I'm trying to model an authentic social networking (Facebook). I'm a Computer Science Graduate student so I've got a grasp on fundamental data structures and calculations.
I started this project in java. My idea would be to create multiple Regions of Customers. Each User inside a given area may have a random quantity of buddies having a normal distribution around confirmed mean. Each User may have a sizable percentage or cluster of "Buddies" in the Area they fit in with. The rest of their "Buddies" is going to be more compact groupings from the couple of different random Areas.
I needed to produce an ArrayList of areas
With every Area holding an ArrayList of Customers
And every User holding an ArrayList of "Buddies"
After that I'm able to undergo each Area, and every User for the reason that Area and provide that user the majority of their buddies from that Area, in addition to a couple of buddies from the couple of random Areas. This really is simple as lengthy as my data set remains small.
After I attempt to create large data sets, I recieve an OutOfMemoryError because of forget about memory within the heap. Now i understand that by doing this to do it will likely be impossible if I wish to create, say, 30 Area's with 1 millions customers per area, and 200 buddies per User. I consume almost 2gb with 1 Area...Now what. My formula works basically could create all of the customers in advance, then simply just "give" buddies to every user. However I require the Areas and Customers produced first. There should be a person within an Area prior to it being designed a "friend".
I love my formula, it's easy and clear to see. Things I require is an easy method to keep this data, because it cant be saved and locked in memory all at one time. I will need not only to access the region a person goes too, but additionally a couple of random areas too, for every user.
1. What technology/data structure must i be putting this data into. Ultimately I essentially desire a User->Friends relationship. The "Area" idea is a method to get this to relationship realistic.
2. Must I be utilising another language altogether. I understand that technologies for example Lucene, Hadoop, etc. were produced with Java, and can be used for considerable amounts of information...But I have not used them and would really like some guidance before I dive into new things.
3. Where must i begin? Clearly I am unable to only use java using the data in memory. However I should also create these Regions of Customers before I'm able to provide a User a listing of Buddies.
Sorry for that semi-lengthy read, however i desired to construct wherever I'm which means you could guide me within the right direction. Thanks to everybody that required time to seeOrassist me with this particular subject.
You'll need a searchable storage solution to secure your data (instead of holding everything in memory). Whether relational database (for example Oracle, MySQL, or SQL Server) by having an O/RM (for example Hibernate) or perhaps a nosql database for example mongodb works all right.
- Make use of a database with a few ORM tool[JPA with Hibernate etc.] ,
- Load data Lazily, when they're really needed
- Unload them when them from Cache/Session when they're not necessarily needed or inactive.
Feel at ease to tell me just in case there's any issue to know.