This is inspired through the article "Why are Facebook, Digg, and Twitter so hard to scale?" on

What exactly database systems(however obscure) are available that will have the ability to handle this kind of data better?

Thank you for you help!

Getting a database system in which the data model is customized for that data structure you are attempting to represent is frequently beneficial. Internet sites lend themselves perfectly to Graph databases, for example Allegro Graph, Neo4j etc.

There's a good article at the Neo4j blog regarding how to represent internet sites inside a graph database, using the good examples using Neo4j.

The advantage of graph databases is the fact that information is saved to ensure that crossing connections among organizations is an extremely fast operation, permitting you to definitely traverse complex systems rapidly. These procedures would typically be (at best) costly join procedures in current implementations of relational databases. Just like relational databases, graph databases have a small trouble with scaling to multiple hardware nodes. However the requirement for multiple hardware nodes ought to be a smaller amount having a graph database compared to a relational database for Social Networking types of data, a couple of billion nodes on one machine isn't any problem. Scaling to multiple hardware nodes is how key-value stores shine, since organizations inside a key-value store completely isolated from one another. The issue here's rather that there is nothing isolated inside a social networking, and therefore to emulate the connections multiple queries towards the database are needed, one for every entity. This is slow, specifically for friend-of-a-friend types of queries, in which you only uncover one degree of buddies with every query.

Disclaimer: I'm a person in the Neo4j team.

Look into the NOSQL debrief, it's interesting assets on several distributed, non relational databases:

Presentation 35mm slides and videos
Intro session - Todd Lipcon, Cloudera (35mm slides, video1, video2)
Voldemort - Jay Kreps, Linkedin (35mm slides pdf ppt, video1, video2)
Cassandra - Avinash Lakshman, Facebook (35mm slides pdf ppt, video)
Dynomite - High cliff Moon, Powerset (35mm slides, video)
HBase - Ryan Rawson, Stumbleupon (35mm slides, video)
Hypertable - Doug Judd, Zvents (35mm slides pdf ppt, video1, video2)
CouchDB - Chris Anderson, (35mm slides, video1, video2)

VPork - Jon Travis, Springsource (35mm slides, video)
MongoDb - Dwight Merriman, 10gen (35mm slides, video)
Infinite Scalability - Jonas S Karlsson, Google (35mm slides, video)

Some videos by Digg's John Quinn, the relaxation by Martin Dittus from Pictures by Russ Garrett from

For that links towards the 35mm slides and videos, look into the original page, you will find way too many of these to paste.

You might like to read NoSQL: If Only It Was That Easy too (as well as the Nosql entry on wikipedia).