I have spent a couple of days researching the benefits and drawbacks of mysql against nosql solutions (particularly mongodb) for my project.

The project must have the ability to eventually scale to deal with hundreds of 1000's of synchronised customers - countless customers as a whole. The website is heavily user focussed and can communicate with the database just as much or even more than the usual site like facebook - it's very relational, all functionality is founded on the regards to the consumer as well as their relationship along with other customers. It is also data heavy - plenty of files, images, audio, texting, personal news feed etc.

I love the appearance on mongodb a great deal, I love the actual way it works, and that i like the way it scales - but can't get my mind around how this could work with a website for example I describe. Would really interactions for any specific user need to be saved in one document?

I'm however very comfortable using mysql and such as the relational facet of it. I'm just worried without lots of work you will see scalability difficulties with this project - although possibly with memcached and sharding this will not be an problem?

Let me know from individuals with knowledge about the 2 databases on large projects, from mysql and mongodb the right tool with this particular job?

When the information is highly relational, make use of a relational database. When not, don't. NoSQL is excellent, don't misunderstand me, but it is not suitable for all tasks. It might be suitable for your career, but the only method to discover is that you should build some tests for the specific usecase. Add a lot of dummy data (millions otherwise 100s of countless rows). After which load test drive it.

So far as scaling, that's much more of a part of the way you construct your application compared to after sales you select. Have you got a solid schema? Have you got a strong cache layer with write-through caching? Would you access the after sales as effectively as you possibly can (queries and the like)? Are you able to shard based on the application?

Individuals would be the type of questions that are appropriate here. Not "that will scale for me personally better". And never "the right tool". Both can get the job done fine. That is best can be you...

Clearly, there is no silver bullet here. However, I must challenge that one assumption you have made:

... it's very relational, all functionality is founded on the regards to the consumer as well as their relationship along with other customers...

OK, I would like you to definitely picture getting 100M customers inside a relational database and begin building this model. Let us try something simple, grab what they are called of the user's buddies.

How can you obtain a user's buddies? Well put forth the users_friends table. If each user has just 10 buddies, that table consists of a billion rows. If customers possess a more modest 100 buddies, you have 10B rows.

So now you must a person and a listing of the buddies IDs. How can we obtain friend's names? You feel the listing of 100 IDs and pull lower each one of the buddies. Perfect.

Now, if you wish to show one user what they are called famous their buddies, all you want do is join the 100M record table towards the 10B record table. This isn't an easy task. Scaling joins becomes tremendously harder and much more costly because the dataset develops.

So, to create this simpler, you are most likely likely to operate a for loop and by hand collect the records for every friend. You need to do this since the buddies are scattered across multiple servers, so each "research" needs to be achieved individually.

Already you've damaged your "relational model".

How about the buddies list? Is keeping a table of 10B records really practical? Why don't you just keep a listing of friend IDs with every user? So why do an additional query.

If you see the pattern here, we have essentially divided the "very relational" model into something that's effectively key-value searches. Obviously, the important thing-value model will scale far better. And thus, MongoDB appears just like a good fit here.

Don't misunderstand me, you will find plenty of good ways to use relational databases. However when you are speaking about handling countless individual key-value style demands, you most likely want to check out a NoSQL database.