At this time I am developing the prototype of the web application that aggregates many text records from a lot of customers. This data should be frequently displayed back and frequently up-to-date. Right now I keep content in the MySQL database and employ NHibernate ORM layer to have interaction using the DB. I have got a table defined for customers, roles, distribution, tags, notices and etc. I love this solution since it is effective and my code looks nice sane, but I am also concerned about how MySQL will work once how big our database reaches a substantial number. Personally i think that it could struggle carrying out join procedures fast enough.

It has helped me consider non-relational database system for example MongoDB, CouchDB, Cassandra or Hadoop. Regrettably I've no training with either. I have read good quality reviews on MongoDB also it looks interesting. I am pleased to take the time and learn if a person works out to become what you want. I'd much appreciate anyone offering points or issues to think about when choosing none relational dbms?

Another solutions here have focused mainly around the technical aspects, however i think you will find important indicates be produced that target the startup company facet of things:

  • Availabililty of talent. MySQL is extremely common and you'll most likely think it is simpler (and most importantly, cheaper) to locate designers for this, in comparison up to the more rarified database systems. This bigger developer base will even mean more lessons, a far more active support community, etc.
  • Easy development. Again, because MySQL is really common, you will discover it's the db of preference for a lot of systems / services. This mutual understanding could make any exterior integration just a little simpler.
  • You're planning for any situation that could never exist, and it is workable whether it does. Very couple of companies (nevermind online companies) compare to MySQL's limits, with all due respect (and i'm just speculating here) the chance that the startup is ever going to hit the kind of data throughput to cripple a correctly structured, well resourced MySQL db is nearly zero.

Essentially, don't spend time ( == money) worrying about which db to make use of, as MySQL are designed for a great deal of information, is well-proven and well supported.

Returning towards the technical aspect... Something which may have a far greater effect on the rate of the application than selection of db, is when effectively data could be cached. A highly effective cache might have dramatic effects on reducing db load and accelerating the overall responsivness of the application. I'd spend time looking into caching solutions and ensuring you're working on your application in a way that it may get the best utilization of individuals solutions.

FYI, my caching solution of preference is memcached.

To date nobody has pointed out PostgreSQL as option to MySQL around the relational side. Remember that MySQL libs are pure GPL, not LGPL. That may pressure you to definitely release your code should you connect to them, although maybe someone with increased legal experience could let you know better the implications. On the other hand, connecting to some MySQL library isn't the same that simply hooking up towards the server and problem instructions, it can be done with closed source.

PostreSQL is often the best free alternative of Oracle and also the BSD license ought to be more business friendly.

Because you should you prefer a non relational database, take into account that the transition could be more dramatic. Should you ever have to personalize your database, opt for the license type factor.

You will find three stuff that genuinely have an in-depth effect on which is the best database choice and you don't mention:

  1. How big your computer data or if you want to store files in your database.
  2. A large number of reads and incredibly couple of (even restricted) creates. For the reason that situation greater than a database you'll need a directory for example LDAP
  3. The significance of of information distribution and/or replication. Most relational databases could be pretty much well duplicated, but due to their concept/design don't handle data distribution too... and can you handle just as much data that doesn't squeeze into one server or connect privileges that requires special separate/extra servers?

However many people will get a non relational database just as they do not like learning SQL