By Upper hundreds of 1000's demands/second I wish to see 60,000 -> +90,000 demands/second.

My Setup includes the next:

user ---> web application --> message queue --> parser --> database?

I ought to point out that the parser presently can parse/stuff around 18750 records/second using COPY therefore we are restricted on that finish until starting adding more parsers -- this is not an enormous concern for me personally now.

I've got a system that needs a chance to bulk upload as quickly as I'm able to as numerous records when i can. This same system (or it may be different for the way you'd approach it) should have the ability to react to analytical type queries similar to this:


wonq = "choose sum(amount) from actions where player = '@player' and " +

       "(type = 'award' or type = 'return') and hands = hands_num"

lostq = "choose sum(amount) from actions where player = 'player' and " +

        "type != 'award' and kind != 'return' and hands = hands_num"

.....10-15 1000 occasions (PER USER) being that they are keyed off and away to another table. Obviously we paginate these results at 10/page for the time being.

I have checked out the next: (presuming they are all on a single server)

  • mysql (reg. ordinary rdbms) -- could enter into the 15-20 1000 demands/second range under current conditions when we attempt to scale this out we want a seperate host/database each time we have to scale -- this isn't possible

  • couchdb (document oriented db) -- did not break 700 demands/second I had been really wishing this would save our ass -- not really a chance!

  • vertica (columnar oriented db) -- was striking 60000 request/second, closed source, very pricey this really is still a choice however i personally didn't enjoy it whatsoever

  • tokyocabinet (hash based db) -- is presently coming in at 45,000 card inserts/second and 66,000 chooses/second yesterday after i authored i was utilizing a FFI based adapater which was carrying out around 5555 demands/second this really is by-far THE quickest most awesome database I have seen yet!!

  • terracotta -- (vm cluster) presently evaluating this together with jmaglev (can't hold back until maglev itself arrives) -- this is actually the SLOWEST!

maybe I am just approaching this issue wrong but I have ALWAYS heard that RDBMS were slow as all hell -- where are these very fast systems that I have learned about?

Testing Conditions::

So ppl know my specs on my small dev box are:


dual 3.2ghz apple, 1 gig ram

Mysql mysql.cnf edits were:


key_buffer = 400M               # was 16M

innodb_log_file_size = 100M     # non existent before

innodb_buffer_pool_size = 200M  # non existent before

UPDATE::

It works out that terracotta may have a location within our application structure however it plain Won't be changing our database in the near future as it is speeds are terrible and it is heap utilization sucks.

However, I had been thrilled to observe that tokyocabinet's NON-FFI ruby library (meaning tyrant/cabinet) is very fast and at this time that's to begin with.

For crazy-large scalability, you will want to concentrate on a couple of things:

  • Sharding: Split your computer data set into groups that do not overlap. Come with an easy, fast method to map from the request to some server. (Player beginning having a-f, server 1 g-q, server 2... etc...)
  • Caching: Use Memcache to keep in mind the creation of some really common choose queries, so it's not necessary to visit disk as frequently.

Well the large player in the overall game is Oracle but thats a lot of money.

If you wish to go cheap then you'll have to spend the money for cost inside a different terms:

  • by partioning the DB across multiple instances and disbursing the burden.
  • Potentially caching results so actual DB access is reduced.

user ---> web application --> message queue --> parser --> database?

What do you want the content queue for? Individuals really are a large performance problem normally.