Originating from an RDBMS background, I had been always of the opinion "Try as hard as possible to make use of one query, presuming it's efficient," and therefore it's pricey for each request you are making towards the database. If this involves MongoDB, it appears such as this is probably not possible since you can't join tables.

I realize it's not said to be relational, but they are also pushing it for reasons like blogs, forums, and things I'd find an RDBMS simpler to approach with.

You will find some hang ups I have had attempting to comprehend the efficiency of MongoDB or NoSQL generally. Basically desired to get all "posts" associated with certain customers (as though these were arranged)... using MySQL I'd most likely perform some joins and obtain it with this.

In MongoDB, presuming I want the collections separate, will it be efficient to utilize a large $in: ['user1', 'user2', 'user3', 'user4', ...] ?

Does that method get slow before long? Basically include 1000 customers? And when I desired to obtain that listing of posts associated with customers X,Y,Z, will it be efficient and/or fast using MongoDB to complete:

  • Get customers array
  • Get Posts IN customers array

2 queries for just one request. Is the fact that bad practice in NoSQL?

To reply to the Q about $in....

Used to do some performance tests using the following scenario:

~24 million paperwork inside a collection
Research a million of individuals documents with different key (indexed)
Using CSharp driver from .Internet

Querying 1 at any given time, single threaded : 109s
Querying 1 at any given time, multi threaded : 48s
Querying 100K at any given time using $in, single threaded=20s
Querying 100K at any given time using $in, multi threaded=9s

So noticeably better performance utilizing a large $in (limited to max query size).

Update: Following on from comments below about how exactly $in works with various chunk dimensions (queries multi-threaded):

Querying 10 at any given time (100000 batches) = 8.8s
Querying 100 at any given time (10000 batches) = 4.32s
Querying 1000 at any given time (1000 batches) = 4.31s
Querying 10000 at any given time (100 batches) = 8.4s
Querying 100000 at any given time (10 batches) = 9s (per original results above)

So there does turn to be considered a sweet-place for the number of values to batch up directly into an $in clause versus. the amount of round outings