I have been focusing on an internet project with MongoDB database layer. I've got a particular entity will be able to not map to document db correctly, thought it might be easier to acquire some feedback.

Say, I've User and Item collections. Customers can like or dislike products. You will find also tags in products and customers may also like or dislike tags. I have to have the ability to lookup for like / dislike counts fast enough.

Things I emerged with is one thing such as this (for item):

{
    name: "Item Name",
    statistics : {
        likes:      5,
        dislikes:   6
    },
    tags: [
        { name: "Foo", likes: 10, dislikes: 20 },
        { name: "Bar", likes: 5,  dislikes: 1  }
    ]
}

This really is pretty decent. The main problem is, I have to determine if a person loved / resented a tag or item. Now, things i emerged with is one thing such as this:

{
    name: "Item Name",
    statistics : {
        likes:      5,
        dislikes:   6
    },
    tags: [
        { 
            name: "Foo", 
            likes: 2, 
            dislikes: 1,
            votes: [
                { user: "user1_id", vote: 1 }, //like 
                { user: "user2_id", vote: 1 }, //like 
                { user: "user3_id", vote: -1 }, //dislike 
            ]
        },
        { 
            name: "Bar", 
            likes: 0,  
            dislikes: 0,
            votes: []
        }
    ]
}

This looks promising, and also the greatest benefit I see here's will be able to do atomic updates if a person changes his mind and dislikes something which he loved before.

But, I expect around 10 tags in every item, with, maybe 100 votes each. I Quickly have around 1000 nested election objects for every item. I understand that mongodb are designed for 16mb documents but nonetheless, could it be ok to keep that much data in a single document?

Must I get a stabilized model. Maybe having a "tagvotes" collection as well as an itemvotes collection? It feels natural in my experience really.

Just wandering if I am thinking relational or rational?

Thanks.

Sooner or later attempting to embed everything becomes impossible in almost any M x N kind of situation as M and N grow. Prior to you achieve that time you have to produce a separate collection and do client-side joins but that does not mean you need to normalize everything totally.

Within this situation, consider what sights you need to show the consumer: clearly you will need to show the product, the number of preferences it's and also the group of tags which have been put on it and perhaps how popular all of individuals tags are. However the actual listing of customers who loved/resented the item and loved/resented each tag will go right into a separate document (inside a separate collection).

Having a schema like that can be done one query to obtain the item and all you need to display alongside that item. After which, if you want it, only one more query to obtain the current user's opinions about this item and every one of the tags they've chosen on which are highly relevant to that item.

could it be ok to keep that much data in a single document?

I do not see issues with the amount of information you store per object, however your read/update designs are worrying: any time you fetch the product, you will also fetch all of the votes, each user's id, etc. Also, when adding votes, you'll grow the item. Sometimes, MongoDB will need to reallocate your object, that takes a little of your time. With time, it'll learn that you're frequently growing objects, and also the padding factor increases, but frequently growing objects isn't the best idea.

I'm able to do atomic updates if a person changes his mind and dislikes something which he loved before.

This really is a little tricky. You should use $pull and $push, but off the top my mind I'm not sure the best way to also keep your likes and dislikes counts synchronized. Furthermore, what goes on if your user really transformed his mind? You'd need to do both $push and $pull, which isn't feasible basically remember properly.

Just wondering if I am thinking relational or rational?

Both. This can be a relational problem :-)

Now I needed to summarize that you ought to denormalize the counts and keep relations inside a different collection, but Hightechrider already authored that. Not fast enough. ;-)