The typical situation. I've got a simple application that will permit individuals to upload photos and follow others. Consequently, every user may have something similar to a "wall" or perhaps an "activity feed" where she or he sees the most recent photos submitted from his/her buddies (people she or he follows).
The majority of the benefits are simple to implement. However, if this involves this history activity feed, things can certainly are a mess due to pure performance reasons.
I have started to the next dilemma here: i'm able to easily design the game feed like a stabilized area of the database, that will save me writing cycles, and can enormously boost the complexity when choosing individuals recent results for each user (for every photo submitted inside a certain period of time, choose a particular number, whose uploaders I'm following / for each individual I follow, choose his photos )
An optimisation option may be the introduction of a number of threshold constraints which, for example would let me order the folks I follow based on the date of the last upload, even exclude some, in order to save cycles, as well as for each user, choose just the 5 (for instance) last submitted photos.
The 2nd approach would be to introduce a totally denormalized schema for that activity feed, by which every row signifies a notification for just one of my fans. Which means that each time I upload a photograph, the DB will put n rows within this "drop bucket", n meaning the amount of people I follow, i.e. plenty of writing cycles. Basically have this type of table, though, I possibly could easily apply some optimisation techniques for example clever indexing, in addition to pruning records over the age of a particular time period (queue).
Yet, another approach that involves mind, is a less denormalized schema in which the server side application will require some area of the complexity from the DB. I saw that some social applications for example friendfeed, heavily depend around the storage of serialized objects for example JSON objects within the DB.
I'm certainly still learning the ability of scalable DB design, so I am certain that you will find a lot of things I have skipped, or still to understand. I'd highly be thankful if a person could produce a minimum of an easy within the right direction.
In case your application is effective, then it is a good wager that you will convey more reads than creates - I only upload a photograph once (write), but all of my buddies reads it every time they refresh their feed. So that you should optimize for fast reads, not fast creates, which points in direction of a denormalized schema.
The issue here would be that the quantity of data you create could rapidly get beyond control for those who have a lot of customers. Large tables are difficult around the db to question, so again there is a potential performance problem. (Additionally, there are the question of getting enough storage, but that is a lot more easily solved).
If, while you suggest, you are able to remove rows after some time, then this may be a great choice. You are able to reduce that period of time (up to and including point) while you grow and encounter performance issues.
Regarding storing serialized objects, it is a good option if these objects are immutable (you will not change them after writing) and you don't have to index them or query in it. Observe that should you denormalize your computer data, it most likely means that you've a single table for that activity feed. For the reason that situation I see little grow in storing blobs. If you are going the serialized objects way, think about using some NoSQL solution, for example CouchDB - they are better enhanced to handle that type of data, so in principle you need to get better performance for the similar hardware setup. Observe that I am not recommending that you simply move all of your data to NoSQL - only for your part where it is a better solution.
Finally, a thing of caution, spoken from experience: building a credit card applicatoin that may scale is difficult and needs time to work better spent elsewhere. You need to spend your occasions worrying concerning how to get countless customers for your application before you decide to be worried about how you are likely to serve individuals millions - the very first is the greater difficult problem. Whenever you become so terrible that you are greatly effective, you are able to re-architect and rebuild the application.