I'm building an internet site that is dependent on serving plenty of little mp3 files (approximately 10-15KB each) quite rapidly. Each file consists of a thing pronunciation, and 20-30 per user is going to be downloaded every minute they're while using site. Each user might download 200 each day, and that i anticipate 50 synchronised customers. You will see approximately. 15,000 separate file eventually.
An amount be the easiest method to store, manage, call and play these files as needed? Should i have specialist hosting to cope with all of the little files, or can they behave happily in a single large folder (utilizing a standard host)? Any delays will ruin the feeling.
Getting done a little more searching, I believe the issue might be solved with either:
- Something like Photobucket however for audio rather, using its own API
- Another kind of 'bucket hosting' solution where one can upload 1000's of files at reasonable prices, and demand them easily
Does anybody are conscious of this type of product?
If you would like (or need) to keep the files on disk rather than as BLOBs inside a database, you will find a few things you have to bear in mind.
Many (although not always all) file systems aren't effective too well with folders that contains many files, which means you most likely don't wish to store my way through one large folder - but that does not mean you'll need specialist hosting.
The bottom line is to distribute the files right into a folder hierarchy, depending on some hash function. For example, we'll make use of the MD5 from the filename here, but it is not particularly significant which formula you utilize or what data you're hashing, as lengthy as you are consistent and also have the data available if you want to discover a file.
Generally, the creation of a hash function is formatted like a hexadecimal string: for instance, the MD5 of "foo.mp3" is 10ebb1120767e9de166e0f5905077cb1.
You may create 16 folders, one for every of possible hexadecimal figures - so you've got a directory , one named 1, and so forth as much as f.
In all of individuals 16 folders, continue doing this structure, which means you have two levels. (//, /1/,... , f/f/)
That which you then do is merely to put the file within the folder determined by its hash. You should use the very first character to look for the first folder, and also the second character to look for the subfolder. By using their plan, foo.mp3 would use 1//, bar.mp3 gets into b/6/, and baz.mp3 gets into 1/b/.
As these hash functions usually are meant to distribute their values evenly, your files is going to be distributed fairly evenly across these 256 folders, which reduces the amount of files in a single folder statistically, 15000 files would lead to typically nearly 60 per folder which should not be a problem.
If you are unlucky and also the hash function you're considering eventually ends up clumping diet program your files in a single folder anyway, you are able to extend the hierarchy to a lot more than 2 levels, or just make use of a different hash function. In the two cases, you have to redistribute the files, however, you only have to do this once, also it should not be an excessive amount of trouble to create a script to get it done for you personally.
For controlling your files, you will probably desire a small database indexing what files you presently have, but this doesn't always have to be employed for anything apart from controlling them - knowing the title from the file, and you apply the filename as input for your hash function, you are able to just calculate the hash again and discover its location this way.