I know for any site which involves uploading files towards the site. But what I would like - and wondering whether it's possible - happens when a person clicks "Browse", and chooses the file, whether it's feasible for the website to instantly scan the site's database for similar files before they upload the file towards the site. Type of like the automatic "Related Questions" whenever you act an issue on this website.
Sure, that's possible. But you will need to develop your personal definition, in addition to formula for locating what's similar.
File Type variations
Different file types ought to be in comparison in a different way. For instance a text file could be suitable to some diff to locate similar files, but evaluating images or videos which are similar is substantially harder.
Impossibility of evaluations
Also, evaluating against a lot of files is an extremely costly factor to complete becasue it is typically done pair-smart. Some indexing techniques may help the efficiency from the search though, however i aren't seeing a good way to get this done rapidly.
Crowd Source Alternative
Another alternative is always to possess the customers from the site explain the commonalities, this way you just display a listing of the very popular files which were chosen similar. Obviously, this does not help when uploading a brand new file, but it can benefit you will get insight in regards to what customers find similar.
What many sites do in order to compare similarity of submissions are to permit customers to tag products. If a person item shares most of the same tags with another, they are likely similar. This really is most likely the simplest approach.
This has got the benefit that any content type could be in comparison holiday to a content type. So text files that have a similar tags like a video could be presented as similar.
You can obtain the file title without uploading the file so that you can perform the search in line with the file title. This content would simply be available following the upload.