Very long time listener, very first time caller.
I am a full-time SE throughout your day along with a full-time data mining student during the night. I have taken the courses, and heard what our professors think. Now, I come your way - the stackoverflowers, to create the real truth.
What's your preferred data mining formula and why? What are the special techniques you've used which have assisted you to definitely be effective previously?
The majority of my professional experience involved last-minute feature additions like, "Hey, we ought to give a recommendation system for this e-Commerce site." The answer was often a fast and dirty nearest neighbor search - brute pressure, euclidean distance, condemned to fail when the site ever grew to become popular. However, premature optimisation and all sorts of that...
I actually do enjoy the concept that data mining could be elegant and wonderful. I have adopted the Netflix Prize and performed using its dataset. Particularly, I love the very fact the imagination and experimentation have performed such most in developing the very best ten records:
- Acmehill blog
- Acmehill New York Times article
- Just a guy in a garage blog
- Just a guy in a garage Wired article
So mostly, like lots of software dev, I believe the very best formula is a balanced view plus some creativeness.
If you're able to become more specific concerning the task the information mining formula will work, sure we will help you (classification, clustering, association rules recognition, etc)
There's lots of data mining calculations for various tasks and so i thought it was a bit difficult to choose.
It might state that my personal favorite data mining formula is Apriori since it has inspired hundred of other calculations and contains several programs. The Apriori formula by itself is very simple. However it has laid the foundation for a lot of other calculations (FPGrowth, PrefixSpan, etc.) which use the so known as "Apriori property".