I'm reading through about SOLR and indexing a MySQL database into SOLR.

Exactly what do they mean by "tokenize" and "not-tokenize"?

And exactly what does it mean when fields are "stabilized"?

I understand how and what it really way to normalize a database, but a area? Just how can an easy area be stabilized?

Thanks

the tokenizer splits a personality stream into words, what are atomic models of search. strings could be split according to whitespace, word limitations, etc. these test is frequently passed through filters within the second stage which apply additional changes towards the words (like soundex codes, porter stemming, etc). it makes sense a stabilized representation from the words that may be effectively in comparison.

for instance: "The Felines Eat Cheese!" may be stabilized towards the words: 1) cat 2) eat 3) cheese

"the" was removed (stopword), cat has become singular (stemming), punctuation is finished, and also the test is lower cased.

Exactly what do they mean by "tokenize" and "not-tokenize"?

Tokenizing a area allows full text search, i.e. coming up with any word that happens any place in the area. An Untokenized area is going to be found only if you have an entire and exact match, e.g. when the field's submissions are "blue moon" it are only found whenever you look for "blue moon", not whenever you search just for "blue".

And exactly what does it mean when fields are "stabilized"?

This probably describes Unicode normalization - Unicode has separate code points for diacritics, e.g. U+0060 is ` (grave accent), therefore the highlighted letter è could be either one Unicode character (U+00E8) or made up of two (U+0060 and U+0065). However you would like both found whenever you look for è.