I am wondering if there's a "best" option for collation in MySQL for any general website in which you aren't 100% of the items is going to be joined? I realize that the encodings ought to be the same, for example MySQL, Apache, the HTML and anything inside PHP.
Previously I've set PHP to output in "UTF-8", but which collation performs this match in MySQL? I am thinking it's among the UTF-8 ones, but I have tried personally utf8_unicode_ci, utf8_general_ci, and utf8_bin before.
The primary difference is sorting precision (when evaluating figures within the language) and gratifaction. The only real special the first is utf8_bin that is for evaluating figures in binary format.
utf8_general_ci is sort of faster than utf8_unicode_ci, but less accurate (for sorting). The particular language utf8 encoding (for example utf8_swedish_ci) contain additional language rules which make them probably the most accurate to sort for individuals languages. More often than not I personally use utf8_unicode_ci (I favor precision to small performance enhancements), unless of course I've got a valid reason to should you prefer a specific language.
Read more about specific unicode character sets around the MySQL manual - http://dev.mysql.com/doc/refman/5./en/charset-unicode-sets.html
Collations affect how information is sorted and just how strings are in comparison to one another. Which means you need to use the collation that much of your customers expect.
Example in the documentation:
utf8_general_ciis also acceptable for German and French, except that ‘ß’ is equivalent to ‘s’, and never to ‘ss’. If this sounds like appropriate for your application, then you need to use
utf8_general_cisince it is faster. Otherwise, use
utf8_unicode_cibecause it's better.
So - it is dependent in your expected users list as well as on just how much you'll need correct sorting. To have an British users list,
utf8_general_ci should suffice, for other languages, like Swedish, special collations happen to be produced.