I transformed from latin1 to utf8. Although a variety of text was exhibiting fine I observed non-british figures were saved within the database as strange symbols. I spent each day attempting to fix that and lastly now non-british figures display as non-british figures within the database and display exactly the same around the browser. However observed which i see apostrophes saved as
' and exclamation marks saved as
!. Is normal, or whenever they be showing up as ' and ! within the database rather? If that's the case, an amount I have to do to be able to fix that?
It truly is dependent on which you want related to the items in the database. In case your invariant is the fact that "items in the database are disinfected and might be placed directly inside a web site without further validation/sanitization", then getting &amp along with other html organizations inside your database makes sense. If, however, your database would be to store just the raw original data, and you want to process it/sanitize it, before exhibiting it in Web coding, then you definitely should most likely replace these organizations using the original figures, encoded using UTF-8. So, it truly is dependent how you interpret your database content.
&#XX; forms are HTML character entities, implying you passed the values saved within the database via a function for example PHP's
htmlentities. When the values are processed inside an HTML document (or possibly by any HTML processor, no matter what they are part of), they ought to display fine. Outdoors of this, they will not.
Which means you most likely don't wish to have them encoded as HTML organizations. You are able to convert the values back while using counterpart towards the function you accustomed to scribe them (e.g. [cde]), that ought to take a disagreement regarding which encoding to transform to. Once you have done that, check a few of the formerly problematic records, ensuring you are while using correct encoding to see them.
If you are still getting problems, there is a mismatch between what encoding the saved values are meant to use and what they are really using. You will need to evaluate which they are really using, after which convert them by tugging them in the DB and only transforming these to the prospective encoding before re-placing them, or re-placing all of them with the encoding they really use. Like the latter choice is to transform the posts to
html_entity_decodes, then altering the column character set, then altering the column type to a text type, then directly transforming the column towards the preferred character encoding. The reason behind this unwieldy sequence is the fact that text types are converted when altering the smoothness encoding, but binary types aren't.
Read "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)" for additional on character encodings generally, and § 9.1.4. of the MySQL manual, "Connection Character Sets and Collations", based on how encodings are utilized in MySQL.