I went through a whole thing today of taking a client’s older EE database and converting it from the basic latin-1 to UTF-8, which has helped solve a bunch of character headaches with é, ó, á, etc with certain extensions. The problem I’m having is that many of the characters are still not converted, and I haven’t been able to find a reliable source of info for how to do this (other than going into the entry with the offending characters, fixing them and re-saving the entry). Here were the steps I took:
1) Exported my dev database dev_ee02 to SQL file.
2) Changed all instances of default character set on the tables to utf8 from whatever it was.
3) Created a new dev database. Set the collation and default character set to utf-8 (general) and imported all those tables I had exported earlier.
4) Pointed EE to the new database.
5) Changed the default charset in EE from the iso version to utf-8.
For the most part, everything seems to be working pretty well. I can type an é and it doesn’t get screwed up at all when rendering on the template. However, I think some of the old characters are still encoded as iso, so they’re throwing those funky black-background question marks on the templates. When I go into the database via phpMyAdmin, I can see that:
ó is being rendered as � (�)
é is being rendered as both é and �
I’m sure there are other offenses. Is there any way to sweep through the exp_weblog_data table (if nothing else) and mass convert those oddities? Or, if it is ultimately easier, I could do the export/import again if there is something I overlooked in the 5 steps above. Thanks.