Surely though the function of the Default character set variable is to force the connection to behave according to a set of standards in the manner you earlier described?
Not exactly. Because ExpressionEngine version 1.x supports versions of MySQL all the way back to 3.23.32, it connects to your database with the default client connection collation, in PHP’s case, this is Latin-1. However, MySQL automagically converts this to the character set of your database when storing it. It’s a little double-change dance that PHP and MySQL do, each talking to each other in a common language, so to speak, but still able to act independently on the data. In other words, EE and MySQL would know that they are working with UTF-8 characters, but PHP and MySQL would be using Latin-1 radios to talk to each other.
I would not rule out the possibility that it might have some impact on the aforementioned MySQL sorting idiosyncrasies. If you’re feeling bold, you might try implementing this hack and see if it makes a difference. Mind you, it will not operate on existing data, so you’d need to create new entries with which to compare.
The article says /core/db/db.mysql.php - I assume that’s an error or v164 structure changed or I’ve only got a partial install..😉
I can ignore the conversion stuff because we tried the ISO->binary->UTF8 and it’s only partial (data’s great but we couldn’t add templates, upload folders and lost other functionality as well); I’ll do it on a fresh EH-based install.
I guess that if the hack works I need to change the same file each time I upgrade, on into perpetuity (I ‘ate core mods’)?
I will certainly give all of that a go in the morning (we’re 2am now and I’m a full day tomorrow) and report back.
jiF