ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Foreign alphabet characters.

June 30, 2008 1:09pm

Subscribe [4]
  • #16 / Jun 30, 2008 7:04pm

    julianps

    175 posts

    As Derek said, please add

    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />

    to your test template.

    In a quick test, my entries sorted correctly.

    ETA: I see you’re ahead of me…

    I understood Derek but not this; is that the full tag?

    jiF

  • #17 / Jun 30, 2008 7:05pm

    julianps

    175 posts

    As Derek said, please add

    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />

    to your test template.

    In a quick test, my entries sorted correctly.

    ETA: I see you’re ahead of me…

    I can make them sort correctly; switching to ISO-8859-1 does that easily (but we don’t do that); still no cigar..😉

  • #18 / Jun 30, 2008 7:09pm

    Ingmar

    29245 posts

    That is the full tag, yes, minus the spaces which are only needed here on this forum. It instructs your browser to display the page in UTF-8.

  • #19 / Jun 30, 2008 7:22pm

    julianps

    175 posts

    Not defaulting to send Latin-1 characters, but to send instructions to the browser as to what character set the output should be interpreted as.  The output is the same either way.

    Check with your host for this install that your database, table, and columns all have unicode collation.  And of course, the entries in question will all have to have been entered into the CP with UTF-8 selected as your character set in your preferences, and the database collations would need to match at that time as well.  If any setting was incorrect or switched at any point along the way, then manual data conversion would be necessary.

    And some info on MySQL’s sort and order behavior:  http://dev.mysql.com/doc/refman/4.1/en/charset-configuration.html

    Okay,

    Lets be temporarily clear (because there’s no point going back through all the posts) that I have the same issue on both EngineHosting and my Dedicated Server.

    DS = http://www.immocherche.com/i/index2/
    EH = http://www.stockting.com/content/index2/

    On both servers the table collation is utf8-general-ci and the column collation is utf8-general-ci

    Both EE installations are running Default charset = UTF8

    For the DS we have the following;

    character set client   utf8
    (Global value)  latin1
    character set connection   utf8
    (Global value)  latin1
    character set database   latin1
    character set results   utf8
    (Global value)  latin1
    character set server   latin1
    character set system   utf8
    character sets dir   /usr/share/mysql/charsets/
    collation connection   utf8_unicode_ci
    (Global value)  latin1_swedish_ci
    collation database   latin1_swedish_ci
    collation server   latin1_swedish_ci

    For the EH we have the following;

    character set client   utf8
    character set connection   utf8
    character set database   utf8
    character set results   utf8
    character set server   utf8
    character set system   utf8
    character sets dir   /usr/share/mysql/charsets/
    collation connection   utf8_unicode_ci
    (Global value)  utf8_general_ci
    collation database   utf8_general_ci
    collation server   utf8_general_ci

    But I re-iterate, both servers were running EE in utf8-mode because the server we had at the beginning (running latin_swedish_ci) was dropped, along with 500 hours of labour, to ensure that we were in our simple way providing factually supportive information to you (we tried converting the data but then we couldn’t add templates, or upload directories. It was just a mess [I cried that day].

    To conclude. On EH servers we are 101% UTF8 and we have the same order-order issue as on the DS and I’ve given up trying to get DS Support to change it[*]

    jiF

    [*] They say removing the Latin1 foundation destabilises some older scripts and come to think of it we cancelled three EH hosting accounts last year because they couldn’t run our scripts in their utf8 environment so, as a CPA, who and I to tell them what to do?

  • #20 / Jun 30, 2008 7:25pm

    julianps

    175 posts

    That is the full tag, yes, minus the spaces which are only needed here on this forum. It instructs your browser to display the page in UTF-8.

    EH = http://www.stockting.com/content/
    DS = http://www.immocherche.com/i/

    The tags are and have always been there; the /index2/ pages were requested by Lisa Wess and are not system pages.

    jiF

  • #21 / Jun 30, 2008 7:31pm

    Derek Jones

    7561 posts

    Ok, so the EH install sounds like a reliable test bed, thank you for the clarification.  Can you verify with EngineHosting that the column collation is also set correctly?  That environment value will take precedent over any server, database, and connection collation in this instance.  I don’t know if you read that article, or have read other articles regarding MySQL’s handling of high ASCII characters with sorting, but it’s a bundle of insanity even when all the character sets and planets are aligned.

  • #22 / Jun 30, 2008 7:33pm

    Derek Jones

    7561 posts

    Unrelated to MySQL’s sorting, but you have some whitespace above your DOCTYPE (on the stockting site), which can cause some problems both with character display and DOM rendering in certain browsers and locales.

  • #23 / Jun 30, 2008 7:34pm

    julianps

    175 posts

    As Derek said, please add

    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />

    to your test template.

    In a quick test, my entries sorted correctly.

    ETA: I see you’re ahead of me…

    Aha - you cheated!

    [strike]<meta http-equiv="Content-type" c>[/strike]
    
    Ärger
    
    
    Île de Fleurön
    
    
    Österreich
    
    
    Überkonto
    
    
    Zachä
    
    
    Zachö
    

    Read Lisa’s request again..😉

    jiF

  • #24 / Jun 30, 2008 7:38pm

    julianps

    175 posts

    Unrelated to MySQL’s sorting, but you have some whitespace above your DOCTYPE (on the stockting site), which can cause some problems both with character display and DOM rendering in certain browsers and locales.

    Thanks for the tip; yep, forgot to remove those pesky tags from the top of the page - we only installed it today and there’s been way too much going on.

    Never met DOM; who’s he?

    jiF

  • #25 / Jun 30, 2008 7:44pm

    julianps

    175 posts

    Ok, so the EH install sounds like a reliable test bed, thank you for the clarification.  Can you verify with EngineHosting that the column collation is also set correctly?  That environment value will take precedent over any server, database, and connection collation in this instance.  I don’t know if you read that article, or have read other articles regarding MySQL’s handling of high ASCII characters with sorting, but it’s a bundle of insanity even when all the character sets and planets are aligned.

    I only keep the EH account for situations like this so you’re more than welcome to have the UN/PW if it helps.

    Can I not see the collations in phpMyAdmin; they all look to be utf8-general-ci - there the ones when I’m looking at the individual rows aren’t they?

    I’ve read a hundred links and more and bore-for-britain on the subject but the truth is that even with a clean clear EE install on EE’s servers we cannot see what, other than the equipment used to key the data into the EECP could possibly different (ie a circumstantial variable).

    Ingmar, can you PM me with screen-caps of you table, rows and the weblog_fields with those German/Austrian entries to see how they look to you on your screen. Is your account on EH as well?

    jiF

  • #26 / Jun 30, 2008 7:45pm

    Derek Jones

    7561 posts

  • #27 / Jun 30, 2008 7:53pm

    julianps

    175 posts

    Document Object Model

    This is what I was into at school.....eeeek

    So, is that it; we’re done. EE+EH=we don’t know?

    Surely here’s still something we can try?

    jiF

  • #28 / Jun 30, 2008 8:00pm

    Derek Jones

    7561 posts

    Surely here’s still something we can try?

    Yes, there is:

    verify with EngineHosting that the column collation is also set correctly?

    Not anything against your ability to read and examine MySQL settings, but it would be best to get the answer from the horse’s mouth.

    Once that’s verified, then it’s a matter of discovering what’s being entered vs. being stored (phpMyAdmin has its own client connection and page charset settings that might distort the truth here as well), and finding out whether or not its acting in a manner consistent with expected behavior for the character set, the characters involved, and MySQL’s interpretation thereof of what comes before what.

  • #29 / Jun 30, 2008 8:18pm

    julianps

    175 posts

    Surely here’s still something we can try?

    Yes, there is:

    verify with EngineHosting that the column collation is also set correctly?

    Not anything against your ability to read and examine MySQL settings, but it would be best to get the answer from the horse’s mouth.

    Once that’s verified, then it’s a matter of discovering what’s being entered vs. being stored (phpMyAdmin has its own client connection and page charset settings that might distort the truth here as well), and finding out whether or not its acting in a manner consistent with expected behavior for the character set, the characters involved, and MySQL’s interpretation thereof of what comes before what.

    I’m a big boy; you can tell me when I’m not up to it (most of the time with this stuff, I’m afraid); I have a ticket open on this at EH and have asked Daniel for confirmation.

    We’d had a frank exchange of views with our server admins over phpMyAdmin connections and never got an intelligent answer out them on whether our installed version was reputable and whether we looking at things through sh!t=coloured glasses a reality distortion field.

    Surely though the function of the Default character set variable is to force the connection to behave according to a set of standards in the manner you earlier described?

    As it is we’re going to wipe the EH server clean in the morning (no caches, do data, no nothing; I’d rather do a second install but that’ll cost another license and I’m feeling a bit mean right now) so we’ll see what that throws up.

    Anything else you suggest?

    jiF

    You do recall (down Lisa, or we’ll go back over the subject of collations) that if I run ISO-8859-1 on the db (I cannot, Pages Module will not allow me) then the sort order is fine. So this certainly (woops; another assumption. sorry) has to do with the way data is getting from the client to mySQL.

  • #30 / Jun 30, 2008 8:24pm

    Derek Jones

    7561 posts

    Surely though the function of the Default character set variable is to force the connection to behave according to a set of standards in the manner you earlier described?

    Not exactly.  Because ExpressionEngine version 1.x supports versions of MySQL all the way back to 3.23.32, it connects to your database with the default client connection collation, in PHP’s case, this is Latin-1.  However, MySQL automagically converts this to the character set of your database when storing it.  It’s a little double-change dance that PHP and MySQL do, each talking to each other in a common language, so to speak, but still able to act independently on the data.  In other words, EE and MySQL would know that they are working with UTF-8 characters, but PHP and MySQL would be using Latin-1 radios to talk to each other.

    I would not rule out the possibility that it might have some impact on the aforementioned MySQL sorting idiosyncrasies.  If you’re feeling bold, you might try implementing this hack and see if it makes a difference.  Mind you, it will not operate on existing data, so you’d need to create new entries with which to compare.

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases