2 of 3
2
Foreign alphabet characters.
Posted: 30 June 2008 04:25 PM   [ Ignore ]   [ # 19 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  369
Joined  12-31-2004
Ingmar Greil - 30 June 2008 04:09 PM

That is the full tag, yes, minus the spaces which are only needed here on this forum. It instructs your browser to display the page in UTF-8.

EH = http://www.stockting.com/content/
DS = http://www.immocherche.com/i/

The tags are and have always been there; the /index2/ pages were requested by Lisa Wess and are not system pages.

jiF

 Signature 

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy - William Shakespeare

Profile
 
 
Posted: 30 June 2008 04:31 PM   [ Ignore ]   [ # 20 ]  
Administrator
Avatar
RankRankRankRankRankRankRank
Total Posts:  15831
Joined  06-03-2002

Ok, so the EH install sounds like a reliable test bed, thank you for the clarification.  Can you verify with EngineHosting that the column collation is also set correctly?  That environment value will take precedent over any server, database, and connection collation in this instance.  I don’t know if you read that article, or have read other articles regarding MySQL’s handling of high ASCII characters with sorting, but it’s a bundle of insanity even when all the character sets and planets are aligned.

 Signature 
Profile
MSG
 
 
Posted: 30 June 2008 04:33 PM   [ Ignore ]   [ # 21 ]  
Administrator
Avatar
RankRankRankRankRankRankRank
Total Posts:  15831
Joined  06-03-2002

Unrelated to MySQL’s sorting, but you have some whitespace above your DOCTYPE (on the stockting site), which can cause some problems both with character display and DOM rendering in certain browsers and locales.

 Signature 
Profile
MSG
 
 
Posted: 30 June 2008 04:34 PM   [ Ignore ]   [ # 22 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  369
Joined  12-31-2004
Ingmar Greil - 30 June 2008 03:58 PM

As Derek said, please add

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />

to your test template.

In a quick test, my entries sorted correctly.

ETA: I see you’re ahead of me…

Aha - you cheated!

[strike]<meta http-equiv="Content-type" c>[/strike]

Ärger
<br />

Île de Fleurön<br />

Österreich<br />

Überkonto<br />

Zachä<br />

Zachö<br />

Read Lisa’s request again..wink

jiF

 Signature 

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy - William Shakespeare

Profile
 
 
Posted: 30 June 2008 04:38 PM   [ Ignore ]   [ # 23 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  369
Joined  12-31-2004
Derek Jones - 30 June 2008 04:33 PM

Unrelated to MySQL’s sorting, but you have some whitespace above your DOCTYPE (on the stockting site), which can cause some problems both with character display and DOM rendering in certain browsers and locales.

Thanks for the tip; yep, forgot to remove those pesky tags from the top of the page - we only installed it today and there’s been way too much going on.

Never met DOM; who’s he?

jiF

 Signature 

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy - William Shakespeare

Profile
 
 
Posted: 30 June 2008 04:44 PM   [ Ignore ]   [ # 24 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  369
Joined  12-31-2004
Derek Jones - 30 June 2008 04:31 PM

Ok, so the EH install sounds like a reliable test bed, thank you for the clarification.  Can you verify with EngineHosting that the column collation is also set correctly?  That environment value will take precedent over any server, database, and connection collation in this instance.  I don’t know if you read that article, or have read other articles regarding MySQL’s handling of high ASCII characters with sorting, but it’s a bundle of insanity even when all the character sets and planets are aligned.

I only keep the EH account for situations like this so you’re more than welcome to have the UN/PW if it helps.

Can I not see the collations in phpMyAdmin; they all look to be utf8-general-ci - there the ones when I’m looking at the individual rows aren’t they?

I’ve read a hundred links and more and bore-for-britain on the subject but the truth is that even with a clean clear EE install on EE’s servers we cannot see what, other than the equipment used to key the data into the EECP could possibly different (ie a circumstantial variable).

Ingmar, can you PM me with screen-caps of you table, rows and the weblog_fields with those German/Austrian entries to see how they look to you on your screen. Is your account on EH as well?

jiF

 Signature 

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy - William Shakespeare

Profile
 
 
Posted: 30 June 2008 04:45 PM   [ Ignore ]   [ # 25 ]  
Administrator
Avatar
RankRankRankRankRankRankRank
Total Posts:  15831
Joined  06-03-2002

Document Object Model

 Signature 
Profile
MSG
 
 
Posted: 30 June 2008 04:53 PM   [ Ignore ]   [ # 26 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  369
Joined  12-31-2004
Derek Jones - 30 June 2008 04:45 PM

Document Object Model

This is what I was into at school.....eeeek

So, is that it; we’re done. EE+EH=we don’t know?

Surely here’s still something we can try?

jiF

 Signature 

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy - William Shakespeare

Profile
 
 
Posted: 30 June 2008 05:00 PM   [ Ignore ]   [ # 27 ]  
Administrator
Avatar
RankRankRankRankRankRankRank
Total Posts:  15831
Joined  06-03-2002
Jules In France - 30 June 2008 04:53 PM

Surely here’s still something we can try?

Yes, there is:

verify with EngineHosting that the column collation is also set correctly?

Not anything against your ability to read and examine MySQL settings, but it would be best to get the answer from the horse’s mouth.

Once that’s verified, then it’s a matter of discovering what’s being entered vs. being stored (phpMyAdmin has its own client connection and page charset settings that might distort the truth here as well), and finding out whether or not its acting in a manner consistent with expected behavior for the character set, the characters involved, and MySQL’s interpretation thereof of what comes before what.

 Signature 
Profile
MSG
 
 
Posted: 30 June 2008 05:18 PM   [ Ignore ]   [ # 28 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  369
Joined  12-31-2004
Derek Jones - 30 June 2008 05:00 PM
Jules In France - 30 June 2008 04:53 PM

Surely here’s still something we can try?

Yes, there is:

verify with EngineHosting that the column collation is also set correctly?

Not anything against your ability to read and examine MySQL settings, but it would be best to get the answer from the horse’s mouth.

Once that’s verified, then it’s a matter of discovering what’s being entered vs. being stored (phpMyAdmin has its own client connection and page charset settings that might distort the truth here as well), and finding out whether or not its acting in a manner consistent with expected behavior for the character set, the characters involved, and MySQL’s interpretation thereof of what comes before what.

I’m a big boy; you can tell me when I’m not up to it (most of the time with this stuff, I’m afraid); I have a ticket open on this at EH and have asked Daniel for confirmation.

We’d had a frank exchange of views with our server admins over phpMyAdmin connections and never got an intelligent answer out them on whether our installed version was reputable and whether we looking at things through sh!t=coloured glasses a reality distortion field.

Surely though the function of the Default character set variable is to force the connection to behave according to a set of standards in the manner you earlier described?

As it is we’re going to wipe the EH server clean in the morning (no caches, do data, no nothing; I’d rather do a second install but that’ll cost another license and I’m feeling a bit mean right now) so we’ll see what that throws up.

Anything else you suggest?

jiF

You do recall (down Lisa, or we’ll go back over the subject of collations) that if I run ISO-8859-1 on the db (I cannot, Pages Module will not allow me) then the sort order is fine. So this certainly (woops; another assumption. sorry) has to do with the way data is getting from the client to mySQL.

 Signature 

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy - William Shakespeare

Profile
 
 
Posted: 30 June 2008 05:24 PM   [ Ignore ]   [ # 29 ]  
Administrator
Avatar
RankRankRankRankRankRankRank
Total Posts:  15831
Joined  06-03-2002
Jules In France - 30 June 2008 05:18 PM

Surely though the function of the Default character set variable is to force the connection to behave according to a set of standards in the manner you earlier described?

Not exactly.  Because ExpressionEngine version 1.x supports versions of MySQL all the way back to 3.23.32, it connects to your database with the default client connection collation, in PHP’s case, this is Latin-1.  However, MySQL automagically converts this to the character set of your database when storing it.  It’s a little double-change dance that PHP and MySQL do, each talking to each other in a common language, so to speak, but still able to act independently on the data.  In other words, EE and MySQL would know that they are working with UTF-8 characters, but PHP and MySQL would be using Latin-1 radios to talk to each other.

I would not rule out the possibility that it might have some impact on the aforementioned MySQL sorting idiosyncrasies.  If you’re feeling bold, you might try implementing this hack and see if it makes a difference.  Mind you, it will not operate on existing data, so you’d need to create new entries with which to compare.

 Signature 
Profile
MSG
 
 
Posted: 30 June 2008 05:52 PM   [ Ignore ]   [ # 30 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  369
Joined  12-31-2004
Derek Jones - 30 June 2008 05:24 PM
Jules In France - 30 June 2008 05:18 PM

Surely though the function of the Default character set variable is to force the connection to behave according to a set of standards in the manner you earlier described?

Not exactly.  Because ExpressionEngine version 1.x supports versions of MySQL all the way back to 3.23.32, it connects to your database with the default client connection collation, in PHP’s case, this is Latin-1.  However, MySQL automagically converts this to the character set of your database when storing it.  It’s a little double-change dance that PHP and MySQL do, each talking to each other in a common language, so to speak, but still able to act independently on the data.  In other words, EE and MySQL would know that they are working with UTF-8 characters, but PHP and MySQL would be using Latin-1 radios to talk to each other.

I would not rule out the possibility that it might have some impact on the aforementioned MySQL sorting idiosyncrasies.  If you’re feeling bold, you might try implementing this hack and see if it makes a difference.  Mind you, it will not operate on existing data, so you’d need to create new entries with which to compare.

The article says /core/db/db.mysql.php - I assume that’s an error or v164 structure changed or I’ve only got a partial install..wink

I can ignore the conversion stuff because we tried the ISO->binary->UTF8 and it’s only partial (data’s great but we couldn’t add templates, upload folders and lost other functionality as well); I’ll do it on a fresh EH-based install.

I guess that if the hack works I need to change the same file each time I upgrade, on into perpetuity (I ‘ate core mods’)?

I will certainly give all of that a go in the morning (we’re 2am now and I’m a full day tomorrow) and report back.

jiF

 Signature 

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy - William Shakespeare

Profile
 
 
Posted: 30 June 2008 07:36 PM   [ Ignore ]   [ # 31 ]  
Administrator
Avatar
RankRankRankRankRankRankRank
Total Posts:  15831
Joined  06-03-2002

The database driver is in /system/db/.  I don’t have details for you about all future versions, but can give you an assurance that it will not require a hack in 2.0 to accomplish what this wiki article covers.  Until then, yes, you’d need to note and maintain the hack if you choose to use it.

 Signature 
Profile
MSG
 
 
Posted: 01 July 2008 01:46 AM   [ Ignore ]   [ # 32 ]  
Moderator
Avatar
RankRankRankRankRankRankRank
Total Posts:  15380
Joined  05-15-2004
Jules In France - 30 June 2008 04:34 PM

Aha - you cheated!

What exactly do you mean? I have a few entries with high ASCII characters and have EE sort them, is that not what we are after?

Here’s the code I whipped up quickly:

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
{exp:weblog:entries orderby="{segment_3}" sort="asc" entry_id="197|195|196|198|199|200" dynamic="off" }
{title}
<br />
{/exp:weblog:entries}

Fair enough? Here’s the result. If I change the orderby parameter to, say, “date”, the order of entries changes as well. So unless I completely misunderstood you, I don’t see where I should have “cheated” here.

 Signature 

Everything will be good in the end. If it’s not good, it’s not the end.

Profile
MSG
 
 
Posted: 01 July 2008 02:20 AM   [ Ignore ]   [ # 33 ]  
Moderator
Avatar
RankRankRankRankRankRankRank
Total Posts:  15380
Joined  05-15-2004
Jules In France - 30 June 2008 04:44 PM

Ingmar, can you PM me with screen-caps of you table, rows and the weblog_fields with those German/Austrian entries to see how they look to you on your screen. Is your account on EH as well?

This particular account is not on EH, but I don’t think it would make any difference. The collation and charsets are the same. These (largely nonsensical) German entries show the correct umlaut both on the fronted, as you have seen, as well as on the backend. I have a screenshot attached.

I also ran a manual query in the CP

SELECT title FROM `exp_weblog_titles` WHERE entry_id >= '196' ORDER BY title DESC

and it worked perfectly. I admit to not having used phpMyAdmin.

Image Attachments
shot1.jpg
Click thumbnail to see full-size image
 Signature 

Everything will be good in the end. If it’s not good, it’s not the end.

Profile
MSG
 
 
Posted: 01 July 2008 03:08 AM   [ Ignore ]   [ # 34 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  369
Joined  12-31-2004
Derek Jones - 30 June 2008 07:36 PM

The database driver is in /system/db/.  I don’t have details for you about all future versions, but can give you an assurance that it will not require a hack in 2.0 to accomplish what this wiki article covers.  Until then, yes, you’d need to note and maintain the hack if you choose to use it.

Derek, I followed the links and thoughts and nothing changes (you see the stuff and nonsense at http://www.stockting.com).
My only observation however was that Forum-member Sasha refers to a change that seems to have disappeared from the “thread”.

Do you have any thoughts on;

$this->query("SET CHARACTER SET utf8");
        
$this->query("SET COLLATION_CONNECTION=utf8_general_ci");

and where I might safely add that to test it?

jiF

 Signature 

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy - William Shakespeare

Profile
 
 
Posted: 01 July 2008 03:14 AM   [ Ignore ]   [ # 35 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  369
Joined  12-31-2004
Ingmar Greil - 01 July 2008 01:46 AM
Jules In France - 30 June 2008 04:34 PM

Aha - you cheated!

What exactly do you mean?

I was teasing….....Lisa’s point was to use her code and no more, to see a “raw” dump.

This is because we already believe that the issue is installation specific; what we do not know is why it is happening to a fresh installation of EE on an EH-based hosting account. That’s the clever answer..wink

Do you have any thoughts on “where”, “when” and “how” for;

$this->query("SET CHARACTER SET utf8");
        
$this->query("SET COLLATION_CONNECTION=utf8_general_ci");

jiF

 Signature 

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy - William Shakespeare

Profile
 
 
Posted: 01 July 2008 03:29 AM   [ Ignore ]   [ # 36 ]  
Research Assistant
Avatar
RankRankRank
Total Posts:  369
Joined  12-31-2004
Ingmar Greil - 01 July 2008 02:20 AM
Jules In France - 30 June 2008 04:44 PM

Ingmar, can you PM me with screen-caps of you table, rows and the weblog_fields with those German/Austrian entries to see how they look to you on your screen. Is your account on EH as well?

This particular account is not on EH, but I don’t think it would make any difference. The collation and charsets are the same. These (largely nonsensical) German entries show the correct umlaut both on the fronted, as you have seen, as well as on the backend. I have a screenshot attached.

I also ran a manual query in the CP

SELECT title FROM `exp_weblog_titles` WHERE entry_id >= '196' ORDER BY title DESC

and it worked perfectly. I admit to not having used phpMyAdmin.

Firstly, as I have said repeatedly, there is no issue with seeing foreign characters in either the CP or on the page; I attach my EDIT list of entries. The challenge lies in the underlying DB and the characters there (I changed the sort order to ASC); That entry that starts AZ is what we all know and love as Île-de-France…

Well I will upload them when I can get around “Error Message:  The file you are attempting to upload has invalid content for its MIME type.” For now you see them here

 Signature 

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy - William Shakespeare

Profile
 
 
   
2 of 3
2
 
Post Marker Legend
New Topic New posts Hot Topic Hot Topic with new posts New Poll New Poll Moved Topic Moved Topic Sticky Topic Sticky topic
Old Topic No new posts Hot Old Topic Hot Topic with no new posts Old Poll Old Poll Closed Topic Closed Topic Announcement Announcements
Theme
Change Theme
Visitor Statistics
The most visitors ever was 1149, on July 16, 2007 09:33 AM
Total Registered Members: 64937 Total Logged-in Users: 70
Total Topics: 81907 Total Anonymous Users: 44
Total Replies: 440302 Total Guests: 298
Total Posts: 522209    
Members ( View Memberlist )