ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Chinese encoding issue

December 03, 2010 5:28pm

Subscribe [4]
  • #1 / Dec 03, 2010 5:28pm

    Ponder The Web

    20 posts

    I’m creating a site in Mandarin Chinese which is a duplicate of a site in English. We used a translation company that translated an exported version of the weblog_data table and when I reimport it, it looks fine in phpMyAdmin. I see the characters just fine. But when I view it in EE, I see question marks.
    If I copy/paste some Chinese into EE and save it, it’s then not readable in the DB, but it displays correctly on the site. I’m dealing with around 100 pages of content and I can’t just paste all that content in thru EE, I have to do a mass import to the DB which doesn’t display correctly after EE retrieves it.

    The database is utf8_unicode_ci, the weblog_data table is utf8_unicode_ci and all the fields with content in that table are also utf8_unicode_ci.
    The Default Character Set in EE is utf-8 and i’m running version 1.6.8.

    I don’t think it’s an EE issue, I think I just need to figure out the encoding. Why doesn’t the Chinese that looks fine in phpMyAdmin also look fine when it’s pulled in to EE? What happens to the Chinese pasted in to EE that it gets encoded differently so it then isn’t viewable in phpMyAdmin but it does display correctly in EE?

    Any advice would be great.

    Thanks.

  • #2 / Dec 04, 2010 5:47pm

    Greg Salt

    3988 posts

    Hi Ponder The Web,

    Are you certain that the data was originally converted into UTF-8 and not some other character set? And to clarify, the text was translated in another similarly configured copy of EE?

    Cheers

    Greg

  • #3 / Dec 04, 2010 5:56pm

    Ponder The Web

    20 posts

    The text started as utf-16 from excel as a tab delimited text file. The only way we could get a non corrupt file from our translation company was as an excel. I then converted the utf-16 to utf-8 and it imports just fine and looks like it displays correctly in phpMyAdmin. It just doesn’t display correctly in EE.

    Yes, the text was translated from the same configuration it was imported back in to. I did change some of the database encoding though, as it started with the default latin1_swedish_ci.

    It’s weird that the Chinese put in thru EE doesn’t display correctly in phpMyAdmin, it’s a bunch of jumbled characters, but it displays correctly in EE.

    Thanks,
    Gary

  • #4 / Dec 06, 2010 2:44am

    John Henry Donovan

    12339 posts

    Gary,

    Leave the db encoding as UTF-8 and then try this to correctly see the text in EE

    At line 163-165 of the /system/db/db.mysql.php file, locate the code below:

    $this->server_info = @mysql_get_server_info();
    
            return TRUE;

    Replace with the following content, which will comprise new lines 163-168:

    $this->server_info = @mysql_get_server_info();
    
            // ADD THIS to allow utf-8 character set connection between EE and MySQL
           $this->query("SET CHARACTER SET utf8");
            $this->query("SET COLLATION_CONNECTION=utf8_general_ci");         
    
            return TRUE;

    Note: Only the line under the comment “// ADD THIS…” is new content.

    This however is a core hack so done so at your own risk. Be sure to back up everything you have before attempting it.

  • #5 / Dec 07, 2010 3:15pm

    Ponder The Web

    20 posts

    I think that did the trick.

    Thanks for the help.

    Gary

  • #6 / Dec 08, 2010 9:19am

    Sue Crocker

    26054 posts

    Glad John was able to help. Don’t hesitate to post again as needed.

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases