ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Korean UTF-8 Encoding

July 22, 2009 12:47pm

Subscribe [3]
  • #1 / Jul 22, 2009 12:47pm

    Kim Ryu Hyun

    65 posts

    This question may be related to a resolved thread.

    Everything works fine using Korean characters on the EE side. My only problem is that when I open MySQL database directly using phpMyAdmin or any other tool, all my Korean characters are broken and unreadable.

    All my setting are in UTF-8. Any ideas on what causes this? Here is the link to my test site:

    —- edited after resolution.

    ExpressionEngine 1.6.7 Build 20090515

  • #2 / Jul 22, 2009 12:51pm

    Sue Crocker

    26054 posts

    What are the MySQL settings set to?

  • #3 / Jul 22, 2009 12:56pm

    Kim Ryu Hyun

    65 posts

    MySQL character set : UTF-8 Unicode (utf8) and MySQL connection collation:  utf8_general_ci. I have also tried MySQL connection collation:  latin1_swedish_ci with the same result.

    Thanks.

  • #4 / Jul 22, 2009 1:02pm

    Ingmar

    29245 posts

    Does it work on both your site and the EE control panel? If so, it’s not an issue. I suspect that phpMyAdmin simply uses the wrong charset. That should be a configurable option.

  • #5 / Jul 22, 2009 1:09pm

    Kim Ryu Hyun

    65 posts

    Well Ingmar,

    It works on both sides: EE and CP. However, it doesn’t seem to be the settings or options issue on the database side. Because when I create a simple table directly on the same database as EE system, it works perfectly. Only the EE created stuff gets tangled.

    According to this Wiki article, it seems that EE writes to the database in latin1 natively even if you set everything to use UTF-8 thus creating this problem.

    http://expressionengine.com/wiki/Switching_EE_to_Use_UTF-8_Charset/

    What gives?

  • #6 / Jul 22, 2009 5:00pm

    Ingmar

    29245 posts

    Yes, EE uses the MySQL default collation, which is Latin-1. It’s no problem to store Unicode characters that way, provided the encoding is declared correctly. If you are using some third party software to manipulate the database you must make sure that here, too, the correct encoding is applied. Perhaps you need to force utf-8 somewhere?

    If that’s such a big issue to you, consider making a feature request, applying the hack shown in the wiki, or wait for EE 2.0, which will use utf-8 natively everywhere.

  • #7 / Jul 22, 2009 10:22pm

    Kim Ryu Hyun

    65 posts

    I cannot wait for EE2 because of project deadline.

    My immediate problem is that I need to import few thousand records from my Excel spreadsheet to my EE weblogs. I am trying to use http://brandnewbox.co.uk/products/details/csvgrab/
    It seems to work fine but my Korean characters created in EE are written in latin1 by default and newly imported data from Excel is written in utf8. It cannot be mixed, right? What do I need to do in this situation? Is hacking EE my only option? Since it is not recommended or supported by you, I do not want to go this route if possible.

    If I decide to proceed with latin1 database as you recommend here, my concern going forward is will I able to convert latin1 database to utf8 database with all Korean characters intact once I decide to move to EE2 when it’s released. How will this be achieved?

    Thanks.

  • #8 / Jul 23, 2009 3:44am

    Ingmar

    29245 posts

    Kim,

    I would suggest to start with a few test records, and see what you get. They are in Korean, I suppose? I don’t see why this should be much of an issue, if both your EE setup as such and your template are using utf-8 the db collation should not matter much.

    If you need assistance with the import, my suggestion would be to start a new thread in Howto, and provide a bit of sample data, like a small .csv file.

    If I decide to proceed with latin1 database as you recommend here

    I recommend to stay with the default setup, yes.

    my concern going forward is will I able to convert latin1 database to utf8 database with all Korean characters intact once I decide to move to EE2 when it’s released. How will this be achieved?

    Yes, the update script will convert your db to utf-8. Unlike with EE 1.6.7, however, EE 2.0 will use native utf-8 conncetions, so no hacks are required (which, you are correct, we regularly discourage).

  • #9 / Jul 23, 2009 4:04am

    Kim Ryu Hyun

    65 posts

    Ok. That seems to work out for me.

    Thanks again.

  • #10 / Jul 23, 2009 4:38am

    Ingmar

    29245 posts

    Excellent 😊 As I’ve said, you’re very welcome to start a new thread if you need additional help.

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases