ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

UTF-8 Whoes... Solved... and outo-punishement applied

October 31, 2007 8:13am

Subscribe [2]
  • #1 / Oct 31, 2007 8:13am

    latrine

    28 posts

    Damn Portuguese characters… and UTF-8

    In my comercial licence EE installed at http://www.mpsfarmaceutica.pt I am trying to generate a XML file to integrate dinamicaly into flash

    I have created a template that has the following lines:

    <?xml version="1.0" encoding="UTF-8"?>
    <documento>
    {exp:weblog:entries weblog="noticias" limit="2" rdf="off" dynamic_start="on" disable="member_data|trackbacks" category="2&5;" orderby="date"}
    
        <item>
          <title>{exp:xml_encode}{title}{/exp:xml_encode}</title>
          <sumario>{exp:xml_encode}{summary}{/exp:xml_encode}</sumario>
        </item>
    
    {/exp:weblog:entries}
    </documento>

     


    and that shows this output:

    http://www.mpsfarmaceutica.pt/index.php/palop2/noticias_xml

    <documento>
    −<item>
    <title>Ranking de Importações de Portugal</title>
    −<sumario>
    Isto é o sumário que é introduzido nas caixas de entrada do site. não pode ser muito grande, apenas grandito!
    </sumario>
    </item>
    −<item>
    <title>Novas regras do créditos documentários em Portugal</title>
    −<sumario>Isto é um sumário para exemplificar o tamanho máximo desta caixa de texto
    </sumario>
    </item>
    </documento>

    Now… can you see what is happening???

    The <title> is beeing correctly generated with the portugues chars
    The <sumario> isn’t!!!

    I know they are in different tables, but the encoding is sitewise…

    Now I see that the title is beeing inserted “unformated” into a table but the other entries are beeing “inserted” CONVERTED to UTF-8

    How can I remove this conversion?

    What can i do to make it output this correctly?

    THanxs fo any help…
    JPC

    PS(this project for a friend as gon through so many mutations that I have already learnt CSS, php, semantics and now it’ going on in Flash (actionscripted from the ground!) all courtesy of the designer..that is never happy 😉

  • #2 / Oct 31, 2007 11:13am

    Robin Sowell

    13255 posts

    Argh- not my strong point either.  What happens if you use the same weblog tag on a regular page?  And- let’s see what happens if we use the query module rather than the tag:

    <?xml version="1.0" encoding="UTF-8"?>
    <documento>
    {exp:query sql="SELECT t.title, field_id_x FROM exp_weblog_titles t, exp_weblog_data d WHERE t.entry_id=d.entry_id AND t.weblog_id = 'y' LIMIT 5"}
    
        <item>
          <title>{exp:xml_encode}{title}{/exp:xml_encode}</title>
          <sumario>{exp:xml_encode}{field_id_x}{/exp:xml_encode}</sumario>
        </item>
    
    {/exp:query}
    </documento>

    In the query- ‘x’ needs to be the correct field id for the summary and y needs to be the weblog id you want.

    I’m just poking at it with the above, trying to figure out what’s up.

    Also- what character set is selected for the site- and do you have ‘autoconvert high ascii’ turned on?

  • #3 / Oct 31, 2007 5:55pm

    latrine

    28 posts

    Also- what character set is selected for the site- and do you have ‘autoconvert high ascii’ turned on?

    Everything is in UTF-8 /Portuguese… in the templates, in thte weblog and in the Xml options.

    The result is exactly the same… looking under the hood into the tables in phpmyadmin I can see that the code is “garbled” in two diferent ways in the database…right from the “input”... on the two underlying tables.

    The title is with some chars (undoubtely from conversion) and the rest of the code has other strings for the same letters…

    The funny part is that I always thought that was from the “database model” used my my provider, now I see that it must be something you do in the input phase that it is different in the exp_weblog_title and exp_weblog_data.

    (the tables and fields are built at the same time and in the same database)... title is 100% correct, the rest of the fields aren’t!

  • #4 / Nov 01, 2007 9:01am

    Derek Allard

    3168 posts

    Could you try going into one entry and changing the “formatting type” to none for me?  Does that help?  I’m not sure it will… but wanted to try changing something up.

    You mention that it looks “garbled” in your database.  Could you post both what you are putting into EE, and what appears in the database for us?

  • #5 / Nov 01, 2007 11:34am

    latrine

    28 posts

    OK… testing with the frase :

    “Teste Inscrição” in the title and in the summary

    Browsing with PHPMYADMIN:

    UTF-8, XML POrtuguese, utf-8 and Convert into Entities set to yes, formatting to “none”

    Table ee_weblog_titles    ->    Teste inscrição
    Table ee_weblog_data      ->    Teste inscrição


    UTF-8, XML POrtuguese, utf-8 and Convert into Entities set to no, formatting to “none”

    Table ee_weblog_titles    ->    Teste inscrição
    Table ee_weblog_data      ->    Teste inscrição

    UTF-8, XML POrtuguese, utf-8 and Convert into Entities set to no, formatting to “xhtml”

    Table ee_weblog_titles    ->    Teste inscrição
    Table ee_weblog_data      ->    Teste inscrição

    The funny part is that when written to a html UTF page (EEtemplate), they all show correctly, but in the source code of the page you get this (first two tries):

    <li class="noticia">
          <h5><a href="http://www.mpsfarmaceutica.pt/index.php/palop2/texto_noticias/teste1/" title="Teste inscrição">Teste inscrição</a></h5>
    
    <p>      Teste inscrição<br />
         </li></p>
    
    <p> </p>
    
    <p>     <li class="noticia"><br />
          </p><h5><a href="http://www.mpsfarmaceutica.pt/index.php/palop2/texto_noticias/teste_inscricaeo/" title="Teste inscri��o">Teste inscrição</a></h5>
    <p>      Teste inscrição<br />
         </li>

    available here:
    html page-> http://www.mpsfarmaceutica.pt/index.php/palop2/areadecliente/

    and the same entries in XML page:

    XMl page->

    <?xml version="1.0" encoding="UTF-8"?>
    <documento>
    
    
        <item>
          <title>Teste inscrição</title>
          <sumario>Teste inscri&ccedil;&atilde;o</sumario>
        </item>
    
    
        <item>
          <title>Teste inscrição</title>
          <sumario>Teste inscri&ccedil;&atilde;o</sumario>
        </item>
    
    
    
    </documento>

    avialable here-> http://www.mpsfarmaceutica.pt/index.php/palop2/noticias_xml

  • #6 / Nov 01, 2007 12:56pm

    Robin Sowell

    13255 posts

    I may have to get Derek to poke this more- and it may be complicated by the fact I’m not dead sure that viewing the db table info in the browser may also be affecting things.  I sort of recall that there may be something going on there- and that a raw export might look different.

    Let us poke a bit more.

  • #7 / Nov 01, 2007 5:25pm

    Derek Allard

    3168 posts

    In another thread, we determined that you had database errors

    Your PHP MySQL library version 4.1.13 differs from your MySQL server version 4.1.22. This may cause unpredictable behavior.

    Has that been resolved?  In that same thread I advised that I thought you’d need to backup, trash the database, and start with a newly collated (UTF-8) one.  Did that ever happen?  These answers will greatly affect our next steps.

  • #8 / Nov 01, 2007 6:15pm

    latrine

    28 posts

    I may have to get Derek to poke this more

    But wouldn’t this affect both the tables?!? wouldn’t this produce two exact matching entries (one in each table) , both of them “garbled” in the same way?


    Thxs for all your help!

    JPCarvalhinho

  • #9 / Nov 01, 2007 7:15pm

    Derek Allard

    3168 posts

    In order to move forward, I’ll need answers to my questions from the previous post.

  • #10 / Nov 02, 2007 4:00pm

    latrine

    28 posts

    Well Derek… My ISP dosn’t allow me to create a new database, so I am stuck with this one… on the other side, I sepnt almost an hour talking to technical suport and they assured me that every database is created equal on their servers and nobody ever complained…

    So I will have to say no, I could not perform the steps you mentioned… And I had already accepted my predicament… it was only now, when I created the XML template and output I realised that the “title” entry is beeing inserted in a usable way, and the rest isn’t… as the database is the same… I figured it could mean some kind of php treatment to the title, that I could reproduce to the rest of the entry…

    Funny again is that the “convert” to entities, is only beeing performed on the title also…

  • #11 / Nov 03, 2007 5:51pm

    Derek Allard

    3168 posts

    OK, enable PHP and try this.

    <?xml version="1.0" encoding="UTF-8"?>
    <documento>
    {exp:weblog:entries weblog="noticias" limit="2" rdf="off" dynamic_start="on" disable="member_data|trackbacks" category="2&5;" orderby="date"}
    
        <item>
          <title>{exp:xml_encode}<?php
    $title = "{title}";
    echo htmlspecialchars_decode($title);
    ?>{/exp:xml_encode}</title>
          <sumario>{exp:xml_encode}{summary}{/exp:xml_encode}</sumario>
        </item>
    
    {/exp:weblog:entries}
    </documento>

    Any joy?

  • #12 / Nov 03, 2007 8:07pm

    latrine

    28 posts

    THXS!...You guys are priceless 😊 but no sugar this time either! 😊

    But let me leave some remarks and comments…
    the title is the only thing that is correct in all the entry, so I also tried this “operation” on the other fields…


    —> The field that has the PHP code I have no data, the field retrieves no value; but

    ——->If I do something like moving the {exp:xml_encode} tags to :

    $sumario = "{exp:xml_encode}{summary}{/exp:xml_encode}";
    echo $sumario;

    I get it’s value from the database but an error in the output parsed:

    XML Parsing Error: undefined entity
    Location: <a href="http://www.mpsfarmaceutica.pt/index.php/palop2/teste_xml/">http://www.mpsfarmaceutica.pt/index.php/palop2/teste_xml/</a>
    Line Number 10, Column 59:<sumario>Teste inscri&ccedil;&atilde;oTeste inscrição</sumario>
    ----------------------------------------------------------^


    ——-> and absolutly removing the {exp:xml_encode} I get:

    Fatal error: Call to undefined function htmlspecialchars_encode() in /web/sites/vhbu/3/12/73923/public/www/sistema/core/core.functions.php(635) : eval()'d code on line 11


    Once again… I can only say… thanxs!

  • #13 / Nov 04, 2007 12:19am

    Derek Allard

    3168 posts

    Ok, well I was hoping to work around some limitations here. It should have been summary not title as you correctly surmised - I was hoping it would be a quick fix.

    And are you telling me that your host told you that they weren’t going to resolve

    Your PHP MySQL library version 4.1.13 differs from your MySQL server version 4.1.22. This may cause unpredictable behavior.

    for you?  And their justification was “nobody else has complained”?  Does that seem like an acceptable resolution to you?

  • #14 / Nov 04, 2007 5:49pm

    latrine

    28 posts

    not really… But I got my “client to ditch” his ISP in exchange for this one, based on many other people I know… and they are really fast and pro-active when solving any other problem until now( e-mail setup, alterante FTP accounts, full backups) and they are the branch of one of the biggest european (AMEN).

    and if it was a problem with the database, shouldn’t the {title} be “garbled” too?!?!? they are loaded into the database at the same time, in the same form, with the same submit instruction… with nothing different structure wise on the Mysql side… that’s what is bothering me and made me open this thread!!

    thxs… anyway, if you want full access to the database and EEinstalation to poke around, just send me a PM (I sent you this info once, I will send it again!)...

    If there is nothing to do, i will try to work with iso-8859-1 and see if the result is different, I will take me some time, but… at least it should be all over…albeight not with the correct and universal UTF side of things…

  • #15 / Nov 05, 2007 3:15pm

    Derek Allard

    3168 posts

    Hey Latrine, I’m not trying to abandon this, but its just that I can’t recreate.  If I use “Teste Inscrição” as both title and summary, and then use this template (set to XML) it works wonderfully.

    <?xml version="1.0" encoding="UTF-8"?>
    <documento>
    {exp:weblog:entries weblog="moblog"}
    
        <item>
          <title>{exp:xml_encode}{title}{/exp:xml_encode}</title>
          <sumario>{exp:xml_encode}{summary}{/exp:xml_encode}</sumario>
        </item>
    
    {/exp:weblog:entries}
    </documento>

    I’ve tested this now on 4 different servers, 3 different PHP versions and several character sets.  Since you have known database/PHP incompatibilities, it seems reasonable to assume that they are contributing to the error.  We don’t have to stop support, but we do need to ask that this at least gets fixed up before we continue with the issue.  Sorry.

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases