ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Proof of concept : multilanguage urls with the help of google translate api

June 04, 2009 10:36am

Subscribe [3]
  • #1 / Jun 04, 2009 10:36am

    xwero

    4145 posts

    It’s a longtime brainteaser for me to have multilanguage urls instead of adding a language segment/query string to the url. The solutions i came up with required too much maintenance.

    Today i was working on some other translation problem and i found out the google translate api’s curl urls. And that is when the lightbulb began to shine.

    $ch = curl_init();
    $uri_string = substr($_SERVER['PATH_INFO'],1); // remove begin slash
    $org_lang = 'en';
    
    curl_setopt($ch, CURLOPT_URL, "http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=".$uri_string."&langpair;=|".$org_lang);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    
    $response = curl_exec($ch);
    
    curl_close($ch);
    // no go if the url couldn't be translated
    if(strpos($response,'400}') === false)
    {
        preg_match('/translatedText":"(.+?)","detectedSourceLanguage":"([a-z]{2})/',$response,$matches);
        // no go if it's the detected language is the original url language
        if($matches[2] != $org_lang)
        {
            $route[$uri_string] = str_replace(' ','',$matches[1]);
            define('DETECTED_LANG',$matches[2]); // for further use on the page
        }
    }

    If you add this to the config/routes.php file the urls can be translated into the language pairs google translate supports.

    As this is a proof of concept the code shouldn’t be used by production code.

  • #2 / Jun 04, 2009 10:48am

    sophistry

    906 posts

    very cool. always thinking outside the nine dots, xwero!

    i tested the google API with this commandline curl (from their API page):

    curl -e <a href="http://www.my-ajax-site.com">http://www.my-ajax-site.com</a> 'http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=hello world 400&langpair=en|it

    i added something to the q parameter that would break the code above. can you see it?

    nice work. i look forward to getting some juice out of this.

    EDIT: just thought of something… what about encoding of URLs? the first english i tried returned a spanish word with an accent…

  • #3 / Jun 04, 2009 10:52am

    Dam1an

    2385 posts

    @sopistry, You need the double quotes around the string do you not? At least they’re there is xwero’s code

    Great idea xwero 😊

  • #4 / Jun 04, 2009 10:54am

    sophistry

    906 posts

    @sopistry, You need the double quotes around the string do you not? At least they’re there is xwero’s code

    nope. no need for double-quotes. anyway, it’s commandline curl not PHP cURL (for demo only).

  • #5 / Jun 04, 2009 11:13am

    xwero

    4145 posts

    You need to url encode the pipes symbol : %7C. I guess the query string will also have to be url encoded, to be on the safe side.

    It’s not a real solution because further tests i did with other urls made me aware of the fact that the language of single segment urls quite often doesn’t get recognized.
    Another dealbreaker is that the actual controller and method name has to be what the translate api provides you. So you can only use it with controllers/methods you are in control of. Or you need to run the slugs through google translate as well.

  • #6 / Jun 04, 2009 11:45am

    Evil Wizard

    223 posts

    personally I would have encoded the spaces to either “+” or “%20;” or “&#x32;”, spaces tend to break URLs

    EDIT: had to change the code of the space codes to display the code and not the space character it represents.

  • #7 / Jun 04, 2009 12:01pm

    sophistry

    906 posts

    actually, on my (forgiving) system the unencoded pipe and spaces work OK, but the 400 in the q parameter hoses the PHP code because it relies on a fragile strpos rather than decoding the JSON response.

    here’s to keeping it bubbly.

    cheers.

  • #8 / Jun 04, 2009 12:16pm

    xwero

    4145 posts

    sophistry the 400 in the url string isn’t a problem because it’s not followed by a curly bracket and i don’t think an url string will have a curly bracket in it.

    I didn’t go for the json php functions because it’s a one line response. I think decoding the json object will take longer than using strpos and preg_match.

  • #9 / Jun 04, 2009 12:39pm

    sophistry

    906 posts

    doh! didn’t notice the curly brace… :red:

    anyhow, suggest a tighter solution is using ‘“responseStatus”: 400’ string in the strpos.

    cheers.

  • #10 / Jun 05, 2009 7:46am

    xwero

    4145 posts

    This is the message when the language isn’t detected

    {"responseData": null, "responseDetails": "could not reliably detect source language", "responseStatus": 400}

    i could do a responseData preg_match check but i wanted to catch as many errors as possible to make the code light. If someone would actually use the code the error messages could be intercepted and routed to the appropriate error page.

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases