ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Problems with utf8 links and rank denial

January 26, 2008 1:27pm

Subscribe [1]
  • #1 / Jan 26, 2008 1:27pm

    alex7

    130 posts

    Nobody knows that they must edit ‘category_ns’ in all language packs (otherwise they can get a mishmash with broken links in their Wiki in some time).

    German:

    "category_ns" =>
    "Kategorie",

    French:

    "category_ns" =>
    "Categorie",

    Does anybody of your EE Wiki users know they should edit this parameter?

    It is not a correct comparison. All the above examples still use Latin characters in parameters. I guess Lisa was right when she pointed that there should not be high ascii characters in the URL. No meter what language we use in wiki, we should use the same category namespace in Russian & English wiki language packs, IMHO:

    "category_ns" =>
    "Category",

    Instead of

    "category_ns" =>
    "Раздел",
  • #2 / Jan 26, 2008 1:38pm

    Derek Jones

    7561 posts

    Some browsers still can’t handle those characters in URLs, Alex, but most modern browsers do, so with the exclusion of some on certain browsers, it’s perfectly fine, and perhaps more correct to rename it.  Again, looking at Wikipedia as a model, Категория is used for Russian, Categoría is used for Spanish, and so on.  Of course, language packs are entirely under your control, so you are free to implement it as you see fit.  In my personal opinion, it doesn’t make sense to use English namespaces on a non-English wiki.

  • #3 / Jan 26, 2008 2:52pm

    alex7

    130 posts

    Some browsers still can’t handle those characters in URLs, Alex, but most modern browsers do, so with the exclusion of some on certain browsers, it’s perfectly fine, and perhaps more correct to rename it…

    Well… modern browsers do handle high ASCII characters in the URL, but I cannot say the same about wiki PHP file. Please, look on the example here:

    http://alex.ourera.org/index.php/wiki/index/

    I’ve got fresh installed wiki and using Russian here. In the only index page I made a category using this syntax:

    [[Раздел:Основной раздел]]

    It did made a category “Основной раздел”, but in addition it did not hide this markup in the visible part of article’s body. Instead of expecting full text of article:

    Welcome to the opening page of your Wiki!

    I’ve got:

    Welcome to the opening page of your Wiki!

    Раздел:Основной раздел

    Clicking on this link leading to absolutely mess in URL.

  • #4 / Jan 26, 2008 2:55pm

    Derek Jones

    7561 posts

    It appears that’s being caused by the redirect.  Do you have Rank Denial enabled by chance in your Security and Sessions preference?

  • #5 / Jan 26, 2008 3:02pm

    alex7

    130 posts

    It appears that’s being caused by the redirect.  Do you have Rank Denial enabled by chance in your Security and Sessions preference?

    Yes I’ve got it enabled and I’d like to keep it this way. 😛 For me security is important reason.

  • #6 / Jan 26, 2008 3:24pm

    Derek Jones

    7561 posts

    I’ve split this thread to a new one as it is a different issue.  Whether or not you’ve changed the category name here looks to be immaterial as I would suspect it does the same thing for any pMcode link that was formed with those characters.  I can duplicate that on my own installation, so I will post back with the resolution.

  • #7 / Jan 26, 2008 3:32pm

    Derek Jones

    7561 posts

    Ok, the issue is that the redirect HTML does not specify the page charset, so whatever the server sends and/or the browser interprets it to be is how the GET variables are understood by the browser.  So for instance if the browser thought it was receiving ISO-8859-1 characters, it would read Раздел in GET data as Раздел.  A fix for you is to open your index.php file and change this:

    if ( ! isset($_SERVER['HTTP_REFERER']) OR ! stristr($_SERVER['HTTP_REFERER'], $host))
    {
        // Possibly not from our site, so we give the user the option
        // Of clicking the link or not
        
        $str = "<html>\n<head>\n<title>Redirect</title>\n</head>\n<body>".
                "To proceed to the URL you have requested, click the link below:".
                "<a href="http://.$_GET">".$_GET['URL']."</a>\n</body>\n</html>";
    }
    else
    {
        $str = "<html>\n<head>\n<title>Redirect</title>\n".
               '<meta http-equiv="refresh" content="5; URL='.$_GET['URL'].'">'.
               "\n</head>\n<body>\n</body>\n</html>";
    }

    to:

    if ( ! isset($_SERVER['HTTP_REFERER']) OR ! stristr($_SERVER['HTTP_REFERER'], $host))
    {
        // Possibly not from our site, so we give the user the option
        // Of clicking the link or not
        
        $str = "<html>\n<head>\n<meta http-equiv='Content-Type' content='text/html; charset=utf-8'/>\n<title>Redirect</title>\n</head>\n<body>".
                "To proceed to the URL you have requested, click the link below:".
                "<a href="http://.$_GET">".$_GET['URL']."</a>\n</body>\n</html>";
    }
    else
    {
        $str = "<html>\n<head>\n<meta http-equiv='Content-Type' content='text/html; charset=utf-8'/>\n<title>Redirect</title>\n".
               '<meta http-equiv="refresh" content="0; URL='.$_GET['URL'].'">'.
               "\n</head>\n<body>\n</body>\n</html>";
    }
  • #8 / Jan 26, 2008 3:36pm

    Derek Jones

    7561 posts

    Oh, and title= attributes are only allowed in links when at least Safe HTML is allowed, so to fix that separate issue, you would need to change your Wiki’s HTML parsing preference to at least allow Safe HTML.

  • #9 / Jan 26, 2008 4:47pm

    alex7

    130 posts

    Ok, the issue is that the redirect HTML does not specify the page charset, so whatever the server sends and/or the browser interprets it to be is how the GET variables are understood by the browser.  So for instance if the browser thought it was receiving ISO-8859-1 characters, it would read Раздел in GET data as Раздел.  A fix for you is to open your index.php file and change this:
    [...]

    Thank you, Derek! This solution working like a charm! But it didn’t fix the duplicated appearance of category name in the body of article. I believe it should be hidden. Any clue?

  • #10 / Jan 26, 2008 4:51pm

    Derek Jones

    7561 posts

    No, that’s intended (as you can see on our own wiki).  Category markers serve to both mark an article as a particular category and display a link to the category.  The Categories list that appears below the article is a nested list, different from the inline links that the category tag provides.

  • #11 / Jan 26, 2008 5:00pm

    alex7

    130 posts

    No, that’s intended (as you can see on our own wiki)...

    Oh, I see… It supposed to be this way. I was misunderstanding, ‘cause I tried to compare it with wikipedia. Thanks for your time and truly appreciated help!  :coolsmile:

  • #12 / Jan 26, 2008 5:05pm

    Derek Jones

    7561 posts

    No problem, Alex, glad I could help.

  • #13 / Jun 20, 2008 3:57am

    alex7

    130 posts

    By the way I found how to avoid doubling in category’s name on the same page. All we need is “|” at the end. Instead of

    [[Category:category name]]
    we got to use
    [[Category:category name|]]

    .

  • #14 / Jun 22, 2008 12:52pm

    Robin Sowell

    13255 posts

    😉  Glad Derek caught this one!  Is all now well- and good to close out?

  • #15 / Jun 22, 2008 1:08pm

    alex7

    130 posts

    😉  Glad Derek caught this one!  Is all now well- and good to close out?

    I guess so. Problem is solved. Thank you folks! 😜

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases