Thread

This is an archived forum and the content is probably no longer be relevant, but is provided here for posterity.

The active forums are here.

Links aren't formatting properly with parenthesis and percentage signs.

February 02, 2008 9:11pm

Subscribe [3]
  • #1 / Feb 02, 2008 9:11pm

    ShadowXOR's avatar

    ShadowXOR

    134 posts

    A member was trying to link to the Wikipedia article which can be reached via EITHER of these two links:

    http://en.wikipedia.org/wiki/Steppenwolf_(band)

    or

    http://en.wikipedia.org/wiki/Steppenwolf_(band)

    (I guess I’m going to have to put a space in between all of the symbols at the end so that you can understand what I’m trying to link to.  This is the real link (if you remove the spaces):

    http://en.wikipedia.org/wiki/Steppenwolf _ % 2 8 b a n d % 2 9

    The EE software keeps altering the percentage sign followed by those two number into something mangled…)

    When the link is created with the parenthesis method, even if we enter it correctly like this:

    [url=http://en.wikipedia.org/wiki/Steppenwolf_(band)]Steppenwolf[/url]

    It automatically removes the parenthesis from around the word “band” and links to this:

    http://en.wikipedia.org/wiki/Steppenwolf_band

    And when using the percentage sign method like this:

    [url=http://en.wikipedia.org/wiki/Steppenwolf_(band)]Steppenwolf[/url]

    It comes out mangled and links to this:

    http://en.wikipedia.org/wiki/Steppenwolf_⢺nd;)

    Any idea why this is happening or how I can fix it?  I successfully created the link on these forums here once I thought, but it appears to be doing the same thing now.  Check it out:

    Link 1: Steppenwolf

    Link 2: Steppenwolf

    This is obviously a big problem since Wikipedia is so popular.  Any idea on how to fix this?  I didn’t think it happened on your official forums but now that I see it does this may even be a bug report.

    EDIT 1: Here is another link that won’t format properly in EE:

    http://tv.yahoo.com/show/35099/news/urn:newsml:eonlinekristen.com:20080202:TV-6a829db24533e7aa026268d0738c29d5__ER:1;_ylt=AuK7ETKJIdGTP7M9Gu47UVyAo9EF

    EDIT 2: Parenthesis don’t work when trying to hotlink an image either.

  • #2 / Feb 03, 2008 9:43am

    Derek Jones's avatar

    Derek Jones

    6974 posts

    Parenthesis aren’t allowed in URLs in ExpressionEngine because in certain circumstances it can cause parts of the URL to be interpreted as code, and are used as part of cross-site scripting hacks.  Unfortunately because some browsers will even interpret URL encoded parenthesis in this manner and still allow code execution, ExpressionEngine takes the high road erring on the side of caution.  URL encoding makes invalid characters safe for transport in a URL, but note safe to use.  The action ExpressionEngine is taking keeps you and your site’s visitors safe.  Of course, this won’t happen to URLs that you enter into weblog entries since we can assume that people with access to those have a higher level of trust than it is safe to assume for forums, wikis, comments, etc.  We know it can be annoying, so we continue to make improvements to the security checks when we can to allow desired information to pass, but only if it does not present a security risk.  So, this is known behavior, and an unintended consequence of maintaining effective armor against the ever increasing prevalence of XSS attacks.

  • #3 / Feb 03, 2008 6:57pm

    ShadowXOR's avatar

    ShadowXOR

    134 posts

    How did IPB protect against such things?  I can’t imagine at their high version they’re leaving their customers vulnerable but they worked properly when I was using that?

  • #4 / Feb 03, 2008 9:00pm

    Derek Jones's avatar

    Derek Jones

    6974 posts

    I do not know much about IPB, but they have had a history of security issues.

  • #5 / Feb 04, 2008 1:03am

    ShadowXOR's avatar

    ShadowXOR

    134 posts

    I do not know much about IPB, but they have had a history of security issues.

    Well is there at least a way for people to copy the plain text in there?  As you can see in my above example EE changes some of the characters so that people couldn’t even copy and paste the text…we just cannot link to any articles that have that (which is a decent amount).

  • #6 / Feb 04, 2008 10:13am

    Derek Jones's avatar

    Derek Jones

    6974 posts

    Yes, if you disable the “Auto-convert URLs and email addresses into links?” feature then the submitted text will remain unformatted.

  • #7 / Feb 04, 2008 3:37pm

    ShadowXOR's avatar

    ShadowXOR

    134 posts

    Yes, if you disable the “Auto-convert URLs and email addresses into links?” feature then the submitted text will remain unformatted.

    Well I want that feature enabled without distorting my URLs.  It even distorts it within the “code” formatting.

    Also, it distorts it on the official EE forums (right here) and you have auto-converting disabled, so that isn’t the problem.  I had to insert spaces in between each character in my original post so that it didn’t change the characters.

  • #8 / Feb 04, 2008 3:57pm

    Derek Jones's avatar

    Derek Jones

    6974 posts

    True, my apologies.  Though if you just submit it as you have in the first post, no such conversion occurs.

    http://en.wikipedia.org/wiki/Steppenwolf_(band)

    It’s only with URL encoded values that you are seeing the conversion.  Let me bring this topic up again with the rest of the development team and see if they have any thoughts as to a safe workaround for you.

  • #9 / Feb 04, 2008 4:16pm

    ShadowXOR's avatar

    ShadowXOR

    134 posts

    True, my apologies.  Though if you just submit it as you have in the first post, no such conversion occurs.

    http://en.wikipedia.org/wiki/Steppenwolf_(band)

    It’s only with URL encoded values that you are seeing the conversion.  Let me bring this topic up again with the rest of the development team and see if they have any thoughts as to a safe workaround for you.

    Thanks for the help.  And the reason I provided the link with the percentages is that’s because Wikipedia gives you that link.  I just happen to know you could change them to parenthesis and net the same effect but my members aren’t going to come up with this and are just going to end up with a broken link.

  • #10 / Feb 04, 2008 4:58pm

    Derek Jones's avatar

    Derek Jones

    6974 posts

    Ok, we have made a determination and future versions will be able to display the URLs without the character conversion as plain text, but I’m afraid it’s just not safe to make the default behavior for pMcode and auto-created links to allow parenthesis of any kind in the URL.  Even with this future change, you would need to allow all HTML in your forums and users would need to create regular HTML links.

    You may email me (not PM) if you would like the modified file that will at least prevent the characters from being converted to invalid character entities in plain text.

  • #11 / Nov 26, 2009 2:09am

    Michael Rog

    179 posts

    Curiously, for site owners willing to take the risk of allowing parentheses in URIs, are there sufficient hooks around EE’s URI cleaning functions to rewrite the functionality via extension?

  • #12 / Nov 26, 2009 2:11am

    Derek Jones's avatar

    Derek Jones

    6974 posts

    No, Michael, sorry, such systems are not exposed to extensions so that every unmodified installation of ExpressionEngine can be assured to have certain protections.  For your own installations, of course, you are certainly free to modify the code as you see fit, but particularly in the case of application security measures, such changes are at your own risk and not supported.

  • #13 / Nov 26, 2009 3:03pm

    Michael Rog

    179 posts

    Sorry to be beating this dead horse, but…

    I just noticed the line in config.php about uri characters:

    $config['permitted_uri_chars'] = 'a-z 0-9~%.:_\\-';

    Can I modify this variable to allow parenthesis in blog and wiki titles?

  • #14 / Nov 26, 2009 3:04pm

    Derek Jones's avatar

    Derek Jones

    6974 posts

    No, that config item is not used by ExpressionEngine.

  • #15 / Nov 30, 2009 8:03pm

    Michael Rog

    179 posts

    Interestingly, John Gruber just released his ideal regex for matching URLs, which does account for a set of parentheses in a URL:

    http://daringfireball.net/2009/11/liberal_regex_for_matching_urls

    Nifty.

ExpressionEngine News

#eecms, #events, #releases