ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Character entities in forum descriptions on frontend

October 13, 2007 10:01am

Subscribe [3]
  • #1 / Oct 13, 2007 10:01am

    municipal

    165 posts

    I’m not sure if this cropped up because of the latest build update but I think so…

    I input the character entity & # 8 2 1 7; for a curly apostrophe in one of the forum descriptions. Now I see that the encoding doesn’t get translated and appears as typed. So I removed it and was pleasantly surprised to see that EE made it curly anyway. But it’s doing so by using an invalid numeric character reference—it’s using & # 1 4 6 ; instead of & # 8 2 1 7;

    And it does nothing with quote marks—like, if I type in straight quote marks, they are output as straight quote marks. If I use the character entity & # 8 2 2 0;, it appears as typed.

    Build:  20070918

  • #2 / Oct 14, 2007 11:25am

    Robin Sowell

    13255 posts

    Weird- is this just happening for the forum titles and descriptions- not an issue in the body?  Which I shall ‘now’ try to “test” in some way- “ is 8220 and ’ 8217 .

  • #3 / Oct 14, 2007 12:14pm

    municipal

    165 posts

    Yup…just forum descriptions (I didn’t try it on titles).

    It seems to be happening here also. Go look at the EE forum home page…straight quotes remain straight and the curly apostrophe is being output with the verboten & # 1 4 6 ;.

  • #4 / Oct 14, 2007 12:18pm

    Robin Sowell

    13255 posts

    I’m calling in the crew on this one, as I’m not dead sure what the expected behavior should be.  One quick question- I know you’re running the latest build of EE- are you running the latest build on the forum as well?  There have been a few build updates for that as well.

  • #5 / Oct 14, 2007 12:20pm

    municipal

    165 posts

    Yup…I upgraded both yesterday. That’s when I saw my encoded apostrophe end up being outputted literally. Part of the forum changelog actually referenced character entities. Lemme’ go get it…

    Edit: Okay, here it is:

    Fixed a bug where board, forum, and category names, and their descriptions were not having special characters converted to entities.

  • #6 / Oct 14, 2007 12:26pm

    Robin Sowell

    13255 posts

    Huh- ok, try clearing your caches.  Can’t hurt.  But I suspect that won’t do the trick.  Let’s see what the crew has to say.

  • #7 / Oct 14, 2007 12:43pm

    municipal

    165 posts

    I cleared the caches but then I noticed something after adding more test text. Here’s the screen shot…see how the HTML line breaks are being output as HTML…

  • #8 / Oct 19, 2007 3:44pm

    municipal

    165 posts

    Hiya. I think I fell by the wayside here…?

  • #9 / Oct 19, 2007 5:34pm

    Derek Allard

    3168 posts

    Sorry municipal, 5 days is entirely inappropriate.  You say you can see it on our forums… can you still?  I’m looking over our forum descriptions right now, and don’t notice it.

    Is your forum publicly viewable?

    After the new build, if you create a new forum, does this behaviour exist, or is it only in old ones?

  • #10 / Oct 19, 2007 5:52pm

    municipal

    165 posts

    That’s okay…the project I need this for keeps getting delayed so launch isn’t imminent.

    Hmmm…I think I misspoke and was referring to the invalid character entities being generated in forum topic titles for apostrophes, not forum descriptions. But in the forum descriptions here, I don’t see *any* character conversions going on.

    You asked about what happens if I create a new forum…ugh. I’d rather not if I don’t have to because of the numbering/URL scheme that occurs. I could PM you access to the site?

    Edited to add a little more info:
    I had a test installation associated with another site that I had installed the forum on last time we had an in-depth problem and I never dumped it afterwards. So, I updated that installation and created a “new” forum and am seeing the same behavior.

  • #11 / Oct 21, 2007 11:31am

    Derek Allard

    3168 posts

    OK, just to be clear, are you actually entering that html and the converted entities (ie: the full 6 character special character starting with the ampersand and ending with the semi-colon) into your forum descriptions?  If so, then this is expected behaviour, EE converts all special characters into their entity equivalents.

    You can change this behaviour by hacking your installation, but be aware that it would be a hack, and you’d need to re-apply it for every update (Also, we can’t provide support on hacks).  The line you’d want to change would be around line 1988 of system/modules/forum/mod.forum_core.php.  You need to remove the _convert_special_chars() function.  The easiest approach is to make

    $match['1'] = str_replace('{forum_description}', $this->_convert_special_chars($row['forum_description'], TRUE), $match['1']);

    look like

    $match['1'] = str_replace('{forum_description}', $row['forum_description'], $match['1']);

    If you aren’t entering the html into your forum descriptions, then could you open up your database, find the exp_forums table, find the forum descriptions field, and tell me what it says in the database?  I’m trying to determine if the db is storing the converted characters or if they are getting converted on the “way out”.

  • #12 / Oct 21, 2007 1:35pm

    municipal

    165 posts

    OK, just to be clear, are you actually entering that html and the converted entities (ie: the full 6 character special character starting with the ampersand and ending with the semi-colon) into your forum descriptions?  If so, then this is expected behaviour, EE converts all special characters into their entity equivalents.

    When I enter the HTML and the character entities (the full 6 character special character starting with the ampersand and ending with the semi-colon), it is output exactly as entered. There is no conversion going on. The instances where you DO see what appears to be a curly apostrophe is actually invalid numerical character references. And it’s not encoding the quote marks at all.

    Wait a sec…let me put up another screenshot…

    So, to reiterate..

    1) For straight quote marks, no conversion occurs
    2) For straight apostrophes, converted using invalid numerical character references
    3) All other HTML seems to get output as straight HTML

    The only reason I typed in the character entities myself to being with was because EE wasn’t doing any conversions. But that didn’t help because they were then output as straight HTML.

  • #13 / Oct 21, 2007 1:37pm

    Derek Allard

    3168 posts

    OK, we just have a misunderstanding here.  What you are seeing is expected behaviour.  EE converts all special characters.  It is converting your characters.  Its making your & into an & and that is therefore preserving the appearance that it is not converting.  In fact it is.  If you want to use line breaks, and to enter your own entities, you’ll need to make the change I outlined above.

  • #14 / Oct 21, 2007 2:55pm

    municipal

    165 posts

    Okay, bear with me…

    Just so I have this straight then…in the forum titles and descriptions…

    EE will not convert straight quote marks and other assorted typographical elements into their correct typographical counterpart. But it will attempt to do just that on apostrophes (single quotes), only it is encoding them with an invalid numerical character reference.

    So I guess my questions are these:

    1) How do I fix the invalid numerical character reference that is being generated for apostrophes?

    2) How does one get typographically correct XHTML in the forum titles and descriptions, similar to everywhere else on the forums? I presume this is where your hack from above comes into play?

    Um…can this be changed? Don’t people want typographically correct XHTML in their forum titles and descriptions?

  • #15 / Oct 21, 2007 4:15pm

    Derek Allard

    3168 posts

    No problem at all municipal.  These things get confusing.  EE will convert the following characters into their entity reference equivalent, and won’t touch the others.  ‘<’, ‘>’, ‘{', '}’, ‘\’‘, ‘“’, ‘?’, ‘&’.

    So the conversion of the ‘&’ prevents your manually entering &#8220; from working.

    This is probably worthy of a feature request, but in the meantime, to get it to work, then yeah, I don’t see a way around changing the source code.  On looking at the code again, I have another change I’d rather you make.  The same line as before, around line 1988 of system/modules/forum/mod.forum_core.php

    $match['1'] = str_replace('{forum_description}', $this->_convert_special_chars($row['forum_description'], TRUE), $match['1']);

    into

    $match['1'] = str_replace('{forum_description}', $this->_convert_special_chars($row['forum_description'], FALSE), $match['1']);

    The only difference in the two is that I’ve changed a “TRUE” to say “FALSE”.

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases