Stripping HTML and entities from a field
Posted: 25 October 2007 02:26 PM   [ Ignore ]  
Lab Assistant
RankRank
Total Posts:  116
Joined  03-14-2007

Can I pull XHTML-formatted weblog entry fields into a template as plain text? I want no HTML, no entities, etc.

For example, let’s say I allow XHMTL formatting of the {title} field, and now I want to use it in the page header in the <title> </title> tag. How do I strip out all HTML and entities? I don’t want to disable XHMTL formatting in the underlying filed, just in this particular display of its contents.

The HTML_Stripper plugin gets me halfway there—it removes HTML and, optionally, both < and > characters, but I’m still stuck with ampersands, non-breaking spaces, etc. The Find and Replace plugin could do it, too, if I felt like listing every possible HTML tag and entity in every usage of the plugin, but I’d rather not be updating my templates every time a user discovers a new tag or entity.

Profile
 
 
Posted: 25 October 2007 02:28 PM   [ Ignore ]   [ # 1 ]  
Administrator
Avatar
RankRankRankRankRankRankRankRank
Total Posts:  36398
Joined  05-14-2004

You would be much better to do this the other way: enter the content with no formatting, and use a plugin around the content in the template to add formatting.

 Signature 
Profile
MSG
 
 
Posted: 25 October 2007 02:32 PM   [ Ignore ]   [ # 2 ]  
Lab Assistant
RankRank
Total Posts:  116
Joined  03-14-2007

That makes sense assuming I want to format the whole field, but what I’m trying to solve is much more a per-instance problem. Sometimes I want to underline a word, or use a non-breaking space, in my title field. I can’t think of how to do that efficiently at run-time.

Profile
 
 
Posted: 25 October 2007 02:34 PM   [ Ignore ]   [ # 3 ]  
Administrator
Avatar
RankRankRankRankRankRankRankRank
Total Posts:  36398
Joined  05-14-2004

Have you looked at other plugins, like TruncHTML? You might be best to expand a plugin to cover the scenarios you need.  In any case, I’m going to move this up to the how to forum. =)

 Signature 
Profile
MSG
 
 
Posted: 25 October 2007 02:49 PM   [ Ignore ]   [ # 4 ]  
Lab Assistant
RankRank
Total Posts:  116
Joined  03-14-2007

Thanks for the pointer, and for moving the post. I’m often unsure of the difference between tech support and “how to” questions.

I’ve read all of the “Text Formatting” and “Output Text” plugins, and none seem to convert or remove entities (like curly quotes) into plain text. Any suggestions would be greatly appreciated.

Profile
 
 
Posted: 26 October 2007 08:21 AM   [ Ignore ]   [ # 5 ]  
Lab Assistant
Avatar
RankRank
Total Posts:  123
Joined  01-05-2007

I use the Find and Replace plugin to strip html tags and newline sequences from the summary field, in order to use it in the meta description tag. I don’t replace any html entities or other special character because they all seem to work fine in the meta description tag.

<meta name="description" content="{exp:replace find="/(<[^>]{1,}>|&nbsp;[\r\n]|[\r\n])/" regex="yes"}{summary}{/exp:replace}" />

To simply delete any html entities you should search for an amersand/semicolon sequence. I haven’t tested it, but I assume that changing the “find” string to

/(<[^>]{1,}>|&nbsp;[\r\n]|[\r\n]|&[^;]{1,};)/

would work fine. Converting html entities requires a lot more work, and also requires that you nest the plugin (I believe plugins are called innermost first, aren’t they?)

Profile
 
 
Posted: 02 June 2008 09:15 AM   [ Ignore ]   [ # 6 ]  
Grad Student
Avatar
Rank
Total Posts:  51
Joined  11-27-2007

I’ve been using this combination successfully to remove both entities and html code:

{exp:replace find="\x26[^\x26\x3B]*\x3B" regex="yes"}
{exp
:html_strip}
This will
<strong>strip</strong> and/or replace&reg; both entities &amp; html code.
{/exp:html_strip}
{
/exp:replace}

By the way, RegexBuddy is a fantastic tool for creating and testing regular expressions. I highly recommend it.

Profile
 
 
   
 
 
Post Marker Legend
New Topic New posts Hot Topic Hot Topic with new posts New Poll New Poll Moved Topic Moved Topic Sticky Topic Sticky topic
Old Topic No new posts Hot Old Topic Hot Topic with no new posts Old Poll Old Poll Closed Topic Closed Topic Announcement Announcements
Theme
Change Theme
Visitor Statistics
The most visitors ever was 1149, on July 16, 2007 09:33 AM
Total Registered Members: 77514 Total Logged-in Users: 35
Total Topics: 101527 Total Anonymous Users: 20
Total Replies: 544280 Total Guests: 257
Total Posts: 645807    
Members ( View Memberlist )