Hi all,
I’m currently using EExcerpt to create short summary views of each entry on my blog. While it does a great job stripping out all the images from each post, oftentimes I will have captions alongside those images that will remain intact.
I am attempting to format the results of my EExcerpt query using Low’s “Find and Replace” plugin along with a custom RegEx to identify the caption code and strip it out as well.
Each one of my captions is formatted via a custom style in Wygwam that wraps them in the following code:
…so I have been attempting to use the following RegEx to identify that snippet:
]*class\s*=\s*(['\"])caption\1[^>]*>(.*?)My final code block looks like this:
{exp:md_eexcerpt if_exceeds="50" stop_after="50"}
{exp:replace find="<p[^>]*class\\s*=\\s*(['\\"])caption\1[^>]*>(.*?)" regex="yes"}
{body}
{/exp:replace}
{/exp:md_eexcerpt}This code gives me the following error:
Warning: preg_replace() [function.preg-replace]: No ending delimiter '/' found in /public_html/system/plugins/pi.replace.php on line 72I guess my first question would be: do any of you guys have experience with this type of implementation? Am I going about this the wrong way? Part of me is thinking this wouldn’t work regardless, as EExcerpt might strip HTML from the result anyway, so there’d be nothing to match the expression to.
And if that’s not the case, is there any chance someone might be able to point me in the right direction with my RegEx code?
Thanks!
Benjamin,
Thanks! That definitely took care of the error, but still didn’t strip the HTML. I’ve got a feeling that it’s interfering with the EExcerpt plugin, which might be stripping the tags before the RegEx gets a chance to process them. Maybe this isn’t the way to go after all.
Thanks for the heads up on Kses, I will definitely check it out!
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454
regex doesn’t parse HTML. you can almost always come up with some sort of html that will break a regex.
That said, if you know your data very well you might be able to use something like this as your regex:
.*?the key here is the .*? which is a lazy or non-greedy operation. This should keep the regex from consuming everything until the last </p> in your field (which would almost certainly be more than you want to delete).
Packet Tide owns and develops ExpressionEngine. © Packet Tide, All Rights Reserved.