EExcerpt using Low "Find and Replace" - Anyone good with RegEx?

Development and Programming

Juan Largo

11 posts

16 years ago

Juan Largo

Hi all,

I’m currently using EExcerpt to create short summary views of each entry on my blog. While it does a great job stripping out all the images from each post, oftentimes I will have captions alongside those images that will remain intact.

I am attempting to format the results of my EExcerpt query using Low’s “Find and Replace” plugin along with a custom RegEx to identify the caption code and strip it out as well.

Each one of my captions is formatted via a custom style in Wygwam that wraps them in the following code:

…so I have been attempting to use the following RegEx to identify that snippet:

]*class\s*=\s*(['\"])caption\1[^>]*>(.*?)

My final code block looks like this:

{exp:md_eexcerpt if_exceeds="50" stop_after="50"}
    {exp:replace find="<p[^>]*class\\s*=\\s*(['\\"])caption\1[^>]*>(.*?)" regex="yes"}
    {body}
    {/exp:replace}
{/exp:md_eexcerpt}

This code gives me the following error:

Warning: preg_replace() [function.preg-replace]: No ending delimiter '/' found in /public_html/system/plugins/pi.replace.php on line 72

I guess my first question would be: do any of you guys have experience with this type of implementation? Am I going about this the wrong way? Part of me is thinking this wouldn’t work regardless, as EExcerpt might strip HTML from the result anyway, so there’d be nothing to match the expression to.

And if that’s not the case, is there any chance someone might be able to point me in the right direction with my RegEx code?

Thanks!

Benjamin David

77 posts

16 years ago

Benjamin David

My guess is that it comes from the closing paragraph :

Looks like the slash has been used as a delimiter in the preg_replace function so you’ll have to escape it :

<\/p>

Hope this will work for you !

Benjamin David

77 posts

16 years ago

Benjamin David

You might also have a look at Kses, a PHP script that does a great job at cleaning HTML tags and attributes. Someone should do an Expression Engine Add-On with this one 😊

http://sourceforge.net/projects/kses/

Juan Largo

11 posts

16 years ago

Juan Largo

Benjamin,

Thanks! That definitely took care of the error, but still didn’t strip the HTML. I’ve got a feeling that it’s interfering with the EExcerpt plugin, which might be stripping the tags before the RegEx gets a chance to process them. Maybe this isn’t the way to go after all.

Thanks for the heads up on Kses, I will definitely check it out!

ender

1,644 posts

16 years ago

ender

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

regex doesn’t parse HTML. you can almost always come up with some sort of html that will break a regex.

That said, if you know your data very well you might be able to use something like this as your regex:

.*?

the key here is the .*? which is a lazy or non-greedy operation. This should keep the regex from consuming everything until the last </p> in your field (which would almost certainly be more than you want to delete).

brittanyA

184 posts

16 years ago

brittanyA

Hi Juan Largo,

I’m having this exact same problem. Any luck finding a solution?

Thanks!

Reply

ExpressionEngine Home Features Pro Contact Version Support

Learn Docs University Forums

Resources Support Add-Ons Partners Blog

Subscribe to ExpressionEngine News!

Privacy Terms Trademark Use License