I’m trying to pull images from the {body} of blog posts. How can I properly escape {body} for use in either XML parsing or just preg_match?
// can use this to grab body, but it won't parse
<?php
ob_start();
?>{body}<?php
$body = ob_get_contents();
ob_end_clean();
// this won't parse either
$body = '{body}';
$body = htmlentities($body);
// but this works
$body = '<a href="http://www.test.com/">http://www.x.com/thumbnails/x.jpg</a>';
preg_match_all('/<img[^>]+>/i',$body, $result);
$doc = new DOMDocument();
$doc->loadHTML($body);
$xml = simplexml_import_dom($doc);
$images = $xml->xpath('//img');
foreach ($images as $img) {
echo $img["src"];
}
?>The html I’m expecting in {body} doesn’t seem to be funky, so what gives? How can I escape {body} so that it will properly be parsed for either preg_matching or the xml parser?
Thanks!
-dave