Spent many hours last weekend working on an import script for Wordpress XML exports. Thought I would share it in the hopes that it would save someone out there some time. It’s not a general use / plug-and-play type of thing. It’s very specific to my project. But a web developer would be able to use it as a starting point for a Wordpress import on their project.
For me there were two big stumbling blocks. 1) How to properly use PHP’s SimpleXML to parse the Wordpress XML export. 2) How to properly escape the characters in the XML for import into the DB. If nothing else, the code serves as a guide on those two points.
Posts: All posts are imported and given a proper status.
Comments: Only approved comments are imported, to prune the large amount of SPAM I was dealing with.
Categories: Wordpress categories are imported into an EE category group that needs to have categories already defined.
Tags: Wordpress tags are imported into an EE category group (possibly empty) and dynamically created if they don’t already exist.