It should just be a matter of constructing the Template correctly. We’re certainly planning on adjusting the current 0.3 one to 1.0, but we haven’t done so yet.
Can I just mention that the “human readable” information available regarding Atom 1.0 is sorely lacking at this point. I have a Template that’s almost complete, but for some reason I’m getting errors regarding the “ ” character entity. [sigh]
I’ll go ahead and attach a text version of the “Atom 1.0” Template as it stood when I last stopped working on it. It validated fine aside from the presence of a non-breaking space character ( ) in the main “body” content. I was unable to determine why the validator was claiming it was an “unidentified entity”.
Well I played around with it some and had the same issue as you, it claimed was an unidentified character. Seems to work fine however in NetNewsWire and Safari RSS so I’m going to chalk it up to just a validator bug.
The only way around it that I found was to put the content section into a CDATA block but that caused some rendering errors with Safari.
Well, supposedly the use of the “type” attribute on <content> and the “xmlns” attribute on the <div> inside should make use of the CDATA bit redundant (or maybe even invalid, I’m not sure). My guess is that if the numeric entity were used for the non-breaking space instead of the character entity then it might work fine. I haven’t tested that, though. That still wouldn’t explain why the character entity is “undefined”.
The practical information available for Atom 1.0 (at least as of a few days ago when I was working on it) is virtually non-existent right now. Further, it’s not even an approved format at this point so it isn’t really even “official” or “released”. It’s simply a draft. I never even really “got” the point behind Atom and why it’s somehow better than RSS 2.0 (or 1.0, for that matter) in any practical, real-world scenario. My feeling is that until there is more useful information on the format available and until it actually is an approved format then it’s probably not worth spending the effort on this. [shrug]
Well the reason CDATA would work is that anything inside a CDATA block is basically exempted from being judged in the parsing when validating a feed (not that it still doesn’t need to be well formed for the reader to interpret it).
Speaking of the numeric value for for the non-breaking space where would I find that at to modify it in the core files to test out if using the numerical value fixes the issue?
Yes, CDATA is used with RSS to do that, but the technical specification document for the Atom 1.0 draft suggested to me that it should not be used in this case in Atom 1.0. According to it, if you have the “type” and “xmlns” defined correctly for XHTML, then you should simply put valid, well-formed XHTML in as the content. The question is, why is the non-breaking space supposedly “invalid” (“undefined”, specifically) in this case? It’s certainly allowed in standard XHTML. Is it really invalid or is feedparser.org returning a false-positive here?
As for using the numeric entity, do a search through system/core/core.typography.php for “nbsp;”. There are 8 instances I see in a quick search.
No, I completely agree, CDATA is unnecessary in Atom 1.0. I have a feeling it’s a parser issue, but I’ll play with the typography file for a bit to see if it makes a difference.
Edit: I also filed a bug report on the Feedvalidator.org bug tracker to have them look into it.
, while it is valid XHTML is not valid XML. To be proper XML it would need to be either
&nbsp;
or the numeric value. This would be the case for any named entity when used in place of the numeric.
I tried the XML_encode plugin around the content section which does get the feed to validate but that strips out waaaaay too much stuff so the feeds contain alot of garbage when viewed in a reader. So nanovivid and I are working on another plugin specific for this task. I’ll let you know how it goes.
If this works then it may be something you guys might want to look into just incorporating generally into the RSS module if possible to “future proof” feeds.
Thanks for continuing to look into this, schweb. I’m curious where you found the information about the character entities not being valid XML; I’d like to read up on that.
Ok the plugin is made (and submitted to the plugin library) that fixes this problem and turns named entities into valid XML without messing up the rest of the encoding.
Thanks for continuing to look into this, schweb. I’m curious where you found the information about the character entities not being valid XML; I’d like to read up on that.
No problem Chris, I’m kind of an Atom junior evangelist so I really wanted to get this working. My first clue that it wasn’t a validator issue was that LiveJournal was also giving an error on the feed. Here’s more information on the whole XML character entity library and its issues. Not that it specifically spells out the problem, but through inference you can deduce it.