263Re: [radio-features] Re: Good example of runaway double encoding
- Oct 9, 2002ok, now we're both completely confused. <grin/>
The issue of /properly/ entity encoding in XML is a whole other can of worms.
It's /proper/ to declare use of HTML entities in the doctype. That's a whole
other headache. I'm not specifically addressing that mess here.
I'm simply stating that to put is wrong. To do &nbsp; is
even worse. To mangle the UTF encoded ones is ever more heinous. Radio quite
merrily commits all three of these sins.
Encoding of entities isn't all that hard once you grasp it. It's the sort of
grunt work that many programmers never seem to get around to doing PROPERLY.
This is indicative of many things inside Radio. But in this case that laziness
makes it harder for many other programs to decypher Radio's gibberish.
----- Original Message -----
> > Look at lines 106-108 of that XML.
> Ah, yeah, it was the other strange construction I saw last time. Sorry about
> > Double encoding of HTML entities is bad.
> I sure won't argue there!
> > There's also the issue of using HTML entities
> > inside and XML document without making note of
> > such in the declaration. That's a whole other
> > train wreck. One I'll avoid mentioning again.
> But isn't that the point of lines 106 through 108? The "&" in " " should
> be escaped because the author intends the tag's contents after parsing to be
> ampersand n b s p semicolon. The only way to have " " as PCDATA and still
> have the intended HTML after parsing would be to define the XML entity
> " " as expanding to " ". I wouldn't call it double encoding just
> because XML and HTML's character entities use the same syntax.
> > As for wrapping in CDATA, there's no need to
> > entity encode at all. It'd be doubly stupid to
> > encode AND wrap in blocks.
> I agree. I thought this might be unclear after I reread my message (and read
> your similar post to rss-dev), so just to clarify: both in my previous message
> and above, by "CDATA" I mean PCDATA that's been parsed, not necessarily text
> inside a CDATA section as defined in section 2.7 of the XML spec. I thought
> "CDATA" was the name for that, as DTDs use the term as the appropriate
> alternative to PCDATA. Is there a different name I should be using instead?
- << Previous post in topic