RSS Profile and Extensions
- I was just made aware of RSS 2.0 support in Blogger.
One issue that caught my eye is the entity-encoded XHTML. So, I went
to the RSS profile and noticed that we don't address extensions in the
profile, or maybe I'm missing something.
I thought one of the reasons for the profile was to include the common
extensions. Is this the common thought?
I was thinking that in order to better accomodate microformats that we
should discuss a way of embedding XHTML, rather than encoding it.
<xhtml:body> has been used in the past and I'd like to re-use
something that many aggregators already support.
- Randy Morin wrote:
> One issue that caught my eye is the entity-encoded XHTML.It looks to me like that outer XHTML div is just a vesitigial remnant of the
Atom feed, which obviously requires the namespace when the content isn't
encoded. Most clients will just ignore the namespace and treat everything as
HTML anyway. I don't think there's any significance to it, other than maybe
indicating that the content is well-formed (which ironically it isn't).
> I was thinking that in order to better accomodate microformats that weMy last test results indicated 9 of 23 aggregators supported <xhtml:body>.
> should discuss a way of embedding XHTML, rather than encoding it.
> <xhtml:body> has been used in the past and I'd like to re-use
> something that many aggregators already support.
Neither IE7 nor the new Firefox made the list. One aggregator lost all
markup in the <xhtml:body> so it would actually lose functionality by the
inclusion of that element (actually nearly all aggregators lost the markup
if it used a namespace prefix).
I haven't done heavy testing of <xhtml:body>, but from my experience of
testing xhtml content in Atom I would say that in general the xhtml support
is considerably worse than escaped html. And bare in mind that when you
screw up inline xhtml it can make the entire feed invalid and thus
unreadable in aggregators that use strict XML parsers (IE7 for example).
Bottom line: <xhtml:body> is nice for hardcore XML zealots that think
escaping is evil, but if you want the broadest possible support for your
feed, you're better off just sticking with <description>.
- James Holderness wrote:
>And then there are those, like http://www.thinksecret.com/rss.xml, who
> Bottom line: <xhtml:body> is nice for hardcore XML zealots that think
> escaping is evil, but if you want the broadest possible support for your
> feed, you're better off just sticking with <description>.
simply put well formed XHTML (sans namespace) directly into their
descriptions. In that particular case, I somehow doubt that there is
any hardcore XML zealotry involved.
- Sam Ruby
- Thanks James. I figured support for it would be very low. If we
include support in the RSS profile, then maybe we can convince more
aggregators to get this working. Eventually, I'd like to be able to
use an xpath on an RSS file to find microformat data. Currently, with
embedded HTML, that's a less than obvious HOW TO DO.
Randy Charles Morin
--- In firstname.lastname@example.org, "James Holderness" <j4_james@...>
> My last test results indicated 9 of 23 aggregators supported
> Neither IE7 nor the new Firefox made the list. One aggregator lost allby the
> markup in the <xhtml:body> so it would actually lose functionality
> inclusion of that element (actually nearly all aggregators lost themarkup
> if it used a namespace prefix).
- * Randy Morin <randy@...> [2006-06-07 18:35]:
> Eventually, I'd like to be able to use an xpath on an RSS fileIn theory, evil hacks for this already work. libxslt supports the
> to find microformat data. Currently, with embedded HTML, that's
> a less than obvious HOW TO DO.
EXSLT extensions which include an `encode-uri` function, so
theoretically you could URI-encode the tagsoup and append it to
the string `data:text/html,`, then pass that to the `document`
function to get it parsed, since libxml2 has a HTML tagsoup
parser that builds a DOM in addition to the parser for
well-formed XML. However, it seems libxslt doesn’t support the
`data:` scheme and it also appears that its `document` function
always goes directly to the XML parser, even if you enable HTML
tagsoup parsing mode.
I had a look at making this go at some point, though, and it
didn’t seem that the patches would be too difficult to write, but
like so many tuit-predicated things the effort never concluded.
If you find another, better implementation of EXSLT that sports a
tagsoup parser, you might get this working immediately. Or you
might try Mark Nottingham’s libxslt_web, which marries libxslt
with HTML Tidy so that you hand it tagsoup and get back an XHTML
nodeset, then hand it off to libxml2 for parsing. (Tidy copes
with more crud than libxml2’s tagsoup parser too.)
Of course, all this is a lot less optimal than having the content
be a first-class citizen of the feed document… but double-encoded
tagsoup is here to stay, so we’ll have to find ways to cope with
Aristotle Pagaltzis // <http://plasmasturm.org/>
- Sam Ruby wrote:
>And then there are those, like http://www.thinksecret.com/rss.xml, whoNot much to go on there, but the occasional use of uppercase element names
>simply put well formed XHTML (sans namespace) directly into their
>descriptions. In that particular case, I somehow doubt that there is
>any hardcore XML zealotry involved.
makes me think it's probably just good old HTML that someone forgot to
escape. Either way, definately not an XML zealot - they'd never get
something like that wrong.
Incidentally, I mean XML zealot in the nicest possible way.
- secou wrote:
> Hi,Only between around 06:00 and 12:00 GMT both days. My hosting provider
> For two days now, Feed Validator is down. Any explanation ?
(Cornerhost) has had some unexplained issue which effectively disabled
HTTP access to all Cornerhost machines.
Meanwhile, the W3C hosts a stable snapshot at
- Sam Ruby