1590Re: [syndication] HTML Encoding BooHoo...
- May 23, 2001
>>Now, the "reversion of entities" code in my RSS reader doesn't know aboutYes, but that wouldn't solve my above problem (** and see earlier message).
>>HTML - it just blindly reverts < to < and so forth. Is the only solution
>>to my problem to make the code understand all the possible HTML entities?
>>Or is there something else?
>There's a fair bit of code around that removes all tags except a subset
>of "Allowable html". PHP even has this as a function built into the
In thiss case, <XML> wasn't a tag, it was part of the actual <title>.
Removing all HTML tags wouldn't affect the <XML>, cos that's not a valid
HTML tag anyways... Right now, my reader:
- loads in an XML file.
- converts any encoded </>'s to </> (to cover encoded HTML).
this is a mass replacement, which causes the above problem.
Ultimately, I don't want to remove tags (that's not a decision I'm willing
to make for the users of my program, but it will be an option that they can
In this case, it's not even an issue of allowable tags or not - it's an
issue of preparing for people correctly encoding HTML (<b>) and not
encoding HTML (<b>).
** I eventually tracked the culprit to nothing in my code, but rather the
XML::Simple perl module, which seems to magick <XML> into <XML> all
by itself. I'm still investigating, but seeing the file encoded, and then
loading it through XML::Simple and Data::Dump[ing] it shows that it's
autoconverted. Why, I'm not sure...
.sig on other machine.
- << Previous post in topic Next post in topic >>