G. Ken Holman wrote:
> Do you have the opportunity in the process to run "tidy"? This is an
> open tool that converts tag soup into XML, which is then suitable for XSLT:
>
> http://tidy.sourceforge.net/
>
> The only input to XSLT is well-formed XML and this would give it to
> you. HTML of any SGML flavour (messy or not messy) is
> unacceptable. XHTML is acceptable, but then you wouldn't be having
> any problems because it would be well-formed.
Perhaps even easier Ken,
http://ccil.org/~cowan/XML/tagsoup/ links a variant of tidy
into Saxon, so that the html can be processed in Saxon as if
it were well formed xhtml?
Just call up Johns jar instead of saxon.jar and it all
comes out well.
Really nice.
regards
--
Dave Pawson
XSLT, XSL-FO and Docbook FAQ
http://www.dpawson.co.uk