Loading ...
Sorry, an error occurred while loading the content.

Re: xml string node to xsl:fo

Expand Messages
  • astragh
    Hi Ken, Firstly thanks so much for taking the time out to reply to my issue. ... the proper use of element start and end tags (nesting, completion, etc.). We
    Message 1 of 4 , Oct 15, 2007
    • 0 Attachment
      Hi Ken,

      Firstly thanks so much for taking the time out to reply to my issue.

      >>Fine .... but I'm guessing this process does not mandate
      the "proper" use of element start and end tags (nesting, completion,
      etc.).

      We would be looking at implementing a framework that would
      produce "proper" end and start tags. Here is a list of what we are
      currently looking at:
      MSHTML: http://msdn2.microsoft.com/en-us/library/aa753630.aspx (part
      of IE)
      TinyMCE: http://tinymce.moxiecode.com/
      FCKeditor: http://www.fckeditor.net/
      OpenWYSIWYG: http://www.openwebware.com/products/openwysiwyg/

      >>Do you have the opportunity in the process to run "tidy"? This is
      an open tool that converts tag soup into XML, which is then suitable
      for XSLT:

      Not unless we can run it in a web application server.

      >>The painful alternative is to use XSLT 2.0 (which isn't painful),
      read in the HTML fragments as a text string, and to do your own
      string analysis in a piecemeal fashion to reconstruct the structure
      from the mess of tag soup from the user. Not at all pleasant, but
      doable.

      This seems to be where all my R&D is pointing me too.

      So basically if we can guarantee the use of valid html tags we should
      be able to do the conversion between html tags and xml (xsl:fo?)
      tags? And if so, where would be my starting point in completing this?

      Again with many thanks,

      -george



      --- In XSL-FO@yahoogroups.com, "G. Ken Holman" <gkholman@...> wrote:
      >
      > At 2007-10-15 11:09 +0000, astragh wrote:
      > >Hi to the forum,
      > >
      > >We currently use xsl fo for styling our xml.
      > >Our application takes user submitted content and adds it to our
      > >database where we then extract this content and transform it into
      an
      > >xml tree. This is then forwarded to our .xsl file and then produces
      > >the PDF.
      >
      > Sounds good so far!
      >
      > >We would like to allow our application to accept html tags using a
      > >wysiwyg module above the textarea where the client can then format
      > >their descriptions with various tags i.e. <b>, <i>, <ul> etc.
      >
      > Fine .... but I'm guessing this process does not mandate
      the "proper"
      > use of element start and end tags (nesting, completion, etc.).
      >
      > >This works ok when the user is forwarded to the next JSP page all
      > >formatting is saved to the database and displayed correctly,
      further
      > >calls to display this page gets the information from the database
      and
      > >again displays all information.
      >
      > Right ... because HTML browsers are forgiving of discrepancies that
      > would otherwise be considered errors in a well-formedness check.
      Any
      > mess entered by the users is acceptable to the end processing in
      the
      > browser client.
      >
      > >The problem is when we request this
      > >view in the PDF format XSL can't handle the html tags. Is there a
      way
      > >that I can make the request to the database during, after or before
      > >we process our XML tree and swap out the html tags with xsl? And
      > >would this work?
      >
      > Do you have the opportunity in the process to run "tidy"? This is
      an
      > open tool that converts tag soup into XML, which is then suitable
      for XSLT:
      >
      > http://tidy.sourceforge.net/
      >
      > The only input to XSLT is well-formed XML and this would give it to
      > you. HTML of any SGML flavour (messy or not messy) is
      > unacceptable. XHTML is acceptable, but then you wouldn't be having
      > any problems because it would be well-formed.
      >
      > The painful alternative is to use XSLT 2.0 (which isn't painful),
      > read in the HTML fragments as a text string, and to do your own
      > string analysis in a piecemeal fashion to reconstruct the structure
      > from the mess of tag soup from the user. Not at all pleasant, but
      doable.
      >
      > >Sorry if this is a little brief I have just inherited the project
      and
      > >only starting to make some ground on the understanding. And I am a
      > >total newbie to this :(
      >
      > Have fun!
      >
      > . . . . . . . . . . Ken
      >
      > --
      > Comprehensive in-depth XSLT2/XSL-FO1.1 classes: Austin TX,Jan-2008
      > World-wide corporate, govt. & user group XML, XSL and UBL training
      > RSS feeds: publicly-available developer resources and training
      > G. Ken Holman mailto:gkholman@...
      > Crane Softwrights Ltd. http://www.CraneSoftwrights.com/f/
      > Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995)
      > Male Cancer Awareness Jul'07 http://www.CraneSoftwrights.com/f/bc
      > Legal business disclaimers: http://www.CraneSoftwrights.com/legal
      >
    • Dave Pawson
      ... Perhaps even easier Ken, http://ccil.org/~cowan/XML/tagsoup/ links a variant of tidy into Saxon, so that the html can be processed in Saxon as if it were
      Message 2 of 4 , Oct 15, 2007
      • 0 Attachment
        G. Ken Holman wrote:

        > Do you have the opportunity in the process to run "tidy"? This is an
        > open tool that converts tag soup into XML, which is then suitable for XSLT:
        >
        > http://tidy.sourceforge.net/
        >
        > The only input to XSLT is well-formed XML and this would give it to
        > you. HTML of any SGML flavour (messy or not messy) is
        > unacceptable. XHTML is acceptable, but then you wouldn't be having
        > any problems because it would be well-formed.

        Perhaps even easier Ken,
        http://ccil.org/~cowan/XML/tagsoup/ links a variant of tidy
        into Saxon, so that the html can be processed in Saxon as if
        it were well formed xhtml?

        Just call up Johns jar instead of saxon.jar and it all
        comes out well.

        Really nice.

        regards


        --
        Dave Pawson
        XSLT, XSL-FO and Docbook FAQ
        http://www.dpawson.co.uk
      Your message has been successfully submitted and would be delivered to recipients shortly.