Loading ...
Sorry, an error occurred while loading the content.

Re: [XSL-FO] How to use properly?

Expand Messages
  • C. Myers
    Eliot, Thank you so much for your prompt reply and valuable information you have provided. We are using RenderX, and its immediate file is called xep and
    Message 1 of 7 , Jan 20, 2005
      Thank you so much for your prompt reply and valuable
      information you have provided. We are using RenderX,
      and its immediate file is called xep and basically a
      text. I will share your message with my colleague and
      decide what to do next.

      Thanks again.


      --- "W. Eliot Kimber" <ekimber@...>

      > C. Myers wrote:
      > > Hi Eliot,
      > >
      > > Could you provide me a little more
      > information/clue(s)
      > > how to implement the two-pass approach? Thanks.
      > The exact mechanism will be entirely dependent on
      > the FO engine you're
      > using and what, if anything, they do to help in this
      > case. Alternatively
      > you can use the "put the data in PDF and extract it"
      > approach, which is
      > generic but can be a bit more trouble to implement
      > (but not that hard).
      > For any solution, the basic approach is:
      > 1. Figure out what layout-related information you
      > need in order to get
      > the effect you want. In your case you need to know
      > what page each
      > footnote reference falls on. That is, for each
      > element that makes a
      > footnote reference, you need to know the page number
      > it falls on.
      > Thus you need to create an association between the
      > original input XML
      > element and the page number of the page it
      > eventually falls on. This is
      > easiest if the original element has an ID or some
      > other
      > easily-referenceable identifier, but that's not a
      > hard requirement. For
      > example, if you are using Saxon, the generated IDs
      > will be consistent
      > for the same input document because the IDs directly
      > reflect the
      > document and tree organization of the elements
      > [NOTE: XSLT doesn't
      > require this and you should not depend on it as a
      > general solution. The
      > Saxon implementation could change at any time (and
      > it may not even be
      > true in Saxon 8, I don't know).]
      > 2. Generate the layout-related information. In the
      > abscence of a more
      > direct extension, there are essentially two
      > available approaches:
      > A. Use Ken Holman's technique of creating leading
      > or trailing pages in
      > your PDF that contain the data you want in some
      > convenient text format
      > (e.g., as XML data or comma-delimited strings or
      > something). You can
      > then use any number of PDF page and text-extraction
      > tools to get the
      > text out of the pages. Note that it doesn't matter
      > what the font size
      > is, so you can make the text very small if you want.
      > See
      > www.cranesoftwrights.com for details on Ken's
      > technique. This should
      > work for any FO implementation.
      > B. If your FO engine produces one, use the
      > (proprietary) area tree
      > serialization produced by your FO implementation.
      > Both XEP and XSL
      > Formatter provide the ability to dump the paginated
      > area tree to an XML
      > file (FOP might as well, I don't know). These trees
      > are non-standard
      > (there is no standard for area tree representation,
      > nor should there be)
      > but pretty obvious in their structure given an
      > understanding of the FO
      > specification. You can an XSLT transform to process
      > this tree in order
      > to figure out which elements occur on which pages.
      > The main downside
      > with this approach is that these area trees can be
      > quite large, easily
      > 10 times as big as the original XML documente, which
      > can make the total
      > process time slow. This is one reason I would prefer
      > the ability to
      > generate only that information I actually need for a
      > given process.
      > 3. In your second pass, use the information gathered
      > in step 2 to
      > reprocess the original input XML document. In this
      > pass you will now
      > know which pages your footnote references fall on
      > and can therefore do
      > things like only generate one reference per page or
      > reset the callouts
      > per page.
      > Unfortunately, in this particular example, because
      > you will likely be
      > changing which footnotes actually occur on which
      > pages, you will likely
      > change the pagination. This will require at least
      > one more pass to
      > settle out the footnote placement, and may require a
      > 4th pass to ensure
      > that there is no change from pass 3 to pass 4.
      > Cheers,
      > Eliot
      > --
      > W. Eliot Kimber
      > Professional Services
      > Innodata Isogen
      > 9390 Research Blvd, #410
      > Austin, TX 78759
      > (512) 372-8122
      > ekimber@...
      > www.innodata-isogen.com

      Do You Yahoo!?
      Tired of spam? Yahoo! Mail has the best spam protection around
    Your message has been successfully submitted and would be delivered to recipients shortly.