Loading ...
Sorry, an error occurred while loading the content.
 

Re: [XSL-FO] How to use properly?

Expand Messages
  • C. Myers
    Eliot, Thank you so much for your prompt reply and valuable information you have provided. We are using RenderX, and its immediate file is called xep and
    Message 1 of 7 , Jan 20, 2005
      Eliot,
      Thank you so much for your prompt reply and valuable
      information you have provided. We are using RenderX,
      and its immediate file is called xep and basically a
      text. I will share your message with my colleague and
      decide what to do next.

      Thanks again.

      Sincerely,
      Ching

      --- "W. Eliot Kimber" <ekimber@...>
      wrote:

      > C. Myers wrote:
      >
      > > Hi Eliot,
      > >
      > > Could you provide me a little more
      > information/clue(s)
      > > how to implement the two-pass approach? Thanks.
      >
      > The exact mechanism will be entirely dependent on
      > the FO engine you're
      > using and what, if anything, they do to help in this
      > case. Alternatively
      > you can use the "put the data in PDF and extract it"
      > approach, which is
      > generic but can be a bit more trouble to implement
      > (but not that hard).
      >
      > For any solution, the basic approach is:
      >
      > 1. Figure out what layout-related information you
      > need in order to get
      > the effect you want. In your case you need to know
      > what page each
      > footnote reference falls on. That is, for each
      > element that makes a
      > footnote reference, you need to know the page number
      > it falls on.
      >
      > Thus you need to create an association between the
      > original input XML
      > element and the page number of the page it
      > eventually falls on. This is
      > easiest if the original element has an ID or some
      > other
      > easily-referenceable identifier, but that's not a
      > hard requirement. For
      > example, if you are using Saxon, the generated IDs
      > will be consistent
      > for the same input document because the IDs directly
      > reflect the
      > document and tree organization of the elements
      > [NOTE: XSLT doesn't
      > require this and you should not depend on it as a
      > general solution. The
      > Saxon implementation could change at any time (and
      > it may not even be
      > true in Saxon 8, I don't know).]
      >
      > 2. Generate the layout-related information. In the
      > abscence of a more
      > direct extension, there are essentially two
      > available approaches:
      >
      > A. Use Ken Holman's technique of creating leading
      > or trailing pages in
      > your PDF that contain the data you want in some
      > convenient text format
      > (e.g., as XML data or comma-delimited strings or
      > something). You can
      > then use any number of PDF page and text-extraction
      > tools to get the
      > text out of the pages. Note that it doesn't matter
      > what the font size
      > is, so you can make the text very small if you want.
      > See
      > www.cranesoftwrights.com for details on Ken's
      > technique. This should
      > work for any FO implementation.
      >
      > B. If your FO engine produces one, use the
      > (proprietary) area tree
      > serialization produced by your FO implementation.
      > Both XEP and XSL
      > Formatter provide the ability to dump the paginated
      > area tree to an XML
      > file (FOP might as well, I don't know). These trees
      > are non-standard
      > (there is no standard for area tree representation,
      > nor should there be)
      > but pretty obvious in their structure given an
      > understanding of the FO
      > specification. You can an XSLT transform to process
      > this tree in order
      > to figure out which elements occur on which pages.
      > The main downside
      > with this approach is that these area trees can be
      > quite large, easily
      > 10 times as big as the original XML documente, which
      > can make the total
      > process time slow. This is one reason I would prefer
      > the ability to
      > generate only that information I actually need for a
      > given process.
      >
      > 3. In your second pass, use the information gathered
      > in step 2 to
      > reprocess the original input XML document. In this
      > pass you will now
      > know which pages your footnote references fall on
      > and can therefore do
      > things like only generate one reference per page or
      > reset the callouts
      > per page.
      >
      > Unfortunately, in this particular example, because
      > you will likely be
      > changing which footnotes actually occur on which
      > pages, you will likely
      > change the pagination. This will require at least
      > one more pass to
      > settle out the footnote placement, and may require a
      > 4th pass to ensure
      > that there is no change from pass 3 to pass 4.
      >
      > Cheers,
      >
      > Eliot
      > --
      > W. Eliot Kimber
      > Professional Services
      > Innodata Isogen
      > 9390 Research Blvd, #410
      > Austin, TX 78759
      > (512) 372-8122
      >
      > ekimber@...
      > www.innodata-isogen.com
      >
      >


      __________________________________________________
      Do You Yahoo!?
      Tired of spam? Yahoo! Mail has the best spam protection around
      http://mail.yahoo.com
    Your message has been successfully submitted and would be delivered to recipients shortly.