Loading ...
Sorry, an error occurred while loading the content.

XSL-FO and foreign language.

Expand Messages
  • LuzErez
    Hi all I have a general question regarding XSL-FO. The concept that the formatting direction is in the tags: “ unicode-bidi= embed is not clear to me. I
    Message 1 of 3 , Jan 12, 2005
    • 0 Attachment
      Hi all
      I have a general question regarding XSL-FO. The concept that the formatting
      direction is in the tags: � unicode-bidi="embed" is not clear to me. I build
      a system that knows to export XSL-FO or I am using XML stream as a data
      source. How do I know if my user keyed in one of the application text box
      Hebrew or Arabic? Maybe my external XML has Arabic in it ? Should I use the
      � unicode-bidi � in every text element?


      Best Regards

      Luz Erez
      Email: Luz@...
      Web: www.ErezSoft.com
      Phone +972-545661331



      [Non-text portions of this message have been removed]
    • W. Eliot Kimber
      ... Definitely you should not use unicode-bidi in every text element. The first key is that Unicode has built-in directionality information such that a
      Message 2 of 3 , Jan 12, 2005
      • 0 Attachment
        LuzErez wrote:

        > Hi all
        > I have a general question regarding XSL-FO. The concept that the formatting
        > direction is in the tags: “ unicode-bidi="embed" is not clear to me. I build
        > a system that knows to export XSL-FO or I am using XML stream as a data
        > source. How do I know if my user keyed in one of the application text box
        > Hebrew or Arabic? Maybe my external XML has Arabic in it ? Should I use the
        > “ unicode-bidi “ in every text element?

        Definitely you should not use unicode-bidi in every text element.

        The first key is that Unicode has built-in directionality information
        such that a Unicode-aware, directionality-aware processor should be able
        to detect the inherent directionality of any sequence of characters.

        The second key is that Unicode defines a sophisticated "bidirectionality
        algorithm" by which applications can determine how to correctly render
        text that mixes left-to-right and right-to-left characters. Most of the
        time this algorithm produces correct results, assuming it's correctly
        implemented (which is a big if).

        With respect to getting the correct layout in FO, there are three
        possible cases with respect to directionality:

        1. A document consists of content that is all of one directionality
        (e.g., all Latin script, all Arabic script, all ideographic characters).

        2. A document consists of mostly left-to-right script with some
        right-to-left script, such as an English document with quoted Arabic words.

        3. A document consists of mostly right-to-left script with some
        left-to-right script, such as Hebrew document with some English words.

        In the first case all that should be required is to set the appropriate
        writing-mode where necessary in order to get the correct layout. The
        Unicode directionality information will be sufficient for the processor
        to compose the character content correctly.

        In the second and third cases in most of the time you need do nothing
        special because again the Unicode directionality and the Unicode
        bidirectional algorithm will produce the correct result (at least in
        those FO processors that implement the Unicode bidi algorithm).

        The only place where bidi-override should be necessary is where you need
        to get a result that is different from the result that would be produced
        by the application of the Unicode bidi algorithm or different from what
        your renderer produces (for whatever reason).

        The biggest problems I've run into are when right-to-left text is mixed
        with both arabic digits and punctuation or bracketing characters--in
        these cases the default result is almost never right, either because of
        the way the bidi algorithm works or because of bugs in a particular
        renderer.

        Also, the fo:bidi-override element is functionally equivalent to the
        Unicode directionality control characters, \u202b through \u202e, and in
        fact some, if not all, implementations simply translate fo:bidi-override
        into the equivalent control characters. Therefore it may be easier to
        just use these control characters where you need them, especially if the
        directionality control needs to span FO boundaries.

        The best thing to do is create some test FO instances and see what you
        get--that will tell you whether you need to do more work to get the
        correct result in the output.

        Cheers,

        Eliot
        --
        W. Eliot Kimber
        Professional Services
        Innodata Isogen
        9390 Research Blvd, #410
        Austin, TX 78759
        (512) 372-8122

        ekimber@...
        www.innodata-isogen.com
      • luzerez
        Thank U for the detail answer. From my tests with out ( With several renderes ) with out specifaing the correnct tags the mixed Hebrew English text will be ;-(
        Message 3 of 3 , Jan 12, 2005
        • 0 Attachment
          Thank U for the detail answer. From my tests with out ( With several
          renderes ) with out specifaing the correnct tags the mixed Hebrew
          English text will be ;-( . Maybe in the future this process will be
          not need this.


          --- In XSL-FO@yahoogroups.com, "W. Eliot Kimber" <ekimber@i...> wrote:
          > LuzErez wrote:
          >
          > > Hi all
          > > I have a general question regarding XSL-FO. The concept that the
          formatting
          > > direction is in the tags: " unicode-bidi="embed" is not clear to
          me. I build
          > > a system that knows to export XSL-FO or I am using XML stream as
          a data
          > > source. How do I know if my user keyed in one of the application
          text box
          > > Hebrew or Arabic? Maybe my external XML has Arabic in it ?
          Should I use the
          > > " unicode-bidi " in every text element?
          >
          > Definitely you should not use unicode-bidi in every text element.
          >
          > The first key is that Unicode has built-in directionality
          information
          > such that a Unicode-aware, directionality-aware processor should be
          able
          > to detect the inherent directionality of any sequence of characters.
          >
          > The second key is that Unicode defines a
          sophisticated "bidirectionality
          > algorithm" by which applications can determine how to correctly
          render
          > text that mixes left-to-right and right-to-left characters. Most of
          the
          > time this algorithm produces correct results, assuming it's
          correctly
          > implemented (which is a big if).
          >
          > With respect to getting the correct layout in FO, there are three
          > possible cases with respect to directionality:
          >
          > 1. A document consists of content that is all of one directionality
          > (e.g., all Latin script, all Arabic script, all ideographic
          characters).
          >
          > 2. A document consists of mostly left-to-right script with some
          > right-to-left script, such as an English document with quoted
          Arabic words.
          >
          > 3. A document consists of mostly right-to-left script with some
          > left-to-right script, such as Hebrew document with some English
          words.
          >
          > In the first case all that should be required is to set the
          appropriate
          > writing-mode where necessary in order to get the correct layout.
          The
          > Unicode directionality information will be sufficient for the
          processor
          > to compose the character content correctly.
          >
          > In the second and third cases in most of the time you need do
          nothing
          > special because again the Unicode directionality and the Unicode
          > bidirectional algorithm will produce the correct result (at least
          in
          > those FO processors that implement the Unicode bidi algorithm).
          >
          > The only place where bidi-override should be necessary is where you
          need
          > to get a result that is different from the result that would be
          produced
          > by the application of the Unicode bidi algorithm or different from
          what
          > your renderer produces (for whatever reason).
          >
          > The biggest problems I've run into are when right-to-left text is
          mixed
          > with both arabic digits and punctuation or bracketing characters--
          in
          > these cases the default result is almost never right, either
          because of
          > the way the bidi algorithm works or because of bugs in a particular
          > renderer.
          >
          > Also, the fo:bidi-override element is functionally equivalent to
          the
          > Unicode directionality control characters, \u202b through \u202e,
          and in
          > fact some, if not all, implementations simply translate fo:bidi-
          override
          > into the equivalent control characters. Therefore it may be easier
          to
          > just use these control characters where you need them, especially
          if the
          > directionality control needs to span FO boundaries.
          >
          > The best thing to do is create some test FO instances and see what
          you
          > get--that will tell you whether you need to do more work to get the
          > correct result in the output.
          >
          > Cheers,
          >
          > Eliot
          > --
          > W. Eliot Kimber
          > Professional Services
          > Innodata Isogen
          > 9390 Research Blvd, #410
          > Austin, TX 78759
          > (512) 372-8122
          >
          > ekimber@i...
          > www.innodata-isogen.com
        Your message has been successfully submitted and would be delivered to recipients shortly.