Loading ...
Sorry, an error occurred while loading the content.
 

Re: Invalid byte 1 of 1-byte error when using ant

Expand Messages
  • Chris Kravogel
    Don, I found the problem. My fault was that I was searching in the XML files for the bug only, but it was in my bookmap.dtd. In a remark line I wrote a version
    Message 1 of 4 , Dec 1, 2005
      Don, I found the problem. My fault was that I was searching in the
      XML files for the bug only, but it was in my bookmap.dtd. In a
      remark line I wrote a version history of the bookmap specialization.
      There I wrote my name, companyname and city. And the character "ü"
      in the city name caused the problem. I changed it to "ue" and it
      works now.

      Best regards

      Chris


      --- In dita-users@yahoogroups.com, Don Day <dond@u...> wrote:
      >
      > Chris, this sounds suspiciously like the presence of a Unicode
      Byte Order
      > Mark (BOM) in a source file. From unicode.org, I found this
      statement: "In
      > the UTF-16 encoding scheme, U+FEFF at the very beginning of a file
      or
      > stream explicitly signals the byte order." You might do a grep or
      search
      > across your source files for a leading FEFF(hex) pair of
      characters. If you
      > find such a file, it was probably saved as UTF-16 despite the
      attribution
      > in the XML PI. DITA processing assumes default UTF-8 byte
      alignment, in
      > which the leading BOM characters in a file would be noise rather
      than data.
      > I suspect it to be in source because of the fact that gen-list is
      literally
      > touching all files referenced by the ditamap.
      >
      > Regards,
      > --
      > Don Day
      > Chair, OASIS DITA Technical Committee
      > IBM Lead DITA Architect
      > Email: dond@u...
      > 11501 Burnet Rd. MS9033E015, Austin TX 78758
      > Phone: +1 512-838-8550
      > T/L: 678-8550
      >
      > "Where is the wisdom we have lost in knowledge?
      > Where is the knowledge we have lost in information?"
      > --T.S. Eliot
      >
      >
      >

      > "Chris
      Kravogel"
      >
      <dita@s...
      >
      om> To
      > Sent by: dita-
      users@yahoogroups.com
      > dita-
      users@yahoog cc
      >
      roups.com
      >
      Subject
      > [dita-users] Invalid byte 1
      of
      > 11/30/2005 09:55 1-byte error when using
      ant
      >
      AM
      >

      >

      > Please respond
      to
      > dita-
      users
      >

      >

      >
      >
      >
      >
      > Hi
      > I have started using ant to generate pdf output. It started working
      > fine, selected the right bookmap.xml file, accepted the outputdir
      > folder proceeded with clean-temp but then when ant came to gen-
      list I
      > received the following error message:
      > "Invalid byte 1 of 1-byte UTF-8 sequence"
      > That confuses me as I have defined the utf-8 encoding in all xml
      > files, the map and the topics by <?xml version="1.0" encoding="utf-
      8"?>
      >
      > So why do I receive this error message? Does anyone has an idea?
      (The
      > xml files were generated by using epic editor.
      >
      > Best regards
      >
      > Chris
      >
      >
      >
      >
      >
      >
      SPONSORED LINKS
      >
      > Xml How to format a computer hard Dita
      > drive
      >
      > How to format a computer
      >
      >
      >
      >
      > YAHOO! GROUPS LINKS
      >
      > Visit your group "dita-users" on the web.
      >
      > To unsubscribe from this group, send an email to:
      > dita-users-unsubscribe@yahoogroups.com
      >
      > Your use of Yahoo! Groups is subject to the Yahoo! Terms of
      Service.
      >
    • Don Day
      ... Thanks for letting us know your discovery, Chris. So perhaps we should formalize this lesson learned as a guideline for any DITA specializers working with
      Message 2 of 4 , Dec 2, 2005
        dita-users@yahoogroups.com wrote on 12/01/2005 02:36:34 AM:

        > Don, I found the problem. My fault was that I was searching in the
        > XML files for the bug only, but it was in my bookmap.dtd. In a
        > remark line I wrote a version history of the bookmap specialization.
        > There I wrote my name, companyname and city. And the character "ü"
        > in the city name caused the problem. I changed it to "ue" and it
        > works now.

        Thanks for letting us know your discovery, Chris. So perhaps we should
        formalize this lesson learned as a guideline for any DITA specializers
        working with DTDs.

        DTDs are not ordinarily parsed as XML (which is Unicode), and some tools
        are sensitive to any characters outside of the usual ASCII range. The
        umlauted u in Chris's case appears to be a Win 1252 character, which was
        invalid to that particular parser. Interestingly, Chris, you could have
        saved your umlauted version of the DTD as a UTF-8 document and it would
        have worked properly. But then you have the issue of some tools not
        responding properly to the Byte Order Mark that might be set at the start
        of the file.

        Therefore, to ensure that your new DTD is accessible to the widest set of
        available tools, be careful that your comments and declarations are using
        only normally available 7-bit ASCII characters, and follow standard
        conventions for phonetic or symbolic representation of umlauts, accents,
        and other special characters.

        This limitation should not exist for Schema specializations, but since you
        might want to provide equivalent DTDs so that the greatest number of tools
        can handle your specialization, then the guidelines for DTDs still set the
        ground rules.

        And while we know it is possible to save the DTD files as UTF-8 and thereby
        use any Unicode character in element names, remember that not all users
        might have access to keyboards for direct keying of those characters when
        authoring content as raw XML. This does not apply for editors that
        generate markup automatically. These are fine decision points between
        expressiveness and usability! One is often at the cost of the other.


        Regards,
        --
        Don Day
        Chair, OASIS DITA Technical Committee
        IBM Lead DITA Architect
        Email: dond@...
        11501 Burnet Rd. MS9033E015, Austin TX 78758
        Phone: +1 512-838-8550
        T/L: 678-8550

        "Where is the wisdom we have lost in knowledge?
        Where is the knowledge we have lost in information?"
        --T.S. Eliot
      Your message has been successfully submitted and would be delivered to recipients shortly.