Re: Invalid byte 1 of 1-byte error when using ant
- Don, I found the problem. My fault was that I was searching in the
XML files for the bug only, but it was in my bookmap.dtd. In a
remark line I wrote a version history of the bookmap specialization.
There I wrote my name, companyname and city. And the character "ü"
in the city name caused the problem. I changed it to "ue" and it
--- In email@example.com, Don Day <dond@u...> wrote:
> Chris, this sounds suspiciously like the presence of a Unicode
> Mark (BOM) in a source file. From unicode.org, I found this
> the UTF-16 encoding scheme, U+FEFF at the very beginning of a file
> stream explicitly signals the byte order." You might do a grep or
> across your source files for a leading FEFF(hex) pair of
characters. If you
> find such a file, it was probably saved as UTF-16 despite the
> in the XML PI. DITA processing assumes default UTF-8 byte
> which the leading BOM characters in a file would be noise rather
> I suspect it to be in source because of the fact that gen-list is
> touching all files referenced by the ditamap.
> Don Day
> Chair, OASIS DITA Technical Committee
> IBM Lead DITA Architect
> Email: dond@u...
> 11501 Burnet Rd. MS9033E015, Austin TX 78758
> Phone: +1 512-838-8550
> T/L: 678-8550
> "Where is the wisdom we have lost in knowledge?
> Where is the knowledge we have lost in information?"
> --T.S. Eliot
> Sent by: dita-
> [dita-users] Invalid byte 1
> 11/30/2005 09:55 1-byte error when using
> Please respond
> I have started using ant to generate pdf output. It started working
> fine, selected the right bookmap.xml file, accepted the outputdir
> folder proceeded with clean-temp but then when ant came to gen-
> received the following error message:
> "Invalid byte 1 of 1-byte UTF-8 sequence"
> That confuses me as I have defined the utf-8 encoding in all xml
> files, the map and the topics by <?xml version="1.0" encoding="utf-
> So why do I receive this error message? Does anyone has an idea?
> xml files were generated by using epic editor.
> Best regards
> Xml How to format a computer hard Dita
> How to format a computer
> YAHOO! GROUPS LINKS
> Visit your group "dita-users" on the web.
> To unsubscribe from this group, send an email to:
> Your use of Yahoo! Groups is subject to the Yahoo! Terms of
- firstname.lastname@example.org wrote on 12/01/2005 02:36:34 AM:
> Don, I found the problem. My fault was that I was searching in theThanks for letting us know your discovery, Chris. So perhaps we should
> XML files for the bug only, but it was in my bookmap.dtd. In a
> remark line I wrote a version history of the bookmap specialization.
> There I wrote my name, companyname and city. And the character "ü"
> in the city name caused the problem. I changed it to "ue" and it
> works now.
formalize this lesson learned as a guideline for any DITA specializers
working with DTDs.
DTDs are not ordinarily parsed as XML (which is Unicode), and some tools
are sensitive to any characters outside of the usual ASCII range. The
umlauted u in Chris's case appears to be a Win 1252 character, which was
invalid to that particular parser. Interestingly, Chris, you could have
saved your umlauted version of the DTD as a UTF-8 document and it would
have worked properly. But then you have the issue of some tools not
responding properly to the Byte Order Mark that might be set at the start
of the file.
Therefore, to ensure that your new DTD is accessible to the widest set of
available tools, be careful that your comments and declarations are using
only normally available 7-bit ASCII characters, and follow standard
conventions for phonetic or symbolic representation of umlauts, accents,
and other special characters.
This limitation should not exist for Schema specializations, but since you
might want to provide equivalent DTDs so that the greatest number of tools
can handle your specialization, then the guidelines for DTDs still set the
And while we know it is possible to save the DTD files as UTF-8 and thereby
use any Unicode character in element names, remember that not all users
might have access to keyboards for direct keying of those characters when
authoring content as raw XML. This does not apply for editors that
generate markup automatically. These are fine decision points between
expressiveness and usability! One is often at the cost of the other.
Chair, OASIS DITA Technical Committee
IBM Lead DITA Architect
11501 Burnet Rd. MS9033E015, Austin TX 78758
Phone: +1 512-838-8550
"Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?"