Loading ...
Sorry, an error occurred while loading the content.

8995Re: [NTO] Text file differences - what's going on?

Expand Messages
  • Alec Burgess
    Dec 2, 2010
    • 0 Attachment
      On 2010-12-02 12:36, Alex Plantema wrote:
      > Op donderdag 2 december 2010 13:01 schreef Alec Burgess:
      >
      > > Can anybody explain and/or point me to a tool I can use to change the
      > > "bad" files to "good" so I can get them indexed by X1?
      >
      > EF BB BF is the UTF-8 representation of U+FEFF, the byte order mark.
      > With Notepad, you can create a file containing only these three bytes
      > by saving an empty file with UTF-8 encoding,
      > and use it to add them to a file, e.g.:
      >
      > copy empty.txt+art_bad.html art_good2.html
      Thanks Alex for that explanation and workaround. I'll use it.

      After googling [EF BB BF] and reading
      http://en.wikipedia.org/wiki/Byte_order_mark it appears that presence of
      the byte order mark is supposed to be optional in UTF-8 files (and
      deprecated).

      Am I reading that correctly? If so the behavior of PsPad (says can't
      read in hex format) and X1 (won't index) should be considered "bugs"?

      --
      Regards ... Alec (buralex@gmail& WinLiveMess - alec.m.burgess@skype)
    • Show all 9 messages in this topic