Loading ...
Sorry, an error occurred while loading the content.

Re: fencs trial is terminated unexpectedly.

Expand Messages
  • Tony Mechelynck
    ... - Especially when encoding is utf-8, it is recommended to start fileencodings with ucs-bom. - It is always recommended to end the fileencodings with
    Message 1 of 4 , Apr 23, 2013
    • 0 Attachment
      On 23/04/13 17:19, Taro MURAOKA wrote:
      > Hi list.
      >
      >
      > When 'enc' is "utf-8" and 'fencs' includes "ucs-2",
      > and open a file which is not "ucs-2" encoding,
      > then fencs trial is terminated at "ucs-2" unexpectedly.
      >
      > For example:
      >
      > :set enc=utf-8
      > :set fencs=ucs-2
      > :e abc.txt
      >
      > It is failed when opening attached "abc.txt".
      >
      >
      > I wrote an attached patch to fix this.
      > Please check it.
      >
      >
      > Best.
      >

      - Especially when 'encoding' is utf-8, it is recommended to start
      'fileencodings' with ucs-bom.
      - It is always recommended to end the 'fileencodings' with some 8-bit
      encoding, which will serve as default
      - It is useless to put more than one 8-bit encoding in 'fileencodings',
      nothing after the first 8-bit encoding will ever be tried
      - ucs-2 is obsolete, utf-16 should be used instead. (UTF-16 can
      represent codepoints up to U+10FFFF, using surrogate pairs for anything
      above U+FFFF. UCS-2 cannot go further up than U+FFFF and surrogates are
      invalid when using it.)
      - For ucs-something and utf-something other than utf-8 (and utf-7 which
      is also obsolete), big-endian is assumed unless you explicitly specify
      little-endian, even when running on a little-endian machine. So, for
      Vim, utf-16 is the same as utf-16be, not utf-16le, even on Intel x86
      processors.
      - It is very hard to detect utf-16 (and the obsolete ucs-2) correctly
      unless there is a BOM (in which case ucs-bom will handle it)
      - In recent versions of Vim (including all patchlevels of 7.3),
      ++enc=something completely bypasses the 'fileencodings' heuristics,
      forcing the charset you mentioned. You may get � or hollow-box wildcards
      if the file contents are invalid for that encoding.

      For "Western" locales, I recommend

      :set fencs=ucs-bom,utf-8,latin1

      For East-Asian locales there is a script somewhere that improves on the
      'fileencodings' heuristic (trying to discriminate as best as possible
      between the common encodings used for the various CJK languages) but I
      don't know the details.


      Best regards,
      Tony.
      --
      This message contains 78% recycled characters.

      --
      --
      You received this message from the "vim_dev" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php

      ---
      You received this message because you are subscribed to the Google Groups "vim_dev" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
      For more options, visit https://groups.google.com/groups/opt_out.
    Your message has been successfully submitted and would be delivered to recipients shortly.