Loading ...
Sorry, an error occurred while loading the content.

Re: opening a Unicode file

Expand Messages
  • mbbill
    àÅ ... --~--~---------~--~----~------------~-------~--~----~ You received this message from the vim_multibyte maillist. For more information, visit
    Message 1 of 6 , Sep 14, 2007
    • 0 Attachment




      >John (Eljay) Love-Jensen wrote:
      >> Hi Tony,
      >>
      >>> :e ++enc=utf-16 filename
      >>
      >> Thanks Tony! I've been wondering how to do that!
      >>
      >> Note: if the utf-16 file contains a BOM (which, often, it should/will), then it should not be necessary to specify utf-16le or utf-16be explicitly (and, indeed, would be incorrect according to Unicode standards to do so -- Vim probably does the friendly thing anyway).
      >
      >If any Unicode file (here I mean UTF-8, UTF16le, UTF-16be, UTF-32le or
      >UTF-32be -- I'll leave out GB18030 for the moment) starts with a BOM, Vim will
      >recognise it _provided_ that your 'fileencodings' (plural) starts with
      >"ucs-bom". In order for it to work properly, though, 'encoding' should already
      >be UTF-8 (or UTF-16 or UTF-32, which Vim handles internally as UTF-8 to avoid
      >problems with null bytes terminating C strings).
      >
      >Specifying explicitly that a file is, for instance, UTF-16le is IMHO not
      >"wrong" (unless the file is actually in some other encoding, of course); it is
      >just "unnecessary" if the file starts with a BOM.
      >
      >>
      >> I say this not for Tony's edification, because I'm sure that he already knows this, but for everyone else who may be in msorens's situation.
      >
      >:-)
      >
      >>
      >> Also if you need to make sure the file is written with BOM you can use:
      >>
      >> :set bomb
      >>
      >> Or without the BOM:
      >>
      >> :set nobomb
      >
      >....and if you want to make sure that "newly created" Unicode files will (or
      >won't) have a BOM by default you can write
      >
      > setglobal bomb
      >or
      > setglobal nobomb
      >
      >in your vimrc. (I use ":setglobal bomb" but YMMV.) This setting has no
      >influence on non-Unicode files such as those in Latin1.
      >
      >>
      >> For some light reading on Unicode 5.0:
      >>
      >> http://www.amazon.com/dp/0321480910/
      >
      >For serious reading, see also http://www.unicode.org/ -- and others.
      >
      >>
      >> HTH,
      >> --Eljay
      >
      >Best regards,
      >Tony.
      >--
      >99 blocks of crud on the disk,
      >99 blocks of crud!
      >You patch a bug, and dump it again:
      >100 blocks of crud on the disk!
      >
      >100 blocks of crud on the disk,
      >100 blocks of crud!
      >You patch a bug, and dump it again:
      >101 blocks of crud on the disk! ...
      >
      >>

      --~--~---------~--~----~------------~-------~--~----~
      You received this message from the "vim_multibyte" maillist.
      For more information, visit http://www.vim.org/maillist.php
      -~----------~----~----~----~------~----~------~--~---
    • Camillo Särs
      ... Beware, though, that if your environment defaults to utf-8 file encoding, then setting bomb will cause the BOM to be written to all new files. This can
      Message 2 of 6 , Sep 15, 2007
      • 0 Attachment
        Tony Mechelynck wrote:
        > ...and if you want to make sure that "newly created" Unicode files will (or
        > won't) have a BOM by default you can write
        >
        > setglobal bomb
        > or
        > setglobal nobomb
        >
        > in your vimrc. (I use ":setglobal bomb" but YMMV.) This setting has no
        > influence on non-Unicode files such as those in Latin1.

        Beware, though, that if your environment defaults to utf-8 file
        encoding, then setting "bomb" will cause the BOM to be written to all
        new files. This can become a problem when dealing with some legacy
        applications that don't expect to see those extra bytes at the
        beginning. Examples range from *nix shells and hashbang (#!) processing
        to Windows .ini file headings [...].

        So this setting may indeed cause some legacy apps to "bomb" on you.
        Pardon the pun, but I thought it was hilarious once I got over the "duh"
        factor after debugging.

        Regards,
        Camillo
        --
        Camillo Särs <ged@...> Aim for the impossible and you
        http://www.ged.fi will achieve the improbable

        --~--~---------~--~----~------------~-------~--~----~
        You received this message from the "vim_multibyte" maillist.
        For more information, visit http://www.vim.org/maillist.php
        -~----------~----~----~----~------~----~------~--~---
      Your message has been successfully submitted and would be delivered to recipients shortly.