Loading ...
Sorry, an error occurred while loading the content.

Re: vim + win + utf-8 => I'm lost

Expand Messages
  • Tony Mechelynck
    ... From: Mojca Miklavec To: Sent: Friday, August 05, 2005 3:28 AM Subject: Re: vim + win + utf-8 = I m lost
    Message 1 of 7 , Aug 4 6:47 PM
    • 0 Attachment
      ----- Original Message -----
      From: "Mojca Miklavec" <mojca.miklavec.lists@...>
      To: <vim@...>
      Sent: Friday, August 05, 2005 3:28 AM
      Subject: Re: vim + win + utf-8 => I'm lost


      >> Someone said that probably windows doesn't pass the proper characters
      >> from keyboard to vim and I figured out that he was probably right. I'm
      >> using cp1250 by default and it works perfect. :set encoding=utf-8 also
      >> works, but I can't type in anything but plain ASCII. Copy-paste from and
      >> to other programs works OK.
      >
      > I'm sorry. It seems that I had to flood the mailing list before
      > discovering the :set termencoding=cp1250 command by myself (and I've
      > been looking for it for at least two years). However, the other two
      > questions are still relevant and I would still be interested in the
      > answer about how to convince Windows to send proper unicode to the
      > editor.
      >
      > Thank you,
      > Mojca

      See my other reply.

      See also, in addition to my tips and script listed over there

      :help digraph.txt
      :help i_CTRL-V_digit
      :help mbyte-keymap
      :help 'langmap'

      * Digraphs are a great method to type "simple" Unicode characters like
      c-caron, s-cedilla, o-slash, oe-ligature, one-half, etc.
      * ^Vuxxxx and ^VUxxxxxxxx (where each x is a hex digit) are invaluable when
      you know the codepoint number but don't have a handy digraph.
      * Keymaps are very useful to define an "alternate keyboard" for a given
      language and to switch "on the fly" between that and English.
      * The 'langmap' option is useful to type Vim commands in Latin alphabet when
      your "native" encoding is something else, for instance Cyrillic or Greek.

      HTH,
      Tony.
    • Mojca Miklavec
      Tony, thank you very much for all the hints. Digraphs, termencoding and langmap (once I write some definitions) now solve 85% of my problems. I would be glad
      Message 2 of 7 , Aug 6 5:37 PM
      • 0 Attachment
        Tony, thank you very much for all the hints. Digraphs, termencoding
        and langmap (once I write some definitions) now solve 85% of my
        problems. I would be glad if windows could communicate with vim in
        unicode directly, but I can live with intermediate step in cp1250 for
        now.

        1. Now another question: I have plenty of material in cp1250. Can I
        write something like that in vimrc:

        if (file seems to be in utf-8 or if this is a new window)
        set encoding=utf-8
        else
        set encoding=cp1250
        ?

        2. Does anyone have any idea why I can't set the latin2 encoding? (I
        can set it, but the files are not displayed any different than if
        cp1250 encoding is set. The worst thing is that probably 10 characters
        are at some other place, but exactly the ones I need are displayed
        wrong.)

        Thanks,
        Mojca


        Tony Mechelynck wrote:
        > See also, in addition to my tips and script listed over there
        >
        > :help digraph.txt
        > :help i_CTRL-V_digit
        > :help mbyte-keymap
        > :help 'langmap'
        >
        > * Digraphs are a great method to type "simple" Unicode characters like
        > c-caron, s-cedilla, o-slash, oe-ligature, one-half, etc.
        > * ^Vuxxxx and ^VUxxxxxxxx (where each x is a hex digit) are invaluable when
        > you know the codepoint number but don't have a handy digraph.
        > * Keymaps are very useful to define an "alternate keyboard" for a given
        > language and to switch "on the fly" between that and English.
        > * The 'langmap' option is useful to type Vim commands in Latin alphabet when
        > your "native" encoding is something else, for instance Cyrillic or Greek.
        >
        > HTH,
        > Tony.
      • Tony Mechelynck
        ... From: Mojca Miklavec To: Sent: Sunday, August 07, 2005 2:37 AM Subject: Re: vim + win + utf-8 = I m lost
        Message 3 of 7 , Aug 6 6:14 PM
        • 0 Attachment
          ----- Original Message -----
          From: "Mojca Miklavec" <mojca.miklavec.lists@...>
          To: <vim@...>
          Sent: Sunday, August 07, 2005 2:37 AM
          Subject: Re: vim + win + utf-8 => I'm lost


          > Tony, thank you very much for all the hints. Digraphs, termencoding
          > and langmap (once I write some definitions) now solve 85% of my
          > problems. I would be glad if windows could communicate with vim in
          > unicode directly, but I can live with intermediate step in cp1250 for
          > now.
          >
          > 1. Now another question: I have plenty of material in cp1250. Can I
          > write something like that in vimrc:
          >
          > if (file seems to be in utf-8 or if this is a new window)
          > set encoding=utf-8
          > else
          > set encoding=cp1250
          > ?

          set encoding=utf-8 termencoding=cp1250
          set fileencodings=ucs-bom,utf-8,cp1250

          This will set 'fileencoding' and 'bomb' buffer-locally to:

          1. bomb fileencoding=<the proper Unicode encoding (of the 5 possible)>
          for any file with a BOM
          2. nobomb fileencoding=utf-8
          - for an empty (or new) file
          - for a file which doesn't contain invalid byte sequences for UTF-8
          3. nobomb fileencoding=cp1250
          otherwise

          Note that ucs-bom should always be first, that there should be at most one
          8-bit encoding, and that it should be last.

          These 3 steps are run in that order. Step 1 is what Windows does. It will
          recognise UTF-8 files with BOM there, i.e., files whose first 3 bytes are EF
          BB BF in hex (codepoint U+FEFF). To add a BOM to any Unicode file of yours,
          use ":setlocal bomb".

          Below is my 'statusline' setting, you may or may not find it useful. It
          displays the 'fileencoding' (or 'encoding' if 'fileencoding' is empty), the
          'bomb' status and (if any) the current keymap. Disregard any linebreaks
          added by my mail client; it should be all on one line, and spurious line
          breaks should be replaced by spaces.

          set statusline=%<%f\
          %h%m%r%=%k[%{(&fenc==\"\")?&enc:&fenc}%{(&bomb?\",BOM\":\"\")}]\
          %-14.(%l,%c%V%)\ %P


          >
          > 2. Does anyone have any idea why I can't set the latin2 encoding? (I
          > can set it, but the files are not displayed any different than if
          > cp1250 encoding is set. The worst thing is that probably 10 characters
          > are at some other place, but exactly the ones I need are displayed
          > wrong.)

          The Vim name is iso-8859-2 and you may need a working iconv.dll in your
          PATH. I got my iconv.exe and iconv.dll from the GnuWin32 project on
          sourceforge.net.

          To read a latin2 file, use

          :e ++enc=iso-8859-2 filename.ext

          (see ":help ++opt") after installing iconv and making sure that it is in
          your PATH.

          >
          > Thanks,
          > Mojca

          My pleasure,
          Tony.
        • Mojca Miklavec
          ... Thank you. The last command is exactly what I was looking for (but I would never figure it out alone)! ... Thank you. Something very useful indeed. ... It
          Message 4 of 7 , Aug 7 2:40 AM
          • 0 Attachment
            Tony Mechelynck wrote:

            > > 1. Now another question: I have plenty of material in cp1250. Can I
            > > write something like that in vimrc:
            > >
            > > if (file seems to be in utf-8 or if this is a new window)
            > > set encoding=utf-8
            > > else
            > > set encoding=cp1250
            > > ?
            >
            > set encoding=utf-8 termencoding=cp1250
            > set fileencodings=ucs-bom,utf-8,cp1250

            Thank you. The last command is exactly what I was looking for (but I
            would never figure it out alone)!

            > Below is my 'statusline' setting, you may or may not find it useful. It
            > displays the 'fileencoding' (or 'encoding' if 'fileencoding' is empty), the
            > 'bomb' status and (if any) the current keymap. Disregard any linebreaks
            > added by my mail client; it should be all on one line, and spurious line
            > breaks should be replaced by spaces.
            >
            > set statusline=%<%f\
            > %h%m%r%=%k[%{(&fenc==\"\")?&enc:&fenc}%{(&bomb?\",BOM\":\"\")}]\
            > %-14.(%l,%c%V%)\ %P

            Thank you. Something very useful indeed.

            > > 2. Does anyone have any idea why I can't set the latin2 encoding? (I
            > > can set it, but the files are not displayed any different than if
            > > cp1250 encoding is set. The worst thing is that probably 10 characters
            > > are at some other place, but exactly the ones I need are displayed
            > > wrong.)
            >
            > The Vim name is iso-8859-2 and you may need a working iconv.dll in your
            > PATH. I got my iconv.exe and iconv.dll from the GnuWin32 project on
            > sourceforge.net.
            >
            > To read a latin2 file, use
            >
            > :e ++enc=iso-8859-2 filename.ext

            It seems that I already installed iconv once (or that it was installed
            by some other program). However, ++enc was the magic missing word :)
            Thank you a lot!

            And GnuWin32 is another fantastic set of tools. Thank you for telling
            me about them. I have some other gnu tools installed, but some tools
            are only present in GnuWin32, not in the one I have installed.

            Thank you for those short, life-saving piecies of code once again, Tony,
            Mojca
          Your message has been successfully submitted and would be delivered to recipients shortly.