Loading ...
Sorry, an error occurred while loading the content.

vim + win + utf-8 => I'm lost

Expand Messages
  • Mojca Miklavec
    Hello, 1. I ve been using vim for quite some time as a basic user, but I cannot figure out how to type unicode under Windows. When I installed vim to a
    Message 1 of 7 , Aug 4, 2005
    • 0 Attachment
      Hello,

      1. I've been using vim for quite some time as a basic user, but I cannot
      figure out how to type unicode under Windows. When I installed vim to a
      friend's computer, windows + unicode was no problem at all, it's just my
      computer that's causing problems.

      Someone said that probably windows doesn't pass the proper characters
      from keyboard to vim and I figured out that he was probably right. I'm
      using cp1250 by default and it works perfect. :set encoding=utf-8 also
      works, but I can't type in anything but plain ASCII. Copy-paste from and
      to other programs works OK.

      In Control panel -> Regional options -> Advanced, there's an option
      "Select the language version for the programs which don't support
      Unicode". I selected Slovenian (cp1250) and cp1250 actually works with
      vim. Unicode in Mozilla also works without any problems. If I connect to
      a remote computer with putty (ssh) and use vim there, typing unicode is
      no problem at all.

      I also worked with a computer under linux, where the encoding in locale
      was latin1. I also didn't succeed to type our characters (ccaron,
      scaron) from keyboard there, although kwrite, Mozilla, OpenOffice and
      many other graphical programs had no problems dealing with unicode and a
      foreign keyboard.

      2. If I have a file in ISO-8859-2 encoding, I can't open it properly.
      :set encoding=latin2 doesn't have any influence on the way I see
      accented characters. The only remaining option is recode or other text
      editor.

      3. I don't need to read Swahili and I don't need to have all the
      10^\infinity Chinese figures, but with the font that vim uses by default
      (fixedsys) I can't see cyrillic, greek, euro symbol and some of the very
      common characters from European languages (with ogonek, cedilla, stroke,
      ...). Which fonts can be recommended?

      Thank you very muuch for any hints,
      Mojca
    • Mojca Miklavec
      ... I m sorry. It seems that I had to flood the mailing list before discovering the :set termencoding=cp1250 command by myself (and I ve been looking for it
      Message 2 of 7 , Aug 4, 2005
      • 0 Attachment
        > Someone said that probably windows doesn't pass the proper characters
        > from keyboard to vim and I figured out that he was probably right. I'm
        > using cp1250 by default and it works perfect. :set encoding=utf-8 also
        > works, but I can't type in anything but plain ASCII. Copy-paste from and
        > to other programs works OK.

        I'm sorry. It seems that I had to flood the mailing list before
        discovering the :set termencoding=cp1250 command by myself (and I've
        been looking for it for at least two years). However, the other two
        questions are still relevant and I would still be interested in the
        answer about how to convince Windows to send proper unicode to the
        editor.

        Thank you,
        Mojca
      • Tony Mechelynck
        ... From: Mojca Miklavec To: Sent: Friday, August 05, 2005 2:23 AM Subject: vim + win + utf-8 = I m lost ...
        Message 3 of 7 , Aug 4, 2005
        • 0 Attachment
          ----- Original Message -----
          From: "Mojca Miklavec" <mojca.miklavec.lists@...>
          To: <vim@...>
          Sent: Friday, August 05, 2005 2:23 AM
          Subject: vim + win + utf-8 => I'm lost


          > Hello,
          >
          > 1. I've been using vim for quite some time as a basic user, but I cannot
          > figure out how to type unicode under Windows. When I installed vim to a
          > friend's computer, windows + unicode was no problem at all, it's just my
          > computer that's causing problems.
          >
          > Someone said that probably windows doesn't pass the proper characters from
          > keyboard to vim and I figured out that he was probably right. I'm using
          > cp1250 by default and it works perfect. :set encoding=utf-8 also works,
          > but I can't type in anything but plain ASCII. Copy-paste from and to other
          > programs works OK.
          >
          > In Control panel -> Regional options -> Advanced, there's an option
          > "Select the language version for the programs which don't support
          > Unicode". I selected Slovenian (cp1250) and cp1250 actually works with
          > vim. Unicode in Mozilla also works without any problems. If I connect to a
          > remote computer with putty (ssh) and use vim there, typing unicode is no
          > problem at all.
          >
          > I also worked with a computer under linux, where the encoding in locale
          > was latin1. I also didn't succeed to type our characters (ccaron, scaron)
          > from keyboard there, although kwrite, Mozilla, OpenOffice and many other
          > graphical programs had no problems dealing with unicode and a foreign
          > keyboard.
          >
          > 2. If I have a file in ISO-8859-2 encoding, I can't open it properly. :set
          > encoding=latin2 doesn't have any influence on the way I see accented
          > characters. The only remaining option is recode or other text editor.
          >
          > 3. I don't need to read Swahili and I don't need to have all the
          > 10^\infinity Chinese figures, but with the font that vim uses by default
          > (fixedsys) I can't see cyrillic, greek, euro symbol and some of the very
          > common characters from European languages (with ogonek, cedilla, stroke,
          > ...). Which fonts can be recommended?
          >
          > Thank you very muuch for any hints,
          > Mojca
          >
          >

          I've written a few tips and scripts about Vim and Unicode; and I am on
          Windows myself -- currently XP, but before that I was on 98 which didn't go
          as smoothly.

          Here are the links:

          http://vim.sourceforge.net/tips/tip.php?tip_id=246 (tip) "Working with
          Unicode"

          http://vim.sourceforge.net/scripts/script.php?script_id=789 (script)
          "Switching to Unicode in an orderly manner"

          http://vim.sourceforge.net/tips/tip.php?tip_id=632 (tip) "Setting the font
          in the GUI"

          Notes about the latter:
          * This is not specifically Unicode-related, but many good-looking
          fonts don't have a wide variety of glyphs in different scripts. Myself, I
          use Lucida_Console for Latin, Courier_New for non-East-Asian Unicode,
          MingLiU for Traditional Chinese. YMMV.
          * Not for you, but maybe for others: The way to do it in kvim is
          explained in the "user comments".

          In particular, when you switch over from your Windows-default encoding to
          UTF-8, your 'termencoding' should not remain empty. It should always jibe
          with what your keyboard is actually inputting, and that hasn't changed. See
          how the script above does it.

          To type "special" characters not on your keyboard, see "help digraph.txt".
          Here are a few examples (where ^K means "hit Ctrl-K"):

          ^Kc< gives č SMALL LATIN LETTER C WITH CARON
          ^KS< gives Š CAPITAL LATIN LETTER S WITH CARON

          etc. (see ":help digraphs-default" for some widely used "second characters"
          in digraphs).

          If, after reading all this, you have more questions, feel free to come back
          to the list.


          HTH,
          Tony.
        • Tony Mechelynck
          ... From: Mojca Miklavec To: Sent: Friday, August 05, 2005 3:28 AM Subject: Re: vim + win + utf-8 = I m lost
          Message 4 of 7 , Aug 4, 2005
          • 0 Attachment
            ----- Original Message -----
            From: "Mojca Miklavec" <mojca.miklavec.lists@...>
            To: <vim@...>
            Sent: Friday, August 05, 2005 3:28 AM
            Subject: Re: vim + win + utf-8 => I'm lost


            >> Someone said that probably windows doesn't pass the proper characters
            >> from keyboard to vim and I figured out that he was probably right. I'm
            >> using cp1250 by default and it works perfect. :set encoding=utf-8 also
            >> works, but I can't type in anything but plain ASCII. Copy-paste from and
            >> to other programs works OK.
            >
            > I'm sorry. It seems that I had to flood the mailing list before
            > discovering the :set termencoding=cp1250 command by myself (and I've
            > been looking for it for at least two years). However, the other two
            > questions are still relevant and I would still be interested in the
            > answer about how to convince Windows to send proper unicode to the
            > editor.
            >
            > Thank you,
            > Mojca

            See my other reply.

            See also, in addition to my tips and script listed over there

            :help digraph.txt
            :help i_CTRL-V_digit
            :help mbyte-keymap
            :help 'langmap'

            * Digraphs are a great method to type "simple" Unicode characters like
            c-caron, s-cedilla, o-slash, oe-ligature, one-half, etc.
            * ^Vuxxxx and ^VUxxxxxxxx (where each x is a hex digit) are invaluable when
            you know the codepoint number but don't have a handy digraph.
            * Keymaps are very useful to define an "alternate keyboard" for a given
            language and to switch "on the fly" between that and English.
            * The 'langmap' option is useful to type Vim commands in Latin alphabet when
            your "native" encoding is something else, for instance Cyrillic or Greek.

            HTH,
            Tony.
          • Mojca Miklavec
            Tony, thank you very much for all the hints. Digraphs, termencoding and langmap (once I write some definitions) now solve 85% of my problems. I would be glad
            Message 5 of 7 , Aug 6, 2005
            • 0 Attachment
              Tony, thank you very much for all the hints. Digraphs, termencoding
              and langmap (once I write some definitions) now solve 85% of my
              problems. I would be glad if windows could communicate with vim in
              unicode directly, but I can live with intermediate step in cp1250 for
              now.

              1. Now another question: I have plenty of material in cp1250. Can I
              write something like that in vimrc:

              if (file seems to be in utf-8 or if this is a new window)
              set encoding=utf-8
              else
              set encoding=cp1250
              ?

              2. Does anyone have any idea why I can't set the latin2 encoding? (I
              can set it, but the files are not displayed any different than if
              cp1250 encoding is set. The worst thing is that probably 10 characters
              are at some other place, but exactly the ones I need are displayed
              wrong.)

              Thanks,
              Mojca


              Tony Mechelynck wrote:
              > See also, in addition to my tips and script listed over there
              >
              > :help digraph.txt
              > :help i_CTRL-V_digit
              > :help mbyte-keymap
              > :help 'langmap'
              >
              > * Digraphs are a great method to type "simple" Unicode characters like
              > c-caron, s-cedilla, o-slash, oe-ligature, one-half, etc.
              > * ^Vuxxxx and ^VUxxxxxxxx (where each x is a hex digit) are invaluable when
              > you know the codepoint number but don't have a handy digraph.
              > * Keymaps are very useful to define an "alternate keyboard" for a given
              > language and to switch "on the fly" between that and English.
              > * The 'langmap' option is useful to type Vim commands in Latin alphabet when
              > your "native" encoding is something else, for instance Cyrillic or Greek.
              >
              > HTH,
              > Tony.
            • Tony Mechelynck
              ... From: Mojca Miklavec To: Sent: Sunday, August 07, 2005 2:37 AM Subject: Re: vim + win + utf-8 = I m lost
              Message 6 of 7 , Aug 6, 2005
              • 0 Attachment
                ----- Original Message -----
                From: "Mojca Miklavec" <mojca.miklavec.lists@...>
                To: <vim@...>
                Sent: Sunday, August 07, 2005 2:37 AM
                Subject: Re: vim + win + utf-8 => I'm lost


                > Tony, thank you very much for all the hints. Digraphs, termencoding
                > and langmap (once I write some definitions) now solve 85% of my
                > problems. I would be glad if windows could communicate with vim in
                > unicode directly, but I can live with intermediate step in cp1250 for
                > now.
                >
                > 1. Now another question: I have plenty of material in cp1250. Can I
                > write something like that in vimrc:
                >
                > if (file seems to be in utf-8 or if this is a new window)
                > set encoding=utf-8
                > else
                > set encoding=cp1250
                > ?

                set encoding=utf-8 termencoding=cp1250
                set fileencodings=ucs-bom,utf-8,cp1250

                This will set 'fileencoding' and 'bomb' buffer-locally to:

                1. bomb fileencoding=<the proper Unicode encoding (of the 5 possible)>
                for any file with a BOM
                2. nobomb fileencoding=utf-8
                - for an empty (or new) file
                - for a file which doesn't contain invalid byte sequences for UTF-8
                3. nobomb fileencoding=cp1250
                otherwise

                Note that ucs-bom should always be first, that there should be at most one
                8-bit encoding, and that it should be last.

                These 3 steps are run in that order. Step 1 is what Windows does. It will
                recognise UTF-8 files with BOM there, i.e., files whose first 3 bytes are EF
                BB BF in hex (codepoint U+FEFF). To add a BOM to any Unicode file of yours,
                use ":setlocal bomb".

                Below is my 'statusline' setting, you may or may not find it useful. It
                displays the 'fileencoding' (or 'encoding' if 'fileencoding' is empty), the
                'bomb' status and (if any) the current keymap. Disregard any linebreaks
                added by my mail client; it should be all on one line, and spurious line
                breaks should be replaced by spaces.

                set statusline=%<%f\
                %h%m%r%=%k[%{(&fenc==\"\")?&enc:&fenc}%{(&bomb?\",BOM\":\"\")}]\
                %-14.(%l,%c%V%)\ %P


                >
                > 2. Does anyone have any idea why I can't set the latin2 encoding? (I
                > can set it, but the files are not displayed any different than if
                > cp1250 encoding is set. The worst thing is that probably 10 characters
                > are at some other place, but exactly the ones I need are displayed
                > wrong.)

                The Vim name is iso-8859-2 and you may need a working iconv.dll in your
                PATH. I got my iconv.exe and iconv.dll from the GnuWin32 project on
                sourceforge.net.

                To read a latin2 file, use

                :e ++enc=iso-8859-2 filename.ext

                (see ":help ++opt") after installing iconv and making sure that it is in
                your PATH.

                >
                > Thanks,
                > Mojca

                My pleasure,
                Tony.
              • Mojca Miklavec
                ... Thank you. The last command is exactly what I was looking for (but I would never figure it out alone)! ... Thank you. Something very useful indeed. ... It
                Message 7 of 7 , Aug 7, 2005
                • 0 Attachment
                  Tony Mechelynck wrote:

                  > > 1. Now another question: I have plenty of material in cp1250. Can I
                  > > write something like that in vimrc:
                  > >
                  > > if (file seems to be in utf-8 or if this is a new window)
                  > > set encoding=utf-8
                  > > else
                  > > set encoding=cp1250
                  > > ?
                  >
                  > set encoding=utf-8 termencoding=cp1250
                  > set fileencodings=ucs-bom,utf-8,cp1250

                  Thank you. The last command is exactly what I was looking for (but I
                  would never figure it out alone)!

                  > Below is my 'statusline' setting, you may or may not find it useful. It
                  > displays the 'fileencoding' (or 'encoding' if 'fileencoding' is empty), the
                  > 'bomb' status and (if any) the current keymap. Disregard any linebreaks
                  > added by my mail client; it should be all on one line, and spurious line
                  > breaks should be replaced by spaces.
                  >
                  > set statusline=%<%f\
                  > %h%m%r%=%k[%{(&fenc==\"\")?&enc:&fenc}%{(&bomb?\",BOM\":\"\")}]\
                  > %-14.(%l,%c%V%)\ %P

                  Thank you. Something very useful indeed.

                  > > 2. Does anyone have any idea why I can't set the latin2 encoding? (I
                  > > can set it, but the files are not displayed any different than if
                  > > cp1250 encoding is set. The worst thing is that probably 10 characters
                  > > are at some other place, but exactly the ones I need are displayed
                  > > wrong.)
                  >
                  > The Vim name is iso-8859-2 and you may need a working iconv.dll in your
                  > PATH. I got my iconv.exe and iconv.dll from the GnuWin32 project on
                  > sourceforge.net.
                  >
                  > To read a latin2 file, use
                  >
                  > :e ++enc=iso-8859-2 filename.ext

                  It seems that I already installed iconv once (or that it was installed
                  by some other program). However, ++enc was the magic missing word :)
                  Thank you a lot!

                  And GnuWin32 is another fantastic set of tools. Thank you for telling
                  me about them. I have some other gnu tools installed, but some tools
                  are only present in GnuWin32, not in the one I have installed.

                  Thank you for those short, life-saving piecies of code once again, Tony,
                  Mojca
                Your message has been successfully submitted and would be delivered to recipients shortly.