Loading ...
Sorry, an error occurred while loading the content.

Encoding Issues on Windows

Expand Messages
  • Alexander Shukaev
    My encoding settings look as follows: if has( multi_byte ) if empty(&termencoding) let &termencoding = &encoding endif let &encoding = utf-8 let
    Message 1 of 12 , Nov 14, 2013
    • 0 Attachment
      My encoding settings look as follows:

      if has('multi_byte')
      if empty(&termencoding)
      let &termencoding = &encoding
      endif
      let &encoding = 'utf-8'
      let &fileencoding = 'utf-8'
      endif

      I have no problems running under GVim: can type any characters, and my patched Consolas for Powerline works just fine. The problems start when I try to run Vim in terminal mode. I use ConEmu (https://code.google.com/p/conemu-maximus5/). a feature-rich terminal emulator for Windows. It claims to officially support Unicode out of the box. Furthermore, reading through `:h termencoding`, I see that Win32 uses Unicode by default to pass symbols. Now, the problem is that I don't understand why when I type anything non-ANSI in terminal Vim, I see `?` symbols? Furthermore, Airline does not display fancy symbols from patched Consolas as well. How to configure true Unicode for terminal Vim on Win32?

      Thanks, regards.

      --
      --
      You received this message from the "vim_use" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php

      ---
      You received this message because you are subscribed to the Google Groups "vim_use" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
      For more options, visit https://groups.google.com/groups/opt_out.
    • Tony Mechelynck
      ... For questions internal to Vim, see http://vim.wikia.org/wiki/Working_with_Unicode In addition to what is said there, in order to _display_ correctly what
      Message 2 of 12 , Nov 14, 2013
      • 0 Attachment
        On 14/11/13 18:50, Alexander Shukaev wrote:
        > My encoding settings look as follows:
        >
        > if has('multi_byte')
        > if empty(&termencoding)
        > let &termencoding = &encoding
        > endif
        > let &encoding = 'utf-8'
        > let &fileencoding = 'utf-8'
        > endif
        >
        > I have no problems running under GVim: can type any characters, and my patched Consolas for Powerline works just fine. The problems start when I try to run Vim in terminal mode. I use ConEmu (https://code.google.com/p/conemu-maximus5/). a feature-rich terminal emulator for Windows. It claims to officially support Unicode out of the box. Furthermore, reading through `:h termencoding`, I see that Win32 uses Unicode by default to pass symbols. Now, the problem is that I don't understand why when I type anything non-ANSI in terminal Vim, I see `?` symbols? Furthermore, Airline does not display fancy symbols from patched Consolas as well. How to configure true Unicode for terminal Vim on Win32?
        >
        > Thanks, regards.
        >

        For questions internal to Vim, see
        http://vim.wikia.org/wiki/Working_with_Unicode

        In addition to what is said there, in order to _display_ correctly what
        you type into Console Vim, you need a terminal font with the necessary
        glyphs for whatever characters you're typing. Most fonts only have
        glyphs for a comparatively small parts of all the characters to which
        the Unicode Consortium has assigned a codepoint, almost always ASCII,
        and often in addition the characters frequently used in a given country:
        for instance a font targeted at China will probably have, in addition to
        ASCII, all Chinese characters (sometimes only either the Traditional
        ones used at Hong Kong and Taiwan _or_ the Simplified ones in use in
        Mainland China, and sometimes not the rarest ones); it may have also
        Japanese kana and "national" kanji, and Korean Hangeul; it may or may
        not have Thai, or Russian and Mongol Cyrillic; it probably won't have
        the glyphs for Tamazight, Cherokee or Deseret. How to set the terminal
        font varies from terminal to terminal (on Linux there are a bevy of
        different terminal emulators, there are several also on the Mac, while
        on Windows there is only one AFAIK, and I can't help you there because
        I'm on Linux). Any character that has no glyph in your font will be
        shown by means of some "fallback glyph", possibilities frequently seen
        are an empty box or a reverse-video question mark. Even if the terminal
        cannot display a character, Vim will represent it correctly in memory
        and on disk — provided that 'encoding' is UTF-8 or the 'encoding' in use
        is compatible with the file's 'fileencoding' _and_, for files being
        modified rather than created, the file's encoding has been correctly
        set, either automatically via 'fileencodings' [plural], or manually by
        using the ++enc modifier in the :edit, :view, or similar command used to
        read the file. You can check how a character is represented in memory by
        putting the cursor on it then doing ga — this will show its
        representation in decimal, hex and octal, in addition to the glyph used,
        which, of course, won't be the "right" glyph if the font in use hasn't
        got it.


        Best regards,
        Tony.
        --
        There was a young man of Khartoum
        Who lured a poor girl to her doom.
        He not only fucked her,
        But buggered and sucked her--
        And left her to pay for the room.

        --
        --
        You received this message from the "vim_use" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php

        ---
        You received this message because you are subscribed to the Google Groups "vim_use" group.
        To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
        For more options, visit https://groups.google.com/groups/opt_out.
      • Alexander Shukaev
        I m aware of all of this. This, however, does not bring us closer to the solution. I know that `Consolas` supports all what I m want to display, otherwise do
        Message 3 of 12 , Nov 14, 2013
        • 0 Attachment
          I'm aware of all of this. This, however, does not bring us closer to the solution. I know that `Consolas` supports all what I'm want to display, otherwise do you think I would be so stupid to start this post in the first place? As I said, in GVim the same font behaves just fine, I see Russian and some other fancy symbols. But as soon as I go to terminal it's not true any more.

          My terminal is configured to use the same font (Consolas). I can type Russian in the command line, and it is displayed without any problems. Furthermore, running the attached script in the terminal produces the right output as well, there are no broken symbols:

          English: texts, web pages and documents
          Graves,etc: à á â ã ä å æ ç è é ê ë ì í î ï
          Greek: ΐ Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο
          Arabic: ڠ ڡ ڢ ڣ ڤ ڥ ڦ ڧ ڨ ک ڪ ګ ڬ ڭ ڮ گ
          Full width: @ A B C D E F G H I J K L M N O
          Romanian: texte, pagini Web şi a documentelor
          Vietnamese: văn bản, các trang web và các tài liệu
          Russian: тексты, веб-страницы и документы
          Japanese: テキスト、Webページや文書
          Yiddish: טעקסץ, וועב זייַטלעך און דאָקומענטן
          Hindi: पाठ, वेब पृष्ठों और दस्तावेज
          Thai: ข้อความ หน้า เว็บ และ เอกสาร
          Korean: 텍스트, 웹 페이지 및 문서
          Chinese: 文本,網頁和文件
          Press any key to continue . . .

          Yes, it turns on 65001 codepage. I've already discovered that Vim cannot work with 65001 page, so this seems to not be an option anyway. The default page in the terminal is 437. `&termencoding` reports 437 as well, while `&encoding` is set to "utf-8" in "vimrc". So where do we go from here?

          --
          --
          You received this message from the "vim_use" maillist.
          Do not top-post! Type your reply below the text you are replying to.
          For more information, visit http://www.vim.org/maillist.php

          ---
          You received this message because you are subscribed to the Google Groups "vim_use" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
          For more options, visit https://groups.google.com/groups/opt_out.
        • Tony Mechelynck
          ... I don t know — and as I said, I m not on Windows anymore, so I can t experiment with any wild idea that I might have. If anyone here is on Windows and
          Message 4 of 12 , Nov 14, 2013
          • 0 Attachment
            On 15/11/13 04:25, Alexander Shukaev wrote:
            > I'm aware of all of this. This, however, does not bring us closer to the solution. I know that `Consolas` supports all what I'm want to display, otherwise do you think I would be so stupid to start this post in the first place? As I said, in GVim the same font behaves just fine, I see Russian and some other fancy symbols. But as soon as I go to terminal it's not true any more.
            >
            > My terminal is configured to use the same font (Consolas). I can type Russian in the command line, and it is displayed without any problems. Furthermore, running the attached script in the terminal produces the right output as well, there are no broken symbols:
            >
            > English: texts, web pages and documents
            > Graves,etc: à á â ã ä å æ ç è é ê ë ì í î ï
            > Greek: ΐ Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο
            > Arabic: ڠ ڡ ڢ ڣ ڤ ڥ ڦ ڧ ڨ ک ڪ ګ ڬ ڭ ڮ گ
            > Full width: @ A B C D E F G H I J K L M N O
            > Romanian: texte, pagini Web şi a documentelor
            > Vietnamese: văn bản, các trang web và các tài liệu
            > Russian: тексты, веб-страницы и документы
            > Japanese: テキスト、Webページや文書
            > Yiddish: טעקסץ, וועב זייַטלעך און דאָקומענטן
            > Hindi: पाठ, वेब पृष्ठों और दस्तावेज
            > Thai: ข้อความ หน้า เว็บ และ เอกสาร
            > Korean: 텍스트, 웹 페이지 및 문서
            > Chinese: 文本,網頁和文件
            > Press any key to continue . . .
            >
            > Yes, it turns on 65001 codepage. I've already discovered that Vim cannot work with 65001 page, so this seems to not be an option anyway. The default page in the terminal is 437. `&termencoding` reports 437 as well, while `&encoding` is set to "utf-8" in "vimrc". So where do we go from here?
            >

            I don't know — and as I said, I'm not on Windows anymore, so I can't
            experiment with any wild idea that I might have. If anyone here is on
            Windows and knows the solution, please speak up!


            Best regards,
            Tony.
            --
            We warn the reader in advance that the proof presented here depends on a
            clever but highly unmotivated trick.
            -- Howard Anton, "Elementary Linear Algebra"

            --
            --
            You received this message from the "vim_use" maillist.
            Do not top-post! Type your reply below the text you are replying to.
            For more information, visit http://www.vim.org/maillist.php

            ---
            You received this message because you are subscribed to the Google Groups "vim_use" group.
            To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
            For more options, visit https://groups.google.com/groups/opt_out.
          • dyh_2011
            hi Alexander, ... I download ConEmu and find the same question as you. ... and find *utf-8* then, I put those in my _vimrc: set encoding=utf-8
            Message 5 of 12 , Nov 28, 2013
            • 0 Attachment
              hi Alexander,
              >Now, the problem is that I don't understand why when I type anything
              non-ANSI in terminal Vim, I see `?` symbols? Furthermore, Airline does not display fancy symbols from patched Consolas as well.How to configure true Unicode for terminal Vim on Win32?
               
              I download ConEmu and find the same question as you.
              after a little goole,I press:
              :set termencoding
              and find *utf-8*
              then, I put those in my _vimrc:
               
              set encoding=utf-8                                                            "解决各种乱码问题
              if has("win32") || has("win64")
                  set termencoding=gbk
              endif
              if has("linux") || has("unix")
                  set termencoding=utf-8
              endif
               
              and question disappear! :)

              dyh_2011
               
              You received this message from the "vim_use" maillist.
              Do not top-post! Type your reply below the text you are replying to.
              For more information, visit http://www.vim.org/maillist.php
               
              --- 
              You received this message because you are subscribed to the Google Groups "vim_use" group.
              To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
              For more options, visit https://groups.google.com/groups/opt_out.

              --
              --
              You received this message from the "vim_use" maillist.
              Do not top-post! Type your reply below the text you are replying to.
              For more information, visit http://www.vim.org/maillist.php
               
              ---
              You received this message because you are subscribed to the Google Groups "vim_use" group.
              To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
              For more options, visit https://groups.google.com/groups/opt_out.
            • Alexander Shukaev
              First of all, could you type `chcp` in terminal and tell me what is your default codepage? Thanks. -- -- You received this message from the vim_use maillist.
              Message 6 of 12 , Nov 28, 2013
              • 0 Attachment
                First of all, could you type `chcp` in terminal and tell me what is your default codepage? Thanks.

                --
                --
                You received this message from the "vim_use" maillist.
                Do not top-post! Type your reply below the text you are replying to.
                For more information, visit http://www.vim.org/maillist.php
                 
                ---
                You received this message because you are subscribed to the Google Groups "vim_use" group.
                To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                For more options, visit https://groups.google.com/groups/opt_out.
              • Jarrod Hermer
                Hi Alex, I have the exact same problem. Have you been able to resolve it? ... On my system chcp returns 437. -- -- You received this message from the vim_use
                Message 7 of 12 , Dec 30, 2013
                • 0 Attachment
                  Hi Alex,

                  I have the exact same problem. Have you been able to resolve it?

                  On Friday, November 29, 2013 9:05:46 AM UTC+2, Alexander Shukaev wrote:
                  > First of all, could you type `chcp` in terminal and tell me what is your default codepage? Thanks.

                  On my system chcp returns 437.

                  --
                  --
                  You received this message from the "vim_use" maillist.
                  Do not top-post! Type your reply below the text you are replying to.
                  For more information, visit http://www.vim.org/maillist.php

                  ---
                  You received this message because you are subscribed to the Google Groups "vim_use" group.
                  To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                  For more options, visit https://groups.google.com/groups/opt_out.
                • Ben Fritz
                  ... codepage 437 contains very few Unicode characters beyond the basic ASCII range. Try a different codepage. Unfortunately I don t know enough about how the
                  Message 8 of 12 , Dec 30, 2013
                  • 0 Attachment
                    On Monday, December 30, 2013 11:32:07 AM UTC-6, Jarrod Hermer wrote:
                    > Hi Alex,
                    >
                    > I have the exact same problem. Have you been able to resolve it?
                    >
                    > On Friday, November 29, 2013 9:05:46 AM UTC+2, Alexander Shukaev wrote:
                    > > First of all, could you type `chcp` in terminal and tell me what is your default codepage? Thanks.
                    >
                    > On my system chcp returns 437.

                    codepage 437 contains very few Unicode characters beyond the basic ASCII range. Try a different codepage. Unfortunately I don't know enough about how the terminal works to know for certain whether any will work.

                    --
                    --
                    You received this message from the "vim_use" maillist.
                    Do not top-post! Type your reply below the text you are replying to.
                    For more information, visit http://www.vim.org/maillist.php

                    ---
                    You received this message because you are subscribed to the Google Groups "vim_use" group.
                    To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                    For more options, visit https://groups.google.com/groups/opt_out.
                  • Tony Mechelynck
                    ... IIRC, cp437, which used to be the default MS-DOS ROM codepage for United States, contains a number of characters for ASCII art (single, double and mixed
                    Message 9 of 12 , Dec 30, 2013
                    • 0 Attachment
                      On 30/12/13 22:15, Ben Fritz wrote:
                      > On Monday, December 30, 2013 11:32:07 AM UTC-6, Jarrod Hermer wrote:
                      >> Hi Alex,
                      >>
                      >> I have the exact same problem. Have you been able to resolve it?
                      >>
                      >> On Friday, November 29, 2013 9:05:46 AM UTC+2, Alexander Shukaev wrote:
                      >>> First of all, could you type `chcp` in terminal and tell me what is your default codepage? Thanks.
                      >>
                      >> On my system chcp returns 437.
                      >
                      > codepage 437 contains very few Unicode characters beyond the basic ASCII range. Try a different codepage. Unfortunately I don't know enough about how the terminal works to know for certain whether any will work.
                      >

                      IIRC, cp437, which used to be the default MS-DOS ROM codepage for United
                      States, contains a number of characters for ASCII art (single, double
                      and mixed frame borders and corners, several shades of grey dithering,
                      and dark blocks: full, vertical-split and horizontal-split) which aren't
                      in Latin1. All of them are in the upper half of the table, i.e.
                      somewhere in the range 0x80-0xFF.


                      Best regards,
                      Tony.
                      --
                      The best laid plans of mice and men are usually about equal.
                      -- Blair

                      --
                      --
                      You received this message from the "vim_use" maillist.
                      Do not top-post! Type your reply below the text you are replying to.
                      For more information, visit http://www.vim.org/maillist.php

                      ---
                      You received this message because you are subscribed to the Google Groups "vim_use" group.
                      To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                      For more options, visit https://groups.google.com/groups/opt_out.
                    • Ben Fritz
                      ... Completely true, but also completely useless for the current problem. Most people won t be trying to draw fancy ASCII art with special characters in Vim,
                      Message 10 of 12 , Dec 30, 2013
                      • 0 Attachment
                        On Monday, December 30, 2013 4:12:16 PM UTC-6, Tony Mechelynck wrote:
                        > On 30/12/13 22:15, Ben Fritz wrote:
                        > > codepage 437 contains very few Unicode characters beyond the basic ASCII range. Try a different codepage. Unfortunately I don't know enough about how the terminal works to know for certain whether any will work.
                        > >
                        >
                        > IIRC, cp437, which used to be the default MS-DOS ROM codepage for United
                        > States, contains a number of characters for ASCII art (single, double
                        > and mixed frame borders and corners, several shades of grey dithering,
                        > and dark blocks: full, vertical-split and horizontal-split) which aren't
                        > in Latin1. All of them are in the upper half of the table, i.e.
                        > somewhere in the range 0x80-0xFF.
                        >

                        Completely true, but also completely useless for the current problem. Most people won't be trying to draw fancy ASCII art with special characters in Vim, they'll be trying to type Russian or Greek or adding a few special quotes or punctuation here and there.

                        I suspect the solution will either be:

                        1. Use chcp to set a codepage containing the desired glyphs (Russian, Greek, whatever) in the upper half of the table
                        2. Use chcp to set a unicode codepage

                        I'm not sure which will work best, or frankly whether either will work.

                        And I cannot possibly recommend a codepage since I don't know what characters are desired. Some googling should tell you what's available...here's a list I've used before:

                        http://msdn.microsoft.com/en-us/library/dd317756(VS.85).aspx

                        Maybe try the UTF-8 codepage first.

                        --
                        --
                        You received this message from the "vim_use" maillist.
                        Do not top-post! Type your reply below the text you are replying to.
                        For more information, visit http://www.vim.org/maillist.php

                        ---
                        You received this message because you are subscribed to the Google Groups "vim_use" group.
                        To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                        For more options, visit https://groups.google.com/groups/opt_out.
                      • Alexander Shukaev
                        Hey Jarrod, I ve not been able to solve the problem. You could also take a look at my post on Stack
                        Message 11 of 12 , Jan 7, 2014
                        • 0 Attachment
                          Hey Jarrod,

                          I've not been able to solve the problem. You could also take a look at my post on Stack Overflow which stays unanswered for several months by now. As I say there, this topic is kind of shady and covered with controversies as nobody can answer it. Furthermore, I discuss UTF-8 codepage there too, and it does not work properly not only with Vim, but many other applications too primarily because the majority of them simply don't support it (including Vim), what to me looks weird that a modern OS does not classify UTF-8 codepage as a first class citizen. Currently, I'm forced to use GVim because of this issue. The best we can hope for is that Mr. Moolenaar finally gives us clarification on this topic once and for all.

                          Mr. Moolenaar, what is the real state of Unicode support for terminal Vim on Windows? Do you plan to support UTF-8 codepage?

                          Regards,
                          Alexander

                          --
                          --
                          You received this message from the "vim_use" maillist.
                          Do not top-post! Type your reply below the text you are replying to.
                          For more information, visit http://www.vim.org/maillist.php
                           
                          ---
                          You received this message because you are subscribed to the Google Groups "vim_use" group.
                          To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                          For more options, visit https://groups.google.com/groups/opt_out.
                        • Alexander Shukaev
                          It looks like there was a patch [https://groups.google.com/forum/#!searchin/vim_dev/windows8%2420ime/vim_dev/JhtPJhOdTKo/E2d5RcIwL2AJ] which was included in
                          Message 12 of 12 , Mar 11, 2014
                          • 0 Attachment
                            It looks like there was a patch [https://groups.google.com/forum/#!searchin/vim_dev/windows8%2420ime/vim_dev/JhtPJhOdTKo/E2d5RcIwL2AJ] which was included in 7.4.142. From now on, I can type `chcp 1251` to change to cyrillic codepage and then start terminal Vim. As a result, now I can see Russian symbols as I type them, i.e. no more question marks instead. Nevertheless, it is still not possible to see such fancy symbols as ones coming from Powerline. The problem is likely that codepage 1251 simply does not include them. If so, then the only way to truly support Unicode with terminal Vim is to support the 65001 codepage - true UTF-8 codepage. Currently, if one tries to start terminal Vim with this codepage active, one would experience random Unicode symbols all over the place on the screen as the support is obviously broken. I would still appreciate if Mr. Moolenaar could directly comment on this issue here.

                            --
                            --
                            You received this message from the "vim_use" maillist.
                            Do not top-post! Type your reply below the text you are replying to.
                            For more information, visit http://www.vim.org/maillist.php

                            ---
                            You received this message because you are subscribed to the Google Groups "vim_use" group.
                            To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                            For more options, visit https://groups.google.com/d/optout.
                          Your message has been successfully submitted and would be delivered to recipients shortly.