Loading ...
Sorry, an error occurred while loading the content.

Re: Extended ASCII characters garbled in Vim 6.3b [Win32 console]

Expand Messages
  • Craig Barkhouse
    Bram, Thanks for the reply. On your advice, I looked at what codepage I was using, and it was something unexpected -- 936. This must have been left over from
    Message 1 of 13 , Jun 2, 2004
    • 0 Attachment
      Bram,

      Thanks for the reply. On your advice, I looked at what codepage I was
      using, and it was something unexpected -- 936. This must have been left
      over from the previous person to use this computer (I thought I had undone
      all the crap he did to this computer, but I didn't think to check this).
      After some fiddling I managed to change my default codepage back to 437.

      This fixed the problem with characters being displayed as two characters on
      screen. However, it still didn't produce the results I expected. I
      expected to be able to see "ASCII art" like I used to. Instead, I see
      accented letters. But in GVIM, it does show the "ASCII art". These are the
      default options that VIM and GVIM start up with:

      VIM: enc=latin1 fenc= tenc=cp437
      GVIM: enc=latin1 fenc= tenc=

      So, there is a difference between the two. Now, if I clear tenc in VIM, I
      do see my "ASCII art" once more. :) I can work around this by using "set
      tenc=" in my .vimrc. But should VIM and GVIM really display different
      characters on screen by default? GVIM seems to ignore tenc completely, so
      I'm not sure I understand what you mean about having the same option value
      work in both the GUI and console.


      ----- Original Message -----
      From: "Bram Moolenaar" <Bram@...>
      To: "Craig Barkhouse" <cabarkho@...>
      Cc: "VIM Developers" <vim-dev@...>
      Sent: Tuesday, June 01, 2004 6:49 AM
      Subject: Re: Extended ASCII characters garbled in Vim 6.3b [Win32 console]


      >
      > Craig Barkhouse wrote:
      >
      > > I just noticed a problem which I believe is new in 6.3b. I know the
      problem
      > > wasn't there in 6.2 with all patches installed, and I believe it wasn't
      > > there with 6.2a.
      > >
      > > I have some extended ASCII characters in my .vimrc as follows (where you
      see
      > > two hex digits in angle brackets, I have the actual character):
      > >
      > > set listchars=tab:<F9><FA>,trail:<FA>,precedes:<AE>,extends:<AF>
      > >
      > > Editing .vimrc with the Win32 console version, I see the <F9> displaying
      as
      > > two characters -- an upside-down question mark followed by an extended
      ASCII
      > > character. The <FA> and <AF> are also displayed as two-character
      sequences,
      > > though the <AE> is just displayed as a single question mark. This
      appears
      > > to be *only* a display problem. As I cursor over the line, the cursor
      stops
      > > at the 'd' in 'extends', indicating that Vim doesn't know about the
      extra
      > > four characters being displayed. My ruler also confirms that Vim sees
      my
      > > original characters correctly.
      > >
      > > This problem does not appear in GVIM 6.3b.
      > >
      > > This problem no doubt has something to do with file encodings or
      multibyte
      > > characters or some such, but that stuff confuses me. Up until now I
      haven't
      > > had to concern myself with it.
      >
      > Please mention the values of the 'encoding' and 'termencoding' options.
      > 'fileencoding' might also matter.
      >
      > What changed is that 'termencoding' is now initialized to the console
      > codepage. If 'encoding' is "latin1", which is the default, this means
      > that characters are converted from "latin1" to the console codepage when
      > displaying them. If you previously used characters in the console
      > codepage, the conversion will show something different.
      >
      > The problem actually is that it was wrong before, and it has been fixed
      > now. But if you already compensated for the missing conversion, then
      > it's wrong now.
      >
      > I still think this is a good fix, because it makes it possible to have
      > the same option value work in the GUI (gvim) and the console (vim).
      > Well, only for those characters that appear both in latin1 and the
      > console codepage (cp437 for me).
      >
      > --
      > For a moment, nothing happened.
      > Then, after a second or so, nothing continued to happen.
      > -- Douglas Adams, "The Hitchhiker's Guide to the Galaxy"
      >
      > /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net
      \\\
      > /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/
      \\\
      > \\\ Project leader for A-A-P -- http://www.A-A-P.org
      ///
      > \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html
      ///
      >
    • Bram Moolenaar
      Craig - ... I don t know what you mean with ASCII art . Often these are DOS characters that do not exist in latin1. But then it would not work in gvim
      Message 2 of 13 , Jun 2, 2004
      • 0 Attachment
        Craig -

        > Thanks for the reply. On your advice, I looked at what codepage I was
        > using, and it was something unexpected -- 936. This must have been left
        > over from the previous person to use this computer (I thought I had undone
        > all the crap he did to this computer, but I didn't think to check this).
        > After some fiddling I managed to change my default codepage back to 437.
        >
        > This fixed the problem with characters being displayed as two characters on
        > screen. However, it still didn't produce the results I expected. I
        > expected to be able to see "ASCII art" like I used to. Instead, I see
        > accented letters. But in GVIM, it does show the "ASCII art". These are the
        > default options that VIM and GVIM start up with:
        >
        > VIM: enc=latin1 fenc= tenc=cp437
        > GVIM: enc=latin1 fenc= tenc=
        >
        > So, there is a difference between the two. Now, if I clear tenc in VIM, I
        > do see my "ASCII art" once more. :) I can work around this by using "set
        > tenc=" in my .vimrc. But should VIM and GVIM really display different
        > characters on screen by default? GVIM seems to ignore tenc completely, so
        > I'm not sure I understand what you mean about having the same option value
        > work in both the GUI and console.

        I don't know what you mean with "ASCII art". Often these are
        DOS characters that do not exist in latin1. But then it would not work
        in gvim either, since you mention it uses latin1.

        If you have a file with cp437 then you need to make sure that 'encoding'
        includes these characters. Either set it to "cp437" or "utf-8".
        Unicode is better, of course, since it includes all possible characters.

        - Bram

        --
        "Space is big. Really big. You just won't believe how vastly hugely mind-
        bogglingly big it is. I mean, you may think it's a long way down the
        road to the chemist, but that's just peanuts to space."
        -- Douglas Adams, "The Hitchhiker's Guide to the Galaxy"

        /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
        /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
        \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
        \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html ///
      • Craig Barkhouse
        ... From: Bram Moolenaar To: Craig Barkhouse Cc: VIM Developers Sent:
        Message 3 of 13 , Jun 2, 2004
        • 0 Attachment
          ----- Original Message -----
          From: "Bram Moolenaar" <Bram@...>
          To: "Craig Barkhouse" <cabarkho@...>
          Cc: "VIM Developers" <vim-dev@...>
          Sent: Wednesday, June 02, 2004 3:14 PM
          Subject: Re: Extended ASCII characters garbled in Vim 6.3b [Win32 console]


          >
          > Craig -
          >
          > > Thanks for the reply. On your advice, I looked at what codepage I was
          > > using, and it was something unexpected -- 936. This must have been left
          > > over from the previous person to use this computer (I thought I had
          undone
          > > all the crap he did to this computer, but I didn't think to check this).
          > > After some fiddling I managed to change my default codepage back to 437.
          > >
          > > This fixed the problem with characters being displayed as two characters
          on
          > > screen. However, it still didn't produce the results I expected. I
          > > expected to be able to see "ASCII art" like I used to. Instead, I see
          > > accented letters. But in GVIM, it does show the "ASCII art". These are
          the
          > > default options that VIM and GVIM start up with:
          > >
          > > VIM: enc=latin1 fenc= tenc=cp437
          > > GVIM: enc=latin1 fenc= tenc=
          > >
          > > So, there is a difference between the two. Now, if I clear tenc in VIM,
          I
          > > do see my "ASCII art" once more. :) I can work around this by using
          "set
          > > tenc=" in my .vimrc. But should VIM and GVIM really display different
          > > characters on screen by default? GVIM seems to ignore tenc completely,
          so
          > > I'm not sure I understand what you mean about having the same option
          value
          > > work in both the GUI and console.
          >
          > I don't know what you mean with "ASCII art". Often these are
          > DOS characters that do not exist in latin1. But then it would not work
          > in gvim either, since you mention it uses latin1.
          >
          > If you have a file with cp437 then you need to make sure that 'encoding'
          > includes these characters. Either set it to "cp437" or "utf-8".
          > Unicode is better, of course, since it includes all possible characters.
          >
          > - Bram

          OK, by "ASCII art" I think I do mean what you call "DOS characters" --
          extended ASCII characters in the 128-255 range. These include special
          graphic symbols as well as certain accented characters. For example, 251 is
          a square root symbol (?). If you're at a Windows cmd prompt you should be
          able to press Alt+251 (numeric keypad) to make the symbol appear. You can
          echo the character to a file to do a little test. Edit the file with VIM,
          and with tenc=cp437 the character appears as a u-circumflex. With tenc= or
          tenc=cp1252, it appears as a square root symbol. In GVIM, no matter what
          tenc is set to, the character appears as a square root symbol.

          Now, I have some extended ASCII (128-255) symbols in my .vimrc, and I just
          want them to display properly (as symbols, not accented characters as in
          Latin-1) on screen.

          Curiously, when I set my codepage to 1252 and start VIM, tenc is not
          "cp1252" but rather is empty. cp1252 looks like it has all the special
          symbols I'm looking for.
        • David Brown
          ... That is a severly broken choice of terminology, though. ASCII defines the 0-127 characters. Having the high bit set drops you into numerous possible code
          Message 4 of 13 , Jun 2, 2004
          • 0 Attachment
            On Wed, Jun 02, 2004 at 03:59:25PM -0400, Craig Barkhouse wrote:

            > OK, by "ASCII art" I think I do mean what you call "DOS characters" --
            > extended ASCII characters in the 128-255 range. These include special
            > graphic symbols as well as certain accented characters.

            That is a severly broken choice of terminology, though. ASCII defines
            the 0-127 characters. Having the high bit set drops you into numerous
            possible code pages, or other encodings. Usually the term ASCII is used
            to indicate that the characters are 7-bit clean.

            Dave Brown
          • Antoine J. Mechelynck
            Bram Moolenaar wrote: Craig - [...] So, there is a difference between the two. Now, if I clear tenc in VIM, I do see my
            Message 5 of 13 , Jun 2, 2004
            • 0 Attachment
              Bram Moolenaar <Bram@...> wrote:
              > Craig -
              >
              [...]
              > > So, there is a difference between the two. Now, if I clear tenc in
              > > VIM, I do see my "ASCII art" once more. :) I can work around this
              > > by using "set tenc=" in my .vimrc. But should VIM and GVIM really
              > > display different characters on screen by default? GVIM seems to
              > > ignore tenc completely, so I'm not sure I understand what you mean
              > > about having the same option value work in both the GUI and console.
              >
              > I don't know what you mean with "ASCII art". Often these are
              > DOS characters that do not exist in latin1. But then it would not
              > work
              > in gvim either, since you mention it uses latin1.
              >
              > If you have a file with cp437 then you need to make sure that
              > 'encoding' includes these characters. Either set it to "cp437" or
              > "utf-8".
              > Unicode is better, of course, since it includes all possible
              > characters.
              >
              > - Bram
              >
              > --
              > "Space is big. Really big. You just won't believe how vastly hugely
              > mind- bogglingly big it is. I mean, you may think it's a long way
              > down the
              > road to the chemist, but that's just peanuts to space."
              > -- Douglas Adams, "The Hitchhiker's Guide to the Galaxy"
              >
              > /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net
              > \\\ /// Sponsor Vim, vote for features --
              > http://www.Vim.org/sponsor/ \\\ \\\ Project leader for
              > A-A-P -- http://www.A-A-P.org /// \\\ Buy at Amazon and help
              > AIDS victims -- http://ICCF.nl/click1.html ///

              "ASCII" art --

              The attached file contains characters 128 to 255. Don't display it in
              gvim -- "type" it in a cp437 Dos box. Look especially at the boxes and lines
              in positions 176-223. They were used (together with the space, of course) to
              display "pretty pictures" in Dos long before Windows existed.

              Best regards,
              Tony.
            • Bram Moolenaar
              ... Yes, but that s only because no conversion is done. To do it correctly you would actually have to set encoding to cp437 and leave termencoding at the
              Message 6 of 13 , Jun 2, 2004
              • 0 Attachment
                Craig Barkhouse wrote:

                > OK, by "ASCII art" I think I do mean what you call "DOS characters" --
                > extended ASCII characters in the 128-255 range. These include special
                > graphic symbols as well as certain accented characters. For example, 251 is
                > a square root symbol (?). If you're at a Windows cmd prompt you should be
                > able to press Alt+251 (numeric keypad) to make the symbol appear. You can
                > echo the character to a file to do a little test. Edit the file with VIM,
                > and with tenc=cp437 the character appears as a u-circumflex. With tenc= or
                > tenc=cp1252, it appears as a square root symbol.

                Yes, but that's only because no conversion is done. To do it correctly
                you would actually have to set 'encoding' to cp437 and leave
                'termencoding' at the default.

                > In GVIM, no matter what tenc is set to, the character appears as a
                > square root symbol.

                For me it appears as u-circumflex. Do you have cp437 as the default
                encoding for your system? Not a good idea. Or do you have
                'fileencodings' include cp437? Then it would still not work when
                'encoding' is latin1. Thus I don't understand why you see the character
                you want to see...

                > Now, I have some extended ASCII (128-255) symbols in my .vimrc, and I just
                > want them to display properly (as symbols, not accented characters as in
                > Latin-1) on screen.
                >
                > Curiously, when I set my codepage to 1252 and start VIM, tenc is not
                > "cp1252" but rather is empty. cp1252 looks like it has all the special
                > symbols I'm looking for.

                cp1252 does _not_ contain those symbols. Vim is simply omitting the
                conversion, which happens to work for these characters. The symbols are
                in cp437.

                You could use cp437 as your normal encoding, but be warned that most
                files these days are latin1 (or cp1252) and then those won't show up
                correctly. Using "utf-8" for 'encoding' and making sure the files are
                read with the real encoding would be the only way to use both.

                --
                Apparently, 1 in 5 people in the world are Chinese. And there are 5
                people in my family, so it must be one of them. It's either my mum
                or my dad. Or my older brother Colin. Or my younger brother
                Ho-Cha-Chu. But I think it's Colin.

                /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
                \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html ///
              • Antoine J. Mechelynck
                Craig Barkhouse wrote: [...] ... Are you sure it s 1252? When I set enc to utf-8 in gvim, then do :view ++enc=cp1252
                Message 7 of 13 , Jun 2, 2004
                • 0 Attachment
                  Craig Barkhouse <cabarkho@...> wrote:
                  [...]
                  > OK, by "ASCII art" I think I do mean what you call "DOS characters" --
                  > extended ASCII characters in the 128-255 range. These include special
                  > graphic symbols as well as certain accented characters. For example,
                  > 251 is a square root symbol (?). If you're at a Windows cmd prompt
                  > you should be able to press Alt+251 (numeric keypad) to make the
                  > symbol appear. You can echo the character to a file to do a little
                  > test. Edit the file with VIM, and with tenc=cp437 the character
                  > appears as a u-circumflex. With tenc= or tenc=cp1252, it appears as
                  > a square root symbol. In GVIM, no matter what tenc is set to, the
                  > character appears as a square root symbol.
                  >
                  > Now, I have some extended ASCII (128-255) symbols in my .vimrc, and I
                  > just want them to display properly (as symbols, not accented
                  > characters as in Latin-1) on screen.
                  >
                  > Curiously, when I set my codepage to 1252 and start VIM, tenc is not
                  > "cp1252" but rather is empty. cp1252 looks like it has all the
                  > special symbols I'm looking for.

                  Are you sure it's 1252? When I set 'enc' to "utf-8" in gvim, then do ":view
                  ++enc=cp1252 upascii.txt" (the file I sent with my previous post), then
                  "setl fenc?" the answer is "cp437"... Apparently, when told to use code page
                  1252, gvim uses 437 instead.

                  Regards,
                  Tony.
                • Antoine J. Mechelynck
                  Antoine J. Mechelynck wrote: [...] ... Oops... My bad. If that same file is open in cp437 in a different window, vim
                  Message 8 of 13 , Jun 2, 2004
                  • 0 Attachment
                    Antoine J. Mechelynck <antoine.mechelynck@...> wrote:
                    [...]
                    > Are you sure it's 1252? When I set 'enc' to "utf-8" in gvim, then do
                    > ":view ++enc=cp1252 upascii.txt" (the file I sent with my previous
                    > post), then "setl fenc?" the answer is "cp437"... Apparently, when
                    > told to use code page 1252, gvim uses 437 instead.
                    >
                    > Regards,
                    > Tony.

                    Oops... My bad. If that same file is open in cp437 in a different window,
                    vim disregards the ++enc option. But if I close all windows but one, then
                    ":view ++enc=cp1252" will show true 1252 (just like in W98 Notepad) without
                    the lines and blocks

                    Regards,
                    Tony.

                    PS. Bug or feature?
                  • Bram Moolenaar
                    ... If you have already opened a file, then another edit command will use the same buffer. You can t have two buffers for the same file (that would be very
                    Message 9 of 13 , Jun 3, 2004
                    • 0 Attachment
                      Antoine J. Mechelynck wrote:

                      > Are you sure it's 1252? When I set 'enc' to "utf-8" in gvim, then do
                      > ":view ++enc=cp1252 upascii.txt" (the file I sent with my previous
                      > post), then "setl fenc?" the answer is "cp437"... Apparently, when
                      > told to use code page 1252, gvim uses 437 instead.

                      If you have already opened a file, then another edit command will use
                      the same buffer. You can't have two buffers for the same file (that
                      would be very confusing).

                      Perhaps a warning about the "++enc" argument being ignored would be
                      appropriate? It's confusing, because when you do the same command again
                      while already editing the buffer, the buffer _is_ reloaded. That's
                      because you indicate you want to re-edit the file.

                      Another possibility would be to reload the file, so that other windows
                      show the buffer also with the new encoding. This is then only to be
                      done if the "++enc" argument differs from 'fileencoding'.

                      --
                      This planet has -- or rather had -- a problem, which was this: most
                      of the people living on it were unhappy for pretty much of the time.
                      Many solutions were suggested for this problem, but most of these
                      were largely concerned with the movements of small green pieces of
                      paper, which is odd because on the whole it wasn't the small green
                      pieces of paper that were unhappy.
                      -- Douglas Adams, "The Hitchhiker's Guide to the Galaxy"

                      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                      /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                      \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
                      \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html ///
                    • Antoine J. Mechelynck
                      ... Hmm. As long as I don t hide buffers, I don t feel the confusion; but I suppose it would be confusing to Vim, inasmuch as the identity of a buffer lies
                      Message 10 of 13 , Jun 3, 2004
                      • 0 Attachment
                        Bram Moolenaar <Bram@...> wrote:
                        > Antoine J. Mechelynck wrote:
                        >
                        > > Are you sure it's 1252? When I set 'enc' to "utf-8" in gvim, then do
                        > > ":view ++enc=cp1252 upascii.txt" (the file I sent with my previous
                        > > post), then "setl fenc?" the answer is "cp437"... Apparently, when
                        > > told to use code page 1252, gvim uses 437 instead.
                        >
                        > If you have already opened a file, then another edit command will use
                        > the same buffer. You can't have two buffers for the same file (that
                        > would be very confusing).

                        Hmm. As long as I don't hide buffers, I don't feel the confusion; but I
                        suppose it would be confusing to Vim, inasmuch as the "identity" of a buffer
                        lies in its name. I just thought of displaying, how shall I call it? ("view"
                        is already a Vim technical term) two different "aspects" of a file in a
                        single instance of gvim, which can be done in other circumstances, like
                        left-to-right vs. right-to-left, and IIRC folded vs. unfolded. Well in any
                        case, the workaround isn't hard to find (use ":saveas to copy the file,
                        giving the buffer a different name; then reopen the original file in
                        split-wndow with different options).
                        >
                        > Perhaps a warning about the "++enc" argument being ignored would be
                        > appropriate? It's confusing, because when you do the same command
                        > again while already editing the buffer, the buffer _is_ reloaded.
                        > That's because you indicate you want to re-edit the file.

                        That is obvious in retrospect.
                        >
                        > Another possibility would be to reload the file, so that other windows
                        > show the buffer also with the new encoding. This is then only to be
                        > done if the "++enc" argument differs from 'fileencoding'.

                        I suppose this latter option can certainly be used if the buffer in question
                        is not displayed in a window other than the one where the file is to be
                        newly displayed (e.g., if it's hidden). What to do when the buffer /is/
                        displayed in another window is less clear to me. If the current behaviour is
                        kept unchanged, then I suppose a note in the help (under ++opt and maybe
                        elsewhere) would be in order. But maybe reloading the file (with the usual
                        caveats about "abandon") would be the thing to do -- less confusing, as you
                        say.
                        >
                        > --
                        > This planet has -- or rather had -- a problem, which was this: most
                        > of the people living on it were unhappy for pretty much of the time.
                        > Many solutions were suggested for this problem, but most of these
                        > were largely concerned with the movements of small green pieces of
                        > paper, which is odd because on the whole it wasn't the small green
                        > pieces of paper that were unhappy.
                        > -- Douglas Adams, "The Hitchhiker's Guide to the Galaxy"
                        >
                        > /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net
                        > \\\ /// Sponsor Vim, vote for features --
                        > http://www.Vim.org/sponsor/ \\\ \\\ Project leader for
                        > A-A-P -- http://www.A-A-P.org /// \\\ Buy at Amazon and help
                        > AIDS victims -- http://ICCF.nl/click1.html ///

                        Best regards,
                        Tony.
                      Your message has been successfully submitted and would be delivered to recipients shortly.