Loading ...
Sorry, an error occurred while loading the content.
 

Re: Extended ASCII characters garbled in Vim 6.3b [Win32 console]

Expand Messages
  • Bram Moolenaar
    Craig - ... I don t know what you mean with ASCII art . Often these are DOS characters that do not exist in latin1. But then it would not work in gvim
    Message 1 of 13 , Jun 2, 2004
      Craig -

      > Thanks for the reply. On your advice, I looked at what codepage I was
      > using, and it was something unexpected -- 936. This must have been left
      > over from the previous person to use this computer (I thought I had undone
      > all the crap he did to this computer, but I didn't think to check this).
      > After some fiddling I managed to change my default codepage back to 437.
      >
      > This fixed the problem with characters being displayed as two characters on
      > screen. However, it still didn't produce the results I expected. I
      > expected to be able to see "ASCII art" like I used to. Instead, I see
      > accented letters. But in GVIM, it does show the "ASCII art". These are the
      > default options that VIM and GVIM start up with:
      >
      > VIM: enc=latin1 fenc= tenc=cp437
      > GVIM: enc=latin1 fenc= tenc=
      >
      > So, there is a difference between the two. Now, if I clear tenc in VIM, I
      > do see my "ASCII art" once more. :) I can work around this by using "set
      > tenc=" in my .vimrc. But should VIM and GVIM really display different
      > characters on screen by default? GVIM seems to ignore tenc completely, so
      > I'm not sure I understand what you mean about having the same option value
      > work in both the GUI and console.

      I don't know what you mean with "ASCII art". Often these are
      DOS characters that do not exist in latin1. But then it would not work
      in gvim either, since you mention it uses latin1.

      If you have a file with cp437 then you need to make sure that 'encoding'
      includes these characters. Either set it to "cp437" or "utf-8".
      Unicode is better, of course, since it includes all possible characters.

      - Bram

      --
      "Space is big. Really big. You just won't believe how vastly hugely mind-
      bogglingly big it is. I mean, you may think it's a long way down the
      road to the chemist, but that's just peanuts to space."
      -- Douglas Adams, "The Hitchhiker's Guide to the Galaxy"

      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
      /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
      \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
      \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html ///
    • Craig Barkhouse
      ... From: Bram Moolenaar To: Craig Barkhouse Cc: VIM Developers Sent:
      Message 2 of 13 , Jun 2, 2004
        ----- Original Message -----
        From: "Bram Moolenaar" <Bram@...>
        To: "Craig Barkhouse" <cabarkho@...>
        Cc: "VIM Developers" <vim-dev@...>
        Sent: Wednesday, June 02, 2004 3:14 PM
        Subject: Re: Extended ASCII characters garbled in Vim 6.3b [Win32 console]


        >
        > Craig -
        >
        > > Thanks for the reply. On your advice, I looked at what codepage I was
        > > using, and it was something unexpected -- 936. This must have been left
        > > over from the previous person to use this computer (I thought I had
        undone
        > > all the crap he did to this computer, but I didn't think to check this).
        > > After some fiddling I managed to change my default codepage back to 437.
        > >
        > > This fixed the problem with characters being displayed as two characters
        on
        > > screen. However, it still didn't produce the results I expected. I
        > > expected to be able to see "ASCII art" like I used to. Instead, I see
        > > accented letters. But in GVIM, it does show the "ASCII art". These are
        the
        > > default options that VIM and GVIM start up with:
        > >
        > > VIM: enc=latin1 fenc= tenc=cp437
        > > GVIM: enc=latin1 fenc= tenc=
        > >
        > > So, there is a difference between the two. Now, if I clear tenc in VIM,
        I
        > > do see my "ASCII art" once more. :) I can work around this by using
        "set
        > > tenc=" in my .vimrc. But should VIM and GVIM really display different
        > > characters on screen by default? GVIM seems to ignore tenc completely,
        so
        > > I'm not sure I understand what you mean about having the same option
        value
        > > work in both the GUI and console.
        >
        > I don't know what you mean with "ASCII art". Often these are
        > DOS characters that do not exist in latin1. But then it would not work
        > in gvim either, since you mention it uses latin1.
        >
        > If you have a file with cp437 then you need to make sure that 'encoding'
        > includes these characters. Either set it to "cp437" or "utf-8".
        > Unicode is better, of course, since it includes all possible characters.
        >
        > - Bram

        OK, by "ASCII art" I think I do mean what you call "DOS characters" --
        extended ASCII characters in the 128-255 range. These include special
        graphic symbols as well as certain accented characters. For example, 251 is
        a square root symbol (?). If you're at a Windows cmd prompt you should be
        able to press Alt+251 (numeric keypad) to make the symbol appear. You can
        echo the character to a file to do a little test. Edit the file with VIM,
        and with tenc=cp437 the character appears as a u-circumflex. With tenc= or
        tenc=cp1252, it appears as a square root symbol. In GVIM, no matter what
        tenc is set to, the character appears as a square root symbol.

        Now, I have some extended ASCII (128-255) symbols in my .vimrc, and I just
        want them to display properly (as symbols, not accented characters as in
        Latin-1) on screen.

        Curiously, when I set my codepage to 1252 and start VIM, tenc is not
        "cp1252" but rather is empty. cp1252 looks like it has all the special
        symbols I'm looking for.
      • David Brown
        ... That is a severly broken choice of terminology, though. ASCII defines the 0-127 characters. Having the high bit set drops you into numerous possible code
        Message 3 of 13 , Jun 2, 2004
          On Wed, Jun 02, 2004 at 03:59:25PM -0400, Craig Barkhouse wrote:

          > OK, by "ASCII art" I think I do mean what you call "DOS characters" --
          > extended ASCII characters in the 128-255 range. These include special
          > graphic symbols as well as certain accented characters.

          That is a severly broken choice of terminology, though. ASCII defines
          the 0-127 characters. Having the high bit set drops you into numerous
          possible code pages, or other encodings. Usually the term ASCII is used
          to indicate that the characters are 7-bit clean.

          Dave Brown
        • Antoine J. Mechelynck
          Bram Moolenaar wrote: Craig - [...] So, there is a difference between the two. Now, if I clear tenc in VIM, I do see my
          Message 4 of 13 , Jun 2, 2004
            Bram Moolenaar <Bram@...> wrote:
            > Craig -
            >
            [...]
            > > So, there is a difference between the two. Now, if I clear tenc in
            > > VIM, I do see my "ASCII art" once more. :) I can work around this
            > > by using "set tenc=" in my .vimrc. But should VIM and GVIM really
            > > display different characters on screen by default? GVIM seems to
            > > ignore tenc completely, so I'm not sure I understand what you mean
            > > about having the same option value work in both the GUI and console.
            >
            > I don't know what you mean with "ASCII art". Often these are
            > DOS characters that do not exist in latin1. But then it would not
            > work
            > in gvim either, since you mention it uses latin1.
            >
            > If you have a file with cp437 then you need to make sure that
            > 'encoding' includes these characters. Either set it to "cp437" or
            > "utf-8".
            > Unicode is better, of course, since it includes all possible
            > characters.
            >
            > - Bram
            >
            > --
            > "Space is big. Really big. You just won't believe how vastly hugely
            > mind- bogglingly big it is. I mean, you may think it's a long way
            > down the
            > road to the chemist, but that's just peanuts to space."
            > -- Douglas Adams, "The Hitchhiker's Guide to the Galaxy"
            >
            > /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net
            > \\\ /// Sponsor Vim, vote for features --
            > http://www.Vim.org/sponsor/ \\\ \\\ Project leader for
            > A-A-P -- http://www.A-A-P.org /// \\\ Buy at Amazon and help
            > AIDS victims -- http://ICCF.nl/click1.html ///

            "ASCII" art --

            The attached file contains characters 128 to 255. Don't display it in
            gvim -- "type" it in a cp437 Dos box. Look especially at the boxes and lines
            in positions 176-223. They were used (together with the space, of course) to
            display "pretty pictures" in Dos long before Windows existed.

            Best regards,
            Tony.
          • Bram Moolenaar
            ... Yes, but that s only because no conversion is done. To do it correctly you would actually have to set encoding to cp437 and leave termencoding at the
            Message 5 of 13 , Jun 2, 2004
              Craig Barkhouse wrote:

              > OK, by "ASCII art" I think I do mean what you call "DOS characters" --
              > extended ASCII characters in the 128-255 range. These include special
              > graphic symbols as well as certain accented characters. For example, 251 is
              > a square root symbol (?). If you're at a Windows cmd prompt you should be
              > able to press Alt+251 (numeric keypad) to make the symbol appear. You can
              > echo the character to a file to do a little test. Edit the file with VIM,
              > and with tenc=cp437 the character appears as a u-circumflex. With tenc= or
              > tenc=cp1252, it appears as a square root symbol.

              Yes, but that's only because no conversion is done. To do it correctly
              you would actually have to set 'encoding' to cp437 and leave
              'termencoding' at the default.

              > In GVIM, no matter what tenc is set to, the character appears as a
              > square root symbol.

              For me it appears as u-circumflex. Do you have cp437 as the default
              encoding for your system? Not a good idea. Or do you have
              'fileencodings' include cp437? Then it would still not work when
              'encoding' is latin1. Thus I don't understand why you see the character
              you want to see...

              > Now, I have some extended ASCII (128-255) symbols in my .vimrc, and I just
              > want them to display properly (as symbols, not accented characters as in
              > Latin-1) on screen.
              >
              > Curiously, when I set my codepage to 1252 and start VIM, tenc is not
              > "cp1252" but rather is empty. cp1252 looks like it has all the special
              > symbols I'm looking for.

              cp1252 does _not_ contain those symbols. Vim is simply omitting the
              conversion, which happens to work for these characters. The symbols are
              in cp437.

              You could use cp437 as your normal encoding, but be warned that most
              files these days are latin1 (or cp1252) and then those won't show up
              correctly. Using "utf-8" for 'encoding' and making sure the files are
              read with the real encoding would be the only way to use both.

              --
              Apparently, 1 in 5 people in the world are Chinese. And there are 5
              people in my family, so it must be one of them. It's either my mum
              or my dad. Or my older brother Colin. Or my younger brother
              Ho-Cha-Chu. But I think it's Colin.

              /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
              /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
              \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
              \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html ///
            • Antoine J. Mechelynck
              Craig Barkhouse wrote: [...] ... Are you sure it s 1252? When I set enc to utf-8 in gvim, then do :view ++enc=cp1252
              Message 6 of 13 , Jun 2, 2004
                Craig Barkhouse <cabarkho@...> wrote:
                [...]
                > OK, by "ASCII art" I think I do mean what you call "DOS characters" --
                > extended ASCII characters in the 128-255 range. These include special
                > graphic symbols as well as certain accented characters. For example,
                > 251 is a square root symbol (?). If you're at a Windows cmd prompt
                > you should be able to press Alt+251 (numeric keypad) to make the
                > symbol appear. You can echo the character to a file to do a little
                > test. Edit the file with VIM, and with tenc=cp437 the character
                > appears as a u-circumflex. With tenc= or tenc=cp1252, it appears as
                > a square root symbol. In GVIM, no matter what tenc is set to, the
                > character appears as a square root symbol.
                >
                > Now, I have some extended ASCII (128-255) symbols in my .vimrc, and I
                > just want them to display properly (as symbols, not accented
                > characters as in Latin-1) on screen.
                >
                > Curiously, when I set my codepage to 1252 and start VIM, tenc is not
                > "cp1252" but rather is empty. cp1252 looks like it has all the
                > special symbols I'm looking for.

                Are you sure it's 1252? When I set 'enc' to "utf-8" in gvim, then do ":view
                ++enc=cp1252 upascii.txt" (the file I sent with my previous post), then
                "setl fenc?" the answer is "cp437"... Apparently, when told to use code page
                1252, gvim uses 437 instead.

                Regards,
                Tony.
              • Antoine J. Mechelynck
                Antoine J. Mechelynck wrote: [...] ... Oops... My bad. If that same file is open in cp437 in a different window, vim
                Message 7 of 13 , Jun 2, 2004
                  Antoine J. Mechelynck <antoine.mechelynck@...> wrote:
                  [...]
                  > Are you sure it's 1252? When I set 'enc' to "utf-8" in gvim, then do
                  > ":view ++enc=cp1252 upascii.txt" (the file I sent with my previous
                  > post), then "setl fenc?" the answer is "cp437"... Apparently, when
                  > told to use code page 1252, gvim uses 437 instead.
                  >
                  > Regards,
                  > Tony.

                  Oops... My bad. If that same file is open in cp437 in a different window,
                  vim disregards the ++enc option. But if I close all windows but one, then
                  ":view ++enc=cp1252" will show true 1252 (just like in W98 Notepad) without
                  the lines and blocks

                  Regards,
                  Tony.

                  PS. Bug or feature?
                • Bram Moolenaar
                  ... If you have already opened a file, then another edit command will use the same buffer. You can t have two buffers for the same file (that would be very
                  Message 8 of 13 , Jun 3, 2004
                    Antoine J. Mechelynck wrote:

                    > Are you sure it's 1252? When I set 'enc' to "utf-8" in gvim, then do
                    > ":view ++enc=cp1252 upascii.txt" (the file I sent with my previous
                    > post), then "setl fenc?" the answer is "cp437"... Apparently, when
                    > told to use code page 1252, gvim uses 437 instead.

                    If you have already opened a file, then another edit command will use
                    the same buffer. You can't have two buffers for the same file (that
                    would be very confusing).

                    Perhaps a warning about the "++enc" argument being ignored would be
                    appropriate? It's confusing, because when you do the same command again
                    while already editing the buffer, the buffer _is_ reloaded. That's
                    because you indicate you want to re-edit the file.

                    Another possibility would be to reload the file, so that other windows
                    show the buffer also with the new encoding. This is then only to be
                    done if the "++enc" argument differs from 'fileencoding'.

                    --
                    This planet has -- or rather had -- a problem, which was this: most
                    of the people living on it were unhappy for pretty much of the time.
                    Many solutions were suggested for this problem, but most of these
                    were largely concerned with the movements of small green pieces of
                    paper, which is odd because on the whole it wasn't the small green
                    pieces of paper that were unhappy.
                    -- Douglas Adams, "The Hitchhiker's Guide to the Galaxy"

                    /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                    /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                    \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
                    \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html ///
                  • Antoine J. Mechelynck
                    ... Hmm. As long as I don t hide buffers, I don t feel the confusion; but I suppose it would be confusing to Vim, inasmuch as the identity of a buffer lies
                    Message 9 of 13 , Jun 3, 2004
                      Bram Moolenaar <Bram@...> wrote:
                      > Antoine J. Mechelynck wrote:
                      >
                      > > Are you sure it's 1252? When I set 'enc' to "utf-8" in gvim, then do
                      > > ":view ++enc=cp1252 upascii.txt" (the file I sent with my previous
                      > > post), then "setl fenc?" the answer is "cp437"... Apparently, when
                      > > told to use code page 1252, gvim uses 437 instead.
                      >
                      > If you have already opened a file, then another edit command will use
                      > the same buffer. You can't have two buffers for the same file (that
                      > would be very confusing).

                      Hmm. As long as I don't hide buffers, I don't feel the confusion; but I
                      suppose it would be confusing to Vim, inasmuch as the "identity" of a buffer
                      lies in its name. I just thought of displaying, how shall I call it? ("view"
                      is already a Vim technical term) two different "aspects" of a file in a
                      single instance of gvim, which can be done in other circumstances, like
                      left-to-right vs. right-to-left, and IIRC folded vs. unfolded. Well in any
                      case, the workaround isn't hard to find (use ":saveas to copy the file,
                      giving the buffer a different name; then reopen the original file in
                      split-wndow with different options).
                      >
                      > Perhaps a warning about the "++enc" argument being ignored would be
                      > appropriate? It's confusing, because when you do the same command
                      > again while already editing the buffer, the buffer _is_ reloaded.
                      > That's because you indicate you want to re-edit the file.

                      That is obvious in retrospect.
                      >
                      > Another possibility would be to reload the file, so that other windows
                      > show the buffer also with the new encoding. This is then only to be
                      > done if the "++enc" argument differs from 'fileencoding'.

                      I suppose this latter option can certainly be used if the buffer in question
                      is not displayed in a window other than the one where the file is to be
                      newly displayed (e.g., if it's hidden). What to do when the buffer /is/
                      displayed in another window is less clear to me. If the current behaviour is
                      kept unchanged, then I suppose a note in the help (under ++opt and maybe
                      elsewhere) would be in order. But maybe reloading the file (with the usual
                      caveats about "abandon") would be the thing to do -- less confusing, as you
                      say.
                      >
                      > --
                      > This planet has -- or rather had -- a problem, which was this: most
                      > of the people living on it were unhappy for pretty much of the time.
                      > Many solutions were suggested for this problem, but most of these
                      > were largely concerned with the movements of small green pieces of
                      > paper, which is odd because on the whole it wasn't the small green
                      > pieces of paper that were unhappy.
                      > -- Douglas Adams, "The Hitchhiker's Guide to the Galaxy"
                      >
                      > /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net
                      > \\\ /// Sponsor Vim, vote for features --
                      > http://www.Vim.org/sponsor/ \\\ \\\ Project leader for
                      > A-A-P -- http://www.A-A-P.org /// \\\ Buy at Amazon and help
                      > AIDS victims -- http://ICCF.nl/click1.html ///

                      Best regards,
                      Tony.
                    Your message has been successfully submitted and would be delivered to recipients shortly.