Loading ...
Sorry, an error occurred while loading the content.

Trouble getting started with vim and utf-8 file

Expand Messages
  • DanKegel
    The file http://winetricks.org/winetricks is, I hope, a utf-8 file, but is not recognized as such in the vim that comes with ubuntu 11.04 (with german locale,
    Message 1 of 6 , Apr 7, 2011
    • 0 Attachment
      The file http://winetricks.org/winetricks is, I hope, a utf-8 file,
      but is not recognized as such in the vim that comes
      with ubuntu 11.04 (with german locale, even).
      It's mostly ascii, with just a few non-ascii lines, e.g.

      # If you do not see an o with two dots over it here [ö], stop!
      ...
      mymenu="$HOME/.local/share/applications/wine/Programs/
      Electronic Arts/Th
      e Sims Medieval/The Sims™ Medieval.desktop"

      That first line contains an o umlaut, and the second line contains the
      trademark symbol.

      Opening the file with vi winetricks shows

      # If you do not see an o with two dots over it here [ö], stop!
      ...
      mymenu="$HOME/.local/share/applications/wine/Programs/
      Electronic Arts/The Sims Medieval/The Simsâ<84>¢ Medieval.desktop"

      which isn't right. Just opening up vi with no arguments, and doing
      !!cat winetricks
      brings the file in great, and the utf-8 chars look good, but then
      saving it complains
      "winetricks" CONVERSION ERROR in line 12328; 14640 lines, 496509
      characters written
      and yields a very corrupt file.

      So what's going on? It seems that vim has decided the file Is Not
      UTF-8. :se shows
      fileencoding=latin1
      fileencodings=ucs-bom,utf-8,default,latin1
      even if I put
      set encoding=utf8 fileencoding=utf8
      in ~/.vimrc.

      Help...

      Thanks,
      Dan

      --
      You received this message from the "vim_multibyte" maillist.
      For more information, visit http://www.vim.org/maillist.php
    • Aleksey
      Here s what i ve found Opening this file in gVim doesn t show it right. Encoding detected is cp1251 (on my config) issuing this command ... did fine and
      Message 2 of 6 , Apr 8, 2011
      • 0 Attachment
        Here's what i've found

        Opening this file in gVim doesn't show it right. Encoding detected is
        cp1251 (on my config)

        issuing this command
        :e ++enc=utf-8

        did fine and displayed TM symbol, but it also gave warning about
        illegal byte at line 7388
        which looked so
        title="?Torrent 3.0" \
        Previous section had µ , so just replaced illegal char with it.

        Saving/opening from command line - works fine with encoding detected

        It doesn't answer your question, just a workaround

        On Apr 8, 9:33 am, DanKegel <daniel.r.ke...@...> wrote:
        > The filehttp://winetricks.org/winetricksis, I hope, a utf-8 file,
        > but is not recognized as such in the vim that comes
        > with ubuntu 11.04 (with german locale, even).
        > It's mostly ascii, with just a few non-ascii lines, e.g.
        >
        > #   If you do not see an o with two dots over it here [ö], stop!
        > ...
        >         mymenu="$HOME/.local/share/applications/wine/Programs/
        > Electronic Arts/Th
        > e Sims Medieval/The Sims™ Medieval.desktop"
        >
        > That first line contains an o umlaut, and the second line contains the
        > trademark symbol.
        >
        > Opening the file with vi winetricks shows
        >
        > #   If you do not see an o with two dots over it here [ö], stop!
        > ...
        >          mymenu="$HOME/.local/share/applications/wine/Programs/
        > Electronic Arts/The Sims Medieval/The Simsâ<84>¢ Medieval.desktop"
        >
        > which isn't right.  Just opening up vi with no arguments, and doing
        >   !!cat winetricks
        > brings the file in great, and the utf-8 chars look good, but then
        > saving it complains
        > "winetricks"  CONVERSION ERROR in line 12328; 14640 lines, 496509
        > characters written
        > and yields a very corrupt file.
        >
        > So what's going on?  It seems that vim has decided the file Is Not
        > UTF-8.  :se shows
        >   fileencoding=latin1
        >   fileencodings=ucs-bom,utf-8,default,latin1
        > even if I put
        >   set encoding=utf8 fileencoding=utf8
        > in ~/.vimrc.
        >
        > Help...
        >
        > Thanks,
        > Dan

        --
        You received this message from the "vim_multibyte" maillist.
        For more information, visit http://www.vim.org/maillist.php
      • Tony Mechelynck
        ... I ve downloaded that file in my browser, then tried to open it in Vim, which does not see it as UTF-8 even though I have enc set to utf-8 and fencs set
        Message 3 of 6 , Apr 8, 2011
        • 0 Attachment
          On 08/04/11 07:33, DanKegel wrote:
          > The file http://winetricks.org/winetricks is, I hope, a utf-8 file,
          > but is not recognized as such in the vim that comes
          > with ubuntu 11.04 (with german locale, even).
          > It's mostly ascii, with just a few non-ascii lines, e.g.
          >
          > # If you do not see an o with two dots over it here [ö], stop!
          > ...
          > mymenu="$HOME/.local/share/applications/wine/Programs/
          > Electronic Arts/Th
          > e Sims Medieval/The Sims™ Medieval.desktop"
          >
          > That first line contains an o umlaut, and the second line contains the
          > trademark symbol.
          >
          > Opening the file with vi winetricks shows
          >
          > # If you do not see an o with two dots over it here [ö], stop!
          > ...
          > mymenu="$HOME/.local/share/applications/wine/Programs/
          > Electronic Arts/The Sims Medieval/The Simsâ<84>¢ Medieval.desktop"
          >
          > which isn't right. Just opening up vi with no arguments, and doing
          > !!cat winetricks
          > brings the file in great, and the utf-8 chars look good, but then
          > saving it complains
          > "winetricks" CONVERSION ERROR in line 12328; 14640 lines, 496509
          > characters written
          > and yields a very corrupt file.
          >
          > So what's going on? It seems that vim has decided the file Is Not
          > UTF-8. :se shows
          > fileencoding=latin1
          > fileencodings=ucs-bom,utf-8,default,latin1
          > even if I put
          > set encoding=utf8 fileencoding=utf8
          > in ~/.vimrc.
          >
          > Help...
          >
          > Thanks,
          > Dan
          >

          I've downloaded that file in my browser, then tried to open it in Vim,
          which does not see it as UTF-8 even though I have 'enc' set to utf-8 and
          'fencs' set to ucs-bom,utf-8,latin1

          Intrigued, I hit 8g8 which brings me to line 7388 column 11 where the
          character µ ("micro" prefix, similar to Greek mu, 0xB5) cannot be UTF-8
          (bytes in the range 0x80 to 0xBF can only exist in UTF-8 as "trailing
          bytes" in a multibyte sequence whose first byte is 0xC0 or higher).
          Moving the cursor one position right and repeating gives me only a beep,
          so this is AFAICT the only illegal character in the file -- but one
          illegal byte in the whole file is enough to reject UTF-8 as the file's
          'fileencoding'.

          Rereading the file with

          :view ++enc=utf-8

          reads it as UTF-8 at the cost of an error message about line 7388, where
          the µ is now replaced by a question mark (but the o-umlaut at line 71
          appears as ö).

          It seems that your file is in UTF-8 at line 71 but in Latin1 at line
          7388, which means that it is the file's fault, not Vim's fault, that
          such a file cannot be displayed correctly.

          See
          :help 8g8
          :help ++opt


          Best regards,
          Tony.
          --
          Never hit a man with glasses. Hit him with a baseball bat.

          --
          You received this message from the "vim_multibyte" maillist.
          For more information, visit http://www.vim.org/maillist.php
        • Dan Kegel
          Thanks very much, guys! -- You received this message from the vim_multibyte maillist. For more information, visit http://www.vim.org/maillist.php
          Message 4 of 6 , Apr 8, 2011
          • 0 Attachment
            Thanks very much, guys!

            --
            You received this message from the "vim_multibyte" maillist.
            For more information, visit http://www.vim.org/maillist.php
          • John Beckett
            ... It looks like you created that file, so you need to fix it because it is not UTF-8. Downloading the file with wget and dumping the bytes shows that the
            Message 5 of 6 , Apr 8, 2011
            • 0 Attachment
              DanKegel wrote:
              > The file http://winetricks.org/winetricks is, I hope, a utf-8
              > file, but is not recognized as such in the vim that comes
              > with ubuntu 11.04 (with german locale, even).

              It looks like you created that file, so you need to fix it
              because it is not UTF-8.

              Downloading the file with wget and dumping the bytes shows that
              the character which I have shown as "?" in the following is not
              valid UTF-8:
              title="?Torrent 3.0" \

              That single byte is hex B5 or binary 10110101. That starts with
              "10" which is never valid as the first byte of a character in
              UTF-8.

              BTW you can find that in Vim by opening the file and typing 8g8
              which jumps to the next illegal byte sequence, then typing ga to
              show the value.

              John

              --
              You received this message from the "vim_multibyte" maillist.
              For more information, visit http://www.vim.org/maillist.php
            • Dan Kegel
              ... Yeah, that s what I gathered from the other replies. Thanks! - Dan -- You received this message from the vim_multibyte maillist. For more information,
              Message 6 of 6 , Apr 8, 2011
              • 0 Attachment
                On Sat, Apr 9, 2011 at 12:13 AM, John Beckett <johnb.beckett@...> wrote:
                > It looks like you created that file, so you need to fix it
                > because it is not UTF-8.
                >
                > Downloading the file with wget and dumping the bytes shows that
                > the character which I have shown as "?" in the following is not
                > valid UTF-8:
                >   title="?Torrent 3.0" \
                >
                > That single byte is hex B5 or binary 10110101. That starts with
                > "10" which is never valid as the first byte of a character in
                > UTF-8.
                >
                > BTW you can find that in Vim by opening the file and typing 8g8
                > which jumps to the next illegal byte sequence, then typing ga to
                > show the value.

                Yeah, that's what I gathered from the other replies.
                Thanks!
                - Dan

                --
                You received this message from the "vim_multibyte" maillist.
                For more information, visit http://www.vim.org/maillist.php
              Your message has been successfully submitted and would be delivered to recipients shortly.