2717Re: Trouble getting started with vim and utf-8 file
- Apr 8, 2011On 08/04/11 07:33, DanKegel wrote:
> The file http://winetricks.org/winetricks is, I hope, a utf-8 file,I've downloaded that file in my browser, then tried to open it in Vim,
> but is not recognized as such in the vim that comes
> with ubuntu 11.04 (with german locale, even).
> It's mostly ascii, with just a few non-ascii lines, e.g.
> # If you do not see an o with two dots over it here [ö], stop!
> Electronic Arts/Th
> e Sims Medieval/The Sims™ Medieval.desktop"
> That first line contains an o umlaut, and the second line contains the
> trademark symbol.
> Opening the file with vi winetricks shows
> # If you do not see an o with two dots over it here [Ã¶], stop!
> Electronic Arts/The Sims Medieval/The Simsâ<84>¢ Medieval.desktop"
> which isn't right. Just opening up vi with no arguments, and doing
> !!cat winetricks
> brings the file in great, and the utf-8 chars look good, but then
> saving it complains
> "winetricks" CONVERSION ERROR in line 12328; 14640 lines, 496509
> characters written
> and yields a very corrupt file.
> So what's going on? It seems that vim has decided the file Is Not
> UTF-8. :se shows
> even if I put
> set encoding=utf8 fileencoding=utf8
> in ~/.vimrc.
which does not see it as UTF-8 even though I have 'enc' set to utf-8 and
'fencs' set to ucs-bom,utf-8,latin1
Intrigued, I hit 8g8 which brings me to line 7388 column 11 where the
character µ ("micro" prefix, similar to Greek mu, 0xB5) cannot be UTF-8
(bytes in the range 0x80 to 0xBF can only exist in UTF-8 as "trailing
bytes" in a multibyte sequence whose first byte is 0xC0 or higher).
Moving the cursor one position right and repeating gives me only a beep,
so this is AFAICT the only illegal character in the file -- but one
illegal byte in the whole file is enough to reject UTF-8 as the file's
Rereading the file with
reads it as UTF-8 at the cost of an error message about line 7388, where
the µ is now replaced by a question mark (but the o-umlaut at line 71
appears as ö).
It seems that your file is in UTF-8 at line 71 but in Latin1 at line
7388, which means that it is the file's fault, not Vim's fault, that
such a file cannot be displayed correctly.
Never hit a man with glasses. Hit him with a baseball bat.
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
- << Previous post in topic Next post in topic >>