2480Re: Unicode conversion bug?
- May 5, 2008On 06/05/08 04:58, T.P.S.Nakagawa wrote:
> Sorry, Tony.If your 'fileencodings' starts with "ucs-bom", Vim ought to detect
> But I pleasure of report next thing of this problem.
> 2008-05-05 23:48 (JST) , Tony Mechelynck sent follow message:
> > If what you said above is exact, it's a Notepad bug: a UTF-8 BOM is
> > three bytes, a UTF-16 BOM (also used for UCS-2) is two bytes, a UTF-32
> Oh yes. I delete 2 bytes , that displayed in unix UTF-8 console.
> But by shown "od -xc" command, notepad attach 3 bytes of BOM. sorry.
> Then, I report more deep for this problem.
> Vim read UTF-8 + BOM , if fileencodings setted, allways display by UTF-8.
> so Windows Japanese version ( must display cp932 )
> so unix console setted ja_JP.eucJP.
correctly any Unicode encoding when there is a BOM without interfering
with the detection of other encodings, unless they may start with one or
more of the following codes and contain not a single invalid byte (or
invalid sequence of bytes) for the corresponding Unicode encoding (I
know that many combinations of bytes higher than 0x7F are invalid in
UTF-8; I'm less sure about the other):
EF BB BF UTF-8
FE FF UTF-16be
FF FE UTF-16le
00 00 FE FF UTF-32be
FF FE 00 00 UTF-32le
Notice that Vim (and any other program with BOM detection) may "guess
wrong" if a file in UTF-16le with BOM starts with a NULL; but I suppose
that such a case is so rare it may be safely ignored.
- Even if editing cp932 files, you may set 'encoding' to utf-8
- In GUI mode, anything that 'encoding' can represent, can be displayed
if your 'guifont' has a glyph for it. Characters for which your
'guifont' has no glyph may be represented by a "placeholder" question
mark or hollow box etc.; but if you use the GTK2 GUI (X11 only, thus not
on Windows) it may, in some cases, be clever enough to find an
appropriate glyph in a different font.
- Even if your terminal display is set to accept cp932 output, you may
still set 'encoding' to utf-8 in Console mode if 'termencoding' is set
to cp932, but of course in that case if you edit Unicode (or other
non-cp932) files containing characters which cannot be represented in
cp932, you will get a "placeholder" display (possibly a question mark or
a hollow box) at that position even though the actual contents of the
file are correct.
- The above applies also, of course, with "cp932" replaced everywhere by
>Yes, especially when you're lacking sleep. ;-)
> That's all of reason , bad display.
> I read 1 hour sources, around *p_fencs setting, but I sleeped.
> It's hard of read part of big source.
> Best regard, by yaemon.
> P.S. now, download page of libiconv is
> NAKAGAWA Tsuneo (a.k.a. yaemon ) mailto:yaemon@...
> Web site ( Japanese ony ) http://www.kikansha.jp/~yaemon/
"The Good Ship Enterprise" (to the tune of "The Good Ship Lollipop")
On the good ship Enterprise
Every week there's a new surprise
Where the Romulans lurk
And the Klingons often go berserk.
Yes, the good ship Enterprise
There's excitement anywhere it flies
Where Tribbles play
And Nurse Chapel never gets her way.
See Captain Kirk standing on the bridge,
Mr. Spock is at his side.
The weekly menace, ooh-ooh
It gets fried, scattered far and wide.
It's the good ship Enterprise
Heading out where danger lies
And you live in dread
If you're wearing a shirt that's red.
-- Doris Robin and Karen Trimble of The L.A. Filkharmonics
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
- << Previous post in topic Next post in topic >>