658Re: mbyte.c patch
- Jul 14, 2002Glenn Maynard wrote:
> On Sun, Jul 14, 2002 at 12:51:38PM +0200, Bram Moolenaar wrote:As far as I know, MS-Windows only supports UCS-2 so far, but since they
> > Question: You changed UTF-16 to UCS-2 in several places. Are you sure
> > this is correct? I thought that MS-Windows does use UTF-16.
> Hmm. You're probably right; thinking about it, I'm really not sure.
> > For the
> > code it doesn't really matter, I suppose, since it's still using
> > two-byte words.
> For this code, it shouldn't, since it's only used for the IME which
> probably won't generate anything that'll translate outside of the BMP.
> (Actually, is there anything at all within the regular Windows codepages
> that's outside the BMP?) Still, it should be commented correctly to
> avoid confusion later on.
finally discovered that 16 bits is not enough (just like 640 Kbyte wasn't
enough! :-), they found the UTF-16 hack to work around it. A really
ugly solution compared to UTF-8. I don't know how much of MS-Windows
currently actually supports UTF-16.
> Also, the code setting up the IME converters:Hmm, the call to iconv probably has to be fixed for this. Sounds like
> convert_setup(&ime_conv, "ucs-2", p_enc);
> ime_conv_cp.vc_type = CONV_DBCS_TO_UCS2;
> ime_conv_cp.vc_dbcs = GetACP();
> ime_conv_cp.vc_factor = 2; /* we don't really know anything about the codepage */
> This works if p_enc is Unicode or latin1 (due to the special cases), but
> I don't think it'll work if it's anything else (ie. cp932), since it'll
> fall back on iconv and that'll force "UCS-2" to "UTF-8".
we need an extra flag in vimconv_T that indicates if any Unicode is
handled as UTF-8 or not. Perhaps this could also be handled when
filling vimconv_T, we don't need the flag then.
> Now, setting encoding to anything but UTF-8 or latin1 currently doesn'tSetting 'encoding' to some Asian codepage should certainly work. Korean
> work anyway: it doesn't render correctly. Is that intended to work?
> If not, encoding should probably reject other settings. It'd simplify
> a lot of things if the internal encoding was always UTF-8 in Windows.
> I think you mentioned this idea before.
and Japanese users couldn't work without this.
Unfortunately we can't drop all kinds of encodings and use UTF-8,
conversion from/to Unicode will not always be possible. There is the
famous yen vs backslash problem, for example.
BLACK KNIGHT: I'm invincible!
ARTHUR: You're a looney.
"Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD
/// Bram Moolenaar -- Bram@... -- http://www.moolenaar.net \\\
/// Creator of Vim -- http://vim.sf.net -- ftp://ftp.vim.org/pub/vim \\\
\\\ Project leader for A-A-P -- http://www.a-a-p.org ///
\\\ Help me helping AIDS orphans in Uganda - http://iccf-holland.org ///
- << Previous post in topic Next post in topic >>