675Re: windows and unicode filenames, etc.
- Aug 5, 2002Glenn Maynard wrote:
> On Mon, Aug 05, 2002 at 09:09:38PM +0200, Bram Moolenaar wrote:I think it so far only worked for text in the system codepage. When
> > > Editing files with Unicode in the filename that don't fit in the ANSI
> > > codepage doesn't work. Fixed, except for the browser, and except for
> > > renaming (since I don't really want to go near win32's mch_rename, but
> > > it does need fixing.)
> > I thought it did work for some DBCS encodings. I did include patches
> > for this in the past.
> If encoding is set to the current codepage, it'll work: the paths are
> being sent directly to the system routines, unedited, and that's what
> the *A (ANSI) versions expect--the ANSI codepage.
> If encoding is set to anything else (including Unicode), it'll only work
> for ASCII, and will probably do something nonsensical for anything else.
> If encoding is set to the current codepage, it's impossible to represent
> filenames that don't fit in that codepage, too. (I can't edit files
> with Japanese in the filename, since my codepage is US.)
setting 'encoding' to something else I would guess we don't convert,
thus you end up with nonsense. Converting the title to Unicode should
work (if the wide version of the function is available, might not be
true on Win 9x).
> > This has a big drawback: for DBCS codes finding the start of a characterAh, you are running into the problem that enc_dbcs is both used as a
> > is complicated and slow. Don't want to use the same code for single
> > byte encodings. There are quite a few other places where DBCS is
> > handled much slower.
> > Isn't it easier to ignore enc_dbcs where the code needs to be used for
> > both encodings?
> Well, I need to be able to know the codepage if encoding is set to one.
> This is easy if encoding=cp932, for example, but it's less easy if it's
> "2byte-cp932" or something like that.
flag that DBCS encoding is being used and the number of the codepage
used for 'encoding'. We could separate the two to avoid confusion.
Introduce enc_codepage perhaps?
> Perhaps there should be a single function, win_get_penc_codepage(),Since 'encoding' doesn't change very often this could be done once and
> which does all of that parsing and returns the codepage (or -1 if it's
> not a codepage)?
stored in a global variable, just like enc_utf8 and enc_dbcs.
> Also, the is_funky_dbcs code in the win32 renderer should use this, too,If GetACP() is really fast, then is_funky_dbcs becomes obsolete.
> since it needs to do the same thing. (Render with Unicode conversion if
> win_get_penc_codepage() != GetACP(); then is_funky_dbcs can probably go
> away, too, since nothing else uses it.)
Otherwise, I thought you were planning to rename it anyway.
> > > I'll probably revert removing the broken Korean stuff and just comment outI last received a message from Sung-Hoon Baek in 1998...
> > > the call for now; I doubt it's needed, but it's not important.
> > Still didn't find someone who can tell when the code is really needed?
> Can you contact the person named in the code? I can't find him in the
> archives at all. I still suspect it's no longer needed, due to the
> newer IME fixes, and the Korean IME does work for me, but I don't know
> about eg. older Korean IM's from 9x. All that code does is poll the IME
> when the cursor blinks, and prints whatever's in there on the cursor;
> since the IME displays the character automatically, there's no need for
> this. (But before the new IME code, this may not have worked.)
Hopefully another Korean can help us here! Namsh?
> I don't know about the weird fake-backslash code. I can see why it wasEven though your reasons sound sensible, I'm a bit careful about
> wanted: MS Korean fonts actually do apparently have a Yen sign on \, which
> I'd imagine Korean users might not want. If you want, I can try to make
> this code work for now, and add an option for this. (I think it should
> be replaced completely at some point, as I've mentioned, but I don't
> expect to have that ready soon, since I need to figure out how to retrofit
> that without being overly intrusive. Also, since it's a nontrivial
> block of code, I'd much rather wait until the current stuff is settled,
> or the diff is going to get unmanagable and there'll be too much to test
throwing out code that nobody complained about.
"A clear conscience is usually the sign of a bad memory."
-- Steven Wright
/// Bram Moolenaar -- Bram@... -- http://www.moolenaar.net \\\
/// Creator of Vim -- http://vim.sf.net -- ftp://ftp.vim.org/pub/vim \\\
\\\ Project leader for A-A-P -- http://www.a-a-p.org ///
\\\ Lord Of The Rings helps Uganda - http://iccf-holland.org/lotr.html ///
- << Previous post in topic Next post in topic >>