Loading ...
Sorry, an error occurred while loading the content.

Re: Woe with MBCS File Names in UTF-8 Mode on Windows

Expand Messages
  • Bram Moolenaar
    ... I m glad you were able to isolate the problem. Vim 7 already included a fix for this. This has been tried out for a while now, thus I think it s safe to
    Message 1 of 13 , Jul 2, 2005
    • 0 Attachment
      Yongwei wrote:

      > I have finally found out the reason. The cause is the _fullpath (which
      > finally calls GetFullPathNameA) in mch_FullName. It is quite normal
      > that the non-Unicode Win32 API requires that file names should be
      > provided in native encoding.
      >
      > Non-DBCS-system users generally will not feel the problem since valid
      > UTF-8 code points are generally valid SBCS (say, Latin1) code points,
      > and ì¿.txt will be regarded as code points |e7 82 9c 2e 74 78 74|. On
      > DBCS systems, |9c2e| is invalid and will become `?' (|3f|).
      >
      > To solve this problem, maybe Vim needs to provide its own verion of
      > fullpath? Bram, what is your opinion?

      I'm glad you were able to isolate the problem.

      Vim 7 already included a fix for this. This has been tried out for a
      while now, thus I think it's safe to include in Vim 6.3. Please try out
      this patch. If it works OK for you then I'll release it.

      *** os_mswin.c~ Sun Dec 5 16:39:37 2004
      --- os_mswin.c Sat Jul 2 13:07:35 2005
      ***************
      *** 367,385 ****
      nResult = mch_dirname(buf, len);
      else
      #endif
      - if (_fullpath(buf, fname, len - 1) == NULL)
      {
      ! STRNCPY(buf, fname, len); /* failed, use the relative path name */
      ! buf[len - 1] = NUL;
      ! #ifndef USE_FNAME_CASE
      ! slash_adjust(buf);
      #endif
      }
      - else
      - nResult = OK;

      #ifdef USE_FNAME_CASE
      fname_case(buf, len);
      #endif

      return nResult;
      --- 367,421 ----
      nResult = mch_dirname(buf, len);
      else
      #endif
      {
      ! #ifdef FEAT_MBYTE
      ! if (enc_codepage >= 0 && (int)GetACP() != enc_codepage
      ! # ifdef __BORLANDC__
      ! /* Wide functions of Borland C 5.5 do not work on Windows 98. */
      ! && g_PlatformId == VER_PLATFORM_WIN32_NT
      ! # endif
      ! )
      ! {
      ! WCHAR *wname;
      ! WCHAR wbuf[MAX_PATH];
      ! char_u *cname = NULL;
      !
      ! /* Use the wide function:
      ! * - convert the fname from 'encoding' to UCS2.
      ! * - invoke _wfullpath()
      ! * - convert the result from UCS2 to 'encoding'.
      ! */
      ! wname = enc_to_ucs2(fname, NULL);
      ! if (wname != NULL && _wfullpath(wbuf, wname, MAX_PATH - 1) != NULL)
      ! {
      ! cname = ucs2_to_enc((short_u *)wbuf, NULL);
      ! if (cname != NULL)
      ! {
      ! STRNCPY(buf, cname, len);
      ! buf[len - 1] = NUL;
      ! nResult = OK;
      ! }
      ! }
      ! vim_free(wname);
      ! vim_free(cname);
      ! }
      ! if (nResult == FAIL) /* fall back to non-wide function */
      #endif
      + {
      + if (_fullpath(buf, fname, len - 1) == NULL)
      + {
      + STRNCPY(buf, fname, len); /* failed, use relative path name */
      + buf[len - 1] = NUL;
      + }
      + else
      + nResult = OK;
      + }
      }

      #ifdef USE_FNAME_CASE
      fname_case(buf, len);
      + #else
      + slash_adjust(buf);
      #endif

      return nResult;

      --
      hundred-and-one symptoms of being an internet addict:
      210. When you get a divorce, you don't care about who gets the children,
      but discuss endlessly who can use the email address.

      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
      /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
      \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
      \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
    • adah@netstd.com
      ... Yes, your patch works like a charm. Thanks, Bram! Best regards, Yongwei
      Message 2 of 13 , Jul 3, 2005
      • 0 Attachment
        Bram wrote:
        >
        > Yongwei wrote:
        >
        > > I have finally found out the reason. The cause is the _fullpath
        > > (which finally calls GetFullPathNameA) in mch_FullName. It is quite
        > > normal that the non-Unicode Win32 API requires that file names
        > > should be provided in native encoding.
        > >
        > > Non-DBCS-system users generally will not feel the problem since
        > > valid UTF-8 code points are generally valid SBCS (say, Latin1) code
        > > points, and 炜.txt will be regarded as code points |e7 82 9c 2e 74
        > > 78 74|. On DBCS systems, |9c2e| is invalid and will become `?'
        > > (|3f|).
        > >
        > > To solve this problem, maybe Vim needs to provide its own verion of
        > > fullpath? Bram, what is your opinion?
        >
        > I'm glad you were able to isolate the problem.
        >
        > Vim 7 already included a fix for this. This has been tried out for a
        > while now, thus I think it's safe to include in Vim 6.3. Please try
        > out this patch. If it works OK for you then I'll release it.

        Yes, your patch works like a charm. Thanks, Bram!

        Best regards,

        Yongwei
      Your message has been successfully submitted and would be delivered to recipients shortly.