Loading ...
Sorry, an error occurred while loading the content.

Woe with MBCS File Names in UTF-8 Mode on Windows

Expand Messages
  • adah@netstd.com
    By UTF-8 mode I mean `set encoding=UTF-8 , and I also have `set fileencodings=ucs-bom,utf-8,chinese in my _vimrc. My environment is Simplified Chinese
    Message 1 of 13 , Jun 29, 2005
    • 0 Attachment
      By UTF-8 mode I mean `set encoding=UTF-8', and I also have `set
      fileencodings=ucs-bom,utf-8,chinese' in my _vimrc. My environment is
      Simplified Chinese version of Windows 2000 (code page 936). I am sure
      the problems that I describe later affect other language versions too.

      I begin with an empty file, and put the characters `测试' (Chinese for
      `test') in it. Then I write it as 测试.txt. It seems OK on the disk.

      Problems arose when I reopen it. The title displays the name as
      `<b2><e2><ca><d4>.txt' (B2E2 CAD4 is the GBK encoding for the two
      Chinese characters), and it is readonly! The problem of readonly should
      be that Vim could not create the swap file with a valid name. Repeat
      the above test with file name `测试件.txt', the file cannot be saved at
      all.

      Unsetting fileencodings seems to have no effect. The only way to make
      the file name right is to `:e ++enc=utf-8 测试.txt'. It works also when
      the file content is in other encodings (like GBK and Big5).

      Even this cannot work when opening an existing file named `测试件.txt'.
      Vim will report `Unable to open swap file for "测试件.txt", recovery
      impossible', and then open an EMPTY new file! (It is possible to open
      this file on the command line, but then the wrong-file-name and readonly
      problems are still there.)

      All the file names work as expected when encoding is the default value.
      However, without UTF-8 I can only work with UTF-8 files and GBK files,
      but not files in other encodings like Big5.

      I am guessing it is a bug in the encoding handling part of Vim.

      Best regards,

      Yongwei
    • Bram Moolenaar
      ... When you open the file, do you mean you start gvim with the file name as an argument? There is a known problem with changing encoding in that situation.
      Message 2 of 13 , Jun 30, 2005
      • 0 Attachment
        Yongwei wrote:

        > By UTF-8 mode I mean `set encoding=UTF-8', and I also have `set
        > fileencodings=ucs-bom,utf-8,chinese' in my _vimrc. My environment is
        > Simplified Chinese version of Windows 2000 (code page 936). I am sure
        > the problems that I describe later affect other language versions too.
        >
        > I begin with an empty file, and put the characters `²âÊÔ' (Chinese for
        > `test') in it. Then I write it as ²âÊÔ.txt. It seems OK on the disk.
        >
        > Problems arose when I reopen it. The title displays the name as
        > `<b2><e2><ca><d4>.txt' (B2E2 CAD4 is the GBK encoding for the two
        > Chinese characters), and it is readonly! The problem of readonly should
        > be that Vim could not create the swap file with a valid name. Repeat
        > the above test with file name `²âÊÔ¼þ.txt', the file cannot be saved at
        > all.

        When you open the file, do you mean you start gvim with the file name as
        an argument? There is a known problem with changing 'encoding' in that
        situation. Did you try opening the file from inside gvim, thus after
        'encoding' has been set to utf-8?

        The problem is that gvim gets the file name in the current codepage on
        the command line. When you switch 'encoding' to "utf-8" and the command
        line is used as-is then the characters will be garbled. I have solved
        this in Vim 7, but it's complicated and I don't want to include this
        change in Vim 6.

        > Unsetting fileencodings seems to have no effect. The only way to make
        > the file name right is to `:e ++enc=utf-8 ²âÊÔ.txt'. It works also when
        > the file content is in other encodings (like GBK and Big5).

        'fileencodings' is only used for the file contents, not for the file
        name. The name must always be in 'encoding'.

        > Even this cannot work when opening an existing file named `²âÊÔ¼þ.txt'.
        > Vim will report `Unable to open swap file for "²âÊÔ¼þ.txt", recovery
        > impossible', and then open an EMPTY new file! (It is possible to open
        > this file on the command line, but then the wrong-file-name and readonly
        > problems are still there.)

        Again, is this when starting Vim or when using ":e" later?

        --
        hundred-and-one symptoms of being an internet addict:
        182. You may not know what is happening in the world, but you know
        every bit of net-gossip there is.

        /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
        /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
        \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
        \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
      • adah@netstd.com
        ... Yes. ... If I start Vim first, set encoding to utf-8, and open the file, it seems OK. However, it is not possible to `e 测试件.txt inside Vim (`Unable
        Message 3 of 13 , Jun 30, 2005
        • 0 Attachment
          Bram wrote:
          >
          > Yongwei wrote:
          >
          > > By UTF-8 mode I mean `set encoding=UTF-8', and I also have `set
          > > fileencodings=ucs-bom,utf-8,chinese' in my _vimrc. My environment
          > > is Simplified Chinese version of Windows 2000 (code page 936). I am
          > > sure the problems that I describe later affect other language
          > > versions too.
          > >
          > > I begin with an empty file, and put the characters `测试' (Chinese
          > > for `test') in it. Then I write it as 测试.txt. It seems OK on the
          > > disk.
          > >
          > > Problems arose when I reopen it. The title displays the name as
          > > `<b2><e2><ca><d4>.txt' (B2E2 CAD4 is the GBK encoding for the two
          > > Chinese characters), and it is readonly! The problem of readonly
          > > should be that Vim could not create the swap file with a valid name.
          > > Repeat the above test with file name `测试件.txt', the file cannot
          > > be saved at all.
          >
          > When you open the file, do you mean you start gvim with the file name
          > as an argument?

          Yes.

          > There is a known problem with changing 'encoding' in
          > that situation. Did you try opening the file from inside gvim, thus
          > after 'encoding' has been set to utf-8?

          If I start Vim first, set encoding to utf-8, and open the file, it seems
          OK. However, it is not possible to `e 测试件.txt' inside Vim (`Unable
          to open swap file for "测试件.txt", recovery impossible'). Also,
          opening a file and then setting utf-8 are not an option, since file
          content would be corrupt.

          > The problem is that gvim gets the file name in the current codepage on
          > the command line. When you switch 'encoding' to "utf-8" and the
          > command line is used as-is then the characters will be garbled. I
          > have solved this in Vim 7, but it's complicated and I don't want to
          > include this change in Vim 6.

          OK. But the problem around 测试件.txt seems more subtle. I had started
          Vim first.

          > > Even this cannot work when opening an existing file named `测试件
          > > .txt'. Vim will report `Unable to open swap file for "测试件.txt",
          > > recovery impossible', and then open an EMPTY new file! (It is
          > > possible to open this file on the command line, but then the
          > > wrong-file-name and readonly problems are still there.)
          >
          > Again, is this when starting Vim or when using ":e" later?

          As said above, using ":e" later.

          BTW, the strange problem seems in the three Chinese characters. `:e 测
          试.txt' and `:e 试件.txt' both are OK. However, some other characters
          in the file name can become corrupt when saving the file, e.g., 炜
          (e7829c in UTF-8, ecbf in GBK) will become ç? (c3a7 c282 in UTF-8). I
          have no clue how it comes.

          Best regards,

          Yongwei
        • Bram Moolenaar
          ... Good. that means you didn t discover a new problem. ... It appears to work fine for me. It may be something in your system that has a problem with this
          Message 4 of 13 , Jun 30, 2005
          • 0 Attachment
            Yongwei wrote:

            > > There is a known problem with changing 'encoding' in that situation.
            > > Did you try opening the file from inside gvim, thus after 'encoding'
            > > has been set to utf-8?
            >
            > If I start Vim first, set encoding to utf-8, and open the file, it
            > seems OK.

            Good. that means you didn't discover a new problem.

            > However, it is not possible to `e 测试件.txt' inside Vim
            > (`Unable to open swap file for "测试件.txt", recovery
            > impossible').

            It appears to work fine for me. It may be something in your system that
            has a problem with this file name. I tried using the original Vim 6.3,
            no patches. I can't see the actual characters, my font doesn't support
            them, there are three rectangles.

            > Also, opening a file and then setting utf-8 are not an option, since
            > file content would be corrupt.

            No, that wouldn't work. Changing 'encoding' invalidates most text.

            > BTW, the strange problem seems in the three Chinese characters. `:e 测
            > 试.txt' and `:e 试件.txt' both are OK. However, some other
            > characters in the file name can become corrupt when saving the file,
            > e.g., 炜 (e7829c in UTF-8, ecbf in GBK) will become ç? (c3a7
            > c282 in UTF-8). I have no clue how it comes.

            I'm afraid I also don't know. Perhaps there is some problem with
            conversion from Unicode to your current codepage. This uses the
            MS-Windows library functions, thus it's not something I can fix.

            --
            hundred-and-one symptoms of being an internet addict:
            186. You overstay in the office so you can have more time surfing the net.

            /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
            /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
            \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
            \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
          • Strange
            I ve a try with my gvim, it seems I can t find any problem about the ²âÊÔ¼þ.txt with ²âÊÔ in it, encoded as UTF-8, Both in command prompt or :e!. I m
            Message 5 of 13 , Jun 30, 2005
            • 0 Attachment
              I've a try with my gvim, it seems I can't find any problem about
              the 测试件.txt with 测试 in it, encoded as UTF-8, Both in command prompt
              or :e!.
              I'm using a chinese version of windows2000 sp4.

              Best regards,

              Strange

              On Thu, Jun 30, 2005 at 11:42:27AM +0800, adah@... wrote:
              > By UTF-8 mode I mean `set encoding=UTF-8', and I also have `set
              > fileencodings=ucs-bom,utf-8,chinese' in my _vimrc. My environment is
              > Simplified Chinese version of Windows 2000 (code page 936). I am sure
              > the problems that I describe later affect other language versions too.
              >
              > I begin with an empty file, and put the characters `测试' (Chinese for
              > `test') in it. Then I write it as 测试.txt. It seems OK on the disk.
              >
              > Problems arose when I reopen it. The title displays the name as
              > `<b2><e2><ca><d4>.txt' (B2E2 CAD4 is the GBK encoding for the two
              > Chinese characters), and it is readonly! The problem of readonly should
              > be that Vim could not create the swap file with a valid name. Repeat
              > the above test with file name `测试件.txt', the file cannot be saved at
              > all.
              >
              > Unsetting fileencodings seems to have no effect. The only way to make
              > the file name right is to `:e ++enc=utf-8 测试.txt'. It works also when
              > the file content is in other encodings (like GBK and Big5).
              >
              > Even this cannot work when opening an existing file named `测试件.txt'.
              > Vim will report `Unable to open swap file for "测试件.txt", recovery
              > impossible', and then open an EMPTY new file! (It is possible to open
              > this file on the command line, but then the wrong-file-name and readonly
              > problems are still there.)
              >
              > All the file names work as expected when encoding is the default value.
              > However, without UTF-8 I can only work with UTF-8 files and GBK files,
              > but not files in other encodings like Big5.
              >
              > I am guessing it is a bug in the encoding handling part of Vim.
              >
              > Best regards,
              >
              > Yongwei
            • adah@netstd.com
              Do you have `set encoding=UTF-8 in your _vimrc or somewhere? If not, you will have no problems at all. If yes, please tell me your exact version and patch
              Message 6 of 13 , Jun 30, 2005
              • 0 Attachment
                Do you have `set encoding=UTF-8' in your _vimrc or somewhere? If not, you
                will have no problems at all. If yes, please tell me your exact version
                and patch number as reported by `:version'.

                Best regards,

                Yongwei





                Strange <strangemk2@...>
                2005-07-01 01:24


                To: vim@...
                CC:
                Subject: Re: Woe with MBCS File Names in UTF-8 Mode on Windows

                I've a try with my gvim, it seems I can't find any problem about
                the 测试件.txt with 测试 in it, encoded as UTF-8, Both in command prompt
                or :e!.
                I'm using a chinese version of windows2000 sp4.

                Best regards,

                Strange

                On Thu, Jun 30, 2005 at 11:42:27AM +0800, adah@... wrote:
                > By UTF-8 mode I mean `set encoding=UTF-8', and I also have `set
                > fileencodings=ucs-bom,utf-8,chinese' in my _vimrc. My environment is
                > Simplified Chinese version of Windows 2000 (code page 936). I am sure
                > the problems that I describe later affect other language versions too.
                >
                > I begin with an empty file, and put the characters `测试' (Chinese for
                > `test') in it. Then I write it as 测试.txt. It seems OK on the disk.
                >
                > Problems arose when I reopen it. The title displays the name as
                > `<b2><e2><ca><d4>.txt' (B2E2 CAD4 is the GBK encoding for the two
                > Chinese characters), and it is readonly! The problem of readonly should
                > be that Vim could not create the swap file with a valid name. Repeat
                > the above test with file name `测试件.txt', the file cannot be saved at
                > all.
                >
                > Unsetting fileencodings seems to have no effect. The only way to make
                > the file name right is to `:e ++enc=utf-8 测试.txt'. It works also when
                > the file content is in other encodings (like GBK and Big5).
                >
                > Even this cannot work when opening an existing file named `测试件.txt'.
                > Vim will report `Unable to open swap file for "测试件.txt", recovery
                > impossible', and then open an EMPTY new file! (It is possible to open
                > this file on the command line, but then the wrong-file-name and readonly
                > problems are still there.)
                >
                > All the file names work as expected when encoding is the default value.
                > However, without UTF-8 I can only work with UTF-8 files and GBK files,
                > but not files in other encodings like Big5.
                >
                > I am guessing it is a bug in the encoding handling part of Vim.
                >
                > Best regards,
                >
                > Yongwei
              • adah@netstd.com
                ... I did a trace into Vim, and I found that it was because the `9c of e7829c (炜) had been lost before mch_open is called. Could this give you a clue? Or
                Message 7 of 13 , Jun 30, 2005
                • 0 Attachment
                  > > BTW, the strange problem seems in the three Chinese characters. `:e
                  > > 测试.txt' and `:e 试件.txt' both are OK. However, some other
                  > > characters in the file name can become corrupt when saving the file,
                  > > e.g., 炜 (e7829c in UTF-8, ecbf in GBK) will become ç? (c3a7 c282 in
                  > > UTF-8). I have no clue how it comes.
                  >
                  > I'm afraid I also don't know. Perhaps there is some problem with
                  > conversion from Unicode to your current codepage. This uses the
                  > MS-Windows library functions, thus it's not something I can fix.

                  I did a trace into Vim, and I found that it was because the `9c' of
                  e7829c (炜) had been lost before mch_open is called. Could this give
                  you a clue? Or give me a guidance where I should investigate further?

                  Best regards,

                  Yongwei
                • Bram Moolenaar
                  ... I would guess that somewhere in the code the DBCS codepage is used to locate the character, instead of using it as UTF-8. Since I don t have a DBCS
                  Message 8 of 13 , Jul 1, 2005
                  • 0 Attachment
                    Yongwei wrote:

                    > > > BTW, the strange problem seems in the three Chinese characters.
                    > > > `:e 测试.txt' and `:e 试件.txt' both are OK.
                    > > > However, some other characters in the file name can become corrupt
                    > > > when saving the file, e.g., 炜 (e7829c in UTF-8, ecbf in
                    > > > GBK) will become ç? (c3a7 c282 in UTF-8). I have no clue how it
                    > > > comes.
                    > >
                    > > I'm afraid I also don't know. Perhaps there is some problem with
                    > > conversion from Unicode to your current codepage. This uses the
                    > > MS-Windows library functions, thus it's not something I can fix.
                    >
                    > I did a trace into Vim, and I found that it was because the `9c' of
                    > e7829c (炜) had been lost before mch_open is called. Could
                    > this give you a clue? Or give me a guidance where I should
                    > investigate further?

                    I would guess that somewhere in the code the DBCS codepage is used to
                    locate the character, instead of using it as UTF-8. Since I don't have
                    a DBCS system, I can't try this.

                    If you are able to see what happens in a debugger then you should be
                    able to follow the route from typing the command to the mch_open() call.

                    --
                    Some of the well know MS-Windows errors:
                    ETIME Wrong time, wait a little while
                    ECRASH Try again...
                    EDETECT Unable to detect errors
                    EOVER You lost! Play another game?
                    ENOCLUE Eh, what did you want?

                    /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                    /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                    \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
                    \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
                  • adah@netstd.com
                    ... Since I was tracing mch_open out (not from outside in), I soon lost my way. And I was not familiar with the Vim code organization. That is the reason why
                    Message 9 of 13 , Jul 1, 2005
                    • 0 Attachment
                      > > I did a trace into Vim, and I found that it was because the `9c' of
                      > > e7829c (炜) had been lost before mch_open is called. Could this
                      > > give you a clue? Or give me a guidance where I should investigate
                      > > further?
                      >
                      > I would guess that somewhere in the code the DBCS codepage is used to
                      > locate the character, instead of using it as UTF-8. Since I don't
                      > have a DBCS system, I can't try this.
                      >
                      > If you are able to see what happens in a debugger then you should be
                      > able to follow the route from typing the command to the mch_open()
                      > call.

                      Since I was tracing mch_open out (not from outside in), I soon lost my
                      way. And I was not familiar with the Vim code organization. That is
                      the reason why I asked for guidance. I need a starting point to trace
                      (where `:w file.txt' is really executed).

                      And it is not difficult to change one's system into a DBCS one, as long
                      as one has a Windows 2000/XP box with installation files/CD. Just
                      install the Far East support and set the default code page in the
                      Regional Setting.

                      Best regards,

                      Yongwei
                    • Bram Moolenaar
                      ... You can step out of mch_open() to see what happened in the calling function. If you need to step through the code that leads to opening the file you might
                      Message 10 of 13 , Jul 1, 2005
                      • 0 Attachment
                        Yongwei wrote:

                        > > > I did a trace into Vim, and I found that it was because the `9c' of
                        > > > e7829c (ì¿) had been lost before mch_open is called. Could this
                        > > > give you a clue? Or give me a guidance where I should investigate
                        > > > further?
                        > >
                        > > I would guess that somewhere in the code the DBCS codepage is used to
                        > > locate the character, instead of using it as UTF-8. Since I don't
                        > > have a DBCS system, I can't try this.
                        > >
                        > > If you are able to see what happens in a debugger then you should be
                        > > able to follow the route from typing the command to the mch_open()
                        > > call.
                        >
                        > Since I was tracing mch_open out (not from outside in), I soon lost my
                        > way. And I was not familiar with the Vim code organization. That is
                        > the reason why I asked for guidance. I need a starting point to trace
                        > (where `:w file.txt' is really executed).

                        You can step out of mch_open() to see what happened in the calling
                        function.

                        If you need to step through the code that leads to opening the file you
                        might want to put a breakpoint in open_buffer(). Check that
                        curbuf->b_ffname is right. The file reading is done in readfile().

                        --
                        hundred-and-one symptoms of being an internet addict:
                        202. You're amazed to find out Spam is a food.

                        /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                        /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                        \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
                        \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
                      • adah@netstd.com
                        ... I have finally found out the reason. The cause is the _fullpath (which finally calls GetFullPathNameA) in mch_FullName. It is quite normal that the
                        Message 11 of 13 , Jul 1, 2005
                        • 0 Attachment
                          Bram wrote:
                          >
                          > Yongwei wrote:
                          >
                          > > > > I did a trace into Vim, and I found that it was because the `9c'
                          > > > > of e7829c (炜) had been lost before mch_open is called. Could
                          > > > > this give you a clue? Or give me a guidance where I should
                          > > > > investigate further?
                          > > >
                          > > > I would guess that somewhere in the code the DBCS codepage is used
                          > > > to locate the character, instead of using it as UTF-8. Since I
                          > > > don't have a DBCS system, I can't try this.
                          > > >
                          > > > If you are able to see what happens in a debugger then you should
                          > > > be able to follow the route from typing the command to the
                          > > > mch_open() call.
                          > >
                          > > Since I was tracing mch_open out (not from outside in), I soon lost
                          > > my way. And I was not familiar with the Vim code organization.
                          > > That is the reason why I asked for guidance. I need a starting
                          > > point to trace (where `:w file.txt' is really executed).
                          >
                          > You can step out of mch_open() to see what happened in the calling
                          > function.
                          >
                          > If you need to step through the code that leads to opening the file
                          > you might want to put a breakpoint in open_buffer(). Check that
                          > curbuf->b_ffname is right. The file reading is done in readfile().

                          I have finally found out the reason. The cause is the _fullpath (which
                          finally calls GetFullPathNameA) in mch_FullName. It is quite normal
                          that the non-Unicode Win32 API requires that file names should be
                          provided in native encoding.

                          Non-DBCS-system users generally will not feel the problem since valid
                          UTF-8 code points are generally valid SBCS (say, Latin1) code points,
                          and 炜.txt will be regarded as code points |e7 82 9c 2e 74 78 74|. On
                          DBCS systems, |9c2e| is invalid and will become `?' (|3f|).

                          To solve this problem, maybe Vim needs to provide its own verion of
                          fullpath? Bram, what is your opinion?

                          Best regards,

                          Yongwei
                        • Bram Moolenaar
                          ... I m glad you were able to isolate the problem. Vim 7 already included a fix for this. This has been tried out for a while now, thus I think it s safe to
                          Message 12 of 13 , Jul 2, 2005
                          • 0 Attachment
                            Yongwei wrote:

                            > I have finally found out the reason. The cause is the _fullpath (which
                            > finally calls GetFullPathNameA) in mch_FullName. It is quite normal
                            > that the non-Unicode Win32 API requires that file names should be
                            > provided in native encoding.
                            >
                            > Non-DBCS-system users generally will not feel the problem since valid
                            > UTF-8 code points are generally valid SBCS (say, Latin1) code points,
                            > and ì¿.txt will be regarded as code points |e7 82 9c 2e 74 78 74|. On
                            > DBCS systems, |9c2e| is invalid and will become `?' (|3f|).
                            >
                            > To solve this problem, maybe Vim needs to provide its own verion of
                            > fullpath? Bram, what is your opinion?

                            I'm glad you were able to isolate the problem.

                            Vim 7 already included a fix for this. This has been tried out for a
                            while now, thus I think it's safe to include in Vim 6.3. Please try out
                            this patch. If it works OK for you then I'll release it.

                            *** os_mswin.c~ Sun Dec 5 16:39:37 2004
                            --- os_mswin.c Sat Jul 2 13:07:35 2005
                            ***************
                            *** 367,385 ****
                            nResult = mch_dirname(buf, len);
                            else
                            #endif
                            - if (_fullpath(buf, fname, len - 1) == NULL)
                            {
                            ! STRNCPY(buf, fname, len); /* failed, use the relative path name */
                            ! buf[len - 1] = NUL;
                            ! #ifndef USE_FNAME_CASE
                            ! slash_adjust(buf);
                            #endif
                            }
                            - else
                            - nResult = OK;

                            #ifdef USE_FNAME_CASE
                            fname_case(buf, len);
                            #endif

                            return nResult;
                            --- 367,421 ----
                            nResult = mch_dirname(buf, len);
                            else
                            #endif
                            {
                            ! #ifdef FEAT_MBYTE
                            ! if (enc_codepage >= 0 && (int)GetACP() != enc_codepage
                            ! # ifdef __BORLANDC__
                            ! /* Wide functions of Borland C 5.5 do not work on Windows 98. */
                            ! && g_PlatformId == VER_PLATFORM_WIN32_NT
                            ! # endif
                            ! )
                            ! {
                            ! WCHAR *wname;
                            ! WCHAR wbuf[MAX_PATH];
                            ! char_u *cname = NULL;
                            !
                            ! /* Use the wide function:
                            ! * - convert the fname from 'encoding' to UCS2.
                            ! * - invoke _wfullpath()
                            ! * - convert the result from UCS2 to 'encoding'.
                            ! */
                            ! wname = enc_to_ucs2(fname, NULL);
                            ! if (wname != NULL && _wfullpath(wbuf, wname, MAX_PATH - 1) != NULL)
                            ! {
                            ! cname = ucs2_to_enc((short_u *)wbuf, NULL);
                            ! if (cname != NULL)
                            ! {
                            ! STRNCPY(buf, cname, len);
                            ! buf[len - 1] = NUL;
                            ! nResult = OK;
                            ! }
                            ! }
                            ! vim_free(wname);
                            ! vim_free(cname);
                            ! }
                            ! if (nResult == FAIL) /* fall back to non-wide function */
                            #endif
                            + {
                            + if (_fullpath(buf, fname, len - 1) == NULL)
                            + {
                            + STRNCPY(buf, fname, len); /* failed, use relative path name */
                            + buf[len - 1] = NUL;
                            + }
                            + else
                            + nResult = OK;
                            + }
                            }

                            #ifdef USE_FNAME_CASE
                            fname_case(buf, len);
                            + #else
                            + slash_adjust(buf);
                            #endif

                            return nResult;

                            --
                            hundred-and-one symptoms of being an internet addict:
                            210. When you get a divorce, you don't care about who gets the children,
                            but discuss endlessly who can use the email address.

                            /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                            /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                            \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
                            \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
                          • adah@netstd.com
                            ... Yes, your patch works like a charm. Thanks, Bram! Best regards, Yongwei
                            Message 13 of 13 , Jul 3, 2005
                            • 0 Attachment
                              Bram wrote:
                              >
                              > Yongwei wrote:
                              >
                              > > I have finally found out the reason. The cause is the _fullpath
                              > > (which finally calls GetFullPathNameA) in mch_FullName. It is quite
                              > > normal that the non-Unicode Win32 API requires that file names
                              > > should be provided in native encoding.
                              > >
                              > > Non-DBCS-system users generally will not feel the problem since
                              > > valid UTF-8 code points are generally valid SBCS (say, Latin1) code
                              > > points, and 炜.txt will be regarded as code points |e7 82 9c 2e 74
                              > > 78 74|. On DBCS systems, |9c2e| is invalid and will become `?'
                              > > (|3f|).
                              > >
                              > > To solve this problem, maybe Vim needs to provide its own verion of
                              > > fullpath? Bram, what is your opinion?
                              >
                              > I'm glad you were able to isolate the problem.
                              >
                              > Vim 7 already included a fix for this. This has been tried out for a
                              > while now, thus I think it's safe to include in Vim 6.3. Please try
                              > out this patch. If it works OK for you then I'll release it.

                              Yes, your patch works like a charm. Thanks, Bram!

                              Best regards,

                              Yongwei
                            Your message has been successfully submitted and would be delivered to recipients shortly.