Loading ...
Sorry, an error occurred while loading the content.

Re: Vim can't use filename having MBYTE

Expand Messages
  • mattn@mail.goo.ne.jp
    ... Thank, Your patch like good. By the way, ... misc1.c 6125 lines if ((add_pat
    Message 1 of 10 , Jul 30, 2000
    View Source
    • 0 Attachment
      Bram@... wrote:
      >
      > Yasuhiro Matsumoto wrote:
      >
      > > When having MBYTE in filename of arguments,
      > > if it have backslash code then Vim can't read file.
      > > (Vim create [New File])
      >
      > Again you found the right place to fix the problem, but it doesn't take care
      > of UTF-8. UTF-8 encoding uses up to six bytes for one character. That's
      > unlike the two-byte encodings supported before Vim 6.0.
      >
      > Try this patch instead:
      > ...

      Thank, Your patch like good.
      By the way,
      --------------------------------
      "misc1.c" 6125 lines

      if ((add_pat <= 0 && (flags & EW_NOTFOUND))
      || (add_pat == -1 && mch_getperm(p) >= 0))
      {
      char_u *t = backslash_halve_save(p);

      addfile(&ga, t, flags);
      vim_free(t);
      }
      --------------------------------
      "backslash_halve_save" is needed?
      if no using, it like good.

      > By the way, please use ordinary English/ASCII text in the subject, not all
      > mail software can decode the MIME encoding setting.

      Sorry, I'll be care.
    • Bram Moolenaar
      ... Hmm, this looks strange. First there is a check if the file exists with mch_getperm(p) and then the number of backslashes is halved. That can t be
      Message 2 of 10 , Jul 31, 2000
      View Source
      • 0 Attachment
        Yasuhiro Matsumoto wrote:

        > By the way,
        > --------------------------------
        > "misc1.c" 6125 lines
        >
        > if ((add_pat <= 0 && (flags & EW_NOTFOUND))
        > || (add_pat == -1 && mch_getperm(p) >= 0))
        > {
        > char_u *t = backslash_halve_save(p);
        >
        > addfile(&ga, t, flags);
        > vim_free(t);
        > }
        > --------------------------------
        > "backslash_halve_save" is needed?
        > if no using, it like good.

        Hmm, this looks strange. First there is a check if the file exists with
        "mch_getperm(p)" and then the number of backslashes is halved. That can't be
        right. Try this instead (misc1.c, around line 5888):

        if (add_pat == -1 || (add_pat == 0 && (flags & EW_NOTFOUND)))
        {
        char_u *t = backslash_halve_save(p);

        if ((flags & EW_NOTFOUND) || mch_getperm(t) >= 0)
        addfile(&ga, t, flags);
        vim_free(t);
        }

        If this still doesn't work, please give an example how to reproduce the
        problem. And don't forget to mention on what system this is.

        --
        Everybody lies, but it doesn't matter since nobody listens.

        /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
        \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
      • mattn@mail.goo.ne.jp
        ... I checked to be right. It seems good.(on Win2K) Still more, if holding MBYTE in filename arguments, rem_backslash do not work. ... *************** ... &&
        Message 3 of 10 , Jul 31, 2000
        View Source
        • 0 Attachment
          Bram@... wrote:
          > Hmm, this looks strange. First there is a check if the file exists with
          > "mch_getperm(p)" and then the number of backslashes is halved. That can't be
          > right. Try this instead (misc1.c, around line 5888):
          > ...
          > If this still doesn't work, please give an example how to reproduce the
          > problem. And don't forget to mention on what system this is.

          I checked to be right. It seems good.(on Win2K)

          Still more, if holding MBYTE in filename arguments,
          "rem_backslash" do not work.

          ---------------------------------------
          *** src.org/ex_docmd.c Mon Jul 24 00:06:21 2000
          --- src/ex_docmd.c Mon Jul 31 19:36:22 2000
          ***************
          *** 3457,3462 ****
          --- 3457,3465 ----
          || (str[1] != NUL
          && str[1] != '*'
          && str[1] != '?'
          + #ifdef FEAT_MBYTE
          + && has_mbyte && (mb_ptr2len_check(&str[1]) == 1)
          + #endif
          && !vim_isfilec(str[1]))));
          #else
          return (str[0] == '\\' && str[1] != NUL);
          ---------------------------------------

          Yasuhiro Matsumoto.
        • Bram Moolenaar
          ... This function is very tricky. It was tuned carefully to remove backslashes where wanted, and keep them where needed. It looks like this patch doesn t work
          Message 4 of 10 , Jul 31, 2000
          View Source
          • 0 Attachment
            Yasuhiro Matsumoto wrote:

            > Still more, if holding MBYTE in filename arguments,
            > "rem_backslash" do not work.
            >
            > ---------------------------------------
            > *** src.org/ex_docmd.c Mon Jul 24 00:06:21 2000
            > --- src/ex_docmd.c Mon Jul 31 19:36:22 2000
            > ***************
            > *** 3457,3462 ****
            > --- 3457,3465 ----
            > || (str[1] != NUL
            > && str[1] != '*'
            > && str[1] != '?'
            > + #ifdef FEAT_MBYTE
            > + && has_mbyte && (mb_ptr2len_check(&str[1]) == 1)
            > + #endif
            > && !vim_isfilec(str[1]))));
            > #else
            > return (str[0] == '\\' && str[1] != NUL);
            > ---------------------------------------

            This function is very tricky. It was tuned carefully to remove
            backslashes where wanted, and keep them where needed.

            It looks like this patch doesn't work when "has_mbyte" is not set.

            Can you please explain what your change intends to do? I believe all
            multi-byte characters are already marked as filename characters. Thus when a
            multi-byte character is preceded with a backslash the backslash should be
            kept, since a file name can start with a multi-byte character, right?

            --
            hundred-and-one symptoms of being an internet addict:
            43. You tell the kids they can't use the computer because "Daddy's got work to
            do" and you don't even have a job.

            /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
            \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
          • mattn@mail.goo.ne.jp
            ... Oh, misstake. ... #ifdef BACKSLASH_IN_FILENAME return (str[0] == #ifdef FEAT_MBYTE && (!has_mbyte || (has_mbyte && (mb_ptr2len_check(str+1) == 1)))
            Message 5 of 10 , Jul 31, 2000
            View Source
            • 0 Attachment
              Bram@... wrote:
              > This function is very tricky. It was tuned carefully to remove
              > backslashes where wanted, and keep them where needed.
              >
              > It looks like this patch doesn't work when "has_mbyte" is not set.

              Oh, misstake.
              ------------------------------- (ex_docmd.c around3454)
              #ifdef BACKSLASH_IN_FILENAME
              return (str[0] == '\\'
              #ifdef FEAT_MBYTE
              && (!has_mbyte || (has_mbyte && (mb_ptr2len_check(str+1) == 1)))
              #endif
              && (str[1] == ' '
              || (str[1] != NUL
              && str[1] != '*'
              && str[1] != '?'
              && !vim_isfilec(str[1]))));
              #else
              return (str[0] == '\\' && str[1] != NUL);
              -----------------------------------
              > Can you please explain what your change intends to do? I believe all
              > multi-byte characters are already marked as filename characters. Thus when a
              > multi-byte character is preceded with a backslash the backslash should be
              > kept, since a file name can start with a multi-byte character, right?

              Seem, It is my misunderstanding.
              This problem is not in the part only.
              The cause is that calling "expand_wildcard" before
              calling "set_init_3".This mean that no initializing
              multi-byte table.
              Thus, when analysing file-name arguments,
              it can't judge multi-byte right.

              I attach diff-file on this mail.
              (change order of calling function)
              But this is not certain.
              I want any one's suggestion.
              -------------------------------------
              Yasuhiro Matsumoto
            • Bram Moolenaar
              ... Ah, thus vim_isfilec() is called before the chartab[] table has been initialized. That s wrong. ... Right. But has_mbyte will also not be set. Unless
              Message 6 of 10 , Aug 1, 2000
              View Source
              • 0 Attachment
                Yasuhiro Matsumoto wrote:

                > This problem is not in the part only.
                > The cause is that calling "expand_wildcard" before
                > calling "set_init_3". This mean that no initializing
                > multi-byte table.

                Ah, thus vim_isfilec() is called before the chartab[] table has been
                initialized. That's wrong.

                > Thus, when analysing file-name arguments, it can't judge multi-byte right.

                Right. But has_mbyte will also not be set. Unless you specified it on the
                command line, which is not likely to happen.

                > I attach diff-file on this mail. (change order of calling function)
                > But this is not certain.
                > I want any one's suggestion.

                Changing the order of initializations is very dangerous. In this case the
                vimrc file can do just about anything with file specified in the command line,
                this it should be run with a valid set of arguments. But since 'isfname' can
                be set in the vimrc, expanding may work differently after it.

                This is a tough chicken-egg problem. I really don't know a good solution.
                Perhaps it would be sufficient to initialise chartab[] first, assuming that
                all characters above 0x80 are filename characters?

                --
                hundred-and-one symptoms of being an internet addict:
                56. You leave the modem speaker on after connecting because you think it
                sounds like the ocean wind...the perfect soundtrack for "surfing the net".

                /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
                \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
              • mattn@mail.goo.ne.jp
                ... I hope that setting p_cc before calling rem_backslash . I do set fileencoding=japan in vimrc, Is it too later? ... Japanese people use few characterset.
                Message 7 of 10 , Aug 1, 2000
                View Source
                • 0 Attachment
                  Bram@... wrote:
                  > Changing the order of initializations is very dangerous. In this case the
                  > vimrc file can do just about anything with file specified in the command line,
                  > this it should be run with a valid set of arguments. But since 'isfname' can
                  > be set in the vimrc, expanding may work differently after it.

                  I hope that setting p_cc before calling "rem_backslash".
                  I do "set fileencoding=japan" in vimrc, Is it too later?

                  > This is a tough chicken-egg problem. I really don't know a good solution.
                  > Perhaps it would be sufficient to initialise chartab[] first, assuming that
                  > all characters above 0x80 are filename characters?

                  Japanese people use few characterset.
                  It is called "shift_jis", "euc-jp", etc.
                  Japanese MS-Windows use "shift_jis", but there is UNIX used "euc-jp".
                  They are composed with lead-byte and trail-byte.

                  If using "shift_jis", trail-byte don't have backslash.
                  But using "euc-jp", trail-byte may have backslash.
                  Thus, it is necessary for skipping trail-byte by lead-byte and way of encoding.
                  However, both have something in using non-ascii for lead-byte.
                  So how about below solution?

                  If on specified backslash lead-char, (it is not path-separator.)
                  next character should be single-byte character. It may be ascii.

                  ---------------------------------------------
                  rem_backslash(str)
                  char_u *str;
                  {
                  #ifdef BACKSLASH_IN_FILENAME
                  return (str[0] == '\\'
                  #ifdef FEAT_MBYTE
                  && isascii(str[1])
                  #endif
                  && (str[1] == ' '
                  || (str[1] != NUL
                  && str[1] != '*'
                  && str[1] != '?'
                  && !vim_isfilec(str[1]))));
                  #else
                  return (str[0] == '\\' && str[1] != NUL);
                  #endif
                  }
                  ---------------------------------------------

                  Yasuhiro Matsumoto
                • Bram Moolenaar
                  ... You would set fileencoding or charcode (which are really the same thing) in your vimrc file. But in your vimrc file you may also want to check which
                  Message 8 of 10 , Aug 3, 2000
                  View Source
                  • 0 Attachment
                    Yasuhiro Matsumoto wrote:

                    > Bram@... wrote:
                    > > Changing the order of initializations is very dangerous. In this case the
                    > > vimrc file can do just about anything with file specified in the command
                    > > line, this it should be run with a valid set of arguments. But since
                    > > 'isfname' can be set in the vimrc, expanding may work differently after
                    > > it.
                    >
                    > I hope that setting p_cc before calling "rem_backslash".
                    > I do "set fileencoding=japan" in vimrc, Is it too later?

                    You would set 'fileencoding' or 'charcode' (which are really the same thing)
                    in your vimrc file. But in your vimrc file you may also want to check which
                    arguments Vim got and they need to be expanded before sourcing the vimrc file.
                    That won't work.

                    A very clever solution would be not to expand wildcards in the arguments until
                    they are used. Thus if you would set 'charcode' in your vimrc and then access
                    an argument, the expansion would be done right there. I'm not sure if this is
                    really possible though.

                    > > This is a tough chicken-egg problem. I really don't know a good solution.
                    > > Perhaps it would be sufficient to initialise chartab[] first, assuming that
                    > > all characters above 0x80 are filename characters?
                    >
                    > Japanese people use few characterset.
                    > It is called "shift_jis", "euc-jp", etc.
                    > Japanese MS-Windows use "shift_jis", but there is UNIX used "euc-jp".
                    > They are composed with lead-byte and trail-byte.
                    >
                    > If using "shift_jis", trail-byte don't have backslash.
                    > But using "euc-jp", trail-byte may have backslash.
                    > Thus, it is necessary for skipping trail-byte by lead-byte and way of
                    > encoding. However, both have something in using non-ascii for lead-byte.

                    I see the problem. It's near to impossible to guess that a backslash is
                    really the second byte of a multi-byte character. Might have been a
                    ISO-8859-1 character followed by a backslash.

                    > So how about below solution?

                    It might solve it in some cases, but not when the double-byte character that
                    contains a backslash is followed by an ascii character. I don't think this is
                    reliable enough.

                    --
                    hundred-and-one symptoms of being an internet addict:
                    78. You find yourself dialing IP numbers on the phone.

                    /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
                    \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
                  • mattn@mail.goo.ne.jp
                    ... rem_backslash(str) char_u *str; { #ifdef BACKSLASH_IN_FILENAME return (str[0] == #ifdef FEAT_MBYTE && isascii(str[1]) #endif && (str[1] == ... &&
                    Message 9 of 10 , Aug 3, 2000
                    View Source
                    • 0 Attachment
                      Bram@... wrote:
                      > > So how about below solution?
                      >
                      > It might solve it in some cases, but not when the double-byte character that
                      > contains a backslash is followed by an ascii character. I don't think this is
                      > reliable enough.

                      ---------------------------------------------
                      rem_backslash(str)
                      char_u *str;
                      {
                      #ifdef BACKSLASH_IN_FILENAME
                      return (str[0] == '\\'
                      #ifdef FEAT_MBYTE
                      && isascii(str[1])
                      #endif
                      && (str[1] == ' '
                      || (str[1] != NUL
                      && str[1] != '*'
                      && str[1] != '?'
                      && !vim_isfilec(str[1]))));
                      #else
                      return (str[0] == '\\' && str[1] != NUL);
                      #endif
                      }
                      ---------------------------------------------

                      I think eough.
                      If I liken multi-byte characters to "[A][B][C][D][E][F]",
                      it will work as below.("[B]" and "[C]", [F] contain backslash)

                      ex: C:\[A][B]\ [C]\I_love_[D][E][F]s.txt
                      (1)
                      (2)
                      (3)
                      (4)
                      (5)
                      (6)

                      In this case, there is six check-point.
                      (Your said is (6)?)

                      (1) Next character is an ascii, but it is not "*", "?", " ".
                      ---> this is not specified backslash.

                      (2) Next character is a non-ascii,
                      ---> this is not specified backslash.

                      (3) Next character is an ascii, and it is " "
                      ---> this is specified backslash.

                      (4) Next character is an ascii, but it is not "*", "?", " ".
                      ---> this is not specified backslash.

                      (5) Next character is an ascii, but it is not "*", "?", " ".
                      ---> this is not specified backslash.

                      (6) Next character is an ascii, but it is not "*", "?", " ".
                      ---> this is not specified backslash.
                    • Bram Moolenaar
                      ... You are right, when [F] is a double-byte character and the second byte is a backslash: [x ]s the backslash would be kept. I see one remaining problem at
                      Message 10 of 10 , Aug 4, 2000
                      View Source
                      • 0 Attachment
                        Yasuhiro Matsumoto wrote:

                        > I think eough.
                        > If I liken multi-byte characters to "[A][B][C][D][E][F]",
                        > it will work as below.("[B]" and "[C]", [F] contain backslash)
                        >
                        > ex: C:\[A][B]\ [C]\I_love_[D][E][F]s.txt

                        You are right, when [F] is a double-byte character and the second byte is a
                        backslash: "[x\]s" the backslash would be kept.

                        I see one remaining problem at [C], but I'm not sure if it actually happens:

                        [x\]\

                        The double backslash would be reduced to one. Hmm, the comment about this
                        appears to be wrong, a double backslash would be kept. It would work OK then.

                        --
                        To be rich is not the end, but only a change of worries.

                        /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
                        \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
                      Your message has been successfully submitted and would be delivered to recipients shortly.