Loading ...
Sorry, an error occurred while loading the content.

Re: Vim can't use filename having MBYTE

Expand Messages
  • mattn@mail.goo.ne.jp
    ... Oh, misstake. ... #ifdef BACKSLASH_IN_FILENAME return (str[0] == #ifdef FEAT_MBYTE && (!has_mbyte || (has_mbyte && (mb_ptr2len_check(str+1) == 1)))
    Message 1 of 10 , Jul 31, 2000
    View Source
    • 0 Attachment
      Bram@... wrote:
      > This function is very tricky. It was tuned carefully to remove
      > backslashes where wanted, and keep them where needed.
      >
      > It looks like this patch doesn't work when "has_mbyte" is not set.

      Oh, misstake.
      ------------------------------- (ex_docmd.c around3454)
      #ifdef BACKSLASH_IN_FILENAME
      return (str[0] == '\\'
      #ifdef FEAT_MBYTE
      && (!has_mbyte || (has_mbyte && (mb_ptr2len_check(str+1) == 1)))
      #endif
      && (str[1] == ' '
      || (str[1] != NUL
      && str[1] != '*'
      && str[1] != '?'
      && !vim_isfilec(str[1]))));
      #else
      return (str[0] == '\\' && str[1] != NUL);
      -----------------------------------
      > Can you please explain what your change intends to do? I believe all
      > multi-byte characters are already marked as filename characters. Thus when a
      > multi-byte character is preceded with a backslash the backslash should be
      > kept, since a file name can start with a multi-byte character, right?

      Seem, It is my misunderstanding.
      This problem is not in the part only.
      The cause is that calling "expand_wildcard" before
      calling "set_init_3".This mean that no initializing
      multi-byte table.
      Thus, when analysing file-name arguments,
      it can't judge multi-byte right.

      I attach diff-file on this mail.
      (change order of calling function)
      But this is not certain.
      I want any one's suggestion.
      -------------------------------------
      Yasuhiro Matsumoto
    • Bram Moolenaar
      ... Ah, thus vim_isfilec() is called before the chartab[] table has been initialized. That s wrong. ... Right. But has_mbyte will also not be set. Unless
      Message 2 of 10 , Aug 1, 2000
      View Source
      • 0 Attachment
        Yasuhiro Matsumoto wrote:

        > This problem is not in the part only.
        > The cause is that calling "expand_wildcard" before
        > calling "set_init_3". This mean that no initializing
        > multi-byte table.

        Ah, thus vim_isfilec() is called before the chartab[] table has been
        initialized. That's wrong.

        > Thus, when analysing file-name arguments, it can't judge multi-byte right.

        Right. But has_mbyte will also not be set. Unless you specified it on the
        command line, which is not likely to happen.

        > I attach diff-file on this mail. (change order of calling function)
        > But this is not certain.
        > I want any one's suggestion.

        Changing the order of initializations is very dangerous. In this case the
        vimrc file can do just about anything with file specified in the command line,
        this it should be run with a valid set of arguments. But since 'isfname' can
        be set in the vimrc, expanding may work differently after it.

        This is a tough chicken-egg problem. I really don't know a good solution.
        Perhaps it would be sufficient to initialise chartab[] first, assuming that
        all characters above 0x80 are filename characters?

        --
        hundred-and-one symptoms of being an internet addict:
        56. You leave the modem speaker on after connecting because you think it
        sounds like the ocean wind...the perfect soundtrack for "surfing the net".

        /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
        \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
      • mattn@mail.goo.ne.jp
        ... I hope that setting p_cc before calling rem_backslash . I do set fileencoding=japan in vimrc, Is it too later? ... Japanese people use few characterset.
        Message 3 of 10 , Aug 1, 2000
        View Source
        • 0 Attachment
          Bram@... wrote:
          > Changing the order of initializations is very dangerous. In this case the
          > vimrc file can do just about anything with file specified in the command line,
          > this it should be run with a valid set of arguments. But since 'isfname' can
          > be set in the vimrc, expanding may work differently after it.

          I hope that setting p_cc before calling "rem_backslash".
          I do "set fileencoding=japan" in vimrc, Is it too later?

          > This is a tough chicken-egg problem. I really don't know a good solution.
          > Perhaps it would be sufficient to initialise chartab[] first, assuming that
          > all characters above 0x80 are filename characters?

          Japanese people use few characterset.
          It is called "shift_jis", "euc-jp", etc.
          Japanese MS-Windows use "shift_jis", but there is UNIX used "euc-jp".
          They are composed with lead-byte and trail-byte.

          If using "shift_jis", trail-byte don't have backslash.
          But using "euc-jp", trail-byte may have backslash.
          Thus, it is necessary for skipping trail-byte by lead-byte and way of encoding.
          However, both have something in using non-ascii for lead-byte.
          So how about below solution?

          If on specified backslash lead-char, (it is not path-separator.)
          next character should be single-byte character. It may be ascii.

          ---------------------------------------------
          rem_backslash(str)
          char_u *str;
          {
          #ifdef BACKSLASH_IN_FILENAME
          return (str[0] == '\\'
          #ifdef FEAT_MBYTE
          && isascii(str[1])
          #endif
          && (str[1] == ' '
          || (str[1] != NUL
          && str[1] != '*'
          && str[1] != '?'
          && !vim_isfilec(str[1]))));
          #else
          return (str[0] == '\\' && str[1] != NUL);
          #endif
          }
          ---------------------------------------------

          Yasuhiro Matsumoto
        • Bram Moolenaar
          ... You would set fileencoding or charcode (which are really the same thing) in your vimrc file. But in your vimrc file you may also want to check which
          Message 4 of 10 , Aug 3, 2000
          View Source
          • 0 Attachment
            Yasuhiro Matsumoto wrote:

            > Bram@... wrote:
            > > Changing the order of initializations is very dangerous. In this case the
            > > vimrc file can do just about anything with file specified in the command
            > > line, this it should be run with a valid set of arguments. But since
            > > 'isfname' can be set in the vimrc, expanding may work differently after
            > > it.
            >
            > I hope that setting p_cc before calling "rem_backslash".
            > I do "set fileencoding=japan" in vimrc, Is it too later?

            You would set 'fileencoding' or 'charcode' (which are really the same thing)
            in your vimrc file. But in your vimrc file you may also want to check which
            arguments Vim got and they need to be expanded before sourcing the vimrc file.
            That won't work.

            A very clever solution would be not to expand wildcards in the arguments until
            they are used. Thus if you would set 'charcode' in your vimrc and then access
            an argument, the expansion would be done right there. I'm not sure if this is
            really possible though.

            > > This is a tough chicken-egg problem. I really don't know a good solution.
            > > Perhaps it would be sufficient to initialise chartab[] first, assuming that
            > > all characters above 0x80 are filename characters?
            >
            > Japanese people use few characterset.
            > It is called "shift_jis", "euc-jp", etc.
            > Japanese MS-Windows use "shift_jis", but there is UNIX used "euc-jp".
            > They are composed with lead-byte and trail-byte.
            >
            > If using "shift_jis", trail-byte don't have backslash.
            > But using "euc-jp", trail-byte may have backslash.
            > Thus, it is necessary for skipping trail-byte by lead-byte and way of
            > encoding. However, both have something in using non-ascii for lead-byte.

            I see the problem. It's near to impossible to guess that a backslash is
            really the second byte of a multi-byte character. Might have been a
            ISO-8859-1 character followed by a backslash.

            > So how about below solution?

            It might solve it in some cases, but not when the double-byte character that
            contains a backslash is followed by an ascii character. I don't think this is
            reliable enough.

            --
            hundred-and-one symptoms of being an internet addict:
            78. You find yourself dialing IP numbers on the phone.

            /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
            \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
          • mattn@mail.goo.ne.jp
            ... rem_backslash(str) char_u *str; { #ifdef BACKSLASH_IN_FILENAME return (str[0] == #ifdef FEAT_MBYTE && isascii(str[1]) #endif && (str[1] == ... &&
            Message 5 of 10 , Aug 3, 2000
            View Source
            • 0 Attachment
              Bram@... wrote:
              > > So how about below solution?
              >
              > It might solve it in some cases, but not when the double-byte character that
              > contains a backslash is followed by an ascii character. I don't think this is
              > reliable enough.

              ---------------------------------------------
              rem_backslash(str)
              char_u *str;
              {
              #ifdef BACKSLASH_IN_FILENAME
              return (str[0] == '\\'
              #ifdef FEAT_MBYTE
              && isascii(str[1])
              #endif
              && (str[1] == ' '
              || (str[1] != NUL
              && str[1] != '*'
              && str[1] != '?'
              && !vim_isfilec(str[1]))));
              #else
              return (str[0] == '\\' && str[1] != NUL);
              #endif
              }
              ---------------------------------------------

              I think eough.
              If I liken multi-byte characters to "[A][B][C][D][E][F]",
              it will work as below.("[B]" and "[C]", [F] contain backslash)

              ex: C:\[A][B]\ [C]\I_love_[D][E][F]s.txt
              (1)
              (2)
              (3)
              (4)
              (5)
              (6)

              In this case, there is six check-point.
              (Your said is (6)?)

              (1) Next character is an ascii, but it is not "*", "?", " ".
              ---> this is not specified backslash.

              (2) Next character is a non-ascii,
              ---> this is not specified backslash.

              (3) Next character is an ascii, and it is " "
              ---> this is specified backslash.

              (4) Next character is an ascii, but it is not "*", "?", " ".
              ---> this is not specified backslash.

              (5) Next character is an ascii, but it is not "*", "?", " ".
              ---> this is not specified backslash.

              (6) Next character is an ascii, but it is not "*", "?", " ".
              ---> this is not specified backslash.
            • Bram Moolenaar
              ... You are right, when [F] is a double-byte character and the second byte is a backslash: [x ]s the backslash would be kept. I see one remaining problem at
              Message 6 of 10 , Aug 4, 2000
              View Source
              • 0 Attachment
                Yasuhiro Matsumoto wrote:

                > I think eough.
                > If I liken multi-byte characters to "[A][B][C][D][E][F]",
                > it will work as below.("[B]" and "[C]", [F] contain backslash)
                >
                > ex: C:\[A][B]\ [C]\I_love_[D][E][F]s.txt

                You are right, when [F] is a double-byte character and the second byte is a
                backslash: "[x\]s" the backslash would be kept.

                I see one remaining problem at [C], but I'm not sure if it actually happens:

                [x\]\

                The double backslash would be reduced to one. Hmm, the comment about this
                appears to be wrong, a double backslash would be kept. It would work OK then.

                --
                To be rich is not the end, but only a change of worries.

                /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
                \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
              Your message has been successfully submitted and would be delivered to recipients shortly.