Loading ...
Sorry, an error occurred while loading the content.
 

Re: Vim can't use filename having MBYTE

Expand Messages
  • mattn@mail.goo.ne.jp
    ... I hope that setting p_cc before calling rem_backslash . I do set fileencoding=japan in vimrc, Is it too later? ... Japanese people use few characterset.
    Message 1 of 10 , Aug 1, 2000
      Bram@... wrote:
      > Changing the order of initializations is very dangerous. In this case the
      > vimrc file can do just about anything with file specified in the command line,
      > this it should be run with a valid set of arguments. But since 'isfname' can
      > be set in the vimrc, expanding may work differently after it.

      I hope that setting p_cc before calling "rem_backslash".
      I do "set fileencoding=japan" in vimrc, Is it too later?

      > This is a tough chicken-egg problem. I really don't know a good solution.
      > Perhaps it would be sufficient to initialise chartab[] first, assuming that
      > all characters above 0x80 are filename characters?

      Japanese people use few characterset.
      It is called "shift_jis", "euc-jp", etc.
      Japanese MS-Windows use "shift_jis", but there is UNIX used "euc-jp".
      They are composed with lead-byte and trail-byte.

      If using "shift_jis", trail-byte don't have backslash.
      But using "euc-jp", trail-byte may have backslash.
      Thus, it is necessary for skipping trail-byte by lead-byte and way of encoding.
      However, both have something in using non-ascii for lead-byte.
      So how about below solution?

      If on specified backslash lead-char, (it is not path-separator.)
      next character should be single-byte character. It may be ascii.

      ---------------------------------------------
      rem_backslash(str)
      char_u *str;
      {
      #ifdef BACKSLASH_IN_FILENAME
      return (str[0] == '\\'
      #ifdef FEAT_MBYTE
      && isascii(str[1])
      #endif
      && (str[1] == ' '
      || (str[1] != NUL
      && str[1] != '*'
      && str[1] != '?'
      && !vim_isfilec(str[1]))));
      #else
      return (str[0] == '\\' && str[1] != NUL);
      #endif
      }
      ---------------------------------------------

      Yasuhiro Matsumoto
    • Bram Moolenaar
      ... You would set fileencoding or charcode (which are really the same thing) in your vimrc file. But in your vimrc file you may also want to check which
      Message 2 of 10 , Aug 3, 2000
        Yasuhiro Matsumoto wrote:

        > Bram@... wrote:
        > > Changing the order of initializations is very dangerous. In this case the
        > > vimrc file can do just about anything with file specified in the command
        > > line, this it should be run with a valid set of arguments. But since
        > > 'isfname' can be set in the vimrc, expanding may work differently after
        > > it.
        >
        > I hope that setting p_cc before calling "rem_backslash".
        > I do "set fileencoding=japan" in vimrc, Is it too later?

        You would set 'fileencoding' or 'charcode' (which are really the same thing)
        in your vimrc file. But in your vimrc file you may also want to check which
        arguments Vim got and they need to be expanded before sourcing the vimrc file.
        That won't work.

        A very clever solution would be not to expand wildcards in the arguments until
        they are used. Thus if you would set 'charcode' in your vimrc and then access
        an argument, the expansion would be done right there. I'm not sure if this is
        really possible though.

        > > This is a tough chicken-egg problem. I really don't know a good solution.
        > > Perhaps it would be sufficient to initialise chartab[] first, assuming that
        > > all characters above 0x80 are filename characters?
        >
        > Japanese people use few characterset.
        > It is called "shift_jis", "euc-jp", etc.
        > Japanese MS-Windows use "shift_jis", but there is UNIX used "euc-jp".
        > They are composed with lead-byte and trail-byte.
        >
        > If using "shift_jis", trail-byte don't have backslash.
        > But using "euc-jp", trail-byte may have backslash.
        > Thus, it is necessary for skipping trail-byte by lead-byte and way of
        > encoding. However, both have something in using non-ascii for lead-byte.

        I see the problem. It's near to impossible to guess that a backslash is
        really the second byte of a multi-byte character. Might have been a
        ISO-8859-1 character followed by a backslash.

        > So how about below solution?

        It might solve it in some cases, but not when the double-byte character that
        contains a backslash is followed by an ascii character. I don't think this is
        reliable enough.

        --
        hundred-and-one symptoms of being an internet addict:
        78. You find yourself dialing IP numbers on the phone.

        /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
        \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
      • mattn@mail.goo.ne.jp
        ... rem_backslash(str) char_u *str; { #ifdef BACKSLASH_IN_FILENAME return (str[0] == #ifdef FEAT_MBYTE && isascii(str[1]) #endif && (str[1] == ... &&
        Message 3 of 10 , Aug 3, 2000
          Bram@... wrote:
          > > So how about below solution?
          >
          > It might solve it in some cases, but not when the double-byte character that
          > contains a backslash is followed by an ascii character. I don't think this is
          > reliable enough.

          ---------------------------------------------
          rem_backslash(str)
          char_u *str;
          {
          #ifdef BACKSLASH_IN_FILENAME
          return (str[0] == '\\'
          #ifdef FEAT_MBYTE
          && isascii(str[1])
          #endif
          && (str[1] == ' '
          || (str[1] != NUL
          && str[1] != '*'
          && str[1] != '?'
          && !vim_isfilec(str[1]))));
          #else
          return (str[0] == '\\' && str[1] != NUL);
          #endif
          }
          ---------------------------------------------

          I think eough.
          If I liken multi-byte characters to "[A][B][C][D][E][F]",
          it will work as below.("[B]" and "[C]", [F] contain backslash)

          ex: C:\[A][B]\ [C]\I_love_[D][E][F]s.txt
          (1)
          (2)
          (3)
          (4)
          (5)
          (6)

          In this case, there is six check-point.
          (Your said is (6)?)

          (1) Next character is an ascii, but it is not "*", "?", " ".
          ---> this is not specified backslash.

          (2) Next character is a non-ascii,
          ---> this is not specified backslash.

          (3) Next character is an ascii, and it is " "
          ---> this is specified backslash.

          (4) Next character is an ascii, but it is not "*", "?", " ".
          ---> this is not specified backslash.

          (5) Next character is an ascii, but it is not "*", "?", " ".
          ---> this is not specified backslash.

          (6) Next character is an ascii, but it is not "*", "?", " ".
          ---> this is not specified backslash.
        • Bram Moolenaar
          ... You are right, when [F] is a double-byte character and the second byte is a backslash: [x ]s the backslash would be kept. I see one remaining problem at
          Message 4 of 10 , Aug 4, 2000
            Yasuhiro Matsumoto wrote:

            > I think eough.
            > If I liken multi-byte characters to "[A][B][C][D][E][F]",
            > it will work as below.("[B]" and "[C]", [F] contain backslash)
            >
            > ex: C:\[A][B]\ [C]\I_love_[D][E][F]s.txt

            You are right, when [F] is a double-byte character and the second byte is a
            backslash: "[x\]s" the backslash would be kept.

            I see one remaining problem at [C], but I'm not sure if it actually happens:

            [x\]\

            The double backslash would be reduced to one. Hmm, the comment about this
            appears to be wrong, a double backslash would be kept. It would work OK then.

            --
            To be rich is not the end, but only a change of worries.

            /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
            \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
          Your message has been successfully submitted and would be delivered to recipients shortly.