Loading ...
Sorry, an error occurred while loading the content.

Error when matching a null character with NFA engine

Expand Messages
  • Jonathon Merz
    With the new regexp engine, when searching for a null character using decimal/octal/hex character matches, all lines are matched instead of only the specified
    Message 1 of 7 , Sep 19, 2013
    View Source
    • 0 Attachment
      With the new regexp engine, when searching for a null character using decimal/octal/hex character matches, all lines are matched instead of only the specified character.

      The attached .txt file (ok to attach I hope) has a null character (represented as "^@") in the first line.

      Using the new engine:
          \%#=2\%d0
      Matches all lines in the file, but using:
          \%#=1\%d0
      matches only the null character as expected.

      The same goes for using \%o and \%x to match characters specified in octal and hex.

      --
      --
      You received this message from the "vim_dev" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php
       
      ---
      You received this message because you are subscribed to the Google Groups "vim_dev" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
      For more options, visit https://groups.google.com/groups/opt_out.
    • Christian Brabandt
      ... That doesn t match for me anything. ... This patch fixes it: diff --git a/src/regexp_nfa.c b/src/regexp_nfa.c ... +++ b/src/regexp_nfa.c @@ -1385,7 +1385,7
      Message 2 of 7 , Sep 19, 2013
      View Source
      • 0 Attachment
        On Do, 19 Sep 2013, Jonathon Merz wrote:

        > With the new regexp engine, when searching for a null character using
        > decimal/octal/hex character matches, all lines are matched instead of only
        > the specified character.
        >
        > The attached .txt file (ok to attach I hope) has a null character
        > (represented as "^@") in the first line.
        >
        > Using the new engine:
        > \%#=2\%d0
        > Matches all lines in the file, but using:

        That doesn't match for me anything.

        > \%#=1\%d0
        > matches only the null character as expected.
        >
        > The same goes for using \%o and \%x to match characters specified in octal
        > and hex.

        This patch fixes it:
        diff --git a/src/regexp_nfa.c b/src/regexp_nfa.c
        --- a/src/regexp_nfa.c
        +++ b/src/regexp_nfa.c
        @@ -1385,7 +1385,7 @@
        _("E678: Invalid character after %s%%[dxouU]"),
        reg_magic == MAGIC_ALL);
        /* TODO: what if a composing character follows? */
        - EMIT(nr);
        + EMIT(nr == 0 ? 0x0a : nr);
        }
        break;

        regards,
        Christian
        --
        Hat der Bauer voll die Scheuer zahlt er etwas mehr an Steuer.

        --
        --
        You received this message from the "vim_dev" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php

        ---
        You received this message because you are subscribed to the Google Groups "vim_dev" group.
        To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
        For more options, visit https://groups.google.com/groups/opt_out.
      • Bram Moolenaar
        ... I can see %d0 does not match the NUL, but I don t see it matching all lines. -- You can tune a file system, but you can t tuna fish -- man tunefs /// Bram
        Message 3 of 7 , Sep 19, 2013
        View Source
        • 0 Attachment
          Jonathon Merz wrote:

          > With the new regexp engine, when searching for a null character using
          > decimal/octal/hex character matches, all lines are matched instead
          > of only
          > the specified character.
          >
          > The attached .txt file (ok to attach I hope) has a null character
          > (represented as "^@") in the first line.
          >
          > Using the new engine:
          > \%#=2\%d0
          > Matches all lines in the file, but using:
          > \%#=1\%d0
          > matches only the null character as expected.
          >
          > The same goes for using \%o and \%x to match characters specified in octal
          > and hex.

          I can see \%d0 does not match the NUL, but I don't see it matching all
          lines.

          --
          You can tune a file system, but you can't tuna fish
          -- man tunefs

          /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
          /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
          \\\ an exciting new programming language -- http://www.Zimbu.org ///
          \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

          --
          --
          You received this message from the "vim_dev" maillist.
          Do not top-post! Type your reply below the text you are replying to.
          For more information, visit http://www.vim.org/maillist.php

          ---
          You received this message because you are subscribed to the Google Groups "vim_dev" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
          For more options, visit https://groups.google.com/groups/opt_out.
        • Jonathon Merz
          Oops, further clarification on that: 1. Start vim with -u NONE -U NONE 2. Repeat the search from my previous email 3. Hit n a few times - cursor hits first
          Message 4 of 7 , Sep 19, 2013
          View Source
          • 0 Attachment
            Oops, further clarification on that:

            1. Start vim with -u NONE -U NONE
            2. Repeat the search from my previous email
            3. Hit 'n' a few times - cursor hits first character of each line
            4. :set hlsearch
            5. The entirety of every line is highlighted

            I was running with hlsearch on when I originally found it and figured that the entire line was the match.  I still think that might be the case... if I do:
                %s/\%#=2\%d0/_/g
            The contents of every line (including empty lines) becomes an underscore just as if I had done:
                %s/^.*$/_/g


            And Christian's patch does fix all of the above for me.


            Thanks,

            Jonathon


            On Thu, Sep 19, 2013 at 12:45 PM, Bram Moolenaar <Bram@...> wrote:

            Jonathon Merz wrote:

            > With the new regexp engine, when searching for a null character using
            > decimal/octal/hex character matches, all lines are matched instead
            > of only
            > the specified character.
            >
            > The attached .txt file (ok to attach I hope) has a null character
            > (represented as "^@") in the first line.
            >
            > Using the new engine:
            >     \%#=2\%d0
            > Matches all lines in the file, but using:
            >     \%#=1\%d0
            > matches only the null character as expected.
            >
            > The same goes for using \%o and \%x to match characters specified in octal
            > and hex.

            I can see \%d0 does not match the NUL, but I don't see it matching all
            lines.

            --
            You can tune a file system, but you can't tuna fish
                                                                    -- man tunefs

             /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net   \\\
            ///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
            \\\  an exciting new programming language -- http://www.Zimbu.org        ///
             \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

            --
            --
            You received this message from the "vim_dev" maillist.
            Do not top-post! Type your reply below the text you are replying to.
            For more information, visit http://www.vim.org/maillist.php
             
            ---
            You received this message because you are subscribed to the Google Groups "vim_dev" group.
            To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
            For more options, visit https://groups.google.com/groups/opt_out.
          • Bram Moolenaar
            ... Thanks! Guess what the bonus question is... -- Eagles may soar, but weasels don t get sucked into jet engines. /// Bram Moolenaar -- Bram@Moolenaar.net --
            Message 5 of 7 , Sep 20, 2013
            View Source
            • 0 Attachment
              Christian Brabandt wrote:

              > On Do, 19 Sep 2013, Jonathon Merz wrote:
              >
              > > With the new regexp engine, when searching for a null character using
              > > decimal/octal/hex character matches, all lines are matched instead of only
              > > the specified character.
              > >
              > > The attached .txt file (ok to attach I hope) has a null character
              > > (represented as "^@") in the first line.
              > >
              > > Using the new engine:
              > > \%#=2\%d0
              > > Matches all lines in the file, but using:
              >
              > That doesn't match for me anything.
              >
              > > \%#=1\%d0
              > > matches only the null character as expected.
              > >
              > > The same goes for using \%o and \%x to match characters specified in octal
              > > and hex.
              >
              > This patch fixes it:

              Thanks! Guess what the bonus question is...


              --
              Eagles may soar, but weasels don't get sucked into jet engines.

              /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
              /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
              \\\ an exciting new programming language -- http://www.Zimbu.org ///
              \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

              --
              --
              You received this message from the "vim_dev" maillist.
              Do not top-post! Type your reply below the text you are replying to.
              For more information, visit http://www.vim.org/maillist.php

              ---
              You received this message because you are subscribed to the Google Groups "vim_dev" group.
              To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
              For more options, visit https://groups.google.com/groups/opt_out.
            • Charles Campbell
              ... Testy, testy, testy! :) Chip -- -- You received this message from the vim_dev maillist. Do not top-post! Type your reply below the text you are replying
              Message 6 of 7 , Sep 20, 2013
              View Source
              • 0 Attachment
                Bram Moolenaar wrote:
                > Christian Brabandt wrote:
                >
                >> On Do, 19 Sep 2013, Jonathon Merz wrote:
                >>
                >>> With the new regexp engine, when searching for a null character using
                >>> decimal/octal/hex character matches, all lines are matched instead of only
                >>> the specified character.
                >>>
                >>> The attached .txt file (ok to attach I hope) has a null character
                >>> (represented as "^@") in the first line.
                >>>
                >>> Using the new engine:
                >>> \%#=2\%d0
                >>> Matches all lines in the file, but using:
                >> That doesn't match for me anything.
                >>
                >>> \%#=1\%d0
                >>> matches only the null character as expected.
                >>>
                >>> The same goes for using \%o and \%x to match characters specified in octal
                >>> and hex.
                >> This patch fixes it:
                > Thanks! Guess what the bonus question is...
                >
                >
                Testy, testy, testy! :)
                Chip

                --
                --
                You received this message from the "vim_dev" maillist.
                Do not top-post! Type your reply below the text you are replying to.
                For more information, visit http://www.vim.org/maillist.php

                ---
                You received this message because you are subscribed to the Google Groups "vim_dev" group.
                To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
                For more options, visit https://groups.google.com/groups/opt_out.
              • Christian Brabandt
                Hi Bram! ... The bonus question is: Why do we need 2 regexp engines? ;) diff --git a/src/regexp_nfa.c b/src/regexp_nfa.c ... +++ b/src/regexp_nfa.c @@ -1385,7
                Message 7 of 7 , Sep 21, 2013
                View Source
                • 0 Attachment
                  Hi Bram!

                  On Fr, 20 Sep 2013, Bram Moolenaar wrote:

                  > Christian Brabandt wrote:
                  >
                  > > On Do, 19 Sep 2013, Jonathon Merz wrote:
                  > >
                  > > > With the new regexp engine, when searching for a null character using
                  > > > decimal/octal/hex character matches, all lines are matched instead of only
                  > > > the specified character.
                  > > >
                  > > > The attached .txt file (ok to attach I hope) has a null character
                  > > > (represented as "^@") in the first line.
                  > > >
                  > > > Using the new engine:
                  > > > \%#=2\%d0
                  > > > Matches all lines in the file, but using:
                  > >
                  > > That doesn't match for me anything.
                  > >
                  > > > \%#=1\%d0
                  > > > matches only the null character as expected.
                  > > >
                  > > > The same goes for using \%o and \%x to match characters specified in octal
                  > > > and hex.
                  > >
                  > > This patch fixes it:
                  >
                  > Thanks! Guess what the bonus question is...

                  The bonus question is:
                  Why do we need 2 regexp engines? ;)


                  diff --git a/src/regexp_nfa.c b/src/regexp_nfa.c
                  --- a/src/regexp_nfa.c
                  +++ b/src/regexp_nfa.c
                  @@ -1385,7 +1385,7 @@
                  _("E678: Invalid character after %s%%[dxouU]"),
                  reg_magic == MAGIC_ALL);
                  /* TODO: what if a composing character follows? */
                  - EMIT(nr);
                  + EMIT(nr == 0 ? 0x0a : nr);
                  }
                  break;

                  diff --git a/src/testdir/test64.in b/src/testdir/test64.in
                  --- a/src/testdir/test64.in
                  +++ b/src/testdir/test64.in
                  @@ -373,6 +373,7 @@
                  :call add(tl, [2, '\%x20', 'yes no', ' '])
                  :call add(tl, [2, '\%u0020', 'yes no', ' '])
                  :call add(tl, [2, '\%U00000020', 'yes no', ' '])
                  +:call add(tl, [2, '\%d0', "yes\x0ano", "\x0a"])
                  :"
                  :""""" \%[abc]
                  :call add(tl, [2, 'foo\%[bar]', 'fobar'])
                  diff --git a/src/testdir/test64.ok b/src/testdir/test64.ok
                  --- a/src/testdir/test64.ok
                  +++ b/src/testdir/test64.ok
                  @@ -863,6 +863,9 @@
                  OK 0 - \%U00000020
                  OK 1 - \%U00000020
                  OK 2 - \%U00000020
                  +OK 0 - \%d0
                  +OK 1 - \%d0
                  +OK 2 - \%d0
                  OK 0 - foo\%[bar]
                  OK 1 - foo\%[bar]
                  OK 2 - foo\%[bar]


                  regards,
                  Christian
                  --
                  "Ich glaube einen Gott!" Dies ist ein schönes, löbliches Wort;
                  aber Gott anerkennen, wo und wie er sich offenbare, das ist
                  eigentlich die Seligkeit auf Erden.
                  -- Goethe, Maximen und Reflektionen, Nr. 539

                  --
                  --
                  You received this message from the "vim_dev" maillist.
                  Do not top-post! Type your reply below the text you are replying to.
                  For more information, visit http://www.vim.org/maillist.php

                  ---
                  You received this message because you are subscribed to the Google Groups "vim_dev" group.
                  To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
                  For more options, visit https://groups.google.com/groups/opt_out.
                Your message has been successfully submitted and would be delivered to recipients shortly.