Loading ...
Sorry, an error occurred while loading the content.

Re: NFA regexp engine incorrectly matching pattern

Expand Messages
  • Dominique PellĂ©
    ... Valgrind finds an error when doing what you describe using Vim-7.3.1189 with re=2: ==12654== Conditional jump or move depends on uninitialised value(s)
    Message 1 of 5 , Jun 14, 2013
    • 0 Attachment
      Lech Lorens wrote:

      > Search for \v.*\/(.*)\n.*\/\1$ in a file with the following contents.
      > This should find adjacent lines with paths to files with the same base name.
      > With re=2 I get the first pair of lines highlighted (which is
      > nonsense). With re=1 I get a correct highlighting of the other pair.
      >
      > #v+
      > ./Dir1/Dir2/Dir3/zyxwvuts.txt
      > ./Dir1/Dir2/Dir3/abcdefgh.bat
      >
      > ./Dir1/Dir2/Dir3/file1.txt
      > ./OtherDir1/OtherDir2/file1.txt
      > #v-
      >
      > This happens in 7.3.1184.
      >
      > Cheers,
      > Lech


      Valgrind finds an error when doing what you describe
      using Vim-7.3.1189 with re=2:

      ==12654== Conditional jump or move depends on uninitialised value(s)
      ==12654== at 0x54D823: sub_equal (regexp_nfa.c:3654)
      ==12654== by 0x54DA16: has_state_with_pos (regexp_nfa.c:3736)
      ==12654== by 0x54DD3E: addstate (regexp_nfa.c:3967)
      ==12654== by 0x54DEBC: addstate (regexp_nfa.c:4012)
      ==12654== by 0x5518B5: nfa_regmatch (regexp_nfa.c:6032)
      ==12654== by 0x551D35: nfa_regtry (regexp_nfa.c:6203)
      ==12654== by 0x5523A6: nfa_regexec_both (regexp_nfa.c:6387)
      ==12654== by 0x55281F: nfa_regexec_multi (regexp_nfa.c:6636)
      ==12654== by 0x552A1E: vim_regexec_multi (regexp.c:8073)
      ==12654== by 0x566D72: searchit (search.c:639)
      ==12654== by 0x568119: do_search (search.c:1356)
      ==12654== by 0x503952: normal_search (normal.c:6433)
      ==12654== by 0x5038A1: nv_search (normal.c:6400)
      ==12654== by 0x4FA744: normal_cmd (normal.c:1200)
      ==12654== by 0x5ED7C2: main_loop (main.c:1329)
      ==12654== by 0x5ED10B: main (main.c:1020)
      ==12654== Uninitialised value was created by a heap allocation
      ==12654== at 0x4C2C78F: malloc (vg_replace_malloc.c:270)
      ==12654== by 0x4E785B: lalloc (misc2.c:929)
      ==12654== by 0x54F640: nfa_regmatch (regexp_nfa.c:4952)
      ==12654== by 0x551D35: nfa_regtry (regexp_nfa.c:6203)
      ==12654== by 0x5523A6: nfa_regexec_both (regexp_nfa.c:6387)
      ==12654== by 0x55281F: nfa_regexec_multi (regexp_nfa.c:6636)
      ==12654== by 0x552A1E: vim_regexec_multi (regexp.c:8073)
      ==12654== by 0x566D72: searchit (search.c:639)
      ==12654== by 0x568119: do_search (search.c:1356)
      ==12654== by 0x503952: normal_search (normal.c:6433)
      ==12654== by 0x5038A1: nv_search (normal.c:6400)
      ==12654== by 0x4FA744: normal_cmd (normal.c:1200)
      ==12654== by 0x5ED7C2: main_loop (main.c:1329)
      ==12654== by 0x5ED10B: main (main.c:1020)
      (more errors after that)

      I have not found how to fix it. However, this line of code
      looks suspicious in function sub_equal():

      3629 todo = sub1->in_use > sub2->in_use ? sub1->in_use : sub2->in_use;

      I think it should be...

      3629 todo = sub1->in_use < sub2->in_use ? sub1->in_use : sub2->in_use;

      ... but that still does not solve it anyway.
      sub1->list.multi[i].end.lnum is accessed and is uninitialized.

      Dominique

      --
      --
      You received this message from the "vim_dev" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php

      ---
      You received this message because you are subscribed to the Google Groups "vim_dev" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
      For more options, visit https://groups.google.com/groups/opt_out.
    • Bram Moolenaar
      ... Back references in a previous like was not implemented yet. Thanks for providing an example to test with. -- hundred-and-one symptoms of being an internet
      Message 2 of 5 , Jun 14, 2013
      • 0 Attachment
        Lech Lorens wrote:

        > Search for \v.*\/(.*)\n.*\/\1$ in a file with the following contents.
        > This should find adjacent lines with paths to files with the same base name.
        > With re=2 I get the first pair of lines highlighted (which is
        > nonsense). With re=1 I get a correct highlighting of the other pair.
        >
        > #v+
        > ./Dir1/Dir2/Dir3/zyxwvuts.txt
        > ./Dir1/Dir2/Dir3/abcdefgh.bat
        >
        > ./Dir1/Dir2/Dir3/file1.txt
        > ./OtherDir1/OtherDir2/file1.txt
        > #v-
        >
        > This happens in 7.3.1184.

        Back references in a previous like was not implemented yet.
        Thanks for providing an example to test with.

        --
        hundred-and-one symptoms of being an internet addict:
        197. Your desk collapses under the weight of your computer peripherals.

        /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
        /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
        \\\ an exciting new programming language -- http://www.Zimbu.org ///
        \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

        --
        --
        You received this message from the "vim_dev" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php

        ---
        You received this message because you are subscribed to the Google Groups "vim_dev" group.
        To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
        For more options, visit https://groups.google.com/groups/opt_out.
      • Bram Moolenaar
        ... -- hundred-and-one symptoms of being an internet addict: 201. When somebody asks you where you are, you tell them in which chat room. /// Bram Moolenaar --
        Message 3 of 5 , Jun 14, 2013
        • 0 Attachment
          I wrote:

          > Lech Lorens wrote:
          >
          > > Search for \v.*\/(.*)\n.*\/\1$ in a file with the following contents.
          > > This should find adjacent lines with paths to files with the same base name.
          > > With re=2 I get the first pair of lines highlighted (which is
          > > nonsense). With re=1 I get a correct highlighting of the other pair.
          > >
          > > #v+
          > > ./Dir1/Dir2/Dir3/zyxwvuts.txt
          > > ./Dir1/Dir2/Dir3/abcdefgh.bat
          > >
          > > ./Dir1/Dir2/Dir3/file1.txt
          > > ./OtherDir1/OtherDir2/file1.txt
          > > #v-
          > >
          > > This happens in 7.3.1184.
          >
          > Back references in a previous like was not implemented yet.

          :s/like/line/

          > Thanks for providing an example to test with.

          --
          hundred-and-one symptoms of being an internet addict:
          201. When somebody asks you where you are, you tell them in which chat room.

          /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
          /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
          \\\ an exciting new programming language -- http://www.Zimbu.org ///
          \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

          --
          --
          You received this message from the "vim_dev" maillist.
          Do not top-post! Type your reply below the text you are replying to.
          For more information, visit http://www.vim.org/maillist.php

          ---
          You received this message because you are subscribed to the Google Groups "vim_dev" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
          For more options, visit https://groups.google.com/groups/opt_out.
        • Bram Moolenaar
          ... No, it was correct. Further down it checks the index against in_use. ... Yeah, I think the problem is that in_use is incremented when only the start
          Message 4 of 5 , Jun 14, 2013
          • 0 Attachment
            Dominique Pelle wrote:

            > Lech Lorens wrote:
            >
            > > Search for \v.*\/(.*)\n.*\/\1$ in a file with the following contents.
            > > This should find adjacent lines with paths to files with the same base name.
            > > With re=2 I get the first pair of lines highlighted (which is
            > > nonsense). With re=1 I get a correct highlighting of the other pair.
            > >
            > > #v+
            > > ./Dir1/Dir2/Dir3/zyxwvuts.txt
            > > ./Dir1/Dir2/Dir3/abcdefgh.bat
            > >
            > > ./Dir1/Dir2/Dir3/file1.txt
            > > ./OtherDir1/OtherDir2/file1.txt
            > > #v-
            > >
            > > This happens in 7.3.1184.
            > >
            > > Cheers,
            > > Lech
            >
            >
            > Valgrind finds an error when doing what you describe
            > using Vim-7.3.1189 with re=2:
            >
            > ==12654== Conditional jump or move depends on uninitialised value(s)
            > ==12654== at 0x54D823: sub_equal (regexp_nfa.c:3654)
            > ==12654== by 0x54DA16: has_state_with_pos (regexp_nfa.c:3736)
            > ==12654== by 0x54DD3E: addstate (regexp_nfa.c:3967)
            > ==12654== by 0x54DEBC: addstate (regexp_nfa.c:4012)
            > ==12654== by 0x5518B5: nfa_regmatch (regexp_nfa.c:6032)
            > ==12654== by 0x551D35: nfa_regtry (regexp_nfa.c:6203)
            > ==12654== by 0x5523A6: nfa_regexec_both (regexp_nfa.c:6387)
            > ==12654== by 0x55281F: nfa_regexec_multi (regexp_nfa.c:6636)
            > ==12654== by 0x552A1E: vim_regexec_multi (regexp.c:8073)
            > ==12654== by 0x566D72: searchit (search.c:639)
            > ==12654== by 0x568119: do_search (search.c:1356)
            > ==12654== by 0x503952: normal_search (normal.c:6433)
            > ==12654== by 0x5038A1: nv_search (normal.c:6400)
            > ==12654== by 0x4FA744: normal_cmd (normal.c:1200)
            > ==12654== by 0x5ED7C2: main_loop (main.c:1329)
            > ==12654== by 0x5ED10B: main (main.c:1020)
            > ==12654== Uninitialised value was created by a heap allocation
            > ==12654== at 0x4C2C78F: malloc (vg_replace_malloc.c:270)
            > ==12654== by 0x4E785B: lalloc (misc2.c:929)
            > ==12654== by 0x54F640: nfa_regmatch (regexp_nfa.c:4952)
            > ==12654== by 0x551D35: nfa_regtry (regexp_nfa.c:6203)
            > ==12654== by 0x5523A6: nfa_regexec_both (regexp_nfa.c:6387)
            > ==12654== by 0x55281F: nfa_regexec_multi (regexp_nfa.c:6636)
            > ==12654== by 0x552A1E: vim_regexec_multi (regexp.c:8073)
            > ==12654== by 0x566D72: searchit (search.c:639)
            > ==12654== by 0x568119: do_search (search.c:1356)
            > ==12654== by 0x503952: normal_search (normal.c:6433)
            > ==12654== by 0x5038A1: nv_search (normal.c:6400)
            > ==12654== by 0x4FA744: normal_cmd (normal.c:1200)
            > ==12654== by 0x5ED7C2: main_loop (main.c:1329)
            > ==12654== by 0x5ED10B: main (main.c:1020)
            > (more errors after that)
            >
            > I have not found how to fix it. However, this line of code
            > looks suspicious in function sub_equal():
            >
            > 3629 todo = sub1->in_use > sub2->in_use ? sub1->in_use : sub2->in_use;
            >
            > I think it should be...
            >
            > 3629 todo = sub1->in_use < sub2->in_use ? sub1->in_use : sub2->in_use;

            No, it was correct. Further down it checks the index against in_use.

            > ... but that still does not solve it anyway.
            > sub1->list.multi[i].end.lnum is accessed and is uninitialized.

            Yeah, I think the problem is that in_use is incremented when only the
            start position is set, the end position is still undefined.

            I think we can omit checking the end position. Two identical states
            that start at the same position must also end at the same position.

            --
            hundred-and-one symptoms of being an internet addict:
            200. You really believe in the concept of a "paperless" office.

            /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
            /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
            \\\ an exciting new programming language -- http://www.Zimbu.org ///
            \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

            --
            --
            You received this message from the "vim_dev" maillist.
            Do not top-post! Type your reply below the text you are replying to.
            For more information, visit http://www.vim.org/maillist.php

            ---
            You received this message because you are subscribed to the Google Groups "vim_dev" group.
            To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
            For more options, visit https://groups.google.com/groups/opt_out.
          Your message has been successfully submitted and would be delivered to recipients shortly.