Loading ...
Sorry, an error occurred while loading the content.

Re: extremely bad 7.4 regex performance for syntax highlighting

Expand Messages
  • Ben Fritz
    ... I think this is a perfect example of a case where the old regex engine SHOULD do better. In this case, you have a bunch of literal matches with very little
    Message 1 of 5 , Oct 9, 2013
    • 0 Attachment
      On Wednesday, October 9, 2013 10:09:05 AM UTC-5, Julian Taylor wrote:
      >
      > Is easytag just doing the highlight regex wrong or is this a issue in the new engine?
      >
      > the regex used looks like this:
      >
      > syntax match cFunctionTag /\C\<\%(cpl_column_get_mean_complex\|cx_tree_previous\|cpl_table_copy_data_int\|cpl_table_get_column_format\|cx_map_erase_range\|cpl_table_divide_columns ....
      >
      >

      I think this is a perfect example of a case where the old regex engine SHOULD do better. In this case, you have a bunch of literal matches with very little if any backtracking, and the number of possible matches is so large that there will be many many states added in the new engine, none of which will normally match.

      Bram, are you still planning on making the automatic selection of regex engine based on which one is probably going to be faster? This might be a good first type of pattern to trigger selection of the old engine.

      --
      --
      You received this message from the "vim_dev" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php

      ---
      You received this message because you are subscribed to the Google Groups "vim_dev" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
      For more options, visit https://groups.google.com/groups/opt_out.
    • Charles E Campbell
      ... A caveat: this looks like a situation where syntax keyword should be used instead of syntax match . Regards, C Campbell -- -- You received this message
      Message 2 of 5 , Oct 9, 2013
      • 0 Attachment
        Ben Fritz wrote:
        > On Wednesday, October 9, 2013 10:09:05 AM UTC-5, Julian Taylor wrote:
        >> Is easytag just doing the highlight regex wrong or is this a issue in the new engine?
        >>
        >> the regex used looks like this:
        >>
        >> syntax match cFunctionTag /\C\<\%(cpl_column_get_mean_complex\|cx_tree_previous\|cpl_table_copy_data_int\|cpl_table_get_column_format\|cx_map_erase_range\|cpl_table_divide_columns ....
        >>
        >>
        > I think this is a perfect example of a case where the old regex engine SHOULD do better. In this case, you have a bunch of literal matches with very little if any backtracking, and the number of possible matches is so large that there will be many many states added in the new engine, none of which will normally match.
        >
        > Bram, are you still planning on making the automatic selection of regex engine based on which one is probably going to be faster? This might be a good first type of pattern to trigger selection of the old engine.

        A caveat: this looks like a situation where "syntax keyword" should be
        used instead of "syntax match".

        Regards,
        C Campbell

        --
        --
        You received this message from the "vim_dev" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php

        ---
        You received this message because you are subscribed to the Google Groups "vim_dev" group.
        To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
        For more options, visit https://groups.google.com/groups/opt_out.
      • Bram Moolenaar
        ... Well, that s going to be a bit difficult if the problem is that the new engine uses more memory, going over a limit that causes swapping. Suppose you do
        Message 3 of 5 , Oct 9, 2013
        • 0 Attachment
          Ben Fritz wrote:

          > On Wednesday, October 9, 2013 10:09:05 AM UTC-5, Julian Taylor wrote:
          > >
          > > Is easytag just doing the highlight regex wrong or is this a issue in the new engine?
          > >
          > > the regex used looks like this:
          > >
          > > syntax match cFunctionTag /\C\<\%(cpl_column_get_mean_complex\|cx_tree_previous\|cpl_table_copy_data_int\|cpl_table_get_column_format\|cx_map_erase_range\|cpl_table_divide_columns ....
          > >
          > >
          >
          > I think this is a perfect example of a case where the old regex engine
          > SHOULD do better. In this case, you have a bunch of literal matches
          > with very little if any backtracking, and the number of possible
          > matches is so large that there will be many many states added in the
          > new engine, none of which will normally match.
          >
          > Bram, are you still planning on making the automatic selection of
          > regex engine based on which one is probably going to be faster? This
          > might be a good first type of pattern to trigger selection of the old
          > engine.

          Well, that's going to be a bit difficult if the problem is that the new
          engine uses more memory, going over a limit that causes swapping.
          Suppose you do the same on a machine with sufficient RAM, which engine
          is faster?

          Over time Vim has grown in its memory footprint, because it's often
          useful and machines have more memory these days. And RAM is cheap.

          It's true that the states of the new engine take quite a bit of memory.
          Main cause is to be able to support all the regexp features that Vim
          as. It might be possible to reduce it a bit, but it's not easy to do
          this.

          --
          A mathematician is a device for turning coffee into theorems.
          Paul Erdos
          A computer programmer is a device for turning coffee into bugs.
          Bram Moolenaar

          /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
          /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
          \\\ an exciting new programming language -- http://www.Zimbu.org ///
          \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

          --
          --
          You received this message from the "vim_dev" maillist.
          Do not top-post! Type your reply below the text you are replying to.
          For more information, visit http://www.vim.org/maillist.php

          ---
          You received this message because you are subscribed to the Google Groups "vim_dev" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
          For more options, visit https://groups.google.com/groups/opt_out.
        • Julian Taylor
          ... it isn t swapping, it are just many soft page faults (writing to memory that is available in RAM but has not been mapped into the processes address space
          Message 4 of 5 , Oct 9, 2013
          • 0 Attachment
            On 09.10.2013 23:26, Bram Moolenaar wrote:
            >
            > Ben Fritz wrote:
            >
            >> On Wednesday, October 9, 2013 10:09:05 AM UTC-5, Julian Taylor wrote:
            >>>
            >>> Is easytag just doing the highlight regex wrong or is this a issue in the new engine?
            >>>
            >>> the regex used looks like this:
            >>>
            >>> syntax match cFunctionTag /\C\<\%(cpl_column_get_mean_complex\|cx_tree_previous\|cpl_table_copy_data_int\|cpl_table_get_column_format\|cx_map_erase_range\|cpl_table_divide_columns ....
            >>>
            >>>
            >>
            >> I think this is a perfect example of a case where the old regex engine
            >> SHOULD do better. In this case, you have a bunch of literal matches
            >> with very little if any backtracking, and the number of possible
            >> matches is so large that there will be many many states added in the
            >> new engine, none of which will normally match.
            >>
            >> Bram, are you still planning on making the automatic selection of
            >> regex engine based on which one is probably going to be faster? This
            >> might be a good first type of pattern to trigger selection of the old
            >> engine.
            >
            > Well, that's going to be a bit difficult if the problem is that the new
            > engine uses more memory, going over a limit that causes swapping.
            > Suppose you do the same on a machine with sufficient RAM, which engine
            > is faster?
            >

            it isn't swapping, it are just many soft page faults (writing to memory
            that is available in RAM but has not been mapped into the processes
            address space yet)
            This is due to the high amount of pretty large allocation it performs.
            Increasing MALLOC_MMAP_THRESHOLD_ improves speed by 25% (its telling
            glibc to recycle more memory) but its still bad compared to vim 7.3.

            >> A caveat: this looks like a situation where "syntax keyword" should
            be used instead of "syntax match".

            If syntax keyword does not use the regex engine it looks like a
            solution, I'll try it out and notify the easytag developer if solves the
            issue.

            --
            --
            You received this message from the "vim_dev" maillist.
            Do not top-post! Type your reply below the text you are replying to.
            For more information, visit http://www.vim.org/maillist.php

            ---
            You received this message because you are subscribed to the Google Groups "vim_dev" group.
            To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
            For more options, visit https://groups.google.com/groups/opt_out.
          Your message has been successfully submitted and would be delivered to recipients shortly.