Loading ...
Sorry, an error occurred while loading the content.

RE: Regexp help

Expand Messages
  • Srinath Avadhanula
    Hi Jonah, ... Thanks for the answer. It was quite an eye-opener to know that parentheses balancing is unsolvable using regexps! On further googling, it looks
    Message 1 of 15 , Dec 1, 2003
    View Source
    • 0 Attachment
      Hi Jonah,

      On Tue, 2 Dec 2003, jonah wrote:

      > The general problem cannot be solved with regular expressions.
      >
      > See, eg, http://science.slc.edu/~msiff/old-courses/compilers/notes/parse.html
      >
      > It is possible that vim's extended regular expression support offers some way to
      > do it, but I doubt it.

      Thanks for the answer. It was quite an eye-opener to know that
      parentheses balancing is unsolvable using regexps! On further googling,
      it looks like none of the common regexp engines can actually do this...

      Srinath
    • Srinath Avadhanula
      ... I wanted a regexp solution for paren balancing specifically for use in the vim-latex project. I have a function which creates regexp based folds just like
      Message 2 of 15 , Dec 1, 2003
      View Source
      • 0 Attachment
        On Tue, 2 Dec 2003, John Aldridge wrote:

        > At 22:30 12-01-2003, Srinath Avadhanula wrote:
        >
        > >Thanks for the answer. It was quite an eye-opener to know that
        > >parentheses balancing is unsolvable using regexps! On further googling,
        > >it looks like none of the common regexp engines can actually do this...
        >
        > For my curiosity, what were you hoping to do with such regexes?
        > Do you think you'll attempt a scripting solution?
        >
        I wanted a regexp solution for paren balancing specifically for use in
        the vim-latex project. I have a function which creates regexp based
        folds just like vim's native syntax foldmethod. That function requires
        a regexp start pattern and regexp end pattern to define the start and
        end of a fold region.

        The high level problem was to specify regexps to fold lines like:

        ------------------------%<------------------------
        \command{This is a command in latex \textbf{which}
        spans several lines and
        finally concludes \textbf{here}}
        ------------------------%<------------------------

        but avoid lines like

        ------------------------%<------------------------
        \command{This is a command in latex \textbf{which}}
        next line is not part of the command
        finally concludes \textbf{here}
        ------------------------%<------------------------

        As can be seen, the problem is to start a fold on a line which has
        unbalanced parentheses after the start of the command...

        It looks like checking whether a string has balanced parentheses or not
        is not really a difficult problem per se. Its just that the regexp
        solution is not possible. Unfortunately, I cannot use a parser based
        solution at the moment.

        Srinath
      • John Aldridge
        ... For my curiosity, what were you hoping to do with such regexes? Do you think you ll attempt a scripting solution? I like to tinker with parsers, and I have
        Message 3 of 15 , Dec 2, 2003
        View Source
        • 0 Attachment
          At 22:30 12-01-2003, Srinath Avadhanula wrote:

          >Hi Jonah,
          >
          >On Tue, 2 Dec 2003, jonah wrote:
          >
          > > The general problem cannot be solved with regular expressions.
          > >
          > > See, eg,
          > http://science.slc.edu/~msiff/old-courses/compilers/notes/parse.html
          > >
          > > It is possible that vim's extended regular expression support
          > offers some way to
          > > do it, but I doubt it.
          >
          >Thanks for the answer. It was quite an eye-opener to know that
          >parentheses balancing is unsolvable using regexps! On further googling,
          >it looks like none of the common regexp engines can actually do this...
          >
          >Srinath


          For my curiosity, what were you hoping to do with such regexes?
          Do you think you'll attempt a scripting solution?

          I like to tinker with parsers, and I have some working, although buggy,
          Vim scripts that do parenthesis counting. One of these days, I
          might also take the time to dig into matchit.vim and have a look
          at how that works.

          ~
          ~
          ~
          "John R. Aldridge, Jr."
        • Jürgen Krämer
          Hi, ... IIRC there is an example for validating nested structures with Perl in the second edition of Jeffrey Friedl s Mastering Regular Expressions , although
          Message 4 of 15 , Dec 2, 2003
          View Source
          • 0 Attachment
            Hi,

            Srinath Avadhanula wrote:
            >
            > Thanks for the answer. It was quite an eye-opener to know that
            > parentheses balancing is unsolvable using regexps! On further googling,
            > it looks like none of the common regexp engines can actually do this...

            IIRC there is an example for validating nested structures with Perl in
            the second edition of Jeffrey Friedl's "Mastering Regular Expressions",
            although -- strictly spoken -- these aren't regular expressions anymore.

            Regards,
            Jürgen

            --
            Jürgen Krämer Softwareentwicklung
            Habel GmbH mailto:jkr@...
            Hinteres Öschle 2 Tel: (0 74 61) 93 53 15
            78604 Rietheim-Weilheim Fax: (0 74 61) 93 53 99
          • Benji Fisher
            ... It uses the searchpair() function. This uses a while loop and a counter to keep track of nesting depth. Before the searchpair() function was added to
            Message 5 of 15 , Dec 2, 2003
            View Source
            • 0 Attachment
              > I like to tinker with parsers, and I have some working, although buggy,
              > Vim scripts that do parenthesis counting. One of these days, I
              > might also take the time to dig into matchit.vim and have a look
              > at how that works.

              It uses the searchpair() function. This uses a while loop and a
              counter to keep track of nesting depth. Before the searchpair()
              function was added to vim, the while loop was in the matchit script.

              HTH --Benji Fisher
            • John Aldridge
              ... Ah... The only automatic folding I ve done is by way of the syntax files, and its been a while since I ve written any block syntax commands. Thanks for the
              Message 6 of 15 , Dec 2, 2003
              View Source
              • 0 Attachment
                At 23:01 12-01-2003, Srinath Avadhanula wrote:

                >On Tue, 2 Dec 2003, John Aldridge wrote:
                >
                > > At 22:30 12-01-2003, Srinath Avadhanula wrote:
                > >
                > > >Thanks for the answer. It was quite an eye-opener to know that
                > > >parentheses balancing is unsolvable using regexps! On further googling,
                > > >it looks like none of the common regexp engines can actually do this...
                > >
                > > For my curiosity, what were you hoping to do with such regexes?
                > > Do you think you'll attempt a scripting solution?
                > >
                >I wanted a regexp solution for paren balancing specifically for use in
                >the vim-latex project. I have a function which creates regexp based
                >folds just like vim's native syntax foldmethod. That function requires
                >a regexp start pattern and regexp end pattern to define the start and
                >end of a fold region.

                Ah... The only automatic folding I've done is by way of the syntax files,
                and its been a while since I've written any block syntax commands.

                Thanks for the response.


                ~
                ~
                ~
                "John R. Aldridge, Jr."
              • John Aldridge
                ... Thank you. I ve downloaded both the 1.7 and 1.0 versions of matchit for later study. ~ ~ ~ John R. Aldridge, Jr.
                Message 7 of 15 , Dec 2, 2003
                View Source
                • 0 Attachment
                  At 09:47 12-02-2003, Benji Fisher wrote:

                  > > I like to tinker with parsers, and I have some working, although buggy,
                  > > Vim scripts that do parenthesis counting. One of these days, I
                  > > might also take the time to dig into matchit.vim and have a look
                  > > at how that works.
                  >
                  > It uses the searchpair() function. This uses a while loop and a
                  >counter to keep track of nesting depth. Before the searchpair()
                  >function was added to vim, the while loop was in the matchit script.
                  >
                  >HTH --Benji Fisher


                  Thank you. I've downloaded both the 1.7 and 1.0 versions of
                  matchit for later study.


                  ~
                  ~
                  ~
                  "John R. Aldridge, Jr."
                • RICHARD PITMAN
                  I have a fortran program in which there are two sorts of numbers, integers which are simply digits not followed by a period, and double precision numbers in
                  Message 8 of 15 , Jul 30, 2013
                  View Source
                  • 0 Attachment
                    I have a fortran program in which there are two sorts of numbers, integers which are simply digits not followed by a period, and double precision numbers in the form 1.0D0, or more precisely \d\+\.\d\+D\d\+ I would like to search for the former, avoiding the latter. Any suggestions gratefully received!

                    --
                    --
                    You received this message from the "vim_use" maillist.
                    Do not top-post! Type your reply below the text you are replying to.
                    For more information, visit http://www.vim.org/maillist.php

                    ---
                    You received this message because you are subscribed to the Google Groups "vim_use" group.
                    To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                    For more options, visit https://groups.google.com/groups/opt_out.
                  • Ben Fritz
                    ... You need a negative look-ahead assertion. I.e. a way to tell Vim, match my pattern wherever this other pattern DOESN T match after it . The way to do this
                    Message 9 of 15 , Jul 30, 2013
                    View Source
                    • 0 Attachment
                      On Tuesday, July 30, 2013 4:15:37 AM UTC-5, kilter wrote:
                      > I have a fortran program in which there are two sorts of numbers, integers which are simply digits not followed by a period, and double precision numbers in the form 1.0D0, or more precisely \d\+\.\d\+D\d\+ I would like to search for the former, avoiding the latter. Any suggestions gratefully received!

                      You need a negative look-ahead assertion. I.e. a way to tell Vim, "match my pattern wherever this other pattern DOESN'T match after it". The way to do this is Vim is using \@!. But you also need to make sure not to match the stuff after the . with your pattern. I found an easy way to do this is just anchoring to word boundaries with \< and \>.

                      So the final pattern is:

                      \<\d\+\>\.\@!

                      Maybe better using "very magic":

                      \v<\d+>\.@!

                      See :help /\@!

                      --
                      --
                      You received this message from the "vim_use" maillist.
                      Do not top-post! Type your reply below the text you are replying to.
                      For more information, visit http://www.vim.org/maillist.php

                      ---
                      You received this message because you are subscribed to the Google Groups "vim_use" group.
                      To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                      For more options, visit https://groups.google.com/groups/opt_out.
                    • Erik Christiansen
                      ... Everything s better with v ;-) ... On the test line: 123 123.0 123. 456 0.123 .123 789 that regex also detects the fractional parts as integers, so it
                      Message 10 of 15 , Jul 30, 2013
                      View Source
                      • 0 Attachment
                        On 30.07.13 06:57, Ben Fritz wrote:
                        > Maybe better using "very magic":

                        Everything's better with \v ;-)

                        > \v<\d+>\.@!

                        On the test line:

                        123 123.0 123. 456 0.123 .123 789

                        that regex also detects the fractional parts as integers, so it still
                        needs a tweak. This seems to do it:

                        /\v\.@<!<\d+>\.@!

                        But even that finds 06 and 57 in 06:57. Whether they are desired
                        integers may vary between use cases. Admittedly they'll probably only
                        crop up in rare strings in a fortran program.

                        Erik

                        --
                        Mollison's Bureaucracy Hypothesis:
                        If an idea can survive a bureaucratic review and be implemented
                        it wasn't worth doing.

                        --
                        --
                        You received this message from the "vim_use" maillist.
                        Do not top-post! Type your reply below the text you are replying to.
                        For more information, visit http://www.vim.org/maillist.php

                        ---
                        You received this message because you are subscribed to the Google Groups "vim_use" group.
                        To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                        For more options, visit https://groups.google.com/groups/opt_out.
                      • Ben Fritz
                        ... Yes. I occasionally edit portions of my .vimrc where I didn t used it, and wonder why. ... The OP specifically said that valid decimals are in the form
                        Message 11 of 15 , Jul 30, 2013
                        View Source
                        • 0 Attachment
                          On Tuesday, July 30, 2013 9:27:51 AM UTC-5, Erik Christiansen wrote:
                          > On 30.07.13 06:57, Ben Fritz wrote:
                          >
                          > > Maybe better using "very magic":
                          >
                          >
                          >
                          > Everything's better with \v ;-)
                          >
                          >

                          Yes. I occasionally edit portions of my .vimrc where I didn't used it, and wonder why.

                          >
                          > > \v<\d+>\.@!
                          >
                          >
                          >
                          > On the test line:
                          >
                          >
                          >
                          > 123 123.0 123. 456 0.123 .123 789
                          >
                          >

                          The OP specifically said that valid decimals are "in the form 1.0D0, or more precisely \d\+\.\d\+D\d\+" so I didn't try stuff like "123." or ".123".

                          But possibly as in the other thread we need to account for negative numbers?

                          >
                          > that regex also detects the fractional parts as integers, so it still
                          >
                          > needs a tweak. This seems to do it:
                          >
                          >
                          >
                          > /\v\.@<!<\d+>\.@!
                          >
                          >

                          Good. Even without the narrow constraints I assumed it's fairly easy to tweak to get it more correct.

                          For the OP, Erik added a negative look-behind (similar to the look-ahead my first response used but constraining what comes BEFORE instead). See :help /\@<!

                          >
                          > But even that finds 06 and 57 in 06:57. Whether they are desired
                          >
                          > integers may vary between use cases. Admittedly they'll probably only
                          >
                          > crop up in rare strings in a fortran program.
                          >

                          Good point...but these would be harder to guard against.

                          --
                          --
                          You received this message from the "vim_use" maillist.
                          Do not top-post! Type your reply below the text you are replying to.
                          For more information, visit http://www.vim.org/maillist.php

                          ---
                          You received this message because you are subscribed to the Google Groups "vim_use" group.
                          To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                          For more options, visit https://groups.google.com/groups/opt_out.
                        • Erik Christiansen
                          ... Wot ... just trust the problem specification? OK, the OP might be a mathematician or engineer, since fortran is mentioned, so you re probably right. But in
                          Message 12 of 15 , Jul 30, 2013
                          View Source
                          • 0 Attachment
                            On 30.07.13 07:41, Ben Fritz wrote:
                            > The OP specifically said that valid decimals are "in the form 1.0D0,
                            > or more precisely \d\+\.\d\+D\d\+" so I didn't try stuff like "123."
                            > or ".123".

                            Wot ... just trust the problem specification? OK, the OP might be a
                            mathematician or engineer, since fortran is mentioned, so you're
                            probably right. But in years gone by, I sometimes wrote regexes for
                            others in a technical department, and the original problem spec almost
                            always had to be tightened, to exclude stuff which hadn't been thought of.

                            > But possibly as in the other thread we need to account for negative numbers?

                            If we change the test text to:

                            123 123.0 123. -456 0.123 .123 789

                            then what we had:

                            > > /\v\.@<!<\d+>\.@!

                            also finds -456, but the cursor is on the 4, not the minus sign.
                            If signed integers are also needed, we'd probably have to ditch the
                            precondition, since /\v(-?|\.@<!)<\d+>\.@! introduces an ambiguity which
                            defeats that alternative. (It's rotten regex construction.)

                            This, though, finds "-456", rather than "456":

                            \v(^|[ \t+-])<\d+>\.@!

                            but again finds "123" " 789", as before. Maybe that's OK?

                            Erik

                            --
                            Remembering is for those who have forgotten.
                            - Chinese proverb

                            --
                            --
                            You received this message from the "vim_use" maillist.
                            Do not top-post! Type your reply below the text you are replying to.
                            For more information, visit http://www.vim.org/maillist.php

                            ---
                            You received this message because you are subscribed to the Google Groups "vim_use" group.
                            To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                            For more options, visit https://groups.google.com/groups/opt_out.
                          • Nikolay Pavlov
                            On Jul 30, 2013 7:26 PM, Erik Christiansen ... numbers? ... If you are speaking about numbers in programming language you should
                            Message 13 of 15 , Jul 30, 2013
                            View Source
                            • 0 Attachment


                              On Jul 30, 2013 7:26 PM, "Erik Christiansen" <dvalin@...> wrote:
                              >
                              > On 30.07.13 07:41, Ben Fritz wrote:
                              > > The OP specifically said that valid decimals are "in the form 1.0D0,
                              > > or more precisely \d\+\.\d\+D\d\+" so I didn't try stuff like "123."
                              > > or ".123".
                              >
                              > Wot ... just trust the problem specification? OK, the OP might be a
                              > mathematician or engineer, since fortran is mentioned, so you're
                              > probably right. But in years gone by, I sometimes wrote regexes for
                              > others in a technical department, and the original problem spec almost
                              > always had to be tightened, to exclude stuff which hadn't been thought of.
                              >
                              > > But possibly as in the other thread we need to account for negative numbers?
                              >
                              > If we change the test text to:
                              >
                              > 123 123.0 123. -456 0.123 .123 789
                              >
                              > then what we had:
                              >
                              > > > /\v\.@<!<\d+>\.@!
                              >
                              > also finds -456, but the cursor is on the 4, not the minus sign.
                              > If signed integers are also needed, we'd probably have to ditch the
                              > precondition, since /\v(-?|\.@<!)<\d+>\.@! introduces an ambiguity which
                              > defeats that alternative. (It's rotten regex construction.)
                              >
                              > This, though, finds "-456", rather than "456":
                              >
                              > \v(^|[ \t+-])<\d+>\.@!
                              >
                              > but again finds "123" " 789", as before. Maybe that's OK?

                              If you are speaking about numbers in programming language you should take care about expressions: there is no number -456 in expression 123-456. Also note that in programming languages negative numbers may exist only as optimization for minus being an unary operator in an expression -456 (it does not make any difference whether you treat -456 as unary minus applied to positive 456 or as a negative number; but the latter is faster). So I would not widen the request this way.

                              \v[[:alnum:].]@<!\d+[[:alnum:].]@!

                              should be fine.

                              > Erik
                              >
                              > --
                              > Remembering is for those who have forgotten.
                              >                             - Chinese proverb
                              >
                              > --
                              > --
                              > You received this message from the "vim_use" maillist.
                              > Do not top-post! Type your reply below the text you are replying to.
                              > For more information, visit http://www.vim.org/maillist.php
                              >
                              > ---
                              > You received this message because you are subscribed to the Google Groups "vim_use" group.
                              > To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                              > For more options, visit https://groups.google.com/groups/opt_out.
                              >
                              >

                              --
                              --
                              You received this message from the "vim_use" maillist.
                              Do not top-post! Type your reply below the text you are replying to.
                              For more information, visit http://www.vim.org/maillist.php
                               
                              ---
                              You received this message because you are subscribed to the Google Groups "vim_use" group.
                              To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@....
                              For more options, visit https://groups.google.com/groups/opt_out.
                               
                               
                            Your message has been successfully submitted and would be delivered to recipients shortly.