Loading ...
Sorry, an error occurred while loading the content.
 

RE: regexp: problem?

Expand Messages
  • Moore, Paul
    From: Zdenek Sekera [mailto:zs@sgi.com] ... Because blank is a character which matches [^#], so your original regexp matches (with one less space matching the
    Message 1 of 9 , Oct 5, 2001
      From: Zdenek Sekera [mailto:zs@...]
      > Let's take a line:
      >
      > aaa #ssss
      >
      > place the cursor on that line and execute:
      >
      > :echo (getline(".") =~ '\v^\s*aaa\s*[^#]')
      >
      > (what I am trying to find out is if the 'aaa' anywhere on the
      > line, surrounded only by whitespaces, is followed by any
      > char that's not '#')
      >
      > The answer will be 1 (it is), to me that's wrong.
      >
      > Change the command to (add a blank inside [^#]) and you get:
      >
      > :echo (getline(".") =~ '\v^\s*aaa\s*[^# ]')
      > ^
      >
      > and you get 0!! Which is corect.
      >
      > Why do I need that additional blank in the regexp?

      Because blank is a character which matches [^#], so your original regexp
      matches (with one less space matching the \s*, and the last space matching
      the #).

      Paul.
    • Preben Guldberg
      ... But none the less correct :-) One of the (white) spaces after aaa is not a # . Thing is, after the aaa , s*[^#] will look for the next character
      Message 2 of 9 , Oct 5, 2001
        Thus wrote Zdenek Sekera (zs@...) on [011005]:
        > Let's take a line:

        > aaa #ssss

        > place the cursor on that line and execute:

        > :echo (getline(".") =~ '\v^\s*aaa\s*[^#]')

        > (what I am trying to find out is if the 'aaa' anywhere on the
        > line, surrounded only by whitespaces, is followed by any
        > char that's not '#')

        > The answer will be 1 (it is), to me that's wrong.

        But none the less correct :-)

        One of the (white) spaces after "aaa" is not a "#".

        Thing is, after the "aaa", "\s*[^#]" will look for the next character
        that is not a "#" skipping as much whitespace as possible while still
        finding a match.

        In other words, if you have "aaa #" (three spaces), "\s*" will match
        the first two spaces and "[^#]" the third space.

        > Change the command to (add a blank inside [^#]) and you get:

        > :echo (getline(".") =~ '\v^\s*aaa\s*[^# ]')

        > and you get 0!! Which is corect.

        It just confirms that you don't have "aaa<Tab>#".

        I guess it is time for some of the new "\@" pattern magic:

        :echo (getline(".") =~ '\v\s*aaa(\s*)@>[^#]')

        which will, however, also match "aaasss". Perhaps you want

        :echo (getline(".") =~ '\v\s*aaa(\s*$|(\s+)@>[^#])')

        In any case, have a look at ":help /\@>".

        Peppe [who finds "/\@" pretty cool]
        --
        "Before you criticize someone, walk
        Preben "Peppe" Guldberg __/-\__ a mile in his shoes. That way, if
        c928400@... (o o) he gets angry, he'll be a mile away
        ----------------------oOOo (_) oOOo-- - and barefoot." --Sarah Jackson
      • Zdenek Sekera
        ... How come there are always several truths ???? :-) ... Hmmm, and there is nothing left from my understanding of greedy algorithms. Frustrating! ... ...which
        Message 3 of 9 , Oct 5, 2001
          Preben Guldberg wrote:
          >
          ...
          > > :echo (getline(".") =~ '\v^\s*aaa\s*[^#]')
          >
          > > (what I am trying to find out is if the 'aaa' anywhere on the
          > > line, surrounded only by whitespaces, is followed by any
          > > char that's not '#')
          >
          > > The answer will be 1 (it is), to me that's wrong.
          >
          > But none the less correct :-)
          >

          How come there are always several truths ???? :-)

          > One of the (white) spaces after "aaa" is not a "#".
          >
          > Thing is, after the "aaa", "\s*[^#]" will look for the next character
          > that is not a "#" skipping as much whitespace as possible while still
          > finding a match.
          >

          Hmmm, and there is nothing left from my understanding of greedy
          algorithms. Frustrating!

          ...
          >
          > I guess it is time for some of the new "\@" pattern magic:
          >

          ...which is almost as opaque as writing syntax files :-)
          I guess I'll have to spend this weekend on those though they
          look quite *unfriendly* to me :-)

          ...
          >
          > :echo (getline(".") =~ '\v\s*aaa(\s*$|(\s+)@>[^#])')
          >

          And here goes the Saturday....

          > In any case, have a look at ":help /\@>".

          I'll give it a chance :-) !

          Thanks Preben, Mike and Paul, I understood my mistake.

          ---Zdenek
        • Preben Guldberg
          ... It can take some mind bending to understand them (and get them right), but if you give them a chance they can become really good friends :-) ... And the
          Message 4 of 9 , Oct 5, 2001
            Thus wrote Zdenek Sekera (zs@...) on [011005]:
            > Preben Guldberg wrote:

            > > I guess it is time for some of the new "\@" pattern magic:

            > I guess I'll have to spend this weekend on those though they
            > look quite *unfriendly* to me :-)

            It can take some mind bending to understand them (and get them right),
            but if you give them a chance they can become really good friends :-)

            > > :echo (getline(".") =~ '\v\s*aaa(\s*$|(\s+)@>[^#])')

            > And here goes the Saturday....

            And the Sunday if you think that one works :-(

            Should have tested it. The simple one was OK, but here is a funny one:

            Test case (tabs on line 2 and 4, trailing space on line 7):

            aaa #ss
            aaa #ss
            aaa ss
            aaa ss
            aaa#ss
            aaass
            aaa
            aaa

            Now try (one line, the interesting pattern on line 2):

            :g/./echo (getline(".") =~
            '^\s*aaa\(\(\s\+\)\@>[^#]\|\s*$\)'
            ).":\t".getline(".").'$'

            Which produces the expected

            0: aaa #ss$
            0: aaa #ss$
            1: aaa ss$
            1: aaa ss$
            0: aaa#ss$
            0: aaass$
            1: aaa $
            1: aaa$

            However, using this string:

            '\v^\s*aaa((\s+)@>[^#]|\s*$)'

            the last two lines are flagged as "0".

            Looking at all options

            BAD: '\v^\s*aaa((\s+)@>[^#]|\s*$)'
            OK: '^\s*aaa\(\(\s\+\)\@>[^#]\|\s*$\)'
            Same: '\m^\s*aaa\(\(\s\+\)\@>[^#]\|\s*$\)'
            OK: '\M^\s\*aaa\(\(\s\+\)\@>\[^#]\|\s\*$\)'
            OK: '\V\^\s\*aaa\(\(\s\+\)\@>\[^#]\|\s\*\$\)'

            Can anyone spot something something wrong in the first pattern here?
            To me they look the same, but I may overlook something.

            Could someone please double check this (gvim 6.0, patches 1-11).

            Peppe
            --
            "Before you criticize someone, walk
            Preben "Peppe" Guldberg __/-\__ a mile in his shoes. That way, if
            c928400@... (o o) he gets angry, he'll be a mile away
            ----------------------oOOo (_) oOOo-- - and barefoot." --Sarah Jackson
          • Zdenek Sekera
            ... You mean - like in buddies ? That s encouraging!! ... I see them the same, too, but what about this: v^ s*aaa(( s*)@ [^#]| s*$) Which I meant to be
            Message 5 of 9 , Oct 5, 2001
              Preben Guldberg wrote:
              >
              ...
              > > > I guess it is time for some of the new "\@" pattern magic:
              ...
              > It can take some mind bending to understand them (and get them right),
              > but if you give them a chance they can become really good friends :-)

              You mean - like in "buddies"? That's encouraging!!

              ...
              > Test case (tabs on line 2 and 4, trailing space on line 7):
              >
              > aaa #ss
              > aaa #ss
              > aaa ss
              > aaa ss
              > aaa#ss
              > aaass
              > aaa
              > aaa
              >
              > Now try (one line, the interesting pattern on line 2):
              >
              > :g/./echo (getline(".") =~
              > '^\s*aaa\(\(\s\+\)\@>[^#]\|\s*$\)'
              > ).":\t".getline(".").'$'
              >
              > Which produces the expected
              >
              > 0: aaa #ss$
              > 0: aaa #ss$
              > 1: aaa ss$
              > 1: aaa ss$
              > 0: aaa#ss$
              > 0: aaass$
              > 1: aaa $
              > 1: aaa$
              >
              > However, using this string:
              >
              > '\v^\s*aaa((\s+)@>[^#]|\s*$)'
              >
              > the last two lines are flagged as "0".
              >
              > Looking at all options
              >
              > BAD: '\v^\s*aaa((\s+)@>[^#]|\s*$)'
              > OK: '^\s*aaa\(\(\s\+\)\@>[^#]\|\s*$\)'
              ...
              > Can anyone spot something something wrong in the first pattern here?
              > To me they look the same, but I may overlook something.

              I see them the same, too, but what about this:

              "\\v^\\s*aaa((\\s*)@>[^#]|\\s*$)"

              Which I meant to be the first one just written with "..." instead of
              '...'
              Can't see the mistake but this gives:

              0: aaa #ss$
              0: aaa #ss$
              1: aaa ss$
              1: aaa ss$
              0: aaa#ss$
              1: aaass$ <- where does this come from?
              0: aaa $
              0: aaa$

              >
              > Could someone please double check this (gvim 6.0, patches 1-11).

              Same version.

              ---Zdenek
            • Preben Guldberg
              ... Yep, if nothing else, you will always have vim :-) ... Personally, I tend to use ... rather than ... when I can - especially for variables used as
              Message 6 of 9 , Oct 5, 2001
                Thus wrote Zdenek Sekera (zs@...) on [011005]:
                > You mean - like in "buddies"? That's encouraging!!

                Yep, if nothing else, you will always have vim :-)

                > Preben Guldberg wrote:
                > > BAD: '\v^\s*aaa((\s+)@>[^#]|\s*$)'
                > > OK: '^\s*aaa\(\(\s\+\)\@>[^#]\|\s*$\)'

                > I see them the same, too, but what about this:

                > "\\v^\\s*aaa((\\s*)@>[^#]|\\s*$)"

                > Which I meant to be the first one just written with "..." instead of
                > '...'

                Personally, I tend to use '...' rather than "..." when I can -
                especially for variables used as patterns - in an attempt to avoid all
                this quoting of '\'.

                > Can't see the mistake but this gives:

                > 1: aaass$ <- where does this come from?

                Oh, but 's' matches "[^#]" just fine. Try changing a '*' to a '+'
                above.

                Thanks for checking.

                Peppe
                --
                "Before you criticize someone, walk
                Preben "Peppe" Guldberg __/-\__ a mile in his shoes. That way, if
                c928400@... (o o) he gets angry, he'll be a mile away
                ----------------------oOOo (_) oOOo-- - and barefoot." --Sarah Jackson
              • Zdenek Sekera
                ... I also prefer ... for the same reason, I just played with your regexps, but I think you missed the reason for my surprise (unless I missed the fine print
                Message 7 of 9 , Oct 5, 2001
                  Preben Guldberg wrote:
                  >
                  ...
                  > > Preben Guldberg wrote:
                  > > > BAD: '\v^\s*aaa((\s+)@>[^#]|\s*$)'
                  > > > OK: '^\s*aaa\(\(\s\+\)\@>[^#]\|\s*$\)'
                  >
                  > > I see them the same, too, but what about this:
                  >
                  > > "\\v^\\s*aaa((\\s*)@>[^#]|\\s*$)"
                  >
                  > > Which I meant to be the first one just written with "..." instead of
                  > > '...'
                  >
                  > Personally, I tend to use '...' rather than "..." when I can -
                  > especially for variables used as patterns - in an attempt to avoid all
                  > this quoting of '\'.
                  >
                  > > Can't see the mistake but this gives:
                  >
                  > > 1: aaass$ <- where does this come from?
                  >
                  > Oh, but 's' matches "[^#]" just fine. Try changing a '*' to a '+'
                  > above.

                  I also prefer '...' for the same reason, I just played with your
                  regexps,
                  but I think you missed the reason for my surprise (unless I missed the
                  fine print in your answer :-)): my regexp *should be* identical to your
                  first one (marked BAD above) except it is written in "..." notation.
                  If indeed they are functionaly identical, why don't they give the
                  same result?

                  Cheers,

                  ---Zdenek
                • Benji Fisher
                  ... They are different: one has a + and the other a *. That s why Peppe said Try changing... . --Benji Fisher
                  Message 8 of 9 , Oct 5, 2001
                    Zdenek Sekera wrote:
                    >
                    > Preben Guldberg wrote:
                    > >
                    > ...
                    > > > Preben Guldberg wrote:
                    > > > > BAD: '\v^\s*aaa((\s+)@>[^#]|\s*$)'
                    > > > > OK: '^\s*aaa\(\(\s\+\)\@>[^#]\|\s*$\)'
                    > >
                    > > > I see them the same, too, but what about this:
                    > >
                    > > > "\\v^\\s*aaa((\\s*)@>[^#]|\\s*$)"
                    > >
                    > > > Which I meant to be the first one just written with "..." instead of
                    > > > '...'
                    > >
                    > > Personally, I tend to use '...' rather than "..." when I can -
                    > > especially for variables used as patterns - in an attempt to avoid all
                    > > this quoting of '\'.
                    > >
                    > > > Can't see the mistake but this gives:
                    > >
                    > > > 1: aaass$ <- where does this come from?
                    > >
                    > > Oh, but 's' matches "[^#]" just fine. Try changing a '*' to a '+'
                    > > above.
                    >
                    > I also prefer '...' for the same reason, I just played with your
                    > regexps,
                    > but I think you missed the reason for my surprise (unless I missed the
                    > fine print in your answer :-)): my regexp *should be* identical to your
                    > first one (marked BAD above) except it is written in "..." notation.
                    > If indeed they are functionaly identical, why don't they give the
                    > same result?

                    They are different: one has a + and the other a *. That's why
                    Peppe said "Try changing...".

                    --Benji Fisher
                  Your message has been successfully submitted and would be delivered to recipients shortly.