Loading ...
Sorry, an error occurred while loading the content.

regex question

Expand Messages
  • David Stuart
    Hi All, I know you love regex questions, so I thought I d send one out. It s not really a vim problem (really a sed problem), but I think people here might
    Message 1 of 16 , Jun 20, 2002
    • 0 Attachment
      Hi All,

      I know you love regex questions, so I thought I'd send one out. It's not
      really a vim problem (really a sed problem), but I think people here
      might know the answer.

      If I have some output:

      ABC.ABC.ABC.ABC

      And I want to substitue every occurrence of ABC with DEF, but only when
      ABC is a "word" (that is, some alpha-numeric string + "_"), so that
      ABCcomment won't get substituted, I have come up with the following
      regular expression:

      s/\(^\|[^a-zA-Z0-9_]\)ABC\([^a-zA-Z0-9_]\|$\)/\1DEF\2/g

      Which essentially says, all ABCs that are surrounded by non-word
      characters, or begin or end a line..

      The problem with this is that when you have something like:

      ABC.ABC

      The first one gets substituted, but the second one doesn't because the
      pattern match has already passed the "."..

      Can anyone help?

      Dave

      --
      David Stuart
      Computing Scientist, Accelight Networks
      e-mail: d.stuart@...
      web: http://www.accelight.com/
    • Thomas S. Urban
      Why won t this work? s/ /DEF/g Modify iskeyword option if you need to. ... -- Everyone is more or less mad on one point. -- Rudyard Kipling
      Message 2 of 16 , Jun 20, 2002
      • 0 Attachment
        Why won't this work?

        s/\<ABC\>/DEF/g

        Modify 'iskeyword' option if you need to.

        On Thu, Jun 20, 2002 at 18:43:04 -0400, David Stuart sent 0.9K bytes:
        > Hi All,
        >
        > I know you love regex questions, so I thought I'd send one out. It's not
        > really a vim problem (really a sed problem), but I think people here
        > might know the answer.
        >
        > If I have some output:
        >
        > ABC.ABC.ABC.ABC
        >
        > And I want to substitue every occurrence of ABC with DEF, but only when
        > ABC is a "word" (that is, some alpha-numeric string + "_"), so that
        > ABCcomment won't get substituted, I have come up with the following
        > regular expression:
        >
        > s/\(^\|[^a-zA-Z0-9_]\)ABC\([^a-zA-Z0-9_]\|$\)/\1DEF\2/g
        >
        > Which essentially says, all ABCs that are surrounded by non-word
        > characters, or begin or end a line..
        >
        > The problem with this is that when you have something like:
        >
        > ABC.ABC
        >
        > The first one gets substituted, but the second one doesn't because the
        > pattern match has already passed the "."..
        >
        > Can anyone help?
        >
        > Dave
        >
        > --
        > David Stuart
        > Computing Scientist, Accelight Networks
        > e-mail: d.stuart@...
        > web: http://www.accelight.com/
        >

        --
        Everyone is more or less mad on one point.
        -- Rudyard Kipling
      • David Stuart
        The main issue is that it s a sed problem, not a vi problem. I don t think the characters exist in sed? I ll go and check though. ... -- David
        Message 3 of 16 , Jun 20, 2002
        • 0 Attachment
          The main issue is that it's a "sed" problem, not a "vi" problem. I don't
          think the "<" and ">" characters exist in sed?

          I'll go and check though.

          Thomas S. Urban wrote:

          >Why won't this work?
          >
          >s/\<ABC\>/DEF/g
          >
          >Modify 'iskeyword' option if you need to.
          >
          >On Thu, Jun 20, 2002 at 18:43:04 -0400, David Stuart sent 0.9K bytes:
          >
          >
          >>Hi All,
          >>
          >>I know you love regex questions, so I thought I'd send one out. It's not
          >>really a vim problem (really a sed problem), but I think people here
          >>might know the answer.
          >>
          >>If I have some output:
          >>
          >>ABC.ABC.ABC.ABC
          >>
          >>And I want to substitue every occurrence of ABC with DEF, but only when
          >>ABC is a "word" (that is, some alpha-numeric string + "_"), so that
          >>ABCcomment won't get substituted, I have come up with the following
          >>regular expression:
          >>
          >>s/\(^\|[^a-zA-Z0-9_]\)ABC\([^a-zA-Z0-9_]\|$\)/\1DEF\2/g
          >>
          >>Which essentially says, all ABCs that are surrounded by non-word
          >>characters, or begin or end a line..
          >>
          >>The problem with this is that when you have something like:
          >>
          >>ABC.ABC
          >>
          >>The first one gets substituted, but the second one doesn't because the
          >>pattern match has already passed the "."..
          >>
          >>Can anyone help?
          >>
          >>Dave
          >>
          >>--
          >>David Stuart
          >>Computing Scientist, Accelight Networks
          >>e-mail: d.stuart@...
          >>web: http://www.accelight.com/
          >>
          >>
          >>
          >
          >
          >

          --
          David Stuart
          Computing Scientist, Accelight Networks
          e-mail: d.stuart@...
          web: http://www.accelight.com/
        • David Stuart
          Yes, to put it in vim terms, I do not have the zero-width option available to me.. :help zero-width ... -- David Stuart Computing Scientist, Accelight
          Message 4 of 16 , Jun 20, 2002
          • 0 Attachment
            Yes, to put it in vim terms, I do not have the "zero-width" option
            available to me.. ":help zero-width"

            Thomas S. Urban wrote:

            >Why won't this work?
            >
            >s/\<ABC\>/DEF/g
            >
            >Modify 'iskeyword' option if you need to.
            >
            >On Thu, Jun 20, 2002 at 18:43:04 -0400, David Stuart sent 0.9K bytes:
            >
            >
            >>Hi All,
            >>
            >>I know you love regex questions, so I thought I'd send one out. It's not
            >>really a vim problem (really a sed problem), but I think people here
            >>might know the answer.
            >>
            >>If I have some output:
            >>
            >>ABC.ABC.ABC.ABC
            >>
            >>And I want to substitue every occurrence of ABC with DEF, but only when
            >>ABC is a "word" (that is, some alpha-numeric string + "_"), so that
            >>ABCcomment won't get substituted, I have come up with the following
            >>regular expression:
            >>
            >>s/\(^\|[^a-zA-Z0-9_]\)ABC\([^a-zA-Z0-9_]\|$\)/\1DEF\2/g
            >>
            >>Which essentially says, all ABCs that are surrounded by non-word
            >>characters, or begin or end a line..
            >>
            >>The problem with this is that when you have something like:
            >>
            >>ABC.ABC
            >>
            >>The first one gets substituted, but the second one doesn't because the
            >>pattern match has already passed the "."..
            >>
            >>Can anyone help?
            >>
            >>Dave
            >>
            >>--
            >>David Stuart
            >>Computing Scientist, Accelight Networks
            >>e-mail: d.stuart@...
            >>web: http://www.accelight.com/
            >>
            >>
            >>
            >
            >
            >

            --
            David Stuart
            Computing Scientist, Accelight Networks
            e-mail: d.stuart@...
            web: http://www.accelight.com/
          • Thomas S. Urban
            ... Didn t see that is was a sed question (ignore iskeyword), but are available at least in GNU sed and the one in /usr/bin on my solaris 8 machine. If
            Message 5 of 16 , Jun 20, 2002
            • 0 Attachment
              On Thu, Jun 20, 2002 at 18:54:19 -0400, David Stuart sent 1.5K bytes:
              > The main issue is that it's a "sed" problem, not a "vi" problem. I don't
              > think the "<" and ">" characters exist in sed?
              >
              > I'll go and check though.
              >

              Didn't see that is was a sed question (ignore iskeyword), but \<,\> are
              available at least in GNU sed and the one in /usr/bin on my solaris 8
              machine. If the version you are using doesn't have it or something
              similar (sometimes \b is used), get GNU sed if you can.


              Scott

              --
              You have all the characteristics of a popular politician: a horrible voice,
              bad breeding, and a vulgar manner.
              -- Aristophanes
            • Arun Bhanu
              Hi, If I ve a piece of text like given below: abc XYZHandler
              Message 6 of 16 , Nov 11, 2002
              • 0 Attachment
                Hi,

                If I've a piece of text like given below:

                <Action>
                <ActionName>abc</ActionName>
                <HandlerClass>
                XYZHandler
                </HandlerClass>
                <Link linkName="btnCancel" linkType="button" linkLabel="Cancel">
                <TargetService>foo</TargetService>
                <TargetAction>bar</TargetAction>
                </Link>
                </Action>

                What is the regex to match all the text in between <Link> and </Link> tags?
                ie. the regex can match the text
                <Link linkName="btnCancel" linkType="button" linkLabel="Cancel">
                <TargetService>foo</TargetService>
                <TargetAction>bar</TargetAction>
                </Link>


                TIA.
                Arun
              • Colin Keith
                ... You don t have to follow up someone else s messages if its on a totally unrelated topic... ... That depends on where you re using it. If you know that you
                Message 7 of 16 , Nov 11, 2002
                • 0 Attachment
                  On Tue, Nov 12, 2002 at 12:12:08PM +0800, Arun Bhanu wrote:
                  > Subject: regex question

                  You don't have to follow up someone else's messages if its on a totally
                  unrelated topic...

                  > What is the regex to match all the text in between <Link> and </Link> tags?

                  That depends on where you're using it. If you know that you have a certain
                  number of lines then you can rely on standard regexp patterns:

                  <Link .*\n\(.*\n.*\)\n.*</Link>

                  I.e.

                  :.,.+4s%<Link .*\n\(.*\n.*\)\n.*</Link>%\1%

                  Gives you:

                  | <TargetService>foo</TargetService>
                  | <TargetAction>bar</TargetAction>

                  Which looks a little dopey. You can tidy it up using \s*'s. But this only
                  works with a fixed number of lines. So you should look at \_.


                  :.,.+4s%<Link .*>\(\_.*\)</Link>%\1%

                  |------
                  | <TargetService>foo</TargetService>
                  | <TargetAction>bar</TargetAction>
                  |------


                  If you want to do something with this though, you're better off using a
                  script to manipulate it.

                  let x = line('.')
                  let y = col('.')
                  let ll=getline('$')
                  let i=1
                  while(i<=ll)
                  if(match(getline(i), '^\s*<Link ') == 0)
                  let startline = i
                  let endline = search('^\s*</Link>', 'W')

                  " Skip out if we don't find the end:
                  if(!endline)
                  echoerr "No </Link> found"
                  return cursor(x,y)
                  endif

                  " Found the end of the match, go back to the line above.
                  let endline = endline - 1

                  " ...... do something with the lines
                  " between startline and endline here

                  endif

                  let i = i + 1
                  endwhile
                  return cursor(x, y)



                  ... which might contain errors cus I haven't tested it, and I was watching
                  TV as I wrote it :) (and should be going to sleep :)

                  Col.


                  --
                  We are often told that the Chinese government restricts Internet access for its
                  citizens by blocking e-mail and web site access. I find this somewhat ironic
                  since the majority of my spam comes from Chinese open mail relays and proxy
                  servers that aren't blocked, banned or filtered in the slightest...
                • Michael Geddes
                  Have a look at my html stuff http://vim.sourceforge.net/script.php?script_id=14 which does more complicated tag matching and auto-closing &c, however a
                  Message 8 of 16 , Nov 11, 2002
                  • 0 Attachment
                    Have a look at my html stuff
                    http://vim.sourceforge.net/script.php?script_id=14 which does more
                    complicated tag matching and auto-closing &c, however a simplistic one
                    would be:

                    '\c<link[^>]*>\_.\{-}</link\s*>'

                    if you use \zs and \ze you can the select just the text inbetween :) -
                    of course this doesn't account for '>' characters inside the quotes in
                    the <link ... > bit.

                    also be aware there's a '/' in there - so if you're using / make sure
                    you escape it ;)

                    //.

                    -----Original Message-----
                    From: Arun Bhanu [mailto:arunbhanu@...]
                    Sent: Tuesday, 12 November 2002 3:12 PM
                    To: vim
                    Subject: regex question


                    Hi,

                    If I've a piece of text like given below:

                    <Action>
                    <ActionName>abc</ActionName>
                    <HandlerClass>
                    XYZHandler
                    </HandlerClass>
                    <Link linkName="btnCancel" linkType="button" linkLabel="Cancel">
                    <TargetService>foo</TargetService>
                    <TargetAction>bar</TargetAction>
                    </Link>
                    </Action>

                    What is the regex to match all the text in between <Link> and </Link>
                    tags? ie. the regex can match the text
                    <Link linkName="btnCancel" linkType="button" linkLabel="Cancel">
                    <TargetService>foo</TargetService>
                    <TargetAction>bar</TargetAction>
                    </Link>


                    TIA.
                    Arun
                  • Benji Fisher
                    ... How would you like an easy way to select everything from in Visual mode? Try ... and then, starting in Normal mode with the cursor
                    Message 9 of 16 , Nov 12, 2002
                    • 0 Attachment
                      Arun Bhanu wrote:
                      > What is the regex to match all the text in between <Link> and </Link> tags?
                      > ie. the regex can match the text
                      > <Link linkName="btnCancel" linkType="button" linkLabel="Cancel">
                      > <TargetService>foo</TargetService>
                      > <TargetAction>bar</TargetAction>
                      > </Link>

                      How would you like an easy way to select everything from "<Link"
                      to "</Link>" in Visual mode? Try

                      :source $VIMRUNTIME/macros/matchit.vim

                      and then, starting in Normal mode with the cursor anywhere on "Link",
                      type "v%" (without the quotes).

                      HTH --Benji Fisher
                    • Eric Arnold
                      Sorry if I ve got brain lock on this, but is it possible to match a substring like match wildmenu ; (directory ) {3,}; such that it will match three or more
                      Message 10 of 16 , Jun 1, 2006
                      • 0 Attachment
                        Sorry if I've got brain lock on this, but is it possible to match a
                        substring like

                        match wildmenu ;\(directory\)\{3,};

                        such that it will match three or more substring chars of the pattern
                        to match "dir" as well as "directory"? (I know the above format isn't
                        this.) I know I could do it if I could use an expression, but syntax
                        highlighting doesn't allow that, so I'm wondering if I can do it with
                        regex alone.
                      • Cory Echols
                        ... Enclose ectory in another group that matches zero or one times. The v enables very magic mode, and the %() construct causes the group to not be
                        Message 11 of 16 , Jun 1, 2006
                        • 0 Attachment
                          On 6/1/06, Eric Arnold <eric.p.arnold@...> wrote:
                          > Sorry if I've got brain lock on this, but is it possible to match a
                          > substring like
                          >
                          > match wildmenu ;\(directory\)\{3,};
                          >
                          > such that it will match three or more substring chars of the pattern
                          > to match "dir" as well as "directory"? (I know the above format isn't
                          > this.) I know I could do it if I could use an expression, but syntax
                          > highlighting doesn't allow that, so I'm wondering if I can do it with
                          > regex alone.
                          >

                          Enclose "ectory" in another group that matches zero or one times. The
                          "\v" enables "very magic" mode, and the "%()" construct causes the
                          group to not be counted as a sub-expression:

                          \v(dir%(ectory)?)
                        • Benji Fisher
                          ... Do you mean like /
                          Message 12 of 16 , Jun 1, 2006
                          • 0 Attachment
                            On Thu, Jun 01, 2006 at 05:05:00AM -0600, Eric Arnold wrote:
                            > Sorry if I've got brain lock on this, but is it possible to match a
                            > substring like
                            >
                            > match wildmenu ;\(directory\)\{3,};
                            >
                            > such that it will match three or more substring chars of the pattern
                            > to match "dir" as well as "directory"? (I know the above format isn't
                            > this.) I know I could do it if I could use an expression, but syntax
                            > highlighting doesn't allow that, so I'm wondering if I can do it with
                            > regex alone.

                            Do you mean like /\<dir\%[ectory]/ ?

                            :help /\%[]

                            HTH --Benji Fisher
                          • Eric Arnold
                            ... Real close. Turns out I think I want: / / but it doesn t seem to recognize {1,} and without the
                            Message 13 of 16 , Jun 1, 2006
                            • 0 Attachment
                              On 6/1/06, Benji Fisher <benji@...> wrote:
                              > On Thu, Jun 01, 2006 at 05:05:00AM -0600, Eric Arnold wrote:
                              > > Sorry if I've got brain lock on this, but is it possible to match a
                              > > substring like
                              > >
                              > > match wildmenu ;\(directory\)\{3,};
                              > >
                              > > such that it will match three or more substring chars of the pattern
                              > > to match "dir" as well as "directory"? (I know the above format isn't
                              > > this.) I know I could do it if I could use an expression, but syntax
                              > > highlighting doesn't allow that, so I'm wondering if I can do it with
                              > > regex alone.
                              >
                              > Do you mean like /\<dir\%[ectory]/ ?
                              >
                              > :help /\%[]
                              >
                              > HTH --Benji Fisher
                              >

                              Real close. Turns out I think I want:

                              /\<\%[directory]\{1,}\>/

                              but it doesn't seem to recognize \{1,} and without the \< it seems to
                              be matching white space. The problem with \< is that it doesn't
                              seem to allow \<\%[.directory]

                              What I'm actually trying to do is walk through a list of displayed
                              files, highlighting each file individually (full length) (I.e via
                              <TAB> key). The regex is because the file names are truncated to a
                              given length, and the remainder is wrapped down onto the next column


                              ./ >8.3 >oaded
                              ../ TabLineSet.vim.2.0 WinWalker.zip.upl>
                              TabLineSet.vim.1.> WinWalker.1.2.1.zip >oaded2
                              >7.1.vim WinWalker.1.2.zip doc/
                              TabLineSet.vim.1.8 WinWalker.2.0.zip plugin/
                              TabLineSet.vim.1.> WinWalker.2.1.zip
                              >8.1 WinWalker.2.2.zip
                              TabLineSet.vim.1.> WinWalker.zip.upl>

                              After I solve the \%[ problem, I then have to see if I can deal with
                              the continuation segments.....
                            • Eric Arnold
                              Sorry I wasn t clear, I wanted it to match any substring of directory . I think %[] does this (courtesy of Benji).
                              Message 14 of 16 , Jun 1, 2006
                              • 0 Attachment
                                Sorry I wasn't clear, I wanted it to match any substring of
                                'directory'. I think \%[] does this (courtesy of Benji).


                                On 6/1/06, Cory Echols <ctechols@...> wrote:
                                > On 6/1/06, Eric Arnold <eric.p.arnold@...> wrote:
                                > > Sorry if I've got brain lock on this, but is it possible to match a
                                > > substring like
                                > >
                                > > match wildmenu ;\(directory\)\{3,};
                                > >
                                > > such that it will match three or more substring chars of the pattern
                                > > to match "dir" as well as "directory"? (I know the above format isn't
                                > > this.) I know I could do it if I could use an expression, but syntax
                                > > highlighting doesn't allow that, so I'm wondering if I can do it with
                                > > regex alone.
                                > >
                                >
                                > Enclose "ectory" in another group that matches zero or one times. The
                                > "\v" enables "very magic" mode, and the "%()" construct causes the
                                > group to not be counted as a sub-expression:
                                >
                                > \v(dir%(ectory)?)
                                >
                              • Charles E Campbell Jr
                                ... I suspect you want / /
                                Message 15 of 16 , Jun 6, 2006
                                • 0 Attachment
                                  Eric Arnold wrote:

                                  > Real close. Turns out I think I want:
                                  >
                                  > /\<\%[directory]\{1,}\>/


                                  I suspect you want
                                  /\<d\%[irectory]\>/

                                  >
                                  > but it doesn't seem to recognize \{1,} and without the \< it seems to
                                  > be matching white space. The problem with \< is that it doesn't
                                  > seem to allow \<\%[.directory]
                                  >
                                  > What I'm actually trying to do is walk through a list of displayed
                                  > files, highlighting each file individually (full length) (I.e via
                                  > <TAB> key). The regex is because the file names are truncated to a
                                  > given length, and the remainder is wrapped down onto the next column
                                  >
                                  >
                                  > ./ >8.3 >oaded
                                  > ../ TabLineSet.vim.2.0 WinWalker.zip.upl>
                                  > TabLineSet.vim.1.> WinWalker.1.2.1.zip >oaded2
                                  > >7.1.vim WinWalker.1.2.zip doc/
                                  > TabLineSet.vim.1.8 WinWalker.2.0.zip plugin/
                                  > TabLineSet.vim.1.> WinWalker.2.1.zip
                                  > >8.1 WinWalker.2.2.zip
                                  > TabLineSet.vim.1.> WinWalker.zip.upl>
                                  >
                                  > After I solve the \%[ problem, I then have to see if I can deal with
                                  > the continuation segments.....
                                  >
                                • Charles E Campbell Jr
                                  ... I suspect you want / / Regards, Chip Campbell
                                  Message 16 of 16 , Jun 6, 2006
                                  • 0 Attachment
                                    Eric Arnold wrote:

                                    > Real close. Turns out I think I want:
                                    >
                                    > /\<\%[directory]\{1,}\>/


                                    I suspect you want
                                    /\<d\%[irectory]\>/

                                    Regards,
                                    Chip Campbell
                                  Your message has been successfully submitted and would be delivered to recipients shortly.