Loading ...
Sorry, an error occurred while loading the content.

Re: Finding matching parentheses

Expand Messages
  • m.feichtinger
    From the help file of NTP62 Regular Expressions, Recursive Pattern : ( ( [^()]++ | (?R) )* ) Or in a lager pattern: ( ( ( [^()]++ | (?1) )* ) ) In your
    Message 1 of 10 , Dec 2, 2012
    • 0 Attachment
      From the help file of NTP62 "Regular Expressions, Recursive Pattern":
      \( ( [^()]++ | (?R) )* \)
      Or in a lager pattern:
      ( \( ( [^()]++ | (?1) )* \) )

      In your case:
      ^!Find "\\foreignlanguage\{greek\}(\{([^{}]++|(?1))*\})" rs1

      The same as named recursion (long line):
      ^!Find "(?x)\\foreignlanguage\{greek\} (?<braces>(?#named) \{ ( [^{}]++ | (?&braces)(?#recursion; reference by name) )* \} )" rs1

      Finds the outer braces and all nested braces between from:
      Dummes Gelaber \foreignlanguage{greek}{Qr'onos} mehr Gelaber
      or (long line):
      Dummes Gelaber \foreignlanguage{greek}{Qr'o {some nested {and some more} braces} nos} mehr Gelaber

      You can then skip the enclosing braces, i.e.
      ^!Set %var%=^$StrCopy("^$GetSelection$";2;^$Calc(^$StrSize("^$GetSelection$")$-2)$)$

      HTH

      --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
      >
      > In my documents I have this string:
      >
      > \foreignlanguage{greek}{Qr'onos}
      >
      > To convert that from TeX to HTML I need to find the greek word "Qr'onos"
      > Easy:
      >
      > ^!Find "\\foreignlanguage\{greek\}\{([^\}]+)\}" RS1
      >
      > But what if the word or phrase contains an inner {} pair? How to I find
      > and select from the starting outer { up to its matching closing } ?
      >
      > Danke
      > Axel
      >
    • flo.gehrke
      ... You will achieve the same result without ^$StrCopy$ when writing... ^!Find (?x) foreignlanguage {greek } ( { ( ([^{}]++|(?1) )* ) } ) RS2 So far,
      Message 2 of 10 , Dec 3, 2012
      • 0 Attachment
        --- In ntb-clips@yahoogroups.com, "m.feichtinger" <mafei@...> wrote:
        >
        > In your case:
        > ^!Find "\\foreignlanguage\{greek\}(\{([^{}]++|(?1))*\})" rs1
        >
        > Finds the outer braces and all nested braces between from:
        > Dummes Gelaber \foreignlanguage{greek}{Qr'onos} mehr Gelaber
        > (...)
        > You can then skip the enclosing braces, i.e.
        > ^!Set %var%=^$StrCopy("^$GetSelection$";2;^$Calc(^$StrSize("^$GetSelection$")$-2)$)$

        You will achieve the same result without ^$StrCopy$ when writing...

        ^!Find "(?x)\\foreignlanguage\{greek\} (\{ ( ([^{}]++|(?1) )* )\} )" RS2

        So far, however, we didn't see any "inner {} pair" in the sample strings. Also in...

        Dummes Gelaber \foreignlanguage{greek}{Qr'onos} mehr Gelaber

        there is no sequence of outer and inner brackets but a sequence of two parenthesized strings. So I suppose it's about something like...

        \foreignlanguage{greek}{Axel{Berger}Odenthal}

        Since there isn't much recursion needed here I think a simple pattern like...

        ^!Find "\\foreignlanguage\{greek}\{(.+)}" RS1

        will suffice here.

        If the outer braces should be enclosed in the match (as with Axel's latest clip) try...

        ^!Find "\\foreignlanguage\{greek}(\{.+})" RS1

        Regards,
        Flo
      • Axel Berger
        ... I d rather not include them, that s just what Match Brackets does. I d have had to eliminate them later. ... Yes. Or rather some LaTeX construct like v{s}
        Message 3 of 10 , Dec 3, 2012
        • 0 Attachment
          "flo.gehrke" wrote:
          > If the outer braces should be enclosed in the match
          > (as with Axel's latest clip)

          I'd rather not include them, that's just what Match Brackets does. I'd
          have had to eliminate them later.

          > So I suppose it's about something like...
          > \foreignlanguage{greek}{Axel{Berger}Odenthal}

          Yes. Or rather some LaTeX construct like \v{s} for an accented s.

          > I think a simple pattern like...
          > ^!Find "\\foreignlanguage\{greek}\{(.+)}" RS1
          > will suffice here.

          No, that'll find "Axel{Berger" in the example above, not
          "Axel{Berger}Odenthal"

          Danke
          Axel
        • flo.gehrke
          ... How come? For me, it s perfectly matching the whole string Axel{Berger}Odenthal . Please test it again. I don t have your complete data, so it s difficult
          Message 4 of 10 , Dec 4, 2012
          • 0 Attachment
            --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
            >
            > > So I suppose it's about something like...
            > > \foreignlanguage{greek}{Axel{Berger}Odenthal}
            > > (...)
            > > I think a simple pattern like...
            > > ^!Find "\\foreignlanguage\{greek}\{(.+)}" RS1
            > > will suffice here.
            >
            > No, that'll find "Axel{Berger" in the example above, not
            > "Axel{Berger}Odenthal"

            How come? For me, it's perfectly matching the whole string 'Axel{Berger}Odenthal'. Please test it again.

            I don't have your complete data, so it's difficult to decide this -- but the only problem could possibly be the dot that might cause a lot of back tracking. So it might be more efficient to define the characters that occur between those brackets:

            ^!Find "\\foreignlanguage\{greek}\{([\w{}']+)}" RS1

            Regards,
            Flo
          • Axel Berger
            ... To be honest, I had not tested, just looked at it, and you re right. It works because you use the greedy find, something I almost never do as a mater of
            Message 5 of 10 , Dec 4, 2012
            • 0 Attachment
              "flo.gehrke" wrote:
              > Please test it again.

              To be honest, I had not tested, just looked at it, and you're right. It
              works because you use the greedy find, something I almost never do as a
              mater of course. The simple reason:
              Have

              \foreignlanguage{greek}{Axel{Berger}Odenthal} some more waffle
              \foreignlanguage{greek}{Axel{Berger}Odenthal}

              without any hard line break and that search falls flat on its face.

              > I don't have your complete data, so it's difficult to decide this

              To use that clip at all I need it to cover a very general case. Once I
              begin using the foreignlanguage notation in my database in earnest and
              begin to rely on automatic HTML conversion, I've no idea what might crop
              up in there. The only thing I do know is, that nested curly braces are
              one of the most frequent TeX constructs of all.

              Axel
            • flo.gehrke
              ... That s why I said there could be a problem with the dot and made a second proposal. Would you mind testing this? Thanks. Flo P.S. It s always a problem to
              Message 6 of 10 , Dec 4, 2012
              • 0 Attachment
                --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
                >
                > "flo.gehrke" wrote:
                > > Please test it again.
                >
                > To be honest, I had not tested, just looked at it, and you're
                > right. (..) Have
                >
                > \foreignlanguage{greek}{Axel{Berger}Odenthal} some more waffle
                > \foreignlanguage{greek}{Axel{Berger}Odenthal}
                >
                > without any hard line break and that search falls flat on its face.

                That's why I said there could be a problem with the dot and made a second proposal. Would you mind testing this? Thanks.

                Flo

                P.S. It's always a problem to deal with an issue without having the data and all details. And it's even less fun if the conditions are changed with each message :-(
              • Axel Berger
                ... Yes, I m very sorry about that. I had /meant/ to make clear from the outset, that those {} may contain just about anything and still be a legal TeX
                Message 7 of 10 , Dec 5, 2012
                • 0 Attachment
                  "flo.gehrke" wrote:
                  > And it's even less fun if the conditions are changed
                  > with each message :-(

                  Yes, I'm very sorry about that. I had /meant/ to make clear from the
                  outset, that those {} may contain just about "anything" and still be a
                  legal TeX constraint. Just about the only constraint is inner curly
                  braces having to be paired and come in the right order.

                  The example given was an extremely simple one. I already knew Match
                  Brackets copes with anything thrown at it so did not need to try to trip
                  it up. I only needed to make sure that my sequence of Find, Match,
                  GetSelection worked as I wanted it to and for that a simple word
                  sufficed. Your second Regex won't find

                  \foreignlanguage{greek}{Axel {Berger} Odenthal}

                  I just see, I had not stressed from the outset that a _general_ solution
                  was needed. What I have is a database of books and articles and I
                  automatically generate lists of references in LaTeX and HTML. TeX is the
                  main format and NT translates to HTML. What I am working on right now is
                  the best format for incorporating titles in non-latin script into that
                  database.

                  Danke
                  Axel
                Your message has been successfully submitted and would be delivered to recipients shortly.