Loading ...
Sorry, an error occurred while loading the content.

Re: Finding matching parentheses

Expand Messages
  • Eb
    When matching tag paris in html, I use a loop and a stack to match up pairs. For nested brackets a simlar technique might work. Except there is no need to use
    Message 1 of 10 , Dec 1, 2012
    • 0 Attachment
      When matching tag paris in html, I use a loop and a stack to match up pairs. For nested brackets a simlar technique might work. Except there is no need to use a stack, since the target is always the same. Just use a counter.


      Start with the 1st opening bracket,
      set the counter to 1
      save your cursor position GetRowStart:GetColStart.
      Search for the next bracket (\{)|(\}) (open or close).
      If it is another opening brace, increment a counter
      If a closing brace, decrement the counter.
      When the counter is back to zero, you're done
      Save the END cursor position, and select between
      Start and end.

      You can track the depth of the nesting with a second counter, that you set to max(counte1,counter2). After counter1 returns to zero, counter2 reteins the depth of the deepest nesting.

      This technique also finds the closing brace, if there are parallel nested braces: {...{...}...{...}...}

      You'll need help from a regexpert for doing this without a loop. I suspect it's possible, but haven't a clue how.



      Cheers


      Eb




      --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
      >
      > In my documents I have this string:
      >
      > \foreignlanguage{greek}{Qr'onos}
      >
      > To convert that from TeX to HTML I need to find the greek word "Qr'onos"
      > Easy:
      >
      > ^!Find "\\foreignlanguage\{greek\}\{([^\}]+)\}" RS1
      >
      > But what if the word or phrase contains an inner {} pair? How to I find
      > and select from the starting outer { up to its matching closing } ?
      >
      > Danke
      > Axel
      >
    • Axel Berger
      ... Yes, I was afraid I might have to resort to somthing like that. The shame is, NT already has everything needed in Search-- Match Brackets, but no Clips
      Message 2 of 10 , Dec 1, 2012
      • 0 Attachment
        Eb wrote:
        > When the counter is back to zero, you're done
        > Save the END cursor position, and select between
        > Start and end.

        Yes, I was afraid I might have to resort to somthing like that. The
        shame is, NT already has everything needed in Search-->Match Brackets,
        but no Clips function for it.

        Still this works:

        ^!Find "\\foreignlanguage\{greek\}(\{)" RS1
        ^!Menu Search/"Match Brackets"
        ^!Set %var%=^$GetSelection$
        ^!Info ^%var%

        on this minimal file:

        Dummes Gelaber \foreignlanguage{greek}{Qr'onos} mehr Gelaber

        Axel
      • m.feichtinger
        From the help file of NTP62 Regular Expressions, Recursive Pattern : ( ( [^()]++ | (?R) )* ) Or in a lager pattern: ( ( ( [^()]++ | (?1) )* ) ) In your
        Message 3 of 10 , Dec 2, 2012
        • 0 Attachment
          From the help file of NTP62 "Regular Expressions, Recursive Pattern":
          \( ( [^()]++ | (?R) )* \)
          Or in a lager pattern:
          ( \( ( [^()]++ | (?1) )* \) )

          In your case:
          ^!Find "\\foreignlanguage\{greek\}(\{([^{}]++|(?1))*\})" rs1

          The same as named recursion (long line):
          ^!Find "(?x)\\foreignlanguage\{greek\} (?<braces>(?#named) \{ ( [^{}]++ | (?&braces)(?#recursion; reference by name) )* \} )" rs1

          Finds the outer braces and all nested braces between from:
          Dummes Gelaber \foreignlanguage{greek}{Qr'onos} mehr Gelaber
          or (long line):
          Dummes Gelaber \foreignlanguage{greek}{Qr'o {some nested {and some more} braces} nos} mehr Gelaber

          You can then skip the enclosing braces, i.e.
          ^!Set %var%=^$StrCopy("^$GetSelection$";2;^$Calc(^$StrSize("^$GetSelection$")$-2)$)$

          HTH

          --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
          >
          > In my documents I have this string:
          >
          > \foreignlanguage{greek}{Qr'onos}
          >
          > To convert that from TeX to HTML I need to find the greek word "Qr'onos"
          > Easy:
          >
          > ^!Find "\\foreignlanguage\{greek\}\{([^\}]+)\}" RS1
          >
          > But what if the word or phrase contains an inner {} pair? How to I find
          > and select from the starting outer { up to its matching closing } ?
          >
          > Danke
          > Axel
          >
        • flo.gehrke
          ... You will achieve the same result without ^$StrCopy$ when writing... ^!Find (?x) foreignlanguage {greek } ( { ( ([^{}]++|(?1) )* ) } ) RS2 So far,
          Message 4 of 10 , Dec 3, 2012
          • 0 Attachment
            --- In ntb-clips@yahoogroups.com, "m.feichtinger" <mafei@...> wrote:
            >
            > In your case:
            > ^!Find "\\foreignlanguage\{greek\}(\{([^{}]++|(?1))*\})" rs1
            >
            > Finds the outer braces and all nested braces between from:
            > Dummes Gelaber \foreignlanguage{greek}{Qr'onos} mehr Gelaber
            > (...)
            > You can then skip the enclosing braces, i.e.
            > ^!Set %var%=^$StrCopy("^$GetSelection$";2;^$Calc(^$StrSize("^$GetSelection$")$-2)$)$

            You will achieve the same result without ^$StrCopy$ when writing...

            ^!Find "(?x)\\foreignlanguage\{greek\} (\{ ( ([^{}]++|(?1) )* )\} )" RS2

            So far, however, we didn't see any "inner {} pair" in the sample strings. Also in...

            Dummes Gelaber \foreignlanguage{greek}{Qr'onos} mehr Gelaber

            there is no sequence of outer and inner brackets but a sequence of two parenthesized strings. So I suppose it's about something like...

            \foreignlanguage{greek}{Axel{Berger}Odenthal}

            Since there isn't much recursion needed here I think a simple pattern like...

            ^!Find "\\foreignlanguage\{greek}\{(.+)}" RS1

            will suffice here.

            If the outer braces should be enclosed in the match (as with Axel's latest clip) try...

            ^!Find "\\foreignlanguage\{greek}(\{.+})" RS1

            Regards,
            Flo
          • Axel Berger
            ... I d rather not include them, that s just what Match Brackets does. I d have had to eliminate them later. ... Yes. Or rather some LaTeX construct like v{s}
            Message 5 of 10 , Dec 3, 2012
            • 0 Attachment
              "flo.gehrke" wrote:
              > If the outer braces should be enclosed in the match
              > (as with Axel's latest clip)

              I'd rather not include them, that's just what Match Brackets does. I'd
              have had to eliminate them later.

              > So I suppose it's about something like...
              > \foreignlanguage{greek}{Axel{Berger}Odenthal}

              Yes. Or rather some LaTeX construct like \v{s} for an accented s.

              > I think a simple pattern like...
              > ^!Find "\\foreignlanguage\{greek}\{(.+)}" RS1
              > will suffice here.

              No, that'll find "Axel{Berger" in the example above, not
              "Axel{Berger}Odenthal"

              Danke
              Axel
            • flo.gehrke
              ... How come? For me, it s perfectly matching the whole string Axel{Berger}Odenthal . Please test it again. I don t have your complete data, so it s difficult
              Message 6 of 10 , Dec 4, 2012
              • 0 Attachment
                --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
                >
                > > So I suppose it's about something like...
                > > \foreignlanguage{greek}{Axel{Berger}Odenthal}
                > > (...)
                > > I think a simple pattern like...
                > > ^!Find "\\foreignlanguage\{greek}\{(.+)}" RS1
                > > will suffice here.
                >
                > No, that'll find "Axel{Berger" in the example above, not
                > "Axel{Berger}Odenthal"

                How come? For me, it's perfectly matching the whole string 'Axel{Berger}Odenthal'. Please test it again.

                I don't have your complete data, so it's difficult to decide this -- but the only problem could possibly be the dot that might cause a lot of back tracking. So it might be more efficient to define the characters that occur between those brackets:

                ^!Find "\\foreignlanguage\{greek}\{([\w{}']+)}" RS1

                Regards,
                Flo
              • Axel Berger
                ... To be honest, I had not tested, just looked at it, and you re right. It works because you use the greedy find, something I almost never do as a mater of
                Message 7 of 10 , Dec 4, 2012
                • 0 Attachment
                  "flo.gehrke" wrote:
                  > Please test it again.

                  To be honest, I had not tested, just looked at it, and you're right. It
                  works because you use the greedy find, something I almost never do as a
                  mater of course. The simple reason:
                  Have

                  \foreignlanguage{greek}{Axel{Berger}Odenthal} some more waffle
                  \foreignlanguage{greek}{Axel{Berger}Odenthal}

                  without any hard line break and that search falls flat on its face.

                  > I don't have your complete data, so it's difficult to decide this

                  To use that clip at all I need it to cover a very general case. Once I
                  begin using the foreignlanguage notation in my database in earnest and
                  begin to rely on automatic HTML conversion, I've no idea what might crop
                  up in there. The only thing I do know is, that nested curly braces are
                  one of the most frequent TeX constructs of all.

                  Axel
                • flo.gehrke
                  ... That s why I said there could be a problem with the dot and made a second proposal. Would you mind testing this? Thanks. Flo P.S. It s always a problem to
                  Message 8 of 10 , Dec 4, 2012
                  • 0 Attachment
                    --- In ntb-clips@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
                    >
                    > "flo.gehrke" wrote:
                    > > Please test it again.
                    >
                    > To be honest, I had not tested, just looked at it, and you're
                    > right. (..) Have
                    >
                    > \foreignlanguage{greek}{Axel{Berger}Odenthal} some more waffle
                    > \foreignlanguage{greek}{Axel{Berger}Odenthal}
                    >
                    > without any hard line break and that search falls flat on its face.

                    That's why I said there could be a problem with the dot and made a second proposal. Would you mind testing this? Thanks.

                    Flo

                    P.S. It's always a problem to deal with an issue without having the data and all details. And it's even less fun if the conditions are changed with each message :-(
                  • Axel Berger
                    ... Yes, I m very sorry about that. I had /meant/ to make clear from the outset, that those {} may contain just about anything and still be a legal TeX
                    Message 9 of 10 , Dec 5, 2012
                    • 0 Attachment
                      "flo.gehrke" wrote:
                      > And it's even less fun if the conditions are changed
                      > with each message :-(

                      Yes, I'm very sorry about that. I had /meant/ to make clear from the
                      outset, that those {} may contain just about "anything" and still be a
                      legal TeX constraint. Just about the only constraint is inner curly
                      braces having to be paired and come in the right order.

                      The example given was an extremely simple one. I already knew Match
                      Brackets copes with anything thrown at it so did not need to try to trip
                      it up. I only needed to make sure that my sequence of Find, Match,
                      GetSelection worked as I wanted it to and for that a simple word
                      sufficed. Your second Regex won't find

                      \foreignlanguage{greek}{Axel {Berger} Odenthal}

                      I just see, I had not stressed from the outset that a _general_ solution
                      was needed. What I have is a database of books and articles and I
                      automatically generate lists of references in LaTeX and HTML. TeX is the
                      main format and NT translates to HTML. What I am working on right now is
                      the best format for incorporating titles in non-latin script into that
                      database.

                      Danke
                      Axel
                    Your message has been successfully submitted and would be delivered to recipients shortly.