Loading ...
Sorry, an error occurred while loading the content.

Empty Array Elements...

Expand Messages
  • Paul
    Hi all, I have a peculiar problem today! I want GetDocMatchAll to recognise an empty match and assign it to an array element (the empty string)! Here s the
    Message 1 of 7 , Dec 8, 2010
    • 0 Attachment
      Hi all,

      I have a peculiar problem today! I want GetDocMatchAll to recognise an empty match and assign it to an array element (the empty string)!

      Here's the deal:

      ^!Find "{[^{}]++}" WRS

      finds nested bracket terms. Here's the challenge example:

      {strongly|}

      And the assignment code:
      ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$

      And I want:
      ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%

      to show ^%arrayelements_0%=2.

      Currently either GetDocMatchAll OR the array assignment is ignoring the 'blank' match that the original ^!Find locates. I haven't been able to determine which is the cause and I'm not sure if this can be accomplished.

      Naturally I'm considering using a token in the bracket like VOIDoption to indicate that the two options are either "strongly" or "". i.e. {strongly|VOIDoption} though it would be cleaner not to.

      Thanks in advance for your help.
      Paul


      p.s. for what started as a simple random number generator, the Article Spinner is behaving quite well now and I'm surprised that it comprises over 1400 lines of commented code including a maths engine that computes "like-by-hand" to handle ridiculously large numbers. Clips rock! :) Thankyou to all who've helped.
    • flo.gehrke
      ... Paul, I m trying to understand your examples. When running... ^!Find {[^{}]++} WRS against the subject... {strongly|} you will get one match:
      Message 2 of 7 , Dec 8, 2010
      • 0 Attachment
        --- In ntb-clips@yahoogroups.com, "Paul" <xboa721@...> wrote:
        >
        > Hi all,
        >
        > I have a peculiar problem today! I want GetDocMatchAll to recognise an empty match and assign it to an array element (the empty string)!
        >
        > Here's the deal:
        >
        > ^!Find "{[^{}]++}" WRS
        >
        > finds nested bracket terms. Here's the challenge example:
        >
        > {strongly|}
        >
        > And the assignment code:
        > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$
        >
        > And I want:
        > ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%
        >
        > to show ^%arrayelements_0%=2.
        >
        > Currently either GetDocMatchAll OR the array assignment is ignoring the 'blank' match that the original ^!Find locates. I haven't been able to determine which is the cause and I'm not sure if this can be accomplished.
        >
        > Naturally I'm considering using a token in the bracket like VOIDoption to indicate that the two options are either "strongly" or "". i.e. {strongly|VOIDoption} though it would be cleaner not to.
        >
        > Thanks in advance for your help.
        > Paul

        Paul,

        I'm trying to understand your examples. When running...

        ^!Find "{[^{}]++}" WRS

        against the subject...

        {strongly|}

        you will get one match: '{strongly|}'. So where's the "empty match"?

        ^!Find "[^|}{]++" WRS

        will get one match as well: 'strongly'. Again, no empty match.

        Accordingly,...

        ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$
        ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%

        will output 'Nest_0 = 1' (and not '2').

        If there actually are "empty matches" (matches of zero length), NT will count them correctly. For example: When running...

        ^!SetArray %Array%=^$GetDocMatchAll("(?=.)")$
        ^!Info %Array0% = ^%Array0%

        against the subject

        abc

        the output will be '%Array0% = 3', because '(?=.)' matches at the position before any character, i.e. it achieves three empty matches (matches which don't consume any character). Consequently,...

        ^!Info ^%Array%

        will output ';;'. That is, NT will assign three empty values to the array.

        Regards,
        Flo
      • diodeom
        ... The pattern [^|}{] is quantified here with a plus, so it expects *at least one* non-pipe/bracket in order to make a match. If it were simply set to look
        Message 3 of 7 , Dec 8, 2010
        • 0 Attachment
          Paul <xboa721@...> wrote:
          >
          > I want GetDocMatchAll to recognise an empty match and assign it to an array element (the empty string)!
          >
          > Here's the deal:
          >
          > ^!Find "{[^{}]++}" WRS
          >
          > finds nested bracket terms. Here's the challenge example:
          >
          > {strongly|}
          >
          > And the assignment code:
          > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$
          >
          > And I want:
          > ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%
          >
          > to show ^%arrayelements_0%=2.
          >
          > Currently either GetDocMatchAll OR the array assignment is ignoring the 'blank' match that the original ^!Find locates. I haven't been able to determine which is the cause and I'm not sure if this can be accomplished.
          >
          >

          The pattern [^|}{] is quantified here with a plus, so it expects *at least one* non-pipe/bracket in order to make a match.

          If it were simply set to look for zero or more, it would capture also unwanted "empties" before and after the brackets. One way to prevent it could be to demand that any match has to be both preceded and followed by a pipe/bracket:

          In selection: {strongly|}
          ^$GetDocMatchAll("(?<=[|}{])[^|}{]*+(?=[|}{])")$

          Alternatively, these funky look-behind/ahead assertions could be avoided if the selection that GetDocMatchAll operates on (acquired in your previous step) were reduced to just what's inside the brackets, e.g.:

          ^!Find "{\K[^{}]++" WRS
          Then in selection: strongly|
          ^$GetDocMatchAll("[^|}{]*+")
        • Eb
          Paul, without fully understanding the context, but expecting the content AND the curlies already selected, the easy answer is to assign the VBAR as array
          Message 4 of 7 , Dec 8, 2010
          • 0 Attachment
            Paul,

            without fully understanding the context, but expecting the content AND the curlies already selected, the easy answer is to assign the VBAR as array element, AND leave it out of your forbidden character set:

            ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^}{]+))$

            I'm not sure why you included a double '+' but to capture the example you will get a single array example with an array delimiter OTHER than the vbar. However, the VBAR as delimiter AND as part of the match will automatically return two elements, even if one (or both) is empty.

            Assuming you have a reason for the Find command, if you add look-behind and ahead assertions for the curlies, you will not even need a GetDocMatchAll function, but just an array assignment.

            i.e.

            ^!Find "(?<={)[^{}]+(?=})" WRS
            ^!SetArray ... = ^$GetSelection$

            (you may need to escape the curlies)


            Cheers,


            Eb


            --- In ntb-clips@yahoogroups.com, "Paul" <xboa721@...> wrote:
            >
            > ... assign it to an array element (the empty string)!
            >
            > ^!Find "{[^{}]++}" WRS
            > ;{strongly|}
            > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$
            > ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%
            >
            > to show ^%arrayelements_0%=2.
          • Paul
            Thankyou for the replies. Without implementing the ideas presented (it s waaaaay to late tonight!) they make sense and I m sure will clear the sticking point
            Message 5 of 7 , Dec 8, 2010
            • 0 Attachment
              Thankyou for the replies.

              Without implementing the ideas presented (it's waaaaay to late tonight!) they make sense and I'm sure will clear the sticking point for me. Many thanks.

              In answers to the questions: Having removed myself from the details of the regex for so long I'd basically overlooked the double+ operator as a first port of call for the ^!Find.

              The lookbehind/ahead assertions really are no problem and in a subsequent search I have to use them anyway.. however it remains to be seen from testing what is required and what works best.

              I'm intrigued by a direct search without the use of GDMA however the complexity of search within search means the current system I'm running I might just stick with :) Then again, I will get a chance to try it out soon. Perhaps the processing time is better.

              Certainly, the proof's in the pudding and I'll be baking soon!

              Kind regards,
              Paul

              p.s. does anyone run an annual competition for search terms that regex can't find? thought not! ;)


              --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@...> wrote:
              >
              > Paul,
              >
              > without fully understanding the context, but expecting the content AND the curlies already selected, the easy answer is to assign the VBAR as array element, AND leave it out of your forbidden character set:
              >
              > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^}{]+))$
              >
              > I'm not sure why you included a double '+' but to capture the example you will get a single array example with an array delimiter OTHER than the vbar. However, the VBAR as delimiter AND as part of the match will automatically return two elements, even if one (or both) is empty.
              >
              > Assuming you have a reason for the Find command, if you add look-behind and ahead assertions for the curlies, you will not even need a GetDocMatchAll function, but just an array assignment.
              >
              > i.e.
              >
              > ^!Find "(?<={)[^{}]+(?=})" WRS
              > ^!SetArray ... = ^$GetSelection$
              >
              > (you may need to escape the curlies)
              >
              >
              > Cheers,
              >
              >
              > Eb
              >
              >
              > --- In ntb-clips@yahoogroups.com, "Paul" <xboa721@> wrote:
              > >
              > > ... assign it to an array element (the empty string)!
              > >
              > > ^!Find "{[^{}]++}" WRS
              > > ;{strongly|}
              > > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$
              > > ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%
              > >
              > > to show ^%arrayelements_0%=2.
              >
            • Paul
              Ah.. the simple answer here Eb is that a document may contain the following: {For an example|Zum bespiel} this is {a good|ein besser|the best} sentence to
              Message 6 of 7 , Dec 8, 2010
              • 0 Attachment
                Ah.. the simple answer here Eb is that a document may contain the following:

                {For an example|Zum bespiel} this is {a good|ein besser|the best} sentence to {understand|comprehend|make sense of} the {higher|} purpose behind the program's {intention|purpose|function|ideology}.

                The first {part{ing|ner} left the house in {tatters|pristine condition} as the absent minded vicar {rowed across the {creek.|Thames.}|ran around in circles!}

                Ok, so it's not simple but did you ever read a Choose Your Own Adventure Story by Edward Packard? Great stuff. Well, it's a similar idea run on a nested bracket system. The purpose lends itself to article marketing and chasing lazy uni students submitting other's work as their own (perhaps with a few words changed).

                So that's content, and yes the curlies were already selected. So if I understand your comment correctly, leaving the VBAR out of the forbidden char set is not an option, as per the previous examples.

                Cheers.

                --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@...> wrote:
                >
                > Paul,
                >
                > without fully understanding the context, but expecting the content AND the curlies already selected, the easy answer is to assign the VBAR as array element, AND leave it out of your forbidden character set:
                >
                > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^}{]+))$
                >
                > I'm not sure why you included a double '+' but to capture the example you will get a single array example with an array delimiter OTHER than the vbar. However, the VBAR as delimiter AND as part of the match will automatically return two elements, even if one (or both) is empty.
                >
                > Assuming you have a reason for the Find command, if you add look-behind and ahead assertions for the curlies, you will not even need a GetDocMatchAll function, but just an array assignment.
                >
                > i.e.
                >
                > ^!Find "(?<={)[^{}]+(?=})" WRS
                > ^!SetArray ... = ^$GetSelection$
                >
                > (you may need to escape the curlies)
                >
                >
                > Cheers,
                >
                >
                > Eb
                >
                >
                > --- In ntb-clips@yahoogroups.com, "Paul" <xboa721@> wrote:
                > >
                > > ... assign it to an array element (the empty string)!
                > >
                > > ^!Find "{[^{}]++}" WRS
                > > ;{strongly|}
                > > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$
                > > ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%
                > >
                > > to show ^%arrayelements_0%=2.
                >
              • Paul
                Thanks Diodem, With reference to: ^!Find {[^{}]++} WRS ... If there is no match I want to exit the search routine. ... This works superbly, albeit with the
                Message 7 of 7 , Dec 9, 2010
                • 0 Attachment
                  Thanks Diodem,

                  With reference to: ^!Find "{[^{}]++}" WRS

                  > The pattern [^|}{] is quantified here with a plus, so it expects *at least one* non-pipe/bracket in order to make a match.

                  If there is no match I want to exit the search routine.

                  > ^$GetDocMatchAll("(?<=[|}{])[^|}{]*+(?=[|}{])")$

                  This works superbly, albeit with the funky look-behind/ahead assertions!

                  > ..if the selection that GetDocMatchAll operates on (acquired in your previous step) were reduced to just what's inside the brackets, e.g.:
                  >
                  > ^!Find "{\K[^{}]++" WRS

                  This knocks out the LHS curly from the found term which is a problem when I replace it with a token.

                  Test
                  As a quick test using the following text:

                  7. {Knocking it|Putting it} Together
                  The {old school|old fashioned|traditional} way to get a screw {into|in} a piece of wood {is|was} to use a {screwdriver|screw driver}! {As with|Like using} any hand tool this is a bit of a {practised|skilled} art.

                  I {strongly|} recommend you beg-borrow-steel a cordless {driver|screwdriver} or {at least|as a second option} a cordless drill. Driving a screw at a {steady|constant} {pace|rate|speed} into the wood will give the best hold.


                  I get the following results:

                  Nest Number Contents

                  1: 2 Knocking it Putting it
                  2: 3 old school old fashioned traditional
                  3: 2 into in
                  4: 2 is was
                  5: 2 screwdriver screw driver
                  6: 2 As with Like using
                  7: 2 practised skilled
                  8: 2 strongly
                  9: 2 driver screwdriver
                  10: 2 at least as a second option
                  11: 2 steady constant
                  12: 3 pace rate speed

                  7. *1* Together
                  The *2* way to get a screw *3* a piece of wood *4* to use a *5*! *6* any hand tool this is a bit of a *7* art.

                  I *8* recommend you beg-borrow-steel a cordless *9* or *10* a cordless drill. Driving a screw at a *11* *12* into the wood will give the best hold.

                  Result? I get nest(8) reporting 2 options, one of which is blank, as desired.

                  Curiously ^!Find "{[^{}]*+}" WRS
                  also works which got me a little puzzled.. shouldn't 0 or more of the [^{}] match *any* text? Even with the ungreedy? Not a biggie. Unless you can see it obviously I wouldn't spend time on it. Cheers.

                  Thankyou very much.
                  Paul
                Your message has been successfully submitted and would be delivered to recipients shortly.