Loading ...
Sorry, an error occurred while loading the content.

Re: Empty Array Elements...

Expand Messages
  • diodeom
    ... The pattern [^|}{] is quantified here with a plus, so it expects *at least one* non-pipe/bracket in order to make a match. If it were simply set to look
    Message 1 of 7 , Dec 8, 2010
    • 0 Attachment
      Paul <xboa721@...> wrote:
      >
      > I want GetDocMatchAll to recognise an empty match and assign it to an array element (the empty string)!
      >
      > Here's the deal:
      >
      > ^!Find "{[^{}]++}" WRS
      >
      > finds nested bracket terms. Here's the challenge example:
      >
      > {strongly|}
      >
      > And the assignment code:
      > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$
      >
      > And I want:
      > ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%
      >
      > to show ^%arrayelements_0%=2.
      >
      > Currently either GetDocMatchAll OR the array assignment is ignoring the 'blank' match that the original ^!Find locates. I haven't been able to determine which is the cause and I'm not sure if this can be accomplished.
      >
      >

      The pattern [^|}{] is quantified here with a plus, so it expects *at least one* non-pipe/bracket in order to make a match.

      If it were simply set to look for zero or more, it would capture also unwanted "empties" before and after the brackets. One way to prevent it could be to demand that any match has to be both preceded and followed by a pipe/bracket:

      In selection: {strongly|}
      ^$GetDocMatchAll("(?<=[|}{])[^|}{]*+(?=[|}{])")$

      Alternatively, these funky look-behind/ahead assertions could be avoided if the selection that GetDocMatchAll operates on (acquired in your previous step) were reduced to just what's inside the brackets, e.g.:

      ^!Find "{\K[^{}]++" WRS
      Then in selection: strongly|
      ^$GetDocMatchAll("[^|}{]*+")
    • Eb
      Paul, without fully understanding the context, but expecting the content AND the curlies already selected, the easy answer is to assign the VBAR as array
      Message 2 of 7 , Dec 8, 2010
      • 0 Attachment
        Paul,

        without fully understanding the context, but expecting the content AND the curlies already selected, the easy answer is to assign the VBAR as array element, AND leave it out of your forbidden character set:

        ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^}{]+))$

        I'm not sure why you included a double '+' but to capture the example you will get a single array example with an array delimiter OTHER than the vbar. However, the VBAR as delimiter AND as part of the match will automatically return two elements, even if one (or both) is empty.

        Assuming you have a reason for the Find command, if you add look-behind and ahead assertions for the curlies, you will not even need a GetDocMatchAll function, but just an array assignment.

        i.e.

        ^!Find "(?<={)[^{}]+(?=})" WRS
        ^!SetArray ... = ^$GetSelection$

        (you may need to escape the curlies)


        Cheers,


        Eb


        --- In ntb-clips@yahoogroups.com, "Paul" <xboa721@...> wrote:
        >
        > ... assign it to an array element (the empty string)!
        >
        > ^!Find "{[^{}]++}" WRS
        > ;{strongly|}
        > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$
        > ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%
        >
        > to show ^%arrayelements_0%=2.
      • Paul
        Thankyou for the replies. Without implementing the ideas presented (it s waaaaay to late tonight!) they make sense and I m sure will clear the sticking point
        Message 3 of 7 , Dec 8, 2010
        • 0 Attachment
          Thankyou for the replies.

          Without implementing the ideas presented (it's waaaaay to late tonight!) they make sense and I'm sure will clear the sticking point for me. Many thanks.

          In answers to the questions: Having removed myself from the details of the regex for so long I'd basically overlooked the double+ operator as a first port of call for the ^!Find.

          The lookbehind/ahead assertions really are no problem and in a subsequent search I have to use them anyway.. however it remains to be seen from testing what is required and what works best.

          I'm intrigued by a direct search without the use of GDMA however the complexity of search within search means the current system I'm running I might just stick with :) Then again, I will get a chance to try it out soon. Perhaps the processing time is better.

          Certainly, the proof's in the pudding and I'll be baking soon!

          Kind regards,
          Paul

          p.s. does anyone run an annual competition for search terms that regex can't find? thought not! ;)


          --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@...> wrote:
          >
          > Paul,
          >
          > without fully understanding the context, but expecting the content AND the curlies already selected, the easy answer is to assign the VBAR as array element, AND leave it out of your forbidden character set:
          >
          > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^}{]+))$
          >
          > I'm not sure why you included a double '+' but to capture the example you will get a single array example with an array delimiter OTHER than the vbar. However, the VBAR as delimiter AND as part of the match will automatically return two elements, even if one (or both) is empty.
          >
          > Assuming you have a reason for the Find command, if you add look-behind and ahead assertions for the curlies, you will not even need a GetDocMatchAll function, but just an array assignment.
          >
          > i.e.
          >
          > ^!Find "(?<={)[^{}]+(?=})" WRS
          > ^!SetArray ... = ^$GetSelection$
          >
          > (you may need to escape the curlies)
          >
          >
          > Cheers,
          >
          >
          > Eb
          >
          >
          > --- In ntb-clips@yahoogroups.com, "Paul" <xboa721@> wrote:
          > >
          > > ... assign it to an array element (the empty string)!
          > >
          > > ^!Find "{[^{}]++}" WRS
          > > ;{strongly|}
          > > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$
          > > ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%
          > >
          > > to show ^%arrayelements_0%=2.
          >
        • Paul
          Ah.. the simple answer here Eb is that a document may contain the following: {For an example|Zum bespiel} this is {a good|ein besser|the best} sentence to
          Message 4 of 7 , Dec 8, 2010
          • 0 Attachment
            Ah.. the simple answer here Eb is that a document may contain the following:

            {For an example|Zum bespiel} this is {a good|ein besser|the best} sentence to {understand|comprehend|make sense of} the {higher|} purpose behind the program's {intention|purpose|function|ideology}.

            The first {part{ing|ner} left the house in {tatters|pristine condition} as the absent minded vicar {rowed across the {creek.|Thames.}|ran around in circles!}

            Ok, so it's not simple but did you ever read a Choose Your Own Adventure Story by Edward Packard? Great stuff. Well, it's a similar idea run on a nested bracket system. The purpose lends itself to article marketing and chasing lazy uni students submitting other's work as their own (perhaps with a few words changed).

            So that's content, and yes the curlies were already selected. So if I understand your comment correctly, leaving the VBAR out of the forbidden char set is not an option, as per the previous examples.

            Cheers.

            --- In ntb-clips@yahoogroups.com, "Eb" <ebbtidalflats@...> wrote:
            >
            > Paul,
            >
            > without fully understanding the context, but expecting the content AND the curlies already selected, the easy answer is to assign the VBAR as array element, AND leave it out of your forbidden character set:
            >
            > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^}{]+))$
            >
            > I'm not sure why you included a double '+' but to capture the example you will get a single array example with an array delimiter OTHER than the vbar. However, the VBAR as delimiter AND as part of the match will automatically return two elements, even if one (or both) is empty.
            >
            > Assuming you have a reason for the Find command, if you add look-behind and ahead assertions for the curlies, you will not even need a GetDocMatchAll function, but just an array assignment.
            >
            > i.e.
            >
            > ^!Find "(?<={)[^{}]+(?=})" WRS
            > ^!SetArray ... = ^$GetSelection$
            >
            > (you may need to escape the curlies)
            >
            >
            > Cheers,
            >
            >
            > Eb
            >
            >
            > --- In ntb-clips@yahoogroups.com, "Paul" <xboa721@> wrote:
            > >
            > > ... assign it to an array element (the empty string)!
            > >
            > > ^!Find "{[^{}]++}" WRS
            > > ;{strongly|}
            > > ^!SetArray %Nest^%Index%_%=^$GetDocMatchAll([^|}{]++)$
            > > ^!Info Nest^%Index%_0 = ^%Nest^%Index%_0%
            > >
            > > to show ^%arrayelements_0%=2.
            >
          • Paul
            Thanks Diodem, With reference to: ^!Find {[^{}]++} WRS ... If there is no match I want to exit the search routine. ... This works superbly, albeit with the
            Message 5 of 7 , Dec 9, 2010
            • 0 Attachment
              Thanks Diodem,

              With reference to: ^!Find "{[^{}]++}" WRS

              > The pattern [^|}{] is quantified here with a plus, so it expects *at least one* non-pipe/bracket in order to make a match.

              If there is no match I want to exit the search routine.

              > ^$GetDocMatchAll("(?<=[|}{])[^|}{]*+(?=[|}{])")$

              This works superbly, albeit with the funky look-behind/ahead assertions!

              > ..if the selection that GetDocMatchAll operates on (acquired in your previous step) were reduced to just what's inside the brackets, e.g.:
              >
              > ^!Find "{\K[^{}]++" WRS

              This knocks out the LHS curly from the found term which is a problem when I replace it with a token.

              Test
              As a quick test using the following text:

              7. {Knocking it|Putting it} Together
              The {old school|old fashioned|traditional} way to get a screw {into|in} a piece of wood {is|was} to use a {screwdriver|screw driver}! {As with|Like using} any hand tool this is a bit of a {practised|skilled} art.

              I {strongly|} recommend you beg-borrow-steel a cordless {driver|screwdriver} or {at least|as a second option} a cordless drill. Driving a screw at a {steady|constant} {pace|rate|speed} into the wood will give the best hold.


              I get the following results:

              Nest Number Contents

              1: 2 Knocking it Putting it
              2: 3 old school old fashioned traditional
              3: 2 into in
              4: 2 is was
              5: 2 screwdriver screw driver
              6: 2 As with Like using
              7: 2 practised skilled
              8: 2 strongly
              9: 2 driver screwdriver
              10: 2 at least as a second option
              11: 2 steady constant
              12: 3 pace rate speed

              7. *1* Together
              The *2* way to get a screw *3* a piece of wood *4* to use a *5*! *6* any hand tool this is a bit of a *7* art.

              I *8* recommend you beg-borrow-steel a cordless *9* or *10* a cordless drill. Driving a screw at a *11* *12* into the wood will give the best hold.

              Result? I get nest(8) reporting 2 options, one of which is blank, as desired.

              Curiously ^!Find "{[^{}]*+}" WRS
              also works which got me a little puzzled.. shouldn't 0 or more of the [^{}] match *any* text? Even with the ungreedy? Not a biggie. Unless you can see it obviously I wouldn't spend time on it. Cheers.

              Thankyou very much.
              Paul
            Your message has been successfully submitted and would be delivered to recipients shortly.