Loading ...
Sorry, an error occurred while loading the content.

find a specific word along with surrounding words, etc.

Expand Messages
  • KenH
    I need to be able to find a word and its surrounding words, up to 3 in front and 3 following. For example, in the sentences below I d like to search for aid
    Message 1 of 6 , Sep 4 5:27 PM
    • 0 Attachment
      I need to be able to find a word and its surrounding words, up to 3 in front and 3 following. For example, in the sentences below I'd like to search for 'aid' in the source file and get the results file, including the number at the beginning of the line. Any suggestions? (I once had a small smattering of NoteTab clip knowledge but I've let it lapse to practically zero now. I've spent several hours trying to bone up but so far only dismal failure. I searched this group but could not find what I need -- maybe my group searching is rusty too.)

      source file:
      1. Now all good people should come to the aid of their party.
      2. Can't we all just get along?
      3. First aid is important to know.

      result file:
      1. come to the aid of their party
      3. First aid is important to
    • Don
      ... Check this out if you want to search for something we have previously done. http://htmlfixit.com/blog/?p=344
      Message 2 of 6 , Sep 4 5:37 PM
      • 0 Attachment
        On 9/4/2011 8:27 PM, KenH wrote:
        > I need to be able to find a word and its surrounding words, up to 3 in front and 3 following. For example, in the sentences below I'd like to search for 'aid' in the source file and get the results file, including the number at the beginning of the line. Any suggestions? (I once had a small smattering of NoteTab clip knowledge but I've let it lapse to practically zero now. I've spent several hours trying to bone up but so far only dismal failure. I searched this group but could not find what I need -- maybe my group searching is rusty too.)
        >
        > source file:
        > 1. Now all good people should come to the aid of their party.
        > 2. Can't we all just get along?
        > 3. First aid is important to know.
        >
        > result file:
        > 1. come to the aid of their party
        > 3. First aid is important to


        Check this out if you want to search for something we have previously done.
        http://htmlfixit.com/blog/?p=344
      • diodeom
        ... To account for possible capitalization, punctuation, dashed or apostrophized words, beginning- or end-of-sentence term s presence and its (presumably
        Message 3 of 6 , Sep 5 6:34 AM
        • 0 Attachment
          "KenH" <kenfhill84083@...> wrote:
          >
          > I need to be able to find a word and its surrounding words, up to 3 in front and 3 following. For example, in the sentences below I'd like to search for 'aid' in the source file and get the results file, including the number at the beginning of the line. Any suggestions? (I once had a small smattering of NoteTab clip knowledge but I've let it lapse to practically zero now. I've spent several hours trying to bone up but so far only dismal failure. I searched this group but could not find what I need -- maybe my group searching is rusty too.)
          >
          > source file:
          > 1. Now all good people should come to the aid of their party.
          > 2. Can't we all just get along?
          > 3. First aid is important to know.
          >
          > result file:
          > 1. come to the aid of their party
          > 3. First aid is important to
          >

          To account for possible capitalization, punctuation, dashed or apostrophized words, beginning- or end-of-sentence term's presence and its (presumably disqualifying) mid-word instances (e.g. "said" or "aide"), as in:

          4. It's a well-meant aid; however, it seems rather futile.
          5. Aid them, and they'll quadruple their numbers of needy.
          6. Yeah, I said it with some conviction.
          7. Let the effin' bleeding-heart cavalry come to their aid.

          ...one (of many) alternatives for a pattern meeting your stated needs could be:

          ^(\d+\. ).*?((([\w'-]+)([\pP ]+)){0,3}\b[Aa]id\b)(((?5)(?4)){0,3})

          ...where the first set of parentheses ($1) captures the line number, dot and space; second outer set ($2) gets up to three words and their separators plus your sample term at word boundaries; and the third outer set ($6) looks for up to three words (by recycling in reverse order the subpatterns 4 and 5 of the second set of parens).

          In the following clips the %s%earch and %r%eplacement patterns are set apart as variables for clarity:

          ;(start long line)
          ^!Set %s%=^(\d+\. ).*?((([\w'-]+)([\pP ]+)){0,3}\b[Aa]id\b)(((?5)(?4)){0,3})
          ;(end long line)
          ^!Set %r%=$1$2$6
          ^!SetClipboard ^$GetDocListAll("^%s%";"^%r%\r\n")$
          ^!Toolbar Paste New

          You might find it much more appealing though to collect context chunks by setting the maximum number of characters -- instead of words -- trimmed at word boundaries before and after the term:

          ^!Set %s%=^(\d+\. ).*?((\b\w.{0,17})?\b[Aa]id\b)((.{0,17}\w\b)?)
          ^!Set %r%=$1$2$4
          ^!SetClipboard ^$GetDocListAll("^%s%";"^%r%\r\n")$
          ^!Toolbar Paste New
        • KenH
          Every time I see an elegant solution like this I am impressed. Works great. Thanks.
          Message 4 of 6 , Sep 5 5:40 PM
          • 0 Attachment
            Every time I see an elegant solution like this I am impressed. Works great. Thanks.

            --- In ntb-clips@yahoogroups.com, "diodeom" <diomir@...> wrote:
            >
            >
            > To account for possible capitalization, punctuation, dashed or apostrophized words, beginning- or end-of-sentence term's presence and its (presumably disqualifying) mid-word instances (e.g. "said" or "aide"), as in:
            >
            > 4. It's a well-meant aid; however, it seems rather futile.
            > 5. Aid them, and they'll quadruple their numbers of needy.
            > 6. Yeah, I said it with some conviction.
            > 7. Let the effin' bleeding-heart cavalry come to their aid.
            >
            > ...one (of many) alternatives for a pattern meeting your stated needs could be:
            >
            > ^(\d+\. ).*?((([\w'-]+)([\pP ]+)){0,3}\b[Aa]id\b)(((?5)(?4)){0,3})
            >
            > ...where the first set of parentheses ($1) captures the line number, dot and space; second outer set ($2) gets up to three words and their separators plus your sample term at word boundaries; and the third outer set ($6) looks for up to three words (by recycling in reverse order the subpatterns 4 and 5 of the second set of parens).
            >
            > In the following clips the %s%earch and %r%eplacement patterns are set apart as variables for clarity:
            >
            > ;(start long line)
            > ^!Set %s%=^(\d+\. ).*?((([\w'-]+)([\pP ]+)){0,3}\b[Aa]id\b)(((?5)(?4)){0,3})
            > ;(end long line)
            > ^!Set %r%=$1$2$6
            > ^!SetClipboard ^$GetDocListAll("^%s%";"^%r%\r\n")$
            > ^!Toolbar Paste New
            >
            > You might find it much more appealing though to collect context chunks by setting the maximum number of characters -- instead of words -- trimmed at word boundaries before and after the term:
            >
            > ^!Set %s%=^(\d+\. ).*?((\b\w.{0,17})?\b[Aa]id\b)((.{0,17}\w\b)?)
            > ^!Set %r%=$1$2$4
            > ^!SetClipboard ^$GetDocListAll("^%s%";"^%r%\r\n")$
            > ^!Toolbar Paste New
            >
          • diodeom
            After removing one needless grouping: ^!Set %s%=^( d+ . ).*?(( b w.{0,17})? b[Aa]id b(.{0,17} w b)?) ^!Set %r%=$1$2
            Message 5 of 6 , Sep 5 6:02 PM
            • 0 Attachment
              After removing one needless grouping:

              ^!Set %s%=^(\d+\. ).*?((\b\w.{0,17})?\b[Aa]id\b(.{0,17}\w\b)?)
              ^!Set %r%=$1$2
            • KenH
              Thanks again. I ran across something in my research and tried it in this code. It seems to work but maybe I ll run into side effects somewhere down the line?
              Message 6 of 6 , Sep 5 6:58 PM
              • 0 Attachment
                Thanks again. I ran across something in my research and tried it in this code. It seems to work but maybe I'll run into side effects somewhere down the line?

                [Aa]id -> (?i)aid

                --- In ntb-clips@yahoogroups.com, "diodeom" <diomir@...> wrote:
                >
                > After removing one needless grouping:
                >
                > ^!Set %s%=^(\d+\. ).*?((\b\w.{0,17})?\b[Aa]id\b(.{0,17}\w\b)?)
                > ^!Set %r%=$1$2
                >
              Your message has been successfully submitted and would be delivered to recipients shortly.