Loading ...
Sorry, an error occurred while loading the content.
 

[Clip] Re: Creation of clip

Expand Messages
  • Flo
    Thanks for that information, Sheri! I remember those 10K chunks . Members who want to read up on that issue - it s in message # 15213 (see ^!Select
    Message 1 of 30 , Jun 21, 2007
      Thanks for that information, Sheri!

      I remember those "10K chunks". Members who want to read up on that
      issue - it's in message # 15213 (see ^!Select +10000...).

      Flo
       
    • Flo
      Sheri wrote... ... Indeed - why not this way... ^!SetScreenUpdate Off ^!SetHintInfo Working... ^!Set %Doc%=^$GetDocIndex$ ^!Set %Keywords%=^?[(T=O;F= Textfiles
      Message 2 of 30 , Jun 21, 2007
        Sheri wrote...

        > I haven't been following this thread in detail, but if he just wants
        > to remove lines having a keyword, wouldn't it be better to use a
        > replace command (replacing keyword lines with "") instead of using
        > getdocmatchall?

        Indeed - why not this way...


        ^!SetScreenUpdate Off
        ^!SetHintInfo Working...
        ^!Set %Doc%=^$GetDocIndex$
        ^!Set %Keywords%=^?[(T=O;F="Textfiles (*.txt)|*.txt")Choose Keyword
        File:]
        ^!Set %Case%=^?[Case-sensitive search:==Yes^=(?-i)|_No^=(?i)]
        ^!Open ^%Keywords%
        ^!Replace "(\r\n)+" >> "|" AWRS
        ^!Replace "\|\Z" >> "" AWRS
        ^!Replace "\A\|" >> "" AWRS
        ^!Set %Search%=^$GetText$
        ^!Close ^%Keywords% Discard
        ^!SetDocIndex ^%Doc%
        ^!Menu Edit/Copy All
        ^!Menu Edit/Paste New
        ^!Replace "^%Case%^.*(^%Search%).*\r\n" >> "" AWRS
        ^!Info Finished!


        Regards,
        Flo
         
      • Sheri
        ... Great! If interested in making further improvements, here are a few more enhancements to consider. When a clip makes use of the clipboard, its nice to
        Message 3 of 30 , Jun 22, 2007
          --- In ntb-clips@yahoogroups.com, "Flo" <flo.gehrke@...> wrote:
          >
          > Sheri wrote...
          >
          > > I haven't been following this thread in detail, but if he just wants
          > > to remove lines having a keyword, wouldn't it be better to use a
          > > replace command (replacing keyword lines with "") instead of using
          > > getdocmatchall?
          >
          > Indeed - why not this way...
          >
          >
          > ^!SetScreenUpdate Off
          > ^!SetHintInfo Working...
          > ^!Set %Doc%=^$GetDocIndex$
          > ^!Set %Keywords%=^?[(T=O;F="Textfiles (*.txt)|*.txt")Choose Keyword
          > File:]
          > ^!Set %Case%=^?[Case-sensitive search:==Yes^=(?-i)|_No^=(?i)]
          > ^!Open ^%Keywords%
          > ^!Replace "(\r\n)+" >> "|" AWRS
          > ^!Replace "\|\Z" >> "" AWRS
          > ^!Replace "\A\|" >> "" AWRS
          > ^!Set %Search%=^$GetText$
          > ^!Close ^%Keywords% Discard
          > ^!SetDocIndex ^%Doc%
          > ^!Menu Edit/Copy All
          > ^!Menu Edit/Paste New
          > ^!Replace "^%Case%^.*(^%Search%).*\r\n" >> "" AWRS
          > ^!Info Finished!
          >
          >
          > Regards,
          > Flo
          >
          >

          Great! If interested in making further improvements, here are a few
          more enhancements to consider.

          When a clip makes use of the clipboard, its nice to restore its
          original contents at the end.

          You are closing the keyword document, before navigating to the
          original document. You need to be sure the keyword document was not
          already open when the clip was started. If it gets closed from a lower
          docindex than the starting document, you would not return to the
          original document when you set your docindex. You'd have to navigate
          to the original docindex and then close discard the keywords document.

          Normally it would be a good idea to reverse sort alternates when
          constructing a regular expression, but since whole lines containing
          alternates are being deleted, in this case that wouldn't make any
          difference. The reason they should normally be reverse sorted is,
          alternates are searched from left to right. If there's a keyword "be"
          and a keyword "before", "be|before" will never find "before" in the
          text. Using \b's before and after the alternates would also work, if
          the keywords are meant to be whole words only.

          If there are any characters that might get interpreted by the regex
          engine as metacharacters in the keyword document, they should be
          escaped with a backslash prior to using them in the alternates.

          When constructing a regular expression with code, its probably a good
          idea to check ^!IfRegexOK before using the expression in a "real"
          statement. If there is an error, you'd have an opportunity to show a
          message and still do clean up tasks (like restore the clipboard).

          Regards,
          Sheri
        • Flo
          Hi Sheri, I m grateful to you for all these recommendations, and I tried to apply them to this clip... ... That s not given here, isn t it? But I think it
          Message 4 of 30 , Jun 23, 2007
            Hi Sheri,

            I'm grateful to you for all these recommendations, and I tried to
            apply them to this clip...

            > When a clip makes use of the clipboard, its nice to restore its
            > original contents at the end.

            That's not given here, isn't it? But I think it could easily be done
            by saving its contents in a variable, and afterwards pasting it back
            to the clipboard like...

              ^!Set %Var%=^$GetClipboard$ ... ^!SetClipboard ^%Var%

            > You'd have to navigate to the original docindex and then close
            > discard the keywords document.

            I changed the order of these command lines.

            By the way: Isn't it even safer to work with the document name? Given
            that the clip always gets started from the original document, we
            could replace...

              ^!Set %Doc%=^$GetDocIndex$^  with  ^!Set %Doc%=^GetDocName

            and

              ^!SetDocIndex ^%Doc%  with  ^!Open ^%Doc%

            (According to the help file, I suppose that ^!Open also selects a
            document that is open already.)

            > Normally it would be a good idea to reverse sort alternates...

            See line #8, and 9 now

            > metacharacters in the keyword document...should be escaped
            > with a backslash

            Certainly, this would be a professional solution. In message # 15199
            you created a subclip GetRegEscape that would do this job.

            > its probably a good idea to check ^!IfRegexOK before using the
            > expression in a "real" statement.

            I hope I've done it the right way.

            > Using \b's before and after the alternates would also work, if
            > the keywords are meant to be whole words only.

            This has been added too.

            In addition to that, I've combined the \b's with a negative
            lookbehind and lookahead. They do not allow certain characters before
            or behind a search word that is being treated as a whole word. This
            is mainly aiming at words hyphenated with - (ANSI 45) and the
            apostrophe ' (ANSI 39). For example: If "McDonald" is defined as a
            keyword it normally matches "McDonald's" too even if embraced with \b
            since - and ' are interpreted as word delimiters. Consequently, the
            clip would delete a line like...

                "eating a hamburger at McDonald's"

            although it isn't really matched by "McDonald" as a whole word.
            Or "self-service" would be matched by "self" and "service" as well
            although they possibly are regarded as substrings of "self-service"
            only. It depends, of course, on the way you look at "lexical
            problems" like that, and also on the sort of text to be processed.
            Certainly, this construction needs some more testing...

            How to deal with compound nouns written with a space (ANSI 32)? For
            example: "Express" would delete "American Express" although we
            possibly don't regard it as a match of that compound. The only
            solution I can see for that is to enter "American Express" with a
            protected space (ANSI 160) in order to distinguish it from the normal
            space (ANSI 32). With regard to this, we could extend the Lookarounds
            with \xA0 in order to match ANSI 160. Maybe there's a better solution
            (or even more problems)...

            Regards,
            Flo


            ^!SetScreenUpdate Off
            ^!SetHintInfo Working...
            ^!Set %Doc%=^$GetDocIndex$
            ^!Set %Keywords%=^?[(T=O;F="Textfiles (*.txt)|*.txt")Choose Keyword
            File:]
            ^!Set %Case%=^?[Case-sensitive search:==Yes^=(?-i)|_No^=(?i)]
            ^!Set %Substr%=^?[Search whole words only:==Yes^=1|_No^=0]
            ^!Open ^%Keywords%
            ^!Select All
            ^$StrSort("^$GetSelection$";0;0;1)$
            ^!Replace "(\r\n)+" >> "|" AWRS
            ^!Replace "\|\Z" >> "" AWRS
            ^!Replace "\A\|" >> "" AWRS
            ^!Set %Search%=^$GetText$
            ^!SetDocIndex ^%Doc%
            ^!Close ^%Keywords% Discard
            ^!IfTrue ^%Substr% Next Else Skip_2
            ;^!Set %Expr%="^%Case%^.*\b(^%Search%)\b.*\r\n"
            ; start of long line
            ^!Set %Expr%="^%Case%^.*\b(?<![[:punct:]])(^%Search%)(?![[:punct:]])
            \b.*\r\n"
            ; end of long line
            ^!Goto Skip
            ^!Set %Expr%="^%Case%^.*(^%Search%).*\r\n"
            ; Try next line for testing RegEx error ;-)
            ;^!Set %Expr%="[[:punkt:]]+"
            ^!IfRegExOK "^%Expr%" Next Else Message
            ^!Menu Edit/Copy All
            ^!Menu Edit/Paste New
            ^!Replace "^%Expr%" >> "" AWRS
            ^!Info Finished!
            ^!Goto End

            :Message
            ^!Prompt ^$GetRegexErrorMsg$
          • Sheri
            Hi Flo, ... Well you do ^!Menu Edit/Copy All near the end so you can paste the result to a new document. As is, that ends up remaining on the clipboard after
            Message 5 of 30 , Jun 24, 2007
              Hi Flo,

              --- In ntb-clips@yahoogroups.com, "Flo" <flo.gehrke@...> wrote:
              >
              > I'm grateful to you for all these recommendations, and I tried to
              > apply them to this clip...
              >
              > > When a clip makes use of the clipboard, its nice to restore its
              > > original contents at the end.
              >
              > That's not given here, isn't it?

              Well you do "^!Menu Edit/Copy All" near the end so you can paste the
              result to a new document. As is, that ends up remaining on the
              clipboard after the clip has finished.

              > But I think it could easily be done by saving its contents in a
              > variable, and afterwards pasting it back to the clipboard like..
              >
              > ^!Set %Var%=^$GetClipboard$ ... ^!SetClipboard ^%Var%

              See ^!ClipboardSave and ^!ClipboardRestore

              >
              > > You'd have to navigate to the original docindex and then close
              > > discard the keywords document.
              >
              > I changed the order of these command lines.
              >
              > By the way: Isn't it even safer to work with the document name?
              > Given that the clip always gets started from the original
              > document, we could replace...

              >
              > ^!Set %Doc%=^$GetDocIndex$^ with ^!Set %Doc%=^GetDocName
              >
              > and
              >
              > ^!SetDocIndex ^%Doc% with ^!Open ^%Doc%

              Yes, that should work. But then NoteTab has to find the docindex,
              maybe slightly faster if you save and restore the docindex yourself.

              >
              > (According to the help file, I suppose that ^!Open also selects a
              > document that is open already.)
              >
              > > Normally it would be a good idea to reverse sort alternates...
              >
              > See line #8, and 9 now
              >
              > > metacharacters in the keyword document...should be escaped
              > > with a backslash
              >
              > Certainly, this would be a professional solution. In message # 15199
              > you created a subclip GetRegEscape that would do this job.

              Since you're using a document buffer, you could use a single ^!Replace
              to replace any metacharacters (alternates -- be sure to escape them)
              with "\\$0"; the GetRegEscape clip approach is necessary only when
              acting on a string instead of a document. There is currently no
              provision in NoteTab to do regex string operations.

              >
              > > its probably a good idea to check ^!IfRegexOK before using the
              > > expression in a "real" statement.
              >
              > I hope I've done it the right way.

              Haven't tried it, but it looks good to me :)

              I haven't made use of classes like punct before myself, so you're
              blazing a trail :)

              >
              > > Using \b's before and after the alternates would also work, if
              > > the keywords are meant to be whole words only.
              >
              > This has been added too.
              >
              > In addition to that, I've combined the \b's with a negative
              > lookbehind and lookahead. They do not allow certain characters
              > before or behind a search word that is being treated as a whole
              > word. This is mainly aiming at words hyphenated with - (ANSI 45)
              > and the apostrophe ' (ANSI 39). For example: If "McDonald" is
              > defined as a keyword it normally matches "McDonald's" too even if
              > embraced with \b
              > since - and ' are interpreted as word delimiters. Consequently, the
              > clip would delete a line like...
              >
              > "eating a hamburger at McDonald's"
              >
              > although it isn't really matched by "McDonald" as a whole word.
              > Or "self-service" would be matched by "self" and "service" as well
              > although they possibly are regarded as substrings of "self-service"
              > only. It depends, of course, on the way you look at "lexical
              > problems" like that, and also on the sort of text to be processed.
              > Certainly, this construction needs some more testing...

              > How to deal with compound nouns written with a space (ANSI 32)?
              > For example: "Express" would delete "American Express" although
              > we possibly don't regard it as a match of that compound. The only
              > solution I can see for that is to enter "American Express" with a
              > protected space (ANSI 160) in order to distinguish it from the
              > normal space (ANSI 32). With regard to this, we could extend the
              > Lookarounds with \xA0 in order to match ANSI 160. Maybe there's a
              > better solution (or even more problems)...

              Hmn, you bring up some interersting points. "American Express" would
              be its own keyword as would "Express". In the case of the "Express"
              alternate, it could use a negative look behind, to make sure it it not
              preceded by "American\x20". Obviously would require some fine tuning
              of the keywords or alternates before applying them to customize them
              to that extent.

              Regards,
              Sheri

              >
              >
              > ^!SetScreenUpdate Off
              > ^!SetHintInfo Working...
              > ^!Set %Doc%=^$GetDocIndex$
              > ^!Set %Keywords%=^?[(T=O;F="Textfiles (*.txt)|*.txt")Choose Keyword
              > File:]
              > ^!Set %Case%=^?[Case-sensitive search:==Yes^=(?-i)|_No^=(?i)]
              > ^!Set %Substr%=^?[Search whole words only:==Yes^=1|_No^=0]
              > ^!Open ^%Keywords%
              > ^!Select All
              > ^$StrSort("^$GetSelection$";0;0;1)$
              > ^!Replace "(\r\n)+" >> "|" AWRS
              > ^!Replace "\|\Z" >> "" AWRS
              > ^!Replace "\A\|" >> "" AWRS
              > ^!Set %Search%=^$GetText$
              > ^!SetDocIndex ^%Doc%
              > ^!Close ^%Keywords% Discard
              > ^!IfTrue ^%Substr% Next Else Skip_2
              > ;^!Set %Expr%="^%Case%^.*\b(^%Search%)\b.*\r\n"
              > ; start of long line
              > ^!Set %Expr%="^%Case%^.*\b(?<![[:punct:]])(^%Search%)(?![[:punct:]])
              > \b.*\r\n"
              > ; end of long line
              > ^!Goto Skip
              > ^!Set %Expr%="^%Case%^.*(^%Search%).*\r\n"
              > ; Try next line for testing RegEx error ;-)
              > ;^!Set %Expr%="[[:punkt:]]+"
              > ^!IfRegExOK "^%Expr%" Next Else Message
              > ^!Menu Edit/Copy All
              > ^!Menu Edit/Paste New
              > ^!Replace "^%Expr%" >> "" AWRS
              > ^!Info Finished!
              > ^!Goto End
              >
              > :Message
              > ^!Prompt ^$GetRegexErrorMsg$
              >
            • hsavage
              ... tried to apply them to this clip... ... Flo, If you re insistent about restoring the clipboard to its previous state after running a clip you might want to
              Message 6 of 30 , Jun 25, 2007
                Flo wrote:
                > Hi Sheri,
                >
                > I'm grateful to you for all these recommendations, and I
                tried to apply them to this clip...
                >
                >> When a clip makes use of the clipboard, its nice to
                >> restore its original contents at the end.
                >
                > That's not given here, isn't it? But I think it could
                >> easily be done by saving its contents in a variable, and
                >> afterwards pasting it back to the clipboard like...
                >
                > ^!Set %Var%=^$GetClipboard$ ... ^!SetClipboard ^%Var%

                Flo,

                If you're insistent about restoring the clipboard to its previous state
                after running a clip you might want to check into the following 2 clip
                commands.

                ^!ClipBoardSave
                ^!ClipBoardRestore [+]


                ºvº SL-6-199 -created- 2007.06.25 - 19.48.24

                "Party Etiquette; Drinking Your Fair Share."
                ¤ ø ¤ hrs ø hsavage@...
              • Flo
                The latest version of this clip splits the keyword list into chunks of 500 lines in order to meet the restrictions of the alternation. In my tests, that error
                Message 7 of 30 , Jun 27, 2007
                  The latest version of this clip splits the keyword list into chunks
                  of 500 lines in order to meet the restrictions of the alternation. In
                  my tests, that error message (mentioned above) appeared from 818
                  keywords on. Now it works with an unlimited amount of keywords. It's
                  designed to delete certain keywords (i.e. stopwords) in a word list,
                  or complete lines in a list, that contain these keywords. In full-
                  text it will delete whole paragraphs containing the keyword (or
                  substrings).

                  Also metacharacters in the keyword list are escaped now (e.g.,
                  replace ? with \?).

                  H=Delete Keywords
                  ^!SetScreenUpdate Off
                  ^!SetHintInfo Working...
                  ; Save clipboard, and restore it later on (recommended by Sheri)
                  ^!ClipBoardSave
                  ; Store the index of active document
                  ^!Set %Doc%=^$GetDocIndex$
                  ; Choose keyword (stopword) file, case, and whole words
                  ^!Set %Keywords%=^?[(T=O;F="Textfiles (*.txt)|*.txt")Choose Keyword
                  File:]
                  ^!Set %Case%=^?[Case-sensitive search:==Yes^=(?-i)|_No^=(?i)]
                  ^!Set %WholeWords%=^?[Search whole words only:==Yes^=1|_No^=0]
                  ^!Open ^%Keywords%
                  ; Reverse sort of keywords (to put longer words before shorter words)
                  ^!Select All
                  ^$StrSort("^$GetSelection$";0;0;1)$
                  ; Escape metacharacters (next one long line)
                  ^!Replace "\\|\^|\!|\$|\?|\.|\*|\<|\>|\+|\(|\)|\[|\]|\{|\}|\=|\||\:"
                  >> "\\$0" AWRST
                  ; Divide document into chunks of 500 lines to meet the
                  ; restrictons of alternation
                  ^!Set %ChunkIndex%=1
                  ^!Jump 1

                  :Loop_1
                  ^!Select 500
                  ^!Toolbar Copy
                  ; Make alternation by replacing NL with vertical bar
                  ^!SetClipboard ^$StrReplace(^%NL%;|;^$GetClipboard$;0;0)$
                  ; Remove vertical bar at end of string to avoid empty
                  ; alternative; note: (A|B|) matches A or B or anything.
                  ; You may do the same at start of string, or watch empty lines
                  ; at the start of keyword list
                  ^!IfSame "^$StrCopyRight(^$GetClipboard$;1)$" "|" Next Else Skip
                  ^!SetClipboard ^$StrDeleteRight(^$GetClipboard$;1)$
                  ; Save chunks in variables %Chunk1%, %Chunk2%, etc.
                  ^!Set %Chunk^%ChunkIndex%%=^$GetClipboard$
                  ^!Jump +1
                  ^!If ^$GetRow$=^$GetLineCount$ Replace
                  ^!Inc %ChunkIndex%
                  ^!Goto Loop_1

                  :Replace
                  ; Return to active document
                  ^!SetDocIndex ^%Doc%
                  ; Close keyword file and copy active document to new document
                  ^!Close ^%Keywords% Discard
                  ^!Menu Edit/Copy All
                  ^!Menu Edit/Paste New
                  ^!Set %RepIndex%=1

                  :Loop_2
                  ^!If ^%RepIndex% > ^%ChunkIndex% Finish
                  ; Grab %Chunk1%, %Chunk2%, etc. for search
                  ^!Set %Search%=^%Chunk^%RepIndex%%
                  ; If "whole words", use word delimiters in RegEx; lookarounds
                  ; prevent hyphenated words from being deleted
                  ^!IfTrue ^%WholeWords% Next Else Skip_2
                  ^!Set %Expr%="^%Case%^.*\b(?<![-])(^%Search%)(?![-])\b.*(\r\n|\z)"
                  ^!Goto Skip
                  ^!Set %Expr%="^%Case%^.*(^%Search%).*(\r\n|\z)"
                  ; Check syntax of RegEx
                  ^!IfRegExOK "^%Expr%" Next Else Message
                  ; Delete matching words and lines
                  ^!Replace "^%Expr%" >> "" AWRS
                  ^!Inc %RepIndex%
                  ^!Goto Loop_2

                  :Finish
                  ^!Info Finished!
                  ^!ClipBoardRestore
                  ^!Goto End

                  :Message
                  ^!Prompt ^$GetRegexErrorMsg$
                  ; end of clip


                  The clip prevents terms hyphenated with - (ANSI 45) from being
                  deleted by substrings, e.g. "self" would not delete "self-catering"
                  (unless you choose deleting of substrings).

                  Regarding apostrophes and compound nouns with space I've been on the
                  wrong track. This issue is much more complicated, and I don't think
                  it could be solved by a general RegEx that would match all
                  eventualities. The apostrophe, for example, is used in a company name
                  like "McDonald's". This name will be deleted by a substring "Mc", and
                  by "McDonald" defined as a whole word as well since the apostrophe is
                  interpreted as a word delimiter. On the other hand, it indicates the
                  genitive of a lemma that possibly should be deleted, e.g. "Dickens'
                  works".

                  Another idea is to process the source file with the following clip
                  before running the "Delete Keywords" clip (of course, it also may be
                  integrated into "Delete Keywords").

                  Look at the following company names...

                  McDonald's
                  General Electric
                  Bank of America

                  In order to protect these names from being deleted
                  by "McDonald", "electric", or "bank", the Protect Keywords clip
                  replaces the apostrophe and space with _apo_ and _spc_ (even more
                  characters may be added that function as word delimiters). Thus the
                  names are interpreted as whole words. After running "Delete Keywords"
                  we can reverse this replacement.

                  First of all, you have to create a PROTECT.TXT file that contains a
                  list of terms like those three company names mentioned above.

                  Please note that "Protect Keywords" is meant to be run on the source
                  file, not on the keyword (or stopword) list!


                  H=Protect Keywords
                  ^!SetScreenUpdate Off
                  ^!SetHintInfo Working...
                  ^!Goto=^?[Choose action:==Protect Words^=Protect|Remove
                  Protection^=Remove]

                  :Protect
                  ^!Set %Doc%=^$GetDocIndex$
                  ; Choose the list of words to be protected, e.g. PROTECT.TXT
                  ^!Set %ProFile%=^?{(T=O;F="Textfiles (*.txt)|*.txt")Choose Protected
                  List:}
                  ^!Open ^%ProFile%
                  ^!Jump Doc_End
                  ^!IfFalse ^$IsEmpty(^$GetLine$)$ Next Else Skip
                  ^!InsertText ^%NL%
                  ^!Set %LineIndex%=^$GetTextLineCount$

                  :Loop_1
                  ^!Jump ^%LineIndex%
                  ^!SetClipboard ^$StrReplace("'";"_apo_";"^$GetLine$";0;0)$
                  ^!SetClipboard ^$StrReplace("^%Space%";"_spc_";"^$GetClipboard$";0;0)$
                  ^!Jump Line_End
                  ^!InsertText "^P^$GetClipboard$"
                  ^!If ^%LineIndex%=1 Replace
                  ^!Dec %LineIndex%
                  ^!Goto Loop_1

                  :Replace
                  ^!Select All
                  ^!SetListDelimiter ^p
                  ^!SetArray %Except%=^$GetSelection$
                  ^!SetDocIndex ^%Doc%
                  ^!Close ^%ProFile% Discard
                  ^!Jump 1
                  ^!Set %Count%=1

                  :Loop_2
                  ^!If ^%Count%=^%Except0% End
                  ^!Set %Search%="^%Except^%Count%%"
                  ^!Inc %Count%
                  ^!Set %Repl%="^%Except^%Count%%"
                  ^!Replace "^%Search%" >> "^%Repl%" AWRS
                  ^!Inc %Count%
                  ^!Goto Loop_2

                  :Remove
                  ^!Replace "_spc_" >> "^%Space%" AWST
                  ^!Replace "_apo_" >> "'" AWST

                  :End
                  ^!Info Finished!


                  Regards,
                  Flo
                   
                • ebbtidalflats
                  Hi Flo, I m curious about a line in your clips, where you replace the text in the document with ^$StrSort. I see what you re doing, but am wondering why you
                  Message 8 of 30 , Jun 28, 2007
                    Hi Flo,

                    I'm curious about a line in your clips, where you replace the text in
                    the document with ^$StrSort.

                    I see what you're doing, but am wondering why you chose the function,
                    rather than the menu command?

                    ^!Menu Modify/Lines/Sort/Descending

                    to select and sort all in one step, instead of using three different
                    functions.

                    > ^!Select All
                    > ^$StrSort("^$GetSelection$";0;0;1)$

                    Also, why sort the short words to the bottom? I know you put a lot of
                    effort into this, but didn't the original poster's (who we havn't
                    heard from for some time) example call for finding partial words? If
                    so, wouldn't finding the partials speed up the search by eliminating a
                    lot of lines from the search for the longer words?

                    Just curious.


                    One more Question. Do you have a specific use in mind for this keyword
                    manipulation? Is this a comparison of two keyword lists, or what? Or
                    was this just a clipcoding exercise?


                    Thanks,


                    Eb
                  • Flo
                    ... Eb, ... The menu command follows the settings in Options | Tools . ^$StrSort$ allows to define the sorting independently of these settings. ... This has
                    Message 9 of 30 , Jun 29, 2007
                      --- In ntb-clips@yahoogroups.com, "ebbtidalflats" <ebbtidalflats@...>
                      wrote:
                      >
                      > Hi Flo,
                      >
                      > I'm curious about a line in your clips,...

                      Eb,

                      > ...why you chose the function, rather than the menu command?

                      The menu command follows the settings in "Options | Tools".
                      ^$StrSort$ allows to define the sorting independently of these
                      settings.

                      > Also, why sort the short words to the bottom?

                      This has been described by Sheri before. Sheri also explained why
                      this isn't really necessary when running the clip on word lists and
                      lines.

                      > wouldn't finding the partials speed up the search

                      I think it isn't a matter of speed, and the difference would scarcely
                      be measurable. What really matters is what you want to achieve.
                      That's why you can choose substrings or whole words.

                      > Do you have a specific use in mind for this keyword
                      > manipulation? Is this a comparison of two keyword lists, or
                      > what?

                      One use, I suppose, has sufficiently been described (protection of
                      certain terms and word forms from being deleted by substrings). There
                      are many more applications I could think of. Why not comparing two
                      word lists, e.g. by subtracting list A from list B in order to get
                      the difference? For me, dealing with word lists is mainly related to
                      Text Retrieval and indexing of text databases, and NT has become an
                      indispensable tool in this field.

                      Several members have contributed to this thread. I just tried to find
                      out how these proposals could be integrated into this clip. It isn't
                      more than a box of building blocks. Maybe you could pick out some
                      ideas matching your own needs...

                      Flo
                       
                    • ebbtidalflats
                      Flo, ... I asked, because that approach is counter to the original request. Not that there was a whole lot of input from the requester. However, he did furnish
                      Message 10 of 30 , Jun 30, 2007
                        Flo,

                        --- In ntb-clips@yahoogroups.com, "Flo" <flo.gehrke@...> wrote:
                        >
                        > > Also, why sort the short words to the bottom?
                        >
                        > This has been described by Sheri before. Sheri also explained why
                        > this isn't really necessary when running the clip on word lists and
                        > lines.

                        I asked, because that approach is counter to the original request.
                        Not that there was a whole lot of input from the requester.

                        However, he did furnish an example, that specifically searched for
                        partial words. Hence my curiosity.


                        > are many more applications I could think of. Why not comparing two
                        > word lists, e.g. by subtracting list A from list B in order to get
                        > the difference?

                        Ahh! Good idea.

                        > For me, dealing with word lists is mainly related to
                        > Text Retrieval and indexing of text databases, and NT has become an
                        > indispensable tool in this field.

                        Hm, mine is more in the area of glossaries, but NT is just as
                        indispensable to me.


                        Thanks for your comments.


                        Eb
                      Your message has been successfully submitted and would be delivered to recipients shortly.