Loading ...
Sorry, an error occurred while loading the content.

12909Re: Extracting words from a file

Expand Messages
  • Hugo Paulissen
    Dec 4, 2004
      >
      > I just happend across this thread. If I have understood your needs
      > correctly, why not just reduce the list to a single column of
      words ,
      > and sort them case sensitive?
      >
      > 1. Replace all spaces in the document with "^P" to change the list
      to
      > individual words (ignore puntuation, if you like.
      >
      > 2. Sort the list CASE SENSITIVE
      >
      > 3. Delete the lower case words
      >
      >
      > 500 K files should contain about 80,000 words or so. Shouldn't take
      > more than a few minutes to do this by hand. If you have a lot of
      > files you can always write down the keystrokes you use, then do the
      > sort by Menu commands (^!Menu Modify/...). I think there's a
      > configuration switch to change sorting behaviour (remove duplicates
      > or not; case sensitive or not).
      >
      >
      > Abair



      We're going around in circles...

      Isn't this what I proposed a few messages earlier?

      > What about this approach? You can easily see for yourself if this
      is
      > of any help.
      >
      > 1. replace " " with "^P" - don't know how fast that would be
      > 2. trim/left align the text (which should have most words on a
      > separate line by now)
      > 3. sort the document with [Case Sensitive Sorting] and [Remove
      > Duplicates] switched on (in options)
      >
      > Hugo
      >
    • Show all 23 messages in this topic