Loading ...
Sorry, an error occurred while loading the content.

Re: Filter out common words in Text Statistics

Expand Messages
  • flo.gehrke
    ... There s no way to exclude a list of noise words or stop-words from Tools | Text Statistics . For this, I would strongly recommend AntConc
    Message 1 of 3 , Feb 16, 2011
    • 0 Attachment
      --- In notetab@yahoogroups.com, "calarts72" <calarts@...> wrote:
      >
      > I bought NoteTab specifically to develop lists of keywords in my
      > web files. I'd like to screen/filter/delete/whatever common words
      > like a, an, the--even a long list of such words, if possible.

      There's no way to exclude a list of "noise words" or "stop-words" from 'Tools | Text Statistics'.

      For this, I would strongly recommend AntConc (www.antlab.sci.waseda.ac.jp/software.html) a free-ware tool for Text Analysis. AntConc allows you to create a word list by excluding a list of stop-words.

      For example, I'm using AntConc for creating word lists of text files excluding more than 30,000 words which don't have any index relevance.

      Nevertheless, you will enjoy NT since it's an excellent tool for more jobs like that. For example, you could run the NT Text Statistics, and remove those stop-words from the result. But for that job you must be willing to get involved in NT Clip Programming...

      Regards,
      Flo
    Your message has been successfully submitted and would be delivered to recipients shortly.