Re: Filter out common words in Text Statistics
- --- In email@example.com, "calarts72" <calarts@...> wrote:
>There's no way to exclude a list of "noise words" or "stop-words" from 'Tools | Text Statistics'.
> I bought NoteTab specifically to develop lists of keywords in my
> web files. I'd like to screen/filter/delete/whatever common words
> like a, an, the--even a long list of such words, if possible.
For this, I would strongly recommend AntConc (www.antlab.sci.waseda.ac.jp/software.html) a free-ware tool for Text Analysis. AntConc allows you to create a word list by excluding a list of stop-words.
For example, I'm using AntConc for creating word lists of text files excluding more than 30,000 words which don't have any index relevance.
Nevertheless, you will enjoy NT since it's an excellent tool for more jobs like that. For example, you could run the NT Text Statistics, and remove those stop-words from the result. But for that job you must be willing to get involved in NT Clip Programming...