RE: [NTB] Filter out common words in Text Statistics
- G'day Cal,
> because it was "too advanced" for this group. Really?Not really - it's more a case of it being off-topic for this list, which is intended for the basic use of NoteTab. Clip programming warrants a list of its own, and the Clips list is provided expressly for this purpose. Your chances of getting help (which may even extend to someone writing a clip for you) will be greatly increased by asking your question on the Clips list.
> -----Original Message-----
> From: email@example.com [mailto:firstname.lastname@example.org] On
> Behalf Of calarts72
> Sent: Thursday, 17 February 2011 06:13
> To: email@example.com
> Subject: [NTB] Filter out common words in Text Statistics
> I bought NoteTab specifically to develop lists of keywords in my web
> files. I'd like to screen/filter/delete/whatever common words like a,
> an, the--even a long list of such words, if possible. However, I don't
> have the time (deadlines!) to learn NoteTab from the ground up.
> An authoritative "no way!" or a brief referral to sections of the
> tutorial would be acceptable.
> Another member asked a similar question and was referred to the Clips
> group because it was "too advanced" for this group. Really?
- --- In firstname.lastname@example.org, "calarts72" <calarts@...> wrote:
>There's no way to exclude a list of "noise words" or "stop-words" from 'Tools | Text Statistics'.
> I bought NoteTab specifically to develop lists of keywords in my
> web files. I'd like to screen/filter/delete/whatever common words
> like a, an, the--even a long list of such words, if possible.
For this, I would strongly recommend AntConc (www.antlab.sci.waseda.ac.jp/software.html) a free-ware tool for Text Analysis. AntConc allows you to create a word list by excluding a list of stop-words.
For example, I'm using AntConc for creating word lists of text files excluding more than 30,000 words which don't have any index relevance.
Nevertheless, you will enjoy NT since it's an excellent tool for more jobs like that. For example, you could run the NT Text Statistics, and remove those stop-words from the result. But for that job you must be willing to get involved in NT Clip Programming...