Loading ...
Sorry, an error occurred while loading the content.

Using NTB and Regex to search for and list groups of words, phrases, in a speech

Expand Messages
  • paultrewern
    Hi all. My problem is I need to find how often phrases or groups of words are used in a speech. That is, to search the text to fine combinations of words
    Message 1 of 4 , Nov 6, 2013
    • 0 Attachment
      Hi all. My problem is I need to find how often phrases or groups of words are used in a speech. That is, to search the text to fine combinations of words within sentences that are repeated, starting with say 2 word phrases and going up to 8 or more words.

      Any assistance would be greatly appreciated. Thanks in advance, Paul.
    • Alec Burgess
      ... I thought I remembered something about this .... back in 2007 there was some discussion about concordances (which I think is what you are describing). I
      Message 2 of 4 , Nov 6, 2013
      • 0 Attachment
        On 2013-11-06 20:45, paultrewern@... wrote:
        Hi all. My problem is I need to find how often phrases or groups of words are used in a speech. That is, to search the text to fine combinations of words within sentences that are repeated, starting with say 2 word phrases and going up to 8 or more words.

        Any assistance would be greatly appreciated. Thanks in advance, Paul.
        I thought I remembered something about this .... back in 2007 there was some discussion about concordances (which I think is what you are describing). I haven't re-checked the link but I think TextStat (described below) might be worth looking at. If it is (or isn't) of help, please post back.
        "idisnick" <idisnick@...> said on 06/13/2007 5:47:08 PM -0400

        Can you help me create this clip, I can't figure it out.
        I have a long list, and a second list of keywords,
        I would like to have NoteTab take the keywords list and search the
        first list for them and parse out all the lines (entire lines) that
        contain those keywods. If you want a fee to do this let me know. Thanks.
        We had a long discussion about this at the beginning of the year (search the list for "concordance"). It has a workingclip.

        An open-source program to do this is: TextStat.
        "TextSTAT - Free concordance software for Windows and Linux
        TextSTAT is a simple programme for the analysis of texts. It reads ASCII/ANSI texts (in different encodings) and HTML files (directly from the internet) and ...
        www.niederlandistik.fu-berlin. de/textstat/software-en.html"
        http://www.google.com/search?q =TextStat&sourceid=navclient-f f&ie=UTF-8&rls=GGGL,GGGL:2006- 39,GGGL:en<http://www.google.com/search? q=TextStat&sourceid=navclient- ff&ie=UTF-8&rls=GGGL,GGGL:2006 -39,GGGL:en>

        On checking the link, I can't tell if it does phrases as well as words but it does support regex so maybe you can make use of it.

        --
        Regards ... Alec (buralex@gmail & WinLiveMess - alec.m.burgess@skype)
      • josephHarris
        Paul, I have seen this in Word Processors, but it is a long time since I have used that facility, and can t direct you to it. It might be worth a search for
        Message 3 of 4 , Nov 6, 2013
        • 0 Attachment
          Paul,

          I have seen this in Word Processors, but it is a long time since I have used that facility, and can't direct you to it. It might be worth a search for something like "word repetition software", or "phrase repeat". And I doubt if it needs much or any expenditure to find a way. I imagine most book and script writing programs have it.

          Have a look here http://www.spacejock.com/ . He offers several good programs free for writers; and is a writer himself. yWriter has a useage count that might help.

          Joseph Harris

          On 07/11/2013 01:45, paultrewern@... wrote:
          ��

          Hi all. My problem is I need to find how often phrases or groups of words are used in a speech. That is, to search the text to fine combinations of words within sentences that are repeated, starting with say 2 word phrases and going up to 8 or more words.

          Any assistance would be greatly appreciated. Thanks in advance, Paul.


        • paultrewern
          HI Joseph and Alec, Thanks for the responses and the suggestions. In researching those I have found the following, all of which do what I m after to some
          Message 4 of 4 , Nov 12, 2013
          • 0 Attachment

            HI Joseph and Alec,


            Thanks for the responses and the suggestions. In researching those I have found the following, all of which do what I'm after to some extent or other:


            http://ontolo.com/content-marketing-phrase-occurrence-tool - super fast and online - I wish I knew the workings behind this one


            http://prowritingaid.com/ - a Word app, with an online version


            http://www.textanz.com/ - Java based


            Regards, Paul.







            ---In ntb-clips@yahoogroups.com, <joe9438@...> wrote:

            Paul,

            I have seen this in Word Processors, but it is a long time since I have used that facility, and can't direct you to it. It might be worth a search for something like "word repetition software", or "phrase repeat". And I doubt if it needs much or any expenditure to find a way. I imagine most book and script writing programs have it.

            Have a look here http://www.spacejock.com/ . He offers several good programs free for writers; and is a writer himself. yWriter has a useage count that might help.

            Joseph Harris

            On 07/11/2013 01:45, paultrewern@... wrote:
            ��

            Hi all. My problem is I need to find how often phrases or groups of words are used in a speech. That is, to search the text to fine combinations of words within sentences that are repeated, starting with say 2 word phrases and going up to 8 or more words.

            Any assistance would be greatly appreciated. Thanks in advance, Paul.


          Your message has been successfully submitted and would be delivered to recipients shortly.