Loading ...
Sorry, an error occurred while loading the content.

NTP: Find and Extract?

Expand Messages
  • BH
    Is it possible for NTP to search for a specific string of text in a .txt file and extract only those lines or only the text after the specific search string?
    Message 1 of 3 , Feb 4, 2005
    • 0 Attachment
      Is it possible for NTP to search for a specific string of text in
      a .txt file and extract only those lines or only the text after the
      specific search string? That sounds confusing, so let me explain in
      more detail.

      I am trying to figure out a way to extract search engine keywords
      from my website's raw data logs, ie.

      If I go to NTP's find option and type in the beginning of Google's
      search query, ie:

      ?q=

      It will find all the strings with those characters in it. Now I want
      NTP to highlight everything after the ?q= so I can copy it to a new
      document and drill down to the actually SE keywords. Is there an
      easy way to do this? My web host's stats are worthless, so I'm
      trying to find a better route in data log analysis.

      Second example:

      I'd like to be able to follow a specific IP through my data logs and
      have NTP extract only those lines that start with that IP number. Is
      that possible?

      Thanks a million. I've been using NTP for many years and so do all
      of my colleagues. It has us completely spoiled when it comes to text
      editors.
    • Don - htmlfixit.com
      ... The short answer: yes The off-topic answer: try this ... http://htmlfixit.com/cgi-bin/demos/advx-counter/stats.pl?login=yes log in, click on search engine
      Message 2 of 3 , Feb 4, 2005
      • 0 Attachment
        BH wrote:
        >
        >
        > Is it possible for NTP to search for a specific string of text in
        > a .txt file and extract only those lines or only the text after the
        > specific search string? That sounds confusing, so let me explain in
        > more detail.
        >
        > I am trying to figure out a way to extract search engine keywords
        > from my website's raw data logs, ie.
        >
        > If I go to NTP's find option and type in the beginning of Google's
        > search query, ie:
        >
        > ?q=
        >
        > It will find all the strings with those characters in it. Now I want
        > NTP to highlight everything after the ?q= so I can copy it to a new
        > document and drill down to the actually SE keywords. Is there an
        > easy way to do this? My web host's stats are worthless, so I'm
        > trying to find a better route in data log analysis.
        >
        > Second example:
        >
        > I'd like to be able to follow a specific IP through my data logs and
        > have NTP extract only those lines that start with that IP number. Is
        > that possible?
        >
        > Thanks a million. I've been using NTP for many years and so do all
        > of my colleagues. It has us completely spoiled when it comes to text
        > editors.


        The short answer: yes
        The off-topic answer: try this ...
        http://htmlfixit.com/cgi-bin/demos/advx-counter/stats.pl?login=yes
        log in, click on search engine summary, click on view google terms.
        This is a $10 counter (there is also a free version which lacks this
        level of drill down to terms but is still a great little utility). You
        might find that useful. Please any further discussion of this under
        off-topic.
        The long answer: to do it well I need some exact data, but I presume you
        can go one line at at time, search for ?q=, select to beginning of line,
        delete, and jump to next line. When you are done you have only after
        the q. You could then do a text stats on it. I'll see if I cannot
        write that a minute.
      • Don - htmlfixit.com
        ... As you know I am blogging these clips now to make them easier for me to find my own work, and in this case to provide addtional details without bogging
        Message 3 of 3 , Feb 4, 2005
        • 0 Attachment
          Don - htmlfixit.com wrote:
          > BH wrote:
          >
          >>
          >>Is it possible for NTP to search for a specific string of text in
          >>a .txt file and extract only those lines or only the text after the
          >>specific search string? That sounds confusing, so let me explain in
          >>more detail.
          >>
          >>I am trying to figure out a way to extract search engine keywords
          >>from my website's raw data logs, ie.
          >>
          As you know I am "blogging" these clips now to make them easier for me
          to find my own work, and in this case to provide addtional details
          without bogging down the list. Here is the address of the posting:
          http://htmlfixit.com/blog/index.php?p=313
          In the post I give some sample data, the results obtained from that, and
          the text statistics on it. I could do lots more. I also noticed that
          some search engines don't use q=, for example dog pile. And some
          non-search engines might use q= in a query string ... and you would
          still get those.

          Anyway, here is the clip:

          ;effort by don at htmlfixit.com
          ;02/04/05

          ; to take query terms from lines of stats
          ;one long example line
          ;http://www.google.com/search?hl=en&lr=&ie=UTF-8&oe=UTF-8&q="We are what
          we know"
          ;becomes
          ;"We are what we know"

          ;turn off wordwrap
          ^!SetWordWrap Off
          ;^!SetDebug On

          ;go to start of document
          ^!Jump Doc_Start

          ;loop for cleaing lines
          :Loop
          ;highlight just this line
          ^!Select Eol
          ^!Find "?q=" TIHS
          ^!IfError TryAgain ELSE KillStart

          :TryAgain
          ;sometimes it is &q=
          ^!Find "&q=" TIHS
          ^!IfError KillLine ELSE KillStart

          :KillLine
          ;KillLine (not a search with a ?q=)
          ;delete highlighted line
          ^!DeleteLine
          ;repeat til done - if done go to done subroutine
          ^!If ^$GetRow$ = ^$GetLinecount$ DONE ELSE Next
          ; next line
          ^!Goto Loop

          :KillStart
          ;KillStart (get rid of ?q= and everything before it)
          ;jump to select end
          ^!Jump Select_End
          ;select to line beginning
          ^!Select Bol
          ;delete highlighted piece
          ^!Keyboard DELETE

          ;now get rid of post search terms by finding &
          ;highlight just this line
          ^!Select Eol
          ^!Find "&" TIHS
          ^!IfError SKIP_3
          ;jump to select end
          ^!Jump Select_Start
          ^!Select Eol
          ;delete highlighted piece
          ^!Keyboard DELETE

          ;repeat til done - if done go to done subroutine
          ^!If ^$GetRow$ = ^$GetLinecount$ DONE ELSE Next

          ;advance to next line
          ^!Jump +1
          ;repeat
          ^!Goto Loop


          :DONE
          ^!Info Done
        Your message has been successfully submitted and would be delivered to recipients shortly.