Loading ...
Sorry, an error occurred while loading the content.

Re: [NTB] NTP: Find and Extract?

Expand Messages
  • Don - htmlfixit.com
    ... As you know I am blogging these clips now to make them easier for me to find my own work, and in this case to provide addtional details without bogging
    Message 1 of 3 , Feb 4, 2005
    • 0 Attachment
      Don - htmlfixit.com wrote:
      > BH wrote:
      >
      >>
      >>Is it possible for NTP to search for a specific string of text in
      >>a .txt file and extract only those lines or only the text after the
      >>specific search string? That sounds confusing, so let me explain in
      >>more detail.
      >>
      >>I am trying to figure out a way to extract search engine keywords
      >>from my website's raw data logs, ie.
      >>
      As you know I am "blogging" these clips now to make them easier for me
      to find my own work, and in this case to provide addtional details
      without bogging down the list. Here is the address of the posting:
      http://htmlfixit.com/blog/index.php?p=313
      In the post I give some sample data, the results obtained from that, and
      the text statistics on it. I could do lots more. I also noticed that
      some search engines don't use q=, for example dog pile. And some
      non-search engines might use q= in a query string ... and you would
      still get those.

      Anyway, here is the clip:

      ;effort by don at htmlfixit.com
      ;02/04/05

      ; to take query terms from lines of stats
      ;one long example line
      ;http://www.google.com/search?hl=en&lr=&ie=UTF-8&oe=UTF-8&q="We are what
      we know"
      ;becomes
      ;"We are what we know"

      ;turn off wordwrap
      ^!SetWordWrap Off
      ;^!SetDebug On

      ;go to start of document
      ^!Jump Doc_Start

      ;loop for cleaing lines
      :Loop
      ;highlight just this line
      ^!Select Eol
      ^!Find "?q=" TIHS
      ^!IfError TryAgain ELSE KillStart

      :TryAgain
      ;sometimes it is &q=
      ^!Find "&q=" TIHS
      ^!IfError KillLine ELSE KillStart

      :KillLine
      ;KillLine (not a search with a ?q=)
      ;delete highlighted line
      ^!DeleteLine
      ;repeat til done - if done go to done subroutine
      ^!If ^$GetRow$ = ^$GetLinecount$ DONE ELSE Next
      ; next line
      ^!Goto Loop

      :KillStart
      ;KillStart (get rid of ?q= and everything before it)
      ;jump to select end
      ^!Jump Select_End
      ;select to line beginning
      ^!Select Bol
      ;delete highlighted piece
      ^!Keyboard DELETE

      ;now get rid of post search terms by finding &
      ;highlight just this line
      ^!Select Eol
      ^!Find "&" TIHS
      ^!IfError SKIP_3
      ;jump to select end
      ^!Jump Select_Start
      ^!Select Eol
      ;delete highlighted piece
      ^!Keyboard DELETE

      ;repeat til done - if done go to done subroutine
      ^!If ^$GetRow$ = ^$GetLinecount$ DONE ELSE Next

      ;advance to next line
      ^!Jump +1
      ;repeat
      ^!Goto Loop


      :DONE
      ^!Info Done
    Your message has been successfully submitted and would be delivered to recipients shortly.