Loading ...
Sorry, an error occurred while loading the content.

RE: [govtrack] "FEAR-less Site Scraping", in Perl, at O'Reilly

Expand Messages
  • Ryan Rarick
    Hmm. Interesting article. Though I feel FEAR::API should probably not be used for this project, at least from my perspective. For example, the author states in
    Message 1 of 2 , Jun 14, 2006

      Hmm.  Interesting article.  Though I feel FEAR::API should probably not be used for this project, at least from my perspective.  For example, the author states in the second paragraph of the documentation, "However, this module violates probably every single rule of any Perl coding standards. Please stop here if you don't want to see the yucky code."

      It does appear to be a noble pursuit by the author to reduce code size, but at the cost of complexity.

      Although, this particular package does have some cool features, such as tabbed content.

      For those of us using Perl for the Screen Scraping, for maintainability reasons, I think we should stick to what's easily readable and maintainable.

      What I like to do is use WWW::Mechanize and HTML::TokeParser and store all the common code in a package which I then reference using objects.  In doing this, I'm able to keep the logical flow of the code separate from the physical flow of the code in a way that I can read in a high level manner which helps me in keeping focused on the intended direction of the script.  So basically, I put the skeleton of the script in the script part and fill in the fleshy details in the package.  That's just my preference though.

      And when I get my Internet Access back at home (maybe by the end of the month - we're in the middle of switching phone companies - still), I'll be able to finish the DB part of my code.


      From: Neal McBurnett <neal@...>
      Reply-To: govtrack@yahoogroups.com
      To: govtrack@yahoogroups.com
      Subject: [govtrack] "FEAR-less Site Scraping", in Perl, at O'Reilly
      Date: Sat, 10 Jun 2006 18:07:23 -0600 (MDT)

      Since scraping information is a common need in the govtrack space,
      when I saw this I thought it might be of interest:

      http://www.perl. com/pub/a/ 2006/06/01/ fear-api. html
      by Yung-chung Lin
      June 01, 2006

      Thanks for all the wonderful work out there to increase the
      transparency of the government, etc.

      Neal McBurnett http://mcburnett. org/neal/
      Signed and/or sealed mail encouraged. GPG/PGP Keyid: 2C9EBA60

    Your message has been successfully submitted and would be delivered to recipients shortly.