Loading ...
Sorry, an error occurred while loading the content.

Re: New Senator

Expand Messages
  • severian43
    OK. I preferred wget to rsync in this case because it s easy to tell from the output of the command if a new file was downloaded. But there are other ways to
    Message 1 of 10 , Feb 8, 2010
    • 0 Attachment
      OK. I preferred wget to rsync in this case because it's easy to tell from the output of the command if a new file was downloaded. But there are other ways to figure it out, so I'll switch to rsync...

      Cheers,
      Kevin


      --- In govtrack@yahoogroups.com, Josh Tauberer <tauberer@...> wrote:
      >
      > As soon as I can fix some bug, that file is going to start to get
      > written out every day, twice a day, instead of just on Sundays. The
      > contents generally won't change, but the modification date will.
      >
      > (Which is why I generally say use rsync anyway.)
      >
      > - Josh Tauberer
      > - CivicImpulse / GovTrack.us
      >
      > http://razor.occams.info | www.govtrack.us | civicimpulse.com
      >
      > "Members of both sides are reminded not to use guests of the
      > House as props."
      >
      > On 02/08/2010 03:24 PM, severian43 wrote:
      > > Perhaps the easiest thing to do is to use something like wget, which only downloads the file when it's been changed (based on comparing the file modified timestamp and the Last-Modified header, I think).
      > >
      > > For whereabill.org, I run a daily cron bash script that does this:
      > >
      > > WGET_OUTPUT=$(2>&1 wget -N -P data/ http://www.govtrack.us/data/us/people.xml)
      > > if echo "$WGET_OUTPUT" | fgrep 'saved'&> /dev/null
      > > then
      > > ...do stuff...
      > > fi
      > >
      > > So the file is only downloaded and processed if it's changed.
      > >
      > > Cheers,
      > > Kevin
      > >
      > >
      > > --- In govtrack@yahoogroups.com, Jack Angelo<jangelo42@> wrote:
      > >>
      > >> the MD5 solution is a good one.
      > >>
      > >> jack
      > >>
      > >>
      > >> On Feb 8, 2010, at 8:16 AM, John Factorial wrote:
      > >>
      > >>> I think a "lastupdated" attribute to the<people> root element would be a nice step toward this end.
      > >>>
      > >>> Before that happens, Jack when you download people.xml from govtrack you could generate an MD5 checksum of the file and save it. The next time you download people.xml, generate the checksum again, and if they match you know the file is the same. I'm not sure what you mean when you say you're "uploading the file," but maybe this is a good solution for you.
      > >>>
      > >>> --- In govtrack@yahoogroups.com, "Jack"<jangelo42@> wrote:
      > >>>>
      > >>>> Is there a way to notice that the people.xml dataset has changed and what changed. Currently I am uploading the file once a week to make sure I am current but it would be better to be able to get a last modified date so I only upload when and what is new.
      > >>>>
      > >>>
      > >>>
      > >>
      > >> Jack Angelo
      > >> jangelo42@
      > >>
      > >
      > >
      > >
      > >
      > > ------------------------------------
      > >
      > > Yahoo! Groups Links
      > >
      > >
      > >
      >
    Your message has been successfully submitted and would be delivered to recipients shortly.