Loading ...
Sorry, an error occurred while loading the content.

Re: [govtrack] Re: Where to get BioguideIDs for People in House.

Expand Messages
  • Josh Tauberer
    ... Ooops, I should have remembered to suggest that. I actually have a parser for that site. Code here:
    Message 1 of 15 , Nov 15, 2008
    • 0 Attachment
      Joe Carmel wrote:
      > A generalized approach for this purpose is to use the
      > http://bioguide.congress.gov website to generate a well-formed HTML file
      > containing your specified criteria and then using that HTML file which
      > will contain the Bioguide IDs as hrefs.

      Ooops, I should have remembered to suggest that.

      I actually have a parser for that site. Code here:
      http://razor.occams.info/code/repo/?/govtrack/gather/us/bioguide.pl

      The whole bioguide directory is here (just updated, now UTF-8):

      http://www.govtrack.us/data/us/bioguide1.csv
      CSV file with bioguide ID, name, birth year, death year

      http://www.govtrack.us/data/us/bioguide2.csv
      CSV file with bioguide ID, role (Senator, Representative, ContCong,
      Delegate (I think that's it, not sure)), party, state, and congress number

      And it's currently updating this:

      http://www.govtrack.us/data/us/bioguide3.csv
      CSV file with bioguide ID and the biographical text

      which will probably be done in four hours (it rate limits one page per
      second).

      An interesting limitation of that data is that reps don't have district
      numbers last I checked.

      --
      - Josh Tauberer
      - GovTrack.us

      http://razor.occams.info

      "Yields falsehood when preceded by its quotation! Yields
      falsehood when preceded by its quotation!" Achilles to
      Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
    • Derek Willis
      ... There is one more role: Resident Commissioner (used for Puerto Rico s member of the House) ... Yeah, that s likely because district numbers can change over
      Message 2 of 15 , Nov 16, 2008
      • 0 Attachment
        On Sat, Nov 15, 2008 at 7:33 PM, Josh Tauberer <tauberer@...> wrote:
        > The whole bioguide directory is here (just updated, now UTF-8):
        >
        > http://www.govtrack.us/data/us/bioguide1.csv
        > CSV file with bioguide ID, name, birth year, death year
        >
        > http://www.govtrack.us/data/us/bioguide2.csv
        > CSV file with bioguide ID, role (Senator, Representative, ContCong,
        > Delegate (I think that's it, not sure)), party, state, and congress number

        There is one more role: Resident Commissioner (used for Puerto Rico's
        member of the House)


        > And it's currently updating this:
        >
        > http://www.govtrack.us/data/us/bioguide3.csv
        > CSV file with bioguide ID and the biographical text
        >
        > which will probably be done in four hours (it rate limits one page per
        > second).
        >
        > An interesting limitation of that data is that reps don't have district
        > numbers last I checked.

        Yeah, that's likely because district numbers can change over time with
        redistricting, or they can run for other offices, so the bioguide
        biotext plays it safe by just using the role. There are other sites
        that provide information about specific districts and more. I like
        Charles Stewart's page at MIT, but even that data has typos:

        http://web.mit.edu/17.251/www/data_page.html

        Derek
        --
        Derek Willis
        dwillis@...
        http://www.thescoop.org/docs/
      Your message has been successfully submitted and would be delivered to recipients shortly.