Re: [govtrack] Bioguide data
- At the House Legislative Data conference in February, the lady who invented the Bioguide was there, and after the conference she went and fixed a couple of duplicate/broken bioguides, which might account for this.-- EricOn Fri, May 11, 2012 at 12:08 PM, Josh Tauberer <tauberer@...> wrote:
I re-ran my bioguide scraper and updated those files:
However, something changed in the character encoding so it looks like
non-ascii characters are now messed up or in something besides utf8.
Looking at a diff of changes there were some corrections in the data and
some people were removed... I am hoping the removed identifiers were for
dupes for name changes, but I'd have to look more closely to know for sure.
Deletions obviously pose an interesting problem if anyone is using
bioguide IDs as primary keys!
- Josh Tauberer (@JoshData)
On 05/04/2012 06:30 PM, Francis wrote:
> I'm looking for machine-readable Bioguide data, in particular the short biographical paragraphs for currently-serving congressmen.
> I know about bioguide html: http://bioguide.congress.gov/
> I know that a bioguide xml dtd exists: http://xml.house.gov/#bioguide
> I know that at least four bioguide xml documents exist: http://xml.house.gov/bioguide/BioguideXML.zip
> I know that the biographical text on the bioguide website corresponds exactly to the text in the four bioguide xml documents.
> I know that in January of 2009, Josh produced a dump of bioguide data scraped from the html site into CSVs:
> It was also scraped by Bill Farrell, and the data was at a now-defunct pythia.progressivenation.net:
> I strongly suspect that the bioguide HTML is generated directly from the bioguide XML, or at least from some source common to both.
> However, *where is the bioguide XML*?
> Yahoo! Groups Links
Yahoo! Groups Links
<*> To visit your group on the web, go to:
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
(Yahoo! ID required)
<*> To change settings via email:
<*> To unsubscribe from this group, send an email to:
<*> Your use of Yahoo! Groups is subject to:
--Developer | sunlightfoundation.com