Loading ...
Sorry, an error occurred while loading the content.
 

Fwd: Public domain US legal data and code

Expand Messages
  • Josh Tauberer
    FYI - The project below is one day going to replace the current GovTrack raw data. This is just an early heads-up. I don t plan to discontinue any of the
    Message 1 of 1 , Oct 5, 2012
      FYI - The project below is one day going to replace the current GovTrack raw data. This is just an early heads-up. I don't plan to discontinue any of the existing data, but perhaps in six months it will be considered deprecated (=still operational but not recommended) in favor of the new data model.

      Josh


      -------- Original Message --------
      Subject:[sunlightlabs] Public domain US legal data and code
      Date:Fri, 5 Oct 2012 11:30:56 -0400
      From:Eric Mill <eric@...>
      Reply-To:sunlightlabs@...
      To:sunlightlabs@...


      Hi all,

      I've been working for the last month or two with Josh Tauberer (of GovTrack.us) and Derek Willis on a project to produce a public domain scraper and dataset from THOMAS.gov, the official source for legislative information for the US Congress. 

      It's a reasonably well documented set of Python scripts, which you can find here:

      We just hit a great milestone - it gets everything important that THOMAS has on bills, back to the year THOMAS starts (1973). We'vepublished and documented all of this data in bulk, and I've worked it into Sunlight's pipeline, so that searches for bills in Scout use data collected directly from this effort.

      The data and code are all hosted on Github on a "unitedstates" organization, which is right now co-owned by me, Josh, and Derek - the intent is to have this all exist in a common space. To the extent that the code needs a license at all, I'm using a public domain "unlicense" that should at least be sufficient for the US (other suggestions welcome).

      There's other great stuff in this organization, too - Josh made an amazing donation of his legislator dataset, and converted it to YAML for easy reuse. I've worked that dataset into Sunlight's products already as well. I've also moved my legal citation extractor into this organization -- and my colleague Thom Neale has an in-progress parser for the US Code, to convert it from binary typesetting codes into JSON.

      Github's organization structure actually makes possible a very neat commons. I'm hoping this model proves useful, both for us and for the public.

      -- Eric

      --

      --
      You received this message because you are subscribed to the Google Groups "sunlightlabs" group.
      To post to this group, send email to sunlightlabs@....
      To unsubscribe from this group, send email to sunlightlabs+unsubscribe@....
      For more options, visit this group at http://groups.google.com/group/sunlightlabs?hl=en.



    Your message has been successfully submitted and would be delivered to recipients shortly.