Loading ...
Sorry, an error occurred while loading the content.
 

investigating how to integrate earmark data into GovTrack

Expand Messages
  • Roger Williams
    Hello GovTrack group: On the GovTrack Projects wiki page (http://wiki.govtrack.us/index.php/Projects), the subject project is listed. Based on your
    Message 1 of 3 , Jul 20, 2008
      Hello GovTrack group:

      On the GovTrack Projects wiki page
      (http://wiki.govtrack.us/index.php/Projects), the subject project is
      listed.

      Based on your recommendation, I looked at the Taxpayers for Common
      Sense website (http://www.taxpayer.net/index.php) and in their
      Earmarks section they do have some downloadable data that they have
      culled from [I am not sure where..is it the congressional record?].

      The stuff I have downloaded so far are all Excel spreadsheets. There
      are a few anomalies here:

      - they list earmarks in individual bills but they don't show the S or
      HR number of the bill..I guess we can query this from the description
      column
      - some are in slightly different formats, i.e. totals at the top or
      totals at the bottom

      Then there is a page with earmark disclousures, where some members
      have supplied the info and sometimes you must go to the members
      website to get a PDF file with all of the information [for that
      member only].

      Finally I found the page which has all of the FY08 earmarks. It is a
      huge spreadsheet, but I am not sure how we could integrate this into
      GovTrack.

      My questions are:

      - Is there a separate bill number for when it comes out
      of "conference"? I think it can have different provisions/earmarks
      from either the House or Senate versions at that point.
      - I can make a program to read an XL spreadsheet. Should it emit "SQL
      INSERT" statements or would it be better to connect directly to the
      underlying MySQL DB.
      - I think FY08 is almost [or already] over, but maybe it is in the
      110th congress. Does it make sense to do any work to integrate this
      data into GovTrack?

      TIA.

      Regards..RogerW
      --------------------------------------------------------
      References:

      spreadsheets for FY09 Senate Labor-HHS-Education
      http://www.taxpayer.net/search_by_category.php?
      action=view&proj_id=1106&category=Earmarks&type=Project

      spreadsheets for FY09 Senate Commerce-Justice-Science
      http://www.taxpayer.net/search_by_category.php?
      action=view&proj_id=1085&category=Earmarks&type=Project

      search results page for all earmarks on the TCS site
      http://www.taxpayer.net/search_by_category.php?
      action=search_by_category&category=Earmarks

      spreadsheet for all FY08 earmarks [bigkahuna.xls]
      http://www.taxpayer.net/search_by_category.php?
      action=view&proj_id=998&category=Earmarks&type=Project
    • Josh Tauberer
      ... Yes, that s a great place to start. Also take a look at http://earmarkwatch.org/. They may have done all of the hard work already of making a nice database
      Message 2 of 3 , Jul 21, 2008
        Roger Williams wrote:
        > On the GovTrack Projects wiki page
        > (http://wiki.govtrack.us/index.php/Projects), the subject project is
        > listed.

        Yes, that's a great place to start. Also take a look at
        http://earmarkwatch.org/. They may have done all of the hard work
        already of making a nice database of the data. I'm not sure who (i.e.
        what human being) to contact about that site, but I'll put you in touch
        with someone who will know.

        > Based on your recommendation, I looked at the Taxpayers for Common
        > Sense website (http://www.taxpayer.net/index.php) and in their
        > Earmarks section they do have some downloadable data that they have
        > culled from [I am not sure where..is it the congressional record?].

        Probably a combination of the text of the bills as well as the committee
        reports that go along with the bills.

        > - they list earmarks in individual bills but they don't show the S or
        > HR number of the bill..I guess we can query this from the description
        > column

        Weird.

        > - some are in slightly different formats, i.e. totals at the top or
        > totals at the bottom
        ...
        > - I can make a program to read an XL spreadsheet. Should it emit "SQL
        > INSERT" statements or would it be better to connect directly to the
        > underlying MySQL DB.

        The goal should be to make a nice normalized, simple, structured, and
        flat-file database, meaning-

        Normalized- Identifiers for people and bills are clear. People should
        be identified with the numbering that I use on GovTrack, and bills
        with the way I identify bills (e.g. h110-2358). Dollar amounts should
        be in a consistent format, etc.

        Simple, Flat-File, Structured: Either CSV or XML format. Something one
        can open up in a text editor and deal with directly.

        > - Is there a separate bill number for when it comes out
        > of "conference"? I think it can have different provisions/earmarks
        > from either the House or Senate versions at that point.

        There may be multiple bills in considering simultaneously, all but one
        of which are culled by the end of the process.

        > - I think FY08 is almost [or already] over, but maybe it is in the
        > 110th congress. Does it make sense to do any work to integrate this
        > data into GovTrack?

        It's a good first step.

        --
        - Josh Tauberer
        - GovTrack.us

        http://razor.occams.info

        "Yields falsehood when preceded by its quotation! Yields
        falsehood when preceded by its quotation!" Achilles to
        Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
      • Roger Williams
        ... Hello UG: I looked at the GovTrack screens to figure out where the earmark data should be displayed. Here are the simple ideas I can understand/work on so
        Message 3 of 3 , Jul 24, 2008
          --- In govtrack@yahoogroups.com, Josh Tauberer <tauberer@...> wrote:
          > <snipped/>
          Hello UG:

          I looked at the GovTrack screens to figure out where the earmark data
          should be displayed.

          Here are the simple ideas I can understand/work on so far:

          - on the bills page, list earmarks "attached" to each bill
          - on the legislators page, list eamarks "sponsored"by each member
          - links to raw data on TCS and/or earmarkwatch.org

          If anyone has any other ideas, please post them here.

          Regards..RogerW
        Your message has been successfully submitted and would be delivered to recipients shortly.