Loading ...
Sorry, an error occurred while loading the content.

Coding projects

Expand Messages
  • Josh Tauberer
    Now that GovTrack is open source (http://www.govtrack.us/source.xpd) I am going to try to be a little more encouraging to get others involved in maintaining
    Message 1 of 6 , May 23, 2008
    View Source
    • 0 Attachment
      Now that GovTrack is open source (http://www.govtrack.us/source.xpd) I
      am going to try to be a little more encouraging to get others involved
      in maintaining and improving the website.

      Here are some possible things that you might be interested in working on:

      Scraping more data- committee and conference reports (good for finding
      earmarks), news articles, scraping general committee information

      Parsing bills- finding earmarks, relating bills to laws being amended,
      comparing bills and tracking the evolution of bills better.

      Improving site features- new ways to search legislation (e.g. by
      sponsor), visualizations of legislation language and evolution, tagging
      bills, visualization of legislative statistics, generation of new
      statistics and graphs

      Extending the site to new areas- browsing the constitution, U.S. code,
      regulations, judicial documents, state-level legislation

      I'm more than happy to help you work on these things if you're interested.

      --
      - Josh Tauberer
      - GovTrack.us

      http://razor.occams.info

      "Yields falsehood when preceded by its quotation! Yields
      falsehood when preceded by its quotation!" Achilles to
      Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
    • Leonard Linde
      On earmarks: Earmarks are released in committee reports as poorly scanned embedded images. Each committee uses its own format for the scanned portion of the
      Message 2 of 6 , May 24, 2008
      View Source
      • 0 Attachment
        On earmarks: Earmarks are released in committee reports as poorly scanned embedded images. Each committee uses its own format for the scanned portion of the report. It's a tough nut, on purpose, I assume.

        Taxpayers for Common Sense has last gathere year's earmarks in an excel spreadsheet. I've parsed that spreadsheet and associated the (text) names with the govtrack voter id. I can send you a SQL dump of the database if you're interested. TCS requires non-commercial use and attribution to use it.


        ----- Original Message ----
        From: Josh Tauberer <tauberer@...>
        To: GovTrack List <govtrack@yahoogroups.com>
        Sent: Friday, May 23, 2008 12:39:03 PM
        Subject: [govtrack] Coding projects


        Now that GovTrack is open source (http://www.govtrack .us/source. xpd) I
        am going to try to be a little more encouraging to get others involved
        in maintaining and improving the website.

        Here are some possible things that you might be interested in working on:

        Scraping more data- committee and conference reports (good for finding
        earmarks), news articles, scraping general committee information

        Parsing bills- finding earmarks, relating bills to laws being amended,
        comparing bills and tracking the evolution of bills better.

        Improving site features- new ways to search legislation (e.g. by
        sponsor), visualizations of legislation language and evolution, tagging
        bills, visualization of legislative statistics, generation of new
        statistics and graphs

        Extending the site to new areas- browsing the constitution, U.S. code,
        regulations, judicial documents, state-level legislation

        I'm more than happy to help you work on these things if you're interested.

        --
        - Josh Tauberer
        - GovTrack.us

        http://razor. occams.info

        "Yields falsehood when preceded by its quotation! Yields
        falsehood when preceded by its quotation!" Achilles to
        Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
      • Roger Williams
        Hello Leonard: I was looking at this spreadsheet. I am interested in how you got that data correlated with the govtrack voter id. Regards..RogerW ... scanned
        Message 3 of 6 , Jul 20, 2008
        View Source
        • 0 Attachment
          Hello Leonard:

          I was looking at this spreadsheet. I am interested in how you got
          that data correlated with the govtrack voter id.

          Regards..RogerW
          --- In govtrack@yahoogroups.com, Leonard Linde <llinde@...> wrote:
          >
          > On earmarks: Earmarks are released in committee reports as poorly
          scanned embedded images. Each committee uses its own format for the
          scanned portion of the report. It's a tough nut, on purpose, I
          assume.
          >
          > Taxpayers for Common Sense has last gathere year's earmarks in an
          excel spreadsheet. I've parsed that spreadsheet and associated the
          (text) names with the govtrack voter id. I can send you a SQL dump
          of the database if you're interested. TCS requires non-commercial
          use and attribution to use it.
          >
          >
          > ----- Original Message ----
          > From: Josh Tauberer <tauberer@...>
          > To: GovTrack List <govtrack@yahoogroups.com>
          > Sent: Friday, May 23, 2008 12:39:03 PM
          > Subject: [govtrack] Coding projects
          >
          >
          > Now that GovTrack is open source (http://www.govtrack .us/source.
          xpd) I
          > am going to try to be a little more encouraging to get others
          involved
          > in maintaining and improving the website.
          >
          > Here are some possible things that you might be interested in
          working on:
          >
          > Scraping more data- committee and conference reports (good for
          finding
          > earmarks), news articles, scraping general committee information
          >
          > Parsing bills- finding earmarks, relating bills to laws being
          amended,
          > comparing bills and tracking the evolution of bills better.
          >
          > Improving site features- new ways to search legislation (e.g. by
          > sponsor), visualizations of legislation language and evolution,
          tagging
          > bills, visualization of legislative statistics, generation of new
          > statistics and graphs
          >
          > Extending the site to new areas- browsing the constitution, U.S.
          code,
          > regulations, judicial documents, state-level legislation
          >
          > I'm more than happy to help you work on these things if you're
          interested.
          >
          > --
          > - Josh Tauberer
          > - GovTrack.us
          >
          > http://razor. occams.info
          >
          > "Yields falsehood when preceded by its quotation! Yields
          > falsehood when preceded by its quotation!" Achilles to
          > Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
          >
        • Leonard Linde
          I exported it to csv and wrote a python (django framework) program to match, by name, to the govtrack.us people file, which I already have in a MySQL table. It
          Message 4 of 6 , Jul 20, 2008
          View Source
          • 0 Attachment
            I exported it to csv and wrote a python (django framework) program to match, by name, to the govtrack.us people file, which I already have in a MySQL table.

            It was an iterative process because the excel file formatting is not 100%:  some names were misspelled, some punctuation was missing, etc.  IIRC, I had to edit one or two lines in the file.

            I did not treat the program as something I'd re-use, because I assumed that the Taxpayers for Common Sense file format will change, and even if it doesn't, next years' file will have a different set of formatting errors.

            I think my result at least 99% correct.   I'd be happy to export the resulting data in any format you'd like and send it to you, as long as you follow the attribution policy of TCS.   I'll send you the program, too, for what it's worth.

            --- On Sun, 7/20/08, Roger Williams <dbsearch04@...> wrote:
            From: Roger Williams <dbsearch04@...>
            Subject: [govtrack] Re: Coding projects
            To: govtrack@yahoogroups.com
            Date: Sunday, July 20, 2008, 7:32 PM

            Hello Leonard:

            I was looking at this spreadsheet. I am interested in how you got
            that data correlated with the govtrack voter id.

            Regards..RogerW
            --- In govtrack@yahoogroup s.com, Leonard Linde <llinde@...> wrote:
            >
            > On earmarks: Earmarks are released in committee reports as poorly
            scanned embedded images. Each committee uses its own format for the
            scanned portion of the report. It's a tough nut, on purpose, I
            assume.
            >
            > Taxpayers for Common Sense has last gathere year's earmarks in an
            excel spreadsheet. I've parsed that spreadsheet and associated the
            (text) names with the govtrack voter id. I can send you a SQL dump
            of the database if you're interested. TCS requires non-commercial
            use and attribution to use it.
            >
            >
            > ----- Original Message ----
            > From: Josh Tauberer <tauberer@.. .>
            > To: GovTrack List <govtrack@yahoogroup s.com>
            > Sent: Friday, May 23, 2008 12:39:03 PM
            > Subject: [govtrack] Coding projects
            >
            >
            > Now that GovTrack is open source (http://www.govtrack .us/source.
            xpd) I
            > am going to try to be a little more encouraging to get others
            involved
            > in maintaining and improving the website.
            >
            > Here are some possible things that you might be interested in
            working on:
            >
            > Scraping more data- committee and conference reports (good for
            finding
            > earmarks), news articles, scraping general committee information
            >
            > Parsing bills- finding earmarks, relating bills to laws being
            amended,
            > comparing bills and tracking the evolution of bills better.
            >
            > Improving site features- new ways to search legislation (e.g. by
            > sponsor), visualizations of legislation language and evolution,
            tagging
            > bills, visualization of legislative statistics, generation of new
            > statistics and graphs
            >
            > Extending the site to new areas- browsing the constitution, U.S.
            code,
            > regulations, judicial documents, state-level legislation
            >
            > I'm more than happy to help you work on these things if you're
            interested.
            >
            > --
            > - Josh Tauberer
            > - GovTrack.us
            >
            > http://razor. occams.info
            >
            > "Yields falsehood when preceded by its quotation! Yields
            > falsehood when preceded by its quotation!" Achilles to
            > Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
            >


          • Roger Williams
            ... to match, by name, to the govtrack.us people file, which I already have in a MySQL table. ... not 100%:  some names were misspelled, some punctuation was
            Message 5 of 6 , Jul 24, 2008
            View Source
            • 0 Attachment
              --- In govtrack@yahoogroups.com, Leonard Linde <llinde@...> wrote:
              >
              > I exported it to csv and wrote a python (django framework) program
              to match, by name, to the govtrack.us people file, which I already
              have in a MySQL table.
              >
              > It was an iterative process because the excel file formatting is
              not 100%:  some names were misspelled, some punctuation was missing,
              etc.  IIRC, I had to edit one or two lines in the file.
              >
              > I did not treat the program as something I'd re-use, because I
              assumed that the Taxpayers for Common Sense file format will change,
              and even if it doesn't, next years' file will have a different set of
              formatting errors.
              >
              > I think my result at least 99% correct.   I'd be happy to export
              the resulting data in any format you'd like and send it to you, as
              long as you follow the attribution policy of TCS.   I'll send you the
              program, too, for what it's worth.
              >
              > <snipped/>
              Hello Leonard:

              I am interested in the .csv version of the data as well as the
              program. I think I saw somewhere that Josh wants to get stuff in perl
              [for consistency with the other parts of the GovTrack implmentation].
              So I can use the program as a guide.

              TIA.

              Regards..RogerW
            • Leonard Linde
              I just sent you a zip file with the data and program. ... From: Roger Williams Subject: [govtrack] Re: Coding projects To:
              Message 6 of 6 , Jul 24, 2008
              View Source
              • 0 Attachment
                I just sent you a zip file with the data and program.

                --- On Thu, 7/24/08, Roger Williams <dbsearch04@...> wrote:
                From: Roger Williams <dbsearch04@...>
                Subject: [govtrack] Re: Coding projects
                To: govtrack@yahoogroups.com
                Date: Thursday, July 24, 2008, 11:30 AM

                --- In govtrack@yahoogroup s.com, Leonard Linde <llinde@...> wrote:
                >
                > I exported it to csv and wrote a python (django framework) program
                to match, by name, to the govtrack.us people file, which I already
                have in a MySQL table.
                >
                > It was an iterative process because the excel file formatting is
                not 100%:  some names were misspelled, some punctuation was missing,
                etc.  IIRC, I had to edit one or two lines in the file.
                >
                > I did not treat the program as something I'd re-use, because I
                assumed that the Taxpayers for Common Sense file format will change,
                and even if it doesn't, next years' file will have a different set of
                formatting errors.
                >
                > I think my result at least 99% correct.   I'd be happy to export
                the resulting data in any format you'd like and send it to you, as
                long as you follow the attribution policy of TCS.   I'll send you the
                program, too, for what it's worth.
                >
                > <snipped/>
                Hello Leonard:

                I am interested in the .csv version of the data as well as the
                program. I think I saw somewhere that Josh wants to get stuff in perl
                [for consistency with the other parts of the GovTrack implmentation] .
                So I can use the program as a guide.

                TIA.

                Regards..RogerW


              Your message has been successfully submitted and would be delivered to recipients shortly.