Loading ...
Sorry, an error occurred while loading the content.

Re: Initial source code release

Expand Messages
  • Joshua Tauberer
    ... Sure. ... MySQL table schema and data for people table of all people that have served in Congress (name, birthday, etc.), and people_roles table for
    Message 1 of 8 , Jul 5, 2007
    • 0 Attachment
      --- In govtrack@yahoogroups.com, "damianmont" <photoca@...> wrote:
      > Any way you could just document what each file does?
      >
      > Here's list:

      Sure.

      > database.people.sql
      MySQL table schema and data for "people" table of all people that have
      served in Congress (name, birthday, etc.), and "people_roles" table
      for every role in Congress each person has served (role type
      (senator/representative), start/end date, party, etc.).

      > database.tables2.sql
      > database.tables.sql
      MySQL table schemas for other tables that are filled in by the
      scripts, mostly for indexing bills. The people tables are the only
      ones I edit by hand and are not automatically generated from some
      other source.

      You'll need to pipe these to mysql, or otherwise load them, for any of
      the scripts to work. (The indexing tables aren't strictly necessary if
      you disable the indexing routines one way or another, but the people
      tables are pretty critical for all of the parsing scripts.)

      > db.pl
      Utility script for opening the MySQL database.

      > general.pl
      Really old utility functions that I don't really use.

      > indexing.pl
      Subroutines that update the indexing MySQL tables based on the
      contents of a bill or vote file.

      > parse_record.pl
      Parses the Congressional Record from THOMAS.

      > parse_rollcall.pl
      Parses roll call pages from the House and Senate websites.

      > parse_status.pl
      Parses bill status pages from THOMAS.

      > personaldb.pl
      Converts a name of a representative into an ID. Considers a date, role
      type, and state/district info to disambiguate names when it's ambiguous.

      > sql.pl
      Utility functions for dealing with MySQL (preparing SQL statements
      programmatically).

      > util.pl
      A ton of utility functions used throughout.

      > (p.s. EXCELLENT article on XML.com... That's how I found your site)

      Thanks!

      > Also could you maybe send us more or less the pages you scrape?
      > I'll probably get them from reading the script, but I'm a php guy, not
      > perl...but a programmer is a programmer right? I should be able to
      > figure it out.

      That's a long list. Maybe another time!

      - Josh
    • tay199
      YEAH! Thanks Josh. I ll keep the group posted on any patches or things of value we find and can contribute. Taylor
      Message 2 of 8 , Jul 6, 2007
      • 0 Attachment
        YEAH! Thanks Josh. I'll keep the group posted on any patches or things
        of value we find and can contribute.

        Taylor
      • Kevin Henry
        Great, thanks Josh... I didn t do much scripting (proper) for http://www.whereabill.org/, but since Josh has started the ball rolling, I ll add in the one
        Message 3 of 8 , Jul 8, 2007
        • 0 Attachment
          Great, thanks Josh...

          I didn't do much scripting (proper) for http://www.whereabill.org/,
          but since Josh has started the ball rolling, I'll add in the one
          script I did write: an XSLT file that takes the XML version of Josh's
          people database
          (http://www.govtrack.us/data/us/110/repstats/people.xml) and extracts
          the people/roles relevant to a single specified Congressional session.

          The problem (for my purposes, which include sending this information
          to the browser client on each request) is that the people.xml file is
          large (6.3MB), and includes lots of dead people. :) So I use this
          script to get only the information for a particular Congress.

          Kevin


          bioseparate.xml:

          <?xml version="1.0" encoding="UTF-8"?>
          <xsl:stylesheet version='1.0'
          xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
          xmlns:exsl="http://exslt.org/common">
          <xsl:output method="xml" version="1.0" encoding="ISO-8859-1"
          indent="yes"/>

          <xsl:param name="congress">110</xsl:param>

          <xsl:template match="people">
          <xsl:copy>
          <xsl:variable name="striprole">
          <xsl:for-each select="person">
          <xsl:copy>
          <xsl:copy-of select="@*"/>
          <xsl:copy-of select="role[(((number($congress)*2 + 1787) >=
          number(substring-before(@startdate,'-'))) and ((number($congress)*2 +
          1787) <= number(substring-before(@enddate,'-')))) or
          (((number($congress)*2 + 1788) >=
          number(substring-before(@startdate,'-'))) and ((number($congress)*2 +
          1788) <= number(substring-before(@enddate,'-'))))]"/>
          </xsl:copy>
          </xsl:for-each>
          </xsl:variable>
          <xsl:for-each select="exsl:node-set($striprole)/person[role]">
          <xsl:copy-of select="."/>
          </xsl:for-each>
          </xsl:copy>
          </xsl:template>

          </xsl:stylesheet>
        • damianmont
          Kevin, Love that www.WhereaBill.org site. You use the information from josh s xml files? Love the mashup, very well done.
          Message 4 of 8 , Jul 10, 2007
          • 0 Attachment
            Kevin,

            Love that www.WhereaBill.org site.
            You use the information from josh's xml files?

            Love the mashup, very well done.

            --- In govtrack@yahoogroups.com, "Kevin Henry" <k@...> wrote:
            >
            > Great, thanks Josh...
            >
            > I didn't do much scripting (proper) for http://www.whereabill.org ,
            > but since Josh has started the ball rolling, I'll add in the one
            > script I did write: an XSLT file that takes the XML version of Josh's
            > people database
            > (http://www.govtrack.us/data/us/110/repstats/people.xml) and extracts
            > the people/roles relevant to a single specified Congressional session.
          • Peggy Garvin
            I d like to know, too. I am writing a brief article (right now, due tomorrow) about some of the new legislative info projects and want to mention whereabill as
            Message 5 of 8 , Jul 10, 2007
            • 0 Attachment
              I'd like to know, too. I am writing a brief article (right now, due tomorrow) about some of the new legislative info projects and want to mention whereabill as well as a sample of sites that have used Govtrack's file.

              Thanks,
              Peggy Garvin
              peggy -at- garvinconsulting.com


              damianmont <photoca@...> wrote:
              Kevin,

              Love that www.WhereaBill. org site.
              You use the information from josh's xml files?

              Love the mashup, very well done.

              --- In govtrack@yahoogroup s.com, "Kevin Henry" <k@...> wrote:
              >
              > Great, thanks Josh...
              >
              > I didn't do much scripting (proper) for http://www.whereabi ll.org ,
              > but since Josh has started the ball rolling, I'll add in the one
              > script I did write: an XSLT file that takes the XML version of Josh's
              > people database
              > (http://www.govtrack .us/data/ us/110/repstats/ people.xml) and extracts
              > the people/roles relevant to a single specified Congressional session.


            • Kevin Henry
              Peggy and Damian, Thanks, glad you like the site. I m getting all the data from govtrack. Specifically, I m using the following files: - the bill status data
              Message 6 of 8 , Jul 11, 2007
              • 0 Attachment
                Peggy and Damian,

                Thanks, glad you like the site.

                I'm getting all the data from govtrack. Specifically, I'm using the
                following files:

                - the bill status data (www.govtrack.us/data/us/*/bills/*.xml)
                - the roll vote data (/data/us/*/rolls/*.xml)
                - the people database (/data/us/110/repstats/people.xml)
                - the popularity listing (/data/us/bills.technorati.xml)
                - the search service (/congress/billsearch_api.xpd)

                I keep a copy of all the files on my server, and do a daily rsync (as
                Josh describes here: http://www.govtrack.us/source.xpd) to stay current.

                Basically, when the server gets a request for a certain bill, it
                retrieves the status data and goes through the action items, parsing
                them into the steps that will be represented in the "driving
                directions". It also retrieves any relevant roll vote data, and then
                sends that information (along with the summary, the titles, the list
                of sponsors, and the biographical data for that session of Congress)
                back to the client, which renders everything (with the help of the
                Google Maps API).

                So it's really a (sort-of) UI sitting on top of govtrack's (sort-of) API.

                Let me know if you need any more information...


                Regards,
                Kevin
                http://www.whereabill.org/
              Your message has been successfully submitted and would be delivered to recipients shortly.