Loading ...
Sorry, an error occurred while loading the content.
 

Re: [govtrack] Local/State Gov

Expand Messages
  • Joshua Tauberer
    ... Hi, Scott. Thanks for joining the list and sharing your thoughts. When I first started working on GovTrack my intention was only to create a service for
    Message 1 of 10 , Jan 9, 2005
      sc0ttbeardsley wrote:
      > One of the biggest problems I faced this most recent election season
      > was determining who was lying and who was lying more. I feel there is
      > a need in the US for a community-driven repository of public domain
      > material

      Hi, Scott. Thanks for joining the list and sharing your thoughts. When
      I first started working on GovTrack my intention was only to create a
      service for people interested in politics, but recently I've started to
      see how much more there is to be gained by simply amassing and sharing data.

      The past month I've been working on and thinking about how an open
      standard for GovTrack-type information might look. In the next week
      I'll be posting more about that. Till then, here are some responses to
      what you wrote...

      > I'm interested in gathering this type of data for local and state
      > governments also.

      Two people have contacted me about working on collecting data for their
      states (New Hampshire and Tennessee; I'm not sure if those people are on
      this list). It would be great to add California to the list.

      > However, I see four likely problems:
      >
      > 1) Information Gathering.
      > Although my county and state governments have the foresight to publish
      > this information on their websites, I expect that people in many other
      > states and counties do not have this luxury.

      It's a problem, but with the small number of people actively working on
      these things right now (i.e. me), even if the information were available
      there probably wouldn't be anyone with the time/interest to get it useable.

      > 3) Districts.
      > What makes up a US congressional district? It's determined by congress
      > right?

      It's actually decided by the states, iirc, but the geographic boundaries
      are provided at http://nationalatlas.gov Districts for state-level
      politics are different. I don't know anything about that, though.

      > Any suggestions/comments?

      If you could head up getting Calif. political data organized, that would
      be fantastic. We should talk more about the details of that so you can
      avoid some of the mistakes I've made, but the general approach I suggest is:
      Get a list of the politicians involved and assign them all ID numbers.
      Start fetching legislative information in whatever way possible.
      GovTrack screen-scrapes various websites with a bunch of Perl scripts
      and tons of regex's. Get the data into a good machine-usable format
      (e.g., see http://www.govtrack.us/data/us/109/bills/sr2.xml)

      Once there is state-level data available it will be much easier (and
      less premature) to talk about creating a unified data format for all of
      the states. But in the meanwhile, I'll be working (and looking for
      feedback) on a format suitable for sharing federal-level data.

      > Thanks for starting a revolution.

      Thanks for joining it.

      --
      - Joshua Tauberer

      http://taubz.for.net

      ** Nothing Unreal Exists **
    • Aaron Huslage
      I think unification of all of the data is possible, but you re right that it s going to take actually getting it in its raw form and then normalizing it. I ve
      Message 2 of 10 , Jan 9, 2005
        I think unification of all of the data is possible, but you're right
        that it's going to take actually getting it in its raw form and then
        normalizing it.

        I've been working on getting the data for Oregon, however, abstracts
        are the only things available online and there are no typed
        transcripts at all.

        The transcription is via audio cassette (!), so I'm looking for people
        who know a lot about voice recognition systems to help me out.


        On Sun, 09 Jan 2005 19:02:49 -0500, Joshua Tauberer <tauberer@...> wrote:
        >
        >
        > sc0ttbeardsley wrote:
        > > One of the biggest problems I faced this most recent election season
        > > was determining who was lying and who was lying more. I feel there is
        > > a need in the US for a community-driven repository of public domain
        > > material
        >
        > Hi, Scott. Thanks for joining the list and sharing your thoughts. When
        > I first started working on GovTrack my intention was only to create a
        > service for people interested in politics, but recently I've started to
        > see how much more there is to be gained by simply amassing and sharing data.
        >
        > The past month I've been working on and thinking about how an open
        > standard for GovTrack-type information might look. In the next week
        > I'll be posting more about that. Till then, here are some responses to
        > what you wrote...
        >
        > > I'm interested in gathering this type of data for local and state
        > > governments also.
        >
        > Two people have contacted me about working on collecting data for their
        > states (New Hampshire and Tennessee; I'm not sure if those people are on
        > this list). It would be great to add California to the list.
        >
        > > However, I see four likely problems:
        > >
        > > 1) Information Gathering.
        > > Although my county and state governments have the foresight to publish
        > > this information on their websites, I expect that people in many other
        > > states and counties do not have this luxury.
        >
        > It's a problem, but with the small number of people actively working on
        > these things right now (i.e. me), even if the information were available
        > there probably wouldn't be anyone with the time/interest to get it useable.
        >
        > > 3) Districts.
        > > What makes up a US congressional district? It's determined by congress
        > > right?
        >
        > It's actually decided by the states, iirc, but the geographic boundaries
        > are provided at http://nationalatlas.gov Districts for state-level
        > politics are different. I don't know anything about that, though.
        >
        > > Any suggestions/comments?
        >
        > If you could head up getting Calif. political data organized, that would
        > be fantastic. We should talk more about the details of that so you can
        > avoid some of the mistakes I've made, but the general approach I suggest is:
        > Get a list of the politicians involved and assign them all ID numbers.
        > Start fetching legislative information in whatever way possible.
        > GovTrack screen-scrapes various websites with a bunch of Perl scripts
        > and tons of regex's. Get the data into a good machine-usable format
        > (e.g., see http://www.govtrack.us/data/us/109/bills/sr2.xml)
        >
        > Once there is state-level data available it will be much easier (and
        > less premature) to talk about creating a unified data format for all of
        > the states. But in the meanwhile, I'll be working (and looking for
        > feedback) on a format suitable for sharing federal-level data.
        >
        > > Thanks for starting a revolution.
        >
        > Thanks for joining it.
        >
        > --
        > - Joshua Tauberer
        >
        > http://taubz.for.net
        >
        > ** Nothing Unreal Exists **
        >
        >
        >
        > Yahoo! Groups Links
        >
        >
        >
        >
        >


        --
        I have decided to move from the planet. I'm sorry but I simply cannot
        remain on a world where Paris Hilton is allowed to publish "memoirs".
        - Alton Brown
      • Joshua Tauberer
        Hi, Aaron. ... Well, abstracts are a start. There is value in anything you can get together, so definitely keep working on it. The more states that have
        Message 3 of 10 , Jan 9, 2005
          Hi, Aaron.

          Aaron Huslage wrote:
          > I've been working on getting the data for Oregon, however, abstracts
          > are the only things available online and there are no typed
          > transcripts at all.

          Well, abstracts are a start. There is value in anything you can get
          together, so definitely keep working on it. The more states that have
          *something* the easier it is to show other states how important and
          useful it is to get everything open and online.

          > The transcription is via audio cassette (!), so I'm looking for people
          > who know a lot about voice recognition systems to help me out.

          I don't think you'd have much luck with that, especially if the audio
          isn't really really good.

          Keep us all updated with your progress on Oregon's politics.

          --
          - Joshua Tauberer

          http://taubz.for.net

          ** Nothing Unreal Exists **
        • Scott Beardsley
          ... I m in the process of gathering California State Assembly and Senate info now. I found a brutally slow anonymous FTP site to get almost everything I need
          Message 4 of 10 , Jan 11, 2005
            > If you could head up getting Calif. political data
            > organized, that would
            > be fantastic. We should talk more about the details
            > of that so you can
            > avoid some of the mistakes I've made, but the
            > general approach I suggest is:

            I'm in the process of gathering California State
            Assembly and Senate info now. I found a brutally slow
            anonymous FTP site to get almost everything I need
            (ftp://leginfo.public.ca.gov/pub). Most of the data is
            available in HTML format (complete with
            <strike></strike> tags for removing content and
            <em></em> for adding content to existing statues). I'm
            using Perl and HTML::Parser to do most of the dirty
            work. I've been able to find text data back to the
            93-94 session (although the older data doesn't use
            strike and em tags). BTW - using strike and em to
            denote changes works great visually maybe that might
            work well for govtrack.us' RSS feeds too.

            > Get a list of the politicians involved and assign
            > them all ID numbers.

            I've thought a little about this lately. I think we
            need to be careful here. If we want to eventually
            merge federal and state (and local) data we have to
            prevent duplicate IDs for real people. Eureka! We can
            just use each politicians SSN! That'd be an excellent
            unique ID. haha j/k.

            I've seen your people.xml and it looks like your IDs
            range from 300000-300159 and 400000-400661. How did
            you pick those? How should I pick mine in such a way
            that they don't overlap with yours and politicians
            from other jurisdictions? Doing this right the first
            time will help potential problems (say when we join
            databases) in the future.

            > a format suitable for sharing federal-level data.

            SOAP?

            Also maybe people and roles should be in seperate
            files?

            Maybe make people.xml read:
            <people>
            <person id="299997">
            <firstname>Foo</firstname>
            <surname>Bar</surname>
            <party>Republicrat</party>
            <address>
            <street>123 Main st</street>
            ...
            </address>
            <address>
            <street>321 Main Ave</street>
            ...
            </address>
            </person>
            ...
            </people>

            Then have a roles.xml:
            <roles>
            <role>
            <level>US</level>
            <branch>Legislature</branch>
            <!-- Judicial and Executive someday? -->
            <district type="congressional">5</district>
            <person id="299998">
            <session>109th</session>
            <started how="elected">2004-01-01</started>
            <ended why="RIP Matsui">2005-01-02</ended>
            </person>
            <status>vacant</status>
            <person id="299999">
            ...
            </person>
            </role>
            ...
            </roles>

            One person entry for every person. One role entry for
            every position in government (along with the list of
            people who have held that position).

            Thoughts/Advise?

            Scott




            __________________________________
            Do you Yahoo!?
            Yahoo! Mail - You care about security. So do we.
            http://promotions.yahoo.com/new_mail
          • Joshua Tauberer
            ... Okay, good. ... I m not tracking changes at that level of detail now. It s a little bit beyond the scope of what I think people would generally find
            Message 5 of 10 , Jan 11, 2005
              Scott Beardsley wrote:
              > I'm using Perl and HTML::Parser to do most of the dirty
              > work.

              Okay, good.

              > BTW - using strike and em to
              > denote changes works great visually maybe that might
              > work well for govtrack.us' RSS feeds too.

              I'm not tracking changes at that level of detail now. It's a little bit
              beyond the scope of what I think people would generally find useful.

              > If we want to eventually
              > merge federal and state (and local) data we have to
              > prevent duplicate IDs for real people.

              We're probably going to have to go through a few attempts at assigning
              common IDs before we get a good system, so I wouldn't worry about it for
              now. We can each map our own ID systems to a common naming system later.

              > I've seen your people.xml and it looks like your IDs
              > range from 300000-300159 and 400000-400661. How did
              > you pick those?

              I actually just picked up the IDs that www.opengov.us (now defunct) was
              using, two summers ago. There's no rhyme or reason to the ID assignment
              anymore, though.

              > > a format suitable for sharing federal-level data.
              >
              > SOAP?

              RDF would be more appropriate, and this is what I'm looking into now.
              See http://w3.org/TR/rdf-primer/ (I haven't read the whole thing myself.)

              Because I want to go with RDF, it would be most natural to identify
              people with URI's, e.g. I could be:
              urn://taubz.for.net/me
              And Rep. Mike Rogers of Alabama could be:
              urn:govshare.info/data/people/us/congress/2003/rogers

              The actual URI itself doesn't matter, so long as we can agree on a
              system of creating them. A URI is slightly more cumbersome than a
              numeric ID, but it's more transparent. You have a good idea who a URI
              refers to just by looking at it.

              Something else to keep in mind is that we don't necessarily need to
              agree on a single URI for each person. If we find out we've assigned
              two URIs to the same person, we can annotate one URI with the reference
              to the other with something like a "this person is the same as this
              person" note.

              There's more to be said about this, but I'll come back to it in the future.

              > Also maybe people and roles should be in seperate files?

              Not too important, as long as the information is in there somewhere.

              --
              - Joshua Tauberer

              http://taubz.for.net

              ** Nothing Unreal Exists **
            • sc0ttbeardsley
              ... Try to email the governor... He seems to be pro open source maybe he ll also be pro open government. Did you see this yet:
              Message 6 of 10 , Jan 26, 2005
                --- In govtrack@yahoogroups.com, Aaron Huslage <huslage@g...> wrote:

                > I've been working on getting the data for Oregon, however, abstracts
                > are the only things available online and there are no typed
                > transcripts at all.

                Try to email the governor... He seems to be pro open source maybe
                he'll also be pro open government.

                Did you see this yet:
                http://katu.com/stories/74397.html
              • directaction
                I m glad I found this site and am intrigued by the intent of the originators. Congratulations to this fine organization and its recent award. I hope that
                Message 7 of 10 , Jan 29, 2005
                  I'm glad I found this site and am intrigued by the intent of the
                  originators. Congratulations to this fine organization and its'
                  recent award.

                  I hope that your organization can make good use of the current
                  opportunity to protect and enhance the citizen. I'm a political
                  consultant. My business and my clients run campaigns which are all
                  about citizen control of the government rather than the one other
                  and reverse option.

                  I work extensively with government data, and specifically the data
                  concerning registration of voters---I use voter files on behalf of
                  my clients and for our various intrusions into the "processes" of
                  government", which is another way to say "getting votes".

                  For the last six months, SOLID, I've been lost in a major overhaul
                  of my business, and my ongoing confusion and despair is all about
                  the inability to find and pay for the expertise and advice anyone
                  needs when confronting the myriad technologies now available and
                  indispensable for those in this field. Frankly, like it would be for
                  any business, my problem is learning enough myself so that I can
                  make wise choices in what technical support I need.

                  Your group can go one of three ways: your proposal, which looks to
                  be headed in a generally good direction; an indifferent course,
                  where you either are ineffective because you never do anything or
                  ineffective because you continually are in the dark about what is
                  going on, and always are outmaneuvered by more knowledgeable
                  operatives who for whatever reasons, are seeking something different
                  than what you think is important; or you could go in the direction
                  of profiting from what I think is an enormous amount of power and
                  influence by closing off access to any real information and making
                  damn sure that a whole lot of intentionally misleading information
                  is substituted for and passed off as the real thing...and this last
                  route is well-traveled, as it's been the choice of many, dating to
                  the country's first day.

                  I read the prior posts to this list, which included one from someone
                  in Sacramento (my old home town and a town in which I've done much
                  campaign work) and another post from someone in Oregon (where I now
                  live and in which I do extensive campaign work) and I've included
                  here some fairly lengthy comments, with examples on where to go or
                  what to do in Sacramento or Oregon, but which apply elsewhere. I've
                  managed campaigns all over the country, and for more than 25 years
                  now, and will gladly help anyone with any information they need and
                  which I might know something about.

                  If your mission or goal is openness and access by the citizen to
                  this thing called "government", I'm on your side. And before you
                  consider that a good thing, ask yourselves "Just exactly who is our
                  new ally, and what is he trying to get?"

                  And remember to judge all you do and all that is done or proposed by
                  others with that kind of general and reasonable examination.

                  Please first consider the history and intent of all those with whom
                  you deal and upon whom you rely for guidance or cooperation...for
                  example: the posts concerning the Governor of Oregon, and the "open
                  source" generalities, didn't mention any of the more obvious
                  concerns we all must have about compilation and disposition of
                  data...and as for Oregon's Governor and government, their history
                  and current practices concerning such data are chock full of major
                  problems. That's not to suggest that the Governor of Oregon and thw
                  whole of that state's government are worse than elsewhere---no, I
                  will say, however, that not a single state in this nation is
                  anywhere near good or decent--- and I will also insist that not one
                  of the existing state-by-state comparisons for "openness" are all
                  that accurate as yet. And there are many groups which have proposed
                  and are seeking the kind of openness which your group proposes to
                  protect and enhance.

                  Open access to all data which government compiles and/or manages is
                  a hot debate being openly conducted (though I would put many
                  qualifications on how to define "open", and herein, I'm not using
                  it literally).

                  Today, there is a real need for people to become engaged in this
                  debate. And it's gotta be done right now---and the good guys better
                  have some geeks with them, to translate and to inform--- so that the
                  non-technically proficient among us don't get into trouble, BIG TIME
                  and suddenly find that we gave away all kinds of data which is NO
                  ONE's business just because we thought we were doing the right thing
                  or because we didn't pay any attention when it was being opened up
                  to "public" access(at the end I've attached a few lines about some
                  problems which have erupted during this last election cycle).


                  I suggest that you start with a review and analysis of all those
                  individuals and groups who for so long have been doing or attempting
                  to do what you now propose (since those groups which have goals
                  similar to your own and with which I am familiar make for a VERY
                  long list, I've noted only a few here).

                  And let's all of us also identify the people and groups who are or
                  likely will be MISUSING that same opportunity.

                  As for groups you can begin with, one of your obvious tools, as well
                  as a starting point for any group establishing its' intent or model,
                  is the Freedom of Information Act. FOIA to put it simply, IS
                  government information. You will want to consider both the original
                  intent of FOIA, all of its' revisions over the years, and it's
                  current implementation.

                  Especially for those of you in the California, California Voter
                  Foundation is a place you might go to get an overview of some of the
                  info now available and to see how various interested parties are
                  attempting to influence the collection of data and its' ultimate
                  disposition---it's a private foundation, with an agenda (and EVERY
                  ONE has an agenda, so learn the agenda of this group, too, and keep
                  it in mind when evaluating what they propose---when you look into
                  those who oppose some of what this Foundation advocates, you'll find
                  many new avenues of inquiry).

                  Right there in Sacramento you have one of the absolute best
                  companies of all those which specialize in selling voter data and
                  enhancements. And learn about all the major players in selling this
                  data and using it for campaigns.

                  If I can answer any questions or add anything of use to someone, I
                  will be glad to help. Let's keep it open and only keep what we have
                  a right to possess.
                • Joshua Tauberer / GovTrack
                  ... Hello, and thanks! I m not sure how much intrigue there could be about GovTrack, unless you think I might have some ulterior profit motives. My intent was
                  Message 8 of 10 , Jan 29, 2005
                    directaction wrote:
                    > I'm glad I found this site and am intrigued by the intent of the
                    > originators. Congratulations to this fine organization and its'
                    > recent award.

                    Hello, and thanks!

                    I'm not sure how much intrigue there could be about GovTrack, unless you
                    think I might have some ulterior profit motives. My intent was to
                    create what you see now on the site. That's basically it.

                    > I hope that your organization can make good use of the current
                    > opportunity to protect and enhance the citizen.

                    That's the general idea.

                    > Your group can go one of three ways: your proposal, which looks to be
                    > headed in a generally good direction;

                    Not sure what proposal you're refering to. If you mean the 'long-term
                    mission' box on the main page of the site, it's a good direction, yeah...

                    > or you could go in the direction of profiting from what I think is an
                    > enormous amount of power and influence by closing off access to any
                    > real information

                    Well, that's not my intention. That will become clearer in the next few
                    months as I work on open standards for sharing information.

                    > I've managed campaigns all over the country, and for more than 25 years
                    > now, and will gladly help anyone with any information they need and
                    > which I might know something about.

                    I'm sure people will appreciate that.

                    > Open access to all data which government compiles and/or manages is a
                    > hot debate being openly conducted ...
                    > and suddenly find that we gave away all kinds of data which is NO
                    > ONE's business just because we thought we were doing the right thing

                    That type of information is far beyond the scope of this mail list. All
                    we're concerned about here is legislative records that are already a
                    matter of public record and, for the most part, already accessible on
                    the Internet.

                    > I suggest that you start with a review and analysis of all those
                    > individuals and groups who for so long have been doing or attempting
                    > to do what you now propose

                    Well, as if I have time to do a careful review and analysis of anything. :)

                    > And let's all of us also identify the people and groups who are or
                    > likely will be MISUSING that same opportunity.

                    I can't disagree more. I have absolutely no concerns about whether
                    people might misuse the data I publish. Given the type of information
                    that I'm dealing with, there's simply no harm, without deliberate
                    misuse, in publishing the truth. And, the same for the other types of
                    information we've been talking about on this list.

                    > If I can answer any questions or add anything of use to someone, I
                    > will be glad to help. Let's keep it open and only keep what we have
                    > a right to possess.

                    Last time I checked, we've got a right to all of the information that
                    we've ever talked about here. One might say more than a right to
                    possess it, a duty to publish it.

                    --
                    - Joshua Tauberer

                    http://taubz.for.net

                    ** Nothing Unreal Exists **
                  Your message has been successfully submitted and would be delivered to recipients shortly.