Loading ...
Sorry, an error occurred while loading the content.
 

Re: [govtrack] Proposed new terms of data use

Expand Messages
  • Josh Tauberer
    ... Besides what Fred posted (thanks Fred), I m not sure I can even assert copyright over the data --- it s a database, there s basically no creative
    Message 1 of 15 , Oct 14 4:18 PM
      Michael Dale wrote:
      > In terms of data re-sharing ...you could license the "transformed" data
      > that govTrack makes available under cc-by-sa but creative commons
      > license does not says much about ~how~ the transformations are re-made
      > available.

      Besides what Fred posted (thanks Fred), I'm not sure I can even assert
      copyright over the data --- it's a database, there's basically no
      creative difference from the public domain original. I wouldn't really
      want to anyway, for the same reason I probably wouldn't create real TOS,
      since I do think the data should be free.

      > focus on providing constructive advice to groups working in this space
      > to maximize the commons and re usability of the data. ie provide a means
      > of "querying the data with gov_track ID if the govtrack data is used"

      Sure, your feedback and the other comments have been instructive for
      figuring out that angle.

      > I will quickly profile data usage / re-usage on metavid.org ;)

      I didn't have MetaVid in mind. :) I would, however, love to see database
      dumps (in a useful format) rather than having to query for everything.

      (I hope I'm the only one who hates queries and APIs as a primary means
      of data access....)

      --
      - Josh Tauberer
      - GovTrack.us

      http://razor.occams.info

      "Yields falsehood when preceded by its quotation! Yields
      falsehood when preceded by its quotation!" Achilles to
      Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
    • Josh Tauberer
      ... In that case you have nothing to worry about. :) -- - Josh Tauberer - GovTrack.us http://razor.occams.info Yields falsehood when preceded by its
      Message 2 of 15 , Oct 14 4:24 PM
        Ilan Rabinovitch wrote:
        > Josh Tauberer wrote:
        >> Use XML instead of/in addition to JSON/SQL. Normalize names to IDs in
        >> the XML. Document what's in the files (http://watchdog.net/about/api is
        >> broken atm so I don't know what's there).
        >>
        >>
        > Josh,
        >
        > At the moment GeekPAC is using your data by parsing the feeds via rsync
        > and putting them into a SQL database. I'm still doing a little clean
        > up, but I do plan to post both the database dumps, as well as the Deki
        > extensions we've written that perform the SQL queries we display. Does
        > that fall in line with what you were thinking for acceptable use?
        >
        > We are not currently adding anything new to the data so reoutputing to
        > XML seems a bit redundant.

        In that case you have nothing to worry about. :)

        --
        - Josh Tauberer
        - GovTrack.us

        http://razor.occams.info

        "Yields falsehood when preceded by its quotation! Yields
        falsehood when preceded by its quotation!" Achilles to
        Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
      • David Moore
        Hi everyone, David with OpenCongress here. Definitely count us in on whatever community standards are agreed upon, we re happy to contribute. More details
        Message 3 of 15 , Oct 14 5:07 PM
          Hi everyone, David with OpenCongress here. Definitely count us in on
          whatever community standards are agreed upon, we're happy to contribute.
          More details below, think that Josh is right to bring it up.

          As a foundation, our site code is open-source under the GPL and we offer
          a host of RSS feeds & widgets & sharing tools to push info out.

          We've always wanted to build an open API, but to be honest, given our
          small staff & limited programming time, it wasn't as much of a priority
          as major feature development.

          Of course, that hasn't stopped us from starting work on a totally open
          API on the back burner, making all data on OC & created by the OC user
          community available. We've looped in a volunteer programmer to work on
          the project with us in his spare time.

          The OpenCongress API should do the trick as far as putting more data
          from our corner of the transparency world on the communal table. Overall
          goal is to provide programmers w/ an API that they could access and get
          the bills associated with a given issue area, their status, and
          blogs/commentary/social wisdom about them. We'll be able to provide
          developers with at least the following data for non-commercial use:

          a) Aggregated news & blog coverage of bills, Senators, and
          Representatives, including those ranked "most useful"

          b) Counts and locations of users tracking bills, Members, committees,
          issues, etc.

          c) User comments, incl. those rated "most useful", i.e. filtered up

          d) User approval ratings for Members

          e) User votes "aye" or "nay" on bills sitewide

          f) Users also tracking related bills, issues, Members (connections)

          g) Users who support/oppose also support/oppose related bills & Members

          h) Users's OC friend relationships -- in their district, state, and
          nationwide

          i) Coming soon, more personally bookmarked content from users of MyOC

          Coming from this, a few sample use cases:

          i. Political bloggers will be able to more easily access user opinion on
          bills & issues & Members in a specific Congressional district, e.g., "In
          the NY-12 Congressional District, public opinion is running strongly
          against this bill, with 147 out of 195 users opposing it. These users
          are also opposing this related bill, and have given their Rep an
          approval rating of only 29%, etc."

          ii. Issue-based groups will be able to create highly customizable
          widgets identifying the most significant bills, votes, related issue
          areas, and Members relating to them. Groups will be able to easily
          display & re-publish the news coverage, blog coverage, and user comments
          rated "most helpful" on their issue by OC users.

          iii. With future planned feature development, users will be able to
          interact with each other in new ways, and contribute analysis of bills &
          votes on the site -- this too will be made available to programmers
          looking to keep their communities in touch with issue areas they care
          about. All the social actions & opinions taking place on OC will be
          available through the API.

          If you're intersted in helping us build the API, we'd love volunteer
          time -- send me an email at drm@... -- or if you have
          questions, feel free to drop me a line as well. I don't really have a
          pinpoint estimate of when the API will be finished at its current rate,
          given other development work underway, but it should be ready before the
          start of the next Congress in January '09, and hopefully much before then.

          Input welcome on all the above, and volunteer help greatly appreciated,
          Thanks,
          -David

          --
          David Moore
          c: (917) 753-3462
          www.opencongress.org
        • Josh Tauberer
          Bah! APIs! The next time someone says API I m gonna jump out a window. I ve got a window right here. It s open. I m ready. The one case an API makes sense as a
          Message 4 of 15 , Oct 14 6:43 PM
            Bah! APIs! The next time someone says API I'm gonna jump out a window.
            I've got a window right here. It's open. I'm ready.

            The one case an API makes sense as a primary means of data access is
            when the data is so large and inseparable that it cannot be reasonably
            distributed in files. It would have to be, say, at least several hundred
            megabytes if not a few gigabytes for that to be the case --- and even
            then one would have to justify not making use of resources like
            public.resource.org to host it.

            Can you imagine the outrage if the FEC decided to make its data
            available only via an API with an API key that was limited to some fixed
            number of queries per day? What's the first thing that would happen?
            People (people like Carl Malamud right?) would reconstruct the database
            and make it available via FTP.

            Besides the case where the data is just too big, if the data is not
            available in a flat file, it is IMO simply not open data, and as far as
            what I am talking about on this thread, it "doesn't count".

            (APIs take time to program correctly. Yes. Insufficient resources =
            acceptable reason not to have an API. Database dumps do not take serious
            effort.)

            --
            - Josh Tauberer
            - GovTrack.us

            http://razor.occams.info

            "Yields falsehood when preceded by its quotation! Yields
            falsehood when preceded by its quotation!" Achilles to
            Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)


            David Moore wrote:
            > Hi everyone, David with OpenCongress here. Definitely count us in on
            > whatever community standards are agreed upon, we're happy to contribute.
            > More details below, think that Josh is right to bring it up.
            >
            > As a foundation, our site code is open-source under the GPL and we offer
            > a host of RSS feeds & widgets & sharing tools to push info out.
            >
            > We've always wanted to build an open API, but to be honest, given our
            > small staff & limited programming time, it wasn't as much of a priority
            > as major feature development.
            >
            > Of course, that hasn't stopped us from starting work on a totally open
            > API on the back burner, making all data on OC & created by the OC user
            > community available. We've looped in a volunteer programmer to work on
            > the project with us in his spare time.
            >
            > The OpenCongress API should do the trick as far as putting more data
            > from our corner of the transparency world on the communal table. Overall
            > goal is to provide programmers w/ an API that they could access and get
            > the bills associated with a given issue area, their status, and
            > blogs/commentary/social wisdom about them. We'll be able to provide
            > developers with at least the following data for non-commercial use:
            >
            > a) Aggregated news & blog coverage of bills, Senators, and
            > Representatives, including those ranked "most useful"
            >
            > b) Counts and locations of users tracking bills, Members, committees,
            > issues, etc.
            >
            > c) User comments, incl. those rated "most useful", i.e. filtered up
            >
            > d) User approval ratings for Members
            >
            > e) User votes "aye" or "nay" on bills sitewide
            >
            > f) Users also tracking related bills, issues, Members (connections)
            >
            > g) Users who support/oppose also support/oppose related bills & Members
            >
            > h) Users's OC friend relationships -- in their district, state, and
            > nationwide
            >
            > i) Coming soon, more personally bookmarked content from users of MyOC
            >
            > Coming from this, a few sample use cases:
            >
            > i. Political bloggers will be able to more easily access user opinion on
            > bills & issues & Members in a specific Congressional district, e.g., "In
            > the NY-12 Congressional District, public opinion is running strongly
            > against this bill, with 147 out of 195 users opposing it. These users
            > are also opposing this related bill, and have given their Rep an
            > approval rating of only 29%, etc."
            >
            > ii. Issue-based groups will be able to create highly customizable
            > widgets identifying the most significant bills, votes, related issue
            > areas, and Members relating to them. Groups will be able to easily
            > display & re-publish the news coverage, blog coverage, and user comments
            > rated "most helpful" on their issue by OC users.
            >
            > iii. With future planned feature development, users will be able to
            > interact with each other in new ways, and contribute analysis of bills &
            > votes on the site -- this too will be made available to programmers
            > looking to keep their communities in touch with issue areas they care
            > about. All the social actions & opinions taking place on OC will be
            > available through the API.
            >
            > If you're intersted in helping us build the API, we'd love volunteer
            > time -- send me an email at drm@... -- or if you have
            > questions, feel free to drop me a line as well. I don't really have a
            > pinpoint estimate of when the API will be finished at its current rate,
            > given other development work underway, but it should be ready before the
            > start of the next Congress in January '09, and hopefully much before then.
            >
            > Input welcome on all the above, and volunteer help greatly appreciated,
            > Thanks,
            > -David
            >
          • aronpilhofer
            ... Let s hope it s a low floor, because I wanted to let folks know we ve just released our campaign finance API. Not necessarily of great use to this group,
            Message 5 of 15 , Oct 15 5:52 AM
              > Bah! APIs! The next time someone says API I'm gonna jump out a window.

              Let's hope it's a low floor, because I wanted to let folks know we've
              just released our campaign finance API. Not necessarily of great use
              to this group, but who knows.

              http://developer.nytimes.com/docs/campaign_finance_api

              Incidentally, I agree that API's are a rather crappy way of
              distributing data en toto, but who is arguing this as an either/or?
              There is significant value in both.

              First, you mention how horrible it would be should the FEC create an
              API. But not everyone has the technical know-how to handle, what, 12?
              13? million FEC records, much less make sense of the arcane poorly
              documented system they use to categorize and code individual records.
              If you don't know what you are doing, you can end up completely
              shooting yourself in the foot.

              And don't even get me started on the electronic filings, which is what
              we are using for our own API. The process of massaging those data into
              something meaningful is far far more complicated than it should be.
              (Like, who's the genius who decided not to require campaigns to
              disclose their aggregate amount of unitemized donors?)

              So, why should you be required to become a campaign finance expert in
              order to use the data? That's an artificial and unnecessary barrier.

              Second, not everyone wants all 8 kazillion records. They may only care
              about specific donors, or specific candidates, or specific localities.
              A well-written API (ours is a work in progress, so, don't judge it
              just yet) is another way of lowering the barrier of entry.

              I agree that the term and the concept is getting a bit overused. But
              that isn't a compelling reason NOT to make access to data easier for
              people.

              >Again, I'm not actually enacting this policy over my data.

              On the specific point that started this thread, it might be a good
              time to gently remind you that this is not your data. It's the
              public's data, which you (and god bless you for having done it) have
              taken the time and effort to make available in a rational format for
              the betterment of all.

              It is a lesson I think we all learned on the playground: sharing is
              not always reciprocal. There are going to be people out there who
              take, and don't give back. I understand your frustration, but I don't
              think adding some new requirement is going to help all that much, and
              may actually end up hurting more than anything else.

              My 2 cents,
              Aron
            • Josh Tauberer
              ... Well, look, I wasn t making a statement about APIs in general. I was responding to a response to my statement about contributing to the commons, and I was
              Message 6 of 15 , Oct 15 6:50 AM
                aronpilhofer wrote:
                > Incidentally, I agree that API's are a rather crappy way of
                > distributing data en toto, but who is arguing this as an either/or?
                > There is significant value in both.

                Well, look, I wasn't making a statement about APIs in general.

                I was responding to a response to my statement about contributing to the
                commons, and I was saying that an API doesn't contribute the data to the
                commons.

                In the case of the Times's FEC API, the data is already available in
                bulk from the FEC. You're providing an additional service to make things
                easier, and I say that is only a good thing. You're also a commercial
                enterprise, with different goals, and I meant to only be addressing the
                strictly nonprofit/transparency world, though I know I didn't say it.

                > On the specific point that started this thread, it might be a good
                > time to gently remind you that this is not your data.

                For all the time I put into it, I think I get a little say in how it is
                used (if you access my server to get it). I have no moral obligation to
                provide the data to everyone. At worst it would be hypocritical to start
                adding restrictions when I talk about openness, which is why I don't
                actually have any.

                And the irony is not past me that if I actually add a restriction,
                someone could fork the project.

                > There are going to be people out there who
                > take, and don't give back.

                But that doesn't mean I shouldn't have an expectation about what they
                *ought* to be doing. The fact that someone isn't contributing data that
                they have back doesn't mean I stop asking.

                --
                - Josh Tauberer
                - GovTrack.us

                http://razor.occams.info

                "Yields falsehood when preceded by its quotation! Yields
                falsehood when preceded by its quotation!" Achilles to
                Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
              • aronpilhofer
                ... Fair enough. I move to strike my statement from the record. ... I guess that depends on what restrictions you do decide to slap on it, if any. I m not
                Message 7 of 15 , Oct 15 7:47 AM
                  > Well, look, I wasn't making a statement about APIs in general.

                  Fair enough. I move to strike my statement from the record.

                  > For all the time I put into it, I think I get a little say in how it >is

                  I guess that depends on what restrictions you do decide to slap on it,
                  if any. I'm not telling you anything you don't know -- but that's part
                  of the deal when you decide to open things up. People take and don't
                  play nice. It sucks, but you can't really have it both ways.

                  >The fact that someone isn't contributing data >that
                  > they have back doesn't mean I stop asking.

                  No one said that. But putting some kind of license on the data to
                  enforce it, that's another matter.
                Your message has been successfully submitted and would be delivered to recipients shortly.