Loading ...
Sorry, an error occurred while loading the content.

Re: [govtrack] Mixing Facts with Speculation/Gossip

Expand Messages
  • Bill Farrell
    Hi Scott, You ve raised some really good points. I think you re absolutely right that pure data mining in the same site with an open forum that has a decided
    Message 1 of 3 , Mar 1, 2005
    • 0 Attachment
      Hi Scott,

      You've raised some really good points. I think you're absolutely right that pure data mining in the same site with an open forum that has a decided purpose of shaping public opinion is something we'd REALLY like to avoid. That was the reason I separated Pythia from WWW on ProgressiveNation.net. The idea: let visitors mine what they need to mine, the way they need to mine it -- on the mining site. Once they've drawn their conclusions or are ready to publish their studies, they must go back to the main site.

      As we're scraping and conditioning the raw data, we might think about a common format for citing the source, perhaps briefly describing the normlization procedures, and state outright that no opinion has been made on any of the contents. I'm not familiar with other members' sites, so I'm not sure what sort of mining and publishing operations are already underway.

      Joshua, would OGDEX support such a resource registration/citation form? This kind of thing is more common in the genealogy world, where citation of source is EVERYTHING. (Genealogists are even more cynical than political researchers :chuckle:) That would give us an idea of the resources we've got on hand, so we can begin building the inter-site communications. (More on that in another post).

      I think by registering our resources on OGDEX and/or GovTrack we can lend some assurance that we've at least done a peer-review and have agreed that the data (wherever it may currently lie) is coherent, standardized, and as complete as currently possible, given the technologies we're forced to employ. (The expression "BFH" [Big Friendly Hammer] springs to mind.)

      Resource registration isn't the whole solution, but it would be a start.

      Best!
      Bill

      ----- Original Message -----
      From: Scott Beardsley <sc0ttbeardsley@...>
      To: govtrack@yahoogroups.com
      Sent: Tue, 1 Mar 2005 00:34:50 +0000
      Subject: [govtrack] Mixing Facts with Speculation/Gossip


      >
    • Joshua Tauberer / GovTrack
      ... OGDEX was started as a community effort with no particular leadership, so don t look at me. :) Its mission is to serve as a hub for efforts along these
      Message 2 of 3 , Mar 1, 2005
      • 0 Attachment
        Bill Farrell wrote:
        > As we're scraping and conditioning the raw data, we might think about
        > a common format for citing the source, perhaps briefly describing the
        > normlization procedures, and state outright that no opinion has been
        > made on any of the contents. [snip]
        >
        > Joshua, would OGDEX support such a resource registration/citation
        > form?

        OGDEX was started as a community effort with no particular leadership,
        so don't look at me. :) Its mission is to serve as a hub for efforts
        along these lines, and I agree that we will need a system for describing
        data sources, and I also think OGDEX would work well as a central place
        to list sources in that way.

        I have lots of ideas (as per usual) about how to go about describing
        data sources, but before I present them 1) I need to get everyone to
        agree that RDF is the way to approach this (otherwise we're going to be
        debating XML formats forever), and 2) we need to actually get different
        data sources available on the web.

        The ideal way to get all of this going is for someone that has data
        (e.g. you, Bill) to pick out a slice of their data that is related to
        what's on GovTrack but doesn't overlap with it, and then for you and
        GovTrack to export that data in a common format (e.g. RDF). These days
        I'm just waiting for this to happen.

        Take the bioguide IDs, for instance. It's related to GovTrack's data in
        that its about the same people that GovTrack has data about, but it
        doesn't overlap because I don't have that info. The common format I
        hope to convince you of is RDF (pending my finishing the explanation of
        RDF), and then I can work with you on getting the data exported in that
        format...

        --
        - Joshua Tauberer

        http://taubz.for.net

        ** Nothing Unreal Exists **
      Your message has been successfully submitted and would be delivered to recipients shortly.