Loading ...
Sorry, an error occurred while loading the content.


Expand Messages
  • Joshua Tauberer
    This is mainly a follow-up to something Chris from DIA (he s on the mail list now) and I were talking about Monday, but I think the list at large would be
    Message 1 of 1 , Mar 12, 2005
      This is mainly a follow-up to something Chris from DIA (he's on the mail
      list now) and I were talking about Monday, but I think the list at large
      would be interested in seeing this.

      When I met with Chris on Monday, he raised an important point that
      whatever system is used to share data, it should be really easy to use,
      in part to encourage people to use it.

      Sharing data as raw databases makes it really easy to drop the data into
      a website, he suggested. And, I totally admit that it's much easier
      than dealing with RDF.

      But, as I responded Monday, once data is in RDF, it's easy to export it
      into a database. Two weeks ago I had been working on an RDF querying
      engine (for fun, really, since there are already existing programs to do
      this), and this week I added to it an SQL output format. The result is
      the ability to query an RDF data model and output it, more or less, as a

      First some background...

      GovTrack publishes an RDF version of all of its data in
      http://www.govtrack.us/data/rdf/. You should take a look at the
      people.rdf file if you haven't seen it get to get a general idea for the
      structure of the data.

      You can browse the data at http://www.govtrack.us/rdfbrowse.xpd. The
      browser program itself knows nothing about the type of data that it's
      browsing, which is a good example of the advantage of using RDF. All of
      the different types of information magically just come together, with no
      glue specific to each type of data. (The browser uses the RDF schemas
      in http://www.govtrack.us/share/ and some labels present in the RDF
      files above to display nice names in place of some URIs.)

      RDF can be written in XML or Notation 3, among other formats. There are
      N3 versions of the schemas in the share directory if you want to see
      what they look like. N3 is a much simpler format than RDF/XML. It's
      basically just a list of statements: subject predicate object, followed
      by a period.

      For the query engine that I wrote, the query itself is written as RDF
      (in this case as N3). You give it an RDF graph with some nodes marked
      as variables, and the engine tells you the different ways it can match
      up (bind) those variables with entities in the target data model.

      Ok, the example...

      At http://www.govtrack.us/rdfquery.cgi you can try it out. Although,
      admitedly it's difficult coming up with valid queries because the
      structure of the data isn't all that simple.

      The example queries the data model for all representatives currently
      serving in an office. (Since the data model is pretty rich, it's also
      possible to write queries to list the population of each state for any
      senator that voted Nay on legislation related to Copyright, for instance.)

      Anyway, the idea is that once the data is in RDF, we could come up with
      some queries to generate database versions of the information, and then
      also publish those.

      - Joshua Tauberer


      ** Nothing Unreal Exists **
    Your message has been successfully submitted and would be delivered to recipients shortly.