On Sun, 2007-02-25 at 21:53 +0100, Danny Ayers wrote:
> Wow, what a thread. I'll respond at greater length once I've re-read a
> couple of times and thought a bit... but there is one point I can pick
> up on right away, from Benjamin:
> I challenge the effectiveness of RDF on a number of points
> * The effectiveness of the graph structure for conveying data machine
> The Web is a graph structure.
That's fine in the abstract sense, but
* An atom document has an atom structure
* A html document has a html structure
* A train list document has a train list structure
These are the structures I really want to get at when I process
information from another component in the network. If these are encoded
directly in XML I can extract this information use tree-walking
algorithms. If they are encoded into RDF I need a tool that does
tree-walking to build up an RDF graph, then I need to do graph-walking
to build the structure I really want to extract.
Not only is the graph structure level unnecessary, it is more
algorithmically complex than the tree walk. I suggest that it is at a
fundamental level easier to write a feed reader that understands
atom/xml than it is to write a feed reader tha understands atom/rdf, no
matter how good your tools are for processing the underlying format or
In either the case of RDF of XML, you still need to specialise your
document type. In RDF you need vocabulary. In XML you need schema, which
encompasses vocabulary and structure.
I coneed that uniform structure is important when you want to throw data
into a database and allow query over it. However, I would contend that
this is not a common function in machine-to-machine interoperation. Most
machine processing needs to do something specific with the data it
receives, and for that we do need the higher-level vocabulary or schema
to be well-defined.
If it is a prerequisite of the machine-processable web to have fully
self-describing documents, then we can always translate these to RDF for
our storage needs if we really want to. In the mean-time, I would
suggest that RDF complicates the common case in favour of an uncommon
case that can be solved in a different way once the common case is dealt
I believe Mark Baker has a different perspective on this, one which I
would like to understand better.
On Sat, 2007-02-24 at 18:40 +0000, Bill de hOra wrote:
> Benjamin Carlyle wrote:
> > As I will point out later in the document, I
> > don't think RDF is as conducive to good vocabulary evolution as
> XML isn't conducive to vocabulary evolution either. This is very
> juxtaposition. Most XML vocabulairies I've seen that declare an
> extensibility based end up defining a subset of what RDF defines.
I think the evidence says otherwise. We have html and other formats to
demonstrate that the basic approach behind good XML development works.
The important rules seem to be:
* Use must-ignore semantics for anything that is not understood
* Don't define new namespaces for extensions, so the extensions can one
day be merged back into the base document type
* Attack a specific problem space, align communities behind the common
brand-name, and hammer things out until it all interoperates
I'm not sure whether or not we have evidence of RDF vocabularies that
have survived similar kinds of pressures, though FOAF may be an example.
> > RSS was defined in terms of RDF so that it
> > could be easily aggregated. However, aggregation did not happen at
> > RDF level in practice. Instead, RSS was aggregated at a higher
> But you don't say why that was. Why was that?
I would guess: Because it wasn't useful. Because the graph structure is
too low-level to meet application-specific data integration requirements
automatically. Do you have any alternative thoughts on that?
> > Must-ignore semantics mean that a document with additional elements
> > be ignored by old implementations.
> mI in my mind is about having having a trailing "else" in the code
> logs to disk instead of throwing an exception. It's a sensible
> programmatic default.
The evidence seems to suggest that mI is critical to long-term evolution
of documents. It is about handling messages from the future and from the
past: Only require information if you need it to function. Ignore what
you don't understand.
> > This allows new versions of the
> > document type to be deployed without breaking the architecture. It
> > allows extensions to be added for various purposes. If we continue
> > use mime we can be specific about particular kinds of subclasses.
> > example, I might sub-class atom for the special purpose of
> > the next three trains that will arrive at a railway station:
> > application/pids+atom+xml.
> > RDF isn't really as flexible.
> I can't agree. RDF's handling of unknown triples is far more flexible
> than mI.
Could you provide some examples of this?
> [aside: it's weird to watch people argue up the uniform interface as
> key constraint of REST, but happily rail on uniform data. ]
This was part of Mark's recent statements. I would like to attack the
issue from a specific direction, and that is application-to-application
One of my impressions from WSEC was that there wasn't a great maturity
of understanding about the uniform interface being displayed around the
room. Everyone was looking for the practical benefits of specific
methods, which is fine, but weren't quite seeing the benefits of uniform
interfaces in general.
One voice in the room asked why he should care about uniform methods,
when the component that recieves a message still has to understand the
whole thing. He didn't see the point of using a uniform method vs an ad
hoc method when the whole message still had to be understood in a very
specific way... and the thing is that in a static architecture he is
exactly right. The uniform interface doesn't offer a fundamental benefit
in a static architecture. It is only as we evolve our architectures and
allow different webs to interact with each other that the key rule takes
effect, and that is:
* The kinds of interactions in an architecture and the kinds of data
transferred in the interactions should be decoupled from each other.
That is to say, the set of methods and the set of content types should
be decoupled from one another. The reason for this is that they vary at
different rates. I am very rarely going to need to need to add new
methods or return codes to form new interactions in the architecture,
but very often going to need to add new kinds of information. I am very
often going to need to add new content types.
The goal of application-to-application integration is to constrain the
kinds of message that are sent around an architecture so that the
messages can be understood wherever they arrive. Whenever the data
schemas of two components line up, I should be able to configure them to
talk to have specific kinds of interactions with each other. I might
want them to have the GET interaction, or the PUT, or the SUBSCRIBE. The
thing is that uniform methods are just an underpinning for uniform
interactions, and that uniform data is still required.
I see the claim that RDF provides uniform data, but it really doesn't.
It doesn't any more than XML provides uniform data. It just provides a
uniform way of creating different data types. Uniform data only comes
about with RDF when you add vocabulary to it. Uniform data only comes
about with XML when you add both vocabulary and structure to it.
Thus, I suggest that RDF and REST are not an automatic fit to each
other. It is necessary to prove that RDF facilitates better ways of
constructing uniform kinds of data than XML does. RDF's uniform
structure is not in and of itself a clear win for REST.