Loading ...
Sorry, an error occurred while loading the content.

[fielding@apache.org: Re: httpRange-14 , what's the problem]

Expand Messages
  • Mark Baker
    Long, but *well* worth the read. Every time I talk to Roy, I learn something new and useful. MB ... Date: Wed, 31 Jul 2002 23:48:05 -0700 Content-Type:
    Message 1 of 1 , Aug 1, 2002
    • 0 Attachment
      Long, but *well* worth the read. Every time I talk to Roy, I learn
      something new and useful.

      MB

      ----- Forwarded message from "Roy T. Fielding" <fielding@...> -----

      Date: Wed, 31 Jul 2002 23:48:05 -0700
      Content-Type: text/plain; charset=US-ASCII; format=flowed
      Mime-Version: 1.0 (Apple Message framework v482)
      Cc: "www-tag" <www-tag@...>
      To: "Tim Berners-Lee" <timbl@...>
      From: "Roy T. Fielding" <fielding@...>
      In-Reply-To: <031301c23721$5661e5a0$84001d12@...>
      Message-Id: <A2224494-A51A-11D6-8895-000393753936@...>
      Content-Transfer-Encoding: 7bit
      Subject: Re: httpRange-14 , what's the problem
      X-Mailing-List: <www-tag@...> archive/latest/1886
      Sender: www-tag-request@...
      List-Id: <www-tag.w3.org>
      List-Help: <http://www.w3.org/Mail/>
      List-Unsubscribe: <mailto:www-tag-request@...?subject=unsubscribe>
      X-Spam-Status: No, hits=-1.8 required=5.0
      tests=IN_REP_TO,DOUBLE_CAPSWORD,PORN_3
      version=2.30
      X-Spam-Level:
      Status: O
      Content-Length: 34095
      Lines: 681


      On Monday, July 29, 2002, at 09:59 AM, Tim Berners-Lee wrote:
      > http://www.w3.org/DesignIssues/HTTP-URI.html

      > Tim Berners-Lee
      > Date: 2002-07-27, last change: $Date: 2002/07/31 20:15:46 $
      > [...]

      > This question has been addressed only vaguely in the specifications.
      > However, the lack of very concise logical definition of such things had
      > not been a problem, until the formal systems started to use them. There
      > were no formal systems addressing this sort of issue (as far as I know,
      > except for Dan Connolly's Larch work [@@]), until the Semantic Web
      > introduced languages such as RDF which have well-defined logical
      > properties and are used to describe (among other things) web operations.

      There has been quite a lot of work outside the W3C regarding the Web
      architecture, both in formalisms and mere descriptions. Google is a
      good way to find them, though most are intended more as a way of showing
      off the properties of some variation on a formalism than they are of
      actually modeling the Web.

      > The efforts of the Technical Architecture Group to create an architecture
      > document with common terms highlighted this problem. (It demonstrates the
      > ambiguity of natural language that no significant problem had been
      > noticed
      > over the past decade, even though the original author or HTTP , and later
      > co-author of HTTP 1.1 who also did his PhD thesis on an analysis of the
      > web, and both of whom have worked with Web protocols ever since, had had
      > conflicting ideas of what the various terms actually mean.)

      Tim, when you invented HTTP it only allowed one method (GET), did not
      have header fields, and interpreted every response as HTML. HTTP has
      changed considerably over time.

      I don't think we have conflicting ideas about the terms. I think that
      the changes introduced in 1995 for the sake of HTTP caching and content
      negotiation are absent from your model of how the Web works because we
      needed to change the model in order for it to work at all. Furthermore,
      the community made a very conscious decision to stop referring to
      resources as documents because they simply do not always fit the mental
      model of the English word "document".

      > This document explains why the author find it difficult to work in the
      > alternative proposed philosophies. If it misrepresents those others'
      > arguments, then it fails, for which I apologize in advance and will
      > endeavor to correct.
      >
      > 1. Web Concepts as proposed
      >
      > The WWW is a space of information objects. The URI was originally called
      > a
      > UDI, and originally all URIs identified information objects. Now, URI
      > schemes exist which identify more or less anything (eg uuids) or
      > emailboxes (mailto:) but is we look purely at HTTP URIs, they define a
      > web
      > of information objects. Information objects -- perhaps in Cyc terms
      > ConceptualWorks -- are normally things which
      >
      > * Carry some sort of message, and

      What does that mean? Information objects are things that carry
      information?
      All objects carry information by virtue of having state.

      > * Can be represented, to a greater or lesser authenticity, in bits

      Any object's state can be represented in bits. The state doesn't have
      to be stored in bits -- it can merely be observed at the time that a
      method is applied.

      > I want to make it clear that such things are generic (See Generic
      > Resources) -- while they are documents, they generally are abstractions
      > which may have many different bit representations, as a function of, for
      > example:
      >
      > * Time -- the contents can vary with revision --
      > * Content-type in which the bits are encoded
      > * Natural language in which a human-readable document is written
      > * Machine language in which a machine-processable document is written
      > * and a few more
      >
      > but the philosophy is that an HTTP URI may identify something with a
      > vagueness as to the dimensions above, but it still must be used to refer
      > to a unique conceptual object whose various representations have a very
      > large a mount in common. Formally, it is the publisher which defines the
      > what an HTTP URI identifies, and so one should look to the publisher for
      > a
      > commitment as to the exact nature of the identity along these axes.

      Yes, no argument there.

      > I'm going to refer to this as a document, because it needs a term and
      > that
      > is the best I have to date, but the reader should be sure to realize that
      > this does not mean a conventional office document, it can be for example
      >
      > * A poem
      > * An order for ball bearings
      > * A painting
      > * A Movie
      > * A reveiw of a movie
      > * A sound clip
      > * A record of the temperature of the furnace
      > * An array a million integers, all zero
      >
      > and so on, as limited only by our imagination.

      None of which are similar to the examples I gave, wherein an http URI
      is being used to identify the I/O control-system for a physical robot
      or a gateway to SMS-enabled phone devices. Nor does it lend appropriate
      significance to the properties of a Web-enabled refrigerator or a
      car radio, both of which I have personally interacted with via HTTP.
      It is therefore far more reasonable to refer to the thing identified
      by an http URI as a resource that is accessible via HTTP.

      > The Web works because, given an HTTP URI, one can in a large number of
      > cases, get a representation of the document. For a human readable
      > document, the person is presented with the information by virtue of some
      > gadget which is given the bits of a representation. In the case of a
      > hypertext document, a reference to another document is encoded such that,
      > upon user request, the referenced document can in turn be automatically
      > presented. In the case of a machine-readable document, identifiers of
      > concepts, being HTTP URIs, will often allow definitive reference
      > information about those concepts to be pulled in to guide further
      > actions.
      >
      > The web, then, is made of documents as the internet is made of cables and
      > routers. The documents can be about anything, so when we move to talk
      > about the contents of documents we break away from talking about
      > information space and the whole universe of human -- and machine --
      > discourse is open to us. Web pages can compare a renaissance choral works
      > with jazz pop hits, and discuss whether pigs have wings.
      > Machine-processable documents can encode information about shoes, and
      > ships, and sealing-wax. Until recently, the Internet protocol standards
      > out of which the Web is built had little to say about such things. They
      > were concerned only with the human-readable side, so it was people,
      > reading natural language (not internet specs) who formed and communicated
      > the concepts at this level. Nowadays, however, semantic web languages
      > allow information to be expressed not only about URIs, TCP ports and
      > documents, but also about arbitrary concepts - the shoes, and ships and
      > sealing wax, and whether pigs have wings. Simple semantic web application
      > allow one to order shoes and travel on ships, and determine that, given
      > the data, pigs do not have wings.
      >
      > For these purposes it is of course quite essential to distinguish between
      > something described by a document and the document itself. Now that we -
      > -
      > for the first time -- have not only internet protocols which can talk
      > about document but also those which talk about real world things, we must
      > either distinguish or be hopelessly fuzzy.

      No argument there either, except that I don't think that this is
      anything new. The solution is to use a formalism that understands
      the difference between an identifier and *use* of that identifier.

      In order for HTTP caching to work, there needs to be a distinction
      between the attributes of a resource (whatever is identified by *any*
      scheme accessed via HTTP) and the attributes of one particular
      representation of that resource obtained via GET. Assertions in
      the form of HTTP metadata (header fields) are made about each,
      independently, and without ambiguity because they are defined by
      a shared standard, albeit without syntactic clarity and independent
      extensibility due to the limitation of mixing them all together
      in a MIME-like header. Apparently, RDF is capable of the same
      distinctions, so there is no technical issue here.

      > And is this bad, is it an inhibition to have to work our way though
      > documents before we can talk about whatever we desire? I would argue not,
      > because it is very important not to lose track of the reasons for our
      > taking and processing any piece of information. The process of publishing
      > and reading is a real social process between social entities, not
      > mechanical agents. To be socially responsible, to be able to handle
      > trust,
      > and so on, we must be aware of these operations. The difference between
      > a
      > car and what some web page says about it is crucial - not only when you
      > are buying a car.

      Correct, but in those circumstances we would not be using the "http"
      URI to define the identity of the car. Instead, we use the http URI
      to provide access via HTTP to a representation of a car that is ALSO
      identified by a VIN (without any URI form being necessary). A legal
      document that would later be drawn up, or even a transaction via
      contract passed through a cashier (also identified via a http URI),
      would use the VIN for physical identification of the car, not
      because it is an inherently better string than one beginning "http:",
      but because the VIN has been permanently affixed to the dashboard and
      engine block for precisely this purpose.

      If car producers wished such a thing, they could all agree to stamp
      each car with a unique http URI instead and achieve the same
      purpose: unique identification within the class of objects under use.
      The fact that we could also use that URI to access information about
      that specific car via the Web, which is a different resource from the
      car itself, doesn't change the fact that it uniquely identifies the
      car within the realm of cars.

      However, it is a bit of a waste of time to talk about this in terms
      of cars when the real objective, or at least that of the vocal
      minority, is to distinguish between the abstract concept of a
      namespace and a document describing that namespace. It is the same
      problem, but is easier to think about. When used within an xmlns
      attribute of an XML document, an http URI identifies the namespace.
      When used within an xlink:href attribute of an XML document or a
      browser's URI entry field or similar construct, an http URI
      identifies the namespace by providing a consistent view of that
      namespace in the form of representations.

      The URI still identifies an unambiguous resource precisely because
      we do not say that the result of a GET is the thing that is
      identified, and we do not say that the thing identified is a
      document just because access is allowed via HTTP by way of documents.
      People who use the Web don't care about that difference, but the
      technology of distributed caching absolutely depends on it.
      In other words, just because a URI only identifies one resource
      does not mean that every use of that URI is equivalent, just as
      using an http URI as a cache key is not equivalent to using it
      as the target of an anchor href.

      I say an http resource is a conceptual object that has state and
      identity and behavior, just as you define it in your own design notes
      prior to getting involved in this debate, but I do not generally refer
      to it as an object because all of the OOP developers get hot and
      bothered when I do so -- it is a term that is inextricably linked
      with a common implementation, just like document is a term that is
      inextricably linked to words/images on renderable media. HTTP is
      designed to hide all details of the implementation, so saying http
      URI identify resources is the most accurate statement.

      > Some have opined that the abstraction of the document is nonsense, and
      > all
      > that exists, when a web page describes a car, is the car and various
      > representations of it, the HTML, PNG and GIF bit streams. This is however
      > very weak in my opinion. The various representations have much more in
      > common than simply the car. And the relationship to the car can be many
      > and varied: home page, picture, catalog entry, invoice, remote control
      > panel, weblog, and so on. The document itself is an important part of
      > society - to dismiss its existence is to prevent us being aware of human
      > and aspects of information without which we are impoverished. By
      > contrast,
      > the difference between different representations of the document (GIF or
      > PNG image for example) is very small, and the relationship between
      > versions of a document which changes through time a very strong one.

      That argument is weird. No one has opined that the abstraction of
      a document is nonsense -- it is merely insufficient to describe all http
      resources. Furthermore, if the same URI is used to identify a resource
      whose representations are a home page, picture, catalog entry, invoice,
      remote control panel, weblog, and so on, then that URI obviously does
      not identify the car. It might be said to identify a bunch of random
      things related to a car of that type, but certainly not the car.

      URI in general identify a resource -- one concept, one identity, one
      sameness that might be observable via its representations. The vast
      majority of URI do identify documents. HOWEVER, the architecture is
      not defined by what is true of the vast majority -- it is defined by
      what is true of ALL resources that fit the given criteria. And if the
      criteria is "all http URI", your definition of "document" simply does
      not fit. That does not in any way prevent people from using unambiguous
      identifiers in http or the semantic web, nor does it somehow reduce
      the value of a document as an abstraction.

      > 2. Trying out the Alternatives
      >
      > The folks who disagree with the model do so for a number of different
      > arguments. This article, therefore will have to take them one by one but
      > the ones which come to mind are as follows:

      Why didn't you simply refer to the arguments that others made, rather
      than your interpretation of what they meant to say? My messages have
      been a lot more carefully worded than this document.

      > 1. Every web page (or many of therm) are in fact themselves
      > representations of some abstract thing, and the URI really identifies
      > that thing, not a document at all.

      Not *necessarily* a document.

      > 2. There are many levels of identification (representation as a set of
      > bits, document, car which the web page is about) and the URI
      > publisher, as owner of the URI, has the right to define it to mean
      > whatever he or she likes;

      They can define it to mean anything, but it only has meaning if it is
      used according to that definition. Likewise, it may take on meaning
      that wasn't intended by the publisher if it can be consistently used
      as such, and may take on a temporary meaning if the temporal period is
      sufficient to be usable. That's because the meaning of a URI is
      insignificant when compared to the reason why the reference is
      being made (the meaning of the resource in context of its use), and
      not all references are made by the publisher of the URI.

      It is possible, though not at all desirable, that the meaning of a
      resource will at some point differ from the intended meaning that
      a person had in mind when they used the URI as a reference. That
      is a well-known problem that will affect the Semantic Web just as
      much as it does the current Web. It is a social problem of any
      system that allows identifiers to exist independent of the entity
      being identified. Yes, it has drawbacks, but try identifying a concept
      that has no current realization within a system that depends on
      the realization for identity.

      > 3. Actually the URI has to, like in English, identify these different
      > things ambiguously. Machines have to disambiguate using common sense
      > and logic
      > 4. Actually the URI has to, like in English, identify these different
      > things ambiguously. Machines have to disambiguate using the fact that
      > different properties will refer to different levels.
      > 5. Actually the URI has to, like in English, identify these different
      > things ambiguously. Machines have to disambiguate using extra
      > information which will be provided in other ways along with the URI
      > 6. Actually the URI has to, like in English, identify these different
      > things ambiguously. Machines have to disambiguate them by context: A
      > catalog card will talk about a document. A car catalog will talk about
      > a car.

      None of the above. I have consistently stated that the URI identifies
      the same resource as far as the architecture is concerned, even if the
      people using that URI are only partially aware of its real sameness
      over time, and even if its meaning changes over time. The only thing
      that differs by context is the RESULT of using that URI. That is the
      separation of concerns between methods and identifiers which has been
      central to the architecture since HTTP/1.0 was introduced.

      > 7. They may have been used to identify documents up till now, but for
      > RDF
      > and the Semantic Web, we should change that and start to use them as
      > the Dublin Core and RDF Core groups have for abstract concepts.

      IMO, there is only one Web.

      > 2.1 Identify abstract things not documents
      >
      > Let's take the alternatives in order. These alternatives all make sense.
      > Each one, however, has problems I can't see any way around when we
      > consider them as a basis as
      >
      > The first was,
      >
      > Every web page (or many of them) are in fact themselves representations
      > of some abstract thing, and the URI really identifies that thing, not a
      > docuemnt at all.
      >
      > Well, that wasn't the model I had when URIs were invented and HTTP was
      > written. However, let's see how it flies. If we stick with the principle
      > that a URI (or URIref) must unambiguously identify the same thing in any
      > context, then we come to the conclusion that URIs can not identify the
      > web
      > page. If a web page is about a car, then the URI can't be used to refer
      > to
      > the web page.

      It doesn't identify both. It identifies the car. The web page is
      what you GET. The same URI can then be used, in another context, to
      indirectly identify a representation that was formerly the result of
      a GET (which is what caches do when they lookup a response), but the
      cache isn't even remotely confused between the two because we have
      defined them as different things.

      > 2.1.1 Same URI can identify a web page and a car
      >
      > What, a web page can't be a car? At this point a pedantic line reasoning
      > suggests that we should allow web pages and cars to conceptually overlap,
      > so that something can be both. This is counterintuitive, as a web page is
      > in common sense, not a concrete object whereas a car is. But sure, we
      > could construct a mathematics in which we use the terms rather specially
      > and something can be at the same time a web page and a car.
      >
      > Frankly, this doesn't serve the social purpose of the semantic web, to be
      > able to deal with common sense concpets and objects. A web page about a
      > car and a car are in most people's minds quite distinct (as I argue
      > further below). A philosophy in which they are identical does not allow
      > me
      > to distinguish between them. not only conflicts with reality as I see it,
      > but also leaves us no way to make statements individually about the two
      > things.

      A web page is something that you GET from a resource, not the resource
      itself.

      The only aspect of this that limits the Semantic Web is that it
      cannot pretend the result of a GET and the resource identified by the
      URI that was used to perform the GET are necessarily the same thing,
      which is a perfectly reasonable thing to require considering that they
      aren't even the same thing for time-varying documents, let alone cars.
      A URI is an identifier of a resource, not the resource itself.

      What is necessary for the Semantic Web is that it be able to distinguish
      between resources and representations, and further that it can deal with
      the very common situation where the representation has no known URI
      by which it can be directly referred, because Web sites deliberately
      hide the URI of those resources that they do not wish to be directly
      accessible. [BTW, Content-Location is not a sufficient fix for this
      problem simply because the resource provider has no desire to use it.]

      > 2.1.2 The URI identifies the car, not the web page
      >
      > So lets fall back on the idea that the URI identifies the subject of the
      > web page, but not the web page itself. This makes sense. We can build the
      > semantic web on top of that easily.
      >
      > The problem with this is that there are a large number of systems which
      > already do use URIs to identify the document. This is the whole metadata
      > world. Think of a few:
      >
      > * The Dublin Core

      uses URI to identify abstract concepts (metadata relationships),
      indirectly obtain sections of a resource that describes, and identify
      other resources that are the target of that relationship.

      > * RSS

      uses URI to identify namespaces, indirectly obtain a document
      that defines a namespace, and other resources that supply
      representations in a given format.

      > * The HTTP headers

      refer to the resource, the representation, or the message, depending
      on the definition of the header field.

      > * The Adobe XML system

      no idea

      > * Access control systems

      always refer to the resource.

      I don't see any problem.

      > (I'm sticking with the machine-processable languages as examples because
      > human-processable ones like HTML have a level of ambiguity traditional in
      > human natural language but quite out of place in the WWW infrastructure
      > --
      > or the Semantic Web. You can argue that people say "I work for w3.org" or
      > "http://www.amazon.com/shrdlu?asin=314159265359" is a great book, just as
      > they happily say "Moby Dick weighs over three thousand tonnes", "Moby
      > Dick
      > was finished over a century ago" and "I left Moby Dick on the beach"
      > without expecting to be misunderstood. So we won't use human language as
      > a
      > guide when defining unambiguously the question of what a URI identifies.
      > )

      So you intend to define meaning without reference to humans? I thought
      that the purpose of the Semantic Web was to help humans understand
      and operate within the realm of interrelated resources. What good does
      it do if the human is first required to translate their "real world"
      reference to one that applies to the less-messy-than-the-real-world
      Semantic Web? I think that is an interface error.

      > Roy Fielding argues the the URI which I associate with his web page
      > actually identifies him.

      No, I do not. I never have. I even explicitly corrected you on
      this very point while we were in your office talking this over.
      My home page URI identifies my home page, where I go for a hypertext
      representation of the topics that I am working on so that I can
      easily jump from there to other resources of interest. It is a
      public resource so that others can do the same. Some people use
      that URI as an indirect way of identifying me, but only in the
      sense that the resource contains more information about me.
      However, someone could build a system that accepts the URI of a
      home page as an indirect identifier of a person and performs some
      action based on that relationship that affects me as a person, such
      as calling my phone number, just as any identifier can be indirectly
      used in ways that are not expected by the identifying authority.

      Mark Baker has argued in the past that an http URI does identify
      him, but I think that is reasonable since he owns the naming authority.
      If you have complete control of the namespace, the identifiers
      within it can identify anything provided that you only use them
      consistently to identify that thing. I doubt that is the case for
      his home page URI, but it could be for some other URI.

      > He argues that conventionally people use the
      > identifier to identify the person. However, consider another Roy Fielding
      > page put together by freinds who found a photograph of him with no
      > clothes
      > on. A lot of content filtering systems would collect that URI and put put
      > into their list. Even though the photo had many represnetations which
      > different devices could download using content negotiation and/or CC/PP
      > (color orblack and white and variosu different resolutions) the URI
      > istelf
      > would be listed as containing nudity. The public are very aware of
      > different works on the web, even though they have the same topic.

      Yikes, what an unpleasant mental picture. Does that identifier provide
      representations of my state, or the state of a nude picture taken at
      some particular point in time? The public is capable of distinguishing
      the two over time, and thus those two resources do not have the same
      topic/meaning/identity, even though there does exist a relation between
      the two. Fortunately, I don't have "friends" like that.

      > 2.2.3 Indirect identification
      >
      > You can argue that a web page indirectly identifies something, of course,
      > and I am quite happy with that. If you identify an organization as that
      > which has home page http://www.w3.org, then you are not saying that
      > http://www.w3.org/ itself is that organization. This scenario is very
      > very
      > common, just as we identify people and things by their "unambiguous
      > properties": books by ISBN, people by email address, and so forth. So
      > long
      > as we don't think that the person is an email address, we are fine. Some
      > people have thought that in saying "An HTTP URI can't identify an
      > organization" I was ruling out this indirect identification, but not so:
      > I
      > am very much in favor of it. The whole SQL world, after all, only
      > identified things indirectly by a key property. This causes no
      > contradiction. Perhaps I should say "An HTTP URI can't directly identify
      > an organization". But by "identify" I mean "directly identify", and
      > "identity" is a fairly direct word and concept, so I will stick with it.

      Identity is not a fairly direct word and concept -- that is why I posted
      a very long description of what it means to www-tag, including its
      definition according to webster.com. If this is the source of our
      disagreement, then I give up. All identifiers are by their very nature
      an indirect means to establishing identity.

      An http URI, when it is dereferenced, activates a mechanism whereby
      the string of characters in the URI is used to select a bag of bits
      that is supposed to represent the state of the abstract thing
      identified as a resource via the http naming authority. Does that
      imply that the resource is an HTTP mechanism? No. The URI is not
      the resource, the mechanism is not the resource, and the bag of bits
      is not the resource. Therefore, an http resource is always
      restricted to indirect identification. It is simply impossible to
      "directly" identify a resource for which the representation is
      allowed to change over time.

      Consumers don't give a rat's ass about the mechanism beyond a desire
      that it consistently provide representations that match the semantics
      they intended by referencing it in the first place. The semantics
      are the resource. Usually the semantics correspond to a "living
      document about a particular subject", but not always.

      > Conclusion so far: the idea that a URI identifies the thing the document
      > is about doesn't work because we can only use a URI to identify one thing
      > and we have and already do use it to identify documents on the web.

      No, we use it to obtain documents via the Web by identifying a
      resource and asking for a representation of its current state.
      We indirectly identify information within the representations,
      the web page, by referring to it as the state obtained by doing
      a GET on a resource's URI. All http-based Web pages are only
      indirectly identified by http URI, since a Web page is a
      representation obtained at an instance in time and not the
      resource itself.

      > 2.2 Author definition
      >
      > So how can we break free of that line of reasoning? We can try throwing
      > away the rule that a URI identifies only one thing.

      I don't.

      > 2.3 Logic disambiguates
      >
      > Otherwise,we have to try another way of letting the URI mean sometimes
      > one
      > thing and sometimes another. Here is another.

      Nope, not here either.

      > 2.4 Different Properties
      >
      > Actually the URI has to, like in English, identify these different
      > things ambiguously. Machines have to disambiguate using the fact that
      > different properties will refer to different levels.

      Machines do have to know the realm of identification. If an identifier
      is used for identifying multiple things in the same realm, then it
      is clearly ambiguous. However, if the machine knows that it is using
      the URI in a context that is clearly direct, such as xmlns attributes,
      then there is no ambiguity just because it is used differently in
      other realms. It is still better though to remain unambiguous, which
      is the case for the examples I described. An xmlns attribute doesn't
      access the resource -- it only uses the name of the resource as an
      identifier. Whether or not the same identifier can be used in a GET
      is irrelevant to the mechanism of xmlns.

      > 2.5 Extra info with URI
      >
      > Actually the URI has to, like in English, identify these different
      > things ambiguously. Machines have to disambiguate using extra
      > information which will be provided in other ways along with the URI

      No, that twists the argument. The argument is that people use identifiers
      in an ambiguous way because there is no such thing as universal agreement
      about the semantics of a resource if the publisher does not make those
      semantics explicit. The true nature of a resource cannot be observed at
      any instant in time because its definition depends on how much it varies
      over time, which is outside the perceptive capacity of humans and
      machines. The publisher can improve understanding of the semantics
      of a resource by adding external assertions, but the identity of the
      resource itself does not change by those assertions.

      > 2.6 Different meaning in different context
      >
      > Actually the URI has to, like in English, identify these different
      > things ambiguously. Machines have to disambiguate them by context: A
      > catalog card will talk about a document. A car catalog will talk about a
      > car.

      The URI doesn't identify different things in different contexts. It is,
      however, used for different purposes in different contexts. An xmlns
      uses the URI of a resource on the Web (or not) to directly identify a
      namespace whose state may (or may not) be indirectly described by a
      representation found by performing a GET on the resource identified by
      that URI. There is no ambiguity here.

      > 2.7 Change it for the Semantics Web
      >
      > They may have been used to identify documents up till now, but for RDF
      > and the Semantic Web, we should change that and start to use them as the
      > Dublin Core and RDF Core groups have for abstract concepts.

      There is no need. In any case, the world doesn't need another AI system
      for describing semantic networks in isolation -- they are only useful
      when they are allowed to be enmeshed in the real world.

      > 2.8 Abandon any identification of abstract things

      I can't imagine anything more abstract than an identifier that identifies
      "Roy's favorite quote from TimBL", which is bound to change over time.
      I refuse to let anyone stick an HTTP server in my head just because that
      is the only way to directly identify that resource. They will have to
      make do with a URI that I control, wherein I manage an appropriate
      mapping and occasionally drop a bag of bits that represents the
      last recorded value of the resource.

      > 3. Conclusion
      >
      > I didn't have this thought out a few years ago. It has only been in
      > actually building a relatively formal system on top of the web
      > infrastructure that I have had to clarify these concepts my own mind. I
      > am
      > forced to conclude that modeling the HTTP part of the web as a web of
      > abstract documents if the only way to go which is practical and, by the
      > philosophical underpinnings of the WWW, tenable.

      I still disagree, particularly since you still haven't described how
      you can hold that position for resources that are clearly not documents,
      namely the resources that are service gateways to other systems. POST
      does not always mean "append to this document". Oh, wait ...

      > Q: Some HTTP URIs can be POSTed to. Can you still say they identify
      > documents?
      >
      > A: Wel, some HTTP URIs can't be accessed at all, and some access is not
      > allowed, and yes, some URIs are not only documents but also can be posted
      > to. So they object is more complex than simply a document. But that it
      > has
      > this extra functionality doesn't make it any less a HTTP document
      > formally. Something can have extra features and still remain in the same
      > class of things.

      That makes no sense to me at all. There is no stretch of the imagination
      that would allow an HTTP POST to a URI that consistently identifies an
      HTTP-to-GSM SMS message gateway to be formally equivalent to a document.
      REST defines the message to be a representation of a document and the
      service to be a resource that consumes representations, resulting in
      a state change in the service that is reflected in the response message.
      One could claim that the state of all SMS messages flying though the
      GSM network is identified by this URI, and that therefore we are only
      appending to that state, but that clearly is not the intention of the
      publisher of the URI and is not consistent with the result of a GET
      on that same URI, and is certainly not useful for reasoning about the
      interaction. It is an invalid model of the system.

      Cheers,

      Roy T. Fielding, Chief Scientist, Day Software
      (roy.fielding@...) <http://www.day.com/>

      Chairman, The Apache Software Foundation
      (fielding@...) <http://www.apache.org/>

      ----- End forwarded message -----
    Your message has been successfully submitted and would be delivered to recipients shortly.