Re: [rng-users] Lets standardize PI for associating Relax NG schema with XML document
- On Jul 3, 2005, at 00:42, B Tommie Usdin wrote:
> At 11:14 PM +0200 7/2/05, Jirka Kosek wrote:I agree. I think it is a desirable feature that the RELAX NG validation
>> Primary motivation (although not stated clearly) for my proposal was
>> not validation, but guided editing of XML document. Describing
>> complex validation is out of scope of my proposal, something much
>> more powerfull like NRL could be used.
> But if there is a "standard" there will be pressure to use it for
> everything conceivable, whether it is appropriate or not.
process takes two *independent* inputs: the schema and the document.
(Mentioned also by James Clark in the famous IETF post:
I can see three main cases here:
1) Apps that want to check their input in an off-the-shelf manner
2) Quality assurance tools
3) Editors with autocomplete/error high-lighting
In case 1) an application receives input from an outside source and
cannot trust that the outside source produces correct output (correct
in the sense that the receiving application works properly when using
it as input). In order to avoid hand coding checks for all the possible
errors situations, the developer of the application decides to embed a
RELAX NG validator and an appropriate schema. Then in the hand-coded
part of the application can trust that anything it sees conforms to the
If the input can smuggle in its own rules the way DOCTYPE and
schemaLocation allow it to do, the app can no longer trust the
validation stage, which defeats the whole point of embedding the
validator. Therefore, I think a PI for the input to specify its own
schema is totally wrong considering case 1).
In case 2) a user has a document (not necessarily created by the user
him/herself) and is interested in the syntactic correctness of the
document. If the document is allowed to define the rules, the user is
getting the answer to the question "Does this document conform to the
grammar it sets for itself?"
http://validator.w3.org/ works like this. It gives you a little badge
of validity to show off, but it doesn't tell you if the internal subset
was used to introduce radically different home grown rules than what
the "This document is valid FooML" message implies. All you know is
that whoever produced the document managed to adhere to his/her own
rules. Then what? The rules could be anything.
http://hsivonen.iki.fi/validator/ - being a RELAX NG validator - works
differently. It allows the user to pose the (in my opinion much more
useful) question "Does this document conform to this grammar?" It does
not give out a badge, but after the validation the user knows what
schema the document did or did not conform to. I think RELAX NG-based
QA tools would regress to a less useful level if the user of a QA tool
only knew that the document is internally consistent without knowing
whether it adheres to the particular grammar the user is interested in.
Therefore, I think a PI for the input to specify its own schema would
harm case 2).
I agree that in case 3) it is desirable to use a RELAX NG schema for
editing assistance. However, I think such use is a private matter
between the user and his/her editor and, therefore, it is not necessary
to expose such private editing method details to whoever subsequently
receives the document. Moreover, the schema repository is likely to be
local, so the most obvious references ie. installation-specific file
system paths would be useless to others making the PI useful only
privately. OTOH, registering common identifiers for schemas and
abstracting away the file paths would probably be an overkill and for
the same effort you could use some configurable association method that
does not contaminate the document.
Also, having to contaminate the document itself with editing
process-specific artifacts can be a sign of a design flaw in the
editor. In the common cases, the schema could be bound to the root
namespace or to the filename extension (as is customary with
programming language-specific syntax highlighting in text editors).
Since case 3) seems more like a private issue, I think central
endorsement of a standard PI is not necessary for case 3).
BTW, I think DOCTYPE and schemaLocation are design bugs, because they
foil the point of cases 1) and 2).
- MURATA Makoto (FAMILY Given) wrote:
> If the camp trying to standardize PIs is not the majoritySorry to answer so late, I've been on vacation. Just for the record, if
> of the RELAX NG community, I do not think that PIs will take off.
> Here is my understanding of the current status. Please let me know
> if I misinterprets somebody.
> For schema-associating PIs
> Jirka Kosek
> Robin Berjon
> George Cristian Bina
you are going to be counting heads in the RNG community, I think I would
be best counted as "neutral". My take on this is that a schema PI is
just as bad an idea as a stylesheet PI, which is to say that it's most
of the time a very bad idea (and in the absence of a processing model,
dreadfully underspecified in its interactions with other specs at that),
but *if* people are going to be doing it anyway (as seems to be the
case) then I would prefer that there is a standard made by people who
understand the issues and limitations of this approach rather than ad
hoc proprietary options mades by people who are probably smart and
probably understand some of the problems, but won't benefit from the
head-banging that some form of community standard would get (or rather,
Senior Research Scientist