Loading ...
Sorry, an error occurred while loading the content.
 

rng + schematron questions

Expand Messages
  • Bruce D'Arcus
    I m wondering if anyone has any practical suggestions about whether or not I should consider mixing RNG and schematron. I have a schema to represent citation
    Message 1 of 4 , Aug 1 8:16 AM
      I'm wondering if anyone has any practical suggestions about whether or
      not I should consider mixing RNG and schematron.

      I have a schema to represent citation styling configuration, and I
      want to add some restrictions that are possible in RNG, but then don't
      map well to XSD. I don't consider it an absolute requirement that I be
      able to convert the schema to XSD, but I guess someone is likely to
      ask for it if it's successful.

      You can see what I want with this schema fragment:

      cs-citationstyle =
      element cs:style {
      cs-author-date.class
      | cs-number.class
      | cs-label.class
      | cs-note.class
      | cs-annotated.class
      | cs-custom.class
      }

      ... and then I have stuff like:

      ## The author-date class assumes a bibliography, sorted using an author-date
      ## algorithm. Where an author has more than one entry in a given
      year, the entries
      ## are dismambiguated with a suffix (e.g. Doe, 1999a). Because of
      the importance
      ## of the author variable, it is essential to support rich
      substitution behavior.
      ## This can and should be configured in cs:defaults/cs:author/cs:substitution.
      cs-author-date.class =
      attribute class { "author-date" },
      cs-info,
      cs-defaults,
      cs-citation.author-date,
      cs-bibliography

      ... and I can also have a special bibliography pattern to insure that
      there is a sort element child that has an "algorithm" attribute set to
      "author-date," etc.

      That sort of thing.

      Or, I can have a more simple RNG schema that is easier to transform to
      XSD, and have a few rules like:

      [
      s:rule [
      context = "/cs:style"
      s:assert [
      test =
      "@class='author-date' and
      cs:bibliography/cs:sort/@algorithm='author-date'"
      "Must use author-date sorting for the author-date class."
      ]
      ]
      ]

      The question I have is, what sort of practical implementation
      trade-offs might I run into with each approach? I really need these
      instances to be easy to validate for developers that may not be that
      comfortable with XML. Is it common, for example, to write little
      scripts, and perhaps expose web services for this sort of thing?

      Bruce
    • Jirka Kosek
      ... It is very useful feature, highly recommended. ... The former approach is more friendly for editing applications. Editors that support RELAX NG (e.g.
      Message 2 of 4 , Aug 1 8:36 AM
        Bruce D'Arcus wrote:

        > I'm wondering if anyone has any practical suggestions about whether or
        > not I should consider mixing RNG and schematron.

        It is very useful feature, highly recommended.

        > The question I have is, what sort of practical implementation
        > trade-offs might I run into with each approach?

        The former approach is more friendly for editing applications. Editors
        that support RELAX NG (e.g. nxml-mode, oXygen, XXE) will offer different
        content models depending on the value of attribute. This can
        dramatically improve data entry if you are using general XML editors.

        So my suggestion is to use RELAX NG to express things which can be
        expressed in RELAX NG without too much effort. Rest could be done in
        Schematron in a very compact way.

        Jirka

        --
        ------------------------------------------------------------------
        Jirka Kosek e-mail: jirka@... http://www.kosek.cz
        ------------------------------------------------------------------
        Profesionální školení a poradenství v oblasti technologií XML.
        Podívejte se na náš nově spuštěný web http://DocBook.cz
        Podrobný přehled školení http://xmlguru.cz/skoleni/
        ------------------------------------------------------------------
        Nejbližší termíny školení:
        ** XSLT 23.-26.10.2006 ** XML schémata 13.-15.11.2006 **
        ** DocBook 11.-13.12.2006 ** XSL-FO 11.-12.12.2006 **
        ------------------------------------------------------------------
        http://xmlguru.cz Blog mostly about XML for English readers
        ------------------------------------------------------------------
      • Daniel Mahler
        Bruce, I personally favour the pure RNG approach where practical. Appart from editor support that Jirka mentioned. A the pure rng schema is better for
        Message 3 of 4 , Aug 1 10:11 AM
          Bruce,

          I personally favour the pure RNG approach where practical.
          Appart from editor support that Jirka mentioned.
          A the pure rng schema is better for attaching semantics to your schema.
          For example you can then naturally use tools like relaxngcc
          to extract the semantic content of documents by annotating the schema.
          There are actually two distinct reasons for this.
          The obvious one is that relaxngcc and data binding tools only
          understand the pure rng part.
          The more subtle reason is that in the pure rng approach,
          your grammar actually reflects the semantics of your data.
          The schematron approach effectively allows you to conflate
          semantically different cases into the same schema prodution.
          When processing the document, your processing must recover the
          semantic distinctions.

          An extreme example of this is the encoding of the MARC21 bibliographic
          standard in XML.
          The MARC21slim.xsd is a very coarse schema which requires additional
          schematron rules
          to ensure validity according the MARC21 standard.
          Deriving tools to poplate an RDBMS or RDF store with information from
          MARC21slim records
          using the schema + Schematron rules requires convoluted programming logic.
          Since the MARC people use XSD, which does not allow content depenpent
          constraints,
          they have actually devised a separate encoding (MODS) to better
          reflect the semantic structure
          of MARC records and thus make it easier to write declarative structure
          driven tools.
          With RNG it would be possible to simply write a more fine grained
          schema for the simple encoding, very much like the first alternative
          in your example.
          While this would make the schema more complex,
          deriving processing tools from the schema becomes much more natural.

          Hope that helps

          cheers
          Daniel



          On 8/1/06, Bruce D'Arcus <bdarcus.lists@...> wrote:
          > I'm wondering if anyone has any practical suggestions about whether or
          > not I should consider mixing RNG and schematron.
          >
          > I have a schema to represent citation styling configuration, and I
          > want to add some restrictions that are possible in RNG, but then don't
          > map well to XSD. I don't consider it an absolute requirement that I be
          > able to convert the schema to XSD, but I guess someone is likely to
          > ask for it if it's successful.
          >
          > You can see what I want with this schema fragment:
          >
          > cs-citationstyle =
          > element cs:style {
          > cs-author-date.class
          > | cs-number.class
          > | cs-label.class
          > | cs-note.class
          > | cs-annotated.class
          > | cs-custom.class
          > }
          >
          > ... and then I have stuff like:
          >
          > ## The author-date class assumes a bibliography, sorted using an
          > author-date
          > ## algorithm. Where an author has more than one entry in a given
          > year, the entries
          > ## are dismambiguated with a suffix (e.g. Doe, 1999a). Because of
          > the importance
          > ## of the author variable, it is essential to support rich
          > substitution behavior.
          > ## This can and should be configured in
          > cs:defaults/cs:author/cs:substitution.
          > cs-author-date.class =
          > attribute class { "author-date" },
          > cs-info,
          > cs-defaults,
          > cs-citation.author-date,
          > cs-bibliography
          >
          > ... and I can also have a special bibliography pattern to insure that
          > there is a sort element child that has an "algorithm" attribute set to
          > "author-date," etc.
          >
          > That sort of thing.
          >
          > Or, I can have a more simple RNG schema that is easier to transform to
          > XSD, and have a few rules like:
          >
          > [
          > s:rule [
          > context = "/cs:style"
          > s:assert [
          > test =
          > "@class='author-date' and
          > cs:bibliography/cs:sort/@algorithm='author-date'"
          > "Must use author-date sorting for the author-date class."
          > ]
          > ]
          > ]
          >
          > The question I have is, what sort of practical implementation
          > trade-offs might I run into with each approach? I really need these
          > instances to be easy to validate for developers that may not be that
          > comfortable with XML. Is it common, for example, to write little
          > scripts, and perhaps expose web services for this sort of thing?
          >
          > Bruce
          >
          >
          >
          > Yahoo! Groups Links
          >
          >
          >
          >
          >
          >
          >
        • Bruce D'Arcus
          Thanks for the comments. I think I ve figured out a way to do this. I ll have the normative schema pure RNG, with all the constraints. IIf I end up needing
          Message 4 of 4 , Aug 1 11:44 AM
            Thanks for the comments. I think I've figured out a way to do this.

            I'll have the normative schema pure RNG, with all the constraints. IIf
            I end up needing support for XSD (let's hope I don't), I can then have
            a customziation schema that simultaenously includes the schematron
            rules. That achieves the same thing, but keeps the normative schema
            pure (and tight).

            Daniel, don't even get me started on the LoC's technology and design
            choices. They should have used RDF, and even if they didn't, they
            should have either used RNG for teh schema technology, or designed the
            schemas in such as a way that they are at least easier to validate in
            XSD. A whole lot of critical logic is encoded in (optional)
            attributes.

            Bruce
          Your message has been successfully submitted and would be delivered to recipients shortly.