Loading ...
Sorry, an error occurred while loading the content.

Finding News Taxonomies [was: RE: Towards a TAG consideration of CURIEs]

Expand Messages
  • Misha Wolf
    ... This: http://www.iptc.org/docs/newscodes.html#123456 is not legal, as 123456 is an illegal fragment identifier. Also, not every maintainer of a taxonomy
    Message 1 of 7 , Apr 7 8:14 AM
    View Source
    • 0 Attachment
      Richard Cyganiak wrote:

      > I don't really want to get into yet another hash vs. slash debate,
      > but I do want to point out that it is possible to get what you want
      > with option 1, though it requires a tiny bit of extra work. Example
      > Apache configuration:
      >
      > Redirect 303 /NewsCodes/ http://www.iptc.org/docs/newscodes.html#
      >
      > This will transparently redirect
      > http://www.iptc.org/NewsCodes/123456
      > to http://www.iptc.org/docs/newscodes.html#123456. HTTP redirects
      > are hardly rocket science.

      This:
      http://www.iptc.org/docs/newscodes.html#123456
      is not legal, as "123456" is an illegal fragment identifier.

      Also, not every maintainer of a taxonomy has control over the
      behaviour of the server hosting the taxonomy.

      Misha Wolf
      News Standards Manager, Reuters, http://www.reuters.com/
      Vice Chair, News Architecture WP, IPTC, http://www.iptc.org/

      This email was sent to you by Reuters, the global news and information company.
      To find out more about Reuters visit www.about.reuters.com

      Any views expressed in this message are those of the individual sender,
      except where the sender specifically states them to be the views of Reuters Limited.

      Reuters Limited is part of the Reuters Group of companies, of which Reuters Group PLC is the ultimate parent company.
      Reuters Group PLC - Registered office address: The Reuters Building, South Colonnade, Canary Wharf, London E14 5EP, United Kingdom
      Registered No: 3296375
      Registered in England and Wales
    • John Cowan
      ... Not exactly. We can decompose this into three claims, two false and one true. 1) 123456 is an invalid fragment: false. If you look at the syntax rules
      Message 2 of 7 , Apr 7 9:23 AM
      View Source
      • 0 Attachment
        Misha Wolf scripsit:

        > This:
        > http://www.iptc.org/docs/newscodes.html#123456
        > is not legal, as "123456" is an illegal fragment identifier.

        Not exactly. We can decompose this into three claims, two false
        and one true.

        1) "123456" is an invalid fragment: false. If you look at the
        syntax rules in RFC 3986, you see that every character in a fragment
        can be a digit.

        2) "123456" can't be the value of an XML attribute of type ID: false.
        An XML document may contain attributes of type ID in one of two
        ways: every attribute with the name "xml:id" is of type ID, and so
        is any attribute declared in the DTD (internal or external) to have
        type ID. Such attributes may contain any value, and the document is
        well-formed.

        3) "123456" can't be the value of an attribute of type ID in a
        *valid* XML document: true. However, plenty of documents are not
        valid: in particular, any document without a DTD is not valid, and
        there is nothing wrong with having a DTD without expecting or requiring
        validity.

        --
        John Cowan cowan@...
        "Not to know The Smiths is not to know K.X.U." --K.X.U.
      • Misha Wolf
        ... I m out of my depth here. At the W3C AC meeting in Edinburgh, last year, I understood Henry to be stating that something like:
        Message 3 of 7 , Apr 7 9:33 AM
        View Source
        • 0 Attachment
          John Cowan wrote:

          > Misha Wolf scripsit:
          >
          > > This:
          > > http://www.iptc.org/docs/newscodes.html#123456
          > > is not legal, as "123456" is an illegal fragment identifier.
          >
          > Not exactly. We can decompose this into three claims, two false
          > and one true.
          >
          > 1) "123456" is an invalid fragment: false. If you look at the
          > syntax rules in RFC 3986, you see that every character in a
          > fragment can be a digit.
          >
          > 2) "123456" can't be the value of an XML attribute of type ID:
          > false. An XML document may contain attributes of type ID in one
          > of two ways: every attribute with the name "xml:id" is of type ID,
          > and so is any attribute declared in the DTD (internal or external)
          > to have type ID. Such attributes may contain any value, and the
          > document is well-formed.
          >
          > 3) "123456" can't be the value of an attribute of type ID in a
          > *valid* XML document: true. However, plenty of documents are not
          > valid: in particular, any document without a DTD is not valid, and
          > there is nothing wrong with having a DTD without expecting or
          > requiring validity.

          I'm out of my depth here. At the W3C AC meeting in Edinburgh, last
          year, I understood Henry to be stating that something like:

          http://www.iptc.org/docs/newscodes.html#123456

          is not legal, as "123456" is an illegal fragment identifier. It may
          be that the resulting XML document is legal, but that the use in
          (X)HTML is illegal?

          Misha Wolf
          News Standards Manager, Reuters, http://www.reuters.com/
          Vice Chair, News Architecture WP, IPTC, http://www.iptc.org/

          This email was sent to you by Reuters, the global news and information company.
          To find out more about Reuters visit www.about.reuters.com

          Any views expressed in this message are those of the individual sender,
          except where the sender specifically states them to be the views of Reuters Limited.

          Reuters Limited is part of the Reuters Group of companies, of which Reuters Group PLC is the ultimate parent company.
          Reuters Group PLC - Registered office address: The Reuters Building, South Colonnade, Canary Wharf, London E14 5EP, United Kingdom
          Registered No: 3296375
          Registered in England and Wales
        • Misha Wolf
          ... Which end of the hyperlink are we discussing? The use, within a News story, of a specified concept from a specified taxonomy, or the description of this
          Message 4 of 7 , Apr 7 1:26 PM
          View Source
          • 0 Attachment
            Shane P. McCarron wrote:

            > John Cowan wrote:
            > > 3) "123456" can't be the value of an attribute of type ID in a
            > > *valid* XML document: true. However, plenty of documents are not
            > > valid: in particular, any document without a DTD is not valid,
            > > and there is nothing wrong with having a DTD without expecting or
            > > requiring validity.
            > >
            > I guess... however, note that XHTML family documents are REQUIRED
            > to be valid. That was the IPTC's goal when they started this
            > thing. Not sure if it is still their goal... defining a syntax
            > that breaks the rules would seem misguided to me.

            Which end of the hyperlink are we discussing? The use, within a News
            story, of a specified concept from a specified taxonomy, or the
            description of this concept on some Web page?

            The IPTC Standards (NewsML-G2, SportsML-G2, EventsML-G2, etc) are not
            (X)HTML-based. The Web page is, of course, (X)HTML.

            The issue, as I understand it, is not what is legal within the News
            story, but rather what is legal within the Web page.

            Misha Wolf
            News Standards Manager, Reuters, http://www.reuters.com/
            Vice Chair, News Architecture WP, IPTC, http://www.iptc.org/

            This email was sent to you by Reuters, the global news and information company.
            To find out more about Reuters visit www.about.reuters.com

            Any views expressed in this message are those of the individual sender,
            except where the sender specifically states them to be the views of Reuters Limited.

            Reuters Limited is part of the Reuters Group of companies, of which Reuters Group PLC is the ultimate parent company.
            Reuters Group PLC - Registered office address: The Reuters Building, South Colonnade, Canary Wharf, London E14 5EP, United Kingdom
            Registered No: 3296375
            Registered in England and Wales
          • John Cowan
            ... The latter. ... Why? It could just as well (perhaps better) be some XML + CSS. -- John Cowan cowan@ccil.org http://ccil.org/~cowan I must confess that
            Message 5 of 7 , Apr 7 2:50 PM
            View Source
            • 0 Attachment
              Misha Wolf scripsit:

              > Which end of the hyperlink are we discussing? The use, within a News
              > story, of a specified concept from a specified taxonomy, or the
              > description of this concept on some Web page?

              The latter.

              > The IPTC Standards (NewsML-G2, SportsML-G2, EventsML-G2, etc) are not
              > (X)HTML-based. The Web page is, of course, (X)HTML.

              Why? It could just as well (perhaps better) be some XML + CSS.

              --
              John Cowan cowan@... http://ccil.org/~cowan
              I must confess that I have very little notion of what [s. 4 of the British
              Trade Marks Act, 1938] is intended to convey, and particularly the sentence
              of 253 words, as I make them, which constitutes sub-section 1. I doubt if
              the entire statute book could be successfully searched for a sentence of
              equal length which is of more fuliginous obscurity. --MacKinnon LJ, 1940
            • John Cowan
              ... An xml:id processor will report a constraint violation when it sees xml:id= 123456 in an element, but it will perform ID type assignment anyway, as
              Message 6 of 7 , Apr 9 2:50 PM
              View Source
              • 0 Attachment
                Booth, David (HP Software - Boston) scripsit:

                > I'm confused by your point #3 below, as it seems to be implying that a
                > document without a DTD could legitimately have an attribute of type ID
                > with value "123456", and after looking at the specs I don't see how it
                > can. Did I miss something? Detailed analysis below.

                An xml:id processor will report a constraint violation when it sees
                "xml:id='123456'" in an element, but it will perform ID type assignment
                anyway, as noted in Section 4 of the xml:id Recommendation.

                In addition, a conformant XML processor must report an attribute declared
                to be of type ID as having that type, no matter what the value may be.
                For example, the document:

                <!DOCTYPE items [
                <!ATTLIST item id ID #IMPLIED>
                ]>
                <items>
                <item id="123456">...</item>
                ...
                </items>

                is not valid, but the id attribute of the item element is of type ID.
                So you can use either xml:id or a (possibly partial) DTD to force
                an attribute to be of type ID, and ignore any xml:id or validation errors.

                Neither of these devices is available in valid XHTML documents, of course.

                --
                You are a child of the universe no less John Cowan
                than the trees and all other acyclic http://www.ccil.org/~cowan
                graphs; you have a right to be here. cowan@...
                --DeXiderata by Sean McGrath
              • John Cowan
                ... A matter of opinion and/or taste. People do routinely work with invalid XML documents, which is the whole reason for introducing xml:id; xml:id processing
                Message 7 of 7 , Apr 11 2:05 PM
                View Source
                • 0 Attachment
                  Booth, David (HP Software - Boston) scripsit:

                  > I guess an approach that depends on routinely ignoring xml:id processing
                  > errors and not validating the XML does not seem to me like a wise design
                  > choice,

                  A matter of opinion and/or taste. People do routinely work with invalid
                  XML documents, which is the whole reason for introducing xml:id;
                  xml:id processing errors reintroduce a little bit of validity, which is
                  why processors SHOULD report them for those who care. But there is
                  no MUSTard about what XML applications do with such errors.

                  --
                  John Cowan cowan@... http://www.ccil.org/~cowan
                  Does anybody want any flotsam? / I've gotsam.
                  Does anybody want any jetsam? / I can getsam.
                  --Ogden Nash, No Doctors Today, Thank You
                Your message has been successfully submitted and would be delivered to recipients shortly.