Loading ...
Sorry, an error occurred while loading the content.
 

Re: [rss-public] Re: data-types-url

Expand Messages
  • Sam Ruby
    ... Unless you are setting out to change what RSS 2.0 is, one need only look at the baseline 2.0.1-rv-6 (a.k.a. Harvard ) spec for the enclosure element,
    Message 1 of 8 , Feb 28, 2006
      rcade wrote:
      > --- In rss-public@yahoogroups.com, Sam Ruby <rubys@...> wrote:
      >
      >>Again, I don't think all this background needs to be included in the
      >>spec, but a simple statement like the one suggested above would be
      >>appropriate.
      >
      > Is there a workaround that RSS publishers in other languages are using
      > so that they may use IRIs as URLs in RSS, or are they simply forced to
      > employ URLs with the anglicized character set?

      Unless you are setting out to change what RSS 2.0 is, one need only look
      at the baseline 2.0.1-rv-6 (a.k.a. "Harvard") spec for the enclosure
      element, which says quite simply and clearly "The url must be an http
      url". Given this, one could say that "they simply [are] forced to
      employ URLs with the anglicized character set"(*).

      While that SOUNDS bad, in practice it is not. There is a clear and
      reversible (for all but some pesky edge cases of no consequence) mapping
      from IRIs to URIs. And all this is handled transparently by some browsers.

      Try entering either http://www.atemschutzunfälle.de/asu.rdf or
      http://www.xn--atemschutzunflle-7nb.de/asu.rdf in the Feed Validator.
      Either way, you will get the same results. In the validation results,
      you will see the "human friendly" version in the input field. If you
      look at the text link at the bottom of the page, you will see the
      internal or "IDNA" version, one that is completely acceptable to all
      HTTP stacks, and conforms to the RSS 2.0 specification.

      - Sam Ruby

      (*) Note that I am talking about the "host" portion of the URI here.
      Non-ASCII characters may be percent encoded and included in other
      portions of the URI, for example, inside a query string.
    • rcade
      ... I think that the following sentence in data-types-urls serves the same purpose without sounding like a new requirement for RSS implementers: These elements
      Message 2 of 8 , Mar 1, 2006
        --- In rss-public@yahoogroups.com, Sam Ruby <rubys@...> wrote:
        >IRIs MUST be converted to URIs before being included in an RSS 2.0
        >document.

        I think that the following sentence in data-types-urls serves the same
        purpose without sounding like a new requirement for RSS implementers:

        These elements MUST NOT contain IRIs.

        The word "IRIs" could link to http://www.apps.ietf.org/rfc/rfc3987.html.

        Implementers who are conversant with IRIs would know this means a
        conversion to URLs is necessary in order to be compliant with Really
        Simple Syndication.

        This wouldn't be a change because 2.0.1-rv-6 requires URLs, and IRIs
        are not URLs.
      • Sam Ruby
        ... I am fine with that wording, Now lets look at how these two suggestions can be compbined. These elements MUST NOT contain IRIs. IRIs MUST be converted to
        Message 3 of 8 , Mar 1, 2006
          rcade wrote:
          > --- In rss-public@yahoogroups.com, Sam Ruby <rubys@...> wrote:
          >
          >>IRIs MUST be converted to URIs before being included in an RSS 2.0
          >>document.
          >
          > I think that the following sentence in data-types-urls serves the same
          > purpose without sounding like a new requirement for RSS implementers:
          >
          > These elements MUST NOT contain IRIs.
          >
          > The word "IRIs" could link to http://www.apps.ietf.org/rfc/rfc3987.html.
          >
          > Implementers who are conversant with IRIs would know this means a
          > conversion to URLs is necessary in order to be compliant with Really
          > Simple Syndication.
          >
          > This wouldn't be a change because 2.0.1-rv-6 requires URLs, and IRIs
          > are not URLs.

          I am fine with that wording, Now lets look at how these two suggestions
          can be compbined.

          These elements MUST NOT contain IRIs. IRIs MUST be converted to
          URIs before being included in an RSS 2.0 document.

          The first sentence sounds like "note to non-English people: you are
          screwed". The second sentence says "no you are not, here's a path
          forward, complete with a helpful link to section 3.2 of RFC 3987 which
          tells you what you need to do".

          But however you chose to word it is fine with me.

          - Sam Ruby
        • Sam Ruby
          ... Upon further reflection, that sentence is misleading. The set of valid IRIs is a proper set supersets of the set of valid URIs. So disallowing IRIs would
          Message 4 of 8 , Mar 1, 2006
            Sam Ruby wrote:
            > rcade wrote:
            >
            >>--- In rss-public@yahoogroups.com, Sam Ruby <rubys@...> wrote:
            >>
            >>>IRIs MUST be converted to URIs before being included in an RSS 2.0
            >>>document.
            >>
            >>I think that the following sentence in data-types-urls serves the same
            >>purpose without sounding like a new requirement for RSS implementers:
            >>
            >>These elements MUST NOT contain IRIs.
            >
            > I am fine with that wording,

            Upon further reflection, that sentence is misleading.

            The set of valid IRIs is a proper set supersets of the set of valid
            URIs. So disallowing IRIs would disallow URIs.

            The process defined for convering an IRI which is already a URI to a URI
            is a no-op.

            - Sam Ruby
          • A. Pagaltzis
            ... I think the correct wording for the spec would be that IRIs with non-ASCII characters MUST be given in their punycode-encoded URI representation. Regards,
            Message 5 of 8 , Mar 1, 2006
              * Sam Ruby <rubys@...> [2006-03-01 13:15]:
              >The set of valid IRIs is a proper set supersets of the set of
              >valid URIs. So disallowing IRIs would disallow URIs.

              I think the correct wording for the spec would be that IRIs with
              non-ASCII characters MUST be given in their punycode-encoded URI
              representation.

              Regards,
              --
              Aristotle Pagaltzis // <http://plasmasturm.org/>
            Your message has been successfully submitted and would be delivered to recipients shortly.