Loading ...
Sorry, an error occurred while loading the content.

RSS Mime Type Obstacle

Expand Messages
  • rcade
    I m reading the RFCs related to MIME types and the process to apply for one. My read of Section 2.2.4 of RFC 2048 would exclude a joint application for
    Message 1 of 22 , Apr 5, 2006
    • 0 Attachment
      I'm reading the RFCs related to MIME types and the process to apply
      for one.

      My read of Section 2.2.4 of RFC 2048 would exclude a joint application
      for "application/rss+xml" by the RSS Advisory Board and RSS-DEV
      Working Group (or any other shared type).

      The section:

      "2.2.4. Canonicalization and Format Requirements

      "All registered media types must employ a single, canonical data
      format, regardless of registration tree.

      "A precise and openly available specification of the format of each
      media type is required for all types registered in the IETF tree and
      must at a minimum be referenced by, if it isn't actually included in,
      the media type registration proposal itself."

      ftp://ftp.isi.edu/in-notes/rfc2048.txt

      Does anyone else have a different take on this?
    • David Powell
      ... RFC 2048 has been obsoleted by RFC 4288 btw. -- Dave
      Message 2 of 22 , Apr 5, 2006
      • 0 Attachment
        Wednesday, April 5, 2006, 4:07:52 PM, rcade wrote:

        > My read of Section 2.2.4 of RFC 2048 would exclude a joint application
        > for "application/rss+xml" by the RSS Advisory Board and RSS-DEV
        > Working Group (or any other shared type).

        RFC 2048 has been obsoleted by RFC 4288 btw.

        --
        Dave
      • Sean Lyndersay
        For what it s worth, perhaps RSS 1.0 should register for application/rss+rdf+xml, as has been suggested in a couple of different places:
        Message 3 of 22 , Apr 5, 2006
        • 0 Attachment

           

          For what it’s worth, perhaps RSS 1.0 should register for application/rss+rdf+xml, as has been suggested in a couple of different places:

          http://www.advogato.org/article/852.html.

           

           


          From: rss-public@yahoogroups.com [mailto:rss-public@yahoogroups.com] On Behalf Of rcade
          Sent: Wednesday, April 05, 2006 8:08 AM
          To: rss-public@yahoogroups.com
          Subject: [rss-public] RSS Mime Type Obstacle

           

          I'm reading the RFCs related to MIME types and the process to apply
          for one.

          My read of Section 2.2.4 of RFC 2048 would exclude a joint application
          for "application/rss+xml" by the RSS Advisory Board and RSS-DEV
          Working Group (or any other shared type).

          The section:

          "2.2.4.  Canonicalization and Format Requirements

          "All registered media types must employ a single, canonical data
          format, regardless of registration tree.

          "A precise and openly available specification of the format of each
          media type is required for all types registered in the IETF tree and
          must at a minimum be referenced by, if it isn't actually included in,
          the media type registration proposal itself."

          ftp://ftp.isi.edu/in-notes/rfc2048.txt

          Does anyone else have a different take on this?




          SPONSORED LINKS

          Xml format

          Advisory board

           


          YAHOO! GROUPS LINKS

           

           


        • Charles Iliya Krempeaux
          Hello Sean, (Sorry for getting off topic everyone.) Just to chime in (because you quoted something I wrote...). (I haven t gotten around to reposting this
          Message 4 of 22 , Apr 6, 2006
          • 0 Attachment
            Hello Sean,

            (Sorry for getting off topic everyone.)

            Just to chime in (because you quoted something I wrote...).  (I haven't gotten around to reposting this onto Advogato yet, but) if you are looking at distinushing the different kinds of feeds apart (which is what that article I wrote is about) then you may want to look at a newer proposal I made too:


            The article's primarily targetted towards Internet TV (which is also known as vlogging, vidcasting, vodcasting, IPTV, etc).  But in the comment it also applied the same technique to Internet radio (also known as podcasting, IPradio, etc).


            See ya


            On 4/5/06, Sean Lyndersay <seanlynd@...> wrote:

             

            For what it's worth, perhaps RSS 1.0 should register for application/rss+rdf+xml, as has been suggested in a couple of different places:

            http://www.advogato.org/article/852.html .

             

             


            From: rss-public@yahoogroups.com [mailto:rss-public@yahoogroups.com] On Behalf Of rcade
            Sent: Wednesday, April 05, 2006 8:08 AM
            To: rss-public@yahoogroups.com
            Subject: [rss-public] RSS Mime Type Obstacle

             

            I'm reading the RFCs related to MIME types and the process to apply
            for one.

            My read of Section 2.2.4 of RFC 2048 would exclude a joint application
            for "application/rss+xml" by the RSS Advisory Board and RSS-DEV
            Working Group (or any other shared type).

            The section:

            "2.2.4.  Canonicalization and Format Requirements

            "All registered media types must employ a single, canonical data
            format, regardless of registration tree.

            "A precise and openly available specification of the format of each
            media type is required for all types registered in the IETF tree and
            must at a minimum be referenced by, if it isn't actually included in,
            the media type registration proposal itself."

            ftp://ftp.isi.edu/in-notes/rfc2048.txt

            Does anyone else have a different take on this?

            [...]


            --
                Charles Iliya Krempeaux, B.Sc.

                charles @ reptile.ca
                supercanadian @ gmail.com

                developer weblog: http://ChangeLog.ca/
            ___________________________________________________________________________
             Make Television                                http://maketelevision.com/
          • ecomputerd
            I appreciate the intention of the media parameter. I would recommend using multiple enclosure-type parameters that list the MIME types of the enclosures.
            Message 5 of 22 , Apr 6, 2006
            • 0 Attachment
              I appreciate the intention of the "media" parameter.

              I would recommend using multiple "enclosure-type" parameters that
              list the MIME types of the enclosures. For example, use enclosure-
              type="video/*" instead of media="tv". This handles mixed feeds much
              more reliably and generically. In fact, I'm not sure I would even
              agree with placing these definitions in the *pointer to* the RSS
              feed. They may be much more relevant within the feed itself. There
              is no current RSS element that lists *potential* enclosures. This
              may be a decent area of extension using a new namespace.

              The overall issue with your proposal, in my mind, is that it is not
              aware of mixed feeds. In addition, it appears that your intention is
              targeted for a very specific application: that of dedicated video
              aggregators.

              As for additional "descriptors" in your suggestion (media="screen,
              tv, 3d-glasses, print and resolution > 90dpi"), it seems much more
              relevant to include such information within the RSS feed itself,
              using RSS-Media for example. Largely because such additional
              descriptors as described are extremely difficult to parse,
              interpret, display, and (if necessary) allow to be chosen by the
              user.

              Greg Smith

              --- In rss-public@yahoogroups.com, "Charles Iliya Krempeaux"
              <supercanadian@...> wrote:
              >
              > Hello Sean,
              >
              > (Sorry for getting off topic everyone.)
              >
              > Just to chime in (because you quoted something I wrote...). (I
              haven't
              > gotten around to reposting this onto Advogato yet, but) if you are
              looking
              > at distinushing the different kinds of feeds apart (which is what
              that
              > article I wrote is about) then you may want to look at a newer
              proposal I
              > made too:
              >
              > RSS and Atom Feed Auto-Discovery for Internet TV
              > http://maketelevision.com/log/rss_and_atom_feed_auto-
              discovery_for_internet_tv
              >
              > The article's primarily targetted towards Internet TV (which is
              also known
              > as vlogging, vidcasting, vodcasting, IPTV, etc). But in the
              comment it also
              > applied the same technique to Internet radio (also known as
              podcasting,
              > IPradio, etc).
              >
              >
              > See ya
              >
              >
              > On 4/5/06, Sean Lyndersay <seanlynd@...> wrote:
              > >
              > >
              > >
              > > For what it's worth, perhaps RSS 1.0 should register for
              > > application/rss+rdf+xml, as has been suggested in a couple of
              different
              > > places:
              > >
              > > http://www.advogato.org/article/852.html.
              > >
              > >
              > >
              > >
              > > ------------------------------
              > >
              > > *From:* rss-public@yahoogroups.com [mailto:rss-
              public@yahoogroups.com] *On
              > > Behalf Of *rcade
              > > *Sent:* Wednesday, April 05, 2006 8:08 AM
              > > *To:* rss-public@yahoogroups.com
              > > *Subject:* [rss-public] RSS Mime Type Obstacle
              > >
              > >
              > >
              > > I'm reading the RFCs related to MIME types and the process to
              apply
              > > for one.
              > >
              > > My read of Section 2.2.4 of RFC 2048 would exclude a joint
              application
              > > for "application/rss+xml" by the RSS Advisory Board and RSS-DEV
              > > Working Group (or any other shared type).
              > >
              > > The section:
              > >
              > > "2.2.4. Canonicalization and Format Requirements
              > >
              > > "All registered media types must employ a single, canonical data
              > > format, regardless of registration tree.
              > >
              > > "A precise and openly available specification of the format of
              each
              > > media type is required for all types registered in the IETF tree
              and
              > > must at a minimum be referenced by, if it isn't actually
              included in,
              > > the media type registration proposal itself."
              > >
              > > ftp://ftp.isi.edu/in-notes/rfc2048.txt
              > >
              > > Does anyone else have a different take on this?
              > >
              > [...]
              >
              >
              > --
              > Charles Iliya Krempeaux, B.Sc.
              >
              > charles @ reptile.ca
              > supercanadian @ gmail.com
              >
              > developer weblog: http://ChangeLog.ca/
              >
              _____________________________________________________________________
              ______
              > Make Television
              http://maketelevision.com/
              >
            • ecomputerd
              Below is a stab at the application for the media type application/rss+xml. The formatting of Yahoo! Group postings is horrible. I m open to suggestions on how
              Message 6 of 22 , Apr 6, 2006
              • 0 Attachment
                Below is a stab at the application for the media type
                application/rss+xml. The formatting of Yahoo! Group postings is
                horrible. I'm open to suggestions on how to make such drafts
                available in a more readable format.

                In particular, this application draft specifies the use of an
                optional "version" parameter that mirros the rss element's version
                attribute. This would allow differentiation of the various RSS
                specifications when and where desired.

                Other notes and questions:
                Should the version parameter include: "In some cases, it is expected
                that maintenance of interoperability with respect to any specific
                value of the "version" parameter could be deferred to the respective
                specification author or organization."?

                I would argue that the charset parameter is not authoritative when
                it conflicts with the XML encoding declaration within the XML MIME
                entity. My argument: Any processor willing to change the charset,
                should also be capable of changing the declaration within the XML
                itself, although this hierarchy is not definitive. This is in direct
                conflict with RFC3023 Section 3.2.

                Is there a significant advantage to differentiating RSS files from
                other XML files? For File Extension, it seems reasonable to
                use ".rss" (or we can stick to ".xml" as is specified by some other
                generic xml files) and for Macintosh File Type Code: "TEXT" (what
                about "RSS" presuming that a 3-character code is acceptable?). I'm
                not aware of the interoperability issues with the Macintosh File
                Type Code.

                The "Published Specification:" section may need an actual
                specification (or link) to be accepted.

                Base URI (RFC3023 Section 6) and Fragment identifiers (RFC3023
                Section 5) probably need review (for possible insertion
                in "Additional Information")

                Email address, Author, and Change Controller need to be filled.

                ---
                According to the charter (http://www.rssboard.org/charter), the "RSS
                Advisory Board is an independent organization with three primary
                duties: publishing the RSS specification, guiding developers who
                create RSS applications and broadening the public understanding of
                RSS." Membership to the board is open, following a vote of existing
                board members, and all actions of the Board are publicly available.
                In addition, the Board maintains a message board open to all for
                specific public discussion on any RSS-related issues.

                To: ietf-types@...
                Subject: Registration of media type application/rss+xml

                Type name: application

                Subtype name: rss+xml

                Required parameters:

                Optional parameters:
                "version": This parameter is identical to the version
                attribute of RSS element within the document, when specified within
                the document. This parameter is optional, whether specified within
                the document or not. A list of valid version parameters is to be
                kept by the RSS Advisory Board. Because some older RSS
                specifications are not currently maintained, the RSS Advisory Board
                does currently maintain a list of versions and a mirred copy of each
                specification. The maintenance of a version to specification mapping
                is easily accomplished, and in any case would be subject to public
                discussion when decided.
                "charset": This parameter has identical semantics to the
                charset parameter of the "application/xml" media type as specified
                in [RFC3023].

                Encoding considerations: Identical to those of "application/xml"
                as described in [RFC3023], section 3.2.

                Security considerations:
                HTML may be used within "one or more" RSS elements. Such use
                of HTML could include [whether allowed in the specification or not]
                potentially-dangerous use of Javascript or other scripting language.
                In some implementations of RSS processors, certain RSS elements may
                be inserted directly into an HTML processing engine or component.
                Such use of Javascript in these RSS elements may be a security
                concern to the implementor.
                In addition, as this media type uses the "+xml" convention,
                it shares the same security considerations as described in
                [RFC3023], section 10.

                Interoperability considerations:
                There are a variety of specifications that call
                themselves "RSS". The effort of the RSS Advisory Board is intended
                to continue to clarify and reconcile interoperability issues as they
                arise. The "version" parameter of this proposal is intended to allow
                differentiation between these various versions, when required.

                Published specification:
                The RSS Advisory Board is currently in the process of
                clarifying and publishing draft specification and Best Practices
                guidelines with the unique purpose of clarifying the specification
                and guiding implementations of RSS document originators and
                processors.

                Applications that use this media type:
                This media type, as well as a handful of others, is currently
                being used by a variety of publishers and clients to indicate "RSS"
                files. The intent of this application is to guide implementors to a
                common media type and encourage the use of differentiating
                parameters where needed.

                Additional information:

                Magic number(s):
                File extension(s):
                Macintosh file type code(s):

                Person & email address to contact for further information:

                Intended usage: COMMON

                (One of COMMON, LIMITED USE or OBSOLETE.)

                Restrictions on usage:

                (Any restrictions on where the media type can be used go here.)

                Author:

                Change controller:

                (Any other information that the author deems interesting may be
                added
                below this line.)
              • Bill Kearney
                No, let s not make it any more confusing that it needs to be. -Bill Kearney Syndic8.com ... For what it s worth, perhaps RSS 1.0 should register for
                Message 7 of 22 , Apr 6, 2006
                • 0 Attachment
                  No, let's not make it any more confusing that it needs to be.

                  -Bill Kearney
                  Syndic8.com

                  ----- Original Message -----
                  > From: "Sean Lyndersay" <seanlynd@...>



                  For what it's worth, perhaps RSS 1.0 should register for
                  application/rss+rdf+xml
                • A. Pagaltzis
                  ... Disagree. Noone will deny that HTTP / MIME / XML is a tangled mess, but it strikes me as less than wise to willfully make it even worse by having specs
                  Message 8 of 22 , Apr 6, 2006
                  • 0 Attachment
                    * ecomputerd <ecomputerd@...> [2006-04-06 12:35]:
                    > I would argue that the charset parameter is not authoritative
                    > when it conflicts with the XML encoding declaration within the
                    > XML MIME entity. My argument: Any processor willing to change
                    > the charset, should also be capable of changing the declaration
                    > within the XML itself, although this hierarchy is not
                    > definitive. This is in direct conflict with RFC3023 Section
                    > 3.2.

                    Disagree. Noone will deny that HTTP / MIME / XML is a tangled
                    mess, but it strikes me as less than wise to willfully make it
                    even worse by having specs arbitrarily overriding other upstream
                    specs. Your proposal would mean that anyone processing RSS
                    documents as generic XML will behave differently from someone
                    processing RSS documents as RSS documents. It would be better to
                    just tell people that it is best practice to not supply the
                    charset parameter. In that case, RFC3023 requires processors to
                    fall back to the encoding in the XML preamble, which is sanity.

                    This doesn't just apply to RSS; it is best practice for *any* XML
                    content to serve it using an appropriate `application/*` MIME
                    type without an explicit charset parameter.

                    > Is there a significant advantage to differentiating RSS files
                    > from other XML files? For File Extension, it seems reasonable
                    > to use ".rss" (or we can stick to ".xml" as is specified by
                    > some other generic xml files) and for Macintosh File Type Code:
                    > "TEXT" (what about "RSS" presuming that a 3-character code is
                    > acceptable?). I'm not aware of the interoperability issues with
                    > the Macintosh File Type Code.

                    Filename extensions are meaningless on the internet, and Mac file
                    type codes even more so. Since end users don't generally save RSS
                    documents to disk, I don't see much value in making any of this
                    up, and since neither these pieces of information nor the Magic
                    Number are applicable to RSS documents, I'd just leave out the
                    Additional Information section entirely.

                    Regards,
                    --
                    Aristotle Pagaltzis // <http://plasmasturm.org/>
                  • Sam Ruby
                    ... How about a simple statement that the charset parameter, if present, MUST agree with the encoding that would be determined by following the rules in
                    Message 9 of 22 , Apr 6, 2006
                    • 0 Attachment
                      A. Pagaltzis wrote:
                      > * ecomputerd <ecomputerd@...> [2006-04-06 12:35]:
                      >
                      >>I would argue that the charset parameter is not authoritative
                      >>when it conflicts with the XML encoding declaration within the
                      >>XML MIME entity. My argument: Any processor willing to change
                      >>the charset, should also be capable of changing the declaration
                      >>within the XML itself, although this hierarchy is not
                      >>definitive. This is in direct conflict with RFC3023 Section
                      >>3.2.
                      >
                      > Disagree. Noone will deny that HTTP / MIME / XML is a tangled
                      > mess, but it strikes me as less than wise to willfully make it
                      > even worse by having specs arbitrarily overriding other upstream
                      > specs. Your proposal would mean that anyone processing RSS
                      > documents as generic XML will behave differently from someone
                      > processing RSS documents as RSS documents. It would be better to
                      > just tell people that it is best practice to not supply the
                      > charset parameter. In that case, RFC3023 requires processors to
                      > fall back to the encoding in the XML preamble, which is sanity.
                      >
                      > This doesn't just apply to RSS; it is best practice for *any* XML
                      > content to serve it using an appropriate `application/*` MIME
                      > type without an explicit charset parameter.

                      How about a simple statement that the charset parameter, if present,
                      MUST agree with the encoding that would be determined by following the
                      rules in Section F.1 of the XML 1.0 specification (third edition)?

                      - Sam Ruby
                    • Charles Iliya Krempeaux
                      Hello, (First, just to say it explicitly, this media= tv proposal does NOT take into account mixed feeds. However, I still believe this to be very useful.
                      Message 10 of 22 , Apr 6, 2006
                      • 0 Attachment
                        Hello,

                        (First, just to say it explicitly, this media="tv" proposal does NOT take into account "mixed" feeds.  However, I still believe this to be very useful.  So, having said that...)

                        I agree with you that it is important to have this information in the RSS feed too.

                        However, I think it is necessary to put this "hinting" information in the HTML because it allows software to "take action" without having to understand RSS.  Which is one the points of HTML "hinting" in general.  It is done so that the browser can "take action" without having to download the target and without having to understand the information at the target.

                        For a more concrete example, imagine if a webpage has a "link" to an RSS feed.  But that that RSS feed is on a different site.  Because of security restrictions, (JavaScript on) the webpage would NOT be able to "probe" the RSS feed (on the other site) and figure what out kind of feed it is.  It would need to completely rely on "hinting" information.


                        Another point, you mentioned using a MIME type of "video/*", with RSS <enclosure>'s, as an alternative to media="tv".  I do NOT think this is a sufficient solution.  The reason is that not all "video" data has a MIME type such as that.  2 examples that come to mind immediately are:
                        • Ogg Theora - application/ogg
                        • SMIL - application/smil  and  application/smil+xml

                        So basically, although I agree with you that adding that kind of metadata into the RSS feed is useful, I still believe that providing the HTML hinting information (of media="tv", media="radio", or whatever) is still useful, and for many use cases (which I've neglected to mention here) is necessary.


                        See ya

                        On 4/6/06, ecomputerd <ecomputerd@...> wrote:
                        I appreciate the intention of the "media" parameter.

                        I would recommend using multiple "enclosure-type" parameters that
                        list the MIME types of the enclosures. For example, use enclosure-
                        type="video/*" instead of media="tv". This handles mixed feeds much
                        more reliably and generically. In fact, I'm not sure I would even
                        agree with placing these definitions in the *pointer to* the RSS
                        feed. They may be much more relevant within the feed itself. There
                        is no current RSS element that lists *potential* enclosures. This
                        may be a decent area of extension using a new namespace.

                        The overall issue with your proposal, in my mind, is that it is not
                        aware of mixed feeds. In addition, it appears that your intention is
                        targeted for a very specific application: that of dedicated video
                        aggregators.

                        As for additional "descriptors" in your suggestion (media="screen,
                        tv, 3d-glasses, print and resolution > 90dpi"), it seems much more
                        relevant to include such information within the RSS feed itself,
                        using RSS-Media for example. Largely because such additional
                        descriptors as described are extremely difficult to parse,
                        interpret, display, and (if necessary) allow to be chosen by the
                        user.

                        Greg Smith

                        --- In rss-public@yahoogroups.com, "Charles Iliya Krempeaux"
                        <supercanadian@...> wrote:
                        >
                        > Hello Sean,
                        >
                        > (Sorry for getting off topic everyone.)
                        >
                        > Just to chime in (because you quoted something I wrote...).  (I
                        haven't
                        > gotten around to reposting this onto Advogato yet, but) if you are
                        looking
                        > at distinushing the different kinds of feeds apart (which is what
                        that
                        > article I wrote is about) then you may want to look at a newer
                        proposal I
                        > made too:
                        >
                        > RSS and Atom Feed Auto-Discovery for Internet TV
                        > http://maketelevision.com/log/rss_and_atom_feed_auto-
                        discovery_for_internet_tv
                        >
                        > The article's primarily targetted towards Internet TV (which is
                        also known
                        > as vlogging, vidcasting, vodcasting, IPTV, etc).  But in the
                        comment it also
                        > applied the same technique to Internet radio (also known as
                        podcasting,
                        > IPradio, etc).
                        >
                        >
                        > See ya
                        >
                        >
                        > On 4/5/06, Sean Lyndersay <seanlynd@...> wrote:
                        > >
                        > >
                        > >
                        > > For what it's worth, perhaps RSS 1.0 should register for
                        > > application/rss+rdf+xml, as has been suggested in a couple of
                        different
                        > > places:
                        > >
                        > > http://www.advogato.org/article/852.html.
                        > >
                        > >
                        > >
                        > >
                        > >  ------------------------------
                        > >
                        > > *From:* rss-public@yahoogroups.com [mailto:rss-
                        public@yahoogroups.com] *On
                        > > Behalf Of *rcade
                        > > *Sent:* Wednesday, April 05, 2006 8:08 AM
                        > > *To:* rss-public@yahoogroups.com
                        > > *Subject:* [rss-public] RSS Mime Type Obstacle
                        > >
                        > >
                        > >
                        > > I'm reading the RFCs related to MIME types and the process to
                        apply
                        > > for one.
                        > >
                        > > My read of Section 2.2.4 of RFC 2048 would exclude a joint
                        application
                        > > for "application/rss+xml" by the RSS Advisory Board and RSS-DEV
                        > > Working Group (or any other shared type).
                        > >
                        > > The section:
                        > >
                        > > " 2.2.4.  Canonicalization and Format Requirements
                        > >
                        > > "All registered media types must employ a single, canonical data
                        > > format, regardless of registration tree.
                        > >
                        > > "A precise and openly available specification of the format of
                        each
                        > > media type is required for all types registered in the IETF tree
                        and
                        > > must at a minimum be referenced by, if it isn't actually
                        included in,
                        > > the media type registration proposal itself."
                        > >
                        > > ftp://ftp.isi.edu/in-notes/rfc2048.txt
                        > >
                        > > Does anyone else have a different take on this?

                        [...]


                        --
                            Charles Iliya Krempeaux, B.Sc.

                            charles @ reptile.ca
                            supercanadian @ gmail.com

                            developer weblog: http://ChangeLog.ca/
                        ___________________________________________________________________________
                         Make Television                                http://maketelevision.com/
                      • ecomputerd
                        Charles, Thank you for this discussion! I m not arguing against the hinting , I m arguing for a different form of the hinting (and one that is consistent with
                        Message 11 of 22 , Apr 6, 2006
                        • 0 Attachment
                          Charles,

                          Thank you for this discussion!

                          I'm not arguing against the "hinting", I'm arguing for a different
                          form of the hinting (and one that is consistent with other protocols
                          and "hinting" mechanisms) . As far as my suggestion not covering Ogg
                          and SMIL. I agree that "video/*" does not, and that is exactly why I
                          indicated that *multiple* "enclosure-type" parameters be used.
                          Adding those MIME types (application/ogg, application/smil, and
                          application/smil+xml) to the "hint" in addition to "video/*" come
                          closer to properly indicating the types of files in a flexible,
                          generic, and extensible way.

                          It is arguable from a high level that application/ogg might not need
                          further demarcation of audio vs. video because they are in fact the
                          same container format. The last time that I reviewed them, both APP
                          and RSS-Media (plus my suggested modifications to your proposal) use
                          the media type as the determining factor on whether the file type is
                          acceptable (for download or upload, as appropriate). This behooves
                          maintainers of MIME types to ensure that a proper and decent amount
                          of differentiation between files can be made. In the case of these
                          newer uses of the MIME type, the proper and decent amount of
                          differentiation apparently includes differentiation between files
                          meant for audio players (with or without video capability) and files
                          meant for video players.

                          Because of the various uses of MIME type for determining the further
                          disposition of the media file, my strong suggestion to the
                          maintainers of the Ogg specification is that they 1) confirm my
                          assertions here, and 2) consider registering audio/ogg, video/ogg or
                          come up with another scheme (parameters or "+audio" or "-audio",
                          perhaps?) that differentiates audio from video. I wasn't previously
                          aware of the "parameter" notation of MIME types (using semicolons),
                          but I recall at least one recommendation/Pace for APP that included
                          a semicolon-separated list of acceptable MIME types for POST. I'm
                          not sure what was finally decided on. This may play a role on how
                          the Ogg specification decides to differentiate audio from video.

                          I know there are edge cases (audio file containing embedded jpeg
                          images for example, and the fact that the vast majority of video
                          files contain audio as well), but the overall intent could be
                          aligned with the current audio/* and video/* MIME types.

                          It's possible that a review of current recommended MPEG4 MIME types
                          and common uses is in order prior to determining how and if Ogg
                          should be indicated by further MIME types to differentiate audio and
                          video. Ogg could have the same issue as MPEG-4 specifies audio (part
                          3) video (part 2 and part 10 "AVC"), and file format (part 14, among
                          others if I recall correctly).

                          For this message, I very briefly reviewed the specification:
                          http://www.rfc-editor.org/rfc/rfc3534.txt
                          and a small portion of the discussion at
                          http://lists.xiph.org/pipermail/theora-dev/2003-February/000385.html


                          You wrote:
                          >I still believe that providing the HTML hinting information
                          >(of media="tv", media="radio", or whatever) is still useful,
                          >and for many use cases (which I've neglected to mention here) is
                          necessary.

                          I am simply suggesting a proxy of your "media" tag because I do not
                          believe the tag is sufficiently flexible to accomodate a wide
                          variety of potential use. It is creating a new paradigm around
                          the "media" tag that is substantially replaced easily by the
                          existing MIME type mechanism (assuming there are "proper and decent"
                          definitions of MIME types as described above, and for which there
                          are now multiple reasons to seek).

                          I would love to hear any number of use cases that you've not yet
                          mentioned that might convince me to alter my opinion. That's exactly
                          what this forum is for!

                          I think we are in complete agreement about the utility, we are just
                          disagreeing on the format.

                          And for those who might doubt the usefulness of its discussion here,
                          Autodiscovery, in my opinion, is an important part of RSS Best
                          Practices.

                          Greg Smith
                        • ecomputerd
                          ... present, ... the ... Sam, What is the relationship between RFC3023 and the XML 1.0 specification (third edition)? RFC3023 does make frequent reference to
                          Message 12 of 22 , Apr 6, 2006
                          • 0 Attachment
                            --- In rss-public@yahoogroups.com, Sam Ruby <rubys@...> wrote:
                            >
                            > A. Pagaltzis wrote:
                            > > * ecomputerd <ecomputerd@...> [2006-04-06 12:35]:
                            > >
                            > >>I would argue that the charset parameter is not authoritative
                            > >>when it conflicts with the XML encoding declaration within the
                            > >>XML MIME entity. My argument: Any processor willing to change
                            > >>the charset, should also be capable of changing the declaration
                            > >>within the XML itself, although this hierarchy is not
                            > >>definitive. This is in direct conflict with RFC3023 Section
                            > >>3.2.
                            > >
                            > > Disagree. Noone will deny that HTTP / MIME / XML is a tangled
                            > > mess, but it strikes me as less than wise to willfully make it
                            > > even worse by having specs arbitrarily overriding other upstream
                            > > specs. Your proposal would mean that anyone processing RSS
                            > > documents as generic XML will behave differently from someone
                            > > processing RSS documents as RSS documents. It would be better to
                            > > just tell people that it is best practice to not supply the
                            > > charset parameter. In that case, RFC3023 requires processors to
                            > > fall back to the encoding in the XML preamble, which is sanity.
                            > >
                            > > This doesn't just apply to RSS; it is best practice for *any* XML
                            > > content to serve it using an appropriate `application/*` MIME
                            > > type without an explicit charset parameter.
                            >
                            > How about a simple statement that the charset parameter, if
                            present,
                            > MUST agree with the encoding that would be determined by following
                            the
                            > rules in Section F.1 of the XML 1.0 specification (third edition)?
                            >
                            > - Sam Ruby
                            >

                            Sam,

                            What is the relationship between RFC3023 and the XML 1.0
                            specification (third edition)? RFC3023 does make frequent reference
                            to the Second Edition. What are the "inheritance rules" for RFC3023
                            adopting the Third Edition as opposed to the specifically-stated
                            Second Edition (I assume these are the the 2nd and 3rd editions of
                            the same document).

                            Also, my apologies if what follows sounds like an attack, Sam. I
                            assure you it is not! I know you are only trying to accomodate my
                            uneasyness with the specification as written.

                            Is it really possible and reasonable to restrict the encoding
                            attribute this way when RFC3023 indicates specifically that the
                            Internal Encoding Declaration can contradict the charset (in which
                            case the charset is used). This is implied in section 8.8 by the
                            comment in parentheses and stated directly in section 8.20 (see
                            below for 8.20 excerpt).

                            In section 8.10 et al, RFC3023 defers (in very specific cases) to
                            Appendix F of "Extensible Markup Language (XML) 1.0 (Second
                            Edition)" issued by The World Wide Web Consortium for determining
                            the appropriate character set/encoding.

                            In section 8.20, the example is where the charset is specified and
                            it disagrees with the Internal Encoding Declaration. In such a case,
                            the Internal Encoding Declaration MUST be ignored.

                            There is an additional paragraph within section 8.20 that I don't
                            know how to interpret:

                            Processors generating XML MIME entities MUST NOT label
                            conflicting charset information between the MIME
                            Content-Type and the XML declaration.

                            My apologies if this seems like my stream of conciousness reading of
                            RFC3023. It only seems that way because it is. I was trying to
                            cherry-pick the relevant sections related to encoding interpretation.

                            I would agree that my original "argument" regarding character
                            set/encoding should be ignored and that we may have to accept at
                            minimum, A. Pagaltzis' recommendation to "do nothing." But I am
                            intrigued by your suggestion, Sam.

                            In particular, I am concerned about the implementation and storage
                            of XML documents where, as is the most common case of RSS, they are
                            delivered via HTTP. It seems that to accurately interpret an RSS
                            file, a processor/client must know the Content-type and prefer any
                            charset parameter over the Internal Encoding Declaration. Has
                            conformance to this ever been tested on any clients? And have the
                            implications of following this rule ever been reviewed; i.e. do
                            producers get this correct and has a producer ever been documented
                            that has a "case 8.20" conflicting charset vs. Internet Encoding
                            Declaration and done it correctly?

                            On another, related, note: Section 8.16 et al, mentions the
                            imperative nature of registering a "+xml" type prior to
                            use: "However, no content type has yet been registered for MathML
                            and so this media type should not be used until such registration
                            has been completed."

                            Greg Smith
                          • Sam Ruby
                            ... Forgive me for the significant snippage, but that s actually the best place to start: real data. The following is slightly dated, but I don t believe that
                            Message 13 of 22 , Apr 6, 2006
                            • 0 Attachment
                              ecomputerd wrote:
                              >
                              > In particular, I am concerned about the implementation and storage
                              > of XML documents where, as is the most common case of RSS, they are
                              > delivered via HTTP. It seems that to accurately interpret an RSS
                              > file, a processor/client must know the Content-type and prefer any
                              > charset parameter over the Internal Encoding Declaration. Has
                              > conformance to this ever been tested on any clients? And have the
                              > implications of following this rule ever been reviewed; i.e. do
                              > producers get this correct and has a producer ever been documented
                              > that has a "case 8.20" conflicting charset vs. Internet Encoding
                              > Declaration and done it correctly?

                              Forgive me for the significant snippage, but that's actually the best
                              place to start: real data. The following is slightly dated, but I don't
                              believe that the situation has materially changed:

                              http://www.xml.com/pub/a/2004/07/21/dive.html

                              My way of dealing with this situation is that the following two
                              conditions are reported as warnings:

                              http://feedvalidator.org/docs/warning/UnexpectedContentType.html
                              http://feedvalidator.org/docs/warning/EncodingMismatch.html

                              By rights, they both should be errors.

                              - Sam Ruby
                            • ecomputerd
                              ... storage ... are ... any ... the ... documented ... best ... don t ... Sam, I don t mind the significant snippage at all. You reposted and responded to what
                              Message 14 of 22 , Apr 6, 2006
                              • 0 Attachment
                                --- In rss-public@yahoogroups.com, Sam Ruby <rubys@...> wrote:
                                >
                                > ecomputerd wrote:
                                > >
                                > > In particular, I am concerned about the implementation and
                                storage
                                > > of XML documents where, as is the most common case of RSS, they
                                are
                                > > delivered via HTTP. It seems that to accurately interpret an RSS
                                > > file, a processor/client must know the Content-type and prefer
                                any
                                > > charset parameter over the Internal Encoding Declaration. Has
                                > > conformance to this ever been tested on any clients? And have
                                the
                                > > implications of following this rule ever been reviewed; i.e. do
                                > > producers get this correct and has a producer ever been
                                documented
                                > > that has a "case 8.20" conflicting charset vs. Internet Encoding
                                > > Declaration and done it correctly?
                                >
                                > Forgive me for the significant snippage, but that's actually the
                                best
                                > place to start: real data. The following is slightly dated, but I
                                don't
                                > believe that the situation has materially changed:
                                >
                                > http://www.xml.com/pub/a/2004/07/21/dive.html
                                >
                                > My way of dealing with this situation is that the following two
                                > conditions are reported as warnings:
                                >
                                > http://feedvalidator.org/docs/warning/UnexpectedContentType.html
                                > http://feedvalidator.org/docs/warning/EncodingMismatch.html
                                >
                                > By rights, they both should be errors.
                                >
                                > - Sam Ruby
                                >

                                Sam,

                                I don't mind the significant snippage at all. You reposted and
                                responded to what I agree is the most important part.

                                My only issue with the article linked to above is that it talks
                                about this being the "fault" (and presumably its opposite:
                                the "responsibility") of "the libraries". While I am only familiar
                                with the .NET framework, I would find it difficult to believe that
                                any library would process this correctly because it's not clear to
                                me how I would specify a "default, mandatory" encoding that is *not*
                                overridden by the Internal Encoding Declaration. I know that I could
                                manually strip the XML declaration, and specify an encoding, but I
                                could not just give the stream to the XML parser. The "stream" in
                                this case would not include any information about the Content-type.
                                I would have to somehow insert that manually.

                                I have not specifically tested this aspect of the .NET Compact
                                Framework, but do any XML parsing libraries easily allow this type
                                of processing (character encoding override)?

                                Unless the XML Parsing Library in question "hooks up" to an Http
                                object of some type, or the raw stream from the server is fed
                                directly into the parser AND the parser understands and parses HTTP
                                Headers including Content-type, it seems likely that some "manual"
                                intervention and coding would be required.

                                Either way, client implementations are stuck in the "user's dilemma"
                                where if any one client can't read a feed that others can, users
                                start complaining and believe the one client that CAN'T read the
                                feed is the one at fault. This feedback loop to implementors only
                                exacerbates the problem.

                                Sam, thanks for already being "around the block" on these issues.
                                Your help with enlightening me on these issues is invaluable and
                                appreciated.

                                Greg Smith
                              • James Holderness
                                ... Mine does. I don t actually use it because I m still not sure how best to handle this, but the option is there. ... Yeah. That s exactly why I haven t done
                                Message 15 of 22 , Apr 6, 2006
                                • 0 Attachment
                                  ecomputerd wrote:
                                  > I have not specifically tested this aspect of the .NET Compact
                                  > Framework, but do any XML parsing libraries easily allow this type
                                  > of processing (character encoding override)?

                                  Mine does. I don't actually use it because I'm still not sure how best to
                                  handle this, but the option is there.

                                  > Either way, client implementations are stuck in the "user's dilemma"
                                  > where if any one client can't read a feed that others can, users
                                  > start complaining and believe the one client that CAN'T read the
                                  > feed is the one at fault.

                                  Yeah. That's exactly why I haven't done anything. Pointless adding code that
                                  you know is going to break things for your users just to comply with some
                                  spec that eveybody else ignores.

                                  My gut feel is that the best solution might just be to treat all RSS feeds
                                  as application/xml. If the charset parameter is present, it takes
                                  precedence. If omitted, follow the requirements in section 4.3.3 of the XML
                                  spec. This would assumedly allow for transcoding proxies without causing
                                  major problems elsewhere.

                                  If I were the RSS board, though, I'm not sure I'd be comfortable
                                  recommending that developers deliberately ignore certain RFCs, no matter how
                                  sensible that might seem. However, you could at least warn people that
                                  blindly following the rules is likely to cause problems.

                                  For the moment I'm just sticking with the do-what-everyone-else-does
                                  strategy.

                                  Regards
                                  James
                                • A. Pagaltzis
                                  ... libxml2 will soon. http://bugzilla.gnome.org/show_bug.cgi?id=331266 http://www.mail-archive.com/xml@gnome.org/msg02264.html Regards, -- Aristotle Pagaltzis
                                  Message 16 of 22 , Apr 6, 2006
                                  • 0 Attachment
                                    * ecomputerd <ecomputerd@...> [2006-04-07 04:00]:
                                    > but do any XML parsing libraries easily allow this type of
                                    > processing (character encoding override)?

                                    libxml2 will soon.

                                    http://bugzilla.gnome.org/show_bug.cgi?id=331266
                                    http://www.mail-archive.com/xml@.../msg02264.html

                                    Regards,
                                    --
                                    Aristotle Pagaltzis // <http://plasmasturm.org/>
                                  • Sam Ruby
                                    ... Nearly all take efforts to accept XML that is not well formed due to character encoding issues. How far to go is often a matter of judgment and is a
                                    Message 17 of 22 , Apr 7, 2006
                                    • 0 Attachment
                                      James Holderness wrote:
                                      >
                                      >>Either way, client implementations are stuck in the "user's dilemma"
                                      >>where if any one client can't read a feed that others can, users
                                      >>start complaining and believe the one client that CAN'T read the
                                      >>feed is the one at fault.
                                      >
                                      > Yeah. That's exactly why I haven't done anything. Pointless adding code that
                                      > you know is going to break things for your users just to comply with some
                                      > spec that eveybody else ignores.

                                      Nearly all take efforts to accept XML that is not well formed due to
                                      character encoding issues. How far to go is often a matter of judgment
                                      and is a matter of considerable debate. For example, the Microsoft
                                      FeedAPI team has taken the courageous stand that they won't accept feeds
                                      that aren't well formed, but then they wimped out when it came to well
                                      formedness issues due to RFC 3023.

                                      Picking which specifications one wishes to respect and which specs one
                                      choses to ignore is a dicey proposition. Particularly as it is quite
                                      possible to construct a completely well formed and valid feed that the
                                      Microsoft FeedAPI will not accept and that the majority of others will
                                      accept - not due to the others respecting RFC 3023, mind you, but by
                                      virtue of being "liberal". Not that I would condone anybody doing this.
                                      In fact, the Feed Validator explicitly checks for and issues a warning
                                      on all such feeds.

                                      - - -

                                      This thread started when Sean asked about producing feeds. That side of
                                      the equation is somewhat easier. You want to produce a feed that
                                      complies with all relevant specifications AND is accepted by the
                                      majority of consumers.

                                      I will assert that best practice is to either use a mime type of
                                      application/xml or either application/rss+xml or application/atom+xml.
                                      And then either to omit the charset parameter or to have the charset
                                      exactly match the encoding that matches the encoding of the feed, if the
                                      feed were interpreted as a standalone document.

                                      The difference between the the less specific mime type (application/xml)
                                      and a more specific one is a matter of how older browsers handle the
                                      feed. With the less specific mime type, users are presented with an
                                      incomprehensible stream of angle brackets. With the more specific mime
                                      type, users are presented with an incomprehensible error message.

                                      With newer browsers (including IE7 which is in beta), this does not
                                      matter as much. Either way, they are presented with information on what
                                      the feed is, how to subscribe, and some preview of the content contained
                                      therein.

                                      Properly configured browsers (even older ones) can emulate this behavior
                                      with a combination of the more specific mime type and Atom's link
                                      rel="self".

                                      In my opinion, this tips the scale to the more specific one being
                                      preferred. The only issue that this poses is the fact that such a mime
                                      type isn't registered.

                                      - - -

                                      My recommendation on how to proceed is to register the mime type without
                                      ANY best practices guidelines, and to ensure that the registration is
                                      consistent with all existing specifications.

                                      Then to follow up with information in the rss-profile on this mime type
                                      being the best practice, coupled with a recommendation that the charset
                                      parameter either be omitted or exactly match the feed if the feed were
                                      interpreted as a standalone document.

                                      - Sam Ruby
                                    • James Holderness
                                      ... It could be argued that their courageous stand was more a case of being too lazy to modify their XML parser to be fault tolerant. And their wimping out on
                                      Message 18 of 22 , Apr 7, 2006
                                      • 0 Attachment
                                        Sam Ruby wrote:
                                        > For example, the Microsoft
                                        > FeedAPI team has taken the courageous stand that they won't accept feeds
                                        > that aren't well formed, but then they wimped out when it came to well
                                        > formedness issues due to RFC 3023.

                                        It could be argued that their courageous stand was more a case of being too
                                        lazy to modify their XML parser to be fault tolerant. And their wimping out
                                        on RFC 3023 issues could equally be attributed to being too lazy to modify
                                        their XML parser to accept charset overrides. Consistent laziness seems more
                                        believable than on again off again courageousness, but I'm willing to give
                                        them the benefit of the doubt.

                                        It's also worth noting that their courageous stand against malformed XML
                                        doesn't extend to rejecting RSS feeds with say malformed dates.

                                        > Particularly as it is quite
                                        > possible to construct a completely well formed and valid feed that the
                                        > Microsoft FeedAPI will not accept and that the majority of others will
                                        > accept - not due to the others respecting RFC 3023, mind you, but by
                                        > virtue of being "liberal".

                                        You mean set the charset to something like iso-8859-1 in the HTTP
                                        content-type, but set the charset in the XML declaration to us-ascii? Nice
                                        idea. Of course you could also just serve up a perfectly valid 0.91 RSS feed
                                        with a DTD.

                                        > Not that I would condone anybody doing this.

                                        Of course not. Unless maybe someone wanted to take a courageous stand
                                        against RSS aggregators that are taking courageous stands. ;-)

                                        Regards
                                        James
                                      • Sam Ruby
                                        ... lazy , courageous , and wimped out are all loaded terms. I still feel that rejecting feeds that aren t (locally) well formed is a gutsy and welcome
                                        Message 19 of 22 , Apr 7, 2006
                                        • 0 Attachment
                                          James Holderness wrote:
                                          > Sam Ruby wrote:
                                          >
                                          >>For example, the Microsoft
                                          >>FeedAPI team has taken the courageous stand that they won't accept feeds
                                          >>that aren't well formed, but then they wimped out when it came to well
                                          >>formedness issues due to RFC 3023.
                                          >
                                          > It could be argued that their courageous stand was more a case of being too
                                          > lazy to modify their XML parser to be fault tolerant. And their wimping out
                                          > on RFC 3023 issues could equally be attributed to being too lazy to modify
                                          > their XML parser to accept charset overrides. Consistent laziness seems more
                                          > believable than on again off again courageousness, but I'm willing to give
                                          > them the benefit of the doubt.

                                          "lazy", "courageous", and "wimped out" are all loaded terms.

                                          I still feel that rejecting feeds that aren't (locally) well formed is a
                                          gutsy and welcome move.

                                          It will likely require more effort to answer all the "but it works in
                                          Bloglines", and "all I did was cut and paste from Microsoft-word into
                                          IE7" questions than it would to compensate for ill-formed feeds.

                                          In the long run, we all benefit.

                                          - Sam Ruby
                                        • James Holderness
                                          ... True. And my suggestions were really only half serious. I just thought your post was in need of an opposing viewpoint on this issue. I find it difficult to
                                          Message 20 of 22 , Apr 7, 2006
                                          • 0 Attachment
                                            Sam Ruby wrote:
                                            > James Holderness wrote:
                                            >> It could be argued that their courageous stand was more a case of being
                                            >> too
                                            >> lazy to modify their XML parser to be fault tolerant. And their wimping
                                            >> out
                                            >> on RFC 3023 issues could equally be attributed to being too lazy to
                                            >> modify
                                            >> their XML parser to accept charset overrides.
                                            >
                                            > "lazy", "courageous", and "wimped out" are all loaded terms.

                                            True. And my suggestions were really only half serious. I just thought your
                                            post was in need of an opposing viewpoint on this issue. I find it difficult
                                            to envision Microsoft as the beneficent hero, dedicated to saving the RSS
                                            world from malformed XML. That's just too fantastic to be believable.

                                            Regards
                                            James
                                          • A. Pagaltzis
                                            ... I don t think it was pure laziness. I think they found that only a very minimal number of feeds is actually malformed once you get past encoding problems
                                            Message 21 of 22 , Apr 8, 2006
                                            • 0 Attachment
                                              * James Holderness <j4_james@...> [2006-04-08 03:15]:
                                              > I find it difficult to envision Microsoft as the beneficent
                                              > hero, dedicated to saving the RSS world from malformed XML.
                                              > That's just too fantastic to be believable.

                                              I don't think it was pure laziness. I think they found that only
                                              a very minimal number of feeds is actually malformed once you get
                                              past encoding problems (which has certainly been my experience,
                                              and the data published by the Google Reader team agrees), and so
                                              they decided that taking a stance would not cause them too much
                                              trouble. My take is that it wasn't just laziness and wasn't just
                                              heroism; it was a dash of heroism mixed with equal parts laziness
                                              and realism.

                                              Regards,
                                              --
                                              Aristotle Pagaltzis // <http://plasmasturm.org/>
                                            • James Holderness
                                              ... I ve only seen the one article in the Google Reader Blog [1], in which they claim about seven percent of all feeds have XML errors. I could be mistaken,
                                              Message 22 of 22 , Apr 8, 2006
                                              • 0 Attachment
                                                A. Pagaltzis wrote:
                                                > a very minimal number of feeds is actually malformed once you get
                                                > past encoding problems (which has certainly been my experience,
                                                > and the data published by the Google Reader team agrees)

                                                I've only seen the one article in the Google Reader Blog [1], in which they
                                                claim about seven percent of all feeds have XML errors. I could be mistaken,
                                                but I don't think they were even considering the charset issues of RFC 3023.
                                                15.6% of the errors were feeds claiming to be UTF-8 containing invalid
                                                characters, but I suspect that's a result of the encoding in the XML header
                                                (or relying on the defaults) rather than charset issues from the HTTP
                                                headers.

                                                So unless I'm missing something, I would expect IE7 to reject all of those
                                                feeds. That amounts to seven in every hundred, about one in fourteen. That's
                                                a huge number IMO. Although, frankly, I don't care. IE7 can reject all the
                                                feeds it wants. I'm not going to use their aggregator, I'm not going to use
                                                their API, so it really doesn't affect me.

                                                At least for now. I'm trying not to think about the future in which
                                                Microsoft RSS has become the de facto standard.

                                                Regards
                                                James

                                                [1] http://googlereader.blogspot.com/2005/12/xml-errors-in-feeds.html
                                              Your message has been successfully submitted and would be delivered to recipients shortly.