Loading ...
Sorry, an error occurred while loading the content.

Concall: Tue 28/02 Short notes

Expand Messages
  • Laurent Le Meur
    Here is the result of yesterday’s confcall. - Report of actions Dave has added an appendix to Reuters EP1 report, with Reuters extension to the draft
    Message 1 of 23 , Mar 1, 2006
    View Source
    • 0 Attachment

      Here is the result of yesterday’s confcall.

       

      - Report of actions

      Dave has added an appendix to Reuters EP1 report, with Reuters extension to the draft standard.

      - Remove semantic equivalence notes from the Tech Spec?

      Agreed. Separate documents (with comparison tables) will be drafted, taking as its starting point Tech Spec v18, and this information will be removed from the Tech Spec.

      - Rename @content to @literal?

      Result: agreed.

      - Catalog: interaction between the items in a message?

      There is no interaction between the items in a message as far as the scheme declarations are concerned.

      - Catalog: ban relative URIs for remote catalogs?

      Result: agreed. Rationale: relative URIs don’t provide a proper key for (http) caches, and are fragile.

      - Catalog: allow <catalog> to take a <title> child element?

      Result: agreed, perceived as useful, in particular in a catalog file.

      - Catalog: allow the overriding of external scheme declarations through the use of inline ones?

      Result: interesting, but we have spotted a problem with the processing of clashing aliases, so the issue is deferred.

      - Catalog: Indefinite caching can be problematic if a catalog contains an error, and must be corrected.

      Result: this question is differed to mail exchanges.

      - Catalog: clashing scheme declaration

      Result: Reconsider the processing for alias clashes.

      Catalog: Should we separate the catalog ID from the catalog location?  Should the catalog be an item?
      Result: Deferred.

      - handling CDATA in a news item

      Agreed: CDATA content will be implemented as content of an <encodedContent encoding="enc:CDATA"> element.

      Agreed: @encoding will be now defined at the core conformance level with two recommended values: "enc:base64" (default) and "enc:CDATA".

      - person names

      Result: agreed to define <name part="NMTOKEN" role="NMTOKENS">xxxxx</name>

      See "http://groups.yahoo.com/group/newsml-2/message/275".

      - group of I18N attributes: on which data types and elements?

      Result: agreed to defines them on all elements that contain free-formed text (eg labels) and all ancestors of these elements. In practice, only element of type integer, date etc.. do not support this group.

      - side-effects of defining one namespace for NAR elements?

       Q1 - how to specify standards built on the NAR?

        1/ content components could be defined in another ns (same situation as XHTML used as content of a News Item)

        2/ all kinds of Items could be defined in the NAR ns (so a News Itemand an EventsML Assignment Item would share the same ns)

        3/ all metadata properties, if they don't collide with existing NAR properties, would be defined in the NAR ns.

        4/ all entity components (person, organisation ...), used as content of a Topic Item, would be defined in the NAR ns.

      Q2 - how to set the boundaries for a standard specification?

        An IPTC standard would be associated with a set of xml schemas.

      Q3 - could an IPTC std disallow a feature defined in the NAR?

        No

      Q4 – could an IPTC std adopt a part of the NAR only?  

        The NAR is made of 5 building blocks

        a) framework

        b) News Item

        c) Package Item

        d) Topic Item

        e) News Message

        An IPTC std must use the framework.

        An IPTC std is free to pick one or more NAR Items an import it as part of the std.

        Note from llm (after the concall): The use of the News Message for information exchange is a generic IPTC feature, provided “for free” by the NAR, outside of the std boundaries

      Q5 - can standards be updated individually?

        Yes, as long as they reside in a specific schema file

      Q6 - what has to be considered for an update to the NAR?

        When the NAR is updated, it is up to individual stds to updated their specs at their pace.

      Result: we shall use one namespace for all NAR elements if no schema-related constraint stops us to do so.

       

      Laurent

       



      -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

      This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individ ual or entity to whom it is addressed. If you have received this email in error, please contact the sende r and delete the email from your system. If you are not the named addressee you should not disseminate, distr ibute or copy this email.

      For more information on Agence France-Presse, please visit our web site at http://www.afp.com
      -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-



    • John Cowan
      ... I believe this decision should be reconsidered. CDATA sections do not really constitute an encoding. There is no difference at the infoset level between
      Message 2 of 23 , Mar 1, 2006
      View Source
      • 0 Attachment
        Laurent Le Meur scripsit:

        > - handling CDATA in a news item
        >
        > Agreed: CDATA content will be implemented as content of an <encodedContent
        > encoding="enc:CDATA"> element.
        >
        > Agreed: @encoding will be now defined at the core conformance level with two
        > recommended values: "enc:base64" (default) and "enc:CDATA".

        I believe this decision should be reconsidered.

        CDATA sections do not really constitute an encoding. There is no difference
        at the infoset level between "<!CDATA[[<not>an element</not>]]>"
        and "<not>an element<not>", and few XML parsers even bother
        to report the distinction. DOM provides a mechanism for doing so, but
        says that reporting CDATA sections as simple text is conformant behavior.

        So any element that's allowed to contain character content may contain
        one or more CDATA sections or may use escaping as a free choice of the
        document creator. That being so, "enc:CDATA" really says "no encoding
        at all".

        --
        John Cowan cowan@...
        http://www.ccil.org/~cowan http://www.ap.org
        Thor Heyerdahl recounts his attempt to prove Rudyard Kipling's theory
        that the mongoose first came to India on a raft from Polynesia.
        --blurb for Rikki-Kon-Tiki-Tavi
      • Laurent Le Meur
        John, We currently have 3 varaints of content wrappers: - directContent handle xml. It needs an xml element as a child. - remoteContent handles .... remote
        Message 3 of 23 , Mar 2, 2006
        View Source
        • 0 Attachment
          John,

          We currently have 3 varaints of content wrappers:
          - directContent handle xml. It needs an xml element as a child.
          - remoteContent handles .... remote content
          - encodeContent handles characters, that are logically encoded (base64, binhex,
          why not quoted-printable).

          So, what would be your choice for handling CDATA or simply plain-text (with line
          breaks I guess)?

          1/ using one of the existing element?
          2/ create a specific one?

          Our current choice is to extend the use of encodeContent with the value meaning
          as you say : not really encoded but rather plain text.

          Laurent


          > -----Message d'origine-----
          > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de
          > John Cowan
          > Envoye : mercredi 1 mars 2006 14:44
          > A : newsml-2@yahoogroups.com
          > Cc : iptc-news-architecture-dev@yahoogroups.com
          > Objet : Re: [newsml-2] Concall: Tue 28/02 Short notes
          >
          > Laurent Le Meur scripsit:
          >
          > > - handling CDATA in a news item
          > >
          > > Agreed: CDATA content will be implemented as content of an <encodedContent
          > > encoding="enc:CDATA"> element.
          > >
          > > Agreed: @encoding will be now defined at the core conformance level with two
          > > recommended values: "enc:base64" (default) and "enc:CDATA".
          >
          > I believe this decision should be reconsidered.
          >
          > CDATA sections do not really constitute an encoding. There is no difference
          > at the infoset level between "<!CDATA[[<not>an element</not>]]>"
          > and "<not>an element<not>", and few XML parsers even bother
          > to report the distinction. DOM provides a mechanism for doing so, but
          > says that reporting CDATA sections as simple text is conformant behavior.
          >
          > So any element that's allowed to contain character content may contain
          > one or more CDATA sections or may use escaping as a free choice of the
          > document creator. That being so, "enc:CDATA" really says "no encoding
          > at all".
          >
          > --
          > John Cowan cowan@...
          > http://www.ccil.org/~cowan http://www.ap.org
          > Thor Heyerdahl recounts his attempt to prove Rudyard Kipling's theory
          > that the mongoose first came to India on a raft from Polynesia.
          > --blurb for Rikki-Kon-Tiki-Tavi
          >
          >
          >
          > Yahoo! Groups Links
          >
          >
          >
          >
          >
          >



          -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

          This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

          For more information on Agence France-Presse, please visit our web site at http://www.afp.com

          -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
        • John Cowan
          ... In that case I d say enc:none , since there is no (content transfer) encoding. I assume that a media type is also allowed, so people can tell if this is
          Message 4 of 23 , Mar 2, 2006
          View Source
          • 0 Attachment
            Laurent Le Meur scripsit:
            > John,
            >
            > We currently have 3 varaints of content wrappers:
            > - directContent handle xml. It needs an xml element as a child.
            > - remoteContent handles .... remote content
            > - encodeContent handles characters, that are logically encoded (base64, binhex,
            > why not quoted-printable).
            >
            > So, what would be your choice for handling CDATA or simply plain-text (with line
            > breaks I guess)?

            In that case I'd say "enc:none", since there is no (content transfer) encoding.
            I assume that a media type is also allowed, so people can tell if this is
            image/jpeg or text/plain or text/html or whatever.

            --
            Híggledy-pìggledy / XML programmers John Cowan
            Try to escape those / I-eighteen-N woes; http://www.ccil.org/~cowan
            Incontrovertibly / What we need more of is http://www.ap.org
            Unicode weenies and / François Yergeaus. cowan@...
          • Michael Steidl/MDir IPTC
            Sometimes postings to this group address a big bunch of different issues - like e.g. notes from the developer s group conference calls. If you reply not to the
            Message 5 of 23 , Mar 3, 2006
            View Source
            • 0 Attachment
              Sometimes postings to this group address a big bunch of different issues - like
              e.g. notes from the developer's group conference calls.

              If you reply not to the notes as a whole only but to a single issue (or a few
              issues)...

              **... please change the Subject of your posting to correspond with the issue**.

              This will improve browsing the many postings which appear on this group - thank
              you for your contributions.

              Michael
              ==================================================
              Sent by:
              Michael Steidl
              Managing Director of the IPTC <mdirector@...>
              International Press Telecommunications Council
              "Information Technology for News"
              Visit us on the web at http://www.iptc.org
            • Laurent Le Meur
              ... Fair enough ... Yes Laurent ... -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- This e-mail, and any file transmitted with it, is confidential and intended
              Message 6 of 23 , Mar 3, 2006
              View Source
              • 0 Attachment
                > In that case I'd say "enc:none", since there is no (content transfer)
                > encoding.

                Fair enough

                > I assume that a media type is also allowed, so people can tell if this is
                > image/jpeg or text/plain or text/html or whatever.

                Yes

                Laurent

                > -----Message d'origine-----
                > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de
                > John Cowan
                > Envoyé : jeudi 2 mars 2006 21:08
                > À : newsml-2@yahoogroups.com
                > Objet : Re: [newsml-2] handling CDATA
                >
                > Laurent Le Meur scripsit:
                > > John,
                > >
                > > We currently have 3 varaints of content wrappers:
                > > - directContent handle xml. It needs an xml element as a child.
                > > - remoteContent handles .... remote content
                > > - encodeContent handles characters, that are logically encoded (base64,
                > binhex,
                > > why not quoted-printable).
                > >
                > > So, what would be your choice for handling CDATA or simply plain-text (with
                > line
                > > breaks I guess)?
                >
                > In that case I'd say "enc:none", since there is no (content transfer)
                > encoding.
                > I assume that a media type is also allowed, so people can tell if this is
                > image/jpeg or text/plain or text/html or whatever.
                >
                > --
                > Híggledy-pìggledy / XML programmers John Cowan
                > Try to escape those / I-eighteen-N woes; http://www.ccil.org/~cowan
                > Incontrovertibly / What we need more of is http://www.ap.org
                > Unicode weenies and / François Yergeaus. cowan@...
                >
                >
                >
                > Yahoo! Groups Links
                >
                >
                >
                >
                >
                >



                -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

                For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
              • John Cowan
                ... In fact, I d go further and propose changing the name to simpleContent with a default encoding of enc:none and a default media type of text/plain ,
                Message 7 of 23 , Mar 3, 2006
                View Source
                • 0 Attachment
                  Laurent Le Meur scripsit:

                  > > In that case I'd say "enc:none", since there is no (content transfer)
                  > > encoding.
                  >
                  > Fair enough

                  In fact, I'd go further and propose changing the name to "simpleContent"
                  with a default encoding of "enc:none" and a default media type of
                  "text/plain", since I suspect that very common use cases will be plain
                  text and escaped HTML (which would require a media type but no encoding).

                  I may be blundering about in a minefield here.

                  --
                  Possession is said to be nine points of the law, John Cowan
                  but that's not saying how many points the law might have. cowan@...
                  --Thomas A. Cowan (law professor and my father)
                • Michael Steidl/MDir IPTC
                  I got back to the start of this thread, Laurents conf call notes of 28/2, and event beyond this: - what was the initial processing use case for CDATA: to
                  Message 8 of 23 , Mar 7, 2006
                  View Source
                  • 0 Attachment
                    I got back to the start of this thread, Laurents conf call notes of 28/2, and
                    event beyond this:

                    - what was the initial "processing use case" for CDATA: to wrap plain text.
                    Hence the initial question is: how to convey plain text in a News Item.

                    - currently three elements are defined for the <contentSet>: <directContent>
                    for XML data, <encodedContent> for any kind of data - but they have to be
                    encoded (base64 by default), and finally <remoteContent> for any kind of data
                    outside this XML instance.

                    - so back to square 1: which of these elements best to use for this purpose -
                    or to create a new element.

                    - pro/con <directContent>: would work with CDATA but this requires at least one
                    XML element as a wrapper, this requires a (provider specific) specification in
                    its own namespace etc ... - not that easy to handle.

                    -- my proposal regarding this: think of an optional <cdataCont> element under
                    <directContent> which could act as a predefined wrapper for CDATA.

                    - pro/con <encodedContent>: should work with plain text but blows up the size.

                    - pro/con <remoteContent>: should work with plain text but handling an external
                    file for each piece of plain text add complexity to processing.

                    Considering this all I stick to the proposal of a predefined wrapper element
                    for CDATA in <directContent>.

                    Michael


                    On 3 Mar 2006 at 7:46 John Cowan wrote:

                    > Laurent Le Meur scripsit:
                    >
                    > > > In that case I'd say "enc:none", since there is no (content transfer)
                    > > > encoding.
                    > >
                    > > Fair enough
                    >
                    > In fact, I'd go further and propose changing the name to "simpleContent"
                    > with a default encoding of "enc:none" and a default media type of
                    > "text/plain", since I suspect that very common use cases will be plain
                    > text and escaped HTML (which would require a media type but no encoding).
                    >
                    > I may be blundering about in a minefield here.
                    >
                    > --
                    > Possession is said to be nine points of the law, John Cowan
                    > but that's not saying how many points the law might have. cowan@...
                    > --Thomas A. Cowan (law professor and my father)
                    >
                    >
                    >
                    > Yahoo! Groups Links
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >

                    ==================================================
                    Sent by:
                    Michael Steidl
                    Managing Director of the IPTC <mdirector@...>
                    International Press Telecommunications Council
                    "Information Technology for News"
                    Visit us on the web at http://www.iptc.org
                  • John Cowan
                    ... In order to avoid this arbitrary wrapper element, I am therefore proposing to allow non-encoded content in encodedContent and change its name to
                    Message 9 of 23 , Mar 7, 2006
                    View Source
                    • 0 Attachment
                      Michael Steidl/MDir IPTC scripsit:

                      > - pro/con <directContent>: would work with CDATA but this requires at
                      > least one XML element as a wrapper, this requires a (provider specific)
                      > specification in its own namespace etc ... - not that easy to handle.

                      In order to avoid this arbitrary wrapper element, I am therefore proposing
                      to allow non-encoded content in encodedContent and change its name to
                      simpleContent. This seems to me to be a very clean solution.

                      > - pro/con <encodedContent>: should work with plain text but blows up
                      > the size.

                      Allowing enc:none as the encoding type would eliminate this problem.

                      --
                      John Cowan cowan@... www.ap.org www.ccil.org/~cowan
                      Assent may be registered by a signature, a handshake, or a click of a computer
                      mouse transmitted across the invisible ether of the Internet. Formality
                      is not a requisite; any sign, symbol or action, or even willful inaction,
                      as long as it is unequivocally referable to the promise, may create a contract.
                      --Specht v. Netscape
                    • Laurent Le Meur
                      Ok, back to plain text and CDATA and escaped HTML. We have now 3 Different proposals: 1/ rename to and .
                      Message 10 of 23 , Mar 8, 2006
                      View Source
                      • 0 Attachment

                        Ok, back to plain text and CDATA and escaped HTML.

                         

                        We have now 3 Different proposals:

                          1/ rename <encodedContent encoding=> to <simpleContent encoding=> and … (JC)

                          2/ add a new element <simpleContent> or <plainContent>

                          3/ (new, MS) create an optional <cdata> sub-element of directContent

                         

                        Samples:

                         

                        <contentSet>

                          <directContent> for XML data

                          <directContent><cdata> for plain text and CDATA

                          <encodedContent> for any kind of encoded data

                          <remoteContent>

                         

                        <contentSet>

                          <directContent> for XML data

                          <simpleContent encoding="..."> for plain text, CDATA and encoded data

                          <remoteContent>

                         

                        <contentSet>

                          <directContent> for XML data

                          <plainContent> for plain text and CDATA

                          <encodedContent encoding="..."> for encoded data

                          <remoteContent>

                         

                        A look at Atom:

                        http://www.xml.com/pub/a/2005/12/07/handling-atom-text-and-content-constructs.html

                         

                        First, a note: We could still adopt the Atom content model, but at the price of a very lax XSD schema rules. Eg it would be technically possible to have both an href attribute AND embedded content. It is not the current choice.

                         

                        Atom creates three specific values for the type attribute: text, html and xhtml. ‘text’ means "treat me as plain text", and html means "treat this escaped text as html". xhtml means "inside is a div element which contains the content".

                         

                        Then a quote: "CDATA sections are nothing but syntactic sugar and do not in any way affect the core semantic issues of escaped markup". It means that

                        <content>One &lt;strong&gt;bold&lt;/strong&gt; foot forward</content>

                        and

                        <content><![CDATA[One <strong>bold</strong> foot forward]]></content>

                        Can live together in the same element.

                         

                        Therefore <cdata> is not a proper name for an element: <plain> seems better.

                         

                        I propose that if you see a good reason to choose one solution or the other, you post a comment on this list.

                         

                        Laurent

                         

                         

                         

                         

                        > -----Message d'origine-----

                        > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de

                        > John Cowan

                        > Envoyé : mardi 7 mars 2006 16:56

                        > À : newsml-2@yahoogroups.com

                        > Objet : Re: [newsml-2] handling CDATA

                        >

                        > Michael Steidl/MDir IPTC scripsit:

                        >

                        > > - pro/con <directContent>: would work with CDATA but this requires at

                        > > least one XML element as a wrapper, this requires a (provider specific)

                        > > specification in its own namespace etc ... - not that easy to handle.

                        >

                        > In order to avoid this arbitrary wrapper element, I am therefore proposing

                        > to allow non-encoded content in encodedContent and change its name to

                        > simpleContent.  This seems to me to be a very clean solution.

                        >

                        > > - pro/con <encodedContent>: should work with plain text but blows up

                        > > the size.

                        >

                        > Allowing enc:none as the encoding type would eliminate this problem.

                        >

                        > --

                        > John Cowan  cowan@...  www.ap.org  www.ccil.org/~cowan

                        > Assent may be registered by a signature, a handshake, or a click of a computer

                        > mouse transmitted across the invisible ether of the Internet. Formality

                        > is not a requisite; any sign, symbol or action, or even willful inaction,

                        > as long as it is unequivocally referable to the promise, may create a

                        > contract.

                        >        --Specht v. Netscape

                        >

                        >

                        >

                        > Yahoo! Groups Links

                        >

                        > <*> To visit your group on the web, go to:

                        >     http://groups.yahoo.com/group/newsml-2/

                        >

                        > <*> To unsubscribe from this group, send an email to:

                        >     newsml-2-unsubscribe@yahoogroups.com

                        >

                        > <*> Your use of Yahoo! Groups is subject to:

                        >     http://docs.yahoo.com/info/terms/

                        >

                        >

                         



                        -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                        This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individ ual or entity to whom it is addressed. If you have received this email in error, please contact the sende r and delete the email from your system. If you are not the named addressee you should not disseminate, distr ibute or copy this email.

                        For more information on Agence France-Presse, please visit our web site at http://www.afp.com
                        -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-



                      • Johan Lindgren
                        ... I find it even more confusing that the conf-call of march 7 is titled 07/02 and that different notes in various versions turn up both in the Newsml-2 list
                        Message 11 of 23 , Mar 8, 2006
                        View Source
                        • 0 Attachment
                          newsml-2@yahoogroups.com skrev den 3 mars 2006 klockan 09:32 +0000:
                          >Sometimes postings to this group address a big bunch of different issues
                          >- like
                          >e.g. notes from the developer's group conference calls.

                          I find it even more confusing that the conf-call of march 7 is titled
                          07/02 and that different notes in various versions turn up both in the
                          Newsml-2 list and in the NAR-list.

                          johan
                        • Laurent Le Meur
                          Apologize for the false date. I m running after time. Sometime an error happens. The offical notes are on the nar-dev list. I simplify them for the newsml-2
                          Message 12 of 23 , Mar 9, 2006
                          View Source
                          • 0 Attachment
                            Apologize for the false date. I'm running after time. Sometime an error happens.

                            The offical notes are on the nar-dev list.
                            I "simplify" them for the newsml-2 list.

                            The issue is: the external readers have no interest in - who was present at the
                            confcall or who voted yes of no.
                            But they need to know is an issue publicly discussed was solved or not during an
                            IPTC confcall.

                            This is the issue when having a semi-public initiative. We cannot do the work
                            totally internally, as IPTC member companies don't allocate much resource on it.
                            And we can't do it totally publicly, as the IPTC management would like more
                            companies to join, and could meet IPR issues (Michael will talk about this at
                            Vancouver I guess).

                            Laurent


                            > -----Message d'origine-----
                            > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de
                            > Johan Lindgren
                            > Envoyé : jeudi 9 mars 2006 08:17
                            > À : newsml-2@yahoogroups.com
                            > Objet : Re: [newsml-2] Group's netiquette reminder: subject line updates
                            >
                            > newsml-2@yahoogroups.com skrev den 3 mars 2006 klockan 09:32 +0000:
                            > >Sometimes postings to this group address a big bunch of different issues
                            > >- like
                            > >e.g. notes from the developer's group conference calls.
                            >
                            > I find it even more confusing that the conf-call of march 7 is titled
                            > 07/02 and that different notes in various versions turn up both in the
                            > Newsml-2 list and in the NAR-list.
                            >
                            > johan
                            >
                            >
                            >
                            >
                            > Yahoo! Groups Links
                            >
                            >
                            >
                            >
                            >



                            -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                            This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

                            For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                            -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                          • Laurent Le Meur
                            I go with J.Cowan and prefer option 1. I m ok with (what is simple here) or . I propose to follow the Atom choice and
                            Message 13 of 23 , Mar 9, 2006
                            View Source
                            • 0 Attachment

                              I go with J.Cowan and prefer option 1.

                              I’m ok with <plainContent> (what is simple here) or <inlineContent>.

                              I propose to follow the Atom choice and disambiguate between escaped html and escaped or CDATA plain-text.

                              Plain-text or cdata would be indicated by encoding=’enc:none’

                              Escaped html would be indicated by eg encoding=’enc:escHTML’

                               

                              Why this choice?

                              Option 2 adds a news element: 3 are complex enough

                              Option 3 also adds a new (sub)element (currently proposed as <cdata>, it should be renamed). This element should accept escaped html or plain-text (CDATA or escaped), and I fear that there will be a need to make clear if it is supposed to be html or plain-text (see Atom). So it would impose an attribute on this element.

                               

                              Laurent

                               


                              De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de Laurent Le Meur
                              Envoyé : mercredi 8 mars 2006 22:49
                              À : newsml-2@yahoogroups.com
                              Objet : RE: [newsml-2] handling CDATA

                               

                              Ok, back to plain text and CDATA and escaped HTML.

                               

                              We have now 3 Different proposals:

                                1/ rename <encodedContent encoding=> to <simpleContent encoding=> and … (JC)

                                2/ add a new element <simpleContent> or <plainContent>

                                3/ (new, MS) create an optional <cdata> sub-element of directContent

                               

                              Samples:

                               

                              <contentSet>

                                <directContent> for XML data

                                <directContent><cdata> for plain text and CDATA

                                <encodedContent> for any kind of encoded data

                                <remoteContent>

                               

                              <contentSet>

                                <directContent> for XML data

                                <simpleContent encoding="..."> for plain text, CDATA and encoded data

                                <remoteContent>

                               

                              <contentSet>

                                <directContent> for XML data

                                <plainContent> for plain text and CDATA

                                <encodedContent encoding="..."> for encoded data

                                <remoteContent>

                               

                              A look at Atom:

                              http://www.xml.com/pub/a/2005/12/07/handling-atom-text-and-content-constructs.html

                               

                              First, a note: We could still adopt the Atom content model, but at the price of a very lax XSD schema rules. Eg it would be technically possible to have both an href attribute AND embedded content. It is not the current choice.

                               

                              Atom creates three specific values for the type attribute: text, html and xhtml. ‘text’ means "treat me as plain text", and html means "treat this escaped text as html". xhtml means "inside is a div element which contains the content".

                               

                              Then a quote: "CDATA sections are nothing but syntactic sugar and do not in any way affect the core semantic issues of escaped markup". It means that

                              <content>One &lt;strong&gt;bold&lt;/strong&gt; foot forward</content>

                              and

                              <content><![CDATA[One <strong>bold</strong> foot forward]]></content>

                              Can live together in the same element.

                               

                              Therefore <cdata> is not a proper name for an element: <plain> seems better.

                               

                              I propose that if you see a good reason to choose one solution or the other, you post a comment on this list.

                               

                              Laurent

                               

                               

                               

                               

                              > -----Message d'origine-----

                              > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de

                              > John Cowan

                              > Envoyé : mardi 7 mars 2006 16:56

                              > À : newsml-2@yahoogroups.com

                              > Objet : Re: [newsml-2] handling CDATA

                              >

                              > Michael Steidl/MDir IPTC scripsit:

                              >

                              > > - pro/con <directContent>: would work with CDATA but this requires at

                              > > least one XML element as a wrapper, this requires a (provider specific)

                              > > specification in its own namespace etc ... - not that easy to handle.

                              >

                              > In order to avoid this arbitrary wrapper element, I am therefore proposing

                              > to allow non-encoded content in encodedContent and change its name to

                              > simpleContent.  This seems to me to be a very clean solution.

                              >

                              > > - pro/con <encodedContent>: should work with plain text but blows up

                              > > the size.

                              >

                              > Allowing enc:none as the encoding type would eliminate this problem.

                              >

                              > --

                              > John Cowan  cowan@...  www.ap.org  www.ccil.org/~cowan

                              > Assent may be registered by a signature, a handshake, or a click of a computer

                              > mouse transmitted across the invisible ether of the Internet. Formality

                              > is not a requisite; any sign, symbol or action, or even willful inaction,

                              > as long as it is unequivocally referable to the promise, may create a

                              > contract.

                              >        --Specht v. Netscape

                              >

                              >

                              >

                              > Yahoo! Groups Links

                              >

                              > <*> To visit your group on the web, go to:

                              >     http://groups.yahoo.com/group/newsml-2/

                              >

                              > <*> To unsubscribe from this group, send an email to:

                              >     newsml-2-unsubscribe@yahoogroups.com

                              >

                              > <*> Your use of Yahoo! Groups is subject to:

                              >     http://docs.yahoo.com/info/terms/

                              >

                              >

                               

                               

                              -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


                              This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individ ual or entity to whom it is addressed. If you have received this email in error, please contact the sende r and delete the email from your system. If you are not the named addressee you should not disseminate, distr ibute or copy this email.

                              For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                              -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-







                              -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                              This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individ ual or entity to whom it is addressed. If you have received this email in error, please contact the sende r and delete the email from your system. If you are not the named addressee you should not disseminate, distr ibute or copy this email.

                              For more information on Agence France-Presse, please visit our web site at http://www.afp.com
                              -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-



                            • John Cowan
                              ... Isn t it enough to set the encoding to enc:none and the media type to text/html in this case? Escaping is not encoding; it will be necessary to escape &
                              Message 14 of 23 , Mar 9, 2006
                              View Source
                              • 0 Attachment
                                Laurent Le Meur scripsit:

                                > I go with J.Cowan and prefer option 1.
                                >
                                > I'm ok with <plainContent> (what is simple here) or <inlineContent>.
                                >
                                > I propose to follow the Atom choice and disambiguate between escaped html and
                                > escaped or CDATA plain-text.
                                >
                                > Plain-text or cdata would be indicated by encoding='enc:none'
                                >
                                > Escaped html would be indicated by eg encoding='enc:escHTML'

                                Isn't it enough to set the encoding to enc:none and the media type to
                                text/html in this case?

                                Escaping is not encoding; it will be necessary to escape & and < characters
                                even in plain text, since the enclosing document remains XML.

                                Any of simpleContent, plainContent, or inlineContent is fine with me.

                                --
                                You know, you haven't stopped talking John Cowan
                                since I came here. You must have been http://www.ap.org
                                vaccinated with a phonograph needle. cowan@...
                                --Rufus T. Firefly http://www.ccil.org/~cowan
                              • Laurent Le Meur
                                ... I wonder why the Atom WG did not choose this kind of solution. Does somebody know? Laurent ... -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- This e-mail,
                                Message 15 of 23 , Mar 10, 2006
                                View Source
                                • 0 Attachment
                                  > Isn't it enough to set the encoding to enc:none and the media type to
                                  > text/html in this case?

                                  I wonder why the Atom WG did not choose this kind of solution.
                                  Does somebody know?
                                  Laurent

                                  > -----Message d'origine-----
                                  > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de
                                  > John Cowan
                                  > Envoye : jeudi 9 mars 2006 22:56
                                  > A : newsml-2@yahoogroups.com
                                  > Objet : Re: [newsml-2] handling CDATA
                                  >
                                  > Laurent Le Meur scripsit:
                                  >
                                  > > I go with J.Cowan and prefer option 1.
                                  > >
                                  > > I'm ok with <plainContent> (what is simple here) or <inlineContent>.
                                  > >
                                  > > I propose to follow the Atom choice and disambiguate between escaped html
                                  > and
                                  > > escaped or CDATA plain-text.
                                  > >
                                  > > Plain-text or cdata would be indicated by encoding='enc:none'
                                  > >
                                  > > Escaped html would be indicated by eg encoding='enc:escHTML'
                                  >
                                  > Isn't it enough to set the encoding to enc:none and the media type to
                                  > text/html in this case?
                                  >
                                  > Escaping is not encoding; it will be necessary to escape & and < characters
                                  > even in plain text, since the enclosing document remains XML.
                                  >
                                  > Any of simpleContent, plainContent, or inlineContent is fine with me.
                                  >
                                  > --
                                  > You know, you haven't stopped talking John Cowan
                                  > since I came here. You must have been http://www.ap.org
                                  > vaccinated with a phonograph needle. cowan@...
                                  > --Rufus T. Firefly http://www.ccil.org/~cowan
                                  >
                                  >
                                  >
                                  > Yahoo! Groups Links
                                  >
                                  >
                                  >
                                  >
                                  >



                                  -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                                  This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

                                  For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                                  -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                • Michael Steidl/MDir IPTC
                                  Sorry, but I have to be bothering: - looking back to the roots of this discussion early this week I pointed out it started with tackling the issue: how to
                                  Message 16 of 23 , Mar 10, 2006
                                  View Source
                                  • 0 Attachment
                                    Sorry, but I have to be bothering:

                                    - looking back to the roots of this discussion early this week I pointed out it started
                                    with tackling the issue: how to preserve line breaks, tabs etc ("white space") in plain
                                    text content. Laurent proposed to use CDATA for this purpose - and added this helps
                                    also with escaping e.g. HTML.

                                    - re-reading the XML (1.1) specs on CDATA and remembering checking the issue of
                                    preserving white space in XML in general I'm not convinced anymore CDATA will
                                    provide this feature. XML 1.1. says: [CDATA sections] are used to escape blocks of
                                    text containing characters which would otherwise be recognized as markup.

                                    Taking this literally this is *only* about escaping typical XML entities like <, >, & etc
                                    from recognition of the validating parser. But it tells *nothing* about preserving white
                                    space. I feel this still has to be tackled in the scope of XML white space handling:

                                    http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-white-space

                                    - re the structure issue discussed under this subject lately:

                                    -- the children of the "contentSet" should reflect a straightforward model. Currently
                                    <remoteContent> for any XML instance external content is undisputed.

                                    -- the the logical sibling name would be <inlineContent> making a statement where to
                                    to find the content.

                                    -- but how to structure <inlineContent>:

                                    --- first we have to cover the XML structured content of any namespace (=
                                    "extensibility point" in IPTC speak).

                                    --- then we have to cover a string representing encoded and escaped content

                                    I propose this:

                                    <inlineContent>
                                    <xmlCont>... xml from any namespace ...</xmlCont>
                                    <escCont escaping="esc:cdata (or: esc:html)"> ... CDATA ... </escCont>
                                    <encCont encoding="base64">.... encoded data ...</encContent>
                                    </inlineContent>

                                    * the child elements are mutually exclusive
                                    * I prefer to disambiguate escaped and encoded content (as John pointed out) by the
                                    element and not by an attribute's value.

                                    Michael


                                    On 10 Mar 2006 at 9:29, Laurent Le Meur wrote:

                                    > > Isn't it enough to set the encoding to enc:none and the media type to
                                    > > text/html in this case?
                                    >
                                    > I wonder why the Atom WG did not choose this kind of solution.
                                    > Does somebody know?
                                    > Laurent
                                    >
                                    > > -----Message d'origine-----
                                    > > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de
                                    > > John Cowan
                                    > > Envoye : jeudi 9 mars 2006 22:56
                                    > > A : newsml-2@yahoogroups.com
                                    > > Objet : Re: [newsml-2] handling CDATA
                                    > >
                                    > > Laurent Le Meur scripsit:
                                    > >
                                    > > > I go with J.Cowan and prefer option 1.
                                    > > >
                                    > > > I'm ok with <plainContent> (what is simple here) or <inlineContent>.
                                    > > >
                                    > > > I propose to follow the Atom choice and disambiguate between escaped html
                                    > > and
                                    > > > escaped or CDATA plain-text.
                                    > > >
                                    > > > Plain-text or cdata would be indicated by encoding='enc:none'
                                    > > >
                                    > > > Escaped html would be indicated by eg encoding='enc:escHTML'
                                    > >
                                    > > Isn't it enough to set the encoding to enc:none and the media type to
                                    > > text/html in this case?
                                    > >
                                    > > Escaping is not encoding; it will be necessary to escape & and < characters
                                    > > even in plain text, since the enclosing document remains XML.
                                    > >
                                    > > Any of simpleContent, plainContent, or inlineContent is fine with me.
                                    > >
                                    > > --
                                    > > You know, you haven't stopped talking John Cowan
                                    > > since I came here. You must have been http://www.ap.org
                                    > > vaccinated with a phonograph needle. cowan@...
                                    > > --Rufus T. Firefly http://www.ccil.org/~cowan
                                    > >
                                    > >
                                    > >
                                    > > Yahoo! Groups Links
                                    > >
                                    > >
                                    > >
                                    > >
                                    > >
                                    >
                                    >
                                    >
                                    > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                    >
                                    > This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.
                                    >
                                    > For more information on Agence France-Presse, please visit our web site at http://www.afp.com
                                    >
                                    > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                    >
                                    >
                                    >
                                    >
                                    > Yahoo! Groups Links
                                    >
                                    >
                                    >
                                    >
                                    >
                                    >
                                    >
                                    >

                                    ==================================================
                                    Sent by:
                                    Michael Steidl
                                    Managing Director of IPTC <mdirector@...>
                                    International Press Telecommunications Council
                                    Working to improve the efficiency of News exchange.
                                    Visit our Web Site at http://www.iptc.org
                                  • Laurent Le Meur
                                    Michael, it seems you are right about the scope of CDATA. In fact XML defines a specific xml:space for space preservation purposes:
                                    Message 17 of 23 , Mar 10, 2006
                                    View Source
                                    • 0 Attachment
                                      Michael,
                                      it seems you are right about the scope of CDATA.

                                      In fact XML defines a specific xml:space for space preservation purposes:
                                      http://www.w3.org/TR/REC-xml/
                                      "A special attribute named xml:space MAY be attached to an element to signal an
                                      intention that in that element, white space should be preserved by applications.
                                      In valid documents, this attribute, like any other, MUST be declared if it is
                                      used. When declared, it MUST be given as an enumerated type whose values are one
                                      or both of "default" and "preserve".

                                      But I'm not a fan of the additional nesting you propose. It is going back to
                                      NewsML1 nesting.

                                      And now I don't see the benefit of treating separately CDATA content and escaped
                                      content (the processing is done by the parser, isn't it?).

                                      I prefer:
                                      <news:item schema="0.6" guid="xxxx" version="1"
                                      xmlns:news="urn:iptc:std:news:1.0:xmlns" xmlns ="urn:iptc:std:newsml:2.0:xmlns">
                                      < itemMeta />
                                      < contentMeta />
                                      <news:contentSet>
                                      <news:encodedContent type="text/plain">
                                      Value 1 < Value 2
                                      Line 2
                                      Line 3
                                      </news:encodedContent>
                                      <news:inlineContent type="text/xml+xhtml">
                                      <html/>
                                      </news: inlineContent>
                                      </news:contentSet>
                                      </news:item>

                                      To:
                                      <news:item schema="0.6" guid="xxxx" version="1"
                                      xmlns:news="urn:iptc:std:news:1.0:xmlns" xmlns ="urn:iptc:std:newsml:2.0:xmlns">
                                      < itemMeta />
                                      < contentMeta />
                                      <news:contentSet>
                                      <news:inlineContent >
                                      <news:escCont type="text/plain">
                                      Value 1 < Value 2
                                      Line 2
                                      Line 3
                                      </news:escCont>
                                      </news: inlineContent>
                                      <news:inlineContent >
                                      <news:extCont type="text/xml+xhtml">
                                      <html/>
                                      </news:extCont>
                                      </news:directContent>
                                      </news:contentSet>
                                      </news:item>

                                      Re <xmlcont> please remember that no xml element should begin with "xml", if I
                                      remember well.

                                      Laurent


                                      > -----Message d'origine-----
                                      > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de
                                      > Michael Steidl/MDir IPTC
                                      > Envoye : vendredi 10 mars 2006 10:11
                                      > A : newsml-2@yahoogroups.com
                                      > Objet : RE: [newsml-2] handling CDATA
                                      >
                                      > Sorry, but I have to be bothering:
                                      >
                                      > - looking back to the roots of this discussion early this week I pointed out
                                      > it started
                                      > with tackling the issue: how to preserve line breaks, tabs etc ("white space")
                                      > in plain
                                      > text content. Laurent proposed to use CDATA for this purpose - and added this
                                      > helps
                                      > also with escaping e.g. HTML.
                                      >
                                      > - re-reading the XML (1.1) specs on CDATA and remembering checking the issue
                                      > of
                                      > preserving white space in XML in general I'm not convinced anymore CDATA will
                                      > provide this feature. XML 1.1. says: [CDATA sections] are used to escape
                                      > blocks of
                                      > text containing characters which would otherwise be recognized as markup.
                                      >
                                      > Taking this literally this is *only* about escaping typical XML entities like
                                      > <, >, & etc
                                      > from recognition of the validating parser. But it tells *nothing* about
                                      > preserving white
                                      > space. I feel this still has to be tackled in the scope of XML white space
                                      > handling:
                                      >
                                      > http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-white-space
                                      >
                                      > - re the structure issue discussed under this subject lately:
                                      >
                                      > -- the children of the "contentSet" should reflect a straightforward model.
                                      > Currently
                                      > <remoteContent> for any XML instance external content is undisputed.
                                      >
                                      > -- the the logical sibling name would be <inlineContent> making a statement
                                      > where to
                                      > to find the content.
                                      >
                                      > -- but how to structure <inlineContent>:
                                      >
                                      > --- first we have to cover the XML structured content of any namespace (=
                                      > "extensibility point" in IPTC speak).
                                      >
                                      > --- then we have to cover a string representing encoded and escaped content
                                      >
                                      > I propose this:
                                      >
                                      > <inlineContent>
                                      > <xmlCont>... xml from any namespace ...</xmlCont>
                                      > <escCont escaping="esc:cdata (or: esc:html)"> ... CDATA ... </escCont>
                                      > <encCont encoding="base64">.... encoded data ...</encContent>
                                      > </inlineContent>
                                      >
                                      > * the child elements are mutually exclusive
                                      > * I prefer to disambiguate escaped and encoded content (as John pointed out)
                                      > by the
                                      > element and not by an attribute's value.
                                      >
                                      > Michael
                                      >
                                      >
                                      > On 10 Mar 2006 at 9:29, Laurent Le Meur wrote:
                                      >
                                      > > > Isn't it enough to set the encoding to enc:none and the media type to
                                      > > > text/html in this case?
                                      > >
                                      > > I wonder why the Atom WG did not choose this kind of solution.
                                      > > Does somebody know?
                                      > > Laurent
                                      > >
                                      > > > -----Message d'origine-----
                                      > > > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part
                                      > de
                                      > > > John Cowan
                                      > > > Envoye : jeudi 9 mars 2006 22:56
                                      > > > A : newsml-2@yahoogroups.com
                                      > > > Objet : Re: [newsml-2] handling CDATA
                                      > > >
                                      > > > Laurent Le Meur scripsit:
                                      > > >
                                      > > > > I go with J.Cowan and prefer option 1.
                                      > > > >
                                      > > > > I'm ok with <plainContent> (what is simple here) or <inlineContent>.
                                      > > > >
                                      > > > > I propose to follow the Atom choice and disambiguate between escaped
                                      > html
                                      > > > and
                                      > > > > escaped or CDATA plain-text.
                                      > > > >
                                      > > > > Plain-text or cdata would be indicated by encoding='enc:none'
                                      > > > >
                                      > > > > Escaped html would be indicated by eg encoding='enc:escHTML'
                                      > > >
                                      > > > Isn't it enough to set the encoding to enc:none and the media type to
                                      > > > text/html in this case?
                                      > > >
                                      > > > Escaping is not encoding; it will be necessary to escape & and <
                                      > characters
                                      > > > even in plain text, since the enclosing document remains XML.
                                      > > >
                                      > > > Any of simpleContent, plainContent, or inlineContent is fine with me.
                                      > > >
                                      > > > --
                                      > > > You know, you haven't stopped talking John Cowan
                                      > > > since I came here. You must have been http://www.ap.org
                                      > > > vaccinated with a phonograph needle. cowan@...
                                      > > > --Rufus T. Firefly http://www.ccil.org/~cowan
                                      > > >
                                      > > >
                                      > > >
                                      > > > Yahoo! Groups Links
                                      > > >
                                      > > >
                                      > > >
                                      > > >
                                      > > >
                                      > >
                                      > >
                                      > >
                                      > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                      > >
                                      > > This e-mail, and any file transmitted with it, is confidential and intended
                                      > solely for the use of the individual or entity to whom it is addressed. If you
                                      > have received this email in error, please contact the sender and delete the
                                      > email from your system. If you are not the named addressee you should not
                                      > disseminate, distribute or copy this email.
                                      > >
                                      > > For more information on Agence France-Presse, please visit our web site at
                                      > http://www.afp.com
                                      > >
                                      > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                      > >
                                      > >
                                      > >
                                      > >
                                      > > Yahoo! Groups Links
                                      > >
                                      > >
                                      > >
                                      > >
                                      > >
                                      > >
                                      > >
                                      > >
                                      >
                                      > ==================================================
                                      > Sent by:
                                      > Michael Steidl
                                      > Managing Director of IPTC <mdirector@...>
                                      > International Press Telecommunications Council
                                      > Working to improve the efficiency of News exchange.
                                      > Visit our Web Site at http://www.iptc.org
                                      >
                                      >
                                      >
                                      >
                                      >
                                      >
                                      > Yahoo! Groups Links
                                      >
                                      >
                                      >
                                      >
                                      >
                                      >



                                      -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                                      This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

                                      For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                                      -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                    • Misha Wolf
                                      Doesn t a media type apply to an (entire) external resource rather than to some sequence of characters inside an XML document? Misha ... From:
                                      Message 18 of 23 , Mar 10, 2006
                                      View Source
                                      • 0 Attachment
                                        Doesn't a media type apply to an (entire) external resource
                                        rather than to some sequence of characters inside an XML
                                        document?

                                        Misha


                                        -----Original Message-----
                                        From: newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] On
                                        Behalf Of Laurent Le Meur
                                        Sent: 10 March 2006 08:29
                                        To: newsml-2@yahoogroups.com
                                        Subject: RE: [newsml-2] handling CDATA

                                        > Isn't it enough to set the encoding to enc:none and the media type to
                                        > text/html in this case?

                                        I wonder why the Atom WG did not choose this kind of solution.
                                        Does somebody know?
                                        Laurent


                                        To find out more about Reuters visit www.about.reuters.com

                                        Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
                                      • Laurent Le Meur
                                        Note : I m not saying I like the name *encodingContent* for both encoded ans escaped content. *plainContent* would be fine also, or any other name which fits
                                        Message 19 of 23 , Mar 10, 2006
                                        View Source
                                        • 0 Attachment
                                          Note : I'm not saying I like the name *encodingContent* for both encoded ans
                                          escaped content. *plainContent* would be fine also, or any other name which fits
                                          the idea of a set of characters that is not pure xml.
                                          Laurent

                                          >
                                          > I prefer:
                                          > <news:item schema="0.6" guid="xxxx" version="1"
                                          > xmlns:news="urn:iptc:std:news:1.0:xmlns" xmlns
                                          > ="urn:iptc:std:newsml:2.0:xmlns">
                                          > < itemMeta />
                                          > < contentMeta />
                                          > <news:contentSet>
                                          > <news:encodedContent type="text/plain">
                                          > Value 1 < Value 2
                                          > Line 2
                                          > Line 3
                                          > </news:encodedContent>
                                          > <news:inlineContent type="text/xml+xhtml">
                                          > <html/>
                                          > </news: inlineContent>
                                          > </news:contentSet>
                                          > </news:item>
                                          >
                                          > To:
                                          > <news:item schema="0.6" guid="xxxx" version="1"
                                          > xmlns:news="urn:iptc:std:news:1.0:xmlns" xmlns
                                          > ="urn:iptc:std:newsml:2.0:xmlns">
                                          > < itemMeta />
                                          > < contentMeta />
                                          > <news:contentSet>
                                          > <news:inlineContent >
                                          > <news:escCont type="text/plain">
                                          > Value 1 < Value 2
                                          > Line 2
                                          > Line 3
                                          > </news:escCont>
                                          > </news: inlineContent>
                                          > <news:inlineContent >
                                          > <news:extCont type="text/xml+xhtml">
                                          > <html/>
                                          > </news:extCont>
                                          > </news:directContent>
                                          > </news:contentSet>
                                          > </news:item>
                                          >
                                          > Re <xmlcont> please remember that no xml element should begin with "xml", if I
                                          > remember well.
                                          >
                                          > Laurent
                                          >
                                          >
                                          > > -----Message d'origine-----
                                          > > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part
                                          > de
                                          > > Michael Steidl/MDir IPTC
                                          > > Envoye : vendredi 10 mars 2006 10:11
                                          > > A : newsml-2@yahoogroups.com
                                          > > Objet : RE: [newsml-2] handling CDATA
                                          > >
                                          > > Sorry, but I have to be bothering:
                                          > >
                                          > > - looking back to the roots of this discussion early this week I pointed out
                                          > > it started
                                          > > with tackling the issue: how to preserve line breaks, tabs etc ("white
                                          > space")
                                          > > in plain
                                          > > text content. Laurent proposed to use CDATA for this purpose - and added
                                          > this
                                          > > helps
                                          > > also with escaping e.g. HTML.
                                          > >
                                          > > - re-reading the XML (1.1) specs on CDATA and remembering checking the issue
                                          > > of
                                          > > preserving white space in XML in general I'm not convinced anymore CDATA
                                          > will
                                          > > provide this feature. XML 1.1. says: [CDATA sections] are used to escape
                                          > > blocks of
                                          > > text containing characters which would otherwise be recognized as markup.
                                          > >
                                          > > Taking this literally this is *only* about escaping typical XML entities
                                          > like
                                          > > <, >, & etc
                                          > > from recognition of the validating parser. But it tells *nothing* about
                                          > > preserving white
                                          > > space. I feel this still has to be tackled in the scope of XML white space
                                          > > handling:
                                          > >
                                          > > http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-white-space
                                          > >
                                          > > - re the structure issue discussed under this subject lately:
                                          > >
                                          > > -- the children of the "contentSet" should reflect a straightforward model.
                                          > > Currently
                                          > > <remoteContent> for any XML instance external content is undisputed.
                                          > >
                                          > > -- the the logical sibling name would be <inlineContent> making a statement
                                          > > where to
                                          > > to find the content.
                                          > >
                                          > > -- but how to structure <inlineContent>:
                                          > >
                                          > > --- first we have to cover the XML structured content of any namespace (=
                                          > > "extensibility point" in IPTC speak).
                                          > >
                                          > > --- then we have to cover a string representing encoded and escaped content
                                          > >
                                          > > I propose this:
                                          > >
                                          > > <inlineContent>
                                          > > <xmlCont>... xml from any namespace ...</xmlCont>
                                          > > <escCont escaping="esc:cdata (or: esc:html)"> ... CDATA ... </escCont>
                                          > > <encCont encoding="base64">.... encoded data ...</encContent>
                                          > > </inlineContent>
                                          > >
                                          > > * the child elements are mutually exclusive
                                          > > * I prefer to disambiguate escaped and encoded content (as John pointed out)
                                          > > by the
                                          > > element and not by an attribute's value.
                                          > >
                                          > > Michael
                                          > >
                                          > >
                                          > > On 10 Mar 2006 at 9:29, Laurent Le Meur wrote:
                                          > >
                                          > > > > Isn't it enough to set the encoding to enc:none and the media type to
                                          > > > > text/html in this case?
                                          > > >
                                          > > > I wonder why the Atom WG did not choose this kind of solution.
                                          > > > Does somebody know?
                                          > > > Laurent
                                          > > >
                                          > > > > -----Message d'origine-----
                                          > > > > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la
                                          > part
                                          > > de
                                          > > > > John Cowan
                                          > > > > Envoye : jeudi 9 mars 2006 22:56
                                          > > > > A : newsml-2@yahoogroups.com
                                          > > > > Objet : Re: [newsml-2] handling CDATA
                                          > > > >
                                          > > > > Laurent Le Meur scripsit:
                                          > > > >
                                          > > > > > I go with J.Cowan and prefer option 1.
                                          > > > > >
                                          > > > > > I'm ok with <plainContent> (what is simple here) or <inlineContent>.
                                          > > > > >
                                          > > > > > I propose to follow the Atom choice and disambiguate between escaped
                                          > > html
                                          > > > > and
                                          > > > > > escaped or CDATA plain-text.
                                          > > > > >
                                          > > > > > Plain-text or cdata would be indicated by encoding='enc:none'
                                          > > > > >
                                          > > > > > Escaped html would be indicated by eg encoding='enc:escHTML'
                                          > > > >
                                          > > > > Isn't it enough to set the encoding to enc:none and the media type to
                                          > > > > text/html in this case?
                                          > > > >
                                          > > > > Escaping is not encoding; it will be necessary to escape & and <
                                          > > characters
                                          > > > > even in plain text, since the enclosing document remains XML.
                                          > > > >
                                          > > > > Any of simpleContent, plainContent, or inlineContent is fine with me.
                                          > > > >
                                          > > > > --
                                          > > > > You know, you haven't stopped talking John Cowan
                                          > > > > since I came here. You must have been http://www.ap.org
                                          > > > > vaccinated with a phonograph needle. cowan@...
                                          > > > > --Rufus T. Firefly
                                          > http://www.ccil.org/~cowan
                                          > > > >
                                          > > > >
                                          > > > >
                                          > > > > Yahoo! Groups Links
                                          > > > >
                                          > > > >
                                          > > > >
                                          > > > >
                                          > > > >
                                          > > >
                                          > > >
                                          > > >
                                          > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                          > > >
                                          > > > This e-mail, and any file transmitted with it, is confidential and
                                          > intended
                                          > > solely for the use of the individual or entity to whom it is addressed. If
                                          > you
                                          > > have received this email in error, please contact the sender and delete the
                                          > > email from your system. If you are not the named addressee you should not
                                          > > disseminate, distribute or copy this email.
                                          > > >
                                          > > > For more information on Agence France-Presse, please visit our web site at
                                          > > http://www.afp.com
                                          > > >
                                          > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                          > > >
                                          > > >
                                          > > >
                                          > > >
                                          > > > Yahoo! Groups Links
                                          > > >
                                          > > >
                                          > > >
                                          > > >
                                          > > >
                                          > > >
                                          > > >
                                          > > >
                                          > >
                                          > > ==================================================
                                          > > Sent by:
                                          > > Michael Steidl
                                          > > Managing Director of IPTC <mdirector@...>
                                          > > International Press Telecommunications Council
                                          > > Working to improve the efficiency of News exchange.
                                          > > Visit our Web Site at http://www.iptc.org
                                          > >
                                          > >
                                          > >
                                          > >
                                          > >
                                          > >
                                          > > Yahoo! Groups Links
                                          > >
                                          > >
                                          > >
                                          > >
                                          > >
                                          > >
                                          >
                                          >
                                          >
                                          > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                          >
                                          > This e-mail, and any file transmitted with it, is confidential and intended
                                          > solely for the use of the individual or entity to whom it is addressed. If you
                                          > have received this email in error, please contact the sender and delete the
                                          > email from your system. If you are not the named addressee you should not
                                          > disseminate, distribute or copy this email.
                                          >
                                          > For more information on Agence France-Presse, please visit our web site at
                                          > http://www.afp.com
                                          >
                                          > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                          >
                                          >
                                          >
                                          >
                                          > Yahoo! Groups Links
                                          >
                                          >
                                          >
                                          >
                                          >



                                          -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                                          This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

                                          For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                                          -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                        • John Cowan
                                          ... AFAIK because they did not see a use case for arbitrary embedded content beyond plain text, HTML, and XHTML. All other content must be remote. Disclaimer:
                                          Message 20 of 23 , Mar 10, 2006
                                          View Source
                                          • 0 Attachment
                                            Laurent Le Meur scripsit:

                                            > > Isn't it enough to set the encoding to enc:none and the media type to
                                            > > text/html in this case?
                                            >
                                            > I wonder why the Atom WG did not choose this kind of solution.
                                            > Does somebody know?

                                            AFAIK because they did not see a use case for arbitrary embedded content
                                            beyond plain text, HTML, and XHTML. All other content must be remote.

                                            Disclaimer: I wasn't involved with the ins and outs of Atom development.

                                            --
                                            John Cowan cowan@... www.ap.org www.ccil.org/~cowan
                                            If a traveler were informed that such a man [as Lord John Russell] was
                                            leader of the House of Commons, he may well begin to comprehend how the
                                            Egyptians worshiped an insect. --Benjamin Disraeli
                                          • John Cowan
                                            ... An entire resource, yes, but not necessarily an external one. Email is the original context for media types, and every email attachment bears a media
                                            Message 21 of 23 , Mar 10, 2006
                                            View Source
                                            • 0 Attachment
                                              Misha Wolf scripsit:

                                              > Doesn't a media type apply to an (entire) external resource
                                              > rather than to some sequence of characters inside an XML
                                              > document?

                                              An entire resource, yes, but not necessarily an external one. Email
                                              is the original context for media types, and every email attachment
                                              bears a media type. (There are even URIs for referring to specific
                                              attachments using the mid: scheme, but it is rarely used.)

                                              --
                                              John Cowan cowan@... http://www.ccil.org/~cowan
                                              Most languages are dramatically underdescribed, and at least one is
                                              dramatically overdescribed. Still other languages are simultaneously
                                              overdescribed and underdescribed. Welsh pertains to the third category.
                                              --Alan King
                                            • Laurent Le Meur
                                              ... Not really: you ll find clear samples at: http://www.xml.com/pub/a/2005/12/07/handling-atom-text-and-content-constructs.ht ml (from Uche Ogbuji) extract =
                                              Message 22 of 23 , Mar 10, 2006
                                              View Source
                                              • 0 Attachment
                                                > AFAIK because they did not see a use case for arbitrary embedded content
                                                > beyond plain text, HTML, and XHTML. All other content must be remote.

                                                Not really: you'll find clear samples at:
                                                http://www.xml.com/pub/a/2005/12/07/handling-atom-text-and-content-constructs.ht
                                                ml (from Uche Ogbuji)

                                                extract =
                                                a/ plain-text
                                                <title type="text">One bold foot forward</title>
                                                b/ escaped html
                                                <title type="html">One <strong>bold</strong> foot forward</title>
                                                c/ xhtml
                                                <title type="xhtml">
                                                <div xmlns="http://www.w3.org/1999/xhtml">
                                                One <strong>bold</strong> foot forward
                                                </div>
                                                </title>
                                                d/ other xml
                                                <content type="image/svg+xml">
                                                <svg xmlns="http://www.w3.org/2000/svg"
                                                width="100px" height="100px">
                                                <title>Itsy bitsy SVG</title>
                                                <circle cx="40" cy="25" r="20" style="fill: black;"/>
                                                <text x="10" y="80" fill="blue">Hello World</text>
                                                </svg>
                                                </content>
                                                e/ png encoded base64
                                                <content type="image/png">
                                                iVBORw0KGgoAAAANSUhEUgAAAB8AAAAqCAYAAABLGYAnAAAABmJLR0QA/wD/AP+gvaeTAAAACXBI
                                                </content>
                                                f/ remote content
                                                <content src="image.png" type="image/png"/>

                                                All with the same element.

                                                @type contains special values (text, html, xhtml) OR a mime type.

                                                Laurent

                                                > -----Message d'origine-----
                                                > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de
                                                > John Cowan
                                                > Envoye : vendredi 10 mars 2006 14:02
                                                > A : newsml-2@yahoogroups.com
                                                > Objet : Re: [newsml-2] handling CDATA
                                                >
                                                > Laurent Le Meur scripsit:
                                                >
                                                > > > Isn't it enough to set the encoding to enc:none and the media type to
                                                > > > text/html in this case?
                                                > >
                                                > > I wonder why the Atom WG did not choose this kind of solution.
                                                > > Does somebody know?
                                                >
                                                > AFAIK because they did not see a use case for arbitrary embedded content
                                                > beyond plain text, HTML, and XHTML. All other content must be remote.
                                                >
                                                > Disclaimer: I wasn't involved with the ins and outs of Atom development.
                                                >
                                                > --
                                                > John Cowan cowan@... www.ap.org www.ccil.org/~cowan
                                                > If a traveler were informed that such a man [as Lord John Russell] was
                                                > leader of the House of Commons, he may well begin to comprehend how the
                                                > Egyptians worshiped an insect. --Benjamin Disraeli
                                                >
                                                >
                                                >
                                                > Yahoo! Groups Links
                                                >
                                                >
                                                >
                                                >
                                                >



                                                -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                                                This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

                                                For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                                                -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                              • Michael Steidl/MDir IPTC
                                                I feel we have to keep our terminology straight - we have to decide what the name of the child elements of tells: something about how the content
                                                Message 23 of 23 , Mar 12, 2006
                                                View Source
                                                • 0 Attachment
                                                  I feel we have to keep our terminology straight - we have to decide what the name of
                                                  the child elements of <contentSet> tells: something about how the content is included
                                                  (by value or by reference) or how it is presented (kind of data type statement) - or
                                                  both. Then we have to align the naming to it.

                                                  My approach in a previous posting was to make the child element's name telling
                                                  about how the content is included, hence I needed a second level for elements of the
                                                  different presentation types. But this can be flattended by renaming the elements:

                                                  <contentSet>
                                                  <inlineXml type="application/xml" format="xbrl">...e.g.: an XBRL structure in
                                                  an XBRL namespace ...</inlineXML>
                                                  <inlinePlain type="text/plain" encoding="base64">.... encoded data as plain
                                                  xs:string...</inlinePlain>
                                                  <remoteAny type="image/jpeg" href="content-access-url"/>
                                                  </contentSet>

                                                  Reading my own proposal I ponder this:

                                                  - the NAR developers wanted to implement different child elements to disambiguate
                                                  content included "by value" and "by reference". The processing model is: for an
                                                  "include by value" element take the element's content and process it, for an "include
                                                  by reference" element follow the href-URL, fetch the data and process it.

                                                  - How to process the data should be indicated by the "type" attribute: the processor
                                                  should decide by the given MIME type how to process ...

                                                  - ... but as this processor is already a kind of XML processor a "special case" for
                                                  "included by value" content was made: if the type of content is a kind of XML it goes
                                                  into a special child element of contentSet - as this content can be processed directly
                                                  by the XML processor we choose the name it <directContent>. The technical
                                                  statement behind this element is: the MIME type must be a variant of XML, e.g.
                                                  "application/xml", "text/xml", "text/xml+xhtml" and so forth ...

                                                  - On the other hand we implied: content in the <encodedContent> can not be
                                                  processed directly by an XML processor, it has to be decoded first and then to be
                                                  processed accordingly to the given MIME type. (Somebody could even compress
                                                  XML first and encode the compressed binary data and then convey this as
                                                  <encodedContent>.)

                                                  - My conclusion: the structure discussed lately does not align with this processing
                                                  model anymore:

                                                  -- if CDATA content goes into the <inlinePlain> content it can be processed by the
                                                  XML processor directly as the CDATA signature is a direct instruction to any XML
                                                  processor and nothing else. (That's the reason why I proposed to keep the CDATA
                                                  content in the scope of the <inlineXML> element)

                                                  -- we have to make clear why we have two different elements for "included by value"
                                                  content ("inline content") - what are the different processing rules for each one.

                                                  -- the crucial issue in terms of XML Schema is: at the moment of specifying the
                                                  content of an XML element as "any ##other" (= any XML structure of another
                                                  namespace) this does not allow a/ CDATA and b/ any other plain xs:string content for
                                                  this element anymore. In other words: "any ##other" content has always to be in a
                                                  special wrapper element. Therefore as long as we want to have an extensibility point
                                                  for non-NAR XML structures this requires some additional structures in the scope of
                                                  the <contentSet>.

                                                  -- As a conclusion from the item above I propose to consider specifying this:

                                                  <contentSet>
                                                  <inlineXmlOther type="application/xml" format="....">MUST be an XML
                                                  structure, MUST be from an external namespace</inlineXmlOther>
                                                  <inlineData type="text/plain" encoding="base64">.... encoded data as plain
                                                  xs:string...</inlineData>
                                                  <remoteData type="image/jpeg" href="content-access-url"/>
                                                  </contentSet>

                                                  Processing rules:

                                                  * the content of the <inlineXmlOther> is always an XML structure - but MUST be from
                                                  another namespace - only a special case of <inlineData>, see below.

                                                  * the content of the <inlineData> is at a first level data which can be represented by a
                                                  plain string. **But** this string has to be processed according to the value of the
                                                  encoding attribute firstly and to the type attribute secondly.

                                                  Note: I feel CDATA have to be notated like this: type="text/xml", encoding="none" -
                                                  as no decoding is required but the data have to be fed into an XML processor as it is
                                                  the only one able to deal with the [CDATA[ .... etc syntax.

                                                  * finally: <remoteData> indicates the data don't reside in this XML instance.
                                                  Processing exactly the same as for <inlineData> after decoding the content. (Aside:
                                                  should be allow encoded data for remote data???)

                                                  Michael

                                                  On 10 Mar 2006 at 12:36, Laurent Le Meur wrote:

                                                  > Note : I'm not saying I like the name *encodingContent* for both encoded ans
                                                  > escaped content. *plainContent* would be fine also, or any other name which fits
                                                  > the idea of a set of characters that is not pure xml.
                                                  > Laurent
                                                  >
                                                  > >
                                                  > > I prefer:
                                                  > > <news:item schema="0.6" guid="xxxx" version="1"
                                                  > > xmlns:news="urn:iptc:std:news:1.0:xmlns" xmlns
                                                  > > ="urn:iptc:std:newsml:2.0:xmlns">
                                                  > > < itemMeta />
                                                  > > < contentMeta />
                                                  > > <news:contentSet>
                                                  > > <news:encodedContent type="text/plain">
                                                  > > Value 1 < Value 2
                                                  > > Line 2
                                                  > > Line 3
                                                  > > </news:encodedContent>
                                                  > > <news:inlineContent type="text/xml+xhtml">
                                                  > > <html/>
                                                  > > </news: inlineContent>
                                                  > > </news:contentSet>
                                                  > > </news:item>
                                                  > >
                                                  > > To:
                                                  > > <news:item schema="0.6" guid="xxxx" version="1"
                                                  > > xmlns:news="urn:iptc:std:news:1.0:xmlns" xmlns
                                                  > > ="urn:iptc:std:newsml:2.0:xmlns">
                                                  > > < itemMeta />
                                                  > > < contentMeta />
                                                  > > <news:contentSet>
                                                  > > <news:inlineContent >
                                                  > > <news:escCont type="text/plain">
                                                  > > Value 1 < Value 2
                                                  > > Line 2
                                                  > > Line 3
                                                  > > </news:escCont>
                                                  > > </news: inlineContent>
                                                  > > <news:inlineContent >
                                                  > > <news:extCont type="text/xml+xhtml">
                                                  > > <html/>
                                                  > > </news:extCont>
                                                  > > </news:directContent>
                                                  > > </news:contentSet>
                                                  > > </news:item>
                                                  > >
                                                  > > Re <xmlcont> please remember that no xml element should begin with "xml", if I
                                                  > > remember well.
                                                  > >
                                                  > > Laurent
                                                  > >
                                                  > >
                                                  > > > -----Message d'origine-----
                                                  > > > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part
                                                  > > de
                                                  > > > Michael Steidl/MDir IPTC
                                                  > > > Envoye : vendredi 10 mars 2006 10:11
                                                  > > > A : newsml-2@yahoogroups.com
                                                  > > > Objet : RE: [newsml-2] handling CDATA
                                                  > > >
                                                  > > > Sorry, but I have to be bothering:
                                                  > > >
                                                  > > > - looking back to the roots of this discussion early this week I pointed out
                                                  > > > it started
                                                  > > > with tackling the issue: how to preserve line breaks, tabs etc ("white
                                                  > > space")
                                                  > > > in plain
                                                  > > > text content. Laurent proposed to use CDATA for this purpose - and added
                                                  > > this
                                                  > > > helps
                                                  > > > also with escaping e.g. HTML.
                                                  > > >
                                                  > > > - re-reading the XML (1.1) specs on CDATA and remembering checking the issue
                                                  > > > of
                                                  > > > preserving white space in XML in general I'm not convinced anymore CDATA
                                                  > > will
                                                  > > > provide this feature. XML 1.1. says: [CDATA sections] are used to escape
                                                  > > > blocks of
                                                  > > > text containing characters which would otherwise be recognized as markup.
                                                  > > >
                                                  > > > Taking this literally this is *only* about escaping typical XML entities
                                                  > > like
                                                  > > > <, >, & etc
                                                  > > > from recognition of the validating parser. But it tells *nothing* about
                                                  > > > preserving white
                                                  > > > space. I feel this still has to be tackled in the scope of XML white space
                                                  > > > handling:
                                                  > > >
                                                  > > > http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-white-space
                                                  > > >
                                                  > > > - re the structure issue discussed under this subject lately:
                                                  > > >
                                                  > > > -- the children of the "contentSet" should reflect a straightforward model.
                                                  > > > Currently
                                                  > > > <remoteContent> for any XML instance external content is undisputed.
                                                  > > >
                                                  > > > -- the the logical sibling name would be <inlineContent> making a statement
                                                  > > > where to
                                                  > > > to find the content.
                                                  > > >
                                                  > > > -- but how to structure <inlineContent>:
                                                  > > >
                                                  > > > --- first we have to cover the XML structured content of any namespace (=
                                                  > > > "extensibility point" in IPTC speak).
                                                  > > >
                                                  > > > --- then we have to cover a string representing encoded and escaped content
                                                  > > >
                                                  > > > I propose this:
                                                  > > >
                                                  > > > <inlineContent>
                                                  > > > <xmlCont>... xml from any namespace ...</xmlCont>
                                                  > > > <escCont escaping="esc:cdata (or: esc:html)"> ... CDATA ... </escCont>
                                                  > > > <encCont encoding="base64">.... encoded data ...</encContent>
                                                  > > > </inlineContent>
                                                  > > >
                                                  > > > * the child elements are mutually exclusive
                                                  > > > * I prefer to disambiguate escaped and encoded content (as John pointed out)
                                                  > > > by the
                                                  > > > element and not by an attribute's value.
                                                  > > >
                                                  > > > Michael
                                                  > > >
                                                  > > >
                                                  > > > On 10 Mar 2006 at 9:29, Laurent Le Meur wrote:
                                                  > > >
                                                  > > > > > Isn't it enough to set the encoding to enc:none and the media type to
                                                  > > > > > text/html in this case?
                                                  > > > >
                                                  > > > > I wonder why the Atom WG did not choose this kind of solution.
                                                  > > > > Does somebody know?
                                                  > > > > Laurent
                                                  > > > >
                                                  > > > > > -----Message d'origine-----
                                                  > > > > > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la
                                                  > > part
                                                  > > > de
                                                  > > > > > John Cowan
                                                  > > > > > Envoye : jeudi 9 mars 2006 22:56
                                                  > > > > > A : newsml-2@yahoogroups.com
                                                  > > > > > Objet : Re: [newsml-2] handling CDATA
                                                  > > > > >
                                                  > > > > > Laurent Le Meur scripsit:
                                                  > > > > >
                                                  > > > > > > I go with J.Cowan and prefer option 1.
                                                  > > > > > >
                                                  > > > > > > I'm ok with <plainContent> (what is simple here) or <inlineContent>.
                                                  > > > > > >
                                                  > > > > > > I propose to follow the Atom choice and disambiguate between escaped
                                                  > > > html
                                                  > > > > > and
                                                  > > > > > > escaped or CDATA plain-text.
                                                  > > > > > >
                                                  > > > > > > Plain-text or cdata would be indicated by encoding='enc:none'
                                                  > > > > > >
                                                  > > > > > > Escaped html would be indicated by eg encoding='enc:escHTML'
                                                  > > > > >
                                                  > > > > > Isn't it enough to set the encoding to enc:none and the media type to
                                                  > > > > > text/html in this case?
                                                  > > > > >
                                                  > > > > > Escaping is not encoding; it will be necessary to escape & and <
                                                  > > > characters
                                                  > > > > > even in plain text, since the enclosing document remains XML.
                                                  > > > > >
                                                  > > > > > Any of simpleContent, plainContent, or inlineContent is fine with me.
                                                  > > > > >
                                                  > > > > > --
                                                  > > > > > You know, you haven't stopped talking John Cowan
                                                  > > > > > since I came here. You must have been http://www.ap.org
                                                  > > > > > vaccinated with a phonograph needle. cowan@...
                                                  > > > > > --Rufus T. Firefly
                                                  > > http://www.ccil.org/~cowan
                                                  > > > > >
                                                  > > > > >
                                                  > > > > >
                                                  > > > > > Yahoo! Groups Links
                                                  > > > > >
                                                  > > > > >
                                                  > > > > >
                                                  > > > > >
                                                  > > > > >
                                                  > > > >
                                                  > > > >
                                                  > > > >
                                                  > > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                                  > > > >
                                                  > > > > This e-mail, and any file transmitted with it, is confidential and
                                                  > > intended
                                                  > > > solely for the use of the individual or entity to whom it is addressed. If
                                                  > > you
                                                  > > > have received this email in error, please contact the sender and delete the
                                                  > > > email from your system. If you are not the named addressee you should not
                                                  > > > disseminate, distribute or copy this email.
                                                  > > > >
                                                  > > > > For more information on Agence France-Presse, please visit our web site at
                                                  > > > http://www.afp.com
                                                  > > > >
                                                  > > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                                  > > > >
                                                  > > > >
                                                  > > > >
                                                  > > > >
                                                  > > > > Yahoo! Groups Links
                                                  > > > >
                                                  > > > >
                                                  > > > >
                                                  > > > >
                                                  > > > >
                                                  > > > >
                                                  > > > >
                                                  > > > >
                                                  > > >
                                                  > > > ==================================================
                                                  > > > Sent by:
                                                  > > > Michael Steidl
                                                  > > > Managing Director of IPTC <mdirector@...>
                                                  > > > International Press Telecommunications Council
                                                  > > > Working to improve the efficiency of News exchange.
                                                  > > > Visit our Web Site at http://www.iptc.org
                                                  > > >
                                                  > > >
                                                  > > >
                                                  > > >
                                                  > > >
                                                  > > >
                                                  > > > Yahoo! Groups Links
                                                  > > >
                                                  > > >
                                                  > > >
                                                  > > >
                                                  > > >
                                                  > > >
                                                  > >
                                                  > >
                                                  > >
                                                  > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                                  > >
                                                  > > This e-mail, and any file transmitted with it, is confidential and intended
                                                  > > solely for the use of the individual or entity to whom it is addressed. If you
                                                  > > have received this email in error, please contact the sender and delete the
                                                  > > email from your system. If you are not the named addressee you should not
                                                  > > disseminate, distribute or copy this email.
                                                  > >
                                                  > > For more information on Agence France-Presse, please visit our web site at
                                                  > > http://www.afp.com
                                                  > >
                                                  > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                                  > >
                                                  > >
                                                  > >
                                                  > >
                                                  > > Yahoo! Groups Links
                                                  > >
                                                  > >
                                                  > >
                                                  > >
                                                  > >
                                                  >
                                                  >
                                                  >
                                                  > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                                  >
                                                  > This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.
                                                  >
                                                  > For more information on Agence France-Presse, please visit our web site at http://www.afp.com
                                                  >
                                                  > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                                  >
                                                  >
                                                  >
                                                  >
                                                  > Yahoo! Groups Links
                                                  >
                                                  >
                                                  >
                                                  >
                                                  >
                                                  >
                                                  >

                                                  ==================================================
                                                  Sent by:
                                                  Michael Steidl
                                                  Managing Director of IPTC <mdirector@...>
                                                  International Press Telecommunications Council
                                                  Working to improve the efficiency of News exchange.
                                                  Visit our Web Site at http://www.iptc.org
                                                Your message has been successfully submitted and would be delivered to recipients shortly.