Loading ...
Sorry, an error occurred while loading the content.

Re: [newsml-2] RE: Clashing scheme declaration

Expand Messages
  • John Cowan
    ... +1 -- Business before pleasure, if not too bloomering long before. --Nicholas van Rijn John Cowan http://www.ccil.org/~cowan
    Message 1 of 28 , Feb 28, 2006
    • 0 Attachment
      Misha Wolf scripsit:

      > Any comments?

      +1

      --
      Business before pleasure, if not too bloomering long before.
      --Nicholas van Rijn
      John Cowan <cowan@...>
      http://www.ccil.org/~cowan http://www.ap.org
    • Laurent Le Meur
      I had exactly the same feeling. Let s make it easy for the consumer. +1 Further comment, use case: An aggregator creates a package from News Items created by
      Message 2 of 28 , Mar 1, 2006
      • 0 Attachment
        I had exactly the same feeling. Let's make it easy for the consumer.

        +1

        Further comment, use case:

        An aggregator creates a package from News Items created by multiple providers,
        using different catalogs. A Package Item contains links to the News Items, with
        supplement hints (eg. subject codes).

        It is up to the provider to generate these hints with a scheme alias declared in
        its own catalog, present at the top of the Package Item.

        Ex:

        A News Item from providerA:
        <nws:item guid="urn:newsml:...:newsA">
        <catalogRef href="www.providerA.com/..." />
        ...
        <subject code="s:1" />
        </nws:item>

        Where "s" is mapped to "http://purl.org/provA/subjects"

        A News Item from providerB:
        <nws:item guid="urn:newsml:...:newsB">
        < catalogRef href="www.providerB.com/..." />
        ...
        <subject code="subj:a" />
        </nws:item>

        Where "subj" is mapped to "http://provB.com/MySubjects"

        A Package Item for Aggregator, referencing both News Items:
        <pack:item>
        <catalogRef href="www.aggregator.com/......" />
        ...
        <link href=" urn:newsml:...:newsA">
        <subject code="sA:1" />
        </link>
        <link href=" urn:newsml:...:newsB">
        <subject code="sB:a" />
        </link>
        </pack:item>

        Where "sA" is mapped to "http://purl.org/provA/subjects"
        Where "sB" is mapped to "http://provB.com/MySubjects"

        The aggregator has made the magic. He has decided to use "sA" instead of "s" to
        represent the scheme used by provider A, and "sB" instead of "subj" to represent
        the scheme used by provider B.

        The consumer has the easy part.

        Laurent

        > -----Message d'origine-----
        > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de
        > Misha Wolf
        > Envoye : mardi 28 fevrier 2006 23:41
        > A : newsml-2@yahoogroups.com
        > Objet : [newsml-2] RE: Clashing scheme declaration
        >
        > Having thought about it further, I suggest that:
        >
        > - When an item of any sort is made available by a producer, the
        > scheme declarations should have to be in a state where they just
        > work, without any gymnastics on the part of the consumer.
        >
        > - When an aggregator builds a package item, scheme declarations
        > from different sources have to be resolved. This is where the
        > rules given below may be right. The order in which declarations
        > are encountered by the aggregator is not significant. The
        > aggregator must simply make sure that in the package item:
        >
        > - each scheme URI is associated with at least one alias, and
        >
        > - each alias is associated with exactly one URI.
        >
        > Any comments?
        >
        > Misha
        >
        >
        > -----Original Message-----
        > From: Misha Wolf
        > Sent: 28 February 2006 19:40
        > To: newsml-2@yahoogroups.com
        > Subject: RE: Clashing scheme declaration
        >
        > I think I've confused what the aggregator should do when building
        > a package of items from different providers with what the consumer
        > should do when receiving the package. Back to the drawing board :-)
        >
        > Misha
        >
        >
        > -----Original Message-----
        > From: Misha Wolf
        > Sent: 28 February 2006 18:07
        > To: newsml-2@yahoogroups.com
        > Subject: Clashing scheme declaration
        >
        > We've discussed what to do if a scheme declaration clashes with
        > another one and came to the conclusion that a declaration appearing
        > later in the list of declarations should override a clashing one
        > ocurring earlier. In today's teleconference of the NewsML2
        > Architecture WP, we realised that things aren't so simple, as this
        > is OK:
        >
        > http://www.iso.ch/iso4217 ~ cur
        > http://www.iso.ch/iso4217 ~ currency
        >
        > but this is not OK:
        >
        > http://www.reuters.com/scheme ~ foo
        > http://www.afp.com/scheme ~ foo
        >
        > So what constitutes clashing declarations?
        >
        > I've tried to write down the various cases and the appropriate
        > actions. Please review them.
        >
        > ---------------------------------------------------------------------
        > URI "URI-1" | Alias "alias-1" | Action
        > ---------------------------------------------------------------------
        > absent | absent | Add the declaration to
        > | | the table.
        > ---------------------------------------------------------------------
        > present with alias-1 | present with URI-1 | Discard the declaration
        > ---------------------------------------------------------------------
        > present with alias-2 | absent | Add the declaration to
        > | | the table. Results in two
        > | | aliases for URI-1.
        > ---------------------------------------------------------------------
        > absent | present with URI-2 | Add URI-1 to the table,
        > | | using some alias-3. In
        > | | all cases where alias-1
        > | | is being used to
        > | | represent URI-1, replace
        > | | alias-1 with alias-3.
        > ---------------------------------------------------------------------
        > present with alias-2 | present with URI-3 | Discard the declaration.
        > | | In all cases where
        > | | alias-1 is being used to
        > | | represent URI-1, replace
        > | | alias-1 with alias-2.
        > ---------------------------------------------------------------------
        >
        > Misha
        >
        >
        > To find out more about Reuters visit www.about.reuters.com
        >
        > Any views expressed in this message are those of the individual sender, except
        > where the sender specifically states them to be the views of Reuters Ltd.
        >
        >
        >
        >
        > Yahoo! Groups Links
        >
        >
        >
        >
        >
        >



        -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

        This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

        For more information on Agence France-Presse, please visit our web site at http://www.afp.com

        -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
      • Darko Gulija
        I agree with your rules, but I still believe that we must provide aggregators with the mechanism to override clashing scheme alias declaration in remote
        Message 3 of 28 , Mar 1, 2006
        • 0 Attachment

          I agree with your rules, but I still believe that we must provide aggregators with the mechanism to "override" clashing scheme alias declaration in remote catalog.

          Otherwise, the rule you set would in practice mean that aggregators could not use provider's catalogs at all, because if any of the declared aliases clash they would not be able to fix it except by copying all non-clashing declarations inline.

          The syntax could be something like:

          if scheme alias "foo" in Hina catalog
                  <scheme alias="foo" uri="
          http://www.hina.hr/catalog/Subject" />
          clashes with "foo" alias in some other catalog, aggregator could use syntax like
          <package:item>
                  <catalogRef href="
          http://www.hina.hr/catalog/MainCatalog_v1.xml">
                          <scheme alias="bar" uri="
          http://www.hina.hr/catalog/Subject" override="foo"/>
                  </catalogRef>
          to instruct the processor (software) to put the pair
          "bar", "
          http://www.hina.hr/catalog/Subject"
          in the list of scheme aliases instead of
          "foo", "
          http://www.hina.hr/catalog/Subject"
          (from the remote catalog) because it would have clashed with the definition of "foo" specified in some other catalog.

          It still means that the aggregator simply has to do it right, but with reasonable effort. On the other hand, it does not complicate catalog processing on the recipient side too much.

          =================================================================
          Darko.Gulija@...
          IT Manager / Voditelj informatike
          tel:  +385 1 48 08 800
          fax:  +385 1 48 08 820
          Croatian News Agency (HINA)   


          >
          -----Original Message-----
          > From: newsml-2@yahoogroups.com
          > [
          href="mailto:newsml-2@yahoogroups.com">mailto:newsml-2@yahoogroups.com] On Behalf Of Misha Wolf
          > Sent: 28. veljača 2006 23:41
          > To:
          newsml-2@yahoogroups.com
          > Subject: [newsml-2] RE: Clashing scheme
          declaration
          >
          > Having thought about it further, I suggest
          that:
          >
          > -  When an item of any sort is made available by a
          producer, the
          >    scheme declarations should have to be in
          a state where they just
          >    work, without any gymnastics
          on the part of the consumer.
          >
          > -  When an aggregator builds a
          package item, scheme declarations
          >    from different
          sources have to be resolved.  This is where the
          >   
          rules given below may be right.  The order in which declarations
          >    are encountered by the aggregator is not
          significant.  The
          >    aggregator must simply make
          sure that in the package item:
          >
          >    -  each
          scheme URI is associated with at least one alias, and
          >
          >    -  each alias is associated with
          exactly one URI.
          >
          > Any comments?
          >
          >
          Misha
          >
          >
          > -----Original Message-----
          > From: Misha
          Wolf
          > Sent: 28 February 2006 19:40
          > To:
          newsml-2@yahoogroups.com
          > Subject: RE: Clashing scheme
          declaration
          >
          > I think I've confused what the aggregator should do
          when
          > building a package of items from different providers with
          >
          what the consumer should do when receiving the package.  Back
          > to
          the drawing board :-)
          >
          > Misha
          >
          >
          >
          -----Original Message-----
          > From: Misha Wolf
          > Sent: 28 February
          2006 18:07
          > To: newsml-2@yahoogroups.com
          > Subject: Clashing scheme
          declaration
          >
          > We've discussed what to do if a scheme declaration
          clashes
          > with another one and came to the conclusion that a
          >
          declaration appearing later in the list of declarations
          > should override
          a clashing one ocurring earlier.  In today's
          > teleconference of the
          NewsML2 Architecture WP, we realised
          > that things aren't so simple, as
          this is OK:
          >
          >  
          href="http://www.iso.ch/iso4217">http://www.iso.ch/iso4217     ~ cur
          >  
          href="http://www.iso.ch/iso4217">http://www.iso.ch/iso4217     ~ currency
          >
          > but this is not OK:
          >
          >  
          href="http://www.reuters.com/scheme">http://www.reuters.com/scheme ~ foo
          >  
          href="http://www.afp.com/scheme">http://www.afp.com/scheme     ~ foo
          >
          > So what constitutes clashing declarations?
          >
          >
          I've tried to write down the various cases and the
          > appropriate
          actions.  Please review them.
          >
          >
          ---------------------------------------------------------------------
          >
          URI "URI-1"          | Alias "alias-1"    | Action
          >
          ---------------------------------------------------------------------
          >
          absent               | absent             | Add the declaration to
          >                     
          |                    | the table.
          >
          ---------------------------------------------------------------------
          >
          present with alias-1 | present with URI-1 | Discard the declaration
          >
          ---------------------------------------------------------------------
          >
          present with alias-2 | absent             | Add the declaration to
          >                     
          |                    | the table. Results in two
          >                     
          |                    | aliases for URI-1.
          >
          ---------------------------------------------------------------------
          >
          absent               | present with URI-2 | Add URI-1 to the table,
          >                     
          |                    | using some alias-3.  In
          >                     
          |                    | all cases where alias-1
          >                     
          |                    | is being used to
          >                     
          |                    | represent URI-1, replace
          >                     
          |                    | alias-1 with alias-3.
          >
          ---------------------------------------------------------------------
          >
          present with alias-2 | present with URI-3 | Discard the declaration.
          >                     
          |                    | In all cases where
          >                     
          |                    | alias-1 is being used to
          >                     
          |                    | represent URI-1, replace
          >                     
          |                    | alias-1 with alias-2.
          >
          ---------------------------------------------------------------------
          >
          >
          Misha
          >
          >
          > To find out more about Reuters visit
          www.about.reuters.com
          >
          > Any views expressed in this message are
          those of the
          > individual sender, except where the sender
          specifically
          > states them to be the views of Reuters
          Ltd.
          >
          >
          >

          > Yahoo! Groups
          Links
          >
          > <*> To visit your group on the web, go
          to:
          >    
          href="http://groups.yahoo.com/group/newsml-2/">http://groups.yahoo.com/group/newsml-2/
          >
          >
          <*> To unsubscribe from this group, send an email to:
          >    
          newsml-2-unsubscribe@yahoogroups.com
          >
          > <*> Your use of
          Yahoo! Groups is subject to:
          >    
          href="http://docs.yahoo.com/info/terms/">http://docs.yahoo.com/info/terms/

          >
          >
          >

        • Michael Steidl/MDir IPTC
          To visualise the approach we ve taken for notating the scheme/alias declarations: consider it as a table with two columns, one for the scheme URI and one for
          Message 4 of 28 , Mar 1, 2006
          • 0 Attachment
            To "visualise" the approach we've taken for notating the scheme/alias
            declarations: consider it as a table with two columns, one for the scheme URI
            and one for the scheme alias. (I will refer to this table below)

            Processing model: In general: This table has to be build for each individual
            item and has a local scope for this item only. This table is built by reading
            all <scheme> elements from all Catalogs associated with "this" item, either by
            reference as an external catalog or by value as an item-local catalog.

            In specific for Package Items: This table is built by reading all <scheme>
            elements from Catalogs associated with "this" (package) item and associated
            with all items included by reference (using a <link> ) AND providing "hints" =
            metadata about the linked item. (Note: Only this addition of hint metadata
            requires to apply CURIEs taken from the original items.)

            After reading in all <scheme> elements the table is validated - this may result
            in clashing aliases. What has to be done in this case is the subject of the
            discussion initiated by Misha.

            Finally the results of any "clash-resolving-action" has to be added to the XML
            representation of "this" item for further use, primarily by the
            consumer/receiver of the item.

            1/ what is a "clash case": in this table an alias is present a/ more than one
            time and b/ pointing to different scheme URIs. (Note: catalogs from different
            providers may assign the same alias to the different scheme URIs)

            (Honestly, I was not able to fully understand Misha's "clash table" below: e.g.
            what does for "URI-1" = absent and "alias-1" = absent the action "add the
            declaration.." mean - a declaration of absent values?)

            2/ rules for resolving the clash case: any alias ocurring in the table a 2nd,
            3rd ... time for **different** URIs has to be replaced by an ad-hoc generated
            replacement value. (Be aware: this action is not required for multiple
            occurences of the same alias pointing to the same URI in all cases - this may
            result from using the same alias for a "common" -e.g. an IPTC controlled -
            scheme URI by different providers in their catalogs.)

            3/ notation of the "clash resolving action":

            Darko proposed to annotate a replacement of the alias "bar" by the ad-hoc value
            "foo" like:

            > <package:item>
            > <catalogRef href="http://www.hina.hr/catalog/MainCatalog_v1.xml">
            > <scheme alias="bar" uri="http://www.hina.hr/catalog/Subject"
            > override="foo"/>
            > </catalogRef>

            Sorry, this won't work for formal reasons as we decided: the catalogRef will
            only have an href attribute and a title child, but not <scheme> children. Only
            the <catalog> element will have <schema> children.

            Ok, no problem to restructure this:
            <package:item>
            <catalogRef href="http://www.hina.hr/catalog/MainCatalog_v1.xml">
            <catalog>
            <scheme alias="bar" uri="http://www.hina.hr/catalog/Subject"
            override="foo"/>
            </catalog>

            Hence the <scheme> element from the local <catalog> overrides the declaration
            from the included HINA catalog.

            (Aside: Darko, your scheme URI requires a character like "#" or "/" at the end
            allowing to add a code)

            Michael

            On 1 Mar 2006 at 10:43 Darko Gulija wrote:

            >
            > I agree with your rules, but I still believe that we must provide aggregators
            > with the mechanism to "override" clashing scheme alias declaration in remote
            > catalog.
            >
            > Otherwise, the rule you set would in practice mean that aggregators could not
            > use provider's catalogs at all, because if any of the declared aliases clash
            > they would not be able to fix it except by copying all non-clashing
            > declarations inline.
            >
            > The syntax could be something like:
            >
            > if scheme alias "foo" in Hina catalog
            > <scheme alias="foo" uri="http://www.hina.hr/catalog/Subject" />
            > clashes with "foo" alias in some other catalog, aggregator could use syntax
            > like
            > <package:item>
            > <catalogRef href="http://www.hina.hr/catalog/MainCatalog_v1.xml">
            > <scheme alias="bar" uri="http://www.hina.hr/catalog/Subject"
            > override="foo"/>
            > </catalogRef>
            > to instruct the processor (software) to put the pair
            > "bar", "http://www.hina.hr/catalog/Subject"
            > in the list of scheme aliases instead of
            > "foo", "http://www.hina.hr/catalog/Subject"
            > (from the remote catalog) because it would have clashed with the definition of
            > "foo" specified in some other catalog.
            >
            > It still means that the aggregator simply has to do it right, but with
            > reasonable effort. On the other hand, itdoes not complicate catalog processing
            > on the recipient side too much.
            >
            > =================================================================
            > Darko.Gulija@...
            > IT Manager / Voditelj informatike
            > tel: +385 1 48 08 800
            > fax: +385 1 48 08 820
            > Croatian News Agency (HINA)
            >
            >
            > > -----Original Message-----
            > > From: newsml-2@yahoogroups.com
            > > [mailto:newsml-2@yahoogroups.com] On Behalf Of Misha Wolf
            > > Sent: 28. veljača 2006 23:41
            > > To: newsml-2@yahoogroups.com
            > > Subject: [newsml-2] RE: Clashing scheme declaration
            > >
            > > Having thought about it further, I suggest that:
            > >
            > > - When an item of any sort is made available by a producer, the
            > > scheme declarations should have to be in a state where they just
            > > work, without any gymnastics on the part of the consumer.
            > >
            > > - When an aggregator builds a package item, scheme declarations
            > > from different sources have to be resolved. This is where the
            > > rules given below may be right. The order in which declarations
            > > are encountered by the aggregator is not significant. The
            > > aggregator must simply make sure that in the package item:
            > >
            > > - each scheme URI is associated with at least one alias, and
            > >
            > > - each alias is associated with exactly one URI.
            > >
            > > Any comments?
            > >
            > > Misha
            > >
            > >
            > > -----Original Message-----
            > > From: Misha Wolf
            > > Sent: 28 February 2006 19:40
            > > To: newsml-2@yahoogroups.com
            > > Subject: RE: Clashing scheme declaration
            > >
            > > I think I've confused what the aggregator should do when
            > > building a package of items from different providers with
            > > what the consumer should do when receiving the package. Back
            > > to the drawing board :-)
            > >
            > > Misha
            > >
            > >
            > > -----Original Message-----
            > > From: Misha Wolf
            > > Sent: 28 February 2006 18:07
            > > To: newsml-2@yahoogroups.com
            > > Subject: Clashing scheme declaration
            > >
            > > We've discussed what to do if a scheme declaration clashes
            > > with another one and came to the conclusion that a
            > > declaration appearing later in the list of declarations
            > > should override a clashing one ocurring earlier. In today's
            > > teleconference of the NewsML2 Architecture WP, we realised
            > > that things aren't so simple, as this is OK:
            > >
            > > http://www.iso.ch/iso4217 ~ cur
            > > http://www.iso.ch/iso4217 ~ currency
            > >
            > > but this is not OK:
            > >
            > > http://www.reuters.com/scheme ~ foo
            > > http://www.afp.com/scheme ~ foo
            > >
            > > So what constitutes clashing declarations?
            > >
            > > I've tried to write down the various cases and the
            > > appropriate actions. Please review them.
            > >
            > > ---------------------------------------------------------------------
            > > URI "URI-1" | Alias "alias-1" | Action
            > > ---------------------------------------------------------------------
            > > absent | absent | Add the declaration to
            > > | | the table.
            > > ---------------------------------------------------------------------
            > > present with alias-1 | present with URI-1 | Discard the declaration
            > > ---------------------------------------------------------------------
            > > present with alias-2 | absent | Add the declaration to
            > > | | the table. Results in two
            > > | | aliases for URI-1.
            > > ---------------------------------------------------------------------
            > > absent | present with URI-2 | Add URI-1 to the table,
            > > | | using some alias-3. In
            > > | | all cases where alias-1
            > > | | is being used to
            > > | | represent URI-1, replace
            > > | | alias-1 with alias-3.
            > > ---------------------------------------------------------------------
            > > present with alias-2 | present with URI-3 | Discard the declaration.
            > > | | In all cases where
            > > | | alias-1 is being used to
            > > | | represent URI-1, replace
            > > | | alias-1 with alias-2.
            > > ---------------------------------------------------------------------
            > >
            > > Misha
            > >
            > >
            > > To find out more about Reuters visit www.about.reuters.com
            > >
            > > Any views expressed in this message are those of the
            > > individual sender, except where the sender specifically
            > > states them to be the views of Reuters Ltd.
            > >
            > >
            > >
            > >
            > > Yahoo! Groups Links
            > >
            > >
            > >
            > >
            > >
            > >
            > >
            >
            >
            > SPONSORED LINKS

            ==================================================
            Sent by:
            Michael Steidl
            Managing Director of the IPTC <mdirector@...>
            International Press Telecommunications Council
            "Information Technology for News"
            Visit us on the web at http://www.iptc.org
          • John Cowan
            ... Since the aggregator will have to change the prefixes at the point of use anyway, I think inserting inline declarations (or using a self-written catalog)
            Message 5 of 28 , Mar 1, 2006
            • 0 Attachment
              Darko Gulija scripsit:
              > I agree with your rules, but I still believe that we must provide
              > aggregators with the mechanism to "override" clashing scheme alias
              > declaration in remote catalog.
              >
              > Otherwise, the rule you set would in practice mean that aggregators could
              > not use provider's catalogs at all, because if any of the declared aliases
              > clash they would not be able to fix it except by copying all non-clashing
              > declarations inline.

              Since the aggregator will have to change the prefixes at the point of use anyway,
              I think inserting inline declarations (or using a self-written catalog) will
              not be that much of an additional hardship.

              --
              Knowledge studies others / Wisdom is self-known; John Cowan
              Muscle masters brothers / Self-mastery is bone; cowan@...
              Content need never borrow / Ambition wanders blind; www.ccil.org/~cowan
              Vitality cleaves to the marrow / Leaving death behind. --Tao 33 (Bynner)
            • Darko Gulija
              Actually, they d have to change only the prefixes that collide. Although, I m not sure what is the better strategy: to check whether some prefixes collide and
              Message 6 of 28 , Mar 1, 2006
              • 0 Attachment
                Actually, they'd have to change only the prefixes that collide.

                Although, I'm not sure what is the better strategy: to check whether some
                prefixes collide and change only the problematic ones, or change them all
                and don't care about collision.

                In the second case, the catalog would then probably have to be inline unless
                the aggregator decides to expose (in hints) only the metadata from their own
                catalogs (which is not bad idea at all).

                =================================================================
                Darko.Gulija@...
                IT Manager / Voditelj informatike
                tel: +385 1 48 08 800
                fax: +385 1 48 08 820
                Croatian News Agency (HINA)


                > -----Original Message-----
                > From: newsml-2@yahoogroups.com
                > [mailto:newsml-2@yahoogroups.com] On Behalf Of John Cowan
                > Sent: 1. ožujak 2006 13:42
                > To: newsml-2@yahoogroups.com
                > Subject: Re: [newsml-2] RE: Clashing scheme declaration
                >
                > Darko Gulija scripsit:
                > > I agree with your rules, but I still believe that we must provide
                > > aggregators with the mechanism to "override" clashing scheme alias
                > > declaration in remote catalog.
                > >
                > > Otherwise, the rule you set would in practice mean that aggregators
                > > could not use provider's catalogs at all, because if any of the
                > > declared aliases clash they would not be able to fix it except by
                > > copying all non-clashing declarations inline.
                >
                > Since the aggregator will have to change the prefixes at the
                > point of use anyway, I think inserting inline declarations
                > (or using a self-written catalog) will not be that much of an
                > additional hardship.
                >
              • John Cowan
                ... I d certainly use the second strategy myself, since no article is likely to mention more than about 100 prefixes; if there were tens of thousands of
                Message 7 of 28 , Mar 1, 2006
                • 0 Attachment
                  Darko Gulija scripsit:

                  > Although, I'm not sure what is the better strategy: to check whether some
                  > prefixes collide and change only the problematic ones, or change them all
                  > and don't care about collision.

                  I'd certainly use the second strategy myself, since no article is likely
                  to mention more than about 100 prefixes; if there were tens of thousands
                  of prefixes, it might be worth changing them selectively.

                  --
                  John Cowan cowan@... http://www.ap.org
                  "Not to know The Smiths is not to know K.X.U." --K.X.U.
                • Misha Wolf
                  ... I m aware that I didn t include any explanation of the table, so I ll do so now. The table assumes that the recipient is processing scheme declarations and
                  Message 8 of 28 , Mar 1, 2006
                  • 0 Attachment
                    Michael wrote:

                    > (Honestly, I was not able to fully understand Misha's "clash
                    > table" below: e.g. what does for "URI-1" = absent and
                    > "alias-1" = absent the action "add the declaration.." mean -
                    > a declaration of absent values?)

                    I'm aware that I didn't include any explanation of the table, so
                    I'll do so now.

                    The table assumes that the recipient is processing scheme
                    declarations and is storing them in a table consisting of two
                    columns: URI and Alias.

                    The recipient is now processing a scheme declaration, consisting of
                    a URI ("URI-1") and an alias ("alias-1"). I'm aware of five
                    possibilities:

                    1. The URI ("URI-1") and the alias ("alias-1") are both absent from
                    the table.

                    2. The URI ("URI-1") and the alias ("alias-1") are both present in
                    the table and, importantly, appear in the same row of the table.

                    3. The URI ("URI-1") is present in the table, but with some other
                    alias ("alias-2"). The alias ("alias-1") is absent from the
                    table.

                    4. The URI ("URI-1") is absent from the table. The alias
                    ("alias-1") is present in the table, but with some other URI
                    ("URI-2").

                    5. The URI ("URI-1") is present in the table, but with some other
                    alias ("alias-2"). The alias ("alias-1") is present in the
                    table, but with some other URI ("URI-3").

                    ---------------------------------------------------------------------
                    URI "URI-1" | Alias "alias-1" | Action
                    ---------------------------------------------------------------------
                    absent | absent | Add the declaration to
                    | | the table.
                    ---------------------------------------------------------------------
                    present with alias-1 | present with URI-1 | Discard the declaration
                    ---------------------------------------------------------------------
                    present with alias-2 | absent | Add the declaration to
                    | | the table. Results in two
                    | | aliases for URI-1.
                    ---------------------------------------------------------------------
                    absent | present with URI-2 | Add URI-1 to the table,
                    | | using some alias-3. In
                    | | all cases where alias-1
                    | | is being used to
                    | | represent URI-1, replace
                    | | alias-1 with alias-3.
                    ---------------------------------------------------------------------
                    present with alias-2 | present with URI-3 | Discard the declaration.
                    | | In all cases where
                    | | alias-1 is being used to
                    | | represent URI-1, replace
                    | | alias-1 with alias-2.
                    ---------------------------------------------------------------------


                    Misha


                    To find out more about Reuters visit www.about.reuters.com

                    Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
                  • Laurent Le Meur
                    I find Misha s table a bit hard to visualize, but it highlights for me that we don t need to solve alias clashes via some extra scheme declaration, as Darko
                    Message 9 of 28 , Mar 3, 2006
                    • 0 Attachment
                      I find Misha's table a bit hard to visualize, but it highlights for me that we
                      don't need to solve alias clashes via some extra scheme declaration, as Darko
                      proposed, but rather by on-the-fly processing of aliases.

                      I will illustrated the fact by two use cases:

                      1/ An aggregator creates a package from News Items created by multiple
                      providers, using different catalogs. The Package Item contains links to the News
                      Items, with supplement hints (eg. subject codes).

                      2/ A recipient adds information (eg via automatic categorisation & entity
                      extraction) to a News Item received from an external provider, before storing it
                      in its CMS.

                      I gave an example for use case 1/ in a preceding post; I repeat it here,
                      slightly modified:

                      A News Item from providerA:
                      <nws:item guid="urn:newsml:...:newsA">
                      <catalogRef href="www.providerA.com/..." />
                      ...
                      <subject code="s:1" />
                      </nws:item>

                      Where "s" is mapped to "http://purl.org/provA/subjects" in providerA's catalog.

                      A News Item from providerB:
                      <nws:item guid="urn:newsml:...:newsB">
                      < catalogRef href="www.providerB.com/..." />
                      ...
                      <subject code="s:2" />
                      </nws:item>

                      Where "s" is mapped to "http://purl.org/provB/subjects" in providerB's catalog.

                      The Aggregator creates a Package Item with references to both News Items:
                      <pack:item>
                      <catalogRef href="www.aggregator.com/......" />
                      ...
                      <link href=" urn:newsml:...:newsA">
                      <subject code="sA:1" />
                      </link>
                      <link href=" urn:newsml:...:newsB">
                      <subject code="sB:2" />
                      </link>
                      </pack:item>

                      Where "sA" is mapped to "http://purl.org/provA/subjects"
                      Where "sB" is mapped to "http://purl.org/provB/subjects"
                      In the aggregator's catalog.

                      The aggregator has made the magic. He has decided to use "sA" instead of "s" to
                      represent the scheme used by provider A, and "sB" instead of "s" to represent
                      the scheme used by provider B.

                      For an aggregator, providing hints using schemes he doesn't know about would be
                      non sense. So the aggregator provides information he can manage, coded in
                      schemes present in his own catalog. Therefore he can easily map the aliases
                      provided by different providers to his own set of aliases.

                      Now an example for use case 2/:

                      A News Item from providerA:
                      <nws:item guid="urn:newsml:...:newsA">
                      <catalogRef href="www.providerA.com/..." />
                      ...
                      <subject code="s:1" />
                      </nws:item>

                      Where "s" is mapped to "http://purl.org/provA/subjects" in providerA's catalog.

                      ProviderB wants to add information, and creates a new News Item (he cannot
                      update a news item it does not own), using a copy of the original content and
                      metadata. Then he adds some subject codes, in a scheme he *decides* to prefix
                      with "s". He must then solve an alias clash:
                      <nws:item guid="urn:newsml:...:newsB">
                      <catalogRef href="www.providerB.com/..." />
                      ...
                      <subject code="sA:1" />
                      <subject code="s:a" />
                      </nws:item>

                      ProviderB has made the magic. He has decided to use "sA" instead of "s" to
                      represent the scheme used by provider A, and "s" for its own subject scheme.

                      If providerB doesn't support the subject scheme used by providerA, he cannot use
                      providerA's metadata, but can still map aliases on the fly when creating the new
                      News Item.

                      In both cases there is no specific declaration at the top of the item, for
                      solving alias declaration: the magic is done when creating the xml news item.

                      Laurent





                      > -----Message d'origine-----
                      > De : newsml-2@yahoogroups.com [mailto:newsml-2@yahoogroups.com] De la part de
                      > Misha Wolf
                      > Envoye : mercredi 1 mars 2006 15:58
                      > A : newsml-2@yahoogroups.com
                      > Objet : RE: [newsml-2] RE: Clashing scheme declaration
                      >
                      > Michael wrote:
                      >
                      > > (Honestly, I was not able to fully understand Misha's "clash
                      > > table" below: e.g. what does for "URI-1" = absent and
                      > > "alias-1" = absent the action "add the declaration.." mean -
                      > > a declaration of absent values?)
                      >
                      > I'm aware that I didn't include any explanation of the table, so
                      > I'll do so now.
                      >


                      -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                      This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

                      For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                      -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                    • Misha Wolf
                      ... I agree with Darko. The aggregator should be able to aggregate news from different providers, without needing to know the schemes used by the providers.
                      Message 10 of 28 , Mar 3, 2006
                      • 0 Attachment
                        Darko wrote:

                        > I still believe that we must provide aggregators with the
                        > mechanism to "override" clashing scheme alias declaration in
                        > remote catalog.
                        >
                        > Otherwise, the rule you set would in practice mean that
                        > aggregators could not use provider's catalogs at all, because if
                        > any of the declared aliases clash they would not be able to fix
                        > it except by copying all non-clashing declarations inline.

                        Laurent wrote:

                        > For an aggregator, providing hints using schemes he doesn't know
                        > about would be non sense. So the aggregator provides information
                        > he can manage, coded in schemes present in his own catalog.
                        > Therefore he can easily map the aliases provided by different
                        > providers to his own set of aliases.

                        I agree with Darko. The aggregator should be able to aggregate news
                        from different providers, without needing to "know" the schemes used
                        by the providers.

                        Darko proposed this syntax:
                        <package:item>
                        <catalogRef href="...">
                        <scheme alias="bar" uri="..." override="foo"/>
                        </catalogRef>
                        ...
                        </package:item>

                        to mean:
                        After retrieving the remote catalog, replace "foo" with "bar".

                        Michael proposed another syntax for this:
                        <package:item>
                        <catalogRef href="...">
                        <catalog>
                        <scheme alias="bar" uri="..." override="foo"/>
                        </catalog>
                        ...
                        </package:item>

                        and I propose a third syntax :-)
                        <package:item>
                        <catalogRef href="...">
                        <replace alias="..." with="..."/>
                        </catalogRef>
                        ...
                        </package:item>

                        Note the following:

                        - I have excluded the URI as it is redundant. Including it
                        complicates the processing model, as we would have to specify the
                        action to be taken if it does not match the declaration in the
                        remote catalog.

                        - I have placed the override statement (<replace .../>) inside the
                        catalogRef element, so making quite clear what is being replaced
                        and the scope of the replacement.

                        - This proposal is very simple to implement, both for the
                        aggregator and for the recipient.

                        In summary, the full syntax would be as illustrated below:

                        <xxx:item>
                        <catalogRef href="..."/>
                        <catalogRef href="..."/>
                        <catalogRef href="...">
                        <replace alias="..." with="..."/>
                        </catalogRef>
                        <catalogRef href="...">
                        <replace alias="..." with="..."/>
                        </catalogRef>
                        <catalog>
                        <scheme alias="bar" uri="..."/>
                        <scheme alias="bar" uri="..."/>
                        </catalog>
                        ...
                        </xxx:item>

                        where:

                        - There can be any number of <catalogRef/> elements.

                        - There must be 0 or 1 <catalog/> element.

                        While there is no reason to impose any order on these, it would be
                        reasonable to require the <catalog/> element (if present) to come
                        after the <catalogRef/> elements (if present).

                        As this doesn't make any functional difference, I suggest we go for
                        whichever approach results in a simpler XML Schema.

                        Misha


                        To find out more about Reuters visit www.about.reuters.com

                        Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
                      • Darko Gulija
                        Does it mean that the following assertions are true: 1. When creating a Package Item from the items from various sources, an aggregator MUST NOT use clashing
                        Message 11 of 28 , Mar 3, 2006
                        • 0 Attachment
                          Does it mean that the following assertions are true:
                          1. When creating a Package Item from the items from various sources, an
                          aggregator MUST NOT use clashing aliases at all, but map them to the aliases
                          from its own catalog (so, aliases would still clash when recipient parses
                          all the catalogs, but they would not be used in hints to denote any
                          metadata; thus, the collision would be avoided).
                          2. It is not an error it the clashing alias exists only in catalogs, but is
                          not used in any metadata value. Such aliases SHOULD be ignored.
                          3. When receiving the item that contains clashing scheme aliases, an
                          recipient MUST NOT use the metadata with aliases that clash. It MAY signal
                          warning or silently ignore them ("MUST NOT" because it is impossible to tell
                          which of the two is wrong, so the real meaning/value of the metadata is not
                          known).

                          Warning: a problem might arise in future with declaring the identical
                          "alias", "uri" pair in multiple catalogs if we decide to add the third
                          property to "scheme" element that would provide for getting the details of
                          the codes from the provider, because it would be different in spite of the
                          fact that the actual code URI is unambiguous.

                          =================================================================
                          Darko.Gulija@...
                          IT Manager / Voditelj informatike
                          tel: +385 1 48 08 800
                          fax: +385 1 48 08 820
                          Croatian News Agency (HINA)


                          > -----Original Message-----
                          > From: newsml-2@yahoogroups.com
                          > [mailto:newsml-2@yahoogroups.com] On Behalf Of Laurent Le Meur
                          > Sent: 3. ožujak 2006 9:40
                          > To: newsml-2@yahoogroups.com
                          > Subject: RE: [newsml-2] RE: Clashing scheme declaration
                          >
                          > I find Misha's table a bit hard to visualize, but it
                          > highlights for me that we don't need to solve alias clashes
                          > via some extra scheme declaration, as Darko proposed, but
                          > rather by on-the-fly processing of aliases.
                          >
                          > I will illustrated the fact by two use cases:
                          >
                          > 1/ An aggregator creates a package from News Items created by
                          > multiple providers, using different catalogs. The Package
                          > Item contains links to the News Items, with supplement hints
                          > (eg. subject codes).
                          >
                          > 2/ A recipient adds information (eg via automatic
                          > categorisation & entity
                          > extraction) to a News Item received from an external
                          > provider, before storing it in its CMS.
                          >
                          > I gave an example for use case 1/ in a preceding post; I
                          > repeat it here, slightly modified:
                          >
                          > A News Item from providerA:
                          > <nws:item guid="urn:newsml:...:newsA">
                          > <catalogRef href="www.providerA.com/..." />
                          > ...
                          > <subject code="s:1" />
                          > </nws:item>
                          >
                          > Where "s" is mapped to "http://purl.org/provA/subjects" in
                          > providerA's catalog.
                          >
                          > A News Item from providerB:
                          > <nws:item guid="urn:newsml:...:newsB">
                          > < catalogRef href="www.providerB.com/..." />
                          > ...
                          > <subject code="s:2" />
                          > </nws:item>
                          >
                          > Where "s" is mapped to "http://purl.org/provB/subjects" in
                          > providerB's catalog.
                          >
                          > The Aggregator creates a Package Item with references to both
                          > News Items:
                          > <pack:item>
                          > <catalogRef href="www.aggregator.com/......" />
                          > ...
                          > <link href=" urn:newsml:...:newsA">
                          > <subject code="sA:1" />
                          > </link>
                          > <link href=" urn:newsml:...:newsB">
                          > <subject code="sB:2" />
                          > </link>
                          > </pack:item>
                          >
                          > Where "sA" is mapped to "http://purl.org/provA/subjects"
                          > Where "sB" is mapped to "http://purl.org/provB/subjects"
                          > In the aggregator's catalog.
                          >
                          > The aggregator has made the magic. He has decided to use "sA"
                          > instead of "s" to represent the scheme used by provider A,
                          > and "sB" instead of "s" to represent the scheme used by provider B.
                          >
                          > For an aggregator, providing hints using schemes he doesn't
                          > know about would be non sense. So the aggregator provides
                          > information he can manage, coded in schemes present in his
                          > own catalog. Therefore he can easily map the aliases provided
                          > by different providers to his own set of aliases.
                          >
                          > Now an example for use case 2/:
                          >
                          > A News Item from providerA:
                          > <nws:item guid="urn:newsml:...:newsA">
                          > <catalogRef href="www.providerA.com/..." />
                          > ...
                          > <subject code="s:1" />
                          > </nws:item>
                          >
                          > Where "s" is mapped to "http://purl.org/provA/subjects" in
                          > providerA's catalog.
                          >
                          > ProviderB wants to add information, and creates a new News
                          > Item (he cannot update a news item it does not own), using a
                          > copy of the original content and metadata. Then he adds some
                          > subject codes, in a scheme he *decides* to prefix with "s".
                          > He must then solve an alias clash:
                          > <nws:item guid="urn:newsml:...:newsB">
                          > <catalogRef href="www.providerB.com/..." />
                          > ...
                          > <subject code="sA:1" />
                          > <subject code="s:a" />
                          > </nws:item>
                          >
                          > ProviderB has made the magic. He has decided to use "sA"
                          > instead of "s" to represent the scheme used by provider A,
                          > and "s" for its own subject scheme.
                          >
                          > If providerB doesn't support the subject scheme used by
                          > providerA, he cannot use providerA's metadata, but can still
                          > map aliases on the fly when creating the new News Item.
                          >
                          > In both cases there is no specific declaration at the top of
                          > the item, for solving alias declaration: the magic is done
                          > when creating the xml news item.
                          >
                          > Laurent
                          >
                          >
                          >
                          >
                          >
                          > > -----Message d'origine-----
                          > > De : newsml-2@yahoogroups.com
                          > [mailto:newsml-2@yahoogroups.com] De la
                          > > part de Misha Wolf Envoye : mercredi 1 mars 2006 15:58 A :
                          > > newsml-2@yahoogroups.com Objet : RE: [newsml-2] RE: Clashing scheme
                          > > declaration
                          > >
                          > > Michael wrote:
                          > >
                          > > > (Honestly, I was not able to fully understand Misha's
                          > "clash table"
                          > > > below: e.g. what does for "URI-1" = absent and "alias-1" = absent
                          > > > the action "add the declaration.." mean - a declaration of absent
                          > > > values?)
                          > >
                          > > I'm aware that I didn't include any explanation of the
                          > table, so I'll
                          > > do so now.
                          > >
                          >
                          >
                          > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                          >
                          > This e-mail, and any file transmitted with it, is
                          > confidential and intended solely for the use of the
                          > individual or entity to whom it is addressed. If you have
                          > received this email in error, please contact the sender and
                          > delete the email from your system. If you are not the named
                          > addressee you should not disseminate, distribute or copy this email.
                          >
                          > For more information on Agence France-Presse, please visit
                          > our web site at http://www.afp.com
                          >
                          > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                          >
                          >
                          >
                          >
                          > Yahoo! Groups Links
                          >
                          >
                          >
                          >
                          >
                          >
                        • Michael Steidl/MDir IPTC
                          ... Misha, thanks for coming back to my initial proposal, which was to use the children for a catalog element pointing to an external file for
                          Message 12 of 28 , Mar 3, 2006
                          • 0 Attachment
                            On 3 Mar 2006 at 12:17 Misha Wolf wrote:
                            > Darko proposed this syntax:
                            > <package:item>
                            > <catalogRef href="...">
                            > <scheme alias="bar" uri="..." override="foo"/>
                            > </catalogRef>
                            > ...
                            > </package:item>
                            >
                            > to mean:
                            > After retrieving the remote catalog, replace "foo" with "bar".
                            >
                            > Michael proposed another syntax for this:
                            > <package:item>
                            > <catalogRef href="...">
                            > <catalog>
                            > <scheme alias="bar" uri="..." override="foo"/>
                            > </catalog>
                            > ...
                            > </package:item>
                            >
                            > and I propose a third syntax :-)
                            > <package:item>
                            > <catalogRef href="...">
                            > <replace alias="..." with="..."/>
                            > </catalogRef>
                            > ...
                            > </package:item>

                            Misha, thanks for coming back to my initial proposal, which was to use the
                            <scheme> children for a catalog element pointing to an external file for
                            "healing" the clashes. But there are processing issues with that structure I'd
                            like to point at, see below:

                            > Note the following:
                            >
                            > - I have excluded the URI as it is redundant. Including it
                            > complicates the processing model, as we would have to specify the
                            > action to be taken if it does not match the declaration in the
                            > remote catalog.

                            We have to be very careful about that killing the URI:

                            - if it is in the "replace" statement then the pair of URI and alias should
                            unambiguously identify a row of the table. (Your warning regarding a wrong URI
                            in the "replace" statement: yes, providers have to be accurate, also the alias
                            value could be wrong ...)

                            - if there is no URI in the "replace" statement the processing of the URI/alias
                            table has to be done at two levels:

                            a/ the final and reconciled table and

                            b/ a temporary table for each external catalog.

                            This is required as the replace statement is scoped to the aliases of the
                            parent <catalog(Ref)>'s external catalog.

                            The processing has to be done this way: first read the declarations of the
                            external catalog to a temporary table, then apply the "replace" declarations
                            and the move all declarations from the temporary table to the final reference
                            table.

                            I feel the implementation of this processing model is more complex like having
                            URIs in the replace statement, where one can build the final table in one step.

                            >
                            > - I have placed the override statement (<replace .../>) inside the
                            > catalogRef element, so making quite clear what is being replaced
                            > and the scope of the replacement.
                            >
                            > - This proposal is very simple to implement, both for the
                            > aggregator and for the recipient.

                            As I said above I can't agree to that assertion.

                            Michael
                            ==================================================
                            Sent by:
                            Michael Steidl
                            Managing Director of the IPTC <mdirector@...>
                            International Press Telecommunications Council
                            "Information Technology for News"
                            Visit us on the web at http://www.iptc.org
                          • John Cowan
                            ... IMHO all this imposes costs on the many (the consumers of items) for a limited benefit to the few (the aggregators). An aggregator already has to read
                            Message 13 of 28 , Mar 3, 2006
                            • 0 Attachment
                              Misha Wolf scripsit:

                              > I agree with Darko. The aggregator should be able to aggregate news
                              > from different providers, without needing to "know" the schemes used
                              > by the providers.

                              IMHO all this imposes costs on the many (the consumers of items) for a
                              limited benefit to the few (the aggregators). An aggregator already has
                              to read through all catalogs mentioned in the content he's aggregating
                              to determine if there is a conflict and, if there is, go through all
                              the CURIEs in the content, decide which ones to keep in the packageItem,
                              fixing up any conflicted aliases as he goes.

                              After doing all that, generating a new catalog with no conflicts really
                              isn't that big a deal, and it makes the life of the consumer much simpler
                              -- just check for conflicts as a regular part of validation, and treat
                              the item like any other invalid item if there are any.

                              --
                              Where the wombat has walked, John Cowan <cowan@...>
                              it will inevitably walk again. http://www.ccil.org/~cowan
                            • Misha Wolf
                              ... The alias does this without any help from the URI. ... One should not provide the same info in two ways, unless one has a very good reason. For example,
                              Message 14 of 28 , Mar 3, 2006
                              • 0 Attachment
                                Michael wrote:

                                > > - I have excluded the URI as it is redundant. Including it
                                > > complicates the processing model, as we would have to specify
                                > > the action to be taken if it does not match the declaration
                                > > in the remote catalog.

                                > We have to be very careful about that killing the URI:
                                >
                                > - if it is in the "replace" statement then the pair of URI and
                                > alias should unambiguously identify a row of the table.

                                The alias does this without any help from the URI.

                                > (Your warning regarding a wrong URI in the "replace" statement:
                                > yes, providers have to be accurate, also the alias value could be
                                > wrong ...)

                                One should not provide the same info in two ways, unless one has a
                                very good reason. For example, in the case of:

                                <subject code="foo:x1g87p>
                                <title xml:lang="grr">Whoosh!</title>
                                </subject>

                                we are explicitly stating that the real info is the code value and
                                that the title is just a hint.

                                You appear to be proposing that both the alias and the URI be
                                provided as real info.

                                As for saying that the alias value could be wrong, I do not consider
                                that to be relevant. Everything in a news item could be wrong.
                                What we need to do is keep the processing simple.

                                > - if there is no URI in the "replace" statement the processing of
                                > the URI/alias table has to be done at two levels:
                                >
                                > a/ the final and reconciled table and
                                >
                                > b/ a temporary table for each external catalog.
                                >
                                > This is required as the replace statement is scoped to the aliases
                                > of the parent <catalog(Ref)>'s external catalog.
                                >
                                > The processing has to be done this way: first read the
                                > declarations of the external catalog to a temporary table, then
                                > apply the "replace" declarations and the move all declarations
                                > from the temporary table to the final reference table.

                                I disagree. All that one has to do is:

                                1. Fetch the remote catalog (possibly, from cache).

                                2. While it contains unprocessed scheme declarations:
                                2a. Take the next scheme declaration.
                                2b. Check for a <replace> instruction.
                                2c. If one is applicable, apply it.
                                2d. Add the resulting alias/URI pair to the in-memory table.

                                3. Do something else.

                                > I feel the implementation of this processing model is more complex
                                > like having URIs in the replace statement, where one can build the
                                > final table in one step.

                                I disagree.

                                Misha


                                To find out more about Reuters visit www.about.reuters.com

                                Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
                              • Michael Steidl/MDir IPTC
                                ... Well, it depends on: if you build a table at once and replace aliases later then you may have an alias in the table more than once. ... Now I disagree:
                                Message 15 of 28 , Mar 3, 2006
                                • 0 Attachment
                                  On 3 Mar 2006 at 13:22 Misha Wolf wrote:
                                  > > We have to be very careful about that killing the URI:
                                  > >
                                  > > - if it is in the "replace" statement then the pair of URI and
                                  > > alias should unambiguously identify a row of the table.
                                  >
                                  > The alias does this without any help from the URI.

                                  Well, it depends on: if you build a table at once and replace aliases later
                                  then you may have an alias in the table more than once.

                                  .....
                                  >
                                  > > - if there is no URI in the "replace" statement the processing of
                                  > > the URI/alias table has to be done at two levels:
                                  > >
                                  > > a/ the final and reconciled table and
                                  > >
                                  > > b/ a temporary table for each external catalog.
                                  > >
                                  > > This is required as the replace statement is scoped to the aliases
                                  > > of the parent <catalog(Ref)>'s external catalog.
                                  > >
                                  > > The processing has to be done this way: first read the
                                  > > declarations of the external catalog to a temporary table, then
                                  > > apply the "replace" declarations and the move all declarations
                                  > > from the temporary table to the final reference table.
                                  >
                                  > I disagree. All that one has to do is:
                                  >
                                  > 1. Fetch the remote catalog (possibly, from cache).
                                  >
                                  > 2. While it contains unprocessed scheme declarations:
                                  > 2a. Take the next scheme declaration.
                                  > 2b. Check for a <replace> instruction.
                                  > 2c. If one is applicable, apply it.
                                  > 2d. Add the resulting alias/URI pair to the in-memory table.
                                  >
                                  > 3. Do something else.

                                  Now I disagree: your processing model requires to check all the <replace>
                                  instructions for **each** fetched external catalog row. Assuming only a few
                                  aliases have to be replaced this is a pure waste of processing power and time.
                                  I guess every programmer would prefer fetching all aliases into a table first
                                  and then to replace only the few rows which have to be updated.

                                  These processing considerations led me to preferring a first full table build
                                  and then applying all the replace instructions.

                                  Michael
                                  ==================================================
                                  Sent by:
                                  Michael Steidl
                                  Managing Director of the IPTC <mdirector@...>
                                  International Press Telecommunications Council
                                  "Information Technology for News"
                                  Visit us on the web at http://www.iptc.org
                                • John Cowan
                                  ... This whole discussion illustrates why consumers of package items shouldn t have to do this stuff, and why the burden falls on aggregators to get it right.
                                  Message 16 of 28 , Mar 3, 2006
                                  • 0 Attachment
                                    Michael Steidl/MDir IPTC scripsit:

                                    > Now I disagree: your processing model requires to check all the <replace>
                                    > instructions for **each** fetched external catalog row. Assuming only a few
                                    > aliases have to be replaced this is a pure waste of processing power and time.
                                    > I guess every programmer would prefer fetching all aliases into a table first
                                    > and then to replace only the few rows which have to be updated.

                                    This whole discussion illustrates why consumers of package items
                                    shouldn't have to do this stuff, and why the burden falls on aggregators
                                    to get it right. After all, the aggregator is responsible for choosing
                                    the metadata to go into the package item, if I understand correctly.
                                    Why not make him responsible for getting the aliases on these items
                                    consistent as well?

                                    --
                                    A witness cannot give evidence of his John Cowan
                                    age unless he can remember being born. cowan@...
                                    --Judge Blagden http://www.ccil.org/~cowan
                                  • Misha Wolf
                                    ... This is a perfectly reasonable approach. The question is whether the other approach is also reasonable. I think that it is. I ll summarise for those who
                                    Message 17 of 28 , Mar 3, 2006
                                    • 0 Attachment
                                      John wrote:

                                      > Misha Wolf scripsit:

                                      > > I agree with Darko. The aggregator should be able to aggregate
                                      > > news from different providers, without needing to "know" the
                                      > > schemes used by the providers.

                                      > IMHO all this imposes costs on the many (the consumers of items)
                                      > for a limited benefit to the few (the aggregators). An aggregator
                                      > already has to read through all catalogs mentioned in the content
                                      > he's aggregating to determine if there is a conflict and, if there
                                      > is, go through all the CURIEs in the content, decide which ones to
                                      > keep in the packageItem, fixing up any conflicted aliases as he
                                      > goes.
                                      >
                                      > After doing all that, generating a new catalog with no conflicts
                                      > really isn't that big a deal, and it makes the life of the
                                      > consumer much simpler -- just check for conflicts as a regular
                                      > part of validation, and treat the item like any other invalid item
                                      > if there are any.

                                      This is a perfectly reasonable approach. The question is whether
                                      the other approach is also reasonable. I think that it is. I'll
                                      summarise for those who have joined the list recently ...

                                      We've decided to use URIs for all NewsML 2 values that are not
                                      literals (dates, numbers, free-form text, etc). These URIs will be
                                      represented in content (and here I'm including metadata as content)
                                      using CURIEs, eg:

                                      <subject code="theme:17001000"/>
                                      <!-- Weather Forecast -->
                                      <subject code="theme:17002000"/>
                                      <!-- Weather Global change -->
                                      <subject code="theme:17004000"/>
                                      <!-- Weather Statistics -->
                                      <subject code="theme:17005000"/>
                                      <!-- Weather Warning -->

                                      The aliases (eg "theme") will be associated with URIs via <scheme>
                                      declarations, eg:

                                      <scheme alias="theme" uri="http://www.iptc.org/NewsCodes/theme#"/>

                                      As a given item (eg News Item or Package Item) may contain many of
                                      these declarations (maybe 20-30), we're allowing them to be stored
                                      in an external catalog and retrieved as needed. The idea is that
                                      retriaval of external catalogs will be fairly rare, as they will
                                      mostly be cached. Indeed, if this is not the case, then there is no
                                      point in using external catalogs, as there will be no saving of
                                      bandwidth or time.

                                      And so, there is an advantage in the aggregator passing on the
                                      external catalogs referenced in the aggregated content, with one or
                                      two "annotations" needed in order to resolve alias clashes. The
                                      recipients are likely to have these catalogs cached and all they
                                      have to do is locally apply the "annotations", before fully
                                      processing the package item.

                                      Misha


                                      To find out more about Reuters visit www.about.reuters.com

                                      Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
                                    • Misha Wolf
                                      ... I disagree. Misha To find out more about Reuters visit www.about.reuters.com Any views expressed in this message are those of the individual sender, except
                                      Message 18 of 28 , Mar 3, 2006
                                      • 0 Attachment
                                        Michael wrote:

                                        > These processing considerations led me to preferring a first
                                        > full table build and then applying all the replace instructions.

                                        I disagree.

                                        Misha


                                        To find out more about Reuters visit www.about.reuters.com

                                        Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
                                      • John Cowan
                                        ... Except that these annotations have to be applied in a non-persistent way, since there is no URI that refers to a catalog-as-annotated-by-X. Consequently,
                                        Message 19 of 28 , Mar 3, 2006
                                        • 0 Attachment
                                          Misha Wolf scripsit:

                                          > And so, there is an advantage in the aggregator passing on the
                                          > external catalogs referenced in the aggregated content, with one or
                                          > two "annotations" needed in order to resolve alias clashes. The
                                          > recipients are likely to have these catalogs cached and all they
                                          > have to do is locally apply the "annotations", before fully
                                          > processing the package item.

                                          Except that these annotations have to be applied in a non-persistent
                                          way, since there is no URI that refers to a catalog-as-annotated-by-X.
                                          Consequently, since consumers often get lots of items from the same
                                          aggregator, they wind up applying the annotations over and over.

                                          We could resolve this by allowing a catalog element to contain an
                                          @extends attribute which says "This catalog extends the named
                                          catalog with changes". But on the whole, since catalogs are immutable,
                                          I think it's just as easy for the aggregator to publish "his own
                                          version" of an upstream catalog with his patches installed, and then
                                          always refer to this version when retrieving metadata items described
                                          in the upstream catalog.

                                          That continues to keep things simple and stable for the consumer, who
                                          at most has to retrieve a few extra catalogs for aggregated items.
                                          If this standard isn't simple for consumers, it won't get traction.

                                          --
                                          What asininity could I have uttered John Cowan <cowan@...>
                                          that they applaud me thus? http://www.ap.org
                                          --Phocion, Greek orator http://www.ccil.org/~cowan
                                        • Laurent Le Meur
                                          ... +1 ... +1000 Laurent -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- This e-mail, and any file transmitted with it, is confidential and intended solely for
                                          Message 20 of 28 , Mar 3, 2006
                                          • 0 Attachment
                                            > ... , since catalogs are immutable,
                                            > I think it's just as easy for the aggregator to publish "his own
                                            > version" of an upstream catalog with his patches installed, and then
                                            > always refer to this version when retrieving metadata items described
                                            > in the upstream catalog.

                                            +1

                                            > That continues to keep things simple and stable for the consumer, who
                                            > at most has to retrieve a few extra catalogs for aggregated items.
                                            > If this standard isn't simple for consumers, it won't get traction.

                                            +1000

                                            Laurent


                                            -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                                            This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

                                            For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                                            -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                          • Laurent Le Meur
                                            ... Yes. The prerequisite is that he has programmed incoming scheme declarations which are of interest for him. (so, aliases would still clash when recipient
                                            Message 21 of 28 , Mar 3, 2006
                                            • 0 Attachment
                                              > Darko Gulija
                                              > Does it mean that the following assertions are true:
                                              > 1. When creating a Package Item from the items from various sources, an
                                              > aggregator MUST NOT use clashing aliases at all, but map them to the aliases
                                              > from its own catalog

                                              Yes. The prerequisite is that he has "programmed" incoming scheme declarations
                                              which are of interest for him.

                                              (so, aliases would still clash when recipient parses
                                              > all the catalogs, but they would not be used in hints to denote any
                                              > metadata; thus, the collision would be avoided).

                                              No, because the items provided by different providers are independent. There is
                                              NO clashing aliases in the news items. There is NO clasing aliases in the
                                              package item neither because the package uses the aggregator's catalog.

                                              > 2. It is not an error it the clashing alias exists only in catalogs, but is
                                              > not used in any metadata value. Such aliases SHOULD be ignored.
                                              > 3. When receiving the item that contains clashing scheme aliases, an
                                              > recipient MUST NOT use the metadata with aliases that clash. It MAY signal
                                              > warning or silently ignore them ("MUST NOT" because it is impossible to tell
                                              > which of the two is wrong, so the real meaning/value of the metadata is not
                                              > known).

                                              This is of no use in the context I describe.
                                              Laurent



                                              -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                                              This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

                                              For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                                              -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                            • Misha Wolf
                                              ... Is this a case of vote early, vote often ? Misha To find out more about Reuters visit www.about.reuters.com Any views expressed in this message are those
                                              Message 22 of 28 , Mar 3, 2006
                                              • 0 Attachment
                                                Laurent wrote:

                                                > +1000

                                                Is this a case of "vote early, vote often"?

                                                Misha


                                                To find out more about Reuters visit www.about.reuters.com

                                                Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
                                              • Misha Wolf
                                                Whether we nest the alias replacement instruction within a element or within a element, AND whether it contains just two aliases or two
                                                Message 23 of 28 , Mar 6, 2006
                                                • 0 Attachment
                                                  Whether we nest the alias replacement instruction within a <catalog>
                                                  element or within a <catalogRef> element, AND whether it contains
                                                  just two aliases or two aliases and a URI, we should not use the
                                                  <scheme> element for this purpose, but rather a different element,
                                                  called, say, <replace>. The advantages include:

                                                  - the element can be made part of the power profile,

                                                  - restrictions can be imposed via XML Scheme on where the element
                                                  may appear.

                                                  There are two possible syntaxes:

                                                  <replace alias="..." with "..."/>

                                                  <replace uri="..." alias="..." with "..."/>

                                                  This makes me wonder whether we should write:

                                                  <scheme uri="..." alias="..."/>
                                                  <!--
                                                  I am using scheme "..." and am locally representing it using the
                                                  alias "..."
                                                  -->

                                                  rather than (at present):

                                                  <scheme alias="..." uri="..."/>
                                                  <!--
                                                  When you see the alias "...", interpret it as the scheme "..."/>
                                                  -->

                                                  Note also that we may, later, wish to add a @type, eg:

                                                  <scheme uri="..." type="..." alias="..."/>
                                                  <!--
                                                  I am using scheme "..." for "..." and am locally representing it
                                                  by "..."
                                                  -->

                                                  For example:

                                                  <scheme
                                                  uri="http://en.wikipedia.org/wiki/ISO4217#Currency_Numeric_Codes"
                                                  type="type:cur"
                                                  alias="iso4217"/>
                                                  <!--
                                                  I am using scheme
                                                  "http://en.wikipedia.org/wiki/ISO4217#Currency_Numeric_Codes"
                                                  for currencies and am locally representing it by "iso4217"
                                                  -->


                                                  Misha

                                                  NewsML 2 resources:
                                                  http://www.iptc.org/ | http://www.iptc.org/NAR/
                                                  http://www.iptc.org/NAR/1.0 | http://groups.yahoo.com/group/newsml-2/


                                                  To find out more about Reuters visit www.about.reuters.com

                                                  Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
                                                • Laurent Le Meur
                                                  Hope you are joking. A CURIE in what helps defining CURIEs... a catalog with a catalogRef at the top. Never. Laurent ...
                                                  Message 24 of 28 , Mar 6, 2006
                                                  • 0 Attachment
                                                    Hope you are joking. A CURIE in what helps defining CURIEs... a catalog with a
                                                    catalogRef at the top. Never.
                                                    Laurent

                                                    >
                                                    > Note also that we may, later, wish to add a @type, eg:
                                                    >
                                                    > <scheme uri="..." type="..." alias="..."/>
                                                    > <!--
                                                    > I am using scheme "..." for "..." and am locally representing it
                                                    > by "..."
                                                    > -->
                                                    >
                                                    > For example:
                                                    >
                                                    > <scheme
                                                    > uri="http://en.wikipedia.org/wiki/ISO4217#Currency_Numeric_Codes"
                                                    > type="type:cur"
                                                    > alias="iso4217"/>
                                                    > <!--
                                                    > I am using scheme
                                                    > "http://en.wikipedia.org/wiki/ISO4217#Currency_Numeric_Codes"
                                                    > for currencies and am locally representing it by "iso4217"
                                                    > -->
                                                    >
                                                    >
                                                    > Misha
                                                    >


                                                    -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

                                                    This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

                                                    For more information on Agence France-Presse, please visit our web site at http://www.afp.com

                                                    -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                                                  • Misha Wolf
                                                    ... I think that we should take this approach for now. If it proves inadequate for an important set of cases, we can always add the kind of feature we ve been
                                                    Message 25 of 28 , Mar 6, 2006
                                                    • 0 Attachment
                                                      Michael wrote:

                                                      > - for package items the provider must provide either one (or a
                                                      > smal set of) pre-reconciled external catalog(s) - presumably
                                                      > maintained by him - OR only reconciled and on-the-fly created
                                                      > inline catalogs.
                                                      >
                                                      > This should work.

                                                      I think that we should take this approach for now. If it proves
                                                      inadequate for an important set of cases, we can always add the kind
                                                      of feature we've been discussing.

                                                      Misha

                                                      NewsML 2 resources:
                                                      http://www.iptc.org/ | http://www.iptc.org/NAR/
                                                      http://www.iptc.org/NAR/1.0 | http://groups.yahoo.com/group/newsml-2/


                                                      To find out more about Reuters visit www.about.reuters.com

                                                      Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
                                                    Your message has been successfully submitted and would be delivered to recipients shortly.