Loading ...
Sorry, an error occurred while loading the content.
 

RE: [newsml] Identifying story threads from NewsML

Expand Messages
  • Misha Wolf
    When using the NewsML-G2 Subject property to tie together News Items about an event one would place an Event identifier in the Subject property. Misha ...
    Message 1 of 4 , May 2, 2008
      When using the NewsML-G2 Subject property to tie together News Items
      about an event one would place an Event identifier in the Subject
      property.

      Misha


      -----Original Message-----
      From: newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] On Behalf Of Laurent LE MEUR
      Sent: 02 May 2008 13:18
      To: newsml@yahoogroups.com
      Subject: RE: [newsml] Identifying story threads from NewsML

      It is true that NewsML 1 did not tackle this topic.

      * NewsItemId must not be used for this purpose, as it is part of the unique identification of the news item.
      * NameLabel would not be my favorite choice, as it aims at identifying a single news item (and not a "thread").
      * Currently most wires use a slug term for this purpose (ex. OLY2008).

      Note NewsML-G2 defines a property named <instanceOf> specifically for this purpose ("thread" or "fixture" like "market opening").
      Note also that if the "thread" is an entity (an event, a location) that is the "subject" of a set of news items, you can use the Newsml-G2 <subject> property to identify this entity.

      Best regards
      Laurent Le Meur
      AFP


      > -----Message d'origine-----
      > De : newsml@yahoogroups.com [mailto:newsml@yahoogroups.com] De la part
      > de masood_a
      > Envoyé : jeudi 1 mai 2008 20:08
      > À : newsml@yahoogroups.com
      > Objet : [newsml] Identifying story threads from NewsML
      >
      > Hi-
      >
      > We are trying to identify a field in NewsML that can be used to
      > identify a set of related stories consistently. This set of stories
      > can be about, lets say Iraq, or an event that is occurring on a single
      > day like "Plane Crash". We would like to always identify the latest
      > news article about a particular "thread". A thread may last for a
      > short time like a day or more longer, say months.
      >
      > There are a few options available to us right now. I would like input
      > on the various options.
      >
      > I am including a list of the fields (from the NewsML 1.2 spec) that
      > we are looking at below. From all the options below, it appears that
      > the field NameLabel may make more sense as this is in the category of
      > items that are "Informal Identifiers", and are not expected to be
      > using any markup.
      >
      > It would be good to know the mechanism being used by major wires as
      > that can help us narrow down our options. I am including the
      > description of the elements from the NewsML spec for your convenience.
      >
      > 1. /NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines/SlugLine
      >
      > [The SlugLine element provides a string of text, possibly embellished
      > by hyperlinks and/or formatting, used to display a NewsItem's slug
      > line. (Note that the meaning of the term "slug line", and the uses to
      > which it is put, are a matter for individual providers to define
      > within their own workflow and business practice.)]
      >
      > 2. NewsML/NewsItem/Identification/NameLabel
      >
      > [The NameLabel element contains a string used by human users as a name
      > to help identify a NewsItem. Its form is determined by the provider.
      > It might be identical to the textual content of the SlugLine element,
      > for example, but even if this is so, the system should not process the
      > NameLabel as a slugline. Nothing can be assumed about the nature of
      > the string within NameLabel beyond the fact that it can help to
      > identify the NewsItem to humans.]
      >
      > In addition NameLabel is in the category of items that are identified
      > as "InformalIdentifiers". [In addition to the formal identification
      > mechanisms described above, NewsML provides a series of Label elements
      > that can be used by human users to identify NewsItems. As far as the
      > NewsML system is concerned, these are arbitrary strings, and cannot be
      > relied upon to provide a robust identification mechanism. Their sole
      > purpose is to provide a convenient way for humans to identify a
      > particular NewsItem in informal exchanges and communications, or as
      > part of a user interface.]
      >
      > 3. /NewsML/NewsItem/Identification/NewsIdentitifier/NewsItemId
      >
      > [The NewsItemId is an identifier for the NewsItem. The combination of
      > NewsItemId and DateId must be unique among NewsItems that emanate from
      > the same provider. Within these constraints, the NewsItemId can take
      > any form the provider wishes. It may take the form of a name for the
      > NewsItem that will be meaningful to humans, but this is not a
      > requirement
      >
      > thanks,
      > -Masood
      >
      >
      > ------------------------------------
      >
      > Find more on NewsML at http://www.newsml.org
      >
      > Any member of this IPTC moderated Yahoo group must comply with the
      > Intellectual Property Policy of the IPTC, available at
      > http://www.iptc.org/goto/ipp. Any posting is assumed to be submitted
      > under the conditions of this IPTC IP Policy.
      > Yahoo! Groups Links
      >
      >
      >



      This e-mail, and any file transmitted with it, is confidential and intended solely for the use of the individual or entity to whom it is addressed. If you have received this email in error, please contact the sender and delete the email from your system. If you are not the named addressee you should not disseminate, distribute or copy this email.

      For more information on Agence France-Presse, please visit our web site at http://www.afp.com


      ------------------------------------

      Find more on NewsML at http://www.newsml.org

      Any member of this IPTC moderated Yahoo group must comply with the Intellectual Property Policy of the IPTC, available at http://www.iptc.org/goto/ipp. Any posting is assumed to be submitted under the conditions of this IPTC IP Policy.
      Yahoo! Groups Links





      This email was sent to you by Thomson Reuters, the global news and information company.
      Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Thomson Reuters.
    • Jon Garfunkel
      I m at the NewsTools conference in Sunnyvale at Yahoo conference listening to Dan Meredith, a product manager at GoogleNews. He has soured on classification
      Message 2 of 4 , May 2, 2008
        I'm at the NewsTools conference in Sunnyvale at Yahoo conference listening
        to Dan Meredith, a product manager at GoogleNews. He has soured on
        classification systems because he calls them (1) unstable, (2) unreliable.

        I was going to ask him whether they've considered NewsML.

        I'm at http://www.newstools.org/
        http://twitter.com/newstools2008

        Jon

        Jon Garfunkel
        Boston, Mass.
        http://civilities.net/
      • Misha Wolf
        Hi Jon, NewsML isn t a classification system. It s a markup language. Regards, Misha ... From: newsml-g2@yahoogroups.com [mailto:newsml-g2@yahoogroups.com] On
        Message 3 of 4 , May 2, 2008
          Hi Jon,

          NewsML isn't a classification system. It's a markup language.

          Regards,
          Misha


          -----Original Message-----
          From: newsml-g2@yahoogroups.com [mailto:newsml-g2@yahoogroups.com] On
          Behalf Of Jon Garfunkel
          Sent: 02 May 2008 22:00
          To: newsml-g2@yahoogroups.com
          Cc: newsml@yahoogroups.com
          Subject: [newsml-g2] Google News talk

          I'm at the NewsTools conference in Sunnyvale at Yahoo conference
          listening
          to Dan Meredith, a product manager at GoogleNews. He has soured on
          classification systems because he calls them (1) unstable, (2)
          unreliable.

          I was going to ask him whether they've considered NewsML.

          I'm at http://www.newstools.org/
          http://twitter.com/newstools2008

          Jon

          Jon Garfunkel
          Boston, Mass.
          http://civilities.net/


          This email was sent to you by Thomson Reuters, the global news and information company.
          Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Thomson Reuters.
        • Jon Garfunkel
          Natch, NewsCodes. I ll have a follow-up post. In short, a number of folks were a bit wary of Google News defining a new SiteMaps schema without sufficient
          Message 4 of 4 , May 2, 2008
            Natch, NewsCodes.
            I'll have a follow-up post. In short, a number of folks were a bit wary of
            Google News defining a new SiteMaps schema without sufficient input from
            the community of online news producers. I try to mention NewsML as much as
            I can to at least people know that there are precedents in this space.

            Jon

            On Sat, 3 May 2008, Misha Wolf wrote:

            > Hi Jon,
            >
            > NewsML isn't a classification system. It's a markup language.
            >
            > Regards,
            > Misha
            >
            >
            > -----Original Message-----
            > From: newsml-g2@yahoogroups.com [mailto:newsml-g2@yahoogroups.com] On
            > Behalf Of Jon Garfunkel
            > Sent: 02 May 2008 22:00
            > To: newsml-g2@yahoogroups.com
            > Cc: newsml@yahoogroups.com
            > Subject: [newsml-g2] Google News talk
            >
            > I'm at the NewsTools conference in Sunnyvale at Yahoo conference
            > listening
            > to Dan Meredith, a product manager at GoogleNews. He has soured on
            > classification systems because he calls them (1) unstable, (2)
            > unreliable.
            >
            > I was going to ask him whether they've considered NewsML.
            >
            > I'm at http://www.newstools.org/
            > http://twitter.com/newstools2008
            >
            > Jon
            >
            > Jon Garfunkel
            > Boston, Mass.
            > http://civilities.net/
          Your message has been successfully submitted and would be delivered to recipients shortly.