Loading ...
Sorry, an error occurred while loading the content.

RE: [newsml-g2] Concepts and Subjects

Expand Messages
  • dave.compton@thomsonreuters.com
    is useful when you have multiple refs (in the text) to the same concept; the can then map through to the single via the
    Message 1 of 11 , Jan 15, 2010
    • 0 Attachment
      <inlineRef> is useful when you have multiple refs (in the text) to the same concept; the <inlineRef> can then map through to the single <assert> via the inlineRef /@qcode. (as below).
       

      <assert qcode="pers:gw.bush">

         <name>President G. W. Bush</name>

         <type qcode="cptType:9"/>

         ...

      </assert>

      <assert qcode="pers:g.bush">

         <name>President G. Bush</name>

         <type qcode="cptType:9"/>

         ...

      </assert>

      <assert qcode="N2:IR">

         <name>Iran</name>

         <type qcode="cptType:5" rtr:facet="geoProp:5"/>

         ...

      </assert>

      ...

      <inlineRef idrefs="x1"     qcode="pers:gw.bush"    confidence="50"/>

      <inlineRef idrefs="x1 x3"  qcode="pers:g.bush" confidence="70"/>

      <inlineRef idrefs="x7"     qcode="N2:IR"           confidence="100"/>

      ...

      <inlineXML contenttype="application/xhtml+xml" xml:lang="en" ...>

         <html xmlns="http://www.w3.org/1999/xhtml">

             <head><title/><head>

             <body>

                <!-- Inline2 -->

                <p> <span id="x1">President Bush</span> makes a speech about <span id="x7">Iran </span> today at 14:00.</p>

                <p> Later, <span id="x3">Bush</span> indicated that there were still many issues to be addressed.</p>

                ...

             </body>

         </html>

      </inlineXML>

       
      Or:
      The <inlineRef> can define different @confidence etc. values for instances of the same concept, e.g.
      <inlineRef idrefs="x1"  qcode="pers:g.bush" confidence="70"/>
      <inlineRef idrefs="x3"  qcode="pers:g.bush" confidence="100"/>
       
      Rgds
      DC

      From: newsml-g2@yahoogroups.com [mailto:newsml-g2@yahoogroups.com] On Behalf Of Paul Harman
      Sent: 15 January 2010 14:33
      To: newsml-g2@yahoogroups.com
      Subject: RE: [newsml-g2] Concepts and Subjects

       

      Actually I don't need to use <assert> at all; I can use <inlineRef> on its own to achieve what I need.
       

      From: newsml-g2@yahoogrou ps.com [mailto:newsml- g2@yahoogroups. com] On Behalf Of Paul Harman
      Sent: 15 January 2010 12:13
      To: newsml-g2@yahoogrou ps.com
      Subject: RE: [newsml-g2] Concepts and Subjects

       

      So an article that mentions Margaret Thatcher in the text would use an <assert> which references the concept in my knowledge base.
       
      I could use an <inlineRef> to link this assert to the occurances of "Margaret Thatcher" in the <inlineXML>, and other NewsML-G2 elements such as <headline> and <description>.
       
      If the article is in fact Thatcher's biography, I should also add a <subject> referencing Thatcher. It's probably an editorial decision whether the article is "about" any of the concepts identified within it.
       
          Paul


      From: newsml-g2@yahoogrou ps.com [mailto:newsml- g2@yahoogroups. com] On Behalf Of Darko Gulija
      Sent: 15 January 2010 12:06
      To: newsml-g2@yahoogrou ps.com
      Subject: Re: [newsml-g2] Concepts and Subjects

       

      <assert> is just to assert some things about concept - it tells you nothing how the concept relates to the current item (you could even assert things about concepts not even mentioned in the newsItem - although it would be a little bit strange).

      <inlineRef> (and <inline>) are references to the concepts mentioned in some part of the item (usually content for inlineRef). It does not tell anything about semantics of the relation between the concept and the item, but just that concept is referenced from within the item. You could provide some basic details about the concept inside <inlineRef> or use <assert> to provide more details.

      <subject> tells you that the referenced concept is subject of the item.

      I agree with the first sentence below. I would add "except for the concepts that are subjects".

      2010/1/15 Paul Harman <paul.harman@ pressassociation .com>


      Thanks Misha. I've been doing this all wrong haven't I.
       
      Inside my newsItem, I should be using <assert> to include detail about the discovered concept/entity, and <inlineRef> to link to where it occurs in the text (as per guidelines ch 18.3). I should NOT be using <subject>.
       
      I think this is a reasonable distinction: <subject> is for what the news item is about; <assert> is for things that it mentions...?
       
          Paul


      From: newsml-g2@yahoogrou ps.com [mailto:newsml-g2@yahoogrou ps.com] On Behalf Of misha.wolf@thomsonr euters.com
      Sent: 13 January 2010 18:03

      To: newsml-g2@yahoogrou ps.com
      Subject: RE: [newsml-g2] Concepts and Subjects

       

      Hi Paul,
       
      See the inlineRef element.
       
      Regards,
      Misha
       

      From: newsml-g2@yahoogrou ps.com [mailto:newsml-g2@yahoogrou ps.com] On Behalf Of Paul Harman
      Sent: 13 January 2010 17:43
      To: newsml-g2@yahoogrou ps.com
      Subject: RE: [newsml-g2] Concepts and Subjects

      I suppose this is what @why is for, but there isn't a suitable value in the existing recommended/ preferred scheme. Maybe the IPTC could add "referenced" to the existing vocabulary?
       
          Paul


      From: newsml-g2@yahoogrou ps.com [mailto:newsml-g2@yahoogrou ps.com] On Behalf Of Paul Harman
      Sent: 13 January 2010 16:53
      To: newsml-g2@yahoogrou ps.com
      Subject: [newsml-g2] Concepts and Subjects

       

      Sorry everyone, I'm confused.

      Inside my <newsItem> I want to provide lists of classifications and
      lists of entities discovered in the text. I had somehow got the mistaken
      impression that I should use <subject> for the classification and
      <concept> for the entities.

      However, there is no mechanism inside <newsItem> to embed [references
      to] concepts (or at least I can't find it); I see now (from reading the
      Implementer' s Guide) that I should use <subject> to refer to the
      identified entities too, with an @type to declare whether it's an
      abstract concept (e.g Politics) or a concrete entity of a specified type
      (e.g. London, Margaret Thatcher).

      But the article isn't ABOUT e.g. London - it's just that London is
      mentioned in the story. So London is not the SUBJECT of the article in
      the way that (say) politics IS the subject of the article. How do I
      distinguish articles /about/ Margaret Thatcher from articles that
      mention Margaret Thatcher? @relevance doesn't really convey what I am
      trying to express.

      Paul

      This email is from the Press Association. For more information, see www.pressassociatio n.com.
      This email may contain confidential information.
      Only the addressee is permitted to read, copy, distribute or otherwise use this email or any attachments.
      If you have received it in error, please contact the sender immediately.
      Any opinion expressed in this email is personal to the sender and may not reflect the opinion of the Press Association.
      Any email reply to this address may be subject to interception or monitoring for operational reasons or for lawful business practices.


      This email was sent to you by Thomson Reuters, the global news and information company.
      Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Thomson Reuters.





      This email was sent to you by Thomson Reuters, the global news and information company.
      Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Thomson Reuters.
    Your message has been successfully submitted and would be delivered to recipients shortly.