Loading ...
Sorry, an error occurred while loading the content.

Re: [TaxoCoP] Re: Tools to merge vocabularies

Expand Messages
  • Seth Earley
    This might be a good topic for one of our monthly conference calls. ________________________________ From: TaxoCoP@yahoogroups.com
    Message 1 of 11 , Sep 29, 2009
    • 1 Attachment
    • 18 KB
    This might be a good topic for one of our monthly conference calls.

    ________________________________
    From: TaxoCoP@yahoogroups.com <TaxoCoP@yahoogroups.com>
    To: TaxoCoP@yahoogroups.com <TaxoCoP@yahoogroups.com>
    Sent: Tue Sep 29 08:35:15 2009
    Subject: [TaxoCoP] Re: Tools to merge vocabularies



    I would be interested to hear of software that meets Heather's specifications. We use Data Harmony's Thesaurus Master for merging taxonomies (full disclosure: I'm employed by its producer Access Innovations). The software permits successive imports from external files, providing various views of the combined file. Each source taxo becomes a main branch of the target taxo and the aggregate can be worked with, analyzing for similar or overlapping terms and concepts. Viewing the aggregate in permuted format reveals common words in terms and sometimes common underlying concepts that can be obscured by how the term is expressed. The software alerts you to incoming terms that are identical to target file NPTs. Identical preferred terms from the separate files are merged; they should be analyzed to be sure they represent the same concept--if not, one or the other term must be reworded to be more accurate and clear.

    Working in the hierarchy view, identical terms in separate taxo files may acquire multiple broader terms if the additional new parents work out well; if they don't, the expression must change to reflect the concept more accurately. Through the merge, show some terms may adopt children under the identical BT from the other source files. If the collected children play well together, great; otherwise the incompatible family groups get separated and the name of one of the parents must change to disambiguate.

    We've used the software to blend up five source files into one and also for mapping multilingual files.

    A first pass run to spot duplicates is good. Before long, humans must, of course, analyze and compare overall structure and specific terms.

    Alice

    --- In TaxoCoP@yahoogroups.com<mailto:TaxoCoP%40yahoogroups.com>, Heather Hedden <heather@...> wrote:
    >
    > If you want to merge one controlled vocabulary into another, what tools
    > are their to do a first pass (before a human review) of automtic
    > matching? This matching would include 1) exact matches of preferred
    > terms between the two vocabularies, 2) matches between preferred terms
    > of one and nonpreferred terms of the other, and 3) possibly even
    > additional fuzzy matches based on natural language processing.
    >
    > Information sought in preparation for a Taxonomy Boot Camp presentation.
    > Thanks.
    >
    > -- Heather
    >
    > --
    > Heather Hedden
    > Hedden Information Management
    > Heather@...
    > www.Hedden-Information.com
    >
  • laptopjockey
    ... I m curious...if merging multiple taxonomies / thesauri is focused on (among other things) getting rid of duplicates how do homographs live in such an
    Message 2 of 11 , Sep 29, 2009
    • 0 Attachment
      --- In TaxoCoP@yahoogroups.com, "Alice" <aredmondneal@...> wrote:
      >
      > I would be interested to hear of software that meets Heather's specifications.

      I'm curious...if merging multiple taxonomies / thesauri is focused on (among other things) getting rid of 'duplicates' how do homographs live in such an environment?

      For example, how does 'Appendix' in a medical thesaurus live with 'Appendix' from a publishing thesaurus?

      John O'
    • Heather Hedden
      First of all, merging should only be done on thesauri in the same field (subject area). But if the field is broad, then undifferentiated homographs with
      Message 3 of 11 , Sep 29, 2009
      • 0 Attachment
        First of all, merging should only be done on thesauri in the same field
        (subject area). But if the field is broad, then undifferentiated
        homographs with different meaning may occur and be matched. A
        taxonomist must review the automated matchings. Parenthetical qualifiers
        can be added if both homographs with different meanings are kept.

        -- Heather

        Heather Hedden
        Hedden Information Management
        www.Hedden-Information.com



        laptopjockey wrote:
        > --- In TaxoCoP@yahoogroups.com, "Alice" <aredmondneal@...> wrote:
        >
        >> I would be interested to hear of software that meets Heather's specifications.
        >>
        >
        > I'm curious...if merging multiple taxonomies / thesauri is focused on (among other things) getting rid of 'duplicates' how do homographs live in such an environment?
        >
        > For example, how does 'Appendix' in a medical thesaurus live with 'Appendix' from a publishing thesaurus?
        >
        > John O'
        >
        >
        >
        > ------------------------------------
        >
        > Yahoo! Groups Links
        >
        >
        >
        >
        >
      • Janice M Herd
        Hi John, It would be necessary to disambiguate the two terms if one merges two vocabularies of distinct subject areas such as medicine and publishing. Example:
        Message 4 of 11 , Sep 29, 2009
        • 0 Attachment
          Hi John,
          It would be necessary to disambiguate the two terms if one merges two vocabularies of distinct subject areas such as medicine and publishing.
          Example: Appendix (Anatomy) uses a paranthetical qualifier
          Appendices (to printed material) might be used in the plural since Z39.19 (NISO monolingual thesaurus construction standard) requires most terms to be plural and it could also receive a paranthetical qualifier.
          Jan


          >>> "laptopjockey" <jogorman@...> 9/29/09 1:20 PM >>>
          --- In TaxoCoP@yahoogroups.com, "Alice" <aredmondneal@...> wrote:
          >
          > I would be interested to hear of software that meets Heather's specifications.

          I'm curious...if merging multiple taxonomies / thesauri is focused on (among other things) getting rid of 'duplicates' how do homographs live in such an environment?

          For example, how does 'Appendix' in a medical thesaurus live with 'Appendix' from a publishing thesaurus?

          John O'
        • John O'Gorman
          That doesn t seem like a very scalable solution... Subject fields intersect all the time - and are more likely to so with greater regularity in the future.
          Message 5 of 11 , Sep 29, 2009
          • 0 Attachment
            That doesn't seem like a very scalable solution...
             
            Subject fields intersect all the time - and are more likely to so with greater regularity in the future. Think of all the areas where 'Law' touches the ground, or 'Medicine' or heck 'Technology'. Just ten cents worth, but if Taxonomists as a group are going to make an even more significant contribution in the future, we're going to have to find a more elegant way to manage ambiguity in mixed subject areas.
             
            John O'
             
             
            -----Original Message-----
            From: Heather Hedden [mailto:heather@...]
            Sent: Tuesday, September 29, 2009 11:31 AM
            To: TaxoCoP@yahoogroups.com
            Subject: Re: [TaxoCoP] Re: Tools to merge vocabularies

             

            First of all, merging should only be done on thesauri in the same field
            (subject area). But if the field is broad, then undifferentiated
            homographs with different meaning may occur and be matched. A
            taxonomist must review the automated matchings. Parenthetical qualifiers
            can be added if both homographs with different meanings are kept.

            -- Heather

            Heather Hedden
            Hedden Information Management
            www.Hedden-Informat ion.com

            laptopjockey wrote:
            > --- In TaxoCoP@yahoogroups .com, "Alice" <aredmondneal@ ...> wrote:
            >
            >> I would be interested to hear of software that meets Heather's specifications.
            >>
            >
            > I'm curious...if merging multiple taxonomies / thesauri is focused on (among other things) getting rid of 'duplicates' how do homographs live in such an environment?
            >
            > For example, how does 'Appendix' in a medical thesaurus live with 'Appendix' from a publishing thesaurus?
            >
            > John O'
            >
            >
            >
            > ------------ --------- --------- ------
            >
            > Yahoo! Groups Links
            >
            >
            >
            >
            >

          • Heather Hedden
            My mistake, I was thinking of mapping projects. Merging, yes, can be done in overlapping fields. But Jan is correct, that parenthetical qualifiers are used to
            Message 6 of 11 , Sep 29, 2009
            • 0 Attachment
              My mistake, I was thinking of mapping projects. Merging, yes, can be
              done in overlapping fields.
              But Jan is correct, that parenthetical qualifiers are used to
              disambiguate homonyms in the same taxonomy, or at least the same facet.
              If that's not elegant, the if the terms are in separate facets, then the
              parenthetical qualifiers may not be necessary in the display, but the
              terms have to be distinguished under the hood and in the indexing.

              -- Heather



              John O'Gorman wrote:
              >
              >
              > That doesn't seem like a very scalable solution...
              >
              > Subject fields intersect all the time - and are more likely to so with
              > greater regularity in the future. Think of all the areas where 'Law'
              > touches the ground, or 'Medicine' or heck 'Technology'. Just ten cents
              > worth, but if Taxonomists as a group are going to make an even more
              > significant contribution in the future, we're going to have to find a
              > more elegant way to manage ambiguity in mixed subject areas.
              >
              > John O'
              >
              >
              >
              > -----Original Message-----
              > *From:* Heather Hedden [mailto:heather@...]
              > *Sent:* Tuesday, September 29, 2009 11:31 AM
              > *To:* TaxoCoP@yahoogroups.com
              > *Subject:* Re: [TaxoCoP] Re: Tools to merge vocabularies
              >
              >
              >
              > First of all, merging should only be done on thesauri in the same
              > field
              > (subject area). But if the field is broad, then undifferentiated
              > homographs with different meaning may occur and be matched. A
              > taxonomist must review the automated matchings. Parenthetical
              > qualifiers
              > can be added if both homographs with different meanings are kept.
              >
              > -- Heather
              >
              > Heather Hedden
              > Hedden Information Management
              > www.Hedden-Information.com
              >
              > laptopjockey wrote:
              > > --- In TaxoCoP@yahoogroups.com
              > <mailto:TaxoCoP%40yahoogroups.com>, "Alice" <aredmondneal@...> wrote:
              > >
              > >> I would be interested to hear of software that meets Heather's
              > specifications.
              > >>
              > >
              > > I'm curious...if merging multiple taxonomies / thesauri is
              > focused on (among other things) getting rid of 'duplicates' how do
              > homographs live in such an environment?
              > >! ;
              > > For example, how does 'Appendix' in a medical thesaurus live
              > with 'Appendix' from a publishing thesaurus?
              > >
              > > John O'
              > >
              > >
              > >
              > > ------------------------------------
              > >
              > > Yahoo! Groups Links
              > >
              > >
              > >
              > >
              > >
              >
              >
              >
              >
              >
            • John O'Gorman
              This is interesting. Your response uses the word mapping in a way that I can t disambiiguate: does mapping refer to Geography or to a broader qualifier in
              Message 7 of 11 , Sep 29, 2009
              • 0 Attachment
                This is interesting. Your response uses the word 'mapping' in a way that I can't disambiiguate: does 'mapping' refer to Geography or to a broader qualifier in the Taxonomy field?  :~)
                 
                Definitely agree that homographs have to be disambiguated somewhere, the point I was trying to make (and as usual not very elegantly) is that we as taxonomists have an opportunity in this rapidly shrinking environment to push the process to the fore.
                 
                John O'
                 
                 
                -----Original Message-----
                From: Heather Hedden [mailto:heather@...]
                Sent: Tuesday, September 29, 2009 01:02 PM
                To: TaxoCoP@yahoogroups.com
                Subject: Re: [TaxoCoP] Re: Tools to merge vocabularies

                 

                My mistake, I was thinking of mapping projects. Merging, yes, can be
                done in overlapping fields.
                But Jan is correct, that parenthetical qualifiers are used to
                disambiguate homonyms in the same taxonomy, or at least the same facet.
                If that's not elegant, the if the terms are in separate facets, then the
                parenthetical qualifiers may not be necessary in the display, but the
                terms have to be distinguished under the hood and in the indexing.

                -- Heather

                John O'Gorman wrote:
                >
                >
                > That doesn't seem like a very scalable solution...
                >
                > Subject fields intersect all the time - and are more likely to so with
                > greater regularity in the future. Think of all the areas where 'Law'
                > touches the ground, or 'Medicine' or heck 'Technology' . Just ten cents
                > worth, but if Taxonomists as a group are going to make an even more
                > significant contribution in the future, we're going to have to find a
                > more elegant way to manage ambiguity in mixed subject areas.
                >
                > John O'
                >
                >
                >
                > -----Original Message-----
                > *From:* Heather Hedden [mailto:heather@hedden. net]
                > *Sent:* Tuesday, September 29, 2009 11:31 AM
                > *To:* TaxoCoP@yahoogroups .com
                > *Subject:* Re: [TaxoCoP] Re: Tools to merge vocabularies
                >
                >
                >
                > First of all, merging should only be done on thesauri in the same
                > field
                > (subject area). But if the field is broad, then undifferentiated
                > homographs with different meaning may occur and be matched. A
                > taxonomist must review the automated matchings. Parenthetical
                > qualifiers
                > can be added if both homographs with different meanings are kept.
                >
                > -- Heather
                >
                > Heather Hedden
                > Hedden Information Management
                > www.Hedden-Informat ion.com
                >
                > laptopjockey wrote:
                > > --- In TaxoCoP@yahoogroups .com
                > <mailto:TaxoCoP% 40yahoogroups. com>, "Alice" <aredmondneal@ ...> wrote:
                > >
                > >> I would be interested to hear of software that meets Heather's
                > specifications.
                > >>
                > >
                > > I'm curious...if merging multiple taxonomies / thesauri is
                > focused on (among other things) getting rid of 'duplicates' how do
                > homographs live in such an environment?
                > >! ;
                > > For example, how does 'Appendix' in a medical thesaurus live
                > with 'Appendix' from a publishing thesaurus?
                > >
                > > John O'
                > >
                > >
                > >
                > > ------------ --------- --------- ------
                > >
                > > Yahoo! Groups Links
                > >
                > >
                > >
                > >
                > >
                >
                >
                >
                >
                >

              Your message has been successfully submitted and would be delivered to recipients shortly.