Loading ...
Sorry, an error occurred while loading the content.
 

data modeling and taxonomy

Expand Messages
  • Heather Hedden
    How do data modeling and taxonomy relate? I m curious to hear from those of you who have done data modeling how you would describe what a data modeler does.
    Message 1 of 21 , Jan 4, 2010
      How do data modeling and taxonomy relate?
      I'm curious to hear from those of you who have done data modeling how you would describe what a data modeler does.
      Thanks.

      -- Heather
      -- 
      Heather Hedden
      Hedden Information Management
      www.Hedden-Information.com
    • Gabriel Tanase
      A data modeler is expected to create blueprint designs (known as logical data models ) for data structures that are then physically implemented in databases,
      Message 2 of 21 , Jan 4, 2010
        A data modeler is expected to create blueprint designs (known as 'logical data models') for data structures that are then physically implemented in databases, for use by software application(s).
        As part of logical data modeling a data modeler may create a generalization/specialization hierarchy of types (classes, entities), which - IMHO - is the area where data modeling and taxonomy meet most intensely.

        Regards,
        Gabriel
        http://www.linkedin.com/in/gabrieltanase

        2010/1/4 Heather Hedden <heather@...>


        How do data modeling and taxonomy relate?
        I'm curious to hear from those of you who have done data modeling how you would describe what a data modeler does.
        Thanks.

        -- Heather
        -- 
        Heather Hedden
        Hedden Information Management
        www.Hedden-Information.com






      • Matt Moore
        Gabriel s has got it there for me. The key differences are that data modelers work with the mapping of structured data (i.e. entities and rows in databases)
        Message 3 of 21 , Jan 4, 2010
          Gabriel's has got it there for me. The key differences are that data modelers work with the mapping of structured data (i.e. entities and rows in databases) whereas taxonomists work with unstructured information. Organizations often want to use the structures in their database applications (e.g. customer attributes in their CRM system) in their taxonomies (e.g. applying this information to proposal documentation). This is a good idea in principle but some care needs to be taken with the implementation for it to be usable. Discussions around Master Data Management should probably involve both.


          From: Gabriel Tanase <gabtanase@...>
          To: TaxoCoP@yahoogroups.com
          Sent: Tue, January 5, 2010 8:30:53 AM
          Subject: Re: [TaxoCoP] data modeling and taxonomy

           

          A data modeler is expected to create blueprint designs (known as 'logical data models') for data structures that are then physically implemented in databases, for use by software application( s).
          As part of logical data modeling a data modeler may create a generalization/ specialization hierarchy of types (classes, entities), which - IMHO - is the area where data modeling and taxonomy meet most intensely.

          Regards,
          Gabriel
          http://www.linkedin .com/in/gabrielt anase

          2010/1/4 Heather Hedden <heather@hedden. net>


          How do data modeling and taxonomy relate?
          I'm curious to hear from those of you who have done data modeling how you would describe what a data modeler does.
          Thanks.

          -- Heather
          -- 
          Heather Hedden
          Hedden Information Management
          www.Hedden-Informat ion.com







        • Bob Bater
          I think Gabriel captures the overlap between data modelling and taxonomy precisely. My only comment would be that the missing component is an over-arching
          Message 4 of 21 , Jan 4, 2010

            I think Gabriel captures the overlap between data modelling and taxonomy precisely. My only comment would be that the missing component is an over-arching ontology which expresses the context of both data model and taxonomy.

             

            Regards,

             

            Bob

             

            From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of Gabriel Tanase
            Sent: 04 January 2010 22:31
            To: TaxoCoP@yahoogroups.com
            Subject: Re: [TaxoCoP] data modeling and taxonomy

             

             

            A data modeler is expected to create blueprint designs (known as 'logical data models') for data structures that are then physically implemented in databases, for use by software application(s).
            As part of logical data modeling a data modeler may create a generalization/specialization hierarchy of types (classes, entities), which - IMHO - is the area where data modeling and taxonomy meet most intensely.

            Regards,
            Gabriel
            http://www.linkedin.com/in/gabrieltanase

            2010/1/4 Heather Hedden <heather@...>



            How do data modeling and taxonomy relate?
            I'm curious to hear from those of you who have done data modeling how you would describe what a data modeler does.
            Thanks.

            -- Heather

            -- 
            Heather Hedden
            Hedden Information Management
            www.Hedden-Information.com

             



          • Gabriel Tanase
            Bob, I agree with you and I already knew it, however I didn t want to complicate matters in my first response to Heather. Apologies if the below will sound
            Message 5 of 21 , Jan 5, 2010
              Bob,

              I agree with you and I already knew it, however I didn't want to complicate matters in my first response to Heather.

              Apologies if the below will sound like a product plug. It is not intended as such, it is only intended to give you brief information on something that people seem to have an interest for and illustrate one of the approaches that exist.

              As it happens, the set of models I am lead developer for (the IBM Insurance Information Warehouse - IIW, part of the IBM Industry Models product set) does have at its "conceptual" apex an ontology / taxonomy of business terms, relationships between these and a classification hierarchy. It is named the "Conceptual Model" (CM). It started originally - many years ago - as a flat glossary.
              (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences).

              All the elements in the downstream IIW data models (below the taxonomy) do have mappings back to terms / concepts and relationships in the CM taxonomy.
              Also, the supporting toolset can generate a vocabulary / glossary from our CM that can be exported to the IBM InfoSphere Business Glossary product (part of the InfoSphere Metadata Server). From here it can be used as a reference vocabulary / glossary for other IBM products dealing with the data lifecycle (Information Analyzer, FastTrack, DataStage, QualityStage, Cognos).

              I believe that, currently, IBM's Industry Models is the only product on the market that supports data models with an ontology / taxonomy in the same model set and maintains mappings between these. There might be other similar products out there, however I am not knowledgeable of any myself.



              Kind regards,
              Gabriel
              http://www.linkedin.com/in/gabrieltanase


              2010/1/4 Bob Bater <bbater@...>


              I think Gabriel captures the overlap between data modelling and taxonomy precisely. My only comment would be that the missing component is an over-arching ontology which expresses the context of both data model and taxonomy.

               

              Regards,

               

              Bob

               

              From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of Gabriel Tanase
              Sent: 04 January 2010 22:31
              To: TaxoCoP@yahoogroups.com


              Subject: Re: [TaxoCoP] data modeling and taxonomy

               

               

              A data modeler is expected to create blueprint designs (known as 'logical data models') for data structures that are then physically implemented in databases, for use by software application(s).


              As part of logical data modeling a data modeler may create a generalization/specialization hierarchy of types (classes, entities), which - IMHO - is the area where data modeling and taxonomy meet most intensely.

              Regards,
              Gabriel
              http://www.linkedin.com/in/gabrieltanase

              2010/1/4 Heather Hedden <heather@...>



              How do data modeling and taxonomy relate?
              I'm curious to hear from those of you who have done data modeling how you would describe what a data modeler does.
              Thanks.

              -- Heather

              -- 
              Heather Hedden
              Hedden Information Management
              www.Hedden-Information.com


            • John O'Gorman
              Excellent question, and as an expert in neither activity (data modelling or taxonomy creation) I can give you a layman s perspective that may (or may not) add
              Message 6 of 21 , Jan 5, 2010
                Excellent question, and as an expert in neither activity (data modelling or taxonomy creation) I can give you a layman's perspective that may (or may not) add to the discussion.
                 
                Individually, both exercises are designed to create a portable (meaning: for the purpose of communication) snapshot of someone's (or a group of someones) understanding of a problem space at a given point in time and in a defined 'locale'. From my point of view, these conditions (relatively narrow constraints of what, when and where) are prerequisites for the creation of a data model, be it conceptual, logical or to a lesser extent physical, and a taxonomy. To press the point a little, see what happens when one or more of the constraints are lifted: a data model for the insurance industry vs health care or now vs the 1900s or in Canada vs the United Arab Emirates (or for that matter between IBM and MicroSoft) would all be more or less different.
                 
                So, the first thing a given data model and a given taxonomy have in common is that they are purpose-built to solve a particular problem. They have a whole range of assumptions built in that - depending on the skill of the creator - make them more or less extensible. A potential problem may exist if/when one of each is used to describe the same application, but that's a topic for another day.
                 
                The second thing is given the constraints previously mentioned they should contain almost all of the same 'primary entities' but in different conformations. Data models and taxonomies do not serve the same function: the former is one of several classes of artifacts used in the process of capturing the information management lifecycle in a given enterprise. The latter is a standalone representation of the conceptual classification categories that normally serve only that function. Again, this is not without precedent or value: the alternator in my car only serves one purpose.
                 
                The third thing is related to the previous point: Data models have almost unlimited degrees of freedom, while a taxonomy (at least in the traditional sense) only ever has two. Almost by definition, a data model must include any entity in which the particular problem space is 'interested' while a taxonomy must make binary decisions from top to bottom: in or out (of this category); yes or no. While the creative taxonomist can invent interesting splits, each level of a taxonomic hierarchy can only accommodate one choice.
                 
                Lastly, I believe that there could be a much more organic relationship between data models and taxonomies. For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxomonist and resided in a master data managment inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary.
                 
                It sounds like this is what Gabriel is close to (if nor exactly) describing, so apologies if I've stated the obvious.
                 
                John O'
                 
                 
                -----Original Message-----
                From: Gabriel Tanase [mailto:gabtanase@...]
                Sent: Tuesday, January 5, 2010 03:08 AM
                To: TaxoCoP@yahoogroups.com
                Subject: Re: [TaxoCoP] data modeling and taxonomy

                 

                Bob,

                I agree with you and I already knew it, however I didn't want to complicate matters in my first response to Heather.

                Apologies if the below will sound like a product plug. It is not intended as such, it is only intended to give you brief information on something that people seem to have an interest for and illustrate one of the approaches that exist.

                As it happens, the set of models I am lead developer for (the IBM Insurance Information Warehouse - IIW, part of the IBM Industry Models product set) does have at its "conceptual" apex an ontology / taxonomy of business terms, relationships between these and a classification hierarchy. It is named the "Conceptual Model" (CM). It started originally - many years ago - as a flat glossary.
                (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences).

                All the elements in the downstream IIW data models (below the taxonomy) do have mappings back to terms / concepts and relationships in the CM taxonomy.
                Also, the supporting toolset can generate a vocabulary / glossary from our CM that can be exported to the IBM InfoSphere Business Glossary product (part of the InfoSphere Metadata Server). From here it can be used as a reference vocabulary / glossary for other IBM products dealing with the data lifecycle (Information Analyzer, FastTrack, DataStage, QualityStage, Cognos).

                I believe that, currently, IBM's Industry Models is the only product on the market that supports data models with an ontology / taxonomy in the same model set and maintains mappings between these. There might be other similar products out there, however I am not knowledgeable of any myself.



                Kind regards,
                Gabriel
                http://www.linkedin .com/in/gabrielt anase


                2010/1/4 Bob Bater <bbater@infoplex- uk.com>



                I think Gabriel captures the overlap between data modelling and taxonomy precisely. My only comment would be that the missing component is an over-arching ontology which expresses the context of both data model and taxonomy.

                 

                Regards,

                 

                Bob

                 

                From: TaxoCoP@yahoogroups .com [mailto:TaxoCoP@yahoogroups .com] On Behalf Of Gabriel Tanase
                Sent: 04 January 2010 22:31
                To: TaxoCoP@yahoogroups .com


                Subject: Re: [TaxoCoP] data modeling and taxonomy

                 

                 

                 

                A data modeler is expected to create blueprint designs (known as 'logical data models') for data structures that are then physically implemented in databases, for use by software application( s).


                As part of logical data modeling a data modeler may create a generalization/ specialization hierarchy of types (classes, entities), which - IMHO - is the area where data modeling and taxonomy meet most intensely.

                Regards,
                Gabriel
                http://www.linkedin .com/in/gabrielt anase

                2010/1/4 Heather Hedden <heather@hedden. net>



                How do data modeling and taxonomy relate?
                I'm curious to hear from those of you who have done data modeling how you would describe what a data modeler does.
                Thanks.

                -- Heather

                -- 
                Heather Hedden
                Hedden Information Management
                www.Hedden-Informat ion.com


                 

              • Keipat Patkei
                John, great response.  Actually, this is terrific thread over all, and thank you, Heather, Gabriel, and Bob for providing terrific information getting at the
                Message 7 of 21 , Jan 5, 2010
                  John, great response.  Actually, this is terrific thread over all, and thank you, Heather, Gabriel, and Bob for providing terrific information getting at the crux of defining parts of the whole. 

                  As some of you know, one of my frustrations over the years has been the ongoing view or use of the term "taxonomy" to describe sort of the beginning and the end of all enterprise management of data everywhere, i.e. "the be all and end all" or "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise, required need, expected outcomes, and so on.  I suppose my real frustration comes when "taxonomy" limitations are then considered a "trap" or "failure" by those who think it can work around or make up for deficiencies in other systems with which it must co-exist/integrate. 

                  Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome. I've been actively engaged in all points raised in this thread in the course of my work of the last ten years; but you've provided me with some new ways of thinking about and describing/communicating the work that I do.  It's particularly welcome at the beginning of what I know is going to be a great year for all of us.  Thanks, again.

                  Keith DeWeese

                  Central Administration/Tribune Technology-Architecture

                  Tribune Company-Tribune Interactive

                  kdeweese@.../keith@...

                  +1-312-527-8740 (w)/+1-312-286-3568 (c)




                  --- On Tue, 1/5/10, John O'Gorman <jogorman@...> wrote:

                  From: John O'Gorman <jogorman@...>
                  Subject: Re: [TaxoCoP] data modeling and taxonomy
                  To: TaxoCoP@yahoogroups.com
                  Date: Tuesday, January 5, 2010, 10:30 AM

                   

                  Excellent question, and as an expert in neither activity (data modelling or taxonomy creation) I can give you a layman's perspective that may (or may not) add to the discussion.
                   
                  Individually, both exercises are designed to create a portable (meaning: for the purpose of communication) snapshot of someone's (or a group of someones) understanding of a problem space at a given point in time and in a defined 'locale'. From my point of view, these conditions (relatively narrow constraints of what, when and where) are prerequisites for the creation of a data model, be it conceptual, logical or to a lesser extent physical, and a taxonomy. To press the point a little, see what happens when one or more of the constraints are lifted: a data model for the insurance industry vs health care or now vs the 1900s or in Canada vs the United Arab Emirates (or for that matter between IBM and MicroSoft) would all be more or less different.
                   
                  So, the first thing a given data model and a given taxonomy have in common is that they are purpose-built to solve a particular problem. They have a whole range of assumptions built in that - depending on the skill of the creator - make them more or less extensible. A potential problem may exist if/when one of each is used to describe the same application, but that's a topic for another day.
                   
                  The second thing is given the constraints previously mentioned they should contain almost all of the same 'primary entities' but in different conformations. Data models and taxonomies do not serve the same function: the former is one of several classes of artifacts used in the process of capturing the information management lifecycle in a given enterprise. The latter is a standalone representation of the conceptual classification categories that normally serve only that function. Again, this is not without precedent or value: the alternator in my car only serves one purpose.
                   
                  The third thing is related to the previous point: Data models have almost unlimited degrees of freedom, while a taxonomy (at least in the traditional sense) only ever has two. Almost by definition, a data model must include any entity in which the particular problem space is 'interested' while a taxonomy must make binary decisions from top to bottom: in or out (of this category); yes or no. While the creative taxonomist can invent interesting splits, each level of a taxonomic hierarchy can only accommodate one choice.
                   
                  Lastly, I believe that there could be a much more organic relationship between data models and taxonomies. For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxomonist and resided in a master data managment inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary.
                   
                  It sounds like this is what Gabriel is close to (if nor exactly) describing, so apologies if I've stated the obvious.
                   
                  John O'
                   
                   
                  -----Original Message-----
                  From: Gabriel Tanase [mailto:gabtanase@ gmail.com]
                  Sent: Tuesday, January 5, 2010 03:08 AM
                  To: TaxoCoP@yahoogroups .com
                  Subject: Re: [TaxoCoP] data modeling and taxonomy

                   

                  Bob,

                  I agree with you and I already knew it, however I didn't want to complicate matters in my first response to Heather.

                  Apologies if the below will sound like a product plug. It is not intended as such, it is only intended to give you brief information on something that people seem to have an interest for and illustrate one of the approaches that exist.

                  As it happens, the set of models I am lead developer for (the IBM Insurance Information Warehouse - IIW, part of the IBM Industry Models product set) does have at its "conceptual" apex an ontology / taxonomy of business terms, relationships between these and a classification hierarchy. It is named the "Conceptual Model" (CM). It started originally - many years ago - as a flat glossary.
                  (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences).

                  All the elements in the downstream IIW data models (below the taxonomy) do have mappings back to terms / concepts and relationships in the CM taxonomy.
                  Also, the supporting toolset can generate a vocabulary / glossary from our CM that can be exported to the IBM InfoSphere Business Glossary product (part of the InfoSphere Metadata Server). From here it can be used as a reference vocabulary / glossary for other IBM products dealing with the data lifecycle (Information Analyzer, FastTrack, DataStage, QualityStage, Cognos).

                  I believe that, currently, IBM's Industry Models is the only product on the market that supports data models with an ontology / taxonomy in the same model set and maintains mappings between these. There might be other similar products out there, however I am not knowledgeable of any myself.



                  Kind regards,
                  Gabriel
                  http://www.linkedin .com/in/gabrielt anase


                  2010/1/4 Bob Bater <bbater@infoplex- uk.com>



                  I think Gabriel captures the overlap between data modelling and taxonomy precisely. My only comment would be that the missing component is an over-arching ontology which expresses the context of both data model and taxonomy.

                   

                  Regards,

                   

                  Bob

                   

                  From: TaxoCoP@yahoogroups .com [mailto:TaxoCoP@yahoogroups .com] On Behalf Of Gabriel Tanase
                  Sent: 04 January 2010 22:31
                  To: TaxoCoP@yahoogroups .com


                  Subject: Re: [TaxoCoP] data modeling and taxonomy

                   

                   

                   

                  A data modeler is expected to create blueprint designs (known as 'logical data models') for data structures that are then physically implemented in databases, for use by software application( s).


                  As part of logical data modeling a data modeler may create a generalization/ specialization hierarchy of types (classes, entities), which - IMHO - is the area where data modeling and taxonomy meet most intensely.

                  Regards,
                  Gabriel
                  http://www.linkedin .com/in/gabrielt anase

                  2010/1/4 Heather Hedden <heather@hedden. net>



                  How do data modeling and taxonomy relate?
                  I'm curious to hear from those of you who have done data modeling how you would describe what a data modeler does.
                  Thanks.

                  -- Heather

                  -- 
                  Heather Hedden
                  Hedden Information Management
                  www.Hedden-Informat ion.com


                   


                • Seth Earley
                  At the Academy of Motion Picture Arts and Sciences Metadata Symposium that I facilitated few months ago, we spoke a great deal about taxonomy and metadata for
                  Message 8 of 21 , Jan 5, 2010

                    At the Academy of Motion Picture Arts and Sciences Metadata Symposium that I facilitated few months ago, we spoke a great deal about taxonomy and metadata for motion picture assets. 

                     

                    The discussions about metadata transformations by Gracenote and the Metadata Service Bureau by Warner Brothers were very interesting and relevant to this discussion of data modeling versus taxonomy.

                     

                    http://www.oscars.org/science-technology/council/projects/metadata-symposium/webcasts.html

                     

                    Seth

                     

                    Seth Earley

                    President
                    _____________________________

                    EARLEY & ASSOCIATES, Inc.
                    Cell: 781-820-8080

                    Email: seth@...  
                    Web: www.earley.com

                     

                    Follow me on twitter: sethearley

                     

                    Free four part Jumpstart Series

                    On Digital Asset Management starts

                    Thursday, January 14th, 2010 1 pm eastern

                    http://www.earley.com/webinars/jumpstarts/digital-asset-management

                     

                    From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of Bob Bater
                    Sent: Monday, January 04, 2010 6:10 PM
                    To: TaxoCoP@yahoogroups.com
                    Subject: RE: [TaxoCoP] data modeling and taxonomy

                     

                     

                    I think Gabriel captures the overlap between data modelling and taxonomy precisely. My only comment would be that the missing component is an over-arching ontology which expresses the context of both data model and taxonomy.

                     Regards,

                     Bob

                     From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of Gabriel Tanase
                    Sent: 04 January 2010 22:31
                    To: TaxoCoP@yahoogroups.com
                    Subject: Re: [TaxoCoP] data modeling and taxonomy

                     A data modeler is expected to create blueprint designs (known as 'logical data models') for data structures that are then physically implemented in databases, for use by software application(s).
                    As part of logical data modeling a data modeler may create a generalization/specialization hierarchy of types (classes, entities), which - IMHO - is the area where data modeling and taxonomy meet most intensely.

                    Regards,
                    Gabriel
                    http://www.linkedin.com/in/gabrieltanase

                    2010/1/4 Heather Hedden <heather@...>


                    How do data modeling and taxonomy relate?
                    I'm curious to hear from those of you who have done data modeling how you would describe what a data modeler does.
                    Thanks.

                    -- Heather


                  • Bob Bater
                    Heather, Gabriel, John, Keith & anyone else who s following this thread: I m still feeling my way around these kinds of issues (have been for years), and have
                    Message 9 of 21 , Jan 5, 2010

                      Heather, Gabriel, John, Keith & anyone else who's following this thread:

                       

                      I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                       

                      Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything*? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                       

                      John also says:

                      > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                      I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                       

                      Keith says that he finds taxonomies are regarded as:

                      > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                      Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists). Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                       

                      Gabriel said:

                      > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                       

                      My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                       

                      Keith said:

                      > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                       

                      Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                       

                      Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                       

                      Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                       

                      Regards,

                       

                      Bob

                    • Matt Moore
                      Bob, My working hypothesis in this respect does not include the need for ontologies to enable the making of inferences. I think you re using a particular
                      Message 10 of 21 , Jan 5, 2010
                        Bob,

                        "My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences."

                        I think you're using a particular version of the term "ontology" that might cause a little confusion. How does your ontology differ from a faceted classification structure? My understanding of ontologies is that they specify the "verbs" that link "nouns" as well as the nouns themselves (so they specify what a certain subclass of person can do to a certain subclass of document for example). What's "in" and what's "out" of your model?

                        Cheers,

                        Matt


                        From: Bob Bater <bbater@...>
                        To: TaxoCoP@yahoogroups.com
                        Sent: Wed, January 6, 2010 10:30:02 AM
                        Subject: RE: [TaxoCoP] data modeling and taxonomy

                         

                        Heather, Gabriel, John, Keith & anyone else who's following this thread:

                         

                        I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                         

                        Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything* ? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                         

                        John also says:

                        > For example, if all of the 'entities' that a data modeller

                        wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                        I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                         

                        Keith says that he finds taxonomies are regarded as:

                        > "THE solution" rather than being viewed as

                        "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                        Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists) . Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                         

                        Gabriel said:

                        > (I said  "ontology / taxonomy" in the above

                        because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                         

                        My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                         

                        Keith said:

                        > Getting at just where taxonomy, data modeling, and ontology

                        specification begin, end, and overlap is really welcome.  <

                         

                        Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                         

                        Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                         

                        Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                         

                        Regards,

                         

                        Bob


                      • Patrick Lambe
                        Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I ll start with a comment that Matt made early on, that there might be
                        Message 11 of 21 , Jan 5, 2010
                          Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I 'll start with a comment that Matt made early on, that there might be usability issues with reusing structures from data models in taxonomies, even though in principle such reuse makes sense.

                          I think there's a tendency for us to get very entity focused in these discussions and definitions and stop there. There's a good reason for this. The common ground for data models, ontologies, taxonomies is their need to establish relatively stable entities at the very least; they each do slightly different different things around the language referring to those entities, and they diverge in the type and extent of work around establishing and defining relationships and maybe inference-generating capabilities (which some taxonomy forms can support as well as ontologies). But the entities are the core point of reference.

                          But Matt's comment reminds us that it's important to remember that data models, taxonomies and ontologies are at the end of the day just instruments, and to understand the instrument is not just about understanding the entities it manipulates, but how the instrument is used, and for what purpose. 

                          The design of a tool is driven by its functionality, not its components. DM-T-Os serve related purposes via different means and in different contexts. There are important differences in the amount of human vs machine processing expected or served. As Matt suggests master data management is one way of getting a handle on how they can inter-operate. But fixing an entity and definition in one space (eg a data model) does not unquestionably qualify it for use in another space (eg a taxonomy).

                          I think we also assume that usability is only really relevant at the taxonomy level. In my book I suggested that taxonomies are for humans and ontologies are for machines, which risks feeding that assumption. But at the end of the day, the rationale for using any of these instruments whether data models, taxonomies or ontologies, is that they must emerge into human use in some way. It's just that for DMs and Os machine processes provide different opportunities and constraints from human ones. If we can't see the pathway to human use (which is where some of the visionary talk on ontologies falls down, I feel) then they risk floating away into philosophical (or organisational) abstractions. I think this is where a lot of the hard wrestling work needs to be done, to resolve relationships between the instruments, preserve a common core where possible, and reflect the context-driven needs at organisational and user levels.

                          This is all very abstract still... I think what would be useful would be some good clear cases where we can see the relationships in specific contexts.

                          P

                          Patrick Lambe

                          website: www.straitsknowledge.com

                          Have you seen our KM Method Cards or
                          Organisation Culture Cards?  





                          On Jan 6, 2010, at 7:30 AM, Bob Bater wrote:


                          Heather, Gabriel, John, Keith & anyone else who's following this thread:

                           

                          I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                           

                          Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything* ? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                           

                          John also says:

                          > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                          I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                           

                          Keith says that he finds taxonomies are regarded as:

                          > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                          Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists) . Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                           

                          Gabriel said:

                          > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                           

                          My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                           

                          Keith said:

                          > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                           

                          Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                           

                          Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                           

                          Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                           

                          Regards,

                           

                          Bob



                        • Adrian Walker
                          Hi All, Good discussion. Here s a hopefully useful background item. It s actually possible to combine data modeling, taxonomy creation and application writing
                          Message 12 of 21 , Jan 6, 2010
                            Hi All,

                            Good discussion.  Here's a hopefully useful background item.

                            It's actually possible to combine data modeling, taxonomy creation and application writing in one notation -- something we call Executable English.

                            Here's a simple example that adds up fund amounts over a business hierarchy:

                              www.reengineeringllc.com/demo_agents/FundManagement1.agent

                            Apologies if you have seen this approach before, and thanks for comments.

                                                                                -- Adrian
                                              
                            Internet Business Logic
                            A Wiki and SOA Endpoint for Executable Open Vocabulary English over SQL and RDF
                            Online at www.reengineeringllc.com    Shared use is free   No advertisements

                            Adrian Walker
                            Reengineering



                            On Tue, Jan 5, 2010 at 10:36 PM, Patrick Lambe <plambe@...> wrote:
                             

                            Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I 'll start with a comment that Matt made early on, that there might be usability issues with reusing structures from data models in taxonomies, even though in principle such reuse makes sense.


                            I think there's a tendency for us to get very entity focused in these discussions and definitions and stop there. There's a good reason for this. The common ground for data models, ontologies, taxonomies is their need to establish relatively stable entities at the very least; they each do slightly different different things around the language referring to those entities, and they diverge in the type and extent of work around establishing and defining relationships and maybe inference-generating capabilities (which some taxonomy forms can support as well as ontologies). But the entities are the core point of reference.

                            But Matt's comment reminds us that it's important to remember that data models, taxonomies and ontologies are at the end of the day just instruments, and to understand the instrument is not just about understanding the entities it manipulates, but how the instrument is used, and for what purpose. 

                            The design of a tool is driven by its functionality, not its components. DM-T-Os serve related purposes via different means and in different contexts. There are important differences in the amount of human vs machine processing expected or served. As Matt suggests master data management is one way of getting a handle on how they can inter-operate. But fixing an entity and definition in one space (eg a data model) does not unquestionably qualify it for use in another space (eg a taxonomy).

                            I think we also assume that usability is only really relevant at the taxonomy level. In my book I suggested that taxonomies are for humans and ontologies are for machines, which risks feeding that assumption. But at the end of the day, the rationale for using any of these instruments whether data models, taxonomies or ontologies, is that they must emerge into human use in some way. It's just that for DMs and Os machine processes provide different opportunities and constraints from human ones. If we can't see the pathway to human use (which is where some of the visionary talk on ontologies falls down, I feel) then they risk floating away into philosophical (or organisational) abstractions. I think this is where a lot of the hard wrestling work needs to be done, to resolve relationships between the instruments, preserve a common core where possible, and reflect the context-driven needs at organisational and user levels.

                            This is all very abstract still... I think what would be useful would be some good clear cases where we can see the relationships in specific contexts.

                            P

                            Patrick Lambe

                            website: www.straitsknowledge.com

                            Have you seen our KM Method Cards or
                            Organisation Culture Cards?  





                            On Jan 6, 2010, at 7:30 AM, Bob Bater wrote:


                            Heather, Gabriel, John, Keith & anyone else who's following this thread:

                             

                            I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                             

                            Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything*? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                             

                            John also says:

                            > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                            I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                             

                            Keith says that he finds taxonomies are regarded as:

                            > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                            Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists). Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                             

                            Gabriel said:

                            > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                             

                            My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                             

                            Keith said:

                            > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                             

                            Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                             

                            Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                             

                            Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                             

                            Regards,

                             

                            Bob




                          • lisa colvin
                            Thanks for the lively discussion. It s exciting to see these ideas coming together. While there are some accepted standards for ontology modeling practice
                            Message 13 of 21 , Jan 6, 2010
                              Thanks for the lively discussion. It's exciting to see these ideas coming together.

                              While there are some accepted standards for ontology modeling practice (RDFS/OWL), there are multiple knowledge representation languages which can be used to express any 'ontology'. Typically the more expressive the language, the more expensive it is computationally. So, you need to pick the representation language which best fits your needs. If you're not building a model to drive some sort of expert system or related capabilities,  a simpler knowledge representation scheme is probably better.

                              However, one reason people use ontology languages in general is when there is a need for strong semantics which define the relationships/ context. Even if you don't want to build an expert/recommendation/QA/NL-based system, you can still use a more formal ontology language as just a pure specification language.

                              So, is a faceted classification scheme an ontology? Some would say 'yes, if it uses an ontology language to express it'. Others might say it's not if you're not expressing/defining any inheritance relations. Overall, it probably doesn't matter what you call it as long as the semantics are rich enough to solve whatever problem you needed solving.

                              There are fundamental differences to how the various disciplines approach information modeling. What I've found most helpful in working with people in another discipline is to be very explicit on how basic terms (like "term" :) , "class", "instance", "inference") are used in expressing the model that you're sharing. The idea of "inference", for example, can vary widely between an expert system developer and an OO developer. If these terms aren't described explicitly and used consistently, people get confused.

                              I also found that defining the capabilities and mathematical relationship distinctions between "controlled vocabulary list", "synonym rings/synsets", taxonomy", "thesaurus", "ontology", "desciption logics",etc.  is really only interesting to taxonomists/ontologists and other curious people like us. :)

                              :) Lisa

                              On Tue, Jan 5, 2010 at 7:36 PM, Patrick Lambe <plambe@...> wrote:
                               

                              Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I 'll start with a comment that Matt made early on, that there might be usability issues with reusing structures from data models in taxonomies, even though in principle such reuse makes sense.


                              I think there's a tendency for us to get very entity focused in these discussions and definitions and stop there. There's a good reason for this. The common ground for data models, ontologies, taxonomies is their need to establish relatively stable entities at the very least; they each do slightly different different things around the language referring to those entities, and they diverge in the type and extent of work around establishing and defining relationships and maybe inference-generating capabilities (which some taxonomy forms can support as well as ontologies). But the entities are the core point of reference.

                              But Matt's comment reminds us that it's important to remember that data models, taxonomies and ontologies are at the end of the day just instruments, and to understand the instrument is not just about understanding the entities it manipulates, but how the instrument is used, and for what purpose. 

                              The design of a tool is driven by its functionality, not its components. DM-T-Os serve related purposes via different means and in different contexts. There are important differences in the amount of human vs machine processing expected or served. As Matt suggests master data management is one way of getting a handle on how they can inter-operate. But fixing an entity and definition in one space (eg a data model) does not unquestionably qualify it for use in another space (eg a taxonomy).

                              I think we also assume that usability is only really relevant at the taxonomy level. In my book I suggested that taxonomies are for humans and ontologies are for machines, which risks feeding that assumption. But at the end of the day, the rationale for using any of these instruments whether data models, taxonomies or ontologies, is that they must emerge into human use in some way. It's just that for DMs and Os machine processes provide different opportunities and constraints from human ones. If we can't see the pathway to human use (which is where some of the visionary talk on ontologies falls down, I feel) then they risk floating away into philosophical (or organisational) abstractions. I think this is where a lot of the hard wrestling work needs to be done, to resolve relationships between the instruments, preserve a common core where possible, and reflect the context-driven needs at organisational and user levels.

                              This is all very abstract still... I think what would be useful would be some good clear cases where we can see the relationships in specific contexts.

                              P

                              Patrick Lambe

                              website: www.straitsknowledge.com

                              Have you seen our KM Method Cards or
                              Organisation Culture Cards?  





                              On Jan 6, 2010, at 7:30 AM, Bob Bater wrote:


                              Heather, Gabriel, John, Keith & anyone else who's following this thread:

                               

                              I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                               

                              Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything*? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                               

                              John also says:

                              > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                              I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                               

                              Keith says that he finds taxonomies are regarded as:

                              > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                              Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists). Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                               

                              Gabriel said:

                              > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                               

                              My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                               

                              Keith said:

                              > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                               

                              Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                               

                              Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                               

                              Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                               

                              Regards,

                               

                              Bob




                            • John O'Gorman
                              I d like to introduce one more abstract into the mix, followed by a concrete example as per Patrick s excellent suggestion. As Lisa mentioned, the mathematical
                              Message 14 of 21 , Jan 6, 2010
                                I'd like to introduce one more abstract into the mix, followed by a concrete example as per Patrick's excellent suggestion. As Lisa mentioned, the mathematical subtleties of taxonomies and data models and such are of little interest outside groups like ours, but the truth is that this line of inquiry is predicated on a flat geometry. The digital universe - owing primarily to its binary origins - is comprised of only two dimensions. Manifestation of this singular truth is everywhere and in spite of some very clever attempts to mitigate the flatness of things, we still have folder structures, naming conventions, hierarchies and super- and sub-types. This is not to suggest that these inventions are not and have not been useful, but we need something more elegant to save ourselves from drowning in a sea of digits and bytes.
                                 
                                Take search...enter 'cricket' and get back two point seven million hits on the sport, the insect, the ethical construct (as in "not cricket") and Buddy Holly. Because humans live in a multi-faceted universe and computers in a flat one, reconciling the semantics (i.e. the gap between n-dimensions and two) is up to us. What is needed is a new 'geometry' of information that simutaneously incorporates more precision and recognizes the existing symmetry of information.
                                 
                                Concrete example: In programming the stupid computer must be 'told' what a string is and how it is going to be used. So a given string may be a variable, a global variable, an object or a method depending on the context. To avoid 'collision', the same string may not be used in any way other than the one for which it has been declared. In the context of the 'cricket' search a similar approach may be taken, albeit with a twist. For every unique concept behind the string 'cricket' a unique identifier is declared. Now we have something like: 1234 - cricket - sport; 3456 - cricket - status;  4567 - cricket - insect; 6789 - cricket - member of Buddy Holly's band.
                                 
                                As Bob correctly points out, individual data models, taxonomies and ontologies (DM-T-O) are by necessity fairly narrow in scope. That's typically why taxonomies tend to break and data models fail with the introduction of information classes from a wider scope. Wouldn't it be interesting, though if in spite of these focused artifacts their individual members already had a declarative that uniquely identified not only what they represent but also what class they are in and how they can be connected to other patterns of use? In other words, have a new geometry built in to the vocabulary values to encourage reuse at a very granular level.
                                 
                                I can expand on the 'patterns' concept in a separate post (like Lisa says, I risk being the only one interested) but for now, think of any formally constructed language and think of the universal patterns used to exchange information. There must be an agreement about the what and the how, and there must also be an understanding about the context and construction, and there is always semantics. A taxonomy (as would a data model) become a new pattern in a given language using existing elements.
                                 
                                 
                                 
                                 
                                 
                                -----Original Message-----
                                From: lisa colvin [mailto:lisacolvin@...]
                                Sent: Wednesday, January 6, 2010 09:19 AM
                                To: TaxoCoP@yahoogroups.com
                                Subject: Re: [TaxoCoP] data modeling and taxonomy

                                 

                                Thanks for the lively discussion. It's exciting to see these ideas coming together.

                                While there are some accepted standards for ontology modeling practice (RDFS/OWL), there are multiple knowledge representation languages which can be used to express any 'ontology'. Typically the more expressive the language, the more expensive it is computationally. So, you need to pick the representation language which best fits your needs. If you're not building a model to drive some sort of expert system or related capabilities,  a simpler knowledge representation scheme is probably better.

                                However, one reason people use ontology languages in general is when there is a need for strong semantics which define the relationships/ context. Even if you don't want to build an expert/recommendati on/QA/NL- based system, you can still use a more formal ontology language as just a pure specification language.

                                So, is a faceted classification scheme an ontology? Some would say 'yes, if it uses an ontology language to express it'. Others might say it's not if you're not expressing/defining any inheritance relations. Overall, it probably doesn't matter what you call it as long as the semantics are rich enough to solve whatever problem you needed solving.

                                There are fundamental differences to how the various disciplines approach information modeling. What I've found most helpful in working with people in another discipline is to be very explicit on how basic terms (like "term" :) , "class", "instance", "inference") are used in expressing the model that you're sharing. The idea of "inference", for example, can vary widely between an expert system developer and an OO developer. If these terms aren't described explicitly and used consistently, people get confused.

                                I also found that defining the capabilities and mathematical relationship distinctions between "controlled vocabulary list", "synonym rings/synsets", taxonomy", "thesaurus", "ontology", "desciption logics",etc.  is really only interesting to taxonomists/ ontologists and other curious people like us. :)

                                :) Lisa

                                On Tue, Jan 5, 2010 at 7:36 PM, Patrick Lambe <plambe@straitsknowl edge.com> wrote:
                                 

                                Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I 'll start with a comment that Matt made early on, that there might be usability issues with reusing structures from data models in taxonomies, even though in principle such reuse makes sense.


                                I think there's a tendency for us to get very entity focused in these discussions and definitions and stop there. There's a good reason for this. The common ground for data models, ontologies, taxonomies is their need to establish relatively stable entities at the very least; they each do slightly different different things around the language referring to those entities, and they diverge in the type and extent of work around establishing and defining relationships and maybe inference-generatin g capabilities (which some taxonomy forms can support as well as ontologies). But the entities are the core point of reference.

                                But Matt's comment reminds us that it's important to remember that data models, taxonomies and ontologies are at the end of the day just instruments, and to understand the instrument is not just about understanding the entities it manipulates, but how the instrument is used, and for what purpose. 

                                The design of a tool is driven by its functionality, not its components. DM-T-Os serve related purposes via different means and in different contexts. There are important differences in the amount of human vs machine processing expected or served. As Matt suggests master data management is one way of getting a handle on how they can inter-operate. But fixing an entity and definition in one space (eg a data model) does not unquestionably qualify it for use in another space (eg a taxonomy).

                                I think we also assume that usability is only really relevant at the taxonomy level. In my book I suggested that taxonomies are for humans and ontologies are for machines, which risks feeding that assumption. But at the end of the day, the rationale for using any of these instruments whether data models, taxonomies or ontologies, is that they must emerge into human use in some way. It's just that for DMs and Os machine processes provide different opportunities and constraints from human ones. If we can't see the pathway to human use (which is where some of the visionary talk on ontologies falls down, I feel) then they risk floating away into philosophical (or organisational) abstractions. I think this is where a lot of the hard wrestling work needs to be done, to resolve relationships between the instruments, preserve a common core where possible, and reflect the context-driven needs at organisational and user levels.

                                This is all very abstract still... I think what would be useful would be some good clear cases where we can see the relationships in specific contexts.

                                P

                                Patrick Lambe

                                website: www.straitsknowledg e.com

                                Have you seen our KM Method Cards or
                                Organisation Culture Cards?  





                                On Jan 6, 2010, at 7:30 AM, Bob Bater wrote:


                                Heather, Gabriel, John, Keith & anyone else who's following this thread:

                                 

                                I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                                 

                                Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything* ? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                                 

                                John also says:

                                > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                                I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                                 

                                Keith says that he finds taxonomies are regarded as:

                                > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                                Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists) . Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                                 

                                Gabriel said:

                                > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                                 

                                My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                                 

                                Keith said:

                                > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                                 

                                Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                                 

                                Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                                 

                                Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                                 

                                Regards,

                                 

                                Bob




                                 

                              • Adrian Walker
                                Hi Lisa, You wrote While there are some accepted standards for ontology modeling practice (RDFS/OWL), there are multiple knowledge representation languages
                                Message 15 of 21 , Jan 6, 2010
                                  Hi Lisa,

                                  You wrote

                                  While there are some accepted standards for ontology modeling practice (RDFS/OWL), there are multiple knowledge representation languages which can be used to express any 'ontology'.

                                  Yes, indeed.  Here's a quote from Ed Barkmeyer of NIST:

                                  'What makes written knowledge an "ontology" is that the language has a grammar and an interpretation of the grammatical constructs that is suitable for automated reasoning.  If most of the desired reasoning depends on your interpretations of constructs you introduced, that can't happen unless you build the engine.'

                                                             Cheers,  -- Adrian

                                  Internet Business Logic
                                  A Wiki and SOA Endpoint for Executable English over SQL and RDF
                                  Online at www.reengineeringllc.com    Shared use is free   No advertisements

                                  Adrian Walker
                                  Reengineering


                                  On Wed, Jan 6, 2010 at 11:19 AM, lisa colvin <lisacolvin@...> wrote:
                                   

                                  Thanks for the lively discussion. It's exciting to see these ideas coming together.

                                  While there are some accepted standards for ontology modeling practice (RDFS/OWL), there are multiple knowledge representation languages which can be used to express any 'ontology'. Typically the more expressive the language, the more expensive it is computationally. So, you need to pick the representation language which best fits your needs. If you're not building a model to drive some sort of expert system or related capabilities,  a simpler knowledge representation scheme is probably better.

                                  However, one reason people use ontology languages in general is when there is a need for strong semantics which define the relationships/ context. Even if you don't want to build an expert/recommendation/QA/NL-based system, you can still use a more formal ontology language as just a pure specification language.

                                  So, is a faceted classification scheme an ontology? Some would say 'yes, if it uses an ontology language to express it'. Others might say it's not if you're not expressing/defining any inheritance relations. Overall, it probably doesn't matter what you call it as long as the semantics are rich enough to solve whatever problem you needed solving.

                                  There are fundamental differences to how the various disciplines approach information modeling. What I've found most helpful in working with people in another discipline is to be very explicit on how basic terms (like "term" :) , "class", "instance", "inference") are used in expressing the model that you're sharing. The idea of "inference", for example, can vary widely between an expert system developer and an OO developer. If these terms aren't described explicitly and used consistently, people get confused.

                                  I also found that defining the capabilities and mathematical relationship distinctions between "controlled vocabulary list", "synonym rings/synsets", taxonomy", "thesaurus", "ontology", "desciption logics",etc.  is really only interesting to taxonomists/ontologists and other curious people like us. :)

                                  :) Lisa


                                  On Tue, Jan 5, 2010 at 7:36 PM, Patrick Lambe <plambe@...> wrote:
                                   

                                  Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I 'll start with a comment that Matt made early on, that there might be usability issues with reusing structures from data models in taxonomies, even though in principle such reuse makes sense.


                                  I think there's a tendency for us to get very entity focused in these discussions and definitions and stop there. There's a good reason for this. The common ground for data models, ontologies, taxonomies is their need to establish relatively stable entities at the very least; they each do slightly different different things around the language referring to those entities, and they diverge in the type and extent of work around establishing and defining relationships and maybe inference-generating capabilities (which some taxonomy forms can support as well as ontologies). But the entities are the core point of reference.

                                  But Matt's comment reminds us that it's important to remember that data models, taxonomies and ontologies are at the end of the day just instruments, and to understand the instrument is not just about understanding the entities it manipulates, but how the instrument is used, and for what purpose. 

                                  The design of a tool is driven by its functionality, not its components. DM-T-Os serve related purposes via different means and in different contexts. There are important differences in the amount of human vs machine processing expected or served. As Matt suggests master data management is one way of getting a handle on how they can inter-operate. But fixing an entity and definition in one space (eg a data model) does not unquestionably qualify it for use in another space (eg a taxonomy).

                                  I think we also assume that usability is only really relevant at the taxonomy level. In my book I suggested that taxonomies are for humans and ontologies are for machines, which risks feeding that assumption. But at the end of the day, the rationale for using any of these instruments whether data models, taxonomies or ontologies, is that they must emerge into human use in some way. It's just that for DMs and Os machine processes provide different opportunities and constraints from human ones. If we can't see the pathway to human use (which is where some of the visionary talk on ontologies falls down, I feel) then they risk floating away into philosophical (or organisational) abstractions. I think this is where a lot of the hard wrestling work needs to be done, to resolve relationships between the instruments, preserve a common core where possible, and reflect the context-driven needs at organisational and user levels.

                                  This is all very abstract still... I think what would be useful would be some good clear cases where we can see the relationships in specific contexts.

                                  P

                                  Patrick Lambe

                                  website: www.straitsknowledge.com

                                  Have you seen our KM Method Cards or
                                  Organisation Culture Cards?  





                                  On Jan 6, 2010, at 7:30 AM, Bob Bater wrote:


                                  Heather, Gabriel, John, Keith & anyone else who's following this thread:

                                   

                                  I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                                   

                                  Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything*? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                                   

                                  John also says:

                                  > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                                  I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                                   

                                  Keith says that he finds taxonomies are regarded as:

                                  > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                                  Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists). Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                                   

                                  Gabriel said:

                                  > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                                   

                                  My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                                   

                                  Keith said:

                                  > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                                   

                                  Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                                   

                                  Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                                   

                                  Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                                   

                                  Regards,

                                   

                                  Bob





                                • cheriewagner@comcast.net
                                  In reading this I wanted to express my appreciation for the time and knowledge that all of you on this list share…I’m a behind-the-scenes lurker, so by way
                                  Message 16 of 21 , Jan 6, 2010

                                    In reading this I wanted to express my appreciation for the time and knowledge that all of you on this list share…I’m a behind-the-scenes lurker, so by way of brief introduction I worked in the content management and taxonomy space for many years and I am working now in different areas.  I know that I am quickly falling behind in what is a rapidly developing and ever-changing information modeling arena, so the following comment may seem obvious or archaic or just plain off!…but in reading this exchange it makes me think of fractals or fractal geometry and how it helps to predict the systematic chaos of nature.  Perhaps one could apply the concepts around fractal geometry to information or information modeling?  or maybe it would just result in some very cool geometric shapes... J

                                     

                                    http://en.wikipedia.org/wiki/Fractal

                                     

                                    Best,

                                    Cherie

                                     

                                     

                                    From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of John O'Gorman
                                    Sent: Wednesday, January 06, 2010 12:14 PM
                                    To: TaxoCoP@yahoogroups.com
                                    Subject: Re: [TaxoCoP] data modeling and taxonomy

                                     

                                     

                                    I'd like to introduce one more abstract into the mix, followed by a concrete example as per Patrick's excellent suggestion. As Lisa mentioned, the mathematical subtleties of taxonomies and data models and such are of little interest outside groups like ours, but the truth is that this line of inquiry is predicated on a flat geometry. The digital universe - owing primarily to its binary origins - is comprised of only two dimensions. Manifestation of this singular truth is everywhere and in spite of some very clever attempts to mitigate the flatness of things, we still have folder structures, naming conventions, hierarchies and super- and sub-types. This is not to suggest that these inventions are not and have not been useful, but we need something more elegant to save ourselves from drowning in a sea of digits and bytes.

                                     

                                    Take search...enter 'cricket' and get back two point seven million hits on the sport, the insect, the ethical construct (as in "not cricket") and Buddy Holly. Because humans live in a multi-faceted universe and computers in a flat one, reconciling the semantics (i.e. the gap between n-dimensions and two) is up to us. What is needed is a new 'geometry' of information that simutaneously incorporates more precision and recognizes the existing symmetry of information.

                                     

                                    Concrete example: In programming the stupid computer must be 'told' what a string is and how it is going to be used. So a given string may be a variable, a global variable, an object or a method depending on the context. To avoid 'collision', the same string may not be used in any way other than the one for which it has been declared. In the context of the 'cricket' search a similar approach may be taken, albeit with a twist. For every unique concept behind the string 'cricket' a unique identifier is declared. Now we have something like: 1234 - cricket - sport; 3456 - cricket - status;  4567 - cricket - insect; 6789 - cricket - member of Buddy Holly's band.

                                     

                                    As Bob correctly points out, individual data models, taxonomies and ontologies (DM-T-O) are by necessity fairly narrow in scope. That's typically why taxonomies tend to break and data models fail with the introduction of information classes from a wider scope. Wouldn't it be interesting, though if in spite of these focused artifacts their individual members already had a declarative that uniquely identified not only what they represent but also what class they are in and how they can be connected to other patterns of use? In other words, have a new geometry built in to the vocabulary values to encourage reuse at a very granular level.

                                     

                                    I can expand on the 'patterns' concept in a separate post (like Lisa says, I risk being the only one interested) but for now, think of any formally constructed language and think of the universal patterns used to exchange information. There must be an agreement about the what and the how, and there must also be an understanding about the context and construction, and there is always semantics. A taxonomy (as would a data model) become a new pattern in a given language using existing elements.

                                     

                                     

                                     

                                     

                                     

                                    -----Original Message-----
                                    From: lisa colvin [mailto:lisacolvin@ gmail.com]
                                    Sent: Wednesday, January 6, 2010 09:19 AM
                                    To: TaxoCoP@yahoogroups .com
                                    Subject: Re: [TaxoCoP] data modeling and taxonomy

                                     

                                    Thanks for the lively discussion. It's exciting to see these ideas coming together.

                                    While there are some accepted standards for ontology modeling practice (RDFS/OWL), there are multiple knowledge representation languages which can be used to express any 'ontology'. Typically the more expressive the language, the more expensive it is computationally. So, you need to pick the representation language which best fits your needs. If you're not building a model to drive some sort of expert system or related capabilities,  a simpler knowledge representation scheme is probably better.

                                    However, one reason people use ontology languages in general is when there is a need for strong semantics which define the relationships/ context. Even if you don't want to build an expert/recommendati on/QA/NL- based system, you can still use a more formal ontology language as just a pure specification language.

                                    So, is a faceted classification scheme an ontology? Some would say 'yes, if it uses an ontology language to express it'. Others might say it's not if you're not expressing/defining any inheritance relations. Overall, it probably doesn't matter what you call it as long as the semantics are rich enough to solve whatever problem you needed solving.

                                    There are fundamental differences to how the various disciplines approach information modeling. What I've found most helpful in working with people in another discipline is to be very explicit on how basic terms (like "term" :) , "class", "instance", "inference") are used in expressing the model that you're sharing. The idea of "inference", for example, can vary widely between an expert system developer and an OO developer. If these terms aren't described explicitly and used consistently, people get confused.

                                    I also found that defining the capabilities and mathematical relationship distinctions between "controlled vocabulary list", "synonym rings/synsets" , taxonomy", "thesaurus", "ontology", "desciption logics",etc.  is really only interesting to taxonomists/ ontologists and other curious people like us. :)

                                    :) Lisa

                                    On Tue, Jan 5, 2010 at 7:36 PM, Patrick Lambe <plambe@straitsknowl edge.com> wrote:

                                     

                                    Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I 'll start with a comment that Matt made early on, that there might be usability issues with reusing structures from data models in taxonomies, even though in principle such reuse makes sense.

                                     

                                    I think there's a tendency for us to get very entity focused in these discussions and definitions and stop there. There's a good reason for this. The common ground for data models, ontologies, taxonomies is their need to establish relatively stable entities at the very least; they each do slightly different different things around the language referring to those entities, and they diverge in the type and extent of work around establishing and defining relationships and maybe inference-generatin g capabilities (which some taxonomy forms can support as well as ontologies). But the entities are the core point of reference.

                                     

                                    But Matt's comment reminds us that it's important to remember that data models, taxonomies and ontologies are at the end of the day just instruments, and to understand the instrument is not just about understanding the entities it manipulates, but how the instrument is used, and for what purpose. 

                                     

                                    The design of a tool is driven by its functionality, not its components. DM-T-Os serve related purposes via different means and in different contexts. There are important differences in the amount of human vs machine processing expected or served. As Matt suggests master data management is one way of getting a handle on how they can inter-operate. But fixing an entity and definition in one space (eg a data model) does not unquestionably qualify it for use in another space (eg a taxonomy).

                                     

                                    I think we also assume that usability is only really relevant at the taxonomy level. In my book I suggested that taxonomies are for humans and ontologies are for machines, which risks feeding that assumption. But at the end of the day, the rationale for using any of these instruments whether data models, taxonomies or ontologies, is that they must emerge into human use in some way. It's just that for DMs and Os machine processes provide different opportunities and constraints from human ones. If we can't see the pathway to human use (which is where some of the visionary talk on ontologies falls down, I feel) then they risk floating away into philosophical (or organisational) abstractions. I think this is where a lot of the hard wrestling work needs to be done, to resolve relationships between the instruments, preserve a common core where possible, and reflect the context-driven needs at organisational and user levels.

                                     

                                    This is all very abstract still... I think what would be useful would be some good clear cases where we can see the relationships in specific contexts.

                                     

                                    P

                                     

                                    Patrick Lambe

                                     

                                    weblog: www.greenchameleon. com

                                    website: www.straitsknowledg e.com

                                    book: www.organisingknowl edge.com

                                     

                                    Have you seen our KM Method Cards or

                                    Organisation Culture Cards?  

                                     

                                    http://www.straitsk nowledge. com/store/

                                     

                                     

                                     

                                     

                                    On Jan 6, 2010, at 7:30 AM, Bob Bater wrote:



                                     

                                    Heather, Gabriel, John, Keith & anyone else who's following this thread:

                                     

                                    I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                                     

                                    Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything* ? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                                     

                                    John also says:

                                    > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                                    I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                                     

                                    Keith says that he finds taxonomies are regarded as:

                                    > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                                    Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists) . Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                                     

                                    Gabriel said:

                                    > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                                     

                                    My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                                     

                                    Keith said:

                                    > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                                     

                                    Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                                     

                                    Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                                     

                                    Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                                     

                                    Regards,

                                     

                                    Bob

                                     

                                     



                                     


                                  • John O'Gorman
                                    Brilliant, Cherie - absolutely brilliant. ... From: cheriewagner@comcast.net [mailto:cheriewagner@comcast.net] Sent: Wednesday, January 6, 2010 12:04 PM To:
                                    Message 17 of 21 , Jan 6, 2010
                                      Brilliant, Cherie - absolutely brilliant.
                                       
                                      -----Original Message-----
                                      From: cheriewagner@... [mailto:cheriewagner@...]
                                      Sent: Wednesday, January 6, 2010 12:04 PM
                                      To: TaxoCoP@yahoogroups.com
                                      Subject: RE: [TaxoCoP] data modeling and taxonomy

                                       

                                      In reading this I wanted to express my appreciation for the time and knowledge that all of you on this list share…I’m a behind-the-scenes lurker, so by way of brief introduction I worked in the content management and taxonomy space for many years and I am working now in different areas.  I know that I am quickly falling behind in what is a rapidly developing and ever-changing information modeling arena, so the following comment may seem obvious or archaic or just plain off!…but in reading this exchange it makes me think of fractals or fractal geometry and how it helps to predict the systematic chaos of nature.  Perhaps one could apply the concepts around fractal geometry to information or information modeling?  or maybe it would just result in some very cool geometric shapes... J

                                       

                                      http://en.wikipedia .org/wiki/ Fractal

                                      Best,

                                      Cherie

                                      From: TaxoCoP@yahoogroups .com [mailto:TaxoCoP@ yahoogroups. com] On Behalf Of John O'Gorman
                                      Sent: Wednesday, January 06, 2010 12:14 PM
                                      To: TaxoCoP@yahoogroups .com
                                      Subject: Re: [TaxoCoP] data modeling and taxonomy

                                       

                                      I'd like to introduce one more abstract into the mix, followed by a concrete example as per Patrick's excellent suggestion. As Lisa mentioned, the mathematical subtleties of taxonomies and data models and such are of little interest outside groups like ours, but the truth is that this line of inquiry is predicated on a flat geometry. The digital universe - owing primarily to its binary origins - is comprised of only two dimensions. Manifestation of this singular truth is everywhere and in spite of some very clever attempts to mitigate the flatness of things, we still have folder structures, naming conventions, hierarchies and super- and sub-types. This is not to suggest that these inventions are not and have not been useful, but we need something more elegant to save ourselves from drowning in a sea of digits and bytes.

                                       

                                      Take search...enter 'cricket' and get back two point seven million hits on the sport, the insect, the ethical construct (as in "not cricket") and Buddy Holly. Because humans live in a multi-faceted universe and computers in a flat one, reconciling the semantics (i.e. the gap between n-dimensions and two) is up to us. What is needed is a new 'geometry' of information that simutaneously incorporates more precision and recognizes the existing symmetry of information.

                                      Concrete example: In programming the stupid computer must be 'told' what a string is and how it is going to be used. So a given string may be a variable, a global variable, an object or a method depending on the context. To avoid 'collision', the same string may not be used in any way other than the one for which it has been declared. In the context of the 'cricket' search a similar approach may be taken, albeit with a twist. For every unique concept behind the string 'cricket' a unique identifier is declared. Now we have something like: 1234 - cricket - sport; 3456 - cricket - status;  4567 - cricket - insect; 6789 - cricket - member of Buddy Holly's band.

                                      As Bob correctly points out, individual data models, taxonomies and ontologies (DM-T-O) are by necessity fairly narrow in scope. That's typically why taxonomies tend to break and data models fail with the introduction of information classes from a wider scope. Wouldn't it be interesting, though if in spite of these focused artifacts their individual members already had a declarative that uniquely identified not only what they represent but also what class they are in and how they can be connected to other patterns of use? In other words, have a new geometry built in to the vocabulary values to encourage reuse at a very granular level.

                                      I can expand on the 'patterns' concept in a separate post (like Lisa says, I risk being the only one interested) but for now, think of any formally constructed language and think of the universal patterns used to exchange information. There must be an agreement about the what and the how, and there must also be an understanding about the context and construction, and there is always semantics. A taxonomy (as would a data model) become a new pattern in a given language using existing elements.

                                      -----Original Message-----
                                      From: lisa colvin [mailto:lisacolvin@ gmail.com]
                                      Sent: Wednesday, January 6, 2010 09:19 AM
                                      To: TaxoCoP@yahoogroups .com
                                      Subject: Re: [TaxoCoP] data modeling and taxonomy

                                       

                                      Thanks for the lively discussion. It's exciting to see these ideas coming together.

                                      While there are some accepted standards for ontology modeling practice (RDFS/OWL), there are multiple knowledge representation languages which can be used to express any 'ontology'. Typically the more expressive the language, the more expensive it is computationally. So, you need to pick the representation language which best fits your needs. If you're not building a model to drive some sort of expert system or related capabilities,  a simpler knowledge representation scheme is probably better.

                                      However, one reason people use ontology languages in general is when there is a need for strong semantics which define the relationships/ context. Even if you don't want to build an expert/recommendati on/QA/NL- based system, you can still use a more formal ontology language as just a pure specification language.

                                      So, is a faceted classification scheme an ontology? Some would say 'yes, if it uses an ontology language to express it'. Others might say it's not if you're not expressing/defining any inheritance relations. Overall, it probably doesn't matter what you call it as long as the semantics are rich enough to solve whatever problem you needed solving.

                                      There are fundamental differences to how the various disciplines approach information modeling. What I've found most helpful in working with people in another discipline is to be very explicit on how basic terms (like "term" :) , "class", "instance", "inference") are used in expressing the model that you're sharing. The idea of "inference", for example, can vary widely between an expert system developer and an OO developer. If these terms aren't described explicitly and used consistently, people get confused.

                                      I also found that defining the capabilities and mathematical relationship distinctions between "controlled vocabulary list", "synonym rings/synsets" , taxonomy", "thesaurus", "ontology", "desciption logics",etc.  is really only interesting to taxonomists/ ontologists and other curious people like us. :)

                                      :) Lisa

                                      On Tue, Jan 5, 2010 at 7:36 PM, Patrick Lambe <plambe@straitsknowl edge.com> wrote:

                                       

                                      Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I 'll start with a comment that Matt made early on, that there might be usability issues with reusing structures from data models in taxonomies, even though in principle such reuse makes sense.

                                      I think there's a tendency for us to get very entity focused in these discussions and definitions and stop there. There's a good reason for this. The common ground for data models, ontologies, taxonomies is their need to establish relatively stable entities at the very least; they each do slightly different different things around the language referring to those entities, and they diverge in the type and extent of work around establishing and defining relationships and maybe inference-generatin g capabilities (which some taxonomy forms can support as well as ontologies). But the entities are the core point of reference.

                                      But Matt's comment reminds us that it's important to remember that data models, taxonomies and ontologies are at the end of the day just instruments, and to understand the instrument is not just about understanding the entities it manipulates, but how the instrument is used, and for what purpose. 

                                      The design of a tool is driven by its functionality, not its components. DM-T-Os serve related purposes via different means and in different contexts. There are important differences in the amount of human vs machine processing expected or served. As Matt suggests master data management is one way of getting a handle on how they can inter-operate. But fixing an entity and definition in one space (eg a data model) does not unquestionably qualify it for use in another space (eg a taxonomy).

                                      I think we also assume that usability is only really relevant at the taxonomy level. In my book I suggested that taxonomies are for humans and ontologies are for machines, which risks feeding that assumption. But at the end of the day, the rationale for using any of these instruments whether data models, taxonomies or ontologies, is that they must emerge into human use in some way. It's just that for DMs and Os machine processes provide different opportunities and constraints from human ones. If we can't see the pathway to human use (which is where some of the visionary talk on ontologies falls down, I feel) then they risk floating away into philosophical (or organisational) abstractions. I think this is where a lot of the hard wrestling work needs to be done, to resolve relationships between the instruments, preserve a common core where possible, and reflect the context-driven needs at organisational and user levels.

                                      This is all very abstract still... I think what would be useful would be some good clear cases where we can see the relationships in specific contexts.

                                      P

                                      Patrick Lambe

                                      weblog: www.greenchameleon. com

                                      website: www.straitsknowledg e.com

                                      book: www.organisingknowl edge.com

                                      Have you seen our KM Method Cards or

                                      Organisation Culture Cards?  

                                      http://www.straitsk nowledge. com/store/

                                      On Jan 6, 2010, at 7:30 AM, Bob Bater wrote:



                                      Heather, Gabriel, John, Keith & anyone else who's following this thread:

                                      I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                                      Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything* ? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                                      John also says:

                                      > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                                      I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                                      Keith says that he finds taxonomies are regarded as:

                                      > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                                      Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists) . Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                                      Gabriel said:

                                      > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                                      My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                                      Keith said:

                                      > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                                      Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                                      Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                                      Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                                      Regards,

                                      Bob




                                       

                                    • Bob Bater
                                      Cherie, As John commented, a brilliant new dimension to our discussion. – particularly your concept of the ‘systematic chaos of nature’. Wow! I think we
                                      Message 18 of 21 , Jan 6, 2010

                                        Cherie,

                                         

                                        As John commented, a brilliant new dimension to our discussion. – particularly your concept of the ‘systematic chaos of nature’. Wow! I think we need to consider that, but it does make an already complex issue even more complex!

                                         

                                        Regards,

                                         

                                        Bob

                                         

                                        From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of cheriewagner@...
                                        Sent: 06 January 2010 20:05
                                        To: TaxoCoP@yahoogroups.com
                                        Subject: RE: [TaxoCoP] data modeling and taxonomy

                                         

                                         

                                        In reading this I wanted to express my appreciation for the time and knowledge that all of you on this list share…I’m a behind-the-scenes lurker, so by way of brief introduction I worked in the content management and taxonomy space for many years and I am working now in different areas.  I know that I am quickly falling behind in what is a rapidly developing and ever-changing information modeling arena, so the following comment may seem obvious or archaic or just plain off!…but in reading this exchange it makes me think of fractals or fractal geometry and how it helps to predict the systematic chaos of nature.  Perhaps one could apply the concepts around fractal geometry to information or information modeling?  or maybe it would just result in some very cool geometric shapes... J

                                         

                                        http://en.wikipedia.org/wiki/Fractal

                                         

                                        Best,

                                        Cherie

                                         

                                         

                                        From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of John O'Gorman
                                        Sent: Wednesday, January 06, 2010 12:14 PM
                                        To: TaxoCoP@yahoogroups.com
                                        Subject: Re: [TaxoCoP] data modeling and taxonomy

                                         

                                         

                                        I'd like to introduce one more abstract into the mix, followed by a concrete example as per Patrick's excellent suggestion. As Lisa mentioned, the mathematical subtleties of taxonomies and data models and such are of little interest outside groups like ours, but the truth is that this line of inquiry is predicated on a flat geometry. The digital universe - owing primarily to its binary origins - is comprised of only two dimensions. Manifestation of this singular truth is everywhere and in spite of some very clever attempts to mitigate the flatness of things, we still have folder structures, naming conventions, hierarchies and super- and sub-types. This is not to suggest that these inventions are not and have not been useful, but we need something more elegant to save ourselves from drowning in a sea of digits and bytes.

                                         

                                        Take search...enter 'cricket' and get back two point seven million hits on the sport, the insect, the ethical construct (as in "not cricket") and Buddy Holly. Because humans live in a multi-faceted universe and computers in a flat one, reconciling the semantics (i.e. the gap between n-dimensions and two) is up to us. What is needed is a new 'geometry' of information that simutaneously incorporates more precision and recognizes the existing symmetry of information.

                                         

                                        Concrete example: In programming the stupid computer must be 'told' what a string is and how it is going to be used. So a given string may be a variable, a global variable, an object or a method depending on the context. To avoid 'collision', the same string may not be used in any way other than the one for which it has been declared. In the context of the 'cricket' search a similar approach may be taken, albeit with a twist. For every unique concept behind the string 'cricket' a unique identifier is declared. Now we have something like: 1234 - cricket - sport; 3456 - cricket - status;  4567 - cricket - insect; 6789 - cricket - member of Buddy Holly's band.

                                         

                                        As Bob correctly points out, individual data models, taxonomies and ontologies (DM-T-O) are by necessity fairly narrow in scope. That's typically why taxonomies tend to break and data models fail with the introduction of information classes from a wider scope. Wouldn't it be interesting, though if in spite of these focused artifacts their individual members already had a declarative that uniquely identified not only what they represent but also what class they are in and how they can be connected to other patterns of use? In other words, have a new geometry built in to the vocabulary values to encourage reuse at a very granular level.

                                         

                                        I can expand on the 'patterns' concept in a separate post (like Lisa says, I risk being the only one interested) but for now, think of any formally constructed language and think of the universal patterns used to exchange information. There must be an agreement about the what and the how, and there must also be an understanding about the context and construction, and there is always semantics. A taxonomy (as would a data model) become a new pattern in a given language using existing elements.

                                         

                                         

                                         

                                         

                                         

                                        -----Original Message-----
                                        From: lisa colvin [mailto:lisacolvin@...]
                                        Sent: Wednesday, January 6, 2010 09:19 AM
                                        To: TaxoCoP@yahoogroups.com
                                        Subject: Re: [TaxoCoP] data modeling and taxonomy

                                         

                                        Thanks for the lively discussion. It's exciting to see these ideas coming together.

                                        While there are some accepted standards for ontology modeling practice (RDFS/OWL), there are multiple knowledge representation languages which can be used to express any 'ontology'. Typically the more expressive the language, the more expensive it is computationally. So, you need to pick the representation language which best fits your needs. If you're not building a model to drive some sort of expert system or related capabilities,  a simpler knowledge representation scheme is probably better.

                                        However, one reason people use ontology languages in general is when there is a need for strong semantics which define the relationships/ context. Even if you don't want to build an expert/recommendation/QA/NL-based system, you can still use a more formal ontology language as just a pure specification language.

                                        So, is a faceted classification scheme an ontology? Some would say 'yes, if it uses an ontology language to express it'. Others might say it's not if you're not expressing/defining any inheritance relations. Overall, it probably doesn't matter what you call it as long as the semantics are rich enough to solve whatever problem you needed solving.

                                        There are fundamental differences to how the various disciplines approach information modeling. What I've found most helpful in working with people in another discipline is to be very explicit on how basic terms (like "term" :) , "class", "instance", "inference") are used in expressing the model that you're sharing. The idea of "inference", for example, can vary widely between an expert system developer and an OO developer. If these terms aren't described explicitly and used consistently, people get confused.

                                        I also found that defining the capabilities and mathematical relationship distinctions between "controlled vocabulary list", "synonym rings/synsets", taxonomy", "thesaurus", "ontology", "desciption logics",etc.  is really only interesting to taxonomists/ontologists and other curious people like us. :)

                                        :) Lisa

                                        On Tue, Jan 5, 2010 at 7:36 PM, Patrick Lambe <plambe@...> wrote:

                                         

                                        Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I 'll start with a comment that Matt made early on, that there might be usability issues with reusing structures from data models in taxonomies, even though in principle such reuse makes sense.

                                         

                                        I think there's a tendency for us to get very entity focused in these discussions and definitions and stop there. There's a good reason for this. The common ground for data models, ontologies, taxonomies is their need to establish relatively stable entities at the very least; they each do slightly different different things around the language referring to those entities, and they diverge in the type and extent of work around establishing and defining relationships and maybe inference-generating capabilities (which some taxonomy forms can support as well as ontologies). But the entities are the core point of reference.

                                         

                                        But Matt's comment reminds us that it's important to remember that data models, taxonomies and ontologies are at the end of the day just instruments, and to understand the instrument is not just about understanding the entities it manipulates, but how the instrument is used, and for what purpose. 

                                         

                                        The design of a tool is driven by its functionality, not its components. DM-T-Os serve related purposes via different means and in different contexts. There are important differences in the amount of human vs machine processing expected or served. As Matt suggests master data management is one way of getting a handle on how they can inter-operate. But fixing an entity and definition in one space (eg a data model) does not unquestionably qualify it for use in another space (eg a taxonomy).

                                         

                                        I think we also assume that usability is only really relevant at the taxonomy level. In my book I suggested that taxonomies are for humans and ontologies are for machines, which risks feeding that assumption. But at the end of the day, the rationale for using any of these instruments whether data models, taxonomies or ontologies, is that they must emerge into human use in some way. It's just that for DMs and Os machine processes provide different opportunities and constraints from human ones. If we can't see the pathway to human use (which is where some of the visionary talk on ontologies falls down, I feel) then they risk floating away into philosophical (or organisational) abstractions. I think this is where a lot of the hard wrestling work needs to be done, to resolve relationships between the instruments, preserve a common core where possible, and reflect the context-driven needs at organisational and user levels.

                                         

                                        This is all very abstract still... I think what would be useful would be some good clear cases where we can see the relationships in specific contexts.

                                         

                                        P

                                         

                                        Patrick Lambe

                                         

                                        weblog: www.greenchameleon.com

                                        website: www.straitsknowledge.com

                                        book: www.organisingknowledge.com

                                         

                                        Have you seen our KM Method Cards or

                                        Organisation Culture Cards?  

                                         

                                        http://www.straitsknowledge.com/store/

                                         

                                         

                                         

                                         

                                        On Jan 6, 2010, at 7:30 AM, Bob Bater wrote:




                                         

                                        Heather, Gabriel, John, Keith & anyone else who's following this thread:

                                         

                                        I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                                         

                                        Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything*? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                                         

                                        John also says:

                                        > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                                        I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                                         

                                        Keith says that he finds taxonomies are regarded as:

                                        > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                                        Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists). Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                                         

                                        Gabriel said:

                                        > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                                         

                                        My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                                         

                                        Keith said:

                                        > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                                         

                                        Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                                         

                                        Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                                         

                                        Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                                         

                                        Regards,

                                         

                                        Bob

                                         

                                         




                                         

                                         

                                      • Seth Earley
                                        I have not followed the entire thread – need to catch up. But did see these last comments about fractals. The following is a bit off topic and has
                                        Message 19 of 21 , Jan 6, 2010

                                          I have not followed the entire thread – need to catch up.  But did see these last comments about fractals.  The following is a bit off topic and has absolutely no practical value for building taxonomies but I could not resist…   I really like fractals.  <smile>

                                           

                                          Here is a blog post I wrote last year about the nature of knowledge and taxonomies that ties into fractals.

                                           

                                          http://www.earley.com/blog/the-fractal-nature-of-knowledge

                                           

                                          Recorded knowledge is an extension of nature. (Intelligence is embedded in natural processes – nature is an excellent problem solver) It only makes sense that classifying that knowledge results in a similar structure. Data models are our way of enabling machines to derive connections in that chaotic sea of information. 

                                           

                                          In fact, there is a body of writing that discusses the role of tags and labels in allowing knowledge to emerge from chaos. 

                                           

                                          From the above blog post:

                                           

                                          In complexity, there is a sweet spot between chaos and control where value emerges. Too much chaos and nothing gets done. Too much control and there are no new solutions to problems. But what are necessary are mechanisms to encourage self-organization. Labels and classifications tell the organization what is important and allow people and teams to find and leverage knowledge that is created in one part of the organization and contribute to the overall goal or value creation. In the “Biology of Business” John Clippinger states that a manager’s job is to encourage knowledge flows. Knowledge flows are encouraged by use of tags that tell the organization what is important.

                                           

                                          Ontologies allow knowledge to emerge across domains of information. 

                                           

                                          Of course, biological systems have exploited the principle of self organization for eons. Life has evolved as order emerging from chaos and differentiates in the process of solving problems of competition and resource utilization.

                                           

                                          Economies are extensions of ecologies.  An economy solves problems of resource allocation, utilization, and competition for the best use of those resources.  So when we are trying to organize information for a business purpose, we’re really just operating on the fringe of some infinitesimally granular knowledge fractal.  It’s all part of the same process.  Makes sense that the principles are the same.

                                           

                                          I’ve been fascinated by this area for many years (must be the chemistry degree).  It gives me a sense of satisfaction that emergent intelligence is just the nature of things.  You can apply fractals to any and everything.  And labels are part of principles of self organization.  Thus the fundamental importance of the work that we do.

                                           

                                          (The following is completely off topic)

                                          When you consider value creation – value comes from knowledge flow – solutions applied to problems.  The financial crisis we just went through was a disruption of the value creation process.  People were taking value when none was created.  To quote Paul Volcker, former chairman of the Federal Reserve, financial engineering does not do anything for the economy. http://online.wsj.com/article/SB10001424052748704825504574586330960597134.html

                                           

                                          I would argue that the people on this list create more real value for organizations than the people who engage in financial engineering and get paid ridiculous sums for “moving the rents” as Volcker states.  

                                           

                                          (To the people who are less familiar with this forum, this is not a typical post – apologies for the tangent)

                                           

                                          May your new year be full of organized information and value creation.

                                           

                                          Seth

                                           

                                          Seth Earley

                                          President
                                          _____________________________

                                          EARLEY & ASSOCIATES, Inc.
                                          Cell: 781-820-8080

                                          Email: seth@...  
                                          Web: www.earley.com

                                           

                                          Follow me on twitter: sethearley

                                           

                                          Free four part Jumpstart Series

                                          On Digital Asset Management starts

                                          Thursday, January 14th, 2010 1 pm eastern

                                          http://www.earley.com/webinars/jumpstarts/digital-asset-management

                                           

                                          From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of Bob Bater
                                          Sent: Wednesday, January 06, 2010 7:36 PM
                                          To: TaxoCoP@yahoogroups.com
                                          Subject: RE: [TaxoCoP] data modeling and taxonomy

                                           

                                           

                                          Cherie,

                                           

                                          As John commented, a brilliant new dimension to our discussion. – particularly your concept of the ‘systematic chaos of nature’. Wow! I think we need to consider that, but it does make an already complex issue even more complex!

                                           

                                          Regards,

                                           

                                          Bob

                                           

                                          From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of cheriewagner@...
                                          Sent: 06 January 2010 20:05
                                          To: TaxoCoP@yahoogroups.com
                                          Subject: RE: [TaxoCoP] data modeling and taxonomy

                                           

                                           

                                          In reading this I wanted to express my appreciation for the time and knowledge that all of you on this list share…I’m a behind-the-scenes lurker, so by way of brief introduction I worked in the content management and taxonomy space for many years and I am working now in different areas.  I know that I am quickly falling behind in what is a rapidly developing and ever-changing information modeling arena, so the following comment may seem obvious or archaic or just plain off!…but in reading this exchange it makes me think of fractals or fractal geometry and how it helps to predict the systematic chaos of nature.  Perhaps one could apply the concepts around fractal geometry to information or information modeling?  or maybe it would just result in some very cool geometric shapes... J

                                           

                                          http://en.wikipedia.org/wiki/Fractal

                                           

                                          Best,

                                          Cherie

                                           

                                           

                                          From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of John O'Gorman
                                          Sent: Wednesday, January 06, 2010 12:14 PM
                                          To: TaxoCoP@yahoogroups.com
                                          Subject: Re: [TaxoCoP] data modeling and taxonomy

                                           

                                           

                                          I'd like to introduce one more abstract into the mix, followed by a concrete example as per Patrick's excellent suggestion. As Lisa mentioned, the mathematical subtleties of taxonomies and data models and such are of little interest outside groups like ours, but the truth is that this line of inquiry is predicated on a flat geometry. The digital universe - owing primarily to its binary origins - is comprised of only two dimensions. Manifestation of this singular truth is everywhere and in spite of some very clever attempts to mitigate the flatness of things, we still have folder structures, naming conventions, hierarchies and super- and sub-types. This is not to suggest that these inventions are not and have not been useful, but we need something more elegant to save ourselves from drowning in a sea of digits and bytes.

                                           

                                          Take search...enter 'cricket' and get back two point seven million hits on the sport, the insect, the ethical construct (as in "not cricket") and Buddy Holly. Because humans live in a multi-faceted universe and computers in a flat one, reconciling the semantics (i.e. the gap between n-dimensions and two) is up to us. What is needed is a new 'geometry' of information that simutaneously incorporates more precision and recognizes the existing symmetry of information.

                                           

                                          Concrete example: In programming the stupid computer must be 'told' what a string is and how it is going to be used. So a given string may be a variable, a global variable, an object or a method depending on the context. To avoid 'collision', the same string may not be used in any way other than the one for which it has been declared. In the context of the 'cricket' search a similar approach may be taken, albeit with a twist. For every unique concept behind the string 'cricket' a unique identifier is declared. Now we have something like: 1234 - cricket - sport; 3456 - cricket - status;  4567 - cricket - insect; 6789 - cricket - member of Buddy Holly's band.

                                           

                                          As Bob correctly points out, individual data models, taxonomies and ontologies (DM-T-O) are by necessity fairly narrow in scope. That's typically why taxonomies tend to break and data models fail with the introduction of information classes from a wider scope. Wouldn't it be interesting, though if in spite of these focused artifacts their individual members already had a declarative that uniquely identified not only what they represent but also what class they are in and how they can be connected to other patterns of use? In other words, have a new geometry built in to the vocabulary values to encourage reuse at a very granular level.

                                           

                                          I can expand on the 'patterns' concept in a separate post (like Lisa says, I risk being the only one interested) but for now, think of any formally constructed language and think of the universal patterns used to exchange information. There must be an agreement about the what and the how, and there must also be an understanding about the context and construction, and there is always semantics. A taxonomy (as would a data model) become a new pattern in a given language using existing elements.

                                           

                                           

                                           

                                           

                                           

                                          -----Original Message-----
                                          From: lisa colvin [mailto:lisacolvin@...]
                                          Sent: Wednesday, January 6, 2010 09:19 AM
                                          To: TaxoCoP@yahoogroups.com
                                          Subject: Re: [TaxoCoP] data modeling and taxonomy

                                           

                                          Thanks for the lively discussion. It's exciting to see these ideas coming together.

                                          While there are some accepted standards for ontology modeling practice (RDFS/OWL), there are multiple knowledge representation languages which can be used to express any 'ontology'. Typically the more expressive the language, the more expensive it is computationally. So, you need to pick the representation language which best fits your needs. If you're not building a model to drive some sort of expert system or related capabilities,  a simpler knowledge representation scheme is probably better.

                                          However, one reason people use ontology languages in general is when there is a need for strong semantics which define the relationships/ context. Even if you don't want to build an expert/recommendation/QA/NL-based system, you can still use a more formal ontology language as just a pure specification language.

                                          So, is a faceted classification scheme an ontology? Some would say 'yes, if it uses an ontology language to express it'. Others might say it's not if you're not expressing/defining any inheritance relations. Overall, it probably doesn't matter what you call it as long as the semantics are rich enough to solve whatever problem you needed solving.

                                          There are fundamental differences to how the various disciplines approach information modeling. What I've found most helpful in working with people in another discipline is to be very explicit on how basic terms (like "term" :) , "class", "instance", "inference") are used in expressing the model that you're sharing. The idea of "inference", for example, can vary widely between an expert system developer and an OO developer. If these terms aren't described explicitly and used consistently, people get confused.

                                          I also found that defining the capabilities and mathematical relationship distinctions between "controlled vocabulary list", "synonym rings/synsets", taxonomy", "thesaurus", "ontology", "desciption logics",etc.  is really only interesting to taxonomists/ontologists and other curious people like us. :)

                                          :) Lisa

                                          On Tue, Jan 5, 2010 at 7:36 PM, Patrick Lambe <plambe@...> wrote:

                                           

                                          Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I 'll start with a comment that Matt made early on, that there might be usability issues with reusing structures from data models in taxonomies, even though in principle such reuse makes sense.

                                           

                                          I think there's a tendency for us to get very entity focused in these discussions and definitions and stop there. There's a good reason for this. The common ground for data models, ontologies, taxonomies is their need to establish relatively stable entities at the very least; they each do slightly different different things around the language referring to those entities, and they diverge in the type and extent of work around establishing and defining relationships and maybe inference-generating capabilities (which some taxonomy forms can support as well as ontologies). But the entities are the core point of reference.

                                           

                                          But Matt's comment reminds us that it's important to remember that data models, taxonomies and ontologies are at the end of the day just instruments, and to understand the instrument is not just about understanding the entities it manipulates, but how the instrument is used, and for what purpose. 

                                           

                                          The design of a tool is driven by its functionality, not its components. DM-T-Os serve related purposes via different means and in different contexts. There are important differences in the amount of human vs machine processing expected or served. As Matt suggests master data management is one way of getting a handle on how they can inter-operate. But fixing an entity and definition in one space (eg a data model) does not unquestionably qualify it for use in another space (eg a taxonomy).

                                           

                                          I think we also assume that usability is only really relevant at the taxonomy level. In my book I suggested that taxonomies are for humans and ontologies are for machines, which risks feeding that assumption. But at the end of the day, the rationale for using any of these instruments whether data models, taxonomies or ontologies, is that they must emerge into human use in some way. It's just that for DMs and Os machine processes provide different opportunities and constraints from human ones. If we can't see the pathway to human use (which is where some of the visionary talk on ontologies falls down, I feel) then they risk floating away into philosophical (or organisational) abstractions. I think this is where a lot of the hard wrestling work needs to be done, to resolve relationships between the instruments, preserve a common core where possible, and reflect the context-driven needs at organisational and user levels.

                                           

                                          This is all very abstract still... I think what would be useful would be some good clear cases where we can see the relationships in specific contexts.

                                           

                                          P

                                           

                                          Patrick Lambe

                                           

                                          weblog: www.greenchameleon.com

                                          website: www.straitsknowledge.com

                                          book: www.organisingknowledge.com

                                           

                                          Have you seen our KM Method Cards or

                                          Organisation Culture Cards?  

                                           

                                          http://www.straitsknowledge.com/store/

                                           

                                           

                                           

                                           

                                          On Jan 6, 2010, at 7:30 AM, Bob Bater wrote:





                                           

                                          Heather, Gabriel, John, Keith & anyone else who's following this thread:

                                           

                                          I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                                           

                                          Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything*? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                                           

                                          John also says:

                                          > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                                          I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                                           

                                          Keith says that he finds taxonomies are regarded as:

                                          > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                                          Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists). Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                                           

                                          Gabriel said:

                                          > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                                           

                                          My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                                           

                                          Keith said:

                                          > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                                           

                                          Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                                           

                                          Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                                           

                                          Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                                           

                                          Regards,

                                           

                                          Bob

                                           

                                           





                                           

                                           

                                        • laptopjockey
                                          Hi Seth - disagree a tad about the off topic designation...like the man said: Just because we re wandering around doesn t mean we re lost. My primary
                                          Message 20 of 21 , Jan 7, 2010
                                            Hi Seth - disagree a tad about the 'off topic' designation...like the man said: "Just because we're wandering around doesn't mean we're lost."

                                            My primary justification for spending time with this group is to learn about the state of the discipline and maybe in the process contribute something to its future. Cherie's comment was brilliant because it represents a way of thinking about classification (by whichever method works) in a broader context, and by doing so perhaps develop a twenty-first century perspective on what I like to call 'predictive taxonomies'.

                                            Concrete being the order of the day, my job as an information integration architect relies on my ability to show clients - before I get a contract - that all their information (regardless how it is represented) is composed of the same elements. Almost always their initial reaction is skeptical - very similar I suppose to an alchemist when told about the periodic table. The truth is, that when you begin to think about it like fractals, all information MUST be composed of granular, self-described elements otherwise we would not be able to communicate their meaning to a larger audience.

                                            I think the group is in agreement that individual taxonomies, data models and even ontologies are of necessity purpose-built artifacts designed to do a job, but in order to extend their usefulness they must exist in a context - be it fractal or otherwise - that is consistent and extensible to other DM - T - Os.

                                            My T.O.E. is not an attempt to classify everything, but like Cherie's fractal comment it is a reasonably sound proposition that the elements of information are the same everywhere - like the laws of Physics - and the apparent chaos we see around us is just nature's way of reminding us that 108 usable chemical elements can produce an infinite variety of combinations (to mix metaphors pretty badly).


                                            --- In TaxoCoP@yahoogroups.com, Seth Earley <seth@...> wrote:
                                            >
                                            > I have not followed the entire thread â€" need to catch up. But did see these last comments about fractals. The following is a bit off topic and has absolutely no practical value for building taxonomies but I could not resist… I really like fractals. <smile>
                                            >
                                            > Here is a blog post I wrote last year about the nature of knowledge and taxonomies that ties into fractals.
                                            >
                                            > http://www.earley.com/blog/the-fractal-nature-of-knowledge
                                            >
                                            > Recorded knowledge is an extension of nature. (Intelligence is embedded in natural processes â€" nature is an excellent problem solver) It only makes sense that classifying that knowledge results in a similar structure. Data models are our way of enabling machines to derive connections in that chaotic sea of information.
                                            >
                                            > In fact, there is a body of writing that discusses the role of tags and labels in allowing knowledge to emerge from chaos.
                                            >
                                            > From the above blog post:
                                            >
                                            > In complexity, there is a sweet spot between chaos and control where value emerges. Too much chaos and nothing gets done. Too much control and there are no new solutions to problems. But what are necessary are mechanisms to encourage self-organization. Labels and classifications tell the organization what is important and allow people and teams to find and leverage knowledge that is created in one part of the organization and contribute to the overall goal or value creation. In the “Biology of Business” John Clippinger states that a manager’s job is to encourage knowledge flows. Knowledge flows are encouraged by use of tags that tell the organization what is important.
                                            >
                                            > Ontologies allow knowledge to emerge across domains of information.
                                            >
                                            > Of course, biological systems have exploited the principle of self organization for eons. Life has evolved as order emerging from chaos and differentiates in the process of solving problems of competition and resource utilization.
                                            >
                                            > Economies are extensions of ecologies. An economy solves problems of resource allocation, utilization, and competition for the best use of those resources. So when we are trying to organize information for a business purpose, we’re really just operating on the fringe of some infinitesimally granular knowledge fractal. It’s all part of the same process. Makes sense that the principles are the same.
                                            >
                                            > I’ve been fascinated by this area for many years (must be the chemistry degree). It gives me a sense of satisfaction that emergent intelligence is just the nature of things. You can apply fractals to any and everything. And labels are part of principles of self organization. Thus the fundamental importance of the work that we do.
                                            >
                                            > (The following is completely off topic)
                                            > When you consider value creation â€" value comes from knowledge flow â€" solutions applied to problems. The financial crisis we just went through was a disruption of the value creation process. People were taking value when none was created. To quote Paul Volcker, former chairman of the Federal Reserve, financial engineering does not do anything for the economy. http://online.wsj.com/article/SB10001424052748704825504574586330960597134.html
                                            >
                                            > I would argue that the people on this list create more real value for organizations than the people who engage in financial engineering and get paid ridiculous sums for “moving the rents” as Volcker states.
                                            >
                                            > (To the people who are less familiar with this forum, this is not a typical post â€" apologies for the tangent)
                                            >
                                            > May your new year be full of organized information and value creation.
                                            >
                                            > Seth
                                            >
                                            > Seth Earley
                                            > President
                                            > _____________________________
                                            > EARLEY & ASSOCIATES, Inc.
                                            > Cell: 781-820-8080
                                            > Email: seth@...<mailto:seth@...>
                                            > Web: www.earley.com<http://www.earley.com>
                                            >
                                            > Follow me on twitter: sethearley
                                            >
                                            > Free four part Jumpstart Series
                                            > On Digital Asset Management starts
                                            > Thursday, January 14th, 2010 1 pm eastern
                                            > http://www.earley.com/webinars/jumpstarts/digital-asset-management
                                            >
                                            > From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of Bob Bater
                                            > Sent: Wednesday, January 06, 2010 7:36 PM
                                            > To: TaxoCoP@yahoogroups.com
                                            > Subject: RE: [TaxoCoP] data modeling and taxonomy
                                            >
                                            >
                                            > Cherie,
                                            >
                                            > As John commented, a brilliant new dimension to our discussion. â€" particularly your concept of the ‘systematic chaos of nature’. Wow! I think we need to consider that, but it does make an already complex issue even more complex!
                                            >
                                            > Regards,
                                            >
                                            > Bob
                                            >
                                            > From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of cheriewagner@...
                                            > Sent: 06 January 2010 20:05
                                            > To: TaxoCoP@yahoogroups.com
                                            > Subject: RE: [TaxoCoP] data modeling and taxonomy
                                            >
                                            >
                                            > In reading this I wanted to express my appreciation for the time and knowledge that all of you on this list share…I’m a behind-the-scenes lurker, so by way of brief introduction I worked in the content management and taxonomy space for many years and I am working now in different areas. I know that I am quickly falling behind in what is a rapidly developing and ever-changing information modeling arena, so the following comment may seem obvious or archaic or just plain off!…but in reading this exchange it makes me think of fractals or fractal geometry and how it helps to predict the systematic chaos of nature. Perhaps one could apply the concepts around fractal geometry to information or information modeling? or maybe it would just result in some very cool geometric shapes... ☺
                                            >
                                            > http://en.wikipedia.org/wiki/Fractal
                                            >
                                            > Best,
                                            > Cherie
                                            >
                                            >
                                            > From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of John O'Gorman
                                            > Sent: Wednesday, January 06, 2010 12:14 PM
                                            > To: TaxoCoP@yahoogroups.com
                                            > Subject: Re: [TaxoCoP] data modeling and taxonomy
                                            >
                                            >
                                            > I'd like to introduce one more abstract into the mix, followed by a concrete example as per Patrick's excellent suggestion. As Lisa mentioned, the mathematical subtleties of taxonomies and data models and such are of little interest outside groups like ours, but the truth is that this line of inquiry is predicated on a flat geometry. The digital universe - owing primarily to its binary origins - is comprised of only two dimensions. Manifestation of this singular truth is everywhere and in spite of some very clever attempts to mitigate the flatness of things, we still have folder structures, naming conventions, hierarchies and super- and sub-types. This is not to suggest that these inventions are not and have not been useful, but we need something more elegant to save ourselves from drowning in a sea of digits and bytes.
                                            >
                                            > Take search...enter 'cricket' and get back two point seven million hits on the sport, the insect, the ethical construct (as in "not cricket") and Buddy Holly. Because humans live in a multi-faceted universe and computers in a flat one, reconciling the semantics (i.e. the gap between n-dimensions and two) is up to us. What is needed is a new 'geometry' of information that simutaneously incorporates more precision and recognizes the existing symmetry of information.
                                            >
                                            > Concrete example: In programming the stupid computer must be 'told' what a string is and how it is going to be used. So a given string may be a variable, a global variable, an object or a method depending on the context. To avoid 'collision', the same string may not be used in any way other than the one for which it has been declared. In the context of the 'cricket' search a similar approach may be taken, albeit with a twist. For every unique concept behind the string 'cricket' a unique identifier is declared. Now we have something like: 1234 - cricket - sport; 3456 - cricket - status; 4567 - cricket - insect; 6789 - cricket - member of Buddy Holly's band.
                                            >
                                            > As Bob correctly points out, individual data models, taxonomies and ontologies (DM-T-O) are by necessity fairly narrow in scope. That's typically why taxonomies tend to break and data models fail with the introduction of information classes from a wider scope. Wouldn't it be interesting, though if in spite of these focused artifacts their individual members already had a declarative that uniquely identified not only what they represent but also what class they are in and how they can be connected to other patterns of use? In other words, have a new geometry built in to the vocabulary values to encourage reuse at a very granular level.
                                            >
                                            > I can expand on the 'patterns' concept in a separate post (like Lisa says, I risk being the only one interested) but for now, think of any formally constructed language and think of the universal patterns used to exchange information. There must be an agreement about the what and the how, and there must also be an understanding about the context and construction, and there is always semantics. A taxonomy (as would a data model) become a new pattern in a given language using existing elements.
                                            >
                                            >
                                            >
                                            >
                                            >
                                            > -----Original Message-----
                                            > From: lisa colvin [mailto:lisacolvin@...]
                                            > Sent: Wednesday, January 6, 2010 09:19 AM
                                            > To: TaxoCoP@yahoogroups.com
                                            > Subject: Re: [TaxoCoP] data modeling and taxonomy
                                            >
                                            >
                                            > Thanks for the lively discussion. It's exciting to see these ideas coming together.
                                            >
                                            > While there are some accepted standards for ontology modeling practice (RDFS/OWL), there are multiple knowledge representation languages which can be used to express any 'ontology'. Typically the more expressive the language, the more expensive it is computationally. So, you need to pick the representation language which best fits your needs. If you're not building a model to drive some sort of expert system or related capabilities, a simpler knowledge representation scheme is probably better.
                                            >
                                            > However, one reason people use ontology languages in general is when there is a need for strong semantics which define the relationships/ context. Even if you don't want to build an expert/recommendation/QA/NL-based system, you can still use a more formal ontology language as just a pure specification language.
                                            >
                                            > So, is a faceted classification scheme an ontology? Some would say 'yes, if it uses an ontology language to express it'. Others might say it's not if you're not expressing/defining any inheritance relations. Overall, it probably doesn't matter what you call it as long as the semantics are rich enough to solve whatever problem you needed solving.
                                            >
                                            > There are fundamental differences to how the various disciplines approach information modeling. What I've found most helpful in working with people in another discipline is to be very explicit on how basic terms (like "term" :) , "class", "instance", "inference") are used in expressing the model that you're sharing. The idea of "inference", for example, can vary widely between an expert system developer and an OO developer. If these terms aren't described explicitly and used consistently, people get confused.
                                            >
                                            > I also found that defining the capabilities and mathematical relationship distinctions between "controlled vocabulary list", "synonym rings/synsets", taxonomy", "thesaurus", "ontology", "desciption logics",etc. is really only interesting to taxonomists/ontologists and other curious people like us. :)
                                            >
                                            > :) Lisa
                                            > On Tue, Jan 5, 2010 at 7:36 PM, Patrick Lambe <plambe@...<mailto:plambe@...>> wrote:
                                            >
                                            >
                                            > Well I was just sitting back and enjoying the conversation, Bob. But since you ask, I 'll start with a comment that Matt made early on, that there might be usability issues with reusing structures from data models in taxonomies, even though in principle such reuse makes sense.
                                            >
                                            > I think there's a tendency for us to get very entity focused in these discussions and definitions and stop there. There's a good reason for this. The common ground for data models, ontologies, taxonomies is their need to establish relatively stable entities at the very least; they each do slightly different different things around the language referring to those entities, and they diverge in the type and extent of work around establishing and defining relationships and maybe inference-generating capabilities (which some taxonomy forms can support as well as ontologies). But the entities are the core point of reference.
                                            >
                                            > But Matt's comment reminds us that it's important to remember that data models, taxonomies and ontologies are at the end of the day just instruments, and to understand the instrument is not just about understanding the entities it manipulates, but how the instrument is used, and for what purpose.
                                            >
                                            > The design of a tool is driven by its functionality, not its components. DM-T-Os serve related purposes via different means and in different contexts. There are important differences in the amount of human vs machine processing expected or served. As Matt suggests master data management is one way of getting a handle on how they can inter-operate. But fixing an entity and definition in one space (eg a data model) does not unquestionably qualify it for use in another space (eg a taxonomy).
                                            >
                                            > I think we also assume that usability is only really relevant at the taxonomy level. In my book I suggested that taxonomies are for humans and ontologies are for machines, which risks feeding that assumption. But at the end of the day, the rationale for using any of these instruments whether data models, taxonomies or ontologies, is that they must emerge into human use in some way. It's just that for DMs and Os machine processes provide different opportunities and constraints from human ones. If we can't see the pathway to human use (which is where some of the visionary talk on ontologies falls down, I feel) then they risk floating away into philosophical (or organisational) abstractions. I think this is where a lot of the hard wrestling work needs to be done, to resolve relationships between the instruments, preserve a common core where possible, and reflect the context-driven needs at organisational and user levels.
                                            >
                                            > This is all very abstract still... I think what would be useful would be some good clear cases where we can see the relationships in specific contexts.
                                            >
                                            > P
                                            >
                                            > Patrick Lambe
                                            >
                                            > weblog: www.greenchameleon.com<http://www.greenchameleon.com/>
                                            > website: www.straitsknowledge.com<http://www.straitsknowledge.com/>
                                            > book: www.organisingknowledge.com<http://www.organisingknowledge.com/>
                                            >
                                            > Have you seen our KM Method Cards or
                                            > Organisation Culture Cards?
                                            >
                                            > http://www.straitsknowledge.com/store/
                                            >
                                            >
                                            >
                                            >
                                            > On Jan 6, 2010, at 7:30 AM, Bob Bater wrote:
                                            >
                                            >
                                            >
                                            >
                                            >
                                            > Heather, Gabriel, John, Keith & anyone else who's following this thread:
                                            >
                                            > I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.
                                            >
                                            > Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything*? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.
                                            >
                                            > John also says:
                                            > > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <
                                            > I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.
                                            >
                                            > Keith says that he finds taxonomies are regarded as:
                                            > > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <
                                            > Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists). Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.
                                            >
                                            > Gabriel said:
                                            > > (I said "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <
                                            >
                                            > My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.
                                            >
                                            > Keith said:
                                            > > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome. <
                                            >
                                            > Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.
                                            >
                                            > Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.
                                            >
                                            > Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.
                                            >
                                            > Regards,
                                            >
                                            > Bob
                                            >
                                          • Bob Bater
                                            Matt et al., Sorry for the long delay in responding, I’ve been tied-up with other things. I’m not intending to restart this thread, but simply to be polite
                                            Message 21 of 21 , Jan 16, 2010

                                              Matt et al.,

                                               

                                              Sorry for the long delay in responding, I’ve been tied-up with other things. I’m not intending to restart this thread, but simply to be polite in responding to Matt’s question below.

                                               

                                              I guess my use of the word ‘ontology’ is what Wikipedia describes under the article Ontology (Information Science):

                                               

                                              Ø  In computer science and information science, an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. <

                                               

                                              In my case, the ‘domain’ is an activity system – usually an organization or part of an organization. At that point, I disagree with the Wikipedia entry where it says:

                                               

                                              Ø  An ontology provides a shared vocabulary, which can be used to model a domain. <

                                               

                                              I don’t believe ontologies generate vocabularies directly. They provide a shared depiction of concepts and their relationships in an activity system, and it’s only in the next stage (the taxonomic system) that terms are agreed for describing those concepts. A third stage – the retrieval system – is where one recognizes that different communities use different terms for the same thing, and if people are to retrieve information using their own terms, then these must be reconciled through the conventional techniques of preferred and non-preferred terms and related terms exemplified by the thesaurus.

                                               

                                              In this sense, my ‘ontology’ is analogous to what Wikipedia describes as ‘Upper ontology (information science)’. I don’t use the ontology itself to make inferences, but to draw the broad outlines of my taxonomic system, which is where I start making inferences to construct the hierarchical and referential relationships.

                                               

                                              I don’t know if it will make my approach to and use of ontologies any clearer, but anyone interested can look at a presentation I gave to the NKOS workshop in Vienna in 2005: http://www2.db.dk/nkos2005/Bob Bater.pdf.  

                                               

                                              Best regards,

                                               

                                              Bob

                                               

                                              From: TaxoCoP@yahoogroups.com [mailto:TaxoCoP@yahoogroups.com] On Behalf Of Matt Moore
                                              Sent: 06 January 2010 01:01
                                              To: TaxoCoP@yahoogroups.com
                                              Subject: Re: [TaxoCoP] data modeling and taxonomy

                                               

                                               

                                              Bob,

                                              "My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences."

                                              I think you're using a particular version of the term "ontology" that might cause a little confusion. How does your ontology differ from a faceted classification structure? My understanding of ontologies is that they specify the "verbs" that link "nouns" as well as the nouns themselves (so they specify what a certain subclass of person can do to a certain subclass of document for example). What's "in" and what's "out" of your model?

                                              Cheers,

                                              Matt

                                               


                                              From: Bob Bater <bbater@...>
                                              To: TaxoCoP@yahoogroups.com
                                              Sent: Wed, January 6, 2010 10:30:02 AM
                                              Subject: RE: [TaxoCoP] data modeling and taxonomy

                                               

                                              Heather, Gabriel, John, Keith & anyone else who's following this thread:

                                               

                                              I'm still feeling my way around these kinds of issues (have been for years), and have no hard-and-fast solutions. However, I do have some 'working hypotheses' which I find to be helpful. I'll refer to them as I respond to a few points made by John, Keith and Gabriel.

                                               

                                              Firstly, John is quite right in pointing out that both data models and taxonomies are necessarily bounded. Who'd want to undertake a data model or a taxonomy of *everything* ? Well, I suppose Melville Dewey, UDC, LCC have all attempted it, with varying degrees of success. But that's a topic for another day. In an organizational context, both data models and taxonomies need to be restricted to a specific domain, if only for practical reasons.

                                               

                                              John also says:

                                              > For example, if all of the 'entities' that a data modeller wanted to use were already classified by a taxonomist and resided in a master data management inventory, then a sort of symbiotic relationship could exist between the necessarily narrow application of the data and the universal 'connectivity' of a fully faceted business vocabulary. <

                                              I see this as the role of the 'over-arching ontology which expresses the context of both data model and taxonomy', to quote my own post. The ontology, developed first, ensures that both data modeller and taxonomist are singing from the same hymn sheet. That will also prove of great benefit to data warehouse developers, document managers, records managers and information architects, further down the line.

                                               

                                              Keith says that he finds taxonomies are regarded as:

                                              > "THE solution" rather than being viewed as "A solution" or part of a larger system of models and decision-making depending on the nature of the enterprise <

                                              Taxonomies have been over-egged. Many in the field think 'taxonomy' first and context later. IMHO bad! Build the ontology first, then do your data modelling. Then you'll have done a PoC (Proof of Concept) for the domain - identifying the entities which are important, their important attributes (for the data modellers) and a first lead-in to the language people use to refer to them (for the taxonomists) . Using both the ontology and the data model, define the key attributes which different communities regard as important to them when they want to access and process information. That gives you a metadata application profile for each community which can be aggregated into a corporate metadata profile. Only then do you look at each attribute in each profile and decide how it is to be populated. Sometimes, it will be an /ad hoc/ value; sometimes the value will be drawn from a fixed, flat list; sometimes the value will be drawn from an organized, maintained hierarchy of values - a taxonomy. For me, the metadata profile comes first. A taxonomy only becomes relevant if a metadata element requires it.

                                               

                                              Gabriel said:

                                              > (I said  "ontology / taxonomy" in the above because I'm not clear myself whether our CM does satisfy a full definition of "ontology"; for example as yet we have no mechanisms for making inferences). <

                                               

                                              My 'working hypothesis' in this respect does not include the need for ontologies to enable the making of inferences. That is a requirement of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide the context for ensuring that information and knowledge management structures and systems are coherent and interoperable.

                                               

                                              Keith said:

                                              > Getting at just where taxonomy, data modeling, and ontology specification begin, end, and overlap is really welcome.  <

                                               

                                              Again, my 'working hypothesis' is that ontologies come first, specifying the entities involved in an activity system, and their relationships. Data modellers will want to define the attributes of each entity and to characterize their relationships more rigorously, to enable their capture in the highly structured world of the DBMS, focused on logical consistency.

                                               

                                              Information managers, on the other hand, are less data-focused and more user-focused, concerned with linking entities and their key attributes to the concepts - and the terms which represent those concepts - employed by workers. So - where appropriate - they build a taxonomy proposing terms to be used for those concepts, reflecting the taxonomic relationships inherent in any domain - generic, partitive, instantial. While the taxonomy can establish the entities (concepts) involved, and their relationships, it cannot dictate the terms which people use to refer to those concepts. Provision is made therefore for variance in terminology by developing a thesaurus, which allows people to search using their native term, and for back-end software to translate this into the 'preferred term' established by the taxonomy.

                                               

                                              Hope that stimulates some thoughts. Meanwhile, where's Patrick Lambe in this thread? Patrick, I'm sure you have some informative views on these issues. Please join us.

                                               

                                              Regards,

                                               

                                              Bob

                                               

                                            Your message has been successfully submitted and would be delivered to recipients shortly.