Heather, Gabriel, John, Keith & anyone else who's following
I'm still feeling my way around these kinds of issues (have been
for years), and have no hard-and-fast solutions. However, I do have some
'working hypotheses' which I find to be helpful. I'll refer to them as I
respond to a few points made by John, Keith and Gabriel.
Firstly, John is quite right in pointing out that both data
models and taxonomies are necessarily bounded. Who'd want to undertake a data
model or a taxonomy of *everything* ? Well, I suppose Melville Dewey, UDC, LCC
have all attempted it, with varying degrees of success. But that's a topic for
another day. In an organizational context, both data models and taxonomies need
to be restricted to a specific domain, if only for practical reasons.
John also says:
> For example, if all of the 'entities' that a data modeller
wanted to use were already classified by a taxonomist and resided in a master
data management inventory, then a sort of symbiotic relationship could exist
between the necessarily narrow application of the data and the universal
'connectivity' of a fully faceted business vocabulary. <
I see this as the role of the 'over-arching ontology which
expresses the context of both data model and taxonomy', to quote my own post.
The ontology, developed first, ensures that both data modeller and taxonomist
are singing from the same hymn sheet. That will also prove of great benefit to
data warehouse developers, document managers, records managers and information
architects, further down the line.
Keith says that he finds taxonomies are regarded as:
> "THE solution" rather than being viewed as
"A solution" or part of a larger system of models and decision-making
depending on the nature of the enterprise <
Taxonomies have been over-egged. Many in the field think 'taxonomy'
first and context later. IMHO bad! Build the ontology first, then do your data
modelling. Then you'll have done a PoC (Proof of Concept) for the domain -
identifying the entities which are important, their important attributes (for
the data modellers) and a first lead-in to the language people use to refer to
them (for the taxonomists) . Using both the ontology and the data model, define
the key attributes which different communities regard as important to them when
they want to access and process information. That gives you a metadata
application profile for each community which can be aggregated into a corporate
metadata profile. Only then do you look at each attribute in each profile and
decide how it is to be populated. Sometimes, it will be an /ad hoc/ value;
sometimes the value will be drawn from a fixed, flat list; sometimes the value
will be drawn from an organized, maintained hierarchy of values - a taxonomy.
For me, the metadata profile comes first. A taxonomy only becomes relevant if a
metadata element requires it.
> (I said "ontology / taxonomy" in the above
because I'm not clear myself whether our CM does satisfy a full definition of
"ontology"; for example as yet we have no mechanisms for making
My 'working hypothesis' in this respect does not include the
need for ontologies to enable the making of inferences. That is a requirement
of strict 'ontologies' in the Semantic Web sense. For me, ontologies provide
the context for ensuring that information and knowledge management structures
and systems are coherent and interoperable.
> Getting at just where taxonomy, data modeling, and ontology
specification begin, end, and overlap is really welcome. <
Again, my 'working hypothesis' is that ontologies come first,
specifying the entities involved in an activity system, and their
relationships. Data modellers will want to define the attributes of each entity
and to characterize their relationships more rigorously, to enable their
capture in the highly structured world of the DBMS, focused on logical
Information managers, on the other hand, are less data-focused
and more user-focused, concerned with linking entities and their key attributes
to the concepts - and the terms which represent those concepts - employed by
workers. So - where appropriate - they build a taxonomy proposing terms to be
used for those concepts, reflecting the taxonomic relationships inherent in any
domain - generic, partitive, instantial. While the taxonomy can establish the
entities (concepts) involved, and their relationships, it cannot dictate the
terms which people use to refer to those concepts. Provision is made therefore
for variance in terminology by developing a thesaurus, which allows people to
search using their native term, and for back-end software to translate this
into the 'preferred term' established by the taxonomy.
Hope that stimulates some thoughts. Meanwhile, where's Patrick
Lambe in this thread? Patrick, I'm sure you have some informative views on
these issues. Please join us.