Ontology

In recent years has the word ontology - besides its philosophical meaning - come to mean a kind of knowledge organization system. An ontology has been defined as “a specification of a representational vocabulary for a shared domain of discourse - definitions of classes, relations, functions, and other objects ” (Gruber, 1993a).

 

The word "ontology" has a long history in philosophy, in which it refers to the subject of existence. In the context of knowledge sharing Gruber (1993a)use the term ontology to mean a specification of a conceptualization. That is, an ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents. This definition is consistent with the usage of ontology as set-of-concept-definitions, but more general.

 

Jermey & Browne (2004, glossary)."Specification of a conceptualisation of a knowledge domain. An ontology is a controlled vocabulary that describes objects and the relations between them in a formal way, and has a grammar for using the vocabulary terms to express something meaningful within a specified domain of interest. The vocabulary is used to make queries and assertions. Ontological commitments are agreements to use the vocabulary in a consistent way for knowledge sharing".

 

"Ontologies resemble faceted taxonomies but use richer semantic relationships among terms and attributes, as well as strict rules about how to specify terms and relationships. Because ontologies do more than just control a vocabulary, they are thought of as knowledge representation. The oft-quoted definition of ontology is "the specification of one's conceptualization of a knowledge domain." (Lombardi, 2004).

 

 

“… Information systems as simple as catalogs, in which each product type has a unique code (e.g. the item number), have been dubbed ‘ontologies’. A catalog is, in a sense, the ontology of the things a company sells. A slightly more complex information system may provide simple natural language texts and allow string matching. Glossaries are information systems that provide natural language descriptions of terms, thus imposing some structure on the text (indexing by terms). Thesauri are standardized information systems that provide, in addition to descriptions of terms, also relations to other more general or more specific terms within a common hierarchy. The fields of knowledge representation, database development, and object-oriented software engineering all employ ontologies conceived as taxonomies in which properties of more general classes are inherited by the more specific ones. Frame-based systems provide, in addition to taxonomic structure, relations between objects and restrictions on what and how classes of objects can be related to each other. Finally, the most expressive and complex information system ontologies use the axioms of full first order, higher order, or modal logic. All these types of information systems satisfy Gruber’s definition, and all are now common bedfellows under the rubric of ‘ontology.’ (Smith & Welty, 2001).

 

"Uschold [17] writes that an ontology:


. . . is often conceived as a set of concepts (e.g. entities, attributes, processes), their definitions and their inter-relationships


. . . An ontology may take a variety of forms, but will necessarily include a vocabulary of terms and some specification of their meaning (i.e. definitions). It may be:

The term ontology have been used in information studies by, among others, Ding, 2001, Soergel, 1999, Soergel et al., 2004, Svenonius 2000, Vickery, 1997 and Wang & Wang, 1995.

 

Soergel et al. (2004) provides information on how to reengineer thesauri to rich ontologies.

 

Table 1: Statements and rules of a hypothetical ontology versus the information given in the ERIC thesaurus (broader term (BT), related term (RT))

(from Soergel et al, 2004)

Eric Thesaurus

Hypothetical ontology

reading instruction

BT instruction

RT reading
RT learning standards 

reading ability

BT ability
RT reading
RT perception

Statements:

reading instruction 

<isa> instruction

<hasDomain> reading

governedBy> learning standards

reading ability

<isa> ability

<hasDomain> reading
<
supportedBy> perception

 

Rule 1

Instruction in a domain should consider ability in that domain:

X shouldConsider Y

IF X <isa (type of)> instruction AND X <hasDomain> W

AND Y <isa> ability AND Y <hasDomain> W

yields: : The designer of reading instruction should also consider reading ability.

Rule 2

X shouldConsider Z
IF X <
shouldConsider> Y
AND Y <
supportedBy
> Z

yields: The designer of reading instruction should also consider perception.

 

 

Table 2: AGROVOC relationships compared with more differentiated relationships of a hypothetical ontology (narrower term (NT), broader term (BT))

(from Soergel et al, 2004)

AGROVOC

Hypothetical Ontology

Undifferentiated hierarchical relationships in AGROVOC

milk

NT cow milk

NT milk fat

cow

NT cow milk 

Cheddar cheese

BT cow milk

Differentiated relationships in an ontology

 

milk

<includesSpecific> cow milk

<containsSubstance> milk fat 

cow

<hasComponent> cow milk*

Cheddar cheese

<<madeFrom> cow milk

 

Rule 1

Part X <mayContainSubstance> Substance Y

IF Animal W <hasComponent> Part X

AND Animal W <ingests> Substance Y

 

Rule 2

Food Z <containsSubstance> Substance Y

IF Food Z <madeFrom> Part X

AND Part X <containsSubstance> Substance Y 

 

 

 

Soergel et al. (2004) also state what is, in their opinion, the limitations of existing KOS and the potential benefits of future generation KOSs:

 

 

"The limitations of existing KOS can be summarized as follows:
  • Lack of conceptual abstraction: thesauri and other traditional KOSs are collections of terms (generic or domain-specific), ordered in a polyhierarchic lattice structure or a monohierarchic tree structure and interlinked with some very broad and basic relationships. The distinction between a concept (meaning) and its lexicalizations (words) is not made consistently, if at all, in such a system, and as such it does not reflect the ways humans understand the world in terms of meaning and language.
  • Limited semantic coverage: most thesauri do not differentiate concepts into types (such as living organism, substance, or process) and have a very limited set of relationships between concepts, distinguishing only between hierarchical relationships, i.e. NT/BT, and associative relationships, i.e. RT. These very rudimentary relationships are not powerful enough to guide a user in meaningful information discovery on the Web or to support inference. They do not reflect the conceptual relationships that people know and that can be used by a system to suggest concepts for expanding the query or making it more specific. Examples:
    • The relation between cow and its part cow milk is expressed as NT rather than the more semantically expressive relation <hasComponent>, so a user who wants to expand the query hierarchically (search for all concepts narrower than cow as well) could not distinguish between searching for all cow parts or searching for all varieties of cow;
    • the relation between mad cow disease and the animal it afflicts, cow, is expressed using RT instead of the more semantically precise relation <afflicts>, so the user could not easily assemble a list of all cow diseases and search for recent occurrences;
    • mad cow disease and one of its symptoms anorexia would also be related using RT rather than the more semantically expressive relation <hasSymptom>.
    The concept relations provided by most thesauri force all relations into the two broad categories, hierarchical and associative. Too often the semantic relationships captured in this way are ambiguous and poorly defined. The generalization/specialization relations defined in most thesauri are not adequately developed to be of use for semantic description and discovery of Web resources. Thus there is a need for a richer and more powerful set of relationships.
  • Lack of consistency: since the relationships in thesauri lack precise semantics, they are applied inconsistently, both creating ambiguity in the interpretation of the relationships and resulting in an overall internal semantic structure that is irregular and unpredictable. Many of the NT/BT hierarchical relationships could, for example, be resolved to the non-hierarchical RT relationship, and vice versa.
  • Limited automated processing: traditionally thesauri were designed for indexing and query formulation by people and not for automated processing. The ambiguous semantics that characterizes many thesauri makes them unsuitable for automated processing. "  (Soergel et al., 2004).

 

 

"Potential benefits of future generation KOSs

For emerging KOSs to satisfy user needs, they must improve both information organization and retrieval in a way that was not possible with traditional KOSs. The following potential benefits are expected from such systems:
  • Unique identifiers and formal semantics: the explicit definition of concepts and relations in an ontology allows a unique identifier to be assigned to each concept. As each concept and relation is explicitly defined as a unique entity, the ontology lends itself to semantic formalization.
  • Internal consistency: another benefit of explicit semantics is the achievement of internal structural consistency in the expression of knowledge due to the possibility of applying integrity constraints.
  • Interoperability: clear semantics enables interoperability among different KOSs since corresponding concepts within different KOSs would have the same unique identifier, irrespective of the actual lexicalizations used to express those concepts. Semantic interoperability promotes sharing and reuse of knowledge.
  • Greater information integration: interoperability among different KOSs makes it possible for machines to recognize and analyze intended meaning of terms from disparate vocabularies. This is possible by using structured meta-information and formal knowledge description such as agreed-upon metadata schemas, controlled domain vocabularies, and taxonomies. The ability to integrate terminologies from different sources maximizes the value of investment made in the ontology.
  • Inferencing capability: new KOSs have the potential for expressing knowledge beyond what is present in the structure of the system. Unlike traditional KOSs where both concepts and relations are underspecified and very few, if any, axiomatic rules exist, the facts (concepts and relations) and rules that can be derived from an ontology have the expressive capabilities that allow for reasoning.
  • Automated information processing: new KOSs create improved potential to discover relevant information from different sources by exploring patterns and filtering information using conceptual connections represented in the ontology. This enables question-answering from one or more databases or, using natural language processing (see next bullet), from text.
  • Natural language processing (NLP) support: offers the possibility of providing a direct reply to a search question that is expressed in natural language, using the enhanced relationships and semantics in an ontology, instead of only returning a list of relevant documents.
  • Search query understanding: using NLP and semantic processing, a system can understand a query posed in natural language, determine the concepts involved and, where useful, create a Boolean query.
  • Concept-based search: an ontology can provide context-aware search capabilities specific to the area of interest.
  • Integrated information search/browse support: text mining on the Web (Web mining) through meaning-oriented access, dynamic organization of information with the possibility for cross-domain links are feasible with emerging KOSs.
  • Search query expansion: the enhancement, extension, and disambiguation of user query terms become possible with the addition of enriched domain- and context-specific information. " (Soergel et al., 2004).

 

 

 

Literature:

 

Ashburner, M. et al. (2000). (Gene Ontology Consortium). Gene Ontology: tool for the unification of biology. Nature genetics, 25, 25-29. Available at: http://www.geneontology.org/GO_nature_genetics_2000.pdf

 

Barnes, J. & Robertson, J. (2002). The use of ontologies in drug discovery. Bioinformatics World.
http://web.archive.org/web/20021210111424/http://www.bioinformaticsworld.info/biwaut02ontologies.html

 

de Bruijn, J. & Fensel, D. (2005). Ontology definitions. IN: Encyclopedia of Library and Information Science. New York: Marcel Dekker. Pp. 1-11. Online: http://www.dekker.com/sdek/issues~content=t713172967.

 

Burkhardt, H. & Smith, B. (Eds.). (1991). Handbook of Metaphysics and Ontology. Vol. 1-2. Munich: Philosophia.

 

DAML Ontology library. http://www.daml.org/ontologies/

 

Ding, Y. (2001). A review of ontologies with the Semantic Web in view. Journal of Information Science, 27(6), 377-384.

 

Fast, K. V. & Campbell, D. G. (2001). The ontological perspectives of the Semantic Web and the metadata harvesting protocol: Applications of metadata for improving web search. Canadian Journal of Information and Library Science-Revue Canadienne des Sciences de l' Information et de Bibliotheconomie, 26(4), 5-19. Available at:
http://www.biblio.iteso.mx/biblioteca/oaithemes/themes/ontologicalperspectives.pdf

 

Gilchrist, A (2003). Thesauri, taxonomies and ontologies - an etymological note. Journal of Documentation 59(1), 7-18.

 

Gruber, T. R. (1993a). A translation approach to portable ontologies. Knowledge Acquisition, 5(2):199-220, 1993. Available on line http://ksl-web.stanford.edu/KSL_Abstracts/KSL-92-71.html

 

Gruber, T. R. (1993b). Toward principles for the design of ontologies used for knowledge sharing. Presented at the Padua workshop on Formal Ontology, March 1993, to appear in an edited collection by Nicola Guarino. Available online http://ksl-web.stanford.edu/KSL_Abstracts/KSL-93-04.html

 

Guarino, N. (1995). Formal ontology, conceptual analysis and knowledge representation. International Journal of Human-Computer Studies, 43(5-6), 625-640.


Guarino, N. (1997). Understanding, building and using ontologies. International Journal of Human-Computer Studies, 46(2-3), 293-310. Modified version available at: http://ksi.cpsc.ucalgary.ca/KAW/KAW96/guarino/guarino.html

 

Guarino, N. (1998). Formal Ontology and Information Systems. IN: Proceedings of FOIS’98, Trento, Italy, 6-8 June 1998. Amsterdam, IOS Press, pp. 3-15. Available at: http://www.loa-cnr.it/Papers/FOIS98.pdf

 

Jermey, J. & Browne, G. (2004). Website Indexing: Enhancing Access to Information within Websites.  Blaxland, NSW: Glenda Brown and Jonathan Jermey.

 

Kumar, A. & Smith, B. (2003). The unified medical language system and the gene ontology: Some critical reflections. IN: KI 2003: Advances in Artificial Intelligence (Lecture Notes in Artificial Intelligence 2821), Berlin: Springer, 135–148. http://ontology.buffalo.edu/medo/UMLS_GO.pdf

 

Legg, C. (2007). Ontologies on the semantic web. Annual Review of Information Science and Technology, 41, 407-451.

 

Lombardi, V. (2004). A metadata glossary. http://noisebetweenstations.com/personal/essays/metadata_glossary/metadata_glossary.html


McGuinness, D. L. (2003). Ontologies Come of Age. IN: Dieter Fensel, J im Hendler, Henry Lieberman, and Wolfgang Wahlster, editors. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. MIT Press. http://www-ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-mit-press-(with-citation).htm

 

Nellhaus, T. (1998). Signs, social ontology, and critical realism. Journal for the theory of social behaviour, 28(1), 1-24.

 

Ontosaurus: A Tool for Browsing and Editing Ontologies

 

Poli, R. (1996). Ontology for knowledge organization. In R. Green (Ed.). Knowledge organiztion and change (pp. pp. 313-319). Frankfurt: Indeks Verlag.

 

Prueitt, P. S. (1996). Ontology based document understanding. Notate 96 Conference. http://web.archive.org/web/20030223161937/http://www.acsa2000.net/notion_l.html 

 

Sharman, R.; Kishore, R. & Ramesh, R. (Eds.). (2007). Ontologies: A handbook of principles, concepts and applications in information systems. Ed. By  New York: Springer.

 

Shirky, C. (2005). Ontology is Overrated: Links, Tags, and Post-hoc Metadata. From the O'Reilly Emerging Technology Conference held in San Diego, California, March 14-17, 2005. http://shirky.com/writings/ontology_overrated.html

 http://www.itconversations.com/shows/detail470.html; http://conferences.oreillynet.com/cs/et2005/view/e_sess/6117

 

Smith, B. (1995). Formal ontology, common sense and cognitive science. International Journal of Human-Computer Studies, 43(5-6), 641-667.

 

Smith, B. (2003a). Ontology. In L. Floridi (Ed.), The Blackwell Guide to the Philosophy of Computing and Information (pp. 155-166). Malden, MA: Blackwell.

 

Smith, B. (2003b). Ontology and Information Science. In E. N. Zalta (Ed.), The Stanford. Encyclopedia of Philosophy. Stanford, CA: The Metaphysics Research Lab, Center for the Study of Language and Information, Stanford University.

Smith, B. (2004). Beyond Concepts: Ontology as Reality Representation. Achille Varzi and Laure Vieu (eds.), Formal Ontology and Information Systems. Proceedings of the Third International Conference (FOIS 2004), Amsterdam: IOS Press, 73–84. http://ontology.buffalo.edu/bfo/BeyondConcepts.pdf

 

Smith, B. & Welty, C. (2001).Ontology: Towards a New Synthesis. FOIS’01, October 17-19, 2001, Ogunquit, Maine, USA. Pages: .3-9. http://lists.w3.org/Archives/Public/www-webont-wg/2002Aug/att-0056/fois-intro.pdf (Visited April 17, 2004).

 

Smith, T. C. & Cleary, J. G. (2003). Automatically linking MEDLINE abstracts to the Gene Ontology. Proceedings of the ISMB 2003 BioLINK Text Data Mining SIG, Brisbane, Australia. June. http://www.reeltwo.com/news/Text_Mining_SIG_abstract.pdf

 

Soergel, D. (1999). The rise of ontologies or the reinvention of classification. Journal of the American Society for Information Science, 50(12), 1119-1120.

 

Soergel, D., Lauser, B., Liang, A., fisseha, F., Keizer, J., and Katz, S. (2004). Reenginnering thesauri for new application: the AGROVOC example. Journal of Digital Information,4(4). http://jodi.tamu.edu/Articles/v04/i04/Soergel/

 

Sowa, J. F. (2000). Ontology, Metadata, and Semiotics. Presented at ICCS'2000 in Darmstadt, Germany, on August 14, 2000. Published in B. Ganter & G. W. Mineau, eds., Conceptual Structures: Logical, Linguistic, and Computational Issues, Lecture Notes in AI #1867, Springer-Verlag, Berlin, 2000, pp. 55-81.  http://users.bestweb.net/~sowa/peirce/ontometa.htm

 

Stahl, B. C. (2007b). Positivism or non-positivismtertium non datur. A critique of ontological syncretisism in IS research. IN: Ontologies: A handbook of principles, concepts and applications in information systems. Ed. By Raj Sharman, Rajiv Kishore & Ram Ramesh. New York: Springer. (Pp. 115-142).

 

Stahl, B. C. (2007a). Ontology, lifeworld, and responsibility in IS. IN: Ontologies: A handbook of principles, concepts and applications in information systems. Ed. By Raj Sharman, Rajiv Kishore & Ram Ramesh. New York: Springer. (Pp. 143-169).

 

Svenionius, E. (2000). The intellectual foundation of information organization. Cambridge, MA: MIT Press.

 

Uschold, M. (1996). Building Ontologies: Towards a Unified Methodology. Presented at Expert Systems ’96 Conference. [not available 2006-06-15] ftp://ftp.aiai.ed.ac.uk/pub/documents/1996/

 

Uschold, M. (1998). Knowledge level modelling: concepts and terminology. Knowledge Engineering Review, 13(1), 5-29.


Uschold, M., & Gruninger, M. (1996). Ontologies: Principles, methods and applications. Knowledge Engineering Review, 11(2), 93-136.

 

Vickery, B. C. (1997). Ontologies. Journal of Information Science, 23(4), 277-286.

 

Wang, H. & Wang, C. (1995). Ontologies for universal information systems. Journal of Information Science, 21(3), 232-239.

 

Welty, C. A. (1998). The Ontological Nature of Subject Taxonomies. IN: N. Guarino (ed.), Proceedings of the First Conference on Formal Ontology and Information Systems, Amsterdam, IOS Press. http://www.cs.vassar.edu/faculty/welty/papers/fois-98/fois-98-1.html

 

 

 

See also: Ontology & metaphysics (Epistemological lifeboat); Semantic web; Topic Maps

 

 

 

Birger Hjørland

Last edited: 24-05-2007

HOME

 



Questions:

 

  1. An ontology has been defined as “a specification of a representational vocabulary for a shared domain of discourse - definitions of classes, relations, functions, and other objects” and as a specification of a conceptualization. Discuss whether these two definitions also apply to other kinds of semantic tools?

  2. What is (if anything) the principal difference between an ontology and other kinds of semantic tools (such as taxonomies and thesauri).