Metathesaurus / megathesaurus
A metathesaurus is a kind of thesaurus which tries to integrate existing thesauri and vocabularies.

 

"Metathesaurus – is intellectual middleware; The National Library of Medicine’s Unified Medical Language System (UMLS) Metathesaurus cross-references national and international medical vocabularies." (National Committee on Vital and Health Statistics. 2000).

 

 

 

"To facilitate multi-file searching across Wilson databases that have been indexed with different controlled vocabularies, the H. W. Wilson Company has developed and is maintaining a “megathesaurus,” in a sense, a thesaurus of thesauri.  Each record in the megathesaurus contains the “authority main term” or “megathesaurus term,” which serves as the anchor to which equivalent terms, along with their respective relational terms, from twelve vocabularies are mapped and stored. An automatic switching mechanism in searching has been developed, which enables the user to search a single index, multiple indexes simultaneously, or the combined indexes in the multi-file OMNI Index in a transparent manner, by using search terms based on any of the source vocabulary "(Kuhr, 2003).

 

“The Unified Medical Language System (UMLS) Metathesaurus is concept-oriented; its goal is to unite all names with identical meaning in a single Concept. The names come from its constituent vocabularies or "sources" - a wide variety of biomedical terminologies including many controlled vocabularies and classifications used in patient records, administrative health data, bibliographic, research, full-text, and expert systems. Many offer little definitional information, and many are not themselves concept-oriented, so identifying synonymy is a challenging semantic task. The rapidly increasing size of the Metathesaurus makes the task daunting, demanding effective computational support; there are more than 1.5 million names for 730,000 concepts in the January 2000 release.
Vocabularies are added and updated using sophisticated lexical matching, selective algorithms, and expert review. Yet the result is imperfect; we have discovered and corrected missed synonymy in approximately 1% of previously released concepts each year”. (Hole & Srinivasan, 2000).
                      Thesauri may be merged to metathesauri. As stated by Nelson; Johnston; Powell & Hole (2001) it is a precondition for such a merging that the different thesauri share the same theoretical commitments: “The first is to be sure that the worldviews are completely in agreement. This is a necessary step before attempting to merge the two maintenance environments. Another is to review the methods of assigning the unique identifiers, and identify any areas where there are potentials for problems. A third is to provide a step in quality control of the agreed upon and shared worldview. Still another is to help identify areas where there may be some subtle shifts in meaning. This process has made us aware of how subtle differences in the maintenance environments or in the operational approach to some deep philosophical issues can affect the vocabulary produced.”
                      Such different world-views may also explain why query expansion based on metathesauri sometimes causes a decline in retrieval performance as demonstrated by Hersh; Price & Donohoe (2000), who concluded their experiment: “[Meta]thesaurus-based query expansion causes a decline in retrieval performance generally but improves it in specific instances. Further research must focus on identifying instances where performance improves and how it can be exploited by real users”.

 

 

Literature:


Aronson, A. R (2001). Effective mapping of biomedical text to the UMLS metathesaurus: The MetaMap Program. Journal of the American Medical Informatics Association, S, 17-21.
 

Cimino, J. J. (2001). Battling scylla and charybdis: the search for redundancy and ambiguity in the 2001 UMLS metathesaurus. Journal of the American Medical Informatics Association, S, 120-124
 

Hersh W ; Price S; Donohoe L (2000).Assessing thesaurus-based query expansion using the UMLS metathesaurus. Journal of the American Medical Informatics Association, 2000, S, P344-348.
 

Hole WT ; Srinivasan S. (2000). Discovering missed synonymy in a large concept-oriented metathesaurus. Journal of the American Medical Informatics Association, 2000, ,S , 354-358.

 

Kuhr, P. S.  (2003).  Putting the world back together: mapping multiple vocabularies into a single thesaurus.  In: I. C. Mcllwaine ed.: Subject Retrieval in a Networked Environment, Proceedings of the IFLA Satellite Meeting held in  Dublin, Ohio 14-16 August 2001.   München: K. G. Saur. Pp.33-42. 
 

National Committee on Vital and Health Statistics. 2000. Report on Uniform Data Standards for Patient Medical Record Information. Available at: http://ncvhs.hhs.gov/hipaa000706.pdf

 

Schuyler, P. L.: Hole, W. T.; Tuttle, M.S. & Sherertz, D. D. (1993). The UMLS Metathesaurus – Representing different views of biological concepts. Bulletin of the medical library association, 81(2), 217-222.  ABSTRACT: The UMLS Metathesaurus (R) is a compilation of names, relationships, and associated information from a variety of biomedical naming systems representing different views of biomedical practice or research. The Metathesaurus is organized by meaning, and the fundamental unit in the Metathesaurus is the concept. Differing names for a biomedical meaning are linked in a single Metathesaurus concept. Extensive additional information describing semantic characteristics, occurrence in machine-readable information sources, and how concepts co-occur in these sources is also provided, enabling a greater comprehension of the concept in its various contexts.
                       The Metathesaurus is not a standardized vocabulary; it is a tool for maximizing the usefulness of existing vocabularies. It serves as a knowledge source for developers of biomedical information applications and as a powerful resource for biomedical information specialists.
 

Sparck Jones, K. (1992). Thesaurus. IN: Encyclopedia of Artificial Intelligence, Vol. 1-2. Ed by S. C. Shapiro, New York: John Wiley & Sons. (Vol. 2, pp. 1605-1613).
 

Tuttle, M. S.; Olson, N.E.; Campbell, K. E.; Sherertz, D. D.; Nelson, S. J. & Cole, W. G. (1994). Formal properties of the metathesaurus. Journal of the American Medical Informatics Association, S , P145-149. ABSTRACT: The Metathesaurus is a machine-created, human edited and enhanced synthesis of authoritative biomedical terminologies. Its formal properties permit it to be a) exploited by computers, and b) modified and enhanced without compromising that usage. If further constraints were imposed on the existence and identity of Metathesaurus relationships, i.e., if every Metathesaurus concept had a ''genus'' and a ''differentia,'' then the Metathesaurus could be converted into an ''Aristotelian Hierarchy.'' In this sense, a genus is a concept that classifies another concept, and a differentia is a concept that distinguishes the classified concept from all other concepts in the same class. Since, in principle, these constraints would make the Metathesaurus easier to leverage and maintain computationally, it is interesting to ask to what degree the maintenance and enhancement procedures now in place are producing a Metathesaurus that is also an ''Aristotelian Hierarchy.'' Given a liberal interpretation of the current Metathesaurus schema, the proportion of the Metathesaurus that is ''Aristotelian'' in each annual version is increasing in spite of dramatic concurrent increases in the number of Metathesaurus concepts.
 

Wang J ; Yan JS; Strasberg HR; Melmon KL (2000). Versatile user interface using UMLS Metathesaurus. Journal of the American Medical Informatics Association, 2000, ,S , P888-892. ABSTRACT: One of the obstacles for a successful search in the biomedical field is that different vocabularies are used by different databases but more than one database is usually needed to respond adequately to a healthcare professional's query. A typical searcher usually is unfamiliar with these vocabularies and the sophisticated measures to narrow or broaden a search. As a result, a failed search is often due to using "inappropriate " search terms. We have developed a highly interactive and versatile user interface, SHINE Refined Search (SHINE RS) It uses medical concepts-from the UMLS Metathesaurus as, the building block to help searchers find "appropriate" search terms for their queries. The results of our preliminary usability assessment are promising and demonstrate the potential to improve retrieval results.
 

Rindflesch, T. C. & Aronson, A. R. (1994). Ambiguity resolution with mapping free-text to the UMLS metathesaurus. Journal of the American Medical Informatics Association,  S, 240-244.
 
 

See also: Unified Medical Language System

 

 

 

 

Birger Hjørland

Last edited: 16-07-2006

HOME