Semantic distance
"Distance" is a metaphor used for relatedness. Semantic distance is the degree of meaning relatedness between concepts. The concept "swan" may, for example, be said to be closer to the concept "waterfowl" than, for example, to the concept "fox".
The example mentioned above may, however, bring a false impression of semantic relatedness as objective and parts of fixed structures. Pellegrin, Bastien & Roux (1994), among others, found variable semantic distances between concepts of a same class, and dynamic variations of these distances due to contextual representation, why the view of semantic distances as objective probably is an unfruitful assumption. Thus it would be better to say, for example, "according to current evolutionary theory is "swan" more related to "waterfowl" than to the concept "fox"". In the same way a classification always reflect a purpose are semantic distances also relative to different goals to which they are being used. Consequently can semantic relations not just be studied empirically, but must be studied in relation to different interests and views.
If concepts are the units of knowledge, then is the semantic distance between concepts a fundamental way of considering knowledge organization (KO). Any theory or measurement of semantic distance must be considered a theory or approach to KO. A mapping of semantic distances represents a semantic tool and a knowledge organization systems (KOS).
One way to measure semantic distances is to consider the number of levels in a thesaurus. This method was used by Brooks (1995a, 1995b, 1997 & 1998).
Another way to consider semantic distances is to use bibliometric methods. A map such as Åström (2002) LIS.PDF shows the distances between the 47 most frequent occurring keywords in Library and Information Science (LIS).
Lussky's research (2004) contributes to the theoretical explanation of semantic relatedness as measured bibliometrically by considering how theories/epistemologies influence the use of words in the scientific literature.
Literature:
Ammon, U (1977). Indføring i sociolingvistik. Copenhagen: Gyldendal. (Translated from ”Probleme der Soziolingvistik”, 1973).
Bousquet, C.; Jaulent, M. C.: Chatellier, G. & Degoulet, P. (2000). Using semantic distance for the efficient coding of medical concepts. Journal of the American Medical Informatics Association, S, 96-100.
Brooks, T. A. (1995a). People, words and perceptions: A phenomenological investigation of textuality. Journal of the American Society for Information Science, 46(2), 103-115.
Brooks, T. A. (1995b). "Topical Subject Expertise and the Semantic Distance Model of Relevance Assessment." Journal of Documentation, 51(4), 370-387.
Brooks. T. A. (1997). The relevance aura of bibliographic records. Information Processing & Management, 33(1), 69-80.
Brooks, T. A. (1998). The Semantic Distance Model of Relevance Assessment. Proceedings of the 61st Annual Meeting of ASIS, Pittsburgh, PA, October 25-28, 1998: Information Access in the Global Information Economy, Vol. 35 (pp. 33-44) [pdf] [HTML]
Budanitsky, A. & Hirst, G. (2006). Evaluating WordNet-based measures of semantic distance. Computational Linguistics, 32(1), 13-47. http://ftp.cs.toronto.edu/pub/gh/Budanitsky+Hirst-2006.pdf
Byrne, C. C. & McCracken, S. A. (1999). An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval. Journal of Information Science, 25(2), 113-131.
Caviedes, J. E. & Cimino, J. J. (2004). Towards the development of a conceptual distance metric for the UMLS. Journal of Biomedical Informatics, 37(2), 77-85.
Lussky, J. P. (2004). Bibliometric patterns in an historical medical index: using the newly digitized Index Catalogue of the Library of the Surgeon General's Office, United States Army. Thesis, Drexel University. Available (full text): http://dspace.library.drexel.edu/retrieve/3815/Lussky_Joan.pdf
Pellegrin, L.; Bastien, C. & Roux, M. (1994). Representation of medical concepts of the thyroid-gland by physicians in anatomy and pathology. Methods of Information in Medicine, 33(4), 382-389.
Rips, L. J., Shoben, E. J. & Smith, E. E. (1973). Semantic distance and the verification of semantic relations. Journal of Verbal Learning and Verbal Behavior, 12, 1-20.
Schvaneveldt, R. W., Durso, F. T., & Mukherji, B. R. (1982). Semantic distance effects in categorization tasks. Journal of Experimental Psychology: Learning, Memory, and Cognition, 8, 1-15.
Shoben, E. J. (1976). The verification of semantic relations in a same-different paradigm: An asymmetry in semantic memory. Journal of Verbal Learning and Verbal Behavior, 15, 365-379.
White, H. D., & McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of the American Society for Information Science, 49(4), 327-355.
Åström, F (2002). Visualizing Library and Information Science concept spaces through keyword and citation based maps and clusters. In: Bruce, Fidel, Ingwersen & Vakkari (Eds.). Emerging frameworks and methods: Proceedings of the fourth international conference on conceptions of Library and Information Science (CoLIS4), pp 185-197. Greenwood Village: Libraries unlimited. Two figures: Bibliometric_MAP_LIS.PDF; Bibliometric_LIS_2.PDF
See also: Bibliometric Knowledge Organization
Birger Hjørland
Last edited: 17-07-2006