Bibliometric Knowledge Organization

Bibliometric techniques, for example author co-citation analysis, may be applied to construe “Atlas of Science” or maps of knowledge fields. According to Bonitz (1983) was this idea first formulated by Wilhelm Ostwald (1919). Eugene Garfield (1981 and 1988) produced a major atlas of biochemistry. A well known example in information science is the map produced by White & McCain (1998) (click). Many such maps are produced today. Some methodological problems in this field are discussed in Hjørland (2002a). 

 

Some attempts have been made to combine bibliometrics with more traditional approaches to knowledge organization and to information retrieval. Kessler (1965) examined papers in Physical Review and compared bibliographic coupling (bc) with analytical subject indexing assigned to the same documents by the periodical's editors (asi). It was found that: (i) groups formed by bc showed high correlation with the asi groups; (ii) groups formed by asi correlate with those formed by bc more or less depending on the 'logical size' of the asi category; (iii) coupling strength provides a measure of strength of relatedness between pairs of papers.

 

Salton (1971), Rees-Potter (1989, 1991), Pao & Worthen (1989) and Pao (1993) are other contributors to bibliometrics as an approach to KO. Rees-Potter described a method for semi-automatically constructing and maintaining a thesaurus. The method uses citation and co-citation analysis and is based on the hypothesis that highly cited papers are concept symbols. Recently Schneider (2004) & Schneider & Borlund (2004) provided an important contribution to this field by demonstrating that thesaurus terms may be produced by means of bibliometric methods. One of the innovations in this research is the application of an advanced parser to identify noun-phrases in small windows by citations in the text. This method is clearly an example of literary warrant, and as such a very explicit application of the principle.

 

" . . the case study of periodontology clearly demonstrates that the applied bibliometric methods of co-citation analysis and citation context analysis are able to select important candidate thesaurus terms. . . . We believe that the special selection procedures inherent in the methodical steps of the two components ensure that a significant number of the selected primary candidate thesaurus terms turn out to be important index terms. Hence, the conclusion is that the applied bibliometric methods are very suitable for selection of candidate thesaurus terms in the specialty area of periodontology. "

(Schneider, 2004, p. 323).

 

 

"Data coverage issues also raises other questions such as the following:

We have no ready answers to these questions, but suggest that they are important topics for discussion and further research". (Börner, Chen & Boyack, 2003, p. 217).

 

A number of scholars have addressed the problem whether bibliographic coupling and/or co-citation are good indicators of subject-relatedness? Small (1973) found that bibliographical coupling and co-citation analysis provided significant different patterns, and suggested that bibliographic coupling is a less reliable indicator of subject similarity than co-citation. Small mentions different kinds of relations, which are not all clearly defined in his paper: Co-citations may 1) be analogous to a measure of descriptor or word association (p. 265),  2) relationships that are strongly recognized by people in the specialty (which may be explicitly recognized in the papers), 3) measure subject similarity (p. 267), 4) reflect "semantic" relations among cited papers 4) identify "core" literature in a specialty. No data or speculation is provided concerning the validity and reliability of subject-relatedness, under which conditions bibliographic coupling or co-citation may be a good indicator of subject relatedness. Concepts such as "subject relatedness" and "semantic relations" are very vaguely talked about, without any hints to their empirical operationalizations. Papers are not directly semantically related because a semantic relation is a relation between concepts, not between papers. However, papers uses concepts, and the concepts they use may be more or less semantically related. Also, as mentioned above, in referring to other papers, such references are often used as "concept symbols". One way to approach this problem may be to see traditional subject relations as kinds of "intellectual organization of knowledge" and to see disciplines, specialties and citation networks as kinds of social organization of knowledge.

 

What are the principal advantages and limitations using bibliometric methods for knowledge organization? First it is important to understand what the bibliometric methods are opposed to. What are the principal differences between bibliometric and "classical" methods of knowledge organization? Traditional systems of knowledge organizations have for example classified countries according to their current jurisdictions, animals according to zoological taxonomy and chemical substances according to the periodical system of physics and chemistry. In other words: traditional systems are constructed on the basis of ontological models of reality produced by scientists or other professionals. They represent intellectual classifications of parts of the world. Bibliometric methods, on the other hand, models patterns in scientific communication and organization. They are social models, displaying social structures among scientists and scholars. 

 

Intellectual and social models may overlap more or less. If they overlap much bibliometric methods may challenge traditional methods of knowledge organization. Probably they overlap more in some domains, but less in other domains. 

 

This issue has been discussed by Hjørland (e.g. 1993, 1997, 1998, 2002 and 2004), who found bibliometric methods valuable but also found it improbable that traditional classification structures such as geographical maps or the periodical system can be produced by any empirical analysis of citation patterns of documents. However empirical studies are needed to further illuminate such connections. The relation between intellectual knowledge organization and social knowledge organization is theoretically related to the debate between social constructivists who claim that scientists construe ontology and realists that claim that sciences are structured according to given ontological structures discovered by science. To the degree that the social constructivist point of view is true should bibliometric maps be the most useful. To the degree that the scientific realist point of view is true should ontological models produced by specialists turn out to be most useful models for knowledge organization. In any case the two approaches may both contribute valuable structures and thus supplement each other. 

 

Two considerations are important in relation to citation indexes as tools for KO: 1) the level of indexing depth is partly determined by the number of terms assigned to each document. In citation indexing this corresponds to the number of references in a given paper. On the average contains scientific papers 10-15 references, which provides a rather high level of depth. 2) The references, which function as access points, are provided by the highest subject-expertise: The experts writing in the leading journals. This expertise is much higher than what library catalogs or bibliographical databases typically are able to draw on.

 

 

 

 

Bibliographic references as index entries / subject access points

Advantages

Disadvantages

  • Citations are provided by highly qualified subject specialists
  • The number of references reflect the indexing depth and specificity (average of scientific papers about 10 references per article)
  • Citation indexing is a highly dynamic form of subject representation
  • References are distributed in papers which allows the utilization of paper structure in the contextual interpretation of citations.
  • Scientific papers form a kind of self-organization system.

 

 

 

 

 

 

Literature:


Bernstam, E. V.; Herskovic, J. R.; Aphinyanaphongs, Y.; Aliferis, C. F.; Sriram, M. G. & Hersh, W. R. (2006). Using citation data to improve retrieval from MEDLINE. Journal of the American Medical Informatics Association, 13(1), 96-105.


Bonitz, M. (1983). Wie lassen sich die Frontgebiete der Forschung bestimmen? : 'ISI Atlas of Science' für Biochemie und Molekularbiologie. Zentralblatt für Bibliothekswesen, 97(7), 295‑296.

 

Börner K., Chen, C. M. & Boyack, K. W. (2003). Visualizing knowledge domains. Annual Review of Information Science and Technology 37, 179-255.

Figures in color:

http://web.archive.org/web/20030623065959/http://www.asis.org/Publications/ARIST/Vol37/BornerFigures.htm

 

Gabel, J. (2006). Improving Information Retrieval of Subjects Through Citation-Analysis. Knowledge Organization, 33(2) 86-95.

 

Garfield, E. (1981). Introducing the ISI Atlas of Science: Biochemistry and Molecular Biology, 1978-1980. Current Contents, 1981, (42), p.5‑13.

 

Garfield, E. (1988). The Encyclopedic ISI‑Atlas‑of‑Science Launches Three new Sections: Biochemistry, Immunology, and Animal- & Plant Sciences. Current Contents, (7), 3‑8.

 

Harter, S. P.; Nisonger, T. E. & Weng, A. W. (1993). Semantic relations between cited and citing articles in library and information science journals.  Journal of the American Society for Information Science, 44(9), 543-552.  

 

Hjørland, B. (1993). Emnerepræsentation og informationssøgning. Bidrag til en teori på kundskabsteoretisk grundlag. Göteborg: Valfrid, Distributionsföreningen för inst Biblioteks­högskolan vid Högskolan i Borås och Centrum för biblioteks- och informationsvetenskap vid Göteborgs Universitet.

 

Hjørland, B. (1997): Information Seeking and Subject Representation. An Activity-theoretical approach to Information Science. Westport & London: Greenwood Press.

 

Hjørland, B. (1998). Information retrieval, text composition, and semantics. Knowledge Organization, 25(1/2), 16-31.

 

Hjørland, B. (2002). Domain analysis in information science. Eleven approaches - traditional as well as innovative. Journal of Documentation, 58(4), 422-462. http://web.archive.org/web/20040721022850/http://www.db.dk/bh/publikationer/Filer/JDOC_2002_Eleven_approaches.pdf

 

Hjørland, B. (2002b) The Methodology of Constructing Classification Schemes. A Discussion of the State-of-the-Art. IN: Proceedings of the Seventh International ISKO Conference 10.-13. July 2002, Granada, Spain. Ed. By María J. López-Huertas. Würzburg, Germany: Ergon Verlag, pp. 450-456. (Advances in Knowledge Organization, Vol. 8).

 

Hjørland, B. (2004). Arguments for Philosophical Realism in Library and Information Science. Library Trends, 52(3), 488-506.  http://www.db.dk/bh/Realism_Library%20Trends.pdf

 

Kessler, M. M. (1965). Comparison of the results of bibliographic coupling and analytic subject indexing. American Documentation 16(3), 223-233.

 

Leydesdorff, L. (2006). Can scientific journals be classified in terms of aggregated journal-journal citation relations using the Journal Citation Reports? Journal of the American Society for Information Science and Technology, 57(5),  601-613.

 

Ostwald, W.: Die chemische Literatur und die Organisation der Wissenschaft (in: Handbuch der allgemeinen Chemie; Bd. 1. Hrsg von W.Ostwald & C.Drucker. Leipzig, 1919, p.92-).

 

Pao, M. L. (1993): Term and Citation Retrieval: A Field Study. Information Processing & Management, 29(1), 95-112.

 

Pao, M. L. & Worthen, D. B. (1989). Retrieval effectiveness by semantic and pragmatic relevance. Journal of the American Society for Information Science, 40(4), 226-235.

 

Qin, J. (1999).  Discovering semantic patterns in bibliographically coupled documents. Library Trends, 48(1), 109-132.

 

Rees-Potter, L. K. (1989). Dynamic thesaural systems: a bibliometric study of terminological and conceptual change in sociology and economics with application to the design of dynamic thesaural systems. Information Processing & Management, 25(6), 677-691.

 

Rees-Potter, L. K. (1991). Dynamic thesauri: the cognitive function. Tools for knowledge organization and the human interface. Proceedings of the 1st International ISKO Conference, Darmstadt, 14-17 August 1990. Part 2, 1991, 145-150. 

 

Salton, G. (1971). Automatic indexing using bibliographic citations. Journal of Documentation, 27(2), 98-110.

 

Small, H. (1973). Co-citation in the relationship between two documents. Journal of the American Society for Information Science, 24, 256-269.

 

Schneider, J. W. (2004). Verification of bibliometric methods' applicability for thesaurus construction. Aalborg: Royal School of Library and Information Science. (PhD-dissertation). Available at:  http://biblis.db.dk/archimages/199.pdf

 

Schneider, J. & Borlund, P. (2004). Introduction to bibliometrics for construction and maintenance of thesauri: methodical considerations. Journal of Documentation, 60(5), 524-549.

 

White, H. D., & McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of the American Society for Information Science, 49(4), 327-355.

 

Åström, F (2002) Visualizing Library and Information Science concept spaces through keyword and citation based maps and clusters. In: Bruce, Fidel, Ingwersen & Vakkari (Eds.). Emerging frameworks and methods: Proceedings of the fourth international conference on conceptions of Library and Information Science (CoLIS4), pp 185-197. Greenwood Village: Libraries unlimited. Two figures: Bibliometric_MAP_LIS.PDF; Bibliometric_LIS_2.PDF

 

 

See also: Atlas of Science; Citation Indexing (Core Concepts in LIS); KeyWord Plus; Literary warrant (Core Concepts in LIS).

 

 

 

 

 

Birger Hjørland

Last edited: 28-02-2007

HOME

 

 

Questions:

 

  1. Discuss the problem of selecting journals for making a bibliometric map. (Maps are very vulnerable to choice of journals). Can journals be selected by objective criteria? If not, what kind of criteria should be used?

  2. Bibliometric mapping is based on an assumption of a semantic relation between citing and cited papers. However, semantic relations are relations between concepts, not papers. How does a paper become equivalent to a concept? (Help: A keyword represents a conceptualization of a paper, or a conceptualization of part of or an aspect of a conceptualization of a paper).

  3. Consider who is doing the citing. Should we expect leading scientific journals to have a citation pattern similar to that of, for example, textbooks? (Other kinds of papers, e.g. newspapers are typically without citations. What is the implication for construing and using bibliographic maps?).

  4. Schneider (2004) verified that bibliometric methods can, at least partially, produce the same terms as appears in a thesaurus, constructed manually. However, a manually constructed thesaurus is bases on the producers selection, knowledge etc. Discuss whether both methods may be in need of some kind of verification? Can they verify each other?