Weak Correlation Between Circulation and Citation Numbers Suggests that both Data Points should be Considered when Deselecting Print Monographs
A Review of:
White, B. (2017). Citations and circulation counts: Data sources for monograph deselection in research library collections. College & Research Libraries, 78(1), 53 – 65. https://doi.org/10.5860/crl.78.1.53
Objective – To facilitate evidence-based deselection of print monographs, this study examines to what extent there are correlations between circulation data (past and future usage) and between the borrowing and citation of print monographs.
Design – Collections assessment project that used a variety of data sources and techniques, including Spearman’s rank correlation coefficient, statistical analysis, and the analysis of circulation data, last-use dates, and citation data.
Setting – An academic library in New Zealand.
Subjects – Two ranges of books were chosen for the study: 591 (Specific Topics in Zoology) and 324 (The Political Process). From these ranges, monographs published prior to 2001 were selected as the study sample.
Methods – This project relied on two data sources: circulation data from the Library’s ILS and citation data from Scopus. All data was downloaded to an Excel spreadsheet in preparation for analysis. The researcher examined call numbers, authors and editors, titles and subtitles, publication dates, circulation counts, dates of last check-in, total number of citations, number of citations from publications released in 2010 and on, and number of citations from institution-affiliated documents. Renewal data was omitted, as it did not provide evidence of additional instances of use.
Where multiple copies of a specific title appeared in the data set, the researcher totalled all circulations and recorded the most recent check-in date. The researcher found that some titles in the study sample were generic and it was impossible to determine if citation data from Scopus linked to the monograph in the library collection. These titles were eliminated from the study.
Once data collection was complete, the researcher calculated two additional data elements: the number of months since the last check-in date and the number of citations from items published before 2010. Data in the Excel spreadsheet was analyzed using Spearman’s rank correlation coefficient to determine the relationship between past and future usage and between circulation and citation data.
Main Results – Findings indicated that circulation and citation data are highly skewed. Many monographs in the study sample had never been borrowed and had few citations, while a small number of “celebrity titles” were borrowed or cited at a much higher rate than other monographs in the same classification.
Further, results indicated that historic circulation numbers are imperfect predictors of future probability that a book will be borrowed. When taking a high-level view of the collection, highly circulated books tend to be borrowed more often than average. However, when examining monographs at the title level, high circulation is more of a probability instead of a robust indicator.
An investigation of whether historic citation counts serve as an indicator of future citation followed previously established trends: monographs not heavily cited in the past are less likely to be cited in the future. Findings also found a weak correlation between local-institution monograph citation counts and total citation counts.
Finally, the results demonstrated a weak correlation between circulation and citation data. As a group, well-cited books are borrowed more often than others, but at the individual title level, the effect is too random for either data set to predict the other in a reliable way. As such, circulation data and citation data can not be used as a proxy for each other.
Conclusion – Neither circulation nor citation data can stand as full proxies of the value of a title. However, both provide information that reflects the status of a title within the scholarly community. In this environment, citation data should be considered equally with circulation figures. Both data points measure different phenomena and the weak correlation between them suggests that both are required to inform decisions about deselecting print monographs.
Copyright (c) 2019 Melissa Goertzen
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The Creative Commons-Attribution-Noncommercial-Share Alike License 4.0 International applies to all works published by Evidence Based Library and Information Practice. Authors will retain copyright of the work.