Investigating Document Type Discrepancies between OpenAlex and the Web of Science

Authors

DOI:

https://doi.org/10.29173/cais1943

Keywords:

Bibliometrics, Research evaluation, OpenAlex, Web of Science, Library and Information Science, Open data, Metadata, Document type

Abstract

Bibliometrics, whether used for research or research evaluation, relies on large multidisciplinary databases of research outputs and citation indices. The Web of Science (WoS) was the main supporting infrastructure of the field for more than 30 years until several new competitors emerged. OpenAlex, launched in 2022, stands out for its openness and extensive coverage. While OpenAlex may reduce or eliminate barriers to accessing bibliometric data, one of the concerns that hinder its broader adoption for research and research evaluation is the quality of its metadata. This study aims to assess the metadata quality of works in OpenAlex and WoS, focusing on document type accuracy. We observe that over 4% of the publications indexed in both OpenAlex and WoS appear to be misclassified as research articles or reviews, and that the vast majority (about 97%) of these errors occur in OpenAlex. By addressing discrepancies and misattributions in document types this research seeks to enhance awareness of data quality issues that could impact bibliometric research and evaluation outcomes.

 

Enquête sur les divergences de types de documents entre OpenAlex et the Web of Science

Résumé
La bibliométrie, qu’elle soit utilisée pour la recherche ou pour l’évaluation de la recherche, repose sur de vastes bases de données multidisciplinaires regroupant des publications scientifiques et indices de citation. The Web of Science (WoS) était la principale infrastructure dans le domaine pendant plus 30 ans jusqu’à l’émergence de plusieurs nouveaux concurrents. OpenAlex, lancée en 2022, se démarque pour sa transparence et sa couverture étendue. Pendant qu’OpenAlex pouvait réduire voire éliminer les barrières pour l’accès aux données bibliométriques, une des préoccupations qui entravait sa large adoption pour la recherche et l'évaluation de la recherche est la qualité de ses métadonnées. Cette étude a pour but d’évaluer la qualité du travail des métadonnées dans OpenAlex et WoS, en se focalisant sur la précision du type de document. On observe que plus de 4% des publications indexées, à la fois dans OpenAlex et dans WoS, semblent être mal classées en tant qu’articles ou revues de recherche, et que la grande majorité (environ 97%) de ces erreurs se présentent dans OpenAlex. En relevant ces divergences et erreurs d’attribution dans les types de documents, cette recherche a pour but d’améliorer l’attention portée aux problèmes de qualité des données qui pourraient impacter les recherches bibliométriques et les résultats de l’évaluation.

Mots-clés
Bibliométrie; évaluation de la recherche; OpenAlex; Web of Science; Bibliothèque et Science de l’Information; données ouvertes, métadonnées, type de document

 

References

Alonso-Alvarez, P., & Eck, N. J. van. (2024). Coverage and metadata availability of African publications in OpenAlex: A comparative analysis (No. arXiv:2409.01120). arXiv. https://doi.org/10.48550/arXiv.2409.01120

Alperin, J. P., Portenoy, J., Demes, K., Larivière, V., & Haustein, S. (2024). An analysis of the suitability of OpenAlex for bibliometric analyses. arXiv. https://doi.org/10.48550/arXiv.2404.17663

Barcelona Declaration. (2024). “Barcelona Declaration on Open Research Information”. https://barcelona-declaration.org/

Bordignon, F. (2024). Is OpenAlex a revolution or a challenge for bibliometrics/bibliometricians? https://enpc.hal.science/hal-04520837

Céspedes, L., Kozlowski, D., Pradier, C., Sainte-Marie, M. H., Shokida, N. S., Benz, P., Poitras, C., Ninkov, A. B., Ebrahimy, S., Ayeni, P., Filali, S., Li, B., & Larivière, V. (2024). Evaluating the Linguistic Coverage of OpenAlex: An Assessment of Metadata Accuracy and Completeness. arXiv. https://doi.org/10.48550/arXiv.2409.10633

Culbert, J. H., Hobert, A., Jahn, N., Haupka, N., Schmidt, M., Donner, P., & Mayr, P. (2024). Reference Coverage Analysis of OpenAlex compared to Web of Science and Scopus. Cornell University. https://doi.org/10.48550/arxiv.2401.16359

Delgado-Quirós, L., & Ortega, J. L. (2024). Completeness degree of publication metadata in eight free-access scholarly databases. Quantitative Science Studies, 5(1), 31–49. https://doi.org/10.1162/qss_a_00286

Garfield, E., & Sher, I. H. (1963). New factors in the evaluation of scientific literature through citation indexing. American Documentation, 14(3), 195–201. https://doi.org/10.1002/asi.5090140304

Haupka, N., Culbert, J.H., Schniedermann, A., Jahn, N., & Mayr, P. (2024). Analysis of the publication and document types in OpenAlex, Web of Science, Scopus, Pubmed and Semantic Scholar. ArXiv. https://doi.org/10.48550/arXiv.2406.15154

Maddi, A., Maisonobe, M., & Boukacem-Zeghmouri, C. (2024). Geographical and disciplinary coverage of open access journals: OpenAlex, Scopus and WoS. ArXiv. https://doi.org/10.48550/arXiv.2411.03325

Mazoni, A., & Costas, R. (2024). Towards the democratisation of open research information for scientometrics and science policy: the Campinas experience. https://www.leidenmadtrics.nl/articles/towards-the-democratisation-of-open-research-information-for-scientometrics-and-science-policy-the-campinas-experience

Narin, F. (1976). Evaluative bibliometrics: The use of publication and citation analysis in the evaluation of scientific activity. Computer Horizons Washington, D. C.

Ortega, J. L., & Delgado‐Quirós, L. (2024). The indexation of retracted literature in seven principal scholarly databases: A coverage comparison of dimensions, OpenAlex, PubMed, Scilit, Scopus, The Lens and Web of Science. Scientometrics, 129(7), 3769–3785. https://doi.org/10.1007/s11192-024-05034-y

Priem, J., Piwowar, H., & Orr, R. (2022). OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. In arXiv (Cornell University). Cornell University. https://doi.org/10.48550/arxiv.2205.01833

Schares, E. (2024). Comparing Funder Metadata in OpenAlex and Dimensions. https://doi.org/10.31274/b8136f97.ccc3dae4

Scheidsteger, T., & Haunschild, R. (2023). Which of the metadata with relevance for bibliometrics are the same and which are different when switching from Microsoft Academic Graph to OpenAlex? Profesional de La Información, 32(2). https://doi.org/10.3145/epi.2023.mar.09

Shi, J., Nason, M., Tullney, M., & Alperin, J. (2025). Identifying metadata quality issues across cultures. College & Research Libraries, 86(1). https://doi.org/10.5860/crl.86.1.101

Simard, M.-A., Basson, I., Hare, M., Lariviere, V., & Mongeon, P. (2024). The open access coverage of OpenAlex, Scopus and Web of Science. arXiv. https://doi.org/10.48550/arXiv.2404.01985

van Eck, N. J., Waltman, L., & Neijssel, M. (2024, October 9). Launch of the CWTS Leiden Ranking Open Edition 2024. Leiden Madtrics. https://www.leidenmadtrics.nl/articles/launch-of-the-cwts-leiden-ranking-open-edition-2024

Visser, M., van Eck, N. J., & Waltman, L. (2021). Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. Quantitative Science Studies, 2(1), 20–41. https://doi.org/10.1162/qss_a_00112

Zhang, L., Cao, Z., Shang, Y., Sivertsen, G., & Huang, Y. (2024). Missing institutions in OpenAlex: Possible reasons, implications, and solutions. Scientometrics. https://doi.org/10.1007/s11192-023-04923-y

Downloads

Published

2025-05-23

How to Cite

Mongeon, P., Hare, M., Krause, G., Marjoram, R., Riddle, P., Toupin, R., & Wilson, S. (2025). Investigating Document Type Discrepancies between OpenAlex and the Web of Science. Proceedings of the Annual Conference of CAIS Actes Du congrès Annuel De l’ACSI. https://doi.org/10.29173/cais1943

Issue

Section

Articles