Prior Steps into Knowledge Mapping: Text Mining Application and Comparison
DOI:
https://doi.org/10.29173/istl2736Abstract
Bibliometrics is increasingly being used by the knowledge community and librarians to easily analyze patterns in knowledge. In the field, the use of data from databases that provide bibliometric information is not always completely clean, so pre-processing is required. Several previous studies have shown that bibliometric analysis begins with a simple pre-processing step. The goal of this research is to use text mining to perform pre-processing to find the basic terms of the keywords that appear – to essentially construct a controlled vocabulary for a bibliographic dataset. The method used in this study is cleaning keywords with the stemming method using RapidMiner software. Bibliometrix was used to compare the results. A total of 85 keywords were combined into basic words. Using the built process, this study discovers differences in the network built between raw data and data that has been pre-processed, resulting in differences in the analysis that will be produced. The built process can also be reused in a variety of real-world situations.
Downloads
References
Aria, M., & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959–975. https://doi.org/10.1016/j.joi.2017.08.007
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., & Wirth, R. (2000). CRISP-DM 1.0: Step-by-step data mining guide. SPSS. https://www.kde.cs.uni-kassel.de/wp-content/uploads/lehre/ws2012-13/kdd/files/CRISPWP-0800.pdf
CheshmehSohrabi, M., & Mashhadi, A. (2022). Using data mining, text mining, and bibliometric techniques to the research trends and gaps in the field of language and linguistics. Journal of Psycholinguistic Research. https://doi.org/10.1007/s10936-022-09911-6
Gumpenberger, C., Wieland, M., & Gorraiz, J. (2012). Bibliometric practices and activities at the University of Vienna. Library Management, 33(3), 174–183. https://doi.org/10.1108/01435121211217199
Han, J., Kang, H.-J., Kim, M., & Kwon, G. H. (2020). Mapping the intellectual structure of research on surgery with mixed reality: Bibliometric network analysis (2000–2019). Journal of Biomedical Informatics, 109, 103516. https://doi.org/10.1016/j.jbi.2020.103516
Lamba, M., & Madhusudhan, M. (2018). Application of sentiment analysis in libraries to provide temporal information service: A case study on various facets of productivity. Social Network Analysis and Mining, 8(1), 63. https://doi.org/10.1007/s13278-018-0541-y
Li, D., Dai, F.-M., Xu, J.-J., & Jiang, M.-D. (2020). Characterizing hotspots and frontier landscapes of diabetes-specific distress from 2000 to 2018: A bibliometric study. BioMed Research International, 2020, 1–13. https://doi.org/10.1155/2020/8691451
Moore, M. T. (2017). Constructing a sentiment analysis model for LibQUAL+ comments. Performance Measurement and Metrics, 18(1), 78–87. https://doi.org/10.1108/PMM-07-2016-0031
Moral-Muñoz, J. A., Herrera-Viedma, E., Santisteban-Espejo, A., & Cobo, M. J. (2020). Software tools for conducting bibliometric analysis in science: An up-to-date review. El Profesional de La Información, 29(1). https://doi.org/10.3145/epi.2020.ene.03
Obidat, A. H. (2022). Bibliometric analysis of global scientific literature on the accessibility of an integrated e-learning model for students with disabilities. Contemporary Educational Technology, 14(3), ep374. https://doi.org/10.30935/cedtech/12064
Porter, M. F. (2001). Snowball: A language for stemming algorithms. http://snowball.tartarus.org/texts/introduction.html
Schröer, C., Kruse, F., & Gómez, J. M. (2021). A systematic literature review on applying CRISP-DM process model. Procedia Computer Science, 181, 526–534. https://doi.org/10.1016/j.procs.2021.01.199
Wang, X., Xu, Z., & Škare, M. (2020). A bibliometric analysis of Economic Research-Ekonomska Istraživanja (2007–2019). Economic Research-Ekonomska Istraživanja, 33(1), 865–886. https://doi.org/10.1080/1331677X.2020.1737558
Wang, X., Xu, Z., Su, S.-F., & Zhou, W. (2021). A comprehensive bibliometric analysis of uncertain group decision making from 1980 to 2019. Information Sciences, 547, 328–353. https://doi.org/10.1016/j.ins.2020.08.036
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Faizhal Arif Santosa

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
While ISTL has always been open access and authors have always retained the copyright of their papers without restrictions, articles in issues prior to no.75 were not licensed with Creative Commons licenses. Since issue no. 75 (Winter 2014), ISTL has licensed its work through Creative Commons licenses. Please refer to the Copyright and Licensing Information page for more information.