| Peer-Reviewed

Automatic Classification of Computing Literatures via Article and Reference Correlation

Received: 17 September 2022     Accepted: 29 September 2022     Published: 21 October 2022
Views:       Downloads:
Abstract

Automatic literature classification via machine learning has witnessed increasing attention in various research circles, especially computing community because of the availability of large body of research articles in diverse fields. Existing works have largely drawn features from segments of articles such as abstracts, contents and their metadata with little or no attention for references. This paper posited that correlating article and reference features would enhance the performance of machine learning algorithms. Therefore, we exploited the correlation of TFIDF of articles and references using association rule and cosine similarity-based correlation methods for classification of computing literatures. We focused on Adekunle Ajasin University Research Repository. Based on the ACM’s and Denning’s taxonomies, the research articles in the database were labelled by experienced computing professionals. Logistic Regression, Support Vector Machine and Multilayer Perceptron Neural Network with N-Gram features were explored as classifiers. For ACM’s taxonomy, the highest accuracy and F1-score of 0.56 and 0.41, respectively were obtained for association rule-based correlation; 0.62 and 0.51, respectively for similarity-based correlation; and 0.59 and 0.46, respectively for the existing article-based classification. For Denning’s taxonomy, the highest accuracy and F1-score of 0.41 and 0.40, respectively were obtained for association rule-based correlation; 0.41 and 0.36, respectively for similarity-based correlation; and 0.38 and 0.37, respectively for the existing article-based classification. These results show that both methods of correlation have better prospect than the popular abstract-based classification method in automatic classification of computing literatures.

Published in American Journal of Computer Science and Technology (Volume 5, Issue 4)
DOI 10.11648/j.ajcst.20220504.12
Page(s) 204-209
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2022. Published by Science Publishing Group

Keywords

Computing, Research Articles, Machine Learning, Classification, Reference Features

References
[1] Akritidis, L., and Panayiotis, B. (2013). A Supervised Machine Learning Classification Algorithm for Research Articles. In SAC’13. Coimbra: ACM.
[2] Rivest, M., Etienne, V., and E´ric, A. (2021). Article-Level Classification of Scientific Publications : A Comparison of Deep Learning, Direct Citation and Bibliographic Coupling. PLoS ONE, 16 (5): 1–18. https://doi.org/10.1371/journal.pone.0251493.
[3] Archambault, E., Beauchesne, O. H., and Caruso, J. (2011). Towards a multilingual, comprehensive and open scientific journal ontology. In: Noyons, B., Ngulube, P., and Leta, J., editors. Proceedings of the 13th International Conference of the International Society for Scientometrics and Informetrics, 13: 66–77. http://science-metrix.com/?q=en/publications/conference-presentations/towards-a-multilingualcomprehensive-and-open-scientific.
[4] Shu, F., Julien, C. A., Zhang, L., Qiu, J., Zhang, J., and Larivière, V. (2019). Comparing journal and paper level classifications of science. Journal of Informetrics, 13 (1): 202–25. https://www.sciencedirect.com/science/article/pii/S1751157718303298.
[5] Sjogårde, P., and Ahlgren, P. (2020). Granularity of algorithmically constructed publication-level classifications of research publications: Identification of specialties. Quant. Sci. Stud. 1 (1): 207–38. https://www.mitpressjournals.org/doi/abs/10.1162/qss_a_00004.
[6] Waltman, L., and van Eck, N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of American Social Information Science and Technology, 63 (12): 2378–92. https://arxiv.org/abs/1203.0532.
[7] Adele, P., and Alden, D. (2017). Classification of Journal Articles in a Search for New Experimental Thermophysical Property Data: A Case Study, Integrated Material and Manufacturing Innovations (2017) 6: 187–196. https://www.doi.org/10.1007/s40192-017-0096-1
[8] Chen, D., Hans-michael, M., and Paul, W. S. (2006). Automatic Document Classification of Biological Literature, 11: 1–11. https://doi.org/10.1186/1471-2105-7-370.
[9] Caragea, C., Adrian, S., Saurabh, K., Doina, C., and Prasenjit, M. (2011). Classifying Scientific Publications Using Abstract Features. Association for the Advancement of Artificial Intelligence. https://www.aaai.org/.
[10] Roul, R. K., and Jajati K. S. (2017). A New Technique Classification of Research Articles Hierarchically : A New Technique. In H.S. Behera and D.P. Mohapatra (Eds.), Computational Intelligence in Data Mining, Advances in Intelligent Systems and Computing 556. https://doi.org/10.1007/978-981-10-3874-7.
[11] Kandimalla, B., Shaurya, R., Jian, W., and Giles, C. L. (2021). Large Scale Subject Category Classi Fi Cation of Scholarly Papers With Deep Attentive Neural Networks. Frontiers in Research Metrics and Analytics 5 (2): 1–12. https://doi.org/10.3389/frma.2020.600382.
[12] Pan, Z., Patrick, S., Setareh, R., Zhengtong, P., and Setareh R.. 2022. Ontology-Driven Scientific Literature Classification Using Clustering and Self-Supervised Learning. In Easychair Preprint.
[13] Chowdhury Shovan and Schoen Marco P. (2020) Research Paper Classification using Supervised Machine Learning Techniques. (2020). Intermountain Engineering, Technology and Computing (IETC), https://doi.org/10.1109/IETC47856.2020.9249211
[14] Denning, P. J. (1997). Computer Science: The Discipline, In A. Ralston and D. Hemmendinger (Eds.), 2000 Edition of Encyclopedia of Computer Science.
[15] Bird, S., Klein, E. and Loper, E. (2009). Natural language processing with Python: Analyzing text with the natural language toolkit. O’Reilly Media, Inc.
[16] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., and R. Weiss. (2011). Scikit-Learn: Machine Learning in Python. Journal of Machine Learning Research, 12: 2825–2830.
Cite This Article
  • APA Style

    Oluwafemi Oriola, Lawrence Ojo, Ojonoka Atawodi. (2022). Automatic Classification of Computing Literatures via Article and Reference Correlation. American Journal of Computer Science and Technology, 5(4), 204-209. https://doi.org/10.11648/j.ajcst.20220504.12

    Copy | Download

    ACS Style

    Oluwafemi Oriola; Lawrence Ojo; Ojonoka Atawodi. Automatic Classification of Computing Literatures via Article and Reference Correlation. Am. J. Comput. Sci. Technol. 2022, 5(4), 204-209. doi: 10.11648/j.ajcst.20220504.12

    Copy | Download

    AMA Style

    Oluwafemi Oriola, Lawrence Ojo, Ojonoka Atawodi. Automatic Classification of Computing Literatures via Article and Reference Correlation. Am J Comput Sci Technol. 2022;5(4):204-209. doi: 10.11648/j.ajcst.20220504.12

    Copy | Download

  • @article{10.11648/j.ajcst.20220504.12,
      author = {Oluwafemi Oriola and Lawrence Ojo and Ojonoka Atawodi},
      title = {Automatic Classification of Computing Literatures via Article and Reference Correlation},
      journal = {American Journal of Computer Science and Technology},
      volume = {5},
      number = {4},
      pages = {204-209},
      doi = {10.11648/j.ajcst.20220504.12},
      url = {https://doi.org/10.11648/j.ajcst.20220504.12},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajcst.20220504.12},
      abstract = {Automatic literature classification via machine learning has witnessed increasing attention in various research circles, especially computing community because of the availability of large body of research articles in diverse fields. Existing works have largely drawn features from segments of articles such as abstracts, contents and their metadata with little or no attention for references. This paper posited that correlating article and reference features would enhance the performance of machine learning algorithms. Therefore, we exploited the correlation of TFIDF of articles and references using association rule and cosine similarity-based correlation methods for classification of computing literatures. We focused on Adekunle Ajasin University Research Repository. Based on the ACM’s and Denning’s taxonomies, the research articles in the database were labelled by experienced computing professionals. Logistic Regression, Support Vector Machine and Multilayer Perceptron Neural Network with N-Gram features were explored as classifiers. For ACM’s taxonomy, the highest accuracy and F1-score of 0.56 and 0.41, respectively were obtained for association rule-based correlation; 0.62 and 0.51, respectively for similarity-based correlation; and 0.59 and 0.46, respectively for the existing article-based classification. For Denning’s taxonomy, the highest accuracy and F1-score of 0.41 and 0.40, respectively were obtained for association rule-based correlation; 0.41 and 0.36, respectively for similarity-based correlation; and 0.38 and 0.37, respectively for the existing article-based classification. These results show that both methods of correlation have better prospect than the popular abstract-based classification method in automatic classification of computing literatures.},
     year = {2022}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Automatic Classification of Computing Literatures via Article and Reference Correlation
    AU  - Oluwafemi Oriola
    AU  - Lawrence Ojo
    AU  - Ojonoka Atawodi
    Y1  - 2022/10/21
    PY  - 2022
    N1  - https://doi.org/10.11648/j.ajcst.20220504.12
    DO  - 10.11648/j.ajcst.20220504.12
    T2  - American Journal of Computer Science and Technology
    JF  - American Journal of Computer Science and Technology
    JO  - American Journal of Computer Science and Technology
    SP  - 204
    EP  - 209
    PB  - Science Publishing Group
    SN  - 2640-012X
    UR  - https://doi.org/10.11648/j.ajcst.20220504.12
    AB  - Automatic literature classification via machine learning has witnessed increasing attention in various research circles, especially computing community because of the availability of large body of research articles in diverse fields. Existing works have largely drawn features from segments of articles such as abstracts, contents and their metadata with little or no attention for references. This paper posited that correlating article and reference features would enhance the performance of machine learning algorithms. Therefore, we exploited the correlation of TFIDF of articles and references using association rule and cosine similarity-based correlation methods for classification of computing literatures. We focused on Adekunle Ajasin University Research Repository. Based on the ACM’s and Denning’s taxonomies, the research articles in the database were labelled by experienced computing professionals. Logistic Regression, Support Vector Machine and Multilayer Perceptron Neural Network with N-Gram features were explored as classifiers. For ACM’s taxonomy, the highest accuracy and F1-score of 0.56 and 0.41, respectively were obtained for association rule-based correlation; 0.62 and 0.51, respectively for similarity-based correlation; and 0.59 and 0.46, respectively for the existing article-based classification. For Denning’s taxonomy, the highest accuracy and F1-score of 0.41 and 0.40, respectively were obtained for association rule-based correlation; 0.41 and 0.36, respectively for similarity-based correlation; and 0.38 and 0.37, respectively for the existing article-based classification. These results show that both methods of correlation have better prospect than the popular abstract-based classification method in automatic classification of computing literatures.
    VL  - 5
    IS  - 4
    ER  - 

    Copy | Download

Author Information
  • Department of Computer Science, Adekunle Ajasin University, Akungba-Akoko, Nigeria

  • Department of Computer Science, Adekunle Ajasin University, Akungba-Akoko, Nigeria

  • School of Computing, University of Southern Mississippi, Hattiesburg, US

  • Sections