| Peer-Reviewed

Assisting Access to COVID-19 Information Through Deep Learning Based Machine Translation: Attention Mechanism Via Bidirectional GRU

Received: 13 September 2021     Accepted: 29 September 2021     Published: 12 October 2021
Views:       Downloads:
Abstract

Due to the recent COVID-19 crisis, there is an increasing need for effective communication and sharing of information internationally in various fields. One of the obstacles that these needs face are language: In texts such as COVID-19 related research, currently existing machine translations which are effective in normal texts because they are trained with normal-context data are often inaccurate, and manual translation is slow and laboursome. So, the exchange of information is being delayed. To overcome this language barrier, this project aimed to create a model that is effective for translating COVID-19 crisis related data specifically. In the research, there are two models created: one is trained with TAUS English-French Corona Crisis Corpus, and another used transfer learning by Kaggle English-French corpus and then trained with TAUS corpus. The model consisted of four bidirectional GRU layers, and used rmsprop as optimizer. The project evaluated the model using the BLEU score. The first model had a higher BLEU score than the second model, supporting the hypothesis that loosely related datasets decrease the quality of translation. In further research, evaluation on this model on different language pairs and use datasets in other specific fields will be conducted.

Published in American Journal of Data Mining and Knowledge Discovery (Volume 6, Issue 1)
DOI 10.11648/j.ajdmkd.20210601.12
Page(s) 9-15
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2021. Published by Science Publishing Group

Keywords

Seq2seq, COVID-19, Bi-GRU, Machine Translation, NLP

References
[1] Coronavirus (COVID-19). (2021, September 2). Google News. https://news.google.com/COVID19/map?hl=en-US&mid=%2Fm%2F06qd3&gl=US&ceid=US%3Aen.
[2] Coronavirus disease (COVID-19). (2020, October 12). World Health Organization. https://www.who.int/emergencies/diseases/novel-coronavirus-2019/question-and-answers-hub/q-a-detail/coronavirus-disease-COVID-19.
[3] Revenue from nlp market worldwide - Google zoeken. (2020, June 8). Statistica. https://www.google.com/search?q=revenue+from+nlp+market+worldwide&oq=revenue+from+nlp+market+worldwide&aqs=chrome.69i57j33i160.4547j1j4&sourceid=chrome&ie=UTF-8.
[4] Park, C. J., Kim, K. H., Park, K. N., & Lim, H. S. (2020). Neural Machine translation specialized for Coronavirus Disease-19 (COVID-19). Journal of the Korea Convergence Society, 11 (9), 7-13. https://doi.org/10.15207/JKCS.2020.11.9.007.
[5] Mahata, S. K., Das, D., & Bandyopadhyay, S. (2019). MTIL2017: Machine Translation Using Recurrent Neural Network on Statistical Machine Translation. Journal of Intelligent Systems, 28 (3), 447–453. https://doi.org/10.1515/jisys-2018-0016.
[6] Way, A., Haque, R., Xie, G., Gaspari, F., Popović, M., & Poncelas, A. (2020). Rapid Development of Competitive Translation Engines for Access to Multilingual COVID-19 Information. Informatics, 7 (2), 19. https://doi.org/10.3390/informatics7020019.
[7] Komal, K., & Sharma, A. (2020). NATURAL LANGUAGE PROCESSING: AN APPROACH TO AID EMERGENCY SERVICES IN COVID-19 PANDEMIC. International Journal of Innovative Research in Computer Science & Technology, 8 (3). https://doi.org/10.21276/ijircst.2020.8.3.32.
[8] Kvapilikova, I., & Bojar, O. (2020). CUNI Machine Translation Systems for the COVID-19 MLIA Initiative.
[9] Corona Corpus - TAUS Matching Data. (n.d.). TAUS. Retrieved September 3, 2021, from https://md.taus.net/corona.
[10] Language Translation (English-French). (2020, April 8). Kaggle. https://www.kaggle.com/devicharith/language-translation-englishfrench.
[11] Cheon, M. J., Lee, D. H., Joo, H. S., & Lee, O. (2021). Deep learning based hybrid approach of detecting fraudulent transactions. Journal of Theoretical and Applied Information Technology, 99 (16), 4044-4054.
[12] Shewalkar, A., Nyavanandi, D., & Ludwig, S. A. (2019). Performance Evaluation of Deep Neural Networks Applied to Speech Recognition: RNN, LSTM and GRU. Journal of Artificial Intelligence and Soft Computing Research, 9 (4), 235–245. https://doi.org/10.2478/jaiscr-2019-0006.
[13] Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv: 1406.1078.
[14] Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112).
[15] Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv: 1409.0473.
Cite This Article
  • APA Style

    Daniel Chang. (2021). Assisting Access to COVID-19 Information Through Deep Learning Based Machine Translation: Attention Mechanism Via Bidirectional GRU. American Journal of Data Mining and Knowledge Discovery, 6(1), 9-15. https://doi.org/10.11648/j.ajdmkd.20210601.12

    Copy | Download

    ACS Style

    Daniel Chang. Assisting Access to COVID-19 Information Through Deep Learning Based Machine Translation: Attention Mechanism Via Bidirectional GRU. Am. J. Data Min. Knowl. Discov. 2021, 6(1), 9-15. doi: 10.11648/j.ajdmkd.20210601.12

    Copy | Download

    AMA Style

    Daniel Chang. Assisting Access to COVID-19 Information Through Deep Learning Based Machine Translation: Attention Mechanism Via Bidirectional GRU. Am J Data Min Knowl Discov. 2021;6(1):9-15. doi: 10.11648/j.ajdmkd.20210601.12

    Copy | Download

  • @article{10.11648/j.ajdmkd.20210601.12,
      author = {Daniel Chang},
      title = {Assisting Access to COVID-19 Information Through Deep Learning Based Machine Translation: Attention Mechanism Via Bidirectional GRU},
      journal = {American Journal of Data Mining and Knowledge Discovery},
      volume = {6},
      number = {1},
      pages = {9-15},
      doi = {10.11648/j.ajdmkd.20210601.12},
      url = {https://doi.org/10.11648/j.ajdmkd.20210601.12},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajdmkd.20210601.12},
      abstract = {Due to the recent COVID-19 crisis, there is an increasing need for effective communication and sharing of information internationally in various fields. One of the obstacles that these needs face are language: In texts such as COVID-19 related research, currently existing machine translations which are effective in normal texts because they are trained with normal-context data are often inaccurate, and manual translation is slow and laboursome. So, the exchange of information is being delayed. To overcome this language barrier, this project aimed to create a model that is effective for translating COVID-19 crisis related data specifically. In the research, there are two models created: one is trained with TAUS English-French Corona Crisis Corpus, and another used transfer learning by Kaggle English-French corpus and then trained with TAUS corpus. The model consisted of four bidirectional GRU layers, and used rmsprop as optimizer. The project evaluated the model using the BLEU score. The first model had a higher BLEU score than the second model, supporting the hypothesis that loosely related datasets decrease the quality of translation. In further research, evaluation on this model on different language pairs and use datasets in other specific fields will be conducted.},
     year = {2021}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Assisting Access to COVID-19 Information Through Deep Learning Based Machine Translation: Attention Mechanism Via Bidirectional GRU
    AU  - Daniel Chang
    Y1  - 2021/10/12
    PY  - 2021
    N1  - https://doi.org/10.11648/j.ajdmkd.20210601.12
    DO  - 10.11648/j.ajdmkd.20210601.12
    T2  - American Journal of Data Mining and Knowledge Discovery
    JF  - American Journal of Data Mining and Knowledge Discovery
    JO  - American Journal of Data Mining and Knowledge Discovery
    SP  - 9
    EP  - 15
    PB  - Science Publishing Group
    SN  - 2578-7837
    UR  - https://doi.org/10.11648/j.ajdmkd.20210601.12
    AB  - Due to the recent COVID-19 crisis, there is an increasing need for effective communication and sharing of information internationally in various fields. One of the obstacles that these needs face are language: In texts such as COVID-19 related research, currently existing machine translations which are effective in normal texts because they are trained with normal-context data are often inaccurate, and manual translation is slow and laboursome. So, the exchange of information is being delayed. To overcome this language barrier, this project aimed to create a model that is effective for translating COVID-19 crisis related data specifically. In the research, there are two models created: one is trained with TAUS English-French Corona Crisis Corpus, and another used transfer learning by Kaggle English-French corpus and then trained with TAUS corpus. The model consisted of four bidirectional GRU layers, and used rmsprop as optimizer. The project evaluated the model using the BLEU score. The first model had a higher BLEU score than the second model, supporting the hypothesis that loosely related datasets decrease the quality of translation. In further research, evaluation on this model on different language pairs and use datasets in other specific fields will be conducted.
    VL  - 6
    IS  - 1
    ER  - 

    Copy | Download

Author Information
  • North London Collegiate School Jeju, Jeju, Korea

  • Sections