Open Access   Article Go Back

Comparative Analysis of Transformer Based Pre-Trained NLP Models

Saurav Singla1 , Ramachandra N.2

Section:Research Paper, Product Type: Journal Paper
Volume-8 , Issue-11 , Page no. 40-44, Nov-2020

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v8i11.4044

Online published on Nov 30, 2020

Copyright © Saurav Singla, Ramachandra N. . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Saurav Singla, Ramachandra N., “Comparative Analysis of Transformer Based Pre-Trained NLP Models,” International Journal of Computer Sciences and Engineering, Vol.8, Issue.11, pp.40-44, 2020.

MLA Style Citation: Saurav Singla, Ramachandra N. "Comparative Analysis of Transformer Based Pre-Trained NLP Models." International Journal of Computer Sciences and Engineering 8.11 (2020): 40-44.

APA Style Citation: Saurav Singla, Ramachandra N., (2020). Comparative Analysis of Transformer Based Pre-Trained NLP Models. International Journal of Computer Sciences and Engineering, 8(11), 40-44.

BibTex Style Citation:
@article{Singla_2020,
author = {Saurav Singla, Ramachandra N.},
title = {Comparative Analysis of Transformer Based Pre-Trained NLP Models},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {11 2020},
volume = {8},
Issue = {11},
month = {11},
year = {2020},
issn = {2347-2693},
pages = {40-44},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=5259},
doi = {https://doi.org/10.26438/ijcse/v8i11.4044}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v8i11.4044}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=5259
TI - Comparative Analysis of Transformer Based Pre-Trained NLP Models
T2 - International Journal of Computer Sciences and Engineering
AU - Saurav Singla, Ramachandra N.
PY - 2020
DA - 2020/11/30
PB - IJCSE, Indore, INDIA
SP - 40-44
IS - 11
VL - 8
SN - 2347-2693
ER -

VIEWS PDF XML
292 4883 downloads 126 downloads
  
  
           

Abstract

Transformer based self-supervised pre-trained models have transformed the concept of Transfer learning in Natural language processing (NLP) using Deep learning approach. Self-attention mechanism made transformers more popular in transfer learning across a broad range of NLP tasks. Among such tasks, Sentiment analysis helps to identify people`s opinions towards a topic, product or service. In this project we analyse the performance of self-supervised models for Multi-class Sentiment analysis on a Non benchmarking dataset. We used BERT, RoBERTa, and ALBERT models for this study. These models are different in design but have the same objective of leveraging a huge amount of text data to build a general language understanding model. We fine-tuned these models on Sentiment analysis with a proposed architecture. We used f1-score and AUC (Area under ROC curve) score for evaluating model performance. We found the BERT model with proposed architecture performed well with the highest f1-score of 0.85 followed by RoBERTa (f1-score=0.80), and ALBERT (f1-score=0.78). This analysis reveals that the BERT model with proposed architecture is best for multi-class sentiment on a Non-benchmarking dataset.

Key-Words / Index Term

NLP, Transfer learning, Sentiment analysis, BERT, RoBERTa, ALBERT

References

[1] Ashish Vashwani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ?ukasz Kaiser, Illia Polosukhin, “Attention is all you need”, 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA pp.5998-6008, 2017.
[2] Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding”, In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 4171-4186.
[3] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov, “Roberta: A robustly optimized bert pre training approaches”, 2019, arXiv preprint arXiv:1907.11692.
[4] Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel,Piyush Sharma, Radu Soricut, “Albert: A lite bert for self-supervised learning of language representations”, 2019, arXiv preprint arXiv:1909.11942.
[5] Matthias Aßenmacher, Christian Heumann, “On the comparability of pre-trained language models”, CEUR Workshop Proceedings, Vol.2624.
[6] Cristóbal Colón-Ruiz, Isabel Segura-Bedmar, “Comparing deep learning architectures for sentiment analysis on drug reviews", Journal of Biomedical Informatics, Volume 110, 2020, 103539, ISSN 1532-0464.
[7] Thanapapas Horsuwan, Kasidis Kanwatchara, Peerapon Vateekul, Bonser Kijsirikul, "A Comparative Study of Pretrained Language Models on Thai Social Text Categorization", 2019, arXiv:1912.01580v1.
[8] Carlos Aspillaga, Andres Carvallo, Vladimir Araujo,"Stress Test Evaluation of Transformer-based Models in Natural Language Understanding Tasks", Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), European Language Resources Association (ELRA), Marseille, pp. 1882–1894, 2020.
[9] Vishal Shirsat, Rajkumar Jagdale, Kanchan Shende, Sachin N. Deshmukh, Sunil Kawale, “Sentence Level Sentiment Analysis from News Articles and Blogs using Machine Learning Techniques”, International Journal of Computer Sciences and Engineering, Vol.7, Issue.5, 2019.
[10] Avinash Kumar1, Savita Sharma, Dinesh Singh, "Sentiment Analysis on Twitter Data using a Hybrid Approach", International Journal of Computer Sciences and Engineering, Vol.-7, Issue-5, May 2019.