TY - GEN
T1 - Empirical Study of Tweets Topic Classification Using Transformer-Based Language Models
AU - Mandal, Ranju
AU - Chen, Jinyan
AU - Becken, Susanne
AU - Stantic, Bela
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Social media opens up a great opportunity for policymakers to analyze and understand a large volume of online content for decision-making purposes. People’s opinions and experiences on social media platforms such as Twitter are extremely significant because of its volume, variety, and veracity. However, processing and retrieving useful information from natural language content is very challenging because of its ambiguity and complexity. Recent advances in Natural Language Understanding (NLU)-based techniques more specifically Transformer-based architecture solve sequence-to-sequence modeling tasks while handling long-range dependencies efficiently, and models based on transformers setting new benchmarks in performance across a wide variety of NLU-based tasks. In this paper, we applied transformer-based sequence modeling on short texts’ topic classification from tourist/user-posted tweets. Multiple BERT-like state-of-the-art sequence modeling approaches on topic/target classification tasks are investigated on the Great Barrier Reef tweet dataset and obtained findings can be valuable for researchers working on classification with large data sets and a large number of target classes.
AB - Social media opens up a great opportunity for policymakers to analyze and understand a large volume of online content for decision-making purposes. People’s opinions and experiences on social media platforms such as Twitter are extremely significant because of its volume, variety, and veracity. However, processing and retrieving useful information from natural language content is very challenging because of its ambiguity and complexity. Recent advances in Natural Language Understanding (NLU)-based techniques more specifically Transformer-based architecture solve sequence-to-sequence modeling tasks while handling long-range dependencies efficiently, and models based on transformers setting new benchmarks in performance across a wide variety of NLU-based tasks. In this paper, we applied transformer-based sequence modeling on short texts’ topic classification from tourist/user-posted tweets. Multiple BERT-like state-of-the-art sequence modeling approaches on topic/target classification tasks are investigated on the Great Barrier Reef tweet dataset and obtained findings can be valuable for researchers working on classification with large data sets and a large number of target classes.
KW - Deep learning
KW - Natural language processing
KW - Target classification
KW - Topic classification
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=85104792796&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-73280-6_27
DO - 10.1007/978-3-030-73280-6_27
M3 - Conference contribution
AN - SCOPUS:85104792796
SN - 9783030732790
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 340
EP - 350
BT - Intelligent Information and Database Systems - 13th Asian Conference, ACIIDS 2021, Proceedings
A2 - Nguyen, Ngoc Thanh
A2 - Chittayasothorn, Suphamit
A2 - Niyato, Dusit
A2 - Trawiński, Bogdan
PB - Springer Science and Business Media Deutschland GmbH
T2 - 13th Asian Conference on Intelligent Information and Database Systems, ACIIDS 2021
Y2 - 7 April 2021 through 10 April 2021
ER -