Scalable and Advanced Framework for Hate Speech Detection on Social Media Using BERT and GPT-2 through Encoder and Decoder Architectures
- 1 Department of Computer Science, Jamia Millia Islamia, New Delhi, India
- 2 Department of Artificial Intelligence and Machine Learning, New Delhi Institute of Management, New Delhi, India
Abstract
Hate speech is a major problem on social media platforms. Every day, numerous instances of hateful behavior based on race, ethnicity, religion, or gender are witnessed on social media. Most of the leading social media platforms like Instagram, Facebook, Twitter Reddit, etc., have strong community guidelines that condemn and restrict the exchange of hateful language/content in any form. Despite the guidelines, some of these instances go unnoticed due to the suitability of the language and expression. This encourages the need for strong automated hate speech detection techniques that can flag such content and ensure a safer environment for users belonging to all domains of life. There is a concept of transformers model it is based on two encoders and decoder blocks. The model which has only an encoder block is called Bidirectional Encoder Representations from Transformers (BERT) and the model which contains only a decoder block is called Generative Pre-trained Transformer (GPT). In this study, we propose a method that uses a pre-trained BERT model for hate speech detection on Twitter data. The dataset contains tweets belonging to three different classes i.e., hate speech (0), offensive language (1), and neither of these (2). We evaluated our proposed model on this dataset: Without data augmentation and with data augmentation using Generative Pre-trained Transformer-2 (GPT-2). It shows that data augmentation with GPT-2 enhances the performance of the BERT model by achieving 81% accuracy in comparison to un-augmented data. Despite strong community guidelines, subtle forms of hate speech on social media often go undetected, highlighting the need for robust detection methods. The suggested method uses a pre-trained BERT algorithm to categorize tweets as hate speech, inflammatory language, or neutral content. Data augmentation with GPT-2 considerably improves the BERT model's performance, obtaining an 81% accuracy rate.
DOI: https://doi.org/10.3844/jcssp.2025.584.594
Copyright: © 2025 Usman , Nabeela Hasan and Syed Mohammad Khurshid Quadri. This is an open access article distributed under the terms of the
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 243 Views
- 131 Downloads
- 0 Citations
Download
Keywords
- BERT
- GPT-2
- Hate Speech
- Machine Learning
- Social Media