Enhancing Sentiment Analysis for Malayalam With mBERT: A Profoundly Transparent and Accurate Approach Using LIME

Anitha R.; K. S. Anil Kumar; Rajeev R. R.; Ansil Shafee; Manju G.; Reshmi L. B.

doi:10.3844/jcssp.2026.1666.1678

Research Article Open Access

Enhancing Sentiment Analysis for Malayalam With mBERT: A Profoundly Transparent and Accurate Approach Using LIME

Anitha R.¹, K. S. Anil Kumar², Rajeev R. R.³, Ansil Shafee¹, Manju G.⁴ and Reshmi L. B.¹

¹ Department of Futures Studies, University of Kerala, Thiruvananthapuram, India
² KSMDB College, Sasthamcotta, KOLLAM, India
³ ICFOSS, Thiruvananthapuram, India
⁴ Department of Computer Science, P. M. Govt. College, Chalakudy, India

Abstract

The overlapping sentiment boundaries, intensifiers, and intricate morphological structures in Malayalam present particular difficulties for sentiment analysis, making it hard for traditional machine learning techniques to produce consistent results. We present an explainable sentiment analysis framework in this paper that refines a Multilingual Bidirectional Encoder Representations from Transformers (mBERT) model on a novel constituency-level dataset that has been manually curated and annotated into five-class (very positive, positive, neutral, negative, and very negative) and three-class (positive, neutral, and negative) categories. In contrast to previous research that focuses solely on accuracy, our method incorporates Local Interpretable Model-Agnostic Explanations (LIME) to identify linguistic cues that significantly impact sentiment prediction in Malayalam, including intensifiers, negations, and context-dependent modifiers. Despite the inherent linguistic complexity, the suggested model demonstrated consistency, achieving 61.78% precision for three-class classification and 61.47% for five-class classification. More significantly, the LIME-based interpretability analysis provides a clear and linguistically grounded standard for low-resource sentiment analysis by highlighting the impact of Malayalam-specific features on classification results. In addition to presenting one of the earliest explainable BERT-based sentiment models for Malayalam, this work lays the groundwork for further studies on interpretable deep learning in underrepresented languages. As far as we know, the current work is the first to create an explainable, transformer-based sentiment analysis framework for Malayalam that incorporates BERT with LIME and is underpinned by a constituency-level curated dataset. This contribution sets a new standard for NLP in low-resource languages in terms of performance and explainability.

Journal of Computer Science

Volume 22 No. 5, 2026, 1666-1678

DOI: https://doi.org/10.3844/jcssp.2026.1666.1678

Submitted On: 8 July 2025 Published On: 2 June 2026

How to Cite: R., A., Kumar, K. S. A., R., R. R., Shafee, A., G., M. & B., R. L. (2026). Enhancing Sentiment Analysis for Malayalam With mBERT: A Profoundly Transparent and Accurate Approach Using LIME. Journal of Computer Science, 22(5), 1666-1678. https://doi.org/10.3844/jcssp.2026.1666.1678

Copyright: © 2026 Anitha R., K. S. Anil Kumar, Rajeev R. R., Ansil Shafee, Manju G. and Reshmi L. B.. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

51 Views
16 Downloads
0 Citations

Download

Keywords

Sentiment Analysis
Malayalam
BERT
Explainable AI
LIME
NLP