Research Article Open Access

KSWN (Kannada SentiWordNet): Developing a Sentiment Lexicon for Kannada using Translation and Word Embedding Techniques

Rashmi Kariganuru Bheemarao1, Hassan Sadashiva Guruprasad1 and Shambhavi Bangalore Ravi2
  • 1 Department of Information Science and Engineering, B.M.S. College of Engineering, Bangalore, Visvesvaraya Technological University, Belagavi, Karnataka, India
  • 2 Department of Computer Science and Engineering (DS), B.M.S. College of Engineering, Bangalore, Visvesvaraya Technological University, Belagavi, Karnataka, India

Abstract

Opinion Mining has gained significant attention in recent years, especially due to the enormous growth of online content generation. However, finding the opinions expressed in comments and reviews is highly challenging in Indian regional languages due to the lack of annotated datasets. Opinion mining has predominantly been conducted in English, with recent efforts extending to Hindi and other languages. A primary resource in opinion mining is SentiWordNet, which aids in analyzing opinions by providing sentiment scores for words. Building a KSWN has been done to explore regional languages, as there is a notable absence of a comparable resource for Kannada. Thus, this study proposes creating a Kannada sentiment lexicon using a translation-based approach from various English sentiment lexicons. KSWN, a sentiment lexicon for Kannada developed using a translation approach, achieved an inter-annotator agreement, with Cohen's Kappa scores of 0.84 for positive words and 0.79for negative words as verified by two Kannada annotators. The Kannada SentiWordNet, initially created, may not cover all sentiment-bearing words, word embeddings are employed to capture semantic similarity. As a seed lexicon can be the foundation for tagging a new corpus. Words in the new corpus are annotated by matching them with the seed list. New words with similar sentiment profiles are identified by applying similarity measures to the embedded word representations. These newly identified words are then added to the lexicon, further enriching it for sentiment analysis tasks.

Journal of Computer Science
Volume 21 No. 6, 2025, 1482-1489

DOI: https://doi.org/10.3844/jcssp.2025.1482.1489

Submitted On: 13 December 2024 Published On: 4 July 2025

How to Cite: Bheemarao, R. K., Guruprasad, H. S. & Ravi, S. B. (2025). KSWN (Kannada SentiWordNet): Developing a Sentiment Lexicon for Kannada using Translation and Word Embedding Techniques. Journal of Computer Science, 21(6), 1482-1489. https://doi.org/10.3844/jcssp.2025.1482.1489

  • 107 Views
  • 51 Downloads
  • 0 Citations

Download

Keywords

  • Natural Language Processing
  • SentiWordNet
  • Word Embeddings
  • Kannada Language
  • Sentiment Analysis
  • Lexicon Development