Optimized Feature Selection Approach for Semi-Supervised Sentiment Analysis of E-Commerce Feedback
- 1 School of Computer Science and Engineering, GIET University, Odisha, India
- 2 Department of Computer Science Engineering, Siksha O Anusandhan (Deemed to be) University, Odisha, India
- 3 School of Computer Application, KIIT (Deemed to be) University, Odisha, India
- 4 School of Computer Engineering, KIIT (Deemed to be) University, Odisha, India
Abstract
In this globalized world, people prefer to buy products online without any hesitation. Usually, to acquire the quality of the product or brand, they examine the product’s reviews, which is a tedious job to do manually. The wide use of social media also encourages the users, to keep their views on the product in a global platform. By using machine learning techniques, we can solve the problem of product selection. In this study, we are using sentiment analysis to analyze the reviews and select the best features. We have applied support vector machine and Naïve Bayes machine learning algorithms for the binary classification of the reviews, where it tells whether the review is favorable or not, i.e., positive or negative. The problem with the real-time review analysis is that all the reviews we are considering for the analysis are not labeled. So, we are using a semi-supervised machine learning technique to retrieve the missing information from the e-commerce product reviews for better information and improved accuracy. Additionally, we want to address the issue of sentiment polarity categorization, boost productivity and gain a deeper understanding of how sentiment analysis may be used to inform business decisions. As a result, this research can help consumers understand the knowledge of product reviews and justify the product quality based on the data i.e., reviews. This study is carried out with two popular semi-supervised methods, self-training and co-training and implemented on the e-commerce dataset. As a result, it found that the optimized co-training model with support vector machine and Naïve Bayes classifiers performs better than the self-training model with support vector machine classifier for the dataset which contains both the labeled and unlabeled data.
DOI: https://doi.org/10.3844/jcssp.2025.363.379
Copyright: © 2025 Alok Kumar Jena, Kakita Murali Gopal, Abinash Tripathy and Nibedan Panda. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 111 Views
- 28 Downloads
- 0 Citations
Download
Keywords
- E-Commerce Reviews
- Self-Training
- Co-Training
- Natural Language Processing
- Machine Learning
- Data-Driven Decisions