A Study on Emotion Analysis and Music Recommendation Using Transfer Learning
- 1 Department of Computer Science, Symbiosis Centre for Information Technology, Pune, India
- 2 Department of Data Science and Data Analytics, Symbiosis Centre for Information Technology, India
Abstract
As more and more people access and consume music through streaming platforms and digital services, music recommendation has grown in importance within the music industry. Given the abundance of music at our disposal, music recommendation algorithms are essential for guiding users toward new music and for creating individualized listening experiences. People frequently seek out music that fits their current emotional state or desired emotional state, which means that emotions can have a big impact on music recommendations. Emotions can be taken into account by music recommendation algorithms when deciding which songs or playlists to recommend to listeners. Face expressions are frequently used to gauge a person's mood. By using a webcam or any other external device, recognizable facial traits can now be extracted as inputs thanks to modern technology. Transfer learning is a method that is increasingly in demand for enhancing emotion recognition and music recommendation systems in the modern world. Transfer learning has evolved into a potent method for utilizing prior knowledge to enhance model performance and lessen the requirement for massive volumes of labeled data as a result of the data explosion and the availability of big pre-trained models. Hence, the objective of this study is to understand how transfer learning impacts the accuracy of detecting emotions from facial expressions and how the music recommendations can be personalized based on the detected emotions. This study aims at recommending songs by detecting the facial expressions of users using the FER2013 dataset for emotion recognition which is further extended by adding own images to the categories in the dataset from Google. A basic CNN, finetuned pre-trained ResNet50V2, finetuned pre-trained VGG16, and finetuned pre-trained EfficientNet50 B0 are trained on the dataset for emotion detection and compared. The music recommendation system is developed using the Spotify songs dataset extracted using Spotify web API. It uses k-means clustering for grouping tracks based on emotions and getting song recommendations based on the emotion predictions using finetuned ResNet50-V2 model with the highest training accuracy of 77.16% and validation accuracy of 69.04%. The findings reveal that using a transfer learning approach may effectively identify emotions from facial expressions and can have a potential impact on recommending music. It improves duties related to music recommendations and might be a useful method for assisting users in finding new music that fits the intended emotional state.
DOI: https://doi.org/10.3844/jcssp.2023.707.726
Copyright: © 2023 Krishna Kumar Singh and Payal Dembla. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 2,128 Views
- 1,473 Downloads
- 2 Citations
Download
Keywords
- Transfer Learning
- Emotion Prediction
- Music Recommendation System