Abstract

In the evolving landscape of digital music consumption, accurate mood classification plays a crucial role in enhancing user experience on streaming platforms. While services like Spotify and SoundCloud have introduced mood-based features, inconsistencies in labeling often result in disjointed listening experiences. This project proposes a music mood recognition algorithm based on audio processing analysis and machine learning, particularly focusing on clustering techniques and neural networks.

This study utilizes a dataset of 389 songs across eight genres from the Free Music Archive (FMA), with music features such as tempo, key, scale (surface-level), and audio deep learning features such as spectral centroid and Mel-spectrograms, which are extracted through extensive preprocessing, including denoising, trimming, normalization, and Short-Time Fourier Transform (STFT). Both audio features and deep spectral representations are analyzed together and individually to find the best models.

Two machine learning approaches are employed: unsupervised clustering (K-Means and Fuzzy C-Means) to group songs into mood categories without labels, and a supervised neural network trained with sparse categorical cross-entropy and optimized via Stochastic Gradient Descent. Principal Component Analysis (PCA) is used to reduce feature dimensionality. Mood categories are defined by energy and emotion levels (Energetic/Relaxing × Happy/Sad), forming four mood groups.

The results show K-Means provides clearer, interpretable clusters, while the neural network—though flexible—exhibits overfitting, with a test accuracy of 43.59%. Fuzzy clustering reveals the overlapping emotional content in music more effectively than hard clustering. ii Limitations include dataset size, subjectivity in mood labeling, and lack of lyrical or contextual data.

This project contributes a potential framework for mood-based recommendation, offering feasible applications in playlist generation and music discovery for users and artists alike. Future work may explore deep learning architectures, context-aware personalization, and integration of lyrics for a more comprehensive emotional modeling of music.

Advisor

Long, Colby

Department

Statistical and Data Sciences

Recommended Citation

Pham, My, "Audio Signal Analysis for Songs’ Mood Recognition by Clustering and Neural Network" (2025). Senior Independent Study Theses. Paper 11332.
https://openworks.wooster.edu/independentstudy/11332

Disciplines

Analysis | Data Science | Other Music | Other Physics | Statistical Models

Keywords

audio signal analysis, spectral analysis, audio semantic analysis, neural network, clustering, fuzzy clustering, k-means clustering.

Publication Date

2025

Degree Granted

Bachelor of Arts

Document Type

Senior Independent Study Thesis

Download

COinS

Open Works

Senior Independent Study Theses

Audio Signal Analysis for Songs’ Mood Recognition by Clustering and Neural Network

Abstract

Advisor

Department

Recommended Citation

Disciplines

Keywords

Publication Date

Degree Granted

Document Type

Search

Browse

Author Corner

Open Works

Senior Independent Study Theses

Audio Signal Analysis for Songs’ Mood Recognition by Clustering and Neural Network

Authors

Abstract

Advisor

Department

Recommended Citation

Disciplines

Keywords

Publication Date

Degree Granted

Document Type

Share

Search

Browse

Author Corner