Speech emotion recognition

balfagih, zainAhmed, Fatima2024-01-102024-01-102023-01-19http://hdl.handle.net/20.500.14131/1342As automation becomes more widespread, data has become an increasingly valuable asset. This is particularly true in industries and organizations that rely on streaming sites and recommendation systems, which often use content-related information to make recommendations based on the user's past viewing history. However, creating these metadata can be a time-consuming and expensive process, as it typically involves manual annotation by specialists. Additionally, there may not always be enough information available, which can lead to lower-quality recommendations.In this study, a deep learning model was developed to recognize emotions in speech. The model used a combination of Mel-Frequency Cepstral Coefficients (MFCCs) for feature extraction and a Long Short-Term Memory (LSTM) layer to capture contextual information. The model was trained and tested on a speech database called the Toronto Emotional Speech Set. The results showed that the model was able to achieve high accuracy in emotion recognition, exceeding 95% for model accuracy and 97% for validation accuracy. The authors suggest that this type of model could be used to improve the ability of AI systems to understand and respond to human emotions, potentially enhancing the user experience in tasks such as voice commands, messaging, and recommendation systems.enSpeech emotion recognitionThesis