Speech Emotion Analysis of Short English Readings based on the CAM-SPAT Model

Qingyuan Li

doi:10.12694/scpe.v24i4.2454

Authors

Qingyuan Li School of Foreign Languages, Nanchang Institute of Technology, Nanchang, 330044, China

DOI:

https://doi.org/10.12694/scpe.v24i4.2454

Keywords:

read aloud speech, emotion, CAM-SPAT, deep learning, feature extraction

Abstract

With the development of technology, voice sentiment analysis has also undergone rapid development, and its application fields are constantly expanding. Multimodal models have become a key focus of researchers due to their ability to better predict emotions. In order to help English learners improve their oral English proficiency, a deep learning based emotional analysis model for English short text reading is proposed, and this model is used to analyze emotions in English reading. Additionally, a cross-modal attention mechanism based on a prediction-assisted task was developed to identify emotions in English reading aloud in state and a two-layer attention-based bi-directional long- and short-term memory network was created to classify emotions in English reading aloud. The results of the research revealed that the classification model’s mean F1 value was 98.54%, the detection model’s mean F1 value was 85.13%, and the speech emotion analysis model’s mean F1 value was 73.55, which was not significantly different from the mean of the professionals’ ratings. The significance of the study lies in providing English learners with a method and pathway to improve their oral English proficiency.

Speech Emotion Analysis of Short English Readings based on the CAM-SPAT Model

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

Latest publications

Information

ISSN

Speech Emotion Analysis of Short English Readings based on the CAM-SPAT Model

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

Latest publications

Information

ISSN

Social media