Get a weekly rundown of the latest AI models and research... subscribe!



Average Model Cost: $0.0000

Number of Runs: 22,049

Models by this creator



The wav2vec2-large-robust-12-ft-emotion-msp-dim model is a model for Dimensional Speech Emotion Recognition based on Wav2vec 2.0. It takes a raw audio signal as input and predicts the arousal, dominance, and valence dimensions of speech emotion in a range of 0 to 1. The model was created by fine-tuning the Wav2Vec2-Large-Robust model on the MSP-Podcast dataset and is pruned to 12 transformer layers. It also provides the pooled states of the last transformer layer. The model can be used for emotion recognition in speech applications.

Read more




Similar creators