Superb
Rank:Average Model Cost: $0.0000
Number of Runs: 32,972
Models by this creator
hubert-base-superb-ks
hubert-base-superb-ks
The hubert-base-superb-ks model is an audio classification model that can recognize and classify different types of audio data. It is based on the Hubert framework and is fine-tuned using the superb-ks dataset. This model can be used for tasks such as speech recognition, sound event detection, and many other audio classification tasks.
$-/run
16.2K
Huggingface
wav2vec2-base-superb-er
$-/run
4.8K
Huggingface
hubert-large-superb-er
$-/run
3.8K
Huggingface
wav2vec2-base-superb-ks
wav2vec2-base-superb-ks
Wav2Vec2-Base for Keyword Spotting Model description This is a ported version of S3PRL's Wav2Vec2 for the SUPERB Keyword Spotting task. The base model is wav2vec2-base, which is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz. For more information refer to SUPERB: Speech processing Universal PERformance Benchmark Task and dataset description Keyword Spotting (KS) detects preregistered keywords by classifying utterances into a predefined set of words. The task is usually performed on-device for the fast response time. Thus, accuracy, model size, and inference time are all crucial. SUPERB uses the widely used Speech Commands dataset v1.0 for the task. The dataset consists of ten classes of keywords, a class for silence, and an unknown class to include the false positive. For the original model's training and evaluation instructions refer to the S3PRL downstream task README. Usage examples You can use the model via the Audio Classification pipeline: Or use the model directly: Eval results The evaluation metric is accuracy. BibTeX entry and citation info
$-/run
2.4K
Huggingface
hubert-base-superb-ic
hubert-base-superb-ic
Hubert-Base for Intent Classification Model description This is a ported version of S3PRL's Hubert for the SUPERB Intent Classification task. The base model is hubert-base-ls960, which is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz. For more information refer to SUPERB: Speech processing Universal PERformance Benchmark Task and dataset description Intent Classification (IC) classifies utterances into predefined classes to determine the intent of speakers. SUPERB uses the Fluent Speech Commands dataset, where each utterance is tagged with three intent labels: action, object, and location. For the original model's training and evaluation instructions refer to the S3PRL downstream task README. Usage examples You can use the model directly like so: Eval results The evaluation metric is accuracy. BibTeX entry and citation info
$-/run
2.2K
Huggingface
hubert-base-superb-er
$-/run
1.3K
Huggingface
wav2vec2-base-superb-sid
$-/run
781
Huggingface
wav2vec2-large-superb-sid
$-/run
548
Huggingface
wav2vec2-base-superb-ic
$-/run
509
Huggingface
hubert-base-superb-sid
$-/run
332
Huggingface