Superb

Rank:

Average Model Cost: $0.0000

Number of Runs: 32,972

Models by this creator

hubert-base-superb-ks

hubert-base-superb-ks

superb

The hubert-base-superb-ks model is an audio classification model that can recognize and classify different types of audio data. It is based on the Hubert framework and is fine-tuned using the superb-ks dataset. This model can be used for tasks such as speech recognition, sound event detection, and many other audio classification tasks.

Read more

$-/run

16.2K

Huggingface

wav2vec2-base-superb-ks

wav2vec2-base-superb-ks

Wav2Vec2-Base for Keyword Spotting Model description This is a ported version of S3PRL's Wav2Vec2 for the SUPERB Keyword Spotting task. The base model is wav2vec2-base, which is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz. For more information refer to SUPERB: Speech processing Universal PERformance Benchmark Task and dataset description Keyword Spotting (KS) detects preregistered keywords by classifying utterances into a predefined set of words. The task is usually performed on-device for the fast response time. Thus, accuracy, model size, and inference time are all crucial. SUPERB uses the widely used Speech Commands dataset v1.0 for the task. The dataset consists of ten classes of keywords, a class for silence, and an unknown class to include the false positive. For the original model's training and evaluation instructions refer to the S3PRL downstream task README. Usage examples You can use the model via the Audio Classification pipeline: Or use the model directly: Eval results The evaluation metric is accuracy. BibTeX entry and citation info

Read more

$-/run

2.4K

Huggingface

hubert-base-superb-ic

hubert-base-superb-ic

Hubert-Base for Intent Classification Model description This is a ported version of S3PRL's Hubert for the SUPERB Intent Classification task. The base model is hubert-base-ls960, which is pretrained on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz. For more information refer to SUPERB: Speech processing Universal PERformance Benchmark Task and dataset description Intent Classification (IC) classifies utterances into predefined classes to determine the intent of speakers. SUPERB uses the Fluent Speech Commands dataset, where each utterance is tagged with three intent labels: action, object, and location. For the original model's training and evaluation instructions refer to the S3PRL downstream task README. Usage examples You can use the model directly like so: Eval results The evaluation metric is accuracy. BibTeX entry and citation info

Read more

$-/run

2.2K

Huggingface

Similar creators