Whisper Diarization

thomasmol

AI model preview image
Whisper-diarization is a model that transcribes audio files with speaker diarization. It can take in audio files in base64 or URL format and outputs the transcription with labels for each speaker. Speaker diarization is the process of identifying different speakers in an audio recording.

Use cases

Whisper-diarization has several possible use cases for a technical audience. It can be used in speech recognition systems to accurately transcribe audio recordings with multiple speakers, making it useful for applications in transcription services and voice assistants. It can also facilitate the analysis of conversations in call center recordings or customer service interactions, enabling organizations to extract insights from customer interactions. Additionally, this model can be integrated into video editing software to automatically generate closed captions with speaker labels, improving accessibility for individuals with hearing impairments. Other potential products or practical uses of this model could include voice-controlled meeting transcription software, intelligent voice assistants that can distinguish between different users, and speech analytics tools for research or market analysis purposes.

Audio-to-Text

Pricing

Cost per run
$-
USD
Avg run time
-
Seconds
Hardware
Nvidia T4 GPU
Prediction

Creator Models

ModelCostRuns
No other models by this creator

Similar Models

Try it!

You can use this area to play around with demo applications that incorporate the Whisper Diarization model. These demos are maintained and hosted externally by third-party creators. If you see an error, message me on Twitter.

Currently, there are no demos available for this model.

Overview

Summary of this model and related resources.

PropertyValue
Creatorthomasmol
Model NameWhisper Diarization
Description

Transcribes any audio file (file, base64 or url) with speaker diarization. ...

Read more ยป
TagsAudio-to-Text
Model LinkView on Replicate
API SpecView on Replicate
Github LinkView on Github
Paper LinkView on Arxiv

Popularity

How popular is this model, by number of runs? How popular is the creator, by the sum of all their runs?

PropertyValue
Runs7,116
Model Rank
Creator Rank

Cost

How much does it cost to run this model? How long, on average, does it take to complete a run?

PropertyValue
Cost per Run$-
Prediction HardwareNvidia T4 GPU
Average Completion Time-