Tango

declare-lab

tango

The Tango model uses instruction-guided diffusion to convert text into audio. It takes text input and generates coherent and natural-sounding audio output, using a combination of language and acoustic models. The instruction-guided diffusion technique allows the model to take into account additional guidance or instructions provided along with the text input, resulting in more accurate and customizable audio output. This model can be helpful in various applications such as text-to-speech systems, virtual assistants, and audio content generation.

Use cases

The Tango model has a wide range of potential use cases in various technical applications. One possible use case is in developing text-to-speech systems that can accurately convert large volumes of written text into natural and coherent audio output. This could be useful for creating audiobooks, podcasts, or other forms of audio content from written materials. The instruction-guided diffusion technique allows for more customization and control over the audio output, which could be beneficial for applications such as virtual assistants or voice-guided navigation systems. For example, a virtual assistant could use the Tango model to generate natural-sounding responses to user queries or instructions. Additionally, the model could be utilized in the development of language learning tools that provide spoken translations or pronunciation guidance. Overall, the Tango model offers a powerful and versatile tool for converting text into high-quality audio, with potential applications in a variety of products and services.

Text-to-Audio

Pricing

Cost per run
$0.2047
USD
Avg run time
89
Seconds
Hardware
Nvidia A100 (40GB) GPU
Prediction

Creator Models

ModelCostRuns
Mustango$?452

Similar Models

Try it!

You can use this area to play around with demo applications that incorporate the Tango model. These demos are maintained and hosted externally by third-party creators. If you see an error, message me on Twitter.

Currently, there are no demos available for this model.

Overview

Summary of this model and related resources.

PropertyValue
Creatordeclare-lab
Model NameTango
Description
Text to Audio using iNstruction-Guided diffusiOn
TagsText-to-Audio
Model LinkView on Replicate
API SpecView on Replicate
Github LinkView on Github
Paper LinkView on Arxiv

Popularity

How popular is this model, by number of runs? How popular is the creator, by the sum of all their runs?

PropertyValue
Runs8,808
Model Rank
Creator Rank

Cost

How much does it cost to run this model? How long, on average, does it take to complete a run?

PropertyValue
Cost per Run$0.2047
Prediction HardwareNvidia A100 (40GB) GPU
Average Completion Time89 seconds