indic-parler-tts
ai4bharat
Here's the blog post incorporating only explicitly provided links:
indic-parler-tts is an innovative text-to-speech model built by ai4bharat that brings multilingual voice synthesis to 21 Indian languages. This extension of Parler-TTS Mini was trained on 1,806 hours of multilingual data to deliver natural-sounding speech across languages from Assamese to Urdu.
Model Inputs and Outputs
The model transforms text and voice descriptions into high-quality synthetic speech. Using natural language prompts, users can control voice characteristics like gender, speaking rate, and audio quality.
Inputs
Text transcript** - The content to be converted to speech
Voice description** - Natural language prompt describing desired voice qualities
Language selection** - Choice of supported Indian language
Outputs
Audio file** - High-quality synthesized speech matching the specified characteristics
Sampling rate** - Standard audio quality output
Capabilities
The enhanced prompt tokenizer enables natural handling of 21 Indian languages including Hindi, Tamil, Telugu and more. By fine-tuning Parler-TTS Large, the model maintains quality while expanding language support. The byte fallback system allows graceful handling of unseen characters.
What can I use it for?
This model opens possibilities for multilingual content creation, accessibility tools, and Indian language technology applications. Companies can integrate it into virtual assistants, e-learning platforms, or automated customer service systems. The Parler-TTS Mini Expresso architecture provides a foundation for emotion-aware speech synthesis.
Things to try
Experiment with code-switching between languages or mix formal and conversational styles. Test different voice descriptions to find the right character for your application. Try generating multiple versions of the same text with varied emotional tones and speaking rates to understand the model's expressive range.
Read more