Ai4bharat

Models by this creator

🏅

Total Score

46

indic-parler-tts

ai4bharat

Here's the blog post incorporating only explicitly provided links: indic-parler-tts is an innovative text-to-speech model built by ai4bharat that brings multilingual voice synthesis to 21 Indian languages. This extension of Parler-TTS Mini was trained on 1,806 hours of multilingual data to deliver natural-sounding speech across languages from Assamese to Urdu. Model Inputs and Outputs The model transforms text and voice descriptions into high-quality synthetic speech. Using natural language prompts, users can control voice characteristics like gender, speaking rate, and audio quality. Inputs Text transcript** - The content to be converted to speech Voice description** - Natural language prompt describing desired voice qualities Language selection** - Choice of supported Indian language Outputs Audio file** - High-quality synthesized speech matching the specified characteristics Sampling rate** - Standard audio quality output Capabilities The enhanced prompt tokenizer enables natural handling of 21 Indian languages including Hindi, Tamil, Telugu and more. By fine-tuning Parler-TTS Large, the model maintains quality while expanding language support. The byte fallback system allows graceful handling of unseen characters. What can I use it for? This model opens possibilities for multilingual content creation, accessibility tools, and Indian language technology applications. Companies can integrate it into virtual assistants, e-learning platforms, or automated customer service systems. The Parler-TTS Mini Expresso architecture provides a foundation for emotion-aware speech synthesis. Things to try Experiment with code-switching between languages or mix formal and conversational styles. Test different voice descriptions to find the right character for your application. Try generating multiple versions of the same text with varied emotional tones and speaking rates to understand the model's expressive range.

Read more

Updated 12/8/2024

Audio-to-Text