Updated to OpenVoice v2: Versatile Instant Voice Cloning

## Model overview

The `openvoice` model, developed by the team at MyShell, is a versatile instant voice cloning AI that can accurately clone the tone color and generate speech in multiple languages and accents. It offers flexible control over voice styles, such as emotion and accent, as well as other style parameters like rhythm, pauses, and intonation. The model also supports zero-shot cross-lingual voice cloning, allowing it to generate speech in languages not present in the training dataset.

The `openvoice` model builds upon several excellent open-source projects, including [TTS](https://github.com/coqui-ai/TTS), [VITS](https://github.com/jaywalnut310/vits), and [VITS2](https://github.com/daniilrobnikov/vits2). It has been powering the instant voice cloning capability of [myshell.ai](https://app.myshell.ai/explore) since May 2023 and has been used tens of millions of times by users worldwide, witnessing explosive growth on the platform.

## Model inputs and outputs

### Inputs
- **Audio**: The reference audio used to clone the tone color.
- **Text**: The text to be spoken by the cloned voice.
- **Speed**: The speed scale of the output audio.
- **Language**: The language of the audio to be generated.

### Outputs
- **Output**: The generated audio in the cloned voice.

## Capabilities

The `openvoice` model excels at accurate tone color cloning, flexible voice style control, and zero-shot cross-lingual voice cloning. It can generate speech in multiple languages and accents, while allowing for granular control over voice styles, including emotion and accent, as well as other parameters like rhythm, pauses, and intonation.

## What can I use it for?

The `openvoice` model can be used for a variety of applications, such as:

- Instant voice cloning for audio, video, or gaming content
- Customized text-to-speech for assistants, chatbots, or audiobooks
- Multilingual voice acting and dubbing
- Voice conversion and style transfer

## Things to try

With the `openvoice` model, you can experiment with different input reference audios to clone a wide range of voices and accents. You can also play with the style parameters to create unique and expressive speech outputs. Additionally, you can explore the model's cross-lingual capabilities by generating speech in languages not present in the training data.