Versatile Audio Super-resolution at Scale which upsamples audio files to 48khz. Longer audio input is possible with this model

## Model overview

`audiosr-long-audio` is a versatile audio super-resolution model created by Sakemin. It can upsample audio files to 48kHz, with the capability to handle longer audio inputs compared to other models. This model is part of Sakemin's suite of audio-related models, which includes the [audio-super-resolution](https://aimodels.fyi/models/replicate/audio-super-resolution-nateraw) model, the [musicgen-fine-tuner](https://aimodels.fyi/models/replicate/musicgen-fine-tuner-sakemin) model, and the [musicgen-remixer](https://aimodels.fyi/models/replicate/musicgen-remixer-sakemin) model.

## Model inputs and outputs

The `audiosr-long-audio` model accepts several key inputs, including an audio file to be upsampled, a random seed, the number of DDIM (Denoising Diffusion Implicit Models) inference steps, and a guidance scale value. The model outputs a URI pointing to the upsampled audio file.

### Inputs
- **Input File**: The audio file to be upsampled, provided as a URI.
- **Seed**: A random seed value, which can be left blank to randomize the seed.
- **Ddim Steps**: The number of DDIM inference steps, with a default of 50 and a range of 10 to 500.
- **Guidance Scale**: The scale for classifier-free guidance, with a default of 3.5 and a range of 1 to 20.
- **Truncated Batches**: A boolean flag to enable truncating batches to 5.12 seconds, which is essential for handling long audio files due to memory constraints.

### Outputs
- **Output**: The upsampled audio file, provided as a URI.

## Capabilities

The `audiosr-long-audio` model can effectively upsample audio files to a higher 48kHz sample rate, preserving the quality and fidelity of the original audio. This makes it a useful tool for enhancing the listening experience of various audio content, such as music, podcasts, or voice recordings.

## What can I use it for?

The `audiosr-long-audio` model can be employed in a variety of audio-related projects and applications. For example, musicians and audio engineers could use it to upscale their recorded tracks, improving the overall sound quality. Content creators, such as podcasters or video producers, could also leverage this model to enhance the audio in their productions. Additionally, the model's ability to handle longer audio inputs makes it suitable for processing larger audio files, such as full-length albums or long-form interviews.

## Things to try

One interesting aspect of the `audiosr-long-audio` model is its flexibility in handling different audio file formats and lengths. Experiment with various types of audio content, from music to speech, to see how the model performs. Additionally, try adjusting the DDIM steps and guidance scale parameters to find the optimal settings for your specific use case.