Generate fixed-bpm loops from text prompts

## Model overview

The `musicgen-looper` is a Cog implementation of the MusicGen model, a simple and controllable model for music generation developed by Facebook Research. Unlike existing music generation models like MusicLM, MusicGen does not require a self-supervised semantic representation and generates all four audio codebooks in a single pass. By introducing a small delay between the codebooks, MusicGen can predict them in parallel, reducing the number of auto-regressive steps per second of audio. The model was trained on 20,000 hours of licensed music data, including an internal dataset of 10,000 high-quality tracks as well as music from ShutterStock and Pond5.

The `musicgen-looper` model is similar to other music generation models like [music-inpainting-bert](https://aimodels.fyi/models/replicate/music-inpainting-bert-andreasjansson), [cantable-diffuguesion](https://aimodels.fyi/models/replicate/cantable-diffuguesion-andreasjansson), and [looptest](https://aimodels.fyi/models/replicate/looptest-allenhung1025) in its ability to generate music from prompts. However, the key differentiator of `musicgen-looper` is its focus on generating fixed-BPM loops from text prompts.

## Model inputs and outputs

The `musicgen-looper` model takes in a text prompt describing the desired music, as well as various parameters to control the generation process, such as tempo, seed, and sampling parameters. It outputs a WAV file containing the generated audio loop.

### Inputs
- **Prompt**: A description of the music you want to generate.
- **BPM**: Tempo of the generated loop in beats per minute.
- **Seed**: Seed for the random number generator. If not provided, a random seed will be used.
- **Top K**: Reduces sampling to the k most likely tokens.
- **Top P**: Reduces sampling to tokens with cumulative probability of p. When set to 0 (default), top_k sampling is used.
- **Temperature**: Controls the "conservativeness" of the sampling process. Higher temperature means more diversity.
- **Classifier Free Guidance**: Increases the influence of inputs on the output. Higher values produce lower-variance outputs that adhere more closely to the inputs.
- **Max Duration**: Maximum duration of the generated loop in seconds.
- **Variations**: Number of variations to generate.
- **Model Version**: Selects the model to use for generation.
- **Output Format**: Specifies the output format for the generated audio (currently only WAV is supported).

### Outputs
- **WAV file**: The generated audio loop.

## Capabilities

The `musicgen-looper` model can generate a wide variety of musical styles and textures from text prompts, including tense, dissonant strings, plucked strings, and more. By controlling parameters like tempo, sampling, and classifier free guidance, users can fine-tune the generated output to match their desired style and mood.

## What can I use it for?

The `musicgen-looper` model could be useful for a variety of applications, such as:

- **Soundtrack generation**: Generating background music or sound effects for videos, games, or other multimedia projects.
- **Music composition**: Providing a starting point or inspiration for composers and musicians to build upon.
- **Audio manipulation**: Experimenting with different prompts and parameters to create unique and interesting musical textures.

The model's ability to generate fixed-BPM loops makes it particularly well-suited for applications where a seamless, loopable audio track is required.

## Things to try

One interesting aspect of the `musicgen-looper` model is its ability to generate variations on a given prompt. By adjusting the "Variations" parameter, users can explore how the model interprets and reinterprets a prompt in different ways. This could be a useful tool for composers and musicians looking to generate a diverse set of ideas or explore the model's creative boundaries.

Another interesting feature is the model's use of classifier free guidance, which helps the generated output adhere more closely to the input prompt. By experimenting with different levels of classifier free guidance, users can find the right balance between adhering to the prompt and introducing their own creative flair.