Models by this creator

AI model preview image



Total Score


The dance-diffusion model is a set of tools developed by the Harmonai team to train a generative model on arbitrary audio samples. It builds on similar models like Stable Diffusion, Audio-LDM, and MusicGen, which explore text-to-image, text-to-audio, and music generation using diffusion models. Model inputs and outputs The dance-diffusion model allows you to generate audio samples by specifying a few key parameters. You can control the number of steps, the length of the generated audio, and the batch size. The generated output is a URI that you can use to access the created audio file. Inputs Steps**: The number of steps to use in the generation process, with a maximum of 150. Length**: The desired length of the generated audio in seconds. Batch Size**: The number of samples to generate at once. Model Name**: The specific model to use, with "maestro-150k" as the default. Outputs Output**: A URI pointing to the generated audio file. Capabilities The dance-diffusion model can be used to generate a wide variety of audio samples, from music to sound effects and more. By training on arbitrary audio data, the model can learn to generate unique and creative outputs that capture the essence of the training data. What can I use it for? The dance-diffusion model could be useful for a range of applications, such as music production, audio design for games and films, or even generative art installations. The ability to fine-tune the model on specific audio samples also opens up possibilities for personalized content creation or data augmentation for machine learning tasks. Things to try One interesting aspect of the dance-diffusion model is its potential for audio interpolation and transformation. By adjusting the model inputs, you could experiment with generating audio that smoothly transitions between different styles or sound sources, or even creates hybrid audio outputs that blend characteristics from multiple samples.

Read more

Updated 5/21/2024