Stabilityai

Rank:

Average Model Cost: $0.0000

Number of Runs: 2,792,802

Models by this creator

sd-vae-ft-mse

sd-vae-ft-mse

stabilityai

The sd-vae-ft-mse model is an improved autoencoder that has been fine-tuned for image reconstruction tasks. It is a variant of the kl-f8 autoencoder that has been trained on a combination of the LAION-Aesthetics and LAION-Humans datasets. The model has two versions: ft-EMA and ft-MSE. ft-EMA uses exponential moving average weights and the same loss configuration as the original checkpoint, while ft-MSE uses MSE reconstruction loss with a small LPIPS loss. Both versions have been evaluated on the COCO 2017 and LAION-Aesthetics datasets and have shown improved performance compared to the original autoencoder. The fine-tuned models can be used as drop-in replacements for the existing autoencoder.

Read more

$-/run

902.5K

Huggingface

stable-diffusion-2-inpainting

stable-diffusion-2-inpainting

The stable-diffusion-2-inpainting model is a diffusion-based text-to-image generation model. It is a Latent Diffusion Model that uses a fixed, pretrained text encoder. The model can be used to generate and modify images based on text prompts. It has been trained on a large-scale dataset and has certain limitations and biases. The model is intended for research purposes only and should not be used for creating harmful or offensive content. It is trained using a specific training procedure and provides different checkpoints for various purposes. The model's environmental impact in terms of CO2 emissions has been estimated as well.

Read more

$-/run

184.3K

Huggingface

sd-vae-ft-ema

sd-vae-ft-ema

The sd-vae-ft-ema model is a fine-tuned VAE (Variational Autoencoder) decoder that is intended to be used with the diffusers library. It is a variant of the kl-f8 autoencoder that has been trained on a dataset containing images of humans to improve the reconstruction of faces. The model comes in two versions: ft-EMA and ft-MSE. The ft-EMA version was trained for 313198 steps with EMA (Exponential Moving Average) weights and uses the same loss configuration as the original checkpoint. The ft-MSE version was trained for an additional 280k steps using a different loss that emphasizes MSE (Mean Squared Error) reconstruction. Both versions only fine-tune the decoder part and can be used as a replacement for the existing autoencoder. The model has been evaluated on the COCO 2017 and LAION-Aesthetics datasets, and visualizations of reconstructions on 256x256 images are provided.

Read more

$-/run

106.6K

Huggingface

stable-diffusion-x4-upscaler

stable-diffusion-x4-upscaler

The Stable Diffusion x4 Upscaler model is a text-guided latent upscaling diffusion model. It is trained on a subset of the LAION dataset and can be used to generate and modify high-resolution images based on text prompts. The model takes both textual input and a noise level parameter as inputs. It has limitations in achieving perfect photorealism, rendering legible text, and generating complex compositions. It may also exhibit biases and limitations when used with non-English prompts. The model is intended for research purposes only and should not be used to generate harmful or offensive content. It has been trained with a focus on safety and has been filtered for explicit and inappropriate material.

Read more

$-/run

103.7K

Huggingface

sdxl-vae

sdxl-vae

sdxl-vae is a model that is designed to perform Variational Autoencoder (VAE) operations. VAE is a generative model used to learn and generate new data. It is a neural network that can encode and decode high-dimensional data, such as images or text, into a lower-dimensional latent space. The sdxl-vae model provides a platform for researchers and developers to train and deploy VAE models for various applications, such as image generation, text generation, and dimensionality reduction.

Read more

$-/run

97.9K

Huggingface

Similar creators