[](#ootdiffusion)OOTDiffusion
=============================

[Our OOTDiffusion GitHub repository](https://github.com/levihsu/OOTDiffusion)

 [Try out OOTDiffusion](https://huggingface.co/spaces/levihsu/OOTDiffusion)

(Thanks to [ZeroGPU](https://huggingface.co/zero-gpu-explorers) for providing A100 GPUs)

> **OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on** \[[arXiv paper](https://arxiv.org/abs/2403.01779)\]  
> [Yuhao Xu](http://levihsu.github.io/), [Tao Gu](https://github.com/T-Gu), [Weifeng Chen](https://github.com/ShineChen1024), [Chengcai Chen](https://www.researchgate.net/profile/Chengcai-Chen)  
> Xiao-i Research

Our model checkpoints trained on [VITON-HD](https://github.com/shadow2496/VITON-HD) (half-body) and [Dress Code](https://github.com/aimagelab/dress-code) (full-body) have been released

*    We support ONNX for [humanparsing](https://github.com/GoGoDuck912/Self-Correction-Human-Parsing) now. Most environmental issues should have been addressed : )
*   Please also download [clip-vit-large-patch14](https://huggingface.co/openai/clip-vit-large-patch14) into _**checkpoints**_ folder
*   We've only tested our code and models on Linux (Ubuntu 22.04)

[![demo](/levihsu/OOTDiffusion/resolve/main/images/demo.png)](/levihsu/OOTDiffusion/blob/main/images/demo.png) [![workflow](/levihsu/OOTDiffusion/resolve/main/images/workflow.png)](/levihsu/OOTDiffusion/blob/main/images/workflow.png)

[](#citation)Citation
---------------------

    @article{xu2024ootdiffusion,
      title={OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on},
      author={Xu, Yuhao and Gu, Tao and Chen, Weifeng and Chen, Chengcai},
      journal={arXiv preprint arXiv:2403.01779},
      year={2024}
    }

## Model overview

The `OOTDiffusion` model is a powerful image-to-image AI model developed by Yuhao Xu, Tao Gu, Weifeng Chen, and Chengcai Chen from Xiao-i Research. It is built on top of the [Latent Diffusion](https://arxiv.org/abs/2112.10752) architecture and aims to enable controllable virtual try-on applications. The model is similar to other diffusion-based text-to-image generation models like [Stable Diffusion](https://aimodels.fyi/models/huggingFace/stable-diffusion-v1-5-runwayml), but it has been specifically optimized for the task of clothing transfer and virtual try-on.

## Model inputs and outputs

### Inputs
- **Clothing Image**: An image of the clothing item that the user wants to try on.
- **Person Image**: An image of the person who will be wearing the clothing.
- **Semantic Map**: A segmentation map that provides information about the different parts of the person's body.

### Outputs
- **Composite Image**: An image that shows the person wearing the clothing item, with the clothing seamlessly integrated into the image.

## Capabilities

The `OOTDiffusion` model is capable of generating high-quality composite images that show a person wearing a clothing item, even in cases where the clothing and person images were not originally aligned. The model is able to handle a variety of clothing types and styles, and can generate realistic-looking results that take into account the person's body shape and pose.

## What can I use it for?

The `OOTDiffusion` model is well-suited for applications that involve virtual try-on, such as online clothing stores or fashion design tools. By allowing users to see how a particular clothing item would look on them, the model can help improve the shopping experience and reduce the number of returns. Additionally, the model could be used in the fashion industry for prototyping and design purposes, allowing designers to quickly visualize how their creations would look on different body types.

## Things to try

One interesting thing to try with the `OOTDiffusion` model is to experiment with different clothing styles and body types. By providing the model with a diverse set of inputs, you can see how it handles different scenarios and generates unique composite images. Additionally, you could try incorporating the model into a larger system or application, such as an e-commerce platform or a design tool, to see how it performs in a real-world setting.