[](#control-lora-model-card)Control-LoRA Model Card
===================================================

[](#introduction)Introduction
-----------------------------

By adding low-rank parameter efficient fine tuning to ControlNet, we introduce _**Control-LoRAs**_. This approach offers a more efficient and compact method to bring model control to a wider variety of consumer GPUs.

For each model below, you'll find:

*   _Rank 256_ files (reducing the original `4.7GB` ControlNet models down to `~738MB` Control-LoRA models) and experimental
*   _Rank 128_ files (reducing to model down to `~377MB`)

Each Control-LoRA has been trained on a diverse range of image concepts and aspect ratios.

### [](#midas-and-clipdrop-depth)MiDaS and ClipDrop Depth

[![canny](/stabilityai/control-lora/resolve/main/samples/depth-sample.jpeg)](/stabilityai/control-lora/blob/main/samples/depth-sample.jpeg)

This Control-LoRA utilizes a grayscale depth map for guided generation.

Depth estimation is an image processing technique that determines the distance of objects in a scene, providing a depth map that highlights variations in proximity.

The model was trained on the depth results of `MiDaS dpt_beit_large_512`.

It was further finetuned on the `Portrait Depth Estimation` model available in the [ClipDrop API by Stability AI](https://clipdrop.co/apis/docs/portrait-depth-estimation).

### [](#canny-edge)Canny Edge

[![canny](/stabilityai/control-lora/resolve/main/samples/canny-sample.jpeg)](/stabilityai/control-lora/blob/main/samples/canny-sample.jpeg) Canny Edge Detection is an image processing technique that identifies abrupt changes in intensity to highlight edges in an image.

This Control-LoRA uses the edges from an image to generate the final image.

### [](#photograph-and-sketch-colorizer)Photograph and Sketch Colorizer

[![photograph colorizer](/stabilityai/control-lora/resolve/main/samples/colorizer-sample.jpeg)](/stabilityai/control-lora/blob/main/samples/colorizer-sample.jpeg) These two Control-LoRAs can be used to colorize images.

_Recolor_ is designed to colorize black and white photographs.

_Sketch_ is designed to color in drawings input as a white-on-black image (either hand-drawn, or created with a `pidi` edge model).

### [](#revision)Revision

[![photograph colorizer](/stabilityai/control-lora/resolve/main/samples/revision-sample.jpeg)](/stabilityai/control-lora/blob/main/samples/revision-sample.jpeg) Revision is a novel approach of using images to prompt SDXL.

It uses pooled CLIP embeddings to produce images conceptually similar to the input. It can be used either in addition, or to replace text prompts.

Revision also includes a blending function for combining multiple image or text concepts, as either positive or negative prompts.

[](#inference)Inference
-----------------------

Control-LoRAs have been implemented into [ComfyUI](https://github.com/comfyanonymous/ComfyUI) and [StableSwarmUI](https://github.com/Stability-AI/StableSwarmUI)

Basic ComfyUI workflows (using the base model only) are available in this HF repo. Custom nodes from Stability are [available here](https://github.com/Stability-AI/stability-ComfyUI-nodes).

**Recolor example on ComfyUI:** [![comfyui recolor](/stabilityai/control-lora/resolve/main/samples/comfyui-recolor-example.jpeg)](/stabilityai/control-lora/blob/main/samples/comfyui-recolor-example.jpeg)

**Canny edge on StableSwarmUI:** [![swarmui recolor](/stabilityai/control-lora/resolve/main/samples/swarmui-canny-example.jpeg)](/stabilityai/control-lora/blob/main/samples/swarmui-canny-example.jpeg)

## Model overview

The `control-lora` model, developed by Stability AI, is a set of low-rank adaptation (LoRA) models that can be used to add model control to a variety of consumer GPUs in a more efficient and compact way. These Control-LoRA models were created by adding low-rank parameter efficient fine tuning to ControlNet, a popular image control model. The Control-LoRA models come in two variants - Rank 256 and Rank 128 - which significantly reduce the original 4.7GB ControlNet models down to around 738MB and 377MB, respectively. These Control-LoRA models have been trained on a diverse range of image concepts and aspect ratios, making them versatile for various image generation tasks.

The Control-LoRA models include several specialized variants, such as the [MiDaS and ClipDrop Depth](#midas-and-clipdrop-depth), [Canny Edge](#canny-edge), [Photograph and Sketch Colorizer](#photograph-and-sketch-colorizer), and [Revision](#revision) models. These variants leverage different image processing techniques like depth estimation, edge detection, and CLIP embeddings to guide the image generation process.

Similar models include the [sdxl-controlnet-lora](https://aimodels.fyi/models/huggingFace/sdxl-controlnet-lora-batouresearch), [lcm-lora-sdxl](https://aimodels.fyi/models/huggingFace/lcm-lora-sdxl-latent-consistency), [sdxl-controlnet](https://aimodels.fyi/models/huggingFace/sdxl-controlnet-lucataco), [sdxl-controlnet-depth](https://aimodels.fyi/models/huggingFace/sdxl-controlnet-depth-lucataco), and the [stable-diffusion-xl-refiner-1.0](https://aimodels.fyi/models/huggingFace/stable-diffusion-xl-refiner-10-stabilityai) models, all of which explore different approaches to incorporating control and refinement into Stable Diffusion models.

## Model inputs and outputs

### Inputs
- **Image**: The Control-LoRA models accept various types of input images, such as depth maps, edge maps, and sketches, to guide the image generation process.
- **Text prompt**: The models can be conditioned on text prompts to generate images that match the specified concepts.

### Outputs
- **Generated image**: The primary output of the Control-LoRA models is a generated image that reflects the input image and text prompt.

## Capabilities

The Control-LoRA models excel at generating images with specific visual characteristics controlled by the input image. For example, the Depth-based variant can generate images guided by a grayscale depth map, highlighting variations in proximity. The Canny Edge variant uses the edges from an image to generate the final output. The Colorizer variants can colorize both black and white photographs and sketches. The Revision model uses CLIP embeddings to produce images conceptually similar to the input, allowing for blending of multiple image and text prompts.

## What can I use it for?

The Control-LoRA models can be particularly useful for applications that require fine-grained control over the image generation process, such as design, creative tools, and research on generative models. The compact model size and efficient inference also make these models suitable for deployment on a wider range of consumer GPUs, expanding the accessibility of advanced image generation capabilities.

## Things to try

One interesting aspect of the Control-LoRA models is their ability to be combined with other LoRA adapters, such as the [Papercut LoRA](https://aimodels.fyi/models/huggingFace/lcm-lora-sdxl-latent-consistency), to generate styled images in just a few inference steps. This opens up possibilities for exploring the synergies between different control mechanisms and stylization techniques in a computationally efficient way.

Additionally, the Control-LoRA models can be used in conjunction with ControlNet and other image-to-image techniques, as demonstrated in the examples provided. Experimenting with different input images, prompts, and inference parameters can lead to a wide range of creative and novel outputs.