supir

Maintainer: cjwbw

Total Score

101

Last updated 6/19/2024
AI model preview image
PropertyValue
Model LinkView on Replicate
API SpecView on Replicate
Github LinkView on Github
Paper LinkView on Arxiv

Create account to get full access

or

If you already have an account, we'll log you in

Model overview

supir is a text-to-image model that focuses on practicing model scaling for photo-realistic image restoration in the wild. It is developed by cjwbw and leverages the LLaVA-13b model for captioning. This version of supir can produce high-quality, photo-realistic images that are well-suited for a variety of applications, such as photo editing, digital art, and visual content creation.

Model inputs and outputs

supir takes in a low-quality input image and a set of parameters to generate a high-quality, restored image. The model can handle various types of image degradation, including noise, blur, and compression artifacts, and can produce results with impressive detail and fidelity.

Inputs

  • Image: A low-quality input image to be restored.
  • Seed: A random seed to control the stochastic behavior of the model.
  • S Cfg: The classifier-free guidance scale, which controls the trade-off between sample fidelity and sample diversity.
  • S Churn: The churn hyper-parameter of the Equivariant Diffusion Model (EDM) sampling scheduler.
  • S Noise: The noise hyper-parameter of the EDM sampling scheduler.
  • Upscale: The upsampling ratio to be applied to the input image.
  • A Prompt: A positive prompt that describes the desired characteristics of the output image.
  • N Prompt: A negative prompt that describes characteristics to be avoided in the output image.
  • Min Size: The minimum resolution of the output image.
  • Edm Steps: The number of steps for the EDM sampling scheduler.
  • Use Llava: A boolean flag to determine whether to use the LLaVA-13b model for captioning.
  • Color Fix Type: The type of color correction to be applied to the output image.
  • Linear Cfg: A boolean flag to control the linear increase of the classifier-free guidance scale.
  • Linear S Stage2: A boolean flag to control the linear increase of the strength of the second stage of the model.
  • Spt Linear Cfg: The starting point for the linear increase of the classifier-free guidance scale.
  • Spt Linear S Stage2: The starting point for the linear increase of the strength of the second stage.

Outputs

  • Output: A high-quality, photo-realistic image generated by the supir model.

Capabilities

supir is capable of generating high-quality, photo-realistic images from low-quality inputs. The model can handle a wide range of image degradation and can produce results with impressive detail and fidelity. Additionally, supir leverages the LLaVA-13b model for captioning, which can provide useful information about the generated images.

What can I use it for?

supir can be used for a variety of applications, such as photo editing, digital art, and visual content creation. The model's ability to restore low-quality images and produce high-quality, photo-realistic results makes it well-suited for tasks like repairing old photographs, enhancing low-resolution images, and creating high-quality visuals for various media. Additionally, the model's captioning capabilities can be useful for tasks like image annotation and description.

Things to try

One interesting aspect of supir is its ability to handle different types of image degradation. You can experiment with the model's performance by trying different input images with varying levels of noise, blur, and compression artifacts. Additionally, you can play with the various model parameters, such as the classifier-free guidance scale and the strength of the second stage, to see how they affect the output quality and fidelity.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

AI model preview image

supir-v0f

cjwbw

Total Score

7

The supir-v0f model is part of the SUPIR (Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild) family of models developed by cjwbw. Unlike the SUPIR model, which uses the LLaVA-13b language model, the supir-v0f model does not incorporate LLaVA-13b. Instead, it focuses on practicing model scaling for photo-realistic image restoration in the wild. The supir-v0f model can be contrasted with the SUPIR-v0Q model, which has default training settings and aims for high generalization and high image quality in most cases, while the supir-v0f model is trained with light degradation settings, with its Stage1 encoder retaining more details when facing light degradations. Model inputs and outputs The supir-v0f model takes a low-quality input image and upscales it to a higher resolution, while restoring the image to be photo-realistic. The model is designed to handle a variety of input image degradations, such as low resolution, noise, and JPEG artifacts, and can produce high-quality, detailed, and color-corrected output images. Inputs Image**: A low-quality input image to be restored and upscaled. Upscale**: The ratio by which the input image should be upscaled. S Cfg**: The classifier-free guidance scale for the prompts used to guide the image restoration process. S Churn**: The original churn hyperparameter of the Entropic Diffusion Model (EDM) used for image generation. S Noise**: The original noise hyperparameter of the EDM used for image generation. A Prompt**: An additive positive prompt to guide the image restoration process. N Prompt**: A fixed negative prompt to guide the image restoration process. S Stage1**: A control strength for the first stage of the image restoration process. S Stage2**: A control strength for the second stage of the image restoration process. Edm Steps**: The number of steps to use for the EDM sampling schedule. Color Fix Type**: The type of color fixing to apply to the output image. Linear Cfg**: Whether to linearly increase the Cfg value during the image restoration process. Linear S Stage2**: Whether to linearly increase the S Stage2 value during the image restoration process. Spt Linear Cfg**: The starting point for the linear increase in Cfg. Spt Linear S Stage2**: The starting point for the linear increase in S Stage2. Outputs Output**: A high-quality, photo-realistic image restored and upscaled from the low-quality input. Capabilities The supir-v0f model is capable of producing high-quality, detailed, and color-corrected output images from low-quality inputs. It can handle a variety of degradations, such as low resolution, noise, and JPEG artifacts, and is particularly effective at retaining details when facing light degradations. What can I use it for? The supir-v0f model can be used for a variety of photo-realistic image restoration and upscaling tasks, such as restoring old photos, enhancing low-quality images from mobile devices, or improving the visual quality of AI-generated images. It can be particularly useful for projects that require high-fidelity, detailed, and color-corrected images, such as photography, video production, or visual design. Things to try One interesting aspect of the supir-v0f model is its ability to handle light degradations effectively, thanks to its Stage1 encoder. You could try experimenting with different input images with varying levels of degradation to see how the model performs and whether the supir-v0f version outperforms the SUPIR-v0Q model in those cases. Additionally, you could explore the effects of the different hyperparameters, such as the Cfg, churn, and noise values, to see how they impact the quality and fidelity of the output images.

Read more

Updated Invalid Date

AI model preview image

supir-v0q

cjwbw

Total Score

81

The supir-v0q model is a powerful AI-based image restoration system developed by researcher cjwbw. It is designed for practicing model scaling to achieve photo-realistic image restoration in the wild. The model is built upon several state-of-the-art techniques, including the SDXL CLIP Encoder, SDXL base 1.0_0.9vae, and the LLaVA CLIP and LLaVA v1.5 13B models. Compared to similar models like GFPGAN, Real-ESRGAN, Animagine-XL-3.1, and LLaVA-13B, the supir-v0q model showcases enhanced generalization and high-quality image restoration capabilities. Model inputs and outputs The supir-v0q model takes low-quality input images and generates high-quality, photo-realistic output images. The model supports upscaling of the input images by a specified ratio, and it offers various options for controlling the restoration process, such as adjusting the classifier-free guidance scale, noise parameters, and the strength of the two-stage restoration pipeline. Inputs Image**: The low-quality input image to be restored. Upscale**: The upsampling ratio to apply to the input image. S Cfg**: The classifier-free guidance scale for the prompts. S Churn**: The original churn hyper-parameter of the Energetic Diffusion Model (EDM). S Noise**: The original noise hyper-parameter of the EDM. A Prompt**: The additive positive prompt for the input image. N Prompt**: The fixed negative prompt for the input image. S Stage1**: The control strength of the first stage of the restoration pipeline. S Stage2**: The control strength of the second stage of the restoration pipeline. Edm Steps**: The number of steps to use for the EDM sampling scheduler. Color Fix Type**: The type of color correction to apply, such as "None", "AdaIn", or "Wavelet". Outputs Output**: The high-quality, photo-realistic image restored from the input. Capabilities The supir-v0q model demonstrates impressive capabilities in restoring low-quality images to high-quality, photo-realistic outputs. It can handle a wide range of degradations, including noise, blur, and compression artifacts, while preserving fine details and natural textures. The model's two-stage restoration pipeline, combined with its ability to control various hyperparameters, allows for fine-tuning and optimization to achieve the desired level of image quality and fidelity. What can I use it for? The supir-v0q model can be particularly useful for a variety of applications, such as: Photo Restoration**: Restoring old, damaged, or low-quality photographs to high-quality, professional-looking images. Image Enhancement**: Improving the quality of images captured with low-end cameras or devices, making them more visually appealing. Creative Workflows**: Enhancing the quality of reference images or source materials used in various creative fields, such as digital art, animation, and visual effects. Content Creation**: Generating high-quality images for use in websites, social media, marketing materials, and other content-driven applications. Creators and businesses working in these areas may find the supir-v0q model a valuable tool for improving the visual quality and impact of their projects. Things to try With the supir-v0q model, you can experiment with various input parameters to fine-tune the restoration process. For example, you can try adjusting the upscaling ratio, the classifier-free guidance scale, or the strength of the two-stage restoration pipeline to achieve the desired level of image quality and fidelity. Additionally, you can explore the different color correction options to find the one that best suits your needs. By leveraging the model's flexibility and customization options, you can unlock new possibilities for your image restoration and enhancement tasks.

Read more

Updated Invalid Date

AI model preview image

daclip-uir

cjwbw

Total Score

1

The daclip-uir model, created by cjwbw, is a powerful AI model that can perform universal image restoration. It is based on the Degradation-Aware CLIP (DA-CLIP) architecture, which allows the model to control vision-language models for diverse image restoration tasks. This model can handle a wide range of degradations, such as motion blur, haze, JPEG compression, low-light, noise, rain, snow, and more. It outperforms many single-task image restoration models and can be applied to real-world mixed-degradation images, similar to Real-ESRGAN. The daclip-uir model is an improvement over other models created by the same maintainer, such as supir, supir-v0f, cogvlm, and supir-v0q. It leverages the power of vision-language models to provide more robust and versatile image restoration capabilities. Model inputs and outputs Inputs Image**: The input image to be restored, which can have various degradations such as motion blur, haze, JPEG compression, low-light, noise, rain, snow, and more. Outputs Restored Image**: The output of the model, which is a high-quality, restored version of the input image. Capabilities The daclip-uir model can perform universal image restoration, handling a wide range of degradations. It can restore images affected by motion blur, haze, JPEG compression, low-light conditions, noise, rain, snow, and more. The model's ability to control vision-language models allows it to adapt to different image restoration tasks and provide high-quality results. What can I use it for? The daclip-uir model can be used for a variety of image restoration applications, such as: Enhancing the quality of low-resolution or degraded images for social media, e-commerce, or photography purposes. Improving the visual quality of surveillance footage or security camera images. Restoring historical or archived images for digital preservation and archiving. Enhancing the visual quality of medical images, such as X-rays or MRI scans, for improved diagnosis and analysis. Improving the visual quality of images captured in challenging environmental conditions, such as hazy or rainy weather. Things to try With the daclip-uir model, you can experiment with restoring images affected by different types of degradations. Try inputting images with various issues, such as motion blur, haze, JPEG compression, low-light conditions, noise, rain, or snow, and observe the model's ability to recover the original high-quality image. Additionally, you can explore the model's performance on real-world mixed-degradation images, similar to the Real-ESRGAN project, and see how it can handle the challenges of restoring images in the wild.

Read more

Updated Invalid Date

AI model preview image

swinir

jingyunliang

Total Score

5.7K

swinir is an image restoration model based on the Swin Transformer architecture, developed by researchers at ETH Zurich. It achieves state-of-the-art performance on a variety of image restoration tasks, including classical image super-resolution, lightweight image super-resolution, real-world image super-resolution, grayscale and color image denoising, and JPEG compression artifact reduction. The model is trained on diverse datasets like DIV2K, Flickr2K, and OST, and outperforms previous state-of-the-art methods by up to 0.45 dB while reducing the parameter count by up to 67%. Model inputs and outputs swinir takes in an image and performs various image restoration tasks. The model can handle different input sizes and scales, and supports tasks like super-resolution, denoising, and JPEG artifact reduction. Inputs Image**: The input image to be restored. Task type**: The specific image restoration task to be performed, such as classical super-resolution, lightweight super-resolution, real-world super-resolution, grayscale denoising, color denoising, or JPEG artifact reduction. Scale factor**: The desired upscaling factor for super-resolution tasks. Noise level**: The noise level for denoising tasks. JPEG quality**: The JPEG quality factor for JPEG artifact reduction tasks. Outputs Restored image**: The output image with the requested restoration applied, such as a high-resolution, denoised, or JPEG artifact-reduced version of the input. Capabilities swinir is capable of performing a wide range of image restoration tasks with state-of-the-art performance. For example, it can take a low-resolution, noisy, or JPEG-compressed image and output a high-quality, clean, and artifact-free version. The model works well on a variety of image types, including natural scenes, faces, and text-heavy images. What can I use it for? swinir can be used in a variety of applications that require high-quality image restoration, such as: Enhancing the resolution and quality of low-quality images for use in social media, e-commerce, or photography. Improving the visual fidelity of images generated by GFPGAN or Codeformer for better face restoration. Reducing noise and artifacts in images captured in low-light or poor conditions for better visualization and analysis. Preprocessing images for downstream computer vision tasks like object detection or classification. Things to try One interesting thing to try with swinir is using it to restore real-world images that have been degraded by various factors, such as low resolution, noise, or JPEG artifacts. The model's ability to handle diverse degradation types and produce high-quality results makes it a powerful tool for practical image restoration applications. Another interesting experiment would be to compare swinir's performance to other state-of-the-art image restoration models like SuperPR or Swin2SR on a range of benchmark datasets and tasks. This could help understand the relative strengths and weaknesses of the different approaches.

Read more

Updated Invalid Date