[](#taiyi-stable-diffusion-1b-chinese-v01)Taiyi-Stable-Diffusion-1B-Chinese-v0.1
================================================================================

*   *   Main Page:[Fengshenbang](https://fengshenbang-lm.com/)
*   Github: [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)

[](#-brief-introduction) Brief Introduction
-----------------------------------------------

Stable Diffusion100[IDEAStable Diffussion](https://zhuanlan.zhihu.com/p/598766181)text2img[1.0](https://docs.qq.com/doc/DUFRMZ25wUFRWaEl0)

The first open source Chinese Stable diffusion Anime model, which was trained on M1 filtered Chinese Anime image-text pairs. See details in [IDEA Research Institute Fengshenbang team released the first opensource Chinese anime Stable Diffussion model](https://zhuanlan.zhihu.com/p/598766181), see more text2img examples in [Taiyi-Anime handbook](https://docs.qq.com/doc/DUFRMZ25wUFRWaEl0)

[](#-model-taxonomy) Model Taxonomy
-------------------------------------------

 Demand

 Task

 Series

 Model

 Parameter

 Extra

 Special

 Multimodal

 Taiyi

Stable Diffusion

1B

Chinese

[](#-model-information) Model Information
-------------------------------------------------

(1001)[IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1](https://huggingface.co/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1) 4 x A100 100

We use two anime dataset(1 million low-quality data and 10k high-qualty data) for two-staged training the chinese anime model based our pretrained model [IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1](https://huggingface.co/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1). It takes 100 hours to train this model based on 4 x A100. This model is a preliminary version and we will update this model continuously and open sourse. Welcome to exchange

### [](#result)Result

upup

The first tip is to make good use of the super resolution model to give the image quality a boost:



    1,,,,,,,,,,,T
    Negative prompt: ,,,,,3D,,,,,
    Steps: 50, Sampler: Euler a, CFG scale: 7, Seed: 3900970600, Size: 512x512, Model hash: 7ab6852a
    

512 \* 512318kb [![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/t-shirt-girl.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/t-shirt-girl.png)

webuiextraR-ESRGAN 4x+ Anime6B [![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/upscale-model.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/upscale-model.png)

2048 \* 20482.6Mb512 \* 5122048 \* 2048 [![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/t-shirt-girl-upscale.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/t-shirt-girl-upscale.png)

webui

These example are got from an model running on webui.



Firstly some img2img examples:

[![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/liuyifei_and_huge.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/liuyifei_and_huge.png)



The Next are some text2img examples:

prompt1

prompt2

1,,,,,,  
,,,

1,,,,,,,,

[![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/boy.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/boy.png)

[![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/girl.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/girl.png)

,,,,,,,,

,,,,,,,,,,,  
,,,,,,,,,,

[![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/outdoor.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/outdoor.png)

[![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/indoor.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/indoor.png)

,,,,,,,,,,,  
,,,,

,,,,,,

[![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/villege.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/villege.png)

[![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/city.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/city.png)

,,(:1.5),,

,,(:1.5),,

[![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/cat.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/cat.png)

[![](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/resolve/main/result_examples/rabbit.png)](/IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1/blob/main/result_examples/rabbit.png)

[](#-usage) Usage
---------------------

### [](#webui-configure-webui)webui Configure webui

webuiwebui

It is highly recommended to use this model in a webui way. webui provides a visual interface plus some advanced retouching features.

[Taiyi Stable Difffusion WebUI](https://github.com/IDEA-CCNL/stable-diffusion-webui/blob/master/README.md)

### [](#-half-precision-fp16-cuda) Half precision FP16 (CUDA)

 `torch_dtype=torch.float16`  `device_map="auto"`  FP16   [the optimization docs](https://huggingface.co/docs/diffusers/main/en/optimization/fp16#half-precision-weights)

    # !pip install git+https://github.com/huggingface/accelerate
    import torch
    from diffusers import StableDiffusionPipeline
    torch.backends.cudnn.benchmark = True
    pipe = StableDiffusionPipeline.from_pretrained("IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1", torch_dtype=torch.float16)
    pipe.to('cuda')
    
    prompt = '1,,,,,,,,'
    image = pipe(prompt, guidance_scale=7.5).images[0]  
    image.save("1.png")
    

### [](#-handbook-for-taiyi) Handbook for Taiyi

*   [handbook](https://github.com/IDEA-CCNL/Fengshenbang-LM/blob/main/fengshen/examples/stable_diffusion_chinese/taiyi_handbook.md)
    
*   [v1.1](https://docs.qq.com/doc/DWklwWkVvSFVwUE9Q)
    
*   [v1.0](https://docs.qq.com/doc/DUFRMZ25wUFRWaEl0)
    

### [](#-how-to-finetune) How to finetune

[finetune code](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen/examples/finetune_taiyi_stable_diffusion)

### [](#dreambooth)DreamBooth

[DreamBooth code](https://github.com/IDEA-CCNL/Fengshenbang-LM/tree/main/fengshen/examples/stable_diffusion_dreambooth)

[](#-citation) Citation
---------------------------

[](https://arxiv.org/abs/2209.02970)

If you are using the resource for your work, please cite the our [paper](https://arxiv.org/abs/2209.02970):

    @article{fengshenbang,
      author    = {Jiaxing Zhang and Ruyi Gan and Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen},
      title     = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
      journal   = {CoRR},
      volume    = {abs/2209.02970},
      year      = {2022}
    }
    

[](https://github.com/IDEA-CCNL/Fengshenbang-LM/):

You can also cite our [website](https://github.com/IDEA-CCNL/Fengshenbang-LM/):

    @misc{Fengshenbang-LM,
      title={Fengshenbang-LM},
      author={IDEA-CCNL},
      year={2021},
      howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
    }

## Model overview

The `Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1` model is the first open-source Chinese Stable Diffusion Anime model, trained on a dataset of 1 million low-quality and 10,000 high-quality Chinese anime image-text pairs. Developed by the IDEA-CCNL team, this model builds upon the pre-trained [Taiyi-Stable-Diffusion-1B-Chinese-v0.1](https://aimodels.fyi/models/huggingFace/taiyi-stable-diffusion-1b-chinese-v01-idea-ccnl) model and further fine-tuned it on anime-specific data.

## Model inputs and outputs

### Inputs
- **Textual Prompts**: The model takes in textual prompts that describe the desired image content, using natural language.

### Outputs
- **Generated Images**: The model outputs high-quality, photorealistic images that match the provided textual prompts.

## Capabilities

The `Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1` model demonstrates strong capabilities in generating Chinese-inspired anime-style illustrations. The model is able to capture intricate details, realistic textures, and vibrant colors in the generated images. Additionally, the model retains the powerful generative abilities of the original Stable Diffusion model, allowing it to handle a wide range of prompts beyond just anime-themed content.

## What can I use it for?

This model can be particularly useful for artists, designers, and content creators who want to generate high-quality Chinese anime-style illustrations. The model can be used to ideate new characters, scenes, and narratives, or to create visual assets for games, animations, and other multimedia projects. The open-source nature of the model also makes it accessible for educational and research purposes, enabling further exploration and development of text-to-image AI capabilities.

## Things to try

One interesting aspect of the `Taiyi-Stable-Diffusion-1B-Anime-Chinese-v0.1` model is its ability to seamlessly handle both Chinese and English prompts. This allows users to experiment with bilingual or multilingual prompts, potentially leading to unique and unexpected results. Additionally, users can try leveraging the model's strengths in generating anime-style art by incorporating detailed, descriptive prompts that capture the desired aesthetic and narrative elements.