[](#opendallev11)OpenDalleV1.1
==============================

my newest model and best current model is located here: [https://huggingface.co/dataautogpt3/ProteusV0.2](https://huggingface.co/dataautogpt3/ProteusV0.2)

![](https://huggingface.co/dataautogpt3/OpenDalleV1.1/resolve/main/ComfyUI_01611_.png)

Prompt

black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed

![](https://huggingface.co/dataautogpt3/OpenDalleV1.1/resolve/main/ComfyUI_01609_.jpeg)

Prompt

(impressionistic realism by csybgh), a 50 something male, working in banking, very short dyed dark curly balding hair, Afro-Asiatic ancestry, talks a lot but listens poorly, stuck in the past, wearing a suit, he has a certain charm, bronze skintone, sitting in a bar at night, he is smoking and feeling cool, drunk on plum wine, masterpiece, 8k, hyper detailed, smokey ambiance, perfect hands AND fingers

![](https://huggingface.co/dataautogpt3/OpenDalleV1.1/resolve/main/ComfyUI_01556_.jpeg)

Prompt

an anime female general laughing, with a military cap, evil smile, sadistic, grim

![](https://huggingface.co/dataautogpt3/OpenDalleV1.1/resolve/main/ComfyUI_01519_.jpeg)

Prompt

John Berkey Style page,ral-oilspill, There is no road ahead,no land, Strangely,the river is still flowing,crossing the void into the mysterious unknown, The end of nothingness,a huge ripple,it is a kind of wave,and it is the law of time that lasts forever in that void, At the end of the infinite void,there is a colorful world,very hazy and mysterious,and it cannot be seen clearly,but it is real, And that's where the river goes

![](https://huggingface.co/dataautogpt3/OpenDalleV1.1/resolve/main/ComfyUI_01817_(1).png)

Prompt

Super Closeup Portrait, action shot, Profoundly dark whitish meadow, glass flowers, Stains, space grunge style, Jeanne d'Arc wearing White Olive green used styled Cotton frock, Wielding thin silver sword, Sci-fi vibe, dirty, noisy, Vintage monk style, very detailed, hd

![](https://huggingface.co/dataautogpt3/OpenDalleV1.1/resolve/main/ComfyUI_01882_.png)

Prompt

cinematic film still of Kodak Motion Picture Film: (Sharp Detailed Image) An Oscar winning movie for Best Cinematography a woman in a kimono standing on a subway train in Japan Kodak Motion Picture Film Style, shallow depth of field, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy

![](https://huggingface.co/dataautogpt3/OpenDalleV1.1/resolve/main/ComfyUI_01542_.jpeg)

Prompt

in the style of artgerm, comic style,3D model, mythical seascape, negative space, space quixotic dreams, temporal hallucination, psychedelic, mystical, intricate details, very bright neon colors, (vantablack background:1.5), pointillism, pareidolia, melting, symbolism, very high contrast, chiaroscuro

Negative Prompt

bad quality, bad anatomy, worst quality, low quality, low resolutions, extra fingers, blur, blurry, ugly, wrongs proportions, watermark, image artifacts, lowres, ugly, jpeg artifacts, deformed, noisy image

![](https://huggingface.co/dataautogpt3/OpenDalleV1.1/resolve/main/ComfyUI_01528_.jpeg)

Prompt

((OpenDAlle!)text logo:1), ~\*~aesthetic~\*~

OpenDalle v1.1 on Hugging Face - It's Here!

Realism & Style: improved

We're talking about a major glow-up in the realism and style department. Expect images that not only hit the bullseye with your prompts but also bring that extra zing of artistic flair. It's like your prompts went to art school!

Prompt Loyalty: Our Heartbeat

The soul of OpenDalle? Sticking to your prompts like glue. v1.1 takes your words and turns them into visual masterpieces that are just what you pictured  maybe even better.

Where We Stand: The Cool Middle Kid

Here's the scoop: OpenDalle v1.1 is proudly strutting a notch above SDXL. While DALLE-3 is still the big cheese, we're hot on its heels. Think of us as the cool, savvy middle sibling, rocking both brains and beauty.

[](#settings-for-opendalle-v11)Settings for OpenDalle v1.1
----------------------------------------------------------

Use these settings for the best results with OpenDalle v1.1:

CFG Scale: Use a CFG scale of 8 to 7

Steps: 60 to 70 steps for more detail, 35 steps for faster results.

Sampler: DPM2

Scheduler: Normal or Karras

[](#use-it-with--diffusers)Use it with  diffusers
-----------------------------------------------------

    from diffusers import AutoPipelineForText2Image
    import torch
            
    pipeline = AutoPipelineForText2Image.from_pretrained('dataautogpt3/OpenDalleV1.1', torch_dtype=torch.float16).to('cuda')        
    image = pipeline('black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed').images[0]
    

Non-Commercial Personal Use License Agreement

For dataautogpt3/OpenDalleV1.1

1.  Introduction

This Non-Commercial Personal Use License Agreement ("Agreement") is between Alexander Izquierdo ("Licensor") and the individual or entity ("Licensee") using the Stable Diffusion model with unique merging method and tuning ("Model") hosted on the Hugging Face repository named OpenDalleV1.1.

2.  Grant of License

a. Licensor hereby grants to Licensee a non-exclusive, non-transferable, non-sublicensable license to use the Model for personal, non-commercial purposes.

b. "Personal, non-commercial purposes" are defined as use that does not involve any form of compensation or monetary gain. This includes, but is not limited to, academic research, educational use, and hobbyist projects.

c. The Licensee is permitted to modify, merge, and use the Model for personal projects, provided that such use adheres to the terms of this Agreement.

3.  Ownership and Intellectual Property Rights

a. The Licensor explicitly retains all rights, title, and interest in and to the unique merging method used in the Model. This merging method is the proprietary creation and intellectual property of the Licensor.

b. The Licensee shall not claim ownership, reverse engineer, or attempt to recreate the merging method for any purpose.

c. The Licensor retains all rights, title, and interest in and to the Model, including any modifications or improvements made by the Licensee.

d. The Licensee agrees to attribute the Licensor in any academic or public display of the Model or derivative works.

4.  Restrictions

a. The Licensee shall not use the Model or the merging method for any commercial purposes.

b. The Licensee shall not distribute, sublicense, lease, or lend the Model or the merging method to any third party.

c. The Licensee shall not publicly display, perform, or communicate the Model, the merging method, or any derivative works thereof without the prior written consent of the Licensor.

5.  Termination

This Agreement will terminate automatically if the Licensee breaches any of its terms and conditions.

6.  Disclaimer of Warranties

The Model and the merging method are provided "as is," and the Licensor makes no warranties, express or implied, regarding their performance, reliability, or suitability for any purpose.

7.  Limitation of Liability

The Licensor shall not be liable for any damages arising out of or related to the use or inability to use the Model or the merging method.

8.  General Provisions

a. This Agreement constitutes the entire agreement between the parties and supersedes all prior agreements and understandings, whether written or oral, relating to its subject matter.

b. Any amendment to this Agreement must be in writing and signed by both parties.

c. This Agreement shall be governed by the laws of Maryland.

IN WITNESS WHEREOF, the parties have executed this Agreement as of the Effective Date.

## Model overview

`OpenDalleV1.1` is a text-to-image generation model developed by [dataautogpt3](https://aimodels.fyi/creators/huggingFace/dataautogpt3). It builds upon the capabilities of previous DALL-E models, showcasing exceptional prompt adherence and semantic understanding. Compared to base SDXL, `OpenDalleV1.1` seems to be a step above in terms of prompt comprehension, edging closer to the abilities of DALL-E 3. Similar models like [open-dalle-v1.1](https://aimodels.fyi/models/huggingFace/open-dalle-v11-lucataco) and [proteus-v0.1](https://aimodels.fyi/models/huggingFace/proteus-v01-lucataco) also demonstrate advancements in this area, with `proteus-v0.1` further refining prompt adherence and stylistic capabilities.

## Model inputs and outputs

`OpenDalleV1.1` is a text-to-image generation model that takes textual prompts as input and generates corresponding images as output. The model can handle a wide range of prompts, from describing detailed scenes and characters to more abstract concepts.

### Inputs
- **Textual prompts**: Detailed descriptions of the desired image, including elements like subject, style, mood, and composition.

### Outputs
- **Generated images**: High-quality, visually striking images that reflect the provided textual prompts.

## Capabilities

`OpenDalleV1.1` demonstrates impressive capabilities in translating textual inputs into detailed and cohesive visual outputs. The model can generate images across a diverse range of genres, from realistic scenes to fantastical and imaginative concepts. It shows a strong understanding of complex prompts, effectively capturing the intended mood, style, and composition.

## What can I use it for?

`OpenDalleV1.1` can be a valuable tool for a variety of applications, such as:

- **Content creation**: Generating unique, on-demand visuals for blog posts, social media, or other digital content.
- **Conceptual design**: Exploring and visualizing ideas, concepts, and prototypes in fields like art, fashion, and product design.
- **Personalized imagery**: Creating custom images based on individual preferences or interests.
- **Rapid prototyping**: Quickly generating visual assets for product development, user interface designs, or other iterative design processes.

## Things to try

One interesting aspect of `OpenDalleV1.1` is its ability to generate images that blend realistic and fantastical elements. By incorporating prompts that combine specific details with more imaginative components, users can explore the model's capacity to create visually striking and thought-provoking artworks. Experimenting with different prompt structures and exploring the model's response to various styles and subject matter can uncover its full potential.

[](#this-is-outdated-newest-version-11-can-be-found-here-httpshuggingfacecodataautogpt3opendallev11)This is outdated! newest version 1.1 can be found here! [https://huggingface.co/dataautogpt3/OpenDalleV1.1](https://huggingface.co/dataautogpt3/OpenDalleV1.1)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

[](#opendalle)OpenDalle
=======================

![](https://huggingface.co/dataautogpt3/OpenDalle/resolve/main/GBvPhMyWoAAp_fT.jpeg)

Prompt

\-

![](https://huggingface.co/dataautogpt3/OpenDalle/resolve/main/GBvMRyqXMAAX8jj.jpeg)

Prompt

\-

![](https://huggingface.co/dataautogpt3/OpenDalle/resolve/main/GBuwGoJXUAA89jm.jpeg)

Prompt

\-

![](https://huggingface.co/dataautogpt3/OpenDalle/resolve/main/GBvRG6FXcAEOvcG.jpeg)

Prompt

panther head coming out of smoke, dark, moody, detailed, shadows

![](https://huggingface.co/dataautogpt3/OpenDalle/resolve/main/ComfyUI_00497_.jpeg)

Prompt

Manga from the early 1990s, characterized by its surreal aesthetic. The artwork is depicted in matte colors and created using a digital medium. Notable illustrators include Junji Ito, Yoshiyuki Sadamoto, and Rumiko Takahashi.

![](https://huggingface.co/dataautogpt3/OpenDalle/resolve/main/ComfyUI_00318_.jpeg)

Prompt

in the style of artgerm, comic style,3D model, mythical seascape, negative space, space quixotic dreams, temporal hallucination, psychedelic, mystical, intricate details, very bright neon colors, (vantablack background:1.5), pointillism, pareidolia, melting, symbolism, very high contrast, chiaroscuro

Negative Prompt

bad quality, bad anatomy, worst quality, low quality, low resolution, extra fingers, blur, blurry, ugly, wrong proportions, watermark, image artifacts, lowres, ugly, jpeg artifacts, deformed, noisy image, embedding:ac\_neg1,

![](https://huggingface.co/dataautogpt3/OpenDalle/resolve/main/ComfyUI_00488_.jpeg)

Prompt

Contemporary poster art featuring a profile captured in a detailed lithograph with fine coal texture, tar and vinyl color palette, set against a Chiaroscuro environment with layered depth composition, etched outlines within a chromatic Renaissance setting, continent fictional astrology elements in a Chiaroscuro daydream shelter, circuitry tone resembling emphatic expanded horror themes, utilizing both palette knife and brush strokes, matte finish, realized in cinematic abstractions, 8K resolution, 36.5 mm

Negative Prompt

nude, naked, porn, ugly, tiling, extra hands, extra drawn feet, Extra fingers, poorly drawn face, (oversaturated: 2), (saturated: 1.6), big contrast, contrast white burn, white spots overexposed, over saturated, extra limbs, blurry, bad anatomy, blurred, watermark, grainy, signature, cut off, closed eyes, text, logo embedding:ac\_neg1,

![](https://huggingface.co/dataautogpt3/OpenDalle/resolve/main/ComfyUI_00284_.jpeg)

Prompt

Super Closeup Portrait, action shot, Profoundly dark whiteish meadow, glass flowers, Stains, space grunge style, Jeanne d Arc wearing White Olive green used styled Cotton frock, Wielding thin silver sword, Sci-fi vibe, dirty, noisy, Vintage monk style, very detailed, hd,

Negative Prompt

bad quality, bad anatomy, worst quality, low quality, low resolutions, extra fingers, blur, blurry, ugly, wrongs proportions, watermark, image artifacts, lowres, ugly, jpeg artifacts, deformed, noisy image

![](https://huggingface.co/dataautogpt3/OpenDalle/resolve/main/ComfyUI_00265_.jpeg)

Prompt

cinematic film still of Kodak Motion Picture Film: (Sharp Detailed Image) An Oscar winning movie for Best Cinematography a woman in a kimono standing on a subway train in Japan Kodak Motion Picture Film Style, shallow depth of field, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy,

Negative Prompt

bad quality, bad anatomy, worst quality, low quality, low resolutions, extra fingers, blur, blurry, ugly, wrongs proportions, watermark, image artifacts, lowres, ugly, jpeg artifacts, deformed, noisy image

I'm thrilled to share an update on a recent project of mine. After some dedicated work, I've developed a highly effective text-to-image model. This innovation results from integrating the DPO model from Hugging Face with several advanced counterparts, including Juggernaut7XL, ALBEDOXL, MEARGEHEAVEN, and a model of my own design. The outcome is a unique fusion that showcases exceptional prompt adherence and semantic understanding, it seems to be a step above base SDXL and a step closer to DALLE-3 in terms of prompt comprehension. Notably, this model excels in interpreting and adhering to the given prompts, focusing more on semantic accuracy than on ultra-high-fidelity image generation. also available on `https://civitai.com/models/238116/opendalle`

[](#settings-for-opendalle-v10)Settings for OpenDalle v1.0
----------------------------------------------------------

Use these settings for the best results with OpenDalle v1.0:

CFG Scale: Use a CFG scale of 8 to 7

Steps: 60 to 70 steps for more detail, 35 steps for faster results.

Sampler: DPM2

Scheduler: Normal or Karras

[](#safetensors-for-automatic1111-comfyui-invokeai)`*.safetensors` for AUTOMATIC1111, ComfyUI, InvokeAI
-------------------------------------------------------------------------------------------------------

[Download \*.safetensors file](https://huggingface.co/dataautogpt3/OpenDalle/resolve/main/OpenDalle.safetensors?download=true)

[](#use-it-with--diffusers)Use it with  diffusers
-----------------------------------------------------

    from diffusers import AutoPipelineForText2Image
    import torch
            
    pipeline = AutoPipelineForText2Image.from_pretrained('dataautogpt3/OpenDalle', torch_dtype=torch.float16).to('cuda')        
    image = pipeline('Manga from the early 1990s, characterized by its surreal aesthetic. The artwork is depicted in matte colors and created using a digital medium. Notable illustrators include Junji Ito, Yoshiyuki Sadamoto, and Rumiko Takahashi.').images[0]
    

Non-Commercial Personal Use License Agreement

For dataautogpt3/OpenDalle

1.  Introduction

This Non-Commercial Personal Use License Agreement ("Agreement") is between Alexander Izquierdo ("Licensor") and the individual or entity ("Licensee") using the Stable Diffusion model with unique merging method and tuning ("Model") hosted on the Hugging Face repository named OpenDalle.

2.  Grant of License

a. Licensor hereby grants to Licensee a non-exclusive, non-transferable, non-sublicensable license to use the Model for personal, non-commercial purposes.

b. "Personal, non-commercial purposes" are defined as use that does not involve any form of compensation or monetary gain. This includes, but is not limited to, academic research, educational use, and hobbyist projects.

c. The Licensee is permitted to modify, merge, and use the Model for personal projects, provided that such use adheres to the terms of this Agreement.

3.  Ownership and Intellectual Property Rights

a. The Licensor explicitly retains all rights, title, and interest in and to the unique merging method used in the Model. This merging method is the proprietary creation and intellectual property of the Licensor.

b. The Licensee shall not claim ownership, reverse engineer, or attempt to recreate the merging method for any purpose.

c. The Licensor retains all rights, title, and interest in and to the Model, including any modifications or improvements made by the Licensee.

d. The Licensee agrees to attribute the Licensor in any academic or public display of the Model or derivative works.

4.  Restrictions

a. The Licensee shall not use the Model or the merging method for any commercial purposes.

b. The Licensee shall not distribute, sublicense, lease, or lend the Model or the merging method to any third party.

c. The Licensee shall not publicly display, perform, or communicate the Model, the merging method, or any derivative works thereof without the prior written consent of the Licensor.

5.  Termination

This Agreement will terminate automatically if the Licensee breaches any of its terms and conditions.

6.  Disclaimer of Warranties

The Model and the merging method are provided "as is," and the Licensor makes no warranties, express or implied, regarding their performance, reliability, or suitability for any purpose.

7.  Limitation of Liability

The Licensor shall not be liable for any damages arising out of or related to the use or inability to use the Model or the merging method.

8.  General Provisions

a. This Agreement constitutes the entire agreement between the parties and supersedes all prior agreements and understandings, whether written or oral, relating to its subject matter.

b. Any amendment to this Agreement must be in writing and signed by both parties.

c. This Agreement shall be governed by the laws of Maryland.

IN WITNESS WHEREOF, the parties have executed this Agreement as of the Effective Date.

## Model overview

`OpenDalle` is an AI model developed by [dataautogpt3](https://aimodels.fyi/creators/huggingFace/dataautogpt3) that can generate images based on text prompts. It is a text-to-image generation model that aims to reproduce the impressive results of OpenAI's DALL-E model with an open-source alternative. `OpenDalle` is a step above the base SDXL model and closer to DALL-E 3 in terms of prompt comprehension and adherence.

The latest version, [OpenDalleV1.1](https://aimodels.fyi/models/huggingFace/opendallev11-dataautogpt3), showcases exceptional prompt adherence and semantic understanding, generating high-quality images that closely match the provided text prompts. Compared to earlier versions, OpenDalleV1.1 has improved realism and artistic flair, producing visuals that capture the essence of the prompts with more vivid detail and creative flourish.

## Model inputs and outputs

### Inputs
- **Text prompts:** The model takes in text descriptions or prompts that provide instructions for the desired image generation.

### Outputs
- **Generated images:** `OpenDalle` outputs images that correspond to the provided text prompts. The generated visuals can range from photorealistic representations to surreal, artistic interpretations of the input text.

## Capabilities
`OpenDalle` demonstrates impressive capabilities in generating diverse and visually compelling images from a wide variety of text prompts. The model can produce detailed and imaginative visuals, spanning from realistic scenes to fantastical, dream-like compositions. For example, the model can generate images of a "panther head coming out of smoke, dark, moody, detailed, shadows" or a "manga from the early 1990s, characterized by its surreal aesthetic."

## What can I use it for?
`OpenDalle` can be a powerful tool for creative projects, such as illustrations, concept art, and visual storytelling. The model's ability to translate text into vivid, imaginative imagery can be leveraged in various applications, including but not limited to:

- Generating artwork and visuals for use in design, marketing, and entertainment
- Assisting with ideation and concept development for creative projects
- Providing visual references and inspiration for artists and designers
- Experimenting with and exploring the intersection of language and visual representation

While `OpenDalle` offers impressive capabilities, users should be aware of the model's limitations and potential biases, as described in the [OpenDalleV1.1 model card](https://aimodels.fyi/models/huggingFace/opendallev11-dataautogpt3).

## Things to try
One interesting aspect of `OpenDalle` is its ability to blend different artistic styles and genres in the generated images. By incorporating prompts that reference specific illustrators, aesthetic movements, or creative techniques, users can explore the model's capacity to synthesize diverse visual elements into cohesive, visually engaging compositions.

For example, prompts that combine references to "artgerm" (a renowned digital artist), "comic style," and "mythical seascape" can result in striking, surreal images that blend comic book aesthetics with fantastical, dreamlike elements. Experimenting with such prompts can help uncover the model's versatility and unlock new creative possibilities.

![](https://huggingface.co/dataautogpt3/ProteusV0.2/resolve/main/ComfyUI_03087_.png)

Prompt

black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed

![](https://huggingface.co/dataautogpt3/ProteusV0.2/resolve/main/GEN8-iTXcAA-okN.jpeg)

Prompt

(impressionistic realism by csybgh), a 50 something male, working in banking, very short dyed dark curly balding hair, Afro-Asiatic ancestry, talks a lot but listens poorly, stuck in the past, wearing a suit, he has a certain charm, bronze skintone, sitting in a bar at night, he is smoking and feeling cool, drunk on plum wine, masterpiece, 8k, hyper detailed, smokey ambiance, perfect hands AND fingers

![](https://huggingface.co/dataautogpt3/ProteusV0.2/resolve/main/ComfyUI_03060_.png)

Prompt

high quality pixel art, a pixel art silhouette of an anime space-themed girl in a space-punk steampunk style, lying in her bed by the window of a spaceship, smoking, with a rustic feel. The image should embody epic portraiture and double exposure, featuring an isolated landscape visible through the window. The colors should primarily be dynamic and action-packed, with a strong use of negative space. The entire artwork should be in pixel art style, emphasizing the characters shape and set against a white background. Silhouette

![](https://huggingface.co/dataautogpt3/ProteusV0.2/resolve/main/ComfyUI_03017_.png)

Prompt

The image features an older man, a long white beard and mustache, He has a stern expression, giving the impression of a wise and experienced individual. The mans beard and mustache are prominent, adding to his distinguished appearance. The close-up shot of the mans face emphasizes his facial features and the intensity of his gaze.

![](https://huggingface.co/dataautogpt3/ProteusV0.2/resolve/main/ComfyUI_03045.png)

Prompt

Super Closeup Portrait, action shot, Profoundly dark whitish meadow, glass flowers, Stains, space grunge style, Jeanne d'Arc wearing White Olive green used styled Cotton frock, Wielding thin silver sword, Sci-fi vibe, dirty, noisy, Vintage monk style, very detailed, hd

![](https://huggingface.co/dataautogpt3/ProteusV0.2/resolve/main/3.png)

Prompt

cinematic film still of Kodak Motion Picture Film: (Sharp Detailed Image) An Oscar winning movie for Best Cinematography a woman in a kimono standing on a subway train in Japan Kodak Motion Picture Film Style, shallow depth of field, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy

![](https://huggingface.co/dataautogpt3/ProteusV0.2/resolve/main/ComfyUI_03061_.png)

Prompt

in the style of artgerm, comic style,3D model, mythical seascape, negative space, space quixotic dreams, temporal hallucination, psychedelic, mystical, intricate details, very bright neon colors, (vantablack background:1.5), pointillism, pareidolia, melting, symbolism, very high contrast, chiaroscuro

Negative Prompt

bad quality, bad anatomy, worst quality, low quality, low resolutions, extra fingers, blur, blurry, ugly, wrongs proportions, watermark, image artifacts, lowres, ugly, jpeg artifacts, deformed, noisy image

![](https://huggingface.co/dataautogpt3/ProteusV0.2/resolve/main/ComfyUI_03092_.png)

Prompt

1980s anime portrait of a character glitching. His face is separated from his body by heavy static. His face is deformed by pain. Dream-like, analog horror, glitch, terrifying

![](https://huggingface.co/dataautogpt3/ProteusV0.2/resolve/main/ComfyUI_03297_.png)

Prompt

(("Proteus"):text\_logo:1)

![](https://huggingface.co/dataautogpt3/ProteusV0.2/resolve/main/ComfyUI_03483_.png)

Prompt

dan seagrave, dante, Abandon All Hope, Ye Who Enter Here, hell religious art purgatory zdzislaw Beksinski, abyss inferno, lost, wanderer

\## ProteusV0.2

merged with RealCartoonXL to fix issues with inability to understand tags related to anime or cartoon styles at just a weight of 0.5% out of 100% using custom scripts with slerp like methods.

Version 0.2 shows subtle yet significant improvements over Version 0.1. It demonstrates enhanced prompt understanding that surpasses MJ6, while also approaching its stylistic capabilities.

[](#proteus)Proteus
-------------------

Proteus serves as a sophisticated enhancement over OpenDalleV1.1, leveraging its core functionalities to deliver superior outcomes. Key areas of advancement include heightened responsiveness to prompts and augmented creative capacities. To achieve this, it was fine-tuned using approximately 220,000 GPTV captioned images from copyright-free stock images (with some anime included), which were then normalized. Additionally, DPO (Direct Preference Optimization) was employed through a collection of 10,000 carefully selected high-quality, AI-generated image pairs.

In pursuit of optimal performance, numerous LORA (Low-Rank Adaptation) models are trained independently before being selectively incorporated into the principal model via dynamic application methods. These techniques involve targeting particular segments within the model while avoiding interference with other areas during the learning phase. Consequently, Proteus exhibits marked improvements in portraying intricate facial characteristics and lifelike skin textures, all while sustaining commendable proficiency across various aesthetic domains, notably surrealism, anime, and cartoon-style visualizations.

[](#settings-for-proteusv02)Settings for ProteusV0.2
----------------------------------------------------

Use these settings for the best results with ProteusV0.2:

CFG Scale: Use a CFG scale of 8 to 7

Steps: 20 to 60 steps for more detail, 20 steps for faster results.

Sampler: DPM++ 2M SDE

Scheduler: Karras

Resolution: 1280x1280 or 1024x1024

please also consider using these keep words to improve your prompts: best quality, HD, `~*~aesthetic~*~`.

if you are having trouble coming up with prompts you can use this GPT I put together to help you refine the prompt. [https://chat.openai.com/g/g-RziQNoydR-diffusion-master](https://chat.openai.com/g/g-RziQNoydR-diffusion-master)

[](#use-it-with--diffusers)Use it with  diffusers
-----------------------------------------------------

    import torch
    from diffusers import (
        StableDiffusionXLPipeline, 
        KDPM2AncestralDiscreteScheduler,
        AutoencoderKL
    )
    
    # Load VAE component
    vae = AutoencoderKL.from_pretrained(
        "madebyollin/sdxl-vae-fp16-fix", 
        torch_dtype=torch.float16
    )
    
    # Configure the pipeline
    pipe = StableDiffusionXLPipeline.from_pretrained(
        "dataautogpt3/ProteusV0.2", 
        vae=vae,
        torch_dtype=torch.float16
    )
    pipe.scheduler = KDPM2AncestralDiscreteScheduler.from_config(pipe.scheduler.config)
    pipe.to('cuda')
    
    # Define prompts and generate image
    prompt = "black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed"
    negative_prompt = "nsfw, bad quality, bad anatomy, worst quality, low quality, low resolutions, extra fingers, blur, blurry, ugly, wrongs proportions, watermark, image artifacts, lowres, ugly, jpeg artifacts, deformed, noisy image"
    
    image = pipe(
        prompt, 
        negative_prompt=negative_prompt, 
        width=1024,
        height=1024,
        guidance_scale=7.5,
        num_inference_steps=50
    ).images[0]
    

please support the work I do through donating to me on: [https://www.buymeacoffee.com/DataVoid](https://www.buymeacoffee.com/DataVoid) or following me on [https://twitter.com/DataPlusEngine](https://twitter.com/DataPlusEngine)

## Model overview

[`ProteusV0.2`](https://aimodels.fyi/creators/huggingFace/dataautogpt3) is an AI model developed by [dataautogpt3](https://aimodels.fyi/creators/huggingFace/dataautogpt3) that excels at generating high-quality, detailed images from text prompts. It is a refinement of the OpenDalleV1.1 model, further improving prompt adherence and stylistic capabilities. Compared to similar models like [OpenDalleV1.1](https://aimodels.fyi/models/huggingFace/opendallev11-dataautogpt3) and [Counterfeit-V2.0](https://aimodels.fyi/models/huggingFace/counterfeit-v20-gsdf), ProteusV0.2 demonstrates more accurate interpretation of prompts and a wider range of stylistic outputs.

## Model inputs and outputs

ProteusV0.2 is a text-to-image AI model that takes natural language prompts as input and generates corresponding images. The model has shown impressive results in capturing the essence of prompts and producing highly detailed, visually striking outputs.

### Inputs
- Text prompts describing the desired image, including details about subjects, styles, and attributes

### Outputs
- High-resolution, photorealistic images that match the provided text prompts
- Images in a variety of styles, from realistic to impressionistic and surreal

## Capabilities

ProteusV0.2 has demonstrated excellent capabilities in interpreting complex text prompts and generating corresponding images with a high degree of detail and accuracy. The model excels at producing visually stunning artwork in diverse genres, from fantastical creatures to detailed portraits and scenes.

## What can I use it for?

ProteusV0.2 can be a valuable tool for a wide range of applications, including:

- **Concept art and visual development**: Generate striking visuals to support creative projects, such as game development, film production, or product design.
- **Illustration and digital art**: Create unique, high-quality illustrations and digital artwork without the need for manual drawing skills.
- **Marketing and advertising**: Produce eye-catching visuals for social media, websites, and other marketing materials.
- **Educational and research purposes**: Use the model to explore the intersection of language and visual representation, or to create educational materials.

## Things to try

One interesting aspect of ProteusV0.2 is its ability to interpret and adhere to prompts in a nuanced way, capturing subtle details and stylistic elements. Try experimenting with prompts that incorporate specific artistic references, such as the styles of famous painters or illustrators. You can also explore the model's capabilities in generating detailed, photorealistic images by including detailed descriptors in your prompts.

![](https://huggingface.co/dataautogpt3/TempestV0.1/resolve/main/steakj.png)

Prompt

Food photography style photo RAW,piece of fried grilled meat, splashes of ketchup and mustard sauce, (rosemary), spices, exceptional shallow depth-of-field capabilities, atmospheric haze blur,vivid colors,high quality textures of materials, volumetric textures, coating textures, metal textures . Appetizing, professional, culinary, high-resolution, commercial, highly detailed

![](https://huggingface.co/dataautogpt3/TempestV0.1/resolve/main/dragon.png)

Prompt

amazing quality, sci-fi, desert landscape, a massive dragon crashing through the dunes, epic scene, fabulous, knight running away

![](https://huggingface.co/dataautogpt3/TempestV0.1/resolve/main/final_output_02499_.png)

Prompt

Super Closeup Portrait, action shot, Profoundly dark whitish meadow, glass flowers, Stains, space grunge style, Jeanne d'Arc wearing White Olive green used styled Cotton frock, Wielding thin silver sword, Sci-fi vibe, dirty, noisy, Vintage monk style, very detailed, hd, cinematic, 2k

![](https://huggingface.co/dataautogpt3/TempestV0.1/resolve/main/liz.jpeg)

Prompt

lizard

[](#who-needs-upscalers)Who needs upscalers?
--------------------------------------------

The TempestV0.1 Initiative is a powerhouse in image generation, leveraging an unparalleled dataset of over 6 million images. The collection's vast scale, with resolutions from 1400x2100 to 4800x7200, encompasses 200GB of high-quality content.

With a groundbreaking 3 million iterations in its training cycle, TempestV0.1 underscores the rigorous effort input by its creator. This training intensity notably eclipses that of all other contemporarie models. TempestV0.1 shatters the conventional limits of image generation, particularly in delivering unparalleled detail and texture.

due to the distrabution diffrenece loras will not work with this model at higher resolutions.

* * *

[](#what-is-the-diffrence-between-base-and-artistic)What is the diffrence between Base and Artistic?
----------------------------------------------------------------------------------------------------

Base is the pure 100% trained model without any special loras or models throw into the mix. suggested settings for Base are:

cfg: 7 to 8

steps: 60 to 80

Artistic is less overall cohesive at larger sizes but has much more flare and stylistic promise. (Proteus+Tempest=Artistic Tempest) suggested settings for Artistic are:

cfg: 3 to 8

steps: 60 to 80

both of these checkpoints have there place and are seperate for ease of understanding for the user.

supported sizes: | 2048x1024 | 1920x1088 |

* * *

please support the work I do through donating to me on: [https://www.buymeacoffee.com/DataVoid](https://www.buymeacoffee.com/DataVoid) or following me on [https://twitter.com/DataPlusEngine](https://twitter.com/DataPlusEngine)

## Model Overview

The `TempestV0.1` Initiative is a powerhouse in image generation, leveraging an unparalleled dataset of over 6 million images. The collection's vast scale, with resolutions from 1400x2100 to 4800x7200, encompasses 200GB of high-quality content. With a groundbreaking 3 million iterations in its training cycle, `TempestV0.1` underscores the rigorous effort input by its creator. This training intensity notably eclipses that of all other contemporary models. `TempestV0.1` shatters the conventional limits of image generation, particularly in delivering unparalleled detail and texture.

The `ProteusV0.2` model serves as a sophisticated enhancement over OpenDalleV1.1, leveraging its core functionalities to deliver superior outcomes. Key areas of advancement include heightened responsiveness to prompts and augmented creative capacities. To achieve this, it was fine-tuned using approximately 220,000 GPTV captioned images from copyright-free stock images (with some anime included), which were then normalized. Additionally, DPO (Direct Preference Optimization) was employed through a collection of 10,000 carefully selected high-quality, AI-generated image pairs.

## Model Inputs and Outputs

`TempestV0.1` accepts a wide range of text prompts to generate high-quality, detailed images. The model demonstrates exceptional capabilities in producing photorealistic food imagery, cinematic sci-fi scenes, and intricate character portraits.

### Inputs
- **Detailed text prompts**: The model responds well to prompts that provide specific details about the desired image, such as the subject, style, lighting, materials, and other visual elements.
- **Artistic descriptions**: In addition to realistic prompts, the model can also interpret more abstract, artistic text to generate compelling, visually striking images.

### Outputs
- **High-resolution images**: The model can output images at resolutions up to 4800x7200, delivering exceptional detail and clarity.
- **Diverse visual styles**: The model is adept at generating images in a wide range of styles, from photorealistic to fantastical and surreal.
- **Intricate textures and materials**: The model excels at rendering complex textures, such as metal, glass, and clothing, with a high level of realism.

## Capabilities

`TempestV0.1` demonstrates impressive capabilities in generating high-quality, detailed images across a variety of domains. The model's exceptional performance in photorealistic food imagery is showcased in the example of a "piece of fried grilled meat, with splashes of ketchup and mustard sauce, and exceptional shallow depth-of-field capabilities." Additionally, the model's ability to create cinematic sci-fi scenes is exemplified by the "epic scene of a massive dragon crashing through desert dunes."

The model also showcases its prowess in producing intricate character portraits, as seen in the example of a "Super Closeup Portrait, action shot, Profoundly dark whitish meadow, glass flowers, Stains, space grunge style, Jeanne d'Arc wearing White Olive green used styled Cotton frock, Wielding thin silver sword, Sci-fi vibe, dirty, noisy, Vintage monk style, very detailed, hd, cinematic, 2k."

## What can I use it for?

The `TempestV0.1` model can be a powerful tool for a variety of applications, particularly in the field of digital art and content creation. Creators and artists can leverage the model's capabilities to generate high-quality, visually striking images for use in illustrations, concept art, product design, and various other creative endeavors.

Additionally, the model's impressive performance in photorealistic imagery makes it a potential asset for industries such as food photography, product visualization, and even architectural visualization. Businesses and professionals in these fields may find the `TempestV0.1` model to be a valuable resource for enhancing their visual content and optimizing their workflows.

## Things to try

One interesting aspect of the `TempestV0.1` model is its ability to generate images with a unique sense of atmosphere and mood. By carefully crafting prompts that evoke a particular emotional or environmental tone, users can create images that are not only visually striking but also convey a deeper, more immersive narrative.

For example, experimenting with prompts that incorporate elements of mystery, tension, or wonder can result in images that captivate the viewer and spark their imagination. Similarly, exploring prompts that blend realistic and fantastical elements can lead to the creation of distinctive, genre-blending visuals that challenge conventional boundaries.

Another intriguing avenue to explore with the `TempestV0.1` model is the potential for combining its capabilities with other AI-powered tools or techniques, such as 3D modeling, animation, or interactive experiences. By integrating the model's image generation prowess with complementary technologies, users may discover new and innovative ways to push the boundaries of visual storytelling and interactive content.

![](https://huggingface.co/dataautogpt3/ProteusV0.3/resolve/main/upscaled_image.png)

Prompt

Anime full body portrait of a swordsman holding his weapon in front of him. He is facing the camera with a fierce look on his face. Anime key visual (best quality, HD, ~+~aesthetic~+~:1.2)

![](https://huggingface.co/dataautogpt3/ProteusV0.3/resolve/main/2.png)

Prompt

spacious,circular underground room,{dirtied and bloodied white tiles},amalgamation,flesh,plastic,dark fabric,core,pulsating heart,limbs,human-like arms,twisted angelic wings,arms,covered in skin,feathers,scales,undulate slowly,unseen current,convulsing,head area,chaotic,mass of eyes,mouths,no human features,smaller forms,cherubs,demons,golden wires,surround,holy light,tv static effect,golden glow,shadows,terrifying essence,overwhelming presence,nightmarish,landscape,sparse,cavernous,eerie,dynamic,motion,striking,awe-inspiring,nightmarish,nightmarish,nightmare,horrifying,bio-mechanical,body horror,amalgamation

![](https://huggingface.co/dataautogpt3/ProteusV0.3/resolve/main/3.png)

Prompt

A robot holding a sign saying 'The Application did not respond' in red colors

![](https://huggingface.co/dataautogpt3/ProteusV0.3/resolve/main/4.png)

Prompt

A photograph of Hughyen in his early twenties, (an inspiring artist whose art focuses on glitching images and vaporwave color gradients with unexpected conflicting compositions:0.5)

![](https://huggingface.co/dataautogpt3/ProteusV0.3/resolve/main/7.png)

Prompt

Anime mugshot of a tough woman. She is holding a prison sign that reads "Proteus". Her face is censored. Anime key visual (best quality, HD, ~+~aesthetic~+~:1.2)

![](https://huggingface.co/dataautogpt3/ProteusV0.3/resolve/main/5.png)

Prompt

Glitch art. 1980s anime, vintage, analogue horror. ((static and noise)), chromatic aberration

![](https://huggingface.co/dataautogpt3/ProteusV0.3/resolve/main/6.png)

Prompt

Masterpiece, glitch, holy holy holy, fog, by DarkIncursio

\## ProteusV0.3: The Anime Update

Proteus V0.3 has been advanced with an additional 200,000 anime-related images, further refined by a selection of 15,000 aesthetically pleasing images, enhancing its lighting effects significantly. This upgrade preserves its understanding of prompts and maintains its photorealistic and stylistic capabilities without suffering from catastrophic forgetting.

[](#proteus)Proteus
-------------------

Proteus serves as a sophisticated enhancement over OpenDalleV1.1, leveraging its core functionalities to deliver superior outcomes. Key areas of advancement include heightened responsiveness to prompts and augmented creative capacities. To achieve this, it was fine-tuned using approximately 220,000 GPTV captioned images from copyright-free stock images (with some anime included), which were then normalized. Additionally, DPO (Direct Preference Optimization) was employed through a collection of 10,000 carefully selected high-quality, AI-generated image pairs.

In pursuit of optimal performance, numerous LORA (Low-Rank Adaptation) models are trained independently before being selectively incorporated into the principal model via dynamic application methods. These techniques involve targeting particular segments within the model while avoiding interference with other areas during the learning phase. Consequently, Proteus exhibits marked improvements in portraying intricate facial characteristics and lifelike skin textures, all while sustaining commendable proficiency across various aesthetic domains, notably surrealism, anime, and cartoon-style visualizations.

[](#settings-for-proteusv03)Settings for ProteusV0.3
----------------------------------------------------

Use these settings for the best results with ProteusV0.3:

CFG Scale: Use a CFG scale of 8 to 7

Steps: 20 to 60 steps for more detail, 20 steps for faster results.

Sampler: DPM++ 2M SDE

Scheduler: Karras

Resolution: 1280x1280 or 1024x1024

please also consider using these keep words to improve your prompts: best quality, HD, `~*~aesthetic~*~`.

if you are having trouble coming up with prompts you can use this GPT I put together to help you refine the prompt. [https://chat.openai.com/g/g-RziQNoydR-diffusion-master](https://chat.openai.com/g/g-RziQNoydR-diffusion-master)

[](#use-it-with--diffusers)Use it with  diffusers
-----------------------------------------------------

    import torch
    from diffusers import (
        StableDiffusionXLPipeline, 
        KDPM2AncestralDiscreteScheduler,
        AutoencoderKL
    )
    
    # Load VAE component
    vae = AutoencoderKL.from_pretrained(
        "madebyollin/sdxl-vae-fp16-fix", 
        torch_dtype=torch.float16
    )
    
    # Configure the pipeline
    pipe = StableDiffusionXLPipeline.from_pretrained(
        "dataautogpt3/ProteusV0.3", 
        vae=vae,
        torch_dtype=torch.float16
    )
    pipe.scheduler = KDPM2AncestralDiscreteScheduler.from_config(pipe.scheduler.config)
    pipe.to('cuda')
    
    # Define prompts and generate image
    prompt = "black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed"
    negative_prompt = "nsfw, bad quality, bad anatomy, worst quality, low quality, low resolutions, extra fingers, blur, blurry, ugly, wrongs proportions, watermark, image artifacts, lowres, ugly, jpeg artifacts, deformed, noisy image"
    
    image = pipe(
        prompt, 
        negative_prompt=negative_prompt, 
        width=1024,
        height=1024,
        guidance_scale=7,
        num_inference_steps=20
    ).images[0]
    

please support the work I do through donating to me on: [https://www.buymeacoffee.com/DataVoid](https://www.buymeacoffee.com/DataVoid) or following me on [https://twitter.com/DataPlusEngine](https://twitter.com/DataPlusEngine)

## ProteusV0.3: The Anime Update

[Proteus](https://aimodels.fyi/creators/huggingFace/dataautogpt3) has been advanced with an additional 200,000 anime-related images, further refined by a selection of 15,000 aesthetically pleasing images, enhancing its lighting effects significantly. This upgrade preserves its understanding of prompts and maintains its photorealistic and stylistic capabilities without suffering from catastrophic forgetting.

## Model inputs and outputs

Proteus V0.3 accepts a wide range of prompts, from detailed anime character descriptions to surreal, nightmare-inspired landscapes. The model can generate high-quality, photorealistic images that capture the essence of the prompt, with impressive attention to detail and stylistic flair.

### Inputs
- Detailed text prompts describing anime characters, scenes, and environments
- Prompts incorporating artistic elements like "best quality", "HD", and "~*~aesthetic~*~"
- Prompts exploring darker, more unsettling themes like "body horror", "nightmarish", and "bio-mechanical"

### Outputs
- Stunning, photorealistic anime-style character portraits
- Captivating, surreal landscapes and environments
- Unsettling, nightmare-inspired amalgamations of organic and mechanical elements

## Capabilities

Proteus V0.3 demonstrates a significant leap forward in its ability to understand and translate intricate text prompts into visually striking images. The model excels at capturing the essence of anime-inspired characters and scenes, infusing them with a heightened sense of realism and cinematic flair.

One of the model's standout capabilities is its handling of dark, unsettling themes. Proteus V0.3 can seamlessly blend organic and mechanical elements, creating truly nightmarish visions that push the boundaries of what is possible in text-to-image generation.

## What can I use it for?

Proteus V0.3 is an excellent choice for artists, illustrators, and creative professionals looking to bring their anime-inspired ideas to life. The model's versatility allows for a wide range of applications, from character design and concept art to worldbuilding and visual development.

Additionally, the model's ability to explore darker, more surreal themes makes it a valuable tool for horror enthusiasts, indie game developers, and anyone seeking to push the boundaries of visual storytelling.

## Things to try

Experiment with blending Proteus V0.3's anime-inspired capabilities with other artistic styles and themes. Try prompts that combine the model's strengths in character portrayal with elements of surrealism, sci-fi, or gothic horror. Explore the limits of the model's ability to capture unsettling, nightmarish visions while maintaining a sense of visual cohesion and artistic flair.

Additionally, consider pairing Proteus V0.3 with other [Proteus](https://aimodels.fyi/creators/huggingFace/dataautogpt3) models or the [OpenDalleV1.1](https://aimodels.fyi/models/huggingFace/opendallev11-dataautogpt3) model to create even more diverse and compelling visual outputs.

![](https://huggingface.co/dataautogpt3/ProteusV0.4/resolve/main/GGuziQaXYAAudCW.png)

Prompt

3 fish in a fish tank wearing adorable outfits, best quality, hd

![](https://huggingface.co/dataautogpt3/ProteusV0.4/resolve/main/upscaled_image%20(1).webp)

Prompt

a woman sitting in a wooden chair in the middle of a grass field on a farm, moonlight, best quality, hd, anime art

![](https://huggingface.co/dataautogpt3/ProteusV0.4/resolve/main/GGvDC_qWUAAcuQA.jpeg)

Prompt

Masterpiece, glitch, holy holy holy, fog, by DarkIncursio

![](https://huggingface.co/dataautogpt3/ProteusV0.4/resolve/main/upscaled_image2.png)

Prompt

jpeg Full Body Photo of a weird imaginary Female creatures captured on celluloid film, (((ghost))),heavy rain, thunder, snow, water's surface, night, expressionless, Blood, Japan God,(school), Ultra Realistic, ((Scary)),looking at camera, screem, plaintive cries, Long claws, fangs, scales,8k, HDR, 500px, mysterious and ornate digital art, photic, intricate, fantasy aesthetic.

![](https://huggingface.co/dataautogpt3/ProteusV0.4/resolve/main/upscaled_image.png)

Prompt

The divine tree of knowledge, an interplay between purple and gold, floats in the void of the sea of quanta, the tree is made of crystal, the void is made of nothingness, strong contrast, dim lighting, beautiful and surreal scene. wide shot

![](https://huggingface.co/dataautogpt3/ProteusV0.4/resolve/main/old.png)

Prompt

The image features an older man, a long white beard and mustache, He has a stern expression, giving the impression of a wise and experienced individual. The mans beard and mustache are prominent, adding to his distinguished appearance. The close-up shot of the mans face emphasizes his facial features and the intensity of his gaze.

![](https://huggingface.co/dataautogpt3/ProteusV0.4/resolve/main/upscaled_image4.png)

Prompt

Ghost in the Shell Stand Alone Complex

![](https://huggingface.co/dataautogpt3/ProteusV0.4/resolve/main/collage.png)

Prompt

(impressionistic realism by csybgh), a 50 something male, working in banking, very short dyed dark curly balding hair, Afro-Asiatic ancestry, talks a lot but listens poorly, stuck in the past, wearing a suit, he has a certain charm, bronze skintone, sitting in a bar at night, he is smoking and feeling cool, drunk on plum wine, masterpiece, 8k, hyper detailed, smokey ambiance, perfect hands AND fingers

![](https://huggingface.co/dataautogpt3/ProteusV0.4/resolve/main/collage2.png)

Prompt

black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed

\## ProteusV0.4: The Style Update

This update enhances stylistic capabilities, similar to Midjourney's approach, rather than advancing prompt comprehension. Methods used do not infringe on any copyrighted material.

[](#proteus)Proteus
-------------------

Proteus serves as a sophisticated enhancement over OpenDalleV1.1, leveraging its core functionalities to deliver superior outcomes. Key areas of advancement include heightened responsiveness to prompts and augmented creative capacities. To achieve this, it was fine-tuned using approximately 220,000 GPTV captioned images from copyright-free stock images (with some anime included), which were then normalized. Additionally, DPO (Direct Preference Optimization) was employed through a collection of 10,000 carefully selected high-quality, AI-generated image pairs.

In pursuit of optimal performance, numerous LORA (Low-Rank Adaptation) models are trained independently before being selectively incorporated into the principal model via dynamic application methods. These techniques involve targeting particular segments within the model while avoiding interference with other areas during the learning phase. Consequently, Proteus exhibits marked improvements in portraying intricate facial characteristics and lifelike skin textures, all while sustaining commendable proficiency across various aesthetic domains, notably surrealism, anime, and cartoon-style visualizations.

finetuned/trained on a total of 400k+ images at this point.

[](#settings-for-proteusv04)Settings for ProteusV0.4
----------------------------------------------------

Use these settings for the best results with ProteusV0.4:

CFG Scale: Use a CFG scale of 4 to 6

Steps: 20 to 60 steps for more detail, 20 steps for faster results.

Sampler: DPM++ 2M SDE

Scheduler: Karras

Resolution: 1280x1280 or 1024x1024

please also consider using these keep words to improve your prompts: best quality, HD, `~*~aesthetic~*~`.

if you are having trouble coming up with prompts you can use this GPT I put together to help you refine the prompt. [https://chat.openai.com/g/g-RziQNoydR-diffusion-master](https://chat.openai.com/g/g-RziQNoydR-diffusion-master)

[](#use-it-with--diffusers)Use it with  diffusers
-----------------------------------------------------

    import torch
    from diffusers import (
        StableDiffusionXLPipeline, 
        KDPM2AncestralDiscreteScheduler,
        AutoencoderKL
    )
    
    # Load VAE component
    vae = AutoencoderKL.from_pretrained(
        "madebyollin/sdxl-vae-fp16-fix", 
        torch_dtype=torch.float16
    )
    
    # Configure the pipeline
    pipe = StableDiffusionXLPipeline.from_pretrained(
        "dataautogpt3/ProteusV0.4", 
        vae=vae,
        torch_dtype=torch.float16
    )
    pipe.scheduler = KDPM2AncestralDiscreteScheduler.from_config(pipe.scheduler.config)
    pipe.to('cuda')
    
    # Define prompts and generate image
    prompt = "black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed"
    negative_prompt = "nsfw, bad quality, bad anatomy, worst quality, low quality, low resolutions, extra fingers, blur, blurry, ugly, wrongs proportions, watermark, image artifacts, lowres, ugly, jpeg artifacts, deformed, noisy image"
    
    image = pipe(
        prompt, 
        negative_prompt=negative_prompt, 
        width=1024,
        height=1024,
        guidance_scale=4,
        num_inference_steps=20
    ).images[0]
    

please support the work I do through donating to me on: [https://www.buymeacoffee.com/DataVoid](https://www.buymeacoffee.com/DataVoid) or following me on [https://twitter.com/DataPlusEngine](https://twitter.com/DataPlusEngine)

## `ProteusV0.4`: The Style Update

This update to the Proteus model enhances its stylistic capabilities, similar to the approach taken by Midjourney, rather than advancing its prompt comprehension. The methods used do not infringe on any copyrighted material.

[Proteus](https://aimodels.fyi/creators/huggingFace/dataautogpt3) serves as a sophisticated enhancement over OpenDalleV1.1, leveraging its core functionalities to deliver superior outcomes. Key areas of advancement include heightened responsiveness to prompts and augmented creative capacities. To achieve this, Proteus was fine-tuned using approximately 220,000 GPTV captioned images from copyright-free stock images (with some anime included), which were then normalized. Additionally, DPO (Direct Preference Optimization) was employed through a collection of 10,000 carefully selected high-quality, AI-generated image pairs.

In pursuit of optimal performance, numerous LORA (Low-Rank Adaptation) models are trained independently before being selectively incorporated into the principal model via dynamic application methods. These techniques involve targeting particular segments within the model while avoiding interference with other areas during the learning phase. Consequently, Proteus exhibits marked improvements in portraying intricate facial characteristics and lifelike skin textures, all while sustaining commendable proficiency across various aesthetic domains, notably surrealism, anime, and cartoon-style visualizations.

### Inputs

- Textual prompts describing the desired image
- Negative prompts to exclude certain elements

### Outputs

- High-quality, visually stunning images generated based on the input prompts

## Capabilities

Proteus V0.4 showcases enhanced stylistic capabilities compared to previous versions, allowing for the creation of a wide range of visually appealing images across various genres, including surrealism, anime, and cartoon-style art. The model demonstrates the ability to generate intricate facial details and lifelike skin textures, as well as striking lighting effects and atmospheric elements.

## What can I use it for?

The `ProteusV0.4` model can be leveraged for a variety of creative projects, such as:

- Concept art and illustrations for games, films, or books
- Generative art installations and experiments
- Social media content creation
- Visualizing ideas and abstract concepts

## Things to try

Consider experimenting with different prompt structures and keywords to explore the full range of Proteus V0.4's stylistic capabilities. Try incorporating artistic styles, genres, or specific visual elements to see how the model responds and generates unique, visually striking imagery.

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/i.png)

Prompt

score\_9, Side View of a Roman Warrior pierced By a spear, cinimatic

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/ComfyUI_temp_bqhxr_00199_.png)

Prompt

a knight fighting a dragon, epic cinimatic

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/ComfyUI_temp_dxhdq_00573_.png)

Prompt

score\_9, score\_8\_up, score\_7\_up, score\_6\_up, score\_5\_up, score\_4\_up, photorealistic, ocean,

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/ComfyUI_temp_bqhxr_00036_.png)

Prompt

score\_9, score\_8\_up, score\_7\_up, score\_6\_up, score\_5\_up, score\_4\_up, powerful aura, imposing, anime style, 1 guy, cast in shadow, red glowing eyes, manic smile

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/ComfyUI_temp_bqhxr_00022_.png)

Prompt

A dark, moody portrait of the holy mary juggling spheres, sacred geometry, dark background, golden ratio composition, hyper realistic, high resolution, photography, in the style of Roberto Ferri and Emil Melmoth

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/ComfyUI_temp_bqhxr_00052_.png)

Prompt

score\_9, score\_8\_up, score\_7\_up, score\_6\_up, score\_5\_up, score\_4\_up, Vegeta, iconic Saiyan prince from DBZ, (powerful stance:1.3), (muscle definition:1.2), in mid-battle roar, (Super Saiyan transformation:1.5), crackling aura of energy enveloping him, dynamic background showcasing a devastated battlefield reminiscent of Namek or Earth during epic confrontations; elements of Akira Toriyama's signature art style blended seamlessly with high saturation and bold lines to capture the intensity and raw power synonymous with Dragon Ball Z; dramatic lighting casting strong shadows to enhance Vegeta's chiseled features and battle torn armor; camera angle low and looking up to emphasize his dominance and unyielding spirit.

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/tiger.png)

Prompt

tiger

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/ComfyUI_temp_bqhxr_00605_.png)

Prompt

the hulk, score\_9, score\_8\_up, score\_7\_up, score\_6\_up, score\_5\_up, score\_4\_up

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/GIlhXbZWgAAAVdi.jpeg)

Prompt

score\_9, Side View of a Roman Warrior pierced By a spear, cinimatic

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/ComfyUI_temp_kcmuc_00530_.png)

Prompt

score\_9, miku

![](https://huggingface.co/dataautogpt3/Proteus-RunDiffusion/resolve/main/ComfyUI_temp_dxhdq_00102_.png)

Prompt

cute anime girl

[](#introducing-proteus-rundiffusion)Introducing Proteus-RunDiffusion
---------------------------------------------------------------------

[https://discord.gg/EDQD3Zpwvc](https://discord.gg/EDQD3Zpwvc) In the development of Proteus-RunDiffusion, our team embarked on an exploratory project aimed at advancing the capabilities of AI in art creation. Our journey, inspired by the broad achievements of models like Pony Diffusion v6 XL CLIP, led us to experiment with the CLIP architecture in novel ways. Through a serendipitous process of trial, error, and discovery, we developed a unique approach to retraining CLIP that we hadn't initially set out to achieve. This approach inadvertently unlocked new potentials in character recognition, natural language processing, and, most notably, the versatility of artistic expression.

[https://rundiffusion.com/proteus-rundiffusion#view-generation-samples](https://rundiffusion.com/proteus-rundiffusion#view-generation-samples)

The cornerstone of our discovery, which we refer to as "style unlocking," emerged unexpectedly. This breakthrough allows models that were previously limited to specific genres or styles, such as anime, to generate art across a broader spectrum, including high-fidelity photorealism. This was a result of our reimagined CLIP model's ability to interpret and understand prompts in ways that surpass the original boundaries of style and genre.

We have observed that this retraining has also led to significant improvements in handling CFG scaling, effectively broadening the range from 3 to 50 without the previous limitations or failures. This enhancement opens up new avenues for creative expression and technical reliability in AI-generated art.

In terms of usage, we recommend a CLIP setting of -2 along with a strategic use of light negatives for optimizing the artistic output of Proteus-RunDiffusion. The CFG setting can vary depending on the project, with 8.5 being ideal for standard requests and 3.5 for more artistic explorations. The model supports and encourages experimentation with various tags, offering users the freedom to explore their creative visions in depth.

Using Proteus-RunDiffusion: Expect a Different Experience

When you start using Proteus-RunDiffusion, be ready for it to behave differently from other AI art models you've used. It's been designed in a unique way, which means it will respond to your prompts and commands in its own style. This difference is part of what makes it special, but it also means there's a learning curve. You'll need some time to get familiar with how it works and what it can do. So, as you begin, keep an open mind and be prepared to adjust your approach.

Importantly, we want to clarify that our development of Proteus-RunDiffusion was inspired by existing works but does not directly incorporate or rework specific components from models like Pony Diffusion's CLIP. Our advancements are the result of our proprietary research and development efforts, aimed at enhancing the creative possibilities and compatibility across different AI art generation platforms.

[](#there-will-be-a-upcoming-human-preference-study-and-research-publication)There will be a upcoming Human Preference Study and Research Publication
-----------------------------------------------------------------------------------------------------------------------------------------------------

## Introducing Proteus-RunDiffusion

Proteus-RunDiffusion is a sophisticated text-to-image AI model developed by [dataautogpt3](https://aimodels.fyi/creators/huggingFace/dataautogpt3) that builds upon the core functionality of OpenDalleV1.1. Key areas of advancement include heightened responsiveness to prompts and augmented creative capacities.

## Model inputs and outputs

Proteus-RunDiffusion takes text prompts as input and generates high-quality, visually striking images in response. The model demonstrates a strong understanding of prompt instructions, translating them into detailed, photorealistic or stylized renditions across a wide range of genres and aesthetics.

### Inputs
- **Text prompts**: Descriptions of the desired image, which can incorporate various artistic styles, subjects, and creative elements.

### Outputs
- **Images**: Unique, AI-generated visual representations that capture the essence of the input prompt.

## Capabilities
Proteus-RunDiffusion exhibits marked improvements in portraying intricate facial characteristics, lifelike skin textures, and a commendable proficiency across diverse aesthetic domains, including surrealism, anime, and cartoon-style visualizations. The model's capabilities are showcased through the varied examples in the provided description, ranging from cinematic scenes to fantastical creatures and stylized portraits.

## What can I use it for?
[Proteus-RunDiffusion](https://aimodels.fyi/models/huggingFace/proteus-rundiffusion-dataautogpt3) can be utilized for a wide range of creative projects, from conceptual art and digital illustrations to visual storytelling and imaginative worldbuilding. Its ability to blend realism with stylistic flair makes it a valuable tool for hobbyists, artists, and designers seeking to bring their creative visions to life.

## Things to try
Experiment with prompts that combine various artistic styles, subjects, and descriptive elements to see the breadth of Proteus-RunDiffusion's capabilities. Additionally, consider exploring the model's settings and parameters, such as adjusting the CFG scale, number of steps, and sampling methods, to achieve different levels of detail and aesthetic outcomes.

![](https://huggingface.co/dataautogpt3/PixArt-Sigma-900M/resolve/main/assets/smoke.png)

Prompt

high quality pixel art, a pixel art silhouette of an anime space-themed girl in a space-punk steampunk style, lying in her bed by the window of a spaceship, smoking, with a rustic feel. The image should embody epic portraiture and double exposure, featuring an isolated landscape visible through the window. The colors should primarily be dynamic and action-packed, with a strong use of negative space. The entire artwork should be in pixel art style, emphasizing the characters shape and set against a white background. Silhouette

![](https://huggingface.co/dataautogpt3/PixArt-Sigma-900M/resolve/main/assets/cat.png)

Prompt

black fluffy gorgeous dangerous cat animal creature, large orange eyes, big fluffy ears, piercing gaze, full moon, dark ambiance, best quality, extremely detailed

![](https://huggingface.co/dataautogpt3/PixArt-Sigma-900M/resolve/main/assets/swordswoman.png)

Prompt

Super Closeup Portrait, action shot, Profoundly dark whitish meadow, glass flowers, Stains, space grunge style, Jeanne dArc wearing White Olive green used styled Cotton frock, Wielding thin silver sword, Sci-fi vibe, dirty, noisy, Vintage monk style, very detailed, hd

![](https://huggingface.co/dataautogpt3/PixArt-Sigma-900M/resolve/main/assets/japanese.png)

Prompt

cinematic film still of Kodak Motion Picture Film (Sharp Detailed Image) An Oscar winning movie for Best Cinematography a woman in a kimono standing on a subway train in Japan Kodak Motion Picture Film Style, shallow depth of field, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy

![](https://huggingface.co/dataautogpt3/PixArt-Sigma-900M/resolve/main/assets/oldman.png)

Prompt

The image features an older man, a long white beard and mustache, He has a stern expression, giving the impression of a wise and experienced individual. The mans beard and mustache are prominent, adding to his distinguished appearance. The close-up shot of the mans face emphasizes his facial features and the intensity of his gaze.cinematic film still of Kodak Motion Picture Film (Sharp Detailed Image) An Oscar winning movie for Best Cinematography a woman in a kimono standing on a subway train in Japan Kodak Motion Picture Film Style, shallow depth of field, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy

![](https://huggingface.co/dataautogpt3/PixArt-Sigma-900M/resolve/main/assets/animegirl.png)

Prompt

anime girl

[](#pixart-sigma-900m-enhanced-text-to-image-model)PixArt Sigma 900M: Enhanced Text-to-Image Model
==================================================================================================

PixArt Sigma 900M is a text-to-image generation model based on the PixArt Sigma architecture. This version has been expanded to 900M parameters, up from the original 600M base model.

[](#key-features)Key Features
=============================

900M parameters (300M more than the base model)

Improved image generation quality

[](#technical-details)Technical Details
=======================================

Architecture: PixArt Sigma variant

Parameter Count: 900 million

Base Model: 600M PixArt Sigma

[](#credits)Credits
===================

Original PixArt Sigma model by PIXART- ([https://pixart-alpha.github.io/PixArt-sigma-project/](https://pixart-alpha.github.io/PixArt-sigma-project/))

Based on jimmycarter (Jimmy Carter) ([https://huggingface.co/jimmycarter](https://huggingface.co/jimmycarter)) adaptation of PixArt Sigma

## PixArt-Sigma-900M: Enhanced Text-to-Image Model

The `PixArt-Sigma-900M` is a text-to-image generation model developed by [dataautogpt3](https://aimodels.fyi/creators/huggingFace/dataautogpt3). It is an enhanced version of the PixArt Sigma architecture, capable of generating high-quality, detailed pixel art and scene images from text prompts.

Similar models include the [ProteusV0.2](https://aimodels.fyi/models/huggingFace/proteusv02-dataautogpt3) and [All-In-One-Pixel-Model](https://aimodels.fyi/models/huggingFace/all-in-one-pixel-model-publicprompts), which also focus on generating pixel art and styled images. The [PixArt-Sigma-XL-2-1024-MS](https://aimodels.fyi/models/huggingFace/pixart-sigma-xl-2-1024-ms-pixart-alpha) model from PixArt-alpha is another related model that uses a transformer-based approach for text-to-image generation.

## Model inputs and outputs

The `PixArt-Sigma-900M` model takes text prompts as input and generates corresponding pixel art or scene images as output. The model has been trained on a large dataset of pixel art and styled images, allowing it to produce highly detailed and visually striking results.

### Inputs
- **Text prompts**: The model accepts text prompts that describe the desired image, such as "a pixel art silhouette of an anime space-themed girl in a space-punk steampunk style, lying in her bed by the window of a spaceship, smoking, with a rustic feel."

### Outputs
- **Pixel art and scene images**: The model generates high-quality pixel art and scene images that match the provided text prompts. The images can range from detailed character portraits to complex, multi-layered environments.

## Capabilities

The `PixArt-Sigma-900M` model excels at generating visually appealing and intricate pixel art and scene images from text prompts. It can capture a wide range of styles, from anime and space-themed imagery to dark, moody atmospheres. The model's attention to detail and ability to translate text into cohesive visual compositions make it a powerful tool for artists, designers, and creative professionals.

## What can I use it for?

The `PixArt-Sigma-900M` model can be a valuable asset for various creative projects and applications, such as:

- **Generating concept art and illustrations**: The model can be used to create pixel art and scene images for use in concept art, game development, or other visual media.
- **Enhancing design and creative workflows**: The model can be integrated into design tools or creative applications to assist designers and artists in rapid prototyping and ideation.
- **Educational and training purposes**: The model can be used in educational settings or as part of training materials to demonstrate the capabilities of text-to-image generation and pixel art creation.

## Things to try

Experiment with the `PixArt-Sigma-900M` model by providing a wide range of text prompts, from specific character descriptions to abstract, imaginative scenes. Try prompts that combine different styles, genres, or themes to see how the model handles more complex compositions. Additionally, consider using the model in combination with other image editing or post-processing tools to refine and enhance the generated outputs.