Do you still use older models?
Who here still uses older models, and what for? I still get a ton of use out of SD 1.4 and 1.5. They make great start images.
https://redd.it/1pkwms0
@rStableDiffusion
Who here still uses older models, and what for? I still get a ton of use out of SD 1.4 and 1.5. They make great start images.
https://redd.it/1pkwms0
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Chroma on itself kinda sux due to speed and image quality. Z-image kinda sux regarding artistic styles. both of them together kinda rules. small 768x1024 10 steps chroma image and 2 k zimage refiner.
https://redd.it/1plaaeo
@rStableDiffusion
https://redd.it/1plaaeo
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Chroma on itself kinda sux due to speed and image quality. Z-image kinda sux regarding…
Explore this post and more from the StableDiffusion community
What makes Z-image so good?
Im a bit of a noob when it comes to AI and image generation. Mostly watching different models generating images like qwen or sd.
I just use Nano banana for hobby.
Question i had was what makes Z-image so good? I know it can run efficiently on older gpus and generate good images but what prevents other models from doing the same.
tldr : what is Z-image doing differently?
Better training , better weights?
Question : what is the Z-image base what everyone is talking about? Next version of z-image
https://redd.it/1pldusz
@rStableDiffusion
Im a bit of a noob when it comes to AI and image generation. Mostly watching different models generating images like qwen or sd.
I just use Nano banana for hobby.
Question i had was what makes Z-image so good? I know it can run efficiently on older gpus and generate good images but what prevents other models from doing the same.
tldr : what is Z-image doing differently?
Better training , better weights?
Question : what is the Z-image base what everyone is talking about? Next version of z-image
https://redd.it/1pldusz
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Increase Your Level Of Details With Daemon Details Nodes and Generate Images at 4k With Z Img Turbo with DyPE
https://youtu.be/8rRCfhj4GBo
https://redd.it/1plh6dm
@rStableDiffusion
https://youtu.be/8rRCfhj4GBo
https://redd.it/1plh6dm
@rStableDiffusion
YouTube
ComfyUI Tutorial : Increase Your Details and Generate images at 4k With Z Img Turbo #comfyui
On this tutorial I will show you how to take your images to the next level using two workflows. the first one is based on Dype nodes that allows you to generate high resolution images at 4k resolution without upscaling and it goes beyond the model training…
Use Qwen3-VL-8B for Image-to-Image Prompting in Z-Image!
Knowing that Z-image used Qwn3-VL-4B as a text encoder. So, I've been using Qwen3-VL-8B as an image-to-image prompt to write detailed denoscriptions of images and then feed it to Z-image.
I tested all the Qwen-3-VL models from the 2B to 32B, and found that the denoscription quality is similar for 8B and above. Z-image seems to really love long detailed prompts, and in my testing, it just prefers prompts by the Qwen3 series of models.
P.S. I strongly believe that some of the TechLinked videos were used in the training dataset, otherwise it's uncanny how much Z-image managed to reproduced the images from text denoscription alone.
Prompt: "This is a medium shot of a man, identified by a lower-third graphic as Riley Murdock, standing in what appears to be a modern studio or set. He has dark, wavy hair, a light beard and mustache, and is wearing round, thin-framed glasses. He is directly looking at the viewer. He is dressed in a simple, dark-colored long-sleeved crewneck shirt. His expression is engaged and he appears to be speaking, with his mouth slightly open. The background is a stylized, colorful wall composed of geometric squares in various shades of blue, white, and yellow-orange, arranged in a pattern that creates a sense of depth and visual interest. A solid orange horizontal band runs across the upper portion of the background. In the lower-left corner, a graphic overlay displays the name "RILEY MURDOCK" in bold, orange, sans-serif capital letters on a white rectangular banner, which is accented with a colorful, abstract geometric design to its left. The lighting is bright and even, typical of a professional video production, highlighting the subject clearly against the vibrant backdrop. The overall impression is that of a presenter or host in a contemporary, upbeat setting. Riley Murdock, presenter, studio, modern, colorful background, geometric pattern, glasses, dark shirt, lower-third graphic, video production, professional, engaging, speaking, orange accent, blue and yellow wall."
Original Screenshot
Image generated from text Denoscription alone
Image generated from text Denoscription alone
Image generated from text Denoscription alone
https://redd.it/1pli1np
@rStableDiffusion
Knowing that Z-image used Qwn3-VL-4B as a text encoder. So, I've been using Qwen3-VL-8B as an image-to-image prompt to write detailed denoscriptions of images and then feed it to Z-image.
I tested all the Qwen-3-VL models from the 2B to 32B, and found that the denoscription quality is similar for 8B and above. Z-image seems to really love long detailed prompts, and in my testing, it just prefers prompts by the Qwen3 series of models.
P.S. I strongly believe that some of the TechLinked videos were used in the training dataset, otherwise it's uncanny how much Z-image managed to reproduced the images from text denoscription alone.
Prompt: "This is a medium shot of a man, identified by a lower-third graphic as Riley Murdock, standing in what appears to be a modern studio or set. He has dark, wavy hair, a light beard and mustache, and is wearing round, thin-framed glasses. He is directly looking at the viewer. He is dressed in a simple, dark-colored long-sleeved crewneck shirt. His expression is engaged and he appears to be speaking, with his mouth slightly open. The background is a stylized, colorful wall composed of geometric squares in various shades of blue, white, and yellow-orange, arranged in a pattern that creates a sense of depth and visual interest. A solid orange horizontal band runs across the upper portion of the background. In the lower-left corner, a graphic overlay displays the name "RILEY MURDOCK" in bold, orange, sans-serif capital letters on a white rectangular banner, which is accented with a colorful, abstract geometric design to its left. The lighting is bright and even, typical of a professional video production, highlighting the subject clearly against the vibrant backdrop. The overall impression is that of a presenter or host in a contemporary, upbeat setting. Riley Murdock, presenter, studio, modern, colorful background, geometric pattern, glasses, dark shirt, lower-third graphic, video production, professional, engaging, speaking, orange accent, blue and yellow wall."
Original Screenshot
Image generated from text Denoscription alone
Image generated from text Denoscription alone
Image generated from text Denoscription alone
https://redd.it/1pli1np
@rStableDiffusion
Excuse me, WHO MADE THIS NODE??? Please elaborate, how can we use this node?
https://redd.it/1pli7p9
@rStableDiffusion
https://redd.it/1pli7p9
@rStableDiffusion
The upcoming Z-image base will be a unified model that handles both image generation and editing.
https://redd.it/1pllpaf
@rStableDiffusion
https://redd.it/1pllpaf
@rStableDiffusion
Créer un LoRA de personne pour Z-Image Turbo pour les novices avec AI-Toolkit
https://redd.it/1plojo7
@rStableDiffusion
https://redd.it/1plojo7
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Créer un LoRA de personne pour Z-Image Turbo pour les novices avec AI-Toolkit
Explore this post and more from the StableDiffusion community