Is Qwen Image edit 2511 just better with 4-step lighting LORA?
I have been testing the FP8 version of Qwen Image Edit 2511 with the official ComfyUI workflow, and er_sde sampler and beta scheduler, and I've got mixed feelings compared to 2509 so far. When changing a single element from a base image, I've found the new version was more prone to change the overall scene (background, character's pose or face), which I consider an undesired effect. It also have a stronger blurrying that was already discussed. On a positive note, there are less occurences of ignored prompts.
Someone posted (I can't retrieve it, maybe deleted?) that moving from 4-step LORA to regular ComfyUI does not improve image quality, even going as far as to the original 40 steps CFG 4 recommendation with BF16 quantization, especially with the blur.
So I added the 4-step LORA to my workflow, and I've got better prompt comprehension and rendering in almost every testing I've done. Why is that? I always thought of these lighting lora as a fine tune to get faster generation at the expense of prompt adherence or image details. But I couldnt see these drawbacks really. What am I missing? Are there use cases for regular qwen edit with standard parameters anymore?
Now, my use of Qwen Image Edit involves mostly short prompts to change one thing of an image at a time. Maybe things are different when writing longer prompts with more details? What's your experience so far?
Now, I wont complain, it means I can have better results in shorter time. Though it makes wonder if using expensive graphic card worth it. 😁
https://redd.it/1pwtqan
@rStableDiffusion
I have been testing the FP8 version of Qwen Image Edit 2511 with the official ComfyUI workflow, and er_sde sampler and beta scheduler, and I've got mixed feelings compared to 2509 so far. When changing a single element from a base image, I've found the new version was more prone to change the overall scene (background, character's pose or face), which I consider an undesired effect. It also have a stronger blurrying that was already discussed. On a positive note, there are less occurences of ignored prompts.
Someone posted (I can't retrieve it, maybe deleted?) that moving from 4-step LORA to regular ComfyUI does not improve image quality, even going as far as to the original 40 steps CFG 4 recommendation with BF16 quantization, especially with the blur.
So I added the 4-step LORA to my workflow, and I've got better prompt comprehension and rendering in almost every testing I've done. Why is that? I always thought of these lighting lora as a fine tune to get faster generation at the expense of prompt adherence or image details. But I couldnt see these drawbacks really. What am I missing? Are there use cases for regular qwen edit with standard parameters anymore?
Now, my use of Qwen Image Edit involves mostly short prompts to change one thing of an image at a time. Maybe things are different when writing longer prompts with more details? What's your experience so far?
Now, I wont complain, it means I can have better results in shorter time. Though it makes wonder if using expensive graphic card worth it. 😁
https://redd.it/1pwtqan
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Z-image Nunchaku is here !
https://github.com/nunchaku-tech/nunchaku/releases/tag/v1.1.0
https://redd.it/1pwyxwd
@rStableDiffusion
https://github.com/nunchaku-tech/nunchaku/releases/tag/v1.1.0
https://redd.it/1pwyxwd
@rStableDiffusion
GitHub
Release Nunchaku v1.1.0 · nunchaku-tech/nunchaku
What's Changed
chore: support cu130 wheels; bump the version to v1.1.0dev by @lmxyy in #798
feat: custom attention backend by @yuleil in #595
feat: Add support for compute capability 12.1 (Gra...
chore: support cu130 wheels; bump the version to v1.1.0dev by @lmxyy in #798
feat: custom attention backend by @yuleil in #595
feat: Add support for compute capability 12.1 (Gra...
Is there any AI upsampler that is 100% true to the low-res image?
There is a way to guarantee that an upsampled image is accurate to the low-res image: when you downsample it again, it is pixel-perfect the same. There are many possible images that have this property, including some that just look blurry. But every AI upsampler I've tried that adds in details does NOT have this property. It makes at least minor changes. Is there any I can use that I will be sure DOES have this property? I know it would have to be differently trained than they usually are. That's what I'm asking for.
https://redd.it/1px0rd7
@rStableDiffusion
There is a way to guarantee that an upsampled image is accurate to the low-res image: when you downsample it again, it is pixel-perfect the same. There are many possible images that have this property, including some that just look blurry. But every AI upsampler I've tried that adds in details does NOT have this property. It makes at least minor changes. Is there any I can use that I will be sure DOES have this property? I know it would have to be differently trained than they usually are. That's what I'm asking for.
https://redd.it/1px0rd7
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Will there be a quantization of TRELLIS2, or low vram workflows for it? Did anyone make it work under 16GB of VRAM?
https://redd.it/1px8q8r
@rStableDiffusion
https://redd.it/1px8q8r
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Wan 2.2 More Consistent Multipart Video Generation via FreeLong - ComfyUI Node
https://www.youtube.com/watch?v=wZgoklsVplc
https://redd.it/1px9t51
@rStableDiffusion
https://www.youtube.com/watch?v=wZgoklsVplc
https://redd.it/1px9t51
@rStableDiffusion
YouTube
Wan 2.2 Longer Video Generation via FreeLong - ComfyUI-LongLook
Pushing Wan 2.2's motion limits. Generate longer length videos with more consistent direction.
Support me if you like this by buying me a coffee: https://buymeacoffee.com/lorasandlenses
Introducing LongLook - a ComfyUI node pack that implements FreeLong…
Support me if you like this by buying me a coffee: https://buymeacoffee.com/lorasandlenses
Introducing LongLook - a ComfyUI node pack that implements FreeLong…
The LoRAs just keep coming! This time it's an exaggerated impasto/textured painting style.
https://redd.it/1px705k
@rStableDiffusion
https://redd.it/1px705k
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: The LoRAs just keep coming! This time it's an exaggerated impasto/textured painting…
Explore this post and more from the StableDiffusion community