Why do I get better results with Qwen Image Edit 4 Step lora than original 20 step?
4 step takes less time and output is being better. Isn't more steps supposed to provide better image? I'm not familiar with this stuff but I thought slower/bigger/more steps would result in better results. But with 4 steps, it creates everything including text and the second image i uploaded accurately compared to 20 where text and the second image i asked for it to include gets distorted
https://redd.it/1pt6fdn
@rStableDiffusion
4 step takes less time and output is being better. Isn't more steps supposed to provide better image? I'm not familiar with this stuff but I thought slower/bigger/more steps would result in better results. But with 4 steps, it creates everything including text and the second image i uploaded accurately compared to 20 where text and the second image i asked for it to include gets distorted
https://redd.it/1pt6fdn
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Got a nice ZIT workflow to work and it produces great images 8GB VRAM
https://redd.it/1ptfx6n
@rStableDiffusion
https://redd.it/1ptfx6n
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Got a nice ZIT workflow to work and it produces great images 8GB VRAM
Explore this post and more from the StableDiffusion community
Last week in Image & Video Generation
I curate a weekly multimodal AI roundup, here are the open-source diffusion highlights from last week:
TurboDiffusion - 100-205x Speed Boost
Accelerates video diffusion models by 100-205 times through architectural optimizations.
Open source with full code release for real-time video generation.
[GitHub](https://github.com/thu-ml/TurboDiffusion) | [Paper](https://arxiv.org/pdf/2512.16093)
https://reddit.com/link/1ptggkm/video/azgwbpu4pu8g1/player
Qwen-Image-Layered - Layer-Based Generation
Decomposes images into editable RGBA layers with open weights.
Enables precise control over semantic components during generation.
Hugging Face | Paper | Demo
https://reddit.com/link/1ptggkm/video/jq1ujox5pu8g1/player
LongVie 2 - 5-Minute Video Diffusion
Generates 5-minute continuous videos with controllable elements.
Open weights and code for extended video generation.
[Paper](https://huggingface.co/papers/2512.13604) | [GitHub](https://github.com/Vchitect/LongVie)
https://reddit.com/link/1ptggkm/video/8kr7ue8pqu8g1/player
WorldPlay(Tencent) - Interactive 3D World Generation
Generates interactive 3D worlds with geometric consistency.
Model available for local deployment.
Website | Model
https://reddit.com/link/1ptggkm/video/dggrhxqyqu8g1/player
Generative Refocusing - Depth-of-Field Control
Controls focus and depth of field in generated or existing images.
Open source implementation for bokeh and focus effects.
[Website](https://generative-refocusing.github.io/) | [Demo](https://huggingface.co/spaces/nycu-cplab/Genfocus-Demo) | [Paper](https://arxiv.org/abs/2512.16923) | [GitHub](https://github.com/rayray9999/Genfocus)
https://reddit.com/link/1ptggkm/video/a9jjbir6pu8g1/player
DeContext - Protection Against Unwanted Edits
Protects images from manipulation by diffusion models like FLUX.
Open source tool for adding imperceptible perturbations that block edits.
Website | Paper | GitHub
https://preview.redd.it/iuyeboy8pu8g1.png?width=1427&format=png&auto=webp&s=6e451e1336fcb8d5cebab46956605d42ecce8604
Flow Map Trajectory Tilting - Test-Time Scaling
Improves diffusion outputs at test time using flow maps.
Adjusts generation trajectories without retraining models.
[Paper](https://arxiv.org/abs/2511.22688) | [Website](https://flow-map-trajectory-tilting.github.io/)
https://preview.redd.it/7huqzj9bpu8g1.png?width=1140&format=png&auto=webp&s=baf5ee057c6c69d2cb1566f0a743c73419de99ad
StereoPilot - 2D to Stereo 3D
Converts 2D videos to stereo 3D with open model and code.
Full source release for VR content creation.
Website | Model | GitHub
LongCat-Video-Avatar - "An expressive avatar model built upon LongCat-Video"
[Website](https://meigen-ai.github.io/LongCat-Video-Avatar/) | [GitHub](https://github.com/meituan-longcat/LongCat-Video) | [Paper](https://arxiv.org/abs/2510.22200) | [ComfyUI](https://huggingface.co/Kijai/LongCat-Video_comfy/tree/main/Avatar)
TRELLIS 2 - 3D generative model designed for high-fidelity image-to-3D generation
Model | Demo (i saw someone playing with this in Comfy but i forgot to save the post)
Wan 2.6 was released last week but only to the API providers for now.
Checkout the full newsletter for more demos,
I curate a weekly multimodal AI roundup, here are the open-source diffusion highlights from last week:
TurboDiffusion - 100-205x Speed Boost
Accelerates video diffusion models by 100-205 times through architectural optimizations.
Open source with full code release for real-time video generation.
[GitHub](https://github.com/thu-ml/TurboDiffusion) | [Paper](https://arxiv.org/pdf/2512.16093)
https://reddit.com/link/1ptggkm/video/azgwbpu4pu8g1/player
Qwen-Image-Layered - Layer-Based Generation
Decomposes images into editable RGBA layers with open weights.
Enables precise control over semantic components during generation.
Hugging Face | Paper | Demo
https://reddit.com/link/1ptggkm/video/jq1ujox5pu8g1/player
LongVie 2 - 5-Minute Video Diffusion
Generates 5-minute continuous videos with controllable elements.
Open weights and code for extended video generation.
[Paper](https://huggingface.co/papers/2512.13604) | [GitHub](https://github.com/Vchitect/LongVie)
https://reddit.com/link/1ptggkm/video/8kr7ue8pqu8g1/player
WorldPlay(Tencent) - Interactive 3D World Generation
Generates interactive 3D worlds with geometric consistency.
Model available for local deployment.
Website | Model
https://reddit.com/link/1ptggkm/video/dggrhxqyqu8g1/player
Generative Refocusing - Depth-of-Field Control
Controls focus and depth of field in generated or existing images.
Open source implementation for bokeh and focus effects.
[Website](https://generative-refocusing.github.io/) | [Demo](https://huggingface.co/spaces/nycu-cplab/Genfocus-Demo) | [Paper](https://arxiv.org/abs/2512.16923) | [GitHub](https://github.com/rayray9999/Genfocus)
https://reddit.com/link/1ptggkm/video/a9jjbir6pu8g1/player
DeContext - Protection Against Unwanted Edits
Protects images from manipulation by diffusion models like FLUX.
Open source tool for adding imperceptible perturbations that block edits.
Website | Paper | GitHub
https://preview.redd.it/iuyeboy8pu8g1.png?width=1427&format=png&auto=webp&s=6e451e1336fcb8d5cebab46956605d42ecce8604
Flow Map Trajectory Tilting - Test-Time Scaling
Improves diffusion outputs at test time using flow maps.
Adjusts generation trajectories without retraining models.
[Paper](https://arxiv.org/abs/2511.22688) | [Website](https://flow-map-trajectory-tilting.github.io/)
https://preview.redd.it/7huqzj9bpu8g1.png?width=1140&format=png&auto=webp&s=baf5ee057c6c69d2cb1566f0a743c73419de99ad
StereoPilot - 2D to Stereo 3D
Converts 2D videos to stereo 3D with open model and code.
Full source release for VR content creation.
Website | Model | GitHub
LongCat-Video-Avatar - "An expressive avatar model built upon LongCat-Video"
[Website](https://meigen-ai.github.io/LongCat-Video-Avatar/) | [GitHub](https://github.com/meituan-longcat/LongCat-Video) | [Paper](https://arxiv.org/abs/2510.22200) | [ComfyUI](https://huggingface.co/Kijai/LongCat-Video_comfy/tree/main/Avatar)
TRELLIS 2 - 3D generative model designed for high-fidelity image-to-3D generation
Model | Demo (i saw someone playing with this in Comfy but i forgot to save the post)
Wan 2.6 was released last week but only to the API providers for now.
Checkout the full newsletter for more demos,
GitHub
GitHub - thu-ml/TurboDiffusion: TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models - thu-ml/TurboDiffusion
papers, and resources.
* Reddit post limits stopped me from adding the rest of the videos/demos.
(https://www.reddit.com/submit/?sourceid=t31ptfw0q)
https://redd.it/1ptggkm
@rStableDiffusion
* Reddit post limits stopped me from adding the rest of the videos/demos.
(https://www.reddit.com/submit/?sourceid=t31ptfw0q)
https://redd.it/1ptggkm
@rStableDiffusion
Ai Livestream of a Simple Corner Store that updates via audience prompt
https://www.youtube.com/live/j03bfyZ5GV4
https://redd.it/1ptimcb
@rStableDiffusion
https://www.youtube.com/live/j03bfyZ5GV4
https://redd.it/1ptimcb
@rStableDiffusion
YouTube
Perpetual Rainy Night at Cyberpunk Corner Store (ASMR)
A "Relaxing" night scene at a corner store in a cyberpunk city. This scene will perpetually grow based on your comments. So please, let me know what you would like to see happen in this scene on the stream. I will drop by and read the comments, live or after…
Block Edit & Save your LoRAs In ComfyUI - LoRA Loader Scheduling Nodes and a few extra goodies for Xmas. Z-image/Flux/Wan/SDXL/QWEN/SD1.5
https://www.youtube.com/watch?v=3CdyGxvYeHo
https://redd.it/1pthc20
@rStableDiffusion
https://www.youtube.com/watch?v=3CdyGxvYeHo
https://redd.it/1pthc20
@rStableDiffusion
YouTube
ComfyUI - Modify & Save your LoRas - Lora Strength Scheduling - Base Model Block editing!
Realtime Lora Toolkit V 2.0 out now
Edit Loras block by block
Save Edited Loras
Schedule Lora Strengh During Generation
Tweak Base Model Blocks
Save Revised Base Model
For: Z-image, Wan, Qwen, Flux, SDXL and SD 1.5
Other additions
Clipboard Image input…
Edit Loras block by block
Save Edited Loras
Schedule Lora Strengh During Generation
Tweak Base Model Blocks
Save Revised Base Model
For: Z-image, Wan, Qwen, Flux, SDXL and SD 1.5
Other additions
Clipboard Image input…
What model or LoRA should I use to generate images that are closest to this style?
https://redd.it/1ptnqzp
@rStableDiffusion
https://redd.it/1ptnqzp
@rStableDiffusion
I built a "Real vs AI" Turing Test using Vibe Coding (featuring Flux & Midjourney v6). Human accuracy is dropping fast.
https://redd.it/1pttd83
@rStableDiffusion
https://redd.it/1pttd83
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: I built a "Real vs AI" Turing Test using Vibe Coding (featuring Flux & Midjourney…
Explore this post and more from the StableDiffusion community
Qwen3-TTS Steps Up: Voice Cloning and Voice Design! (link to blog post)
https://qwen.ai/blog?id=qwen3-tts-vc-voicedesign
https://redd.it/1pttnlk
@rStableDiffusion
https://qwen.ai/blog?id=qwen3-tts-vc-voicedesign
https://redd.it/1pttnlk
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Qwen3-TTS Steps Up: Voice Cloning and Voice Design! (link to blog post)
Posted by SysPsych - 24 votes and 3 comments