r/StableDiffusion – Telegram
How are people combining Stable Diffusion with conversational workflows?

I’ve seen more discussions lately about pairing Stable Diffusion with text-based systems, like using an AI chatbot to help refine prompts, styles, or iteration logic before image generation.
For those experimenting with this kind of setup:
Do you find conversational layers actually improve creative output, or is manual prompt tuning still better?
Interested in hearing practical experiences rather than tools or promotions

https://redd.it/1pyuowm
@rStableDiffusion
FYI: You can train a Wan 2.2 LoRA with 16gb VRAM.

I've seen a lot of posts where people are doing initial image generation in Z-Image-Turbo and then animating it in Wan 2.2. If you're doing that solely because you prefer the aesthetics of Z-Image-Turbo, then carry on.

But for those who may be doing this out of perceived resource constraints, you may benefit from knowing that you can train LoRAs for Wan 2.2 in ostris/ai-toolkit with 16GB VRAM. Just start with the default 24GB config file and then add these parameters to your config under the model section:

layer_offloading: true
layer_offloading_text_encoder_percent: 0.6
layer_offloading_transformer_percent: 0.6


You can lower or raise the offloading percent to find what works for your setup. Of course, your batch size, gradient accumulation, and resolution all have to be reasonable as well (e.g., I did batch_size: 2, gradient_accumulation: 2, resolution: 512).

I've only tested two different LoRA runs for Wan 2.2, but so far it trains easier and, IMO, looks more natural than Z-Image-Turbo, which tends to look like it's trying to look realistic and gritty.

https://redd.it/1pz0w56
@rStableDiffusion