r/StableDiffusion – Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
KaniTTS2 - open-source 400M TTS model with voice cloning, runs in 3GB VRAM. Pretrain code included.

https://redd.it/1r4svm5
@rStableDiffusion
This media is not supported in your browser
VIEW IN TELEGRAM
ACEStep1.5 LoRA + Prompt Blending & Temporal Latent Noise Mask in ComfyUI: Think Daft Punk Chorus and Dr Dre verse

https://redd.it/1r4ops9
@rStableDiffusion
Dear QWEN Team - Happy New Year!

Thank you for all your contributions to the Open Source community over the past year. You guys are awesome!

Please enjoy a blessed new year celebration and we can't wait to see what cool stuff you have in stock for us in the year of the horse!

Have a great time - 新年快樂\~

https://redd.it/1r51lct
@rStableDiffusion
Quantz for RedFire-Image-Edit 1.0 FP8 / NVFP4

https://preview.redd.it/6irwlbb4qhjg1.png?width=1328&format=png&auto=webp&s=d7061447c977b6f11afdcbdca779216037f7d006

I just created quant-models for the new RedFire-Image-Edit 1.0

It works with the qwen-edit workflow, text-encoder and vae.

Here you can download the FP8 and NVFP4 versions.

Happy Prompting!

https://huggingface.co/Starnodes/quants

[https://huggingface.co/FireRedTeam/FireRed-Image-Edit-1.0\]

https://redd.it/1r4pmby
@rStableDiffusion
Training LoRA on 5060 Ti 16GB .. is this the best speed or is there any way to speed up iteration time?
https://redd.it/1r559f0
@rStableDiffusion
SDXL is still the undisputed king of n𝚜fw content

When will this change? Yeah you might get an extra arm and have to regenerate a couple times. But you get what you ask for. I have high hopes for Flux Klein but progress is slow.

https://redd.it/1r55ib0
@rStableDiffusion
ComfyUI - AceStep v1.5 is amazing

I thought I'd take a break from image generations and take a look at the new audio side of ComfyUI, ACE-Step 1.5 Music Generation 1.7b



This is my best effort so far:



https://www.youtube.com/watch?v=SfloXIUf1C0



Lyrics in video header.



Song duration 180, bpm 150, Steps 100, cfg 1.1, Euler, simple, denoise 1.00

https://redd.it/1r4yk19
@rStableDiffusion
🚨 SeansOmniTagProcessor V2 Batch Folder/Single video file options + UI overhaul + Now running Qwen3-VL-8B-Abliterated 🖼️ LoRa Data Set maker, Video/Images 🎥
https://redd.it/1r5crcy
@rStableDiffusion
Tried Z-Image Turbo on 32GB RAM + RTX 3050 via ForgeUI — consistently ~6–10s per 1080p image

Hey folks, been tinkering with SD setups and wanted to share some real-world performance numbers in case it helps others in the same hardware bracket.
Hardware:
• RTX 3050 (laptop GPU)
• 32 GB RAM
• Running everything through ForgeUI + Z-Image Turbo
Workflow:
• 1080p outputs
• Default-ish Turbo settings (sped up sampling + optimized caching)
• No crazy overclocking, just stable system config
Results:
I’m getting pretty consistent ~6–10 seconds per image at 1080p depending on the prompt complexity and sampler choice. Even with denser prompts and CFG bumped up, the RTX 3050 still holds its own surprisingly well with Turbo processing.
Before this I was bracing for 20–30s renders, but the combined ForgeUI + Z-Image Turbo setup feels like a legit game changer for this class of GPU.
Curious to hear from folks with similar rigs: • Is that ~6–10s/1080p what you’re seeing?
• Any specific Turbo settings that squeeze out more performance without quality loss?
• How do your artifacting/noise results compare at faster speeds?
• Anyone paired this with other UIs like Automatic1111 or NMKD and seen big diffs?
Appreciate any tips or shared benchmarks!

https://redd.it/1r58tz2
@rStableDiffusion