r/StableDiffusion – Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
I'M BACK FINALLY WITH AN UPDATE! 12GB GGUF LTX-2 WORKFLOWS FOR T2V/I2V/V2V/IA2V/TA2V!!! ALL WITH SUPER COOL STUFF AND THINGS!

https://redd.it/1ql1fc8
@rStableDiffusion
ModelSamplingAuraFlow cranked as high as 100 fixes almost every single face adherence, anatomy, and resolution issue I've experienced with Flux2 Klein 9b fp8. I see no reason why it wouldn't help the other Klein variants. Stupid simple workflow in comments, without subgraphs or disappearing noodles.

https://redd.it/1ql0vwj
@rStableDiffusion
Voice Clone Studio, powered by Qwen3-TTS and Whisper for auto transcribe.

Hey Guys,

I played around with the release of Qwen3-TTS and made a standalone version that exposes most of it's features, using Gradio.

I've included Whisper support, so you can provide your own audio samples and automatically generate the matching text for them in a "Prep Sample" section. This section allows you to review previously saved Voice Samples, import and trim audio or delete unused samples.

I've also added a Voice Design section, but I use it a bit differently from the demos of Qwen3-tts. You design the voice you want and when happy with the result, you save it as a Voice Sample instead. This way, it can then be used indefinitely with the first tab, using the Qwen3-TTS base model. If you prefer to design and simply save the resulting output directly, there is an option for that as well.

It uses caching, so when a voice sample is used, it saves the resulting cache to disk. Allowing the following queries to be faster.

You can find it here: https://github.com/FranckyB/Voice-Clone-Studio

This project was mostly for myself, but thought it could prove useful to some. 😊

Perhaps a ComfyUI would be more direct, but I liked the idea of having a simple UI where your prepared Samples remain and can be easily selected with a drag and drop.

https://redd.it/1qlfl48
@rStableDiffusion
"Chroma2-Kaleidoscope" based on Flux Klein 4B Base is up on HuggingFace! Probably not very usable yet as implied by the "IT'S STILL WIP GUYS CHILL!!" model card note though.
https://redd.it/1qlv6u3
@rStableDiffusion