SkyWork have released their image model with editing capabilities. Both base and DMD-distilled versions are released. Some impressive examples in the paper.
https://redd.it/1ql0bol
@rStableDiffusion
https://redd.it/1ql0bol
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: SkyWork have released their image model with editing capabilities. Both base and…
Explore this post and more from the StableDiffusion community
ModelSamplingAuraFlow cranked as high as 100 fixes almost every single face adherence, anatomy, and resolution issue I've experienced with Flux2 Klein 9b fp8. I see no reason why it wouldn't help the other Klein variants. Stupid simple workflow in comments, without subgraphs or disappearing noodles.
https://redd.it/1ql0vwj
@rStableDiffusion
https://redd.it/1ql0vwj
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: ModelSamplingAuraFlow cranked as high as 100 fixes almost every single face adherence…
Explore this post and more from the StableDiffusion community
Voice Clone Studio, powered by Qwen3-TTS and Whisper for auto transcribe.
Hey Guys,
I played around with the release of Qwen3-TTS and made a standalone version that exposes most of it's features, using Gradio.
I've included Whisper support, so you can provide your own audio samples and automatically generate the matching text for them in a "Prep Sample" section. This section allows you to review previously saved Voice Samples, import and trim audio or delete unused samples.
I've also added a Voice Design section, but I use it a bit differently from the demos of Qwen3-tts. You design the voice you want and when happy with the result, you save it as a Voice Sample instead. This way, it can then be used indefinitely with the first tab, using the Qwen3-TTS base model. If you prefer to design and simply save the resulting output directly, there is an option for that as well.
It uses caching, so when a voice sample is used, it saves the resulting cache to disk. Allowing the following queries to be faster.
You can find it here: https://github.com/FranckyB/Voice-Clone-Studio
This project was mostly for myself, but thought it could prove useful to some. 😊
Perhaps a ComfyUI would be more direct, but I liked the idea of having a simple UI where your prepared Samples remain and can be easily selected with a drag and drop.
https://redd.it/1qlfl48
@rStableDiffusion
Hey Guys,
I played around with the release of Qwen3-TTS and made a standalone version that exposes most of it's features, using Gradio.
I've included Whisper support, so you can provide your own audio samples and automatically generate the matching text for them in a "Prep Sample" section. This section allows you to review previously saved Voice Samples, import and trim audio or delete unused samples.
I've also added a Voice Design section, but I use it a bit differently from the demos of Qwen3-tts. You design the voice you want and when happy with the result, you save it as a Voice Sample instead. This way, it can then be used indefinitely with the first tab, using the Qwen3-TTS base model. If you prefer to design and simply save the resulting output directly, there is an option for that as well.
It uses caching, so when a voice sample is used, it saves the resulting cache to disk. Allowing the following queries to be faster.
You can find it here: https://github.com/FranckyB/Voice-Clone-Studio
This project was mostly for myself, but thought it could prove useful to some. 😊
Perhaps a ComfyUI would be more direct, but I liked the idea of having a simple UI where your prepared Samples remain and can be easily selected with a drag and drop.
https://redd.it/1qlfl48
@rStableDiffusion
GitHub
GitHub - FranckyB/Voice-Clone-Studio: A Gradio-based web UI for voice cloning and voice design, powered by Qwen3-TTS
A Gradio-based web UI for voice cloning and voice design, powered by Qwen3-TTS - FranckyB/Voice-Clone-Studio
Flux. 2 Klein INPAINT Segment Edit For Accurate Image Edit
https://youtu.be/JUjFiyyQrx8
https://redd.it/1qlhxms
@rStableDiffusion
https://youtu.be/JUjFiyyQrx8
https://redd.it/1qlhxms
@rStableDiffusion
YouTube
Comfyui Tutorial : Flux 2 Klein Segment Inpaint Edit #comfyui #flux2klein #comfyuitutorial
On this tutorial, i will show how to do segment edit using mask inpainting method which only focuses on some part of the image rather than editing the whole image. the workflows uses the SAM 3 model for mask creation and lanpaint ksampler for creating stunning…
🎙️ A New Voice Has Arrived — Qwen3-TTS Custom Node for ComfyUI Is Here
https://www.reddit.com/gallery/1qljtix
https://redd.it/1qljuul
@rStableDiffusion
https://www.reddit.com/gallery/1qljtix
https://redd.it/1qljuul
@rStableDiffusion
Reddit
From the comfyui community on Reddit: 🎙️ A New Voice Has Arrived — Qwen3-TTS Custom Node for ComfyUI Is Here
Explore this post and more from the comfyui community
Flux klein 9b works great out of the box with default comfy workflow
https://redd.it/1qlig5f
@rStableDiffusion
https://redd.it/1qlig5f
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Flux klein 9b works great out of the box with default comfy workflow
Explore this post and more from the StableDiffusion community
"Chroma2-Kaleidoscope" based on Flux Klein 4B Base is up on HuggingFace! Probably not very usable yet as implied by the "IT'S STILL WIP GUYS CHILL!!" model card note though.
https://redd.it/1qlv6u3
@rStableDiffusion
https://redd.it/1qlv6u3
@rStableDiffusion