r/StableDiffusion – Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
I used Flux-Schnell to generate card art in real time as the player progresses
https://redd.it/1pnlvsk
@rStableDiffusion
Chatterbox Turbo Released Today

I didn't see another post on this, but the open source TTS was released today.

https://huggingface.co/collections/ResembleAI/chatterbox-turbo

I tested it with a recording of my voice and in 5 seconds it was able to create a pretty decent facsimile of my voice.

https://redd.it/1pnozbo
@rStableDiffusion
LORA Training - Sample every 250 steps - Best practices in sample prompts?

I am experimenting with LORA training (characters), always learning new things and leveraging some great insights I find in this community.
Generally my dataset is composed of 30 high definition photos with different environment/clothing and camera distance. I am aiming at photorealism.

I do not see often discussions about which prompts should be used during training to check the LORA's quality progression.
I generate a LORA every 250 steps and I normally produce 4 images.
My approach is:

1) An image with prompt very similar to one of the dataset images (just to see how different the resulting image is from the dataset)

2) An image putting the character in a very different environment/clothing/expression (to see how the model can cope with variations)

3) A close-up portrait of my character with white background (to focus on face details)

4) An anime close-up portrait of my character in Ghibli style (to quickly check if the LORA is overtrained: when images start getting out photographic rather than anime, I know I overtrained)

I have no idea if this is a good approach or not.
What do you normally do? What prompts do you use?

P.S. I have noticed that the subsequent image generation in ComfyUI is much better quality than the samples generated during training (I do not really know why) but nevertheless, even if in low quality, samples are anyway useful to check the training progression.

https://redd.it/1pnx20s
@rStableDiffusion
Prompt Manager, now with Qwen3VL support and multi image input.

Hey Guys,

Thought I'd share the new updates to my Prompt Manager Add-On.

Added Qwen3VL support, both Instruct and Thinking Variant.
Added option to output the prompt in JSON format.
After seeing community discussions about its advantages.
Added ComfyUI preferences option to set default preferred Models.
Falls back to available models if none are specified.
Integrated several quality-of-life improvements contributed by GitHub user, BigStationW, including:
Support for Thinking Models.
Support for up to 5 images in multi-image queries.
Faster job cancellation.
Option to output everything to Console for debugging.

For Basic Workflow, you can just use the Generator Node, it has an image input and the option to select if you want Image analysis or Prompt Generation.

But for more control, you can add the Options node to get an extra 4 inputs and then use "Analyze Image with Prompt" for something like this:

https://preview.redd.it/o5rer26b8j7g1.png?width=2330&format=png&auto=webp&s=77177fbdc0bbb0931af4e2f715a2d568c9aa9a27

I'll admit, I kind of flew past the initial idea of this Add-On 😅.
I'll eventually have to decide if I rename it to something more fitting.

For those that hadn't seen my previous post. This works with a preinstalled copy of Llama.cpp. I did so, as Llama.cpp is very simple to install (1 command line). This way, I don't risk creating conflicts with ComfyUI. This add-on will then simply Start and Stop Llama.cpp as it needs it.

https://redd.it/1pnxxcr
@rStableDiffusion