NEW BOT Телеграм, страница

r/StableDiffusion

[Update] AI Image Tagger, added Visual Node Editor, R-4B support, smart templates and more

Hey everyone,

a while back I shared my [AI Image Tagger project](https://www.reddit.com/r/StableDiffusion/comments/1nwvhp1/made_a_free_tool_to_autotag_images_alpha_looking/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button), a simple batch captioning tool built around BLIP.

I’ve been working on it since then, and there’s now a pretty big update with a bunch of new stuff and general improvements.

**Main changes:**

* Added a visual node editor, so you can build your own processing pipelines (like Input → Model → Output).
* Added support for the R-4B model, which gives more detailed and reasoning-based captions. BLIP is still there if you want something faster.
* Introduced Smart Templates (called Conjunction nodes) to combine AI outputs and custom prompts into structured captions.
* Added real-time stats – shows processing speed and ETA while it’s running.
* Improved batch processing – handles larger sets of images more efficiently and uses less memory.
* Added flexible export – outputs as a ZIP with embedded metadata.
* Supports multiple precision modes: float32, float16, 8-bit, and 4-bit.

I designed this pipeline to leverage an LLM for producing detailed, multi perspective image denoscriptions, refining the results across several iterations.

Everything’s open-source (MIT) here:
[https://github.com/maxiarat1/ai-image-captioner](https://github.com/maxiarat1/ai-image-captioner)

If you tried the earlier version, this one should feel a lot smoother and more flexible. I’d appreciate any feedback or ideas for other node types to add next.

https://preview.redd.it/4cqaztbdj4wf1.png?width=3870&format=png&auto=webp&s=96dcc926d8a6746c9a2cc8504a93502868850adc

If you tried the previous version, this update adds much more flexibility and visual control.
Feedback and suggestions are welcome, especially regarding model performance and node editor usability.

https://redd.it/1oazq7n
@rStableDiffusion

From the StableDiffusion community on Reddit: Made a free tool to auto-tag images (alpha) – looking for ideas/feedback

Explore this post and more from the StableDiffusion community

11 views22:40

r/StableDiffusion

PSA: Ditch the high noise lightx2v

This isn't some secret knowledge but I have only really tested this today and if you're like me, maybe I'm the one to get this idea into your head: ditch the lightx2v lora for the high noise. At least for I2V, that's what I'm testing now.

I have gotten frustrated by the slow movement and bad prompt adherence. So today I decided to try to use the high noise model naked. I always assumed it would need too many steps and take way too long, but that's not really the case. I have settled for a 6/4 split, 6 steps with the high noise model without lightx2v and then 4 steps with the low noise model with lightx2v. It just feels so much better. It does take a little longer (6 minutes for the whole generation) but the quality boost is worth it. Do it. It feels like a whole new model to me.

https://redd.it/1ob3uaa
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

11 views23:40

r/StableDiffusion

LucidFlux image restoration — broken workflows or am I dumb? 😅
https://redd.it/1ob1iuo
@rStableDiffusion

15 views00:40

r/StableDiffusion

609-983-3423

Text anytime. Call mon-fri 3:10-4:45pm EST

https://redd.it/1ob648e
@rStableDiffusion

From the sdforall community on Reddit

Explore this post and more from the sdforall community

26 views01:40

r/StableDiffusion

365 Straight Days of Stable Diffusion
https://redd.it/1ob5soa
@rStableDiffusion

17 views02:40

Behind the scenes of my robotic arm video 🎬✨

https://redd.it/1odkvof
@rStableDiffusion

12 views06:40

r/StableDiffusion

2000s Analog Core - A Hi8 Camcorder LoRA for Qwen-Image

https://redd.it/1odsid9
@rStableDiffusion

From the StableDiffusion community on Reddit: 2000s Analog Core - A Hi8 Camcorder LoRA for Qwen-Image

Explore this post and more from the StableDiffusion community

13 views17:40

r/StableDiffusion

8 views17:40

r/StableDiffusion

More Nunchaku SVDQuants available - Jib Mix Flux, Fluxmania,
CyberRealistic and PixelWave

Hey everyone! Since my last post got great feedback, I've finished my SVDQuant pipeline and cranked out a few more models:

[Jib Mix Flux V12](https://huggingface.co/spooknik/Jib-Mix-Flux-SVDQ)
CyberRealistic Flux V2.5
[Fluxmania Legacy](https://huggingface.co/spooknik/Fluxmania-SVDQ)
Pixelwave schnell 04 (Int4 coming within 24 hours)

Update on Chroma: Unfortunately, it won't work with Deepcompressor/Nunchaku out of the box due to differences in the model architecture. I attempted a Flux/Chroma merge to get around this, but the results weren't promising. I'll wait for official Nunchaku support before tackling it.

Requests welcome! Drop a comment if there's a model you'd like to see as an SVDQuant - I might just make it happen.

*(Ko-Fi in my profile if you'd like to buy me a coffee ☕)*

https://redd.it/1oe6bcz
@rStableDiffusion

huggingface.co

spooknik/Jib-Mix-Flux-SVDQ · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

11 views18:40

r/StableDiffusion

"Conflagration" Wan22 FLF ComfyUI
https://youtu.be/gQC-60yFfVU

https://redd.it/1oe2k9h
@rStableDiffusion

9 views19:40

r/StableDiffusion

Pony v7 model weights won't be released 😢
https://redd.it/1oed9z5
@rStableDiffusion

9 views20:40

r/StableDiffusion

LTXV 2.0 is out

https://website.ltx.video/blog/introducing-ltx-2

https://redd.it/1oe3le4
@rStableDiffusion