[Update] AI Image Tagger, added Visual Node Editor, R-4B support, smart templates and more
Hey everyone,
a while back I shared my [AI Image Tagger project](https://www.reddit.com/r/StableDiffusion/comments/1nwvhp1/made_a_free_tool_to_autotag_images_alpha_looking/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button), a simple batch captioning tool built around BLIP.
I’ve been working on it since then, and there’s now a pretty big update with a bunch of new stuff and general improvements.
**Main changes:**
* Added a visual node editor, so you can build your own processing pipelines (like Input → Model → Output).
* Added support for the R-4B model, which gives more detailed and reasoning-based captions. BLIP is still there if you want something faster.
* Introduced Smart Templates (called Conjunction nodes) to combine AI outputs and custom prompts into structured captions.
* Added real-time stats – shows processing speed and ETA while it’s running.
* Improved batch processing – handles larger sets of images more efficiently and uses less memory.
* Added flexible export – outputs as a ZIP with embedded metadata.
* Supports multiple precision modes: float32, float16, 8-bit, and 4-bit.
I designed this pipeline to leverage an LLM for producing detailed, multi perspective image denoscriptions, refining the results across several iterations.
Everything’s open-source (MIT) here:
[https://github.com/maxiarat1/ai-image-captioner](https://github.com/maxiarat1/ai-image-captioner)
If you tried the earlier version, this one should feel a lot smoother and more flexible. I’d appreciate any feedback or ideas for other node types to add next.
https://preview.redd.it/4cqaztbdj4wf1.png?width=3870&format=png&auto=webp&s=96dcc926d8a6746c9a2cc8504a93502868850adc
If you tried the previous version, this update adds much more flexibility and visual control.
Feedback and suggestions are welcome, especially regarding model performance and node editor usability.
https://redd.it/1oazq7n
@rStableDiffusion
Hey everyone,
a while back I shared my [AI Image Tagger project](https://www.reddit.com/r/StableDiffusion/comments/1nwvhp1/made_a_free_tool_to_autotag_images_alpha_looking/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button), a simple batch captioning tool built around BLIP.
I’ve been working on it since then, and there’s now a pretty big update with a bunch of new stuff and general improvements.
**Main changes:**
* Added a visual node editor, so you can build your own processing pipelines (like Input → Model → Output).
* Added support for the R-4B model, which gives more detailed and reasoning-based captions. BLIP is still there if you want something faster.
* Introduced Smart Templates (called Conjunction nodes) to combine AI outputs and custom prompts into structured captions.
* Added real-time stats – shows processing speed and ETA while it’s running.
* Improved batch processing – handles larger sets of images more efficiently and uses less memory.
* Added flexible export – outputs as a ZIP with embedded metadata.
* Supports multiple precision modes: float32, float16, 8-bit, and 4-bit.
I designed this pipeline to leverage an LLM for producing detailed, multi perspective image denoscriptions, refining the results across several iterations.
Everything’s open-source (MIT) here:
[https://github.com/maxiarat1/ai-image-captioner](https://github.com/maxiarat1/ai-image-captioner)
If you tried the earlier version, this one should feel a lot smoother and more flexible. I’d appreciate any feedback or ideas for other node types to add next.
https://preview.redd.it/4cqaztbdj4wf1.png?width=3870&format=png&auto=webp&s=96dcc926d8a6746c9a2cc8504a93502868850adc
If you tried the previous version, this update adds much more flexibility and visual control.
Feedback and suggestions are welcome, especially regarding model performance and node editor usability.
https://redd.it/1oazq7n
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Made a free tool to auto-tag images (alpha) – looking for ideas/feedback
Explore this post and more from the StableDiffusion community
PSA: Ditch the high noise lightx2v
This isn't some secret knowledge but I have only really tested this today and if you're like me, maybe I'm the one to get this idea into your head: ditch the lightx2v lora for the high noise. At least for I2V, that's what I'm testing now.
I have gotten frustrated by the slow movement and bad prompt adherence. So today I decided to try to use the high noise model naked. I always assumed it would need too many steps and take way too long, but that's not really the case. I have settled for a 6/4 split, 6 steps with the high noise model without lightx2v and then 4 steps with the low noise model with lightx2v. It just feels so much better. It does take a little longer (6 minutes for the whole generation) but the quality boost is worth it. Do it. It feels like a whole new model to me.
https://redd.it/1ob3uaa
@rStableDiffusion
This isn't some secret knowledge but I have only really tested this today and if you're like me, maybe I'm the one to get this idea into your head: ditch the lightx2v lora for the high noise. At least for I2V, that's what I'm testing now.
I have gotten frustrated by the slow movement and bad prompt adherence. So today I decided to try to use the high noise model naked. I always assumed it would need too many steps and take way too long, but that's not really the case. I have settled for a 6/4 split, 6 steps with the high noise model without lightx2v and then 4 steps with the low noise model with lightx2v. It just feels so much better. It does take a little longer (6 minutes for the whole generation) but the quality boost is worth it. Do it. It feels like a whole new model to me.
https://redd.it/1ob3uaa
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
LucidFlux image restoration — broken workflows or am I dumb? 😅
https://redd.it/1ob1iuo
@rStableDiffusion
https://redd.it/1ob1iuo
@rStableDiffusion
More Nunchaku SVDQuants available - Jib Mix Flux, Fluxmania,
CyberRealistic and PixelWave
Hey everyone! Since my last post got great feedback, I've finished my SVDQuant pipeline and cranked out a few more models:
[Jib Mix Flux V12](https://huggingface.co/spooknik/Jib-Mix-Flux-SVDQ)
CyberRealistic Flux V2.5
[Fluxmania Legacy](https://huggingface.co/spooknik/Fluxmania-SVDQ)
Pixelwave schnell 04 (Int4 coming within 24 hours)
Update on Chroma: Unfortunately, it won't work with Deepcompressor/Nunchaku out of the box due to differences in the model architecture. I attempted a Flux/Chroma merge to get around this, but the results weren't promising. I'll wait for official Nunchaku support before tackling it.
Requests welcome! Drop a comment if there's a model you'd like to see as an SVDQuant - I might just make it happen.
*(Ko-Fi in my profile if you'd like to buy me a coffee ☕)*
https://redd.it/1oe6bcz
@rStableDiffusion
CyberRealistic and PixelWave
Hey everyone! Since my last post got great feedback, I've finished my SVDQuant pipeline and cranked out a few more models:
[Jib Mix Flux V12](https://huggingface.co/spooknik/Jib-Mix-Flux-SVDQ)
CyberRealistic Flux V2.5
[Fluxmania Legacy](https://huggingface.co/spooknik/Fluxmania-SVDQ)
Pixelwave schnell 04 (Int4 coming within 24 hours)
Update on Chroma: Unfortunately, it won't work with Deepcompressor/Nunchaku out of the box due to differences in the model architecture. I attempted a Flux/Chroma merge to get around this, but the results weren't promising. I'll wait for official Nunchaku support before tackling it.
Requests welcome! Drop a comment if there's a model you'd like to see as an SVDQuant - I might just make it happen.
*(Ko-Fi in my profile if you'd like to buy me a coffee ☕)*
https://redd.it/1oe6bcz
@rStableDiffusion
huggingface.co
spooknik/Jib-Mix-Flux-SVDQ · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
"Conflagration" Wan22 FLF ComfyUI
https://youtu.be/gQC-60yFfVU
https://redd.it/1oe2k9h
@rStableDiffusion
https://youtu.be/gQC-60yFfVU
https://redd.it/1oe2k9h
@rStableDiffusion
LTXV 2.0 is out
https://website.ltx.video/blog/introducing-ltx-2
https://redd.it/1oe3le4
@rStableDiffusion
https://website.ltx.video/blog/introducing-ltx-2
https://redd.it/1oe3le4
@rStableDiffusion
website.ltx.video
The Next-Generation Multimodal AI Foundation Model by Lightricks | LTX-2
Discover LTX-2, Lightricks’ next-generation multimodal AI model for video with synchronised audio and image creation. Generate, enhance, and repurpose visuals faster than ever.