r/StableDiffusion – Telegram
🔥 90% OFF - Perplexity AI PRO 1-Year Plan - Limited Time SUPER PROMO!
https://redd.it/1oalp84
@rStableDiffusion
This media is not supported in your browser
VIEW IN TELEGRAM
I’m making an open-sourced comfyui-integrated video editor, and I want to know if you’d find it useful

https://redd.it/1oapyzg
@rStableDiffusion
Introducing InSubject 0.5, a QwenEdit LoRA trained for creating highly consistent characters/objects w/ just a single reference - samples attached, link + dataset below

https://redd.it/1oayez0
@rStableDiffusion
I built a (opensource) UI for Stable Diffusion focused on workflow and ease of use - Meet PrismXL!

Hey everyone,

Like many of you, I've spent countless hours exploring the incredible world of Stable Diffusion. Along the way, I found myself wanting a tool that felt a bit more... fluid. Something that combined powerful features with a clean, intuitive interface that didn't get in the way of the creative process.

So, I decided to build it myself. I'm excited to share my passion project with you all: PrismXL.

It's a standalone desktop GUI built from the ground up with PySide6 and Diffusers, currently running the fantastic Juggernaut-XL-v9 model.

https://preview.redd.it/wipjdf7u14wf1.png?width=1280&format=png&auto=webp&s=55af37370bbc1efe20a5e034dfb31e9694f73d7e

https://preview.redd.it/5sdlvf7u14wf1.png?width=1280&format=png&auto=webp&s=8cd3f9589d2d9001bfc6712765abb8a2d69a2c17

My goal wasn't to reinvent the wheel, but to refine the experience. Here are some of the core features I focused on:

Clean, Modern UI: A fully custom, frameless interface with movable sections. You can drag and drop the "Prompt," "Advanced Options," and other panels to arrange your workspace exactly how you like it.
Built-in Spell Checker: The prompt and negative prompt boxes have a built-in spell checker with a correction suggestion menu (right-click on a misspelled word). No more re-running a 50-step generation because of a simple typo!
Prompt Library: Save your favorite or most complex prompts with a noscript. You can easily search, edit, and "cast" them back into the prompt box.
Live Render Preview: For 512x512 generations, you can enable a live preview that shows you the image as it's being refined at each step. It's fantastic for getting a feel for your image's direction early on.
Grid Generation & Zoom: Easily generate a grid of up to 4 images to compare subtle variations. The image viewer includes a zoom-on-click feature and thumbnails for easy switching.
User-Friendly Controls: All the essentials are there—steps, CFG scale, CLIP skip, custom seeds, and a wide range of resolutions—all presented with intuitive sliders and dropdowns.

Why another GUI?

I know there are some amazing, feature-rich UIs out there. PrismXL is my take on a tool that’s designed to be approachable for newcomers without sacrificing the control that power users need. It's about reducing friction and keeping the focus on creativity. I've poured a lot of effort into the small details of the user experience.

This is a project born out of a love for the technology and the community around it. I've just added a "Terms of Use" dialog on the first launch as a simple safeguard, but my hope is to eventually open-source it once I'm confident in its stability and have a good content protection plan in place.

I would be incredibly grateful for any feedback you have. What do you like? What's missing? What could be improved?

You can check out the project and find the download link on GitHub:

**https://github.com/dovvnloading/Sapphire-Image-GenXL**

Thanks for taking a look. I'm excited to hear what you think and to continue building this with the community in mind! Happy generating

https://redd.it/1oawx5t
@rStableDiffusion
[Update] AI Image Tagger, added Visual Node Editor, R-4B support, smart templates and more

Hey everyone,

a while back I shared my [AI Image Tagger project](https://www.reddit.com/r/StableDiffusion/comments/1nwvhp1/made_a_free_tool_to_autotag_images_alpha_looking/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button), a simple batch captioning tool built around BLIP.

I’ve been working on it since then, and there’s now a pretty big update with a bunch of new stuff and general improvements.

**Main changes:**

* Added a visual node editor, so you can build your own processing pipelines (like Input → Model → Output).
* Added support for the R-4B model, which gives more detailed and reasoning-based captions. BLIP is still there if you want something faster.
* Introduced Smart Templates (called Conjunction nodes) to combine AI outputs and custom prompts into structured captions.
* Added real-time stats – shows processing speed and ETA while it’s running.
* Improved batch processing – handles larger sets of images more efficiently and uses less memory.
* Added flexible export – outputs as a ZIP with embedded metadata.
* Supports multiple precision modes: float32, float16, 8-bit, and 4-bit.

I designed this pipeline to leverage an LLM for producing detailed, multi perspective image denoscriptions, refining the results across several iterations.

Everything’s open-source (MIT) here:
[https://github.com/maxiarat1/ai-image-captioner](https://github.com/maxiarat1/ai-image-captioner)

If you tried the earlier version, this one should feel a lot smoother and more flexible. I’d appreciate any feedback or ideas for other node types to add next.

https://preview.redd.it/4cqaztbdj4wf1.png?width=3870&format=png&auto=webp&s=96dcc926d8a6746c9a2cc8504a93502868850adc

If you tried the previous version, this update adds much more flexibility and visual control.
Feedback and suggestions are welcome, especially regarding model performance and node editor usability.

https://redd.it/1oazq7n
@rStableDiffusion