This media is not supported in your browser
VIEW IN TELEGRAM
I’m making an open-sourced comfyui-integrated video editor, and I want to know if you’d find it useful
https://redd.it/1oapyzg
@rStableDiffusion
https://redd.it/1oapyzg
@rStableDiffusion
Introducing InSubject 0.5, a QwenEdit LoRA trained for creating highly consistent characters/objects w/ just a single reference - samples attached, link + dataset below
https://redd.it/1oayez0
@rStableDiffusion
https://redd.it/1oayez0
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Introducing InSubject 0.5, a QwenEdit LoRA trained for creating highly consistent…
Explore this post and more from the StableDiffusion community
I built a (opensource) UI for Stable Diffusion focused on workflow and ease of use - Meet PrismXL!
Hey everyone,
Like many of you, I've spent countless hours exploring the incredible world of Stable Diffusion. Along the way, I found myself wanting a tool that felt a bit more... fluid. Something that combined powerful features with a clean, intuitive interface that didn't get in the way of the creative process.
So, I decided to build it myself. I'm excited to share my passion project with you all: PrismXL.
It's a standalone desktop GUI built from the ground up with PySide6 and Diffusers, currently running the fantastic Juggernaut-XL-v9 model.
https://preview.redd.it/wipjdf7u14wf1.png?width=1280&format=png&auto=webp&s=55af37370bbc1efe20a5e034dfb31e9694f73d7e
https://preview.redd.it/5sdlvf7u14wf1.png?width=1280&format=png&auto=webp&s=8cd3f9589d2d9001bfc6712765abb8a2d69a2c17
My goal wasn't to reinvent the wheel, but to refine the experience. Here are some of the core features I focused on:
Clean, Modern UI: A fully custom, frameless interface with movable sections. You can drag and drop the "Prompt," "Advanced Options," and other panels to arrange your workspace exactly how you like it.
Built-in Spell Checker: The prompt and negative prompt boxes have a built-in spell checker with a correction suggestion menu (right-click on a misspelled word). No more re-running a 50-step generation because of a simple typo!
Prompt Library: Save your favorite or most complex prompts with a noscript. You can easily search, edit, and "cast" them back into the prompt box.
Live Render Preview: For 512x512 generations, you can enable a live preview that shows you the image as it's being refined at each step. It's fantastic for getting a feel for your image's direction early on.
Grid Generation & Zoom: Easily generate a grid of up to 4 images to compare subtle variations. The image viewer includes a zoom-on-click feature and thumbnails for easy switching.
User-Friendly Controls: All the essentials are there—steps, CFG scale, CLIP skip, custom seeds, and a wide range of resolutions—all presented with intuitive sliders and dropdowns.
Why another GUI?
I know there are some amazing, feature-rich UIs out there. PrismXL is my take on a tool that’s designed to be approachable for newcomers without sacrificing the control that power users need. It's about reducing friction and keeping the focus on creativity. I've poured a lot of effort into the small details of the user experience.
This is a project born out of a love for the technology and the community around it. I've just added a "Terms of Use" dialog on the first launch as a simple safeguard, but my hope is to eventually open-source it once I'm confident in its stability and have a good content protection plan in place.
I would be incredibly grateful for any feedback you have. What do you like? What's missing? What could be improved?
You can check out the project and find the download link on GitHub:
**https://github.com/dovvnloading/Sapphire-Image-GenXL**
Thanks for taking a look. I'm excited to hear what you think and to continue building this with the community in mind! Happy generating
https://redd.it/1oawx5t
@rStableDiffusion
Hey everyone,
Like many of you, I've spent countless hours exploring the incredible world of Stable Diffusion. Along the way, I found myself wanting a tool that felt a bit more... fluid. Something that combined powerful features with a clean, intuitive interface that didn't get in the way of the creative process.
So, I decided to build it myself. I'm excited to share my passion project with you all: PrismXL.
It's a standalone desktop GUI built from the ground up with PySide6 and Diffusers, currently running the fantastic Juggernaut-XL-v9 model.
https://preview.redd.it/wipjdf7u14wf1.png?width=1280&format=png&auto=webp&s=55af37370bbc1efe20a5e034dfb31e9694f73d7e
https://preview.redd.it/5sdlvf7u14wf1.png?width=1280&format=png&auto=webp&s=8cd3f9589d2d9001bfc6712765abb8a2d69a2c17
My goal wasn't to reinvent the wheel, but to refine the experience. Here are some of the core features I focused on:
Clean, Modern UI: A fully custom, frameless interface with movable sections. You can drag and drop the "Prompt," "Advanced Options," and other panels to arrange your workspace exactly how you like it.
Built-in Spell Checker: The prompt and negative prompt boxes have a built-in spell checker with a correction suggestion menu (right-click on a misspelled word). No more re-running a 50-step generation because of a simple typo!
Prompt Library: Save your favorite or most complex prompts with a noscript. You can easily search, edit, and "cast" them back into the prompt box.
Live Render Preview: For 512x512 generations, you can enable a live preview that shows you the image as it's being refined at each step. It's fantastic for getting a feel for your image's direction early on.
Grid Generation & Zoom: Easily generate a grid of up to 4 images to compare subtle variations. The image viewer includes a zoom-on-click feature and thumbnails for easy switching.
User-Friendly Controls: All the essentials are there—steps, CFG scale, CLIP skip, custom seeds, and a wide range of resolutions—all presented with intuitive sliders and dropdowns.
Why another GUI?
I know there are some amazing, feature-rich UIs out there. PrismXL is my take on a tool that’s designed to be approachable for newcomers without sacrificing the control that power users need. It's about reducing friction and keeping the focus on creativity. I've poured a lot of effort into the small details of the user experience.
This is a project born out of a love for the technology and the community around it. I've just added a "Terms of Use" dialog on the first launch as a simple safeguard, but my hope is to eventually open-source it once I'm confident in its stability and have a good content protection plan in place.
I would be incredibly grateful for any feedback you have. What do you like? What's missing? What could be improved?
You can check out the project and find the download link on GitHub:
**https://github.com/dovvnloading/Sapphire-Image-GenXL**
Thanks for taking a look. I'm excited to hear what you think and to continue building this with the community in mind! Happy generating
https://redd.it/1oawx5t
@rStableDiffusion
[Update] AI Image Tagger, added Visual Node Editor, R-4B support, smart templates and more
Hey everyone,
a while back I shared my [AI Image Tagger project](https://www.reddit.com/r/StableDiffusion/comments/1nwvhp1/made_a_free_tool_to_autotag_images_alpha_looking/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button), a simple batch captioning tool built around BLIP.
I’ve been working on it since then, and there’s now a pretty big update with a bunch of new stuff and general improvements.
**Main changes:**
* Added a visual node editor, so you can build your own processing pipelines (like Input → Model → Output).
* Added support for the R-4B model, which gives more detailed and reasoning-based captions. BLIP is still there if you want something faster.
* Introduced Smart Templates (called Conjunction nodes) to combine AI outputs and custom prompts into structured captions.
* Added real-time stats – shows processing speed and ETA while it’s running.
* Improved batch processing – handles larger sets of images more efficiently and uses less memory.
* Added flexible export – outputs as a ZIP with embedded metadata.
* Supports multiple precision modes: float32, float16, 8-bit, and 4-bit.
I designed this pipeline to leverage an LLM for producing detailed, multi perspective image denoscriptions, refining the results across several iterations.
Everything’s open-source (MIT) here:
[https://github.com/maxiarat1/ai-image-captioner](https://github.com/maxiarat1/ai-image-captioner)
If you tried the earlier version, this one should feel a lot smoother and more flexible. I’d appreciate any feedback or ideas for other node types to add next.
https://preview.redd.it/4cqaztbdj4wf1.png?width=3870&format=png&auto=webp&s=96dcc926d8a6746c9a2cc8504a93502868850adc
If you tried the previous version, this update adds much more flexibility and visual control.
Feedback and suggestions are welcome, especially regarding model performance and node editor usability.
https://redd.it/1oazq7n
@rStableDiffusion
Hey everyone,
a while back I shared my [AI Image Tagger project](https://www.reddit.com/r/StableDiffusion/comments/1nwvhp1/made_a_free_tool_to_autotag_images_alpha_looking/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button), a simple batch captioning tool built around BLIP.
I’ve been working on it since then, and there’s now a pretty big update with a bunch of new stuff and general improvements.
**Main changes:**
* Added a visual node editor, so you can build your own processing pipelines (like Input → Model → Output).
* Added support for the R-4B model, which gives more detailed and reasoning-based captions. BLIP is still there if you want something faster.
* Introduced Smart Templates (called Conjunction nodes) to combine AI outputs and custom prompts into structured captions.
* Added real-time stats – shows processing speed and ETA while it’s running.
* Improved batch processing – handles larger sets of images more efficiently and uses less memory.
* Added flexible export – outputs as a ZIP with embedded metadata.
* Supports multiple precision modes: float32, float16, 8-bit, and 4-bit.
I designed this pipeline to leverage an LLM for producing detailed, multi perspective image denoscriptions, refining the results across several iterations.
Everything’s open-source (MIT) here:
[https://github.com/maxiarat1/ai-image-captioner](https://github.com/maxiarat1/ai-image-captioner)
If you tried the earlier version, this one should feel a lot smoother and more flexible. I’d appreciate any feedback or ideas for other node types to add next.
https://preview.redd.it/4cqaztbdj4wf1.png?width=3870&format=png&auto=webp&s=96dcc926d8a6746c9a2cc8504a93502868850adc
If you tried the previous version, this update adds much more flexibility and visual control.
Feedback and suggestions are welcome, especially regarding model performance and node editor usability.
https://redd.it/1oazq7n
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Made a free tool to auto-tag images (alpha) – looking for ideas/feedback
Explore this post and more from the StableDiffusion community
PSA: Ditch the high noise lightx2v
This isn't some secret knowledge but I have only really tested this today and if you're like me, maybe I'm the one to get this idea into your head: ditch the lightx2v lora for the high noise. At least for I2V, that's what I'm testing now.
I have gotten frustrated by the slow movement and bad prompt adherence. So today I decided to try to use the high noise model naked. I always assumed it would need too many steps and take way too long, but that's not really the case. I have settled for a 6/4 split, 6 steps with the high noise model without lightx2v and then 4 steps with the low noise model with lightx2v. It just feels so much better. It does take a little longer (6 minutes for the whole generation) but the quality boost is worth it. Do it. It feels like a whole new model to me.
https://redd.it/1ob3uaa
@rStableDiffusion
This isn't some secret knowledge but I have only really tested this today and if you're like me, maybe I'm the one to get this idea into your head: ditch the lightx2v lora for the high noise. At least for I2V, that's what I'm testing now.
I have gotten frustrated by the slow movement and bad prompt adherence. So today I decided to try to use the high noise model naked. I always assumed it would need too many steps and take way too long, but that's not really the case. I have settled for a 6/4 split, 6 steps with the high noise model without lightx2v and then 4 steps with the low noise model with lightx2v. It just feels so much better. It does take a little longer (6 minutes for the whole generation) but the quality boost is worth it. Do it. It feels like a whole new model to me.
https://redd.it/1ob3uaa
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
LucidFlux image restoration — broken workflows or am I dumb? 😅
https://redd.it/1ob1iuo
@rStableDiffusion
https://redd.it/1ob1iuo
@rStableDiffusion