NEW BOT Телеграм, страница

r/StableDiffusion

Low Res Input -> Qwen Image Edit 2511 -> ZIT Refining

https://redd.it/1q3yix6
@rStableDiffusion

From the StableDiffusion community on Reddit: Low Res Input -> Qwen Image Edit 2511 -> ZIT Refining

Explore this post and more from the StableDiffusion community

9 views21:40

r/StableDiffusion

ZIT-cadelic-Wallpapers

https://redd.it/1q42zu5
@rStableDiffusion

From the StableDiffusion community on Reddit: ZIT-cadelic-Wallpapers

Explore this post and more from the StableDiffusion community

9 views23:40

r/StableDiffusion

SVI: One simple change fixed my slow motion and lack of prompt adherence...
https://redd.it/1q45liy
@rStableDiffusion

8 views01:23

r/StableDiffusion

LTXV2 Pull Request In Comfy, Coming Soon? (weights not released yet)

https://github.com/comfyanonymous/ComfyUI/pull/11632

Looking at the PR it seems to support audio and use Gemma3 12B as text encoder.

The previous LTX models had speed but nowhere near the quality of Wan 2.2 14B.

LTX 0.9.7 actually followed prompts quite well, and had a good way of handling infinite length generation in comfy, you just put in prompts delimited by a '|' character, the dev team behind LTX clearly cares as the workflows are nicely organised, they release distilled + non distilled versions same day etc.

There seems to be something about Wan 2.2 that makes it avoid body horror/keep coherence when doing more complex things, smaller/faster models like Wan 5B, Hunyuan 1.5 and even the old Wan 1.3B CAN produce really good results, but 90% of the time you'll get weird body horror or artifacts somewhere in the video, whereas with Wan 2.2 it feels more like 20%.

On top of that some of the models break down a lot quicker with lower resolution, so you're forced into higher res, partially losing the speed benefits, or they have a high quality but stupidly slow VAE (HY 1.5 and Wan 5B are like this).

I hope LTX can achieve that while being faster, or improve on Wan (more consistent/less dice roll prompt following similar to Qwen image/z image, which might be likely due to gemma as text encoder) while being the same speed.

https://redd.it/1q49ulp
@rStableDiffusion

7 views03:10

r/StableDiffusion

GLM-Image AR Model Support by zRzRzRzRzRzRzR · Pull Request #43100 · huggingface/transformers
https://github.com/huggingface/transformers/pull/43100/files

https://redd.it/1q42gv8
@rStableDiffusion

GitHub

GLM-Image AR Model Support by zRzRzRzRzRzRzR · Pull Request #43100 · huggingface/transformers

This PR is to adapt the implementation of the AR model for GLM-Image.

7 views05:09

r/StableDiffusion

Follow-up help for the Z-Image Turbo Lora.

https://redd.it/1q4dh56
@rStableDiffusion

From the StableDiffusion community on Reddit: Follow-up help for the Z-Image Turbo Lora.

Explore this post and more from the StableDiffusion community

9 views07:16

r/StableDiffusion

Hate War and Peace style prompt for ZIT? try this

https://redd.it/1q4ck9m
@rStableDiffusion

From the StableDiffusion community on Reddit: Hate War and Peace style prompt for ZIT? try this

Explore this post and more from the StableDiffusion community

9 views10:07

r/StableDiffusion

0:20

This media is not supported in your browser

VIEW IN TELEGRAM

Miniature tea making process with Qwen + wan + mmAudio

https://redd.it/1q4ebtz
@rStableDiffusion

11 views13:47

r/StableDiffusion

Brie's Lazy Character Control Suite (Qwen Edit 2511)

https://redd.it/1q4ngjy
@rStableDiffusion

From the StableDiffusion community on Reddit: Brie's Lazy Character Control Suite (Qwen Edit 2511)

Explore this post and more from the StableDiffusion community

21 views17:34

r/StableDiffusion

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

I open-sourced a tool that turns any photo into a playable Game Boy ROM using AI

https://redd.it/1q4pgaa
@rStableDiffusion

12 views21:52

r/StableDiffusion

I’m the Co-founder & CEO of Lightricks. We just open-sourced LTX-2, a production-ready audio-video AI model. AMA.

Hi everyone. **I’m Zeev Farbman, Co-founder & CEO of Lightricks.**

I’ve spent the last few years working closely with our team on [LTX-2](https://ltx.io/model), a production-ready audio–video foundation model. This week, we did a full open-source release of LTX-2, including weights, code, a trainer, benchmarks, LoRAs, and documentation.

Open releases of multimodal models are rare, and when they do happen, they’re often hard to run or hard to reproduce. We built LTX-2 to be something you can actually use: it runs locally on consumer GPUs and powers real products at Lightricks.

**I’m here to answer questions about:**

* Why we decided to open-source LTX-2
* What it took ship an open, production-ready AI model
* Tradeoffs around quality, efficiency, and control
* Where we think open multimodal models are going next
* Roadmap and plans

Ask me anything!
I’ll answer as many questions as I can, with some help from the LTX-2 team.

*Verification:*

[Lightricks CEO Zeev Farbman](https://preview.redd.it/3oo06hz2x4cg1.jpg?width=2400&format=pjpg&auto=webp&s=4c3764327c90a1af88b7e056084ed2ac8f87c60b)

https://redd.it/1q7dzq2
@rStableDiffusion

ltx.io

Multimodal Model For Generative Creation | LTX Model

LTX Model is an AI multimodal generation model built for real workflows, enabling reliable, flexible, production-grade AI creation at scale.

6 views02:40

r/StableDiffusion

LTX-2 For Low V-RAM: Audio-Video Model Using Comfy UI (720p & 1080p Videos)
https://youtu.be/XOVF0wIAMQQ

https://redd.it/1q6wzqh
@rStableDiffusion

YouTube

LTX-2 For Low V-RAM: Audio-Video Model Using Comfy UI (720p & 1080p Videos)

🌟 Running LTX-2 on Consumer GPUs! 🌟

LTX-2 is a DiT-based audio-video foundation model that generates synchronized video and audio inside a single model — no separate pipelines, no post-sync hacks.

In this video, I show how LTX-2 can be run locally even…

6 views03:40

r/StableDiffusion

0:42

This media is not supported in your browser

VIEW IN TELEGRAM

LTX-2 team literally challenging Alibaba Wan team, this was shared on their official X account :)

https://redd.it/1q7kygr
@rStableDiffusion

6 views04:40

r/StableDiffusion

My attempt at creating some non perfect looking photos with ai that are not super obviously ai generated

https://redd.it/1q7uafn
@rStableDiffusion

From the StableDiffusion community on Reddit: My attempt at creating some non perfect looking photos with ai that are not super…

Explore this post and more from the StableDiffusion community

8 views05:40

r/StableDiffusion

0:28

This media is not supported in your browser

VIEW IN TELEGRAM

LTX2 on 8GB VRAM and 32 GB RAM

https://redd.it/1q7pcah
@rStableDiffusion

7 views06:40

r/StableDiffusion

0:08

This media is not supported in your browser

VIEW IN TELEGRAM

LTX2 ASMR

https://redd.it/1q7q166
@rStableDiffusion

6 views07:40

r/StableDiffusion

someone posted today about sage attention 3, I tested it and here is my results

Hardware: RTX 5090 + 64GB DDR4 RAM.

Test: same input image, same prompt, 121 frames, 16 fps, 720x1280

1. Lightx2v high/low models (not loras) + sage attention node set to auto: 160 seconds
2. Lightx2v high/low models (not loras) + sage attention node set to sage3: 85 seconds
3. Lightx2v high/low models (not loras) + no sage attention: 223 seconds
4. Full WAN 2.2 fp16 models, no loras + sage 3: 17 minutes
5. Full WAN 2.2 fp16, no loras, no sage attention: 24.5 minutes

Quality best to worst: 5 > 1&2 > 3 > 4

I'm lazy to upload all generations but uploading whats important:

4. using Wan 2.2 fp16 + sage3: https://files.catbox.moe/a3eosn.mp4, Quality Speaks for itself

2. lightx2v + sage 3 https://files.catbox.moe/nd9dtz.mp4

3. lightx2v no sage attention https://files.catbox.moe/ivhy68.mp4

hope this helps.

Edit: if anyone wants to test this this is how I installed sage3 and got it running in Comfyui portable:

******Note 1: do this at your own risk, I personally have multiple running copies of Comfyui portable in case anything went wrong.

*****Note 2: assuming you have triton installed which should be installed if you use SA2.2.

1. Download the wheel that matches your cuda, pytorch, and python versions from here, https://github.com/mengqin/SageAttention/releases/tag/20251229
2. Place the wheel in your .\\python_embeded\\ folder
3. Run this in command "ComfyUI\\python_embeded\\python.exe -m pip install full_wheel_name.whl"

https://redd.it/1q7yzsp
@rStableDiffusion