NEW BOT Телеграм, страница

6 views13:40

6 views13:41

Сonsistency characters V0.4 | Generate characters only by image and prompt, without character's Lora! | IL\NoobAI Edit

https://redd.it/1okrsld
@rStableDiffusion

From the StableDiffusion community on Reddit: Сonsistency characters V0.4 | Generate characters only by image and prompt, without…

Explore this post and more from the StableDiffusion community

6 views14:40

r/StableDiffusion

5 views14:40

r/StableDiffusion

Update — FP4 Infrastructure Verified (Oct 31 2025)

Quick follow-up to my previous post about running SageAttention 3 on an RTX 5080 (Blackwell) under WSL2 + CUDA 13.0 + PyTorch 2.10 nightly.

After digging into the internal API, I confirmed that the hidden FP4 quantization hooks (scaleandquantfp4, enableblockscaledfp4attn, etc.) are fully implemented at the Python level — even though the low-level CUDA kernels are not yet active.

I built an experimental FP4 quantization layer and integrated it directly into nodesmodelloading.py.
The system initializes correctly, executes under Blackwell, and logs tensor output + VRAM profile with FP4 hooks active.
However, true FP4 compute isn’t yet functional, as the CUDA backend still defaults to FP8/FP16 paths.

---

Proof of Execution

attention mode override: sageattn3
FP4 quantization applied to transformer
FP4 API fallback to BF16/FP8 pipeline
Max allocated memory: 9.95 GB
Prompt executed in 341.08 seconds

---

Next Steps

Wait for full NV-FP4 exposure in future CUDA / PyTorch releases

Continue testing with non-quantized WAN 2.2 models

Publish an FP4-ready fork once reproducibility is verified

Full build logs and technical details are on GitHub:
Repository: github.com/k1n0F/sageattention3-blackwell-wsl2

https://redd.it/1oktwaz
@rStableDiffusion

GitHub

GitHub - k1n0F/sageattention3-blackwell-wsl2

Contribute to k1n0F/sageattention3-blackwell-wsl2 development by creating an account on GitHub.

5 views15:40

About

Blog

Apps

Platform