NEW BOT Телеграм, страница

6 views14:41

Local Dream 2.2.0 - batch mode and history

The new version of Local Dream has been released, with two new features:
- you can also perform (linear) batch generation,
- you can review and save previously generated images, per model!

The new version can be downloaded for Android from here:
https://github.com/xororz/local-dream/releases/tag/v2.2.0

https://redd.it/1omhokh
@rStableDiffusion

GitHub

Release v2.2.0 · xororz/local-dream

Support single-click serial batch generation feature (#111)
Added generation history feature for each model, displaying only the most recent 100 images
Display the most recent 20 historical generat...

7 views15:40

r/StableDiffusion

Is SD 1.5 still relevant? Are there any cool models?

The other day I was testing the stuff I generated on old infrastructure of the company (for one year and half the only infrastructure we had was a single 2080 Ti...) and now with the more advanced infrastructure we have, something like SDXL (Turbo) and SD 1.5 will cost next to nothing.

But I'm afraid with all these new advanced models, these models aren't as satisfying as the past. So here I just ask you, if you still use these models, which checkpoints are you using?

https://redd.it/1omkh9h
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

7 views16:40

r/StableDiffusion

Back to 1.5 and QR Code Monster

https://redd.it/1ommqxs
@rStableDiffusion

From the StableDiffusion community on Reddit: Back to 1.5 and QR Code Monster

Explore this post and more from the StableDiffusion community

7 views17:40

r/StableDiffusion

5 views17:40

r/StableDiffusion

It turns out WDDM driver mode is making our RAM - GPU transfer extremely slower compared to TCC or MCDM mode. Anyone has figured out the bypass NVIDIA software level restrictions?

We have noticed this issue while I was working on Qwen Images models training.

We are getting massive speed loss when we do big data transfer between RAM and GPU on Windows compared to Linux. It is all due to Block Swapping.

The hit is such a big scale that Linux runs 2x faster than Windows even more.

Tests are made on same : GPU RTX 5090

You can read more info here : https://github.com/kohya-ss/musubi-tuner/pull/700

It turns out if we enable TCC mode on Windows, it gets equal speed as Linux.

However NVIDIA blocked this at driver level.

I found a Chinese article with just changing few letters, via Patching nvlddmkm.sys, the TCC mode fully becomes working on consumer GPUs. However this option is extremely hard and complex for average users.

Everything I found says it is due to driver mode WDDM

Moreover it seems like Microsoft added this feature : MCDM

https://learn.microsoft.com/en-us/windows-hardware/drivers/display/mcdm-architecture

And as far as I understood, MCDM mode should be also same speed.

Anyone managed to fix this issue? Able to set mode to MCDM or TCC on consumer GPUs?

This is a very hidden issue on the community. This would probably speed up inference as well.

Usin WSL2 makes absolutely 0 difference. I tested.

https://redd.it/1ommmek
@rStableDiffusion

GitHub

feat: add use_pinned_memory option for block swap in multiple models by kohya-ss · Pull Request #700 · kohya-ss/musubi-tuner

Add --use_pinned_memory_for_block_swap for each training noscript to enable pinned memory. Will work with Windows and Linux, but tested with Windows only.
Qwen-Image fine tuning is tested.

5 views18:40

About

Blog

Apps

Platform