This media is not supported in your browser
VIEW IN TELEGRAM
Thx to Kijai LTX-2 GGUFs are now up. Even Q6 is better quality than FP8 imo.
https://redd.it/1q8590s
@rStableDiffusion
https://redd.it/1q8590s
@rStableDiffusion
WanGP now has support for audio and image to video input with LTX2!
https://github.com/deepbeepmeep/Wan2GP
https://redd.it/1q85ubt
@rStableDiffusion
https://github.com/deepbeepmeep/Wan2GP
https://redd.it/1q85ubt
@rStableDiffusion
GitHub
GitHub - deepbeepmeep/Wan2GP: A fast AI Video Generator for the GPU Poor. Supports Wan 2.1/2.2, Qwen Image, Hunyuan Video, LTX…
A fast AI Video Generator for the GPU Poor. Supports Wan 2.1/2.2, Qwen Image, Hunyuan Video, LTX Video and Flux. - deepbeepmeep/Wan2GP
Z-Image IMG2IMG for Characters: Endgame V3 - Ultimate Photorealism
https://redd.it/1q87a3o
@rStableDiffusion
https://redd.it/1q87a3o
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Z-Image IMG2IMG for Characters: Endgame V3 - Ultimate Photorealism
Explore this post and more from the StableDiffusion community
Tips on Running LTX2 on Low ( 8GB or little less or more) VRAM
There seems to be a lot of confusion here on how to run LTX2 on 8GB VRAM or low VRAM setups. I have been running it in a completely stable setup on 8GB VRAM 4060 (Mobile) Laptop, 64 GB RAM. Generating 10 sec videos at 768 X 768 within 3 mins. In fact I got most of my info, from someone who was running the same stuff on 6GB VRAM and 32GB RAM. When done correctly, this this throws out videos faster than Flux used to make single images. In my experience, these things are critical, ignoring any of them results in failures.
Use the Workflow provided by ComfyUI within their latest updates (LTX2 Image to Video). None of the versions provided by 3rd party references worked for me. Use the same models in it (the distilled LTX2) and the below variation of Gemma:
Use the fp8 version of Gemma (the one provided in workflow is too heavy), expand the workflow and change the clip to this version after downloading it separately.
Increase Pagefile to 128 GB, as the model, clip, etc, etc take up more than 90 to 105 GB of RAM + Virtual Memory to load up. RAM alone, no matter how much, is usually never enough. This is the biggest failure point, if not done.
Use the flags: Low VRAM (for 8GB or Less) or Reserve VRAM (for 8GB+) in the executable file.
start with 480 X 480 and gradually work up to see what limit your hardware allows.
Finally, this:
In ComfyUI\\comfy\\ldm\\lightricks\\embeddings_connector.py
replace:
hidden_states = torch.cat((hidden_states, learnable_registers[hidden_states.shape[1\]:\].unsqueeze(0).repeat(hidden_states.shape[0\], 1, 1)), dim=1)
with
hidden_states = torch.cat((hidden_states, learnable_registers[hidden_states.shape[1\]:\].unsqueeze(0).repeat(hidden_states.shape[0\], 1, 1).to(hidden_states.device)), dim=1)
.... Did this all after a day of banging my head around and giving up, then found this info from multiple places ... with above all, did not have a single issue.
https://redd.it/1q87hdn
@rStableDiffusion
There seems to be a lot of confusion here on how to run LTX2 on 8GB VRAM or low VRAM setups. I have been running it in a completely stable setup on 8GB VRAM 4060 (Mobile) Laptop, 64 GB RAM. Generating 10 sec videos at 768 X 768 within 3 mins. In fact I got most of my info, from someone who was running the same stuff on 6GB VRAM and 32GB RAM. When done correctly, this this throws out videos faster than Flux used to make single images. In my experience, these things are critical, ignoring any of them results in failures.
Use the Workflow provided by ComfyUI within their latest updates (LTX2 Image to Video). None of the versions provided by 3rd party references worked for me. Use the same models in it (the distilled LTX2) and the below variation of Gemma:
Use the fp8 version of Gemma (the one provided in workflow is too heavy), expand the workflow and change the clip to this version after downloading it separately.
Increase Pagefile to 128 GB, as the model, clip, etc, etc take up more than 90 to 105 GB of RAM + Virtual Memory to load up. RAM alone, no matter how much, is usually never enough. This is the biggest failure point, if not done.
Use the flags: Low VRAM (for 8GB or Less) or Reserve VRAM (for 8GB+) in the executable file.
start with 480 X 480 and gradually work up to see what limit your hardware allows.
Finally, this:
In ComfyUI\\comfy\\ldm\\lightricks\\embeddings_connector.py
replace:
hidden_states = torch.cat((hidden_states, learnable_registers[hidden_states.shape[1\]:\].unsqueeze(0).repeat(hidden_states.shape[0\], 1, 1)), dim=1)
with
hidden_states = torch.cat((hidden_states, learnable_registers[hidden_states.shape[1\]:\].unsqueeze(0).repeat(hidden_states.shape[0\], 1, 1).to(hidden_states.device)), dim=1)
.... Did this all after a day of banging my head around and giving up, then found this info from multiple places ... with above all, did not have a single issue.
https://redd.it/1q87hdn
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
This media is not supported in your browser
VIEW IN TELEGRAM
20 seconds LTX2 video on a 3090 in only 2 minutes at 720p. Wan2GP, not comfy this time
https://redd.it/1q8e2g8
@rStableDiffusion
https://redd.it/1q8e2g8
@rStableDiffusion
This media is not supported in your browser
VIEW IN TELEGRAM
Stop using T2V & Best Practices IMO (LTX Video / ComfyUI Guide)
https://redd.it/1q8dxon
@rStableDiffusion
https://redd.it/1q8dxon
@rStableDiffusion
How Many Male Genital Pics Does Z-Turbo Need for a Lora to work? Sheesh.
Trying to make a lora that can make people with male genitalia. Gathered about 150 photos to train in AI Toolkit and so far the results are pure nightmare fuel...is this going to take like 1,000+ pictures to train? Any tips from those who have had success in this realm?
https://redd.it/1q8olqf
@rStableDiffusion
Trying to make a lora that can make people with male genitalia. Gathered about 150 photos to train in AI Toolkit and so far the results are pure nightmare fuel...is this going to take like 1,000+ pictures to train? Any tips from those who have had success in this realm?
https://redd.it/1q8olqf
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
This media is not supported in your browser
VIEW IN TELEGRAM
Another single 60-seconds test in LTX-2 with a more dynamic scene
https://redd.it/1q8plrd
@rStableDiffusion
https://redd.it/1q8plrd
@rStableDiffusion
All sorts of LTX-2 workflows. Getting Messy. Can we have like Workflow Link + Denoscription of what it achives in the comments here at a single place?
All people with workflows may be can comment/link workflow with denoscription/example?
https://redd.it/1q8o0d0
@rStableDiffusion
All people with workflows may be can comment/link workflow with denoscription/example?
https://redd.it/1q8o0d0
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
SDXL → Z-Image → SeedVR2, while the world burns with LTX-2 videos, here are a few images.
https://redd.it/1q8w47s
@rStableDiffusion
https://redd.it/1q8w47s
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: SDXL → Z-Image → SeedVR2, while the world burns with LTX-2 videos, here are a few…
Explore this post and more from the StableDiffusion community
Open Source Needs Competition, Not Brain-Dead “WAN Is Better” Comments
Sometimes I wonder whether all these comments around like “WAN vs anything else, WAN is better” aren’t just a handful of organized Chinese users trying to tear down any other competitive model 😆
or (heres the sad truth) if they’re simply a bunch of idiots ready to spit on everything, even on what’s handed to them for free right under their noses, and who
haven’t understood the importance of competition that drives progress in this open-source sector, which is ESSENTIAL, and we’re all hanging by a thread begging for production-ready tools that can compete with big corporations.
WAN and LTX are two different things:
one was trained to create video and audio together.
I don’t know if you even have the faintest idea of how complex that is.
Just ENCOURAGE OPENSOURCE COMPETITION, help if you can, give polite comments and testing, then add your new toy to your arsenal! wtf. God you piss me off so much with those nasty fingers always ready to type bullshit against everything.
https://redd.it/1q8wt2b
@rStableDiffusion
Sometimes I wonder whether all these comments around like “WAN vs anything else, WAN is better” aren’t just a handful of organized Chinese users trying to tear down any other competitive model 😆
or (heres the sad truth) if they’re simply a bunch of idiots ready to spit on everything, even on what’s handed to them for free right under their noses, and who
haven’t understood the importance of competition that drives progress in this open-source sector, which is ESSENTIAL, and we’re all hanging by a thread begging for production-ready tools that can compete with big corporations.
WAN and LTX are two different things:
one was trained to create video and audio together.
I don’t know if you even have the faintest idea of how complex that is.
Just ENCOURAGE OPENSOURCE COMPETITION, help if you can, give polite comments and testing, then add your new toy to your arsenal! wtf. God you piss me off so much with those nasty fingers always ready to type bullshit against everything.
https://redd.it/1q8wt2b
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community