Does anybody know why Forge Couple isn't generating the 2 characters?
https://redd.it/1oinxh1
@rStableDiffusion
https://redd.it/1oinxh1
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Does anybody know why Forge Couple isn't generating the 2 characters?
Explore this post and more from the StableDiffusion community
Just dropped Kani TTS English - a 400M TTS model that's 5x faster than realtime on RTX 4080
https://huggingface.co/nineninesix/kani-tts-400m-en
https://redd.it/1oiv6p8
@rStableDiffusion
https://huggingface.co/nineninesix/kani-tts-400m-en
https://redd.it/1oiv6p8
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Just dropped Kani TTS English - a 400M TTS model that's 5x faster than realtime on…
Explore this post and more from the StableDiffusion community
Wan prompting tricks, change scene, FLF
So i've been experimenting with this great model img2vid and there are some tricks I found useful I want to share:
1. You can use "immediately cut to the scene...." or "the scene changes and <scene/action denoscription>" or "the scene cuts" or "cut to the next scene" and similar if you want to use your fav img as reference and make drastic changes QUICK and have more useful frames per generation. Inspired by some loras, and it also works most of the time with loras not originally trained for scene changes and even without loras, but scene change startup time may vary. Loras and their set strenghts also has a visible effect on this.
Also I usually start at least two or more runs (with same settings, but different random seeds) - helps with iterating.
2. FLF can be used to make this effect even stronger(!) and more predictable. Works best if you have first frame image and last frame second image composition wise (just rotating the same image makes a huge difference) close to what you want, so wan effectively tries to merge them immediately. So it's closer to having TWO startup references.
These are my experiments with BASE Q5KM model. Basically, it's similar to what Lynx model does (but I fail to make it run, and most KJ workflows, so this improvisation)
121 frames works just fine
Let's discuss and share similar findings
https://redd.it/1oiw57z
@rStableDiffusion
So i've been experimenting with this great model img2vid and there are some tricks I found useful I want to share:
1. You can use "immediately cut to the scene...." or "the scene changes and <scene/action denoscription>" or "the scene cuts" or "cut to the next scene" and similar if you want to use your fav img as reference and make drastic changes QUICK and have more useful frames per generation. Inspired by some loras, and it also works most of the time with loras not originally trained for scene changes and even without loras, but scene change startup time may vary. Loras and their set strenghts also has a visible effect on this.
Also I usually start at least two or more runs (with same settings, but different random seeds) - helps with iterating.
2. FLF can be used to make this effect even stronger(!) and more predictable. Works best if you have first frame image and last frame second image composition wise (just rotating the same image makes a huge difference) close to what you want, so wan effectively tries to merge them immediately. So it's closer to having TWO startup references.
These are my experiments with BASE Q5KM model. Basically, it's similar to what Lynx model does (but I fail to make it run, and most KJ workflows, so this improvisation)
121 frames works just fine
Let's discuss and share similar findings
https://redd.it/1oiw57z
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Update to Repo for my AI Toolkit Fork + New Yaml Settings for I2V motion training
Hi, PR has already been submitted to Ostris but yeah... my last one hasn't even been looked at. So here is my fork repo:
[https://github.com/relaxis/ai-toolkit](https://github.com/relaxis/ai-toolkit)
Changes:
1. Automagic now trains separate LR per lora (high and low noise) if it detects MoE training - LR outputs now print to log and terminal. You can also train each lora according to different optimizer parameters:
​
optimizer_params:
lr_bump: 0.000005 #old
min_lr: 0.000008 #old
max_lr: 0.0003 #old
beta2: 0.999
weight_decay: 0.0001
clip_threshold: 1
high_noise_lr_bump: 0.00001 # new
high_noise_min_lr: 0.00001 # new
high_noise_max_lr: 0.0003 # new
low_noise_lr_bump: 0.000005 # new
low_noise_min_lr: 0.00001 # new
low_noise_max_lr: 0.0003 #new
2. Changed resolution bucket logic - previously this worked on SDXL bucket logic but now you can specify pixel count. The logic will allow higher dimension videos and images to be trained as long as they fit within the specified pixel count (allows for higher resolution low vram videos below your cut off resolution).
resolution: - 512
max_pixels_per_frame: 262144
https://redd.it/1oiyuzr
@rStableDiffusion
Hi, PR has already been submitted to Ostris but yeah... my last one hasn't even been looked at. So here is my fork repo:
[https://github.com/relaxis/ai-toolkit](https://github.com/relaxis/ai-toolkit)
Changes:
1. Automagic now trains separate LR per lora (high and low noise) if it detects MoE training - LR outputs now print to log and terminal. You can also train each lora according to different optimizer parameters:
​
optimizer_params:
lr_bump: 0.000005 #old
min_lr: 0.000008 #old
max_lr: 0.0003 #old
beta2: 0.999
weight_decay: 0.0001
clip_threshold: 1
high_noise_lr_bump: 0.00001 # new
high_noise_min_lr: 0.00001 # new
high_noise_max_lr: 0.0003 # new
low_noise_lr_bump: 0.000005 # new
low_noise_min_lr: 0.00001 # new
low_noise_max_lr: 0.0003 #new
2. Changed resolution bucket logic - previously this worked on SDXL bucket logic but now you can specify pixel count. The logic will allow higher dimension videos and images to be trained as long as they fit within the specified pixel count (allows for higher resolution low vram videos below your cut off resolution).
resolution: - 512
max_pixels_per_frame: 262144
https://redd.it/1oiyuzr
@rStableDiffusion
GitHub
GitHub - relaxis/ai-toolkit: The ultimate training toolkit for finetuning diffusion models
The ultimate training toolkit for finetuning diffusion models - relaxis/ai-toolkit
Labubu Generator: Open the Door to Mischief, Monsters, and Your Imagination (Qwen Image LoRA, Civitai Release, Training Details Included)
https://redd.it/1oj3lgt
@rStableDiffusion
https://redd.it/1oj3lgt
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Labubu Generator: Open the Door to Mischief, Monsters, and Your Imagination (Qwen…
Explore this post and more from the StableDiffusion community