realistic elf or something.
## Base vs Distil vs Turbo
They're good for different things. I'm generally a fan of base models, so most workflows I post are / will be for base models. Generally they give the highest quality but are much slower and can be finicky to use at times.
What is distillation?
It's basically a method of narrowing the focus of a model so that it converges on what you want faster and more consistently. This allows a distil to generate images in fewer steps and more consistently for whatever subject/topic was chosen. They often also come pre-negatived (in a sense, don't @ me) so that you can use 1.0 CFG and no negative prompt. Distils can be full models or simple loras.
The downside of this is that the model becomes more narrow, making it less creative and less capable outside of the areas it was focused on during distillation. For many models it also reduces the quality of image outputs, sometimes massively. Models like Qwen and Flux have god-awful quality when distilled (especially human skin), but luckily Z-image distils pretty well and only loses a little bit of quality. Generally, the fewer steps the distil needs the lower the quality is. 4-step distils usually have very poor quality compared to base, while 8+ step distils are usually much more balanced.
Z-image turbo is just an official distil, and it's focused on general realism and human-centric shots. It's also designed to run in around 10 steps, allowing it to maintain pretty high quality.
So, if you're just doing human-centric shots and don't mind a small quality drop, Z-image turbo will work just fine for you. You'll want to use a different workflow though - let me know if you'd like me to upload mine.
Below are the typical pros and cons of base models and distils. These are pretty much always true, but not always a 'big deal' depending on the model. As I said above, Z-image distils pretty well so it's not too bad, but be careful which one you use - tons of distils are terrible at human skin and make people look plastic (z-image turbo is fine).
Base model pros:
Generally gives the highest quality outputs with the finest details, once you get the hang of it
Creative and flexible
Base model cons:
Very slow
Usually requires a lengthy negative prompt to get good results
Creativity has a downside; you'll often need to generate something several times to get a result you like
More prone to mistakes when compared to the focus areas of distils
e.g. z-image base is more likely to mess up hands/fingers or distant faces compared to z-image turbo
Distil pros:
Fast generations
Good at whatever it was focused on (e.g. people-centric photography for z-image turbo)
Doesn't need a negative prompt (usually)
Distil cons:
Bad at whatever it wasn't focused on, compared to base
Usually bad at facial expressions (not able to do 'extreme' ones like anger properly)
Generally less creative, less flexible (not always a downside)
Lower quality images, sometimes by a lot and sometimes only by a little - depends on the model, the specific distil, and the subject matter
Can't have a negative prompt (usually)
You can get access to negative prompts using NAG (not covered in this post)
https://redd.it/1qzncrz
@rStableDiffusion
## Base vs Distil vs Turbo
They're good for different things. I'm generally a fan of base models, so most workflows I post are / will be for base models. Generally they give the highest quality but are much slower and can be finicky to use at times.
What is distillation?
It's basically a method of narrowing the focus of a model so that it converges on what you want faster and more consistently. This allows a distil to generate images in fewer steps and more consistently for whatever subject/topic was chosen. They often also come pre-negatived (in a sense, don't @ me) so that you can use 1.0 CFG and no negative prompt. Distils can be full models or simple loras.
The downside of this is that the model becomes more narrow, making it less creative and less capable outside of the areas it was focused on during distillation. For many models it also reduces the quality of image outputs, sometimes massively. Models like Qwen and Flux have god-awful quality when distilled (especially human skin), but luckily Z-image distils pretty well and only loses a little bit of quality. Generally, the fewer steps the distil needs the lower the quality is. 4-step distils usually have very poor quality compared to base, while 8+ step distils are usually much more balanced.
Z-image turbo is just an official distil, and it's focused on general realism and human-centric shots. It's also designed to run in around 10 steps, allowing it to maintain pretty high quality.
So, if you're just doing human-centric shots and don't mind a small quality drop, Z-image turbo will work just fine for you. You'll want to use a different workflow though - let me know if you'd like me to upload mine.
Below are the typical pros and cons of base models and distils. These are pretty much always true, but not always a 'big deal' depending on the model. As I said above, Z-image distils pretty well so it's not too bad, but be careful which one you use - tons of distils are terrible at human skin and make people look plastic (z-image turbo is fine).
Base model pros:
Generally gives the highest quality outputs with the finest details, once you get the hang of it
Creative and flexible
Base model cons:
Very slow
Usually requires a lengthy negative prompt to get good results
Creativity has a downside; you'll often need to generate something several times to get a result you like
More prone to mistakes when compared to the focus areas of distils
e.g. z-image base is more likely to mess up hands/fingers or distant faces compared to z-image turbo
Distil pros:
Fast generations
Good at whatever it was focused on (e.g. people-centric photography for z-image turbo)
Doesn't need a negative prompt (usually)
Distil cons:
Bad at whatever it wasn't focused on, compared to base
Usually bad at facial expressions (not able to do 'extreme' ones like anger properly)
Generally less creative, less flexible (not always a downside)
Lower quality images, sometimes by a lot and sometimes only by a little - depends on the model, the specific distil, and the subject matter
Can't have a negative prompt (usually)
You can get access to negative prompts using NAG (not covered in this post)
https://redd.it/1qzncrz
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
This media is not supported in your browser
VIEW IN TELEGRAM
What is up with the "plastic mouthes" that LTX-2 Generates when using i2v with your own Audio? Info in comments.
https://redd.it/1qzqgbm
@rStableDiffusion
https://redd.it/1qzqgbm
@rStableDiffusion
Just created my first Flux.2 Klein 9B style LoRA and I'm impressed with its text and adherence abilities
https://redd.it/1qzvhaw
@rStableDiffusion
https://redd.it/1qzvhaw
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Just created my first Flux.2 Klein 9B style LoRA and I'm impressed with its text…
Explore this post and more from the StableDiffusion community
Ace Step 1.5 could open up a booming market for huge, comprehensive music LoRAs
I'm still settling into my initial Ace Step 1.5 setup, but I'm getting some pretty high-quality sound out of the software as I gain familiarity with the parameters and prompting conventions. All that's missing to bring Ace Step much closer to Udio is a huge database of the music we already like.
Personally, I'm not talking about - nor do I have any interest in - "ethical" databases.
I would be delighted to pay for well-trained LoRAs. I don't know how big Ace Step LoRAs can be or how many songs they can hold, or if they can be combined, but I'm eager to find out more about that stuff. As of yet I'm not even sure how to load/implement LoRAs but I'll figure it out.
It seems that training music LoRAs might be a bit more involved than training AI image LoRAs, so I don't know if I should expect to see a CivitAI-style gallery of frequent releases such as the huge & still growing collections of SDXL models and LoRAs.
Anyway, I'm really looking forward to what the community produces. I haven't been this excited about Music AI since I discovered Udio almost two years ago.
https://redd.it/1qzutkp
@rStableDiffusion
I'm still settling into my initial Ace Step 1.5 setup, but I'm getting some pretty high-quality sound out of the software as I gain familiarity with the parameters and prompting conventions. All that's missing to bring Ace Step much closer to Udio is a huge database of the music we already like.
Personally, I'm not talking about - nor do I have any interest in - "ethical" databases.
I would be delighted to pay for well-trained LoRAs. I don't know how big Ace Step LoRAs can be or how many songs they can hold, or if they can be combined, but I'm eager to find out more about that stuff. As of yet I'm not even sure how to load/implement LoRAs but I'll figure it out.
It seems that training music LoRAs might be a bit more involved than training AI image LoRAs, so I don't know if I should expect to see a CivitAI-style gallery of frequent releases such as the huge & still growing collections of SDXL models and LoRAs.
Anyway, I'm really looking forward to what the community produces. I haven't been this excited about Music AI since I discovered Udio almost two years ago.
https://redd.it/1qzutkp
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Did Ace Step 1.5 just got better? Someone merged Turbo and SFT models
https://huggingface.co/Aryanne/acestep-v15-test-merges/blob/main/acestep\_v1.5\_merge\_sft\_turbo\_ta\_0.5.safetensors
IMO it sounds even better than the base turbo one. Let me know what you think.
https://redd.it/1r0bfof
@rStableDiffusion
https://huggingface.co/Aryanne/acestep-v15-test-merges/blob/main/acestep\_v1.5\_merge\_sft\_turbo\_ta\_0.5.safetensors
IMO it sounds even better than the base turbo one. Let me know what you think.
https://redd.it/1r0bfof
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Did Ace Step 1.5 just got better? Someone merged Turbo and SFT models
Explore this post and more from the StableDiffusion community