r/StableDiffusion – Telegram
[SD1.5] This image was entirely generated by AI, not human-prompted (explanation in the comments)
https://i.imgur.com/E0bv2qo.png

https://redd.it/1pxg7n7
@rStableDiffusion
(ComfyUI with 5090) Free resources used to generate infinitely long 2K@36fps videos w/LoRAs

I wanna share what is possible to achieve on a single RTX 5090 in ComfyUI. In theory it's possible to generate infinitely long coherent 2k videos at 32fps with custom LoRAs with prompts on any timestamps. My 50-sec video was crisp and beautiful motions and had no distortion or blur and also character consistency throughout the video with my start image.

Stats on a 50-sec generation:

SVI 2.0 Pro (WAN 2.2 A14B I2V):

50-second video (765 frames): Generate 1280x720 = 1620 secs [SageAttn2 and Torch Compile w/latest lightx2v\]

SeedVR2 v2.5.24 (ema_7b_fp16):

50-second video (765 frames): Upscale 1280x720 to 2560x1440 = 1984 secs [SageAttn2 and Triton - Torch Compile could be used here as well, I just forgot\]

Rife VFI (rife49):

50-second video (1530 frames): Frame Interpolation 16fps to 32fps = 450 secs

Video Combine:

50-second video (1530 frames): Combine frames = 313 secs

Total = 4367 secs (72 mins) for a crisp and beautiful (no slowmotion) 2560x1440 video with 36 fps.

I might drop a video later in a new post, and if enough people would like a ComfyUI workflow, I will share it.

https://preview.redd.it/0deireppyw9g1.png?width=1058&format=png&auto=webp&s=e65f460de8343b620aca5c2764b38e3a054ce5b8



https://redd.it/1pxn75c
@rStableDiffusion
Best Model for anime and comfy UI workflows...

Recommend me a good model for anime images. I heard illustrious is pretty good but I am using a basic workflow in comfy UI and my images are distorted especially the faces.

https://redd.it/1pxpp4l
@rStableDiffusion
What’s the best model for each use case?

From my understanding, SDXL, primarily illustrious is still the defacto model for anime. Qwen seems to be the best at prompt adherence. And Z image for realism(as well as fast iteration). Is this more or less the use case for each model? And if so, when to use other models for that task. For example, using WAN as a refiner, Qwen for anime, and so on.

https://redd.it/1pxsyua
@rStableDiffusion
Joined the cool kids with a 5090. Pro audio engineer here looking to connect with other audiophiles for resources - Collaborative thread, will keep OP updated for reference.

Beyond ecstatic!

Looking to build a resource list for all things audio. I've use and "abused" all commercial offerings, hoping to dig deep into open-source, and take my projects to the net level.

What do you love using, and for what? Mind sharing your workflows?

https://redd.it/1pxqstv
@rStableDiffusion
Best anime upscaler?

Ive tried waifu2xgui, ultimate sd noscript. upscayl and some other upscale models but they dont seem to work well or add much quality. The bad details just become more apparent. Im trying to upscale novelai generated images. I dont mind if the image changes slightly as long as noise,artifacts are removed and faces/eyes are improved

https://redd.it/1pxygmh
@rStableDiffusion
Are there viable careers for Generative AI skills?

I've been learning how to use generative AI for a couple months now, primarily using comfy UI to generate images and videos and I've gotten pretty comfortable with it. Initially I started as a way to expand my skillset since I was recently laid off and haven't had much luck landing a new role in an industry I've worked 15+ years in.

I've been wondering if there is a way to make some income off this? I know people are selling adult content on Patreon and DeviantArt but I'm not looking to get into that, and honestly it seems its already extremely oversaturated.

On the one hand it seems there is a lot of potential to replace content such as video ads that typically have expensive production costs with more economic AI options, but on the other hand there seems to be a lot of aversion to AI generated content in general. Some companies that do seem to be using generative AI are using licensed tools that are easy to use so they just do it in-house vs. hiring an experienced 3rd party. Using tools such as Nano-banana also don't require any local setup or expensive computer hardware for these companies.

In other words, being able to set up an AI client locally and using opensource models like Z-Turbo doesn't really have any demand. So I'm wondering if I should continue investing my time to keep learning vs. pursuing something else?

https://redd.it/1pxzny8
@rStableDiffusion
QWEN EDIT 2511 seems to be a downgrade when doing small edits with two images.

Been doing clothes swaps for local shop so I have 2 target models (male and female) and I then use the clothing images from their supplier. I could extract the clothes first but with 2509 it's been working fine keeping them on the source person and prompting to extract the clothes and place them on image 1.

BUT, with 2511, after hours of playing, it will not only transfer the clothes (very well) but also the skin tone of the source model! This means that the outputs end up with darker tanned arms or midrif than the persons original skin!

Never had this isssue with 2509. I've tried adding things like "do not change skin tone" etc but it insists on bring it over with the clothes.

As a test I did an interim edit of converting the original clothing model/person to gray manniquin and guess what, the person ends up with gray skin haha! Again, absolutely fine with 2509.

https://redd.it/1pxzbh2
@rStableDiffusion
People who are using llm to enhance prompt, what is your system prompt?

I mostly interested in a image, will appreciate anyone who willing to share their prompts.

https://redd.it/1pxzyfc
@rStableDiffusion