InfinityStar - new model
https://huggingface.co/FoundationVision/InfinityStar
We introduce InfinityStar, a unified spacetime autoregressive framework for high-resolution image and dynamic video synthesis. Building on the recent success of autoregressive modeling in both vision and language, our purely discrete approach jointly captures spatial and temporal dependencies within a single architecture. This unified design naturally supports a variety of generation tasks such as text-to-image, text-to-video, image-to-video, and long-duration video synthesis via straightforward temporal autoregression. Through extensive experiments, InfinityStar scores 83.74 on VBench, outperforming all autoregressive models by large margins, even surpassing diffusion competitors like HunyuanVideo. Without extra optimizations, our model generates a 5s, 720p video approximately 10$\\times$ faster than leading diffusion-based methods. To our knowledge, InfinityStar is the first discrete autoregressive video generator capable of producing industrial-level 720p videos. We release all code and models to foster further research in efficient, high-quality video generation.
weights on HF
https://huggingface.co/FoundationVision/InfinityStar/tree/main
InfinityStarInteract\_24K\_iters
infinitystar\_8b\_480p\_weights
infinitystar\_8b\_720p\_weights
https://redd.it/1ov05oq
@rStableDiffusion
https://huggingface.co/FoundationVision/InfinityStar
We introduce InfinityStar, a unified spacetime autoregressive framework for high-resolution image and dynamic video synthesis. Building on the recent success of autoregressive modeling in both vision and language, our purely discrete approach jointly captures spatial and temporal dependencies within a single architecture. This unified design naturally supports a variety of generation tasks such as text-to-image, text-to-video, image-to-video, and long-duration video synthesis via straightforward temporal autoregression. Through extensive experiments, InfinityStar scores 83.74 on VBench, outperforming all autoregressive models by large margins, even surpassing diffusion competitors like HunyuanVideo. Without extra optimizations, our model generates a 5s, 720p video approximately 10$\\times$ faster than leading diffusion-based methods. To our knowledge, InfinityStar is the first discrete autoregressive video generator capable of producing industrial-level 720p videos. We release all code and models to foster further research in efficient, high-quality video generation.
weights on HF
https://huggingface.co/FoundationVision/InfinityStar/tree/main
InfinityStarInteract\_24K\_iters
infinitystar\_8b\_480p\_weights
infinitystar\_8b\_720p\_weights
https://redd.it/1ov05oq
@rStableDiffusion
huggingface.co
FoundationVision/InfinityStar · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
ComfyUI Tutorial Series Ep 70: Nunchaku Qwen Loras - Relight, Camera Angle & Scene Change
https://www.youtube.com/watch?v=9sD5Ekavjgo
https://redd.it/1ov8r21
@rStableDiffusion
https://www.youtube.com/watch?v=9sD5Ekavjgo
https://redd.it/1ov8r21
@rStableDiffusion
YouTube
ComfyUI Tutorial Series Ep 70: Nunchaku Qwen Loras - Relight, Camera Angle & Scene Change
In this episode, we can finally use Loras with the Nunchaku Qwen model in ComfyUI. I’ll show you 9 powerful Loras that help you edit and transform images in new ways. Learn how to change camera angles, relight your subjects, remove shadows, blend two images…
Advice: Building Believable Customer Avatars with FaceSeek
I came up with a great trick here if you were to create story based content or faceless brands
Before inserting my AI generated faces in videos or thumbnails I normally use FaceSeek to see how real they look
I create characters in Midjourney and then I upload the image to FaceSeek. If it can't find the closest match I assume that the face is the most unique one, and therefore I can use it
If the face matching gets similar persons I change the AI prompt a bit until it looks good
Through this, they give me the option of not using AI faces that could look very much like a real person with whom
I am not endorsing a product it is just a great tool to content check if you are storytelling with AI images
https://redd.it/1ov9mov
@rStableDiffusion
I came up with a great trick here if you were to create story based content or faceless brands
Before inserting my AI generated faces in videos or thumbnails I normally use FaceSeek to see how real they look
I create characters in Midjourney and then I upload the image to FaceSeek. If it can't find the closest match I assume that the face is the most unique one, and therefore I can use it
If the face matching gets similar persons I change the AI prompt a bit until it looks good
Through this, they give me the option of not using AI faces that could look very much like a real person with whom
I am not endorsing a product it is just a great tool to content check if you are storytelling with AI images
https://redd.it/1ov9mov
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
XDiT finally release their ComfyUI node for Parallel Multi GPU worker.
https://redd.it/1ov9ns1
@rStableDiffusion
https://redd.it/1ov9ns1
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: XDiT finally release their ComfyUI node for Parallel Multi GPU worker.
Explore this post and more from the StableDiffusion community
Continue to update the solution for converting 3D images into realistic photos in Qwen
https://redd.it/1ovccgz
@rStableDiffusion
https://redd.it/1ovccgz
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Continue to update the solution for converting 3D images into realistic photos in…
Explore this post and more from the StableDiffusion community