Wan2.1 Mocha Video Character One-Click Replacement
https://reddit.com/link/1ogkacm/video/5banxduzggxf1/player
Workflow download:
https://civitai.com/models/2075972?modelVersionId=2348984
Project address:https://orange-3dv-team.github.io/MoCha/
Controllable video character replacement with a user-provided one remains a challenging problem due to the lack of qualified paired-video data. Prior works have predominantly adopted a reconstruction-based paradigm reliant on per-frame masks and explicit structural guidance (e.g., pose, depth). This reliance, however, renders them fragile in complex scenarios involving occlusions, rare poses, character-object interactions, or complex illumination, often resulting in visual artifacts and temporal discontinuities. In this paper, we propose MoCha, a novel framework that bypasses these limitations, which requires only a single first-frame mask and re-renders the character by unifying different conditions into a single token stream. Further, MoCha adopts a condition-aware RoPE to support multi-reference images and variable-length video generation. To overcome the data bottleneck, we construct a comprehensive data synthesis pipeline to collect qualified paired-training videos. Extensive experiments show that our method substantially outperforms existing state-of-the-art approaches.
https://redd.it/1ogkacm
@rStableDiffusion
https://reddit.com/link/1ogkacm/video/5banxduzggxf1/player
Workflow download:
https://civitai.com/models/2075972?modelVersionId=2348984
Project address:https://orange-3dv-team.github.io/MoCha/
Controllable video character replacement with a user-provided one remains a challenging problem due to the lack of qualified paired-video data. Prior works have predominantly adopted a reconstruction-based paradigm reliant on per-frame masks and explicit structural guidance (e.g., pose, depth). This reliance, however, renders them fragile in complex scenarios involving occlusions, rare poses, character-object interactions, or complex illumination, often resulting in visual artifacts and temporal discontinuities. In this paper, we propose MoCha, a novel framework that bypasses these limitations, which requires only a single first-frame mask and re-renders the character by unifying different conditions into a single token stream. Further, MoCha adopts a condition-aware RoPE to support multi-reference images and variable-length video generation. To overcome the data bottleneck, we construct a comprehensive data synthesis pipeline to collect qualified paired-training videos. Extensive experiments show that our method substantially outperforms existing state-of-the-art approaches.
https://redd.it/1ogkacm
@rStableDiffusion
Civitai
Wan2.1 Mocha Video Character One-Click Replacement - v1.0 | Stable Diffusion Workflows | Civitai
You can click on the link below and try it out directly. If the effect is good, you can deploy it locally https://www.runninghub.ai/post/1981739348...
Media is too big
VIEW IN TELEGRAM
Introducing The Arca Gidan Prize, an art competition focused on open models. It's an excuse to push yourself + models, but 4 winners get to fly to Hollywood to show their piece - sponsored by Comfy/Banodoco
https://redd.it/1ogrpmt
@rStableDiffusion
https://redd.it/1ogrpmt
@rStableDiffusion
Wan 2.2 - Why the '' slow '' motion ?
Hi,
Every video I'm generating using Wan 2.2 has somehow '' slow '' motion, this is an easy tell that the video is generated.
Is there a way to have faster movements that look more natural ?
https://redd.it/1ogt5ug
@rStableDiffusion
Hi,
Every video I'm generating using Wan 2.2 has somehow '' slow '' motion, this is an easy tell that the video is generated.
Is there a way to have faster movements that look more natural ?
https://redd.it/1ogt5ug
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Holy crap. Form me Chroma Radiance is like 10 times better than qwen.
https://redd.it/1ogwi51
@rStableDiffusion
https://redd.it/1ogwi51
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Holy crap. Form me Chroma Radiance is like 10 times better than qwen.
Explore this post and more from the StableDiffusion community