This media is not supported in your browser
VIEW IN TELEGRAM
📲 Efficient VLMs 📲
👉CoPE-VideoLM is a codec-aware tokenization framework for VLM to replace dense RGB encoding w/ light structured representations derived from codec primitives. Token -93% / time-to-first-token -86%! Code announced💙
👉Review https://t.ly/3_GqN
👉Paper https://arxiv.org/pdf/2602.13191
👉Project https://sayands.github.io/cope/
👉Repo TBA
👉CoPE-VideoLM is a codec-aware tokenization framework for VLM to replace dense RGB encoding w/ light structured representations derived from codec primitives. Token -93% / time-to-first-token -86%! Code announced💙
👉Review https://t.ly/3_GqN
👉Paper https://arxiv.org/pdf/2602.13191
👉Project https://sayands.github.io/cope/
👉Repo TBA
🔥11❤5👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🐙Dex4D: Task-Agnostic Track🐙
👉Dex4D by CMU is a novel approach for unseen objects and poses, scene layouts, backgrounds, & task trajectories. Code under Apache 2.0💙
👉Review https://t.ly/ZGx9T
👉Paper arxiv.org/pdf/2602.15828
👉Project dex4d.github.io/
👉Sim github.com/Dex4D/Dex4D-Simulation
👉Vision github.com/Dex4D/Dex4D-Vision
👉HW https://github.com/Dex4D/Dex4D-Hardware
👉Dex4D by CMU is a novel approach for unseen objects and poses, scene layouts, backgrounds, & task trajectories. Code under Apache 2.0💙
👉Review https://t.ly/ZGx9T
👉Paper arxiv.org/pdf/2602.15828
👉Project dex4d.github.io/
👉Sim github.com/Dex4D/Dex4D-Simulation
👉Vision github.com/Dex4D/Dex4D-Vision
👉HW https://github.com/Dex4D/Dex4D-Hardware
❤8🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🚤Video Neural Compression🚤
👉TeCoNeRV: adapting INR hypernetworks to compress videos efficiently at higher resolutions. Impressive: +5.35dB PSNR, -36% bitrates & 1.5-3× faster. Code announced💙
👉Review https://t.ly/0AtCK
👉Paper arxiv.org/pdf/2602.16711
👉Project namithap10.github.io/teconerv/
👉Repo github.com/namithap10/TeCoNeRV/
👉TeCoNeRV: adapting INR hypernetworks to compress videos efficiently at higher resolutions. Impressive: +5.35dB PSNR, -36% bitrates & 1.5-3× faster. Code announced💙
👉Review https://t.ly/0AtCK
👉Paper arxiv.org/pdf/2602.16711
👉Project namithap10.github.io/teconerv/
👉Repo github.com/namithap10/TeCoNeRV/
🔥10❤4👏2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥New SOTA Planar Tracking🔥
👉WOFTSAM by the Visual Recognition Group (CTU) is a novel planar tracker that combine robust long-term segmentation by SAM2 with 8 degrees-of-freedom homography pose estimation. Repo under BY-NC-SA 4.0💙
👉Review https://t.ly/VUOe5
👉Paper https://lnkd.in/dZfc_DhQ
👉Repo https://lnkd.in/dAcneJGn
👉WOFTSAM by the Visual Recognition Group (CTU) is a novel planar tracker that combine robust long-term segmentation by SAM2 with 8 degrees-of-freedom homography pose estimation. Repo under BY-NC-SA 4.0💙
👉Review https://t.ly/VUOe5
👉Paper https://lnkd.in/dZfc_DhQ
👉Repo https://lnkd.in/dAcneJGn
🔥8👍4❤2👏1🤯1🤣1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🫸 World-Grounded Hand-Obj🫸
👉WHOLE jointly reconstructs coherent hand and object motion in the world space by guiding a generative motion prior. Code announced💙
👉Review https://t.ly/c5w8h
👉Paper https://arxiv.org/pdf/2602.22209
👉Project https://judyye.github.io/whole-www/
👉Repo TBA
👉WHOLE jointly reconstructs coherent hand and object motion in the world space by guiding a generative motion prior. Code announced💙
👉Review https://t.ly/c5w8h
👉Paper https://arxiv.org/pdf/2602.22209
👉Project https://judyye.github.io/whole-www/
👉Repo TBA
❤2👍2🔥1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🧱Solaris: generative #Minecraft🧱
👉NYU unveils Solaris, multiplayer video world model in Minecraft, which generates consistent first-person observations for two players simultaneously. Impressive work. Repo & Dataset💙
👉Review https://t.ly/VrcrT
👉Paper https://arxiv.org/pdf/2602.22208
👉Project https://solaris-wm.github.io/
👉Repo https://github.com/solaris-wm/
👉NYU unveils Solaris, multiplayer video world model in Minecraft, which generates consistent first-person observations for two players simultaneously. Impressive work. Repo & Dataset💙
👉Review https://t.ly/VrcrT
👉Paper https://arxiv.org/pdf/2602.22208
👉Project https://solaris-wm.github.io/
👉Repo https://github.com/solaris-wm/
🔥6❤2👍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🦜Geometry-Aware 4D Head🦜
👉 GeoDiff4D is a novel framework that reconstructs animatable 4D head avatars from a single portrait image through geometry-aware diffusion. Code announced💙
👉Review https://t.ly/J9L-t
👉Paper https://lnkd.in/ddpv-78g
👉Project https://lnkd.in/d-vhukyj
👉Repo https://lnkd.in/dzd6mnFv
👉 GeoDiff4D is a novel framework that reconstructs animatable 4D head avatars from a single portrait image through geometry-aware diffusion. Code announced💙
👉Review https://t.ly/J9L-t
👉Paper https://lnkd.in/ddpv-78g
👉Project https://lnkd.in/d-vhukyj
👉Repo https://lnkd.in/dzd6mnFv
❤5👏3🤯2👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🍓Fully Offline Mobile-VTON🍓
👉A novel, hq, privacy-preserving framework that enables fully offline virtual try-on on commodity mobile devices using only a single user image and a garment image. Repo announced, to be released💙
👉Review https://t.ly/dsrIn
👉Paper arxiv.org/pdf/2603.00947
👉Project zhenchenwan.github.io/Mobile-VTON/
👉Repo https://github.com/tmllab/2026_CVPR_Mobile-VTON
👉A novel, hq, privacy-preserving framework that enables fully offline virtual try-on on commodity mobile devices using only a single user image and a garment image. Repo announced, to be released💙
👉Review https://t.ly/dsrIn
👉Paper arxiv.org/pdf/2603.00947
👉Project zhenchenwan.github.io/Mobile-VTON/
👉Repo https://github.com/tmllab/2026_CVPR_Mobile-VTON
❤9🤯3🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
🪿All Point Clouds-One Encoder🪿
👉Utonia is a step toward one-from-all and one-for-all point cloud encoder. It pretrains a single encoder on diverse point cloud data and reuses it as a reliable backbone for downstream tasks. Code under Apache 2.0💙
👉Review https://t.ly/yqSyZ
👉Paper https://arxiv.org/pdf/2603.03283
👉Project pointcept.github.io/Utonia/
👉Repo https://github.com/Pointcept/Utonia
👉Utonia is a step toward one-from-all and one-for-all point cloud encoder. It pretrains a single encoder on diverse point cloud data and reuses it as a reliable backbone for downstream tasks. Code under Apache 2.0💙
👉Review https://t.ly/yqSyZ
👉Paper https://arxiv.org/pdf/2603.03283
👉Project pointcept.github.io/Utonia/
👉Repo https://github.com/Pointcept/Utonia
❤5👏2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🐪DuoMo: Dual Motion Diffusion🐪
👉DuoMo by META is a novel generative method that recovers human motion in world-space coordinates from unconstrained videos with noisy or incomplete observations. Code announced💙
👉Review https://t.ly/dnA3K
👉Paper arxiv.org/pdf/2603.03265
👉Project yufu-wang.github.io/duomo/
👉Repo TBA
👉DuoMo by META is a novel generative method that recovers human motion in world-space coordinates from unconstrained videos with noisy or incomplete observations. Code announced💙
👉Review https://t.ly/dnA3K
👉Paper arxiv.org/pdf/2603.03265
👉Project yufu-wang.github.io/duomo/
👉Repo TBA
❤5🤯2👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🍙Any Resolution, Any Geometry🍙
👉Ultra Resolution Geometry Transformer (URGT) for arbitrary resolutions (e.g. 4K, 6K, 8K) depth–normal estimation. New SOTA. Repo under MIT💙
👉Review https://t.ly/HXg1n
👉Paper arxiv.org/pdf/2603.03026
👉Project dreamaker-mrc.github.io/Any-Resolution-Any-Geometry/
👉Repo github.com/Dreamaker-MrC/Any-Resolution-Any-Geometry
👉Ultra Resolution Geometry Transformer (URGT) for arbitrary resolutions (e.g. 4K, 6K, 8K) depth–normal estimation. New SOTA. Repo under MIT💙
👉Review https://t.ly/HXg1n
👉Paper arxiv.org/pdf/2603.03026
👉Project dreamaker-mrc.github.io/Any-Resolution-Any-Geometry/
👉Repo github.com/Dreamaker-MrC/Any-Resolution-Any-Geometry
🔥7❤5👍1
Could be useful for you seeing a few (verified) job posting about AI in this channel?
Anonymous Poll
64%
💚YES, why not?!
36%
❌ NO, only damn AI & Papers
❤2