This media is not supported in your browser
VIEW IN TELEGRAM
👢Generative View Stitching 👢
👉GVS is a novel approach that enables collision-free camera-guided video generation for predefined trajectories, it's a non-autoregressive alternative to video length extrapolation. Full repo under MIT💙
👉Review https://t.ly/TiN_5
👉Paper https://arxiv.org/pdf/2510.24718
👉Project https://andrewsonga.github.io/gvs/
👉Repo github.com/andrewsonga/generative_view_stitching
👉GVS is a novel approach that enables collision-free camera-guided video generation for predefined trajectories, it's a non-autoregressive alternative to video length extrapolation. Full repo under MIT💙
👉Review https://t.ly/TiN_5
👉Paper https://arxiv.org/pdf/2510.24718
👉Project https://andrewsonga.github.io/gvs/
👉Repo github.com/andrewsonga/generative_view_stitching
🔥10❤3👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔪Tracking Object Transformations🔪
👉"Track Any State": tracking objects through transformations while detecting/describing state changes. Repo & Dataset available under MIT💙
👉Review https://t.ly/NPyW4
👉Paper https://lnkd.in/d4pA3bXJ
👉Project https://lnkd.in/dgbNfCuj
👉Repo https://lnkd.in/dtVWq2z7
👉"Track Any State": tracking objects through transformations while detecting/describing state changes. Repo & Dataset available under MIT💙
👉Review https://t.ly/NPyW4
👉Paper https://lnkd.in/d4pA3bXJ
👉Project https://lnkd.in/dgbNfCuj
👉Repo https://lnkd.in/dtVWq2z7
🔥20❤7🤯3👏2👍1
🎸Another BRIXEL in the Wall 🎸
👉BRIXEL allows the user to produce high-resolution feature maps using the DINOv3 backbone without requiring large amounts of compute. Repo released💙
👉Review https://t.ly/fZPwC
👉Paper arxiv.org/pdf/2511.05168
👉Repo github.com/alexanderlappe/BRIXEL
👉BRIXEL allows the user to produce high-resolution feature maps using the DINOv3 backbone without requiring large amounts of compute. Repo released💙
👉Review https://t.ly/fZPwC
👉Paper arxiv.org/pdf/2511.05168
👉Repo github.com/alexanderlappe/BRIXEL
🤩7🤯3❤2🔥2👍1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🐼Pixel-Dense Embedding🐼
👉FlowFeat is a novel high-resolution and multi-task feature representation that embeds a distribution of plausible apparent motions, or motion profiles. Repo available under 💙
👉Review https://t.ly/aUx_U
👉Paper arxiv.org/pdf/2511.07696
👉Project tum-vision.github.io/flowfeat
👉Repo github.com/tum-vision/flowfeat
👉FlowFeat is a novel high-resolution and multi-task feature representation that embeds a distribution of plausible apparent motions, or motion profiles. Repo available under 💙
👉Review https://t.ly/aUx_U
👉Paper arxiv.org/pdf/2511.07696
👉Project tum-vision.github.io/flowfeat
👉Repo github.com/tum-vision/flowfeat
🔥6👍3❤2
🚨 Announcement 🚨
I’ve received numerous reports of people blatantly copying my content on LinkedIn just to get a few likes.
Let me be very clear: I put a great deal of time and effort into reviewing papers and creating original, meaningful content. It’s disappointing to see professionals (some of whom are even members of this group or my connections) resorting to plagiarism instead of contributing their own ideas.
👉 Starting today, I’ll be removing these connections from LinkedIn and banning such individuals from this group.
📢 I also encourage everyone to report these cases whenever you come across them. Every single report helps stop this bad habit and keeps our community fair, respectful, and authentic.
I’ve received numerous reports of people blatantly copying my content on LinkedIn just to get a few likes.
Let me be very clear: I put a great deal of time and effort into reviewing papers and creating original, meaningful content. It’s disappointing to see professionals (some of whom are even members of this group or my connections) resorting to plagiarism instead of contributing their own ideas.
👉 Starting today, I’ll be removing these connections from LinkedIn and banning such individuals from this group.
📢 I also encourage everyone to report these cases whenever you come across them. Every single report helps stop this bad habit and keeps our community fair, respectful, and authentic.
❤68👏22👍18😢1
This media is not supported in your browser
VIEW IN TELEGRAM
🟩 Foundational Humanoid 🟩
👉#NVIDIA unveils SONIC a novel foundational model for high-precision teleoperation & interactive control capabilities (running, jumping, crawling) with natural human-like movements. Code announced💙
👉Review https://t.ly/_3wnt
👉Paper https://lnkd.in/dctfShu8
👉Project https://lnkd.in/d_inmA2p
👉#NVIDIA unveils SONIC a novel foundational model for high-precision teleoperation & interactive control capabilities (running, jumping, crawling) with natural human-like movements. Code announced💙
👉Review https://t.ly/_3wnt
👉Paper https://lnkd.in/dctfShu8
👉Project https://lnkd.in/d_inmA2p
🤯9❤5👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Depth Anything 3 is out🔥
👉ByteDance unveils Depth Anything 3 (DA3), a model that predicts spatially consistent geometry from arbitrary visual inputs, with or without known camera poses. Repo under Apache 2.0💙
👉Review https://t.ly/AOPu7
👉Paper arxiv.org/pdf/2511.10647
👉Project https://lnkd.in/dnByyn2z
👉Repo https://lnkd.in/daCVz_4a
👉Demo https://lnkd.in/dKUZiJt
👉ByteDance unveils Depth Anything 3 (DA3), a model that predicts spatially consistent geometry from arbitrary visual inputs, with or without known camera poses. Repo under Apache 2.0💙
👉Review https://t.ly/AOPu7
👉Paper arxiv.org/pdf/2511.10647
👉Project https://lnkd.in/dnByyn2z
👉Repo https://lnkd.in/daCVz_4a
👉Demo https://lnkd.in/dKUZiJt
🔥18❤9👍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🌩️ It's "Time-to-Move" 🌩️
👉Technion + Nvidia Time-to-Move (TTM) is a training-free, plug-and-play framework for motion- and appearance-controlled video generation with I2V diffusion models (Wan 2.2, CogVideoX, & Stable VD). Impressive results!
👉Review https://t.ly/0pwXm
👉Paper https://lnkd.in/dxD3uHYb
👉Project https://lnkd.in/dcE5juyM
👉Repo https://lnkd.in/dMMUjybJ
👉Technion + Nvidia Time-to-Move (TTM) is a training-free, plug-and-play framework for motion- and appearance-controlled video generation with I2V diffusion models (Wan 2.2, CogVideoX, & Stable VD). Impressive results!
👉Review https://t.ly/0pwXm
👉Paper https://lnkd.in/dxD3uHYb
👉Project https://lnkd.in/dcE5juyM
👉Repo https://lnkd.in/dMMUjybJ
1👍2🔥2❤1
This media is not supported in your browser
VIEW IN TELEGRAM
⌚ Multi-Shot Video Segmentation ⌚
👉Fudan focuses on an underexplored task of multi-shot video object segmentation (MVOS). Benchmark and repo available (the extension part of SAM) under Apache 2.0💙
👉Review https://t.ly/WBW00
👉Paper https://arxiv.org/pdf/2511.13715
👉Project https://henghuiding.com/SAAS/
👉Repo https://github.com/FudanCVL/SAAS
👉Fudan focuses on an underexplored task of multi-shot video object segmentation (MVOS). Benchmark and repo available (the extension part of SAM) under Apache 2.0💙
👉Review https://t.ly/WBW00
👉Paper https://arxiv.org/pdf/2511.13715
👉Project https://henghuiding.com/SAAS/
👉Repo https://github.com/FudanCVL/SAAS
1🔥6❤2
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 SAM 3/3D are OUT!! 🔥
👉#META released SAM 3, a unified model for detection, segmentation, tracking of objects in images & video using text, exemplar & visual prompts. Repo/Models under proprietary license💙
👉Review https://t.ly/lnRZN
👉Paper https://t.ly/5tq9N
👉Project https://ai.meta.com/sam3/
👉Demo: https://segment-anything.com
👉Repo https://github.com/facebookresearch/sam3
👉#META released SAM 3, a unified model for detection, segmentation, tracking of objects in images & video using text, exemplar & visual prompts. Repo/Models under proprietary license💙
👉Review https://t.ly/lnRZN
👉Paper https://t.ly/5tq9N
👉Project https://ai.meta.com/sam3/
👉Demo: https://segment-anything.com
👉Repo https://github.com/facebookresearch/sam3
🔥23❤9👏2
This media is not supported in your browser
VIEW IN TELEGRAM
🍯Unwrapping of 3D Meshes🍯
👉PartUV is a novel part-based UV unwrapping method for 3D meshes; it combines learned part priors with geometric cues to generate a compact set of part-aligned charts. Repo released💙
👉Review https://t.ly/8dNIY
👉Paper arxiv.org/pdf/2511.16659
👉Project www.zhaoningwang.com/PartUV/
👉Repo github.com/EricWang12/PartUV
👉PartUV is a novel part-based UV unwrapping method for 3D meshes; it combines learned part priors with geometric cues to generate a compact set of part-aligned charts. Repo released💙
👉Review https://t.ly/8dNIY
👉Paper arxiv.org/pdf/2511.16659
👉Project www.zhaoningwang.com/PartUV/
👉Repo github.com/EricWang12/PartUV
❤15🔥3👍2
🍕 Upsample Anything 🍕
👉Upsample Anything, a novel universal, training-free up-sampler via lightweight test-time optimization. No code but it's a relevant paper💙
👉Review https://t.ly/7LE6G
👉Paper https://lnkd.in/dsUfdtih
👉Upsample Anything, a novel universal, training-free up-sampler via lightweight test-time optimization. No code but it's a relevant paper💙
👉Review https://t.ly/7LE6G
👉Paper https://lnkd.in/dsUfdtih
🔥8❤4👍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🦞Single Synthetic Image per Class🦞
👉MIT unveils Linear Gradient Matching (H/T Torralba), a novel method of distillation to use a single synthetic image per class for linear classifiers training (and more). Repo available💙
👉Review https://t.ly/dD3un
👉Paper arxiv.org/pdf/2511.16674
👉Project linear-gradient-matching.github.io/
👉Repo github.com/GeorgeCazenavette/linear-gradient-matching
👉MIT unveils Linear Gradient Matching (H/T Torralba), a novel method of distillation to use a single synthetic image per class for linear classifiers training (and more). Repo available💙
👉Review https://t.ly/dD3un
👉Paper arxiv.org/pdf/2511.16674
👉Project linear-gradient-matching.github.io/
👉Repo github.com/GeorgeCazenavette/linear-gradient-matching
1❤7🔥2👍1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🧪 EfficientSAM3 is out 🧪
👉Bristol announces EfficientSAM3, a family of efficient models built on Progressive Hierarchical Distillation that transfers capability from SAM3 to lightweight students. Code coming (in sync with SAM3 release)💙
👉Review https://t.ly/bfXP2
👉Paper arxiv.org/pdf/2511.15833
👉Project simonzeng7108.github.io/efficientsam3/
👉Repo github.com/SimonZeng7108/efficientsam3
👉Bristol announces EfficientSAM3, a family of efficient models built on Progressive Hierarchical Distillation that transfers capability from SAM3 to lightweight students. Code coming (in sync with SAM3 release)💙
👉Review https://t.ly/bfXP2
👉Paper arxiv.org/pdf/2511.15833
👉Project simonzeng7108.github.io/efficientsam3/
👉Repo github.com/SimonZeng7108/efficientsam3
❤7👍2🔥1👏1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌩️ Cloud4D in time 🌩️
👉Cloud4D: physically-realistic 3D cloud fields using ground-based cameras at a 25 m spatial resolution and 5 s temporal resolution. Repo coming, Data released💙
👉Review https://t.ly/w7Zly
👉Paper arxiv.org/pdf/2511.19431
👉Project cloud4d.jacob-lin.com/
👉Data https://drive.google.com/drive/folders/1QU_0kIUXIVt8h3uqygBeaF3Gvr_L5SdX?usp=drive_link
👉Repo TBA
👉Cloud4D: physically-realistic 3D cloud fields using ground-based cameras at a 25 m spatial resolution and 5 s temporal resolution. Repo coming, Data released💙
👉Review https://t.ly/w7Zly
👉Paper arxiv.org/pdf/2511.19431
👉Project cloud4d.jacob-lin.com/
👉Data https://drive.google.com/drive/folders/1QU_0kIUXIVt8h3uqygBeaF3Gvr_L5SdX?usp=drive_link
👉Repo TBA
🔥9❤3
This media is not supported in your browser
VIEW IN TELEGRAM
🍓MotionV2V: Editing Motion in Video🍓
👉 Google unveils motion edits, a new approach for editing videos by controlling the change in motion from the original to the edited video using diffusion models. Impressive results. Repo released soon💙
👉Review https://t.ly/s0sIT
👉Paper https://arxiv.org/pdf/2511.20640
👉Project https://ryanndagreat.github.io/MotionV2V/
👉Repo https://github.com/RyannDaGreat/MotionV2V
👉 Google unveils motion edits, a new approach for editing videos by controlling the change in motion from the original to the edited video using diffusion models. Impressive results. Repo released soon💙
👉Review https://t.ly/s0sIT
👉Paper https://arxiv.org/pdf/2511.20640
👉Project https://ryanndagreat.github.io/MotionV2V/
👉Repo https://github.com/RyannDaGreat/MotionV2V
❤8🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 Smell Like Vision Spirit 🔥
👉New York Smells is a novel large-scale dataset of paired vision and olfaction captured in-the-wild, enabling the new task of cross-modal learning between smell and sight. With the lights out, it's less dangerous. Dataset available💙
👉Review https://t.ly/Ycn_B
👉Paper arxiv.org/pdf/2511.20544
👉Project smell.cs.columbia.edu/
👉New York Smells is a novel large-scale dataset of paired vision and olfaction captured in-the-wild, enabling the new task of cross-modal learning between smell and sight. With the lights out, it's less dangerous. Dataset available💙
👉Review https://t.ly/Ycn_B
👉Paper arxiv.org/pdf/2511.20544
👉Project smell.cs.columbia.edu/
❤15👍2🔥2