This media is not supported in your browser
VIEW IN TELEGRAM
🎭New 2D Landmarks SOTA🎭
👉Flawless AI unveils FaceLift, a novel semi-supervised approach that learns 3D landmarks by directly lifting (visible) hand-labeled 2D landmarks and ensures better definition alignment, with no need for 3D landmark datasets. No code announced🥹
👉Review https://t.ly/lew9a
👉Paper arxiv.org/pdf/2405.19646
👉Project davidcferman.github.io/FaceLift
👉Flawless AI unveils FaceLift, a novel semi-supervised approach that learns 3D landmarks by directly lifting (visible) hand-labeled 2D landmarks and ensures better definition alignment, with no need for 3D landmark datasets. No code announced🥹
👉Review https://t.ly/lew9a
👉Paper arxiv.org/pdf/2405.19646
👉Project davidcferman.github.io/FaceLift
🔥16❤5😢5👏2💩2⚡1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🐳 MultiPly: in-the-wild Multi-People 🐳
👉MultiPly: novel framework to reconstruct multiple people in 3D from monocular in-the-wild videos. It's the new SOTA over the publicly available datasets and in-the-wild videos. Source Code announced, coming💙
👉Review https://t.ly/_xjk_
👉Paper arxiv.org/pdf/2406.01595
👉Project eth-ait.github.io/MultiPly
👉Repo github.com/eth-ait/MultiPly
👉MultiPly: novel framework to reconstruct multiple people in 3D from monocular in-the-wild videos. It's the new SOTA over the publicly available datasets and in-the-wild videos. Source Code announced, coming💙
👉Review https://t.ly/_xjk_
👉Paper arxiv.org/pdf/2406.01595
👉Project eth-ait.github.io/MultiPly
👉Repo github.com/eth-ait/MultiPly
🔥14👍4👏2❤1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
👹AI and the Everything in the Whole Wide World Benchmark👹
👉Last week Yann LeCun said something like "LLMs will not reach human intelligence". It's clear the on-going #deeplearning is not ready for "general AI", a "radical alternative" is necessary to create a “superintelligence”.
👉Review https://t.ly/isdxM
👉News https://lnkd.in/dFraieZS
👉Paper https://lnkd.in/da-7PnVT
👉Last week Yann LeCun said something like "LLMs will not reach human intelligence". It's clear the on-going #deeplearning is not ready for "general AI", a "radical alternative" is necessary to create a “superintelligence”.
👉Review https://t.ly/isdxM
👉News https://lnkd.in/dFraieZS
👉Paper https://lnkd.in/da-7PnVT
❤5👍2👏1💩1
This media is not supported in your browser
VIEW IN TELEGRAM
📞FacET: VideoCall Change Your Expression📞
👉Columbia University unveils FacET: discovering behavioral differences between conversing face-to-face (F2F) and on video-calls (VCs).
👉Review https://t.ly/qsQmt
👉Paper arxiv.org/pdf/2406.00955
👉Project facet.cs.columbia.edu/
👉Repo (empty) github.com/stellargo/facet
👉Columbia University unveils FacET: discovering behavioral differences between conversing face-to-face (F2F) and on video-calls (VCs).
👉Review https://t.ly/qsQmt
👉Paper arxiv.org/pdf/2406.00955
👉Project facet.cs.columbia.edu/
👉Repo (empty) github.com/stellargo/facet
🔥8❤1👍1👏1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🚙 UA-Track: Uncertainty-Aware MOT🚙
👉UA-Track: novel Uncertainty-Aware 3D MOT framework which tackles the uncertainty problem from multiple aspects. Code announced, not released yet.
👉Review https://t.ly/RmVSV
👉Paper https://arxiv.org/pdf/2406.02147
👉Project https://liautoad.github.io/ua-track-website
👉UA-Track: novel Uncertainty-Aware 3D MOT framework which tackles the uncertainty problem from multiple aspects. Code announced, not released yet.
👉Review https://t.ly/RmVSV
👉Paper https://arxiv.org/pdf/2406.02147
👉Project https://liautoad.github.io/ua-track-website
👍8❤1🔥1🥰1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧊 Universal 6D Pose/Tracking 🧊
👉Omni6DPose is a novel dataset for 6D Object Pose with 1.5M+ annotations. Extra: GenPose++, the novel SOTA in category-level 6D estimation/tracking thanks to two pivotal improvements.
👉Review https://t.ly/Ywgl1
👉Paper arxiv.org/pdf/2406.04316
👉Project https://lnkd.in/dHBvenhX
👉Lib https://lnkd.in/d8Yc-KFh
👉Omni6DPose is a novel dataset for 6D Object Pose with 1.5M+ annotations. Extra: GenPose++, the novel SOTA in category-level 6D estimation/tracking thanks to two pivotal improvements.
👉Review https://t.ly/Ywgl1
👉Paper arxiv.org/pdf/2406.04316
👉Project https://lnkd.in/dHBvenhX
👉Lib https://lnkd.in/d8Yc-KFh
❤12👍4🤩2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
👗 SOTA Multi-Garment VTOn Editing 👗
👉#Google (+UWA) unveils M&M VTO, novel mix 'n' match virtual try-on that takes as input multiple garment images, text denoscription for garment layout and an image of a person. It's the new SOTA both qualitatively and quantitatively. Impressive results!
👉Review https://t.ly/66mLN
👉Paper arxiv.org/pdf/2406.04542
👉Project https://mmvto.github.io
👉#Google (+UWA) unveils M&M VTO, novel mix 'n' match virtual try-on that takes as input multiple garment images, text denoscription for garment layout and an image of a person. It's the new SOTA both qualitatively and quantitatively. Impressive results!
👉Review https://t.ly/66mLN
👉Paper arxiv.org/pdf/2406.04542
👉Project https://mmvto.github.io
👍4❤3🥰3🔥1🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
👑 Kling AI vs. OpenAI Sora 👑
👉Kling: the ultimate Chinese text-to-video model - rival to #OpenAI’s Sora. No papers or tech info to check, but stunning results from the official site.
👉Review https://t.ly/870DQ
👉Paper ???
👉Project https://kling.kuaishou.com/
👉Kling: the ultimate Chinese text-to-video model - rival to #OpenAI’s Sora. No papers or tech info to check, but stunning results from the official site.
👉Review https://t.ly/870DQ
👉Paper ???
👉Project https://kling.kuaishou.com/
🔥6👍3❤1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🍉 MASA: MOT Anything By SAM 🍉
👉MASA: Matching Anything by Segmenting Anything pipeline to learn object-level associations from unlabeled images of any domain. An universal instance appearance model for matching any objects in any domain. Source code in June 💙
👉Review https://t.ly/pKdEV
👉Paper https://lnkd.in/dnjuT7xm
👉Project https://lnkd.in/dYbWzG4E
👉Code https://lnkd.in/dr5BJCXm
👉MASA: Matching Anything by Segmenting Anything pipeline to learn object-level associations from unlabeled images of any domain. An universal instance appearance model for matching any objects in any domain. Source code in June 💙
👉Review https://t.ly/pKdEV
👉Paper https://lnkd.in/dnjuT7xm
👉Project https://lnkd.in/dYbWzG4E
👉Code https://lnkd.in/dr5BJCXm
🔥16❤4👏3👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎹 PianoMotion10M for gen-hands 🎹
👉PianoMotion10M: 116 hours of piano playing videos from a bird’s-eye view with 10M+ annotated hand poses. A big contribution in hand motion generation. Code & Dataset released💙
👉Review https://t.ly/_pKKz
👉Paper arxiv.org/pdf/2406.09326
👉Code https://lnkd.in/dcBP6nvm
👉Project https://lnkd.in/d_YqZk8x
👉Dataset https://lnkd.in/dUPyfNDA
👉PianoMotion10M: 116 hours of piano playing videos from a bird’s-eye view with 10M+ annotated hand poses. A big contribution in hand motion generation. Code & Dataset released💙
👉Review https://t.ly/_pKKz
👉Paper arxiv.org/pdf/2406.09326
👉Code https://lnkd.in/dcBP6nvm
👉Project https://lnkd.in/d_YqZk8x
👉Dataset https://lnkd.in/dUPyfNDA
❤8🔥4⚡1🥰1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
📫MeshPose: DensePose+HMR📫
👉MeshPose: novel approach to jointly tackle DensePose and Human Mesh Reconstruction in a while. A natural fit for #AR applications requiring real-time mobile inference.
👉Review https://t.ly/a-5uN
👉Paper arxiv.org/pdf/2406.10180
👉Project https://meshpose.github.io/
👉MeshPose: novel approach to jointly tackle DensePose and Human Mesh Reconstruction in a while. A natural fit for #AR applications requiring real-time mobile inference.
👉Review https://t.ly/a-5uN
👉Paper arxiv.org/pdf/2406.10180
👉Project https://meshpose.github.io/
🔥6❤1👍1
lowlight_back_n_forth.gif
1.4 MB
🌵 RobustSAM for Degraded Images 🌵
👉RobustSAM, the evolution of SAM for degraded images; enhancing the SAM’s performance on low-quality pics while preserving prompt-ability & zeroshot generalization. Dataset & Code released💙
👉Review https://t.ly/mnyyG
👉Paper arxiv.org/pdf/2406.09627
👉Project robustsam.github.io
👉Code github.com/robustsam/RobustSAM
👉RobustSAM, the evolution of SAM for degraded images; enhancing the SAM’s performance on low-quality pics while preserving prompt-ability & zeroshot generalization. Dataset & Code released💙
👉Review https://t.ly/mnyyG
👉Paper arxiv.org/pdf/2406.09627
👉Project robustsam.github.io
👉Code github.com/robustsam/RobustSAM
❤5👍1🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤HOT3D Hand/Object Tracking🧤
👉#Meta opens a novel egocentric dataset for 3D hand & object tracking. A new benchmark for vision-based understanding of 3D hand-object interactions. Dataset available 💙
👉Review https://t.ly/cD76F
👉Paper https://lnkd.in/e6_7UNny
👉Data https://lnkd.in/e6P-sQFK
👉#Meta opens a novel egocentric dataset for 3D hand & object tracking. A new benchmark for vision-based understanding of 3D hand-object interactions. Dataset available 💙
👉Review https://t.ly/cD76F
👉Paper https://lnkd.in/e6_7UNny
👉Data https://lnkd.in/e6P-sQFK
🔥9❤3👏3👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
💦 Self-driving in wet conditions 💦
👉BMW SemanticSpray: novel dataset contains scenes in wet surface conditions captured by camera, LiDAR and radar. Camera: 2D Boxes | LiDAR: 3D Boxes, Semantic Labels | Radar: Semantic Labels.
👉Review https://t.ly/8S93j
👉Paper https://lnkd.in/dnN5MCZC
👉Project https://lnkd.in/dkUaxyEF
👉Data https://lnkd.in/ddhkyXv8
👉BMW SemanticSpray: novel dataset contains scenes in wet surface conditions captured by camera, LiDAR and radar. Camera: 2D Boxes | LiDAR: 3D Boxes, Semantic Labels | Radar: Semantic Labels.
👉Review https://t.ly/8S93j
👉Paper https://lnkd.in/dnN5MCZC
👉Project https://lnkd.in/dkUaxyEF
👉Data https://lnkd.in/ddhkyXv8
🔥6❤1👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌱 TokenHMR : new 3D human pose SOTA 🌱
👉TokenHMR is the new SOTA HPS method mixing 2D keypoints and 3D pose accuracy, thus leveraging Internet data without known camera parameters. It's the new SOTA by a large margin.
👉Review https://t.ly/K9_8n
👉Paper arxiv.org/pdf/2404.16752
👉Project tokenhmr.is.tue.mpg.de/
👉Code github.com/saidwivedi/TokenHMR
👉TokenHMR is the new SOTA HPS method mixing 2D keypoints and 3D pose accuracy, thus leveraging Internet data without known camera parameters. It's the new SOTA by a large margin.
👉Review https://t.ly/K9_8n
👉Paper arxiv.org/pdf/2404.16752
👉Project tokenhmr.is.tue.mpg.de/
👉Code github.com/saidwivedi/TokenHMR
🤯5👍3😱3⚡2❤2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🤓Glasses-Removal in Videos🤓
👉Lightricks unveils a novel method able to receive an input video of a person wearing glasses, and removes the glasses preserving the ID. It works even with reflections, heavy makeup, and blinks. Code announced, not yet released.
👉Review https://t.ly/Hgs2d
👉Paper arxiv.org/pdf/2406.14510
👉Project https://v-lasik.github.io/
👉Code github.com/v-lasik/v-lasik-code
👉Lightricks unveils a novel method able to receive an input video of a person wearing glasses, and removes the glasses preserving the ID. It works even with reflections, heavy makeup, and blinks. Code announced, not yet released.
👉Review https://t.ly/Hgs2d
👉Paper arxiv.org/pdf/2406.14510
👉Project https://v-lasik.github.io/
👉Code github.com/v-lasik/v-lasik-code
💩16❤6🤯5👍3🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🧬Event-driven SuperResolution🧬
👉USTC unveils EvTexture, the first VSR method that utilizes event signals for texture enhancement. It leverages high-freq details of events to better recover texture in VSR. Code available💙
👉Review https://t.ly/zlb4c
👉Paper arxiv.org/pdf/2406.13457
👉Code github.com/DachunKai/EvTexture
👉USTC unveils EvTexture, the first VSR method that utilizes event signals for texture enhancement. It leverages high-freq details of events to better recover texture in VSR. Code available💙
👉Review https://t.ly/zlb4c
👉Paper arxiv.org/pdf/2406.13457
👉Code github.com/DachunKai/EvTexture
👍11❤6🤯4🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
🐻StableNormal: Stable/Sharp Normal🐻
👉Alibaba unveils StableNormal, a novel method which tailors the diffusion priors for monocular normal estimation. Hugging Face demo is available💙
👉Review https://t.ly/FPJlG
👉Paper https://arxiv.org/pdf/2406.16864
👉Demo https://huggingface.co/Stable-X
👉Alibaba unveils StableNormal, a novel method which tailors the diffusion priors for monocular normal estimation. Hugging Face demo is available💙
👉Review https://t.ly/FPJlG
👉Paper https://arxiv.org/pdf/2406.16864
👉Demo https://huggingface.co/Stable-X
🔥4❤2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🍦Geometry Guided Depth🍦
👉Depth and #3D reconstruction which can take as input, where available, previously-made estimates of the scene’s geometry
👉Review https://lnkd.in/dMgakzWm
👉Paper https://arxiv.org/pdf/2406.18387
👉Repo (empty) https://github.com/nianticlabs/DoubleTake
👉Depth and #3D reconstruction which can take as input, where available, previously-made estimates of the scene’s geometry
👉Review https://lnkd.in/dMgakzWm
👉Paper https://arxiv.org/pdf/2406.18387
👉Repo (empty) https://github.com/nianticlabs/DoubleTake
👍7🔥7❤1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🌮MeshAnything with Transformers🌮
👉MeshAnything converts any 3D representation into Artist-Created Meshes (AMs), i.e., meshes created by human artists. It can be combined with various 3D asset production pipelines, such as 3D reconstruction and generation, to transform their results into AMs that can be seamlessly applied in the 3D industry. Source Code available💙
👉Review https://t.ly/HvkD4
👉Paper arxiv.org/pdf/2406.10163
👉Code github.com/buaacyw/MeshAnything
👉MeshAnything converts any 3D representation into Artist-Created Meshes (AMs), i.e., meshes created by human artists. It can be combined with various 3D asset production pipelines, such as 3D reconstruction and generation, to transform their results into AMs that can be seamlessly applied in the 3D industry. Source Code available💙
👉Review https://t.ly/HvkD4
👉Paper arxiv.org/pdf/2406.10163
👉Code github.com/buaacyw/MeshAnything
🤯11❤10🔥5👍4👏2
This media is not supported in your browser
VIEW IN TELEGRAM
🌾LLaNA: NeRF-LLM assistant🌾
👉UniBO unveils LLaNA; novel Multimodal-LLM that understands and reasons on an input NeRF. It processes directly the NeRF weights and performs tasks such as captioning, Q&A, & zero-shot classification of NeRFs.
👉Review https://t.ly/JAfhV
👉Paper arxiv.org/pdf/2406.11840
👉Project andreamaduzzi.github.io/llana/
👉Code & Data coming
👉UniBO unveils LLaNA; novel Multimodal-LLM that understands and reasons on an input NeRF. It processes directly the NeRF weights and performs tasks such as captioning, Q&A, & zero-shot classification of NeRFs.
👉Review https://t.ly/JAfhV
👉Paper arxiv.org/pdf/2406.11840
👉Project andreamaduzzi.github.io/llana/
👉Code & Data coming
❤16🔥2👏2🤯1