This media is not supported in your browser
VIEW IN TELEGRAM
🦖 T-Rex 2: a new SOTA is out! 🦖
👉A novel (VERY STRONG) open-set object detector model. Strong zero-shot capabilities, suitable for various scenarios with only one suit of weights. Demo and Source Code released💙
👉Review https://t.ly/fYw8D
👉Paper https://lnkd.in/dpmRh2zh
👉Project https://lnkd.in/dnR_jPcR
👉Code https://lnkd.in/dnZnGRUn
👉Demo https://lnkd.in/drDUEDYh
👉A novel (VERY STRONG) open-set object detector model. Strong zero-shot capabilities, suitable for various scenarios with only one suit of weights. Demo and Source Code released💙
👉Review https://t.ly/fYw8D
👉Paper https://lnkd.in/dpmRh2zh
👉Project https://lnkd.in/dnR_jPcR
👉Code https://lnkd.in/dnZnGRUn
👉Demo https://lnkd.in/drDUEDYh
🔥23👍3🤯2❤1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
💄TinyBeauty: 460 FPS Make-up💄
👉TinyBeauty: only 80K parameters to achieve the SOTA in virtual makeup without intricate face prompts. Up to 460 FPS on mobile!
👉Review https://t.ly/LG5ok
👉Paper https://arxiv.org/pdf/2403.15033.pdf
👉Project https://tinybeauty.github.io/TinyBeauty/
👉TinyBeauty: only 80K parameters to achieve the SOTA in virtual makeup without intricate face prompts. Up to 460 FPS on mobile!
👉Review https://t.ly/LG5ok
👉Paper https://arxiv.org/pdf/2403.15033.pdf
👉Project https://tinybeauty.github.io/TinyBeauty/
👍7🤯4😍2⚡1🔥1💩1
This media is not supported in your browser
VIEW IN TELEGRAM
☔ AiOS: All-in-One-Stage Humans ☔
👉All-in-one-stage framework for SOTA multiple expressive pose and shape recovery without additional human detection step.
👉Review https://t.ly/ekNd4
👉Paper https://arxiv.org/pdf/2403.17934.pdf
👉Project https://ttxskk.github.io/AiOS/
👉Code/Demo (announced)
👉All-in-one-stage framework for SOTA multiple expressive pose and shape recovery without additional human detection step.
👉Review https://t.ly/ekNd4
👉Paper https://arxiv.org/pdf/2403.17934.pdf
👉Project https://ttxskk.github.io/AiOS/
👉Code/Demo (announced)
❤6👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🏀 MAVOS Object Segmentation 🏀
👉MAVOS is a transformer-based VOS w/ a novel, optimized and dynamic long-term modulated cross-attention memory. Code & Models announced (BSD 3-Clause)💙
👉Review https://t.ly/SKaRG
👉Paper https://lnkd.in/dQyifKa3
👉Project github.com/Amshaker/MAVOS
👉MAVOS is a transformer-based VOS w/ a novel, optimized and dynamic long-term modulated cross-attention memory. Code & Models announced (BSD 3-Clause)💙
👉Review https://t.ly/SKaRG
👉Paper https://lnkd.in/dQyifKa3
👉Project github.com/Amshaker/MAVOS
🔥10👍2❤1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
💦 ObjectDrop: automagical objects removal 💦
👉#Google unveils ObjectDrop, the new SOTA in photorealistic object removal and insertion. Focus on shadows and reflections, impressive!
👉Review https://t.ly/ZJ6NN
👉Paper https://arxiv.org/pdf/2403.18818.pdf
👉Project https://objectdrop.github.io/
👉#Google unveils ObjectDrop, the new SOTA in photorealistic object removal and insertion. Focus on shadows and reflections, impressive!
👉Review https://t.ly/ZJ6NN
👉Paper https://arxiv.org/pdf/2403.18818.pdf
👉Project https://objectdrop.github.io/
👍14🤯8❤4🔥3🍾2
This media is not supported in your browser
VIEW IN TELEGRAM
🪼 Universal Mono Metric Depth 🪼
👉ETH unveils UniDepth: metric 3D scenes from solely single images across domains. A novel, universal and flexible MMDE solution. Source code released💙
👉Review https://t.ly/5C8eq
👉Paper arxiv.org/pdf/2403.18913.pdf
👉Code github.com/lpiccinelli-eth/unidepth
👉ETH unveils UniDepth: metric 3D scenes from solely single images across domains. A novel, universal and flexible MMDE solution. Source code released💙
👉Review https://t.ly/5C8eq
👉Paper arxiv.org/pdf/2403.18913.pdf
👉Code github.com/lpiccinelli-eth/unidepth
🔥10👍1🤣1
This media is not supported in your browser
VIEW IN TELEGRAM
🔘 RELI11D: Multimodal Humans 🔘
👉RELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soon💙
👉Review https://t.ly/5EG6X
👉Paper https://lnkd.in/ep6Utcik
👉Project https://lnkd.in/eDhNHYBb
👉RELI11D is the ultimate and high-quality multimodal human motion dataset involving LiDAR, IMU system, RGB camera, and Event camera. Dataset & Source Code to be released soon💙
👉Review https://t.ly/5EG6X
👉Paper https://lnkd.in/ep6Utcik
👉Project https://lnkd.in/eDhNHYBb
❤3🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 ECoDepth: SOTA Diffusive Mono-Depth 🔥
👉New SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released 💙
👉Review https://t.ly/s2pbB
👉Paper https://lnkd.in/eYt5yr_q
👉Code https://lnkd.in/eEcyPQcd
👉New SIDE model using a diffusion backbone conditioned on ViT embeddings. It's the new SOTA in SIDE. Source Code released 💙
👉Review https://t.ly/s2pbB
👉Paper https://lnkd.in/eYt5yr_q
👉Code https://lnkd.in/eEcyPQcd
🔥11👍4❤3⚡1
AI with Papers - Artificial Intelligence & Deep Learning
🦕 DINO-based Video Tracking 🦕 👉The Weizmann Institute announced the new SOTA in point-tracking via pre-trained DINO features. Source code announced (not yet released)💙 👉Review https://t.ly/_GIMT 👉Paper https://lnkd.in/dsGVDcar 👉Project dino-tracker.github.io/…
GitHub
GitHub - AssafSinger94/dino-tracker: Official Pytorch Implementation for “DINO-Tracker: Taming DINO for Self-Supervised Point Tracking…
Official Pytorch Implementation for “DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video” (ECCV 2024) - AssafSinger94/dino-tracker
👍10❤2
This media is not supported in your browser
VIEW IN TELEGRAM
🕷️ Gen-NeRF2NeRF Translation 🕷️
👉GenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.
👉Review https://t.ly/VMWAH
👉Paper arxiv.org/pdf/2404.02788.pdf
👉Project xiangyueliu.github.io/GenN2N/
👉Code github.com/Lxiangyue/GenN2N
👉GenN2N: unified NeRF-to-NeRF translation for editing tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc.
👉Review https://t.ly/VMWAH
👉Paper arxiv.org/pdf/2404.02788.pdf
👉Project xiangyueliu.github.io/GenN2N/
👉Code github.com/Lxiangyue/GenN2N
🤯4❤3🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
👆iSeg: Interactive 3D Segmentation👆
👉 iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.
👉Review https://t.ly/tyFnD
👉Paper https://lnkd.in/dydAz8zp
👉Project https://lnkd.in/de-h6SRi
👉Code (coming)
👉 iSeg: interactive segmentation technique for 3D shapes operating entirely in 3D. It accepts both positive/negative clicks directly on the shape's surface, indicating inclusion & exclusion of regions.
👉Review https://t.ly/tyFnD
👉Paper https://lnkd.in/dydAz8zp
👉Project https://lnkd.in/de-h6SRi
👉Code (coming)
❤7👏2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
👗 Neural Bodies with Clothes 👗
👉Neural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.
👉Review https://t.ly/Un1wc
👉Project https://lnkd.in/dhDG6FF5
👉Paper https://lnkd.in/dhcfK7jZ
👉Code https://lnkd.in/dQvXWysP
👉Neural-ABC is a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for ID, clothing, shape, and pose.
👉Review https://t.ly/Un1wc
👉Project https://lnkd.in/dhDG6FF5
👉Paper https://lnkd.in/dhcfK7jZ
👉Code https://lnkd.in/dQvXWysP
🔥7👍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🔌 BodyMAP: human body & pressure 🔌
👉#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming 💙
👉Review https://t.ly/8926S
👉Project bodymap3d.github.io/
👉Paper https://lnkd.in/gCxH4ev3
👉Code https://lnkd.in/gaifdy3q
👉#Nvidia (+CMU) unveils BodyMAP, the new SOTA in predicting body mesh (3D pose & shape) and 3D applied pressure on the human body. Source Code released, Dataset coming 💙
👉Review https://t.ly/8926S
👉Project bodymap3d.github.io/
👉Paper https://lnkd.in/gCxH4ev3
👉Code https://lnkd.in/gaifdy3q
❤8🤯4⚡1👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🧞 XComposer2: 4K Vision-Language 🧞
👉InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840×1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released 💙
👉Review https://t.ly/GCHsz
👉Paper arxiv.org/pdf/2404.06512.pdf
👉Code github.com/InternLM/InternLM-XComposer
👉InternLMXComposer2-4KHD brings LVLM resolution capabilities up to 4K HD (3840×1600) and beyond. Authors: Shanghai AI Lab, CUHK, SenseTime & Tsinghua. Source Code & Models released 💙
👉Review https://t.ly/GCHsz
👉Paper arxiv.org/pdf/2404.06512.pdf
👉Code github.com/InternLM/InternLM-XComposer
🥰7⚡2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
⚛️ Flying w/ Photons: Neural Render ⚛️
👉Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!
👉Review https://t.ly/ZqL3a
👉Paper arxiv.org/pdf/2404.06493.pdf
👉Project anaghmalik.com/FlyingWithPhotons/
👉Code github.com/anaghmalik/FlyingWithPhotons
👉Novel neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Pico-Seconds time resolution!
👉Review https://t.ly/ZqL3a
👉Paper arxiv.org/pdf/2404.06493.pdf
👉Project anaghmalik.com/FlyingWithPhotons/
👉Code github.com/anaghmalik/FlyingWithPhotons
🤯6⚡3❤2👍1🤣1
This media is not supported in your browser
VIEW IN TELEGRAM
☄️ Tracking Any 2D Pixels in 3D ☄️
👉 SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.
👉Review https://t.ly/B28Cj
👉Paper https://lnkd.in/d8ers_nm
👉Project https://lnkd.in/deHjtZuE
👉Code https://lnkd.in/dMe3TvFT
👉 SpatialTracker lifts 2D pixels to 3D using monocular depth, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories.
👉Review https://t.ly/B28Cj
👉Paper https://lnkd.in/d8ers_nm
👉Project https://lnkd.in/deHjtZuE
👉Code https://lnkd.in/dMe3TvFT
❤10🔥5⚡1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🪐YOLO-CIANNA: Neural Astro🪐
👉 CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released 💙
👉Review https://t.ly/441XS
👉Paper arxiv.org/pdf/2402.05925.pdf
👉Code github.com/Deyht/CIANNA
👉Wiki github.com/Deyht/CIANNA/wiki
👉 CIANNA is a general-purpose deep learning framework for (but not only for) astronomical data analysis. Source Code released 💙
👉Review https://t.ly/441XS
👉Paper arxiv.org/pdf/2402.05925.pdf
👉Code github.com/Deyht/CIANNA
👉Wiki github.com/Deyht/CIANNA/wiki
👍7⚡5❤4🔥2🥰2
This media is not supported in your browser
VIEW IN TELEGRAM
🧤Neuro MusculoSkeletal-MANO🧤
👉SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced 💙
👉Review https://t.ly/HOQrn
👉Paper arxiv.org/pdf/2404.10227.pdf
👉Project https://ms-mano.robotflow.ai/
👉Code announced (no repo yet)
👉SJTU unveils MusculoSkeletal-MANO, novel musculoskeletal system with a learnable parametric hand model. Source Code announced 💙
👉Review https://t.ly/HOQrn
👉Paper arxiv.org/pdf/2404.10227.pdf
👉Project https://ms-mano.robotflow.ai/
👉Code announced (no repo yet)
🔥3⚡1❤1👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽SoccerNET: Athlete Tracking⚽
👉SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.
👉Review https://t.ly/Mdu9s
👉Paper arxiv.org/pdf/2404.11335.pdf
👉Code github.com/SoccerNet/sn-gamestate
👉SoccerNet Challenge is a novel high level computer vision task that is specific to sports analytics. It aims at recognizing the state of a sport game, i.e., identifying and localizing all sports individuals (players, referees, ..) on the field.
👉Review https://t.ly/Mdu9s
👉Paper arxiv.org/pdf/2404.11335.pdf
👉Code github.com/SoccerNet/sn-gamestate
❤9👍8🔥3⚡2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎲 Articulated Objs from MonoClips 🎲
👉REACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video
👉Review https://t.ly/REuM8
👉Paper https://lnkd.in/d6PWagij
👉Project https://lnkd.in/dpg3x4tm
👉Repo https://lnkd.in/dRZWj6_N
👉REACTO is the new SOTA to address the challenge of reconstructing general articulated 3D objects from single monocular video
👉Review https://t.ly/REuM8
👉Paper https://lnkd.in/d6PWagij
👉Project https://lnkd.in/dpg3x4tm
👉Repo https://lnkd.in/dRZWj6_N
🤯6👍1🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🪼 All You Need is SAM (+Flow) 🪼
👉Oxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced 💙
👉Review https://t.ly/ZRYtp
👉Paper https://lnkd.in/d4XqkEGF
👉Project https://lnkd.in/dHpmx3FF
👉Repo coming: https://github.com/Jyxarthur/
👉Oxford unveils the new SOTA for moving object segmentation via SAM + Optical Flow. Two novel models & Source Code announced 💙
👉Review https://t.ly/ZRYtp
👉Paper https://lnkd.in/d4XqkEGF
👉Project https://lnkd.in/dHpmx3FF
👉Repo coming: https://github.com/Jyxarthur/
❤12👍7🔥2🤯2