This media is not supported in your browser
VIEW IN TELEGRAM
🫒 Omni Urban Scene Reconstruction 🫒
👉OmniRe is novel holistic approach for efficiently reconstructing HD dynamic urban scenes from on-device logs. It's able to create the simulation of reconstructed scenarios with actors in real-time (~60 Hz). Code released💙
👉Review https://t.ly/SXVPa
👉Paper arxiv.org/pdf/2408.16760
👉Project ziyc.github.io/omnire/
👉Code github.com/ziyc/drivestudio
👉OmniRe is novel holistic approach for efficiently reconstructing HD dynamic urban scenes from on-device logs. It's able to create the simulation of reconstructed scenarios with actors in real-time (~60 Hz). Code released💙
👉Review https://t.ly/SXVPa
👉Paper arxiv.org/pdf/2408.16760
👉Project ziyc.github.io/omnire/
👉Code github.com/ziyc/drivestudio
🔥10👍9❤3🤯1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
💄Interactive Drag-based Editing💄
👉CSE unveils InstantDrag: novel pipeline designed to enhance editing interactivity and speed, taking only an image and a drag instruction as input. Source Code announced, coming💙
👉Review https://t.ly/hy6SL
👉Paper arxiv.org/pdf/2409.08857
👉Project joonghyuk.com/instantdrag-web/
👉Code github.com/alex4727/InstantDrag
👉CSE unveils InstantDrag: novel pipeline designed to enhance editing interactivity and speed, taking only an image and a drag instruction as input. Source Code announced, coming💙
👉Review https://t.ly/hy6SL
👉Paper arxiv.org/pdf/2409.08857
👉Project joonghyuk.com/instantdrag-web/
👉Code github.com/alex4727/InstantDrag
🔥13👍3😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🌭Hand-Object interaction Pretraining🌭
👉Berkeley unveils HOP, a novel approach to learn general robot manipulation priors from 3D hand-object interaction trajectories.
👉Review https://t.ly/FLqvJ
👉Paper https://arxiv.org/pdf/2409.08273
👉Project https://hgaurav2k.github.io/hop/
👉Berkeley unveils HOP, a novel approach to learn general robot manipulation priors from 3D hand-object interaction trajectories.
👉Review https://t.ly/FLqvJ
👉Paper https://arxiv.org/pdf/2409.08273
👉Project https://hgaurav2k.github.io/hop/
🥰3❤1👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🧸Motion Instruction Fine-Tuning🧸
👉MotIF is a novel method that fine-tunes pre-trained VLMs to equip the capability to distinguish nuanced robotic motions with different shapes and semantic groundings. A work by MIT, Stanford, and CMU. Source Code announced, coming💙
👉Review https://t.ly/iJ2UY
👉Paper https://arxiv.org/pdf/2409.10683
👉Project https://motif-1k.github.io/
👉Code coming
👉MotIF is a novel method that fine-tunes pre-trained VLMs to equip the capability to distinguish nuanced robotic motions with different shapes and semantic groundings. A work by MIT, Stanford, and CMU. Source Code announced, coming💙
👉Review https://t.ly/iJ2UY
👉Paper https://arxiv.org/pdf/2409.10683
👉Project https://motif-1k.github.io/
👉Code coming
👍1🔥1🤯1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽ SoccerNet 2024 Results ⚽
👉SoccerNet is the annual video understanding challenge for football. These challenges aim to advance research across multiple themes in football. The 2024 results are out!
👉Review https://t.ly/DUPgx
👉Paper arxiv.org/pdf/2409.10587
👉Repo github.com/SoccerNet
👉Project www.soccer-net.org/
👉SoccerNet is the annual video understanding challenge for football. These challenges aim to advance research across multiple themes in football. The 2024 results are out!
👉Review https://t.ly/DUPgx
👉Paper arxiv.org/pdf/2409.10587
👉Repo github.com/SoccerNet
👉Project www.soccer-net.org/
🔥12👍6🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🌏 JoyHallo: Mandarin Digital Human 🌏
👉JD Health faced the challenges of audio-driven video generation in Mandarin, a task complicated by the language’s intricate lip movements and the scarcity of HQ datasets. Impressive results (-> audio ON). Code Models available💙
👉Review https://t.ly/5NGDh
👉Paper arxiv.org/pdf/2409.13268
👉Project jdh-algo.github.io/JoyHallo/
👉Code github.com/jdh-algo/JoyHallo
👉JD Health faced the challenges of audio-driven video generation in Mandarin, a task complicated by the language’s intricate lip movements and the scarcity of HQ datasets. Impressive results (-> audio ON). Code Models available💙
👉Review https://t.ly/5NGDh
👉Paper arxiv.org/pdf/2409.13268
👉Project jdh-algo.github.io/JoyHallo/
👉Code github.com/jdh-algo/JoyHallo
🔥9👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎢 Robo-quadruped Parkour🎢
👉LAAS-CNRS unveils a novel RL approach to perform agile skills that are reminiscent of parkour, such as walking, climbing high steps, leaping over gaps, and crawling under obstacles. Data and Code available💙
👉Review https://t.ly/-6VRm
👉Paper arxiv.org/pdf/2409.13678
👉Project gepetto.github.io/SoloParkour/
👉Code github.com/Gepetto/SoloParkour
👉LAAS-CNRS unveils a novel RL approach to perform agile skills that are reminiscent of parkour, such as walking, climbing high steps, leaping over gaps, and crawling under obstacles. Data and Code available💙
👉Review https://t.ly/-6VRm
👉Paper arxiv.org/pdf/2409.13678
👉Project gepetto.github.io/SoloParkour/
👉Code github.com/Gepetto/SoloParkour
🔥5👍2👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🩰 Dressed Humans in the wild 🩰
👉ETH (+ #Microsoft ) ReLoo: novel 3D-HQ reconstruction of humans dressed in loose garments from mono in-the-wild clips. No prior assumptions about the garments. Source Code announced, coming 💙
👉Review https://t.ly/evgmN
👉Paper arxiv.org/pdf/2409.15269
👉Project moygcc.github.io/ReLoo/
👉Code github.com/eth-ait/ReLoo
👉ETH (+ #Microsoft ) ReLoo: novel 3D-HQ reconstruction of humans dressed in loose garments from mono in-the-wild clips. No prior assumptions about the garments. Source Code announced, coming 💙
👉Review https://t.ly/evgmN
👉Paper arxiv.org/pdf/2409.15269
👉Project moygcc.github.io/ReLoo/
👉Code github.com/eth-ait/ReLoo
🤯9❤2👍1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🌾 New SOTA Edge Detection 🌾
👉CUP (+ ESPOCH) unveils the new SOTA for Edge Detection (NBED); superior performance consistently across multiple benchmarks, even compared with huge computational cost and complex training models. Source Code released💙
👉Review https://t.ly/zUMcS
👉Paper arxiv.org/pdf/2409.14976
👉Code github.com/Li-yachuan/NBED
👉CUP (+ ESPOCH) unveils the new SOTA for Edge Detection (NBED); superior performance consistently across multiple benchmarks, even compared with huge computational cost and complex training models. Source Code released💙
👉Review https://t.ly/zUMcS
👉Paper arxiv.org/pdf/2409.14976
👉Code github.com/Li-yachuan/NBED
🔥11👍5👏1
This media is not supported in your browser
VIEW IN TELEGRAM
👩🦰 SOTA Gaussian Haircut 👩🦰
👉ETH et. al unveils Gaussian Haircut, the new SOTA in hair reconstruction via dual representation (classic + 3D Gaussian). Code and Model announced💙
👉Review https://t.ly/aiOjq
👉Paper arxiv.org/pdf/2409.14778
👉Project https://lnkd.in/dFRm2ycb
👉Repo https://lnkd.in/d5NWNkb5
👉ETH et. al unveils Gaussian Haircut, the new SOTA in hair reconstruction via dual representation (classic + 3D Gaussian). Code and Model announced💙
👉Review https://t.ly/aiOjq
👉Paper arxiv.org/pdf/2409.14778
👉Project https://lnkd.in/dFRm2ycb
👉Repo https://lnkd.in/d5NWNkb5
🔥16👍2❤1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍇SPARK: Real-time Face Capture🍇
👉Technicolor Group unveils SPARK, a novel high-precision 3D face capture via collection of unconstrained videos of a subject as prior information. New SOTA able to handle unseen pose, expression and lighting. Impressive results. Code & Model announced💙
👉Review https://t.ly/rZOgp
👉Paper arxiv.org/pdf/2409.07984
👉Project kelianb.github.io/SPARK/
👉Repo github.com/KelianB/SPARK/
👉Technicolor Group unveils SPARK, a novel high-precision 3D face capture via collection of unconstrained videos of a subject as prior information. New SOTA able to handle unseen pose, expression and lighting. Impressive results. Code & Model announced💙
👉Review https://t.ly/rZOgp
👉Paper arxiv.org/pdf/2409.07984
👉Project kelianb.github.io/SPARK/
👉Repo github.com/KelianB/SPARK/
🔥10❤2👏1💩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦴 One-Image Object Detection 🦴
👉Delft University (+Hensoldt Optronics) introduces OSSA, a novel unsupervised domain adaptation method for object detection that utilizes a single, unlabeled target image to approximate the target domain style. Code released💙
👉Review https://t.ly/-li2G
👉Paper arxiv.org/pdf/2410.00900
👉Code github.com/RobinGerster7/OSSA
👉Delft University (+Hensoldt Optronics) introduces OSSA, a novel unsupervised domain adaptation method for object detection that utilizes a single, unlabeled target image to approximate the target domain style. Code released💙
👉Review https://t.ly/-li2G
👉Paper arxiv.org/pdf/2410.00900
👉Code github.com/RobinGerster7/OSSA
🔥19👏2⚡1👍1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🛳️ EVER Ellipsoid Rendering 🛳️
👉UCSD & Google present EVER, a novel method for real-time differentiable emission-only volume rendering. Unlike 3DGS it does not suffer from popping artifacts and view dependent density, achieving ∼30 FPS at 720p on #NVIDIA RTX4090.
👉Review https://t.ly/zAfGU
👉Paper arxiv.org/pdf/2410.01804
👉Project half-potato.gitlab.io/posts/ever/
👉UCSD & Google present EVER, a novel method for real-time differentiable emission-only volume rendering. Unlike 3DGS it does not suffer from popping artifacts and view dependent density, achieving ∼30 FPS at 720p on #NVIDIA RTX4090.
👉Review https://t.ly/zAfGU
👉Paper arxiv.org/pdf/2410.01804
👉Project half-potato.gitlab.io/posts/ever/
🔥13❤2👍2👏1🤯1😱1🍾1
🔥 "Deep Gen-AI" Full Course 🔥
👉A fresh course from Stanford about the probabilistic foundations and algorithms for deep generative models. A novel overview about the evolution of the genAI in #computervision, language and more...
👉Review https://t.ly/ylBxq
👉Course https://lnkd.in/dMKH9gNe
👉Lectures https://lnkd.in/d_uwDvT6
👉A fresh course from Stanford about the probabilistic foundations and algorithms for deep generative models. A novel overview about the evolution of the genAI in #computervision, language and more...
👉Review https://t.ly/ylBxq
👉Course https://lnkd.in/dMKH9gNe
👉Lectures https://lnkd.in/d_uwDvT6
❤21🔥7👏2👍1🥰1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🐏 EFM3D: 3D Ego-Foundation 🐏
👉#META presents EFM3D, the first benchmark for 3D object detection and surface regression on HQ annotated egocentric data of Project Aria. Datasets & Code released💙
👉Review https://t.ly/cDJv6
👉Paper arxiv.org/pdf/2406.10224
👉Project www.projectaria.com/datasets/aeo/
👉Repo github.com/facebookresearch/efm3d
👉#META presents EFM3D, the first benchmark for 3D object detection and surface regression on HQ annotated egocentric data of Project Aria. Datasets & Code released💙
👉Review https://t.ly/cDJv6
👉Paper arxiv.org/pdf/2406.10224
👉Project www.projectaria.com/datasets/aeo/
👉Repo github.com/facebookresearch/efm3d
🔥9❤2👍2⚡1👏1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🥦Gaussian Splatting VTON🥦
👉GS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announced💙
👉Review https://t.ly/sTPbW
👉Paper arxiv.org/pdf/2410.05259
👉Project yukangcao.github.io/GS-VTON/
👉Repo github.com/yukangcao/GS-VTON
👉GS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announced💙
👉Review https://t.ly/sTPbW
👉Paper arxiv.org/pdf/2410.05259
👉Project yukangcao.github.io/GS-VTON/
👉Repo github.com/yukangcao/GS-VTON
🔥14❤3👍1👏1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
💡Diffusion Models Relighting💡
👉#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.
👉Review https://t.ly/fliXU
👉Paper arxiv.org/pdf/2410.08188
👉Project www.eyelinestudios.com/research/diffrelight.html
👉#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.
👉Review https://t.ly/fliXU
👉Paper arxiv.org/pdf/2410.08188
👉Project www.eyelinestudios.com/research/diffrelight.html
🔥17❤7⚡2👍2😍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🥎POKEFLEX: Soft Object Dataset🥎
👉PokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announced💙
👉Review https://t.ly/GXggP
👉Paper arxiv.org/pdf/2410.07688
👉Project https://lnkd.in/duv-jS7a
👉Repo
👉PokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announced💙
👉Review https://t.ly/GXggP
👉Paper arxiv.org/pdf/2410.07688
👉Project https://lnkd.in/duv-jS7a
👉Repo
👍7🔥2🥰1👏1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 DEPTH ANY VIDEO is out! 🔥
👉DAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!
👉Review https://t.ly/CjSz2
👉Paper arxiv.org/pdf/2410.10815
👉Project depthanyvideo.github.io/
👉Code github.com/Nightmare-n/DepthAnyVideo
👉DAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!
👉Review https://t.ly/CjSz2
👉Paper arxiv.org/pdf/2410.10815
👉Project depthanyvideo.github.io/
👉Code github.com/Nightmare-n/DepthAnyVideo
🔥14🤯3❤1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪞Robo-Emulation via Video Imitation🪞
👉OKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.
👉Review https://t.ly/_N29-
👉Paper arxiv.org/pdf/2410.11792
👉Project https://lnkd.in/d6bHF_-s
👉OKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.
👉Review https://t.ly/_N29-
👉Paper arxiv.org/pdf/2410.11792
👉Project https://lnkd.in/d6bHF_-s
👍4🤯2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 CoTracker3 by #META is out! 🔥
👉#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data 🤯🤯🤯
👉Review https://t.ly/TcRIv
👉Paper arxiv.org/pdf/2410.11831
👉Project cotracker3.github.io/
👉Code github.com/facebookresearch/co-tracker
👉#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data 🤯🤯🤯
👉Review https://t.ly/TcRIv
👉Paper arxiv.org/pdf/2410.11831
👉Project cotracker3.github.io/
👉Code github.com/facebookresearch/co-tracker
❤14🔥3🤯3🍾2👍1😱1😍1