This media is not supported in your browser
VIEW IN TELEGRAM
🐏 EFM3D: 3D Ego-Foundation 🐏
👉#META presents EFM3D, the first benchmark for 3D object detection and surface regression on HQ annotated egocentric data of Project Aria. Datasets & Code released💙
👉Review https://t.ly/cDJv6
👉Paper arxiv.org/pdf/2406.10224
👉Project www.projectaria.com/datasets/aeo/
👉Repo github.com/facebookresearch/efm3d
👉#META presents EFM3D, the first benchmark for 3D object detection and surface regression on HQ annotated egocentric data of Project Aria. Datasets & Code released💙
👉Review https://t.ly/cDJv6
👉Paper arxiv.org/pdf/2406.10224
👉Project www.projectaria.com/datasets/aeo/
👉Repo github.com/facebookresearch/efm3d
🔥9❤2👍2⚡1👏1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🥦Gaussian Splatting VTON🥦
👉GS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announced💙
👉Review https://t.ly/sTPbW
👉Paper arxiv.org/pdf/2410.05259
👉Project yukangcao.github.io/GS-VTON/
👉Repo github.com/yukangcao/GS-VTON
👉GS-VTON is a novel image-prompted 3D-VTON which, by leveraging 3DGS as the 3D representation, enables the transfer of pre-trained knowledge from 2D VTON models to 3D while improving cross-view consistency. Code announced💙
👉Review https://t.ly/sTPbW
👉Paper arxiv.org/pdf/2410.05259
👉Project yukangcao.github.io/GS-VTON/
👉Repo github.com/yukangcao/GS-VTON
🔥14❤3👍1👏1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
💡Diffusion Models Relighting💡
👉#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.
👉Review https://t.ly/fliXU
👉Paper arxiv.org/pdf/2410.08188
👉Project www.eyelinestudios.com/research/diffrelight.html
👉#Netflix unveils DifFRelight, a novel free-viewpoint facial relighting via diffusion model. Precise lighting control, high-fidelity relit facial images from flat-lit inputs.
👉Review https://t.ly/fliXU
👉Paper arxiv.org/pdf/2410.08188
👉Project www.eyelinestudios.com/research/diffrelight.html
🔥17❤7⚡2👍2😍2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🥎POKEFLEX: Soft Object Dataset🥎
👉PokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announced💙
👉Review https://t.ly/GXggP
👉Paper arxiv.org/pdf/2410.07688
👉Project https://lnkd.in/duv-jS7a
👉Repo
👉PokeFlex from ETH is a dataset that includes 3D textured meshes, point clouds, RGB & depth maps of deformable objects. Pretrained models & dataset announced💙
👉Review https://t.ly/GXggP
👉Paper arxiv.org/pdf/2410.07688
👉Project https://lnkd.in/duv-jS7a
👉Repo
👍7🔥2🥰1👏1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 DEPTH ANY VIDEO is out! 🔥
👉DAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!
👉Review https://t.ly/CjSz2
👉Paper arxiv.org/pdf/2410.10815
👉Project depthanyvideo.github.io/
👉Code github.com/Nightmare-n/DepthAnyVideo
👉DAV is a novel foundation model for image/video depth estimation.The new SOTA for accuracy & consistency, up to 150 FPS!
👉Review https://t.ly/CjSz2
👉Paper arxiv.org/pdf/2410.10815
👉Project depthanyvideo.github.io/
👉Code github.com/Nightmare-n/DepthAnyVideo
🔥14🤯3❤1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪞Robo-Emulation via Video Imitation🪞
👉OKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.
👉Review https://t.ly/_N29-
👉Paper arxiv.org/pdf/2410.11792
👉Project https://lnkd.in/d6bHF_-s
👉OKAMI (UT & #Nvidia) is a novel foundation method that generates a manipulation plan from a single RGB-D video and derives a policy for execution.
👉Review https://t.ly/_N29-
👉Paper arxiv.org/pdf/2410.11792
👉Project https://lnkd.in/d6bHF_-s
👍4🤯2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 CoTracker3 by #META is out! 🔥
👉#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data 🤯🤯🤯
👉Review https://t.ly/TcRIv
👉Paper arxiv.org/pdf/2410.11831
👉Project cotracker3.github.io/
👉Code github.com/facebookresearch/co-tracker
👉#Meta (+VGG Oxford) unveils CoTracker3, a new tracker that outperforms the previous SoTA by a large margin using only the 0.1% of the training data 🤯🤯🤯
👉Review https://t.ly/TcRIv
👉Paper arxiv.org/pdf/2410.11831
👉Project cotracker3.github.io/
👉Code github.com/facebookresearch/co-tracker
❤14🔥3🤯3🍾2👍1😱1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🦠 Neural Metamorphosis 🦠
👉NU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.
👉Review https://t.ly/DJab3
👉Paper arxiv.org/pdf/2410.11878
👉Project adamdad.github.io/neumeta
👉Code github.com/Adamdad/neumeta
👉NU Singapore unveils NeuMeta to transform neural nets by allowing a single model to adapt on the fly to different sizes, generating the right weights when needed.
👉Review https://t.ly/DJab3
👉Paper arxiv.org/pdf/2410.11878
👉Project adamdad.github.io/neumeta
👉Code github.com/Adamdad/neumeta
❤7🔥3🤯3😱2⚡1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
☀️ GS + Depth = SOTA ☀️
👉DepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soon💙
👉Review https://t.ly/87HuH
👉Paper arxiv.org/abs/2410.13862
👉Project haofeixu.github.io/depthsplat/
👉Code github.com/cvg/depthsplat
👉DepthSplat, the new SOTA in depth estimation & novel view synthesis. The key feature is the cross-task interaction between Gaussian Splatting & depth estimation. Source Code to be released soon💙
👉Review https://t.ly/87HuH
👉Paper arxiv.org/abs/2410.13862
👉Project haofeixu.github.io/depthsplat/
👉Code github.com/cvg/depthsplat
🤯9🔥8❤3⚡1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥BitNet: code of 1-bit LLM released🔥
👉BitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released 💙
👉Review https://t.ly/3G2LA
👉Paper arxiv.org/pdf/2310.11453
👉Code https://lnkd.in/duPADJVb
👉BitNet by #Microsoft, announced in late 2023, is a 1-bit Transformer architecture designed for LLMs. BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Source Code just released 💙
👉Review https://t.ly/3G2LA
👉Paper arxiv.org/pdf/2310.11453
👉Code https://lnkd.in/duPADJVb
🔥21❤5🤯2👍1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🧿 Look Ma, no markers 🧿
👉#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset released💙
👉Review https://t.ly/5fN0g
👉Paper arxiv.org/pdf/2410.11520
👉Project microsoft.github.io/SynthMoCap/
👉Repo github.com/microsoft/SynthMoCap
👉#Microsoft unveils the first technique for marker-free, HQ reconstruction of COMPLETE human body, including eyes and tongue, without requiring any calibration, manual intervention or custom hardware. Impressive results! Repo for training & Dataset released💙
👉Review https://t.ly/5fN0g
👉Paper arxiv.org/pdf/2410.11520
👉Project microsoft.github.io/SynthMoCap/
👉Repo github.com/microsoft/SynthMoCap
🤯16👍10🔥3😱3❤1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🪁 PL2Map: efficient neural 2D-3D 🪁
👉PL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences
👉Review https://t.ly/D-bVD
👉Paper arxiv.org/pdf/2402.18011
👉Project https://thpjp.github.io/pl2map
👉Code https://github.com/ais-lab/pl2map
👉PL2Map is a novel neural network tailored for efficient representation of complex point & line maps. A natural representation of 2D-3D correspondences
👉Review https://t.ly/D-bVD
👉Paper arxiv.org/pdf/2402.18011
👉Project https://thpjp.github.io/pl2map
👉Code https://github.com/ais-lab/pl2map
🔥14🤯8👍2❤1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌻 Plant Camouflage Detection🌻
👉PlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released 💙
👉Review https://t.ly/pYFX4
👉Paper arxiv.org/pdf/2410.17598
👉Code github.com/yjybuaa/PlantCamo
👉PlantCamo Dataset is the first dataset for plant camouflage detection: 1,250 images with camouflage characteristics. Source Code released 💙
👉Review https://t.ly/pYFX4
👉Paper arxiv.org/pdf/2410.17598
👉Code github.com/yjybuaa/PlantCamo
❤11👍6🤯4👏1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
⛈️ SMITE: SEGMENT IN TIME ⛈️
👉SFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced 💙
👉Review https://t.ly/w6aWJ
👉Paper arxiv.org/pdf/2410.18538
👉Project segment-me-in-time.github.io/
👉Repo github.com/alimohammadiamirhossein/smite
👉SFU unveils SMITE: a novel AI that -with only one or few segmentation references with fine granularity- is able to segment different unseen videos respecting the segmentation references. Dataset & Code (under Apache 2.0) announced 💙
👉Review https://t.ly/w6aWJ
👉Paper arxiv.org/pdf/2410.18538
👉Project segment-me-in-time.github.io/
👉Repo github.com/alimohammadiamirhossein/smite
🤯11❤4🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🫐 Blendify: #Python + Blender 🫐
👉Lightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code released💙
👉Review https://t.ly/l0crA
👉Paper https://arxiv.org/pdf/2410.17858
👉Code https://virtualhumans.mpi-inf.mpg.de/blendify/
👉Lightweight Python framework that provides a high-level API for creating & rendering scenes with #Blender. It simplifies data augmentation & synthesis. Source Code released💙
👉Review https://t.ly/l0crA
👉Paper https://arxiv.org/pdf/2410.17858
👉Code https://virtualhumans.mpi-inf.mpg.de/blendify/
🤩13👍4🔥4❤2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 D-FINE: new SOTA Detector 🔥
👉D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available 💙
👉Review https://t.ly/aw9fN
👉Paper https://arxiv.org/pdf/2410.13842
👉Code https://github.com/Peterande/D-FINE
👉D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR model. New SOTA on MS COCO with additional data. Code & models available 💙
👉Review https://t.ly/aw9fN
👉Paper https://arxiv.org/pdf/2410.13842
👉Code https://github.com/Peterande/D-FINE
❤16👍3👏1🤯1
AI with Papers - Artificial Intelligence & Deep Learning
🔫 Free-Moving Reconstruction 🔫 👉EPFL (+#MagicLeap) unveils a novel approach for reconstructing free-moving object from monocular RGB clip. Free interaction with objects in front of a moving cam without relying on any prior, and optimizes the sequence globally…
GitHub
GitHub - HaixinShi/fmov_pose: This is the official repo for the implementation of Free-Moving Object Reconstruction and Pose Estimation…
This is the official repo for the implementation of Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera(AAAI 2025). - HaixinShi/fmov_pose
👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🍜 REM: Segment What You Describe 🍜
👉REM is a framework for segmenting concepts in video that can be described via LLM. Suitable for rare & non-object dynamic concepts, such as waves, smoke, etc. Code & Data announced 💙
👉Review https://t.ly/OyVtV
👉Paper arxiv.org/pdf/2410.23287
👉Project https://miccooper9.github.io/projects/ReferEverything/
👉REM is a framework for segmenting concepts in video that can be described via LLM. Suitable for rare & non-object dynamic concepts, such as waves, smoke, etc. Code & Data announced 💙
👉Review https://t.ly/OyVtV
👉Paper arxiv.org/pdf/2410.23287
👉Project https://miccooper9.github.io/projects/ReferEverything/
🔥18❤4👍3🤩2🤯1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
☀️ Universal Relightable Avatars ☀️
👉#Meta unveils URAvatar, photorealistic & relightable avatars from phone scan with unknown illumination. Stunning results!
👉Review https://t.ly/U-ESX
👉Paper arxiv.org/pdf/2410.24223
👉Project junxuan-li.github.io/urgca-website
👉#Meta unveils URAvatar, photorealistic & relightable avatars from phone scan with unknown illumination. Stunning results!
👉Review https://t.ly/U-ESX
👉Paper arxiv.org/pdf/2410.24223
👉Project junxuan-li.github.io/urgca-website
❤11🔥5⚡1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🏣 CityGaussianV2: Large-Scale City 🏣
👉A novel approach for large-scale scene reconstruction that addresses critical challenges related to geometric accuracy and efficiency: 10x compression, 25% faster & -50% memory! Source code released💙
👉Review https://t.ly/Xgn59
👉Paper arxiv.org/pdf/2411.00771
👉Project dekuliutesla.github.io/CityGaussianV2/
👉Code github.com/DekuLiuTesla/CityGaussian
👉A novel approach for large-scale scene reconstruction that addresses critical challenges related to geometric accuracy and efficiency: 10x compression, 25% faster & -50% memory! Source code released💙
👉Review https://t.ly/Xgn59
👉Paper arxiv.org/pdf/2411.00771
👉Project dekuliutesla.github.io/CityGaussianV2/
👉Code github.com/DekuLiuTesla/CityGaussian
👍15🔥9❤2👏1