This media is not supported in your browser
VIEW IN TELEGRAM
😻 GARField: Group Anything 😻
👉 GARField is a novel approach for decomposing #3D scenes into a hierarchy of semantically meaningful groups from posed image inputs.
👉Review https://t.ly/6Hkeq
👉Paper https://lnkd.in/d28mfRcZ
👉Project https://lnkd.in/dzYdRNKy
👉Repo (coming) https://lnkd.in/d2VeRJCS
👉 GARField is a novel approach for decomposing #3D scenes into a hierarchy of semantically meaningful groups from posed image inputs.
👉Review https://t.ly/6Hkeq
👉Paper https://lnkd.in/d28mfRcZ
👉Project https://lnkd.in/dzYdRNKy
👉Repo (coming) https://lnkd.in/d2VeRJCS
👍8❤3🥰1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 Depth Anything: new SOTA 🔥
👉Depth Anything: the new SOTA in monocular depth estimation (MDE), trained with 1.5M labeled images and 62M+ unlabeled images jointly. It's the new SOTA!
👉Review https://t.ly/tCBwO
👉Paper https://lnkd.in/djx-9k2J
👉Project https://lnkd.in/dYetqZFa
👉Repo https://lnkd.in/d87CrUGv
👉Demo🤗 https://lnkd.in/dJhvKBep
👉Depth Anything: the new SOTA in monocular depth estimation (MDE), trained with 1.5M labeled images and 62M+ unlabeled images jointly. It's the new SOTA!
👉Review https://t.ly/tCBwO
👉Paper https://lnkd.in/djx-9k2J
👉Project https://lnkd.in/dYetqZFa
👉Repo https://lnkd.in/d87CrUGv
👉Demo🤗 https://lnkd.in/dJhvKBep
🔥17❤3🥰2🤩2
This media is not supported in your browser
VIEW IN TELEGRAM
🎭 ULTRA-Realistic Avatar 🎭
👉Novel 3D avatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting.
👉Review https://t.ly/B3BEu
👉Project https://lnkd.in/dkUQHFEV
👉Paper https://lnkd.in/dtEQxrBu
👉Code coming 🩷
👉Novel 3D avatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting.
👉Review https://t.ly/B3BEu
👉Project https://lnkd.in/dkUQHFEV
👉Paper https://lnkd.in/dtEQxrBu
👉Code coming 🩷
💩17❤5👍2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Lumiere: SOTA video-gen🔥
👉#Google unveils Lumiere: Space-Time Diffusion Model for Realistic Video Generation. It's the new SOTA, tasks: Text-to-Video, Video Stylization, Cinemagraphs & Video Inpainting.
👉Review https://t.ly/nalJR
👉Paper https://lnkd.in/d-PvrGjT
👉Project https://t.ly/gK8hz
👉#Google unveils Lumiere: Space-Time Diffusion Model for Realistic Video Generation. It's the new SOTA, tasks: Text-to-Video, Video Stylization, Cinemagraphs & Video Inpainting.
👉Review https://t.ly/nalJR
👉Paper https://lnkd.in/d-PvrGjT
👉Project https://t.ly/gK8hz
🔥18❤4👍3👏2🤩2🥰1🤯1💩1
This media is not supported in your browser
VIEW IN TELEGRAM
🧪 SUPIR: SOTA restoration 🧪
👉SUPIR is the new SOTA in image restoration; suitable for restoration of blurry objects, defining the material texture of objects, and adjusting restoration based on high-level semantics
👉Review https://t.ly/wgObH
👉Project https://supir.xpixel.group/
👉Paper https://lnkd.in/dZPYcUuq
👉Demo coming 🩷 but no code announced :(
👉SUPIR is the new SOTA in image restoration; suitable for restoration of blurry objects, defining the material texture of objects, and adjusting restoration based on high-level semantics
👉Review https://t.ly/wgObH
👉Project https://supir.xpixel.group/
👉Paper https://lnkd.in/dZPYcUuq
👉Demo coming 🩷 but no code announced :(
❤8🔥4🥰1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🫧 SAM + Open Models 🫧
👉Grounded SAM (w/ DINO) as an open-set detector to combine with SAM. It can seamlessly integrate with other Open-World models to accomplish more intricate visual tasks.
👉Review https://t.ly/FwasQ
👉Paper arxiv.org/pdf/2401.14159.pdf
👉Code github.com/IDEA-Research/Grounded-Segment-Anything
👉Grounded SAM (w/ DINO) as an open-set detector to combine with SAM. It can seamlessly integrate with other Open-World models to accomplish more intricate visual tasks.
👉Review https://t.ly/FwasQ
👉Paper arxiv.org/pdf/2401.14159.pdf
👉Code github.com/IDEA-Research/Grounded-Segment-Anything
🔥9👏2👍1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
👢"Virtual Try-All" by #Amazon 👢
👉#Amazon announces ”Diffuse to Choose”: diffusion-based image-conditioned inpainting for VTON. Virtually place any e-commerce item in any setting.
👉Review https://t.ly/at07Y
👉Paper https://lnkd.in/dxR7nGtd
👉Project diffuse2choose.github.io/
👉#Amazon announces ”Diffuse to Choose”: diffusion-based image-conditioned inpainting for VTON. Virtually place any e-commerce item in any setting.
👉Review https://t.ly/at07Y
👉Paper https://lnkd.in/dxR7nGtd
👉Project diffuse2choose.github.io/
❤15👍7🤯4🔥1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🦩 WildRGB-D: Objects in the Wild 🦩
👉#NVIDIA unveils a novel RGB-D object dataset captured in the wild: ~8500 recorded objects, ~20,000 RGBD videos, 46 categories with corresponding masks and 3D point clouds.
👉Review https://t.ly/WCqVz
👉Data github.com/wildrgbd/wildrgbd
👉Paper arxiv.org/pdf/2401.12592.pdf
👉Project wildrgbd.github.io/
👉#NVIDIA unveils a novel RGB-D object dataset captured in the wild: ~8500 recorded objects, ~20,000 RGBD videos, 46 categories with corresponding masks and 3D point clouds.
👉Review https://t.ly/WCqVz
👉Data github.com/wildrgbd/wildrgbd
👉Paper arxiv.org/pdf/2401.12592.pdf
👉Project wildrgbd.github.io/
👍9❤3🔥2👏1🤩1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🌋EasyVolcap: Accelerating Neural Volumetric🌋
👉Novel #PyTorch library for accelerating neural video:volumetric video capturing, reconstruction & rendering
👉Review https://t.ly/8BISl
👉Paper arxiv.org/pdf/2312.06575.pdf
👉Code github.com/zju3dv/EasyVolcap
👉Novel #PyTorch library for accelerating neural video:volumetric video capturing, reconstruction & rendering
👉Review https://t.ly/8BISl
👉Paper arxiv.org/pdf/2312.06575.pdf
👉Code github.com/zju3dv/EasyVolcap
🔥10👍2❤1🥰1👏1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🐙 Rock-Track announced! 🐙
👉Rock-Track: the evolution of Poly-MOT, the previous SOTA in 3D MOT Tracking-By-Detection framework.
👉Review https://t.ly/hC0ak
👉Repo, coming: https://lnkd.in/dtDkPwCC
👉Paper coming
👉Rock-Track: the evolution of Poly-MOT, the previous SOTA in 3D MOT Tracking-By-Detection framework.
👉Review https://t.ly/hC0ak
👉Repo, coming: https://lnkd.in/dtDkPwCC
👉Paper coming
👍4👏4🔥2❤1🥰1
🧠350+ Free #AI Courses by #Google🧠
👉350+ free courses from #Google to become professional in #AI & #Cloud. The full catalog (900+) includes a variety of activity: videos, documents, labs, coding, and quizzes. 15+ supported languages. No excuse.
✅𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈
✅𝐈𝐧𝐭𝐫𝐨 𝐭𝐨 𝐋𝐋𝐌𝐬
✅𝐂𝐕 𝐰𝐢𝐭𝐡 𝐓𝐅
✅𝐃𝐚𝐭𝐚, 𝐌𝐋, 𝐀𝐈
✅𝐑𝐞𝐬𝐩𝐨𝐧𝐬𝐢𝐛𝐥𝐞 𝐀𝐈
👉Review: https://t.ly/517Dr
👉Full list: https://www.cloudskillsboost.google/catalog?page=1
👉350+ free courses from #Google to become professional in #AI & #Cloud. The full catalog (900+) includes a variety of activity: videos, documents, labs, coding, and quizzes. 15+ supported languages. No excuse.
✅𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈
✅𝐈𝐧𝐭𝐫𝐨 𝐭𝐨 𝐋𝐋𝐌𝐬
✅𝐂𝐕 𝐰𝐢𝐭𝐡 𝐓𝐅
✅𝐃𝐚𝐭𝐚, 𝐌𝐋, 𝐀𝐈
✅𝐑𝐞𝐬𝐩𝐨𝐧𝐬𝐢𝐛𝐥𝐞 𝐀𝐈
👉Review: https://t.ly/517Dr
👉Full list: https://www.cloudskillsboost.google/catalog?page=1
❤13👍3👏2🍾2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🍋 Diffutoon: new SOTA video 🍋
👉Diffutoon is a cartoon shading approach, aiming to transform photorealistic videos in anime styles. It can handle exceptionally high resolutions and rapid motions. Source code released!
👉Review https://t.ly/sim2O
👉Paper https://lnkd.in/dPcSnAUu
👉Code https://lnkd.in/d9B_dGrf
👉Project https://lnkd.in/dpcsJcX2
👉Diffutoon is a cartoon shading approach, aiming to transform photorealistic videos in anime styles. It can handle exceptionally high resolutions and rapid motions. Source code released!
👉Review https://t.ly/sim2O
👉Paper https://lnkd.in/dPcSnAUu
👉Code https://lnkd.in/d9B_dGrf
👉Project https://lnkd.in/dpcsJcX2
🔥19❤3🤯3👍1🥰1🤩1💩1🍾1
🥓 RANSAC -> PARSAC (neural) 🥓
👉Neural PARSAC: estimating multiple vanishing points (V), fundamental matrices (F) or homographies (H) at the speed of light! Source Code released 💙
👉Review https://t.ly/r9ngg
👉Paper https://lnkd.in/dadQ4Qec
👉Code https://lnkd.in/dYp6gADd
👉Neural PARSAC: estimating multiple vanishing points (V), fundamental matrices (F) or homographies (H) at the speed of light! Source Code released 💙
👉Review https://t.ly/r9ngg
👉Paper https://lnkd.in/dadQ4Qec
👉Code https://lnkd.in/dYp6gADd
❤14👍3⚡1🥰1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
↘️ SEELE: "moving" the subjects ➡️
👉Subject repositioning: manipulating an input image to reposition one of its subjects to a desired location while preserving the image’s fidelity. SEELE is a single diffusion model to address this novel generative sub-tasks
👉Review https://t.ly/4FS4H
👉Paper arxiv.org/pdf/2401.16861.pdf
👉Project yikai-wang.github.io/seele/
👉Subject repositioning: manipulating an input image to reposition one of its subjects to a desired location while preserving the image’s fidelity. SEELE is a single diffusion model to address this novel generative sub-tasks
👉Review https://t.ly/4FS4H
👉Paper arxiv.org/pdf/2401.16861.pdf
👉Project yikai-wang.github.io/seele/
👍20❤3🤯3👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🎉 ADΔER: Event-Camera Suite 🎉
👉ADΔER: a novel/unified framework for event-based video. Encoder / transcoder / decoder for ADΔER (Address, Decimation, Δt Event Representation) video streams. Source code (RUST) released 💙
👉Review https://t.ly/w5_KC
👉Paper arxiv.org/pdf/2401.17151.pdf
👉Repo github.com/ac-freeman/adder-codec-rs
👉ADΔER: a novel/unified framework for event-based video. Encoder / transcoder / decoder for ADΔER (Address, Decimation, Δt Event Representation) video streams. Source code (RUST) released 💙
👉Review https://t.ly/w5_KC
👉Paper arxiv.org/pdf/2401.17151.pdf
👉Repo github.com/ac-freeman/adder-codec-rs
❤7👍3🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🚦(add) Anything in Any Video🚦
👉 XPeng Motors announced Anything in Any Scene: novel #AI for realistic video simulation that seamlessly inserts any object into an existing dynamic video. Strong emphasis on realism, the objects in the BBs don't exist. Source Code released 💙
👉Review https://t.ly/UYhl0
👉Code https://lnkd.in/gyi7Dhkn
👉Paper https://lnkd.in/gXyAJ6GZ
👉Project https://lnkd.in/gVA5vduD
👉 XPeng Motors announced Anything in Any Scene: novel #AI for realistic video simulation that seamlessly inserts any object into an existing dynamic video. Strong emphasis on realism, the objects in the BBs don't exist. Source Code released 💙
👉Review https://t.ly/UYhl0
👉Code https://lnkd.in/gyi7Dhkn
👉Paper https://lnkd.in/gXyAJ6GZ
👉Project https://lnkd.in/gVA5vduD
🔥12🤯6👍5🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🍬 ABS: SOTA collision-free 🍬
👉ABS (Agile But Safe): learning-based control framework for agile and collision-free locomotion for quadrupedal robot. Source Code announced (coming) 💙
👉Review https://t.ly/AYu-Z
👉Paper arxiv.org/pdf/2401.17583.pdf
👉Project agile-but-safe.github.io/
👉Repo github.com/LeCAR-Lab/ABS
👉ABS (Agile But Safe): learning-based control framework for agile and collision-free locomotion for quadrupedal robot. Source Code announced (coming) 💙
👉Review https://t.ly/AYu-Z
👉Paper arxiv.org/pdf/2401.17583.pdf
👉Project agile-but-safe.github.io/
👉Repo github.com/LeCAR-Lab/ABS
😍11👏3👍1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🏇 Bootstrapping TAP 🏇
👉#Deepmind shows how large-scale, unlabeled, uncurated real-world data can improve TAP with minimal architectural changes, via a self-supervised student-teacher setup. Source Code released 💙
👉Review https://t.ly/-S_ZL
👉Paper arxiv.org/pdf/2402.00847.pdf
👉Code https://github.com/google-deepmind/tapnet
👉#Deepmind shows how large-scale, unlabeled, uncurated real-world data can improve TAP with minimal architectural changes, via a self-supervised student-teacher setup. Source Code released 💙
👉Review https://t.ly/-S_ZL
👉Paper arxiv.org/pdf/2402.00847.pdf
👉Code https://github.com/google-deepmind/tapnet
🔥5👍3🥰1🤩1
💥Py4AI 2x Speakers, 2x Tickets💥
✅Doubling the speakers (6 -> 12!)
✅A new track (2 tracks in parallel)
✅A new batch of 100 tickets!
👉 More: https://t.ly/WmVrM
✅Doubling the speakers (6 -> 12!)
✅A new track (2 tracks in parallel)
✅A new batch of 100 tickets!
👉 More: https://t.ly/WmVrM
❤7👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🪵 HASSOD Object Detection 🪵
👉 HASSOD: fully self-supervised detection and instance segmentation. The new SOTA able to understand the part-to-whole object composition like humans do.
👉Review https://t.ly/66qHF
👉Paper arxiv.org/pdf/2402.03311.pdf
👉Project hassod-neurips23.github.io/
👉Repo github.com/Shengcao-Cao/HASSOD
👉 HASSOD: fully self-supervised detection and instance segmentation. The new SOTA able to understand the part-to-whole object composition like humans do.
👉Review https://t.ly/66qHF
👉Paper arxiv.org/pdf/2402.03311.pdf
👉Project hassod-neurips23.github.io/
👉Repo github.com/Shengcao-Cao/HASSOD
🔥13❤5👍3👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🌵 G-Splatting Portraits 🌵
👉From monocular/casual video captures, Rig3DGS rigs 3D Gaussian Splatting to enable the creation of re-animatable portrait videos with control over facial expressions, head-pose and viewing direction
👉Review https://t.ly/fq71w
👉Paper https://arxiv.org/pdf/2402.03723.pdf
👉Project shahrukhathar.github.io/2024/02/05/Rig3DGS.html
👉From monocular/casual video captures, Rig3DGS rigs 3D Gaussian Splatting to enable the creation of re-animatable portrait videos with control over facial expressions, head-pose and viewing direction
👉Review https://t.ly/fq71w
👉Paper https://arxiv.org/pdf/2402.03723.pdf
👉Project shahrukhathar.github.io/2024/02/05/Rig3DGS.html
🔥13❤3👍1🥰1