This media is not supported in your browser
VIEW IN TELEGRAM
🔥 Depth Anything: new SOTA 🔥
👉Depth Anything: the new SOTA in monocular depth estimation (MDE), trained with 1.5M labeled images and 62M+ unlabeled images jointly. It's the new SOTA!
👉Review https://t.ly/tCBwO
👉Paper https://lnkd.in/djx-9k2J
👉Project https://lnkd.in/dYetqZFa
👉Repo https://lnkd.in/d87CrUGv
👉Demo🤗 https://lnkd.in/dJhvKBep
👉Depth Anything: the new SOTA in monocular depth estimation (MDE), trained with 1.5M labeled images and 62M+ unlabeled images jointly. It's the new SOTA!
👉Review https://t.ly/tCBwO
👉Paper https://lnkd.in/djx-9k2J
👉Project https://lnkd.in/dYetqZFa
👉Repo https://lnkd.in/d87CrUGv
👉Demo🤗 https://lnkd.in/dJhvKBep
🔥17❤3🥰2🤩2
This media is not supported in your browser
VIEW IN TELEGRAM
🎭 ULTRA-Realistic Avatar 🎭
👉Novel 3D avatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting.
👉Review https://t.ly/B3BEu
👉Project https://lnkd.in/dkUQHFEV
👉Paper https://lnkd.in/dtEQxrBu
👉Code coming 🩷
👉Novel 3D avatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting.
👉Review https://t.ly/B3BEu
👉Project https://lnkd.in/dkUQHFEV
👉Paper https://lnkd.in/dtEQxrBu
👉Code coming 🩷
💩17❤5👍2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Lumiere: SOTA video-gen🔥
👉#Google unveils Lumiere: Space-Time Diffusion Model for Realistic Video Generation. It's the new SOTA, tasks: Text-to-Video, Video Stylization, Cinemagraphs & Video Inpainting.
👉Review https://t.ly/nalJR
👉Paper https://lnkd.in/d-PvrGjT
👉Project https://t.ly/gK8hz
👉#Google unveils Lumiere: Space-Time Diffusion Model for Realistic Video Generation. It's the new SOTA, tasks: Text-to-Video, Video Stylization, Cinemagraphs & Video Inpainting.
👉Review https://t.ly/nalJR
👉Paper https://lnkd.in/d-PvrGjT
👉Project https://t.ly/gK8hz
🔥18❤4👍3👏2🤩2🥰1🤯1💩1
This media is not supported in your browser
VIEW IN TELEGRAM
🧪 SUPIR: SOTA restoration 🧪
👉SUPIR is the new SOTA in image restoration; suitable for restoration of blurry objects, defining the material texture of objects, and adjusting restoration based on high-level semantics
👉Review https://t.ly/wgObH
👉Project https://supir.xpixel.group/
👉Paper https://lnkd.in/dZPYcUuq
👉Demo coming 🩷 but no code announced :(
👉SUPIR is the new SOTA in image restoration; suitable for restoration of blurry objects, defining the material texture of objects, and adjusting restoration based on high-level semantics
👉Review https://t.ly/wgObH
👉Project https://supir.xpixel.group/
👉Paper https://lnkd.in/dZPYcUuq
👉Demo coming 🩷 but no code announced :(
❤8🔥4🥰1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🫧 SAM + Open Models 🫧
👉Grounded SAM (w/ DINO) as an open-set detector to combine with SAM. It can seamlessly integrate with other Open-World models to accomplish more intricate visual tasks.
👉Review https://t.ly/FwasQ
👉Paper arxiv.org/pdf/2401.14159.pdf
👉Code github.com/IDEA-Research/Grounded-Segment-Anything
👉Grounded SAM (w/ DINO) as an open-set detector to combine with SAM. It can seamlessly integrate with other Open-World models to accomplish more intricate visual tasks.
👉Review https://t.ly/FwasQ
👉Paper arxiv.org/pdf/2401.14159.pdf
👉Code github.com/IDEA-Research/Grounded-Segment-Anything
🔥9👏2👍1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
👢"Virtual Try-All" by #Amazon 👢
👉#Amazon announces ”Diffuse to Choose”: diffusion-based image-conditioned inpainting for VTON. Virtually place any e-commerce item in any setting.
👉Review https://t.ly/at07Y
👉Paper https://lnkd.in/dxR7nGtd
👉Project diffuse2choose.github.io/
👉#Amazon announces ”Diffuse to Choose”: diffusion-based image-conditioned inpainting for VTON. Virtually place any e-commerce item in any setting.
👉Review https://t.ly/at07Y
👉Paper https://lnkd.in/dxR7nGtd
👉Project diffuse2choose.github.io/
❤15👍7🤯4🔥1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🦩 WildRGB-D: Objects in the Wild 🦩
👉#NVIDIA unveils a novel RGB-D object dataset captured in the wild: ~8500 recorded objects, ~20,000 RGBD videos, 46 categories with corresponding masks and 3D point clouds.
👉Review https://t.ly/WCqVz
👉Data github.com/wildrgbd/wildrgbd
👉Paper arxiv.org/pdf/2401.12592.pdf
👉Project wildrgbd.github.io/
👉#NVIDIA unveils a novel RGB-D object dataset captured in the wild: ~8500 recorded objects, ~20,000 RGBD videos, 46 categories with corresponding masks and 3D point clouds.
👉Review https://t.ly/WCqVz
👉Data github.com/wildrgbd/wildrgbd
👉Paper arxiv.org/pdf/2401.12592.pdf
👉Project wildrgbd.github.io/
👍9❤3🔥2👏1🤩1😍1
This media is not supported in your browser
VIEW IN TELEGRAM
🌋EasyVolcap: Accelerating Neural Volumetric🌋
👉Novel #PyTorch library for accelerating neural video:volumetric video capturing, reconstruction & rendering
👉Review https://t.ly/8BISl
👉Paper arxiv.org/pdf/2312.06575.pdf
👉Code github.com/zju3dv/EasyVolcap
👉Novel #PyTorch library for accelerating neural video:volumetric video capturing, reconstruction & rendering
👉Review https://t.ly/8BISl
👉Paper arxiv.org/pdf/2312.06575.pdf
👉Code github.com/zju3dv/EasyVolcap
🔥10👍2❤1🥰1👏1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🐙 Rock-Track announced! 🐙
👉Rock-Track: the evolution of Poly-MOT, the previous SOTA in 3D MOT Tracking-By-Detection framework.
👉Review https://t.ly/hC0ak
👉Repo, coming: https://lnkd.in/dtDkPwCC
👉Paper coming
👉Rock-Track: the evolution of Poly-MOT, the previous SOTA in 3D MOT Tracking-By-Detection framework.
👉Review https://t.ly/hC0ak
👉Repo, coming: https://lnkd.in/dtDkPwCC
👉Paper coming
👍4👏4🔥2❤1🥰1
🧠350+ Free #AI Courses by #Google🧠
👉350+ free courses from #Google to become professional in #AI & #Cloud. The full catalog (900+) includes a variety of activity: videos, documents, labs, coding, and quizzes. 15+ supported languages. No excuse.
✅𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈
✅𝐈𝐧𝐭𝐫𝐨 𝐭𝐨 𝐋𝐋𝐌𝐬
✅𝐂𝐕 𝐰𝐢𝐭𝐡 𝐓𝐅
✅𝐃𝐚𝐭𝐚, 𝐌𝐋, 𝐀𝐈
✅𝐑𝐞𝐬𝐩𝐨𝐧𝐬𝐢𝐛𝐥𝐞 𝐀𝐈
👉Review: https://t.ly/517Dr
👉Full list: https://www.cloudskillsboost.google/catalog?page=1
👉350+ free courses from #Google to become professional in #AI & #Cloud. The full catalog (900+) includes a variety of activity: videos, documents, labs, coding, and quizzes. 15+ supported languages. No excuse.
✅𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈
✅𝐈𝐧𝐭𝐫𝐨 𝐭𝐨 𝐋𝐋𝐌𝐬
✅𝐂𝐕 𝐰𝐢𝐭𝐡 𝐓𝐅
✅𝐃𝐚𝐭𝐚, 𝐌𝐋, 𝐀𝐈
✅𝐑𝐞𝐬𝐩𝐨𝐧𝐬𝐢𝐛𝐥𝐞 𝐀𝐈
👉Review: https://t.ly/517Dr
👉Full list: https://www.cloudskillsboost.google/catalog?page=1
❤13👍3👏2🍾2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🍋 Diffutoon: new SOTA video 🍋
👉Diffutoon is a cartoon shading approach, aiming to transform photorealistic videos in anime styles. It can handle exceptionally high resolutions and rapid motions. Source code released!
👉Review https://t.ly/sim2O
👉Paper https://lnkd.in/dPcSnAUu
👉Code https://lnkd.in/d9B_dGrf
👉Project https://lnkd.in/dpcsJcX2
👉Diffutoon is a cartoon shading approach, aiming to transform photorealistic videos in anime styles. It can handle exceptionally high resolutions and rapid motions. Source code released!
👉Review https://t.ly/sim2O
👉Paper https://lnkd.in/dPcSnAUu
👉Code https://lnkd.in/d9B_dGrf
👉Project https://lnkd.in/dpcsJcX2
🔥19❤3🤯3👍1🥰1🤩1💩1🍾1
🥓 RANSAC -> PARSAC (neural) 🥓
👉Neural PARSAC: estimating multiple vanishing points (V), fundamental matrices (F) or homographies (H) at the speed of light! Source Code released 💙
👉Review https://t.ly/r9ngg
👉Paper https://lnkd.in/dadQ4Qec
👉Code https://lnkd.in/dYp6gADd
👉Neural PARSAC: estimating multiple vanishing points (V), fundamental matrices (F) or homographies (H) at the speed of light! Source Code released 💙
👉Review https://t.ly/r9ngg
👉Paper https://lnkd.in/dadQ4Qec
👉Code https://lnkd.in/dYp6gADd
❤14👍3⚡1🥰1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
↘️ SEELE: "moving" the subjects ➡️
👉Subject repositioning: manipulating an input image to reposition one of its subjects to a desired location while preserving the image’s fidelity. SEELE is a single diffusion model to address this novel generative sub-tasks
👉Review https://t.ly/4FS4H
👉Paper arxiv.org/pdf/2401.16861.pdf
👉Project yikai-wang.github.io/seele/
👉Subject repositioning: manipulating an input image to reposition one of its subjects to a desired location while preserving the image’s fidelity. SEELE is a single diffusion model to address this novel generative sub-tasks
👉Review https://t.ly/4FS4H
👉Paper arxiv.org/pdf/2401.16861.pdf
👉Project yikai-wang.github.io/seele/
👍20❤3🤯3👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🎉 ADΔER: Event-Camera Suite 🎉
👉ADΔER: a novel/unified framework for event-based video. Encoder / transcoder / decoder for ADΔER (Address, Decimation, Δt Event Representation) video streams. Source code (RUST) released 💙
👉Review https://t.ly/w5_KC
👉Paper arxiv.org/pdf/2401.17151.pdf
👉Repo github.com/ac-freeman/adder-codec-rs
👉ADΔER: a novel/unified framework for event-based video. Encoder / transcoder / decoder for ADΔER (Address, Decimation, Δt Event Representation) video streams. Source code (RUST) released 💙
👉Review https://t.ly/w5_KC
👉Paper arxiv.org/pdf/2401.17151.pdf
👉Repo github.com/ac-freeman/adder-codec-rs
❤7👍3🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🚦(add) Anything in Any Video🚦
👉 XPeng Motors announced Anything in Any Scene: novel #AI for realistic video simulation that seamlessly inserts any object into an existing dynamic video. Strong emphasis on realism, the objects in the BBs don't exist. Source Code released 💙
👉Review https://t.ly/UYhl0
👉Code https://lnkd.in/gyi7Dhkn
👉Paper https://lnkd.in/gXyAJ6GZ
👉Project https://lnkd.in/gVA5vduD
👉 XPeng Motors announced Anything in Any Scene: novel #AI for realistic video simulation that seamlessly inserts any object into an existing dynamic video. Strong emphasis on realism, the objects in the BBs don't exist. Source Code released 💙
👉Review https://t.ly/UYhl0
👉Code https://lnkd.in/gyi7Dhkn
👉Paper https://lnkd.in/gXyAJ6GZ
👉Project https://lnkd.in/gVA5vduD
🔥12🤯6👍5🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🍬 ABS: SOTA collision-free 🍬
👉ABS (Agile But Safe): learning-based control framework for agile and collision-free locomotion for quadrupedal robot. Source Code announced (coming) 💙
👉Review https://t.ly/AYu-Z
👉Paper arxiv.org/pdf/2401.17583.pdf
👉Project agile-but-safe.github.io/
👉Repo github.com/LeCAR-Lab/ABS
👉ABS (Agile But Safe): learning-based control framework for agile and collision-free locomotion for quadrupedal robot. Source Code announced (coming) 💙
👉Review https://t.ly/AYu-Z
👉Paper arxiv.org/pdf/2401.17583.pdf
👉Project agile-but-safe.github.io/
👉Repo github.com/LeCAR-Lab/ABS
😍11👏3👍1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🏇 Bootstrapping TAP 🏇
👉#Deepmind shows how large-scale, unlabeled, uncurated real-world data can improve TAP with minimal architectural changes, via a self-supervised student-teacher setup. Source Code released 💙
👉Review https://t.ly/-S_ZL
👉Paper arxiv.org/pdf/2402.00847.pdf
👉Code https://github.com/google-deepmind/tapnet
👉#Deepmind shows how large-scale, unlabeled, uncurated real-world data can improve TAP with minimal architectural changes, via a self-supervised student-teacher setup. Source Code released 💙
👉Review https://t.ly/-S_ZL
👉Paper arxiv.org/pdf/2402.00847.pdf
👉Code https://github.com/google-deepmind/tapnet
🔥5👍3🥰1🤩1
💥Py4AI 2x Speakers, 2x Tickets💥
✅Doubling the speakers (6 -> 12!)
✅A new track (2 tracks in parallel)
✅A new batch of 100 tickets!
👉 More: https://t.ly/WmVrM
✅Doubling the speakers (6 -> 12!)
✅A new track (2 tracks in parallel)
✅A new batch of 100 tickets!
👉 More: https://t.ly/WmVrM
❤7👍2🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🪵 HASSOD Object Detection 🪵
👉 HASSOD: fully self-supervised detection and instance segmentation. The new SOTA able to understand the part-to-whole object composition like humans do.
👉Review https://t.ly/66qHF
👉Paper arxiv.org/pdf/2402.03311.pdf
👉Project hassod-neurips23.github.io/
👉Repo github.com/Shengcao-Cao/HASSOD
👉 HASSOD: fully self-supervised detection and instance segmentation. The new SOTA able to understand the part-to-whole object composition like humans do.
👉Review https://t.ly/66qHF
👉Paper arxiv.org/pdf/2402.03311.pdf
👉Project hassod-neurips23.github.io/
👉Repo github.com/Shengcao-Cao/HASSOD
🔥13❤5👍3👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🌵 G-Splatting Portraits 🌵
👉From monocular/casual video captures, Rig3DGS rigs 3D Gaussian Splatting to enable the creation of re-animatable portrait videos with control over facial expressions, head-pose and viewing direction
👉Review https://t.ly/fq71w
👉Paper https://arxiv.org/pdf/2402.03723.pdf
👉Project shahrukhathar.github.io/2024/02/05/Rig3DGS.html
👉From monocular/casual video captures, Rig3DGS rigs 3D Gaussian Splatting to enable the creation of re-animatable portrait videos with control over facial expressions, head-pose and viewing direction
👉Review https://t.ly/fq71w
👉Paper https://arxiv.org/pdf/2402.03723.pdf
👉Project shahrukhathar.github.io/2024/02/05/Rig3DGS.html
🔥13❤3👍1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🌆 Up to 69x Faster SAM 🌆
👉EfficientViT-SAM is a new family of accelerated Segment Anything Models. The same old SAM’s lightweight prompt encoder and mask decoder, while replacing the heavy image encoder with EfficientViT. Up to 69x faster, source code released. Authors: Tsinghua, MIT & #Nvidia
👉Review https://t.ly/zGiE9
👉Paper arxiv.org/pdf/2402.05008.pdf
👉Code github.com/mit-han-lab/efficientvit
👉EfficientViT-SAM is a new family of accelerated Segment Anything Models. The same old SAM’s lightweight prompt encoder and mask decoder, while replacing the heavy image encoder with EfficientViT. Up to 69x faster, source code released. Authors: Tsinghua, MIT & #Nvidia
👉Review https://t.ly/zGiE9
👉Paper arxiv.org/pdf/2402.05008.pdf
👉Code github.com/mit-han-lab/efficientvit
🔥19👍7❤4🥰1