This media is not supported in your browser
VIEW IN TELEGRAM
🥸Imagen: far beyond DALL·E 2🥸
👉#Google: unprecedented photorealism and deep level of language understanding
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dynamic thresh diffusion sampling
✅Efficient U-Net, efficient++ variant
✅DrawBench, new text-to-image
✅The new SOTA, COCO FID of 7.27
More: https://bit.ly/3lVtkbz
👉#Google: unprecedented photorealism and deep level of language understanding
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dynamic thresh diffusion sampling
✅Efficient U-Net, efficient++ variant
✅DrawBench, new text-to-image
✅The new SOTA, COCO FID of 7.27
More: https://bit.ly/3lVtkbz
🔥9🤯6👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🪤Tracking over SOTA detectors🪤
👉Lightweight Python lib for real-time 2D object tracking 💥
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Layer of tracking over SOTA detectors
✅Suitable for complex video processing
✅Source code under BSD 3-Clause
✅Maintained by Tryolabs team
More: https://bit.ly/3wKtGqg
👉Lightweight Python lib for real-time 2D object tracking 💥
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Layer of tracking over SOTA detectors
✅Suitable for complex video processing
✅Source code under BSD 3-Clause
✅Maintained by Tryolabs team
More: https://bit.ly/3wKtGqg
👍7🔥3🤩3
This media is not supported in your browser
VIEW IN TELEGRAM
🥷🏿 FCA: #3D Neural Camouflage 🥷🏿
👉#3D full-camouflage adversarial patch to fool neural detectors
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Attack by diff-neural render
✅E2E physical adversarial attack
✅Envs, vehicles & detectors
✅Source code available!
More: https://bit.ly/38kKyfa
👉#3D full-camouflage adversarial patch to fool neural detectors
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Attack by diff-neural render
✅E2E physical adversarial attack
✅Envs, vehicles & detectors
✅Source code available!
More: https://bit.ly/38kKyfa
👍5🔥3🤯2👏1
Media is too big
VIEW IN TELEGRAM
🍋 One-Shot Object Pose 🍋
👉A novel one-shot object pose estimator
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Visual localization pipeline for object pose
✅Handling novel objects without CAD model
✅Novel graph attention for 2D-3D matching
✅Large dataset for one-shot object pose
More: https://bit.ly/3MTogjJ
👉A novel one-shot object pose estimator
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Visual localization pipeline for object pose
✅Handling novel objects without CAD model
✅Novel graph attention for 2D-3D matching
✅Large dataset for one-shot object pose
More: https://bit.ly/3MTogjJ
🔥11❤4👍2🤯2
This media is not supported in your browser
VIEW IN TELEGRAM
☄️STEVE: Slot-TransformEr for VidEos☄️
👉STEVE: unsupervised model for object-centric learning in videos
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Adoption of a slot decoder (SLATE)
✅SLATE with slot-level recurrence model
✅Complex and naturalistic videos
✅Significantly outperforms previous SOTA
More: https://bit.ly/3PNxxM3
👉STEVE: unsupervised model for object-centric learning in videos
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Adoption of a slot decoder (SLATE)
✅SLATE with slot-level recurrence model
✅Complex and naturalistic videos
✅Significantly outperforms previous SOTA
More: https://bit.ly/3PNxxM3
🔥7👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦔 CogVideo: insane text-to-clip 🦔
👉CogVideo: 9B-parameters world's first large scale open-source text-to-video 😵
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Largest open-source T2C transformer
✅Finetuning of text-to-image model
✅Multi-frame-rate hierarchical training
✅From pretrained model CogView2
More: https://bit.ly/3Gzfl4n
👉CogVideo: 9B-parameters world's first large scale open-source text-to-video 😵
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Largest open-source T2C transformer
✅Finetuning of text-to-image model
✅Multi-frame-rate hierarchical training
✅From pretrained model CogView2
More: https://bit.ly/3Gzfl4n
🔥9👍6
This media is not supported in your browser
VIEW IN TELEGRAM
🦄Time-Aware Neural Voxels🦄
👉TiNeuVox: "NeRF" with time-aware voxel features 😵
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dynamic scene w/ optimizable structure
✅Temporal information in radiance net
✅Small/large motion w/ single-res of feats
✅192× faster than previous Hyper-NeRF
More: https://bit.ly/3wR4O08
👉TiNeuVox: "NeRF" with time-aware voxel features 😵
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Dynamic scene w/ optimizable structure
✅Temporal information in radiance net
✅Small/large motion w/ single-res of feats
✅192× faster than previous Hyper-NeRF
More: https://bit.ly/3wR4O08
👍11🔥2🤯1
🫐Neural Anomaly Detection by AWS🫐
👉Ultra-competitive inference and SOTA for both detection and localization
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Locally aggregated, mid-level feats patch
✅Maximizing nominal information at test time
✅Reducing biases towards ImageNet classes
✅Image-level anomaly AUROC of up to 99.6%
More: https://bit.ly/3t7Ndjg
👉Ultra-competitive inference and SOTA for both detection and localization
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Locally aggregated, mid-level feats patch
✅Maximizing nominal information at test time
✅Reducing biases towards ImageNet classes
✅Image-level anomaly AUROC of up to 99.6%
More: https://bit.ly/3t7Ndjg
🔥7🤯3👍2
This media is not supported in your browser
VIEW IN TELEGRAM
🛹 Project Skate from Google #AI 🛹
👉#AI tool to analyze the skateboarder's tricks in real-time
More: https://bit.ly/3zbQS3M
👉#AI tool to analyze the skateboarder's tricks in real-time
More: https://bit.ly/3zbQS3M
🔥15🤩3👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧬Neural Text2Human Generation🧬
👉Text-driven neural human generation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Full-body from a given human pose
✅Hierarchical texture-aware codebook
✅DeepFashion -> 44k Hi-Res images
✅Code and models available!
More: https://bit.ly/3Mdnpt0
👉Text-driven neural human generation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Full-body from a given human pose
✅Hierarchical texture-aware codebook
✅DeepFashion -> 44k Hi-Res images
✅Code and models available!
More: https://bit.ly/3Mdnpt0
🔥15👍1
🧨EfficientFormers: 1.6ms inference 🧨
👉Transformers fast as MobileNet? Snap shows that on #iphone!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Low latency on mobile, high performance!
✅Revisiting the design of ViT through latency
✅New dimension-consistent design paradigm
✅EfficientFormers: a new ViT for mobile!
More: https://bit.ly/3MdgW15
👉Transformers fast as MobileNet? Snap shows that on #iphone!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Low latency on mobile, high performance!
✅Revisiting the design of ViT through latency
✅New dimension-consistent design paradigm
✅EfficientFormers: a new ViT for mobile!
More: https://bit.ly/3MdgW15
🔥16👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🐢 Transformer-Based Sens-Fusion 🐢
👉Updating TransFuser (CVPR21): image + LiDAR representations with self-attention
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Existing approach can't handle traffic 😢
✅Novel multi-modal fusion transformer
✅The new SOTA in driving performance
✅Reducing avg collisions per KM by 48%
✅Insights on current limitations of E2E
More: https://bit.ly/391dmd6
👉Updating TransFuser (CVPR21): image + LiDAR representations with self-attention
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Existing approach can't handle traffic 😢
✅Novel multi-modal fusion transformer
✅The new SOTA in driving performance
✅Reducing avg collisions per KM by 48%
✅Insights on current limitations of E2E
More: https://bit.ly/391dmd6
👍11🔥2
🧘🏻♂️YogNet: neural yoga assistant🧘🏻♂️
👉Multi-person yoga neural expert for 20 asanas
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅CNNs & reg.LSTMs + 3D-CNNs
✅Multi-person asanas in real-time
✅YAR: dataset for yoga & posture
✅1206 videos, 2D RGB camera
More: https://bit.ly/3NncVbE
👉Multi-person yoga neural expert for 20 asanas
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅CNNs & reg.LSTMs + 3D-CNNs
✅Multi-person asanas in real-time
✅YAR: dataset for yoga & posture
✅1206 videos, 2D RGB camera
More: https://bit.ly/3NncVbE
❤13👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🔴 Geogram: geometric algos in C++ 🔴
👉Novel open-source programming library with (research) geometric algorithms in C++
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Geometry Processing from #INRIA
✅30+ papers from SIGGRAPH, etc.
✅Grants: GOODSHAPE & VORPALINE
✅Code (mostly C++) under BSD 3
More: https://bit.ly/3mhS4L7
👉Novel open-source programming library with (research) geometric algorithms in C++
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Geometry Processing from #INRIA
✅30+ papers from SIGGRAPH, etc.
✅Grants: GOODSHAPE & VORPALINE
✅Code (mostly C++) under BSD 3
More: https://bit.ly/3mhS4L7
🔥6👍3❤1
🍏 Open Source Vision from #Apple 🍏
👉CVNets: open-source (not a joke) lib for neural vision.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅PyTorch-based neural lib. for vision
✅Train 2−4× longer w/ augmentations
✅Plug-and-play components for CV
✅Source code under a custom license
More: https://bit.ly/39d1dSj
👉CVNets: open-source (not a joke) lib for neural vision.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅PyTorch-based neural lib. for vision
✅Train 2−4× longer w/ augmentations
✅Plug-and-play components for CV
✅Source code under a custom license
More: https://bit.ly/39d1dSj
👍9
This media is not supported in your browser
VIEW IN TELEGRAM
🏇🏻Neural Clips by #Nvidia: INSANE 🏇🏻
👉Neural generation with changes in camera viewpoint & content that arises over time 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel hierarchical generator architecture
✅Temp. receptive field + temporal embed.
✅Multi-res. with super-resolution network
✅SOTA in long clip with motion & changes
✅Code, data & models in August 2022 🏖️
More: https://bit.ly/3zroWsC
👉Neural generation with changes in camera viewpoint & content that arises over time 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel hierarchical generator architecture
✅Temp. receptive field + temporal embed.
✅Multi-res. with super-resolution network
✅SOTA in long clip with motion & changes
✅Code, data & models in August 2022 🏖️
More: https://bit.ly/3zroWsC
🤯9👎2❤1
This media is not supported in your browser
VIEW IN TELEGRAM
⚽ Zero to #Messi with #deeplearning ⚽
👉EA unveils a neural system to learn multiple soccer juggling skills 😍
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Learning difficult soccer juggling skills
✅Layer-wise mixture-of-experts architecture
✅Specialization arises naturally
✅Adaptive random walk training strategy
More: https://bit.ly/3mwRaL2
👉EA unveils a neural system to learn multiple soccer juggling skills 😍
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Learning difficult soccer juggling skills
✅Layer-wise mixture-of-experts architecture
✅Specialization arises naturally
✅Adaptive random walk training strategy
More: https://bit.ly/3mwRaL2
🔥7👍3
This media is not supported in your browser
VIEW IN TELEGRAM
🏖️ HumanNeRF: source code is out! 🏖️
👉Pausing the video at any frame and rendering the subject from arbitrary views!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Synthesizing photorealistic humans
✅Synthesizing details, ie. cloth & face
✅Volumetric canonical T-pose
✅Skeletal rigid/non-rigid decomposition
More: https://bit.ly/3NEkTNY
👉Pausing the video at any frame and rendering the subject from arbitrary views!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Synthesizing photorealistic humans
✅Synthesizing details, ie. cloth & face
✅Volumetric canonical T-pose
✅Skeletal rigid/non-rigid decomposition
More: https://bit.ly/3NEkTNY
🤯17🔥5👍2
This media is not supported in your browser
VIEW IN TELEGRAM
🎒 EG3D: source code is out! 🎒
👉#Nvidia just opened EG3D: real time multi-view faces w/ HQ #3D geometry!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Tri-plane-based 3D GAN framework
✅Pose-correlated attribute (expression)
✅SOTA in uncond. 3D-aware synthesis
✅Source code & models NOW available!
More: https://bit.ly/3aOfHs0
👉#Nvidia just opened EG3D: real time multi-view faces w/ HQ #3D geometry!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Tri-plane-based 3D GAN framework
✅Pose-correlated attribute (expression)
✅SOTA in uncond. 3D-aware synthesis
✅Source code & models NOW available!
More: https://bit.ly/3aOfHs0
🔥7🤯6👍4❤2
🔥One Millisecond Backbone. Fire!🔥
👉MobileOne by #Apple: efficient mobile backbone with inference <1 ms on #iPhone12!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅75.9% top-1 accuracy on ImageNet
✅38× faster than MobileFormer net
✅Classification, detection & segmentation
✅Source code & model soon available!
More: https://bit.ly/3tsT7f2
👉MobileOne by #Apple: efficient mobile backbone with inference <1 ms on #iPhone12!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅75.9% top-1 accuracy on ImageNet
✅38× faster than MobileFormer net
✅Classification, detection & segmentation
✅Source code & model soon available!
More: https://bit.ly/3tsT7f2
❤24👍2
This media is not supported in your browser
VIEW IN TELEGRAM
🧨 Scaling Transformers to GigaPixels!🧨
👉Novel ViT called Hierarchical Image Pyramid Transformer (HIPT) -> Scaling to GigaPixels!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Gigapixel whole-slide imaging (WSI)
✅Leveraging natural hier. structure of WSI
✅Self-supervised Hi-Res representations
✅Source code and models available!
More: https://bit.ly/3xLuzkg
👉Novel ViT called Hierarchical Image Pyramid Transformer (HIPT) -> Scaling to GigaPixels!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Gigapixel whole-slide imaging (WSI)
✅Leveraging natural hier. structure of WSI
✅Self-supervised Hi-Res representations
✅Source code and models available!
More: https://bit.ly/3xLuzkg
🤯16👍1