This media is not supported in your browser
VIEW IN TELEGRAM
💀 OSSO: Skeletal Shape from Outside 💀
👉Anatomic skeleton of a person from 3D surface of body 🦴
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Max Planck + IMATI-CNR + INRIA
✅DXA images to obtain #3D shape
✅External body to internal skeleton
More: https://bit.ly/3v7Z5TQ
👉Anatomic skeleton of a person from 3D surface of body 🦴
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Max Planck + IMATI-CNR + INRIA
✅DXA images to obtain #3D shape
✅External body to internal skeleton
More: https://bit.ly/3v7Z5TQ
👍4🤯2🔥1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🎷 Pix2Seq: object detection by #Google 🎷
👉A novel framework to perform object detection as a language modeling task
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Obj. detection as a lang-modeling task
✅BBs/labels -> seq. of discrete token
✅Encoder-decoder (one token at a time)
✅Code under Apache License 2.0
More: https://bit.ly/3F49PX3
👉A novel framework to perform object detection as a language modeling task
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Obj. detection as a lang-modeling task
✅BBs/labels -> seq. of discrete token
✅Encoder-decoder (one token at a time)
✅Code under Apache License 2.0
More: https://bit.ly/3F49PX3
👍8🤯3🔥1😱1🎉1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🌹 Generalizable Neural Performer 🌹
👉General neural framework to synthesize free-viewpoint images of arbitrary human performers
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Free-viewpoint synthesis of humans
✅Implicit Geometric Body Embedding
✅Screen-Space Occlusion-Aware Blending
✅GeneBody: 4M frames, multi-view cams
More: https://cutt.ly/SGcnQzn
👉General neural framework to synthesize free-viewpoint images of arbitrary human performers
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Free-viewpoint synthesis of humans
✅Implicit Geometric Body Embedding
✅Screen-Space Occlusion-Aware Blending
✅GeneBody: 4M frames, multi-view cams
More: https://cutt.ly/SGcnQzn
👍5🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🚌 Tire-defect inspection 🚌
👉Unsupervised defects in tires using neural networks
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Impurity, same material as tire
✅Impurity, with different material
✅Damage by temp/pressure
✅Crack or etched material
More: https://bit.ly/37GX1JT
👉Unsupervised defects in tires using neural networks
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Impurity, same material as tire
✅Impurity, with different material
✅Damage by temp/pressure
✅Crack or etched material
More: https://bit.ly/37GX1JT
❤5👍3🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🧋#4D Neural Fields🧋
👉4D N.F. visual representations from monocular RGB-D 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅4D scene completion (occlusions)
✅Scene completion in cluttered scenes
✅Novel #AI for contextual point clouds
✅Data, code, models under MIT license
More: https://cutt.ly/6GveKiJ
👉4D N.F. visual representations from monocular RGB-D 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅4D scene completion (occlusions)
✅Scene completion in cluttered scenes
✅Novel #AI for contextual point clouds
✅Data, code, models under MIT license
More: https://cutt.ly/6GveKiJ
👍6🤯2🔥1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
👔Largest dataset of human-object 👔
👉BEHAVE by Google: largest dataset of human-object interactions
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅8 subjects, 20 objects, 5 envs.
✅321 clips with 4 Kinect RGB-D
✅Masks and segmented point clouds
✅3D SMPL & mesh registration
✅Textured scan reconstructions
More: https://bit.ly/3Lx6NNo
👉BEHAVE by Google: largest dataset of human-object interactions
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅8 subjects, 20 objects, 5 envs.
✅321 clips with 4 Kinect RGB-D
✅Masks and segmented point clouds
✅3D SMPL & mesh registration
✅Textured scan reconstructions
More: https://bit.ly/3Lx6NNo
👏5👍4🔥2❤1😱1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦴ENARF-GAN Neural Articulations🦴
👉Unsupervised method for 3D geometry-aware representation of articulated objects
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel efficient neural representation
✅Tri-planes deformation fields for training
✅Novel GAN for articulated representations
✅Controllable 3D from real unlabeled pic
More: https://bit.ly/3xYqedN
👉Unsupervised method for 3D geometry-aware representation of articulated objects
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Novel efficient neural representation
✅Tri-planes deformation fields for training
✅Novel GAN for articulated representations
✅Controllable 3D from real unlabeled pic
More: https://bit.ly/3xYqedN
🤯3👍2❤1🔥1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🖲️ HuMMan: 4D human dataset 🖲️
👉HuMMan: 4D dataset with 1000 humans, 400k sequences & 60M frames 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅RGB, pt-clouds, keypts, SMPL, texture
✅Mobile device in the sensor suite
✅500+ actions to cover movements
More: https://bit.ly/3vTRW8Z
👉HuMMan: 4D dataset with 1000 humans, 400k sequences & 60M frames 🤯
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅RGB, pt-clouds, keypts, SMPL, texture
✅Mobile device in the sensor suite
✅500+ actions to cover movements
More: https://bit.ly/3vTRW8Z
🥰2😱2👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Neighborhood Attention Transformer 🔥
👉A novel transformer for both image classification and downstream vision tasks
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Neighborhood Attention (NA)
✅Neighborhood Attention Transformer, NAT
✅Faster training/inference, good throughput
✅Checkpoints, train, #CUDA kernel available
More: https://bit.ly/3F5aVSo
👉A novel transformer for both image classification and downstream vision tasks
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Neighborhood Attention (NA)
✅Neighborhood Attention Transformer, NAT
✅Faster training/inference, good throughput
✅Checkpoints, train, #CUDA kernel available
More: https://bit.ly/3F5aVSo
🤯4👍3🔥1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥🔥FANs: Fully Attentional Networks🔥🔥
👉#Nvidia unveils the fully attentional networks (FANs)
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Efficient fully attentional design
✅Semantic seg. & object detection
✅Model/source code soon available!
More: https://bit.ly/3vtpITs
👉#Nvidia unveils the fully attentional networks (FANs)
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Efficient fully attentional design
✅Semantic seg. & object detection
✅Model/source code soon available!
More: https://bit.ly/3vtpITs
🔥7🤯3👍2❤1
👨🏼🎨 Open-Source DALL·E 2 is out 👨🏼🎨
👉#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅SOTA for text-to-image generation
✅Source code/model under MIT License
✅"Medieval painting of wifi not working"
More: https://bit.ly/3vzsff6
👉#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅SOTA for text-to-image generation
✅Source code/model under MIT License
✅"Medieval painting of wifi not working"
More: https://bit.ly/3vzsff6
🤯14👍6😁1
This media is not supported in your browser
VIEW IN TELEGRAM
⛺ViTPose: Transformer for Pose⛺
👉ViTPose from ViTAE, ViT for human pose
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Plain/nonhierarchical ViT for pose
✅Deconv-layers after ViT for keypoints
✅Just the baseline is the new SOTA
✅Source code & models available soon!
More: https://bit.ly/3MJ0kz1
👉ViTPose from ViTAE, ViT for human pose
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Plain/nonhierarchical ViT for pose
✅Deconv-layers after ViT for keypoints
✅Just the baseline is the new SOTA
✅Source code & models available soon!
More: https://bit.ly/3MJ0kz1
👍5🤯4🔥1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🧳 Unsupervised HD Motion Transfer 🧳
👉Novel e2e unsupervised motion transfer for image animation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅TPS motion estimation + Dropout
✅Novel E2E unsupervised motion transfer
✅Optical flow + multi-res. occlusion mask
✅Code and models under MIT license
More: https://bit.ly/3MGNPns
👉Novel e2e unsupervised motion transfer for image animation
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅TPS motion estimation + Dropout
✅Novel E2E unsupervised motion transfer
✅Optical flow + multi-res. occlusion mask
✅Code and models under MIT license
More: https://bit.ly/3MGNPns
🔥8👍6🤯4❤2😱2
This media is not supported in your browser
VIEW IN TELEGRAM
🚤 Neural Self-Calibration in the wild 🚤
👉 Learning algorithm to regress calibration params from in the wild clips
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Params purely from self-supervision
✅S.S. depth/pose learning as objective
✅POV, fisheye, catadioptric: no changes
✅SOTA results on EuRoC MAV dataset
More: https://bit.ly/3w1n6LB
👉 Learning algorithm to regress calibration params from in the wild clips
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Params purely from self-supervision
✅S.S. depth/pose learning as objective
✅POV, fisheye, catadioptric: no changes
✅SOTA results on EuRoC MAV dataset
More: https://bit.ly/3w1n6LB
👍8🤩2🔥1🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦅 ConDor: S.S. Canonicalization 🦅
👉Self-Supervised Canonicalization for full/partial 3D points cloud
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅RRC + Stanford + KAIST + Brown
✅On top of Tensor Field Networks (TFNs)
✅Unseen 3D -> equivariant canonical
✅Co-segmentation, NO supervision
✅Code and model under MIT license
More: https://bit.ly/3MNDyGa
👉Self-Supervised Canonicalization for full/partial 3D points cloud
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅RRC + Stanford + KAIST + Brown
✅On top of Tensor Field Networks (TFNs)
✅Unseen 3D -> equivariant canonical
✅Co-segmentation, NO supervision
✅Code and model under MIT license
More: https://bit.ly/3MNDyGa
🔥4👍1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦀 Event-aided Direct Sparse Odometry 🦀
👉EDS: direct monocular visual odometry using events/frames
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Mono 6-DOF visual odometry + events
✅Direct photometric bundle adjustment
✅Camera motion tracking by sparse pixels
✅A new dataset with HQ events and frame
More: https://bit.ly/3s9FiBN
👉EDS: direct monocular visual odometry using events/frames
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Mono 6-DOF visual odometry + events
✅Direct photometric bundle adjustment
✅Camera motion tracking by sparse pixels
✅A new dataset with HQ events and frame
More: https://bit.ly/3s9FiBN
🔥5👍3🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🫀BlobGAN: Blob-Disentangled Scene🫀
👉Unsupervised, mid-level (blobs) generation of scenes
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Spatial, depth-ordered Gaussian blobs
✅Reaching for supervised level, and more
✅Source under BSD-2 "Simplified" License
More: https://bit.ly/3kRyGnj
👉Unsupervised, mid-level (blobs) generation of scenes
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Spatial, depth-ordered Gaussian blobs
✅Reaching for supervised level, and more
✅Source under BSD-2 "Simplified" License
More: https://bit.ly/3kRyGnj
🔥8👍1🥰1🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🦕E2EVE editor via pre-trained artist🦕
👉E2EVE generates a new version of the source image that resembles the "driver" one
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Blending regions by driver image
✅E2E cond-probability of the edits
✅S.S. augmenting in target domain
✅Implemented as SOTA transformer
✅Code/models available (soon)
More: https://bit.ly/3P9TDYW
👉E2EVE generates a new version of the source image that resembles the "driver" one
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅Blending regions by driver image
✅E2E cond-probability of the edits
✅S.S. augmenting in target domain
✅Implemented as SOTA transformer
✅Code/models available (soon)
More: https://bit.ly/3P9TDYW
🤯5👍2🤩2❤1🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🐶 Bringing pets in #metaverse 🐶
👉ARTEMIS: pipeline for generating articulated neural pets for virtual worlds
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅ARTiculated, appEarance, Mo-synthesIS
✅Motion control, animation & rendering
✅Neural-generated (NGI) animal engine
✅SOTA animal mocap + neural control
More: https://bit.ly/3LZSLDU
👉ARTEMIS: pipeline for generating articulated neural pets for virtual worlds
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅ARTiculated, appEarance, Mo-synthesIS
✅Motion control, animation & rendering
✅Neural-generated (NGI) animal engine
✅SOTA animal mocap + neural control
More: https://bit.ly/3LZSLDU
❤4👍2🥰2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
😍Animated hand in 1972, damn romantic😍
👉Q: is #VR the technology that developed least in the last 30 years? 🤔
More: https://bit.ly/3snxNaq
👉Q: is #VR the technology that developed least in the last 30 years? 🤔
More: https://bit.ly/3snxNaq
👍7❤3🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
⏏️Ensembling models for GAN training⏏️
👉Pretrained vision models to improve the GAN training. FID by 1.5 to 2×!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅CV models as ensemble of discriminators
✅Improving GAN in limited / large-scale set
✅10k samples matches StyleGAN2 w/ 1.6M
✅Source code / models under MIT license
More: https://bit.ly/3wgUVsr
👉Pretrained vision models to improve the GAN training. FID by 1.5 to 2×!
𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
✅CV models as ensemble of discriminators
✅Improving GAN in limited / large-scale set
✅10k samples matches StyleGAN2 w/ 1.6M
✅Source code / models under MIT license
More: https://bit.ly/3wgUVsr
🤯6🔥2