AI with Papers - Artificial Intelligence & Deep Learning – Telegram
AI with Papers - Artificial Intelligence & Deep Learning
15.8K subscribers
146 photos
260 videos
14 files
1.36K links
All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#AI #chatGPT
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
👔Largest dataset of human-object 👔

👉BEHAVE by Google: largest dataset of human-object interactions

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
8 subjects, 20 objects, 5 envs.
321 clips with 4 Kinect RGB-D
Masks and segmented point clouds
3D SMPL & mesh registration
Textured scan reconstructions

More: https://bit.ly/3Lx6NNo
👏5👍4🔥21😱1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦴ENARF-GAN Neural Articulations🦴

👉Unsupervised method for 3D geometry-aware representation of articulated objects

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel efficient neural representation
Tri-planes deformation fields for training
Novel GAN for articulated representations
Controllable 3D from real unlabeled pic

More: https://bit.ly/3xYqedN
🤯3👍21🔥1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🖲️ HuMMan: 4D human dataset 🖲️

👉HuMMan: 4D dataset with 1000 humans, 400k sequences & 60M frames 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
RGB, pt-clouds, keypts, SMPL, texture
Mobile device in the sensor suite
500+ actions to cover movements

More: https://bit.ly/3vTRW8Z
🥰2😱2👍1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥Neighborhood Attention Transformer 🔥

👉A novel transformer for both image classification and downstream vision tasks

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Neighborhood Attention (NA)
Neighborhood Attention Transformer, NAT
Faster training/inference, good throughput
Checkpoints, train, #CUDA kernel available

More: https://bit.ly/3F5aVSo
🤯4👍3🔥1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥🔥FANs: Fully Attentional Networks🔥🔥

👉#Nvidia unveils the fully attentional networks (FANs)

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Efficient fully attentional design
Semantic seg. & object detection
Model/source code soon available!

More: https://bit.ly/3vtpITs
🔥7🤯3👍21
👨🏼‍🎨 Open-Source DALL·E 2 is out 👨🏼‍🎨

👉#Pytorch implementation of DALL-E 2, #OpenAI's latest text-to-image neural net.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
SOTA for text-to-image generation
Source code/model under MIT License
"Medieval painting of wifi not working"

More: https://bit.ly/3vzsff6
🤯14👍6😁1
This media is not supported in your browser
VIEW IN TELEGRAM
ViTPose: Transformer for Pose

👉ViTPose from ViTAE, ViT for human pose

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Plain/nonhierarchical ViT for pose
Deconv-layers after ViT for keypoints
Just the baseline is the new SOTA
Source code & models available soon!

More: https://bit.ly/3MJ0kz1
👍5🤯4🔥1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🧳 Unsupervised HD Motion Transfer 🧳

👉Novel e2e unsupervised motion transfer for image animation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
TPS motion estimation + Dropout
Novel E2E unsupervised motion transfer
Optical flow + multi-res. occlusion mask
Code and models under MIT license

More: https://bit.ly/3MGNPns
🔥8👍6🤯42😱2
This media is not supported in your browser
VIEW IN TELEGRAM
🚤 Neural Self-Calibration in the wild 🚤

👉 Learning algorithm to regress calibration params from in the wild clips

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Params purely from self-supervision
S.S. depth/pose learning as objective
POV, fisheye, catadioptric: no changes
SOTA results on EuRoC MAV dataset

More: https://bit.ly/3w1n6LB
👍8🤩2🔥1🥰1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🦅 ConDor: S.S. Canonicalization 🦅

👉Self-Supervised Canonicalization for full/partial 3D points cloud

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
RRC + Stanford + KAIST + Brown
On top of Tensor Field Networks (TFNs)
Unseen 3D -> equivariant canonical
Co-segmentation, NO supervision
Code and model under MIT license

More: https://bit.ly/3MNDyGa
🔥4👍1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🦀 Event-aided Direct Sparse Odometry 🦀

👉EDS: direct monocular visual odometry using events/frames

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Mono 6-DOF visual odometry + events
Direct photometric bundle adjustment
Camera motion tracking by sparse pixels
A new dataset with HQ events and frame

More: https://bit.ly/3s9FiBN
🔥5👍3🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🫀BlobGAN: Blob-Disentangled Scene🫀

👉Unsupervised, mid-level (blobs) generation of scenes

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Spatial, depth-ordered Gaussian blobs
Reaching for supervised level, and more
Source under BSD-2 "Simplified" License

More: https://bit.ly/3kRyGnj
🔥8👍1🥰1🤯1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🦕E2EVE editor via pre-trained artist🦕

👉E2EVE generates a new version of the source image that resembles the "driver" one

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Blending regions by driver image
E2E cond-probability of the edits
S.S. augmenting in target domain
Implemented as SOTA transformer
Code/models available (soon)

More: https://bit.ly/3P9TDYW
🤯5👍2🤩21🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🐶 Bringing pets in #metaverse 🐶

👉ARTEMIS: pipeline for generating articulated neural pets for virtual worlds

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
ARTiculated, appEarance, Mo-synthesIS
Motion control, animation & rendering
Neural-generated (NGI) animal engine
SOTA animal mocap + neural control

More: https://bit.ly/3LZSLDU
4👍2🥰2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
😍Animated hand in 1972, damn romantic😍

👉Q: is #VR the technology that developed least in the last 30 years? 🤔

More: https://bit.ly/3snxNaq
👍73🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
⏏️Ensembling models for GAN training⏏️

👉Pretrained vision models to improve the GAN training. FID by 1.5 to 2×!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
CV models as ensemble of discriminators
Improving GAN in limited / large-scale set
10k samples matches StyleGAN2 w/ 1.6M
Source code / models under MIT license

More: https://bit.ly/3wgUVsr
🤯6🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
🤯Cooperative Driving + AUTOCASTSIM🤯

👉COOPERNAUT: cross-vehicle perception for vision-based cooperative driving

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
UTexas + #Stanford + #Sony #AI
LiDAR into compact point-based
Network-augmented simulator
Source code and models available

More: https://bit.ly/3sr5HLk
🔥6🤯3🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
💄NeuralHDHair: 3D Neural Hair💄

👉NeuralHDHair: fully automatic system for modeling HD hair from a single image

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
IRHairNet for hair geometric features
GrowingNet: 3D hair strands in parallel
VIFu: novel voxel-aligned implicit function
SOTA in 3D hair modeling from single pic

More: https://bit.ly/38iR0mQ
👍5🥰31
This media is not supported in your browser
VIEW IN TELEGRAM
🐡DyNeRF: Neural 3D Video Synthesis🐡

👉#Meta unveils DyNeRF, novel rendering HQ 3D video

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel NeRF-based on temp-latent codes
Novel training based on hierarchical step
Datasets of time-synch/calibrated clips
Attribution-NonCommercial 4.0 Int.

More: https://bit.ly/3MlBRA9
🤯8👍2🔥1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🍋GATO: agent for multiple tasks🍋

👉The same network with the same weights can play Atari, caption pics, chat, and more🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
General-purpose agent, multiple tasks
Multi-modal-task, multi-embodiment
Inspired by large-scale language model

More: https://bit.ly/3LbBOWb
🤯103👍2🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
🪐NeRF powered by keypoints🪐

👉ETHZ + META unveil how to encode relative spatial #3D info via sparse 3D keypoints

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Sparse 3D keypoints for SOTA avatars
Unseen subjects from 2/3 views
Never-before-seen iPhone captures

More: https://bit.ly/39NQqhe
🤯5🔥21👍1