AI with Papers - Artificial Intelligence & Deep Learning – Telegram
AI with Papers - Artificial Intelligence & Deep Learning
15.8K subscribers
146 photos
260 videos
14 files
1.36K links
All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#AI #chatGPT
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🎩 SinNeRF: Single Image NeRF 🎩

👉NEural Radiance Field via single view only

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
UATX + UIUC + UOregon + Picsart AI
"Looking only once” approach
semi-supervised learning process
Geometry/semantic pseudo-labels
SOTA in novel-view synthesis

More: https://bit.ly/3ujMZqF
👍7🔥2👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 Transformer-based Tracking 🔥

👉Tracker via Transformer-based model prediction module

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Tracking by Transformer prediction
Extending model predictor for BBs
SOTA on three public benchmark
Code/models under GNU License 3.0

More: https://bit.ly/3ucYvUI
🔥9🤯2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
👗 In-The-Wild Virtual Try-On 👗

👉StyleGAN-based architecture for appearance flow estimation in VTON application

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Global appearance flow estimation
Ok with mis-alignments person/garment
"In-the-wild": person with natural poses
Code under CC BY-NC-SA 4.0 license

More: https://bit.ly/3LPR9wl
👏63🔥1🤔1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎇DALL·E 2 just announced!🎇

👉DALL·E 2 to create realistic images and art from natural language

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
More realistic/accurate, 4x res.
Better caption matching
Not available yet, waiting list!

More: https://bit.ly/3j9v3bR
🔥12🤯5👍2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
👋Forecasting interactions via attention👋

👉Predicting the hand motion trajectory and the future contact points on the next active object

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Object-Centric Transformer (OCT)
Self-attention Transformer mechanism
Framework to handle uncertainty
SOTA on Epic-Kitchens and EGTEA

More: https://bit.ly/3v3PpbI
👍4🔥2👏1🤔1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍇SmeLU: Smooth Activation Function🍇

👉Google unveils a new smooth activation function: easy to implement, cheap & less error-prone

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Smooth to mitigate irreproducibility
Cheap function, better than GELU/Swish
0-1 slope through quadratic middle region
SmeLU as convolution of ReLU with box
Best reproducibility-accuracy tradeoff

More: https://bit.ly/3xcskXm
😱8👍41🔥1😁1🤯1
📍Hyper-Dense Landmarks at 150FPS📍

👉#Microsoft unveils the SOTA in dense landmarking + #3D reconstruction. MAGIC.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Accurate 10× as many landmarks as usual
Synthetic data, perfect annotations
NO appearance, light, diff-rendering
#3D @150+FPS with a single CPU thread
SOTA in monocular 3D reconstruction

More: https://bit.ly/37pQS40
👍6🔥4🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
☀️SunStage: Selfie with the Sun☀️

👉Accurate/tailored reconstruction of facial geometry/reflectance

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel personalized scanning
Disentanglement of scene params
Geometry, materials, lighting, poses
Photorealistic with a single selfie video

More: https://bit.ly/36W1Oqx
🔥3👏2🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
📫 Generative Neural Avatars 📫

👉3D shapes of people in a variety of garments with corresponding skinning weight

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
ETH + Uni-Tübingen + Max Planck
Animatable #3D human in garment
Directly from raw posed 3D scans
NO canonical, registration, manual w.
Geometric detail in clothing deformation


More: https://bit.ly/3M7mCdB
👏3🔥2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🗨️Conversational program synthesis🗨️

👉Conversational synthesis to translate English into executable code

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Conversational program synthesis
New multi-turn progr.benchmark
Open Custom library: JAXFORMER
Source code under BSD-3 license

More: https://bit.ly/3jjWWhk
🤯4🥰2🔥1😱1
This media is not supported in your browser
VIEW IN TELEGRAM
🧯Long Video Diffusion Models🧯

👉#Google unveils a novel diffusion model for video generation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Straightforward extension of 2D UNet
Longer by new conditional generation
SOTA in unconditional generation

More: https://bit.ly/35Y2rzg
🔥4🎉2🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🚙 AutoRF: #3D objects in-the-wild 🚙

👉From #Meta: #3D object from just a single, in-the wild, image

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Novel view synthesis from in-the-wild
Normalized, object-centric representation
Disentangling shape, appearance & pose
Exploiting BBS & panoptic segmentation
Shape/appearance properties for objects


More: https://bit.ly/3O4ONeQ
🤯7😱2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🌠GAN-based Darkest Dataset🌠

👉Berkeley + #Intel announce first photorealistic dataset under starlight (no moon, <0.001 lx)

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
"Darkest" dataset ever seen
Moonless, no external illumination
GAN-tuned physics-based model
Clips with dancing, volleyball, flags...

More: https://bit.ly/3LXxMkN
👍3🤯2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖Populating with digital humans🤖

👉ETHZ unveils GAMMA to populate the #3D scene with digital humans

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
GenerAtive Motion primitive MArkers
Realistic, controllable, infinite motions
Tree-based search to preserve quality
SOTA in realistic/controllable motion

More: https://bit.ly/3OgY4AG
😱5👍4🔥2👏1🤯1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥#AIwithPapers: we are ~2,000!🔥

💙💛 Simply amazing. Thank you all 💙💛

😈 Invite your friends -> https://news.1rj.ru/str/AI_DeepLearning
18🔥8🥰4👍3
This media is not supported in your browser
VIEW IN TELEGRAM
😼GARF: Gaussian Activated NeRF😼

👉GARF: Gaussian Activated R.F. for Hi-Fi reconstruction/pose

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
NeRF from imperfect camera poses
NO hyper-parameter tuning/initialization
Theoretical insight on Gaussian activation
Unlocking NeRF for real-world application?

More: https://bit.ly/36bvdfU
👍4🤩21👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🎭Novel pre-training strategy for #AI🎭

👉EPFL unveils the Multi-modal Multi-task Masked Autoencoders (MultiMAE)

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Multimodal: additional modal. over RGB
Multi-task: multiple outputs over RGB
General: MultiMAE by pseudo-labeling
Classification, segmentation, depth
Code under NonCommercial 4.0 Int.

More: https://bit.ly/3jRhNsN
🔥7🤯2👍1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧪 A new SOTA in Dataset Distillation 🧪

👉A new approach by Matching Training Trajectories is out!

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Distilling data "to match" bigger one
Distilled data to guide a network
Trajectories of experts from real data
SOTA + distilling higher-res visual data

More: https://bit.ly/3JwYOxW
👍5🔥1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤 Two-Hand tracking via GCN 🧤

👉The first-ever GCN for two interacting hands in single RGB image

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Reconstruction by GCN mesh regression
PIFA: pyramid attention for local occlusion
CHA: cross hand attention for interaction
SOTA + generalization in-the-wild scenario
Source code available under GNU 🤯

More: https://bit.ly/3KH5FWO
👏10👍4🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🕹️Video K-Net, SOTA in Segmentation🕹️

👉Simple, strong, and unified framework for fully end-to-end video panoptic segmentation

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Learnable kernels from K-Net
K-Net learns to segment & track
Appearance / cross-T kernel interaction
New SOTA without bells and whistles 🤷‍♂️

More: https://bit.ly/3uEEZQR
👍6🔥1🤯1