AI with Papers - Artificial Intelligence & Deep Learning – Telegram
AI with Papers - Artificial Intelligence & Deep Learning
15.8K subscribers
146 photos
260 videos
14 files
1.36K links
All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#AI #chatGPT
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🪨Controllable #3D Adversarial Face🪨

👉#Meta (+CMU) on decoupling identity/expression + granular control over expressions

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Supervised auto-enc. + GAN
UV texture maps + 3D faces
Control expression, saving ID
Code under X11 License

More: https://bit.ly/3AVE80q
👍6
This media is not supported in your browser
VIEW IN TELEGRAM
🥑 DALL·E: Outpainting via #NLP 🥑

👉Extending any original image, creating large-scale images in any aspect ratio

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Extending an image beyond its borders
Visual elements in same style of the input
Driving the image "story" in new directions
Shadows, reflections & textures w/ context

More: https://bit.ly/3eoH8uD
🔥20🤯71
This media is not supported in your browser
VIEW IN TELEGRAM
🌪️ TimeLapse++: Video Temporal Pyramid🌪️

👉Multi-scale lens to view the passage of time: far beyond a "classic" timelapse

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Inspired by "old-school" spatial pyramids
Video Spectrogram to go through pyramid
Months/years of data in a few seconds!
Multi-temporal freq., no aliasing

More: https://bit.ly/3TKnYPS
🤯6👍21
This media is not supported in your browser
VIEW IN TELEGRAM
🫐 Stable Diffusion Video is out! 🫐

👉A free notebook to generate videos by interpolating the latent space of SD.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Blueberry to strawberry spaghetti
Dream items from same prompt
Morph different prompts (seeds)
Built on a noscript by A. Karpathy

More: https://bit.ly/3ey8632
🤯15👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🦎 VMT: Video Mask Transfiner 🦎

👉Novel highly efficient ViT structure for video instance segmentation.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
HD & more temporally stable mask
Higher resolution features for VIS
Detecting error-prone s-t. regions
Auto-refinement on training data!

More: https://bit.ly/3RKXtb4
🤯91
🤯 #StableDiffusion + #Dallemini = BOOM! 🤯

👉A #colab notebook that combines Stable Diffusion + DALL-E Mini (Craiyon)

More: https://bit.ly/3TTOshR
🔥9👏5😢1
This media is not supported in your browser
VIEW IN TELEGRAM
🐠VIS - Deformable Transformers 🐠

👉DeVIS: VIS method with efficiency and performance of deformable ViT

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Temp. multi-scale D-Attention
Instance-aware object queries
Mask: DA + multi-scale feats map
Improved multi-cue clip tracking
SOTA on YouTube-VIS 2021/OVIS

More: https://bit.ly/3TQv1Xc
🔥81👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🌈 X-NeRF: Cross-Spectral NeRF 🌈

👉Cross-Spectral NeRF from cams with different light spectrums

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
First ever cross-spectral NeRF
Avoiding non-trivial calib/match
Normalized Cross-Device Coords
Novel dataset w/ RGB, MS, & IR

More: https://bit.ly/3RqHnUo
👍7
This media is not supported in your browser
VIEW IN TELEGRAM
👹TT-GNeRF: generative NeRF for Faces👹

👉TT-GNeRF: a novel 3D-aware GANs based on generative NeRF for faces

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
ETH + Uni_Trento + #Snap 🤯
DAEM for disentanglement of 3D model
"Training-as-Init, Optimizing-for-Tuning"
Consistency++, preserving non-target ROI
Unsupervised optimization of geometry

More: https://bit.ly/3ARZmMw
🔥41👍1
🎪 SOTA in Arbitrary Shape Text Detection 🎪

👉Novel unified coarse-to-fine Transformer for arbitrary shape text detection

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Coarse-to-fine arbitrary text detection
Accurate text detection, NO post-process
Boundary proposal generation mechanism
Innovative boundary transformer (iterative)
Boundary energy loss (BEL) for refinement

More: https://bit.ly/3D6Ryt4
8👍2😢1
This media is not supported in your browser
VIEW IN TELEGRAM
🐲 Open-Source Self-Driving projects 🐲

👉A free repo with many autonomous vehicle-related projects

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Basic/Advance Lane/Line Detection
Driving behavior by training & validating
Autopilot: predicting steering angle

More: https://bit.ly/3qqJ7RB
🔥22👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🥤K-VIL: Keypoint-based visual imitation🥤

👉K-VIL: auto-incremental extraction of object-centric task representation.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Efficient task-relevant keypoints
Embodiment-independent tasks
Adaptation of tasks to new scenes
Input: only a small set of demo clips
Novel keypoint-based controller

More: https://bit.ly/3eIrxpP
🔥7👍1
This media is not supported in your browser
VIEW IN TELEGRAM
💜 #Selfdriving in 80's. Damn Romantic 💜

👉The first self-driving car with people on board, 1986. So slow and lovely.

More: https://bit.ly/3BtRDon
9👏4👍3
This media is not supported in your browser
VIEW IN TELEGRAM
🏵️ TORAS: SOTA #AI for annotation 🏵️

👉TORAS: web-based AI-powered, cooperative, annotation platform.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
SOTA AI tools -> significant speedup
"Recipes" to define how to annotate
Repo with folder structure for storage
Also on-prem for (commercial) firms

More: https://bit.ly/3L78YI2
🔥9🤯2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
💮MAXIM: Multi-Axis MLP for Vision💮

👉#Google opens MAXIM, a multi-axis MLP for low-level vision

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Denoising, deblurring, dehazing, etc
Multi-axis gated MLP, linear complexity
Cross gating block, separate features
SOTA results on several datasets!

More: https://bit.ly/3Dmp8LI
🔥121👎1
This media is not supported in your browser
VIEW IN TELEGRAM
🔥 A Survey on Diffusion Models 🔥

👉A comprehensive review of denoising diffusion models in #computervision 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Overview on diffusion models
Hot trend for the generative AI
A multi-perspective categorization
Current limitations / new directions

More: https://bit.ly/3RYG5zP
5👍3🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🉐#AI finds where IG photos are taken🉐

👉Brilliant work of Depoorter, Belgium artist that handles #privacy, #AI & #socialmedia

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Recorded open cameras for weeks
Scraped all #Instagram photos
Matching Instagram vs. footage

More: https://bit.ly/3eL5dfc
😱18👍13🥰2
This media is not supported in your browser
VIEW IN TELEGRAM
🈯SAMURAI: in-the-wild Shape/Material🈯

👉#Google SAMURAI: shape, BRDF, per-image pose & illumination. Relightable #3D assets for #AR/#VR.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Parametrization for varying distances
Camera multiplex optimization
Posterior scaling of input images
Explicit meshes extraction with BRDF
Code/data soon available ->#NeurIPS

More: https://bit.ly/3BKWgf3
👍8🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
🟨 Lang<->Pics in 100+ Languages 🟨

👉#Google PaLI: unified lang-image #AI to perform tasks in 109 languages 🤯

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
PaLI: Pathways Lang & Image model
Answering, captioning, reasoning, etc
From Eng. to 109 lang. understanding
The new SOTA on several datasets

More: https://bit.ly/3QMslHC
🔥6👍1💯1
This media is not supported in your browser
VIEW IN TELEGRAM
🍐PeRFception: Largest IR Dataset🍐

👉#Nvidia, a new frontier in data collection via Plenoxels: same info, -96.4% in size.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
POSTECH + NVIDIA + Caltech = 🤯
Size: -96.4% from original dataset!
2D/3D image/object class/semantic
Ready-to-use pipeline for implicit dataset

More: https://bit.ly/3eW9hJA
9❤‍🔥1👍1😍1