NEW BOT Телеграм, страница - 280557045

AI with Papers - Artificial Intelligence & Deep Learning

@AI_DeepLearning

17.5K subscribers

155 photos

266 videos

14 files

1.41K links

All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#AI #chatGPT

Download Telegram

About

Blog

Apps

Platform

AI with Papers - Artificial Intelligence & Deep Learning

17.5K subscribers

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🪵 HASSOD Object Detection 🪵

👉 HASSOD: fully self-supervised detection and instance segmentation. The new SOTA able to understand the part-to-whole object composition like humans do.

👉Review https://t.ly/66qHF
👉Paper arxiv.org/pdf/2402.03311.pdf
👉Project hassod-neurips23.github.io/
👉Repo github.com/Shengcao-Cao/HASSOD

🔥13❤5👍3👏1

7.79K viewsedited 12:45

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🌵 G-Splatting Portraits 🌵

👉From monocular/casual video captures, Rig3DGS rigs 3D Gaussian Splatting to enable the creation of re-animatable portrait videos with control over facial expressions, head-pose and viewing direction

👉Review https://t.ly/fq71w
👉Paper https://arxiv.org/pdf/2402.03723.pdf
👉Project shahrukhathar.github.io/2024/02/05/Rig3DGS.html

🔥13❤3👍1🥰1

8.15K viewsedited 07:26

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🌆 Up to 69x Faster SAM 🌆

👉EfficientViT-SAM is a new family of accelerated Segment Anything Models. The same old SAM’s lightweight prompt encoder and mask decoder, while replacing the heavy image encoder with EfficientViT. Up to 69x faster, source code released. Authors: Tsinghua, MIT & #Nvidia

👉Review https://t.ly/zGiE9
👉Paper arxiv.org/pdf/2402.05008.pdf
👉Code github.com/mit-han-lab/efficientvit

🔥19👍7❤4🥰1

8.96K viewsedited 13:04

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🌴 Direct-a-Video Generation 🌴

👉Direct-a-Video is a text-to-video generation framework that allows users to individually or jointly control the camera movement and/or object motion

👉Review https://t.ly/dZSLs
👉Paper arxiv.org/pdf/2402.03162.pdf
👉Project https://direct-a-video.github.io/

🔥7👏3❤1

8.11K views13:33

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🍇 Graph Neural Network in TF 🍇

👉#Google TensorFlow-GNN: novel library to build Graph Neural Networks on TensorFlow. Source Code released under Apache 2.0 license 💙

👉Review https://t.ly/TQfg-
👉Code github.com/tensorflow/gnn
👉Blog blog.research.google/2024/02/graph-neural-networks-in-tensorflow.html

❤17👍4👏1

8.6K viewsedited 08:24

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🆔 Magic-Me: ID-Specific Video 🆔

👉#ByteDance VCD: with just a few images of a specific identity it can generate temporal consistent videos aligned with the given prompt

👉Review https://t.ly/qjJ2O
👉Paper arxiv.org/pdf/2402.09368.pdf
👉Project magic-me-webpage.github.io
👉Code github.com/Zhen-Dong/Magic-Me

❤6🥰1🤯1🤣1

8.12K viewsedited 15:27

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥 Breaking: GEMINI 1.5 is out 🔥

👉Gemini 1.5 just announced: standard 128,000 token context window, up to 1 MILLION tokens via AI-Studio and #Vertex AI in private preview 🫠

👉Review https://t.ly/Vblvx
👉More: https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/#build-experiment

🤯17👍4😱2

9.33K views16:01

AI with Papers - Artificial Intelligence & Deep Learning

AI with Papers - Artificial Intelligence & Deep Learning

🈚 Seeing Through Occlusions 🈚 👉Novel NSF to see through occlusions, reflection suppression & shadow removal. 👉Review https://t.ly/5jcIG 👉Project https://light.princeton.edu/publication/nsf 👉Paper https://arxiv.org/pdf/2312.14235.pdf 👉Repo https://gi…

🔥 Seeing Through Occlusions: code is out 🔥

👉Repo: https://github.com/princeton-computational-imaging/NSF

❤4🔥3🥰1

7.22K viewsedited 19:35

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

☀️ One2Avatar: Pic -> 3D Avatar ☀️

👉#Google presents a new approach to generate animatable photo-realistic avatars from only a few/one image. Impressive results.

👉Review https://t.ly/AS1oc
👉Paper arxiv.org/pdf/2402.11909.pdf
👉Project zhixuany.github.io/one2avatar_webpage/

👏12❤3🤩3🔥2

7.75K views07:55

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🪟 BOG: Fine Geometric Views 🪟

👉 #Google (+Tübingen) unveils Binary Opacity Grids, a novel method to reconstruct triangle meshes from multi-view images able to capture fine geometric detail such as leaves, branches & grass. New SOTA, real-time on Google Pixel 8 Pro (and similar).

👉Review https://t.ly/E6T0W
👉Paper https://lnkd.in/dQEq3zy6
👉Project https://lnkd.in/dYYCadx9
👉Demo https://lnkd.in/d92R6QME

🔥8🤯4👏3🥰1

8.26K viewsedited 14:42

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🦥Neuromorphic Video Binarization🦥

👉 University of HK unveils the new SOTA in event-based neuromorphic binary reconstruction: stunning results on QR Code, barcode, & Text. Real-Time, only CPU, up to 10,000 FPS!

👉Review https://t.ly/V-NFa
👉Paper arxiv.org/pdf/2402.12644.pdf
👉Project github.com/eleboss/EBR

❤15👏1

7.82K views07:49

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🩻 Pose via Ray Diffusion 🩻

👉Novel distributed representation of camera pose that treats a camera as a bundle of rays. Naturally suited for set-level transformers, it's the new SOTA on camera pose estimation. Source code released 💙

👉Review https://t.ly/qBsFK
👉Paper arxiv.org/pdf/2402.14817.pdf
👉Project jasonyzhang.com/RayDiffusion
👉Code github.com/jasonyzhang/RayDiffusion

🔥17❤6🤯3👍1👏1🍾1

8.69K views08:04

AI with Papers - Artificial Intelligence & Deep Learning

🗃️ MATH-Vision Dataset 🗃️

👉MATH-V is a curated dataset of 3,040 HQ mat problems with visual contexts sourced from real math competitions. Dataset released 💙

👉Review https://t.ly/gmIAu
👉Paper arxiv.org/pdf/2402.14804.pdf
👉Project mathvision-cuhk.github.io/
👉Code github.com/mathvision-cuhk/MathVision

🤯8🔥4👍2👏1

8.55K views08:08

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🫅FlowMDM: Human Composition🫅

👉FlowMDM, a diffusion-based approach capable of generating seamlessly continuous sequences of human motion from textual denoscriptions.

👉Review https://t.ly/pr2g_
👉Paper https://lnkd.in/daYRftdF
👉Project https://lnkd.in/dcRkv5Pc
👉Repo https://lnkd.in/dw-3JJks

❤9🔥6👍1👏1

9.01K views08:03

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🎷EMO: talking/singing Gen-AI 🎷

👉EMO: audio-driven portrait-video generation. Vocal avatar videos with expressive facial expressions, and various head poses. Input: 1 single frame, video duration = length of input audio

👉Review https://t.ly/4IYj5
👉Paper https://lnkd.in/dGPX2-Yc
👉Project https://lnkd.in/dyf6p_N3
👉Repo (empty) github.com/HumanAIGC/EMO

❤18🔥7👍4🤯3🥰1

8.71K viewsedited 07:55

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

💌 Multi-LoRA Composition 💌

👉Two novel training-free image composition: LoRA Switch and LoRA Composite for integrating any number of elements in an image through multi-LoRA composition. Source Code released 💙

👉Review https://t.ly/GFy3Z
👉Paper arxiv.org/pdf/2402.16843.pdf
👉Code github.com/maszhongming/Multi-LoRA-Composition

👍11❤6🔥2🥰1👏1

8.78K views09:27

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

💥 MM-AU: Video Accident 💥

👉MM-AU - Multi-Modal Accident Understanding: 11,727 videos with temporally aligned denoscriptions. 2.23M+ BBs, 58,650 pairs of video-based accident reasons. Data & Code announced 💙

👉Review https://t.ly/a-jKI
👉Paper arxiv.org/pdf/2403.00436.pdf
👉Dataset http://www.lotvsmmau.net/MMAU/demo

👍11❤2🔥2🤯2

8.46K viewsedited 07:58

AI with Papers - Artificial Intelligence & Deep Learning

🔥 SOTA: Stable Diffusion 3 is out! 🔥

👉Stable Diffusion 3 is the new SOTA in text-to-image generation (based on human preference evaluations). New Multimodal Diffusion Transformer (MMDiT) architecture uses separate sets of weights for image & language, improving text understanding/spelling capabilities. Weights & Source Code to be released 💙

👉Review https://t.ly/a1koo
👉Paper https://lnkd.in/d4i-9Bte
👉Blog https://lnkd.in/d-bEX-ww

🔥19❤5👏3⚡1👍1😱1

8.44K viewsedited 13:33

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧵E-LoFTR: new Feats-Matching SOTA🧵

👉A novel LoFTR-inspired algorithm for efficiently producing semidense matches across images: up to 2.5× faster than LoFTR, superior to previous SOTA pipeline (SuperPoint + LightGlue). Code announced.

👉Review https://t.ly/7SPmC
👉Paper https://arxiv.org/pdf/2403.04765.pdf
👉Project https://zju3dv.github.io/efficientloftr/
👉Repo https://github.com/zju3dv/efficientloftr

🔥13👍4🤯2❤1

7.91K views08:19

AI with Papers - Artificial Intelligence & Deep Learning

🦁StableDrag: Point-based Editing🦁

👉#Tencent unveils StableDrag, a novel point-based image editing framework via discriminative point tracking method + confidence-based latent enhancement strategy for motion supervision. Source Code announced but still no repo.

👉Review https://t.ly/eUI05
👉Paper https://lnkd.in/dz8-ymck
👉Project stabledrag.github.io/

❤2👍1🔥1👏1

7.97K views13:29

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🏛️ PIXART-Σ: 4K Generation 🏛️

👉PixArt-Σ is a novel Diffusion Transformer model (DiT) capable of directly generating images at 4K resolution. Authors: #Huawei, Dalian, HKU & HKUST. Demos available, code announced 💙

👉Review https://t.ly/Cm2Qh
👉Paper arxiv.org/pdf/2403.04692.pdf
👉Project pixart-alpha.github.io/PixArt-sigma-project/
👉Repo (empty) github.com/PixArt-alpha/PixArt-sigma
🤗-Demo https://huggingface.co/spaces/PixArt-alpha/PixArt-alpha

🔥7⚡1❤1👍1🤯1

8.69K views13:06