NEW BOT Телеграм, страница - 720665648

AI with Papers - Artificial Intelligence & Deep Learning

@AI_DeepLearning

17.5K subscribers

155 photos

266 videos

14 files

1.4K links

All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#AI #chatGPT

Download Telegram

About

Blog

Apps

Platform

AI with Papers - Artificial Intelligence & Deep Learning

17.5K subscribers

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🟩 Foundational Humanoid 🟩

👉#NVIDIA unveils SONIC a novel foundational model for high-precision teleoperation & interactive control capabilities (running, jumping, crawling) with natural human-like movements. Code announced💙

👉Review https://t.ly/_3wnt
👉Paper https://lnkd.in/dctfShu8
👉Project https://lnkd.in/d_inmA2p

🤯9❤5👍1🔥1

5.34K viewsedited 07:41

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥Depth Anything 3 is out🔥

👉ByteDance unveils Depth Anything 3 (DA3), a model that predicts spatially consistent geometry from arbitrary visual inputs, with or without known camera poses. Repo under Apache 2.0💙

👉Review https://t.ly/AOPu7
👉Paper arxiv.org/pdf/2511.10647
👉Project https://lnkd.in/dnByyn2z
👉Repo https://lnkd.in/daCVz_4a
👉Demo https://lnkd.in/dKUZiJt

🔥18❤9👍2👏1

6.24K viewsedited 07:50

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🌩️ It's "Time-to-Move" 🌩️

👉Technion + Nvidia Time-to-Move (TTM) is a training-free, plug-and-play framework for motion- and appearance-controlled video generation with I2V diffusion models (Wan 2.2, CogVideoX, & Stable VD). Impressive results!

👉Review https://t.ly/0pwXm
👉Paper https://lnkd.in/dxD3uHYb
👉Project https://lnkd.in/dcE5juyM
👉Repo https://lnkd.in/dMMUjybJ

1👍2🔥2❤1

4.33K views08:13

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

⌚ Multi-Shot Video Segmentation ⌚

👉Fudan focuses on an underexplored task of multi-shot video object segmentation (MVOS). Benchmark and repo available (the extension part of SAM) under Apache 2.0💙

👉Review https://t.ly/WBW00
👉Paper https://arxiv.org/pdf/2511.13715
👉Project https://henghuiding.com/SAAS/
👉Repo https://github.com/FudanCVL/SAAS

1🔥6❤2

4.44K views11:30

AI with Papers - Artificial Intelligence & Deep Learning

AI with Papers - Artificial Intelligence & Deep Learning

🔥14❤2🤯2

4.26K views21:54

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥 SAM 3/3D are OUT!! 🔥

👉#META released SAM 3, a unified model for detection, segmentation, tracking of objects in images & video using text, exemplar & visual prompts. Repo/Models under proprietary license💙

👉Review https://t.ly/lnRZN
👉Paper https://t.ly/5tq9N
👉Project https://ai.meta.com/sam3/
👉Demo: https://segment-anything.com
👉Repo https://github.com/facebookresearch/sam3

🔥23❤9👏2

4.71K viewsedited 08:00

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🍯Unwrapping of 3D Meshes🍯

👉PartUV is a novel part-based UV unwrapping method for 3D meshes; it combines learned part priors with geometric cues to generate a compact set of part-aligned charts. Repo released💙

👉Review https://t.ly/8dNIY
👉Paper arxiv.org/pdf/2511.16659
👉Project www.zhaoningwang.com/PartUV/
👉Repo github.com/EricWang12/PartUV

❤15🔥3👍2

4.94K views08:07

AI with Papers - Artificial Intelligence & Deep Learning

🍕 Upsample Anything 🍕

👉Upsample Anything, a novel universal, training-free up-sampler via lightweight test-time optimization. No code but it's a relevant paper💙

👉Review https://t.ly/7LE6G
👉Paper https://lnkd.in/dsUfdtih

🔥8❤4👍2👏1

4.65K views13:33

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🦞Single Synthetic Image per Class🦞

👉MIT unveils Linear Gradient Matching (H/T Torralba), a novel method of distillation to use a single synthetic image per class for linear classifiers training (and more). Repo available💙

👉Review https://t.ly/dD3un
👉Paper arxiv.org/pdf/2511.16674
👉Project linear-gradient-matching.github.io/
👉Repo github.com/GeorgeCazenavette/linear-gradient-matching

1❤8🔥2👍1😍1

4.72K views08:10

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧪 EfficientSAM3 is out 🧪

👉Bristol announces EfficientSAM3, a family of efficient models built on Progressive Hierarchical Distillation that transfers capability from SAM3 to lightweight students. Code coming (in sync with SAM3 release)💙

👉Review https://t.ly/bfXP2
👉Paper arxiv.org/pdf/2511.15833
👉Project simonzeng7108.github.io/efficientsam3/
👉Repo github.com/SimonZeng7108/efficientsam3

❤7👍2🔥1👏1🤩1

4.5K viewsedited 08:06

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🌩️ Cloud4D in time 🌩️

👉Cloud4D: physically-realistic 3D cloud fields using ground-based cameras at a 25 m spatial resolution and 5 s temporal resolution. Repo coming, Data released💙

👉Review https://t.ly/w7Zly
👉Paper arxiv.org/pdf/2511.19431
👉Project cloud4d.jacob-lin.com/
👉Data https://drive.google.com/drive/folders/1QU_0kIUXIVt8h3uqygBeaF3Gvr_L5SdX?usp=drive_link
👉Repo TBA

🔥9❤3

4.67K viewsedited 07:45

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🍓MotionV2V: Editing Motion in Video🍓

👉 Google unveils motion edits, a new approach for editing videos by controlling the change in motion from the original to the edited video using diffusion models. Impressive results. Repo released soon💙

👉Review https://t.ly/s0sIT
👉Paper https://arxiv.org/pdf/2511.20640
👉Project https://ryanndagreat.github.io/MotionV2V/
👉Repo https://github.com/RyannDaGreat/MotionV2V

❤8🔥1

5.07K views07:57

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥 Smell Like Vision Spirit 🔥

👉New York Smells is a novel large-scale dataset of paired vision and olfaction captured in-the-wild, enabling the new task of cross-modal learning between smell and sight. With the lights out, it's less dangerous. Dataset available💙

👉Review https://t.ly/Ycn_B
👉Paper arxiv.org/pdf/2511.20544
👉Project smell.cs.columbia.edu/

❤15👍2🔥2

5.34K views13:35

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🕶️ Seeing without Pixels 🕶️

👉Is it possible to perceive a video’s content without seeing its pixels, just from the camera trajectory? Deepmind (+ UTexas) is the first to systematically investigate this seemingly implausible question💙

👉Review https://t.ly/Ymd1c
👉Paper arxiv.org/pdf/2511.21681
👉Project sites.google.com/view/seeing-without-pixels

❤8🔥8👏1

5.71K views07:48

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🌵Instance-Level Video Generation🌵

👉InstanceV is the first video generation framework to be designed specifically for instance-level control at the architectural level. Code & Data announced💙

👉Review https://t.ly/y_TBT
👉Paper arxiv.org/pdf/2511.23146
👉Project aliothchen.github.io/projects/InstanceV/
👉Repo TBA

❤10👍4

4.99K views09:07

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🥭3D Point Motion Editing🥭

👉Edit-by-Track enables precise video motion editing via 3D point tracks. By specifying desired 3D trajectories, users can seamlessly control joint camera and object motion, remove objects, and transfer motion between videos. No code announced but relevant💙

👉Review https://t.ly/GJHJ5
👉Paper arxiv.org/pdf/2512.02015
👉Project edit-by-track.github.io/

❤4🔥4🤣1

5.04K viewsedited 08:02

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🦄 Native Unified Multimodal 🦄

👉META unveils a novel UMM that builds a unified continuous visual representation by cascading a VAE encoder with a representation encoder. This unified representation space allows SOTA E2E processing of images/videos for both understanding/generation. Code under legal review💙

👉Review https://t.ly/7wmKP
👉Paper https://lnkd.in/djT4WGEU
👉Project https://tuna-ai.org/
👉Repo github.com/wren93/tuna

❤7🔥1

5.32K viewsedited 13:19

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

✌️SOTA Generative SLP✌️

👉Stable Signer is a new sign language generative model. It redefines the SLP task as a hierarchical generation end-to-end task that only includes text understanding (Prompt2Gloss, Text2Gloss) and Pose2Vid. Repo with data 💙

👉Review https://t.ly/yKZhn
👉Paper arxiv.org/pdf/2512.04048
👉Project stablesigner.github.io/
👉Data github.com/SignLLM/Prompt2Sign/tree/main/tools-new-2025

❤6🔥1👏1

5.91K views08:17

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🐘TTSC for 3D Generative🐘

👉SpaceControl is the new SOTA training-free test-time method for explicit spatial control of 3D generation. Repo announced💙

👉Review https://t.ly/1zrah
👉Paper https://lnkd.in/dEWh3vep
👉Project https://lnkd.in/dScftUmm
👉Repo TBA

❤8🔥2👍1👏1

4.66K views11:38

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🎷Layered PSD Diffusion🎷

👉OmniPSD produces layered PSD files with transparent alpha channels, separating text, foreground elements, and background into clean RGBA layers that can be directly edited in tools. Online Demo💙

👉Review https://t.ly/YNRAC
👉Paper arxiv.org/pdf/2512.09247
👉Project showlab.github.io/OmniPSD/
👉Demo https://www.lovart.ai/it

🔥9❤8👍1👏1

4.85K views07:53

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧱Pixel Art Volumetric Rendering🧱

👉Voxify3D is a novel differentiable two-stage framework bridging 3D mesh optimization with 2D pixel art supervision. Repo announced💙

👉Review https://t.ly/qPyNl
👉Paper https://lnkd.in/du5ikJGN
👉Project https://lnkd.in/dpiAjj5m
👉Repo TBA

❤6🔥4👏1

5.01K viewsedited 11:26