NEW BOT Телеграм, страница

ML Research Hub

✨The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation

📝 Summary:
This paper highlights the gap between SAM2 and SAM3. SAM2 uses spatial prompts for geometric segmentation, but SAM3 is a concept-driven multimodal model with a unified vision-language architecture. SAM3 represents a new class of foundation model for concept-driven segmentation.

🔹 Publication Date: Published on Dec 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06032
• PDF: https://arxiv.org/pdf/2512.06032
• Github: https://github.com/Applied-AI-Research-Lab/The-SAM2-to-SAM3-Gap-in-the-Segment-Anything-Model-Family

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ImageSegmentation #FoundationModels #ComputerVision #MultimodalAI #AIResearch

❤1

331 views15:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Small-Gain Nash: Certified Contraction to Nash Equilibria in Differentiable Games

📝 Summary:
Small-Gain Nash SGN certifies convergence in differentiable games where traditional methods fail. It constructs a custom weighted block metric making the pseudo-gradient strongly monotone, even if non-monotone in Euclidean space. This provides a structural convergence certificate with safe step-s...

🔹 Publication Date: Published on Dec 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06791
• PDF: https://arxiv.org/pdf/2512.06791
• Project Page: https://huggingface.co/papers?q=projected%20Euler
• Github: https://github.com/AashVed/SmallGainNash

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

349 views16:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation

📝 Summary:
FAE adapts pretrained visual representations for image generation using a simple framework with a single attention layer and dual decoders. It bridges the gap between understanding features and generation latents, achieving strong performance and fast learning on various benchmarks.

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07829
• PDF: https://arxiv.org/pdf/2512.07829

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤2

265 views20:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

📝 Summary:
A three-stage framework, SPARK, uses a generator and verifier to create synthetic training data for process reward models, enabling reference-free reinforcement learning that surpasses ground-truth me...

🔹 Publication Date: Published on Dec 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.03244
• PDF: https://arxiv.org/pdf/2512.03244

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

260 views21:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge

📝 Summary:
A vision-action policy using correlated noise for flow matching and learnable mixed-layer attention wins the 2025 BEHAVIOR Challenge with high performance across diverse household tasks. AI-generated ...

🔹 Publication Date: Published on Dec 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06951
• PDF: https://arxiv.org/pdf/2512.06951
• Project Page: https://behavior.stanford.edu/challenge/
• Github: https://github.com/IliaLarchenko/behavior-1k-solution

🔹 Models citing this paper:
• https://huggingface.co/IliaLarchenko/behavior_submission

✨ Datasets citing this paper:
• https://huggingface.co/datasets/IliaLarchenko/behavior_224_rgb

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

258 views21:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

🤖🧠 IndicWav2Vec: Building the Future of Speech Recognition for Indian Languages

🗓️ 09 Dec 2025
📚 AI News & Trends

India is one of the most linguistically diverse countries in the world, home to over 1,600 languages and dialects. Yet, speech technology for most of these languages has historically lagged behind due to limited data and resources. While English and a handful of global languages have benefited immensely from advancements in automatic speech recognition (ASR), ...

#IndicWav2Vec #SpeechRecognition #IndianLanguages #ASR #LinguisticDiversity #AIResearch

❤1

273 views22:06

📖 Read More

📣 BEST TELEGRAM CHANNELS

ML Research Hub

✨JEPA as a Neural Tokenizer: Learning Robust Speech Representations with Density Adaptive Attention

📝 Summary:
This paper introduces a two-stage self-supervised framework combining JEPA and DAAM to learn robust speech representations. It uses masked prediction, FSQ, and HiFi-GAN for efficient, highly compressed, and language-model-friendly tokenization that outperforms existing audio codecs.

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07168
• PDF: https://arxiv.org/pdf/2512.07168

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

227 views01:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

2:00

This media is not supported in your browser

VIEW IN TELEGRAM

✨Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

📝 Summary:
Wan-Move brings precise, scalable motion control to video generation. It projects object trajectories into latent space, creating motion-aware features to guide existing models without architectural changes. This yields high-quality 480p videos with motion control rivaling commercial tools.

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08765
• PDF: https://arxiv.org/pdf/2512.08765
• Github: https://wan-move.github.io/

🔹 Models citing this paper:
• https://huggingface.co/Ruihang/Wan-Move-14B-480P

✨ Datasets citing this paper:
• https://huggingface.co/datasets/Ruihang/MoveBench

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

215 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

✨TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels

📝 Summary:
TrackingWorld provides dense 3D tracking of pixels in a world-centric coordinate system by upsampling sparse 2D tracks and optimizing camera poses and 3D coordinates. AI-generated summary Monocular 3D...

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08358
• PDF: https://arxiv.org/pdf/2512.08358
• Project Page: https://igl-hkust.github.io/TrackingWorld.github.io/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

162 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models

📝 Summary:
TreeGRPO, a novel RL framework, enhances training efficiency for generative models by using a tree-structured denoising process, leading to faster training and better performance. AI-generated summary...

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08153
• PDF: https://arxiv.org/pdf/2512.08153

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

185 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality

📝 Summary:
LivingSwap enhances video face swapping by using keyframes and reference guidance to maintain identity and fidelity over long sequences, reducing manual effort and achieving state-of-the-art results. ...

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07951
• PDF: https://arxiv.org/pdf/2512.07951
• Project Page: https://aim-uofa.github.io/LivingSwap

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

183 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨EcomBench: Towards Holistic Evaluation of Foundation Agents in E-commerce

📝 Summary:
EcomBench is a benchmark that evaluates agent performance in real-world e-commerce environments through deep information retrieval, multi-step reasoning, and cross-source knowledge integration. AI-gen...

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08868
• PDF: https://arxiv.org/pdf/2512.08868

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

202 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DeepCode: Open Agentic Coding

📝 Summary:
DeepCode, a fully autonomous framework, addresses the challenges of document-to-codebase synthesis by optimizing information flow through source compression, structured indexing, knowledge injection, ...

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07921
• PDF: https://arxiv.org/pdf/2512.07921
• Github: https://github.com/HKUDS/DeepCode

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

239 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

📝 Summary:
ThreadWeaver, a framework for adaptive parallel reasoning, achieves accuracy comparable to sequential models while reducing inference latency through parallel trajectory generation, trie-based trainin...

🔹 Publication Date: Published on Nov 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07843
• PDF: https://arxiv.org/pdf/2512.07843

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

225 views05:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Modular Neural Image Signal Processing

📝 Summary:
A modular neural ISP framework provides high rendering accuracy, scalability, and flexibility for diverse photo-editing operations with competitive results. AI-generated summary This paper presents a ...

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08564
• PDF: https://arxiv.org/pdf/2512.08564

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

233 views05:02

✨ Explore Data Science 📝 Write your paper

✨Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation

📝 Summary:
DualVLN is a dual-system model for vision-language navigation. It integrates a VLM global planner with a fast local policy for smooth actions, enabling robust real-time control and long-horizon planning in dynamic environments.

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08186
• PDF: https://arxiv.org/pdf/2512.08186
• Project Page: https://internrobotics.github.io/internvla-n1-dualvln.github.io/
• Github: https://github.com/InternRobotics/InternNav

🔹 Models citing this paper:
• https://huggingface.co/InternRobotics/InternVLA-N1-System2
• https://huggingface.co/InternRobotics/InternVLA-N1-w-NavDP
• https://huggingface.co/InternRobotics/InternVLA-N1-DualVLN

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

266 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs

📝 Summary:
This paper introduces a principled method to adapt autoregressive LLMs into block-wise diffusion models, enabling efficient parallel generation. This adaptation retains pretrained knowledge, achieving state-of-the-art performance for 7B diffusion LLMs, and avoids expensive training from scratch.

🔹 Publication Date: Published on Dec 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06776
• PDF: https://arxiv.org/pdf/2512.06776

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #DiffusionModels #AI #ParallelGeneration #MachineLearning

215 views08:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Same Content, Different Answers: Cross-Modal Inconsistency in MLLMs

📝 Summary:
New benchmarks reveal MLLMs struggle with cross-modal inconsistency, failing to reason consistently across image, text, and mixed modalities with the same information. Visual characteristics like color and resolution significantly impact performance, even when text recognition is perfect. This hi...

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08923
• PDF: https://arxiv.org/pdf/2512.08923

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MLLMs #CrossModalAI #AIResearch #ComputerVision #NLP

241 views08:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Predicting Time-Dependent Flow Over Complex Geometries Using Operator Networks

📝 Summary:
A Deep Operator Network predicts unsteady flow velocity fields over complex geometries with up to 1000X speedup over traditional simulations. It accurately captures near-term transients but shows error accumulation in fine-scale wakes.

🔹 Publication Date: Published on Dec 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04434
• PDF: https://arxiv.org/pdf/2512.04434
• Github: https://github.com/baskargroup/TimeDependent-DeepONet

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#DeepLearning #FluidDynamics #AI #CFD #MachineLearning

259 views08:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment

📝 Summary:
MIND-V generates long-horizon, physically plausible robotic manipulation videos. This hierarchical framework uses semantic reasoning and an RL-based physical alignment strategy to synthesize robust, coherent actions, addressing data scarcity.

🔹 Publication Date: Published on Dec 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06628
• PDF: https://arxiv.org/pdf/2512.06628
• Project Page: https://github.com/Richard-Zhang-AI/MIND-V
• Github: https://github.com/Richard-Zhang-AI/MIND-V

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#Robotics #VideoGeneration #ReinforcementLearning #AI #MachineLearning

235 views09:03

✨ Explore Data Science 📝 Write your paper

✨OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

📝 Summary:
OneStory generates coherent multi-shot videos by modeling global cross-shot context. It uses a Frame Selection module and an Adaptive Conditioner for next-shot generation, leveraging pretrained models and a new dataset. This achieves state-of-the-art narrative coherence for long-form video storyt...

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07802
• PDF: https://arxiv.org/pdf/2512.07802
• Project Page: https://zhaochongan.github.io/projects/OneStory/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VideoGeneration #AI #DeepLearning #ComputerVision #GenerativeAI

❤1

238 views09:04

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform