ML Research Hub – Telegram
ML Research Hub
32.9K subscribers
4.63K photos
285 videos
24 files
5K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
IVRA: Improving Visual-Token Relations for Robot Action Policy with Training-Free Hint-Based Guidance

📝 Summary:
IVRA improves spatial understanding in VLA models by training-free injection of vision encoder affinity signals into language model layers at inference time. This enhances geometric structure and robot action policies. It shows consistent performance gains across diverse 2D and 3D manipulation ta...

🔹 Publication Date: Published on Jan 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.16207
• PDF: https://arxiv.org/pdf/2601.16207
• Github: https://jongwoopark7978.github.io/IVRA

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#Robotics #VisionLanguageModels #SpatialAI #RobotLearning #DeepLearning
Prometheus: Unified Knowledge Graphs for Issue Resolution in Multilingual Codebases

📝 Summary:
Prometheus is a multi-agent system that uses a unified knowledge graph of code repositories to resolve real-world issues across multiple programming languages. It improves upon existing methods by handling diverse languages and real-world scenarios.

🔹 Publication Date: Published on Jul 26, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.19942
• PDF: https://arxiv.org/pdf/2507.19942
• Github: https://github.com/Pantheon-temple/Prometheus

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#KnowledgeGraphs #MultiAgentSystems #CodeAnalysis #SoftwareEngineering #AI
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation

📝 Summary:
This paper presents an agentic framework translating dialogue into cinematic videos. ScripterAgent generates a noscript from dialogue, which DirectorAgent uses to orchestrate video models for long-horizon coherence. The system improves noscript faithfulness and reveals a trade-off in current video ge...

🔹 Publication Date: Published on Jan 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17737
• PDF: https://arxiv.org/pdf/2601.17737
• Project Page: https://xd-mu.github.io/ScriptIsAllYouNeed/
• Github: https://github.com/Tencent/digitalhuman/tree/main/ScriptAgent

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AIAgents #VideoGeneration #GenerativeAI #MultimodalAI #DeepLearning
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers

📝 Summary:
Elastic Attention dynamically adjusts transformer sparsity ratios during inference using a lightweight Attention Router. This resolves static sparsity limitations in existing models, boosting efficiency and performance for long-context LLMs with minimal training.

🔹 Publication Date: Published on Jan 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17367
• PDF: https://arxiv.org/pdf/2601.17367
• Project Page: https://github.com/LCM-Lab/Elastic-Attention
• Github: https://github.com/LCM-Lab/Elastic-Attention

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#Transformers #LLMs #Sparsity #DeepLearning #EfficientAI
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs

📝 Summary:
This paper surveys how LLMs are transforming data preparation tasks like cleaning, integration, and enrichment. It details the shift from rule-based to prompt-driven approaches, outlining techniques, benefits, and challenges, along with future research directions.

🔹 Publication Date: Published on Jan 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17058
• PDF: https://arxiv.org/pdf/2601.17058
• Project Page: https://github.com/weAIDB/awesome-data-llm
• Github: https://github.com/weAIDB/awesome-data-llm

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLMs #DataPreparation #DataCleaning #DataScience #AI
VIBEVOICE-ASR Technical Report

📝 Summary:
VibeVoice-ASR is a unified end-to-end speech understanding framework that processes long-form audio in a single pass while supporting multilingual, code-switching, and domain-specific context injectio...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18184
• PDF: https://arxiv.org/pdf/2601.18184

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility

📝 Summary:
Scientific image synthesis using logic-driven frameworks like ImgCoder improves multimodal reasoning by addressing visual-logic divergence through structured generation and evaluation benchmarks. AI-g...

🔹 Publication Date: Published on Jan 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17027
• PDF: https://arxiv.org/pdf/2601.17027
• Project Page: https://scigenbench.github.io/
• Github: https://github.com/SciGenBench/SciGenBench

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation

📝 Summary:
AR-Omni is a unified autoregressive model for any-to-any multimodal generation using a single Transformer. It generates text images and streaming speech without relying on expert components. The model addresses key challenges like modality imbalance and achieves strong real-time quality.

🔹 Publication Date: Published on Jan 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17761
• PDF: https://arxiv.org/pdf/2601.17761
• Project Page: https://modalitydance.github.io/AR-Omni
• Github: https://modalitydance.github.io/AR-Omni

🔹 Models citing this paper:
https://huggingface.co/ModalityDance/AR-Omni-Pretrain-v0.1
https://huggingface.co/ModalityDance/AR-Omni-Chat-v0.1

Datasets citing this paper:
https://huggingface.co/datasets/ModalityDance/AR-Omni-Instruct-v0.1

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts

📝 Summary:
Imbalanced expert routing in Mixture-of-Experts models leads to computational inefficiencies in expert parallelism, which are addressed by a dynamic rerouting algorithm that balances workload and redu...

🔹 Publication Date: Published on Jan 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17111
• PDF: https://arxiv.org/pdf/2601.17111

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal

📝 Summary:
An agentic framework for automatic academic rebuttal generation that decomposes reviews, retrieves evidence, plans rebuttal strategies, and generates persuasive responses with human-level performance ...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/HakHan/drpg-rebuttalagent
• PDF: https://arxiv.org/pdf/2601.18081
• Github: https://github.com/ulab-uiuc/DRPG-RebuttalAgent/tree/master

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
iFSQ: Improving FSQ for Image Generation with 1 Line of Code

📝 Summary:
Finite Scalar Quantization with improved activation mapping enables unified modeling of discrete and continuous image generation approaches, revealing optimal representation balance and performance ch...

🔹 Publication Date: Published on Jan 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17124
• PDF: https://arxiv.org/pdf/2601.17124
• Github: https://github.com/Tencent-Hunyuan/iFSQ

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Self-Refining Video Sampling

📝 Summary:
Self-refining video sampling improves motion coherence and physics alignment by using a pre-trained video generator as its own denoising autoencoder for iterative refinement with uncertainty-aware reg...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18577
• PDF: https://arxiv.org/pdf/2601.18577
• Project Page: https://agwmon.github.io/self-refine-video/
• Github: https://github.com/agwmon/self-refine-video

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
CGPT: Cluster-Guided Partial Tables with LLM-Generated Supervision for Table Retrieval

📝 Summary:
CGPT improves table retrieval by using LLM-generated synthetic queries for contrastive fine-tuning of embedding models through semantically diverse partial table construction. AI-generated summary Gen...

🔹 Publication Date: Published on Jan 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.15849
• PDF: https://arxiv.org/pdf/2601.15849
• Github: https://github.com/yumeow0122/CGPT

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
RouteMoA: Dynamic Routing without Pre-Inference Boosts Efficient Mixture-of-Agents

📝 Summary:
RouteMoA reduces computational costs and latency in mixture-of-agents frameworks by using dynamic routing with lightweight scoring and judgment mechanisms. AI-generated summary Mixture-of-Agents (MoA)...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18130
• PDF: https://arxiv.org/pdf/2601.18130

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SkyReels-V3 Technique Report

📝 Summary:
SkyReels-V3 is a unified multimodal video generation model that supports reference image-to-video, video-to-video extension, and audio-guided video generation through diffusion Transformers and in-con...

🔹 Publication Date: Published on Jan 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17323
• PDF: https://arxiv.org/pdf/2601.17323

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
UI Remix: Supporting UI Design Through Interactive Example Retrieval and Remixing

📝 Summary:
UI Remix is an interactive system that supports mobile UI design through example-driven workflows using a multimodal retrieval-augmented generation model, enabling iterative design adaptation with sou...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18759
• PDF: https://arxiv.org/pdf/2601.18759

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SAGE: Steerable Agentic Data Generation for Deep Search with Execution Feedback

📝 Summary:
Deep search agents trained on synthetic question-answer pairs generated through an iterative agent-based pipeline demonstrate improved performance and adaptability across different search environments...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18202
• PDF: https://arxiv.org/pdf/2601.18202

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Agentic Very Long Video Understanding

📝 Summary:
An agentic framework using entity scene graphs enables long-horizon video understanding with structured search, temporal reasoning, and cross-modal capabilities for extended visual and audio interpret...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18157
• PDF: https://arxiv.org/pdf/2601.18157

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment

📝 Summary:
Meta Reward Modeling reformulates personalized reward modeling as a meta-learning problem to enable efficient adaptation to individual users with limited feedback. AI-generated summary Alignment of La...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18731
• PDF: https://arxiv.org/pdf/2601.18731
• Github: https://github.com/ModalityDance/MRM

🔹 Models citing this paper:
https://huggingface.co/ModalityDance/MRM-Reddit150-V2
https://huggingface.co/ModalityDance/MRM-Reddit100-V2
https://huggingface.co/ModalityDance/MRM-Reddit150-V1

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
2
daVinci-Dev: Agent-native Mid-training for Software Engineering

📝 Summary:
This paper introduces agentic mid-training for LLMs, bridging static data and dynamic development environments. Using agent-native data with contextually and environmentally native trajectories, it outperforms prior work on SWE-Bench Verified with fewer tokens.

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18418
• PDF: https://arxiv.org/pdf/2601.18418
• Github: https://github.com/GAIR-NLP/daVinci-Dev

🔹 Models citing this paper:
https://huggingface.co/GAIR/daVinci-Dev-72B
https://huggingface.co/GAIR/daVinci-Dev-32B-MT
https://huggingface.co/GAIR/daVinci-Dev-72B-MT

Datasets citing this paper:
https://huggingface.co/datasets/GAIR/daVinci-Dev

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research