ML Research Hub – Telegram
ML Research Hub
32.9K subscribers
4.63K photos
285 videos
24 files
5K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation

📝 Summary:
This paper presents an agentic framework translating dialogue into cinematic videos. ScripterAgent generates a noscript from dialogue, which DirectorAgent uses to orchestrate video models for long-horizon coherence. The system improves noscript faithfulness and reveals a trade-off in current video ge...

🔹 Publication Date: Published on Jan 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17737
• PDF: https://arxiv.org/pdf/2601.17737
• Project Page: https://xd-mu.github.io/ScriptIsAllYouNeed/
• Github: https://github.com/Tencent/digitalhuman/tree/main/ScriptAgent

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AIAgents #VideoGeneration #GenerativeAI #MultimodalAI #DeepLearning
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers

📝 Summary:
Elastic Attention dynamically adjusts transformer sparsity ratios during inference using a lightweight Attention Router. This resolves static sparsity limitations in existing models, boosting efficiency and performance for long-context LLMs with minimal training.

🔹 Publication Date: Published on Jan 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17367
• PDF: https://arxiv.org/pdf/2601.17367
• Project Page: https://github.com/LCM-Lab/Elastic-Attention
• Github: https://github.com/LCM-Lab/Elastic-Attention

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#Transformers #LLMs #Sparsity #DeepLearning #EfficientAI
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs

📝 Summary:
This paper surveys how LLMs are transforming data preparation tasks like cleaning, integration, and enrichment. It details the shift from rule-based to prompt-driven approaches, outlining techniques, benefits, and challenges, along with future research directions.

🔹 Publication Date: Published on Jan 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17058
• PDF: https://arxiv.org/pdf/2601.17058
• Project Page: https://github.com/weAIDB/awesome-data-llm
• Github: https://github.com/weAIDB/awesome-data-llm

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLMs #DataPreparation #DataCleaning #DataScience #AI
VIBEVOICE-ASR Technical Report

📝 Summary:
VibeVoice-ASR is a unified end-to-end speech understanding framework that processes long-form audio in a single pass while supporting multilingual, code-switching, and domain-specific context injectio...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18184
• PDF: https://arxiv.org/pdf/2601.18184

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility

📝 Summary:
Scientific image synthesis using logic-driven frameworks like ImgCoder improves multimodal reasoning by addressing visual-logic divergence through structured generation and evaluation benchmarks. AI-g...

🔹 Publication Date: Published on Jan 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17027
• PDF: https://arxiv.org/pdf/2601.17027
• Project Page: https://scigenbench.github.io/
• Github: https://github.com/SciGenBench/SciGenBench

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation

📝 Summary:
AR-Omni is a unified autoregressive model for any-to-any multimodal generation using a single Transformer. It generates text images and streaming speech without relying on expert components. The model addresses key challenges like modality imbalance and achieves strong real-time quality.

🔹 Publication Date: Published on Jan 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17761
• PDF: https://arxiv.org/pdf/2601.17761
• Project Page: https://modalitydance.github.io/AR-Omni
• Github: https://modalitydance.github.io/AR-Omni

🔹 Models citing this paper:
https://huggingface.co/ModalityDance/AR-Omni-Pretrain-v0.1
https://huggingface.co/ModalityDance/AR-Omni-Chat-v0.1

Datasets citing this paper:
https://huggingface.co/datasets/ModalityDance/AR-Omni-Instruct-v0.1

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts

📝 Summary:
Imbalanced expert routing in Mixture-of-Experts models leads to computational inefficiencies in expert parallelism, which are addressed by a dynamic rerouting algorithm that balances workload and redu...

🔹 Publication Date: Published on Jan 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17111
• PDF: https://arxiv.org/pdf/2601.17111

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal

📝 Summary:
An agentic framework for automatic academic rebuttal generation that decomposes reviews, retrieves evidence, plans rebuttal strategies, and generates persuasive responses with human-level performance ...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://huggingface.co/collections/HakHan/drpg-rebuttalagent
• PDF: https://arxiv.org/pdf/2601.18081
• Github: https://github.com/ulab-uiuc/DRPG-RebuttalAgent/tree/master

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
iFSQ: Improving FSQ for Image Generation with 1 Line of Code

📝 Summary:
Finite Scalar Quantization with improved activation mapping enables unified modeling of discrete and continuous image generation approaches, revealing optimal representation balance and performance ch...

🔹 Publication Date: Published on Jan 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17124
• PDF: https://arxiv.org/pdf/2601.17124
• Github: https://github.com/Tencent-Hunyuan/iFSQ

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Self-Refining Video Sampling

📝 Summary:
Self-refining video sampling improves motion coherence and physics alignment by using a pre-trained video generator as its own denoising autoencoder for iterative refinement with uncertainty-aware reg...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18577
• PDF: https://arxiv.org/pdf/2601.18577
• Project Page: https://agwmon.github.io/self-refine-video/
• Github: https://github.com/agwmon/self-refine-video

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
CGPT: Cluster-Guided Partial Tables with LLM-Generated Supervision for Table Retrieval

📝 Summary:
CGPT improves table retrieval by using LLM-generated synthetic queries for contrastive fine-tuning of embedding models through semantically diverse partial table construction. AI-generated summary Gen...

🔹 Publication Date: Published on Jan 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.15849
• PDF: https://arxiv.org/pdf/2601.15849
• Github: https://github.com/yumeow0122/CGPT

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
RouteMoA: Dynamic Routing without Pre-Inference Boosts Efficient Mixture-of-Agents

📝 Summary:
RouteMoA reduces computational costs and latency in mixture-of-agents frameworks by using dynamic routing with lightweight scoring and judgment mechanisms. AI-generated summary Mixture-of-Agents (MoA)...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18130
• PDF: https://arxiv.org/pdf/2601.18130

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SkyReels-V3 Technique Report

📝 Summary:
SkyReels-V3 is a unified multimodal video generation model that supports reference image-to-video, video-to-video extension, and audio-guided video generation through diffusion Transformers and in-con...

🔹 Publication Date: Published on Jan 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.17323
• PDF: https://arxiv.org/pdf/2601.17323

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
UI Remix: Supporting UI Design Through Interactive Example Retrieval and Remixing

📝 Summary:
UI Remix is an interactive system that supports mobile UI design through example-driven workflows using a multimodal retrieval-augmented generation model, enabling iterative design adaptation with sou...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18759
• PDF: https://arxiv.org/pdf/2601.18759

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SAGE: Steerable Agentic Data Generation for Deep Search with Execution Feedback

📝 Summary:
Deep search agents trained on synthetic question-answer pairs generated through an iterative agent-based pipeline demonstrate improved performance and adaptability across different search environments...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18202
• PDF: https://arxiv.org/pdf/2601.18202

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Agentic Very Long Video Understanding

📝 Summary:
An agentic framework using entity scene graphs enables long-horizon video understanding with structured search, temporal reasoning, and cross-modal capabilities for extended visual and audio interpret...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18157
• PDF: https://arxiv.org/pdf/2601.18157

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment

📝 Summary:
Meta Reward Modeling reformulates personalized reward modeling as a meta-learning problem to enable efficient adaptation to individual users with limited feedback. AI-generated summary Alignment of La...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18731
• PDF: https://arxiv.org/pdf/2601.18731
• Github: https://github.com/ModalityDance/MRM

🔹 Models citing this paper:
https://huggingface.co/ModalityDance/MRM-Reddit150-V2
https://huggingface.co/ModalityDance/MRM-Reddit100-V2
https://huggingface.co/ModalityDance/MRM-Reddit150-V1

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
2
daVinci-Dev: Agent-native Mid-training for Software Engineering

📝 Summary:
This paper introduces agentic mid-training for LLMs, bridging static data and dynamic development environments. Using agent-native data with contextually and environmentally native trajectories, it outperforms prior work on SWE-Bench Verified with fewer tokens.

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18418
• PDF: https://arxiv.org/pdf/2601.18418
• Github: https://github.com/GAIR-NLP/daVinci-Dev

🔹 Models citing this paper:
https://huggingface.co/GAIR/daVinci-Dev-72B
https://huggingface.co/GAIR/daVinci-Dev-32B-MT
https://huggingface.co/GAIR/daVinci-Dev-72B-MT

Datasets citing this paper:
https://huggingface.co/datasets/GAIR/daVinci-Dev

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents

📝 Summary:
Research examines factors influencing out-of-domain performance in reinforcement learning agents, identifying state information richness and planning complexity as key determinants, while proposing a ...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18217
• PDF: https://arxiv.org/pdf/2601.18217

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

📝 Summary:
A self-improvement framework enables pretrained language models to generate automated curricula for solving previously unsolvable problems by leveraging latent knowledge and meta-reinforcement learnin...

🔹 Publication Date: Published on Jan 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.18778
• PDF: https://arxiv.org/pdf/2601.18778
• Github: https://ssundaram21.github.io/soar/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research