ML Research Hub – Telegram
ML Research Hub
32.7K subscribers
3.89K photos
210 videos
23 files
4.18K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Rethinking Chain-of-Thought Reasoning for Videos

📝 Summary:
This paper demonstrates that concise chains of thought and reduced visual tokens efficiently enable video reasoning in MLLMs. Their framework improves inference speed and performance, proving long, human-like reasoning is not necessary for effective video understanding.

🔹 Publication Date: Published on Dec 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.09616
• PDF: https://arxiv.org/pdf/2512.09616
• Github: https://github.com/LaVi-Lab/Rethink_CoT_Video

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Smart Timing for Mining: A Deep Learning Framework for Bitcoin Hardware ROI Prediction

📝 Summary:
MineROI-Net is a Transformer model predicting Bitcoin ASIC hardware profitability within one year, addressing acquisition timing. It achieves 83.7% accuracy, outperforming baselines, and precisely identifies profitable or unprofitable periods to reduce financial risk.

🔹 Publication Date: Published on Dec 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05402
• PDF: https://arxiv.org/pdf/2512.05402

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#DeepLearning #Bitcoin #CryptoMining #FinancialModeling #AIResearch
1
GimbalDiffusion: Gravity-Aware Camera Control for Video Generation

📝 Summary:
GimbalDiffusion offers precise text-to-video camera control by using absolute, gravity-aligned coordinates. This framework defines interpretable camera trajectories, enhancing robustness and diverse motion beyond relative methods.

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.09112
• PDF: https://arxiv.org/pdf/2512.09112
• Project Page: https://lvsn.github.io/GimbalDiffusion/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#VideoGeneration #AI #DiffusionModels #ComputerVision #DeepLearning
Towards a Science of Scaling Agent Systems

📝 Summary:
A quantitative framework for agent system scaling using empirical coordination metrics identifies optimal multi-agent strategies based on task properties. AI-generated summary Agents, language model (...

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08296
• PDF: https://arxiv.org/pdf/2512.08296

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
🤖🧠 How to Run and Fine-Tune Kimi K2 Thinking Locally with Unsloth

🗓️ 11 Dec 2025
📚 AI News & Trends

The demand for efficient and powerful large language models (LLMs) continues to rise as developers and researchers seek new ways to optimize reasoning, coding, and conversational AI performance. One of the most impressive open-source AI systems available today is Kimi K2 Thinking, created by Moonshot AI. Through collaboration with Unsloth, users can now fine-tune and ...

#KimiK2Thinking #Unsloth #LLMs #LargeLanguageModels #AI #FineTuning
1
The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality

📝 Summary:
The FACTS Leaderboard is a new comprehensive benchmark evaluating LLMs' factual accuracy. It uses four sub-leaderboards: image-based, closed-book, search-augmented, and document-grounded, to holistically assess factuality with automated judges.

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10791
• PDF: https://arxiv.org/pdf/2512.10791
• Project Page: https://www.kaggle.com/benchmarks/google/facts

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Evaluating Gemini Robotics Policies in a Veo World Simulator

📝 Summary:
A generative evaluation system using a frontier video model (Veo) enables comprehensive policy evaluation in robotics, including nominal performance, out-of-distribution generalization, and safety che...

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10675
• PDF: https://arxiv.org/pdf/2512.10675

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Fed-SE: Federated Self-Evolution for Privacy-Constrained Multi-Environment LLM Agents

📝 Summary:
Fed-SE, a Federated Self-Evolution framework, enhances LLM agents in privacy-constrained environments by local parameter-efficient fine-tuning and global aggregation in a low-rank subspace. AI-generat...

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.08870
• PDF: https://arxiv.org/pdf/2512.08870

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

📝 Summary:
This study systematically explores reinforcement learning for text-to-3D generation, addressing reward designs, RL algorithms, and introducing a new benchmark. It develops AR3D-R1, the first RL-enhanced text-to-3D model, demonstrating RLs effectiveness across 3D generation stages.

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10949
• PDF: https://arxiv.org/pdf/2512.10949
• Github: https://github.com/Ivan-Tang-3D/3DGen-R1

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

📝 Summary:
The Outcome-based Process Verifier (OPV) improves the verification of complex reasoning chains in large language models by combining outcome-based and process-based verification with iterative active ...

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10756
• PDF: https://arxiv.org/pdf/2512.10756

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos

📝 Summary:
MoCapAnything is a reference-guided framework that reconstructs rotation-based animations from monocular video for arbitrary rigged 3D assets, enabling cross-species retargeting and scalable 3D motion...

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10881
• PDF: https://arxiv.org/pdf/2512.10881

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Stronger Normalization-Free Transformers

📝 Summary:
Derf, a novel point-wise normalization function, outperforms existing alternatives across various domains, enhancing generalization without increased fitting capacity. AI-generated summary Although no...

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10938
• PDF: https://arxiv.org/pdf/2512.10938

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

📝 Summary:
OPV, an iterative active learning framework with Rejection Fine-Tuning, enhances verification of long reasoning chains in large language models, achieving state-of-the-art results and improving accura...

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10739
• PDF: https://arxiv.org/pdf/2512.10739

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Confucius Code Agent: An Open-sourced AI Software Engineer at Industrial Scale

📝 Summary:
Real-world AI software engineering demands coding agents that can reason over massive repositories, maintain durable memory across and within long sessions, and robustly coordinate complex toolchains ...

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10398
• PDF: https://arxiv.org/pdf/2512.10398
• Github: https://github.com/facebook/confucius

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task

📝 Summary:
A spatiotemporal reasoning framework enhances multimodal large language models for video question answering by strategically scheduling tools to improve spatial and temporal understanding. AI-generate...

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10359
• PDF: https://arxiv.org/pdf/2512.10359
• Github: https://github.com/fansunqi/VideoTool

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos

📝 Summary:
A video-to-video translation framework converts human-object interaction videos into realistic robot manipulation videos using unpaired training data and a generative model. AI-generated summary Robot...

🔹 Publication Date: Published on Dec 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.09406
• PDF: https://arxiv.org/pdf/2512.09406
• Project Page: https://showlab.github.io/H2R-Grounder/
• Github: https://github.com/showlab/H2R-Grounder

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

📝 Summary:
InternGeometry, an LLM agent, surpasses human performance on IMO geometry problems. It uses iterative proposition verification and a dynamic memory mechanism, combined with Complexity-Boosting Reinforcement Learning, to achieve this with very limited training data.

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10534
• PDF: https://arxiv.org/pdf/2512.10534

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction

📝 Summary:
VQRAE, a Vector Quantization Representation AutoEncoder, unifies multimodal understanding, generation, and reconstruction using a unified tokenizer with continuous semantic features and discrete token...

🔹 Publication Date: Published on Nov 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2511.23386
• PDF: https://arxiv.org/pdf/2511.23386

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
From Macro to Micro: Benchmarking Microscopic Spatial Intelligence on Molecules via Vision-Language Models

📝 Summary:
A benchmark framework evaluates Vision-Language Models in understanding microscopic spatial relationships, showing potential but highlighting the need for domain-specific knowledge integration. AI-gen...

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10867
• PDF: https://arxiv.org/pdf/2512.10867

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MoRel: Long-Range Flicker-Free 4D Motion Modeling via Anchor Relay-based Bidirectional Blending with Hierarchical Densification

📝 Summary:
MoRel is a 4D Gaussian Splatting framework for long-range dynamic videos. It uses Anchor Relay-based Bidirectional Blending and Hierarchical Densification to achieve temporally consistent, flicker-free reconstruction with efficient memory use.

🔹 Publication Date: Published on Dec 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.09270
• PDF: https://arxiv.org/pdf/2512.09270
• Project Page: https://cmlab-korea.github.io/MoRel/
• Github: https://github.com/CMLab-Korea/MoRel-arXiv

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#GaussianSplatting #4DMotionModeling #ComputerVision #DeepLearning #NeuralRendering
MOA: Multi-Objective Alignment for Role-Playing Agents

📝 Summary:
MOA is a reinforcement-learning framework for role-playing agents that uses multi-objective optimization and thought-augmented rollout. It simultaneously improves multiple skills like domain knowledge and linguistic style, addressing limitations of prior methods. MOA outperforms strong baselines,...

🔹 Publication Date: Published on Dec 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.09756
• PDF: https://arxiv.org/pdf/2512.09756

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #ReinforcementLearning #MultiObjectiveOptimization #RolePlayingAgents #MachineLearning
1