ML Research Hub – Telegram
ML Research Hub
32.7K subscribers
4.06K photos
234 videos
23 files
4.38K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR

📝 Summary:
LLMs for step-by-step reasoning become verbose as RLVR often filters easy problems. This work shows that retaining and modestly up-weighting moderately easy problems acts as an implicit length regularizer. This approach significantly reduces output verbosity by half while maintaining accuracy, wi...

🔹 Publication Date: Published on Nov 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.01937
• PDF: https://arxiv.org/pdf/2511.01937
• Github: https://github.com/MBZUAI-Paris/Frugal-AI-Math

🔹 Models citing this paper:
https://huggingface.co/MBZUAI-Paris/Frugal-Math-4B

Datasets citing this paper:
https://huggingface.co/datasets/MBZUAI-Paris/frugal-maths-data-split-v1

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLM #AI #ReinforcementLearning #FrugalAI #MathematicalReasoning
BRAINS: A Retrieval-Augmented System for Alzheimer's Detection and Monitoring

📝 Summary:
BRAINS is an LLM-based system for Alzheimer's detection and monitoring. It integrates cognitive assessments and a case retrieval module for risk assessment and disease severity classification. Evaluations demonstrate its effectiveness as a scalable, explainable, early-stage detection tool.

🔹 Publication Date: Published on Nov 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02490
• PDF: https://arxiv.org/pdf/2511.02490

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#Alzheimers #LLM #AI #MedicalAI #EarlyDetection
Kimi Linear: An Expressive, Efficient Attention Architecture

📝 Summary:
Kimi Linear is a new hybrid linear attention architecture that outperforms full attention in performance and efficiency across diverse scenarios. It leverages Kimi Delta Attention and Multi-Head Latent Attention, reducing KV cache by up to 75% and boosting decoding throughput by 6x.

🔹 Publication Date: Published on Oct 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26692
• PDF: https://arxiv.org/pdf/2510.26692
• Github: https://github.com/MoonshotAI/Kimi-Linear

🔹 Models citing this paper:
https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Instruct
https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Base
https://huggingface.co/aiqtech/Kimi-Linear-48B-A3B-Instruct

Spaces citing this paper:
https://huggingface.co/spaces/Speedofmastery/orynxml-agents

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AttentionMechanisms #LLM #AIResearch #DeepLearning #ModelEfficiency
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

📝 Summary:
PaddleOCR-VL is a new 0.9B vision-language model for document parsing. It uses a NaViT-style visual encoder and ERNIE-4.5, achieving state-of-the-art performance across 109 languages with minimal resources and fast inference. This model is highly suitable for practical deployment.

🔹 Publication Date: Published on Oct 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.14528
• PDF: https://arxiv.org/pdf/2510.14528
• Github: https://github.com/PaddlePaddle/PaddleOCR

🔹 Models citing this paper:
https://huggingface.co/PaddlePaddle/PaddleOCR-VL
https://huggingface.co/PaddlePaddle/PP-DocLayoutV2
https://huggingface.co/lvyufeng/PaddleOCR-VL-0.9B

Spaces citing this paper:
https://huggingface.co/spaces/PaddlePaddle/PaddleOCR-VL_Online_Demo
https://huggingface.co/spaces/markobinario/PaddleOCR-VL_Online_Demo
https://huggingface.co/spaces/waytoAGI/PaddleOCR-VL_Online_Demo

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#OCR #VisionLanguageModel #DocumentAI #DeepLearning #AI
Emu3.5: Native Multimodal Models are World Learners

📝 Summary:
Emu3.5 is a large-scale multimodal world model predicting next states in vision and language. It uses reinforcement learning and Discrete Diffusion Adaptation for efficient inference, delivering strong performance in multimodal tasks and world exploration.

🔹 Publication Date: Published on Oct 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26583
• PDF: https://arxiv.org/pdf/2510.26583
• Project Page: https://emu.world/
• Github: https://github.com/baaivision/Emu3.5

🔹 Models citing this paper:
https://huggingface.co/BAAI/Emu3.5
https://huggingface.co/BAAI/Emu3.5-Image
https://huggingface.co/BAAI/Emu3.5-VisionTokenizer

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MultimodalAI #WorldModels #ReinforcementLearning #ComputerVision #NLP
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

📝 Summary:
DeepAnalyze-8B is an agentic LLM that autonomously completes the entire data science pipeline, from raw data to research reports. It employs curriculum-based training and data-grounded trajectory synthesis, outperforming larger, workflow-based agents. This open-source model advances autonomous da...

🔹 Publication Date: Published on Oct 19

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/deepanalyze-agentic-large-language-models-for-autonomous-data-science
• PDF: https://arxiv.org/pdf/2510.16872
• Project Page: https://ruc-deepanalyze.github.io/
• Github: https://github.com/ruc-datalab/DeepAnalyze

🔹 Models citing this paper:
https://huggingface.co/RUC-DataLab/DeepAnalyze-8B

Datasets citing this paper:
https://huggingface.co/datasets/RUC-DataLab/DataScience-Instruct-500K
https://huggingface.co/datasets/fantos/DataScience-Instruct-500K

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLM #DataScience #AgenticAI #AutonomousAI #AI
TradingAgents: Multi-Agents LLM Financial Trading Framework

📝 Summary:
TradingAgents is a multi-agent LLM framework that simulates real-world trading firms with specialized, collaborative agents. This approach significantly improves trading performance metrics like cumulative returns and Sharpe ratio compared to baseline models.

🔹 Publication Date: Published on Dec 28, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.20138
• PDF: https://arxiv.org/pdf/2412.20138
• Github: https://github.com/tauricresearch/tradingagents

Spaces citing this paper:
https://huggingface.co/spaces/shanghengdu/LLM-Agent-Optimization-PaperList
https://huggingface.co/spaces/Ervin2077/qiu

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#TradingAgents #MultiAgentLLM #FinancialTrading #AlgorithmicTrading #AI
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation

📝 Summary:
OmniFlatten is a novel end-to-end GPT model enabling real-time natural full-duplex spoken dialogue. It achieves this by post-training a text LLM with a multi-stage process for speech-text generation, without modifying the original architecture.

🔹 Publication Date: Published on Oct 23, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2410.17799
• PDF: https://arxiv.org/pdf/2410.17799
• Github: https://github.com/karpathy/nanogpt

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#GPT #VoiceAI #NLP #LLM #DeepLearning
olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models

📝 Summary:
olmOCR is an open-source toolkit that uses a fine-tuned vision language model to convert PDFs into clean, structured text. It enables large-scale, cost-effective extraction of trillions of tokens for training language models.

🔹 Publication Date: Published on Feb 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.18443
• PDF: https://arxiv.org/pdf/2502.18443
• Github: https://github.com/allenai/olmocr

Datasets citing this paper:
https://huggingface.co/datasets/davanstrien/test-olmocr2
https://huggingface.co/datasets/davanstrien/newspapers-olmocr2
https://huggingface.co/datasets/stckmn/ocr-output-Directive017-1761355297

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#OCR #VLMs #LLM #DataExtraction #OpenSource
MedRAX: Medical Reasoning Agent for Chest X-ray

📝 Summary:
MedRAX is a new AI agent that integrates CXR analysis tools and multimodal large language models. It answers complex medical queries without extra training, achieving state-of-the-art performance.

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.02673
• PDF: https://arxiv.org/pdf/2502.02673
• Github: https://github.com/bowang-lab/medrax

Spaces citing this paper:
https://huggingface.co/spaces/asbamit/MedRAX-main

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #MedicalAI #LLM #Radiology #DeepLearning
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

📝 Summary:
Mem0 is a memory-centric architecture with graph-based memory that enhances long-term conversational coherence in LLMs by efficiently extracting and consolidating information. It outperforms existing memory systems in accuracy, achieving 26% improvement over OpenAI, and significantly reduces comp...

🔹 Publication Date: Published on Apr 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.19413
• PDF: https://arxiv.org/pdf/2504.19413
• Github: https://github.com/mem0ai/mem0

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #LLM #AIAgents #LongTermMemory #GraphMemory
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

📝 Summary:
IndexTTS enhances XTTS and Tortoise for TTS, improving naturalness and zero-shot voice cloning. It features hybrid character-pinyin modeling for Chinese and optimized vector quantization, resulting in more controllable usage, faster inference, and superior performance compared to other systems.

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.05512
• PDF: https://arxiv.org/pdf/2502.05512
• Github: https://github.com/index-tts/index-tts

🔹 Models citing this paper:
https://huggingface.co/IndexTeam/IndexTTS-2
https://huggingface.co/IndexTeam/Index-TTS
https://huggingface.co/Toxzic/indextts-colab

Spaces citing this paper:
https://huggingface.co/spaces/IndexTeam/IndexTTS
https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
https://huggingface.co/spaces/jairwaal/image

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#TextToSpeech #ZeroShotLearning #VoiceCloning #AI #MachineLearning