ML Research Hub – Telegram
ML Research Hub
32.7K subscribers
4.01K photos
229 videos
23 files
4.32K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
BioBench: A Blueprint to Move Beyond ImageNet for Scientific ML Benchmarks

📝 Summary:
ImageNet accuracy poorly predicts performance on scientific imagery. BioBench is a new ecology vision benchmark unifying diverse tasks, kingdoms, and modalities with 3.1M images, offering a better evaluation for scientific ML.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16315
• PDF: https://arxiv.org/pdf/2511.16315
• Project Page: https://samuelstevens.me/biobench
• Github: https://github.com/samuelstevens/biobench

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#BioBench #MachineLearning #ComputerVision #ScientificML #Ecology
1
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

📝 Summary:
EntroPIC stabilizes entropy during long-term LLM training by adaptively tuning loss coefficients with Proportional-Integral Control. This novel method ensures efficient exploration and prevents sub-optimal behaviors, leading to stable and optimal reinforcement learning for LLMs.

🔹 Publication Date: Published on Nov 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15248
• PDF: https://arxiv.org/pdf/2511.15248
• Project Page: https://huggingface.co/spaces/yangkaiSIGS/entropic
• Github: https://github.com/yk7333/EntroPIC

🔹 Models citing this paper:
https://huggingface.co/hunterbown/shannon-control-unit

Spaces citing this paper:
https://huggingface.co/spaces/yangkaiSIGS/entropic

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLM #MachineLearning #ReinforcementLearning #ControlTheory #DeepLearning
FinTRec: Transformer Based Unified Contextual Ads Targeting and Personalization for Financial Applications

📝 Summary:
FinTRec is a transformer-based framework for financial recommendation systems. It handles complex user interactions and multiple products, outperforming traditional tree models. This unified approach improves performance and reduces costs.

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14865
• PDF: https://arxiv.org/pdf/2511.14865

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#FinTech #RecommendationSystems #Transformers #AI #MachineLearning
Generalist Foundation Models Are Not Clinical Enough for Hospital Operations

📝 Summary:
Lang1, a specialized clinical language model, significantly outperforms generalist models in predicting hospital operational metrics after supervised finetuning. This suggests that effective healthcare AI requires in-domain pretraining and finetuning for specialized tasks.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13703
• PDF: https://arxiv.org/pdf/2511.13703

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#HealthcareAI #ClinicalNLP #LLM #HospitalOperations #AIResearch
Boosting Medical Visual Understanding From Multi-Granular Language Learning

📝 Summary:
MGLL enhances visual understanding by improving multi-label and cross-granularity alignment in image-text pretraining, outperforming existing methods in complex domains like medical imaging.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15943
• PDF: https://arxiv.org/pdf/2511.15943
• Project Page: https://github.com/HUANGLIZI/MGLL
• Github: https://github.com/HUANGLIZI/MGLL

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MedicalAI #ComputerVision #DeepLearning #NLP #ImageTextPretraining
2
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

📝 Summary:
Agent0 is a self-evolving framework that trains LLM agents without human data. It uses two competing agents and tool integration in a multi-step co-evolution process. This significantly boosts reasoning capabilities, improving math by 18% and general reasoning by 24% on benchmarks.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16043
• PDF: https://arxiv.org/pdf/2511.16043
• Github: https://github.com/aiming-lab/Agent0

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLMAgents #SelfEvolvingAI #ToolIntegration #AIResearch #Reasoning
🚀 THE 7-DAY PROFIT CHALLENGE! 🚀

Can you turn $100 into $5,000 in just 7 days?
Lisa can. And she’s challenging YOU to do the same. 👇

https://news.1rj.ru/str/+AOPQVJRWlJc5ZGRi
https://news.1rj.ru/str/+AOPQVJRWlJc5ZGRi
https://news.1rj.ru/str/+AOPQVJRWlJc5ZGRi
MobiAgent: A Systematic Framework for Customizable Mobile Agents

📝 Summary:
MobiAgent is a comprehensive mobile agent system designed to improve real-world task execution accuracy and efficiency. It uses MobiMind models, the AgentRR framework, and MobiFlow benchmarking, plus an AI-assisted data collection pipeline. MobiAgent achieves state-of-the-art performance in mobil...

🔹 Publication Date: Published on Aug 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.00531
• PDF: https://arxiv.org/pdf/2509.00531
• Github: https://github.com/IPADS-SAI/MobiAgent/releases/download/v1.0/Mobiagent.apk

🔹 Models citing this paper:
https://huggingface.co/IPADS-SAI/MobiMind-Grounder-3B
https://huggingface.co/IPADS-SAI/MobiMind-Decider-7B
https://huggingface.co/IPADS-SAI/MobiMind-Mixed-7B

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MobileAgents #AI #DeepLearning #Robotics #Automation
1
Code2Video: A Code-centric Paradigm for Educational Video Generation

📝 Summary:
Code2Video is a code-centric agent framework generating educational videos via executable Python code. It uses three collaborative agents to improve coherence and interpretability, outperforming direct code generation by 40% and matching human-crafted tutorials.

🔹 Publication Date: Published on Oct 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.01174
• PDF: https://arxiv.org/pdf/2510.01174
• Project Page: https://showlab.github.io/Code2Video/
• Github: https://github.com/showlab/code2video

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #VideoGeneration #EducationalTech #CodeGeneration #DeepLearning
Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics

📝 Summary:
Enterprise Deep Research EDR is a multi-agent system for automated report generation and real-time data analysis in enterprises. It integrates specialized agents, tools, and a reflection mechanism for adaptive research. EDR outperforms state-of-the-art systems on open benchmarks without human ste...

🔹 Publication Date: Published on Oct 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.17797
• PDF: https://arxiv.org/pdf/2510.17797
• Github: https://github.com/SalesforceAIResearch/enterprise-deep-research

Datasets citing this paper:
https://huggingface.co/datasets/Salesforce/EDR-200

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MultiAgentSystems #EnterpriseAI #DataAnalytics #AIResearch #AutomatedReporting
Hulu-Med: A Transparent Generalist Model towards Holistic Medical Vision-Language Understanding

📝 Summary:
Hulu-Med is a transparent medical vision-language model unifying diverse data modalities like text, 2D/3D images, and video. It achieves state-of-the-art performance across 30 clinical benchmarks with efficient training, promoting accessible AI.

🔹 Publication Date: Published on Oct 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.08668
• PDF: https://arxiv.org/pdf/2510.08668
• Github: https://github.com/ZJUI-AI4H/Hulu-Med

🔹 Models citing this paper:
https://huggingface.co/ZJU-AI4H/Hulu-Med-32B
https://huggingface.co/ZJU-AI4H/Hulu-Med-7B
https://huggingface.co/ZJU-AI4H/Hulu-Med-14B

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MedicalAI #VisionLanguageModel #MultimodalAI #HealthcareAI #AIResearch
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation

📝 Summary:
GraphGen is a framework that enhances synthetic data generation for LLMs by constructing fine-grained knowledge graphs. It targets high-value knowledge gaps, uses multi-hop sampling, and style-controlled generation to create diverse and accurate QA pairs. This approach outperforms conventional me...

🔹 Publication Date: Published on May 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.20416
• PDF: https://arxiv.org/pdf/2505.20416
• Project Page: https://huggingface.co/spaces/chenzihong/GraphGen
• Github: https://github.com/open-sciencelab/GraphGen

Datasets citing this paper:
https://huggingface.co/datasets/chenzihong/GraphGen-Data

Spaces citing this paper:
https://huggingface.co/spaces/chenzihong/GraphGen

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLMs #KnowledgeGraphs #SyntheticData #FineTuning #NLP
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought

📝 Summary:
Skywork R1V is a multimodal reasoning model that efficiently extends large language models to visual tasks. It achieves this via efficient transfer, enhanced visual-text alignment, and adaptive Chain-of-Thought optimization, delivering competitive benchmark performance.

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.05599
• PDF: https://arxiv.org/pdf/2504.05599
• Project Page: https://huggingface.co/papers?q=lightweight%20visual%20projector
• Github: https://github.com/SkyworkAI/Skywork-R1V

🔹 Models citing this paper:
https://huggingface.co/Skywork/Skywork-R1V-38B
https://huggingface.co/Skywork/Skywork-R1V2-38B
https://huggingface.co/Skywork/Skywork-R1V2-38B-AWQ

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MultimodalAI #ChainOfThought #LLMs #ComputerVision #AIResearch
👍1
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

📝 Summary:
OpenMMReasoner introduces a two-stage SFT+RL training approach with rigorous data curation. This method significantly enhances multimodal reasoning, improving performance by 11.6% over baselines across nine benchmarks.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16334
• PDF: https://arxiv.org/pdf/2511.16334
• Project Page: https://evolvinglmms-lab.github.io/OpenMMReasoner/
• Github: https://github.com/EvolvingLMMs-Lab/OpenMMReasoner

🔹 Models citing this paper:
https://huggingface.co/OpenMMReasoner/OpenMMReasoner-RL
https://huggingface.co/OpenMMReasoner/OpenMMReasoner-ColdStart

Datasets citing this paper:
https://huggingface.co/datasets/OpenMMReasoner/OpenMMReasoner-SFT-874K
https://huggingface.co/datasets/OpenMMReasoner/OpenMMReasoner-RL-74K

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MultimodalAI #ReinforcementLearning #LLMs #AIResearch #DeepLearning
1
Media is too big
VIEW IN TELEGRAM
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization

📝 Summary:
GeoVista is a new agentic model for geolocalization that integrates tool invocation and reinforcement learning. It achieves high performance on the new GeoBench benchmark, surpassing open-source models and matching closed-source models.

🔹 Publication Date: Published on Nov 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15705
• PDF: https://arxiv.org/pdf/2511.15705
• Project Page: https://ekonwang.github.io/geo-vista/
• Github: https://github.com/ekonwang/GeoVista

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#Geolocalization #AI #ReinforcementLearning #ComputerVision #AIAgents
SAM 3: Segment Anything with Concepts

📝 Summary:
SAM 3 is a unified model achieving state-of-the-art in promptable concept segmentation and tracking. It uses concept prompts for detecting, segmenting, and tracking objects, doubling accuracy over existing systems. The model and a new benchmark are open sourced.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16719
• PDF: https://arxiv.org/pdf/2511.16719
• Project Page: https://ai.meta.com/sam3/
• Github: https://github.com/facebookresearch/sam3

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#ComputerVision #ImageSegmentation #ObjectTracking #AI #DeepLearning
RynnVLA-002: A Unified Vision-Language-Action and World Model

📝 Summary:
RynnVLA-002 unifies a Vision-Language-Action and world model, enabling joint learning of environmental dynamics and action planning. This mutual enhancement leads to superior performance, achieving 97.4% success in simulation and a 50% boost in real-world robot tasks.

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17502
• PDF: https://arxiv.org/pdf/2511.17502
• Github: https://github.com/alibaba-damo-academy/RynnVLA-002

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#VisionLanguageAction #WorldModels #Robotics #AI #DeepLearning
Video-R4: Reinforcing Text-Rich Video Reasoning with Visual Rumination

📝 Summary:
Video-R4 is a video reasoning LMM that improves text-rich video QA through iterative visual rumination. It simulates human behavior by iteratively selecting, zooming, and re-encoding frames to update its reasoning. This approach achieves state-of-the-art results on various QA tasks.

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17490
• PDF: https://arxiv.org/pdf/2511.17490
• Project Page: https://yunlong10.github.io/Video-R4/
• Github: https://github.com/yunlong10/Video-R4

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#VideoReasoning #LMM #MultimodalAI #DeepLearning #VideoQA
WorldGen: From Text to Traversable and Interactive 3D Worlds

📝 Summary:
WorldGen transforms text prompts into interactive 3D worlds. It combines LLM reasoning with procedural and diffusion-based 3D generation to efficiently create coherent, navigable environments for gaming and simulation.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16825
• PDF: https://arxiv.org/pdf/2511.16825
• Project Page: https://www.meta.com/blog/worldgen-3d-world-generation-reality-labs-generative-ai-research/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#3DGeneration #GenerativeAI #LLMs #VirtualWorlds #AIResearch
Media is too big
VIEW IN TELEGRAM
Planning with Sketch-Guided Verification for Physics-Aware Video Generation

📝 Summary:
SketchVerify improves video motion planning by iteratively refining candidate trajectories using lightweight sketch-based verification. This training-free method enhances physical realism and consistency more efficiently than full video generation.

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17450
• PDF: https://arxiv.org/pdf/2511.17450
• Project Page: https://sketchverify.github.io/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#VideoGeneration #MotionPlanning #AI #ComputerVision #PhysicsSimulation