NEW BOT Телеграм, страница

ML Research Hub

🔹 Title: MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

📝 Summary:
MinerU2.5 is a document parsing model using a two-stage coarse-to-fine strategy. It first analyzes layout on downsampled images, then recognizes content on native-resolution crops. This achieves state-of-the-art accuracy with high efficiency.

🔹 Publication Date: Published on Sep 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.22186
• PDF: https://arxiv.org/pdf/2509.22186
• Project Page: https://opendatalab.github.io/MinerU/
• Github: https://github.com/opendatalab/MinerU

🔹 Models citing this paper:
• https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B
• https://huggingface.co/freakynit/MinerU2.5-2509-1.2B
• https://huggingface.co/Mungert/MinerU2.5-2509-1.2B-GGUF

🔹 Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

arXiv.org

MinerU2.5: A Decoupled Vision-Language Model for Efficient...

We introduce MinerU2.5, a 1.2B-parameter document parsing vision-language model that achieves state-of-the-art recognition accuracy while maintaining exceptional computational efficiency. Our...

❤1

136 views08:43

Explore Data Science

ML Research Hub

🔹 Title: MinerU: An Open-Source Solution for Precise Document Content Extraction

📝 Summary:
MinerU is an open-source solution for high-precision document content extraction. It leverages fine-tuned models and pre/postprocessing rules to achieve consistent accuracy across diverse document types, addressing challenges in existing tools.

🔹 Publication Date: Published on Sep 27, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2409.18839
• PDF: https://huggingface.co/spaces/Echo9k/PDF_reader
• Github: https://github.com/opendatalab/MinerU

🔹 Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

98 views08:44

Explore Data Science

ML Research Hub

🔹 Title: IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

📝 Summary:
IndexTTS enhances XTTS and Tortoise for superior naturalness and zero-shot voice cloning. It uses hybrid character-pinyin modeling and optimized VQ, offering controllable, efficient TTS with better performance.

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.05512
• PDF: https://arxiv.org/pdf/2502.05512
• Github: https://github.com/index-tts/index-tts

🔹 Models citing this paper:
• https://huggingface.co/IndexTeam/IndexTTS-2
• https://huggingface.co/IndexTeam/Index-TTS
• https://huggingface.co/Toxzic/indextts-colab

🔹 Spaces citing this paper:
• https://huggingface.co/spaces/IndexTeam/IndexTTS
• https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
• https://huggingface.co/spaces/jairwaal/image

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

102 views08:45

Explore Data Science

ML Research Hub

🔹 Title: DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

📝 Summary:
DeepAnalyze-8B is an agentic LLM that autonomously completes the entire data science pipeline. It uses curriculum-based training and data-grounded trajectory synthesis. DeepAnalyze-8B outperforms prior workflow-based agents on various data tasks.

🔹 Publication Date: Published on Oct 19

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/deepanalyze-agentic-large-language-models-for-autonomous-data-science
• PDF: https://arxiv.org/pdf/2510.16872
• Project Page: https://ruc-deepanalyze.github.io/
• Github: https://github.com/ruc-datalab/DeepAnalyze

🔹 Models citing this paper:
• https://huggingface.co/RUC-DataLab/DeepAnalyze-8B

🔹 Datasets citing this paper:
• https://huggingface.co/datasets/RUC-DataLab/DataScience-Instruct-500K
• https://huggingface.co/datasets/fantos/DataScience-Instruct-500K

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

119 views08:45

Explore Data Science

ML Research Hub

🔹 Title: DeepAgent: A General Reasoning Agent with Scalable Toolsets

📝 Summary:
DeepAgent is an end-to-end deep reasoning agent that autonomously performs thinking, tool discovery, and action execution. It uses memory folding and an RL strategy ToolPO to learn tool use and manage interactions. DeepAgent significantly outperforms baselines on diverse tool-use and application ...

🔹 Publication Date: Published on Oct 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.21618
• PDF: https://arxiv.org/pdf/2510.21618
• Github: https://github.com/RUC-NLPIR/DeepAgent

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

99 views08:46

Explore Data Science

ML Research Hub

🔹 Title: ReCode: Unify Plan and Action for Universal Granularity Control

📝 Summary:
ReCode unifies LLM agent planning and action through recursive code generation. It treats plans as functions decomposed into primitive actions, enabling dynamic granularity control. This boosts performance and data efficiency.

🔹 Publication Date: Published on Oct 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.23564
• PDF: https://arxiv.org/pdf/2510.23564
• Github: https://github.com/FoundationAgents/ReCode

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

92 views08:47

Explore Data Science

ML Research Hub

0:12

This media is not supported in your browser

VIEW IN TELEGRAM

🔹 Title: WebDancer: Towards Autonomous Information Seeking Agency

📝 Summary:
This paper presents WebDancer, a four-stage training paradigm for autonomous information seeking agents. It combines data construction, supervised fine-tuning, and reinforcement learning to achieve strong performance on challenging benchmarks.

🔹 Publication Date: Published on May 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.22648
• PDF: https://arxiv.org/pdf/2505.22648
• Github: https://github.com/Alibaba-NLP/WebAgent

🔹 Models citing this paper:
• https://huggingface.co/Alibaba-NLP/WebDancer-32B

🔹 Spaces citing this paper:
• https://huggingface.co/spaces/frucht/Alibaba-NLP-WebDancer-32B

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

103 views08:48

Explore Data Science

ML Research Hub

🔹 Title: Scaling Agents via Continual Pre-training

📝 Summary:
AgentFounder proposes Agentic Continual Pre-training to build powerful agentic foundation models. This resolves post-training optimization issues, achieving state-of-the-art agentic performance with strong tool-use.

🔹 Publication Date: Published on Sep 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2502.06589
• PDF: https://arxiv.org/pdf/2509.13310
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

83 views08:49

Explore Data Science

ML Research Hub

🔹 Title: WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

📝 Summary:
WebWeaver is a dual-agent framework for open-ended deep research. It uses adaptive planning to create dynamic outlines and focused synthesis to write reports, avoiding long-context issues. This approach achieves state-of-the-art results on OEDR benchmarks.

🔹 Publication Date: Published on Sep 16

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/webweaver-structuring-web-scale-evidence-with-dynamic-outlines-for-open-ended-deep-research
• PDF: https://arxiv.org/pdf/2509.13312
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

80 views08:49

Explore Data Science

ML Research Hub

🔹 Title: ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization

📝 Summary:
ReSum enhances LLM web agents by using periodic context summarization to overcome context window limitations. It converts interaction histories into compact reasoning states, enabling indefinite exploration for knowledge-intensive tasks. This paradigm achieves significant performance improvements...

🔹 Publication Date: Published on Sep 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.13313
• PDF: https://arxiv.org/pdf/2509.13313
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

92 views08:50

Explore Data Science

ML Research Hub

🔹 Title: WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning

📝 Summary:
WebSailor is a post-training method that enables open-source AI models to match the performance of proprietary agents in complex information-seeking tasks. It does this by instilling the ability to systematically reduce uncertainty, closing a key capability gap.

🔹 Publication Date: Published on Sep 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.13305
• PDF: https://arxiv.org/pdf/2509.13305
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

117 views08:50

Explore Data Science

ML Research Hub

🔹 Title: WebSailor: Navigating Super-human Reasoning for Web Agent

📝 Summary:
WebSailor is a post-training method that teaches open-source LLMs to reduce extreme uncertainty in complex information-seeking tasks. It matches the superhuman reasoning of proprietary agents, closing the capability gap.

🔹 Publication Date: Published on Jul 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.02592
• PDF: https://arxiv.org/pdf/2507.02592
• Project Page: https://github.com/Alibaba-NLP/WebAgent
• Github: https://github.com/Alibaba-NLP/WebAgent

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

187 views08:51

Explore Data Science

ML Research Hub

146 views09:53

ML Research Hub

✨ Title: ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

📝 Summary:
ThinkMorph is a unified model that enhances multimodal reasoning by generating complementary text-image steps that manipulate visual content with coherent verbal logic. It achieves significant performance gains, generalizes effectively, and demonstrates emergent multimodal intelligence, including...

🔹 Publication Date: Published on Oct 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27492
• PDF: https://arxiv.org/pdf/2510.27492
• Project Page: https://thinkmorph.github.io/
• Github: https://github.com/ThinkMorph/ThinkMorph

🔹 Models citing this paper:
• https://huggingface.co/ThinkMorph/ThinkMorph-7B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/ThinkMorph/Jigsaw_Assembly
• https://huggingface.co/datasets/ThinkMorph/Visual_Search
• https://huggingface.co/datasets/ThinkMorph/Chart_Refocus

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

arXiv.org

ThinkMorph: Emergent Properties in Multimodal Interleaved...

Multimodal reasoning requires iterative coordination between language and vision, yet it remains unclear what constitutes a meaningful interleaved chain of thought. We posit that text and image...

144 views09:53

Explore Data Science

ML Research Hub

✨ Title: OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

📝 Summary:
OS-Sentinel is a new hybrid framework that improves safety detection for mobile AI agents. It combines a Formal Verifier with a VLM-based Contextual Judge to identify both explicit system violations and contextual risks, showing significant performance gains.

🔹 Publication Date: Published on Oct 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.24411
• PDF: https://arxiv.org/pdf/2510.24411
• Github: https://github.com/OS-Copilot/OS-Sentinel

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

114 views09:53

Explore Data Science

ML Research Hub

✨ Title: INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

📝 Summary:
This paper compares FP and INT quantization, challenging the trend towards FP. It finds fine-grained MXINT8 outperforms FP in 8-bit formats for accuracy and efficiency. For 4-bit, FP often leads, but INT can surpass it, suggesting fine-grained INT offers a better balance for future AI accelerators.

🔹 Publication Date: Published on Oct 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.25602
• PDF: https://arxiv.org/pdf/2510.25602
• Github: https://github.com/ChenMnZ/INT_vs_FP

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

92 views09:53

Explore Data Science

ML Research Hub

✨ Title: π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

📝 Summary:
piRL enables online RL fine-tuning for flow-based VLA models, overcoming their unique RL challenges. It uses novel algorithms to significantly boost VLA model performance and generalization on robotic tasks.

🔹 Publication Date: Published on Oct 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.25889
• PDF: https://arxiv.org/pdf/2510.25889
• Project Page: https://rlinf.readthedocs.io/en/latest/rst_source/examples/pi0.html
• Github: https://github.com/RLinf/RLinf

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

94 views09:54

Explore Data Science

ML Research Hub

✨ Title: Continuous Autoregressive Language Models

📝 Summary:
LLM efficiency is hampered by sequential token generation. Continuous Autoregressive Language Models CALM address this by predicting continuous vectors, each representing multiple tokens. This significantly reduces generative steps, boosting efficiency and establishing a scalable path for ultra-e...

🔹 Publication Date: Published on Oct 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27688
• PDF: https://arxiv.org/pdf/2510.27688
• Project Page: https://shaochenze.github.io/blog/2025/CALM/
• Github: https://shaochenze.github.io/blog/2025/CALM

🔹 Models citing this paper:
• https://huggingface.co/cccczshao/CALM-M
• https://huggingface.co/cccczshao/CALM-L
• https://huggingface.co/cccczshao/CALM-XL

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

128 views09:54

Explore Data Science

ML Research Hub

✨ Title: Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning

📝 Summary:
Spatial-SSRL is a self-supervised reinforcement learning method that enhances LVLM spatial understanding. It uses five pretext tasks derived from RGB or RGB-D images to generate verifiable signals, avoiding costly human supervision. This approach significantly improves spatial reasoning while mai...

🔹 Publication Date: Published on Oct 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27606
• PDF: https://arxiv.org/pdf/2510.27606
• Github: https://github.com/InternLM/Spatial-SSRL

🔹 Models citing this paper:
• https://huggingface.co/internlm/Spatial-SSRL-7B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/internlm/Spatial-SSRL-81k

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

139 views09:54

Explore Data Science

ML Research Hub

✨ Title: HyperClick: Advancing Reliable GUI Grounding via Uncertainty Calibration

📝 Summary:
GUI agents are overconfident and unreliable in grounding. HyperClick improves reliability by a dual reward mechanism that calibrates spatial confidence, reducing overconfidence. It achieves state-of-the-art performance for dependable GUI automation.

🔹 Publication Date: Published on Oct 31

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27266
• PDF: https://arxiv.org/pdf/2510.27266
• Github: https://github.com/xiaomi-research/hyperclick

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

86 views09:54

Explore Data Science

ML Research Hub

✨ Title: Defeating the Training-Inference Mismatch via FP16

📝 Summary:
RL fine-tuning of LLMs is unstable due to a numerical mismatch caused by BF16s rounding errors. We found that simply using FP16 effectively resolves this issue, leading to more stable optimization, faster convergence, and stronger performance. This simple change requires no model or algorithm mod...

🔹 Publication Date: Published on Oct 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26788
• PDF: https://arxiv.org/pdf/2510.26788
• Github: https://github.com/sail-sg/Precision-RL

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

98 views09:55

Explore Data Science

About

Blog

Apps

Platform