🔹 Title: MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
📝 Summary:
MinerU2.5 is a document parsing model using a two-stage coarse-to-fine strategy. It first analyzes layout on downsampled images, then recognizes content on native-resolution crops. This achieves state-of-the-art accuracy with high efficiency.
🔹 Publication Date: Published on Sep 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.22186
• PDF: https://arxiv.org/pdf/2509.22186
• Project Page: https://opendatalab.github.io/MinerU/
• Github: https://github.com/opendatalab/MinerU
🔹 Models citing this paper:
• https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B
• https://huggingface.co/freakynit/MinerU2.5-2509-1.2B
• https://huggingface.co/Mungert/MinerU2.5-2509-1.2B-GGUF
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
MinerU2.5 is a document parsing model using a two-stage coarse-to-fine strategy. It first analyzes layout on downsampled images, then recognizes content on native-resolution crops. This achieves state-of-the-art accuracy with high efficiency.
🔹 Publication Date: Published on Sep 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.22186
• PDF: https://arxiv.org/pdf/2509.22186
• Project Page: https://opendatalab.github.io/MinerU/
• Github: https://github.com/opendatalab/MinerU
🔹 Models citing this paper:
• https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B
• https://huggingface.co/freakynit/MinerU2.5-2509-1.2B
• https://huggingface.co/Mungert/MinerU2.5-2509-1.2B-GGUF
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
arXiv.org
MinerU2.5: A Decoupled Vision-Language Model for Efficient...
We introduce MinerU2.5, a 1.2B-parameter document parsing vision-language model that achieves state-of-the-art recognition accuracy while maintaining exceptional computational efficiency. Our...
❤1
🔹 Title: MinerU: An Open-Source Solution for Precise Document Content Extraction
📝 Summary:
MinerU is an open-source solution for high-precision document content extraction. It leverages fine-tuned models and pre/postprocessing rules to achieve consistent accuracy across diverse document types, addressing challenges in existing tools.
🔹 Publication Date: Published on Sep 27, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2409.18839
• PDF: https://huggingface.co/spaces/Echo9k/PDF_reader
• Github: https://github.com/opendatalab/MinerU
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
MinerU is an open-source solution for high-precision document content extraction. It leverages fine-tuned models and pre/postprocessing rules to achieve consistent accuracy across diverse document types, addressing challenges in existing tools.
🔹 Publication Date: Published on Sep 27, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2409.18839
• PDF: https://huggingface.co/spaces/Echo9k/PDF_reader
• Github: https://github.com/opendatalab/MinerU
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
📝 Summary:
IndexTTS enhances XTTS and Tortoise for superior naturalness and zero-shot voice cloning. It uses hybrid character-pinyin modeling and optimized VQ, offering controllable, efficient TTS with better performance.
🔹 Publication Date: Published on Feb 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.05512
• PDF: https://arxiv.org/pdf/2502.05512
• Github: https://github.com/index-tts/index-tts
🔹 Models citing this paper:
• https://huggingface.co/IndexTeam/IndexTTS-2
• https://huggingface.co/IndexTeam/Index-TTS
• https://huggingface.co/Toxzic/indextts-colab
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/IndexTeam/IndexTTS
• https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
• https://huggingface.co/spaces/jairwaal/image
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
IndexTTS enhances XTTS and Tortoise for superior naturalness and zero-shot voice cloning. It uses hybrid character-pinyin modeling and optimized VQ, offering controllable, efficient TTS with better performance.
🔹 Publication Date: Published on Feb 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.05512
• PDF: https://arxiv.org/pdf/2502.05512
• Github: https://github.com/index-tts/index-tts
🔹 Models citing this paper:
• https://huggingface.co/IndexTeam/IndexTTS-2
• https://huggingface.co/IndexTeam/Index-TTS
• https://huggingface.co/Toxzic/indextts-colab
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/IndexTeam/IndexTTS
• https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
• https://huggingface.co/spaces/jairwaal/image
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
📝 Summary:
DeepAnalyze-8B is an agentic LLM that autonomously completes the entire data science pipeline. It uses curriculum-based training and data-grounded trajectory synthesis. DeepAnalyze-8B outperforms prior workflow-based agents on various data tasks.
🔹 Publication Date: Published on Oct 19
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/deepanalyze-agentic-large-language-models-for-autonomous-data-science
• PDF: https://arxiv.org/pdf/2510.16872
• Project Page: https://ruc-deepanalyze.github.io/
• Github: https://github.com/ruc-datalab/DeepAnalyze
🔹 Models citing this paper:
• https://huggingface.co/RUC-DataLab/DeepAnalyze-8B
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/RUC-DataLab/DataScience-Instruct-500K
• https://huggingface.co/datasets/fantos/DataScience-Instruct-500K
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
DeepAnalyze-8B is an agentic LLM that autonomously completes the entire data science pipeline. It uses curriculum-based training and data-grounded trajectory synthesis. DeepAnalyze-8B outperforms prior workflow-based agents on various data tasks.
🔹 Publication Date: Published on Oct 19
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/deepanalyze-agentic-large-language-models-for-autonomous-data-science
• PDF: https://arxiv.org/pdf/2510.16872
• Project Page: https://ruc-deepanalyze.github.io/
• Github: https://github.com/ruc-datalab/DeepAnalyze
🔹 Models citing this paper:
• https://huggingface.co/RUC-DataLab/DeepAnalyze-8B
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/RUC-DataLab/DataScience-Instruct-500K
• https://huggingface.co/datasets/fantos/DataScience-Instruct-500K
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: DeepAgent: A General Reasoning Agent with Scalable Toolsets
📝 Summary:
DeepAgent is an end-to-end deep reasoning agent that autonomously performs thinking, tool discovery, and action execution. It uses memory folding and an RL strategy ToolPO to learn tool use and manage interactions. DeepAgent significantly outperforms baselines on diverse tool-use and application ...
🔹 Publication Date: Published on Oct 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.21618
• PDF: https://arxiv.org/pdf/2510.21618
• Github: https://github.com/RUC-NLPIR/DeepAgent
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
DeepAgent is an end-to-end deep reasoning agent that autonomously performs thinking, tool discovery, and action execution. It uses memory folding and an RL strategy ToolPO to learn tool use and manage interactions. DeepAgent significantly outperforms baselines on diverse tool-use and application ...
🔹 Publication Date: Published on Oct 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.21618
• PDF: https://arxiv.org/pdf/2510.21618
• Github: https://github.com/RUC-NLPIR/DeepAgent
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: ReCode: Unify Plan and Action for Universal Granularity Control
📝 Summary:
ReCode unifies LLM agent planning and action through recursive code generation. It treats plans as functions decomposed into primitive actions, enabling dynamic granularity control. This boosts performance and data efficiency.
🔹 Publication Date: Published on Oct 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.23564
• PDF: https://arxiv.org/pdf/2510.23564
• Github: https://github.com/FoundationAgents/ReCode
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
ReCode unifies LLM agent planning and action through recursive code generation. It treats plans as functions decomposed into primitive actions, enabling dynamic granularity control. This boosts performance and data efficiency.
🔹 Publication Date: Published on Oct 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.23564
• PDF: https://arxiv.org/pdf/2510.23564
• Github: https://github.com/FoundationAgents/ReCode
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
This media is not supported in your browser
VIEW IN TELEGRAM
🔹 Title: WebDancer: Towards Autonomous Information Seeking Agency
📝 Summary:
This paper presents WebDancer, a four-stage training paradigm for autonomous information seeking agents. It combines data construction, supervised fine-tuning, and reinforcement learning to achieve strong performance on challenging benchmarks.
🔹 Publication Date: Published on May 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.22648
• PDF: https://arxiv.org/pdf/2505.22648
• Github: https://github.com/Alibaba-NLP/WebAgent
🔹 Models citing this paper:
• https://huggingface.co/Alibaba-NLP/WebDancer-32B
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/frucht/Alibaba-NLP-WebDancer-32B
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
This paper presents WebDancer, a four-stage training paradigm for autonomous information seeking agents. It combines data construction, supervised fine-tuning, and reinforcement learning to achieve strong performance on challenging benchmarks.
🔹 Publication Date: Published on May 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2505.22648
• PDF: https://arxiv.org/pdf/2505.22648
• Github: https://github.com/Alibaba-NLP/WebAgent
🔹 Models citing this paper:
• https://huggingface.co/Alibaba-NLP/WebDancer-32B
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/frucht/Alibaba-NLP-WebDancer-32B
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: Scaling Agents via Continual Pre-training
📝 Summary:
AgentFounder proposes Agentic Continual Pre-training to build powerful agentic foundation models. This resolves post-training optimization issues, achieving state-of-the-art agentic performance with strong tool-use.
🔹 Publication Date: Published on Sep 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2502.06589
• PDF: https://arxiv.org/pdf/2509.13310
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
AgentFounder proposes Agentic Continual Pre-training to build powerful agentic foundation models. This resolves post-training optimization issues, achieving state-of-the-art agentic performance with strong tool-use.
🔹 Publication Date: Published on Sep 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2502.06589
• PDF: https://arxiv.org/pdf/2509.13310
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research
📝 Summary:
WebWeaver is a dual-agent framework for open-ended deep research. It uses adaptive planning to create dynamic outlines and focused synthesis to write reports, avoiding long-context issues. This approach achieves state-of-the-art results on OEDR benchmarks.
🔹 Publication Date: Published on Sep 16
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/webweaver-structuring-web-scale-evidence-with-dynamic-outlines-for-open-ended-deep-research
• PDF: https://arxiv.org/pdf/2509.13312
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
WebWeaver is a dual-agent framework for open-ended deep research. It uses adaptive planning to create dynamic outlines and focused synthesis to write reports, avoiding long-context issues. This approach achieves state-of-the-art results on OEDR benchmarks.
🔹 Publication Date: Published on Sep 16
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/webweaver-structuring-web-scale-evidence-with-dynamic-outlines-for-open-ended-deep-research
• PDF: https://arxiv.org/pdf/2509.13312
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization
📝 Summary:
ReSum enhances LLM web agents by using periodic context summarization to overcome context window limitations. It converts interaction histories into compact reasoning states, enabling indefinite exploration for knowledge-intensive tasks. This paradigm achieves significant performance improvements...
🔹 Publication Date: Published on Sep 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.13313
• PDF: https://arxiv.org/pdf/2509.13313
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
ReSum enhances LLM web agents by using periodic context summarization to overcome context window limitations. It converts interaction histories into compact reasoning states, enabling indefinite exploration for knowledge-intensive tasks. This paradigm achieves significant performance improvements...
🔹 Publication Date: Published on Sep 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.13313
• PDF: https://arxiv.org/pdf/2509.13313
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning
📝 Summary:
WebSailor is a post-training method that enables open-source AI models to match the performance of proprietary agents in complex information-seeking tasks. It does this by instilling the ability to systematically reduce uncertainty, closing a key capability gap.
🔹 Publication Date: Published on Sep 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.13305
• PDF: https://arxiv.org/pdf/2509.13305
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
WebSailor is a post-training method that enables open-source AI models to match the performance of proprietary agents in complex information-seeking tasks. It does this by instilling the ability to systematically reduce uncertainty, closing a key capability gap.
🔹 Publication Date: Published on Sep 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.13305
• PDF: https://arxiv.org/pdf/2509.13305
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: WebSailor: Navigating Super-human Reasoning for Web Agent
📝 Summary:
WebSailor is a post-training method that teaches open-source LLMs to reduce extreme uncertainty in complex information-seeking tasks. It matches the superhuman reasoning of proprietary agents, closing the capability gap.
🔹 Publication Date: Published on Jul 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.02592
• PDF: https://arxiv.org/pdf/2507.02592
• Project Page: https://github.com/Alibaba-NLP/WebAgent
• Github: https://github.com/Alibaba-NLP/WebAgent
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
WebSailor is a post-training method that teaches open-source LLMs to reduce extreme uncertainty in complex information-seeking tasks. It matches the superhuman reasoning of proprietary agents, closing the capability gap.
🔹 Publication Date: Published on Jul 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.02592
• PDF: https://arxiv.org/pdf/2507.02592
• Project Page: https://github.com/Alibaba-NLP/WebAgent
• Github: https://github.com/Alibaba-NLP/WebAgent
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
✨ Title: ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
📝 Summary:
ThinkMorph is a unified model that enhances multimodal reasoning by generating complementary text-image steps that manipulate visual content with coherent verbal logic. It achieves significant performance gains, generalizes effectively, and demonstrates emergent multimodal intelligence, including...
🔹 Publication Date: Published on Oct 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27492
• PDF: https://arxiv.org/pdf/2510.27492
• Project Page: https://thinkmorph.github.io/
• Github: https://github.com/ThinkMorph/ThinkMorph
🔹 Models citing this paper:
• https://huggingface.co/ThinkMorph/ThinkMorph-7B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/ThinkMorph/Jigsaw_Assembly
• https://huggingface.co/datasets/ThinkMorph/Visual_Search
• https://huggingface.co/datasets/ThinkMorph/Chart_Refocus
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
ThinkMorph is a unified model that enhances multimodal reasoning by generating complementary text-image steps that manipulate visual content with coherent verbal logic. It achieves significant performance gains, generalizes effectively, and demonstrates emergent multimodal intelligence, including...
🔹 Publication Date: Published on Oct 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27492
• PDF: https://arxiv.org/pdf/2510.27492
• Project Page: https://thinkmorph.github.io/
• Github: https://github.com/ThinkMorph/ThinkMorph
🔹 Models citing this paper:
• https://huggingface.co/ThinkMorph/ThinkMorph-7B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/ThinkMorph/Jigsaw_Assembly
• https://huggingface.co/datasets/ThinkMorph/Visual_Search
• https://huggingface.co/datasets/ThinkMorph/Chart_Refocus
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
arXiv.org
ThinkMorph: Emergent Properties in Multimodal Interleaved...
Multimodal reasoning requires iterative coordination between language and vision, yet it remains unclear what constitutes a meaningful interleaved chain of thought. We posit that text and image...
✨ Title: OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows
📝 Summary:
OS-Sentinel is a new hybrid framework that improves safety detection for mobile AI agents. It combines a Formal Verifier with a VLM-based Contextual Judge to identify both explicit system violations and contextual risks, showing significant performance gains.
🔹 Publication Date: Published on Oct 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.24411
• PDF: https://arxiv.org/pdf/2510.24411
• Github: https://github.com/OS-Copilot/OS-Sentinel
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
OS-Sentinel is a new hybrid framework that improves safety detection for mobile AI agents. It combines a Formal Verifier with a VLM-based Contextual Judge to identify both explicit system violations and contextual risks, showing significant performance gains.
🔹 Publication Date: Published on Oct 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.24411
• PDF: https://arxiv.org/pdf/2510.24411
• Github: https://github.com/OS-Copilot/OS-Sentinel
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
✨ Title: INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
📝 Summary:
This paper compares FP and INT quantization, challenging the trend towards FP. It finds fine-grained MXINT8 outperforms FP in 8-bit formats for accuracy and efficiency. For 4-bit, FP often leads, but INT can surpass it, suggesting fine-grained INT offers a better balance for future AI accelerators.
🔹 Publication Date: Published on Oct 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.25602
• PDF: https://arxiv.org/pdf/2510.25602
• Github: https://github.com/ChenMnZ/INT_vs_FP
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
This paper compares FP and INT quantization, challenging the trend towards FP. It finds fine-grained MXINT8 outperforms FP in 8-bit formats for accuracy and efficiency. For 4-bit, FP often leads, but INT can surpass it, suggesting fine-grained INT offers a better balance for future AI accelerators.
🔹 Publication Date: Published on Oct 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.25602
• PDF: https://arxiv.org/pdf/2510.25602
• Github: https://github.com/ChenMnZ/INT_vs_FP
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
✨ Title: π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
📝 Summary:
piRL enables online RL fine-tuning for flow-based VLA models, overcoming their unique RL challenges. It uses novel algorithms to significantly boost VLA model performance and generalization on robotic tasks.
🔹 Publication Date: Published on Oct 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.25889
• PDF: https://arxiv.org/pdf/2510.25889
• Project Page: https://rlinf.readthedocs.io/en/latest/rst_source/examples/pi0.html
• Github: https://github.com/RLinf/RLinf
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
piRL enables online RL fine-tuning for flow-based VLA models, overcoming their unique RL challenges. It uses novel algorithms to significantly boost VLA model performance and generalization on robotic tasks.
🔹 Publication Date: Published on Oct 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.25889
• PDF: https://arxiv.org/pdf/2510.25889
• Project Page: https://rlinf.readthedocs.io/en/latest/rst_source/examples/pi0.html
• Github: https://github.com/RLinf/RLinf
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
✨ Title: Continuous Autoregressive Language Models
📝 Summary:
LLM efficiency is hampered by sequential token generation. Continuous Autoregressive Language Models CALM address this by predicting continuous vectors, each representing multiple tokens. This significantly reduces generative steps, boosting efficiency and establishing a scalable path for ultra-e...
🔹 Publication Date: Published on Oct 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27688
• PDF: https://arxiv.org/pdf/2510.27688
• Project Page: https://shaochenze.github.io/blog/2025/CALM/
• Github: https://shaochenze.github.io/blog/2025/CALM
🔹 Models citing this paper:
• https://huggingface.co/cccczshao/CALM-M
• https://huggingface.co/cccczshao/CALM-L
• https://huggingface.co/cccczshao/CALM-XL
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
LLM efficiency is hampered by sequential token generation. Continuous Autoregressive Language Models CALM address this by predicting continuous vectors, each representing multiple tokens. This significantly reduces generative steps, boosting efficiency and establishing a scalable path for ultra-e...
🔹 Publication Date: Published on Oct 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27688
• PDF: https://arxiv.org/pdf/2510.27688
• Project Page: https://shaochenze.github.io/blog/2025/CALM/
• Github: https://shaochenze.github.io/blog/2025/CALM
🔹 Models citing this paper:
• https://huggingface.co/cccczshao/CALM-M
• https://huggingface.co/cccczshao/CALM-L
• https://huggingface.co/cccczshao/CALM-XL
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
✨ Title: Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning
📝 Summary:
Spatial-SSRL is a self-supervised reinforcement learning method that enhances LVLM spatial understanding. It uses five pretext tasks derived from RGB or RGB-D images to generate verifiable signals, avoiding costly human supervision. This approach significantly improves spatial reasoning while mai...
🔹 Publication Date: Published on Oct 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27606
• PDF: https://arxiv.org/pdf/2510.27606
• Github: https://github.com/InternLM/Spatial-SSRL
🔹 Models citing this paper:
• https://huggingface.co/internlm/Spatial-SSRL-7B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/internlm/Spatial-SSRL-81k
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
Spatial-SSRL is a self-supervised reinforcement learning method that enhances LVLM spatial understanding. It uses five pretext tasks derived from RGB or RGB-D images to generate verifiable signals, avoiding costly human supervision. This approach significantly improves spatial reasoning while mai...
🔹 Publication Date: Published on Oct 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27606
• PDF: https://arxiv.org/pdf/2510.27606
• Github: https://github.com/InternLM/Spatial-SSRL
🔹 Models citing this paper:
• https://huggingface.co/internlm/Spatial-SSRL-7B
✨ Datasets citing this paper:
• https://huggingface.co/datasets/internlm/Spatial-SSRL-81k
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
✨ Title: HyperClick: Advancing Reliable GUI Grounding via Uncertainty Calibration
📝 Summary:
GUI agents are overconfident and unreliable in grounding. HyperClick improves reliability by a dual reward mechanism that calibrates spatial confidence, reducing overconfidence. It achieves state-of-the-art performance for dependable GUI automation.
🔹 Publication Date: Published on Oct 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27266
• PDF: https://arxiv.org/pdf/2510.27266
• Github: https://github.com/xiaomi-research/hyperclick
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
GUI agents are overconfident and unreliable in grounding. HyperClick improves reliability by a dual reward mechanism that calibrates spatial confidence, reducing overconfidence. It achieves state-of-the-art performance for dependable GUI automation.
🔹 Publication Date: Published on Oct 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27266
• PDF: https://arxiv.org/pdf/2510.27266
• Github: https://github.com/xiaomi-research/hyperclick
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
✨ Title: Defeating the Training-Inference Mismatch via FP16
📝 Summary:
RL fine-tuning of LLMs is unstable due to a numerical mismatch caused by BF16s rounding errors. We found that simply using FP16 effectively resolves this issue, leading to more stable optimization, faster convergence, and stronger performance. This simple change requires no model or algorithm mod...
🔹 Publication Date: Published on Oct 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26788
• PDF: https://arxiv.org/pdf/2510.26788
• Github: https://github.com/sail-sg/Precision-RL
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
RL fine-tuning of LLMs is unstable due to a numerical mismatch caused by BF16s rounding errors. We found that simply using FP16 effectively resolves this issue, leading to more stable optimization, faster convergence, and stronger performance. This simple change requires no model or algorithm mod...
🔹 Publication Date: Published on Oct 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26788
• PDF: https://arxiv.org/pdf/2510.26788
• Github: https://github.com/sail-sg/Precision-RL
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT