🔹 Title: π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
🔹 Publication Date: Published on Oct 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.25889
• PDF: https://arxiv.org/pdf/2510.25889
• Project Page: https://rlinf.readthedocs.io/en/latest/rst_source/examples/pi0.html
• Github: https://github.com/RLinf/RLinf
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Publication Date: Published on Oct 29
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.25889
• PDF: https://arxiv.org/pdf/2510.25889
• Project Page: https://rlinf.readthedocs.io/en/latest/rst_source/examples/pi0.html
• Github: https://github.com/RLinf/RLinf
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: Mask-to-Height: A YOLOv11-Based Architecture for Joint Building Instance Segmentation and Height Classification from Satellite Imagery
🔹 Publication Date: Published on Oct 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27224
• PDF: https://arxiv.org/pdf/2510.27224
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Publication Date: Published on Oct 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.27224
• PDF: https://arxiv.org/pdf/2510.27224
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
Forwarded from Kaggle Data Hub
Unlock premium learning without spending a dime! ⭐️ @DataScienceC is the first Telegram channel dishing out free Udemy coupons daily—grab courses on data science, coding, AI, and beyond. Join the revolution and boost your skills for free today! 📕
What topic are you itching to learn next?😊
https://news.1rj.ru/str/DataScienceC🌟
What topic are you itching to learn next?
https://news.1rj.ru/str/DataScienceC
Please open Telegram to view this post
VIEW IN TELEGRAM
Telegram
Udemy Coupons
ads: @HusseinSheikho
The first channel in Telegram that offers free
Udemy coupons
The first channel in Telegram that offers free
Udemy coupons
🔹 Title: Agent Lightning: Train ANY AI Agents with Reinforcement Learning
📝 Summary:
Agent Lightning is a flexible RL framework for training LLMs in any AI agent. It uniquely decouples agent execution from training, allowing seamless integration with diverse existing agents with minimal code changes. This enables robust training for complex interactions and shows stable performan...
🔹 Publication Date: Published on Aug 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03680
• PDF: https://arxiv.org/pdf/2508.03680
• Project Page: https://www.microsoft.com/en-us/research/project/agent-lightning/
• Github: https://github.com/microsoft/agent-lightning
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
Agent Lightning is a flexible RL framework for training LLMs in any AI agent. It uniquely decouples agent execution from training, allowing seamless integration with diverse existing agents with minimal code changes. This enables robust training for complex interactions and shows stable performan...
🔹 Publication Date: Published on Aug 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03680
• PDF: https://arxiv.org/pdf/2508.03680
• Project Page: https://www.microsoft.com/en-us/research/project/agent-lightning/
• Github: https://github.com/microsoft/agent-lightning
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: Kimi Linear: An Expressive, Efficient Attention Architecture
📝 Summary:
Kimi Linear is a new hybrid linear attention architecture that, for the first time, outperforms full attention across various contexts. It achieves superior performance and efficiency, reducing KV cache and increasing decoding throughput, making it a powerful drop-in replacement.
🔹 Publication Date: Published on Oct 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26692
• PDF: https://arxiv.org/pdf/2510.26692
• Github: https://github.com/MoonshotAI/Kimi-Linear
🔹 Models citing this paper:
• https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Instruct
• https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Base
• https://huggingface.co/aiqtech/Kimi-Linear-48B-A3B-Instruct
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
Kimi Linear is a new hybrid linear attention architecture that, for the first time, outperforms full attention across various contexts. It achieves superior performance and efficiency, reducing KV cache and increasing decoding throughput, making it a powerful drop-in replacement.
🔹 Publication Date: Published on Oct 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26692
• PDF: https://arxiv.org/pdf/2510.26692
• Github: https://github.com/MoonshotAI/Kimi-Linear
🔹 Models citing this paper:
• https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Instruct
• https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Base
• https://huggingface.co/aiqtech/Kimi-Linear-48B-A3B-Instruct
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: Emu3.5: Native Multimodal Models are World Learners
📝 Summary:
Emu3.5 is a multimodal world model natively predicting vision and language states. Trained on vast video data, it uses Discrete Diffusion Adaptation for 20x faster image inference. It excels at multimodal generation, world modeling, and performs competitively.
🔹 Publication Date: Published on Oct 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26583
• PDF: https://arxiv.org/pdf/2510.26583
• Project Page: https://emu.world/
• Github: https://github.com/baaivision/Emu3.5
🔹 Models citing this paper:
• https://huggingface.co/BAAI/Emu3.5
• https://huggingface.co/BAAI/Emu3.5-Image
• https://huggingface.co/BAAI/Emu3.5-VisionTokenizer
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
Emu3.5 is a multimodal world model natively predicting vision and language states. Trained on vast video data, it uses Discrete Diffusion Adaptation for 20x faster image inference. It excels at multimodal generation, world modeling, and performs competitively.
🔹 Publication Date: Published on Oct 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26583
• PDF: https://arxiv.org/pdf/2510.26583
• Project Page: https://emu.world/
• Github: https://github.com/baaivision/Emu3.5
🔹 Models citing this paper:
• https://huggingface.co/BAAI/Emu3.5
• https://huggingface.co/BAAI/Emu3.5-Image
• https://huggingface.co/BAAI/Emu3.5-VisionTokenizer
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models
📝 Summary:
olmOCR is an open-source toolkit using a fine-tuned vision language model to convert diverse PDFs into clean, structured plain text. It preserves formatting like tables and equations, and is optimized for cost-effective large-scale batch processing, unlocking tokens for language model training.
🔹 Publication Date: Published on Feb 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.18443
• PDF: https://arxiv.org/pdf/2502.18443
• Github: https://github.com/allenai/olmocr
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/davanstrien/test-olmocr2
• https://huggingface.co/datasets/davanstrien/newspapers-olmocr2
• https://huggingface.co/datasets/stckmn/ocr-output-Directive017-1761355297
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
olmOCR is an open-source toolkit using a fine-tuned vision language model to convert diverse PDFs into clean, structured plain text. It preserves formatting like tables and equations, and is optimized for cost-effective large-scale batch processing, unlocking tokens for language model training.
🔹 Publication Date: Published on Feb 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.18443
• PDF: https://arxiv.org/pdf/2502.18443
• Github: https://github.com/allenai/olmocr
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/davanstrien/test-olmocr2
• https://huggingface.co/datasets/davanstrien/newspapers-olmocr2
• https://huggingface.co/datasets/stckmn/ocr-output-Directive017-1761355297
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model
📝 Summary:
PaddleOCR-VL is a compact 0.9B vision-language model for multilingual document parsing. It achieves state-of-the-art performance on 109 languages with minimal resources and fast inference. It efficiently recognizes complex elements like text and tables.
🔹 Publication Date: Published on Oct 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.14528
• PDF: https://arxiv.org/pdf/2510.14528
• Github: https://github.com/PaddlePaddle/PaddleOCR
🔹 Models citing this paper:
• https://huggingface.co/PaddlePaddle/PaddleOCR-VL
• https://huggingface.co/PaddlePaddle/PP-DocLayoutV2
• https://huggingface.co/lvyufeng/PaddleOCR-VL-0.9B
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/PaddlePaddle/PaddleOCR-VL_Online_Demo
• https://huggingface.co/spaces/markobinario/PaddleOCR-VL_Online_Demo
• https://huggingface.co/spaces/waytoAGI/PaddleOCR-VL_Online_Demo
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
PaddleOCR-VL is a compact 0.9B vision-language model for multilingual document parsing. It achieves state-of-the-art performance on 109 languages with minimal resources and fast inference. It efficiently recognizes complex elements like text and tables.
🔹 Publication Date: Published on Oct 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.14528
• PDF: https://arxiv.org/pdf/2510.14528
• Github: https://github.com/PaddlePaddle/PaddleOCR
🔹 Models citing this paper:
• https://huggingface.co/PaddlePaddle/PaddleOCR-VL
• https://huggingface.co/PaddlePaddle/PP-DocLayoutV2
• https://huggingface.co/lvyufeng/PaddleOCR-VL-0.9B
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/PaddlePaddle/PaddleOCR-VL_Online_Demo
• https://huggingface.co/spaces/markobinario/PaddleOCR-VL_Online_Demo
• https://huggingface.co/spaces/waytoAGI/PaddleOCR-VL_Online_Demo
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
arXiv.org
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B...
In this report, we propose PaddleOCR-VL, a SOTA and resource-efficient model tailored for document parsing. Its core component is PaddleOCR-VL-0.9B, a compact yet powerful vision-language model...
🔹 Title: Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
📝 Summary:
Concerto combines 3D self-distillation and 2D-3D joint embedding to learn superior spatial features. It significantly outperforms existing self-supervised models and achieves new state-of-the-art results in scene understanding and open-world perception.
🔹 Publication Date: Published on Oct 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.23607
• PDF: https://arxiv.org/pdf/2510.23607
• Project Page: https://pointcept.github.io/Concerto/
• Github: https://github.com/Pointcept/Pointcept
🔹 Models citing this paper:
• https://huggingface.co/Pointcept/Concerto
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/Pointcept/Concerto
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
Concerto combines 3D self-distillation and 2D-3D joint embedding to learn superior spatial features. It significantly outperforms existing self-supervised models and achieves new state-of-the-art results in scene understanding and open-world perception.
🔹 Publication Date: Published on Oct 27
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.23607
• PDF: https://arxiv.org/pdf/2510.23607
• Project Page: https://pointcept.github.io/Concerto/
• Github: https://github.com/Pointcept/Pointcept
🔹 Models citing this paper:
• https://huggingface.co/Pointcept/Concerto
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/Pointcept/Concerto
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
📝 Summary:
LlamaFactory is a unified framework that enables efficient, no-code fine-tuning of over 100 large language models. It simplifies adapting LLMs to various tasks using a web-based user interface.
🔹 Publication Date: Published on Mar 20, 2024
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/llamafactory-unified-efficient-fine-tuning-of-100-language-models
• PDF: https://arxiv.org/pdf/2403.13372
• Project Page: https://huggingface.co/spaces/hiyouga/LLaMA-Board
• Github: https://github.com/hiyouga/LLaMA-Factory
🔹 Models citing this paper:
• https://huggingface.co/AELLM/Llama-3.2-Chibi-3B
• https://huggingface.co/GXMZU/Qwen3-14B-ai-expert-250925
• https://huggingface.co/XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/Justinrune/LLaMA-Factory
• https://huggingface.co/spaces/featherless-ai/try-this-model
• https://huggingface.co/spaces/Darok/Featherless-Feud
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
LlamaFactory is a unified framework that enables efficient, no-code fine-tuning of over 100 large language models. It simplifies adapting LLMs to various tasks using a web-based user interface.
🔹 Publication Date: Published on Mar 20, 2024
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/llamafactory-unified-efficient-fine-tuning-of-100-language-models
• PDF: https://arxiv.org/pdf/2403.13372
• Project Page: https://huggingface.co/spaces/hiyouga/LLaMA-Board
• Github: https://github.com/hiyouga/LLaMA-Factory
🔹 Models citing this paper:
• https://huggingface.co/AELLM/Llama-3.2-Chibi-3B
• https://huggingface.co/GXMZU/Qwen3-14B-ai-expert-250925
• https://huggingface.co/XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/Justinrune/LLaMA-Factory
• https://huggingface.co/spaces/featherless-ai/try-this-model
• https://huggingface.co/spaces/Darok/Featherless-Feud
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
Arxivexplained
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models - Explained Simply
By Yaowei Zheng, Richong Zhang, Junhao Zhang et al.. # LlamaFactory: The Game-Changer That Makes AI Customization Accessible to Everyone
**The Problem:*...
**The Problem:*...
🔹 Title: Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
📝 Summary:
Mem0 is a memory-centric architecture, with an enhanced graph-based version, that improves LLMs long-term conversational coherence. It surpasses other memory systems in accuracy and drastically cuts computational costs, enabling more reliable AI agents.
🔹 Publication Date: Published on Apr 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.19413
• PDF: https://arxiv.org/pdf/2504.19413
• Github: https://github.com/mem0ai/mem0
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
Mem0 is a memory-centric architecture, with an enhanced graph-based version, that improves LLMs long-term conversational coherence. It surpasses other memory systems in accuracy and drastically cuts computational costs, enabling more reliable AI agents.
🔹 Publication Date: Published on Apr 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.19413
• PDF: https://arxiv.org/pdf/2504.19413
• Github: https://github.com/mem0ai/mem0
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: PokeeResearch: Effective Deep Research via Reinforcement Learning from AI Feedback and Robust Reasoning Scaffold
📝 Summary:
PokeeResearch-7B is a 7B-parameter deep research agent. It uses Reinforcement Learning from AI Feedback and chain-of-thought reasoning to enhance robustness. This achieves state-of-the-art performance on deep research benchmarks.
🔹 Publication Date: Published on Oct 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.15862
• PDF: https://arxiv.org/pdf/2510.15862
• Github: https://github.com/Pokee-AI/PokeeResearchOSS
🔹 Models citing this paper:
• https://huggingface.co/PokeeAI/pokee_research_7b
• https://huggingface.co/Mungert/pokee_research_7b-GGUF
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
PokeeResearch-7B is a 7B-parameter deep research agent. It uses Reinforcement Learning from AI Feedback and chain-of-thought reasoning to enhance robustness. This achieves state-of-the-art performance on deep research benchmarks.
🔹 Publication Date: Published on Oct 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.15862
• PDF: https://arxiv.org/pdf/2510.15862
• Github: https://github.com/Pokee-AI/PokeeResearchOSS
🔹 Models citing this paper:
• https://huggingface.co/PokeeAI/pokee_research_7b
• https://huggingface.co/Mungert/pokee_research_7b-GGUF
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: TradingAgents: Multi-Agents LLM Financial Trading Framework
📝 Summary:
TradingAgents introduces a multi-agent LLM framework for stock trading, simulating real-world firms with specialized agent roles. This collaborative system, featuring analysts and traders, significantly improves trading performance metrics. It outperforms baseline models in cumulative returns, Sh...
🔹 Publication Date: Published on Dec 28, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.20138
• PDF: https://arxiv.org/pdf/2412.20138
• Github: https://github.com/tauricresearch/tradingagents
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/shanghengdu/LLM-Agent-Optimization-PaperList
• https://huggingface.co/spaces/Ervin2077/qiu
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
TradingAgents introduces a multi-agent LLM framework for stock trading, simulating real-world firms with specialized agent roles. This collaborative system, featuring analysts and traders, significantly improves trading performance metrics. It outperforms baseline models in cumulative returns, Sh...
🔹 Publication Date: Published on Dec 28, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.20138
• PDF: https://arxiv.org/pdf/2412.20138
• Github: https://github.com/tauricresearch/tradingagents
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/shanghengdu/LLM-Agent-Optimization-PaperList
• https://huggingface.co/spaces/Ervin2077/qiu
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
📝 Summary:
OmniFlatten is an end-to-end GPT model enabling real-time, natural full-duplex spoken dialogue. It uses a multi-stage post-training scheme to adapt a text-based LLM for speech and text generation without altering its original architecture, handling complex conversation dynamics with low latency.
🔹 Publication Date: Published on Oct 23, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2410.17799
• PDF: https://arxiv.org/pdf/2410.17799
• Github: https://github.com/karpathy/nanogpt
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
OmniFlatten is an end-to-end GPT model enabling real-time, natural full-duplex spoken dialogue. It uses a multi-stage post-training scheme to adapt a text-based LLM for speech and text generation without altering its original architecture, handling complex conversation dynamics with low latency.
🔹 Publication Date: Published on Oct 23, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2410.17799
• PDF: https://arxiv.org/pdf/2410.17799
• Github: https://github.com/karpathy/nanogpt
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
📝 Summary:
MinerU2.5 is a document parsing model using a two-stage coarse-to-fine strategy. It first analyzes layout on downsampled images, then recognizes content on native-resolution crops. This achieves state-of-the-art accuracy with high efficiency.
🔹 Publication Date: Published on Sep 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.22186
• PDF: https://arxiv.org/pdf/2509.22186
• Project Page: https://opendatalab.github.io/MinerU/
• Github: https://github.com/opendatalab/MinerU
🔹 Models citing this paper:
• https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B
• https://huggingface.co/freakynit/MinerU2.5-2509-1.2B
• https://huggingface.co/Mungert/MinerU2.5-2509-1.2B-GGUF
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
MinerU2.5 is a document parsing model using a two-stage coarse-to-fine strategy. It first analyzes layout on downsampled images, then recognizes content on native-resolution crops. This achieves state-of-the-art accuracy with high efficiency.
🔹 Publication Date: Published on Sep 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.22186
• PDF: https://arxiv.org/pdf/2509.22186
• Project Page: https://opendatalab.github.io/MinerU/
• Github: https://github.com/opendatalab/MinerU
🔹 Models citing this paper:
• https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B
• https://huggingface.co/freakynit/MinerU2.5-2509-1.2B
• https://huggingface.co/Mungert/MinerU2.5-2509-1.2B-GGUF
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
arXiv.org
MinerU2.5: A Decoupled Vision-Language Model for Efficient...
We introduce MinerU2.5, a 1.2B-parameter document parsing vision-language model that achieves state-of-the-art recognition accuracy while maintaining exceptional computational efficiency. Our...
❤1
🔹 Title: MinerU: An Open-Source Solution for Precise Document Content Extraction
📝 Summary:
MinerU is an open-source solution for high-precision document content extraction. It leverages fine-tuned models and pre/postprocessing rules to achieve consistent accuracy across diverse document types, addressing challenges in existing tools.
🔹 Publication Date: Published on Sep 27, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2409.18839
• PDF: https://huggingface.co/spaces/Echo9k/PDF_reader
• Github: https://github.com/opendatalab/MinerU
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
MinerU is an open-source solution for high-precision document content extraction. It leverages fine-tuned models and pre/postprocessing rules to achieve consistent accuracy across diverse document types, addressing challenges in existing tools.
🔹 Publication Date: Published on Sep 27, 2024
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2409.18839
• PDF: https://huggingface.co/spaces/Echo9k/PDF_reader
• Github: https://github.com/opendatalab/MinerU
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
📝 Summary:
IndexTTS enhances XTTS and Tortoise for superior naturalness and zero-shot voice cloning. It uses hybrid character-pinyin modeling and optimized VQ, offering controllable, efficient TTS with better performance.
🔹 Publication Date: Published on Feb 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.05512
• PDF: https://arxiv.org/pdf/2502.05512
• Github: https://github.com/index-tts/index-tts
🔹 Models citing this paper:
• https://huggingface.co/IndexTeam/IndexTTS-2
• https://huggingface.co/IndexTeam/Index-TTS
• https://huggingface.co/Toxzic/indextts-colab
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/IndexTeam/IndexTTS
• https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
• https://huggingface.co/spaces/jairwaal/image
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
IndexTTS enhances XTTS and Tortoise for superior naturalness and zero-shot voice cloning. It uses hybrid character-pinyin modeling and optimized VQ, offering controllable, efficient TTS with better performance.
🔹 Publication Date: Published on Feb 8
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.05512
• PDF: https://arxiv.org/pdf/2502.05512
• Github: https://github.com/index-tts/index-tts
🔹 Models citing this paper:
• https://huggingface.co/IndexTeam/IndexTTS-2
• https://huggingface.co/IndexTeam/Index-TTS
• https://huggingface.co/Toxzic/indextts-colab
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/IndexTeam/IndexTTS
• https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
• https://huggingface.co/spaces/jairwaal/image
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
🔹 Title: DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
📝 Summary:
DeepAnalyze-8B is an agentic LLM that autonomously completes the entire data science pipeline. It uses curriculum-based training and data-grounded trajectory synthesis. DeepAnalyze-8B outperforms prior workflow-based agents on various data tasks.
🔹 Publication Date: Published on Oct 19
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/deepanalyze-agentic-large-language-models-for-autonomous-data-science
• PDF: https://arxiv.org/pdf/2510.16872
• Project Page: https://ruc-deepanalyze.github.io/
• Github: https://github.com/ruc-datalab/DeepAnalyze
🔹 Models citing this paper:
• https://huggingface.co/RUC-DataLab/DeepAnalyze-8B
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/RUC-DataLab/DataScience-Instruct-500K
• https://huggingface.co/datasets/fantos/DataScience-Instruct-500K
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
📝 Summary:
DeepAnalyze-8B is an agentic LLM that autonomously completes the entire data science pipeline. It uses curriculum-based training and data-grounded trajectory synthesis. DeepAnalyze-8B outperforms prior workflow-based agents on various data tasks.
🔹 Publication Date: Published on Oct 19
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/deepanalyze-agentic-large-language-models-for-autonomous-data-science
• PDF: https://arxiv.org/pdf/2510.16872
• Project Page: https://ruc-deepanalyze.github.io/
• Github: https://github.com/ruc-datalab/DeepAnalyze
🔹 Models citing this paper:
• https://huggingface.co/RUC-DataLab/DeepAnalyze-8B
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/RUC-DataLab/DataScience-Instruct-500K
• https://huggingface.co/datasets/fantos/DataScience-Instruct-500K
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT