NEW BOT Телеграм, страница

ML Research Hub

✨Kimi Linear: An Expressive, Efficient Attention Architecture

📝 Summary:
Kimi Linear is a new hybrid linear attention architecture that outperforms full attention in performance and efficiency across diverse scenarios. It leverages Kimi Delta Attention and Multi-Head Latent Attention, reducing KV cache by up to 75% and boosting decoding throughput by 6x.

🔹 Publication Date: Published on Oct 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26692
• PDF: https://arxiv.org/pdf/2510.26692
• Github: https://github.com/MoonshotAI/Kimi-Linear

🔹 Models citing this paper:
• https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Instruct
• https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Base
• https://huggingface.co/aiqtech/Kimi-Linear-48B-A3B-Instruct

✨ Spaces citing this paper:
• https://huggingface.co/spaces/Speedofmastery/orynxml-agents

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AttentionMechanisms #LLM #AIResearch #DeepLearning #ModelEfficiency

arXiv.org

Kimi Linear: An Expressive, Efficient Attention Architecture

We introduce Kimi Linear, a hybrid linear attention architecture that, for the first time, outperforms full attention under fair comparisons across various scenarios -- including short-context,...

104 views05:54

✨ Explore Data Science 📝 Write your paper

ML Research Hub

79 views05:55

ML Research Hub

✨PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

📝 Summary:
PaddleOCR-VL is a new 0.9B vision-language model for document parsing. It uses a NaViT-style visual encoder and ERNIE-4.5, achieving state-of-the-art performance across 109 languages with minimal resources and fast inference. This model is highly suitable for practical deployment.

🔹 Publication Date: Published on Oct 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.14528
• PDF: https://arxiv.org/pdf/2510.14528
• Github: https://github.com/PaddlePaddle/PaddleOCR

🔹 Models citing this paper:
• https://huggingface.co/PaddlePaddle/PaddleOCR-VL
• https://huggingface.co/PaddlePaddle/PP-DocLayoutV2
• https://huggingface.co/lvyufeng/PaddleOCR-VL-0.9B

✨ Spaces citing this paper:
• https://huggingface.co/spaces/PaddlePaddle/PaddleOCR-VL_Online_Demo
• https://huggingface.co/spaces/markobinario/PaddleOCR-VL_Online_Demo
• https://huggingface.co/spaces/waytoAGI/PaddleOCR-VL_Online_Demo

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#OCR #VisionLanguageModel #DocumentAI #DeepLearning #AI

arXiv.org

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B...

In this report, we propose PaddleOCR-VL, a SOTA and resource-efficient model tailored for document parsing. Its core component is PaddleOCR-VL-0.9B, a compact yet powerful vision-language model...

99 views05:55

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Emu3.5: Native Multimodal Models are World Learners

📝 Summary:
Emu3.5 is a large-scale multimodal world model predicting next states in vision and language. It uses reinforcement learning and Discrete Diffusion Adaptation for efficient inference, delivering strong performance in multimodal tasks and world exploration.

🔹 Publication Date: Published on Oct 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26583
• PDF: https://arxiv.org/pdf/2510.26583
• Project Page: https://emu.world/
• Github: https://github.com/baaivision/Emu3.5

🔹 Models citing this paper:
• https://huggingface.co/BAAI/Emu3.5
• https://huggingface.co/BAAI/Emu3.5-Image
• https://huggingface.co/BAAI/Emu3.5-VisionTokenizer

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MultimodalAI #WorldModels #ReinforcementLearning #ComputerVision #NLP

108 views05:55

✨ Explore Data Science 📝 Write your paper

ML Research Hub

83 views05:55

ML Research Hub

✨DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

📝 Summary:
DeepAnalyze-8B is an agentic LLM that autonomously completes the entire data science pipeline, from raw data to research reports. It employs curriculum-based training and data-grounded trajectory synthesis, outperforming larger, workflow-based agents. This open-source model advances autonomous da...

🔹 Publication Date: Published on Oct 19

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/deepanalyze-agentic-large-language-models-for-autonomous-data-science
• PDF: https://arxiv.org/pdf/2510.16872
• Project Page: https://ruc-deepanalyze.github.io/
• Github: https://github.com/ruc-datalab/DeepAnalyze

🔹 Models citing this paper:
• https://huggingface.co/RUC-DataLab/DeepAnalyze-8B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/RUC-DataLab/DataScience-Instruct-500K
• https://huggingface.co/datasets/fantos/DataScience-Instruct-500K

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #DataScience #AgenticAI #AutonomousAI #AI

Arxivexplained

DeepAnalyze: Agentic Large Language Models for Autonomous Data Science - Explained Simply

By Shaolei Zhang, Ju Fan, Meihao Fan et al.. # DeepAnalyze: The AI Data Scientist That Never Sleeps

**The Problem:** Every business drowns in da...

114 views05:55

✨ Explore Data Science 📝 Write your paper

ML Research Hub

108 views05:55

ML Research Hub

✨LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

📝 Summary:
LlamaFactory is a unified framework for efficient, no-code fine-tuning of over 100 large language models. It provides a web-based user interface, LlamaBoard, to simplify customization for various tasks.

🔹 Publication Date: Published on Mar 20, 2024

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/llamafactory-unified-efficient-fine-tuning-of-100-language-models
• PDF: https://arxiv.org/pdf/2403.13372
• Project Page: https://huggingface.co/spaces/hiyouga/LLaMA-Board
• Github: https://github.com/hiyouga/LLaMA-Factory

🔹 Models citing this paper:
• https://huggingface.co/AELLM/Llama-3.2-Chibi-3B
• https://huggingface.co/GXMZU/Qwen3-14B-ai-expert-250925
• https://huggingface.co/XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k

✨ Spaces citing this paper:
• https://huggingface.co/spaces/Justinrune/LLaMA-Factory
• https://huggingface.co/spaces/featherless-ai/try-this-model
• https://huggingface.co/spaces/Darok/Featherless-Feud

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LlamaFactory #LLM #FineTuning #AI #MachineLearning

Arxivexplained

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models - Explained Simply

By Yaowei Zheng, Richong Zhang, Junhao Zhang et al.. # LlamaFactory: The Game-Changer That Makes AI Customization Accessible to Everyone

**The Problem:*...

83 views05:55

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨TradingAgents: Multi-Agents LLM Financial Trading Framework

📝 Summary:
TradingAgents is a multi-agent LLM framework that simulates real-world trading firms with specialized, collaborative agents. This approach significantly improves trading performance metrics like cumulative returns and Sharpe ratio compared to baseline models.

🔹 Publication Date: Published on Dec 28, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.20138
• PDF: https://arxiv.org/pdf/2412.20138
• Github: https://github.com/tauricresearch/tradingagents

✨ Spaces citing this paper:
• https://huggingface.co/spaces/shanghengdu/LLM-Agent-Optimization-PaperList
• https://huggingface.co/spaces/Ervin2077/qiu

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#TradingAgents #MultiAgentLLM #FinancialTrading #AlgorithmicTrading #AI

95 views05:56

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation

📝 Summary:
OmniFlatten is a novel end-to-end GPT model enabling real-time natural full-duplex spoken dialogue. It achieves this by post-training a text LLM with a multi-stage process for speech-text generation, without modifying the original architecture.

🔹 Publication Date: Published on Oct 23, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2410.17799
• PDF: https://arxiv.org/pdf/2410.17799
• Github: https://github.com/karpathy/nanogpt

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#GPT #VoiceAI #NLP #LLM #DeepLearning

75 views05:56

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models

📝 Summary:
olmOCR is an open-source toolkit that uses a fine-tuned vision language model to convert PDFs into clean, structured text. It enables large-scale, cost-effective extraction of trillions of tokens for training language models.

🔹 Publication Date: Published on Feb 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.18443
• PDF: https://arxiv.org/pdf/2502.18443
• Github: https://github.com/allenai/olmocr

✨ Datasets citing this paper:
• https://huggingface.co/datasets/davanstrien/test-olmocr2
• https://huggingface.co/datasets/davanstrien/newspapers-olmocr2
• https://huggingface.co/datasets/stckmn/ocr-output-Directive017-1761355297

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#OCR #VLMs #LLM #DataExtraction #OpenSource

76 views05:56

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MedRAX: Medical Reasoning Agent for Chest X-ray

📝 Summary:
MedRAX is a new AI agent that integrates CXR analysis tools and multimodal large language models. It answers complex medical queries without extra training, achieving state-of-the-art performance.

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.02673
• PDF: https://arxiv.org/pdf/2502.02673
• Github: https://github.com/bowang-lab/medrax

✨ Spaces citing this paper:
• https://huggingface.co/spaces/asbamit/MedRAX-main

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #MedicalAI #LLM #Radiology #DeepLearning

81 views05:57

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

📝 Summary:
Mem0 is a memory-centric architecture with graph-based memory that enhances long-term conversational coherence in LLMs by efficiently extracting and consolidating information. It outperforms existing memory systems in accuracy, achieving 26% improvement over OpenAI, and significantly reduces comp...

🔹 Publication Date: Published on Apr 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.19413
• PDF: https://arxiv.org/pdf/2504.19413
• Github: https://github.com/mem0ai/mem0

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #LLM #AIAgents #LongTermMemory #GraphMemory

80 views05:57

✨ Explore Data Science 📝 Write your paper

ML Research Hub

71 views05:57

ML Research Hub

✨IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

📝 Summary:
IndexTTS enhances XTTS and Tortoise for TTS, improving naturalness and zero-shot voice cloning. It features hybrid character-pinyin modeling for Chinese and optimized vector quantization, resulting in more controllable usage, faster inference, and superior performance compared to other systems.

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.05512
• PDF: https://arxiv.org/pdf/2502.05512
• Github: https://github.com/index-tts/index-tts

🔹 Models citing this paper:
• https://huggingface.co/IndexTeam/IndexTTS-2
• https://huggingface.co/IndexTeam/Index-TTS
• https://huggingface.co/Toxzic/indextts-colab

✨ Spaces citing this paper:
• https://huggingface.co/spaces/IndexTeam/IndexTTS
• https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
• https://huggingface.co/spaces/jairwaal/image

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#TextToSpeech #ZeroShotLearning #VoiceCloning #AI #MachineLearning

arXiv.org

IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot...

Recently, large language model (LLM) based text-to-speech (TTS) systems have gradually become the mainstream in the industry due to their high naturalness and powerful zero-shot voice cloning...

68 views05:57

✨ Explore Data Science 📝 Write your paper

ML Research Hub

59 views05:57

ML Research Hub

✨PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

📝 Summary:
PyTorch FSDP is an industry-grade solution for efficient and scalable large model training. It enables significantly larger models with near-linear TFLOPS scalability, making advanced capabilities more accessible.

🔹 Publication Date: Published on Apr 21, 2023

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2304.11277
• PDF: https://arxiv.org/pdf/2304.11277
• Github: https://github.com/pytorch/pytorch/blob/main/torch/distributed/fsdp/fully_sharded_data_parallel.py

🔹 Models citing this paper:
• https://huggingface.co/databricks/dbrx-instruct
• https://huggingface.co/databricks/dbrx-base
• https://huggingface.co/Undi95/dbrx-base

✨ Spaces citing this paper:
• https://huggingface.co/spaces/nanotron/ultrascale-playbook
• https://huggingface.co/spaces/Ki-Seki/ultrascale-playbook-zh-cn
• https://huggingface.co/spaces/Gantrol/ultrascale-playbook-zh-cn

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#PyTorch #FSDP #DeepLearning #DistributedTraining #LargeModels

arXiv.org

PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite the remarkable progress made in the field of machine...

66 views05:57

✨ Explore Data Science 📝 Write your paper

ML Research Hub

56 views05:58

ML Research Hub

✨MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

📝 Summary:
MinerU2.5 is a new 1.2B-parameter VLM for document parsing. It uses a coarse-to-fine, two-stage strategy: global layout analysis on downsampled images, then targeted content recognition on native-resolution crops. This achieves state-of-the-art accuracy efficiently for high-resolution documents.

🔹 Publication Date: Published on Sep 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.22186
• PDF: https://arxiv.org/pdf/2509.22186
• Project Page: https://opendatalab.github.io/MinerU/
• Github: https://github.com/opendatalab/MinerU

🔹 Models citing this paper:
• https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B
• https://huggingface.co/freakynit/MinerU2.5-2509-1.2B
• https://huggingface.co/Mungert/MinerU2.5-2509-1.2B-GGUF

✨ Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VisionLanguageModel #DocumentAI #DeepLearning #ComputerVision #AIResearch

arXiv.org

MinerU2.5: A Decoupled Vision-Language Model for Efficient...

We introduce MinerU2.5, a 1.2B-parameter document parsing vision-language model that achieves state-of-the-art recognition accuracy while maintaining exceptional computational efficiency. Our...

53 views05:58

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PyTorch Distributed: Experiences on Accelerating Data Parallel Training

📝 Summary:
This paper details PyTorch's distributed data parallel module, which accelerates large-scale model training. It uses techniques like gradient bucketing and computation-communication overlap to achieve near-linear scalability with 256 GPUs.

🔹 Publication Date: Published on Jun 28, 2020

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2006.15704
• PDF: https://arxiv.org/pdf/2006.15704
• Github: https://github.com/pytorch/pytorch/blob/master/torch/nn/parallel/distributed.py

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#PyTorch #DistributedTraining #DeepLearning #Scalability #HPC

56 views05:58

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MinerU: An Open-Source Solution for Precise Document Content Extraction

📝 Summary:
MinerU is an open-source tool that provides high-precision document content extraction. It uses fine-tuned models and pre/postprocessing rules to consistently achieve high performance across diverse document types.

🔹 Publication Date: Published on Sep 27, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2409.18839
• PDF: https://huggingface.co/spaces/Echo9k/PDF_reader
• Github: https://github.com/opendatalab/MinerU

✨ Spaces citing this paper:
• https://huggingface.co/spaces/opendatalab/MinerU
• https://huggingface.co/spaces/xiaoye-winters/MinerU-API
• https://huggingface.co/spaces/ApeAITW/MinerU_2.5_Test

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#DocumentExtraction #OpenSource #DataScience #NLP #AI

58 views05:58

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform