NEW BOT Телеграм, страница

ML Research Hub

✨ Title: iFlyBot-VLA Technical Report

📝 Summary:
iFlyBot-VLA is a large VLA model that uses a latent action model and dual-level action representation. This enhances 3D perception and reasoning, achieving superior performance in diverse manipulation tasks.

🔹 Publication Date: Published on Nov 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.01914
• PDF: https://arxiv.org/pdf/2511.01914
• Project Page: https://xuwenjie401.github.io/iFlyBot-VLA.github.io/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

❤1

114 views04:01

Explore Data Science

ML Research Hub

✨ Title: VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models

📝 Summary:
This paper introduces VidEmo, a new video emotion foundation model that uses an affective cues-guided reasoning framework. It is trained on the Emo-CFG dataset and achieves competitive performance in emotion understanding and face perception tasks.

🔹 Publication Date: Published on Nov 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02712
• PDF: https://arxiv.org/pdf/2511.02712
• Project Page: https://zzcheng.top/VidEmo

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

110 views04:01

Explore Data Science

ML Research Hub

✨ Title: ChartM^3: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension

📝 Summary:
A new automated code-driven pipeline, ChartM^3, generates diverse datasets for complex chart understanding via RAG and CoT. This improves MLLM reasoning and generalization, enabling smaller models to match larger ones in complex chart comprehension.

🔹 Publication Date: Published on Nov 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02415
• PDF: https://arxiv.org/pdf/2511.02415

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

150 views04:02

Explore Data Science

ML Research Hub

✨ Title: Discriminately Treating Motion Components Evolves Joint Depth and Ego-Motion Learning

📝 Summary:
DiMoDE introduces a discriminative treatment of motion components for robust joint depth and ego-motion learning. By leveraging geometric constraints and reforming the learning process, it improves accuracy and achieves state-of-the-art performance.

🔹 Publication Date: Published on Nov 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.01502
• PDF: https://arxiv.org/pdf/2511.01502

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

165 views04:02

Explore Data Science

ML Research Hub

✨ Title: VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

📝 Summary:
VCode introduces a benchmark for generating SVG code from images, preserving symbolic meaning for visual reasoning. Frontier VLMs struggle with this visual-centric task. VCoder, an agentic framework, improves performance using iterative revision and visual tools.

🔹 Publication Date: Published on Nov 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02778
• PDF: https://arxiv.org/pdf/2511.02778
• Project Page: https://csu-jpg.github.io/VCode/
• Github: https://github.com/CSU-JPG/VCode

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VCode #MultimodalAI #SVG #VisualReasoning #VLMs

133 views05:28

Explore Data Science

ML Research Hub

✨ Title: When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought

📝 Summary:
MIRA is a new benchmark for evaluating models that use intermediate visual images to enhance reasoning. It includes 546 multimodal problems requiring models to generate and utilize visual cues. Experiments show models achieve a 33.7% performance gain with visual cues compared to text-only prompts...

🔹 Publication Date: Published on Nov 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02779
• PDF: https://arxiv.org/pdf/2511.02779

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VisualReasoning #ChainOfThought #MultimodalAI #AIBenchmark #ComputerVision

122 views05:29

Explore Data Science

ML Research Hub

✨When Modalities Conflict: How Unimodal Reasoning Uncertainty Governs Preference Dynamics in MLLMs

📝 Summary:
A new framework explains MLLM conflict resolution by decomposing modality following into relative reasoning uncertainty and inherent modality preference. Modality following decreases with relative uncertainty. Inherent preference is measured at the balance point, offering mechanistic insights.

🔹 Publication Date: Published on Nov 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02243
• PDF: https://arxiv.org/pdf/2511.02243

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MLLMs #MultimodalAI #LLM #DeepLearning #AIResearch

119 views05:54

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR

📝 Summary:
LLMs for step-by-step reasoning become verbose as RLVR often filters easy problems. This work shows that retaining and modestly up-weighting moderately easy problems acts as an implicit length regularizer. This approach significantly reduces output verbosity by half while maintaining accuracy, wi...

🔹 Publication Date: Published on Nov 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.01937
• PDF: https://arxiv.org/pdf/2511.01937
• Github: https://github.com/MBZUAI-Paris/Frugal-AI-Math

🔹 Models citing this paper:
• https://huggingface.co/MBZUAI-Paris/Frugal-Math-4B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/MBZUAI-Paris/frugal-maths-data-split-v1

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #AI #ReinforcementLearning #FrugalAI #MathematicalReasoning

111 views05:54

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨BRAINS: A Retrieval-Augmented System for Alzheimer's Detection and Monitoring

📝 Summary:
BRAINS is an LLM-based system for Alzheimer's detection and monitoring. It integrates cognitive assessments and a case retrieval module for risk assessment and disease severity classification. Evaluations demonstrate its effectiveness as a scalable, explainable, early-stage detection tool.

🔹 Publication Date: Published on Nov 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02490
• PDF: https://arxiv.org/pdf/2511.02490

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#Alzheimers #LLM #AI #MedicalAI #EarlyDetection

114 views05:54

✨ Explore Data Science 📝 Write your paper

ML Research Hub

82 views05:54

ML Research Hub

✨Kimi Linear: An Expressive, Efficient Attention Architecture

📝 Summary:
Kimi Linear is a new hybrid linear attention architecture that outperforms full attention in performance and efficiency across diverse scenarios. It leverages Kimi Delta Attention and Multi-Head Latent Attention, reducing KV cache by up to 75% and boosting decoding throughput by 6x.

🔹 Publication Date: Published on Oct 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26692
• PDF: https://arxiv.org/pdf/2510.26692
• Github: https://github.com/MoonshotAI/Kimi-Linear

🔹 Models citing this paper:
• https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Instruct
• https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Base
• https://huggingface.co/aiqtech/Kimi-Linear-48B-A3B-Instruct

✨ Spaces citing this paper:
• https://huggingface.co/spaces/Speedofmastery/orynxml-agents

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AttentionMechanisms #LLM #AIResearch #DeepLearning #ModelEfficiency

arXiv.org

Kimi Linear: An Expressive, Efficient Attention Architecture

We introduce Kimi Linear, a hybrid linear attention architecture that, for the first time, outperforms full attention under fair comparisons across various scenarios -- including short-context,...

104 views05:54

✨ Explore Data Science 📝 Write your paper

ML Research Hub

79 views05:55

ML Research Hub

✨PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

📝 Summary:
PaddleOCR-VL is a new 0.9B vision-language model for document parsing. It uses a NaViT-style visual encoder and ERNIE-4.5, achieving state-of-the-art performance across 109 languages with minimal resources and fast inference. This model is highly suitable for practical deployment.

🔹 Publication Date: Published on Oct 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.14528
• PDF: https://arxiv.org/pdf/2510.14528
• Github: https://github.com/PaddlePaddle/PaddleOCR

🔹 Models citing this paper:
• https://huggingface.co/PaddlePaddle/PaddleOCR-VL
• https://huggingface.co/PaddlePaddle/PP-DocLayoutV2
• https://huggingface.co/lvyufeng/PaddleOCR-VL-0.9B

✨ Spaces citing this paper:
• https://huggingface.co/spaces/PaddlePaddle/PaddleOCR-VL_Online_Demo
• https://huggingface.co/spaces/markobinario/PaddleOCR-VL_Online_Demo
• https://huggingface.co/spaces/waytoAGI/PaddleOCR-VL_Online_Demo

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#OCR #VisionLanguageModel #DocumentAI #DeepLearning #AI

arXiv.org

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B...

In this report, we propose PaddleOCR-VL, a SOTA and resource-efficient model tailored for document parsing. Its core component is PaddleOCR-VL-0.9B, a compact yet powerful vision-language model...

99 views05:55

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Emu3.5: Native Multimodal Models are World Learners

📝 Summary:
Emu3.5 is a large-scale multimodal world model predicting next states in vision and language. It uses reinforcement learning and Discrete Diffusion Adaptation for efficient inference, delivering strong performance in multimodal tasks and world exploration.

🔹 Publication Date: Published on Oct 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.26583
• PDF: https://arxiv.org/pdf/2510.26583
• Project Page: https://emu.world/
• Github: https://github.com/baaivision/Emu3.5

🔹 Models citing this paper:
• https://huggingface.co/BAAI/Emu3.5
• https://huggingface.co/BAAI/Emu3.5-Image
• https://huggingface.co/BAAI/Emu3.5-VisionTokenizer

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MultimodalAI #WorldModels #ReinforcementLearning #ComputerVision #NLP

108 views05:55

✨ Explore Data Science 📝 Write your paper

ML Research Hub

83 views05:55

ML Research Hub

✨DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

📝 Summary:
DeepAnalyze-8B is an agentic LLM that autonomously completes the entire data science pipeline, from raw data to research reports. It employs curriculum-based training and data-grounded trajectory synthesis, outperforming larger, workflow-based agents. This open-source model advances autonomous da...

🔹 Publication Date: Published on Oct 19

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/deepanalyze-agentic-large-language-models-for-autonomous-data-science
• PDF: https://arxiv.org/pdf/2510.16872
• Project Page: https://ruc-deepanalyze.github.io/
• Github: https://github.com/ruc-datalab/DeepAnalyze

🔹 Models citing this paper:
• https://huggingface.co/RUC-DataLab/DeepAnalyze-8B

✨ Datasets citing this paper:
• https://huggingface.co/datasets/RUC-DataLab/DataScience-Instruct-500K
• https://huggingface.co/datasets/fantos/DataScience-Instruct-500K

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #DataScience #AgenticAI #AutonomousAI #AI

Arxivexplained

DeepAnalyze: Agentic Large Language Models for Autonomous Data Science - Explained Simply

By Shaolei Zhang, Ju Fan, Meihao Fan et al.. # DeepAnalyze: The AI Data Scientist That Never Sleeps

**The Problem:** Every business drowns in da...

114 views05:55

✨ Explore Data Science 📝 Write your paper

ML Research Hub

108 views05:55

ML Research Hub

✨LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

📝 Summary:
LlamaFactory is a unified framework for efficient, no-code fine-tuning of over 100 large language models. It provides a web-based user interface, LlamaBoard, to simplify customization for various tasks.

🔹 Publication Date: Published on Mar 20, 2024

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/llamafactory-unified-efficient-fine-tuning-of-100-language-models
• PDF: https://arxiv.org/pdf/2403.13372
• Project Page: https://huggingface.co/spaces/hiyouga/LLaMA-Board
• Github: https://github.com/hiyouga/LLaMA-Factory

🔹 Models citing this paper:
• https://huggingface.co/AELLM/Llama-3.2-Chibi-3B
• https://huggingface.co/GXMZU/Qwen3-14B-ai-expert-250925
• https://huggingface.co/XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k

✨ Spaces citing this paper:
• https://huggingface.co/spaces/Justinrune/LLaMA-Factory
• https://huggingface.co/spaces/featherless-ai/try-this-model
• https://huggingface.co/spaces/Darok/Featherless-Feud

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LlamaFactory #LLM #FineTuning #AI #MachineLearning

Arxivexplained

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models - Explained Simply

By Yaowei Zheng, Richong Zhang, Junhao Zhang et al.. # LlamaFactory: The Game-Changer That Makes AI Customization Accessible to Everyone

**The Problem:*...

83 views05:55

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨TradingAgents: Multi-Agents LLM Financial Trading Framework

📝 Summary:
TradingAgents is a multi-agent LLM framework that simulates real-world trading firms with specialized, collaborative agents. This approach significantly improves trading performance metrics like cumulative returns and Sharpe ratio compared to baseline models.

🔹 Publication Date: Published on Dec 28, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2412.20138
• PDF: https://arxiv.org/pdf/2412.20138
• Github: https://github.com/tauricresearch/tradingagents

✨ Spaces citing this paper:
• https://huggingface.co/spaces/shanghengdu/LLM-Agent-Optimization-PaperList
• https://huggingface.co/spaces/Ervin2077/qiu

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#TradingAgents #MultiAgentLLM #FinancialTrading #AlgorithmicTrading #AI

95 views05:56

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation

📝 Summary:
OmniFlatten is a novel end-to-end GPT model enabling real-time natural full-duplex spoken dialogue. It achieves this by post-training a text LLM with a multi-stage process for speech-text generation, without modifying the original architecture.

🔹 Publication Date: Published on Oct 23, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2410.17799
• PDF: https://arxiv.org/pdf/2410.17799
• Github: https://github.com/karpathy/nanogpt

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#GPT #VoiceAI #NLP #LLM #DeepLearning

75 views05:56

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models

📝 Summary:
olmOCR is an open-source toolkit that uses a fine-tuned vision language model to convert PDFs into clean, structured text. It enables large-scale, cost-effective extraction of trillions of tokens for training language models.

🔹 Publication Date: Published on Feb 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2502.18443
• PDF: https://arxiv.org/pdf/2502.18443
• Github: https://github.com/allenai/olmocr

✨ Datasets citing this paper:
• https://huggingface.co/datasets/davanstrien/test-olmocr2
• https://huggingface.co/datasets/davanstrien/newspapers-olmocr2
• https://huggingface.co/datasets/stckmn/ocr-output-Directive017-1761355297

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#OCR #VLMs #LLM #DataExtraction #OpenSource

76 views05:56

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform