NEW BOT Телеграм, страница

ML Research Hub

✨Experience-Guided Adaptation of Inference-Time Reasoning Strategies

📝 Summary:
Experience-Guided Reasoner EGuR dynamically generates and optimizes complete computational strategies at inference time using accumulated experience. It adapts LLM calls tools and control logic improving accuracy up to 14 percent and reducing costs by up to 111x.

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11519
• PDF: https://arxiv.org/pdf/2511.11519

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #AI #Reasoning #Optimization #MachineLearning

329 views18:09

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨From Proof to Program: Characterizing Tool-Induced Reasoning Hallucinations in Large Language Models

📝 Summary:
Tool-augmented LLMs exhibit Tool-Induced Myopia TIM, treating tool outputs as substitutes for true reasoning. This improves final answer accuracy but significantly degrades reasoning quality. A proposed framework realigns these models to use tools as assistive evidence, enhancing both accuracy an...

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.10899
• PDF: https://arxiv.org/pdf/2511.10899

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #AIResearch #Reasoning #ToolAugmentation #AIHallucinations

❤1

268 views01:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨miniF2F-Lean Revisited: Reviewing Limitations and Charting a Path Forward

📝 Summary:
An analysis of miniF2F showed AI systems had 36% accuracy due to problem errors. Correcting these errors created miniF2F-v2, improving accuracy to 70%. High-quality benchmarks like miniF2F-v2 are crucial for evaluating formal reasoning progress.

🔹 Publication Date: Published on Nov 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.03108
• PDF: https://arxiv.org/pdf/2511.03108
• Github: https://github.com/roozbeh-yz/miniF2F_v2

✨ Datasets citing this paper:
• https://huggingface.co/datasets/roozbeh-yz/miniF2F_v2

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #FormalReasoning #Benchmarks #MachineLearning #Dataset

257 views01:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:20

This media is not supported in your browser

VIEW IN TELEGRAM

✨GGBench: A Geometric Generative Reasoning Benchmark for Unified Multimodal Models

📝 Summary:
GGBench is a new benchmark for evaluating geometric generative reasoning in unified multimodal models. It addresses a critical gap by assessing integrated cognitive processes, requiring language comprehension and precise visual generation to actively construct solutions. This sets a rigorous stan...

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11134
• PDF: https://arxiv.org/pdf/2511.11134

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#GGBench #MultimodalAI #GeometricReasoning #GenerativeAI #AIResearch

210 views02:49

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:16

This media is not supported in your browser

VIEW IN TELEGRAM

✨MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

📝 Summary:
A parallel multimodal diffusion framework, MMaDA-Parallel, enhances cross-modal alignment and semantic consistency in thinking-aware image synthesis by addressing error propagation issues in sequentia...

🔹 Publication Date: Published on Nov 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.09611
• PDF: https://arxiv.org/pdf/2511.09611
• Project Page: https://tyfeld.github.io/mmadaparellel.github.io/
• Github: https://github.com/tyfeld/MMaDA-Parallel

🔹 Models citing this paper:
• https://huggingface.co/tyfeld/MMaDA-Parallel-A
• https://huggingface.co/tyfeld/MMaDA-Parallel-M

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MultimodalAI #DiffusionModels #ImageSynthesis #LLM #AIResearch

169 views03:02

✨ Explore Data Science 📝 Write your paper

✨UFO^3: Weaving the Digital Agent Galaxy

📝 Summary:
UFO^3 unifies diverse digital devices into a single orchestration fabric, enabling AI agents to collaborate seamlessly across platforms. It models tasks dynamically for asynchronous execution, achieving efficient, resilient, and accurate cross-device task orchestration with improved parallelism a...

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11332
• PDF: https://arxiv.org/pdf/2511.11332
• Project Page: https://microsoft.github.io/UFO/
• Github: https://github.com/microsoft/UFO/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AIAgents #TaskOrchestration #DistributedSystems #EdgeAI #MultiAgentSystems

185 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Test-Time Spectrum-Aware Latent Steering for Zero-Shot Generalization in Vision-Language Models

📝 Summary:
VLMs degrade under test-time domain shifts. Spectrum-Aware Test-Time Steering STS is a lightweight method that adapts VLM latent representations by steering them using textual embedding subspaces, without backpropagation. STS surpasses state-of-the-art, offering faster inference and less memory.

🔹 Publication Date: Published on Nov 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.09809
• PDF: https://arxiv.org/pdf/2511.09809

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VisionLanguageModels #ZeroShotGeneralization #DomainAdaptation #DeepLearning #AI

203 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data

📝 Summary:
Uni-MoE 2.0-Omni is an open-source omnimodal large model improving multimodal understanding, reasoning, and generation. It uses dynamic MoE and progressive training to achieve state-of-the-art results across 85 benchmarks, outperforming leading models like Qwen2.5-Omni.

🔹 Publication Date: Published on Nov 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12609
• PDF: https://arxiv.org/pdf/2511.12609
• Project Page: https://idealistxy.github.io/Uni-MoE-v2.github.io/
• Github: https://github.com/HITsz-TMG/Uni-MoE

🔹 Models citing this paper:
• https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Omni
• https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Base
• https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Image

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#OmnimodalAI #LLMs #MixtureOfExperts #MultimodalLearning #AIResearch

197 views04:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning

📝 Summary:
GroupRank introduces a novel groupwise reranking paradigm addressing limitations of pointwise and listwise methods. It processes queries with document groups to assign comparative relevance scores, combining flexibility with global context. Trained via reinforcement learning and synthesized data,...

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11653
• PDF: https://arxiv.org/pdf/2511.11653

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#Reranking #ReinforcementLearning #InformationRetrieval #MachineLearning #DataScience

191 views04:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models

📝 Summary:
TiViBench is a new benchmark assessing image-to-video models reasoning across four dimensions and 24 tasks. Commercial models show stronger reasoning potential. VideoTPO, a test-time strategy, significantly enhances performance, advancing reasoning in video generation.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13704
• PDF: https://arxiv.org/pdf/2511.13704
• Project Page: https://haroldchen19.github.io/TiViBench-Page/
• Github: https://haroldchen19.github.io/TiViBench-Page/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VideoGeneration #AIBenchmark #ComputerVision #DeepLearning #AIResearch

174 views05:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

2:31

This media is not supported in your browser

VIEW IN TELEGRAM

✨PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image

📝 Summary:
PhysX-Anything generates simulation-ready physical 3D assets from single images, crucial for embodied AI. It uses a novel VLM-based model and an efficient 3D representation, enabling direct use in robotic policy learning.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13648
• PDF: https://arxiv.org/pdf/2511.13648
• Project Page: https://physx-anything.github.io/
• Github: https://github.com/ziangcao0312/PhysX-Anything

✨ Datasets citing this paper:
• https://huggingface.co/datasets/Caoza/PhysX-Mobility

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#EmbodiedAI #3DReconstruction #Robotics #ComputerVision #AIResearch

169 views05:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Part-X-MLLM: Part-aware 3D Multimodal Large Language Model

📝 Summary:
Part-X-MLLM is a 3D multimodal large language model that unifies diverse 3D tasks by generating structured programs from RGB point clouds and language prompts. It outputs part-level data and edit commands, enabling state-of-the-art 3D generation and editing through one interface.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13647
• PDF: https://arxiv.org/pdf/2511.13647
• Project Page: https://chunshi.wang/Part-X-MLLM/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#3D #MLLM #GenerativeAI #ComputerVision #AIResearch

155 views05:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation

📝 Summary:
OlmoEarth is a novel multimodal spatio-temporal foundation model for Earth observation data. It employs new self-supervised learning methods to achieve state-of-the-art performance on many tasks. It is deployed as a platform for non-profits and NGOs.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13655
• PDF: https://arxiv.org/pdf/2511.13655
• Project Page: https://olmoearth.allenai.org/
• Github: https://github.com/allenai/olmoearth_pretrain

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#EarthObservation #FoundationModels #AI #RemoteSensing #SelfSupervisedLearning

159 views05:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?

📝 Summary:
Live-SWE-agent is the first live software engineering agent that autonomously and continuously evolves itself on-the-fly during runtime. It starts with basic tools and refines its own implementation while solving problems. It achieves 75.4% on SWE-bench Verified and 45.8% on SWE-Bench Pro, outper...

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13646
• PDF: https://arxiv.org/pdf/2511.13646

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#SoftwareEngineering #AI #AutonomousAgents #SelfEvolvingAI #LiveSWEagent

204 views05:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance

📝 Summary:
WebCoach introduces a self-evolving framework for web agents with persistent cross-session memory. It uses a WebCondenser, External Memory Store, and a Coach to learn from past experiences without retraining. This significantly improves task success and enables smaller models to match larger LLM ...

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12997
• PDF: https://arxiv.org/pdf/2511.12997

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#WebAgents #AI #MachineLearning #LLM #MemoryAI

❤1

239 views05:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

217 views05:06

ML Research Hub

✨MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

📝 Summary:
MiroThinker v1.0 is an open-source research agent introducing 'interactive scaling.' It trains models with reinforcement learning for deeper agent-environment interactions, performing up to 600 tool calls per task. This achieves state-of-the-art performance and establishes interaction depth as a ...

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11793
• PDF: https://arxiv.org/pdf/2511.11793
• Project Page: https://dr.miromind.ai/
• Github: https://github.com/MiroMindAI/MiroThinker

🔹 Models citing this paper:
• https://huggingface.co/miromind-ai/MiroThinker-v1.0-72B
• https://huggingface.co/miromind-ai/MiroThinker-v1.0-8B
• https://huggingface.co/miromind-ai/MiroThinker-v1.0-30B

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MiroThinker #ResearchAgents #ReinforcementLearning #OpenSourceAI #LLM

arXiv.org

MiroThinker: Pushing the Performance Boundaries of Open-Source...

We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and information-seeking capabilities. Unlike previous agents that only scale up model size...

❤1

258 views05:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨P1: Mastering Physics Olympiads with Reinforcement Learning

📝 Summary:
P1 is a family of open-source physics reasoning models trained via reinforcement learning. P1-235B-A22B achieved Gold-medal performance at IPhO 2025 and won 12 other competitions. These models also show strong generalizability on other reasoning tasks.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13612
• PDF: https://arxiv.org/pdf/2511.13612
• Project Page: https://prime-rl.github.io/P1/
• Github: https://github.com/PRIME-RL/P1

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ReinforcementLearning #Physics #AI #MachineLearning #OpenSource

257 views06:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MicroVQA++: High-Quality Microscopy Reasoning Dataset with Weakly Supervised Graphs for Multimodal Large Language Model

📝 Summary:
MicroVQA plus plus is a new high-quality microscopy VQA dataset built via a three-stage process. This includes HiCQA-Graph, a novel filtering method using NLI, CLIP, and MLLM signals. The dataset enables strong microscopy reasoning performance for MLLMs.

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11407
• PDF: https://arxiv.org/pdf/2511.11407
• Github: https://github.com/ieellee/MicroVQA-PlusPlus

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MLLM #Microscopy #VQA #AIResearch #Dataset

296 views07:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance

📝 Summary:
SoCE is a novel model souping technique that boosts LLM performance. It uses non-uniform weighted averaging of expert models identified for specific benchmark categories, unlike uniform methods. This leads to state-of-the-art results and improved robustness.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13254
• PDF: https://arxiv.org/pdf/2511.13254

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLMs #ModelSouping #MachineLearning #AI #StateOfTheArt

343 views08:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

313 views11:08

About

Blog

Apps

Platform