Experience-Guided Adaptation of Inference-Time Reasoning Strategies
📅 Publication date: Nov 14 2025
📑 Paper
🔗 Code: N/A
📝 Denoscription:
Experience-Guided Reasoner dynamically generates and optimizes computational strategies at inference time, adapting to problems using accumulated experience and improving accuracy and efficiency.
📅 Publication date: Nov 14 2025
📑 Paper
🔗 Code: N/A
📝 Denoscription:
Experience-Guided Reasoner dynamically generates and optimizes computational strategies at inference time, adapting to problems using accumulated experience and improving accuracy and efficiency.
Dynamic Reflections: Probing Video Representations with Text Alignment
📅 Publication date: Nov 4 2025
📑 Paper
🔗 Code: N/A
📅 Publication date: Nov 4 2025
📑 Paper
🔗 Code: N/A
❤3
Test-Time Spectrum-Aware Latent Steering for Zero-Shot Generalization in Vision-Language Models
📅 Publication date: Nov 12 2025
📑 Paper
🔗 Code: N/A
📅 Publication date: Nov 12 2025
📑 Paper
🔗 Code: N/A
👍1
P1: Mastering Physics Olympiads with Reinforcement Learning
📅 Publication date: Nov 17 2025
📑 Paper PDF
🔗 Code Repository
📅 Publication date: Nov 17 2025
📑 Paper PDF
🔗 Code Repository
INDIBATOR: Diverse and Fact-Grounded Individuality for Multi-Agent Debate in Molecular Discovery
📅 Publication Date: Published on Feb 2 2025
📑 Paper: https://arxiv.org/pdf/2602.01815
🔗 Code: N/A
📝 Denoscription:
Multi-agent systems for molecular discovery that use individualized scientist profiles based on publication and molecular history outperform traditional role-based approaches.
📅 Publication Date: Published on Feb 2 2025
📑 Paper: https://arxiv.org/pdf/2602.01815
🔗 Code: N/A
📝 Denoscription:
Multi-agent systems for molecular discovery that use individualized scientist profiles based on publication and molecular history outperform traditional role-based approaches.
❤1
OVD: On-policy Verbal Distillation
📅 Publication Date: Published on Jan 29
📑 Paper: https://arxiv.org/pdf/2601.21968
🔗 Code: https://OVD.github.io
📝 Denoscription:
On-policy Verbal Distillation (OVD) enables efficient knowledge transfer from teacher to student models by replacing token-level probability matching with trajectory matching.
📅 Publication Date: Published on Jan 29
📑 Paper: https://arxiv.org/pdf/2601.21968
🔗 Code: https://OVD.github.io
📝 Denoscription:
On-policy Verbal Distillation (OVD) enables efficient knowledge transfer from teacher to student models by replacing token-level probability matching with trajectory matching.
❤2
PISA: Piecewise Sparse Attention Is Wiser for Efficient Diffusion Transformers
📅 Publication Date: Published on Feb 1
📑 Paper: https://arxiv.org/pdf/2602.01077
🔗 Code: https://github.com/xie-lab-ml/piecewise-sparse-attention
📝 Denoscription:
PISA is a novel sparse attention method that improves diffusion transformer efficiency by approximating non-critical attention blocks instead of discarding them, achieving faster processing.
📅 Publication Date: Published on Feb 1
📑 Paper: https://arxiv.org/pdf/2602.01077
🔗 Code: https://github.com/xie-lab-ml/piecewise-sparse-attention
📝 Denoscription:
PISA is a novel sparse attention method that improves diffusion transformer efficiency by approximating non-critical attention blocks instead of discarding them, achieving faster processing.
DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains
📅 Publication date: Nov 14 2025
📑 Paper PDF
🔗 Code: N/A
📝 Denoscription:
A new benchmark DiscoX and evaluation system Metric-S are introduced to assess discourse-level and expert-level Chinese-English translation, highlighting the challenges in achieving professional-grade machine translation.
📅 Publication date: Nov 14 2025
📑 Paper PDF
🔗 Code: N/A
📝 Denoscription:
A new benchmark DiscoX and evaluation system Metric-S are introduced to assess discourse-level and expert-level Chinese-English translation, highlighting the challenges in achieving professional-grade machine translation.
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
📅 Publication date: Nov 18 2025
📑 Paper: https://arxiv.org/pdf/2511.14460.pdf
🔗 Code: https://github.com/0russwest0/Agent-R1
📝 Denoscription:
A new training framework for RL-based LLM Agents is introduced, extending MDP methodology and demonstrating effectiveness on Multihop QA tasks.
📅 Publication date: Nov 18 2025
📑 Paper: https://arxiv.org/pdf/2511.14460.pdf
🔗 Code: https://github.com/0russwest0/Agent-R1
📝 Denoscription:
A new training framework for RL-based LLM Agents is introduced, extending MDP methodology and demonstrating effectiveness on Multihop QA tasks.
UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity
📅 Publication date: Nov 17 2025
📑 Paper: https://arxiv.org/pdf/2511.13714.pdf
🔗 Code: https://github.com/yujunwei04/UnSAMv2
📅 Publication date: Nov 17 2025
📑 Paper: https://arxiv.org/pdf/2511.13714.pdf
🔗 Code: https://github.com/yujunwei04/UnSAMv2
Virtual Width Networks
📅 Publication date: Nov 14 2025
📑 Paper: https://arxiv.org/pdf/2511.11238.pdf
🔗 Code: N/A
📝 Denoscription:
Virtual Width Networks (VWN) enhance model efficiency by expanding representational width without increasing computational cost, accelerating optimization and improving loss reduction.
📅 Publication date: Nov 14 2025
📑 Paper: https://arxiv.org/pdf/2511.11238.pdf
🔗 Code: N/A
📝 Denoscription:
Virtual Width Networks (VWN) enhance model efficiency by expanding representational width without increasing computational cost, accelerating optimization and improving loss reduction.
❤1
Part-X-MLLM: Part-aware 3D Multimodal Large Language Model
📅 Publication date: Nov 17 2025
📑 Paper: https://arxiv.org/pdf/2511.13647.pdf
🔗 Code: https://github.com/AiEson/Part-X-MLLM
📅 Publication date: Nov 17 2025
📑 Paper: https://arxiv.org/pdf/2511.13647.pdf
🔗 Code: https://github.com/AiEson/Part-X-MLLM
InstructVLA: Vision-Language-Action Instruction Tuning from
Understanding to Manipulation
📅 Publication date: Jul 23 2025
📑 Paper PDF
🔗 Code: N/A
📝 Denoscription:
InstructVLA is an end-to-end vision-language-action model that enhances manipulation performance while preserving vision-language reasoning through multimodal training and mixture-of-experts adaptation.
Understanding to Manipulation
📅 Publication date: Jul 23 2025
📑 Paper PDF
🔗 Code: N/A
📝 Denoscription:
InstructVLA is an end-to-end vision-language-action model that enhances manipulation performance while preserving vision-language reasoning through multimodal training and mixture-of-experts adaptation.
❤1
Training Long-Context, Multi-Turn Software Engineering Agents with
Reinforcement Learning
📅 Publication date: Aug 5 2025
📑 Paper PDF
🔗 Code: N/A
Reinforcement Learning
📅 Publication date: Aug 5 2025
📑 Paper PDF
🔗 Code: N/A
OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation
📅 Publication date: Nov 17 2025
📑 Paper PDF
🔗 Code Repository
📅 Publication date: Nov 17 2025
📑 Paper PDF
🔗 Code Repository
Influence Guided Sampling for Domain Adaptation of Text Retrievers
📅 Publication Date: Jan 29
📑 Paper: https://arxiv.org/pdf/2601.21759
🔗 Code: N/A
📝 Denoscription:
An reinforcement learning-based sampling framework adaptively reweights training datasets to improve embedding model performance while reducing GPU costs.
📅 Publication Date: Jan 29
📑 Paper: https://arxiv.org/pdf/2601.21759
🔗 Code: N/A
📝 Denoscription:
An reinforcement learning-based sampling framework adaptively reweights training datasets to improve embedding model performance while reducing GPU costs.
❤2
Kronos: A Foundation Model for the Language of Financial Markets
📅 Publication Date: Aug 2, 2025
📑 Paper: https://arxiv.org/pdf/2508.02739
🔗 Code: https://github.com/shiyu-coder/Kronos
📝 Denoscription:
Kronos is a novel foundation model for financial K-line data, employing a specialized tokenizer and autoregressive pre-training on a massive dataset. It significantly outperforms existing models in forecasting, volatility prediction, and generating synthetic financial data.
📅 Publication Date: Aug 2, 2025
📑 Paper: https://arxiv.org/pdf/2508.02739
🔗 Code: https://github.com/shiyu-coder/Kronos
📝 Denoscription:
Kronos is a novel foundation model for financial K-line data, employing a specialized tokenizer and autoregressive pre-training on a massive dataset. It significantly outperforms existing models in forecasting, volatility prediction, and generating synthetic financial data.