NEW BOT Телеграм, страница

ML Research Hub

✨CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

📝 Summary:
CritiCal, a novel training method using natural language critiques, significantly improves LLM confidence calibration. This method outperforms other approaches, including GPT-4o, enhancing reliability and generalization across tasks.

🔹 Publication Date: Published on Oct 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.24505
• PDF: https://arxiv.org/pdf/2510.24505

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #ConfidenceCalibration #MachineLearning #NLP #AIResearch

422 views11:27

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨HAFixAgent: History-Aware Automated Program Repair Agent

📝 Summary:
HAFixAgent enhances automated program repair for complex multi-hunk bugs by incorporating repository history. It significantly improves bug-fixing effectiveness over existing agent-based systems while maintaining efficiency. This offers a practical approach for history-aware agentic APR.

🔹 Publication Date: Published on Nov 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.01047
• PDF: https://arxiv.org/pdf/2511.01047

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AutomatedProgramRepair #SoftwareEngineering #AI #BugFixing #CodeRepair

420 views16:27

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks

📝 Summary:
VeriCoT is a neuro-symbolic method to validate LLM Chain-of-Thought reasoning. It formalizes CoT steps into first-order logic for automated verification of consistency. This improves LLM reliability by identifying flawed reasoning and enhancing overall accuracy.

🔹 Publication Date: Published on Nov 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2511.04662
• PDF: https://arxiv.org/pdf/2511.04662

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #ChainOfThought #NeuroSymbolic #AI #Logic

416 views17:28

✨ Explore Data Science 📝 Write your paper

ML Research Hub

347 views19:29

ML Research Hub

✨VGGT: Visual Geometry Grounded Transformer

📝 Summary:
VGGT is a novel feed-forward neural network that efficiently infers multiple key 3D scene attributes from single or multiple views. It outperforms existing specialized models without requiring post-processing, achieving state-of-the-art results across several 3D computer vision tasks. VGGT also s...

🔹 Publication Date: Published on Mar 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2503.11651
• PDF: https://arxiv.org/pdf/2503.11651
• Project Page: https://vgg-t.github.io/
• Github: https://github.com/facebookresearch/vggt

🔹 Models citing this paper:
• https://huggingface.co/facebook/VGGT-1B
• https://huggingface.co/facebook/VGGT-1B-Commercial

✨ Spaces citing this paper:
• https://huggingface.co/spaces/facebook/vggt
• https://huggingface.co/spaces/Pointcept/Concerto
• https://huggingface.co/spaces/HanzhouLiu/Stylos_Demo

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#3DComputerVision #Transformers #DeepLearning #ComputerVision #AI

arXiv.org

VGGT: Visual Geometry Grounded Transformer

We present VGGT, a feed-forward neural network that directly infers all key 3D attributes of a scene, including camera parameters, point maps, depth maps, and 3D point tracks, from one, a few, or...

406 views19:29

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Real-Time Reasoning Agents in Evolving Environments

📝 Summary:
AI agents struggle with real-time reasoning in dynamic environments, failing to balance logical judgments with timely responses. This paper introduces Real-Time Reasoning Gym and AgileThinker. AgileThinker combines reactive and planning approaches to effectively balance reasoning depth and respon...

🔹 Publication Date: Published on Nov 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.04898
• PDF: https://arxiv.org/pdf/2511.04898
• Project Page: https://realtimegym.saltlab.stanford.edu
• Github: https://github.com/SALT-NLP/RealtimeGym

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #RealTimeAI #AutonomousAgents #DynamicEnvironments #MachineLearning

376 views23:30

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨HaluMem: Evaluating Hallucinations in Memory Systems of Agents

📝 Summary:
HaluMem is a new benchmark that evaluates memory hallucinations in AI systems by localizing them to specific stages: extraction, updating, and question answering. It uses large human-AI interaction datasets. Findings show current systems accumulate hallucinations during extraction and updating, w...

🔹 Publication Date: Published on Nov 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.03506
• PDF: https://arxiv.org/pdf/2511.03506
• Github: https://github.com/MemTensor/HaluMem

✨ Datasets citing this paper:
• https://huggingface.co/datasets/IAAR-Shanghai/HaluMem

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AIHallucinations #AIAgents #MemorySystems #LLM #AIResearch

273 views04:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services

📝 Summary:
RedOne 2.0 is an SNS-oriented LLM trained with a progressive, RL-prioritized post-training paradigm for rapid and stable adaptation to social networking challenges. This 4B model significantly improves over a 7B baseline and achieves an 8.74 performance lift from base models with less data, demon...

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07070
• PDF: https://arxiv.org/pdf/2511.07070

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #SocialNetworking #ReinforcementLearning #NLP #DeepLearning

207 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization

📝 Summary:
RLoop is a self-improving framework addressing Reinforcement Learning overfitting and generalization issues. It uses iterative policy initialization and Rejection-sampling Fine-Tuning to convert diverse policy variations into robust performance gains, boosting accuracy and mitigating catastrophic...

🔹 Publication Date: Published on Nov 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.04285
• PDF: https://arxiv.org/pdf/2511.04285

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ReinforcementLearning #MachineLearning #AI #DeepLearning #Generalization

206 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs

📝 Summary:
MoE LLMs have suboptimal routers that cause significant performance gaps. Routing Manifold Alignment RoMA aligns routing weights with task embeddings using a regularization term during lightweight finetuning of routers. This improves generalization by encouraging similar samples to share expert c...

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07419
• PDF: https://arxiv.org/pdf/2511.07419
• Github: https://github.com/tianyi-lab/RoMA

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLMs #MixtureOfExperts #DeepLearning #AI #MachineLearning

206 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

✨DIMO: Diverse 3D Motion Generation for Arbitrary Objects

📝 Summary:
DIMO is a generative AI that creates diverse 3D motions for any object from one image. It extracts motion patterns from video models into a latent space, using neural key point trajectories to drive 3D object models. This enables sampling diverse motions and applications like interpolation.

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07409
• PDF: https://arxiv.org/pdf/2511.07409

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#DIMO #3DMotion #GenerativeAI #ComputerVision #DeepLearning

204 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction

📝 Summary:
IterResearch improves long-horizon reasoning by reformulating it as a Markov Decision Process with strategic workspace reconstruction. This novel paradigm overcomes context suffocation, achieving substantial performance gains and unprecedented interaction scaling, and also serves as an effective ...

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07327
• PDF: https://arxiv.org/pdf/2511.07327

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ReinforcementLearning #AI #MachineLearning #AIagents #MDP

204 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

📝 Summary:
MVU-Eval is a new comprehensive benchmark for evaluating Multi-Video Understanding in Multimodal Large Language Models. It addresses a critical gap in existing single-video benchmarks and reveals significant performance limitations in current MLLMs for multi-video scenarios.

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07250
• PDF: https://arxiv.org/pdf/2511.07250
• Project Page: https://huggingface.co/datasets/MVU-Eval-Team/MVU-Eval-Data
• Github: https://github.com/NJU-LINK/MVU-Eval

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MLLMs #VideoUnderstanding #AI #Benchmarking #ComputerVision

165 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨The Station: An Open-World Environment for AI-Driven Discovery

📝 Summary:
The Station is an open-world multi-agent AI environment enabling autonomous scientific discovery. Agents engage in full scientific journeys, achieving state-of-the-art results across diverse benchmarks. This new paradigm fosters emergent behaviors and novel method development, moving beyond rigid...

🔹 Publication Date: Published on Nov 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.06309
• PDF: https://arxiv.org/pdf/2511.06309
• Github: https://github.com/dualverse-ai/station

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #MultiAgentSystems #ScientificDiscovery #OpenWorldAI #AutonomousAI

❤1

156 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

✨Robot Learning from a Physical World Model

📝 Summary:
PhysWorld enables robots to learn accurate manipulation from AI-generated videos by integrating video generation with physical world modeling. This approach grounds visual guidance into physically executable actions, eliminating the need for real robot data.

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07416
• PDF: https://arxiv.org/pdf/2511.07416
• Project Page: https://pointscoder.github.io/PhysWorld_Web/
• Github: https://github.com/PointsCoder/OpenReal2Sim

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#RobotLearning #Robotics #AI #PhysicalModeling #MachineLearning

139 views05:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DigiData: Training and Evaluating General-Purpose Mobile Control Agents

📝 Summary:
DigiData provides a diverse, high-quality dataset for training mobile control agents with complex goals from app feature exploration. DigiData-Bench offers dynamic AI-powered evaluation protocols, improving agent assessment beyond common metrics.

🔹 Publication Date: Published on Nov 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07413
• PDF: https://arxiv.org/pdf/2511.07413
• Github: https://facebookresearch.github.io/DigiData

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MobileAgents #ArtificialIntelligence #MachineLearning #Datasets #AgentTraining

❤1

169 views05:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?

📝 Summary:
SWE-fficiency is a new benchmark evaluating how language models optimize real-world software repositories for performance on actual workloads. Agents must identify bottlenecks and generate correct code patches matching expert speedup. Current agents significantly underperform, struggling with loc...

🔹 Publication Date: Published on Nov 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.06090
• PDF: https://arxiv.org/pdf/2511.06090
• Project Page: https://swefficiency.com/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #SoftwareOptimization #PerformanceTuning #AIagents #Benchmarking

175 views05:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs

📝 Summary:
LUT-LLM is an FPGA accelerator for LLM inference that leverages on-chip memory to shift computation from arithmetic to memory-based operations via table lookups. This innovative approach achieves 1.66x lower latency than AMD MI210 and 1.72x higher energy efficiency than NVIDIA A100 for a 1.7B LLM.

🔹 Publication Date: Published on Nov 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.06174
• PDF: https://arxiv.org/pdf/2511.06174

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #FPGA #AI #DeepLearning #AIHardware

237 views05:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

📝 Summary:
This study develops a two-stage reinforcement learning method for competitive code generation. It uses tailored data curation and a hard-focus curriculum, achieving state-of-the-art performance on competitive programming benchmarks.

🔹 Publication Date: Published on Nov 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.06307
• PDF: https://arxiv.org/pdf/2511.06307

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ReinforcementLearning #CodeGeneration #DataCuration #MachineLearning #AIResearch

❤1

209 views06:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization

📝 Summary:
SofT-GRPO is a novel algorithm that enhances soft-thinking in LLMs by integrating Gumbel noise and Gumbel-Softmax. This method successfully reinforces soft-thinking policies, enabling LLMs to outperform discrete-token reinforcement learning approaches, especially on complex tasks.

🔹 Publication Date: Published on Nov 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.06411
• PDF: https://arxiv.org/pdf/2511.06411

🔹 Models citing this paper:
• https://huggingface.co/zz1358m/SofT-GRPO-master

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #ReinforcementLearning #AI #MachineLearning #DeepLearning

177 views06:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Diffusion-SDPO: Safeguarded Direct Preference Optimization for Diffusion Models

📝 Summary:
Diffusion-SDPO improves text-to-image quality by fixing a flaw in standard DPO where preferred output error can increase. It uses a safeguarded update to adaptively scale the loser gradient, ensuring the preferred output's error never increases. This leads to consistent quality gains across bench...

🔹 Publication Date: Published on Nov 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.03317
• PDF: https://arxiv.org/pdf/2511.03317
• Github: https://github.com/AIDC-AI/Diffusion-SDPO

🔹 Models citing this paper:
• https://huggingface.co/AIDC-AI/Diffusion-SDPO

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#DiffusionModels #DPO #TextToImage #GenerativeAI #AI

194 views07:04

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform