✨SciEducator: Scientific Video Understanding and Educating via Deming-Cycle Multi-Agent System
📝 Summary:
SciEducator is a self-evolving multi-agent system designed for scientific video understanding and education. It integrates professional knowledge and step-wise reasoning to interpret scientific activities and produce multimodal educational content. SciEducator significantly outperforms existing m...
🔹 Publication Date: Published on Nov 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17943
• PDF: https://arxiv.org/pdf/2511.17943
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MultiAgentSystems #AIEducation #VideoUnderstanding #EdTech #AIResearch
📝 Summary:
SciEducator is a self-evolving multi-agent system designed for scientific video understanding and education. It integrates professional knowledge and step-wise reasoning to interpret scientific activities and produce multimodal educational content. SciEducator significantly outperforms existing m...
🔹 Publication Date: Published on Nov 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17943
• PDF: https://arxiv.org/pdf/2511.17943
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MultiAgentSystems #AIEducation #VideoUnderstanding #EdTech #AIResearch
✨SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
📝 Summary:
SSA is a new training framework for sparse attention in LLMs that aligns sparse and full attention outputs. It achieves state-of-the-art performance, stronger sparsity, and improves long-context extrapolation, allowing flexible compute-performance trade-offs.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20102
• PDF: https://arxiv.org/pdf/2511.20102
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #SparseAttention #DeepLearning #AIResearch #ModelEfficiency
📝 Summary:
SSA is a new training framework for sparse attention in LLMs that aligns sparse and full attention outputs. It achieves state-of-the-art performance, stronger sparsity, and improves long-context extrapolation, allowing flexible compute-performance trade-offs.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20102
• PDF: https://arxiv.org/pdf/2511.20102
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #SparseAttention #DeepLearning #AIResearch #ModelEfficiency
✨Cognitive Foundations for Reasoning and Their Manifestation in LLMs
📝 Summary:
LLMs underutilize cognitive elements and meta-cognitive controls, leading to reasoning gaps. A new framework shows models fail to spontaneously deploy successful strategies. Test-time guidance significantly improves their performance on complex problems.
🔹 Publication Date: Published on Nov 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16660
• PDF: https://arxiv.org/pdf/2511.16660
• Project Page: https://tinyurl.com/cognitive-foundations
• Github: https://github.com/pkargupta/cognitive_foundations/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLMs #CognitiveAI #Reasoning #ArtificialIntelligence #DeepLearning
📝 Summary:
LLMs underutilize cognitive elements and meta-cognitive controls, leading to reasoning gaps. A new framework shows models fail to spontaneously deploy successful strategies. Test-time guidance significantly improves their performance on complex problems.
🔹 Publication Date: Published on Nov 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16660
• PDF: https://arxiv.org/pdf/2511.16660
• Project Page: https://tinyurl.com/cognitive-foundations
• Github: https://github.com/pkargupta/cognitive_foundations/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLMs #CognitiveAI #Reasoning #ArtificialIntelligence #DeepLearning
✨Future Is Unevenly Distributed: Forecasting Ability of LLMs Depends on What We're Asking
📝 Summary:
LLM forecasting ability varies significantly across domains and question types. Their predictive performance depends heavily on context, external knowledge, and how questions are asked.
🔹 Publication Date: Published on Nov 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18394
• PDF: https://arxiv.org/pdf/2511.18394
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #Forecasting #AI #MachineLearning #DataScience
📝 Summary:
LLM forecasting ability varies significantly across domains and question types. Their predictive performance depends heavily on context, external knowledge, and how questions are asked.
🔹 Publication Date: Published on Nov 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18394
• PDF: https://arxiv.org/pdf/2511.18394
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #Forecasting #AI #MachineLearning #DataScience
✨Cook and Clean Together: Teaching Embodied Agents for Parallel Task Execution
📝 Summary:
A new task, ORS3D, is introduced for embodied agents, requiring language understanding, 3D grounding, and efficient parallel task scheduling. The ORS3D-60K dataset and GRANT, an embodied LLM with a scheduling token mechanism, enable agents to minimize total completion time.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19430
• PDF: https://arxiv.org/pdf/2511.19430
• Project Page: https://h-embodvis.github.io/GRANT/
• Github: https://github.com/H-EmbodVis/GRANT
🔹 Models citing this paper:
• https://huggingface.co/H-EmbodVis/GRANT
✨ Datasets citing this paper:
• https://huggingface.co/datasets/H-EmbodVis/ORS3D-60K
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#EmbodiedAI #LLM #Robotics #TaskScheduling #AIResearch
📝 Summary:
A new task, ORS3D, is introduced for embodied agents, requiring language understanding, 3D grounding, and efficient parallel task scheduling. The ORS3D-60K dataset and GRANT, an embodied LLM with a scheduling token mechanism, enable agents to minimize total completion time.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19430
• PDF: https://arxiv.org/pdf/2511.19430
• Project Page: https://h-embodvis.github.io/GRANT/
• Github: https://github.com/H-EmbodVis/GRANT
🔹 Models citing this paper:
• https://huggingface.co/H-EmbodVis/GRANT
✨ Datasets citing this paper:
• https://huggingface.co/datasets/H-EmbodVis/ORS3D-60K
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#EmbodiedAI #LLM #Robotics #TaskScheduling #AIResearch
Media is too big
VIEW IN TELEGRAM
✨MagicWorld: Interactive Geometry-driven Video World Exploration
📝 Summary:
MagicWorld improves interactive video world models by integrating 3D geometry for structural stability and historical retrieval to prevent error accumulation. This allows for continuous, consistent scene evolution driven by user actions from a single image.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18886
• PDF: https://arxiv.org/pdf/2511.18886
• Github: https://vivocameraresearch.github.io/magicworld/
🔹 Models citing this paper:
• https://huggingface.co/LuckyLiGY/MagicWorld
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ComputerVision #3DGeometry #GenerativeAI #DeepLearning #VideoGeneration
📝 Summary:
MagicWorld improves interactive video world models by integrating 3D geometry for structural stability and historical retrieval to prevent error accumulation. This allows for continuous, consistent scene evolution driven by user actions from a single image.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18886
• PDF: https://arxiv.org/pdf/2511.18886
• Github: https://vivocameraresearch.github.io/magicworld/
🔹 Models citing this paper:
• https://huggingface.co/LuckyLiGY/MagicWorld
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ComputerVision #3DGeometry #GenerativeAI #DeepLearning #VideoGeneration
✨Diverse Video Generation with Determinantal Point Process-Guided Policy Optimization
📝 Summary:
DPP-GRPO combines Determinantal Point Processes and Group Relative Policy Optimization to enhance text-to-video diversity. It explicitly rewards varied generations, improving overall diversity without sacrificing quality or prompt fidelity.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20647
• PDF: https://arxiv.org/pdf/2511.20647
• Project Page: https://diverse-video.github.io/
• Github: https://diverse-video.github.io/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoGeneration #GenerativeAI #DeepLearning #ReinforcementLearning #ComputerVision
📝 Summary:
DPP-GRPO combines Determinantal Point Processes and Group Relative Policy Optimization to enhance text-to-video diversity. It explicitly rewards varied generations, improving overall diversity without sacrificing quality or prompt fidelity.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20647
• PDF: https://arxiv.org/pdf/2511.20647
• Project Page: https://diverse-video.github.io/
• Github: https://diverse-video.github.io/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoGeneration #GenerativeAI #DeepLearning #ReinforcementLearning #ComputerVision
✨Concept-Aware Batch Sampling Improves Language-Image Pretraining
📝 Summary:
Concept-Aware Batch Sampling CABS improves vision-language models by flexibly curating training data online based on specific concept distributions. Using the DataConcept dataset, CABS significantly enhances CLIP and SigLIP model performance across 28 benchmarks. It offers an effective open-sourc...
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20643
• PDF: https://arxiv.org/pdf/2511.20643
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MachineLearning #ComputerVision #NLP #DeepLearning #AIResearch
📝 Summary:
Concept-Aware Batch Sampling CABS improves vision-language models by flexibly curating training data online based on specific concept distributions. Using the DataConcept dataset, CABS significantly enhances CLIP and SigLIP model performance across 28 benchmarks. It offers an effective open-sourc...
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20643
• PDF: https://arxiv.org/pdf/2511.20643
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MachineLearning #ComputerVision #NLP #DeepLearning #AIResearch
✨Uplifting Table Tennis: A Robust, Real-World Application for 3D Trajectory and Spin Estimation
📝 Summary:
This paper proposes a robust two-stage pipeline for accurate 3D table tennis ball motion analysis from monocular video. It separates perception and 2D-to-3D uplifting, training with real 2D and robust synthetic 3D data to create a practical real-world application.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20250
• PDF: https://arxiv.org/pdf/2511.20250
• Project Page: https://kiedani.github.io/WACV2026/index.html
• Github: https://kiedani.github.io/WACV2026/index.html
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ComputerVision #3DReconstruction #SportsTech #MachineLearning #RealWorldAI
📝 Summary:
This paper proposes a robust two-stage pipeline for accurate 3D table tennis ball motion analysis from monocular video. It separates perception and 2D-to-3D uplifting, training with real 2D and robust synthetic 3D data to create a practical real-world application.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20250
• PDF: https://arxiv.org/pdf/2511.20250
• Project Page: https://kiedani.github.io/WACV2026/index.html
• Github: https://kiedani.github.io/WACV2026/index.html
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ComputerVision #3DReconstruction #SportsTech #MachineLearning #RealWorldAI
This media is not supported in your browser
VIEW IN TELEGRAM
✨STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flow
📝 Summary:
STARFlow-V introduces a normalizing flow-based model for end-to-end video generation, offering robust causal prediction and high quality. It achieves strong visual fidelity and temporal consistency using a global-local latent architecture and flow-score matching, establishing NFs as a promising a...
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20462
• PDF: https://arxiv.org/pdf/2511.20462
• Project Page: https://starflow-v.github.io
• Github: https://github.com/apple/ml-starflow
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoGeneration #NormalizingFlow #GenerativeAI #MachineLearning #DeepLearning
📝 Summary:
STARFlow-V introduces a normalizing flow-based model for end-to-end video generation, offering robust causal prediction and high quality. It achieves strong visual fidelity and temporal consistency using a global-local latent architecture and flow-score matching, establishing NFs as a promising a...
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20462
• PDF: https://arxiv.org/pdf/2511.20462
• Project Page: https://starflow-v.github.io
• Github: https://github.com/apple/ml-starflow
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoGeneration #NormalizingFlow #GenerativeAI #MachineLearning #DeepLearning
✨CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning
📝 Summary:
CLaRa improves retrieval-augmented generation by using unified embedding-based compression and joint end-to-end optimization. It introduces SCP for semantic compression and trains both reranker and generator with a single loss, achieving state-of-the-art results.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18659
• PDF: https://arxiv.org/pdf/2511.18659
• Github: https://github.com/apple/ml-clara
🔹 Models citing this paper:
• https://huggingface.co/probejie/CLaRa-Base
• https://huggingface.co/probejie/CLaRa-E2E
• https://huggingface.co/probejie/CLaRa-Instruct
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#RAG #MachineLearning #GenerativeAI #NLP #DeepLearning
📝 Summary:
CLaRa improves retrieval-augmented generation by using unified embedding-based compression and joint end-to-end optimization. It introduces SCP for semantic compression and trains both reranker and generator with a single loss, achieving state-of-the-art results.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18659
• PDF: https://arxiv.org/pdf/2511.18659
• Github: https://github.com/apple/ml-clara
🔹 Models citing this paper:
• https://huggingface.co/probejie/CLaRa-Base
• https://huggingface.co/probejie/CLaRa-E2E
• https://huggingface.co/probejie/CLaRa-Instruct
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#RAG #MachineLearning #GenerativeAI #NLP #DeepLearning
✨ROOT: Robust Orthogonalized Optimizer for Neural Network Training
📝 Summary:
ROOT is a robust optimizer for LLMs addressing dimensional fragility and outlier noise. It employs adaptive Newton iterations for precise orthogonalization and proximal optimization to suppress noise, yielding improved stability, faster convergence, and better performance.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20626
• PDF: https://arxiv.org/pdf/2511.20626
• Github: https://github.com/huawei-noah/noah-research
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#Optimizer #NeuralNetworks #LLMs #DeepLearning #MachineLearning
📝 Summary:
ROOT is a robust optimizer for LLMs addressing dimensional fragility and outlier noise. It employs adaptive Newton iterations for precise orthogonalization and proximal optimization to suppress noise, yielding improved stability, faster convergence, and better performance.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20626
• PDF: https://arxiv.org/pdf/2511.20626
• Github: https://github.com/huawei-noah/noah-research
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#Optimizer #NeuralNetworks #LLMs #DeepLearning #MachineLearning
✨NVIDIA Nemotron Parse 1.1
📝 Summary:
Nemotron-Parse-1.1 is a lightweight OCR and document parsing model with improved capabilities. It excels in general OCR, markdown, structured tables, and text extraction from images using an encoder-decoder architecture. The model achieves competitive accuracy and is publicly released.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20478
• PDF: https://arxiv.org/pdf/2511.20478
🔹 Models citing this paper:
• https://huggingface.co/nvidia/NVIDIA-Nemotron-Parse-v1.1
• https://huggingface.co/nvidia/NVIDIA-Nemotron-Parse-v1.1-TC
✨ Spaces citing this paper:
• https://huggingface.co/spaces/prithivMLmods/NVIDIA-Nemotron-Parse-OCR
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#OCR #DocumentParsing #DeepLearning #AI #NVIDIA
📝 Summary:
Nemotron-Parse-1.1 is a lightweight OCR and document parsing model with improved capabilities. It excels in general OCR, markdown, structured tables, and text extraction from images using an encoder-decoder architecture. The model achieves competitive accuracy and is publicly released.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20478
• PDF: https://arxiv.org/pdf/2511.20478
🔹 Models citing this paper:
• https://huggingface.co/nvidia/NVIDIA-Nemotron-Parse-v1.1
• https://huggingface.co/nvidia/NVIDIA-Nemotron-Parse-v1.1-TC
✨ Spaces citing this paper:
• https://huggingface.co/spaces/prithivMLmods/NVIDIA-Nemotron-Parse-OCR
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#OCR #DocumentParsing #DeepLearning #AI #NVIDIA
✨Terminal Velocity Matching
📝 Summary:
Terminal Velocity Matching TVM generalizes flow matching for high-fidelity generative modeling. It achieves state-of-the-art ImageNet performance with minimal steps, e.g., 1.99 FID in 4 NFEs, through improved diffusion transition modeling and adapted transformers.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19797
• PDF: https://arxiv.org/pdf/2511.19797
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#GenerativeAI #FlowMatching #DeepLearning #ComputerVision #DiffusionModels
📝 Summary:
Terminal Velocity Matching TVM generalizes flow matching for high-fidelity generative modeling. It achieves state-of-the-art ImageNet performance with minimal steps, e.g., 1.99 FID in 4 NFEs, through improved diffusion transition modeling and adapted transformers.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19797
• PDF: https://arxiv.org/pdf/2511.19797
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#GenerativeAI #FlowMatching #DeepLearning #ComputerVision #DiffusionModels
✨Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation
📝 Summary:
Inferix is a next-gen inference engine for immersive world simulation, generating high-quality interactive videos. It uses semi-autoregressive block-diffusion with LLM-style KV Cache for efficient, stable generation, enabling real-time world dynamics.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20714
• PDF: https://arxiv.org/pdf/2511.20714
• Github: https://github.com/alibaba-damo-academy/Inferix
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#WorldSimulation #DiffusionModels #GenerativeAI #AIResearch #RealtimeAI
📝 Summary:
Inferix is a next-gen inference engine for immersive world simulation, generating high-quality interactive videos. It uses semi-autoregressive block-diffusion with LLM-style KV Cache for efficient, stable generation, enabling real-time world dynamics.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20714
• PDF: https://arxiv.org/pdf/2511.20714
• Github: https://github.com/alibaba-damo-academy/Inferix
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#WorldSimulation #DiffusionModels #GenerativeAI #AIResearch #RealtimeAI
✨Latent Collaboration in Multi-Agent Systems
📝 Summary:
LatentMAS enables LLM agents to collaborate directly in latent space, surpassing text-based communication. This boosts reasoning quality, accuracy, and efficiency speed, tokens without extra training.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20639
• PDF: https://arxiv.org/pdf/2511.20639
• Github: https://github.com/Gen-Verse/LatentMAS
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #MultiAgentSystems #LatentSpace #AIAgents #ArtificialIntelligence
📝 Summary:
LatentMAS enables LLM agents to collaborate directly in latent space, surpassing text-based communication. This boosts reasoning quality, accuracy, and efficiency speed, tokens without extra training.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20639
• PDF: https://arxiv.org/pdf/2511.20639
• Github: https://github.com/Gen-Verse/LatentMAS
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #MultiAgentSystems #LatentSpace #AIAgents #ArtificialIntelligence
✨Monet: Reasoning in Latent Visual Space Beyond Images and Language
📝 Summary:
Monet is a new framework enabling MLLMs to reason directly in latent visual space using continuous embeddings as intermediate visual thoughts. It addresses training challenges with a three-stage distillation pipeline and introduces VLPO, outperforming on visual reasoning tasks.
🔹 Publication Date: Published on Nov 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.21395
• PDF: https://arxiv.org/pdf/2511.21395
• Github: https://github.com/NOVAglow646/Monet
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MLLM #VisualReasoning #LatentSpace #AI #DeepLearning
📝 Summary:
Monet is a new framework enabling MLLMs to reason directly in latent visual space using continuous embeddings as intermediate visual thoughts. It addresses training challenges with a three-stage distillation pipeline and introduces VLPO, outperforming on visual reasoning tasks.
🔹 Publication Date: Published on Nov 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.21395
• PDF: https://arxiv.org/pdf/2511.21395
• Github: https://github.com/NOVAglow646/Monet
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MLLM #VisualReasoning #LatentSpace #AI #DeepLearning
❤1
✨Revisiting Generalization Across Difficulty Levels: It's Not So Easy
📝 Summary:
This paper shows that large language models do not consistently generalize across different task difficulties. Training on only easy or hard data is insufficient for broad improvement. This highlights the need for diverse difficulty levels in both training and evaluation datasets for LLMs.
🔹 Publication Date: Published on Nov 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.21692
• PDF: https://arxiv.org/pdf/2511.21692
• Github: https://github.com/BatsResearch/Cross-Difficulty
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #AIResearch #MachineLearning #Generalization #DatasetDesign
📝 Summary:
This paper shows that large language models do not consistently generalize across different task difficulties. Training on only easy or hard data is insufficient for broad improvement. This highlights the need for diverse difficulty levels in both training and evaluation datasets for LLMs.
🔹 Publication Date: Published on Nov 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.21692
• PDF: https://arxiv.org/pdf/2511.21692
• Github: https://github.com/BatsResearch/Cross-Difficulty
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #AIResearch #MachineLearning #Generalization #DatasetDesign
✨Frequency-Adaptive Sharpness Regularization for Improving 3D Gaussian Splatting Generalization
📝 Summary:
This paper introduces Frequency-Adaptive Sharpness Regularization FASR to improve 3DGS generalization in novel view synthesis. FASR adaptively adjusts regularization based on local image frequency, preventing overfitting and reconstructing fine details better than prior methods.
🔹 Publication Date: Published on Nov 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17918
• PDF: https://arxiv.org/pdf/2511.17918
• Project Page: https://bbangsik13.github.io/FASR
• Github: https://bbangsik13.github.io/FASR
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#3DGS #NeuralRendering #ComputerVision #DeepLearning #AI
📝 Summary:
This paper introduces Frequency-Adaptive Sharpness Regularization FASR to improve 3DGS generalization in novel view synthesis. FASR adaptively adjusts regularization based on local image frequency, preventing overfitting and reconstructing fine details better than prior methods.
🔹 Publication Date: Published on Nov 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17918
• PDF: https://arxiv.org/pdf/2511.17918
• Project Page: https://bbangsik13.github.io/FASR
• Github: https://bbangsik13.github.io/FASR
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#3DGS #NeuralRendering #ComputerVision #DeepLearning #AI
✨MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
📝 Summary:
MobileVLA-R1 is a unified framework for quadruped robots that improves vision-language-action through supervised chain-of-thought alignment and GRPO reinforcement learning. This two-stage training enhances reasoning and control stability. It achieves superior performance in complex environments, ...
🔹 Publication Date: Published on Nov 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17889
• PDF: https://arxiv.org/pdf/2511.17889
• Project Page: https://aigeeksgroup.github.io/MobileVLA-R1/
• Github: https://github.com/AIGeeksGroup/MobileVLA-R1
✨ Datasets citing this paper:
• https://huggingface.co/datasets/AIGeeksGroup/MobileVLA-CoT
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#Robotics #VisionLanguageModels #ReinforcementLearning #MobileRobots #AI
📝 Summary:
MobileVLA-R1 is a unified framework for quadruped robots that improves vision-language-action through supervised chain-of-thought alignment and GRPO reinforcement learning. This two-stage training enhances reasoning and control stability. It achieves superior performance in complex environments, ...
🔹 Publication Date: Published on Nov 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17889
• PDF: https://arxiv.org/pdf/2511.17889
• Project Page: https://aigeeksgroup.github.io/MobileVLA-R1/
• Github: https://github.com/AIGeeksGroup/MobileVLA-R1
✨ Datasets citing this paper:
• https://huggingface.co/datasets/AIGeeksGroup/MobileVLA-CoT
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#Robotics #VisionLanguageModels #ReinforcementLearning #MobileRobots #AI
✨SPHINX: A Synthetic Environment for Visual Perception and Reasoning
📝 Summary:
Sphinx is a synthetic environment for visual perception and reasoning, using procedurally generated puzzles to evaluate large vision-language models. It shows that current state-of-the-art models perform poorly, but reinforcement learning with verifiable rewards substantially improves accuracy.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20814
• PDF: https://arxiv.org/pdf/2511.20814
• Github: https://github.com/xashru/sphinx
✨ Datasets citing this paper:
• https://huggingface.co/datasets/xashru/sphinx
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AI #ComputerVision #ReinforcementLearning #VisionLanguageModels #SyntheticEnvironments
📝 Summary:
Sphinx is a synthetic environment for visual perception and reasoning, using procedurally generated puzzles to evaluate large vision-language models. It shows that current state-of-the-art models perform poorly, but reinforcement learning with verifiable rewards substantially improves accuracy.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20814
• PDF: https://arxiv.org/pdf/2511.20814
• Github: https://github.com/xashru/sphinx
✨ Datasets citing this paper:
• https://huggingface.co/datasets/xashru/sphinx
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AI #ComputerVision #ReinforcementLearning #VisionLanguageModels #SyntheticEnvironments