This media is not supported in your browser
VIEW IN TELEGRAM
✨PhysChoreo: Physics-Controllable Video Generation with Part-Aware Semantic Grounding
📝 Summary:
PhysChoreo generates physically realistic and controllable videos from a single image. It reconstructs part-aware physical properties and simulates dynamic behavior, outperforming existing methods.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20562
• PDF: https://arxiv.org/pdf/2511.20562
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoGeneration #PhysicalSimulation #ComputerVision #DeepLearning #AIResearch
📝 Summary:
PhysChoreo generates physically realistic and controllable videos from a single image. It reconstructs part-aware physical properties and simulates dynamic behavior, outperforming existing methods.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20562
• PDF: https://arxiv.org/pdf/2511.20562
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoGeneration #PhysicalSimulation #ComputerVision #DeepLearning #AIResearch
✨Fara-7B: An Efficient Agentic Model for Computer Use
📝 Summary:
FaraGen creates synthetic datasets for computer use agents, solving a data scarcity problem. This data trains Fara-7B, a small on-device model that perceives computers via screenshots and outperforms larger models on diverse web tasks.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19663
• PDF: https://arxiv.org/pdf/2511.19663
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AIAgents #OnDeviceAI #SyntheticData #MachineLearning #ComputerVision
📝 Summary:
FaraGen creates synthetic datasets for computer use agents, solving a data scarcity problem. This data trains Fara-7B, a small on-device model that perceives computers via screenshots and outperforms larger models on diverse web tasks.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19663
• PDF: https://arxiv.org/pdf/2511.19663
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AIAgents #OnDeviceAI #SyntheticData #MachineLearning #ComputerVision
✨Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning
📝 Summary:
Agent0-VL is a self-evolving vision-language agent that integrates tool usage into both reasoning and self-evaluation. It uses a Solver and Verifier in a self-evolving cycle for continuous improvement without human annotation or external rewards, achieving a 12.5% performance gain.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19900
• PDF: https://arxiv.org/pdf/2511.19900
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AIAgents #VisionLanguage #SelfEvolvingAI #ToolAugmentedAI #AIResearch
📝 Summary:
Agent0-VL is a self-evolving vision-language agent that integrates tool usage into both reasoning and self-evaluation. It uses a Solver and Verifier in a self-evolving cycle for continuous improvement without human annotation or external rewards, achieving a 12.5% performance gain.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19900
• PDF: https://arxiv.org/pdf/2511.19900
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AIAgents #VisionLanguage #SelfEvolvingAI #ToolAugmentedAI #AIResearch
✨Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs
📝 Summary:
VISTA-Gym is a scalable training environment that enhances vision-language models VLMs tool-integrated visual reasoning using reinforcement learning. It unifies diverse multimodal tasks and provides standardized visual tools. VISTA-R1 trained with VISTA-Gym significantly outperforms leading basel...
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19773
• PDF: https://arxiv.org/pdf/2511.19773
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VLMs #ReinforcementLearning #ToolIntegratedAI #MultimodalAI #AIResearch
📝 Summary:
VISTA-Gym is a scalable training environment that enhances vision-language models VLMs tool-integrated visual reasoning using reinforcement learning. It unifies diverse multimodal tasks and provides standardized visual tools. VISTA-R1 trained with VISTA-Gym significantly outperforms leading basel...
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19773
• PDF: https://arxiv.org/pdf/2511.19773
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VLMs #ReinforcementLearning #ToolIntegratedAI #MultimodalAI #AIResearch
❤1
✨UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers
📝 Summary:
Video diffusion transformers struggle with video length extrapolation due to attention dispersion, causing quality degradation and repetition. UltraViCo suppresses attention for tokens beyond the training window, improving quality and reducing repetition. This extends the extrapolation limit from...
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20123
• PDF: https://arxiv.org/pdf/2511.20123
• Project Page: https://thu-ml.github.io/UltraViCo.github.io/
• Github: https://github.com/thu-ml/DiT-Extrapolation
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoAI #DiffusionModels #Transformers #GenerativeAI #DeepLearning
📝 Summary:
Video diffusion transformers struggle with video length extrapolation due to attention dispersion, causing quality degradation and repetition. UltraViCo suppresses attention for tokens beyond the training window, improving quality and reducing repetition. This extends the extrapolation limit from...
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20123
• PDF: https://arxiv.org/pdf/2511.20123
• Project Page: https://thu-ml.github.io/UltraViCo.github.io/
• Github: https://github.com/thu-ml/DiT-Extrapolation
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoAI #DiffusionModels #Transformers #GenerativeAI #DeepLearning
✨ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding
📝 Summary:
ReDirector presents a camera-controlled video retake generation method using Rotary Camera Encoding RoCE. This novel camera conditioned RoPE phase shift improves dynamic object localization and static background preservation across variable length videos and diverse camera trajectories.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19827
• PDF: https://arxiv.org/pdf/2511.19827
• Project Page: https://byeongjun-park.github.io/ReDirector/
• Github: https://byeongjun-park.github.io/ReDirector/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoGeneration #ComputerVision #AIResearch #CameraControl #VideoEditing
📝 Summary:
ReDirector presents a camera-controlled video retake generation method using Rotary Camera Encoding RoCE. This novel camera conditioned RoPE phase shift improves dynamic object localization and static background preservation across variable length videos and diverse camera trajectories.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19827
• PDF: https://arxiv.org/pdf/2511.19827
• Project Page: https://byeongjun-park.github.io/ReDirector/
• Github: https://byeongjun-park.github.io/ReDirector/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoGeneration #ComputerVision #AIResearch #CameraControl #VideoEditing
✨VQ-VA World: Towards High-Quality Visual Question-Visual Answering
📝 Summary:
VQ-VA World introduces a data-centric framework and benchmark for Visual Question-Visual Answering, generating images from visual questions. This significantly improves open-source models, narrowing the performance gap with proprietary systems.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20573
• PDF: https://arxiv.org/pdf/2511.20573
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VQA #GenerativeAI #DataCentricAI #ComputerVision #MachineLearning
📝 Summary:
VQ-VA World introduces a data-centric framework and benchmark for Visual Question-Visual Answering, generating images from visual questions. This significantly improves open-source models, narrowing the performance gap with proprietary systems.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20573
• PDF: https://arxiv.org/pdf/2511.20573
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VQA #GenerativeAI #DataCentricAI #ComputerVision #MachineLearning
✨Soft Adaptive Policy Optimization
📝 Summary:
SAPO improves RL training stability for LLMs. It uses a smooth adaptive gate to attenuate off-policy updates, unlike hard clipping. This selectively down-weights problematic tokens, leading to improved training stability and higher performance.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20347
• PDF: https://arxiv.org/pdf/2511.20347
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ReinforcementLearning #LLMs #PolicyOptimization #DeepLearning #AI
📝 Summary:
SAPO improves RL training stability for LLMs. It uses a smooth adaptive gate to attenuate off-policy updates, unlike hard clipping. This selectively down-weights problematic tokens, leading to improved training stability and higher performance.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20347
• PDF: https://arxiv.org/pdf/2511.20347
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ReinforcementLearning #LLMs #PolicyOptimization #DeepLearning #AI
❤1
✨GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface
📝 Summary:
GLiNER2 is an efficient, unified transformer framework supporting named entity recognition, text classification, and structured data extraction. It offers competitive performance and improved accessibility over LLMs, all in a CPU-efficient, compact model.
🔹 Publication Date: Published on Jul 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.18546
• PDF: https://arxiv.org/pdf/2507.18546
• Github: https://github.com/fastino-ai/GLiNER2
🔹 Models citing this paper:
• https://huggingface.co/fastino/gliner2-base-v1
• https://huggingface.co/fastino/gliner2-large-v1
✨ Spaces citing this paper:
• https://huggingface.co/spaces/fastino/gliner2-official-demo
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#InformationExtraction #NER #NLP #DeepLearning #AI
📝 Summary:
GLiNER2 is an efficient, unified transformer framework supporting named entity recognition, text classification, and structured data extraction. It offers competitive performance and improved accessibility over LLMs, all in a CPU-efficient, compact model.
🔹 Publication Date: Published on Jul 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2507.18546
• PDF: https://arxiv.org/pdf/2507.18546
• Github: https://github.com/fastino-ai/GLiNER2
🔹 Models citing this paper:
• https://huggingface.co/fastino/gliner2-base-v1
• https://huggingface.co/fastino/gliner2-large-v1
✨ Spaces citing this paper:
• https://huggingface.co/spaces/fastino/gliner2-official-demo
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#InformationExtraction #NER #NLP #DeepLearning #AI
✨GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms
📝 Summary:
GigaEvo is an open-source framework for LLM-guided evolutionary computation, providing modular tools for complex optimization. It enhances reproducibility of AlphaEvolve-inspired methods with detailed implementations, validated on challenging problems like Heilbronn triangle placement.
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17592
• PDF: https://arxiv.org/pdf/2511.17592
• Project Page: https://airi-institute.github.io/gigaevo-cover/
• Github: https://github.com/FusionBrainLab/gigaevo-core
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #EvolutionaryAlgorithms #Optimization #OpenSource #AI
📝 Summary:
GigaEvo is an open-source framework for LLM-guided evolutionary computation, providing modular tools for complex optimization. It enhances reproducibility of AlphaEvolve-inspired methods with detailed implementations, validated on challenging problems like Heilbronn triangle placement.
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17592
• PDF: https://arxiv.org/pdf/2511.17592
• Project Page: https://airi-institute.github.io/gigaevo-cover/
• Github: https://github.com/FusionBrainLab/gigaevo-core
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #EvolutionaryAlgorithms #Optimization #OpenSource #AI
✨MajutsuCity: Language-driven Aesthetic-adaptive City Generation with Controllable 3D Assets and Layouts
📝 Summary:
MajutsuCity is a language-driven framework for generating 3D urban scenes, offering high structural consistency, stylistic diversity, and controllability. It uses a four-stage pipeline and an interactive editing agent, significantly outperforming existing methods.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20415
• PDF: https://arxiv.org/pdf/2511.20415
• Project Page: https://longhz140516.github.io/MajutsuCity/
• Github: https://github.com/LongHZ140516/MajutsuCity
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#GenerativeAI #3DModeling #CityGeneration #ComputerGraphics #DeepLearning
📝 Summary:
MajutsuCity is a language-driven framework for generating 3D urban scenes, offering high structural consistency, stylistic diversity, and controllability. It uses a four-stage pipeline and an interactive editing agent, significantly outperforming existing methods.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20415
• PDF: https://arxiv.org/pdf/2511.20415
• Project Page: https://longhz140516.github.io/MajutsuCity/
• Github: https://github.com/LongHZ140516/MajutsuCity
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#GenerativeAI #3DModeling #CityGeneration #ComputerGraphics #DeepLearning
❤1
✨DiffSeg30k: A Multi-Turn Diffusion Editing Benchmark for Localized AIGC Detection
📝 Summary:
DiffSeg30k is a 30k image dataset with pixel-level annotations for localized AI-generated content detection. It moves AIGC detection to semantic segmentation, enabling fine-grained edit localization. Segmentation models prove strong whole-image classifiers of diffusion edits, showing cross-genera...
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19111
• PDF: https://arxiv.org/pdf/2511.19111
• Project Page: https://huggingface.co/datasets/Chaos2629/Diffseg30k
✨ Datasets citing this paper:
• https://huggingface.co/datasets/Chaos2629/Diffseg30k
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AIGCDetection #SemanticSegmentation #DiffusionModels #ComputerVision #MachineLearning
📝 Summary:
DiffSeg30k is a 30k image dataset with pixel-level annotations for localized AI-generated content detection. It moves AIGC detection to semantic segmentation, enabling fine-grained edit localization. Segmentation models prove strong whole-image classifiers of diffusion edits, showing cross-genera...
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19111
• PDF: https://arxiv.org/pdf/2511.19111
• Project Page: https://huggingface.co/datasets/Chaos2629/Diffseg30k
✨ Datasets citing this paper:
• https://huggingface.co/datasets/Chaos2629/Diffseg30k
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AIGCDetection #SemanticSegmentation #DiffusionModels #ComputerVision #MachineLearning
✨OmniAlpha: A Sequence-to-Sequence Framework for Unified Multi-Task RGBA Generation
📝 Summary:
OmniAlpha is the first unified multi-task generative framework for RGBA image generation and editing. It uses a Diffusion Transformer with a novel MSRoPE-BiL method and a new AlphaLayers dataset. OmniAlpha consistently outperforms specialized models across 21 tasks, achieving superior results in ...
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20211
• PDF: https://arxiv.org/pdf/2511.20211
• Github: https://github.com/Longin-Yu/OmniAlpha
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#GenerativeAI #DiffusionModels #ImageGeneration #ComputerVision #DeepLearning
📝 Summary:
OmniAlpha is the first unified multi-task generative framework for RGBA image generation and editing. It uses a Diffusion Transformer with a novel MSRoPE-BiL method and a new AlphaLayers dataset. OmniAlpha consistently outperforms specialized models across 21 tasks, achieving superior results in ...
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20211
• PDF: https://arxiv.org/pdf/2511.20211
• Github: https://github.com/Longin-Yu/OmniAlpha
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#GenerativeAI #DiffusionModels #ImageGeneration #ComputerVision #DeepLearning
✨Yo'City: Personalized and Boundless 3D Realistic City Scene Generation via Self-Critic Expansion
📝 Summary:
Yo'City is an agentic framework for personalized, infinitely expandable 3D city scene generation. It leverages large models with hierarchical planning, a self-critic image synthesis loop, and relationship-guided expansion for spatially coherent growth. Yo'City outperforms existing methods.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18734
• PDF: https://arxiv.org/pdf/2511.18734
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#3DGeneration #GenerativeAI #CityGeneration #ProceduralGeneration #ComputerGraphics
📝 Summary:
Yo'City is an agentic framework for personalized, infinitely expandable 3D city scene generation. It leverages large models with hierarchical planning, a self-critic image synthesis loop, and relationship-guided expansion for spatially coherent growth. Yo'City outperforms existing methods.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18734
• PDF: https://arxiv.org/pdf/2511.18734
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#3DGeneration #GenerativeAI #CityGeneration #ProceduralGeneration #ComputerGraphics
✨SciEducator: Scientific Video Understanding and Educating via Deming-Cycle Multi-Agent System
📝 Summary:
SciEducator is a self-evolving multi-agent system designed for scientific video understanding and education. It integrates professional knowledge and step-wise reasoning to interpret scientific activities and produce multimodal educational content. SciEducator significantly outperforms existing m...
🔹 Publication Date: Published on Nov 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17943
• PDF: https://arxiv.org/pdf/2511.17943
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MultiAgentSystems #AIEducation #VideoUnderstanding #EdTech #AIResearch
📝 Summary:
SciEducator is a self-evolving multi-agent system designed for scientific video understanding and education. It integrates professional knowledge and step-wise reasoning to interpret scientific activities and produce multimodal educational content. SciEducator significantly outperforms existing m...
🔹 Publication Date: Published on Nov 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17943
• PDF: https://arxiv.org/pdf/2511.17943
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MultiAgentSystems #AIEducation #VideoUnderstanding #EdTech #AIResearch
✨SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space
📝 Summary:
SSA is a new training framework for sparse attention in LLMs that aligns sparse and full attention outputs. It achieves state-of-the-art performance, stronger sparsity, and improves long-context extrapolation, allowing flexible compute-performance trade-offs.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20102
• PDF: https://arxiv.org/pdf/2511.20102
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #SparseAttention #DeepLearning #AIResearch #ModelEfficiency
📝 Summary:
SSA is a new training framework for sparse attention in LLMs that aligns sparse and full attention outputs. It achieves state-of-the-art performance, stronger sparsity, and improves long-context extrapolation, allowing flexible compute-performance trade-offs.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20102
• PDF: https://arxiv.org/pdf/2511.20102
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #SparseAttention #DeepLearning #AIResearch #ModelEfficiency
✨Cognitive Foundations for Reasoning and Their Manifestation in LLMs
📝 Summary:
LLMs underutilize cognitive elements and meta-cognitive controls, leading to reasoning gaps. A new framework shows models fail to spontaneously deploy successful strategies. Test-time guidance significantly improves their performance on complex problems.
🔹 Publication Date: Published on Nov 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16660
• PDF: https://arxiv.org/pdf/2511.16660
• Project Page: https://tinyurl.com/cognitive-foundations
• Github: https://github.com/pkargupta/cognitive_foundations/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLMs #CognitiveAI #Reasoning #ArtificialIntelligence #DeepLearning
📝 Summary:
LLMs underutilize cognitive elements and meta-cognitive controls, leading to reasoning gaps. A new framework shows models fail to spontaneously deploy successful strategies. Test-time guidance significantly improves their performance on complex problems.
🔹 Publication Date: Published on Nov 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16660
• PDF: https://arxiv.org/pdf/2511.16660
• Project Page: https://tinyurl.com/cognitive-foundations
• Github: https://github.com/pkargupta/cognitive_foundations/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLMs #CognitiveAI #Reasoning #ArtificialIntelligence #DeepLearning
✨Future Is Unevenly Distributed: Forecasting Ability of LLMs Depends on What We're Asking
📝 Summary:
LLM forecasting ability varies significantly across domains and question types. Their predictive performance depends heavily on context, external knowledge, and how questions are asked.
🔹 Publication Date: Published on Nov 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18394
• PDF: https://arxiv.org/pdf/2511.18394
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #Forecasting #AI #MachineLearning #DataScience
📝 Summary:
LLM forecasting ability varies significantly across domains and question types. Their predictive performance depends heavily on context, external knowledge, and how questions are asked.
🔹 Publication Date: Published on Nov 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18394
• PDF: https://arxiv.org/pdf/2511.18394
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #Forecasting #AI #MachineLearning #DataScience
✨Cook and Clean Together: Teaching Embodied Agents for Parallel Task Execution
📝 Summary:
A new task, ORS3D, is introduced for embodied agents, requiring language understanding, 3D grounding, and efficient parallel task scheduling. The ORS3D-60K dataset and GRANT, an embodied LLM with a scheduling token mechanism, enable agents to minimize total completion time.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19430
• PDF: https://arxiv.org/pdf/2511.19430
• Project Page: https://h-embodvis.github.io/GRANT/
• Github: https://github.com/H-EmbodVis/GRANT
🔹 Models citing this paper:
• https://huggingface.co/H-EmbodVis/GRANT
✨ Datasets citing this paper:
• https://huggingface.co/datasets/H-EmbodVis/ORS3D-60K
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#EmbodiedAI #LLM #Robotics #TaskScheduling #AIResearch
📝 Summary:
A new task, ORS3D, is introduced for embodied agents, requiring language understanding, 3D grounding, and efficient parallel task scheduling. The ORS3D-60K dataset and GRANT, an embodied LLM with a scheduling token mechanism, enable agents to minimize total completion time.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19430
• PDF: https://arxiv.org/pdf/2511.19430
• Project Page: https://h-embodvis.github.io/GRANT/
• Github: https://github.com/H-EmbodVis/GRANT
🔹 Models citing this paper:
• https://huggingface.co/H-EmbodVis/GRANT
✨ Datasets citing this paper:
• https://huggingface.co/datasets/H-EmbodVis/ORS3D-60K
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#EmbodiedAI #LLM #Robotics #TaskScheduling #AIResearch
Media is too big
VIEW IN TELEGRAM
✨MagicWorld: Interactive Geometry-driven Video World Exploration
📝 Summary:
MagicWorld improves interactive video world models by integrating 3D geometry for structural stability and historical retrieval to prevent error accumulation. This allows for continuous, consistent scene evolution driven by user actions from a single image.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18886
• PDF: https://arxiv.org/pdf/2511.18886
• Github: https://vivocameraresearch.github.io/magicworld/
🔹 Models citing this paper:
• https://huggingface.co/LuckyLiGY/MagicWorld
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ComputerVision #3DGeometry #GenerativeAI #DeepLearning #VideoGeneration
📝 Summary:
MagicWorld improves interactive video world models by integrating 3D geometry for structural stability and historical retrieval to prevent error accumulation. This allows for continuous, consistent scene evolution driven by user actions from a single image.
🔹 Publication Date: Published on Nov 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18886
• PDF: https://arxiv.org/pdf/2511.18886
• Github: https://vivocameraresearch.github.io/magicworld/
🔹 Models citing this paper:
• https://huggingface.co/LuckyLiGY/MagicWorld
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ComputerVision #3DGeometry #GenerativeAI #DeepLearning #VideoGeneration
✨Diverse Video Generation with Determinantal Point Process-Guided Policy Optimization
📝 Summary:
DPP-GRPO combines Determinantal Point Processes and Group Relative Policy Optimization to enhance text-to-video diversity. It explicitly rewards varied generations, improving overall diversity without sacrificing quality or prompt fidelity.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20647
• PDF: https://arxiv.org/pdf/2511.20647
• Project Page: https://diverse-video.github.io/
• Github: https://diverse-video.github.io/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoGeneration #GenerativeAI #DeepLearning #ReinforcementLearning #ComputerVision
📝 Summary:
DPP-GRPO combines Determinantal Point Processes and Group Relative Policy Optimization to enhance text-to-video diversity. It explicitly rewards varied generations, improving overall diversity without sacrificing quality or prompt fidelity.
🔹 Publication Date: Published on Nov 25
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.20647
• PDF: https://arxiv.org/pdf/2511.20647
• Project Page: https://diverse-video.github.io/
• Github: https://diverse-video.github.io/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoGeneration #GenerativeAI #DeepLearning #ReinforcementLearning #ComputerVision