✨Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs
📝 Summary:
EvoSynth is a new framework that autonomously engineers and evolves novel, code-based jailbreak methods for LLMs, moving beyond prompt refinement. It uses self-correction to create diverse and highly successful attacks, achieving 85.5% ASR against robust models.
🔹 Publication Date: Published on Nov 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12710
• PDF: https://arxiv.org/pdf/2511.12710
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLMs #JailbreakAttacks #AISecurity #EvolutionaryAlgorithms #AIResearch
📝 Summary:
EvoSynth is a new framework that autonomously engineers and evolves novel, code-based jailbreak methods for LLMs, moving beyond prompt refinement. It uses self-correction to create diverse and highly successful attacks, achieving 85.5% ASR against robust models.
🔹 Publication Date: Published on Nov 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12710
• PDF: https://arxiv.org/pdf/2511.12710
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLMs #JailbreakAttacks #AISecurity #EvolutionaryAlgorithms #AIResearch
❤1
✨Dynamic Reflections: Probing Video Representations with Text Alignment
📝 Summary:
This work presents the first comprehensive study on video-text representation alignment. It reveals alignment depends on data richness and correlates with downstream task performance, suggesting its value for general video understanding. This introduces video-text alignment as a zero-shot method ...
🔹 Publication Date: Published on Nov 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02767
• PDF: https://arxiv.org/pdf/2511.02767
• Github: https://video-prh.github.io/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoUnderstanding #TextAlignment #VideoTextAI #ZeroShotLearning #RepresentationLearning
📝 Summary:
This work presents the first comprehensive study on video-text representation alignment. It reveals alignment depends on data richness and correlates with downstream task performance, suggesting its value for general video understanding. This introduces video-text alignment as a zero-shot method ...
🔹 Publication Date: Published on Nov 4
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.02767
• PDF: https://arxiv.org/pdf/2511.02767
• Github: https://video-prh.github.io/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#VideoUnderstanding #TextAlignment #VideoTextAI #ZeroShotLearning #RepresentationLearning
❤1
✨Back to Basics: Let Denoising Generative Models Denoise
📝 Summary:
Denoising diffusion models should predict clean images directly, not noise, leveraging the data manifold assumption. The paper introduces JiT, a model using simple, large-patch Transformers that achieves competitive generative results on ImageNet.
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13720
• PDF: https://arxiv.org/pdf/2511.13720
• Github: https://github.com/LTH14/JiT
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#DiffusionModels #GenerativeAI #DeepLearning #ComputerVision #AIResearch
📝 Summary:
Denoising diffusion models should predict clean images directly, not noise, leveraging the data manifold assumption. The paper introduces JiT, a model using simple, large-patch Transformers that achieves competitive generative results on ImageNet.
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13720
• PDF: https://arxiv.org/pdf/2511.13720
• Github: https://github.com/LTH14/JiT
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#DiffusionModels #GenerativeAI #DeepLearning #ComputerVision #AIResearch
❤1
✨Genomic Next-Token Predictors are In-Context Learners
📝 Summary:
In-context learning ICL emerges organically in genomic sequences through large-scale predictive training, mirroring its behavior in language models. This first evidence suggests ICL is a general phenomenon of large-scale modeling, not exclusive to human language.
🔹 Publication Date: Published on Nov 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12797
• PDF: https://arxiv.org/pdf/2511.12797
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#Genomics #InContextLearning #AI #MachineLearning #LLMs
📝 Summary:
In-context learning ICL emerges organically in genomic sequences through large-scale predictive training, mirroring its behavior in language models. This first evidence suggests ICL is a general phenomenon of large-scale modeling, not exclusive to human language.
🔹 Publication Date: Published on Nov 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12797
• PDF: https://arxiv.org/pdf/2511.12797
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#Genomics #InContextLearning #AI #MachineLearning #LLMs
❤1
✨A Decentralized Retrieval Augmented Generation System with Source Reliabilities Secured on Blockchain
📝 Summary:
This paper proposes a decentralized RAG system using a blockchain-based mechanism to score data source reliability. It dynamically evaluates sources, boosting performance by 10.7% compared to centralized systems and achieving 56% cost savings in unreliable environments.
🔹 Publication Date: Published on Nov 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07577
• PDF: https://arxiv.org/pdf/2511.07577
• Github: https://github.com/yining610/Reliable-dRAG
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#RAG #Blockchain #DecentralizedAI #GenerativeAI #AIResearch
📝 Summary:
This paper proposes a decentralized RAG system using a blockchain-based mechanism to score data source reliability. It dynamically evaluates sources, boosting performance by 10.7% compared to centralized systems and achieving 56% cost savings in unreliable environments.
🔹 Publication Date: Published on Nov 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.07577
• PDF: https://arxiv.org/pdf/2511.07577
• Github: https://github.com/yining610/Reliable-dRAG
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#RAG #Blockchain #DecentralizedAI #GenerativeAI #AIResearch
Media is too big
VIEW IN TELEGRAM
✨UFO^3: Weaving the Digital Agent Galaxy
📝 Summary:
UFO^3 unifies diverse digital devices into a single orchestration fabric, enabling AI agents to collaborate seamlessly across platforms. It models tasks dynamically for asynchronous execution, achieving efficient, resilient, and accurate cross-device task orchestration with improved parallelism a...
🔹 Publication Date: Published on Nov 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11332
• PDF: https://arxiv.org/pdf/2511.11332
• Project Page: https://microsoft.github.io/UFO/
• Github: https://github.com/microsoft/UFO/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AIAgents #TaskOrchestration #DistributedSystems #EdgeAI #MultiAgentSystems
📝 Summary:
UFO^3 unifies diverse digital devices into a single orchestration fabric, enabling AI agents to collaborate seamlessly across platforms. It models tasks dynamically for asynchronous execution, achieving efficient, resilient, and accurate cross-device task orchestration with improved parallelism a...
🔹 Publication Date: Published on Nov 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11332
• PDF: https://arxiv.org/pdf/2511.11332
• Project Page: https://microsoft.github.io/UFO/
• Github: https://github.com/microsoft/UFO/
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AIAgents #TaskOrchestration #DistributedSystems #EdgeAI #MultiAgentSystems
✨UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity
📝 Summary:
UnSAMv2 enables continuous segmentation granularity control for the SAM model without human annotations. It uses self-supervised learning on unlabeled data to discover mask-granularity pairs and a novel control embedding. UnSAMv2 significantly enhances SAM-2s performance across various segmentati...
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13714
• PDF: https://arxiv.org/pdf/2511.13714
• Project Page: https://yujunwei04.github.io/UnSAMv2-Project-Page/
• Github: https://github.com/yujunwei04/UnSAMv2
✨ Spaces citing this paper:
• https://huggingface.co/spaces/yujunwei04/UnSAMv2
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AI #ComputerVision #SelfSupervisedLearning #ImageSegmentation #DeepLearning
📝 Summary:
UnSAMv2 enables continuous segmentation granularity control for the SAM model without human annotations. It uses self-supervised learning on unlabeled data to discover mask-granularity pairs and a novel control embedding. UnSAMv2 significantly enhances SAM-2s performance across various segmentati...
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13714
• PDF: https://arxiv.org/pdf/2511.13714
• Project Page: https://yujunwei04.github.io/UnSAMv2-Project-Page/
• Github: https://github.com/yujunwei04/UnSAMv2
✨ Spaces citing this paper:
• https://huggingface.co/spaces/yujunwei04/UnSAMv2
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AI #ComputerVision #SelfSupervisedLearning #ImageSegmentation #DeepLearning
✨OpenUS: A Fully Open-Source Foundation Model for Ultrasound Image Analysis via Self-Adaptive Masked Contrastive Learning
📝 Summary:
OpenUS is an open-source ultrasound foundation model built on a large public dataset. It uses a vision Mamba backbone and a novel self-adaptive masking framework to enhance pre-training, enabling label-efficient fine-tuning for various US tasks.
🔹 Publication Date: Published on Nov 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11510
• PDF: https://arxiv.org/pdf/2511.11510
• Github: https://github.com/XZheng0427/OpenUS
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#OpenSource #FoundationModel #UltrasoundAI #MachineLearning #MedicalImaging
📝 Summary:
OpenUS is an open-source ultrasound foundation model built on a large public dataset. It uses a vision Mamba backbone and a novel self-adaptive masking framework to enhance pre-training, enabling label-efficient fine-tuning for various US tasks.
🔹 Publication Date: Published on Nov 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11510
• PDF: https://arxiv.org/pdf/2511.11510
• Github: https://github.com/XZheng0427/OpenUS
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#OpenSource #FoundationModel #UltrasoundAI #MachineLearning #MedicalImaging
❤1
✨Assessing LLMs for Serendipity Discovery in Knowledge Graphs: A Case for Drug Repurposing
📝 Summary:
SerenQA evaluates LLMs for discovering surprising, valuable serendipitous answers in scientific knowledge graphs, focusing on drug repurposing. It uses a new serendipity metric. Experiments show LLMs struggle with genuine surprising insights.
🔹 Publication Date: Published on Nov 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12472
• PDF: https://arxiv.org/pdf/2511.12472
• Project Page: https://cwru-db-group.github.io/serenQA
• Github: https://github.com/CWRU-DB-Group/DrugKG
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #KnowledgeGraphs #DrugRepurposing #AI #Serendipity
📝 Summary:
SerenQA evaluates LLMs for discovering surprising, valuable serendipitous answers in scientific knowledge graphs, focusing on drug repurposing. It uses a new serendipity metric. Experiments show LLMs struggle with genuine surprising insights.
🔹 Publication Date: Published on Nov 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12472
• PDF: https://arxiv.org/pdf/2511.12472
• Project Page: https://cwru-db-group.github.io/serenQA
• Github: https://github.com/CWRU-DB-Group/DrugKG
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #KnowledgeGraphs #DrugRepurposing #AI #Serendipity
✨SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization
📝 Summary:
SafeGRPO introduces a self-rewarded, rule-governed framework for multimodal safety alignment in MLLMs. It integrates verifiable reward construction and step-guided safety thinking to improve robustness against compositional risks and enhance reasoning stability.
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12982
• PDF: https://arxiv.org/pdf/2511.12982
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MLLMs #AISafety #MultimodalAI #ReinforcementLearning #AIResearch
📝 Summary:
SafeGRPO introduces a self-rewarded, rule-governed framework for multimodal safety alignment in MLLMs. It integrates verifiable reward construction and step-guided safety thinking to improve robustness against compositional risks and enhance reasoning stability.
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12982
• PDF: https://arxiv.org/pdf/2511.12982
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MLLMs #AISafety #MultimodalAI #ReinforcementLearning #AIResearch
✨Error-Driven Scene Editing for 3D Grounding in Large Language Models
📝 Summary:
DEER-3D improves 3D LLM grounding by iteratively editing and retraining models. It diagnoses predicate-level errors, then generates targeted 3D scene edits as counterfactuals to enhance spatial understanding and accuracy.
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14086
• PDF: https://arxiv.org/pdf/2511.14086
• Github: https://github.com/zhangyuejoslin/Deer-3D
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLMs #3DGrounding #ComputerVision #DeepLearning #AIResearch
📝 Summary:
DEER-3D improves 3D LLM grounding by iteratively editing and retraining models. It diagnoses predicate-level errors, then generates targeted 3D scene edits as counterfactuals to enhance spatial understanding and accuracy.
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14086
• PDF: https://arxiv.org/pdf/2511.14086
• Github: https://github.com/zhangyuejoslin/Deer-3D
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLMs #3DGrounding #ComputerVision #DeepLearning #AIResearch
✨ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning
📝 Summary:
ATLAS is a new, high-difficulty, multidisciplinary benchmark for LLMs, featuring 800 original problems across seven scientific fields. It addresses current benchmark limitations with complex, open-ended answers and aims to differentiate advanced scientific reasoning, serving as a ruler for AGI pr...
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14366
• PDF: https://arxiv.org/pdf/2511.14366
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #AGI #AIResearch #ScientificReasoning #Benchmark
📝 Summary:
ATLAS is a new, high-difficulty, multidisciplinary benchmark for LLMs, featuring 800 original problems across seven scientific fields. It addresses current benchmark limitations with complex, open-ended answers and aims to differentiate advanced scientific reasoning, serving as a ruler for AGI pr...
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14366
• PDF: https://arxiv.org/pdf/2511.14366
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLM #AGI #AIResearch #ScientificReasoning #Benchmark
✨Orion: A Unified Visual Agent for Multimodal Perception, Advanced Visual Reasoning and Execution
📝 Summary:
Orion is a visual agent framework that orchestrates specialized computer vision tools to execute complex visual workflows. It achieves competitive performance on benchmarks and enables autonomous, tool-driven visual reasoning.
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14210
• PDF: https://arxiv.org/pdf/2511.14210
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ComputerVision #AIagents #VisualReasoning #MultimodalAI #DeepLearning
📝 Summary:
Orion is a visual agent framework that orchestrates specialized computer vision tools to execute complex visual workflows. It achieves competitive performance on benchmarks and enables autonomous, tool-driven visual reasoning.
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14210
• PDF: https://arxiv.org/pdf/2511.14210
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ComputerVision #AIagents #VisualReasoning #MultimodalAI #DeepLearning
✨A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space
📝 Summary:
CoTyle introduces code-to-style image generation, creating consistent visual styles from numerical codes. It is the first open-source academic method for this task, using a discrete style codebook and a text-to-image diffusion model for diverse, reproducible styles.
🔹 Publication Date: Published on Nov 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.10555
• PDF: https://arxiv.org/pdf/2511.10555
• Project Page: https://Kwai-Kolors.github.io/CoTyle/
• Github: https://github.com/Kwai-Kolors/CoTyle
✨ Spaces citing this paper:
• https://huggingface.co/spaces/Kwai-Kolors/CoTyle
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ImageGeneration #DiffusionModels #NeuralStyle #ComputerVision #DeepLearning
📝 Summary:
CoTyle introduces code-to-style image generation, creating consistent visual styles from numerical codes. It is the first open-source academic method for this task, using a discrete style codebook and a text-to-image diffusion model for diverse, reproducible styles.
🔹 Publication Date: Published on Nov 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.10555
• PDF: https://arxiv.org/pdf/2511.10555
• Project Page: https://Kwai-Kolors.github.io/CoTyle/
• Github: https://github.com/Kwai-Kolors/CoTyle
✨ Spaces citing this paper:
• https://huggingface.co/spaces/Kwai-Kolors/CoTyle
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#ImageGeneration #DiffusionModels #NeuralStyle #ComputerVision #DeepLearning
✨MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs
📝 Summary:
MVI-Bench introduces a new benchmark to evaluate Large Vision-Language Models robustness against misleading visual inputs. It utilizes a hierarchical taxonomy and a novel metric to uncover significant vulnerabilities in state-of-the-art LVLMs.
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14159
• PDF: https://arxiv.org/pdf/2511.14159
• Github: https://github.com/chenyil6/MVI-Bench
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LVLMs #ComputerVision #AIrobustness #MachineLearning #AI
📝 Summary:
MVI-Bench introduces a new benchmark to evaluate Large Vision-Language Models robustness against misleading visual inputs. It utilizes a hierarchical taxonomy and a novel metric to uncover significant vulnerabilities in state-of-the-art LVLMs.
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14159
• PDF: https://arxiv.org/pdf/2511.14159
• Github: https://github.com/chenyil6/MVI-Bench
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LVLMs #ComputerVision #AIrobustness #MachineLearning #AI
✨REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding
📝 Summary:
Text-only self-reflection is insufficient for long-form video understanding. REVISOR is a new framework enabling MLLMs to perform multimodal introspective reflection across text and visual modalities. This significantly enhances reasoning for long videos without extra fine-tuning, achieving stron...
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13026
• PDF: https://arxiv.org/pdf/2511.13026
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MultimodalAI #VideoUnderstanding #MLLMs #AIResearch #ComputerVision
📝 Summary:
Text-only self-reflection is insufficient for long-form video understanding. REVISOR is a new framework enabling MLLMs to perform multimodal introspective reflection across text and visual modalities. This significantly enhances reasoning for long videos without extra fine-tuning, achieving stron...
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13026
• PDF: https://arxiv.org/pdf/2511.13026
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#MultimodalAI #VideoUnderstanding #MLLMs #AIResearch #ComputerVision
✨Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
📝 Summary:
This paper clarifies RL for LLM Agents by extending the MDP framework. It introduces Agent-R1, a modular and flexible training framework, demonstrating its effectiveness on Multihop QA tasks.
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14460
• PDF: https://arxiv.org/pdf/2511.14460
• Github: https://github.com/0russwest0/Agent-R1
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLMAgents #ReinforcementLearning #AI #DeepLearning #NLP
📝 Summary:
This paper clarifies RL for LLM Agents by extending the MDP framework. It introduces Agent-R1, a modular and flexible training framework, demonstrating its effectiveness on Multihop QA tasks.
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14460
• PDF: https://arxiv.org/pdf/2511.14460
• Github: https://github.com/0russwest0/Agent-R1
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#LLMAgents #ReinforcementLearning #AI #DeepLearning #NLP
✨Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark
📝 Summary:
Current video model benchmarks miss assessing Chain-of-Frames CoF reasoning, crucial for world simulators. Gen-ViRe is a new benchmark that decomposes CoF reasoning into cognitive subtasks, offering the first quantitative assessment. It reveals poor reasoning depth despite impressive visual quali...
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13853
• PDF: https://arxiv.org/pdf/2511.13853
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AI #WorldSimulators #VisualReasoning #GenerativeAI #Benchmarks
📝 Summary:
Current video model benchmarks miss assessing Chain-of-Frames CoF reasoning, crucial for world simulators. Gen-ViRe is a new benchmark that decomposes CoF reasoning into cognitive subtasks, offering the first quantitative assessment. It reveals poor reasoning depth despite impressive visual quali...
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13853
• PDF: https://arxiv.org/pdf/2511.13853
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AI #WorldSimulators #VisualReasoning #GenerativeAI #Benchmarks
✨Agent READMEs: An Empirical Study of Context Files for Agentic Coding
📝 Summary:
This study analyzed 2303 agent context files, finding them complex and evolving like config code. Developers prioritize functional details but rarely specify non-functional requirements like security or performance. This suggests a gap in guardrails for agent-written code quality.
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12884
• PDF: https://arxiv.org/pdf/2511.12884
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AIAgents #SoftwareEngineering #CodeQuality #LLMs #AIResearch
📝 Summary:
This study analyzed 2303 agent context files, finding them complex and evolving like config code. Developers prioritize functional details but rarely specify non-functional requirements like security or performance. This suggests a gap in guardrails for agent-written code quality.
🔹 Publication Date: Published on Nov 17
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12884
• PDF: https://arxiv.org/pdf/2511.12884
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#AIAgents #SoftwareEngineering #CodeQuality #LLMs #AIResearch
✨UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE
📝 Summary:
UniMoE-Audio unifies speech and music generation using a novel Dynamic-Capacity Mixture-of-Experts framework. It addresses data imbalance and task conflicts through a hybrid expert design and a three-stage training, achieving state-of-the-art performance and synergistic cross-domain learning.
🔹 Publication Date: Published on Oct 15
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/unimoe-audio-unified-speech-and-music-generation-with-dynamic-capacity-moe
• PDF: https://arxiv.org/pdf/2510.13344
• Project Page: https://mukioxun.github.io/Uni-MoE-site/home.html
• Github: https://github.com/HITsz-TMG/Uni-MoE/blob/master/UniMoE-Audio
🔹 Models citing this paper:
• https://huggingface.co/HIT-TMG/UniMoE-Audio-Preview
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#SpeechGeneration #MusicGeneration #MixtureOfExperts #GenerativeAI #DeepLearning
📝 Summary:
UniMoE-Audio unifies speech and music generation using a novel Dynamic-Capacity Mixture-of-Experts framework. It addresses data imbalance and task conflicts through a hybrid expert design and a three-stage training, achieving state-of-the-art performance and synergistic cross-domain learning.
🔹 Publication Date: Published on Oct 15
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/unimoe-audio-unified-speech-and-music-generation-with-dynamic-capacity-moe
• PDF: https://arxiv.org/pdf/2510.13344
• Project Page: https://mukioxun.github.io/Uni-MoE-site/home.html
• Github: https://github.com/HITsz-TMG/Uni-MoE/blob/master/UniMoE-Audio
🔹 Models citing this paper:
• https://huggingface.co/HIT-TMG/UniMoE-Audio-Preview
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#SpeechGeneration #MusicGeneration #MixtureOfExperts #GenerativeAI #DeepLearning
✨OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
📝 Summary:
OmniZip is a training-free framework that addresses the computational bottleneck in omnimodal LLMs by dynamically compressing audio-visual tokens. It uses audio retention scores to guide video token pruning, achieving 3.42X inference speedup and 1.4X memory reduction without performance loss.
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14582
• PDF: https://arxiv.org/pdf/2511.14582
• Github: https://github.com/KD-TAO/OmniZip
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#OmnimodalLLM #TokenCompression #LLMs #AI #ModelEfficiency
📝 Summary:
OmniZip is a training-free framework that addresses the computational bottleneck in omnimodal LLMs by dynamically compressing audio-visual tokens. It uses audio retention scores to guide video token pruning, achieving 3.42X inference speedup and 1.4X memory reduction without performance loss.
🔹 Publication Date: Published on Nov 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14582
• PDF: https://arxiv.org/pdf/2511.14582
• Github: https://github.com/KD-TAO/OmniZip
==================================
For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT
#OmnimodalLLM #TokenCompression #LLMs #AI #ModelEfficiency