NEW BOT Телеграм, страница

ML Research Hub

✨SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization

📝 Summary:
SafeGRPO introduces a self-rewarded, rule-governed framework for multimodal safety alignment in MLLMs. It integrates verifiable reward construction and step-guided safety thinking to improve robustness against compositional risks and enhance reasoning stability.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12982
• PDF: https://arxiv.org/pdf/2511.12982

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MLLMs #AISafety #MultimodalAI #ReinforcementLearning #AIResearch

231 views02:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Error-Driven Scene Editing for 3D Grounding in Large Language Models

📝 Summary:
DEER-3D improves 3D LLM grounding by iteratively editing and retraining models. It diagnoses predicate-level errors, then generates targeted 3D scene edits as counterfactuals to enhance spatial understanding and accuracy.

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14086
• PDF: https://arxiv.org/pdf/2511.14086
• Github: https://github.com/zhangyuejoslin/Deer-3D

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLMs #3DGrounding #ComputerVision #DeepLearning #AIResearch

186 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning

📝 Summary:
ATLAS is a new, high-difficulty, multidisciplinary benchmark for LLMs, featuring 800 original problems across seven scientific fields. It addresses current benchmark limitations with complex, open-ended answers and aims to differentiate advanced scientific reasoning, serving as a ruler for AGI pr...

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14366
• PDF: https://arxiv.org/pdf/2511.14366

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #AGI #AIResearch #ScientificReasoning #Benchmark

200 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Orion: A Unified Visual Agent for Multimodal Perception, Advanced Visual Reasoning and Execution

📝 Summary:
Orion is a visual agent framework that orchestrates specialized computer vision tools to execute complex visual workflows. It achieves competitive performance on benchmarks and enables autonomous, tool-driven visual reasoning.

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14210
• PDF: https://arxiv.org/pdf/2511.14210

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ComputerVision #AIagents #VisualReasoning #MultimodalAI #DeepLearning

193 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space

📝 Summary:
CoTyle introduces code-to-style image generation, creating consistent visual styles from numerical codes. It is the first open-source academic method for this task, using a discrete style codebook and a text-to-image diffusion model for diverse, reproducible styles.

🔹 Publication Date: Published on Nov 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.10555
• PDF: https://arxiv.org/pdf/2511.10555
• Project Page: https://Kwai-Kolors.github.io/CoTyle/
• Github: https://github.com/Kwai-Kolors/CoTyle

✨ Spaces citing this paper:
• https://huggingface.co/spaces/Kwai-Kolors/CoTyle

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ImageGeneration #DiffusionModels #NeuralStyle #ComputerVision #DeepLearning

160 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs

📝 Summary:
MVI-Bench introduces a new benchmark to evaluate Large Vision-Language Models robustness against misleading visual inputs. It utilizes a hierarchical taxonomy and a novel metric to uncover significant vulnerabilities in state-of-the-art LVLMs.

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14159
• PDF: https://arxiv.org/pdf/2511.14159
• Github: https://github.com/chenyil6/MVI-Bench

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LVLMs #ComputerVision #AIrobustness #MachineLearning #AI

167 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding

📝 Summary:
Text-only self-reflection is insufficient for long-form video understanding. REVISOR is a new framework enabling MLLMs to perform multimodal introspective reflection across text and visual modalities. This significantly enhances reasoning for long videos without extra fine-tuning, achieving stron...

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13026
• PDF: https://arxiv.org/pdf/2511.13026

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MultimodalAI #VideoUnderstanding #MLLMs #AIResearch #ComputerVision

173 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

📝 Summary:
This paper clarifies RL for LLM Agents by extending the MDP framework. It introduces Agent-R1, a modular and flexible training framework, demonstrating its effectiveness on Multihop QA tasks.

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14460
• PDF: https://arxiv.org/pdf/2511.14460
• Github: https://github.com/0russwest0/Agent-R1

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLMAgents #ReinforcementLearning #AI #DeepLearning #NLP

203 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark

📝 Summary:
Current video model benchmarks miss assessing Chain-of-Frames CoF reasoning, crucial for world simulators. Gen-ViRe is a new benchmark that decomposes CoF reasoning into cognitive subtasks, offering the first quantitative assessment. It reveals poor reasoning depth despite impressive visual quali...

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13853
• PDF: https://arxiv.org/pdf/2511.13853

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #WorldSimulators #VisualReasoning #GenerativeAI #Benchmarks

179 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Agent READMEs: An Empirical Study of Context Files for Agentic Coding

📝 Summary:
This study analyzed 2303 agent context files, finding them complex and evolving like config code. Developers prioritize functional details but rarely specify non-functional requirements like security or performance. This suggests a gap in guardrails for agent-written code quality.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.12884
• PDF: https://arxiv.org/pdf/2511.12884

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AIAgents #SoftwareEngineering #CodeQuality #LLMs #AIResearch

183 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE

📝 Summary:
UniMoE-Audio unifies speech and music generation using a novel Dynamic-Capacity Mixture-of-Experts framework. It addresses data imbalance and task conflicts through a hybrid expert design and a three-stage training, achieving state-of-the-art performance and synergistic cross-domain learning.

🔹 Publication Date: Published on Oct 15

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/unimoe-audio-unified-speech-and-music-generation-with-dynamic-capacity-moe
• PDF: https://arxiv.org/pdf/2510.13344
• Project Page: https://mukioxun.github.io/Uni-MoE-site/home.html
• Github: https://github.com/HITsz-TMG/Uni-MoE/blob/master/UniMoE-Audio

🔹 Models citing this paper:
• https://huggingface.co/HIT-TMG/UniMoE-Audio-Preview

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#SpeechGeneration #MusicGeneration #MixtureOfExperts #GenerativeAI #DeepLearning

247 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models

📝 Summary:
OmniZip is a training-free framework that addresses the computational bottleneck in omnimodal LLMs by dynamically compressing audio-visual tokens. It uses audio retention scores to guide video token pruning, achieving 3.42X inference speedup and 1.4X memory reduction without performance loss.

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14582
• PDF: https://arxiv.org/pdf/2511.14582
• Github: https://github.com/KD-TAO/OmniZip

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#OmnimodalLLM #TokenCompression #LLMs #AI #ModelEfficiency

259 views06:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

1:29

This media is not supported in your browser

VIEW IN TELEGRAM

✨Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

📝 Summary:
Think-at-Hard TaH improves LLM reasoning by dynamically refining only hard tokens. It uses a neural decider to identify them and LoRA for focused refinement, boosting performance with minimal overhead.

🔹 Publication Date: Published on Nov 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.08577
• PDF: https://arxiv.org/pdf/2511.08577
• Github: https://github.com/thu-nics/TaH

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #AI #MachineLearning #NaturalLanguageProcessing #Reasoning

251 views06:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

📝 Summary:
Uni-MoE introduces a sparse Multimodal Mixture of Experts LLM efficiently handling diverse data types. It uses modality-specific encoders and a progressive training strategy, reducing performance bias and improving collaboration across modalities.

🔹 Publication Date: Published on May 18, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2405.11273
• PDF: https://arxiv.org/pdf/2405.11273
• Github: https://github.com/hitsz-tmg/umoe-scaling-unified-multimodal-llms

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MultimodalAI #LLMs #MixtureOfExperts #DeepLearning #AIResearch

253 views07:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models

📝 Summary:
AraLingBench is a human-annotated benchmark evaluating Arabic LLM linguistic competence using expert-designed questions. It reveals models achieve surface proficiency but lack deep understanding, often relying on memorization rather than true comprehension.

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14295
• PDF: https://arxiv.org/pdf/2511.14295

✨ Datasets citing this paper:
• https://huggingface.co/datasets/hammh0a/AraLingBench

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ArabicNLP #LLMEvaluation #AIResearch #LanguageModels #NLPBenchmarking

206 views08:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Mitigating Label Length Bias in Large Language Models

📝 Summary:
Large Language Models exhibit a label length bias with multi-token class labels. This paper introduces Normalized Contextual Calibration NCC to mitigate this issue by normalizing and calibrating predictions at the full-label level. NCC significantly improves performance and reliability across div...

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14385
• PDF: https://arxiv.org/pdf/2511.14385

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #AI #NLP #BiasInAI #MachineLearning

214 views08:20

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Φeat: Physically-Grounded Feature Representation

📝 Summary:
Φeat is a new self-supervised visual backbone that captures material identity like reflectance and mesostructure. It learns robust features invariant to external physical factors such as shape and lighting, promoting physics-aware perception.

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11270
• PDF: https://arxiv.org/pdf/2511.11270

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ComputerVision #SelfSupervisedLearning #DeepLearning #FeatureLearning #PhysicsAwareAI

238 views08:21

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Large Language Models Meet Extreme Multi-label Classification: Scaling and Multi-modal Framework

📝 Summary:
This paper improves Extreme Multi-label Classification XMC by using larger decoder-only models and introduces ViXML, a vision-enhanced framework. ViXML efficiently integrates visual information, significantly outperforming text-only models and achieving new state-of-the-art.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13189
• PDF: https://arxiv.org/pdf/2511.13189

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #XMC #MultiModalAI #MachineLearning #AIResearch

247 views09:21

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨A Brain Wave Encodes a Thousand Tokens: Modeling Inter-Cortical Neural Interactions for Effective EEG-based Emotion Recognition

📝 Summary:
RBTransformer, a Transformer-based model, improves EEG-based emotion recognition by modeling inter-cortical neural dynamics. It uses Band Differential Entropy tokens and multi-head attention. This approach significantly outperforms existing state-of-the-art methods on multiple datasets and dimens...

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.13954
• PDF: https://arxiv.org/pdf/2511.13954
• Github: https://github.com/nnilayy/RBTransformer

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#EEG #EmotionRecognition #Transformers #Neuroscience #MachineLearning

261 views10:21

✨ Explore Data Science 📝 Write your paper

✨Proactive Hearing Assistants that Isolate Egocentric Conversations

📝 Summary:
A proactive hearing assistant system automatically identifies and isolates the wearers conversation partners from binaural audio. It uses a dual-model AI architecture that adapts to conversational dynamics in real-time, improving speech clarity without user prompts.

🔹 Publication Date: Published on Nov 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.11473
• PDF: https://arxiv.org/pdf/2511.11473
• Project Page: https://proactivehearing.cs.washington.edu/
• Github: https://github.com/guilinhu/proactive_hearing_assistant

🔹 Models citing this paper:
• https://huggingface.co/guilinhu/proactive_hearing

✨ Datasets citing this paper:
• https://huggingface.co/datasets/guilinhu/libri_conversation

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#HearingTech #AI #SpeechEnhancement #AssistiveTechnology #AudioProcessing

255 views11:22

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards

📝 Summary:
NORA-1.5, an enhanced vision-language-action model with a flow-matching-based action expert and reward-driven post-training, improves performance and reliability in both simulated and real-world setti...

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14659
• PDF: https://arxiv.org/pdf/2511.14659
• Project Page: https://declare-lab.github.io/nora-1.5
• Github: https://github.com/declare-lab/nora-1.5

🔹 Models citing this paper:
• https://huggingface.co/declare-lab/nora-1.5

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

336 views11:22

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform