NEW BOT Телеграм, страница

ML Research Hub

✨Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models

📝 Summary:
Downscaling multimodal models disproportionately harms visual capabilities, including perception, more than LLM abilities. This paper introduces visual extraction tuning combined with step-by-step reasoning to improve smaller models efficiency and performance.

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17487
• PDF: https://arxiv.org/pdf/2511.17487
• Project Page: https://web.stanford.edu/~markendo/projects/downscaling_intelligence
• Github: https://github.com/markendo/downscaling_intelligence

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MultimodalAI #SmallModels #ComputerVision #EfficientAI #AIResearch

301 views08:08

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Diversity Has Always Been There in Your Visual Autoregressive Models

📝 Summary:
To combat diversity collapse in Visual Autoregressive models, DiverseVAR modifies feature maps without retraining. This restores generative diversity while maintaining high synthesis quality.

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17074
• PDF: https://arxiv.org/pdf/2511.17074

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VisualAI #GenerativeModels #ModelDiversity #MachineLearning #ComputerVision

319 views09:09

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:28

This media is not supported in your browser

VIEW IN TELEGRAM

✨Loomis Painter: Reconstructing the Painting Process

📝 Summary:
This paper proposes a unified diffusion model framework for generating consistent, high-fidelity multi-media painting processes. It uses semantic control and cross-medium style augmentation to replicate human artistic workflows, supported by a new dataset and evaluation metrics.

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17344
• PDF: https://arxiv.org/pdf/2511.17344
• Project Page: https://markus-pobitzer.github.io/lplp/
• Github: https://github.com/Markus-Pobitzer/wlp

🔹 Models citing this paper:
• https://huggingface.co/Markus-Pobitzer/wlp-lora

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#DiffusionModels #GenerativeAI #AIArt #ComputerGraphics #MachineLearning

442 views10:09

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Channel name was changed to «ML Research Hub»

10:16

ML Research Hub

Channel photo updated

10:16

ML Research Hub

✨MergeDNA: Context-aware Genome Modeling with Dynamic Tokenization through Token Merging

📝 Summary:
MergeDNA models genomic sequences with a hierarchical architecture and dynamic Token Merging to adaptively chunk bases. This addresses varying information density and lack of a fixed vocabulary, achieving superior performance on DNA benchmarks and multi-omics tasks.

🔹 Publication Date: Published on Nov 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14806
• PDF: https://arxiv.org/pdf/2511.14806

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#Genomics #Bioinformatics #MachineLearning #DNA #MultiOmics

379 views11:09

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Insights from the ICLR Peer Review and Rebuttal Process

📝 Summary:
ICLR 2024-2025 peer review was analyzed using LLM analysis to understand score changes. Initial scores and co-reviewer ratings strongly predict changes, and rebuttals aid borderline papers. These insights aim to improve the review process.

🔹 Publication Date: Published on Nov 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15462
• PDF: https://arxiv.org/pdf/2511.15462
• Project Page: https://github.com/papercopilot/iclr-insights.

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#PeerReview #LLM #ICLR #AcademicResearch #MachineLearning

425 views11:10

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:28

This media is not supported in your browser

VIEW IN TELEGRAM

662 views11:28

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨InstructMix2Mix: Consistent Sparse-View Editing Through Multi-View Model Personalization

📝 Summary:
InstructMix2Mix I-Mix2Mix improves multi-view image editing from sparse inputs, which often lack consistency. It distills a 2D diffusion model into a multi-view diffusion model, leveraging its 3D prior for cross-view coherence. This framework significantly enhances multi-view consistency and per-...

🔹 Publication Date: Published on Nov 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.14899
• PDF: https://arxiv.org/pdf/2511.14899
• Project Page: https://danielgilo.github.io/instruct-mix2mix/
• Github: https://danielgilo.github.io/instruct-mix2mix/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MultiViewEditing #DiffusionModels #ComputerVision #3DVision #ImageSynthesis

❤1

723 views13:10

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:03

This media is not supported in your browser

VIEW IN TELEGRAM

✨Computer-Use Agents as Judges for Generative User Interface

📝 Summary:
This paper introduces a framework where Computer-Use Agents CUA act as judges for coding language models Coder to automatically design GUIs. The goal is to optimize interfaces for CUA efficiency and task solvability, rather than human aesthetics, using a new benchmark called AUI-Gym.

🔹 Publication Date: Published on Nov 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.15567
• PDF: https://arxiv.org/pdf/2511.15567
• Project Page: https://showlab.github.io/AUI/
• Github: https://github.com/showlab/AUI/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AIAgents #GUIDesign #GenerativeAI #AIevaluation #LanguageModels

476 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark

📝 Summary:
M3-Bench is a new benchmark evaluating multimodal LLM agent tool use in complex, multi-hop workflows requiring visual grounding and tool dependencies. It introduces a similarity-driven alignment method and interpretable metrics. Evaluations show significant gaps in current MLLMs, especially in ar...

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17729
• PDF: https://arxiv.org/pdf/2511.17729
• Github: https://github.com/EtaYang10th/Open-M3-Bench

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MLLM #LLMAgents #AI #Benchmarking #ToolUse

319 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨General Agentic Memory Via Deep Research

📝 Summary:
GAM is a novel framework for AI memory addressing information loss in static systems. It uses JIT principles with a memorizer and researcher to create optimized contexts at runtime. This improves memory efficiency and task completion, leveraging LLMs and reinforcement learning.

🔹 Publication Date: Published on Nov 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18423
• PDF: https://arxiv.org/pdf/2511.18423
• Github: https://github.com/VectorSpaceLab/general-agentic-memory

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #LLMs #ReinforcementLearning #AIMemory #DeepLearning

357 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨In-Video Instructions: Visual Signals as Generative Control

📝 Summary:
This paper introduces In-Video Instruction for controllable image-to-video generation. It embeds visual signals like text or arrows directly into frames as instructions, offering precise, spatial-aware control over object actions. Experiments show video models reliably execute these visual cues.

🔹 Publication Date: Published on Nov 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19401
• PDF: https://arxiv.org/pdf/2511.19401
• Project Page: https://fangggf.github.io/In-Video/
• Github: https://fangggf.github.io/In-Video/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VideoGeneration #GenerativeAI #ComputerVision #AIResearch #DeepLearning

288 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨AutoEnv: Automated Environments for Measuring Cross-Environment Agent Learning

📝 Summary:
AutoEnv and AutoEnv-36 provide a standardized framework and dataset for measuring cross-environment agent learning. Their evaluations show that fixed learning methods do not scale across diverse environments, highlighting current limitations in agent generalization.

🔹 Publication Date: Published on Nov 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19304
• PDF: https://arxiv.org/pdf/2511.19304

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #MachineLearning #AgentLearning #Generalization #ReinforcementLearning

198 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

📝 Summary:
DeCo is a frequency-decoupled pixel diffusion framework that improves image generation by separating high-frequency details and low-frequency semantics. It uses a lightweight pixel decoder for details and a DiT for semantics, achieving superior efficiency and quality over existing pixel diffusion...

🔹 Publication Date: Published on Nov 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19365
• PDF: https://arxiv.org/pdf/2511.19365
• Project Page: https://zehong-ma.github.io/DeCo/
• Github: https://github.com/Zehong-Ma/DeCo

🔹 Models citing this paper:
• https://huggingface.co/zehongma/DeCo

✨ Spaces citing this paper:
• https://huggingface.co/spaces/zehongma/DeCo

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ImageGeneration #DiffusionModels #ComputerVision #DeepLearning #DeCo

222 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Budget-Aware Tool-Use Enables Effective Agent Scaling

📝 Summary:
Tool-augmented agents struggle to scale with more tool calls due to a lack of budget awareness. This paper introduces Budget Tracker for continuous budget awareness and BATS for adaptive planning, dynamically adjusting strategy based on remaining resources. These methods significantly improve cos...

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17006
• PDF: https://arxiv.org/pdf/2511.17006

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AIAgents #ToolUse #ResourceManagement #AgentScaling #AIResearch

169 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios

📝 Summary:
UltraFlux overcomes diffusion transformer failures at 4K resolution and diverse aspect ratios through data-model co-design. It uses enhanced positional encoding, VAE improvements, gradient rebalancing, and aesthetic curriculum learning to achieve superior 4K text-to-image generation, outperformin...

🔹 Publication Date: Published on Nov 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.18050
• PDF: https://arxiv.org/pdf/2511.18050
• Project Page: https://github.com/W2GenAI-Lab/UltraFlux
• Github: https://github.com/W2GenAI-Lab/UltraFlux

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#TextToImage #GenerativeAI #4KGeneration #DiffusionModels #AIResearch

206 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Controllable Layer Decomposition for Reversible Multi-Layer Image Generation

📝 Summary:
Controllable Layer Decomposition CLD enables fine-grained, controllable separation of raster images into editable RGBA layers, overcoming traditional compositing limitations. Using LD-DiT and MLCA, CLD surpasses existing methods in quality and control. It produces layers directly usable in design...

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16249
• PDF: https://arxiv.org/pdf/2511.16249
• Github: https://github.com/monkek123King/CLD

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ImageGeneration #DeepLearning #ComputerVision #ImageEditing #LayerDecomposition

193 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨PRInTS: Reward Modeling for Long-Horizon Information Seeking

📝 Summary:
PRInTS is a generative process reward model that improves AI agents information-seeking. It provides dense scoring on step quality and summarizes long trajectories to manage context. PRInTS enhances agent performance, matching or surpassing frontier models with a smaller backbone.

🔹 Publication Date: Published on Nov 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.19314
• PDF: https://arxiv.org/pdf/2511.19314

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#RewardModeling #InformationSeeking #AIagents #GenerativeAI #MachineLearning

207 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Plan-X: Instruct Video Generation via Semantic Planning

📝 Summary:
Plan-X improves instruction-aligned video generation by integrating a Semantic Planner with diffusion models. The planner generates semantic tokens that guide video synthesis, reducing visual hallucinations. This framework combines language models for reasoning with diffusion models for photoreal...

🔹 Publication Date: Published on Nov 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17986
• PDF: https://arxiv.org/pdf/2511.17986
• Project Page: https://byteaigc.github.io/Plan-X/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VideoGeneration #DiffusionModels #AI #ComputerVision #DeepLearning

197 views04:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

✨Target-Bench: Can World Models Achieve Mapless Path Planning with Semantic Targets?

📝 Summary:
Target-Bench evaluates world models for mapless robot path planning to semantic targets in real-world environments. It reveals off-the-shelf models perform poorly, but fine-tuning significantly improves their planning capability.

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17792
• PDF: https://arxiv.org/pdf/2511.17792
• Project Page: https://target-bench.github.io/
• Github: https://github.com/TUM-AVS/target-bench

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#Robotics #PathPlanning #WorldModels #ArtificialIntelligence #MachineLearning

208 views04:03

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform