NEW BOT Телеграм, страница

ML Research Hub

✨Scaling Zero-Shot Reference-to-Video Generation

📝 Summary:
Saber is a scalable zero-shot framework for reference-to-video generation that uses video-text pairs to learn identity-consistent representations and outperforms models trained with explicit reference...

🔹 Publication Date: Published on Dec 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06905
• PDF: https://arxiv.org/pdf/2512.06905
• Project Page: https://franciszzj.github.io/Saber/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

178 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Rethinking Training Dynamics in Scale-wise Autoregressive Generation

📝 Summary:
Self-Autoregressive Refinement (SAR) improves the quality of autoregressive generative models by addressing exposure bias through Stagger-Scale Rollout and Contrastive Student-Forcing Loss, leading to...

🔹 Publication Date: Published on Dec 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06421
• PDF: https://arxiv.org/pdf/2512.06421
• Project Page: https://gengzezhou.github.io/SAR/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

186 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Embodied Referring Expression Comprehension in Human-Robot Interaction

📝 Summary:
A large-scale dataset and multimodal model improve embodied interaction comprehension in robots by addressing perspective bias and enhancing multimodal signal integration. AI-generated summary As robo...

🔹 Publication Date: Published on Dec 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06558
• PDF: https://arxiv.org/pdf/2512.06558

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

207 views04:01

✨ Explore Data Science 📝 Write your paper

✨Unified Video Editing with Temporal Reasoner

📝 Summary:
VideoCoF, a Chain-of-Frames approach, improves video editing precision and instruction-to-region mapping by using reasoning tokens without requiring user-provided masks. AI-generated summary Existing ...

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07469
• PDF: https://arxiv.org/pdf/2512.07469
• Project Page: https://videocof.github.io/
• Github: https://github.com/knightyxp/VideoCoF

🔹 Models citing this paper:
• https://huggingface.co/XiangpengYang/VideoCoF

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

172 views05:02

✨ Explore Data Science 📝 Write your paper

✨Relational Visual Similarity

📝 Summary:
Vision-Language models fine-tuned on anonymized image captions can capture relational similarity between images, a capability lacking in current visual similarity metrics. AI-generated summary Humans ...

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07833
• PDF: https://arxiv.org/pdf/2512.07833
• Project Page: https://thaoshibe.github.io/relsim/
• Github: https://github.com/thaoshibe/relsim

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

169 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Group Representational Position Encoding

📝 Summary:
GRAPE is a unified positional encoding framework that combines multiplicative rotations and additive logit biases, extending existing methods like RoPE and ALiBi. AI-generated summary We present GRAPE...

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07805
• PDF: https://model-architectures.github.io/GRAPE/GRAPE.pdf
• Github: https://model-architectures.github.io/GRAPE/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

163 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:10

This media is not supported in your browser

VIEW IN TELEGRAM

✨Voxify3D: Pixel Art Meets Volumetric Rendering

📝 Summary:
Voxify3D is a two-stage framework that combines 3D mesh optimization with 2D pixel art supervision to generate high-quality voxel art with semantic preservation, pixel-art aesthetics, and discrete col...

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07834
• PDF: https://arxiv.org/pdf/2512.07834
• Project Page: https://yichuanh.github.io/Voxify-3D/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

162 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

📝 Summary:
A controlled experimental framework isolates and evaluates the contributions of pre-training, mid-training, and reinforcement learning in improving language model reasoning, demonstrating the necessit...

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07783
• PDF: https://arxiv.org/pdf/2512.07783
• Github: https://github.com/Interplay-LM-Reasoning/Interplay-LM-Reasoning

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

224 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Vector Quantization using Gaussian Variational Autoencoder

📝 Summary:
Gaussian Quant (GQ) converts Gaussian VAE to VQ-VAE without training, outperforming previous VQ-VAEs and Gaussian VAE discretization methods across different architectures. AI-generated summary Vector...

🔹 Publication Date: Published on Dec 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06609
• PDF: https://arxiv.org/pdf/2512.06609
• Github: https://github.com/Stability-AI/generative-models

🔹 Models citing this paper:
• https://huggingface.co/xutongda/GQModel

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

249 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:08

This media is not supported in your browser

VIEW IN TELEGRAM

✨VideoVLA: Video Generators Can Be Generalizable Robot Manipulators

📝 Summary:
VideoVLA uses a multi-modal Diffusion Transformer to predict actions and visual outcomes from language and image inputs, enabling strong generalization in robotic manipulation tasks. AI-generated summ...

🔹 Publication Date: Published on Dec 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06963
• PDF: https://arxiv.org/pdf/2512.06963
• Project Page: https://videovla-nips2025.github.io/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

284 views05:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

📝 Summary:
NPR is a teacher-free framework enabling LLMs to perform genuine parallel reasoning. It uses self-distilled training and a new optimization algorithm. This achieves significant performance gains and speedups on reasoning benchmarks.

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07461
• PDF: https://arxiv.org/pdf/2512.07461
• Github: https://bigai-nlco.github.io/Native-Parallel-Reasoner

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

263 views07:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Distribution Matching Variational AutoEncoder

📝 Summary:
DMVAE explicitly aligns VAE latent distributions with arbitrary reference distributions, generalizing beyond fixed priors. This improves modeling efficiency and image synthesis fidelity, with SSL-derived distributions showing excellent balance.

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07778
• PDF: https://arxiv.org/pdf/2512.07778
• Github: https://github.com/sen-ye/dmvae%7D

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VAE #DeepLearning #GenerativeAI #ImageSynthesis #ArtificialIntelligence

182 views08:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Multi-view Pyramid Transformer: Look Coarser to See Broader

📝 Summary:
MVP is a scalable multi-view transformer that reconstructs large 3D scenes from many images. It uses a dual hierarchy of local-to-global inter-view and fine-to-coarse intra-view processing. This achieves efficient, state-of-the-art 3D scene reconstruction quality.

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07806
• PDF: https://arxiv.org/pdf/2512.07806
• Project Page: https://gynjn.github.io/MVP/
• Github: https://github.com/Gynjn/MVP

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#3DReconstruction #ComputerVision #Transformers #DeepLearning #AI

199 views08:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

📝 Summary:
UnityVideo is a unified framework enhancing video generation by integrating multiple modalities and training paradigms. It uses dynamic noising and a modality switcher for comprehensive world understanding. This improves video quality, consistency, and zero-shot generalization to new data.

🔹 Publication Date: Published on Dec 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.07831
• PDF: https://arxiv.org/pdf/2512.07831
• Project Page: https://jackailab.github.io/Projects/UnityVideo/
• Github: https://github.com/dvlab-research/UnityVideo

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VideoGeneration #MultimodalAI #GenerativeAI #DeepLearning #AIResearch

267 views08:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation

📝 Summary:
ReCamDriving generates camera-controlled novel-trajectory videos using dense 3DGS renderings and a two-stage training approach, achieving state-of-the-art results in controllability and consistency. A...

🔹 Publication Date: Published on Dec 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.03621
• PDF: https://arxiv.org/pdf/2512.03621
• Project Page: https://recamdriving.github.io/
• Github: https://recamdriving.github.io/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

268 views08:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning

📝 Summary:
VG-Refiner improves visual reasoning by addressing unreliable tool outputs. It uses a two-stage think-rethink mechanism and refinement reward to correct poor tool results. This significantly improves accuracy and correction ability in referring and grounding tasks.

🔹 Publication Date: Published on Dec 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06373
• PDF: https://arxiv.org/pdf/2512.06373
• Github: https://github.com/VoyageWang/VG-Refiner

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VisualReasoning #ReinforcementLearning #ComputerVision #AIResearch #MachineLearning

246 views10:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Decouple to Generalize: Context-First Self-Evolving Learning for Data-Scarce Vision-Language Reasoning

📝 Summary:
DoGe is a framework that addresses data scarcity in vision-language models. It decouples context learning from problem solving, using a curriculum to improve reward signals and data diversity. This enhances generalization and performance.

🔹 Publication Date: Published on Dec 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06835
• PDF: https://arxiv.org/pdf/2512.06835

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VisionLanguage #DataScarcity #MachineLearning #AIResearch #DeepLearning

❤1

242 views11:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

194 viewsedited 11:05

ML Research Hub

✨GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

📝 Summary:
GLM-4.1V-Thinking is a vision-language model using a reasoning-centric training framework. It achieves state-of-the-art multimodal reasoning across various tasks like STEM and long document understanding. The model outperforms larger models and competes with closed-source systems like GPT-4o.

🔹 Publication Date: Published on Jul 1

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/glm-41v-thinking-towards-versatile-multimodal-reasoning-with-scalable-reinforcement-learning
• PDF: https://arxiv.org/pdf/2507.01006
• Github: https://github.com/THUDM/GLM-4.1V-Thinking

🔹 Models citing this paper:
• https://huggingface.co/zai-org/GLM-4.1V-9B-Thinking
• https://huggingface.co/zai-org/GLM-4.5V
• https://huggingface.co/zai-org/GLM-4.6V-Flash

✨ Spaces citing this paper:
• https://huggingface.co/spaces/zai-org/GLM-4.1V-9B-Thinking-Demo
• https://huggingface.co/spaces/zai-org/GLM-4.1V-9B-Thinking-API-Demo
• https://huggingface.co/spaces/akhaliq/anycoder

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#GLM41VThinking #MultimodalAI #VisionLanguageModels #ReinforcementLearning #AIResearch

Arxivexplained

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning - Explained Simply

By Wenyi Hong, Wenmeng Yu, Xiaotao Gu et al.. # GLM-4.1V-Thinking: The AI That Actually Thinks Through Visual Problems

**The Problem:** Current A...

290 views11:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning

📝 Summary:
Reinforcement Learning enhances decoding-based regression by introducing sequence-level rewards. This overcomes token-level limitations, improving precision and generalization. It establishes a robust and accurate paradigm for numerical prediction.

🔹 Publication Date: Published on Dec 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.06533
• PDF: https://arxiv.org/pdf/2512.06533

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#ReinforcementLearning #MachineLearning #Regression #DataScience #AI

271 views12:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DZ-TDPO: Non-Destructive Temporal Alignment for Mutable State Tracking in Long-Context Dialogue

📝 Summary:
DZ-TDPO addresses state inertia in long-context dialogue using dynamic KL constraints and temporal attention bias. It achieves state-of-the-art win rates and robust zero-shot generalization, resolving user intent conflicts while preserving model capabilities.

🔹 Publication Date: Published on Dec 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.03704
• PDF: https://arxiv.org/pdf/2512.03704
• Github: https://github.com/lyj20071013/DZ-TDPO

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#DialogueSystems #NLP #MachineLearning #StateTracking #LongContext

❤1

309 views12:05

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform