ML Research Hub – Telegram
ML Research Hub
32.6K subscribers
3.89K photos
210 videos
23 files
4.18K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning

📝 Summary:
Skyra, a specialized multimodal large language model, detects and explains visual artifacts in AI-generated videos using a novel dataset and two-stage training strategy, outperforming existing methods...

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15693
• PDF: https://arxiv.org/pdf/2512.15693
• Project Page: https://joeleelyf.github.io/Skyra/
• Github: https://github.com/JoeLeelyf/Skyra

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing

📝 Summary:
Jacobi Forcing is a progressive distillation method that enables efficient parallel decoding of transformer-based models while maintaining performance, significantly reducing inference latency. AI-gen...

🔹 Publication Date: Published on Dec 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.14681
• PDF: https://arxiv.org/pdf/2512.14681
• Github: https://github.com/hao-ai-lab/JacobiForcing

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models

📝 Summary:
DiffusionVL, a family of diffusion vision language models derived from autoregressive models through fine-tuning, achieves performance improvements and faster inference speeds compared to existing mod...

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15713
• PDF: https://arxiv.org/pdf/2512.15713

🔹 Models citing this paper:
https://huggingface.co/hustvl/DiffusionVL-Qwen2.5VL-3B
https://huggingface.co/hustvl/DiffusionVL-Qwen2.5VL-7B

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition

📝 Summary:
Qwen-Image-Layered decomposes images into semantically disentangled RGBA layers using a diffusion model, enabling independent editing of each layer and improving decomposition quality and consistency....

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15603
• PDF: https://arxiv.org/pdf/2512.15603

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Step-GUI Technical Report

📝 Summary:
A self-evolving training pipeline with the Calibrated Step Reward System and GUI-MCP protocol improve GUI automation efficiency, accuracy, and privacy in real-world scenarios. AI-generated summary Rec...

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15431
• PDF: https://arxiv.org/pdf/2512.15431

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Robust and Calibrated Detection of Authentic Multimedia Content

📝 Summary:
A resynthesis framework enhances deepfake detection by verifying authenticity with low false positive rates and robustness against efficient adversaries, supporting multiple modalities. AI-generated s...

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15182
• PDF: https://arxiv.org/pdf/2512.15182

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Is Nano Banana Pro a Low-Level Vision All-Rounder? A Comprehensive Evaluation on 14 Tasks and 40 Datasets

📝 Summary:
Nano Banana Pro excels in subjective visual quality across low-level vision tasks without fine-tuning but struggles with traditional reference-based quantitative metrics due to generative model stocha...

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15110
• PDF: https://arxiv.org/pdf/2512.15110
• Project Page: https://lowlevelbanana.github.io/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning

📝 Summary:
The paper proposes SAGE, a multi-turn reasoning system for video that mimics human behavior, using synthetic data and reinforcement learning to improve performance on long videos. AI-generated summary...

🔹 Publication Date: Published on Dec 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.13874
• PDF: https://arxiv.org/pdf/2512.13874
• Project Page: https://praeclarumjj3.github.io/sage/
• Github: https://github.com/allenai/SAGE

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
In Pursuit of Pixel Supervision for Visual Pre-training

📝 Summary:
Pixio, an enhanced masked autoencoder, demonstrates competitive performance across various downstream tasks using pixel-space self-supervised learning, outperforming latent-space approaches. AI-genera...

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15715
• PDF: https://arxiv.org/pdf/2512.15715
• Project Page: https://github.com/facebookresearch/pixio
• Github: https://github.com/facebookresearch/pixio

🔹 Models citing this paper:
https://huggingface.co/facebook/pixio-vitb16
https://huggingface.co/facebook/pixio-vitl16
https://huggingface.co/facebook/pixio-vit1b16

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
End-to-End Training for Autoregressive Video Diffusion via Self-Resampling

📝 Summary:
Resampling Forcing is a teacher-free framework to train autoregressive video diffusion models. It uses self-resampling to simulate inference errors and history routing for efficient long video generation. This approach improves temporal consistency and achieves comparable performance to teacher-b...

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15702
• PDF: https://arxiv.org/pdf/2512.15702

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LikeBench: Evaluating Subjective Likability in LLMs for Personalization

📝 Summary:
LikeBench introduces a multi-session evaluation framework to measure the likability of LLMs by their ability to adapt to user preferences across multiple dimensions, demonstrating that strong memory p...

🔹 Publication Date: Published on Dec 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.13077
• PDF: https://arxiv.org/pdf/2512.13077

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This channels is for Programmers, Coders, Software Engineers.

0️⃣ Python
1️⃣ Data Science
2️⃣ Machine Learning
3️⃣ Data Visualization
4️⃣ Artificial Intelligence
5️⃣ Data Analysis
6️⃣ Statistics
7️⃣ Deep Learning
8️⃣ programming Languages

https://news.1rj.ru/str/addlist/8_rRW2scgfRhOTc0

https://news.1rj.ru/str/Codeprogrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
1
IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning

📝 Summary:
IC-Effect is an instruction-guided DiT framework for precise video VFX editing. It synthesizes complex effects with spatial-temporal consistency by leveraging contextual learning, a two-stage training strategy, and sparse tokenization, outperforming existing models.

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15635
• PDF: https://arxiv.org/pdf/2512.15635

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
WAY: Estimation of Vessel Destination in Worldwide AIS Trajectory

📝 Summary:
A novel deep learning architecture, WAY, uses nested sequence structures and spatial grids for accurate long-term vessel destination estimation from AIS data, incorporating CASP blocks and Gradient Dr...

🔹 Publication Date: Published on Dec 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.13190
• PDF: https://arxiv.org/pdf/2512.13190
• Github: https://github.com/sadPororo/WAY

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Media is too big
VIEW IN TELEGRAM
MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

📝 Summary:
MMSI-Video-Bench is a comprehensive benchmark for video-based spatial intelligence in MLLMs, revealing significant gaps between human and AI performance and highlighting challenges in geometric reason...

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10863
• PDF: https://arxiv.org/pdf/2512.10863
• Github: https://github.com/InternRobotics/MMSI-Video-Bench

Datasets citing this paper:
https://huggingface.co/datasets/rbler/MMSI-Video-Bench

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Hybrid Attribution Priors for Explainable and Robust Model Training

📝 Summary:
A novel framework, Class-Aware Attribution Prior (CAP), enhances language model interpretability and robustness by guiding the model to capture fine-grained class distinctions and combining with exist...

🔹 Publication Date: Published on Dec 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.14719
• PDF: https://arxiv.org/pdf/2512.14719

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
SS4D: Native 4D Generative Model via Structured Spacetime Latents

📝 Summary:
SS4D synthesizes dynamic 3D objects from monocular video using a native 4D generative model with structured spacetime latents, ensuring high fidelity, temporal coherence, and structural consistency. A...

🔹 Publication Date: Published on Dec 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.14284
• PDF: https://arxiv.org/pdf/2512.14284
• Project Page: https://lizb6626.github.io/SS4D/
• Github: https://github.com/Lizb6626/SS4D/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
VOYAGER: A Training Free Approach for Generating Diverse Datasets using LLMs

📝 Summary:
Voyager is a novel, training-free method that iteratively generates diverse synthetic datasets from LLMs. It uses determinantal point processes to optimize diversity, significantly outperforming baselines with a 1.5-3x improvement.

🔹 Publication Date: Published on Dec 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.12072
• PDF: https://arxiv.org/pdf/2512.12072

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLMs #SyntheticData #DataScience #MachineLearning #AI
2
HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices

📝 Summary:
HyperVL is an efficient multimodal large language model for edge devices. It uses image tiling, a Visual Resolution Compressor, and Dual Consistency Learning to reduce memory, latency, and power. HyperVL maintains performance, making it practical for on-device inference.

🔹 Publication Date: Published on Dec 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.14052
• PDF: https://arxiv.org/pdf/2512.14052

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#HyperVL #MLLM #EdgeAI #EfficientAI #OnDeviceAI
1
Please open Telegram to view this post
VIEW IN TELEGRAM
2
Towards Seamless Interaction: Causal Turn-Level Modeling of Interactive 3D Conversational Head Dynamics

📝 Summary:
TIMAR is a new causal framework for 3D conversational head generation. It models dialogue using interleaved audio-visual contexts to predict continuous head dynamics, improving coherence and expressive variability. Experiments show TIMAR significantly reduces errors and improves performance.

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15340
• PDF: https://arxiv.org/pdf/2512.15340
• Project Page: https://github.com/CoderChen01/towards-seamleass-interaction/blob/main/README.md
• Github: https://github.com/CoderChen01/towards-seamleass-interaction

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#ConversationalAI #3DAnimation #HumanComputerInteraction #CausalModeling #AI