ML Research Hub – Telegram
ML Research Hub
32.7K subscribers
3.93K photos
217 videos
23 files
4.22K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning

📝 Summary:
ReVSeg enhances video object segmentation. It uses sequential reasoning within pretrained vision language models, optimized by reinforcement learning. This achieves state-of-the-art results and provides interpretable reasoning.

🔹 Publication Date: Published on Dec 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.02835
• PDF: https://arxiv.org/pdf/2512.02835
• Project Page: https://clementine24.github.io/ReVSeg/
• Github: https://github.com/Clementine24/ReVSeg

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#VideoSegmentation #ReinforcementLearning #VisionLanguageModels #ComputerVision #DeepLearning
ProPhy: Progressive Physical Alignment for Dynamic World Simulation

📝 Summary:
ProPhy is a two-stage framework that enhances video generation by explicitly incorporating physics-aware conditioning and anisotropic generation. It uses a Mixture-of-Physics-Experts mechanism to extract fine-grained physical priors, improving physical consistency and realism in dynamic world sim...

🔹 Publication Date: Published on Dec 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05564
• PDF: https://arxiv.org/pdf/2512.05564

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#VideoGeneration #PhysicsAI #DynamicSimulation #DeepLearning #ComputerVision
This media is not supported in your browser
VIEW IN TELEGRAM
World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty

📝 Summary:
C3 is an uncertainty quantification method for training controllable video models that provides dense confidence estimation and out-of-distribution detection, addressing hallucination issues. AI-gener...

🔹 Publication Date: Published on Dec 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05927
• PDF: https://arxiv.org/pdf/2512.05927
• Project Page: https://c-cubed-uq.github.io/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Self-Improving VLM Judges Without Human Annotations

📝 Summary:
A framework for self-training a Vision-Language Model (VLM) judge using self-synthesized data improves judge accuracy on VL-RewardBench, surpassing larger models in several dimensions. AI-generated su...

🔹 Publication Date: Published on Dec 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05145
• PDF: https://arxiv.org/pdf/2512.05145

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

📝 Summary:
This paper introduces Entropy Ratio Clipping ERC to stabilize reinforcement learning. ERC uses the entropy ratio between policies as a global metric, imposing constraints to address distributional shifts overlooked by PPO-Clip. Experiments show consistent performance improvements.

🔹 Publication Date: Published on Dec 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05591
• PDF: https://arxiv.org/pdf/2512.05591

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#ReinforcementLearning #MachineLearning #DeepLearning #AI #ERC
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

📝 Summary:
MoRe4D generates high-quality 4D scenes from a single image by jointly performing motion generation and geometric reconstruction. It uses a diffusion-based 4D Scene Trajectory Generator and depth-guided motion normalization for consistent dynamic details.

🔹 Publication Date: Published on Dec 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05044
• PDF: https://arxiv.org/pdf/2512.05044
• Project Page: https://ivg-yanranzhang.github.io/MoRe4D/
• Github: https://github.com/Zhangyr2022/MoRe4D

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#4DSynthesis #3DReconstruction #MotionGeneration #ComputerVision #GenerativeAI
COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence

📝 Summary:
COOPER is a unified MLLM that integrates depth and segmentation modalities to enhance spatial intelligence. It uses adaptive interleaved reasoning, improving spatial reasoning by 6.91%. Learning to generate auxiliary modalities also strengthens spatial understanding, boosting distance and size es...

🔹 Publication Date: Published on Dec 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04563
• PDF: https://arxiv.org/pdf/2512.04563
• Github: https://github.com/zhangzef/COOPER

🔹 Models citing this paper:
https://huggingface.co/Starrrrrry/COOPER-AMG
https://huggingface.co/Starrrrrry/COOPER

Datasets citing this paper:
https://huggingface.co/datasets/Starrrrrry/COOPER_Train_Set

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MLLM #SpatialIntelligence #ComputerVision #AI #DeepLearning
From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks

📝 Summary:
CAPO, a curriculum advantage policy optimization, enhances reinforcement learning for large language models by strategically introducing positive and negative advantage signals, improving reasoning ca...

🔹 Publication Date: Published on Dec 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.02580
• PDF: https://arxiv.org/pdf/2512.02580

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AI & Human Co-Improvement for Safer Co-Superintelligence

📝 Summary:
The focus should be on collaborative co-improvement between humans and AI systems to achieve safer and accelerated AI research and development. AI-generated summary Self-improvement is a goal currentl...

🔹 Publication Date: Published on Dec 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05356
• PDF: https://arxiv.org/pdf/2512.05356

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling

📝 Summary:
SpaceControl enables explicit spatial control of 3D generation using various geometric inputs, outperforming existing methods in geometric faithfulness while maintaining visual quality. AI-generated s...

🔹 Publication Date: Published on Dec 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05343
• PDF: https://arxiv.org/pdf/2512.05343

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling

📝 Summary:
PaCo-RL is a reinforcement learning framework for consistent image generation. It introduces PaCo-Reward for human-aligned consistency evaluation and PaCo-GRPO for efficient RL optimization. The framework achieves state-of-the-art consistency with improved training efficiency.

🔹 Publication Date: Published on Dec 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04784
• PDF: https://arxiv.org/pdf/2512.04784
• Project Page: https://x-gengroup.github.io/HomePage_PaCo-RL/
• Github: https://x-gengroup.github.io/HomePage_PaCo-RL

🔹 Models citing this paper:
https://huggingface.co/X-GenGroup/PaCo-Reward-7B
https://huggingface.co/X-GenGroup/PaCo-Reward-7B-Lora
https://huggingface.co/X-GenGroup/PaCo-FLUX.1-dev-Lora

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#ReinforcementLearning #ImageGeneration #AI #DeepLearning #GenerativeAI
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards

📝 Summary:
RealGen is a photorealistic text-to-image framework addressing AI artifacts in current models. It uses an LLM for prompt optimization and a diffusion model, enhanced by a Detector Reward mechanism that quantifies artifacts and assesses realism. RealGen significantly outperforms other models, achi...

🔹 Publication Date: Published on Nov 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.00473
• PDF: https://arxiv.org/pdf/2512.00473
• Project Page: https://yejy53.github.io/RealGen/
• Github: https://yejy53.github.io/RealGen/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#TextToImage #GenerativeAI #DiffusionModels #AIResearch #ComputerVision
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

📝 Summary:
TwinFlow is a 1-step generative model framework that enhances inference efficiency without requiring fixed pretrained teacher models or standard adversarial networks, achieving high performance on tex...

🔹 Publication Date: Published on Dec 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05150
• PDF: https://arxiv.org/pdf/2512.05150

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
I'm pleased to invite you to join my private Signal group.

All my resources will be free and unrestricted there. My goal is to build a clean community exclusively for smart programmers, and I believe Signal is the most suitable platform for this (Signal is the second most popular app after WhatsApp in the US), making it particularly suitable for us as programmers.

https://signal.group/#CjQKIPcpEqLQow53AG7RHjeVk-4sc1TFxyym3r0gQQzV-OPpEhCPw_-kRmJ8LlC13l0WiEfp
This media is not supported in your browser
VIEW IN TELEGRAM
SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations

📝 Summary:
SCAIL is a framework that improves character animation to studio-grade quality. It uses a novel 3D pose representation and a diffusion-transformer with full-context pose injection, achieving state-of-the-art realism and reliability.

🔹 Publication Date: Published on Dec 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05905
• PDF: https://arxiv.org/pdf/2512.05905

🔹 Models citing this paper:
https://huggingface.co/zai-org/SCAIL-Preview

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#CharacterAnimation #AI #3DAnimation #DeepLearning #ComputerGraphics
This media is not supported in your browser
VIEW IN TELEGRAM
TimesNet-Gen: Deep Learning-based Site Specific Strong Motion Generation

📝 Summary:
TimesNet-Gen is a time-domain deep learning model that effectively synthesizes site-specific strong ground motion records. It uses a station-specific latent bottleneck and outperforms a spectrogram-based baseline, improving earthquake risk assessment.

🔹 Publication Date: Published on Dec 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04694
• PDF: https://arxiv.org/pdf/2512.04694
• Project Page: https://huggingface.co/spaces/Barisylmz/TimesNet-Gen
• Github: https://github.com/brsylmz23/TimesNet-Gen/tree/main

Spaces citing this paper:
https://huggingface.co/spaces/Barisylmz/TimesNet-Gen

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#DeepLearning #EarthquakeEngineering #Seismology #GroundMotion #AI
InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write

📝 Summary:
InkSight converts offline handwriting to digital ink using novel reading and writing priors. This approach effectively derenders text from diverse photos, generalizing beyond its training and requiring less paired data than prior methods.

🔹 Publication Date: Published on Feb 8, 2024

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2402.05804
• PDF: https://arxiv.org/pdf/2402.05804
• Project Page: https://charlieleee.github.io/publication/inksight
• Github: https://github.com/google-research/inksight

🔹 Models citing this paper:
https://huggingface.co/Derendering/InkSight-Small-p

Datasets citing this paper:
https://huggingface.co/datasets/Derendering/InkSight-Derenderings

Spaces citing this paper:
https://huggingface.co/spaces/Derendering/Model-Output-Playground

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#HandwritingConversion #ComputerVision #DeepLearning #AIResearch #DocumentDigitization
SQ-format: A Unified Sparse-Quantized Hardware-friendly Data Format for LLMs

📝 Summary:
The SQ-format is a unified sparse-quantized data format for LLM post-training quantization. It improves accuracy and efficiency balance by combining sparse and low-precision matrix multiplications. This enables better performance and throughput, especially for outlier activations, supporting next...

🔹 Publication Date: Published on Dec 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05409
• PDF: https://arxiv.org/pdf/2512.05409

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLMs #Quantization #SparseML #HardwareAcceleration #AIResearch
1