ML Research Hub – Telegram
ML Research Hub
32.6K subscribers
3.92K photos
217 videos
23 files
4.22K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing

📝 Summary:
RePlan, a plan-then-execute framework, enhances instruction-based image editing by combining a vision-language planner with a diffusion editor, achieving superior performance in complex and intricate ...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16864
• PDF: https://arxiv.org/pdf/2512.16864
• Project Page: https://replan-iv-edit.github.io/
• Github: https://github.com/dvlab-research/RePlan

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
AdaTooler-V: Adaptive Tool-Use for Images and Videos

📝 Summary:
AdaTooler-V, a multimodal large language model, adaptively uses vision tools based on reinforcement learning, improving performance and reducing unnecessary tool invocations in visual reasoning tasks....

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16918
• PDF: https://arxiv.org/pdf/2512.16918
• Github: https://github.com/CYWang735/AdaTooler-V

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
This media is not supported in your browser
VIEW IN TELEGRAM
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

📝 Summary:
N3D-VLM integrates native 3D perception and reasoning in vision-language models, enabling precise 3D localization and spatial understanding with a large-scale dataset. AI-generated summary While curre...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16561
• PDF: https://arxiv.org/pdf/2512.16561
• Github: https://github.com/W-Ted/N3D-VLM

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Media is too big
VIEW IN TELEGRAM
The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text

📝 Summary:
WorldCanvas generates coherent, controllable world events by integrating text, trajectories, and reference images. This multimodal approach surpasses text-only or image-to-video methods, creating videos with preserved object identity and temporal consistency. It advances world models from passive...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16924
• PDF: https://arxiv.org/pdf/2512.16924
• Project Page: https://worldcanvas.github.io/
• Github: https://github.com/pPetrichor/WorldCanvas

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers

📝 Summary:
Log-linear Sparse Attention (LLSA) improves the efficiency of diffusion transformers by reducing computational costs for long token sequences through a hierarchical structure, enhancing training speed...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16615
• PDF: https://arxiv.org/pdf/2512.16615
• Github: https://github.com/SingleZombie/LLSA

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Coupled Variational Reinforcement Learning for Language Model General Reasoning

📝 Summary:
CoVRL, a hybrid approach combining variational inference and reinforcement learning, enhances language model reasoning by coupling prior and posterior distributions, improving performance and coherenc...

🔹 Publication Date: Published on Dec 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.12576
• PDF: https://arxiv.org/pdf/2512.12576

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks

📝 Summary:
VenusBench-GD is a comprehensive, multi-platform GUI grounding benchmark with a hierarchical evaluation. It reveals general models excel at basic tasks, but specialized models are still better for advanced, despite overfitting.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16501
• PDF: https://arxiv.org/pdf/2512.16501
• Project Page: https://ui-venus.github.io/VenusBench-GD/

Datasets citing this paper:
https://huggingface.co/datasets/inclusionAI/VenusBench-GD

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
REGLUE Your Latents with Global and Local Semantics for Entangled Diffusion

📝 Summary:
REGLUE, a unified latent diffusion framework, enhances image synthesis by jointly modeling VAE latents, patch-level VFM semantics, and global tokens, improving semantic supervision and convergence. AI...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16636
• PDF: https://arxiv.org/pdf/2512.16636

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction

📝 Summary:
FlashPortrait is a diffusion-based video transformer for long-portrait animation that ensures ID consistency and achieves 6x acceleration through a dynamic sliding-window scheme and higher-order laten...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16900
• PDF: https://arxiv.org/pdf/2512.16900
• Project Page: https://francis-rings.github.io/FlashPortrait/
• Github: https://github.com/Francis-Rings/FlashPortrait

🔹 Models citing this paper:
https://huggingface.co/FrancisRing/FlashPortrait

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Insight Miner: A Time Series Analysis Dataset for Cross-Domain Alignment with Natural Language

📝 Summary:
Insight Miner, a large-scale multimodal model, generates high-quality time-series denoscriptions using a novel agentic workflow and outperforms existing models with the help of the TS-Insights dataset. ...

🔹 Publication Date: Published on Dec 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.11251
• PDF: https://arxiv.org/pdf/2512.11251

Datasets citing this paper:
https://huggingface.co/datasets/zhykoties/time-series-language-alignment

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Media is too big
VIEW IN TELEGRAM
Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation

📝 Summary:
A novel feed-forward framework, Make-It-Poseable, reformulates character posing as a latent-space transformation problem, using a latent posing transformer and dense pose representation to achieve sup...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16767
• PDF: https://arxiv.org/pdf/2512.16767
• Project Page: https://jasongzy.github.io/Make-It-Poseable/
• Github: https://github.com/jasongzy/Make-It-Poseable

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Multimodal RewardBench 2: Evaluating Omni Reward Models for Interleaved Text and Image

📝 Summary:
MMRB2 is a new benchmark for multimodal reward models, evaluating them on interleaved image and text tasks using 4,000 expert-annotated preferences. It shows top models like Gemini 3 Pro achieve 75-80% accuracy, still below human performance, highlighting areas for improvement in these models.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16899
• PDF: https://arxiv.org/pdf/2512.16899
• Github: https://github.com/facebookresearch/MMRB2/tree/main

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MultimodalAI #RewardModels #AIbenchmark #MachineLearning #AIResearch
1
This media is not supported in your browser
VIEW IN TELEGRAM
Vibe Spaces for Creatively Connecting and Expressing Visual Concepts

📝 Summary:
Vibe Blending uses Vibe Space, a hierarchical graph manifold, to create coherent and creative image hybrids. It learns geodesics in feature spaces, outperforming current methods in creativity and coherence as rated by humans.

🔹 Publication Date: Published on Dec 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.14884
• PDF: https://arxiv.org/pdf/2512.14884
• Project Page: https://huzeyann.github.io/VibeSpace-webpage/
• Github: https://github.com/huzeyann/VibeSpace

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#ImageGeneration #ComputerVision #AI #MachineLearning #CreativeAI
1
This media is not supported in your browser
VIEW IN TELEGRAM
FrameDiffuser: G-Buffer-Conditioned Diffusion for Neural Forward Frame Rendering

📝 Summary:
FrameDiffuser is an autoregressive neural rendering framework. It generates temporally consistent, photorealistic frames using G-buffer data and its own previous output. This achieves interactive speed and high quality compared to prior methods.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16670
• PDF: https://arxiv.org/pdf/2512.16670

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#NeuralRendering #DiffusionModels #ComputerGraphics #RealtimeRendering #DeepLearning
2
JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

📝 Summary:
JustRL uses a minimal single-stage RL approach with fixed hyperparameters to achieve state-of-the-art performance on 1.5B reasoning models. It uses less compute and shows stable training, suggesting that complex RL methods for LLMs may be unnecessary and can even hinder exploration.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16649
• PDF: https://arxiv.org/pdf/2512.16649

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#ReinforcementLearning #LLMs #DeepLearning #AIResearch #ModelScaling
1
Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

📝 Summary:
Vision-Language-Action VLA models integrate visual, linguistic, and action capabilities for autonomous driving. They aim for interpretable and human-aligned policies, addressing prior system limitations. This paper characterizes VLA paradigms, datasets, and future challenges.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16760
• PDF: https://arxiv.org/pdf/2512.16760
• Project Page: https://worldbench.github.io/vla4ad
• Github: https://github.com/worldbench/awesome-vla-for-ad

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#VLAModels #AutonomousDriving #AI #DeepLearning #Robotics
2
Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs

📝 Summary:
This paper benchmarks SpeechLLMs against cascaded systems for speech-to-text translation. It finds cascaded systems are more reliable overall, while SpeechLLMs match them only in select cases. Integrating an LLM is essential for high quality speech translation.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16378
• PDF: https://arxiv.org/pdf/2512.16378
• Github: https://github.com/sarapapi/hearing2translate

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#SpeechTranslation #LLMs #NLP #AIResearch #DeepLearning
1
🚀 Master Data Science & Programming!

Unlock your potential with this curated list of Telegram channels. Whether you need books, datasets, interview prep, or project ideas, we have the perfect resource for you. Join the community today!


🔰 Machine Learning with Python
Learn Machine Learning with hands-on Python tutorials, real-world code examples, and clear explanations for researchers and developers.
https://news.1rj.ru/str/CodeProgrammer

🔖 Machine Learning
Machine learning insights, practical tutorials, and clear explanations for beginners and aspiring data scientists. Follow the channel for models, algorithms, coding guides, and real-world ML applications.
https://news.1rj.ru/str/DataScienceM

🧠 Code With Python
This channel delivers clear, practical content for developers, covering Python, Django, Data Structures, Algorithms, and DSA – perfect for learning, coding, and mastering key programming skills.
https://news.1rj.ru/str/DataScience4

🎯 PyData Careers | Quiz
Python Data Science jobs, interview tips, and career insights for aspiring professionals.
https://news.1rj.ru/str/DataScienceQ

💾 Kaggle Data Hub
Your go-to hub for Kaggle datasets – explore, analyze, and leverage data for Machine Learning and Data Science projects.
https://news.1rj.ru/str/datasets1

🧑‍🎓 Udemy Coupons | Courses
The first channel in Telegram that offers free Udemy coupons
https://news.1rj.ru/str/DataScienceC

😀 ML Research Hub
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.
https://news.1rj.ru/str/DataScienceT

💬 Data Science Chat
An active community group for discussing data challenges and networking with peers.
https://news.1rj.ru/str/DataScience9

🐍 Python Arab| بايثون عربي
The largest Arabic-speaking group for Python developers to share knowledge and help.
https://news.1rj.ru/str/PythonArab

🖊 Data Science Jupyter Notebooks
Explore the world of Data Science through Jupyter Notebooks—insights, tutorials, and tools to boost your data journey. Code, analyze, and visualize smarter with every post.
https://news.1rj.ru/str/DataScienceN

📺 Free Online Courses | Videos
Free online courses covering data science, machine learning, analytics, programming, and essential skills for learners.
https://news.1rj.ru/str/DataScienceV

📈 Data Analytics
Dive into the world of Data Analytics – uncover insights, explore trends, and master data-driven decision making.
https://news.1rj.ru/str/DataAnalyticsX

🎧 Learn Python Hub
Master Python with step-by-step courses – from basics to advanced projects and practical applications.
https://news.1rj.ru/str/Python53

⭐️ Research Papers
Professional Academic Writing & Simulation Services
https://news.1rj.ru/str/DataScienceY

━━━━━━━━━━━━━━━━━━
Admin: @HusseinSheikho
Please open Telegram to view this post
VIEW IN TELEGRAM
1
EasyV2V: A High-quality Instruction-based Video Editing Framework

📝 Summary:
EasyV2V is a framework for instruction-based video editing that combines diverse data sources, leverages pretrained text-to-video models with LoRA fine-tuning, and uses unified spatiotemporal control. This innovative approach achieves state-of-the-art results in video editing.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16920
• PDF: https://arxiv.org/pdf/2512.16920
• Github: https://snap-research.github.io/easyv2v/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#VideoEditing #AI #DeepLearning #ComputerVision #TextToVideo
2
Bidirectional Normalizing Flow: From Data to Noise and Back

📝 Summary:
Bidirectional Normalizing Flow BiFlow improves generative modeling by learning an approximate noise-to-data inverse, removing the need for exact invertibility. This allows flexible architectures, yielding better generation quality and accelerating sampling by up to two orders of magnitude.

🔹 Publication Date: Published on Dec 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.10953
• PDF: https://arxiv.org/pdf/2512.10953

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#NormalizingFlows #GenerativeAI #MachineLearning #DeepLearning #DataScience
Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision

📝 Summary:
Nemotron-Math is a new large mathematical reasoning dataset with diverse styles and Python tool integration, generated from gpt-oss-120b. It combines competition problems with real-world queries, achieving state-of-the-art performance and accelerating long-context training.

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15489
• PDF: https://arxiv.org/pdf/2512.15489

Datasets citing this paper:
https://huggingface.co/datasets/nvidia/Nemotron-Math-v2
https://huggingface.co/datasets/nvidia/Nemotron-Math-Proofs-v1

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#NemotronMath #MathematicalReasoning #LargeLanguageModels #AIDataset #DeepLearning