ML Research Hub – Telegram
ML Research Hub
32.6K subscribers
3.92K photos
217 videos
23 files
4.22K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Adaptation of Agentic AI

📝 Summary:
This paper presents a framework for agent and tool adaptation in agentic AI systems, clarifying design strategies and identifying open challenges for improving AI capabilities. AI-generated summary Cu...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16301
• PDF: https://arxiv.org/pdf/2512.16301
• Github: https://github.com/pat-jj/Awesome-Adaptation-of-Agentic-AI

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
TabReX : Tabular Referenceless eXplainable Evaluation

📝 Summary:
TabReX is a reference-less framework using graph-based reasoning to evaluate the quality of tables generated by LLMs, offering structural and factual fidelity scores. AI-generated summary Evaluating t...

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15907
• PDF: https://arxiv.org/pdf/2512.15907
• Project Page: https://coral-lab-asu.github.io/TabReX/
• Github: https://github.com/CoRAL-ASU/TabReX

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model

📝 Summary:
Seedance 1.5 pro, a dual-branch Diffusion Transformer model, achieves high-quality audio-visual synchronization and generation through cross-modal integration, post-training optimizations, and an acce...

🔹 Publication Date: Published on Dec 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.13507
• PDF: https://arxiv.org/pdf/2512.13507
• Project Page: https://seed.bytedance.com/seedance1_5_pro

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation

📝 Summary:
A panoramic metric depth foundation model using DINOv3-Large and a three-stage pseudo-label pipeline achieves robust performance across diverse real-world scenes. AI-generated summary In this work, we...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16913
• PDF: https://arxiv.org/pdf/2512.16913
• Github: https://insta360-research-team.github.io/DAP

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Next-Embedding Prediction Makes Strong Vision Learners

📝 Summary:
Generative pretraining using next embedding prediction outperforms traditional self-supervised methods in visual learning tasks, achieving high accuracy on ImageNet and effective transfer to semantic ...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16922
• PDF: https://arxiv.org/pdf/2512.16922

🔹 Models citing this paper:
https://huggingface.co/SixAILab/nepa-base-patch14-224-sft
https://huggingface.co/SixAILab/nepa-large-patch14-224
https://huggingface.co/SixAILab/nepa-base-patch14-224

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

📝 Summary:
Alchemist, a meta-gradient-based framework, automatically selects high-quality subsets from large-scale text-image datasets to improve visual quality and training efficiency in Text-to-Image models. A...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16905
• PDF: https://arxiv.org/pdf/2512.16905

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors

📝 Summary:
StereoPilot, a feed-forward model leveraging a learnable domain switcher and cycle consistency loss, synthesizes high-quality stereo video directly without depth maps, outperforming existing methods i...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16915
• PDF: https://arxiv.org/pdf/2512.16915

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

📝 Summary:
Reinforcement learning with verifiable rewards improves LLM reasoning through spurious rewards and entropy minimization, despite seemingly paradoxical effects, by reducing clipping bias and policy ent...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16912
• PDF: https://arxiv.org/pdf/2512.16912

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Differences That Matter: Auditing Models for Capability Gap Discovery and Rectification

📝 Summary:
AuditDM, an automated framework using reinforcement learning, identifies and rectifies failure modes in multimodal LLMs by generating challenging examples, leading to improved performance across bench...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16921
• PDF: https://arxiv.org/pdf/2512.16921
• Project Page: https://auditdm.github.io/
• Github: https://auditdm.github.io/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
EmoCaliber: Advancing Reliable Visual Emotion Comprehension via Confidence Verbalization and Calibration

📝 Summary:
EmoCaliber, a confidence-aware Multimodal Large Language Model, enhances Visual Emotion Comprehension by verbalizing confidence in emotion predictions, leading to improved reliability and accuracy. AI...

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15528
• PDF: https://arxiv.org/pdf/2512.15528
• Github: https://github.com/wdqqdw/EmoCaliber

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
This media is not supported in your browser
VIEW IN TELEGRAM
Generative Refocusing: Flexible Defocus Control from a Single Image

📝 Summary:
Generative Refocusing uses DeblurNet and BokehNet for high-quality single-image refocusing. Its semi-supervised training with real bokeh images and EXIF metadata enables controllable bokeh and text-guided adjustments, outperforming current methods.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16923
• PDF: https://arxiv.org/pdf/2512.16923
• Project Page: https://generative-refocusing.github.io/
• Github: https://github.com/rayray9999/Genfocus

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing

📝 Summary:
RePlan, a plan-then-execute framework, enhances instruction-based image editing by combining a vision-language planner with a diffusion editor, achieving superior performance in complex and intricate ...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16864
• PDF: https://arxiv.org/pdf/2512.16864
• Project Page: https://replan-iv-edit.github.io/
• Github: https://github.com/dvlab-research/RePlan

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
AdaTooler-V: Adaptive Tool-Use for Images and Videos

📝 Summary:
AdaTooler-V, a multimodal large language model, adaptively uses vision tools based on reinforcement learning, improving performance and reducing unnecessary tool invocations in visual reasoning tasks....

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16918
• PDF: https://arxiv.org/pdf/2512.16918
• Github: https://github.com/CYWang735/AdaTooler-V

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
This media is not supported in your browser
VIEW IN TELEGRAM
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

📝 Summary:
N3D-VLM integrates native 3D perception and reasoning in vision-language models, enabling precise 3D localization and spatial understanding with a large-scale dataset. AI-generated summary While curre...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16561
• PDF: https://arxiv.org/pdf/2512.16561
• Github: https://github.com/W-Ted/N3D-VLM

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Media is too big
VIEW IN TELEGRAM
The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text

📝 Summary:
WorldCanvas generates coherent, controllable world events by integrating text, trajectories, and reference images. This multimodal approach surpasses text-only or image-to-video methods, creating videos with preserved object identity and temporal consistency. It advances world models from passive...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16924
• PDF: https://arxiv.org/pdf/2512.16924
• Project Page: https://worldcanvas.github.io/
• Github: https://github.com/pPetrichor/WorldCanvas

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers

📝 Summary:
Log-linear Sparse Attention (LLSA) improves the efficiency of diffusion transformers by reducing computational costs for long token sequences through a hierarchical structure, enhancing training speed...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16615
• PDF: https://arxiv.org/pdf/2512.16615
• Github: https://github.com/SingleZombie/LLSA

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Coupled Variational Reinforcement Learning for Language Model General Reasoning

📝 Summary:
CoVRL, a hybrid approach combining variational inference and reinforcement learning, enhances language model reasoning by coupling prior and posterior distributions, improving performance and coherenc...

🔹 Publication Date: Published on Dec 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.12576
• PDF: https://arxiv.org/pdf/2512.12576

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks

📝 Summary:
VenusBench-GD is a comprehensive, multi-platform GUI grounding benchmark with a hierarchical evaluation. It reveals general models excel at basic tasks, but specialized models are still better for advanced, despite overfitting.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16501
• PDF: https://arxiv.org/pdf/2512.16501
• Project Page: https://ui-venus.github.io/VenusBench-GD/

Datasets citing this paper:
https://huggingface.co/datasets/inclusionAI/VenusBench-GD

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
REGLUE Your Latents with Global and Local Semantics for Entangled Diffusion

📝 Summary:
REGLUE, a unified latent diffusion framework, enhances image synthesis by jointly modeling VAE latents, patch-level VFM semantics, and global tokens, improving semantic supervision and convergence. AI...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16636
• PDF: https://arxiv.org/pdf/2512.16636

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction

📝 Summary:
FlashPortrait is a diffusion-based video transformer for long-portrait animation that ensures ID consistency and achieves 6x acceleration through a dynamic sliding-window scheme and higher-order laten...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16900
• PDF: https://arxiv.org/pdf/2512.16900
• Project Page: https://francis-rings.github.io/FlashPortrait/
• Github: https://github.com/Francis-Rings/FlashPortrait

🔹 Models citing this paper:
https://huggingface.co/FrancisRing/FlashPortrait

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Insight Miner: A Time Series Analysis Dataset for Cross-Domain Alignment with Natural Language

📝 Summary:
Insight Miner, a large-scale multimodal model, generates high-quality time-series denoscriptions using a novel agentic workflow and outperforms existing models with the help of the TS-Insights dataset. ...

🔹 Publication Date: Published on Dec 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.11251
• PDF: https://arxiv.org/pdf/2512.11251

Datasets citing this paper:
https://huggingface.co/datasets/zhykoties/time-series-language-alignment

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1