ML Research Hub – Telegram
ML Research Hub
32.7K subscribers
3.98K photos
226 videos
23 files
4.29K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Are We on the Right Way to Assessing LLM-as-a-Judge?

📝 Summary:
Sage is a human-free evaluation suite assessing LLM-as-a-Judge consistency using rational choice theory. It reveals significant reliability problems in current top LLM judges, even in difficult cases. The study suggests finetuning, explicit rubrics, and panel judging can boost consistency.

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16041
• PDF: https://arxiv.org/pdf/2512.16041

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLMEvaluation #LLMReliability #AIResearch #GenAI #NLP
1
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers

📝 Summary:
Canon layers are lightweight architectural components that enhance language model reasoning depth and breadth by promoting horizontal information flow. They improve performance across various architectures, validated in synthetic tasks and real-world pretraining.

🔹 Publication Date: Published on Dec 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.17351
• PDF: https://arxiv.org/pdf/2512.17351
• Project Page: https://physics.allen-zhu.com/part-4-architecture-design/part-4-1
• Github: https://github.com/facebookresearch/PhysicsLM4

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LanguageModels #LLM #AIArchitecture #DeepLearning #NLP
1
Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs

📝 Summary:
Turn-PPO improves multi-turn reinforcement learning for LLM agents by using a turn-level MDP for advantage estimation. This PPO variant outperforms GRPO and standard PPO, addressing limitations in long-horizon reasoning. It demonstrates effectiveness on WebShop and Sokoban datasets.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.17008
• PDF: https://arxiv.org/pdf/2512.17008

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLM #ReinforcementLearning #AI #MachineLearning #AgenticAI
1
Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

📝 Summary:
A novel framework, Robust-R1, enhances multimodal large language models' robustness to visual degradations through explicit modeling, supervised fine-tuning, reward-driven alignment, and dynamic reaso...

🔹 Publication Date: Published on Dec 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.17532
• PDF: https://arxiv.org/pdf/2512.17532
• Project Page: https://jqt.me/index.html

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence

📝 Summary:
Proposed Egocentric2Embodiment pipeline translates human egocentric videos into structured training data for robots, enhancing their egocentric understanding and task performance. AI-generated summary...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16793
• PDF: https://arxiv.org/pdf/2512.16793
• Project Page: https://zgc-embodyai.github.io/PhysBrain/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models

📝 Summary:
StageVAR accelerates visual autoregressive models by recognizing early stages are critical while later detail-refinement stages can be pruned or approximated. This plug-and-play framework achieves up to 3.4x speedup with minimal quality loss, outperforming existing methods.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16483
• PDF: https://arxiv.org/pdf/2512.16483
• Github: https://github.com/sen-mao/StageVAR

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#ComputerVision #DeepLearning #ModelAcceleration #AI #NeuralNetworks
1
Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing

📝 Summary:
This paper proposes a framework using a semantic-pixel reconstruction objective to adapt encoder features for generation. It creates a compact, semantically rich latent space, leading to state-of-the-art image reconstruction and improved text-to-image generation and editing.

🔹 Publication Date: Published on Dec 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.17909
• PDF: https://arxiv.org/pdf/2512.17909
• Project Page: https://jshilong.github.io/PS-VAE-PAGE/
• Github: https://jshilong.github.io/PS-VAE-PAGE/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#TextToImage #ImageGeneration #DeepLearning #ComputerVision #AIResearch
1
GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation

📝 Summary:
GroundingME is a new benchmark revealing significant visual grounding gaps in MLLMs, which often hallucinate instead of rejecting ungroundable queries. State-of-the-art models only reach 45.1% accuracy, raising safety concerns. Data-mixture training shows promise in improving their ability to rec...

🔹 Publication Date: Published on Dec 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.17495
• PDF: https://arxiv.org/pdf/2512.17495
• Project Page: https://groundingme.github.io/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MLLMs #VisualGrounding #AISafety #AIResearch #Benchmarking
1
HERBench: A Benchmark for Multi-Evidence Integration in Video Question Answering

📝 Summary:
HERBench is a new VideoQA benchmark designed to test multi-evidence integration across time, revealing significant challenges for current Video-LLMs. It requires models to fuse at least three visual cues from distinct segments, with state-of-the-art models performing poorly due to retrieval and f...

🔹 Publication Date: Published on Dec 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.14870
• PDF: https://arxiv.org/pdf/2512.14870
• Project Page: https://herbench.github.io/
• Github: https://github.com/DanBenAmi/HERBench

Datasets citing this paper:
https://huggingface.co/datasets/DanBenAmi/HERBench

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
2
An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges

📝 Summary:
This survey offers a structured guide to Vision-Language-Action VLA models in robotics. It breaks down five key challenges: representation, execution, generalization, safety, and datasets, serving as a roadmap for researchers.

🔹 Publication Date: Published on Dec 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.11362
• PDF: https://arxiv.org/pdf/2512.11362
• Project Page: https://suyuz1.github.io/Survery/
• Github: https://suyuz1.github.io/VLA-Survey-Anatomy/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#VLAModels #Robotics #ArtificialIntelligence #VisionLanguage #AIResearch
1
RadarGen: Automotive Radar Point Cloud Generation from Cameras

📝 Summary:
RadarGen synthesizes realistic automotive radar point clouds from camera images using diffusion models. It incorporates depth, semantic, and motion cues for physical plausibility, enabling scalable multimodal simulation and improving perception models.

🔹 Publication Date: Published on Dec 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.17897
• PDF: https://arxiv.org/pdf/2512.17897

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AutomotiveRadar #PointClouds #DiffusionModels #ComputerVision #AutonomousDriving
1
Media is too big
VIEW IN TELEGRAM
3D-RE-GEN: 3D Reconstruction of Indoor Scenes with a Generative Framework

📝 Summary:
3D-RE-GEN reconstructs single images into modifiable 3D textured mesh scenes with comprehensive backgrounds. It uses a compositional generative framework and novel optimization for artist-ready, physically realistic layouts, achieving state-of-the-art performance.

🔹 Publication Date: Published on Dec 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.17459
• PDF: https://arxiv.org/pdf/2512.17459
• Project Page: https://3dregen.jdihlmann.com/
• Github: https://github.com/cgtuebingen/3D-RE-GEN

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#3DReconstruction #GenerativeAI #ComputerVision #DeepLearning #ComputerGraphics
1
This media is not supported in your browser
VIEW IN TELEGRAM
Meta-RL Induces Exploration in Language Agents

📝 Summary:
LaMer, a Meta-RL framework, enhances LLM agents exploration and adaptation in RL tasks. It significantly improves their performance and generalization across diverse environments, proving Meta-RLs effectiveness for robust adaptation in language agents.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16848
• PDF: https://arxiv.org/pdf/2512.16848

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MetaRL #LLMAgents #ReinforcementLearning #NLP #AI
A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

📝 Summary:
This paper introduces LongShOTBench, a diagnostic benchmark for long-form multimodal video understanding with open-ended questions and agentic tool use. It also presents LongShOTAgent, an agentic system for video analysis. Results show state-of-the-art models struggle significantly, highlighting ...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16978
• PDF: https://arxiv.org/pdf/2512.16978
• Project Page: https://mbzuai-oryx.github.io/LongShOT/
• Github: https://github.com/mbzuai-oryx/longshot

Datasets citing this paper:
https://huggingface.co/datasets/MBZUAI/longshot-bench

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#VideoAI #MultimodalAI #AgenticAI #AIbenchmark #AIResearch
MineTheGap: Automatic Mining of Biases in Text-to-Image Models

📝 Summary:
MineTheGap automatically finds prompts that cause Text-to-Image models to generate biased outputs. It uses a genetic algorithm and a novel bias score to identify and rank biases, aiming to reduce redundancy and improve output diversity.

🔹 Publication Date: Published on Dec 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.13427
• PDF: https://arxiv.org/pdf/2512.13427

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AIbias #TextToImage #GenerativeAI #ResponsibleAI #MachineLearning
🚀 Master Data Science & Programming!

Unlock your potential with this curated list of Telegram channels. Whether you need books, datasets, interview prep, or project ideas, we have the perfect resource for you. Join the community today!


🔰 Machine Learning with Python
Learn Machine Learning with hands-on Python tutorials, real-world code examples, and clear explanations for researchers and developers.
https://news.1rj.ru/str/CodeProgrammer

🔖 Machine Learning
Machine learning insights, practical tutorials, and clear explanations for beginners and aspiring data scientists. Follow the channel for models, algorithms, coding guides, and real-world ML applications.
https://news.1rj.ru/str/DataScienceM

🧠 Code With Python
This channel delivers clear, practical content for developers, covering Python, Django, Data Structures, Algorithms, and DSA – perfect for learning, coding, and mastering key programming skills.
https://news.1rj.ru/str/DataScience4

🎯 PyData Careers | Quiz
Python Data Science jobs, interview tips, and career insights for aspiring professionals.
https://news.1rj.ru/str/DataScienceQ

💾 Kaggle Data Hub
Your go-to hub for Kaggle datasets – explore, analyze, and leverage data for Machine Learning and Data Science projects.
https://news.1rj.ru/str/datasets1

🧑‍🎓 Udemy Coupons | Courses
The first channel in Telegram that offers free Udemy coupons
https://news.1rj.ru/str/DataScienceC

😀 ML Research Hub
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.
https://news.1rj.ru/str/DataScienceT

💬 Data Science Chat
An active community group for discussing data challenges and networking with peers.
https://news.1rj.ru/str/DataScience9

🐍 Python Arab| بايثون عربي
The largest Arabic-speaking group for Python developers to share knowledge and help.
https://news.1rj.ru/str/PythonArab

🖊 Data Science Jupyter Notebooks
Explore the world of Data Science through Jupyter Notebooks—insights, tutorials, and tools to boost your data journey. Code, analyze, and visualize smarter with every post.
https://news.1rj.ru/str/DataScienceN

📺 Free Online Courses | Videos
Free online courses covering data science, machine learning, analytics, programming, and essential skills for learners.
https://news.1rj.ru/str/DataScienceV

📈 Data Analytics
Dive into the world of Data Analytics – uncover insights, explore trends, and master data-driven decision making.
https://news.1rj.ru/str/DataAnalyticsX

🎧 Learn Python Hub
Master Python with step-by-step courses – from basics to advanced projects and practical applications.
https://news.1rj.ru/str/Python53

⭐️ Research Papers
Professional Academic Writing & Simulation Services
https://news.1rj.ru/str/DataScienceY

━━━━━━━━━━━━━━━━━━
Admin: @HusseinSheikho
Please open Telegram to view this post
VIEW IN TELEGRAM
2
Bolmo: Byteifying the Next Generation of Language Models

📝 Summary:
Bolmo introduces competitive byte-level language models by efficiently converting existing subword models. This byteification overcomes subword limitations, matching performance with minimal training. Bolmo makes byte-level LMs practical.

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15586
• PDF: https://arxiv.org/pdf/2512.15586

🔹 Models citing this paper:
https://huggingface.co/allenai/Bolmo-7B
https://huggingface.co/allenai/Bolmo-1B

Datasets citing this paper:
https://huggingface.co/datasets/allenai/bolmo_mix

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LanguageModels #ByteLevelLMs #NLP #DeepLearning #AIResearch
1
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

📝 Summary:
DataFlow is an LLM-driven framework for unified, high-quality data preparation. It automates pipeline generation from natural language, significantly boosting LLM performance across diverse tasks like math, code, and text. DataFlow ensures reproducible data and provides a scalable foundation for AI.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16676
• PDF: https://arxiv.org/pdf/2512.16676
• Project Page: https://github.com/OpenDCAI/DataFlow
• Github: https://github.com/OpenDCAI/DataFlow

Datasets citing this paper:
https://huggingface.co/datasets/OpenDCAI/dataflow-demo-code
https://huggingface.co/datasets/OpenDCAI/dataflow-demo-Text2SQL
https://huggingface.co/datasets/OpenDCAI/dataflow-instruct-10k

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLM #DataPreparation #DataCentricAI #WorkflowAutomation #AIResearch
Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction

📝 Summary:
LLMs poorly estimate human cognitive difficulty for educational tasks. Scaling models does not improve alignment with humans; they converge to a machine consensus and fail to simulate student struggles or show introspection.

🔹 Publication Date: Published on Dec 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.18880
• PDF: https://arxiv.org/pdf/2512.18880
• Github: https://github.com/MingLiiii/Difficulty_Alignment

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#LLM #EducationalAI #ItemDifficulty #HumanAIAlignment #AIResearch
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

📝 Summary:
The Prism Hypothesis posits semantic encoders capture low-frequency meaning, while pixel encoders retain high-frequency details. Unified Autoencoding UAE leverages this with a frequency-band modulator to harmonize both into a single latent space. This achieves state-of-the-art performance on imag...

🔹 Publication Date: Published on Dec 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.19693
• PDF: https://arxiv.org/pdf/2512.19693
• Github: https://github.com/WeichenFan/UAE

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#DeepLearning #ComputerVision #Autoencoders #RepresentationLearning #AIResearch