ML Research Hub – Telegram
ML Research Hub
32.8K subscribers
4.41K photos
272 videos
23 files
4.77K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
🔹 Title: Spotlight on Token Perception for Multimodal Reinforcement Learning

🔹 Publication Date: Published on Oct 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.09285
• PDF: https://arxiv.org/pdf/2510.09285
• Project Page: https://huggingface.co/collections/chamber111/vppo-data-68e7aaafe1bffbab844d341b
• Github: https://github.com/huaixuheqing/VPPO-RL

🔹 Datasets citing this paper:
https://huggingface.co/datasets/chamber111/VPPO-Eval
https://huggingface.co/datasets/chamber111/VPPO_ViRL39K_train
https://huggingface.co/datasets/chamber111/VPPO_MMK12_validation

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems

🔹 Publication Date: Published on Oct 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.11652
• PDF: https://arxiv.org/pdf/2510.11652

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
1
🔹 Title: InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

🔹 Publication Date: Published on Oct 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.11341
• PDF: https://arxiv.org/pdf/2510.11341
• Project Page: https://hmwang2002.github.io/release/internnoscript/
• Github: https://github.com/hmwang2002/InternSVG

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: From Data to Rewards: a Bilevel Optimization Perspective on Maximum Likelihood Estimation

🔹 Publication Date: Published on Oct 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.07624
• PDF: https://arxiv.org/pdf/2510.07624
• Github: https://github.com/abenechehab/nll_to_po

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

🔹 Publication Date: Published on Oct 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.11052
• PDF: https://arxiv.org/pdf/2510.11052

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: CoBia: Constructed Conversations Can Trigger Otherwise Concealed Societal Biases in LLMs

🔹 Publication Date: Published on Oct 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.09871
• PDF: https://arxiv.org/pdf/2510.09871
• Project Page: https://github.com/nafisenik/CoBia
• Github: https://github.com/nafisenik/CoBia

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: Through the Perspective of LiDAR: A Feature-Enriched and Uncertainty-Aware Annotation Pipeline for Terrestrial Point Cloud Segmentation

🔹 Publication Date: Published on Oct 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.06582
• PDF: https://arxiv.org/pdf/2510.06582
• Project Page: https://fz-rit.github.io/through-the-lidars-eye/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: The Curious Case of Factual (Mis)Alignment between LLMs' Short- and Long-Form Answers

🔹 Publication Date: Published on Oct 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.11218
• PDF: https://arxiv.org/pdf/2510.11218
• Github: https://github.com/WorldHellow/SLAQ

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for Large Vision-and-Language Models

🔹 Publication Date: Published on Oct 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2510.10606
• PDF: https://arxiv.org/pdf/2510.10606
• Github: https://github.com/dvlab-research/ViSurf?tab=readme-ov-file

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
1
🔹 Title: SwarmSys: Decentralized Swarm-Inspired Agents for Scalable and Adaptive Reasoning

🔹 Publication Date: Published on Oct 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.10047
• PDF: https://arxiv.org/pdf/2510.10047

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: World-To-Image: Grounding Text-to-Image Generation with Agent-Driven World Knowledge

🔹 Publication Date: Published on Oct 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.04201
• PDF: https://arxiv.org/pdf/2510.04201
• Github: https://github.com/mhson-kyle/World-To-Image

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: AndesVL Technical Report: An Efficient Mobile-side Multimodal Large Language Model

🔹 Publication Date: Published on Oct 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.11496
• PDF: https://arxiv.org/pdf/2510.11496

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining

🔹 Publication Date: Published on Oct 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.01427
• PDF: https://arxiv.org/pdf/2510.01427
• Github: https://github.com/LongfeiYun17/falconer

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: Multimodal Policy Internalization for Conversational Agents

🔹 Publication Date: Published on Oct 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.09474
• PDF: https://arxiv.org/pdf/2510.09474
• Project Page: https://mikewangwzhl.github.io/TriMPI/
• Github: https://mikewangwzhl.github.io/TriMPI/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections

🔹 Publication Date: Published on Oct 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.09023
• PDF: https://arxiv.org/pdf/2510.09023

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: MultiCOIN: Multi-Modal COntrollable Video INbetweening

🔹 Publication Date: Published on Oct 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.08561
• PDF: https://arxiv.org/pdf/2510.08561

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: oMeBench: Towards Robust Benchmarking of LLMs in Organic Mechanism Elucidation and Reasoning

🔹 Publication Date: Published on Oct 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.07731
• PDF: https://arxiv.org/pdf/2510.07731
• Github: https://github.com/skylarkie/oMeBench

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🔹 Title: VLM-Guided Adaptive Negative Prompting for Creative Generation

🔹 Publication Date: Published on Oct 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.10715
• PDF: https://arxiv.org/pdf/2510.10715
• Github: https://shelley-golan.github.io/VLM-Guided-Creative-Generation/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT
🤖🧠 Thinking with Camera 2.0: A Powerful Multimodal Model for Camera-Centric Understanding and Generation

🗓️ 14 Oct 2025
📚 AI News & Trends

In the rapidly evolving field of multimodal AI, bridging gaps between vision, language and geometry is one of the frontier challenges. Traditional vision-language models excel at describing what is in an image “a cat on a sofa” “a red car on the road” but struggle to reason about how the image was captured: the camera’s ...

#MultimodalAI #CameraCentricUnderstanding #VisionLanguageModels #AIResearch #ComputerVision #GenerativeModels
🤖🧠 Granite-Speech-3.3-8B: IBM’s Next-Gen Speech-Language Model for Enterprise AI

🗓️ 14 Oct 2025
📚 AI News & Trends

In the fast-growing field of speech and language AI, IBM continues to make strides with its Granite model family , a suite of open enterprise-grade AI models that combine accuracy, safety and efficiency. The latest addition to this ecosystem, Granite-Speech-3.3-8B marks a significant milestone in automatic speech recognition (ASR) and speech translation (AST) technology. Released ...

#SpeechAI #LanguageModel #EnterpriseAI #ASR #SpeechTranslation #GraniteModel
🤖🧠 LLaMAX2 by Nanjing University, HKU, CMU & Shanghai AI Lab: A Breakthrough in Translation-Enhanced Reasoning Models

🗓️ 14 Oct 2025
📚 AI News & Trends

The world of large language models (LLMs) has evolved rapidly, producing advanced systems capable of reasoning, problem-solving, and creative text generation. However, a persistent challenge has been balancing translation quality with reasoning ability. Most translation-enhanced models excel in linguistic diversity but falter in logical reasoning or coding tasks. Addressing this crucial gap, the research paper ...

#LLaMAX2 #TranslationEnhanced #ReasoningModels #LargeLanguageModels #NanjingUniversity #HKU