ML Research Hub – Telegram
ML Research Hub
32.7K subscribers
4K photos
228 videos
23 files
4.31K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Streaming Video Instruction Tuning

📝 Summary:
We present Streamo, a real-time streaming video LLM that serves as a general-purpose interactive assistant. Unlike existing online video models that focus narrowly on question answering or captioning,...

🔹 Publication Date: Published on Dec 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21334
• PDF: https://arxiv.org/pdf/2512.21334

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LLM Swiss Round: Aggregating Multi-Benchmark Performance via Competitive Swiss-System Dynamics

📝 Summary:
The rapid proliferation of Large Language Models (LLMs) and diverse specialized benchmarks necessitates a shift from fragmented, task-specific metrics to a holistic, competitive ranking system that ef...

🔹 Publication Date: Published on Dec 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21010
• PDF: https://arxiv.org/pdf/2512.21010

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

📝 Summary:
TurboDiffusion significantly accelerates video generation by 100-200x while maintaining quality. It achieves this speedup through attention acceleration, step distillation, and W8A8 quantization. Experiments confirm the substantial speedup on a single GPU.

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16093
• PDF: https://jt-zhang.github.io/files/TurboDiffusion_Technical_Report.pdf
• Project Page: https://github.com/thu-ml/TurboDiffusion
• Github: https://github.com/thu-ml/TurboDiffusion

🔹 Models citing this paper:
https://huggingface.co/TurboDiffusion/TurboWan2.2-I2V-A14B-720P
https://huggingface.co/TurboDiffusion/TurboWan2.1-T2V-1.3B-480P
https://huggingface.co/TurboDiffusion/TurboWan2.1-T2V-14B-720P

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming

📝 Summary:
High-resolution video generation, while crucial for digital media and film, is computationally bottlenecked by the quadratic complexity of diffusion models, making practical inference infeasible. To a...

🔹 Publication Date: Published on Dec 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21338
• PDF: https://arxiv.org/pdf/2512.21338
• Project Page: http://haonanqiu.com/projects/HiStream.html
• Github: https://github.com/arthur-qiu/HiStream

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models

📝 Summary:
VLMs exhibit a significant popularity bias, performing better on famous items via memorization rather than general understanding. We introduce YearGuessr, a large multi-modal dataset and benchmark, confirming VLMs struggle with unrecognized subjects.

🔹 Publication Date: Published on Dec 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21337
• PDF: https://arxiv.org/pdf/2512.21337
• Project Page: https://sytwu.github.io/BeyondMemo/
• Github: https://sytwu.github.io/BeyondMemo/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations

📝 Summary:
Recent advances in pretraining general foundation models have significantly improved performance across diverse downstream tasks. While autoregressive (AR) generative models like GPT have revolutioniz...

🔹 Publication Date: Published on Dec 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21004
• PDF: https://arxiv.org/pdf/2512.21004
• Github: https://github.com/Singularity0104/NExT-Vid

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior

📝 Summary:
Tokenizers provide the fundamental basis through which text is represented and processed by language models (LMs). Despite the importance of tokenization, its role in LM performance and behavior is po...

🔹 Publication Date: Published on Dec 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20757
• PDF: https://arxiv.org/pdf/2512.20757
• Github: https://github.com/r-three/Tokenizers

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation

📝 Summary:
DreaMontage is a framework for generating seamless, expressive, long-duration one-shot videos from diverse inputs. It integrates an intermediate-conditioning DiT, a tailored DPO for smoothness, and a segment-wise auto-regressive inference strategy for long sequences.

🔹 Publication Date: Published on Dec 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.21252
• PDF: https://arxiv.org/pdf/2512.21252
• Project Page: https://dreamontage.github.io/DreaMontage/
• Github: https://dreamontage.github.io/DreaMontage/

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
🔥 NEW YEAR 2026 – PREMIUM

nature papers: 400$

Q1 and  Q2 papers    300$

Q3 and Q4 papers   200$

Doctoral thesis (complete)    500$

M.S thesis         300$

paper simulation   150$

Contact me: @Omidyzd62
Multi-hop Reasoning via Early Knowledge Alignment

📝 Summary:
Early Knowledge Alignment EKA improves iterative RAG by aligning LLMs with relevant knowledge before planning. This enhances retrieval, reduces errors, and boosts performance and efficiency.

🔹 Publication Date: Published on Dec 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20144
• PDF: https://arxiv.org/pdf/2512.20144

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#MultiHopReasoning #LLM #RAG #KnowledgeAlignment #AI
SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios

📝 Summary:
SWE-EVO is a new benchmark for AI coding agents that evaluates them on long-horizon, multi-step software evolution tasks across many files. It reveals a significant gap in current models abilities, with even top models achieving only 21 percent resolution. This highlights their struggle with sust...

🔹 Publication Date: Published on Dec 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.18470
• PDF: https://arxiv.org/pdf/2512.18470

Datasets citing this paper:
https://huggingface.co/datasets/Fsoft-AIC/SWE-EVO

==================================

For more data science resources:
https://news.1rj.ru/str/DataScienceT

#AICoding #SoftwareEvolution #Benchmarking #LLMs #AIResearch
1