NEW BOT Телеграм, страница

ML Research Hub

✨SecureCode v2.0: A Production-Grade Dataset for Training Security-Aware Code Generation Models

📝 Summary:
SecureCode v2.0 is a production-grade dataset of 1215 security-focused coding examples. It trains AI models to generate secure code by providing real-incident examples with vulnerable and secure implementations, attacks, defense, and operational security context across 11 languages, using a conve...

🔹 Publication Date: Published on Dec 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.18542
• PDF: https://arxiv.org/pdf/2512.18542
• Project Page: https://perfecxion.ai/
• Github: https://github.com/scthornton/securecode-v2

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#Cybersecurity #CodeSecurity #AI #CodeGeneration #Dataset

179 views23:23

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Step-DeepResearch Technical Report

📝 Summary:
Step-DeepResearch is an end-to-end agent for deep research, using a data synthesis strategy and progressive training. It achieves expert-level capabilities, outperforming existing models and rivaling SOTA closed-source models with cost-efficiency. It also introduces ADR-Bench for realistic Chines...

🔹 Publication Date: Published on Dec 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20491
• PDF: https://arxiv.org/pdf/2512.20491

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #MachineLearning #DeepResearch #AIagent #SOTA

120 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

📝 Summary:
This paper decomposes LLM policies into internal layer and modular policies, revealing distinct reasoning patterns across layers. It finds early layers explore and top layers refine. Motivated by this, Bottom-up Policy Optimization BuPO is proposed to optimize internal layer policies for superior...

🔹 Publication Date: Published on Dec 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.19673
• PDF: https://arxiv.org/pdf/2512.19673

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #PolicyOptimization #DeepLearning #AIResearch #NLP

128 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

109 views03:01

ML Research Hub

✨SAM Audio: Segment Anything in Audio

📝 Summary:
SAM Audio is a foundation model for general audio separation. It unifies text visual and temporal span prompts achieving state-of-the-art performance across diverse audio types. It also introduces a new real-world separation benchmark.

🔹 Publication Date: Published on Dec 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.18099
• PDF: https://arxiv.org/pdf/2512.18099
• Project Page: https://ai.meta.com/samaudio/
• Github: https://github.com/facebookresearch/sam-audio

🔹 Models citing this paper:
• https://huggingface.co/facebook/sam-audio-large
• https://huggingface.co/facebook/sam-audio-small
• https://huggingface.co/facebook/sam-audio-base

✨ Spaces citing this paper:
• https://huggingface.co/spaces/lpeterl/sam-audio-webui
• https://huggingface.co/spaces/Arrcttacsrks/SAM-Audio-Demo
• https://huggingface.co/spaces/chippie1/SAM-Audio-Demo

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AudioSeparation #FoundationModels #AI #DeepLearning #SAMAudio

arXiv.org

SAM Audio: Segment Anything in Audio

General audio source separation is a key capability for multimodal AI systems that can perceive and reason about sound. Despite substantial progress in recent years, existing separation models are...

107 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models

📝 Summary:
QuantiPhy is a benchmark that quantitatively assesses state-of-the-art vision perception models' ability to reason about physical properties such as size, velocity, and acceleration from video observa...

🔹 Publication Date: Published on Dec 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.19526
• PDF: https://arxiv.org/pdf/2512.19526

✨ Datasets citing this paper:
• https://huggingface.co/datasets/PaulineLi/QuantiPhy-validation

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

124 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

119 views03:01

ML Research Hub

✨GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

📝 Summary:
GLM-4.5, a Mixture-of-Experts large language model with 355B parameters, achieves strong performance across agentic, reasoning, and coding tasks using multi-stage training and reinforcement learning. ...

🔹 Publication Date: Published on Aug 8

🔹 Paper Links:
• arXiv Page: https://arxivlens.com/PaperView/Details/glm-4-5-agentic-reasoning-and-coding-arc-foundation-models-126-7b914dd8
• PDF: https://arxiv.org/pdf/2508.06471
• Github: https://github.com/zai-org/GLM-4.5

🔹 Models citing this paper:
• https://huggingface.co/zai-org/GLM-4.5
• https://huggingface.co/zai-org/GLM-4.6
• https://huggingface.co/zai-org/GLM-4.5-Air

✨ Spaces citing this paper:
• https://huggingface.co/spaces/enzostvs/deepsite
• https://huggingface.co/spaces/akhaliq/anycoder
• https://huggingface.co/spaces/hadadxyz/ai

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

Arxivlens

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models - AI Research Paper Analysis | ArxivLens

AI-powered analysis of 'GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models'. We present GLM-4.5, an open-source Mixture-of-Experts (MoE) large language
model with 355B total parameters and 32B activated parameters, featuring a
... Explore with…

150 views03:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

0:04

This media is not supported in your browser

VIEW IN TELEGRAM

✨SpatialTree: How Spatial Abilities Branch Out in MLLMs

📝 Summary:
SpatialTree introduces a 4-level cognitive hierarchy and benchmark for evaluating MLLM spatial abilities. It reveals distinct skill dependencies and strong cross-level transfer from low to high-level abilities. A novel auto-think strategy consistently enhances performance across all spatial levels.

🔹 Publication Date: Published on Dec 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20617
• PDF: https://arxiv.org/pdf/2512.20617
• Project Page: https://spatialtree.github.io/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

114 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨SemanticGen: Video Generation in Semantic Space

📝 Summary:
SemanticGen addresses slow convergence and computational costs in video generation by using a two-stage diffusion model approach that first generates semantic features and then VAE latents, leading to...

🔹 Publication Date: Published on Dec 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20619
• PDF: https://arxiv.org/pdf/2512.20619
• Project Page: https://jianhongbai.github.io/SemanticGen/
• Github: https://jianhongbai.github.io/SemanticGen/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

138 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Reinforcement Learning for Self-Improving Agent with Skill Library

📝 Summary:
A novel RL framework, SAGE, enhances LLM-based agents' self-improvement capabilities by systematically incorporating skills from a skill library, leading to better performance and efficiency in new en...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.17102
• PDF: https://arxiv.org/pdf/2512.17102

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

154 views04:01

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Active Intelligence in Video Avatars via Closed-loop World Modeling

📝 Summary:
Video avatars currently lack agency for autonomous goal pursuit. ORCA introduces a framework for active intelligence, using a closed-loop Observe-Think-Act-Reflect cycle and a dual-system architecture for strategic reasoning and action. It enables robust, goal-directed task completion, transformi...

🔹 Publication Date: Published on Dec 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20615
• PDF: https://arxiv.org/pdf/2512.20615
• Project Page: https://xuanhuahe.github.io/ORCA/
• Github: https://xuanhuahe.github.io/ORCA/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

157 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨FaithLens: Detecting and Explaining Faithfulness Hallucination

📝 Summary:
FaithLens is a cost-efficient model for detecting and explaining faithfulness hallucinations in LLM outputs. It uses synthesized training data and rule-based reinforcement learning. FaithLens outperforms advanced models like GPT-4.1 on 12 tasks while providing high-quality explanations.

🔹 Publication Date: Published on Dec 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20182
• PDF: https://arxiv.org/pdf/2512.20182
• Github: https://github.com/S1s-Z/FaithLens

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

127 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Multi-LLM Thematic Analysis with Dual Reliability Metrics: Combining Cohen's Kappa and Semantic Similarity for Qualitative Research Validation

📝 Summary:
A multi-perspective validation framework using LLMs for thematic analysis combines ensemble validation with Cohen's Kappa and cosine similarity to enhance reliability and extract consensus themes from...

🔹 Publication Date: Published on Dec 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20352
• PDF: https://arxiv.org/pdf/2512.20352
• Project Page: https://azalab-llm-tool.vercel.app/
• Github: https://github.com/NileshArnaiya/LLM-Thematic-Analysis-Tool

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

159 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨INTELLECT-3: Technical Report

📝 Summary:
INTELLECT-3, a large Mixture-of-Experts model trained with reinforcement learning, achieves top performance across various benchmarks and is supported by an open-source RL infrastructure framework. AI...

🔹 Publication Date: Published on Dec 18

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.16144
• PDF: https://arxiv.org/pdf/2512.16144

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

204 views06:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MemEvolve: Meta-Evolution of Agent Memory Systems

📝 Summary:
MemEvolve, a meta-evolutionary framework, enhances self-evolving memory systems by jointly evolving agents' experiential knowledge and memory architecture, leading to improved performance and generali...

🔹 Publication Date: Published on Dec 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.18746
• PDF: https://arxiv.org/pdf/2512.18746

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤2

221 views07:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨LongVideoAgent: Multi-Agent Reasoning with Long Videos

📝 Summary:
A multi-agent framework with a master LLM, grounding agent, and vision agent enhances long-video QA by improving temporal grounding and extracting visual details. This RL-trained system outperforms non-agent baselines on new datasets.

🔹 Publication Date: Published on Dec 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.20618
• PDF: https://arxiv.org/pdf/2512.20618
• Github: https://longvideoagent.github.io/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#MultiAgentSystems #LLM #VideoUnderstanding #ComputerVision #AI

❤1

208 views09:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Toxicity Ahead: Forecasting Conversational Derailment on GitHub

📝 Summary:
A novel LLM framework uses a two-step prompting pipeline to predict conversational derailment on GitHub. It generates Summaries of Conversation Dynamics to forecast toxicity, achieving high F1-scores and outperforming baselines for proactive moderation.

🔹 Publication Date: Published on Dec 17

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.15031
• PDF: https://arxiv.org/pdf/2512.15031

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#LLM #ToxicityDetection #ContentModeration #GitHub #MachineLearning

❤1

199 views10:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Simulstream: Open-Source Toolkit for Evaluation and Demonstration of Streaming Speech-to-Text Translation Systems

📝 Summary:
Simulstream is an open-source toolkit for evaluating and demonstrating streaming speech-to-text translation. It supports long-form audio, incremental decoding, and re-translation, plus offers an interactive demo interface.

🔹 Publication Date: Published on Dec 19

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.17648
• PDF: https://arxiv.org/pdf/2512.17648
• Project Page: https://pypi.org/project/simulstream/

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#SpeechToText #MachineTranslation #NLP #OpenSource #StreamingAI

❤1

198 views11:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Scaling Laws for Code: Every Programming Language Matters

📝 Summary:
This paper explores scaling laws for multilingual code pre-training, finding interpreted languages benefit more from scaling. It proposes an optimal token allocation strategy for programming languages based on utility and synergy, outperforming uniform distribution.

🔹 Publication Date: Published on Dec 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.13472
• PDF: https://arxiv.org/pdf/2512.13472

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#CodeAI #MachineLearning #ProgrammingLanguages #ScalingLaws #LLMs

142 views14:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨FlipVQA-Miner: Cross-Page Visual Question-Answer Mining from Textbooks

📝 Summary:
FlipVQA-Miner automates high-quality QA and VQA extraction from textbooks. It combines layout-aware OCR with LLM-based semantic parsing. This provides accurate, real-world data for LLM training, avoiding synthetic samples and improving reasoning.

🔹 Publication Date: Published on Nov 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.16216
• PDF: https://arxiv.org/pdf/2511.16216
• Github: https://github.com/OpenDCAI/DataFlow

==================================

For more data science resources:
✓ https://news.1rj.ru/str/DataScienceT

#VQA #LLM #OCR #DataExtraction #AIResearch

152 views14:04

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform