ML Research Hub – Telegram
ML Research Hub
32.7K subscribers
3.99K photos
226 videos
23 files
4.29K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
❗️LISA WILL MAKE YOU RICH TODAY

Lisa just dropped an exclusive link to her channel where money is being given away every single day! Each member has the chance to win between $100 and $5,000 – GUARANTEED!

👉🏻 CLICK HERE 👈🏻
👉🏻 CLICK HERE 👈🏻
👉🏻 CLICK HERE 👈🏻

🚨 LIMITED TIME ONLY: Free access for the first 100 people!

Don’t miss this opportunity – it might not come again.
👍41
LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync

Paper: https://arxiv.org/pdf/2412.09262v1.pdf

Code: https://github.com/bytedance/LatentSync

https://news.1rj.ru/str/DataScienceT 💎
Please open Telegram to view this post
VIEW IN TELEGRAM
👍4
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation

Paper: https://arxiv.org/pdf/2409.13731v3.pdf

Code: https://github.com/openspg/kag

Datasets: 2WikiMultiHopQA

https://news.1rj.ru/str/DataScienceT 🎁
Please open Telegram to view this post
VIEW IN TELEGRAM
👍4
Please open Telegram to view this post
VIEW IN TELEGRAM
👍4
Story-Adapter: A Training-free Iterative Framework for Long Story Visualization

Paper: https://arxiv.org/pdf/2410.06244v1.pdf

Code: https://github.com/jwmao1/story-adapter

https://news.1rj.ru/str/DataScienceT 📊
Please open Telegram to view this post
VIEW IN TELEGRAM
👍3
Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations

Paper: https://arxiv.org/pdf/2408.15232v2.pdf

Code: https://github.com/stanford-oval/storm

https://news.1rj.ru/str/DataScienceT ⚠️
Please open Telegram to view this post
VIEW IN TELEGRAM
👍4
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction

Paper: https://arxiv.org/pdf/2501.03218v1.pdf

Code: https://github.com/mark12ding/dispider

Dataset: Video-MME

https://news.1rj.ru/str/DataScienceT 🎙
Please open Telegram to view this post
VIEW IN TELEGRAM
👍3
Please open Telegram to view this post
VIEW IN TELEGRAM
AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scans

Paper: https://arxiv.org/pdf/2407.02418v2.pdf

Code: https://github.com/GabrieleLozupone/AXIAL

Dataset: ADNI

https://news.1rj.ru/str/DataScienceT 🧠
Please open Telegram to view this post
VIEW IN TELEGRAM
3👍3
Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding

🖥 Github: https://github.com/opengvlab/piip

📕 Paper: https://arxiv.org/abs/2501.07783v1

⭐️ Dataset: https://paperswithcode.com/dataset/gqa

https://news.1rj.ru/str/DataScienceT 🧠
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
👍41
FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors

Paper: https://arxiv.org/pdf/2501.08225v1.pdf

Code: https://github.com/ybybzhang/framepainter

https://news.1rj.ru/str/DataScienceT ✈️
Please open Telegram to view this post
VIEW IN TELEGRAM
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

Paper: https://arxiv.org/pdf/2407.15811v1.pdf

code: https://github.com/sonyresearch/micro_diffusion

Datasets: MS COCO

https://news.1rj.ru/str/DataScienceT 🧠
Please open Telegram to view this post
VIEW IN TELEGRAM
1
MiniRAG: Towards Extremely Simple Retrieval-Augmented Generation

Paper: https://arxiv.org/pdf/2501.06713v2.pdf

Code: https://github.com/hkuds/minirag

https://news.1rj.ru/str/DataScienceT 🧠
Please open Telegram to view this post
VIEW IN TELEGRAM
3👍2
Continual Forgetting for Pre-trained Vision Models (CVPR2024)

🖥 Github: https://github.com/bjzhb666/GS-LoRA

📕 Paper: https://arxiv.org/abs/2501.09705v1

🧠 Dataset: https://paperswithcode.com/dataset/coco

https://news.1rj.ru/str/DataScienceT 🧠
Please open Telegram to view this post
VIEW IN TELEGRAM
1👍1
Please open Telegram to view this post
VIEW IN TELEGRAM
👍1
UnCommon Objects in 3D

We introduce Uncommon Objects in 3D (uCO3D), a new object-centric dataset for 3D deep learning and 3D generative AI. uCO3D is the largest publicly-available collection of high-resolution videos of objects with 3D annotations that ensures full-360 coverage. uCO3D is significantly more diverse than MVImgNet and CO3Dv2, covering more than 1,000 object categories. It is also of higher quality, due to extensive quality checks of both the collected videos and the 3D annotations. Similar to analogous datasets, uCO3D contains annotations for 3D camera poses, depth maps and sparse point clouds. In addition, each object is equipped with a caption and a 3D Gaussian Splat reconstruction. We train several large 3D models on MVImgNet, CO3Dv2, and uCO3D and obtain superior results using the latter, showing that uCO3D is better for learning applications.

Paper: https://arxiv.org/pdf/2501.07574v1.pdf

Code: https://github.com/facebookresearch/uco3d

DataSet: MS COCO

https://news.1rj.ru/str/DataScienceT 🐻‍❄️
Please open Telegram to view this post
VIEW IN TELEGRAM
3👍1
The GAN is dead; long live the GAN! A Modern GAN Baseline

There is a widely-spread claim that GANs are difficult to train, and GAN architectures in the literature are littered with empirical tricks. We provide evidence against this claim and build a modern GAN baseline in a more principled manner. First, we derive a well-behaved regularized relativistic GAN loss that addresses issues of mode dropping and non-convergence that were previously tackled via a bag of ad-hoc tricks. We analyze our loss mathematically and prove that it admits local convergence guarantees, unlike most existing relativistic losses. Second, our new loss allows us to discard all ad-hoc tricks and replace outdated backbones used in common GANs with modern architectures. Using StyleGAN2 as an example, we present a roadmap of simplification and modernization that results in a new minimalist baseline -- R3GAN. Despite being simple, our approach surpasses StyleGAN2 on FFHQ, ImageNet, CIFAR, and Stacked MNIST datasets, and compares favorably against state-of-the-art GANs and diffusion models.

Paper: https://arxiv.org/pdf/2501.05441v1.pdf

Code: https://github.com/brownvc/r3gan

Dataset: CIFAR-10

https://news.1rj.ru/str/DataScienceT 😵‍💫
Please open Telegram to view this post
VIEW IN TELEGRAM
4👍2