RIML Lab – Telegram
RIML Lab
2.86K subscribers
46 photos
25 videos
7 files
144 links
Robust and Interpretable Machine Learning Lab,
Prof. Mohammad Hossein Rohban,
Sharif University of Technology

https://youtube.com/@rimllab

twitter.com/MhRohban

https://www.linkedin.com/company/robust-and-interpretable-machine-learning-lab/
Download Telegram
🩻 Medical Imaging Journal Club

Join us this week as we explore advances in anomaly detection using diffusion models, with a focus on their application to real-world medical imaging data. We’ll examine a novel paper that leverages weakly supervised learning and DDIMs (Denoising Diffusion Implicit Models) for generating detailed and reliable anomaly maps — without requiring pixel-level annotations.

This Week’s Presentation:

🔹 Title: Diffusion Models for Medical Anomaly Detection

🔸 Presenter: Mobina Poulaei

🌀 Abstract:
This paper presents a novel, weakly supervised method for medical anomaly detection based on denoising diffusion implicit models. Unlike conventional GANs or autoencoders, the proposed framework preserves fine image details while performing image-to-image translation from pathological to healthy domains. It utilizes a deterministic noise-encoding scheme along with classifier guidance to reconstruct healthy-looking versions of medical scans. The resulting pixel-wise anomaly maps, derived from comparing original and reconstructed images, demonstrate precise localization of pathological regions.

Session Details:
- 📅 Date: Wednesday
- 🕒 Time: 11:00 AM – 12:00 PM
- 🌐 Location: Online at
vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
🪢 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

🌟 This Week's Presentation:

📌 Title:

We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function

🎙️ Presenter: Amir Kasaei

🧠 Abstract:
Text-to-image diffusion models, particularly Stable Diffusion, have significantly advanced computer vision by enabling high-quality image synthesis from textual prompts. However, their performance often degrades when handling complex prompts involving multiple attributes or objects. This work investigates the root causes of this limitation, focusing on the role of the CLIP text encoder. The study identifies a phenomenon of attribute bias in the text embedding space and reveals a contextual issue in the handling of padding embeddings, which leads to concept entanglement. To address these challenges, the authors propose Magnet, a novel, training-free method that enhances attribute disentanglement through the use of positive and negative binding vectors, supported by a neighbor-based strategy to improve accuracy. Experimental results demonstrate that Magnet significantly boosts both image synthesis quality and attribute binding precision, with minimal computational cost, and effectively supports the generation of unconventional or abstract visual concepts.

📄 Paper:
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function

Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
Call for Research Assistants in Large Language Model Projects

If you are familiar with Large Language Models (LLMs), you are invited to join our research projects as a research assistant. These projects focus on advanced topics in reasoning with large language models and are jointly supervised by Dr. Rohban and Dr. Jafari. The projects are conducted within the RIML and INL laboratories at Sharif University of Technology, and may also be considered as undergraduate thesis projects, if applicable. If you are interested, please complete the following form:
Registration Form
📢 Research Assistant Positions Available

The Robust and Interpretable Machine Learning (RIML) Lab and the Trustworthy and Secure Artificial Intelligence Lab (TSAIL) at the Computer Engineering Department of Sharif University of Technology are seeking highly motivated and talented research assistants to join our team. This collaborative project is jointly supervised by Dr. Rohban and Dr. Sadeghzadeh.

🔍 Position Overview
We are working on cutting-edge research in the field of generative models, with a focus on robustness, interpretability, and trustworthiness. As a research assistant, you will contribute to impactful projects at the intersection of theory and real-world applications.

🧠 Required Qualifications

- Solid background in machine learning, artificial intelligence, and generative models
- Hands-on experience with generative models and their practical applications
- Proficiency in Python and frameworks such as PyTorch
- Strong communication skills and the ability to work well in a collaborative research environment

📝 How to Apply
If you are interested in joining our team, please complete the application form and upload your CV using the following link:
👉 Application Form

📚 Suggested Background Reading
To better understand the context of our research, we recommend reviewing the following papers:

1. http://arxiv.org/abs/2410.15618
2. http://arxiv.org/abs/2305.10120

We look forward to your application!
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Unlearning in Deep generative models in the context of cutting-edge generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle unlearning tasks and where improvements can be made.

This Week's Presentation:

🔹 Title: Categorical Reparameterization with Gumbel-Softmax


🔸 Presenter: Aryan Komaei

🌀 Abstract:
This paper addresses the challenge of using categorical variables in stochastic neural networks, which traditionally struggle with backpropagation due to non-differentiable sampling. The authors propose the Gumbel-Softmax distribution as a solution — a differentiable approximation of categorical variables that allows for efficient gradient-based optimization. The key benefit is that it can be smoothly annealed to behave like a true categorical distribution. The method outperforms previous gradient estimators in tasks like structured prediction and generative modeling, and also enables significant speedups in semi-supervised classification.

Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 11:00 - 12:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
🔐 ML Security Journal Club

This Week's Presentation:

🔹 Title: Jailbreaking Text-to-image Generative Models

🔸 Presenter: Arian Komaei

🌀 Abstract:
This paper introduces SneakyPrompt, an automated attack framework designed to bypass safety filters in text-to-image generative models like Stable Diffusion and DALL·E 2. These models are often equipped with safety filters to prevent the generation of harmful or NSFW (Not-Safe-for-Work) images. SneakyPrompt exploits these systems by using reinforcement learning to perturb blocked prompts in a way that circumvents the filters.

📄 Paper: SneakyPrompt: Jailbreaking Text-to-image Generative Models


Session Details:
- 📅 Date: Wednesday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
🪢 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

🌟 This Week's Presentation:

📌 Title:

Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples


🎙️ Presenter: Mobina Poulaei

🧠 Abstract:
This work tackles a key challenge in diffusion models: the misalignment between generated images and their text prompts. While Direct Preference Optimization (DPO) has been used to improve alignment, it struggles with visual inconsistency between training samples. To address this, the authors propose D-Fusion, a method that creates visually consistent, DPO-trainable image pairs using mask-guided self-attention fusion. D-Fusion also preserves denoising trajectories necessary for optimization. Experiments show that it effectively improves prompt-image alignment across multiple reinforcement learning settings.

📄 Paper:
D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples

Session Details:
- 📅 Date: Tuesday, August 5
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
🔐 ML Security Journal Club

This Week's Presentation:

🔹 Title: Jailbreaking Text-to-image Generative Models

🔸 Presenter: Arian Komaei

🌀 Abstract:
This paper introduces GhostPrompt, an automated jailbreak framework targeting text-to-image (T2I) generation models to bypass integrated safety filters for not-safe-for-work (NSFW) content. Unlike previous token-level perturbation methods, GhostPrompt leverages large language models (LLMs) with multimodal feedback for semantic-level adversarial prompt generation. It combines Dynamic Optimization—an iterative feedback-driven process for generating aligned adversarial prompts—and Adaptive Safety Indicator Injection, which strategically embeds benign visual cues to evade image-level detection. The framework achieves a 99% bypass rate against ShieldLM-7B (up from 12.5% with Sneakyprompt), improves CLIP scores, reduces processing time, and generalizes to unseen models, including GPT-4.1 and DALL·E 3. The work reveals critical vulnerabilities in current multimodal safety systems and calls for further AI safety research under controlled-access protocols.

📄 Paper: GhostPrompt: Jailbreaking Text-to-image Generative Models based on Dynamic Optimization


Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 6:30 - 7:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
🪢 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

🌟 This Week's Presentation:

📌 Title:
Fast Noise Initialization for Temporally Consistent Video Generation


🎙️ Presenter: Ali Aghayari

🧠 Abstract:
Video generation has advanced rapidly with diffusion models, but ensuring temporal consistency remains challenging. Existing methods like FreeInit address this by iteratively refining noise during inference, though at a significant computational cost. To overcome this, the authors introduce FastInit, a fast noise initialization method powered by a Video Noise Prediction Network (VNPNet). Given random noise and a text prompt, VNPNet produces refined noise in a single forward pass, eliminating the need for iteration. This approach greatly improves efficiency while maintaining high temporal consistency across frames. Trained on a large-scale dataset of text prompts and noise pairs, FastInit consistently enhances video quality in experiments with various text-to-video models. By offering both speed and stability, FastInit provides a practical solution for real-world video generation. The code and dataset will be released publicly.

📄 Paper:
FastInit: Fast Noise Initialization for Temporally Consistent Video Generation

Session Details:
- 📅 Date: Tuesday, August 19
- 🕒 Time: 11:00 AM - 12:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
Call for Research Assistants in Large Language Model Projects

If you are familiar with Large Language Models (LLMs), you are invited to join our research projects as a research assistant. These projects focus on advanced topics in large language models.
The projects are conducted within the RIML Laboratory and may also be considered as undergraduate thesis projects, if applicable.
For an introduction to the topic, you can read:
Learning to Generate Research Idea with Dynamic Control
If you are interested, please complete the following form:
Registration Form
if you faced any problem contact @Moein_Salimi
جلسه‌ی سی و یکم باشگاه مدل‌های زبانی بزرگ
📚 موضوع: برآورد عدم قطعیت در شبکه‌های عمیق
سخنران:
دکتر یاسین عباسی، پژوهشگر پیشین هوش مصنوعی در دیپ‌مایند
زمان: چهارشنبه ۱۴۰۴/۰۶/۲۶، ساعت ۱۵:۰۰
لینک جلسه:
https://vc.sharif.edu/rohban
یوتیوب (ویدئو جلسه‌ها)
توییتر
افزودن رویداد به تقویم گوگل‌
وبسایت ژورنال‌کلاب
از همه دعوت می‌کنیم که در این جلسه شرکت کنند.
#LLM_Club
@LLM_CLUB
🪢 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

🌟 This Week's Presentation:

📄 Paper:
Minority-Focused Text-to-Image Generation via Prompt Optimization

🧠 Abstract:
This paper introduces a new framework for improving the generation of minority samples with pretrained text-to-image diffusion models. Minority instances—defined as samples in low-density regions of text-conditioned data distributions—are valuable for applications like data augmentation and creative AI but are underrepresented in current models, which tend to focus on high-density regions. To address this imbalance, the authors propose an online prompt optimization method that preserves semantic content while guiding the emergence of desired properties. They further adapt this approach with a specialized likelihood-based objective to better capture minority features. Experimental results across multiple diffusion models show that the method substantially improves the quality and diversity of generated minority samples compared to existing techniques.

🎙️ Presenter: Amir Kasaei

Session Details:
- 📅 Date: Tuesday
- 🕒 Time: 4:00 PM - 5:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
🔐 ML Security Journal Club

This Week's Presentation:

🔹 Title: Safe Generative AI Workshop @ NeurIPS 2024

🔸 Presenter: Arian Komaei

🌀 Abstract:
In the past two years, generative AI has been the major driving force behind the development of advanced AI productssuch as ChatGPT4, AlphaFold, and StableDiffusion. These technologies, while significantly improving productivity for many, have raised significant safety concerns. However, there has been no workshop focusing on this topic in the past two years. This workshop, emphasizing AI safety concerns related to the use of generative AI, is very needed for the community. Generative AI, including large language models, vision-language models, diffusion models, and many more, has significantly aided various aspects of both academia and industry. In scientific discovery, these aspects encompass experimental design, hypothesis formulation, theoretical reasoning, and observation organization. In commercial applications, generative models such as large language models and diffusion algorithms have changed the lifestyles and workflows of billions around the world. This workshop aims to convene experts from various fields to address these challenges and explore potential solutions.


Session Details:
- 📅 Date: Sunday
- 🕒 Time: 4:00 - 5:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
🪢 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

🌟 This Week's Presentation:

📌 Title:
Compositional Visual Reasoning
: Why It Matters and What Holds Us Back

🧠 Abstract:
Compositional visual reasoning is a key challenge in multimodal AI, focusing on enabling machines to break down visual scenes into meaningful parts, connect them with concepts, and perform multi-step logical inference. In this session, we will introduce the foundations of visual reasoning and discuss why compositionality is crucial for achieving robustness, interpretability, and cognitive alignment in AI systems. We will also highlight major challenges, including hallucinations, difficulty in maintaining semantic fidelity, and the limitations of current reasoning strategies. The aim is to provide a clear picture of the problem space and motivate deeper exploration in future sessions.

📄 Paper:
Explain Before You Answer: A Survey on Compositional Visual Reasoning

🎙 Presenter:
Amir Kasaei

Session Details:
- 📅 Date: Tuesday, September 23
- 🕒 Time: 4:00 AM - 5:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
🔐 ML Security Journal Club

This Week's Presentation:

🔹 Title: Unlearning diffusion models

🔸 Presenter: Arian Komaei

🌀 Abstract:
This paper introduces Single Layer Unlearning Gradient (SLUG), a new method for removing unwanted information from trained models efficiently. Unlike traditional unlearning approaches that require costly updates across many layers, SLUG updates only one carefully chosen layer using a single gradient step. The method relies on layer importance and gradient alignment to identify the optimal layer, preserving model performance while unlearning targeted content. Experiments show that SLUG works effectively across models like CLIP, Stable Diffusion, and vision-language models, handling both concrete concepts (e.g., objects, identities) and abstract ones (e.g., artistic styles). Compared to existing approaches, SLUG achieves similar unlearning results but with much lower computational cost, making it a practical solution for efficient and precise targeted unlearning.

📄 Paper: Targeted Unlearning with Single Layer Unlearning Gradient


Session Details:
- 📅 Date: Sunday
- 🕒 Time: 4:00 - 5:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
🔐 ML Security Journal Club

This Week's Presentation:

🔹 Title: Unlearning diffusion models

🔸 Presenter: Arian Komaei

🌀 Abstract:
This paper introduces Single Layer Unlearning Gradient (SLUG), a new method for removing unwanted information from trained models efficiently. Unlike traditional unlearning approaches that require costly updates across many layers, SLUG updates only one carefully chosen layer using a single gradient step. The method relies on layer importance and gradient alignment to identify the optimal layer, preserving model performance while unlearning targeted content. Experiments show that SLUG works effectively across models like CLIP, Stable Diffusion, and vision-language models, handling both concrete concepts (e.g., objects, identities) and abstract ones (e.g., artistic styles). Compared to existing approaches, SLUG achieves similar unlearning results but with much lower computational cost, making it a practical solution for efficient and precise targeted unlearning.

📄 Paper: Targeted Unlearning with Single Layer Unlearning Gradient


Session Details:
- 📅 Date: Sunday
- 🕒 Time: 4:00 - 5:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
Research Team Formation: ML Trustworthiness & Speech Language Models

We are currently forming a research team for a project in the field of ML Trustworthiness and Speech Language Models.
Our goal is to publish the outcomes of this research in top-tier machine learning conferences. Additionally, active team members who contribute meaningfully to the project will receive recommendation letters from faculty members.

If you are interested in these topics and have sufficient time to dedicate to research, please fill out the form below:
Form Link

To learn more about related works previously conducted in our lab, you can visit the following links:
• Dr. Mohammad Hossein Rohban – Google Scholar
• PatchGuard: Adversarially Robust Anomaly Detection and Localization through Vision Transformers (CVPR 2025)

_We look forward to collaborating with you!_
🚀 Open RA Positions – Reinforcement Learning (Generalization & Sample Efficiency)

We have a few Research Assistant (RA) openings on Generalization and Sample Efficiency in Reinforcement Learning (RL). Selected candidates will work directly with Dr. Rohban and the project supervisor.

The project focuses on improving RL agents’ generalization beyond training environments using contrastive learning. While the choice of positive/negative samples greatly impacts training (see: https://arxiv.org/abs/2102.10960
), anchor selection remains an unexplored area (related works: https://arxiv.org/abs/2004.04136
and https://arxiv.org/abs/1511.05952
).

We’re looking for highly motivated researchers (B.Sc. or higher) with:
1️⃣ Strong background in Python and Git
2️⃣ Proficiency in Deep Learning & Reinforcement Learning (taken/audited both courses)
3️⃣ At least 3 months of prior research experience
4️⃣ Self-motivated, independent, and a quick learner
5️⃣ On-site presence in the lab, with weekly meetings with Dr. Rohban and regular reports to the project supervisor

🕘 Deadline: Wednesday, October 20th, 2025 – 9:00 AM (Tehran time)
📄 Apply here: https://forms.gle/88SfwtwZvQ2JCZ7X7
📢 Research Assistant Positions Available

The Robust and Interpretable Machine Learning (RIML) Lab and the Trustworthy and Secure Artificial Intelligence Lab (TSAIL) at the Computer Engineering Department of Sharif University of Technology are seeking highly motivated and talented research assistants to join our team. This collaborative project is jointly supervised by Dr. Rohban and Dr. Sadeghzadeh.


🔍 Position Overview
We are working on cutting-edge research in the field of generative models, with a focus on robustness, interpretability, and trustworthiness. As a research assistant, you will contribute to impactful projects at the intersection of theory and real-world applications.

🧠 Required Qualifications

- Solid background in machine learning, artificial intelligence, and generative models
- Hands-on experience with generative models and their practical applications
- Proficiency in Python and frameworks such as PyTorch
- Strong communication skills and the ability to work well in a collaborative research environment

📝 How to Apply
If you are interested in joining our team, please complete the application form and upload your CV using the following link:
👉 Application Form

📚 Suggested Background Reading
To better understand the context of our research, we recommend reviewing the following papers:

1. http://arxiv.org/abs/2410.15618
2. http://arxiv.org/abs/2305.10120

⚠️ Note 1: We do not accept applicants who currently have a full-time job or those who are students with a part-time job.

⚠️ Note 2: The target of these projects is submission to ICML and ECCV at the end of this Shamsi year. Therefore, time is limited, and participants must have at least 20–30 hours of free time per week to dedicate to the projects.

We look forward to your application!
Call for Research Assistants in Large Language Model Projects

If you are familiar with LLMs, you are invited to join our research projects as a research assistant. This project focuses on abductive reasoning in llms.
This project focuses on abductive reasoning in LLMs and aims at preparing submission for ACL 2026.
For an introduction to the topic, you can read:
GEAR: A General Evaluation Framework for Abductive Reasoning
If you are interested, please complete the following form:
Registration Form