RIML Lab – Telegram
RIML Lab
2.86K subscribers
46 photos
25 videos
7 files
144 links
Robust and Interpretable Machine Learning Lab,
Prof. Mohammad Hossein Rohban,
Sharif University of Technology

https://youtube.com/@rimllab

twitter.com/MhRohban

https://www.linkedin.com/company/robust-and-interpretable-machine-learning-lab/
Download Telegram
🧠 RL Journal Club: This Week's Session

🤝 We invite you to join us for this week's RL Journal Club session, where we will dive into a minimalist approach to offline reinforcement learning. In this session, we will explore how simplifying algorithms can lead to more robust and efficient models in RL, challenging the necessity of complex modifications commonly seen in recent advancements.

This Week's Presentation:

🔹 Title: Revisiting the Minimalist Approach to Offline Reinforcement Learning
🔸 Presenter: Professor Mohammad Hossein Rohban
🌀 Abstract: This presentation will delve into the trade-offs between simplicity and performance in offline RL algorithms. We will review the minimalist approach proposed in the paper, which re-evaluates core algorithmic features and shows that simpler models can achieve performance on par with more intricate methods. The discussion will include experimental results that demonstrate how stripping away complexity can lead to more effective learning, providing fresh insights into the design of RL systems.

The presentation will be based on the following paper:

▪️ Revisiting the Minimalist Approach to Offline Reinforcement Learning (https://arxiv.org/abs/2305.09836)

Session Details:

📅 Date: Tuesday
🕒 Time: 4:00 - 5:00 PM
🌐 Location: Online at https://vc.sharif.edu/ch/rohban
📍 For in-person attendance, please message me on Telegram at @alirezanobakht78

☝️ Note: The discussion is open to everyone, but we can only host students of Sharif University of Technology in person.

💯 Join us for an insightful session where we rethink how much complexity is truly necessary for effective offline reinforcement learning! Don't miss this chance to deepen your understanding of RL methodologies.

✌️ We look forward to your participation!
#RLJClub #JClub #RIML #SUT #AI #RL
Forwarded from Rayan AI Course
🧠 آغاز ثبت‌نام رایگان مسابقات بین‌المللی هوش مصنوعی رایان (Rayan) | دانشگاه صنعتی شریف

🪙با بیش از ۳۵ هزار دلار جایزه نقدی
🎓چاپ دستاوردهای ۱۰ تیم برتر در کنفرانس‌‌ها/مجلات برتر بین‌المللی هوش مصنوعی
🗓شروع مسابقه از ۲۶ مهرماه ۱۴۰۳

💬موضوعات مورد بررسی Trustworthiness In Deep Learning:
💬 Model Poisoning
💬 Compositional Generalization
💬 Zero-Shot Anomaly Detection

👀 مسابقات بین‌المللی هوش مصنوعی رایان با حمایت معاونت علمی ریاست‌جمهوری و موضوع Trustworthy AI، توسط دانشگاه صنعتی شریف برگزار می‌گردد. برگزاری این مسابقه در ۳ مرحله (۲ مرحله مجازی و ۱ مرحله حضوری) از تاریخ ۲۶ مهر آغاز می‌شود.

⭐️ رایان جهت حمایت از تیم‌های برتر راه‌یافته به مرحله سوم، ضمن تامین مالی بابت هزینه سفر و اسکان، دستاوردهای علمی تیم‌های برتر را در یکی از کنفرانس‌ها یا مجلات مطرح این حوزه با ذکر نام اعضای تیم در مقاله‌ی مربوطه، چاپ و منتشر خواهد کرد. این شرکت‌کنندگان برای دستیابی به جایزه ۳۵ هزار دلاری برای تیم‌های برتر، در فاز سوم به رقابت می‌پردازند.

👥 تیم‌های شرکت‌کننده، ۲ الی ۴ نفره هستند.

💬 ثبت‌نام کاملاً رایگان تا پایان ۲۵ مهرماه از طریق آدرس زیر:
ai.rayan.global

🌐Linkedin
🌐@Rayan_AI_Contest
Please open Telegram to view this post
VIEW IN TELEGRAM
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

This Week's Presentation:

🔹 Title: A semiotic methodology for assessing the compositional effectiveness of generative text-to-image models

🔸 Presenter: Amir Kasaei

🌀 Abstract:
A new methodology for evaluating text-to-image generation models is being proposed, addressing limitations in current evaluation techniques. Existing methods, which use metrics such as fidelity and CLIPScore, often combine criteria like position, action, and photorealism in their assessments. This new approach adapts model analysis from visual semiotics, establishing distinct visual composition criteria. It highlights three key dimensions: plastic categories, multimodal translation, and enunciation, each with specific sub-criteria. The methodology is tested on Midjourney and DALL·E, providing a structured framework that can be used for future quantitative analyses of generated images.

📄 Paper: A semiotic methodology for assessing the compositional effectiveness of generative text-to-image models

Session Details:
- 📅 Date: Sunday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban


We look forward to your participation! ✌️
🚨Open Position: Visual Compositional Generation Research 🚨

We are excited to announce an open research position for a project under Dr. Rohban at the RIML Lab (Sharif University of Technology). The project focuses on improving text-to-image generation in diffusion-based models by addressing compositional challenges.

🔍 Project Denoscription:

Large-scale diffusion-based models excel at text-to-image (T2I) synthesis, but still face issues like object missing and improper attribute binding. This project aims to study and resolve these compositional failures to improve the quality of T2I models.

Key Papers:
- T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional T2I Generation
- Attend-and-Excite: Attention-Based Semantic Guidance for T2I Diffusion Models
- If at First You Don’t Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection
- ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

🎯 Requirements:

- Must: PyTorch, Deep Learning,
- Recommended: Transformers and Diffusion Models.
- Able to dedicate significant time to the project.


🗓 Important Dates:

- Application Deadline: 2024/10/12 (23:59 UTC+3:30)

📌 Apply here:
Application Form

For questions:
📧 a.kasaei@me.com
💬 @amirkasaei

@RIMLLab
#research_application
#open_position
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

This Week's Presentation:

🔹 Title: InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization

🔸 Presenter: Amir Kasaei

🌀 Abstract:
Recent advancements in diffusion models, like Stable Diffusion, have shown impressive image generation capabilities, but ensuring precise alignment with text prompts remains a challenge. This presentation introduces Initial Noise Optimization (InitNO), a method that refines initial noise to improve semantic accuracy in generated images. By evaluating and guiding the noise using cross-attention and self-attention scores, the approach effectively enhances image-prompt alignment, as demonstrated through rigorous experimentation.


📄 Paper: InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization

Session Details:
- 📅 Date: Sunday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban


We look forward to your participation! ✌️
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

This Week's Presentation:

🔹 Title: Backdooring Bias into Text-to-Image Models

🔸 Presenter: Mehrdad Aksari Mahabadi

🌀 Abstract:
This paper investigates the misuse of text-conditional diffusion models, particularly text-to-image models, which create visually appealing images based on user denoscriptions. While these images generally represent harmless concepts, they can be manipulated for harmful purposes like propaganda. The authors show that adversaries can introduce biases through backdoor attacks, affecting even well-meaning users. Despite users verifying image-text alignment, the attack remains hidden by preserving the text's semantic content while altering other image features to embed biases, amplifying them by 4-8 times. The study reveals that current generative models make such attacks cost-effective and feasible, with costs ranging from 12 to 18 units. Various triggers, objectives, and biases are evaluated, with discussions on mitigations and future research directions.

📄 Paper: Backdooring Bias into Text-to-Image Models

Session Details:
- 📅 Date: Sunday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban


We look forward to your participation! ✌️
RIML Lab
💠 Compositional Learning Journal Club Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models…
جلسه‌ی امروز متاسفانه برگزار نخواهد شد
سایر جلسات از طریق همین کانال اطلاع رسانی خواهد شد
Research Position at the Sharif Information Systems and Data Science Center
 
Project Denoscription: Anomaly detection in time series on various datasets, including those related to autonomous vehicle batteries, predictive maintenance, and determining remaining useful life (RUL) upon anomaly detection in products, particularly electric vehicle batteries. The paper deadline for this project is by the end of February. The project also involves the use of federated learning algorithms to support multiple local devices in anomaly detection, RUL estimation, and predictive maintenance on each local device.
 
Technical Requirements: Two electrical or computer engineering students with strong skills in deep learning, robustness concepts, time series anomaly detection, federated learning algorithms, and a creative mindset, strong and clean implementation skills.
 
Benefits: Access to a new, well-equipped lab and Research under the supervision of three professors in Electrical and Computer Engineering.

Dr. Babak  Khalaj
Dr. Siavash Ahmadi
Dr. Mohammad Hossein Rohban

Please send your CV, with the subject line "Research Position in Time Series Anomaly Detection,"
to the email address: data-icst@sharif.edu.
Forwarded from RIML Lab (Amir Kasaei)
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

This Week's Presentation:

🔹 Title: Backdooring Bias into Text-to-Image Models

🔸 Presenter: Mehrdad Aksari Mahabadi

🌀 Abstract:
This paper investigates the misuse of text-conditional diffusion models, particularly text-to-image models, which create visually appealing images based on user denoscriptions. While these images generally represent harmless concepts, they can be manipulated for harmful purposes like propaganda. The authors show that adversaries can introduce biases through backdoor attacks, affecting even well-meaning users. Despite users verifying image-text alignment, the attack remains hidden by preserving the text's semantic content while altering other image features to embed biases, amplifying them by 4-8 times. The study reveals that current generative models make such attacks cost-effective and feasible, with costs ranging from 12 to 18 units. Various triggers, objectives, and biases are evaluated, with discussions on mitigations and future research directions.

📄 Paper: Backdooring Bias into Text-to-Image Models

Session Details:
- 📅 Date: Sunday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban


We look forward to your participation! ✌️
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

This Week's Presentation:

🔹 Title: Object-Attribute Binding in Text-to-Image Generation: Evaluation and Control

🔸 Presenter: Arshia Hemmat

🌀 Abstract:
This presentation introduces advancements in addressing compositional challenges in text-to-image (T2I) generation models. Current diffusion models often struggle to associate attributes accurately with the intended objects based on text prompts. To address this, a new Edge Prediction Vision Transformer (EPViT) is introduced for improved image-text alignment evaluation. Additionally, the proposed Focused Cross-Attention (FCA) mechanism uses syntactic constraints from input sentences to enhance visual attention maps. DisCLIP embeddings further disentangle multimodal embeddings, improving attribute-object alignment. These innovations integrate seamlessly into state-of-the-art diffusion models, enhancing T2I generation quality without additional model training.

📄 Paper: Object-Attribute Binding in Text-to-Image Generation: Evaluation and Control


Session Details:
- 📅 Date: Sunday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban


We look forward to your participation! ✌️
🚨 Open Research Position: Visual Anomaly Detection

We announce that there is an open research position in the RIML lab at Sharif University of Technology, supervised by Dr. Rohban.

🔍 Project Denoscription:
Industrial inspection and quality control are among the most prominent applications of visual anomaly detection. In this context, the model is given a training set of solely normal samples to learn their distribution. During inference, any sample that deviates from this established normal distribution, should be recognized as an anomaly.
This project aims to improve the capabilities of existing models, allowing them to detect intricate anomalies that extend beyond conventional defects.

Introductory Paper:
Deep Industrial Image Anomaly Detection: A Survey

Requirements:
- Good understanding of deep learning concepts
- Fluency in Python, PyTorch
- Willingness to dedicate significant time

Submit your application here:
Application Form

Application Deadline:
2024/11/22 (23:59 UTC+3:30)

If you have any questions, contact:
@sehbeygi79
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

This Week's Presentation:

🔹 Title: Counting Understanding in Visoin Lanugate Models

🔸 Presenter: Arash Marioriyad

🌀 Abstract:
Counting-related challenges represent some of the most significant compositional understanding failure modes in vision-language models (VLMs) such as CLIP. While humans, even in early stages of development, readily generalize over numerical concepts, these models often struggle to accurately interpret numbers beyond three, with the difficulty intensifying as the numerical value increases. In this presentation, we explore the counting-related limitations of VLMs and examine the proposed solutions within the field to address these issues.

📄 Papers:
- Teaching CLIP to Count to Ten (ICCV, 2023)
- CLIP-Count: Towards Text-Guided Zero-Shot Object Counting (ACM-MM, 2023)


Session Details:
- 📅 Date: Sunday
- 🕒 Time: 5:00 - 6:00 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban


We look forward to your participation! ✌️
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

This Week's Presentation:

🔹 Title: GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing

🔸 Presenter: Dr Rohban

🌀 Abstract:
This innovative framework addresses the limitations of current image generation models in handling intricate text prompts and ensuring reliability through verification and self-correction mechanisms. Coordinated by a multimodal large language model (MLLM) agent, GenArtist integrates a diverse library of tools, enabling seamless task decomposition, step-by-step execution, and systematic self-correction. With its tree-structured planning and advanced use of position-related inputs, GenArtist achieves state-of-the-art performance, outperforming models like SDXL and DALL-E 3. This session will delve into the system’s architecture and its groundbreaking potential for advancing image generation and editing tasks.


📄 Papers: GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing


Session Details:
- 📅 Date: Wednesday
- 🕒 Time: 3:30 - 4:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
Research Week 1403.pdf
2.4 MB
با سلام. اسلایدهای ارائه هفته پژوهش در مورد مقاله نوریپس پذیرفته شده از RIML خدمت عزیزان تقدیم می‌شود. همینطور در این رشته توییت توضیحاتی در مورد مقاله داده‌ام: https://x.com/MhRohban/status/1867803097596338499
Forwarded from Arash
📣 TA Application Form

🤖 Deep Reinforcement Learning
🧑🏻‍🏫 Dr. Mohammad Hossein Rohban
Deadline: December 31th

https://docs.google.com/forms/d/e/1FAIpQLSduvRRAnwi6Ik9huMDFWOvZqAWhr7HHlHjXdZbst55zSv5Hmw/viewform
📣 TA Application Form

🤖 Course: System-2 AI
🧑🏻‍🏫 Instructors: Dr. Rohban, Dr. Soleymani, Mr. Samiei
Deadline: January 23rd

https://docs.google.com/forms/d/e/1FAIpQLSewqI25q5c3DcsdcCzhCVg42motC2S-bg_xuuPWZ0wA60rYHQ/viewform?usp=dialog
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

This Week's Presentation:

🔹 Title: Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step


🔸 Presenter: Amir Kasaei

🌀 Abstract:
This paper explores the use of Chain-of-Thought (CoT) reasoning to improve autoregressive image generation, an area not widely studied. The authors propose three techniques: scaling computation for verification, aligning preferences with Direct Preference Optimization (DPO), and integrating these methods for enhanced performance. They introduce two new reward models, PARM and PARM++, which adaptively assess and correct image generations. Their approach improves the Show-o model, achieving a +24% gain on the GenEval benchmark and surpassing Stable Diffusion 3 by +15%.


📄 Papers: Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step


Session Details:
- 📅 Date: Sunday
- 🕒 Time: 5:30 - 6:30 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
RIML Lab
💠 Compositional Learning Journal Club Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models…
جلسه‌ی امروز متاسفانه برگزار نخواهد شد
سایر جلسات از طریق همین کانال اطلاع رسانی خواهد شد
💠 Compositional Learning Journal Club

Join us this week for an in-depth discussion on Compositional Learning in the context of cutting-edge text-to-image generative models. We will explore recent breakthroughs and challenges, focusing on how these models handle compositional tasks and where improvements can be made.

This Week's Presentation:

🔹 Title: Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step


🔸 Presenter: Amir Kasaei

🌀 Abstract:

This paper explores the use of Chain-of-Thought (CoT) reasoning to improve autoregressive image generation, an area not widely studied. The authors propose three techniques: scaling computation for verification, aligning preferences with Direct Preference Optimization (DPO), and integrating these methods for enhanced performance. They introduce two new reward models, PARM and PARM++, which adaptively assess and correct image generations. Their approach improves the Show-o model, achieving a +24% gain on the GenEval benchmark and surpassing Stable Diffusion 3 by +15%.


📄 Papers: Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step


Session Details:
- 📅 Date: Wednesday
- 🕒 Time: 2:15 - 3:15 PM
- 🌐 Location: Online at vc.sharif.edu/ch/rohban

We look forward to your participation! ✌️
Research Position at the  Sharif Center for Information Systems and Data Science:

We are seeking several highly skilled students for a project with a deadline  for the NeurIPS conference, focusing on predictive maintenance for batteries and bearings.
Candidates should have strong abilities in precise implementation and integrating new ideas into various architectures such as contrastive learning, transformers, PINN(Physics-informed neural networks) and diffusion models to rapidly enhance the research group's capabilities.

The project is under the direct collaboration of Dr. Babak Khalaj, Dr. Siavash Ahmadi, and Dr. Mohammad Hossein Rohban.

To apply and submit your CV, please contact via email: seyedreza.shiyade@gmail.com
Postdoctoral Research Position Available

The Robust and Interpretable Machine Learning (RIML) Lab at the Computer Engineering Department of Sharif University of Technology is seeking a number of highly motivated and talented postdoctoral researchers to join our team. The successful candidate will work on cutting-edge research involving Large Language Model (LLM) Agents.

• 1-2 years, with the possibility of extension based on performance and funding
• Conduct innovative research on LLM Agents
• Collaborate with a multidisciplinary team of researchers
• Publish high-quality research papers in top-tier conferences and journals
• Mentor graduate and undergraduate students
• Present research findings at international conferences and workshops

Qualifications:
• Ph.D. in Computer Science, Computer Engineering, or a related field earned at most in the last 2 years
• Strong background in natural language processing, machine learning, and artificial intelligence
• Experience with large language models and their applications
• Excellent programming skills (Python, and PyTorch, etc.)
• Strong publication record in relevant areas
• Excellent communication and teamwork skills

Interested candidates should submit the following documents to rohban@sharif.edu by Feb. 7th:
• A cover letter describing your research interests and career goals
• A detailed CV, including a list of publications
• Contact information for at least two references

For more information about our recent research topics, please check out my google scholar: https://scholar.google.com/citations?hl=en&user=pRyJ6FkAAAAJ&view_op=list_works&sortby=pubdate.