NEW BOT Телеграм, страница

Artem Ryblov’s Data Science Weekly

SQL Tutorial

Learn to answer questions with data using SQL. No coding experience necessary.

Link: Site

Navigational hashtags: #armknowledgesharing #armsites #armcourses
General hashtags: #sql

@data_science_weekly

👍6

595 views18:01

Artem Ryblov’s Data Science Weekly

Recommenders

Recommenders objective is to assist researchers, developers and enthusiasts in prototyping, experimenting with and bringing to production a range of classic and state-of-the-art recommendation systems.

Recommenders is a project under the Linux Foundation of AI and Data.

This repository contains examples and best practices for building recommendation systems, provided as Jupyter notebooks. The examples detail our learnings on five key tasks:
- Prepare Data: Preparing and loading data for each recommendation algorithm.
- Model: Building models using various classical and deep learning recommendation algorithms such as Alternating Least Squares (ALS) or eXtreme Deep Factorization Machines (xDeepFM).
- Evaluate: Evaluating algorithms with offline metrics.
- Model Select and Optimize: Tuning and optimizing hyperparameters for recommendation models.
- Operationalize: Operationalizing models in a production environment on Azure.

Several utilities are provided in recommenders to support common tasks such as loading datasets in the format expected by different algorithms, evaluating model outputs, and splitting training/test data. Implementations of several state-of-the-art algorithms are included for self-study and customization in your own applications. See the Recommenders documentation.

For a more detailed overview of the repository, please see the documents on the wiki page.

For some of the practical scenarios where recommendation systems have been applied, see scenarios.

Link: Repository

Navigational hashtags: #armknowledgesharing #armrepo
General hashtags: #recsys #recommendersystems #recommenders

@data_science_weekly

👍4

580 views08:02

Artem Ryblov’s Data Science Weekly

CS50’s Introduction to Programming with Python by Harvard

An introduction to programming using a language called Python. Learn how to read and write code as well as how to test and “debug” it. Designed for students with or without prior programming experience who’d like to learn Python specifically.

Learn about functions, arguments, and return values (oh my!); variables and types; conditionals and Boolean expressions; and loops. Learn how to handle exceptions, find and fix bugs, and write unit tests; use third-party libraries; validate and extract data with regular expressions; model real-world entities with classes, objects, methods, and properties; and read and write files.

Hands-on opportunities for lots of practice. Exercises inspired by real-world programming problems.

No software required except for a web browser, or you can write code on your own PC or Mac.

Link: Course

Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #python

@data_science_weekly

👍3

532 views08:02

Artem Ryblov’s Data Science Weekly

Interpreting Machine Learning Models With SHAP. A Guide With Python Examples And Theory On Shapley Values by Christoph Molnar

Machine learning is transforming fields from healthcare diagnostics to climate change predictions through their predictive performance. However, these complex machine learning models often lack interpretability, which is becoming more essential than ever for debugging, fostering trust, and communicating model insights.

Introducing SHAP, the Swiss army knife of machine learning interpretability:
- SHAP can be used to explain individual predictions.
- By combining explanations for individual predictions, SHAP allows to study the overall model behavior.
- SHAP is model-agnostic – it works with any model, from simple linear regression to deep learning.
- With its flexibility, SHAP can handle various data formats, whether it’s tabular, image, or text.
- The Python package shap makes the application of SHAP for model interpretation easy.

This book will be your comprehensive guide to mastering the theory and application of SHAP. It starts with the quite fascinating origin in game theory and explores what splitting taxi costs has to do with explaining machine learning predictions. Starting with using SHAP to explain a simple linear regression model, the book progressively introduces SHAP for more complex models. You’ll learn the ins and outs of the most popular explainable AI method and how to apply it using the shap package.

In a world where interpretability is key, this book is your roadmap to mastering SHAP. For machine learning models that are not only accurate but also interpretable.

Links:
- Paperback
- eBook

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #ml #machinelearning #shap #interpretability #python #shapley #shapleyvalues

@data_science_weekly

👍8

562 views08:02

Artem Ryblov’s Data Science Weekly

Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin

Even bad code can function. But if code isn’t clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn’t have to be that way.

Noted software expert Robert C. Martin, presents a revolutionary paradigm with Clean Code: A Handbook of Agile Software Craftsmanship. Martin, who has helped bring agile principles from a practitioner’s point of view to tens of thousands of programmers, has teamed up with his colleagues from Object Mentor to distill their best agile practice of cleaning code “on the fly” into a book that will instill within you the values of software craftsman, and make you a better programmer―but only if you work at it.

What kind of work will you be doing? You’ll be reading code―lots of code. And you will be challenged to think about what’s right about that code, and what’s wrong with it. More importantly you will be challenged to reassess your professional values and your commitment to your craft.

Clean Code is divided into three parts. The first describes the principles, patterns, and practices of writing clean code. The second part consists of several case studies of increasing complexity. Each case study is an exercise in cleaning up code―of transforming a code base that has some problems into one that is sound and efficient. The third part is the payoff: a single chapter containing a list of heuristics and “smells” gathered while creating the case studies. The result is a knowledge base that describes the way we think when we write, read, and clean code.

Readers will come away from this book understanding:
- How to tell the difference between good and bad code
- How to write good code and how to transform bad code into good code
- How to create good names, good functions, good objects, and good classes
- How to format code for maximum readability
- How to implement complete error handling without obscuring code
- How to unit test and practice test-driven development
- What “smells” and heuristics can help you identify bad code

This book is a must for any developer, software engineer, project manager, team lead, or systems analyst with an interest in producing better code.

Link: Paperback

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #development #cleancode

@data_science_weekly

👍8

598 views08:04

Artem Ryblov’s Data Science Weekly

A new perspective on Shapley values, part I: Intro to Shapley and SHAP by Edden Gerber

This post is the first in a series of two posts about explaining statistical models with Shapley values.

There are two main reasons you might want to read it:
1. To learn about Shapley values and the SHAP python library.
This is what this post is about after all. The explanations it provides are far from exhaustive, and contain nothing that cannot be gathered from other online sources, but it should still serve as a good quick intro or bonus reading on this subject.
2. As an introduction or refresher before reading the next post about Naive Shapley values.
The next post is my attempt at a novel contribution to the topic of Shapley values in machine learning. You may be already familiar with SHAP and Shapley and are just glancing over this post to make sure we’re on common ground, or you may be here to clear up something confusing from the next post.

Link: Post

Navigational hashtags: #armknowledgesharing #armsites
General hashtags: #shap #shapley #interpretation #ml #python

@data_science_weekly

👍7

541 views08:04

Artem Ryblov’s Data Science Weekly

Robotics Course by Hugging Face 🤗

This free course will take you on a journey, from classical robotics to modern learning-based approaches, in understanding, implementing, and applying machine learning techniques to real robotic systems.

This course is based on the Robot Learning Tutorial, which is a comprehensive guide to robot learning for researchers and practitioners. Here, we are attempting to distill the tutorial into a more accessible format for the community.

This first unit will help you onboard. You’ll see the course syllabus and learning objectives, understand the structure and prerequisites, meet the team behind the course, learn about LeRobot and the surrounding Huggnig Face ecosystem, and explore the community resources that support your journey.

This course bridges theory and practice in Robotics! It's designed for students interested in understanding how machine learning is transforming robotics. Whether you're new to robotics or looking to understand learning-based approaches, this course will guide you step by step.

What to expect from this course?

Across the course you will study classical robotics foundations and modern learning‑based approaches, learn to use LeRobot, work with real robotics datasets, and implement state‑of‑the‑art algorithms. The emphasis is on practical skills you can apply to real robotic systems.

At the end of this course, you'll understand:
- How robots learn from data
- Why learning-based approaches are transforming robotics
- How to implement these techniques using modern tools like LeRobot

What's the syllabus?

Here is the general syllabus for the robotics course. Each unit builds on the previous ones to give you a comprehensive understanding of Robotics.
- Course Introduction. Welcome, prerequisites, and course overview
- Introduction to Robotics. Why Robotics matters and LeRobot ecosystem
- Classical Robotics. Traditional approaches and their limitations
- Reinforcement Learning. How robots learn through trial and error
- Imitation Learning. Learning from demonstrations and behavioral cloning
- Foundation Models. Large-scale models for general robotics

Link: Course

Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #robotics #rf #reinforcementlearning #foundationalmodels #hf #huggingface

@data_science_weekly

👍6

600 views08:04

Artem Ryblov’s Data Science Weekly

A new perspective on Shapley values, part II: The Naïve Shapley method by Edden Gerber

Why should you read this post?
1. For insight into Shapley values and the SHAP tool.
Most other sources on these topics are explanations based on existing primary sources (e.g. academic papers and the SHAP documentation). This post is an attempt to gain some understanding through an empirical approach.
2. To learn about an alternative approach to computing Shapley values, that under some (limited) circumstances may be preferable to SHAP.
If you are unfamiliar with Shaply values or SHAP, or want a short recap of how the SHAP explainers work, check out the previous post. In a hurry? The author has emphasized the key sentences in bold to assist your speed-reading.

Link: Post

Navigational hashtags: #armknowledgesharing #armsites
General hashtags: #shap #shapley #interpretation #ml #python

@data_science_weekly

👍6

529 views08:01

Artem Ryblov’s Data Science Weekly

Machine Learning Systems. Principles and Practices of Engineering Artificially Intelligent Systems by Vijay Janapa Reddi (Harvard University)

Machine Learning Systems provides a systematic framework for understanding and engineering machine learning (ML) systems.

This textbook bridges the gap between theoretical foundations and practical engineering, emphasizing the systems perspective required to build effective AI solutions.

Unlike resources that focus primarily on algorithms and model architectures, this book highlights the broader context in which ML systems operate, including data engineering, model optimization, hardware-aware training, and inference acceleration.

Readers will develop the ability to reason about ML system architectures and apply enduring engineering principles for building flexible, efficient, and robust machine learning systems.

Links:
- Book
- Site

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #mlsd #machinelearning #machinelearningsystemdesign

@data_science_weekly

👍6

517 viewsedited 08:00

Artem Ryblov’s Data Science Weekly

CS231n: Deep Learning for Computer Vision by Stanford

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems.

This course is a deep dive into the details of deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement and train their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. Additionally, the final assignment will give them the opportunity to train and apply multi-million parameter networks on real-world vision problems of their choice.

Through multiple hands-on assignments and the final course project, students will acquire the toolset for setting up deep learning tasks and practical engineering tricks for training and fine-tuning deep neural networks.

Links:
- Course Materials
- Useful Notes
- Videos (2025)

Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #cv #computervision #nn #neuralnetworks

@data_science_weekly

👍7

717 views08:03

Artem Ryblov’s Data Science Weekly

Machine Learning System Design Interview by Ali Aminian and Alex Xu

Machine learning system design interviews are the most difficult to tackle of all technical interview questions. This book provides a reliable strategy and knowledge base for approaching a broad range of ML system design questions. It provides a step-by-step framework for tackling an ML system design question. It includes many real-world examples to illustrate the systematic approach, with detailed steps you can follow.

This book is an essential resource for anyone interested in ML system design, whether they are beginners or experienced engineers. Meanwhile, if you need to prepare for an ML interview, this book is specifically written for you.

What’s inside?
- An insider’s take on what interviewers really look for and why.
- A 7-step framework for solving any ML system design interview question.
- 10 real ML system design interview questions with detailed solutions.
- 211 diagrams that visually explain how various systems work.

Table Of Contents
Chapter 1 Introduction and Overview
Chapter 2 Visual Search System
Chapter 3 Google Street View Blurring System
Chapter 4 YouTube Video Search
Chapter 5 Harmful Content Detection
Chapter 6 Video Recommendation System
Chapter 7 Event Recommendation System
Chapter 8 Ad Click Prediction on Social Platforms
Chapter 9 Similar Listings on Vacation Rental Platforms
Chapter 10 Personalized News Feed
Chapter 11 People You May Know

Links:
- Paper version
- Digital version
- Solutions

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #mlsd #machinelearning #machinelearningsystemdesign

@data_science_weekly

👍8

681 views08:00

Artem Ryblov’s Data Science Weekly

A/B Testing course by Skoltech

Each of us regularly makes decisions. The optimal solution is often not obvious, and the cost of error is high. A/B tests are the most accurate way to choose the best option.

A/B experiments are used to test the effectiveness of new drugs and are also widely used in business. Companies that use A/B experiments make more accurate decisions, allowing them to stay ahead of the competition.

Mathematical statistics is the foundation of A/B tests. It provides mathematically sound criteria for testing hypotheses. This allows us to be confident in the accuracy of our results.

Upon completion of this course, you'll be able to design experiments and evaluate them using A/B testing techniques, including advanced ones such as variance reduction and ratio metric analysis. If you're a manager, you'll learn the full A/B testing pipeline, its key steps, and the typical mistakes people make when conducting A/B tests.

Table Of Contents
Week 1. A/B Tests. Introduction
Week 2. Statistics Basics. Parametric Estimation. Bootstrapping
Week 3. Statistics Basics. Hypothesis Testing
Week 4. A/B Tests. Basic Level
Week 5. A/B Tests. Increasing Sensitivity. Review of Modern Methods

Link: Course

Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #statistics #ab #abtesting

@data_science_weekly

👍5

542 views08:04

Artem Ryblov’s Data Science Weekly

System Design 101

Explain complex systems using visuals and simple terms.

Whether you're preparing for a System Design Interview or you simply want to understand how systems work beneath the surface, we hope this repository will help you achieve that.

Link: Repo

Navigational hashtags: #armknowledgesharing #armrepo
General hashtags: #systemdesign

@data_science_weekly

👍6

441 views08:00

Artem Ryblov’s Data Science Weekly

Machine Learning Q and AI. 30 Essential Questions and Answers on Machine Learning and AI by Sebastian Raschka

If you’re ready to venture beyond introductory concepts and dig deeper into machine learning, deep learning, and AI, the question-and-answer format of Machine Learning Q and AI will make things fast and easy for you, without a lot of mucking about.

Born out of questions often fielded by author Sebastian Raschka, the direct, no-nonsense approach of this book makes advanced topics more accessible and genuinely engaging. Each brief, self-contained chapter journeys through a fundamental question in AI, unraveling it with clear explanations, diagrams, and hands-on exercises.

WHAT'S INSIDE:
FOCUSED CHAPTERS: Key questions in AI are answered concisely, and complex ideas are broken down into easily digestible parts.
WIDE RANGE OF TOPICS: Raschka covers topics ranging from neural network architectures and model evaluation to computer vision and natural language processing.
PRACTICAL APPLICATIONS: Learn techniques for enhancing model performance, fine-tuning large models, and more.

You’ll also explore how to:
• Manage the various sources of randomness in neural network training
• Differentiate between encoder and decoder architectures in large language models
• Reduce overfitting through data and model modifications
• Construct confidence intervals for classifiers and optimize models with limited labeled data
• Choose between different multi-GPU training paradigms and different types of generative AI models
• Understand performance metrics for natural language processing
• Make sense of the inductive biases in vision transformers

If you’ve been on the hunt for the perfect resource to elevate your understanding of machine learning, Machine Learning Q and AI will make it easy for you to painlessly advance your knowledge beyond the basics.

Link: Site

Navigational hashtags: #armknowledgesharing #armbook
General hashtags: #ml #machinelearning #nlp #cv #dl #nn #neuralnetworks #deeplearning #computervision #naturallanguageprocessing

@data_science_weekly

👍10

485 views08:00

Artem Ryblov’s Data Science Weekly

PyTorch internals

This talk is for those of you who have used PyTorch, and thought to yourself, "It would be great if I could contribute to PyTorch," but were scared by PyTorch's behemoth of a C++ codebase. I'm not going to lie: the PyTorch codebase can be a bit overwhelming at times. The purpose of this talk is to put a map in your hands: to tell you about the basic conceptual structure of a "tensor library that supports automatic differentiation", and give you some tools and tricks for finding your way around the codebase. I'm going to assume that you've written some PyTorch before, but haven't necessarily delved deeper into how a machine learning library is written.

The talk is in two parts: in the first part, I'm going to first introduce you to the conceptual universe of a tensor library. I'll start by talking about the tensor data type you know and love, and give a more detailed discussion about what exactly this data type provides, which will lead us to a better understanding of how it is actually implemented under the hood. If you're an advanced user of PyTorch, you'll be familiar with most of this material. We'll also talk about the trinity of "extension points", layout, device and dtype, which guide how we think about extensions to the tensor class. In the live talk at PyTorch NYC, I skipped the slides about autograd, but I'll talk a little bit about them in these notes as well.

The second part grapples with the actual nitty gritty details involved with actually coding in PyTorch. I'll tell you how to cut your way through swaths of autograd code, what code actually matters and what is legacy, and also all of the cool tools that PyTorch gives you for writing kernels.

Link: Site

Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #dl #deeplearning #pytorch

@data_science_weekly

👍5

458 views08:04

Artem Ryblov’s Data Science Weekly

Deep Learning Tuning Playbook by Google

This document helps you train deep learning models more effectively. Although this document emphasizes hyperparameter tuning, it also touches on other aspects of deep learning training, such as training pipeline implementation and optimization.

This document assumes your machine learning task is either a supervised learning problem or a similar problem (for example, self-supervised learning) That said, some of the advice in this document may also apply to other types of machine learning problems.

Links:
- GitHub
- Site

Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #dl #deeplearning #google

@data_science_weekly

👍3

456 views15:02

Artem Ryblov’s Data Science Weekly

Happy New Year!

👍4

277 views11:10

Artem Ryblov’s Data Science Weekly

Forwarded from TGStat Bot

Summary of the year for the channel "Artem Ryblov’s Data Science Weekly" from @TGStat

👍6

306 views11:10

Artem Ryblov’s Data Science Weekly

Tech Interview Cheat Sheet

This list is meant to be both a quick guide and reference for further research into these topics. It's basically a summary of that comp sci course you never took or forgot about, so there's no way it can cover everything in depth.

Link: Site

Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #interview #techinterview #interviewprep #interviewpreparation

@data_science_weekly

👍2

266 viewsedited 08:04

Artem Ryblov’s Data Science Weekly

A/B Testing & Experimentation Roadmap

This roadmap is for analysts, data scientists, and product folks who want to go from “I know what an A/B test is” to running trustworthy, advanced online experiments (CUPED, sequential testing, quasi-experiments, Bayesian, etc.).

It’s organized by topics. You don’t have to go strictly top-to-bottom, but earlier sections are foundations for later ones.

Link: GitHub

Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #statistics #abtesting #ab

@data_science_weekly

👍2

248 views08:05

About

Blog

Apps

Platform