A Machine Learning Model Monitoring Checklist: 7 Things to Track
Once the model is deployed in production, you need to ensure it performs and that you have accounted for data/mode drift and other changes affecting accuracy and precision. Here comes model monitoring! This article will look into the specifics of model monitoring and explore open-source tools that you can start using today. It also features a short, 7-step checklist to help you make machine learning work in the real world.
https://bit.ly/3slqg96
Subscribe to our weekly digest — https://bit.ly/3smtHwp
Once the model is deployed in production, you need to ensure it performs and that you have accounted for data/mode drift and other changes affecting accuracy and precision. Here comes model monitoring! This article will look into the specifics of model monitoring and explore open-source tools that you can start using today. It also features a short, 7-step checklist to help you make machine learning work in the real world.
https://bit.ly/3slqg96
Subscribe to our weekly digest — https://bit.ly/3smtHwp
Hi folks, DataScience Digest is back on track. In fact, our readers got the first email newsletter just yesterday. Interested in weekly updates about AI, ML too?
Kindly subscribe here: https://bit.ly/3acYHc4.
The newsletter is sent out every Wednesday. Stay tuned!
Kindly subscribe here: https://bit.ly/3acYHc4.
The newsletter is sent out every Wednesday. Stay tuned!
Weekly Awesome Tricks And Best Practices From Kaggle
Kaggle is a go-to destination for data scientists and ML engineers for a reason. It features tons of valuable resources and hosts competitions covering pretty much each and every existing/potential topic in the industry. But how do you take the most out of the platform? Check out this article with tips, tricks, and best practices on using Kaggle during a typical data science workflow.
https://bit.ly/3agM0No
Subscribe to our weekly newsletter — https://bit.ly/3wTjKdg
Kaggle is a go-to destination for data scientists and ML engineers for a reason. It features tons of valuable resources and hosts competitions covering pretty much each and every existing/potential topic in the industry. But how do you take the most out of the platform? Check out this article with tips, tricks, and best practices on using Kaggle during a typical data science workflow.
https://bit.ly/3agM0No
Subscribe to our weekly newsletter — https://bit.ly/3wTjKdg
Paper Review: Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Microsoft Research Asia has presented a brand new vision Transformer called Swin Transformer that can serve as a backbone like usual CNNs in computer vision and Transformers in natural language processing. The author provides a detailed review of the paper, exploring all the do’s and don’ts of the new approach and the possibilities it offers for developing a unified architecture for CV and NLP tasks.
https://bit.ly/32nLZTu
Subscribe to our weekly newsletter — https://bit.ly/3tvunB8
Microsoft Research Asia has presented a brand new vision Transformer called Swin Transformer that can serve as a backbone like usual CNNs in computer vision and Transformers in natural language processing. The author provides a detailed review of the paper, exploring all the do’s and don’ts of the new approach and the possibilities it offers for developing a unified architecture for CV and NLP tasks.
https://bit.ly/32nLZTu
Subscribe to our weekly newsletter — https://bit.ly/3tvunB8
Transferable Visual Words: Exploiting the Semantics of Anatomical Patterns for Self-supervised Learning
In this paper, Fatemeh Haghighi and the team of authors introduce a new concept called «transferable visual words» (TransVW), which is designed to help achieve annotation efficiency for deep learning in medical image analysis. Learn about the team’s extensive experiments and the advantages that TransVW has demonstrated. The research is available as code, pre-trained models, and curated visual words.
Paper — https://bit.ly/3gjtZlj
Code — https://bit.ly/32ms9rP
Subscribe to our weekly newsletter — https://bit.ly/3ahhZNd
In this paper, Fatemeh Haghighi and the team of authors introduce a new concept called «transferable visual words» (TransVW), which is designed to help achieve annotation efficiency for deep learning in medical image analysis. Learn about the team’s extensive experiments and the advantages that TransVW has demonstrated. The research is available as code, pre-trained models, and curated visual words.
Paper — https://bit.ly/3gjtZlj
Code — https://bit.ly/32ms9rP
Subscribe to our weekly newsletter — https://bit.ly/3ahhZNd
How Graph Neural Networks (GNN) Work: Introduction to Graph Convolutions from Scratch
The noscript of this one is quite self-explanatory — The author explores graph neural networks and graph convolutions to explain how they work and how you can apply them in theory and practice in your projects. All points are illustrated with code for convenience.
https://bit.ly/3mY2cYS
@DataScienceDigest
The noscript of this one is quite self-explanatory — The author explores graph neural networks and graph convolutions to explain how they work and how you can apply them in theory and practice in your projects. All points are illustrated with code for convenience.
https://bit.ly/3mY2cYS
@DataScienceDigest
OpenCV Face Detection with Haar Cascades
Face detection is one of the most popular Computer Vision use cases (at least, as perceived by the general public). Learning how to use OpenCV and Haar Cascades can be critical if you want to go deep with the field — and this detailed tutorial provides a fresh and easy start for new learners. Just follow the instructions step by step and see the results in action.
https://bit.ly/3v5C3KB
@DataScienceDigest
Face detection is one of the most popular Computer Vision use cases (at least, as perceived by the general public). Learning how to use OpenCV and Haar Cascades can be critical if you want to go deep with the field — and this detailed tutorial provides a fresh and easy start for new learners. Just follow the instructions step by step and see the results in action.
https://bit.ly/3v5C3KB
@DataScienceDigest
Lviv Data Science Summer School
Hi folks,
I’m pleased to invite you all to enroll in the Lviv Data Science Summer School, to delve into advanced methods and tools of Data Science and Machine Learning, including such domains as CV, NLP, Healthcare, Social Network Analysis, and Urban Data Science. The courses are practice-oriented and are geared towards undergraduates, Ph.D. students, and young professionals (intermediate level). The studies begin July 19–30 and will be hosted online. Make sure to apply — Spots are running fast!
https://bit.ly/2Qc0QOx
@DataScienceDigest
Hi folks,
I’m pleased to invite you all to enroll in the Lviv Data Science Summer School, to delve into advanced methods and tools of Data Science and Machine Learning, including such domains as CV, NLP, Healthcare, Social Network Analysis, and Urban Data Science. The courses are practice-oriented and are geared towards undergraduates, Ph.D. students, and young professionals (intermediate level). The studies begin July 19–30 and will be hosted online. Make sure to apply — Spots are running fast!
https://bit.ly/2Qc0QOx
@DataScienceDigest
Interpretable Machine Learning: A Guide for Making Black Box Models Explainable
The book by Christoph Molnar goes deep to explain how to make supervised machine learning models more interpretable. You’ll start by exploring the concepts of interpretability to learn about simple, interpretable models such as decision trees, decision rules, and linear regression. Then, you’ll look into general model-agnostic methods for interpreting black box models like feature importance and accumulated local effects and explaining individual predictions with Shapley values and LIME. The book focuses on ML models for tabular data and less on computer vision and natural language processing tasks. Reading the book is recommended for machine learning practitioners, data scientists, statisticians, and anyone else interested in making machine learning models interpretable.
https://bit.ly/3sH8Ofq
@DataScienceDigest
The book by Christoph Molnar goes deep to explain how to make supervised machine learning models more interpretable. You’ll start by exploring the concepts of interpretability to learn about simple, interpretable models such as decision trees, decision rules, and linear regression. Then, you’ll look into general model-agnostic methods for interpreting black box models like feature importance and accumulated local effects and explaining individual predictions with Shapley values and LIME. The book focuses on ML models for tabular data and less on computer vision and natural language processing tasks. Reading the book is recommended for machine learning practitioners, data scientists, statisticians, and anyone else interested in making machine learning models interpretable.
https://bit.ly/3sH8Ofq
@DataScienceDigest
Zero-Shot Learning: Can You Classify an Object Without Seeing It Before?
Developing machine learning models that can perform predictive functions on data it has never seen before has become an important research area called zero-shot learning. We tend to be pretty great at recognizing things in the world we never saw before, and zero-shot learning offers a possible path toward mimicking this powerful human capability.
https://bit.ly/3xxMF7c
@DataScienceDigest
Developing machine learning models that can perform predictive functions on data it has never seen before has become an important research area called zero-shot learning. We tend to be pretty great at recognizing things in the world we never saw before, and zero-shot learning offers a possible path toward mimicking this powerful human capability.
https://bit.ly/3xxMF7c
@DataScienceDigest
Shedding Light on Fairness in AI with a New Data Set
Bias and fairness in AI are highly debatable topics. To address the problem, Facebook AI has created Casual Conversations, a new dataset consisting of 45,186 videos of participants having non noscripted conversations, to help AI researchers identify and evaluate the fairness of their computer vision and audio models across subgroups of age, gender, apparent skin tone, and ambient lighting.
https://bit.ly/3tRt3bX
@DataScienceDigest
Bias and fairness in AI are highly debatable topics. To address the problem, Facebook AI has created Casual Conversations, a new dataset consisting of 45,186 videos of participants having non noscripted conversations, to help AI researchers identify and evaluate the fairness of their computer vision and audio models across subgroups of age, gender, apparent skin tone, and ambient lighting.
https://bit.ly/3tRt3bX
@DataScienceDigest
VideoGPT: Video Generation using VQ-VAE and Transformers
In this research paper, Wilson Yan et al. present VideoGPT, a simple architecture for scaling likelihood-based generative modeling to natural videos. Despite its simplicity, it can generate samples competitive with advanced GAN models for video generation, as well as high fidelity natural images from UCF-101 and Tumbler GIF Dataset (TGIF).
Paper - https://bit.ly/3aHbpAa
Code - https://bit.ly/32NQQxw
Demo - https://bit.ly/3dUzuFF
@DataScienceDigest
In this research paper, Wilson Yan et al. present VideoGPT, a simple architecture for scaling likelihood-based generative modeling to natural videos. Despite its simplicity, it can generate samples competitive with advanced GAN models for video generation, as well as high fidelity natural images from UCF-101 and Tumbler GIF Dataset (TGIF).
Paper - https://bit.ly/3aHbpAa
Code - https://bit.ly/32NQQxw
Demo - https://bit.ly/3dUzuFF
@DataScienceDigest
Data Science Digest — 28.04.21
The new issue of DataScience Digest is here! Hop to learn about the latest articles, tutorials, research papers, and books on Data Science, AI, ML, and Big Data. All sections are prioritized for your convenience. Enjoy!
https://bit.ly/3nrBYOT
Join 👉 @DataScienceDigest
The new issue of DataScience Digest is here! Hop to learn about the latest articles, tutorials, research papers, and books on Data Science, AI, ML, and Big Data. All sections are prioritized for your convenience. Enjoy!
https://bit.ly/3nrBYOT
Join 👉 @DataScienceDigest
Token Labeling: Training a 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet
In this paper, Zihang Jiang, Qibin Hou et al. explore vision transformers applied to ImageNet classification. They have developed new training techniques to demonstrate that by slightly tuning the structure of vision transformers and introducing token labeling, the models can achieve better results than the CNN counterparts and other transformer-based classification models.
Paper - https://bit.ly/32VtFRQ
Code - https://bit.ly/3eDxAbn
In this paper, Zihang Jiang, Qibin Hou et al. explore vision transformers applied to ImageNet classification. They have developed new training techniques to demonstrate that by slightly tuning the structure of vision transformers and introducing token labeling, the models can achieve better results than the CNN counterparts and other transformer-based classification models.
Paper - https://bit.ly/32VtFRQ
Code - https://bit.ly/3eDxAbn
Boosting Natural Language Processing with Wikipedia
In this hands-on tutorial, Nicola Melluso explains how you can take advantage of Wikipedia to improve your Natural Language Processing models. To illustrate how it works, he takes such NLP tasks as Named Entity Recognition and Topic Modeling, and then goes deep step by step, to explain how to collect and process data, build and train the models, etc.
https://bit.ly/3tfdiul
In this hands-on tutorial, Nicola Melluso explains how you can take advantage of Wikipedia to improve your Natural Language Processing models. To illustrate how it works, he takes such NLP tasks as Named Entity Recognition and Topic Modeling, and then goes deep step by step, to explain how to collect and process data, build and train the models, etc.
https://bit.ly/3tfdiul
NLP Profiler
A simple but useful NLP library created by @neomatrix369. It enables Data Science practitioners to easily profile datasets with one, two, or more text columns. The library is designed to return either high-level insights or low-level/granular statistical information about the text when given a dataset and a column name containing text data, in that column. Check out the library and let us know what you think.
https://bit.ly/3xEgSBp
A simple but useful NLP library created by @neomatrix369. It enables Data Science practitioners to easily profile datasets with one, two, or more text columns. The library is designed to return either high-level insights or low-level/granular statistical information about the text when given a dataset and a column name containing text data, in that column. Check out the library and let us know what you think.
https://bit.ly/3xEgSBp
Deep Learning for Audio with the Speech Commands Dataset
If you want to learn how to train a simple model on the Speech Commands audio dataset, this article by Peter Gao is for you. He explains how to choose a dataset and handle data, train, test, tune the model, and, most importantly, how to do error analysis (and analyze failure cases) to improve model performance over time.
https://bit.ly/2SbUIpR
If you want to learn how to train a simple model on the Speech Commands audio dataset, this article by Peter Gao is for you. He explains how to choose a dataset and handle data, train, test, tune the model, and, most importantly, how to do error analysis (and analyze failure cases) to improve model performance over time.
https://bit.ly/2SbUIpR
skweak: Weak Supervision Made Easy for NLP
In this paper, Pierre Lison et al. present skweak, a versatile, Python-based software toolkit to help NLP developers apply weak supervision to a wide range of NLP tasks. The toolkit makes it easy to implement a large spectrum of labeling functions (such as heuristics, gazetteers, neural models, or linguistic constraints) on text data, apply them on a corpus, and aggregate their results in a fully unsupervised fashion.
Paper — https://bit.ly/3tk0ORU
Code — https://bit.ly/33aEmAj
In this paper, Pierre Lison et al. present skweak, a versatile, Python-based software toolkit to help NLP developers apply weak supervision to a wide range of NLP tasks. The toolkit makes it easy to implement a large spectrum of labeling functions (such as heuristics, gazetteers, neural models, or linguistic constraints) on text data, apply them on a corpus, and aggregate their results in a fully unsupervised fashion.
Paper — https://bit.ly/3tk0ORU
Code — https://bit.ly/33aEmAj
Face Detection Tips, Suggestions, and Best Practices
In this tutorial, Adrian Rosenbrock continues to explore the topic of face detection. You will learn their tips, suggestions, and best practices to achieve high face detection accuracy with OpenCV and dlib. Though the tutorial is mostly theoretical, it features code and tons of useful links inside.
https://bit.ly/3ehR0na
In this tutorial, Adrian Rosenbrock continues to explore the topic of face detection. You will learn their tips, suggestions, and best practices to achieve high face detection accuracy with OpenCV and dlib. Though the tutorial is mostly theoretical, it features code and tons of useful links inside.
https://bit.ly/3ehR0na
Data Science Digest — 05.05.21
The new issue of DataScienceDigest is here! Hop to learn about the latest articles, tutorials, research papers, and projects on DataScience, AI, ML, and BigData. All sections are prioritized for your convenience. Enjoy!
https://bit.ly/33mYRd3
Join 👉 @DataScienceDigest
The new issue of DataScienceDigest is here! Hop to learn about the latest articles, tutorials, research papers, and projects on DataScience, AI, ML, and BigData. All sections are prioritized for your convenience. Enjoy!
https://bit.ly/33mYRd3
Join 👉 @DataScienceDigest
Improving Model Performance Through Human Participation
In this article, Preetam Josh (Netflix) and Mudit Jain (Google) explore a complex topic of AI-to-human cooperation. Specifically, they explain how human input in the model inference loop (human-in-the-loop) can increase the final precision and recall, and how to incorporate human feedback at inference time to ensure higher precision and recall.
https://bit.ly/3eUWm7b
In this article, Preetam Josh (Netflix) and Mudit Jain (Google) explore a complex topic of AI-to-human cooperation. Specifically, they explain how human input in the model inference loop (human-in-the-loop) can increase the final precision and recall, and how to incorporate human feedback at inference time to ensure higher precision and recall.
https://bit.ly/3eUWm7b