Stanford MLSys Seminar Series
In this seminar series, we want to take a look at the frontier of machine learning systems, and how machine learning changes the modern programming stack. Our goal is to help curate a curriculum of awesome work in ML systems to help drive research focus to interesting questions.
https://stanford.io/3dFVl1J
In this seminar series, we want to take a look at the frontier of machine learning systems, and how machine learning changes the modern programming stack. Our goal is to help curate a curriculum of awesome work in ML systems to help drive research focus to interesting questions.
https://stanford.io/3dFVl1J
#dataset
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation
MSeg is a composite dataset that unifies semantic segmentation datasets from different domains. In this dataset, authors reconcile the taxonomies and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.
https://bit.ly/2Hfqc9A
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation
MSeg is a composite dataset that unifies semantic segmentation datasets from different domains. In this dataset, authors reconcile the taxonomies and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.
https://bit.ly/2Hfqc9A
Druid - Interactive Analytics At Scale
Druid – один из полезных и популярных инструментов в мире Больших Данных. Именно эта OLAP система позволяет эффективно обрабатывать, хранить и запрашивать данные. Что и подтверждает востребованность Druid среди инструментов в среде обработки Больших Данных
С Владимиром Иордановым мы поговорим о том, как работает Druid, из чего он состоит и каковы его возможности. Владимир познакомит нас с компонентами Druid, расскажет об архитектуре кластера, о том как проходит обработка данных. Мы сможем понаблюдать на практике, как с ним работать.
О спикере:
🔸 Владимир Иорданов – BigData Tech Lead в Lohika и уже более шести лет работает в проектах, связанных с Большими Данными. Его последний проект был непосредственно связан с Apache Druid. Во время работы на этом проекте Владимир получил большой опыт работы с поддержкой и разработкой под эту систему в production. Этим опытом наш спикер и поделится на BigData Odesa #TechTalks .
🔹Где: онлайн
🔹Когда: 17.12.2020 в 19:00
🔹Язык доклада: русский
🔹Регистрация обязательна
🔹Вход: donation. Будем вам признательны за перечисление любой комфортной для Вас суммы в благотворительный фонд “Корпорация монстров”. Способы перечисления помощи есть на странице регистрации.
Регистрация на онлайн-трансляцию => https://docs.google.com/forms/d/e/1FAIpQLSdALs-5FgvDKwoAVqJkbvYKOR0UTLrWTvMszDUGl1HIylISrA/viewform
Как всегда, вас ждет интересная тема и полезный вечер. Вы научитесь эффективно применять Druid в своих проектах при построении системы обработки больших данных.
Присоединяйтесь!
Druid – один из полезных и популярных инструментов в мире Больших Данных. Именно эта OLAP система позволяет эффективно обрабатывать, хранить и запрашивать данные. Что и подтверждает востребованность Druid среди инструментов в среде обработки Больших Данных
С Владимиром Иордановым мы поговорим о том, как работает Druid, из чего он состоит и каковы его возможности. Владимир познакомит нас с компонентами Druid, расскажет об архитектуре кластера, о том как проходит обработка данных. Мы сможем понаблюдать на практике, как с ним работать.
О спикере:
🔸 Владимир Иорданов – BigData Tech Lead в Lohika и уже более шести лет работает в проектах, связанных с Большими Данными. Его последний проект был непосредственно связан с Apache Druid. Во время работы на этом проекте Владимир получил большой опыт работы с поддержкой и разработкой под эту систему в production. Этим опытом наш спикер и поделится на BigData Odesa #TechTalks .
🔹Где: онлайн
🔹Когда: 17.12.2020 в 19:00
🔹Язык доклада: русский
🔹Регистрация обязательна
🔹Вход: donation. Будем вам признательны за перечисление любой комфортной для Вас суммы в благотворительный фонд “Корпорация монстров”. Способы перечисления помощи есть на странице регистрации.
Регистрация на онлайн-трансляцию => https://docs.google.com/forms/d/e/1FAIpQLSdALs-5FgvDKwoAVqJkbvYKOR0UTLrWTvMszDUGl1HIylISrA/viewform
Как всегда, вас ждет интересная тема и полезный вечер. Вы научитесь эффективно применять Druid в своих проектах при построении системы обработки больших данных.
Присоединяйтесь!
Hey folks!
I know, I know… You’ve all been wondering for quite a while if this digest is actually dead or kinda not exactly. I’m here to say that it’s alive and kicking, and I’m — finally — back.
Now, it makes sense to explain where I’ve been all this time. Well, I have two words for you: Covid-19 and a child (probably shouldn’t use these in one sentence).
With quarantines imposed, most people had to sit at home and watch TV series… But it wasn’t and isn’t the case if you’re a developer. Just WFH, baby. And to the babies — my son was born in May 2020, so my plate was extra full.
Anyway, the situation has stabilized, and I’m back. So, what can you expect to see next?
- Daily updates on all things AI/ML/Data on Telegram and other social media
- Weekly email newsletter with Top AI news and resources
- Quizzes and surveys
- Webinars
- Fun stuff
- etc
I hope this sorted things out a bit. Stay tuned and have a great one, all!
P. S. Updates will start arriving soon.
Best regards,
Dmitry Spodarets
@DataScienceDigest
I know, I know… You’ve all been wondering for quite a while if this digest is actually dead or kinda not exactly. I’m here to say that it’s alive and kicking, and I’m — finally — back.
Now, it makes sense to explain where I’ve been all this time. Well, I have two words for you: Covid-19 and a child (probably shouldn’t use these in one sentence).
With quarantines imposed, most people had to sit at home and watch TV series… But it wasn’t and isn’t the case if you’re a developer. Just WFH, baby. And to the babies — my son was born in May 2020, so my plate was extra full.
Anyway, the situation has stabilized, and I’m back. So, what can you expect to see next?
- Daily updates on all things AI/ML/Data on Telegram and other social media
- Weekly email newsletter with Top AI news and resources
- Quizzes and surveys
- Webinars
- Fun stuff
- etc
I hope this sorted things out a bit. Stay tuned and have a great one, all!
P. S. Updates will start arriving soon.
Best regards,
Dmitry Spodarets
@DataScienceDigest
Recommendation Algorithms & System Designs of YouTube, Spotify, Airbnb, Netflix, TikTok, and Uber
Ever wondered how top technology companies can so accurately predict their customers’ next step? Go no further! In this article, you will find a collection of recommendation algorithms and system designs that they may be using. Note that the author collected all the info from open sources and judges from his personal experience; there is no guarantee that the designs are 100% correct.
https://bit.ly/3m0y0Mo
@DataScienceDigest
Ever wondered how top technology companies can so accurately predict their customers’ next step? Go no further! In this article, you will find a collection of recommendation algorithms and system designs that they may be using. Note that the author collected all the info from open sources and judges from his personal experience; there is no guarantee that the designs are 100% correct.
https://bit.ly/3m0y0Mo
@DataScienceDigest
17 Types of Similarity and Dissimilarity Measures Used in Data Science
Measuring similarity and dissimilarity is quite important in any data science work, simply because it allows to “see” how close or distant data objects are located to each other. In this article, the author took a deep dive into 17 types of similarity and dissimilarity measures, to help you navigate various metrics and their applications in Data Science. The amount of work done by the author is massive, so the article is definitely worth checking out.
https://bit.ly/3ub1YQM
Join 👉 @DataScienceDigest
Measuring similarity and dissimilarity is quite important in any data science work, simply because it allows to “see” how close or distant data objects are located to each other. In this article, the author took a deep dive into 17 types of similarity and dissimilarity measures, to help you navigate various metrics and their applications in Data Science. The amount of work done by the author is massive, so the article is definitely worth checking out.
https://bit.ly/3ub1YQM
Join 👉 @DataScienceDigest
R vs. Python vs. Julia
Data Scientists are used to writing code with R and Python, but a new programming language for Data Science, Julia, can be their new option of choice. Julia promises C-like performance without compromising the way Data Scientists write code and interact with data, and brings a refreshing programming mindset to the DS community. Eager to learn more about Julia and how it stands against R and Python? Check out this article!
https://bit.ly/3maeaOR
Join 👉 @DataScienceDigest
Data Scientists are used to writing code with R and Python, but a new programming language for Data Science, Julia, can be their new option of choice. Julia promises C-like performance without compromising the way Data Scientists write code and interact with data, and brings a refreshing programming mindset to the DS community. Eager to learn more about Julia and how it stands against R and Python? Check out this article!
https://bit.ly/3maeaOR
Join 👉 @DataScienceDigest
Mixing Normal Images and Adversarial Images When Training CNNs
This tutorial will guide you on the path to Computer Vision mastery. As the first step, you will learn how to generate image batches of normal images and adversarial images during the training process to improve your model’s ability to generalize and defend against adversarial attacks. The tutorial is extra detailed and features all the details that CNN beginners may need to start to prepare images for their networks.
https://bit.ly/39yTvPg
Join 👉 @DataScienceDigest
This tutorial will guide you on the path to Computer Vision mastery. As the first step, you will learn how to generate image batches of normal images and adversarial images during the training process to improve your model’s ability to generalize and defend against adversarial attacks. The tutorial is extra detailed and features all the details that CNN beginners may need to start to prepare images for their networks.
https://bit.ly/39yTvPg
Join 👉 @DataScienceDigest
Unit Testing Python Code in Jupyter Notebooks
Writing unit tests appears to be the natural way of doing things when you work on production code, library code, or have to engage with test-driven environments. But what about writing unit tests for your Jupyter notebooks? In some cases, you don’t need it, but when you go beyond data exploration, you definitely do — you just have to maintain and monitor your code in the long run. Learn how unit tests can help you make your Data Science work more efficient, and then try it in practice. You won’t be disappointed!
https://bit.ly/3sPHr3W
Join 👉 @DataScienceDigest
Writing unit tests appears to be the natural way of doing things when you work on production code, library code, or have to engage with test-driven environments. But what about writing unit tests for your Jupyter notebooks? In some cases, you don’t need it, but when you go beyond data exploration, you definitely do — you just have to maintain and monitor your code in the long run. Learn how unit tests can help you make your Data Science work more efficient, and then try it in practice. You won’t be disappointed!
https://bit.ly/3sPHr3W
Join 👉 @DataScienceDigest
Awesome AI-ML-DL Repository
If you are interested in all things AI/ML/Dl as we are, check out this awesome repository by @neomatrix369 that features study notes and a curated list of helpful resources on the topic. The repo is created by and for engineers, developers, data scientists, and all other professionals looking to sharpen their mastery of AI, ML, and DL. As with other repos on GitHub, you can easily contribute to, watch, star, fork, and share the repo with others in the tech community.
https://bit.ly/3upveTN
If you are interested in all things AI/ML/Dl as we are, check out this awesome repository by @neomatrix369 that features study notes and a curated list of helpful resources on the topic. The repo is created by and for engineers, developers, data scientists, and all other professionals looking to sharpen their mastery of AI, ML, and DL. As with other repos on GitHub, you can easily contribute to, watch, star, fork, and share the repo with others in the tech community.
https://bit.ly/3upveTN
Zero-Shot Text-to-Image Generation
One of the major tasks of text-to-image generation is finding better modeling assumptions for training on a fixed dataset. Such assumptions involve complex architectures, auxiliary losses, or side information that make it more difficult to handle data. Discover a new method of text-to-image generation based on a transformer that autoregressively models the text and image tokens as a single stream of data. Learn why it is competitive for at-scale data compared to previous domain-specific models when evaluated in a zero-shot fashion.
https://bit.ly/3fNxuQI
Join 👉 @DataScienceDigest
One of the major tasks of text-to-image generation is finding better modeling assumptions for training on a fixed dataset. Such assumptions involve complex architectures, auxiliary losses, or side information that make it more difficult to handle data. Discover a new method of text-to-image generation based on a transformer that autoregressively models the text and image tokens as a single stream of data. Learn why it is competitive for at-scale data compared to previous domain-specific models when evaluated in a zero-shot fashion.
https://bit.ly/3fNxuQI
Join 👉 @DataScienceDigest
Hey folks,
I need to ask you something.
We plan to launch a series of free webinars in May, and I’d appreciate if you shared your thoughts on what topics we should focus first thing. What are you interested in the most? — Feel free to share as many topics/ideas as you like in the comments' section!
P. S. We’re also looking for speakers. So, if you have some insights to share with the community, we’re ready to provide the platform. Kindly ping me (@dmitryspodarets) to discuss next steps and details.
Thank you!
Best regards,
Dmitry Spodarets
@DataScienceDigest
I need to ask you something.
We plan to launch a series of free webinars in May, and I’d appreciate if you shared your thoughts on what topics we should focus first thing. What are you interested in the most? — Feel free to share as many topics/ideas as you like in the comments' section!
P. S. We’re also looking for speakers. So, if you have some insights to share with the community, we’re ready to provide the platform. Kindly ping me (@dmitryspodarets) to discuss next steps and details.
Thank you!
Best regards,
Dmitry Spodarets
@DataScienceDigest
The Applied Machine Learning Course at Cornell Tech
Starting from the very basics, the course covers the most important ML algorithms and how to apply them in practice. The slides are Jupyter notebooks with programmatically generated figures so that readers can tweak parameters and regenerate the figures themselves. The course explores topics such as how to prioritize model improvements, diagnose overfitting, perform error analysis, visualize loss curves, etc.
Course Videos: https://bit.ly/3dLKZha
Course Materials: https://bit.ly/2Rmlj3e
Join 👉 @DataScienceDigest
Starting from the very basics, the course covers the most important ML algorithms and how to apply them in practice. The slides are Jupyter notebooks with programmatically generated figures so that readers can tweak parameters and regenerate the figures themselves. The course explores topics such as how to prioritize model improvements, diagnose overfitting, perform error analysis, visualize loss curves, etc.
Course Videos: https://bit.ly/3dLKZha
Course Materials: https://bit.ly/2Rmlj3e
Join 👉 @DataScienceDigest
Transfer Learning and Data Augmentation Applied to the Simpsons Image Dataset
In this article, the author uses the Simpsons characters dataset to experiment with data augmentation for transfer learning. Through image filtering to splitting, testing, and validating datasets, a series of experiments is conducted to address the problem of small datasets and overfitting. Check out this step-by-step guide to learn about the results and the final metrics that the experiments yielded.
https://bit.ly/3wQuqJN
Join 👉 @DataScienceDigest
In this article, the author uses the Simpsons characters dataset to experiment with data augmentation for transfer learning. Through image filtering to splitting, testing, and validating datasets, a series of experiments is conducted to address the problem of small datasets and overfitting. Check out this step-by-step guide to learn about the results and the final metrics that the experiments yielded.
https://bit.ly/3wQuqJN
Join 👉 @DataScienceDigest
How to Deploy Machine Learning / Deep Learning Models to the Web
Machine and deep learning models should not exist in a vacuum (theoretical environments). They need to be deployed in production and used by businesses/customers. In this article, the author provides a step-by-step guide on deploying models to the web and accessing them as a REST API using Heroku and GitHub. You will also learn how to access that API using the Python requests module and CURL.
https://bit.ly/2OKpXqI
Join 👉 @DataScienceDigest
Machine and deep learning models should not exist in a vacuum (theoretical environments). They need to be deployed in production and used by businesses/customers. In this article, the author provides a step-by-step guide on deploying models to the web and accessing them as a REST API using Heroku and GitHub. You will also learn how to access that API using the Python requests module and CURL.
https://bit.ly/2OKpXqI
Join 👉 @DataScienceDigest
A Machine Learning Model Monitoring Checklist: 7 Things to Track
Once the model is deployed in production, you need to ensure it performs and that you have accounted for data/mode drift and other changes affecting accuracy and precision. Here comes model monitoring! This article will look into the specifics of model monitoring and explore open-source tools that you can start using today. It also features a short, 7-step checklist to help you make machine learning work in the real world.
https://bit.ly/3slqg96
Subscribe to our weekly digest — https://bit.ly/3smtHwp
Once the model is deployed in production, you need to ensure it performs and that you have accounted for data/mode drift and other changes affecting accuracy and precision. Here comes model monitoring! This article will look into the specifics of model monitoring and explore open-source tools that you can start using today. It also features a short, 7-step checklist to help you make machine learning work in the real world.
https://bit.ly/3slqg96
Subscribe to our weekly digest — https://bit.ly/3smtHwp
Hi folks, DataScience Digest is back on track. In fact, our readers got the first email newsletter just yesterday. Interested in weekly updates about AI, ML too?
Kindly subscribe here: https://bit.ly/3acYHc4.
The newsletter is sent out every Wednesday. Stay tuned!
Kindly subscribe here: https://bit.ly/3acYHc4.
The newsletter is sent out every Wednesday. Stay tuned!
Weekly Awesome Tricks And Best Practices From Kaggle
Kaggle is a go-to destination for data scientists and ML engineers for a reason. It features tons of valuable resources and hosts competitions covering pretty much each and every existing/potential topic in the industry. But how do you take the most out of the platform? Check out this article with tips, tricks, and best practices on using Kaggle during a typical data science workflow.
https://bit.ly/3agM0No
Subscribe to our weekly newsletter — https://bit.ly/3wTjKdg
Kaggle is a go-to destination for data scientists and ML engineers for a reason. It features tons of valuable resources and hosts competitions covering pretty much each and every existing/potential topic in the industry. But how do you take the most out of the platform? Check out this article with tips, tricks, and best practices on using Kaggle during a typical data science workflow.
https://bit.ly/3agM0No
Subscribe to our weekly newsletter — https://bit.ly/3wTjKdg
Paper Review: Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Microsoft Research Asia has presented a brand new vision Transformer called Swin Transformer that can serve as a backbone like usual CNNs in computer vision and Transformers in natural language processing. The author provides a detailed review of the paper, exploring all the do’s and don’ts of the new approach and the possibilities it offers for developing a unified architecture for CV and NLP tasks.
https://bit.ly/32nLZTu
Subscribe to our weekly newsletter — https://bit.ly/3tvunB8
Microsoft Research Asia has presented a brand new vision Transformer called Swin Transformer that can serve as a backbone like usual CNNs in computer vision and Transformers in natural language processing. The author provides a detailed review of the paper, exploring all the do’s and don’ts of the new approach and the possibilities it offers for developing a unified architecture for CV and NLP tasks.
https://bit.ly/32nLZTu
Subscribe to our weekly newsletter — https://bit.ly/3tvunB8
Transferable Visual Words: Exploiting the Semantics of Anatomical Patterns for Self-supervised Learning
In this paper, Fatemeh Haghighi and the team of authors introduce a new concept called «transferable visual words» (TransVW), which is designed to help achieve annotation efficiency for deep learning in medical image analysis. Learn about the team’s extensive experiments and the advantages that TransVW has demonstrated. The research is available as code, pre-trained models, and curated visual words.
Paper — https://bit.ly/3gjtZlj
Code — https://bit.ly/32ms9rP
Subscribe to our weekly newsletter — https://bit.ly/3ahhZNd
In this paper, Fatemeh Haghighi and the team of authors introduce a new concept called «transferable visual words» (TransVW), which is designed to help achieve annotation efficiency for deep learning in medical image analysis. Learn about the team’s extensive experiments and the advantages that TransVW has demonstrated. The research is available as code, pre-trained models, and curated visual words.
Paper — https://bit.ly/3gjtZlj
Code — https://bit.ly/32ms9rP
Subscribe to our weekly newsletter — https://bit.ly/3ahhZNd
How Graph Neural Networks (GNN) Work: Introduction to Graph Convolutions from Scratch
The noscript of this one is quite self-explanatory — The author explores graph neural networks and graph convolutions to explain how they work and how you can apply them in theory and practice in your projects. All points are illustrated with code for convenience.
https://bit.ly/3mY2cYS
@DataScienceDigest
The noscript of this one is quite self-explanatory — The author explores graph neural networks and graph convolutions to explain how they work and how you can apply them in theory and practice in your projects. All points are illustrated with code for convenience.
https://bit.ly/3mY2cYS
@DataScienceDigest