Data Phoenix – Telegram
Data Phoenix
1.45K subscribers
641 photos
3 videos
1 file
1.33K links
Data Phoenix is your best friend in learning and growing in the data world!
We publish digest, organize events and help expand the frontiers of your knowledge in ML, CV, NLP, and other aspects of AI. Idea and implementation: @dmitryspodarets
Download Telegram
Unit Testing Python Code in Jupyter Notebooks

Writing unit tests appears to be the natural way of doing things when you work on production code, library code, or have to engage with test-driven environments. But what about writing unit tests for your Jupyter notebooks? In some cases, you don’t need it, but when you go beyond data exploration, you definitely do — you just have to maintain and monitor your code in the long run. Learn how unit tests can help you make your Data Science work more efficient, and then try it in practice. You won’t be disappointed!

https://bit.ly/3sPHr3W

Join 👉 @DataScienceDigest
​​Awesome AI-ML-DL Repository

If you are interested in all things AI/ML/Dl as we are, check out this awesome repository by @neomatrix369 that features study notes and a curated list of helpful resources on the topic. The repo is created by and for engineers, developers, data scientists, and all other professionals looking to sharpen their mastery of AI, ML, and DL. As with other repos on GitHub, you can easily contribute to, watch, star, fork, and share the repo with others in the tech community.

https://bit.ly/3upveTN
​​Zero-Shot Text-to-Image Generation

One of the major tasks of text-to-image generation is finding better modeling assumptions for training on a fixed dataset. Such assumptions involve complex architectures, auxiliary losses, or side information that make it more difficult to handle data. Discover a new method of text-to-image generation based on a transformer that autoregressively models the text and image tokens as a single stream of data. Learn why it is competitive for at-scale data compared to previous domain-specific models when evaluated in a zero-shot fashion.

https://bit.ly/3fNxuQI

Join 👉 @DataScienceDigest
​​Hey folks,

I need to ask you something.

We plan to launch a series of free webinars in May, and I’d appreciate if you shared your thoughts on what topics we should focus first thing. What are you interested in the most? — Feel free to share as many topics/ideas as you like in the comments' section!

P. S. We’re also looking for speakers. So, if you have some insights to share with the community, we’re ready to provide the platform. Kindly ping me (@dmitryspodarets) to discuss next steps and details.

Thank you!

Best regards,
Dmitry Spodarets
@DataScienceDigest
​​The Applied Machine Learning Course at Cornell Tech

Starting from the very basics, the course covers the most important ML algorithms and how to apply them in practice. The slides are Jupyter notebooks with programmatically generated figures so that readers can tweak parameters and regenerate the figures themselves. The course explores topics such as how to prioritize model improvements, diagnose overfitting, perform error analysis, visualize loss curves, etc.

Course Videos: https://bit.ly/3dLKZha
Course Materials: https://bit.ly/2Rmlj3e

Join 👉 @DataScienceDigest
​​Transfer Learning and Data Augmentation Applied to the Simpsons Image Dataset

In this article, the author uses the Simpsons characters dataset to experiment with data augmentation for transfer learning. Through image filtering to splitting, testing, and validating datasets, a series of experiments is conducted to address the problem of small datasets and overfitting. Check out this step-by-step guide to learn about the results and the final metrics that the experiments yielded.

https://bit.ly/3wQuqJN

Join 👉 @DataScienceDigest
​​How to Deploy Machine Learning / Deep Learning Models to the Web

Machine and deep learning models should not exist in a vacuum (theoretical environments). They need to be deployed in production and used by businesses/customers. In this article, the author provides a step-by-step guide on deploying models to the web and accessing them as a REST API using Heroku and GitHub. You will also learn how to access that API using the Python requests module and CURL.

https://bit.ly/2OKpXqI

Join 👉 @DataScienceDigest
​​A Machine Learning Model Monitoring Checklist: 7 Things to Track

Once the model is deployed in production, you need to ensure it performs and that you have accounted for data/mode drift and other changes affecting accuracy and precision. Here comes model monitoring! This article will look into the specifics of model monitoring and explore open-source tools that you can start using today. It also features a short, 7-step checklist to help you make machine learning work in the real world.

https://bit.ly/3slqg96

Subscribe to our weekly digest — https://bit.ly/3smtHwp
Hi folks, DataScience Digest is back on track. In fact, our readers got the first email newsletter just yesterday. Interested in weekly updates about AI, ML too?

Kindly subscribe here: https://bit.ly/3acYHc4.

The newsletter is sent out every Wednesday. Stay tuned!
​​Weekly Awesome Tricks And Best Practices From Kaggle

Kaggle is a go-to destination for data scientists and ML engineers for a reason. It features tons of valuable resources and hosts competitions covering pretty much each and every existing/potential topic in the industry. But how do you take the most out of the platform? Check out this article with tips, tricks, and best practices on using Kaggle during a typical data science workflow.

https://bit.ly/3agM0No

Subscribe to our weekly newsletter — https://bit.ly/3wTjKdg
​​Paper Review: Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Microsoft Research Asia has presented a brand new vision Transformer called Swin Transformer that can serve as a backbone like usual CNNs in computer vision and Transformers in natural language processing. The author provides a detailed review of the paper, exploring all the do’s and don’ts of the new approach and the possibilities it offers for developing a unified architecture for CV and NLP tasks.

https://bit.ly/32nLZTu

Subscribe to our weekly newsletter — https://bit.ly/3tvunB8
​​Transferable Visual Words: Exploiting the Semantics of Anatomical Patterns for Self-supervised Learning

In this paper, Fatemeh Haghighi and the team of authors introduce a new concept called «transferable visual words» (TransVW), which is designed to help achieve annotation efficiency for deep learning in medical image analysis. Learn about the team’s extensive experiments and the advantages that TransVW has demonstrated. The research is available as code, pre-trained models, and curated visual words.

Paper — https://bit.ly/3gjtZlj
Code — https://bit.ly/32ms9rP

Subscribe to our weekly newsletter — https://bit.ly/3ahhZNd
​​How Graph Neural Networks (GNN) Work: Introduction to Graph Convolutions from Scratch

The noscript of this one is quite self-explanatory — The author explores graph neural networks and graph convolutions to explain how they work and how you can apply them in theory and practice in your projects. All points are illustrated with code for convenience.

https://bit.ly/3mY2cYS

@DataScienceDigest
​​OpenCV Face Detection with Haar Cascades

Face detection is one of the most popular Computer Vision use cases (at least, as perceived by the general public). Learning how to use OpenCV and Haar Cascades can be critical if you want to go deep with the field — and this detailed tutorial provides a fresh and easy start for new learners. Just follow the instructions step by step and see the results in action.

https://bit.ly/3v5C3KB

@DataScienceDigest
​​Lviv Data Science Summer School

Hi folks,
I’m pleased to invite you all to enroll in the Lviv Data Science Summer School, to delve into advanced methods and tools of Data Science and Machine Learning, including such domains as CV, NLP, Healthcare, Social Network Analysis, and Urban Data Science. The courses are practice-oriented and are geared towards undergraduates, Ph.D. students, and young professionals (intermediate level). The studies begin July 19–30 and will be hosted online. Make sure to apply — Spots are running fast!

https://bit.ly/2Qc0QOx

@DataScienceDigest
​​Interpretable Machine Learning: A Guide for Making Black Box Models Explainable

The book by Christoph Molnar goes deep to explain how to make supervised machine learning models more interpretable. You’ll start by exploring the concepts of interpretability to learn about simple, interpretable models such as decision trees, decision rules, and linear regression. Then, you’ll look into general model-agnostic methods for interpreting black box models like feature importance and accumulated local effects and explaining individual predictions with Shapley values and LIME. The book focuses on ML models for tabular data and less on computer vision and natural language processing tasks. Reading the book is recommended for machine learning practitioners, data scientists, statisticians, and anyone else interested in making machine learning models interpretable.

https://bit.ly/3sH8Ofq

@DataScienceDigest
​​Zero-Shot Learning: Can You Classify an Object Without Seeing It Before?

Developing machine learning models that can perform predictive functions on data it has never seen before has become an important research area called zero-shot learning. We tend to be pretty great at recognizing things in the world we never saw before, and zero-shot learning offers a possible path toward mimicking this powerful human capability.

https://bit.ly/3xxMF7c

@DataScienceDigest
​​Shedding Light on Fairness in AI with a New Data Set

Bias and fairness in AI are highly debatable topics. To address the problem, Facebook AI has created Casual Conversations, a new dataset consisting of 45,186 videos of participants having non noscripted conversations, to help AI researchers identify and evaluate the fairness of their computer vision and audio models across subgroups of age, gender, apparent skin tone, and ambient lighting.

https://bit.ly/3tRt3bX

@DataScienceDigest
​​VideoGPT: Video Generation using VQ-VAE and Transformers

In this research paper, Wilson Yan et al. present VideoGPT, a simple architecture for scaling likelihood-based generative modeling to natural videos. Despite its simplicity, it can generate samples competitive with advanced GAN models for video generation, as well as high fidelity natural images from UCF-101 and Tumbler GIF Dataset (TGIF).

Paper - https://bit.ly/3aHbpAa
Code - https://bit.ly/32NQQxw
Demo - https://bit.ly/3dUzuFF

@DataScienceDigest
​​Data Science Digest — 28.04.21

The new issue of DataScience Digest is here! Hop to learn about the latest articles, tutorials, research papers, and books on Data Science, AI, ML, and Big Data. All sections are prioritized for your convenience. Enjoy!

https://bit.ly/3nrBYOT

Join 👉 @DataScienceDigest
​​Token Labeling: Training a 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet

In this paper, Zihang Jiang, Qibin Hou et al. explore vision transformers applied to ImageNet classification. They have developed new training techniques to demonstrate that by slightly tuning the structure of vision transformers and introducing token labeling, the models can achieve better results than the CNN counterparts and other transformer-based classification models.

Paper - https://bit.ly/32VtFRQ
Code - https://bit.ly/3eDxAbn