Data Phoenix – Telegram
Data Phoenix
1.45K subscribers
641 photos
3 videos
1 file
1.33K links
Data Phoenix is your best friend in learning and growing in the data world!
We publish digest, organize events and help expand the frontiers of your knowledge in ML, CV, NLP, and other aspects of AI. Idea and implementation: @dmitryspodarets
Download Telegram
Understanding coordinate systems and DICOM for deep learning medical image analysis

Multiple introductory concepts regarding deep learning in medical imaging, such as coordinate system and DICOM data extraction from the machine learning perspective.

https://bit.ly/3hSi0sB
Awesome GPT-3

This evolving GPT-3 collection includes links to some of the best demos and tutorials around the web. This is a great rabbit hole for anyone interested in understanding how GPT-3 works and where it's going.

https://bit.ly/30LQozy
Object Detection from 9 FPS to 650 FPS in 6 Steps

This article is a practical deep dive into making a specific deep learning model (Nvidia’s SSD300) run fast on a powerful GPU server, but the general principles apply to all GPU programming. The SSD300 is an object-detection model trained on COCO, so output will be bounding boxes with probabilities for 81 classes of object.

https://bit.ly/34YwqTd
​​The 2020 Data & AI Landscape

In this post, you will learn about:
— Key trends in data infrastructure
— Key trends in analytics & enterprise AI
— The 2020 landscape
— Who’s in, who’s out — noteworthy IPOS, M&A and additions

https://bit.ly/378UCol
Using reinforcement learning to personalize AI-accelerated MRI scans

Our early experiments with the fastMRI data set show that our models outperform the previous active MRI acquisition state of the art over a broad range of acceleration factors.

https://bit.ly/3j1tOsz
Putting ML in Production

A guide and case study on MLOps for software engineers, data scientists and product managers. Deploy ML to production for a real product with live data using open source tools.

https://bit.ly/2GXFb89
Stanford MLSys Seminar Series

In this seminar series, we want to take a look at the frontier of machine learning systems, and how machine learning changes the modern programming stack. Our goal is to help curate a curriculum of awesome work in ML systems to help drive research focus to interesting questions.

https://stanford.io/3dFVl1J
#dataset
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation

MSeg is a composite dataset that unifies semantic segmentation datasets from different domains. In this dataset, authors reconcile the taxonomies and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.

https://bit.ly/2Hfqc9A
Druid - Interactive Analytics At Scale

Druid – один из полезных и популярных инструментов в мире Больших Данных. Именно эта OLAP система позволяет эффективно обрабатывать, хранить и запрашивать данные. Что и подтверждает востребованность Druid среди инструментов в среде обработки Больших Данных
С Владимиром Иордановым мы поговорим о том, как работает Druid, из чего он состоит и каковы его возможности. Владимир познакомит нас с компонентами Druid, расскажет об архитектуре кластера, о том как проходит обработка данных. Мы сможем понаблюдать на практике, как с ним работать.

О спикере:
🔸 Владимир Иорданов – BigData Tech Lead в Lohika и уже более шести лет работает в проектах, связанных с Большими Данными. Его последний проект был непосредственно связан с Apache Druid. Во время работы на этом проекте Владимир получил большой опыт работы с поддержкой и разработкой под эту систему в production. Этим опытом наш спикер и поделится на BigData Odesa #TechTalks .

🔹Где: онлайн
🔹Когда: 17.12.2020 в 19:00
🔹Язык доклада: русский
🔹Регистрация обязательна
🔹Вход: donation. Будем вам признательны за перечисление любой комфортной для Вас суммы в благотворительный фонд “Корпорация монстров”. Способы перечисления помощи есть на странице регистрации.

Регистрация на онлайн-трансляцию => https://docs.google.com/forms/d/e/1FAIpQLSdALs-5FgvDKwoAVqJkbvYKOR0UTLrWTvMszDUGl1HIylISrA/viewform

Как всегда, вас ждет интересная тема и полезный вечер. Вы научитесь эффективно применять Druid в своих проектах при построении системы обработки больших данных.

Присоединяйтесь!
Hey folks!

I know, I know… You’ve all been wondering for quite a while if this digest is actually dead or kinda not exactly. I’m here to say that it’s alive and kicking, and I’m — finally — back.

Now, it makes sense to explain where I’ve been all this time. Well, I have two words for you: Covid-19 and a child (probably shouldn’t use these in one sentence).

With quarantines imposed, most people had to sit at home and watch TV series… But it wasn’t and isn’t the case if you’re a developer. Just WFH, baby. And to the babies — my son was born in May 2020, so my plate was extra full.

Anyway, the situation has stabilized, and I’m back. So, what can you expect to see next?
- Daily updates on all things AI/ML/Data on Telegram and other social media
- Weekly email newsletter with Top AI news and resources
- Quizzes and surveys
- Webinars
- Fun stuff
- etc

I hope this sorted things out a bit. Stay tuned and have a great one, all!

P. S. Updates will start arriving soon.

Best regards,
Dmitry Spodarets
@DataScienceDigest
Recommendation Algorithms & System Designs of YouTube, Spotify, Airbnb, Netflix, TikTok, and Uber

Ever wondered how top technology companies can so accurately predict their customers’ next step? Go no further! In this article, you will find a collection of recommendation algorithms and system designs that they may be using. Note that the author collected all the info from open sources and judges from his personal experience; there is no guarantee that the designs are 100% correct.

https://bit.ly/3m0y0Mo

@DataScienceDigest
17 Types of Similarity and Dissimilarity Measures Used in Data Science

Measuring similarity and dissimilarity is quite important in any data science work, simply because it allows to “see” how close or distant data objects are located to each other. In this article, the author took a deep dive into 17 types of similarity and dissimilarity measures, to help you navigate various metrics and their applications in Data Science. The amount of work done by the author is massive, so the article is definitely worth checking out.

https://bit.ly/3ub1YQM

Join 👉 @DataScienceDigest
R vs. Python vs. Julia

Data Scientists are used to writing code with R and Python, but a new programming language for Data Science, Julia, can be their new option of choice. Julia promises C-like performance without compromising the way Data Scientists write code and interact with data, and brings a refreshing programming mindset to the DS community. Eager to learn more about Julia and how it stands against R and Python? Check out this article!

https://bit.ly/3maeaOR

Join 👉 @DataScienceDigest
Mixing Normal Images and Adversarial Images When Training CNNs

This tutorial will guide you on the path to Computer Vision mastery. As the first step, you will learn how to generate image batches of normal images and adversarial images during the training process to improve your model’s ability to generalize and defend against adversarial attacks. The tutorial is extra detailed and features all the details that CNN beginners may need to start to prepare images for their networks.

https://bit.ly/39yTvPg

Join 👉 @DataScienceDigest
Unit Testing Python Code in Jupyter Notebooks

Writing unit tests appears to be the natural way of doing things when you work on production code, library code, or have to engage with test-driven environments. But what about writing unit tests for your Jupyter notebooks? In some cases, you don’t need it, but when you go beyond data exploration, you definitely do — you just have to maintain and monitor your code in the long run. Learn how unit tests can help you make your Data Science work more efficient, and then try it in practice. You won’t be disappointed!

https://bit.ly/3sPHr3W

Join 👉 @DataScienceDigest
​​Awesome AI-ML-DL Repository

If you are interested in all things AI/ML/Dl as we are, check out this awesome repository by @neomatrix369 that features study notes and a curated list of helpful resources on the topic. The repo is created by and for engineers, developers, data scientists, and all other professionals looking to sharpen their mastery of AI, ML, and DL. As with other repos on GitHub, you can easily contribute to, watch, star, fork, and share the repo with others in the tech community.

https://bit.ly/3upveTN
​​Zero-Shot Text-to-Image Generation

One of the major tasks of text-to-image generation is finding better modeling assumptions for training on a fixed dataset. Such assumptions involve complex architectures, auxiliary losses, or side information that make it more difficult to handle data. Discover a new method of text-to-image generation based on a transformer that autoregressively models the text and image tokens as a single stream of data. Learn why it is competitive for at-scale data compared to previous domain-specific models when evaluated in a zero-shot fashion.

https://bit.ly/3fNxuQI

Join 👉 @DataScienceDigest
​​Hey folks,

I need to ask you something.

We plan to launch a series of free webinars in May, and I’d appreciate if you shared your thoughts on what topics we should focus first thing. What are you interested in the most? — Feel free to share as many topics/ideas as you like in the comments' section!

P. S. We’re also looking for speakers. So, if you have some insights to share with the community, we’re ready to provide the platform. Kindly ping me (@dmitryspodarets) to discuss next steps and details.

Thank you!

Best regards,
Dmitry Spodarets
@DataScienceDigest
​​The Applied Machine Learning Course at Cornell Tech

Starting from the very basics, the course covers the most important ML algorithms and how to apply them in practice. The slides are Jupyter notebooks with programmatically generated figures so that readers can tweak parameters and regenerate the figures themselves. The course explores topics such as how to prioritize model improvements, diagnose overfitting, perform error analysis, visualize loss curves, etc.

Course Videos: https://bit.ly/3dLKZha
Course Materials: https://bit.ly/2Rmlj3e

Join 👉 @DataScienceDigest
​​Transfer Learning and Data Augmentation Applied to the Simpsons Image Dataset

In this article, the author uses the Simpsons characters dataset to experiment with data augmentation for transfer learning. Through image filtering to splitting, testing, and validating datasets, a series of experiments is conducted to address the problem of small datasets and overfitting. Check out this step-by-step guide to learn about the results and the final metrics that the experiments yielded.

https://bit.ly/3wQuqJN

Join 👉 @DataScienceDigest
​​How to Deploy Machine Learning / Deep Learning Models to the Web

Machine and deep learning models should not exist in a vacuum (theoretical environments). They need to be deployed in production and used by businesses/customers. In this article, the author provides a step-by-step guide on deploying models to the web and accessing them as a REST API using Heroku and GitHub. You will also learn how to access that API using the Python requests module and CURL.

https://bit.ly/2OKpXqI

Join 👉 @DataScienceDigest