Data Phoenix – Telegram
Data Phoenix
1.45K subscribers
641 photos
3 videos
1 file
1.33K links
Data Phoenix is your best friend in learning and growing in the data world!
We publish digest, organize events and help expand the frontiers of your knowledge in ML, CV, NLP, and other aspects of AI. Idea and implementation: @dmitryspodarets
Download Telegram
How to Version Control Jupyter Notebooks - https://nextjournal.com/schmudde/how-to-version-control-jupyter
Jupyter notebooks integrate metadata, source code, formatted text, and rich media into a single document, which makes them poor candidates for conventional version control systems. This article explores a variety of ways to version control your notebooks, including built-in solutions and external tools.
Which Deep Learning Framework is Growing Fastest?

To answer that question, I looked at the number of job listings on Indeed, Monster, LinkedIn, and SimplyHired. I also evaluated changes in Google search volume, GitHub activity, Medium articles, ArXiv articles, and Quora topic followers. Overall, these sources paint a comprehensive picture of growth in demand, usage, and interest.

Link: https://towardsdatascience.com/which-deep-learning-framework-is-growing-fastest-3f77f14aa318
Open Questions about Generative Adversarial Networks

Problem 1: What are the trade-offs between GANs and other generative models?
Problem 2: What sorts of distributions can GANs model?
Problem 3: How can we Scale GANs beyond image synthesis?
Problem 4: What can we say about the global convergence of the training dynamics?
Problem 5: How should we evaluate GANs and when should we use them?
Problem 6: How does GAN training scale with batch size?
Problem 7: What is the relationship between GANs and adversarial examples?

Link: https://distill.pub/2019/gan-open-problems/
Visualising Model Response with easyalluvial

This article will show how you can use alluvial plots to visualise model response in up to 4 dimensions. easyalluvial generates an artificial data space using fixed values for unplotted variables or uses the partial dependence plotting method. It is model agnostic but offers some convenient wrappers for caret models.

Link: https://www.datisticsblog.com/2019/04/visualising-model-response-with-easyalluvial/
Essential Guide to keep up with AI/ML/CV

These fields are booming these days. In order not to become rusty, one has to constantly follow the updates. Here is the essential guide on how to keep up with the important news/papers/discussions/tutorials. This guide is by no means an exhaustive one so contributions are truly welcome.

Link: https://github.com/BAILOOL/DoYouEvenLearn
**Random Forests for Complete Beginners**

The definitive guide to Random Forests and Decision Trees. You will learn what Random Forests are and how they work from the ground up.

Link: https://victorzhou.com/blog/intro-to-random-forests/
Forecasting: Principles and Practice

This interactive textbook is intended to provide a comprehensive introduction to forecasting methods and to present enough information about each method for readers to be able to use them sensibly.

Link: http://bit.ly/2KXjtlS
If you want to share any useful links in our digest, please send them here: http://bit.ly/2KZzJml
A Repository of Conversational Datasets

This repository provides tools to create reproducible datasets for training and evaluating models of conversational response. This includes:
— Reddit — 3.7 billion comments structured in threaded conversations.
— OpenSubnoscripts — over 400 million lines from the movie and television subnoscripts (available in English and other languages).
— Amazon QA — over 3.6 million question-response pairs in the context of Amazon products.

Link: http://bit.ly/2KVq8wG
Best of arXiv.org for AI, Machine Learning, and Deep Learning — March 2019

This article will review research papers appearing on the arXiv.org preprint server for compelling subjects relating to AI, machine learning and deep learning — from disciplines including statistics, mathematics and computer science — and provide you with a useful «best of» list for the past month.

Link: http://bit.ly/2L2PRTS
Interactive web-based data visualization with R, plotly, and shiny

An interactive book in which you'll gain insight and practical skills for creating interactive and dynamic web graphics for data analysis from R.

Link:http://bit.ly/2L8IzhA
Better NLP library

A library using state of the art NLP libraries to make it easier to work with textual data. Using Spacy, Texacy, Gensim and a number of python libraries to make extracting NLP information from text easier, comparable and measurable. Contributions are welcome, it's just a pull request away.

Link: https://github.com/neomatrix369/awesome-ai-ml-dl/tree/master/examples/better-nlp
Introducing d3-regression

This article will introduce you to d3-regression — D3.js module for calculating statistical regressions from two-dimensional data. It is dependency-free, and its API exposes configurable functions that transform input data as other D3 modules.

Link: http://bit.ly/2LbaQUK
Aroma: Using machine learning for code recommendation

In this article, you will learn about Aroma - a code-to-code search and recommendation tool that uses machine learning (ML) to simplify gaining insights from big codebases.

Link: http://bit.ly/2La3T6u
An End-to-End Project on Time Series Analysis and Forecasting with Python

This article explains how to use time series for non-stationary data, like economic, weather, stock price, and retail sales. You will learn different approaches for forecasting retail sales time series.

Link: http://bit.ly/2LaZT5F
AutoML for Data Augmentation

In this article, you will learn about DeepAugment — an AutoML tool focusing on data augmentation. It utilizes Bayesian optimization for discovering data augmentation strategies tailored to your image dataset.

Link: http://bit.ly/2W9H9o5
Using Reinforcement Learning to Design a Better Rocket Engine

In this article, you will learn how to use reinforcement learning to develop innovative solutions in rocket engine development. You will see how ML techniques can be applied to the manufacturing industry and learn more about the role of the Machine Learning Product Manager.

Link: http://bit.ly/2VdABIQ
Top 5 Interesting Applications of GANs for Every Machine Learning Enthusiast!

In this article, you will learn about five intriguing applications of GANs that are prevalent in the industry: GANs for Image Editing, Using GANs for Security, Generating Data using GANs, GANs for Attention Prediction, GANs for 3D Object Generation. Links to research papers for each GAN application are included.

Link: http://bit.ly/2GQgsyr
A Recommendation Model with PyTorch

This article will outline the idea of Probabilistic Matrix Factorization and its use in recommendation systems on PyTorch.

Link: http://bit.ly/2GZcIuf