Data Phoenix – Telegram
Data Phoenix
1.45K subscribers
641 photos
3 videos
1 file
1.33K links
Data Phoenix is your best friend in learning and growing in the data world!
We publish digest, organize events and help expand the frontiers of your knowledge in ML, CV, NLP, and other aspects of AI. Idea and implementation: @dmitryspodarets
Download Telegram
Another 10 Free Must-Read Books for Machine Learning and Data Science - https://www.kdnuggets.com/2019/03/another-10-free-must-read-books-for-machine-learning-and-data-science.html
In this article, you will find a few books on elementary machine learning, a few on general machine topics of interest such as feature engineering and model interpretability, an intro to deep learning, a book on Python programming, a pair of data visualizations entrants, and twin reinforcement learning efforts.
Six Easy Ways to Run Your Jupyter Notebook in the Cloud - https://www.dataschool.io/cloud-services-for-jupyter-notebook/
This article will review six services you can use to easily run your Jupyter notebook in the cloud. All of them have the following characteristics: they don't require you to install anything on your local machine; they are completely free (or they have a free plan); they give you access to the Jupyter Notebook environment; they allow you to import and export notebooks using the standard .ipynb file format; they support the Python language (and most support other languages as well).
Computer Vision Tutorial: A Step-by-Step Introduction to Image Segmentation Techniques - https://www.analyticsvidhya.com/blog/2019/04/introduction-image-segmentation-techniques-python/
In this article, you will learn the concept of image segmentation. It is a powerful computer vision algorithm that builds upon the idea of object detection and takes us to a whole new level of working with image data.
How to Version Control Jupyter Notebooks - https://nextjournal.com/schmudde/how-to-version-control-jupyter
Jupyter notebooks integrate metadata, source code, formatted text, and rich media into a single document, which makes them poor candidates for conventional version control systems. This article explores a variety of ways to version control your notebooks, including built-in solutions and external tools.
Which Deep Learning Framework is Growing Fastest?

To answer that question, I looked at the number of job listings on Indeed, Monster, LinkedIn, and SimplyHired. I also evaluated changes in Google search volume, GitHub activity, Medium articles, ArXiv articles, and Quora topic followers. Overall, these sources paint a comprehensive picture of growth in demand, usage, and interest.

Link: https://towardsdatascience.com/which-deep-learning-framework-is-growing-fastest-3f77f14aa318
Open Questions about Generative Adversarial Networks

Problem 1: What are the trade-offs between GANs and other generative models?
Problem 2: What sorts of distributions can GANs model?
Problem 3: How can we Scale GANs beyond image synthesis?
Problem 4: What can we say about the global convergence of the training dynamics?
Problem 5: How should we evaluate GANs and when should we use them?
Problem 6: How does GAN training scale with batch size?
Problem 7: What is the relationship between GANs and adversarial examples?

Link: https://distill.pub/2019/gan-open-problems/
Visualising Model Response with easyalluvial

This article will show how you can use alluvial plots to visualise model response in up to 4 dimensions. easyalluvial generates an artificial data space using fixed values for unplotted variables or uses the partial dependence plotting method. It is model agnostic but offers some convenient wrappers for caret models.

Link: https://www.datisticsblog.com/2019/04/visualising-model-response-with-easyalluvial/
Essential Guide to keep up with AI/ML/CV

These fields are booming these days. In order not to become rusty, one has to constantly follow the updates. Here is the essential guide on how to keep up with the important news/papers/discussions/tutorials. This guide is by no means an exhaustive one so contributions are truly welcome.

Link: https://github.com/BAILOOL/DoYouEvenLearn
**Random Forests for Complete Beginners**

The definitive guide to Random Forests and Decision Trees. You will learn what Random Forests are and how they work from the ground up.

Link: https://victorzhou.com/blog/intro-to-random-forests/
Forecasting: Principles and Practice

This interactive textbook is intended to provide a comprehensive introduction to forecasting methods and to present enough information about each method for readers to be able to use them sensibly.

Link: http://bit.ly/2KXjtlS
If you want to share any useful links in our digest, please send them here: http://bit.ly/2KZzJml
A Repository of Conversational Datasets

This repository provides tools to create reproducible datasets for training and evaluating models of conversational response. This includes:
— Reddit — 3.7 billion comments structured in threaded conversations.
— OpenSubnoscripts — over 400 million lines from the movie and television subnoscripts (available in English and other languages).
— Amazon QA — over 3.6 million question-response pairs in the context of Amazon products.

Link: http://bit.ly/2KVq8wG
Best of arXiv.org for AI, Machine Learning, and Deep Learning — March 2019

This article will review research papers appearing on the arXiv.org preprint server for compelling subjects relating to AI, machine learning and deep learning — from disciplines including statistics, mathematics and computer science — and provide you with a useful «best of» list for the past month.

Link: http://bit.ly/2L2PRTS
Interactive web-based data visualization with R, plotly, and shiny

An interactive book in which you'll gain insight and practical skills for creating interactive and dynamic web graphics for data analysis from R.

Link:http://bit.ly/2L8IzhA
Better NLP library

A library using state of the art NLP libraries to make it easier to work with textual data. Using Spacy, Texacy, Gensim and a number of python libraries to make extracting NLP information from text easier, comparable and measurable. Contributions are welcome, it's just a pull request away.

Link: https://github.com/neomatrix369/awesome-ai-ml-dl/tree/master/examples/better-nlp
Introducing d3-regression

This article will introduce you to d3-regression — D3.js module for calculating statistical regressions from two-dimensional data. It is dependency-free, and its API exposes configurable functions that transform input data as other D3 modules.

Link: http://bit.ly/2LbaQUK
Aroma: Using machine learning for code recommendation

In this article, you will learn about Aroma - a code-to-code search and recommendation tool that uses machine learning (ML) to simplify gaining insights from big codebases.

Link: http://bit.ly/2La3T6u