How to Perform #FeatureSelection with Categorical Data
After completing this tutorial, you will know:
* The breast cancer predictive modeling problem with categorical inputs and binary #classification target variable.
* How to evaluate the importance of categorical features using the chi-squared and mutual information statistics.
* How to perform feature selection for categorical data when fitting and evaluating a classification model.
Link
🔭 @DeepGravity
After completing this tutorial, you will know:
* The breast cancer predictive modeling problem with categorical inputs and binary #classification target variable.
* How to evaluate the importance of categorical features using the chi-squared and mutual information statistics.
* How to perform feature selection for categorical data when fitting and evaluating a classification model.
Link
🔭 @DeepGravity
Researchers from #Microsoft have created #Icebreaker, a #DeepGenerativeModel with a new element-wise method of acquiring data that uses AI to help decision making and minimize #data requirements. Learn how less costly information can be collected
Link
🔭 @DeepGravity
Link
🔭 @DeepGravity
Microsoft Research
Icebreaker trains machine learning models with low data cost
Microsoft researchers have created Icebreaker, a deep generative model with a new element-wise information acquisition method that uses AI to aid decision making and minimize data requirements. Learn how it helps get information w/ less cost.
#Facebook #AI Residency program
The AI Residency program will pair you with an AI Researcher and Engineer who will both guide your project. With the team, you will pick a research problem of mutual interest and then devise new deep learning techniques to solve it. We also encourage collaborations beyond the assigned mentors. The research will be communicated to the academic community by submitting papers to top academic venues (for example, NeurIPS, ICML, ICLR, CVPR, ICCV, ACL, EMNLP etc.), as well as open-source code releases and/or product impact.
Link
#Job
🔭 @DeepGravity
The AI Residency program will pair you with an AI Researcher and Engineer who will both guide your project. With the team, you will pick a research problem of mutual interest and then devise new deep learning techniques to solve it. We also encourage collaborations beyond the assigned mentors. The research will be communicated to the academic community by submitting papers to top academic venues (for example, NeurIPS, ICML, ICLR, CVPR, ICCV, ACL, EMNLP etc.), as well as open-source code releases and/or product impact.
Link
#Job
🔭 @DeepGravity
Facebook
Log in or sign up to view
See posts, photos and more on Facebook.
ETH Zurich supports excellent Master's students with two scholarship programmes
Computer Scientist - 3D Geometry Processing, France
Post doctoral Research Fellow in Intelligent Information Media Laboratory, Toyota Technological Institute, Japan (TTI-J)
Postdoctoral Position in Systems/Computational Neuroscience, Drexel University
Open Postdoc position in Reinforcement Learning, at Inria SequeL, Lille (France)
Postdoctoral position in Artificial Intelligence, The Fondation Mathématique Jacques Hadamard
Postdoctoral position in Machine Learning and Clinical Informatics, Nemati Lab at the UC San Diego Health.
ML PhD openings in machine learning at UC-San Diego
Postdoctoral fellow in AI, and real time signal processing for Brain Computer Interface clinical application at CEA, Grenoble and Paris-Saclay, France
Postdoc position in in the fields of Electrical and Computer Engineering including research in computer vision, machine learning, and robotics. at Brown University School of Engineering
Post-doctoral position in skin image processing at Vanderbilt University / Nashville TN USA
#Job
🔭 @DeepGravity
Computer Scientist - 3D Geometry Processing, France
Post doctoral Research Fellow in Intelligent Information Media Laboratory, Toyota Technological Institute, Japan (TTI-J)
Postdoctoral Position in Systems/Computational Neuroscience, Drexel University
Open Postdoc position in Reinforcement Learning, at Inria SequeL, Lille (France)
Postdoctoral position in Artificial Intelligence, The Fondation Mathématique Jacques Hadamard
Postdoctoral position in Machine Learning and Clinical Informatics, Nemati Lab at the UC San Diego Health.
ML PhD openings in machine learning at UC-San Diego
Postdoctoral fellow in AI, and real time signal processing for Brain Computer Interface clinical application at CEA, Grenoble and Paris-Saclay, France
Postdoc position in in the fields of Electrical and Computer Engineering including research in computer vision, machine learning, and robotics. at Brown University School of Engineering
Post-doctoral position in skin image processing at Vanderbilt University / Nashville TN USA
#Job
🔭 @DeepGravity
How to Develop #MultilayerPerceptronModels for #TimeSeries Forecasting
After completing this tutorial, you will know:
How to develop #MLP models for univariate time series forecasting.
How to develop MLP models for multivariate time series forecasting.
How to develop MLP models for multi-step time series forecasting.
Link
🔭 @DeepGravity
After completing this tutorial, you will know:
How to develop #MLP models for univariate time series forecasting.
How to develop MLP models for multivariate time series forecasting.
How to develop MLP models for multi-step time series forecasting.
Link
🔭 @DeepGravity
Spark #NLP 101: LightPipeline
A Pipeline is specified as a sequence of stages, and each stage is either a Transformer or an Estimator. These stages are run in order, and the input DataFrame is transformed as it passes through each stage. Now let’s see how this can be done in Spark NLP using Annotators and Transformers.
Link
🔭 @DeepGravity
A Pipeline is specified as a sequence of stages, and each stage is either a Transformer or an Estimator. These stages are run in order, and the input DataFrame is transformed as it passes through each stage. Now let’s see how this can be done in Spark NLP using Annotators and Transformers.
Link
🔭 @DeepGravity
KDnuggets
Spark NLP 101: LightPipeline - KDnuggets
A Pipeline is specified as a sequence of stages, and each stage is either a Transformer or an Estimator. These stages are run in order, and the input DataFrame is transformed as it passes through each stage. Now let’s see how this can be done in Spark NLP…
#MachineLearning #CheatSheet
This cheat sheet contains many classical equations and diagrams on machine learning, which will help you quickly recall knowledge and ideas on machine learning.
The cheat sheet will also appeal to someone who is preparing for a job interview related to machine learning.
Link
🔭 @DeepGravity
This cheat sheet contains many classical equations and diagrams on machine learning, which will help you quickly recall knowledge and ideas on machine learning.
The cheat sheet will also appeal to someone who is preparing for a job interview related to machine learning.
Link
🔭 @DeepGravity
GitHub
GitHub - soulmachine/machine-learning-cheat-sheet: Classical equations and diagrams in machine learning
Classical equations and diagrams in machine learning - soulmachine/machine-learning-cheat-sheet
When #Bayes, #Ockham, and #Shannon come together to define #MachineLearning
A beautiful idea, which binds together concepts from statistics, information theory, and philosophy to lay down the foundation of machine learning.
Link
🔭 @DeepGravity
A beautiful idea, which binds together concepts from statistics, information theory, and philosophy to lay down the foundation of machine learning.
Link
🔭 @DeepGravity
Medium
When Bayes, Ockham, and Shannon come together to define machine learning
We discuss Minimum Denoscription Length, a beautiful idea, which binds together concepts from statistics, information theory, and philosophy.
A Recipe for Training #NeuralNetworks
Some few weeks ago I posted a tweet on “the most common neural net mistakes”, listing a few common gotchas related to training neural nets. The tweet got quite a bit more engagement than I anticipated (including a webinar :)). Clearly, a lot of people have personally encountered the large gap between “here is how a convolutional layer works” and “our convnet achieves state of the art results”.
Link
🔭 @DeepGravity
Some few weeks ago I posted a tweet on “the most common neural net mistakes”, listing a few common gotchas related to training neural nets. The tweet got quite a bit more engagement than I anticipated (including a webinar :)). Clearly, a lot of people have personally encountered the large gap between “here is how a convolutional layer works” and “our convnet achieves state of the art results”.
Link
🔭 @DeepGravity
karpathy.github.io
A Recipe for Training Neural Networks
Musings of a Computer Scientist.
How Much Over-parameterization Is Sufficient to Learn Deep #ReLU Networks?
A recent line of research on #DeepLearning focuses on the extremely over-parameterized setting, and shows that when the network width is larger than a high degree polynomial of the training sample size n and the inverse of the target accuracy ϵ^-1, deep neural networks learned by (stochastic) gradient descent enjoy nice optimization and generalization guarantees. Very recently, it is shown that under certain margin assumption on the training data, a polylogarithmic width condition suffices for two-layer ReLU networks to converge and generalize (Ji and Telgarsky, 2019). However, how much over-parameterization is sufficient to guarantee optimization and generalization for deep neural networks still remains an open question. In this work, we establish sharp optimization and generalization guarantees for deep ReLU networks. Under various assumptions made in previous work, our optimization and generalization guarantees hold with network width polylogarithmic in n and ϵ^-1. Our results push the study of over-parameterized deep neural networks towards more practical settings.
Link
🔭 @DeepGravity
A recent line of research on #DeepLearning focuses on the extremely over-parameterized setting, and shows that when the network width is larger than a high degree polynomial of the training sample size n and the inverse of the target accuracy ϵ^-1, deep neural networks learned by (stochastic) gradient descent enjoy nice optimization and generalization guarantees. Very recently, it is shown that under certain margin assumption on the training data, a polylogarithmic width condition suffices for two-layer ReLU networks to converge and generalize (Ji and Telgarsky, 2019). However, how much over-parameterization is sufficient to guarantee optimization and generalization for deep neural networks still remains an open question. In this work, we establish sharp optimization and generalization guarantees for deep ReLU networks. Under various assumptions made in previous work, our optimization and generalization guarantees hold with network width polylogarithmic in n and ϵ^-1. Our results push the study of over-parameterized deep neural networks towards more practical settings.
Link
🔭 @DeepGravity
Noise Robust Generative Adversarial Networks
#GenerativeAdversarialNetworks (#GANs) are neural networks that learn data distributions through adversarial training. In intensive studies, recent GANs have shown promising results for reproducing training data. However, in spite of noise, they reproduce data with fidelity. As an alternative, we propose a novel family of GANs called noise-robust GANs (NR-GANs), which can learn a clean image generator even when training data are noisy. In particular, NR-GANs can solve this problem without having complete noise information (e.g., the noise distribution type, noise amount, or signal-noise relation). To achieve this, we introduce a noise generator and train it along with a clean image generator. As it is difficult to generate an image and a noise separately without constraints, we propose distribution and transformation constraints that encourage the noise generator to capture only the noise-specific components. In particular, considering such constraints under different assumptions, we devise two variants of NR-GANs for signal-independent noise and three variants of NR-GANs for signal-dependent noise. On three benchmark datasets, we demonstrate the effectiveness of NR-GANs in noise robust image generation. Furthermore, we show the applicability of NR-GANs in image denoising.
Link
🔭 @DeepGravity
#GenerativeAdversarialNetworks (#GANs) are neural networks that learn data distributions through adversarial training. In intensive studies, recent GANs have shown promising results for reproducing training data. However, in spite of noise, they reproduce data with fidelity. As an alternative, we propose a novel family of GANs called noise-robust GANs (NR-GANs), which can learn a clean image generator even when training data are noisy. In particular, NR-GANs can solve this problem without having complete noise information (e.g., the noise distribution type, noise amount, or signal-noise relation). To achieve this, we introduce a noise generator and train it along with a clean image generator. As it is difficult to generate an image and a noise separately without constraints, we propose distribution and transformation constraints that encourage the noise generator to capture only the noise-specific components. In particular, considering such constraints under different assumptions, we devise two variants of NR-GANs for signal-independent noise and three variants of NR-GANs for signal-dependent noise. On three benchmark datasets, we demonstrate the effectiveness of NR-GANs in noise robust image generation. Furthermore, we show the applicability of NR-GANs in image denoising.
Link
🔭 @DeepGravity
A Quick Guide to #FeatureEngineering
Feature engineering plays a key role in machine learning, data mining, and data analytics. This article provides a general definition for feature engineering, together with an overview of the major issues, approaches, and challenges of the field.
Link
🔭 @DeepGravity
Feature engineering plays a key role in machine learning, data mining, and data analytics. This article provides a general definition for feature engineering, together with an overview of the major issues, approaches, and challenges of the field.
Link
🔭 @DeepGravity
KDnuggets
A Quick Guide to Feature Engineering - KDnuggets
Feature engineering plays a key role in machine learning, data mining, and data analytics. This article provides a general definition for feature engineering, together with an overview of the major issues, approaches, and challenges of the field.
#Keras inventor #Chollet charts a new direction for #AI: a Q&A
#Google scientist François Chollet has made a lasting contribution to AI in the wildly popular Keras application programming interface. He now hopes to move the field toward a new approach to intelligence. He talked with ZDNet about what he hopes to accomplish.
Link
🔭 @DeepGravity
#Google scientist François Chollet has made a lasting contribution to AI in the wildly popular Keras application programming interface. He now hopes to move the field toward a new approach to intelligence. He talked with ZDNet about what he hopes to accomplish.
Link
🔭 @DeepGravity
ZDNet
Keras inventor Chollet charts a new direction for AI: a Q&A
Google scientist François Chollet has made a lasting contribution to AI in the wildly popular Keras application programming interface. He now hopes to move the field toward a new approach to intelligence. He talked with ZDNet about what he hopes to accomplish.
Introduction to Applied #LinearAlgebra – Vectors, Matrices, and Least Squares
by
Stephen Boyd and Lieven Vandenberghe
#Cambridge University Press
Link
#Book
🔭 @DeepGravity
by
Stephen Boyd and Lieven Vandenberghe
#Cambridge University Press
Link
#Book
🔭 @DeepGravity
Deep Gravity
Introduction to Applied #LinearAlgebra – Vectors, Matrices, and Least Squares by Stephen Boyd and Lieven Vandenberghe #Cambridge University Press Link #Book 🔭 @DeepGravity
#LSTM: A Search Space Odyssey
Klaus Greff, Rupesh K. Srivastava, Jan Koutn´ık, Bas R. Steunebrink, Jurgen #Schmidhuber
Abstract—Several variants of the Long Short-Term Memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In this paper, we present the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling. The hyperparameters of all LSTM variants for each task were optimized separately using random search, and their importance was assessed using the powerful fANOVA framework. In total, we summarize the results of 5400 experimental runs (≈ 15 years of CPU time), which makes our study the largest of its kind on LSTM networks. Our results show that none of the variants can improve upon the standard LSTM architecture significantly, and demonstrate the forget gate and the output activation function to be its most critical components. We further observe that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.
Link
🔭 @DeepGravity
Klaus Greff, Rupesh K. Srivastava, Jan Koutn´ık, Bas R. Steunebrink, Jurgen #Schmidhuber
Abstract—Several variants of the Long Short-Term Memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In this paper, we present the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling. The hyperparameters of all LSTM variants for each task were optimized separately using random search, and their importance was assessed using the powerful fANOVA framework. In total, we summarize the results of 5400 experimental runs (≈ 15 years of CPU time), which makes our study the largest of its kind on LSTM networks. Our results show that none of the variants can improve upon the standard LSTM architecture significantly, and demonstrate the forget gate and the output activation function to be its most critical components. We further observe that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.
Link
🔭 @DeepGravity
Learning Efficient Video Representation with #Video Shuffle Networks
3D #CNN shows its strong ability in learning spatiotemporal representation in recent video recognition tasks. However, inflating 2D convolution to 3D inevitably introduces additional computational costs, making it cumbersome in practical deployment. We consider whether there is a way to equip the conventional 2D convolution with temporal #vision no requiring expanding its kernel. To this end, we propose the video shuffle, a parameter-free plug-in component that efficiently reallocates the inputs of 2D convolution so that its receptive field can be extended to the temporal dimension. In practical, video shuffle firstly divides each frame feature into multiple groups and then aggregate the grouped features via temporal shuffle operation. This allows the following 2D convolution aggregate the global spatiotemporal features. The proposed video shuffle can be flexibly inserted into popular 2D #CNNs, forming the Video Shuffle Networks (VSN). With a simple yet efficient implementation, VSN performs surprisingly well on temporal modeling benchmarks. In experiments, VSN not only gains non-trivial improvements on Kinetics and Moments in Time, but also achieves state-of-the-art performance on Something-Something-V1, Something-Something-V2 datasets.
Link
🔭 @DeepGravity
3D #CNN shows its strong ability in learning spatiotemporal representation in recent video recognition tasks. However, inflating 2D convolution to 3D inevitably introduces additional computational costs, making it cumbersome in practical deployment. We consider whether there is a way to equip the conventional 2D convolution with temporal #vision no requiring expanding its kernel. To this end, we propose the video shuffle, a parameter-free plug-in component that efficiently reallocates the inputs of 2D convolution so that its receptive field can be extended to the temporal dimension. In practical, video shuffle firstly divides each frame feature into multiple groups and then aggregate the grouped features via temporal shuffle operation. This allows the following 2D convolution aggregate the global spatiotemporal features. The proposed video shuffle can be flexibly inserted into popular 2D #CNNs, forming the Video Shuffle Networks (VSN). With a simple yet efficient implementation, VSN performs surprisingly well on temporal modeling benchmarks. In experiments, VSN not only gains non-trivial improvements on Kinetics and Moments in Time, but also achieves state-of-the-art performance on Something-Something-V1, Something-Something-V2 datasets.
Link
🔭 @DeepGravity
DeepAI
Learning Efficient Video Representation with Video Shuffle Networks
11/26/19 - 3D CNN shows its strong ability in learning spatiotemporal representation in
recent video recognition tasks. However, inflating 2D...
recent video recognition tasks. However, inflating 2D...
How to Visualize Filters and Feature Maps in #ConvolutionalNeuralNetworks
After completing this tutorial, you will know:
* How to develop a visualization for specific filters in a convolutional neural network.
* How to develop a visualization for specific feature maps in a convolutional neural network.
* How to systematically visualize feature maps for each block in a #deep convolutional neural network.
Link
🔭 @DeepGravity
After completing this tutorial, you will know:
* How to develop a visualization for specific filters in a convolutional neural network.
* How to develop a visualization for specific feature maps in a convolutional neural network.
* How to systematically visualize feature maps for each block in a #deep convolutional neural network.
Link
🔭 @DeepGravity