Deep Gravity – Telegram
Deep Gravity
393 subscribers
60 photos
35 videos
17 files
495 links
AI

Contact:
DeepL.Gravity@gmail.com
Download Telegram
Merging Deterministic #PolicyGradient Estimations with Varied #Bias - #Variance Tradeoff for Effective #DeepReinforcementLearning

Deep reinforcement learning (#DRL) on #MarkovDecisionProcess (#MDPs) with continuous action spaces is often approached by directly updating parametric policies along the direction of estimated policy gradients (PGs). Previous research revealed that the performance of these PG algorithms depends heavily on the bias-variance tradeoff involved in estimating and using PGs. A notable approach towards balancing this tradeoff is to merge both on-policy and off-policy gradient estimations for the purpose of training stochastic policies. However this method cannot be utilized directly by sample-efficient off-policy PG algorithms such as #DeepDeterministicPolicyGradient (#DDPG) and #twindelayedDDPG ( #TD3), which have been designed to train deterministic policies. It is hence important to develop new techniques to merge multiple off-policy estimations of deterministic PG (DPG). Driven by this research question, this paper introduces elite #DPG which will be estimated differently from conventional DPG to emphasize on the variance reduction effect at the expense of increased learning bias. To mitigate the extra bias, policy consolidation techniques will be developed to distill policy behavioral knowledge from elite trajectories and use the distilled generative model to further regularize policy training. Moreover, we will study both theoretically and experimentally two different DPG merging methods, i.e., interpolation merging and two-step merging, with the aim to induce varied bias-variance tradeoff through combined use of both conventional DPG and elite DPG. Experiments on six benchmark control tasks confirm that these two merging methods can noticeably improve the learning performance of TD3, significantly outperforming several state-of-the-art #DRL algorithms.

Link

🔭 @DeepGravity
An official #PyTorch implementation of “Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation” (#NeurIPS 2019) by Risto Vuorio*, Shao-Hua Sun*, Hexiang Hu, and Joseph J. Lim

This project is an implementation of Multimodal Model-Agnostic #MetaLearning via Task-Aware Modulation, which is published in NeurIPS 2019. Please contact Shao-Hua Sun for any questions.

Model-agnostic meta-learners aim to acquire meta-prior parameters from a distribution of tasks and adapt to novel tasks with few gradient updates. Yet, seeking a common initialization shared across the entire task distribution substantially limits the diversity of the task distributions that they are able to learn from. We propose a multimodal MAML (MMAML) framework, which is able to modulate its meta-learned prior according to the identified mode, allowing more efficient fast adaptation. An illustration of the proposed framework is as follows.

Link

🔭 @DeepGravity
#Text2FaceGAN: Face Generation from Fine Grained Textual Denoscriptions

Powerful #GenerativeAdversarialNetworks ( 3GAN) have been developed to automatically synthesize realistic images from text. However, most existing tasks are limited to generating simple images such as flowers from captions. In this work, we extend this problem to the less addressed domain of face generation from fine-grained textual denoscriptions of face, e.g., "A person has curly hair, oval face, and mustache". We are motivated by the potential of automated face generation to impact and assist critical tasks such as criminal face reconstruction. Since current datasets for the task are either very small or do not contain captions, we generate captions for images in the CelebA dataset by creating an algorithm to automatically convert a list of attributes to a set of captions. We then model the highly multi-modal problem of text to face generation as learning the conditional distribution of faces (conditioned on text) in same latent space. We utilize the current state-of-the-art GAN (DC-GAN with GAN-CLS loss) for learning conditional multi-modality. The presence of more fine-grained details and variable length of the captions makes the problem easier for a user but more difficult to handle compared to the other text-to-image tasks. We flipped the labels for real and fake images and added noise in discriminator. Generated images for diverse textual denoscriptions show promising results. In the end, we show how the widely used inceptions score is not a good metric to evaluate the performance of generative models used for synthesizing faces from text.

Link

🔭 @DeepGravity
#DBSN: Measuring Uncertainty through #Bayesian Learning of #DeepNeuralNetwork Structures

Bayesian neural networks (BNNs) introduce uncertainty estimation to #deep networks by performing Bayesian inference on network weights. However, such models bring the challenges of inference, and further BNNs with weight uncertainty rarely achieve superior performance to standard models. In this paper, we investigate a new line of Bayesian deep learning by performing Bayesian reasoning on the structure of deep neural networks. Drawing inspiration from the neural architecture search, we define the network structure as gating weights on the redundant operations between computational nodes, and apply stochastic variational inference techniques to learn the structure distributions of networks. Empirically, the proposed method substantially surpasses the advanced deep neural networks across a range of classification and segmentation tasks. More importantly, our approach also preserves benefits of Bayesian principles, producing improved uncertainty estimation than the strong baselines including MC dropout and variational #BNNs algorithms (e.g. noisy EK-FAC).

Link

🔭 @DeepGravity
Algorithmic Improvements for #DeepReinforcement #Learning applied to Interactive Fiction

Text-based games are a natural challenge domain for deep reinforcement learning algorithms. Their state and action spaces are combinatorially large, their reward function is sparse, and they are partially observable: the agent is informed of the consequences of its actions through textual feedback. In this paper we emphasize this latter point and consider the design of a deep reinforcement learning agent that can play from feedback alone. Our design recognizes and takes advantage of the structural characteristics of text-based games. We first propose a contextualisation mechanism, based on accumulated reward, which simplifies the learning problem and mitigates partial observability. We then study different methods that rely on the notion that most actions are ineffectual in any given situation, following Zahavy et al.'s idea of an admissible action. We evaluate these techniques in a series of text-based games of increasing difficulty based on the TextWorld framework, as well as the iconic game Zork. Empirically, we find that these techniques improve the performance of a baseline deep reinforcement learning agent applied to text-based games.

Link

🔭 @DeepGravity
#Google Introduces New Metrics for #AI -Generated Audio and Video Quality

Google AI researchers published two new metrics for measuring the quality of audio and video generated by deep-learning networks, the Fréchet Audio Distance (FAD) and Fréchet Video Distance (FVD). The metrics have been shown to have a high correlation with human evaluations of quality.

Link

🔭 @DeepGravity
Introducing TensorBoard.dev: a new way to share your #ML experiment results

TensorBoard, TensorFlow’s visualization toolkit, is often used by researchers and engineers to visualize and understand their ML experiments. It enables tracking experiment metrics, visualizing models, profiling ML programs, visualizing hyperparameter tuning experiments, and much more.

#TensorBoard

Link

🔭 @DeepGravity
Procgen Benchmark
We’re releasing Procgen Benchmark, 16 simple-to-use procedurally-generated environments which provide a direct measure of how quickly a reinforcement learning agent learns generalizable skills.

#OpenAI

Link

🔭 @DeepGravity
Major trends in #NLP : a review of 20 years of #ACL research

The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019) is starting this week in Florence, Italy. We took the opportunity to review major research trends in the animated NLP space and formulate some implications from the business perspective. The article is backed by a statistical and — guess what — NLP-based analysis of ACL papers from the last 20 years

Link

🔭 @DeepGravity
Generalized Coefficient of Correlation for Non-Linear Relationships

What is the best correlation coefficient R(X, Y) to measure non-linear dependencies between two variables X and Y? Let's say that you want to assess weather there is a linear or quadratic relationship between X and Y. One way to do it is to perform a polynomial regression such as Y = a + bX + cX^2, and then measure the standard coefficient of correlation between the predicted and observed values. How good is this approach?

Link

🔭 @DeepGravity
Research Fellow in Deep Reinforcement Learning for Machine Theory of Mind @ Oxford Brookes

New Post-doc Opening at U. of Toronto on Deep Learning / RL for Traffic Prediction and Control

Seeking PostDoctoral Fellow in Machine Learning (Survival Prediction; Medical Informatics), University of Alberta

2 full-time academic position vacancies on Data Science and related topics in ULB, Brussels, Belgium

Postdoc at Monash University (Melbourne) for probabilistic & deep learning

New Post-doc Opening at U. of Toronto on Deep Learning / RL for Traffic Prediction and Control

MERL is seeking a motivated and qualified individual to conduct research in safe reinforcement learning (RL) and deep learning algorithms for robotics applications.

Fully-funded Post Doctoral Position at InterDigitl, Information Theory for Understanding and Designing Flexible Deep Neural Networks

AI Scientist positions at AI Singapore

RL and LfD research positions (now including interns) at Bosch / UT Austin, focusing on autonomous vehicles

looking for an Integrated Master's cum PhD studentship position across the globe in the areas of Artificial Intelligence, Machine Learning, Data Science, Natural Language Processing

Permanent academic position - Lecturer/Senior Lecturer/Reader in Media & Data Science, University of Glasgow, School of Computing Science

PhD positions in Machine Learning in ECE at George Washington University, USA

2 PhD Candidates in Computer Science, paluno - The Ruhr Institute for Software Technology, Universität Duisburg-Essen

3-year fully funded PhD position on Multimodal Machine Learning for Mental Health (CNRS GREYC, France)

Research Fellow / Senior Research Fellow at the intersection of machine learning and robotics

Two postdoctoral positions are available in the lab of Carlos Fernandez-Granda at the Courant Institute and Center for Data Science at NYU

#Job

🔭 @DeepGravity
#DeepLearning models tend to increase their accuracy with the increasing amount of training data, where’s traditional #MachineLearning models such as #SVM and Naive #Bayes classifier stop improving after a saturation point.

Link

🔭 @DeepGravity
#VariationalAutoencoder Theory

The Variational Autoencoder has taken the #MachineLearning community by storm since Kingma and Welling’s seminal paper was released in 20131.

Link

🔭 @DeepGravity
#DecisionTree vs #RandomForest vs #GradientBoostingMachines: Explained Simply

Decision Trees, Random Forests and Boosting are among the top 16 #data science and machine learning tools used by data scientists. The three methods are similar, with a significant amount of overlap. In a nutshell:

* A decision tree is a simple, decision making-diagram.
* Random forests are a large number of trees, combined (using averages or "majority rules") at the end of the process.
* Gradient boosting machines also combine decision trees, but start the combining process at the beginning, instead of at the end.

Link

🔭 @DeepGravity