NEW BOT Телеграм, страница

Deep Gravity

#Transform-Invariant #ConvolutionalNeuralNetworks for Image #Classification and Search

Convolutional neural networks (CNNs) have achieved state-of-the-art results on many visual recognition tasks. However, current CNN models still exhibit a poor ability to be invariant to spatial transformations of images. Intuitively, with sufficient layers and parameters, hierarchical combinations of convolution (matrix multiplication and non-linear activation) and pooling operations should be able to learn a robust mapping from transformed input images to transform-invariant representations. In this paper, we propose randomly transforming (rotation, scale, and translation) feature maps of CNNs during the training stage. This prevents complex dependencies of specific rotation, scale, and translation levels of training images in #CNN models. Rather, each convolutional kernel learns to detect a feature that is generally helpful for producing the transform-invariant answer given the combinatorially large variety of transform levels of its input feature maps. In this way, we do not require any extra training supervision or modification to the optimization process and training images. We show that random transformation provides significant improvements of CNNs on many benchmark tasks, including small-scale image recognition, large-scale image recognition, and image retrieval. The code is available at https://github.com/jasonustc/caffe-multigpu/tree/TICNN.

Paper

🔭 @DeepGravity

GitHub

jasonustc/caffe-multigpu

linux && windows compatible caffe. Contribute to jasonustc/caffe-multigpu development by creating an account on GitHub.

138 views23:56

Deep Gravity

#AI Transformation Playbook, How to lead your company into the AI era, by Andrew #Ng

This AI Transformation Playbook draws on insights gleaned from leading the #Google Brain team and the Baidu AI Group, which played leading roles in transforming both Google and Baidu into great AI companies. It is possible for any enterprise to follow this Playbook and become a strong AI company, though these recommendations are tailored primarily for larger enterprises with a market cap/valuation from $500M to $500B.

Link

🔭 @DeepGravity

542 views00:01

Deep Gravity

A black #humour !

Richard S. Sutton

Pieter Abbeel

🔭 @DeepGravity

160 views13:11

Deep Gravity

When Does Label Smoothing Help?
Rafael Müller, Simon Kornblith, Geoffrey #Hinton

The #generalization and learning speed of a multi-class neural network can often be significantly improved by using soft targets that are a weighted average of the hard targets and the uniform distribution over labels. Smoothing the labels in this way prevents the network from becoming over-confident and label smoothing has been used in many state-of-the-art models, including image classification, language translation and speech recognition. Despite its widespread use, label smoothing is still poorly understood. Here we show empirically that in addition to improving generalization, label smoothing improves model calibration which can significantly improve beam-search. However, we also observe that if a teacher network is trained with label smoothing, knowledge distillation into a student network is much less effective. To explain these observations, we visualize how label smoothing changes the representations learned by the penultimate layer of the network. We show that label smoothing encourages the representations of training examples from the same class to group in tight clusters. This results in loss of information in the logits about resemblances between instances of different classes, which is necessary for distillation, but does not hurt generalization or calibration of the model's predictions.

Paper

🔭 @DeepGravity

144 views19:10

Deep Gravity

#DeepSpeech 0.6: Mozilla’s #Speech_to_Text Engine Gets Fast, Lean, and Ubiquitous

The #MachineLearning team at #Mozilla continues work on DeepSpeech, an automatic speech recognition (ASR) engine which aims to make speech recognition technology and trained models openly available to developers. DeepSpeech is a deep learning-based ASR engine with a simple API. We also provide pre-trained English models.

Our latest release, version v0.6, offers the highest quality, most feature-packed model so far. In this overview, we’ll show how DeepSpeech can transform your applications by enabling client-side, low-latency, and privacy-preserving speech recognition capabilities.

Link

🔭 @DeepGravity

Mozilla Hacks – the Web developer blog

DeepSpeech 0.6: Mozilla’s Speech-to-Text Engine Gets Fast, Lean, and Ubiquitous

The Machine Learning team at Mozilla continues work on DeepSpeech, an automatic speech recognition (ASR) engine which aims to make speech recognition technology and trained models openly available to developers. ...

140 views21:18

Deep Gravity

Beyond #Accuracy: #Precision and #Recall

Precision is defined as the number of true positives divided by the number of true positives plus the number of false positives. False positives are cases the model incorrectly labels as positive that are actually negative, or in our example, individuals the model classifies as terrorists that are not. While recall expresses the ability to find all relevant instances in a dataset, precision expresses the proportion of the data points our model says was relevant actually were relevant.

Link

🔭 @DeepGravity

Medium

Beyond Accuracy: Precision and Recall

Choosing the right metrics for classification tasks

131 views13:22

Deep Gravity

A Gentle Introduction to KFold Cross-Validation

KFold vs StratifiedKFold
Just a small notebook to point out that KFold and Stratified may not do what you think.

🔭 @DeepGravity

127 viewsedited 13:25

Deep Gravity

A GUI for #pandas: #bamboolib

YouTube

bamboolib website

bamboolib demo

🔭 @DeepGravity

YouTube

Intro to bamboolib - a GUI for pandas

Try the live demo on https://bamboolib.com

130 views18:38

Deep Gravity

Why neural networks don’t work and how to use them

Link

🔭 @DeepGravity

The Aquila consortium

Why neural networks don’t work and how to use them

Throughout the scientific community neural networks are being used for a variety of different tasks. Unfortunately, this is normally done without thought of the statistical implication. Here we lay down the statistical notions showing why neural networks…

104 views19:08

Deep Gravity

Deciphering interaction fingerprints from protein molecular surfaces using geometric #DeepLearning

Abstract
Predicting interactions between proteins and other biomolecules solely based on structure remains a challenge in biology. A high-level representation of protein structure, the molecular surface, displays patterns of chemical and geometric features that fingerprint a protein’s modes of interactions with other biomolecules. We hypothesize that proteins participating in similar interactions may share common fingerprints, independent of their evolutionary history. Fingerprints may be difficult to grasp by visual analysis but could be learned from large-scale datasets. We present MaSIF (molecular surface interaction fingerprinting), a conceptual framework based on a geometric deep learning method to capture fingerprints that are important for specific biomolecular interactions. We showcase MaSIF with three prediction challenges: protein pocket-ligand prediction, protein–protein interaction site prediction and ultrafast scanning of protein surfaces for prediction of protein–protein complexes. We anticipate that our conceptual framework will lead to improvements in our understanding of protein function and design.

Paper

🔭 @DeepGravity

Nature

Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning

Nature Methods - MaSIF, a deep learning-based method, finds common patterns of chemical and geometric features on biomolecular surfaces for predicting protein–ligand and protein–protein...

95 views19:14

Deep Gravity

Machine Unlearning

Once users have shared their data online, it is generally difficult for them to revoke access and ask for the data to be deleted. Machine learning (ML) exacerbates this problem because any model trained with said data may have memorized it, putting users at risk of a successful privacy attack exposing their information. Yet, having models unlearn is notoriously difficult. After a data point is removed from a training set, one often resorts to entirely retraining downstream models from scratch. We introduce SISA training, a framework that decreases the number of model parameters affected by an unlearning request and caches intermediate outputs of the training algorithm to limit the number of model updates that need to be computed to have these parameters unlearn. This framework reduces the computational overhead associated with unlearning, even in the worst-case setting where unlearning requests are made uniformly across the training set. In some cases, we may have a prior on the distribution of unlearning requests that will be issued by users. We may take this prior into account to partition and order data accordingly and further decrease overhead from unlearning. Our evaluation spans two datasets from different application domains, with corresponding motivations for unlearning. Under no distributional assumptions, we observe that SISA training improves unlearning for the Purchase dataset by 3.13x, and 1.658x for the SVHN dataset, over retraining from scratch. We also validate how knowledge of the unlearning distribution provides further improvements in retraining time by simulating a scenario where we model unlearning requests that come from users of a commercial product that is available in countries with varying sensitivity to privacy. Our work contributes to practical data governance in machine learning.

Paper

🔭 @DeepGravity

98 views19:15

Deep Gravity

This media is not supported in your browser

VIEW IN TELEGRAM

Join millions of teachers and students around the globe by doing a one-hour coding activity during Computer Science Education Week this December 9-15

Link

🔭 @DeepGravity

98 views19:18

Deep Gravity

In Defense of Uniform Convergence: Generalization via derandomization with an application to interpolating predictors

We propose to study the generalization error of a learned predictor ĥ in terms of that of a surrogate (potentially randomized) classifier that is coupled to ĥ and designed to trade empirical risk for control of generalization error. In the case where ĥ interpolates the data, it is interesting to consider theoretical surrogate classifiers that are partially derandomized or rerandomized, e.g., fit to the training data but with modified label noise. We show that replacing ĥ by its conditional distribution with respect to an arbitrary σ-field is a viable method to derandomize. We give an example, inspired by the work of Nagarajan and Kolter (2019), where the learned classifier ĥ interpolates the training data with high probability, has small risk, and, yet, does not belong to a nonrandom class with a tight uniform bound on two-sided generalization error. At the same time, we bound the risk of ĥ in terms of a surrogate that is constructed by conditioning and shown to belong to a nonrandom class with uniformly small generalization error.

Link

🔭 @DeepGravity

104 views19:21

Deep Gravity

Multi-Task #ReinforcementLearning without
Interference

While deep reinforcement learning systems have demonstrated impressive results in domains ranging from game playing and robotic control, sample efficiency remains a major challenge, particularly as these algorithms learn individual tasks from scratch. Multi-task and goal-conditioned reinforcement learning have emerged as promising approaches for sharing structure across multiple tasks to enable more efficient learning. However, challenges in optimization have hamstrung such methods from realizing efficiency gains compared to learning tasks independently from scratch. Motivated by these challenges, we develop a general approach that can change the multi-task optimization landscape to alleviate conflicting gradients across tasks. In particular, we introduce two instantiations of this approach, one architectural and one algorithmic, that prevent gradients for different tasks from interfering with one another. On two challenging multi-task RL problems, we find that our approaches leads to greater final performance and learning efficiency in comparison to prior approaches.

Paper

🔭 @DeepGravity

100 views20:02

Deep Gravity

Meta-gradient updates for training return functions for #ReinforcementLearning systems,

Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for reinforcement learning. The embodiments described herein apply meta-learning (and in particular, meta-gradient reinforcement learning) to learn an optimum return function G so that the training of the system is improved. This provides a more effective and efficient means of training a reinforcement learning system as the system is able to converge on an optimum set of one or more policy parameters θ more quickly by training the return function G as it goes. In particular, the return function G is made dependent on the one or more policy parameters θ and a meta-objective function J′ is used that is differentiated with respect to the one or more return parameters η to improve the training of the return function G.

#Google
#DeepMind

Paper

🔭 @DeepGravity

Google

Meta-gradient updates for training return functions for reinforcement learning systems

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for reinforcement learning. The embodiments described herein apply meta-learning (and in particular, meta-gradient reinforcement learning) to learn an optimum…

103 views20:11

Deep Gravity

#DeepMind ’s Dreamer #AI learns from the past to predict the future

Some AI systems achieve goals in challenging environments by drawing on representations of the world informed by past experiences. They generalize these to novel situations, enabling them to complete tasks even in settings they haven’t encountered before. As it turns out, reinforcement learning — a training technique that employs rewards to drive software policies toward goals — is particularly well-suited to learning world models that summarize an agent’s experience, and by extension to facilitating the learning of novel behaviors.

Article

🔭 @DeepGravity

VentureBeat

DeepMind’s Dreamer AI learns from the past to predict the future

In a new preprint research paper, researchers at DeepMind and Google propose Dreamer, an algorithm that learns to predict outcomes from experience.

101 views22:17

Deep Gravity

#Fun

🔭 @DeepGravity

720 views22:27

Deep Gravity

Improved Few-Shot Visual Classification, by Peyman Bateni,

Few-shot learning is a fundamental task in computer vision that carries the promise of alleviating the need for exhaustively labeled data. Most few-shot learning approaches to date have focused on progressively more complex neural feature extractors and classifier adaptation strategies, as well as the refinement of the task definition itself. In this paper, we explore the hypothesis that a simple class-covariance-based distance metric, namely the Mahalanobis distance, adopted into a state of the art few-shot learning approach (CNAPS) can, in and of itself, lead to a significant performance improvement. We also discover that it is possible to learn adaptive feature extractors that allow useful estimation of the high dimensional feature covariances required by this metric from surprisingly few samples. The result of our work is a new "Simple CNAPS" architecture which has up to 9.2 trainable parameters than CNAPS and performs up to 6.1 the art on the standard few-shot image classification benchmark dataset.

Paper

🔭 @DeepGravity

96 views22:30

Deep Gravity

A very interesting paper by #Harvard University and #OpenAI #DeepDoubleDescent: WHERE BIGGER MODELS AND MORE DATA HURT ABSTRACT We show that a variety of modern deep learning tasks exhibit a “double-descent” phenomenon where, as we increase model size,…

A short explanation to this paper

#DeepDoubleDescent

YouTube

🔭 @DeepGravity

YouTube

Deep Double Descent

This video explores a new study on double descent evident in Deep Learning models such as CNNs, ResNets and Transformers. The double descent phenomenon is an...

87 views00:14

Deep Gravity

Various #datascience roles

🔭 @DeepGravity

86 views00:19

Deep Gravity