NEW BOT Телеграм, страница

Channel created

00:19

Who are you? I'm PhD student doing DL research, preferably about weak/self supervision. Or even unsupervised things as well.
What happens? I'm writing here some reviews of papers I read.
Why the hell? Because it allows me to practice writing, and to understand papers I read deeper.
So what? I will be happy if it's somehow interesting to someone else. Anyways, here's my archive: https://www.notion.so/Self-Supervised-Boy-papers-reading-751aa85ffca948d28feacc45dc3cb0c0.

Ярослав's Notion on Notion

Self Supervised Boy papers reading

Channel in telegram.

239 viewsedited 10:47

Self Supervised Boy

Self-training über alles. Another paper on self-training by Le Quoc.
They compared self-training with supervised and self-supervised pre-training for different tasks. Self-training seemingly works better, while pre-training even hurts final quality when enough labeled data is available or strong augmentation is applied.
Main practical takeaway is, self-training adds quality even after pre-training. So, it could be worthy to self-train your baseline models to have better start.
More detailed with tables here: https://www.notion.so/Rethinking-Pre-training-and-Self-training-e00596e346fa4261af68db7409fbbde6
Source here: https://arxiv.org/pdf/2006.06882.pdf

swanky-pleasure-bcf on Notion

Rethinking Pre-training and Self-training | Notion

Setup

235 views13:22

Self Supervised Boy

Unsupervised segmentation with autoregressive models. Authors proposed to scan image with different scanning orders and request that the close pixels produce close embeddings independently of the scanning order.
SoTA across the unsupervised segmentations.
More detailed with images and losses here: https://www.notion.so/Autoregressive-Unsupervised-Image-Segmentation-211c6e8ec6174fe9929e53e5140e1024
Source here: https://arxiv.org/pdf/2007.08247.pdf

swanky-pleasure-bcf on Notion

Autoregressive Unsupervised Image Segmentation | Notion

Setup

190 views15:19

Self Supervised Boy

One more update on Teacher-Student paradigm by Le Quoc.
Now Teacher is continuously updated to direct Student towards optimum w.r.t. the labeled data. On each step we took update gradient for the Teacher model as the gradient towards current pseudo-label. Then we scale this gradient w.r.t. cosine distance between two gradients of the Student model: from unlabeled and labeled data.
Achieved new SoTA on ImageNET (+1.6% top-1 acc).

More detailed with formulas here: https://www.notion.so/Meta-Pseudo-Label-b83ac7b7086e47e1bef749bc3e8e2124
Source here: https://arxiv.org/pdf/2003.10580.pdf

swanky-pleasure-bcf on Notion

Meta Pseudo Label | Notion

Going one step further into pseudolabeling: we want to continuously train teacher $f_{\theta_T}$ aiming to improve student $f_{\theta_S}$ quality. Ideally, we want the fully trained $f_{\theta_S}$ to perform good on the labeled data $\{x_l;y_l\}$. Although…

160 views15:20

Self Supervised Boy

Oral from the ICLR 2021 on usage of teacher-student setup for cross-domain transfer learning. Teacher is trained on the labelled data and produces pseudolabels for the unlabelled data in target domain. This allows student to learn worthy in-domain representations and gain 2.9% of accuracy on one-shot learning with relatively low training effort.

With more fluff here: https://www.notion.so/Self-training-for-Few-shot-Transfer-Across-Extreme-Task-Differences-bfe820f60b4b474796fd0a5b6b6ad312
Source here: https://openreview.net/pdf?id=O3Y56aqpChA

swanky-pleasure-bcf on Notion

Self-training for Few-shot Transfer Across Extreme Task Differences | Notion

Authors set on the few-shot learning models. Especially the transfer learning. Previous SoTA in this approach was quite a naïve but effective one. It was actually a pre-training on the available supervised data of the similar domain. In this paper self-training…

153 views14:16

Self Supervised Boy

One more oral from ICLR 2021. Theoretical this time, so no way I cans setup detailed overview.

Key points:
1) Authors altered the definition of neighbourhood. Instead of measuring distance between samples, they denote sample x' as a neighbour of x if there is such augmentation A(x) that distance |a(x) - x'| is lower than the threshold.
2) Assumption 1: any small subset of the in-class samples should have expansion (via adding neighbours to subset) to the larger in-class subset of samples.
3) Assumption 2: probability of having x' as neighbour of x with them having different ground truth labels is low and almost negligible.

Authors show, that those are sufficient requirements for the consistency regularisation in self-supervision and transfer learning to show good results.

This nicely adds to the previous paper on transfer learning, where authors shown how consistency regularisation helps. Also it nicely adds to works on smart augmentation strategies.

source: https://openreview.net/pdf?id=rC8sJ4i6kaH

152 viewsedited 10:27

Self Supervised Boy

Spotlight on ICLR 2021 by Schmidhuber. Proposes the method of unsupervised keypoints location algorithm with RL application on Atari.

Very clear and simple idea.:
1. Compressing image with VAE and using features from some intermediate layer of encoder later on.
2. Trying to predict feature vector by its surrounding vectors. If the prediction error is high, we found some important object.
3. Compressing error map for image as the mixture of gaussians with fixed covariance, each center representing one keypoint.

SoTA on Atari games, more robust to input noise.

Probably, could be also used outside of simple Atari framework if you have enough data to train, and take later layers of encoder.

With colorfull images here: https://www.notion.so/Unsupervised-Object-Keypoint-Learning-Using-Local-Spatial-Predictability-ddcf36a856ff4e389050b3089cd710bc
Source here: https://openreview.net/pdf?id=GJwMHetHc73

swanky-pleasure-bcf on Notion

Unsupervised Object Keypoint Learning Using Local Spatial Predictability | Notion

In this paper authors proposed the new approach to the unsupervised keypoint learning. Previous SoTA approach, Transporter, was guided by the movement between slices to learn keypoints. In current paper authors shown possible flaws of such training procedure…

726 views16:52

Self Supervised Boy

Yet another paper from ICLR 2021. This one proposed advanced method of pseudolabel generation.

In a few words, if we simultaneously train an encoder-decoder model to predict the segmentation on supervised data, and to produce consistent pseudolabel and prediction on unsupervised data independent of the augmentation.
As the pseudolabel we use specifically calibrated Grad-CAM from the encoder part of the model, and we fuse it with the prediction of the decoder part, again with fancy procedure.

With some more fluff and notes here.
Source here.

swanky-pleasure-bcf on Notion

PeudoSeg: Designing Pseudolabels for Semantic Segmentation | Notion

Authors added on topic of the pseudo-label generation for semantic segmentation. They proposed to combine several different techniques to achieve better quality of the pseudo-labels.

152 views10:39

Self Supervised Boy

Pretty simple keypoint localisation pipeline with self-supervision constraints for unlabeled data. Again from ICLR 2021.

The key ideas are:
1. Add classification task for the type of keypoint as function of localisation network features. This usually is not required, because of the fixed order of keypoints in model predictions. But this small additional loss, actually boosts performance more then next two constraints.
2. Add constraint that if we localise keypoint on the spatially augmented image, result should be the same as spatially augmented localisation map.
3. Add constraint that representation vectors of keypoints should be invariant to augmentation.

And here they are, getting SoTA results for several challenging datasets even on 100% of dataset as the labeled data.

With a bit more imformation here.
Source here.

swanky-pleasure-bcf on Notion

Semi-supervised Keypoint Localization | Notion

This paper proposed method which could relatively easy be added upon popular keypoint localisation frameworks. Authors add the task of keypoint classification and impose additional constraints (invariance of keypoint position and class under input augmentations)…

148 views18:21

Self Supervised Boy

Yet again simple approach leading to unsupervised segmentation. Mostly useful as pre-training though.

Proposed pipeline first mines saliency object areas (with any available framework, possibly supervised) and then makes contrast learning for pixel embeddings inside those regions. During second step individual pixel embedding is attracted to the mean embedding of its object and pushed away from mean embeddings of other objects. This additional detail differs it from some previously proposed pipelines and allows wider training, because of slower growing rate of the loss pairs.

Less briefly and with some external links here.
Source here.

swanky-pleasure-bcf on Notion

Unsupervised Semantic Segmentation by Contrasting Objects Mask Proposals | Notion

Paper proposes versatile two-step approach of pixel-level embeddings training which could be used both for unsupervised segmentation, or as pre-training for semi-supervised segmentation. Authors argue, that the mid-range prior for training embeddings is better…

677 views13:50

Self Supervised Boy

A bit old (NeurIPS 2019), but interesting take on the saliency prediction.

Instead of using direct mixture of different unsupervised salient region prediction algorithms and focusing on fusion strategy, authors proposed to use distillation in neural networks as a way to refine each algorithm predictions separately. Paper shows several steps of distillation, self-training and usage of moving average to stabilize the predictions of each method separately. After these steps, authors employ accumulated averages as labels for final network training.

Slightly more words here.
Source here.

swanky-pleasure-bcf on Notion

DeepUSPS: Deep Robust Unsupervised Saliency
Prediction With Self-Supervision | Notion

The task of saliency prediction is segmentation task which should separate salient object from background. There is bunch of unsupervised methods to predict salien object on single image, and there was already some works incorporating those different methods…

178 views11:48

Self Supervised Boy

Forwarded from Gradient Dude

Facebook open-sourced a library for state-of-the-art self-supervised learning: VISSL.

+ It contains reproducible reference implementation of SOTA self-supervision approaches (like SimCLR, MoCo, PIRL, SwAV etc) and their components that can be reused. Also supports supervised trainings.
+ Easy to train model on 1-gpu, multi-gpu and multi-node. Seamless scaling to large scale data and model sizes with FP16, LARC etc.

Finally somebody unified all recent works in one modular framework. I don't know about you, but I'm very happy 😌!

VISSL: https://vissl.ai/
Blogpost: https://ai.facebook.com/blog/seer-the-start-of-a-more-powerful-flexible-and-accessible-era-for-computer-vision
Tutorials in Google Colab: https://vissl.ai/tutorials/

187 views17:15

Self Supervised Boy

New paper from Yan LeCun, who continued to introduce new features inspired by biology.

For each batch of input samples we produce two batches of vector representations, which differs only by the augmentation (randomly sampled). After this, we are able to calculate cross-correlation matrix of those representations (cross correlation between sets of augmentations it is). The loss itself pushes matrix to be as close to the identity as possible. Intuitively this pushes representations to be invariant to the augmentation as the main diagonal tend to be 1, and non-redundant and non-trivial since values out of the main diagonal tend to be 0.
Authors show via rigorous ablation tests how it helps to (1) ease requirements for large batch size, (2) avoid fuss of negative mining and (3) to take advance of the dimensionality of the representation.

More expanded here.
Much more discussion and great area overview in source.

P.S. It is always such a pleasure to read papers like this where authors proposed such clear concepts, so they have much more space for discussion.

swanky-pleasure-bcf on Notion

Barlow Twins: Self-Supervised Learning via Redundancy Reduction | Notion

Paper features new approach to the self-supervised learning of representations. It starts in a usual way with siamese NN looking on different augmentations of the sample, but instead of mining positive and negative samples as proposed by e.g. SimCLR this…

👍1

248 views13:30

Self Supervised Boy

Interactive Weak Supervision paper from ICLR 2021.

In contrast to classical active learning where experts are queried to assess individual samples, the idea of this paper is to assess labeling heuristics being automatically generated. Authors argue that since experts are good in writing such heuristics from scratch, they should be able to label auto-generated heuristics. To be able to rank non-assessed heuristics authors proposed to train an ensemble of models to predict the assessors' mark for the heuristic. As input for these models authors proposed to use fingerprint of the heuristic: concatenated predictions on some subset of data.

There is no very fancy results, there is some concerns raised by reviewers, and there are some strange notations in this paper. Yet the idea looks interesting to me.

With a bit deeper denoscription (and one unanswered question) here.
Source (and rebuttal comments with important links) there.

swanky-pleasure-bcf on Notion

Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling | Notion

Authors proposed a new pipeline for interactive weak supervision. Instead of asking users for sample labeling authors proposed to ask for labeling of the labeling functions (LF), e.g. regular expressions for text parsing. Authors argues that since experts…

2.3K views19:07

About

Blog

Apps

Platform