echoinside – Telegram
echoinside
107 subscribers
834 photos
65 videos
41 files
933 links
ML in computer graphics and random stuff.
Any feedback: @fogside
Download Telegram
Cycles-X rendering engine is available in an experimental branch and it works significantly faster than cycles on CPU and GPU.
There is much be done. We expect it will take at least 6 months until this work is part of an official Blender release.
- Official blog post
#blender
This media is not supported in your browser
VIEW IN TELEGRAM
New Fb Oculus avatars
Appearing first in three games for Quest.
- source
Oculus is beginning to roll out redesigned avatars that are more expressive and customizable than those that launched in 2016.
By the end of 2021, Oculus will have opened its new avatar SDK to all developers, and these VR personas will be supported in Facebook Horizon, the company’s own expansive social VR playground. Though, games are just one application for these refreshed avatars. Oculus says the avatar you create will eventually appear in some form within the Facebook app, Messenger, Instagram, and more, but only if you choose to.
#avatars #VR
This media is not supported in your browser
VIEW IN TELEGRAM
Softwrap - Dynamics For Retopology.
- available on blendermarket
Softwrap works by running a custom softbody simulation while snapping in a way similar to the shrinkwrap modifier.
#simulation #physics #tools #blender
Forwarded from Denis Sexy IT 🤖
This media is not supported in your browser
VIEW IN TELEGRAM
Логичное продолжение нейронки которая стала популярной из-за того как клево она оживляет фотографии с лицами: двигающиеся ЧБ-фотографии, портреты, мемы, все это результат работы алгоритма который называется First Order Model.

Несмотря на то, что алгоритм хорошо работает, оживлять им что-то кроме лиц довольно сложно, хоть он это и поддерживает — «гличи» и помарки создают довольно неприятный эффект.

И вот, спасибо группе ученых, скоро мемы можно будет оживлять в полный рост — новый алгоритм уже может понимать какие именно части тела как бы двигались на фотографии исходя из исходного видео – сделал нарезку с видео, там все понятно (живая фигурка особенно криповая получилась).

Страница проекта:
https://snap-research.github.io/articulated-animation/
(код проекта выложат попозже)
Sketch-based Normal Map Generation with Geometric Sampling
* pdf, abs
Normal map is an important and efficient way to represent complex 3D models. A designer may benefit from the auto-generation of high quality and accurate normal maps from freehand sketches in 3D content creation. This paper proposes a deep generative model for generating normal maps from users’ sketch with geometric sampling. Our generative model is based on Conditional Generative Adversarial Network with the curvature-sensitive points sampling of conditional masks. This sampling process can help eliminate the ambiguity of generation results as network input. In addition, we adopted a U-Net structure discriminator to help the generator be better trained. It is verified that the proposed framework can generate more accurate normal maps.
#gan #sketch
NVIDIA Omniverse Audio2Face is now available in open beta. Unfortunately, it works only on Windows rn. And it requires some RTX gpu. For some reason I think that this kind of product would be much more consumer friendly as a web app like Mixamo or MetaHuman creator.
- download
- tutorial
Audio2Face simplifies animation of a 3D character to match any voice-over track, whether you’re animating characters for a game, film, real-time digital assistants, or just for fun. You can use the app for interactive real-time applications or as a traditional facial animation authoring tool. Run the results live or bake them out, it’s up to you.
#speech2animation
dualFace: Two-Stage Drawing Guidance for Freehand Portrait Sketching (CVMJ)
[JAIST, University of Tokyo]
* youtube
* project page
* code
* abs, pdf
In this paper, we propose dualFace, a portrait drawing interface to assist users with different levels of drawing skills to complete recognizable and authentic face sketches. dualFace consists of two-stage drawing assistance to provide global and local visual guidance: global guidance,
which helps users draw contour lines of portraits (i.e., geometric structure), and local guidance, which helps users draws details of facial parts (which conform to user-drawn contour lines), inspired by traditional artist workflows in portrait drawing. In the stage of global guidance, the user draws several contour lines, and dualFace then searches several relevant images from an internal database and displays the suggested face contour lines over the background of the canvas. In the stage of local guidance, we synthesize detailed portrait images with a deep generative model from user-drawn contour lines, but use the synthesized results as detailed drawing guidance.
#sketch #retrieval #face
This media is not supported in your browser
VIEW IN TELEGRAM
Sceneformer: Indoor Scene Generation with Transformers
[ Technical University of Munich]
* project page
* github
* pdf, abs
We address the task of indoor scene generation by generating a sequence of objects, along with their locations and orientations conditioned on a room layout. Large-scale indoor scene datasets allow us to extract patterns from user-designed indoor scenes, and generate new scenes based on these patterns. Existing methods rely on the 2D or 3D appearance of these scenes in addition to object positions, and make assumptions about the possible relations between objects. In contrast, we do not use any appearance information, and implicitly learn object relations using the self-attention mechanism of transformers. Our method is also flexible, as it can be conditioned not only on the room layout but also on text denoscriptions of the room, using only the cross-attention mechanism of transformers.
#indoor
This media is not supported in your browser
VIEW IN TELEGRAM
Explaining in Style: Training a GAN to explain a classifier in StyleSpace
[Google research]
* project page
* pdf
Image classification models can depend on multiple different semantic attributes of the image. An explanation of the decision of the classifier needs to both discover and visualize these properties. Here we present StylEx, a method for doing this, by training a generative model to specifically explain multiple attributes that underlie classifier decisions. We apply StylEx to multiple domains, including animals, leaves, faces and retinal images. For these, we show how an image can be modified in different ways to change its classifier output. Our results show that the method finds attributes that align well with semantic ones, generate meaningful image-specific explanations, and are human-interpretable as measured in user-studies.
#gan
New book on geometric deep learning
https://geometricdeeplearning.com/
#courses
KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control
[University of Oxford, UC Berkeley, Stanford University, Google Research]
* project page
* demo
* pdf, abs
* code (plan to release on April 30)

We present KeypointDeformer, a novel unsupervised method for shape control through automatically discovered 3D keypoints. Our approach produces intuitive and semantically consistent control of shape deformations. Moreover, our discovered 3D keypoints are consistent across object category instances despite large shape variations. Since our method is unsupervised, it can be readily deployed to new object categories without requiring expensive annotations for 3D keypoints and deformations. Our method also works on real-world 3D scans of shoes from Google scanned objects.
#3d #unsupervised
Total Relighting:
Learning to Relight Portraits for Background Replacement
[Google Research]
* youtube
* project page
* pdf
We propose a novel system for portrait relighting and background replacement. Our technique includes alpha matting, relighting, and compositing. We demonstrate that each of these stages can be tackled in a sequential pipeline without the use of priors (e.g. background or illumination) and with no specialized acquisition techniques, using only a single RGB portrait image and a novel, target HDR lighting environment as inputs. We train our model using relit portraits of subjects captured in a light stage computational illumination system, which records multiple lighting conditions, high quality geometry, and accurate alpha mattes. To perform realistic relighting for compositing, we introduce a novel per-pixel lighting representation in a deep learning framework, which explicitly models the diffuse and the specular components of appearance.
#single_image #relighting #matting
echoinside
https://stephaneginier.com/sculptgl Sculpting in web #web #tools
Emerging Properties in Self-Supervised Vision Transformers (DINO)
[Facebook AI Research, Inria, Sorbonne University]
(probably everyone already noticed)
* pdf, abs
* github
* blog
* YC review
tldr; DINO — self-distillation with no labels.
Non-contrastive SSL which looks like BYOL with some tweaks: local & global image crops sent to the student network, while only global crops sent to the teacher (multi-crop thing); cross entropy loss between teacher & student representations. Tested with resnet and vision_transformers, achieving better results with the latter. Works extremely well as a feature extractor for knn and simple linear models. It is also shown, that ViTs extract impressive class-specific segmentation maps in this unsupervised setting, which look much better than in supervised setting.
Also, batch size matters, and multi-crop matters. For example, the performance is 72.5% after 46 hours of training without multi-crop (i.e. 2×224^2) while DINO in 2×224^2+10×96^2 crop setting reaches 74.6% in 24 hours only. Memory usage is 15,4G vs 9.3G. So, the best result of 76.1 top1 accuracy is achieved using 16 GPUs for 3 days. And it is still better computational result than in the previous unsupervised works.
It is possible to use this method with small batches without multi-crop, but this setup wasn't studied well. Results with the smaller batch sizes (bs = 128) are slightly below our default training setup of bs = 1024, and would certainly require to re-tune hyperparameters like the momentum rates for example. We have explored training a model with a batch size of 8, reaching 35.2% after 50 epochs, showing the potential for training large models that barely fit an image per GPU.
See images in comments.
#self_supervised
Vector Neurons: A General Framework for SO(3)-Equivariant Networks
[Stanford University, NVIDIA Research, Google Research, University of Toronto]
* pdf, abs
* github
* project page

Invariance and equivariance to the rotation group have been widely discussed in the 3D deep learning community for pointclouds. We introduce a general framework built on top of what we call Vector Neuron representations for creating SO(3)-equivariant neural networks for pointcloud processing. Extending neurons from 1D scalars to 3D vectors, our vector neurons enable a simple mapping of SO(3) actions to latent spaces thereby providing a framework for building equivariance in common neural operations -- including linear layers, non-linearities, pooling, and normalizations. Due to their simplicity, vector neurons are versatile and, as we demonstrate, can be incorporated into diverse network architecture backbones.
We also show for the first time a rotation equivariant reconstruction network.
#3d #point_clouds #implicit_geometry