echoinside – Telegram
echoinside
107 subscribers
834 photos
65 videos
41 files
933 links
ML in computer graphics and random stuff.
Any feedback: @fogside
Download Telegram
MobileStyleGAN: A Lightweight Convolutional Neural Network for High-Fidelity Image Synthesis
* pdf
* github

We introduce MobileStyle-GAN architecture, which has x3.5 fewer parameters and is x9.5 less computationally complex than StyleGAN2, while providing comparable quality. In comparison to previous works, we present an end-to-end wavelet-based CNN architecture for generative networks. We show that integrating wavelet-based methods into GANs allows more lightweight networks to be designed and provides more smoothed latent space.

* Similar work with code
#gan #wavelet #mobile
Pixel Codec Avatars
[Facebook Reality Labs Research]
* pdf
* TUM AI Lecture Series - Photorealistic Telepresence (Yaser Sheikh)
Telecommunication with photorealistic avatars in virtual or augmented reality is a promising path for achieving authentic face-to-face communication in 3D over remote physical distances. In this work, we present the Pixel CodecAvatars (PiCA): a deep generative model of 3D human faces that achieves state of the art reconstruction performance while being computationally efficient and adaptive to the rendering conditions during execution. Our model combines two core ideas:
(1) a fully convolutional architecture for decoding spatially varying features, and
(2) a rendering-adaptive per-pixel decoder.
Both techniques are integrated via a dense surface representation that is learned in a weakly-supervised manner from low-topology mesh tracking over training images. We demonstrate that PiCA improves reconstruction over existing techniques across testing expressions and views on persons of different gender and skin tone. Importantly, we show that the PiCA modelis much smaller than the state-of-art baseline model, and makes multi-person telecommunicaiton possible: on a single Oculus Quest 2 mobile VR headset, 5 avatars are rendered in realtime in the same scene.
#VR #3D #avatars
Media is too big
VIEW IN TELEGRAM
Velocity Skinning for Real-time Stylized Skeletal Animation
* project page
* web demo
* pdf

Velocity Skinning is a simple technics to add exagerated deformation triggered by skeletal velocity on top of standard skinning animation.
The technic is:
- Real-time and can be implemented as a single pass vertex shader
- Works "out-of-the-box" on existing skinning data in reusing skinning weights
- Allows non-linear-time editing from instantaneous pose and velocity information

#physics #simulation #skinning #shader
https://developer.nvidia.com/cuda-python
Available with Anaconda and Numba
#tools
The MetaHuman Creator — web based app for creating ultra-realistic virtual humans and exporting them to UE4.
Now they opened it for early access.
Sign up here:
https://www.unrealengine.com/metahuman-creator
- UPD:
50 preset MetaHuman characters are already available inside Bridge, ready for both Unreal Engine and Maya.

source
Forwarded from PHYGITAL+CREATIVE
Media is too big
VIEW IN TELEGRAM
Еще один ML инструмент для анимации - Cascadeur вчера анонсировал новую версию спустя 2 года!

Это инструментарий для создания анимации 3D-персонажей, а интуитивный дизайн поможет быстро освоиться, даже если вы не имели опыта работы в похожих программах.

Инструменты на основе физики и deep learning позволяют упростить работу. Например, корректировать тайминги и траектории и выставлять максимально естественные позы. Таким образом вы можете избежать ненужной рутины и полностью сосредоточиться на творчестве.

#ml
DECOR-GAN: 3D Shape Detailization by Conditional Refinement
[Simon Fraser University, Adobe Research, IIT Bombay]
* pdf
* github
* video-demo
* seminar
We introduce a deep generative network for 3D shape detailization. Given a low-resolution coarse voxel shape, our network refines it, via voxel upsampling, into a higher-resolution shape enriched with geometric details. The output shape preserves the overall structure (or content) of the input, while its detail generation is conditioned on an input "style code" corresponding to a detailed exemplar. Our 3D detailization via conditional refinement is realized by a generative adversarial network, coined DECOR-GAN. The network utilizes a 3D CNN generator for upsampling coarse voxels and a 3D PatchGAN discriminator to enforce local patches of the generated model to be similar to those in the training detailed shapes. During testing, a style code is fed into the generator to condition the refinement. #gan #3D
This media is not supported in your browser
VIEW IN TELEGRAM
GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds
[Nvidia, Cornell University]

* project page
* youtube

-   GANcraft is a powerful tool for converting semantic block worlds to photorealistic worlds without the need for ground truth data.
- Existing methods perform poorly on the task due to the lack of viewpoint consistency and photorealism.
- GANcraft performs well in this challenging world-to-world setting where the ground truth is unavailable and the distribution mismatch between a Minecraft world and internet photos is significant.
- We introduce a new training scheme which uses pseudo-ground truth. This improves the quality of the results significantly.
- We introduce a hybrid neural rendering pipeline which is able to represent large and complex scenes efficiently.
- We are able to control the appearance of the GANcraft results by using style-conditioning images.
#gan #spade #neural_rendering
https://digitalhumans.com/sophie/
Here you can try to talk to Sophie.
I think it's very funny, even though her voice is too robotic. But it gives the feeling that if this issue could be solved, it could turn into very nice experience.
Also, they have Einstein.
https://tinytools.directory/
🦊
Really cool collection of tools for visual interactive projects
- author
#tools
Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks
[MIT, ChipBrain, Amazon]
- pdf
- cleanLab (for label cleaning)
- 10 common benchmark datasets were investigated: ImageNet, CIFAR-10, CIFAR-100, Caltech-256, Quickdraw, MNIST, Amazon Reviews, IMDB, 20 News Groups, AudioSet

We identify label errors in the test sets of 10 common benchmark datasets.
Label errors are identified using confident learning algorithms and then human-validated via crowdsourcing.
Errors in test sets are numerous and widespread: we estimate an average of 3.4% errors across the 10 datasets, where for example 2916 label errors comprise 6% of the ImageNet validation set.
Surprisingly, we find that lower capacity models may be practically more useful than higher capacity models in real-world datasets with high proportions of erroneously labeled data. For example, on ImageNet with corrected labels: ResNet-18 outperforms ResNet-50.
On CIFAR-10 with corrected labels: VGG-11 outperforms VGG-19.
- related article
So, I got early access to MetaHuman creator and just want to share a few of my impressions.
First of all, this thing seems to work like GeForce Now. It makes all the expensive computations on their side and streams a live interactive video to your browser. That's why the maximum session length is 1h.
Currently MH allows you to make a character using morphing to presets and moving face rig parts manually.
It allows you to make renders with different fidelity levels and preview LODs. But I didn't find any export option.
Many people were curious about its ability to represent some particular personalities.
I made some attempts starting from myself ofc 😊. There's definitely something similar at the end, but it feels like something very important is missing. I felt like I was struggling to find necessary shapes for eyes and jaw line and eyebrows. And yes, I'm not a 3D artist and it is quite easy for me to miss smth. So, here is a review of one 3D artist.
This media is not supported in your browser
VIEW IN TELEGRAM
So, in its current state MH allows you to quickly make high fidelity NPC with quite significant variety of options, but it is not easy to make someone in it.
UPD:
one another artist's review
Few-shot Image Generation via Cross-domain Correspondence
[Adobe Research, UC Davis, UC Berkeley]
* project page
* pdf
* code
Training generative models, such as GANs, on a target domain containing limited examples (e.g., 10) can easily result in overfitting. In this work, we seek to utilize a large source domain for pretraining and transfer the diversity information from source to target. We propose to preserve the relative similarities and differences between instances in the source via a novel cross-domain distance consistency loss. To further reduce overfitting, we present an anchor-based strategy to encourage different levels of realism over different regions in the latent space. With extensive results in both photorealistic and non-photorealistic domains, we demonstrate qualitatively and quantitatively that our few-shot model automatically discovers correspondences between source and target domains and generates more diverse and realistic images than previous methods.
#gan #one_shot_learning