echoinside
https://stephaneginier.com/sculptgl Sculpting in web #web #tools
Emerging Properties in Self-Supervised Vision Transformers (DINO)
[Facebook AI Research, Inria, Sorbonne University]
(probably everyone already noticed)
* pdf, abs
* github
* blog
* YC review
tldr; DINO — self-distillation with no labels.
Non-contrastive SSL which looks like BYOL with some tweaks: local & global image crops sent to the student network, while only global crops sent to the teacher (multi-crop thing); cross entropy loss between teacher & student representations. Tested with resnet and vision_transformers, achieving better results with the latter. Works extremely well as a feature extractor for knn and simple linear models. It is also shown, that ViTs extract impressive class-specific segmentation maps in this unsupervised setting, which look much better than in supervised setting.
Also, batch size matters, and multi-crop matters.
It is possible to use this method with small batches without multi-crop, but this setup wasn't studied well.
#self_supervised
[Facebook AI Research, Inria, Sorbonne University]
(probably everyone already noticed)
* pdf, abs
* github
* blog
* YC review
tldr; DINO — self-distillation with no labels.
Non-contrastive SSL which looks like BYOL with some tweaks: local & global image crops sent to the student network, while only global crops sent to the teacher (multi-crop thing); cross entropy loss between teacher & student representations. Tested with resnet and vision_transformers, achieving better results with the latter. Works extremely well as a feature extractor for knn and simple linear models. It is also shown, that ViTs extract impressive class-specific segmentation maps in this unsupervised setting, which look much better than in supervised setting.
Also, batch size matters, and multi-crop matters.
For example, the performance is 72.5% after 46 hours of training without multi-crop (i.e. 2×224^2) while DINO in 2×224^2+10×96^2 crop setting reaches 74.6% in 24 hours only. Memory usage is 15,4G vs 9.3G. So, the best result of 76.1 top1 accuracy is achieved using 16 GPUs for 3 days. And it is still better computational result than in the previous unsupervised works.It is possible to use this method with small batches without multi-crop, but this setup wasn't studied well.
Results with the smaller batch sizes (bs = 128) are slightly below our default training setup of bs = 1024, and would certainly require to re-tune hyperparameters like the momentum rates for example. We have explored training a model with a batch size of 8, reaching 35.2% after 50 epochs, showing the potential for training large models that barely fit an image per GPU.
See images in comments.#self_supervised
YouTube
DINO: Emerging Properties in Self-Supervised Vision Transformers (Facebook AI Research Explained)
#dino #facebook #selfsupervised
Self-Supervised Learning is the final frontier in Representation Learning: Getting useful features without any labels. Facebook AI's new system, DINO, combines advances in Self-Supervised Learning for Computer Vision with…
Self-Supervised Learning is the final frontier in Representation Learning: Getting useful features without any labels. Facebook AI's new system, DINO, combines advances in Self-Supervised Learning for Computer Vision with…
Vector Neurons: A General Framework for SO(3)-Equivariant Networks
[Stanford University, NVIDIA Research, Google Research, University of Toronto]
* pdf, abs
* github
* project page
[Stanford University, NVIDIA Research, Google Research, University of Toronto]
* pdf, abs
* github
* project page
Invariance and equivariance to the rotation group have been widely discussed in the 3D deep learning community for pointclouds. We introduce a general framework built on top of what we call Vector Neuron representations for creating SO(3)-equivariant neural networks for pointcloud processing. Extending neurons from 1D scalars to 3D vectors, our vector neurons enable a simple mapping of SO(3) actions to latent spaces thereby providing a framework for building equivariance in common neural operations -- including linear layers, non-linearities, pooling, and normalizations. Due to their simplicity, vector neurons are versatile and, as we demonstrate, can be incorporated into diverse network architecture backbones.
We also show for the first time a rotation equivariant reconstruction network.
#3d #point_clouds #implicit_geometry
echoinside
* youtube * paper Приятная в своей изящной простоте и эффективности работа. Что сделано Набор сканов людей зарегистрирован в SMPL параметрическую модель. Получен датасет вида {меш с RGB значением для каждой вершины, набор фоток с разными известными параметрами…
The code is now available: https://github.com/sergeyprokudin/smplpix
And it is also possible to use SMPLpix renderer with DECA to make only face renders.
#smpl #neural_rendering
And it is also possible to use SMPLpix renderer with DECA to make only face renders.
#smpl #neural_rendering
GitHub
GitHub - sergeyprokudin/smplpix: SMPLpix: Neural Avatars from 3D Human Models
SMPLpix: Neural Avatars from 3D Human Models. Contribute to sergeyprokudin/smplpix development by creating an account on GitHub.
This media is not supported in your browser
VIEW IN TELEGRAM
Project Starline: Feel like you're there, together
- blogpost
Very impressive press release from Google that pushes the boundaries of telepresence.
- blogpost
Very impressive press release from Google that pushes the boundaries of telepresence.
Imagine looking through a sort of magic window, and through that window, you see another person, life-size and in three dimensions. You can talk naturally, gesture and make eye contact.
There's not much information about the technical part yet. The project was under development for a few years and some trial deployments are planned this year.To make this experience possible, we are applying research in computer vision, machine learning, spatial audio and real-time compression. We've also developed a breakthrough light field display system that creates a sense of volume and depth that can be experienced without the need for additional glasses or headsets.
echoinside
#sdf #tools #implicit_geometry Автор видео @ephtracy в твиттер делиться прогрессом по созданию собственного 3д редактора работающего с signed distance field. В том же блендере возможно моделировать hard surface объекты с помощью metaballs, что даёт похожие…
Small release for win64
https://ephtracy.github.io/index.html?page=magicacsg
review: https://youtu.be/rgwNsNCpbhg?t=208
#sdf #tools
https://ephtracy.github.io/index.html?page=magicacsg
review: https://youtu.be/rgwNsNCpbhg?t=208
#sdf #tools
ephtracy.github.io
MagicaVoxel
MagicaVoxel Official Website
I was quite lazy with the channel, but I'm coming back! 👻✨
Look at this impressive idea — non-adversarial domain adaptation of pretrained generator using CLIP loss.
We already saw some examples of image stylization driven by clip loss and text denoscription, also generating images with clip loss within some domain. But in this work we change the generator itself to produce images with mixed domain. No data is required from this new domain.
* github
* colab
Look at this impressive idea — non-adversarial domain adaptation of pretrained generator using CLIP loss.
We already saw some examples of image stylization driven by clip loss and text denoscription, also generating images with clip loss within some domain. But in this work we change the generator itself to produce images with mixed domain. No data is required from this new domain.
* github
* colab
OpenCV with Roboflow launched Modelplace, a marketplace for AI models.
I wonder why nobody did this before. There's also AWS Marketplace, but it doesn't look convenient for creators.
If I missed some similar marketplace, please let me know in comments. ☺️
Modelplace doesn't look polished enough yet — you need to contact them to publish a model, there aren't many models available. But maybe we will have something like blender market for creators at some point. The direction and proposed features of Modelplace looks quite promising.
https://modelplace.ai/models
I wonder why nobody did this before. There's also AWS Marketplace, but it doesn't look convenient for creators.
If I missed some similar marketplace, please let me know in comments. ☺️
Modelplace doesn't look polished enough yet — you need to contact them to publish a model, there aren't many models available. But maybe we will have something like blender market for creators at some point. The direction and proposed features of Modelplace looks quite promising.
https://modelplace.ai/models
CLIPDraw
There's one known work — optimizing curves to match some classifiers' predictions. And there are so many other works aiming at imitating artists' strokes. But now this idea has been extended to CLIP prompts optimization. The style looks completely different, but it also depends on your prompt. For example, it looks like adding "watercolor painting" to your prompt improves its final visual look (in my opinion). Also there's an option to exclude some of keywords if you see smth not great on your final image 🐸.
I think it's quite fun to play with it. And it's a good base to make your own noscript optimization thing. It's so nice when you can optimize smth in low resolution and then just render it in higher resolution without quality loss.
Here is the result of "Watercolor painting of sunflower in van Gogh style". I also like this finger and signature. 🌻
- paper
- colab
- twitter
#art
There's one known work — optimizing curves to match some classifiers' predictions. And there are so many other works aiming at imitating artists' strokes. But now this idea has been extended to CLIP prompts optimization. The style looks completely different, but it also depends on your prompt. For example, it looks like adding "watercolor painting" to your prompt improves its final visual look (in my opinion). Also there's an option to exclude some of keywords if you see smth not great on your final image 🐸.
I think it's quite fun to play with it. And it's a good base to make your own noscript optimization thing. It's so nice when you can optimize smth in low resolution and then just render it in higher resolution without quality loss.
Here is the result of "Watercolor painting of sunflower in van Gogh style". I also like this finger and signature. 🌻
- paper
- colab
#art
#promo
For Russian speakers
———
3-дневный интенсив по модульному синтезу в VCV Rack - открытом и бесплатном эмуляторе модульных синтезаторов. Рассмотрим основы синтеза, научимся применять модуляры в создании музыки от попа до нойза
КОГДА:
23-29 августа (даты уточняются)
ПРОГРАММА
• Общие принципы синтеза
• Виды синтеза
• Логические элементы и интерфейсы
• Практика
ФОРМАТ
Онлайн + возможно оффлайн в Москве
ДЛЯ УЧАСТИЯ
• Нужно - компьютер с установленным VCV Rack, мин. опыт работы со звуком
• Не нужно - оконченная музыкалка, чёрный пояс по аблетону
ЦЕНА
1200р. Оплата через @selfoscillation_bot
Все деньги пойдут на помощь промо-группе @LESXXV, попавшей на деньги из-за кражи оборудования с феста
ЧАТ КУРСА, ИНФО
https://news.1rj.ru/str/joinchat/TDfLE5JMTLgwYzgy
О ЛЕКТОРЕ
@ferluht, участник vk.com/ed9m_8, исследователь ИИ и адепт модульного синтеза
Примеры работ сделанных полностью на VCV Rack:
https://open.spotify.com/album/0z2sluqa7HFYNipwYAugEX
https://youtu.be/Y2RphGohREE
https://vk.com/video-97759962_456239065
For Russian speakers
———
3-дневный интенсив по модульному синтезу в VCV Rack - открытом и бесплатном эмуляторе модульных синтезаторов. Рассмотрим основы синтеза, научимся применять модуляры в создании музыки от попа до нойза
КОГДА:
23-29 августа (даты уточняются)
ПРОГРАММА
• Общие принципы синтеза
• Виды синтеза
• Логические элементы и интерфейсы
• Практика
ФОРМАТ
Онлайн + возможно оффлайн в Москве
ДЛЯ УЧАСТИЯ
• Нужно - компьютер с установленным VCV Rack, мин. опыт работы со звуком
• Не нужно - оконченная музыкалка, чёрный пояс по аблетону
ЦЕНА
1200р. Оплата через @selfoscillation_bot
Все деньги пойдут на помощь промо-группе @LESXXV, попавшей на деньги из-за кражи оборудования с феста
ЧАТ КУРСА, ИНФО
https://news.1rj.ru/str/joinchat/TDfLE5JMTLgwYzgy
О ЛЕКТОРЕ
@ferluht, участник vk.com/ed9m_8, исследователь ИИ и адепт модульного синтеза
Примеры работ сделанных полностью на VCV Rack:
https://open.spotify.com/album/0z2sluqa7HFYNipwYAugEX
https://youtu.be/Y2RphGohREE
https://vk.com/video-97759962_456239065
If you’re not an attendee but you want to be, Unity have teamed up with SIGGRAPH to offer free Basic-tier access to everyone. Use code UNITY21. More details:
https://unity.com/event/siggraph-2021
https://unity.com/event/siggraph-2021
Unity
Unity Real-Time Development Platform | 3D, 2D, VR & AR Engine
Create and grow real-time 3D games, apps, and experiences for entertainment, film, automotive, architecture, and more. Get started with Unity today.
This media is not supported in your browser
VIEW IN TELEGRAM
Finally, somebody merged NeRF and CLIP ☺️🔥
The author is one of the creators of Diet NeRF.
- original twitter thread
- some other attempts to shape point clouds with CLIP
The author is one of the creators of Diet NeRF.
- original twitter thread
- some other attempts to shape point clouds with CLIP
Text to 3D implemented with CLIP + NeRF (an evolution of DietNeRF!)
Prompt: "a 3d render of a jenga tower in unreal engine"Large Steps in Inverse Rendering of Geometry 😱😱😱
[EPFL, SIGGRAPH Asia 2021]
- twitter thread
- project page
- pdf
Inverse reconstruction from images to meshes just made a huge step forward.
No code yet. The examples in the paper are not close enough to possible input images in reality. But it still looks very cool.
[EPFL, SIGGRAPH Asia 2021]
- twitter thread
- project page
Inverse reconstruction from images to meshes just made a huge step forward.
No code yet. The examples in the paper are not close enough to possible input images in reality. But it still looks very cool.
We propose a simple and practical alternative that casts differentiable rendering into the framework of preconditioned gradient descent. Our preconditioner biases gradient steps towards smooth solutions without requiring the final solution to be smooth. In contrast to Jacobi-style iteration, each gradient step propagates information among all variables, enabling convergence using fewer and larger steps.
Our method is not restricted to meshes and can also accelerate the reconstruction of other representations, where smooth solutions are generally expected. We demonstrate its superior performance in the context of geometric optimization and texture reconstruction.Forwarded from Gradient Dude
This media is not supported in your browser
VIEW IN TELEGRAM
🔥StyleGAN3 by NVIDIA!
Do you remember the awesome smooth results by Alias-Free GAN I wrote about earlier? The authors have finally posted the code and now you can build your amazing projects on it.
I don't know about you, but my hands are already itching to try it out.
🛠 Source code
🌀 Project page
⚙️ Colab
Do you remember the awesome smooth results by Alias-Free GAN I wrote about earlier? The authors have finally posted the code and now you can build your amazing projects on it.
I don't know about you, but my hands are already itching to try it out.
🛠 Source code
🌀 Project page
⚙️ Colab
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
[Nvidia]
- code
- project page
- pdf
- twitter
[Nvidia]
- code
- project page
We demonstrate near-instant training of neural graphics primitives on a single GPU for multiple tasks. In gigapixel image we represent an image by a neural network. SDF learns a signed distance function in 3D space whose zero level-set represents a 2D surface. NeRF [Mildenhall et al. 2020] uses 2D images and their camera poses to reconstruct a volumetric radiance-and-density field that is visualized using ray marching. Lastly, neural volume learns a denoised radiance and density field directly from a volumetric path tracer. In all tasks, our encoding and its efficient implementation provide clear benefits: instant training, high quality, and simplicity. Our encoding is task-agnostic: we use the same implementation and hyperparameters across all tasks and only vary the hash table size which trades off quality and performance.