Graph Machine Learning for Visual Computing Tutorial @ CVPR 2022
The biggest CV conference features a dedicated tutorial on using Graph ML in Computer Vision tasks, e.g., video understanding, scene graphs in 3D vision, and more!
💫 The lineup of speakers is stellar: Petar Veličković (DeepMind), Matthias Fey (KumoAI / TU Dortmund), Bernard Ghanem (KAUST), Federico Tombari (Google), Fabian Manhardt (Google), Judith Fan (UCSD), Luca Carlone (MIT), and Rajat Talak (MIT).
Tune in next Monday, June 20th, 1:00pm - 5:30pm Central Daylight Time, there will be an open Zoom link 🖥️
The biggest CV conference features a dedicated tutorial on using Graph ML in Computer Vision tasks, e.g., video understanding, scene graphs in 3D vision, and more!
💫 The lineup of speakers is stellar: Petar Veličković (DeepMind), Matthias Fey (KumoAI / TU Dortmund), Bernard Ghanem (KAUST), Federico Tombari (Google), Fabian Manhardt (Google), Judith Fan (UCSD), Luca Carlone (MIT), and Rajat Talak (MIT).
Tune in next Monday, June 20th, 1:00pm - 5:30pm Central Daylight Time, there will be an open Zoom link 🖥️
gml4vc.github.io
CVPR Tutorial on Graph Machine Learning for Visual Computing
Graph Machine Learning at AirBnB
Devin Soni from the engineering team of AirBnB wrote a Medium post on scaling GNNs to industrial-scale graphs. In summary, they use the framework of SIGN (Scalable Inception Graph Networks) and SGC (Simplified GCNs). SIGN pre-computes powers of the adjacency matrix before optimization whereas SGC collapses different layer weights and non-linearities into a single feature propagation step. Conceptually, the approach should be quite fast and scalable, although there are no experimental numbers in the post. Still, great to see recent advancements in scaling GNNs to industrial use-cases!
Devin Soni from the engineering team of AirBnB wrote a Medium post on scaling GNNs to industrial-scale graphs. In summary, they use the framework of SIGN (Scalable Inception Graph Networks) and SGC (Simplified GCNs). SIGN pre-computes powers of the adjacency matrix before optimization whereas SGC collapses different layer weights and non-linearities into a single feature propagation step. Conceptually, the approach should be quite fast and scalable, although there are no experimental numbers in the post. Still, great to see recent advancements in scaling GNNs to industrial use-cases!
Medium
Graph Machine Learning at Airbnb
How Airbnb is leveraging graph neural networks to up-level our machine learning
Graph Neural Networks Are The Next Big Thing
says Swami Sivasubramanian, VP of Data and ML Services at AWS re:MARS 2022. Can’t agree more! Watch the keynote to learn more how AWS accelerates graph learning tasks.
says Swami Sivasubramanian, VP of Data and ML Services at AWS re:MARS 2022. Can’t agree more! Watch the keynote to learn more how AWS accelerates graph learning tasks.
Monday Special: Learnable Neural Priority Queues in GNNs
Message passing propagates messages to all nodes in a neighborhood. Messages might be weighted with a fixed normalization constant (like in GCNs), or with a learnable scalar (GAT), or with a composition function over node and edge features (MPNN). Still, you’d send messages to all neighboring nodes. Some neighbor samplers (like the one in classic GraphSAGE) allow to subsample K nodes in a neighborhood to send messages to, but, still, all those K subsampled nodes receive the message.
Is there a way to somehow guide a GNN and send messages only to a fraction of neighbors and retain the same performance? This could also potentially save a lot of computation, e.g., when propagating through hub nodes. How do we select those important neighbors then?
Looking at the classical graph search algorithms like Dijkstra and A*, we employ priority queues that essentially rank edges according to a certain heuristic, then we take the top ranked edge, add it to the path, and continue further. Can we use something similar for GNNs?
Recently, a few fresh mid-2022 works proposed to learn priority queues explicitly or implicitly:
- Learning to Efficiently Propagate for Reasoning on Knowledge Graphs by Zhu et al. propose A*Net and a neural priority function. Essentially, we construct an edge index dynamically at each layer of a GNN starting from the “root” node. The priority function takes representations of a current node and edge feature, and produces a sigmoid distribution over neighboring nodes and edges from which we select top-K, add to the edge index, and then perform message passing. The strategy brings 5-7x reductions in the number of computed messages and 2-5x reductions in the GPU RAM (depending on the graph).
- Learning Adaptive Propagation for Knowledge Graph Reasoning by Zhang et al. propose AdaProp that builds the edge index dynamically as well. At each layer, AdaProp still computes all messages in the neighborhood, and then applies the differentiable Gumbel-TopK trick with the Straight-Through Estimator to select K edges. Those edges are added to the edge index for the next message passing layer. AdaProp does not save as many messages as A*Net but converges somewhat faster. (And should be more difficult to train due to noisy Gumbel-TopK + STE).
- Learning heuristics for A* by Numeroso et al. approach the A* problem from the Neural Algorithmic Reasoning viewpoint with the Encoder-Process-Decoder architecture. Instead of dynamically building the edge index, they attach an additional decoder to predict heuristic values (along with the standard node state predictor), and add a regularization term for possibly unbounded predicted heuristic values. Here, we still compute all messages during training, but can perform inference faster based on the predicted heuristics.
Check the papers for more details - definitely worth the time! Illustration: saved messages in the A*Net.
Message passing propagates messages to all nodes in a neighborhood. Messages might be weighted with a fixed normalization constant (like in GCNs), or with a learnable scalar (GAT), or with a composition function over node and edge features (MPNN). Still, you’d send messages to all neighboring nodes. Some neighbor samplers (like the one in classic GraphSAGE) allow to subsample K nodes in a neighborhood to send messages to, but, still, all those K subsampled nodes receive the message.
Is there a way to somehow guide a GNN and send messages only to a fraction of neighbors and retain the same performance? This could also potentially save a lot of computation, e.g., when propagating through hub nodes. How do we select those important neighbors then?
Looking at the classical graph search algorithms like Dijkstra and A*, we employ priority queues that essentially rank edges according to a certain heuristic, then we take the top ranked edge, add it to the path, and continue further. Can we use something similar for GNNs?
Recently, a few fresh mid-2022 works proposed to learn priority queues explicitly or implicitly:
- Learning to Efficiently Propagate for Reasoning on Knowledge Graphs by Zhu et al. propose A*Net and a neural priority function. Essentially, we construct an edge index dynamically at each layer of a GNN starting from the “root” node. The priority function takes representations of a current node and edge feature, and produces a sigmoid distribution over neighboring nodes and edges from which we select top-K, add to the edge index, and then perform message passing. The strategy brings 5-7x reductions in the number of computed messages and 2-5x reductions in the GPU RAM (depending on the graph).
- Learning Adaptive Propagation for Knowledge Graph Reasoning by Zhang et al. propose AdaProp that builds the edge index dynamically as well. At each layer, AdaProp still computes all messages in the neighborhood, and then applies the differentiable Gumbel-TopK trick with the Straight-Through Estimator to select K edges. Those edges are added to the edge index for the next message passing layer. AdaProp does not save as many messages as A*Net but converges somewhat faster. (And should be more difficult to train due to noisy Gumbel-TopK + STE).
- Learning heuristics for A* by Numeroso et al. approach the A* problem from the Neural Algorithmic Reasoning viewpoint with the Encoder-Process-Decoder architecture. Instead of dynamically building the edge index, they attach an additional decoder to predict heuristic values (along with the standard node state predictor), and add a regularization term for possibly unbounded predicted heuristic values. Here, we still compute all messages during training, but can perform inference faster based on the predicted heuristics.
Check the papers for more details - definitely worth the time! Illustration: saved messages in the A*Net.
OpenFold & Open Molecular Software Foundation
News from the sister adjacent where Geometric Deep Learning is the main workhorse: OpenFold is a new, non-profit AI research consortium to foster free and open-source tools for biology and drug discovery. OpenFold is founded by the Lab of Mohammed AlQuraishi at Columbia University, Arzeda, Cyrus Biotechnology, Prescient Design, and Outspace Bio.
The first big release of OpenFold is OpenFold, (citing the authors) “a trainable reproduction” of AlphaFold 2 in PyTorch with the aim to open all the training data and model weights.
The OpenFold consortium is designed to be “OpenAI in drug discovery”, let’s hope they will be a bit more open than OpenAI itself about their models and code 😉
News from the sister adjacent where Geometric Deep Learning is the main workhorse: OpenFold is a new, non-profit AI research consortium to foster free and open-source tools for biology and drug discovery. OpenFold is founded by the Lab of Mohammed AlQuraishi at Columbia University, Arzeda, Cyrus Biotechnology, Prescient Design, and Outspace Bio.
The first big release of OpenFold is OpenFold, (citing the authors) “a trainable reproduction” of AlphaFold 2 in PyTorch with the aim to open all the training data and model weights.
The OpenFold consortium is designed to be “OpenAI in drug discovery”, let’s hope they will be a bit more open than OpenAI itself about their models and code 😉
openfold.io
OpenFold Consortium
denoscription
EURO Meets NeurIPS 2022 Vehicle Routing Competition
”The EURO Meets NeurIPS 2022 Vehicle Routing Competition aims to bring together researchers from operations research (OR) and machine learning (ML) to address the vehicle routing problem with time windows (VRPTW) as well as a dynamic VRPTW.”
Recently, we have been observing a surge in applying GNNs for Combinatorial Optimization problems (like Traveling Salesman Problem) - here is the top challenge combining combinatorial optimization and dynamic graphs. Data and problem denoscription are already available.
”The EURO Meets NeurIPS 2022 Vehicle Routing Competition aims to bring together researchers from operations research (OR) and machine learning (ML) to address the vehicle routing problem with time windows (VRPTW) as well as a dynamic VRPTW.”
Recently, we have been observing a surge in applying GNNs for Combinatorial Optimization problems (like Traveling Salesman Problem) - here is the top challenge combining combinatorial optimization and dynamic graphs. Data and problem denoscription are already available.
Recap: Fields Medal & Graph Theory, Origins of Geometric Deep Learning
1. Fields Medal is often considered “the Nobel Prize in mathematics”. This year, International Mathematical Union (IMU) announced 4 awardees: brilliant mathematicians Hugo Duminil-Copin (Université de Genève and IHÉS), June Huh (Princeton), James Maynard (Oxford), and Maryna Viazovska (EPFL).
It is heartwarming for channel’s editors that the research of June Huh has direct connections to the graph theory - first, he proved the 40-years-unsolved Read’s conjecture on counting ways to color the graph using chromatic polynomials, studied those polynomials even deeper, and generalized the framework to matroids. Check this wonderful Quanta Magazine’s article dedicated to June and his research.
2. Just in case you had all your browser tabs closed and looked for something new to read - Michael Bronstein comes to help and publishes a new blog on the origins of Geometric Deep Learning. This is going to be a series of articles tracing the history of geometry from Greeks to GNNs.
1. Fields Medal is often considered “the Nobel Prize in mathematics”. This year, International Mathematical Union (IMU) announced 4 awardees: brilliant mathematicians Hugo Duminil-Copin (Université de Genève and IHÉS), June Huh (Princeton), James Maynard (Oxford), and Maryna Viazovska (EPFL).
It is heartwarming for channel’s editors that the research of June Huh has direct connections to the graph theory - first, he proved the 40-years-unsolved Read’s conjecture on counting ways to color the graph using chromatic polynomials, studied those polynomials even deeper, and generalized the framework to matroids. Check this wonderful Quanta Magazine’s article dedicated to June and his research.
2. Just in case you had all your browser tabs closed and looked for something new to read - Michael Bronstein comes to help and publishes a new blog on the origins of Geometric Deep Learning. This is going to be a series of articles tracing the history of geometry from Greeks to GNNs.
🔥 New Course: An Introduction to Group Equivariant Deep Learning
Erik Bekkers from University of Amsterdam created a fantastic new course covering the most up-to-date flavor of GNNs, namely, equivariant and group-equivariant GNNs. The course consists of 3 lectures, starts from the introduction to the group theory, gradually comes to equivariance and steerable kernels, covers tensor products and irreducible representations (hello Wigner matrices). After the course, you won’t be afraid of cryptic abbreviation like SO(3) or E(n)!
The course includes a YouTube playlist, slides, lecture notes, and Colab notebooks to play around with the real code.
If you got inspired by this topic, we highly recommend the upcoming course by Joey Bose (Mila and McGill) on Geometry and Generative Models with even deeper study of manifolds (hyperbolic, spherical, product) to normalizing flows, ODEs, and denoising diffusion models.
Erik Bekkers from University of Amsterdam created a fantastic new course covering the most up-to-date flavor of GNNs, namely, equivariant and group-equivariant GNNs. The course consists of 3 lectures, starts from the introduction to the group theory, gradually comes to equivariance and steerable kernels, covers tensor products and irreducible representations (hello Wigner matrices). After the course, you won’t be afraid of cryptic abbreviation like SO(3) or E(n)!
The course includes a YouTube playlist, slides, lecture notes, and Colab notebooks to play around with the real code.
If you got inspired by this topic, we highly recommend the upcoming course by Joey Bose (Mila and McGill) on Geometry and Generative Models with even deeper study of manifolds (hyperbolic, spherical, product) to normalizing flows, ODEs, and denoising diffusion models.
uvagedl
UvA - An Introduction to Group Equivariant Deep Learning
Materials for the group equivariant deep learning course
Graph ML Workshops and Summer Schools 🇬🇧 🇨🇦 🇨🇭🇮🇹
This week is surprisingly well-packed with physical meetings of the GraphML community with top speakers and lecturers. We would expect all the materials to be recorded and available online.
- London Geometry and ML Summer School (🇬🇧)
- Deep Exploration of non-Euclidean Data with Geometric and Topological Representation Learning (🇨🇦)
- Swiss Equivariant Machine Learning Workshop (🇨🇭)
Also, in 2 weeks there is going to be Italian Summer School on Geometric DL (🇮🇹).
This week is surprisingly well-packed with physical meetings of the GraphML community with top speakers and lecturers. We would expect all the materials to be recorded and available online.
- London Geometry and ML Summer School (🇬🇧)
- Deep Exploration of non-Euclidean Data with Geometric and Topological Representation Learning (🇨🇦)
- Swiss Equivariant Machine Learning Workshop (🇨🇭)
Also, in 2 weeks there is going to be Italian Summer School on Geometric DL (🇮🇹).
www.logml.ai
LOGML 2025
London Geometry and Machine Learning Summer School, July 7-11 2025
❤1
TensorFlow GNN
TensorFlow GNN (TF-GNN) is a new scalable library for Graph Neural Networks in TensorFlow. It is designed from the bottom up to support the kinds of rich heterogeneous graph data that occurs in real-life use-cases. Many production models at Google use TF-GNN and it has been recently released as an open source project. Google has released a paper that describe the TF-GNN data model, its Keras modeling API, and relevant capabilities such as graph sampling, distributed training, and accelerator support. A new version was just pushed to GitHub.
TensorFlow GNN (TF-GNN) is a new scalable library for Graph Neural Networks in TensorFlow. It is designed from the bottom up to support the kinds of rich heterogeneous graph data that occurs in real-life use-cases. Many production models at Google use TF-GNN and it has been recently released as an open source project. Google has released a paper that describe the TF-GNN data model, its Keras modeling API, and relevant capabilities such as graph sampling, distributed training, and accelerator support. A new version was just pushed to GitHub.
GitHub
GitHub - tensorflow/gnn: TensorFlow GNN is a library to build Graph Neural Networks on the TensorFlow platform.
TensorFlow GNN is a library to build Graph Neural Networks on the TensorFlow platform. - tensorflow/gnn
ICML 2022 - Graph Workshops
ICML starts today with the full week of tutorials, main talks, and workshops. While we are preparing a blog post about interesting graph papers, you can already check the contents of graph- and related workshops to be held on Friday and Saturday.
- Topology, Algebra, and Geometry in Machine Learning (TAG in ML)
- Knowledge Retrieval and Language Models (KRLM)
- Beyond Bayes: Paths Towards Universal Reasoning Systems
- Machine Learning in Computational Design
ICML starts today with the full week of tutorials, main talks, and workshops. While we are preparing a blog post about interesting graph papers, you can already check the contents of graph- and related workshops to be held on Friday and Saturday.
- Topology, Algebra, and Geometry in Machine Learning (TAG in ML)
- Knowledge Retrieval and Language Models (KRLM)
- Beyond Bayes: Paths Towards Universal Reasoning Systems
- Machine Learning in Computational Design
Origins of Geometric Deep Learning - Part 2 and 3
A while ago we referenced the first article of the series on the Origins of Geometric DL by Michael Bronstein. Recently, the series got new episodes - Part 2 focuses on the high hopes about the perceptron, the curse of dimensionality, and first AI winters. Part 3 introduces first architectures with baked geometrical priors - the neocognitron (precursor of convnets) and convolutional neural networks.
As always, Michael did a great and meticulous job of finding original references and adding some comments to them - often the references section is as interesting and informative as the main text! 🍿
A while ago we referenced the first article of the series on the Origins of Geometric DL by Michael Bronstein. Recently, the series got new episodes - Part 2 focuses on the high hopes about the perceptron, the curse of dimensionality, and first AI winters. Part 3 introduces first architectures with baked geometrical priors - the neocognitron (precursor of convnets) and convolutional neural networks.
As always, Michael did a great and meticulous job of finding original references and adding some comments to them - often the references section is as interesting and informative as the main text! 🍿
Medium
Towards Geometric Deep Learning II: The Perceptron Affair
In the second post of our series “Towards Geometric Deep Learning” we discuss how the criticism of Perceptrons led to geometric insights
ESMFold: Protein Language Models Solve Folding, Too
Today, Meta AI Protein Team announced ESMFold - a protein folding model that uses representations right from a protein LM. Meta AI has been working on BERT-style protein language models for a while, e.g., they created a family of ESM models that are currently SOTA in masked protein sequence prediction tasks.
“A key difference between ESMFold and AlphaFold2 is the use of language model representations to remove the need for explicit homologous sequences (in the form of an MSA) as input.”
To this end, the authors design a new family of protein LMs ESM-2. ESM-2 are much more parameter efficient compared to ESM-1b, e.g., 150M ESM-2 is on par with 650M ESM-1b, and 15B ESM-2 leaves all ESM-1 models far behind. Having pre-trained an LM, ESMFold applies Folding Trunk blocks (simplified EvoFormer blocks from AlphaFold 2) and yields 3D predictions.
ESMFold outperforms AlphaFold and RoseTTAFold when only given a single-sequence input w/o MSAs and also much faster! Check out the attached illustration with architecture and charts.
“On a single NVIDIA V100 GPU, ESMFold makes a prediction on a protein with 384 residues in 14.2 seconds, 6X faster than a single AlphaFold2 model. On shorter sequences we see a ~60X improvement. … ESMFold can be run reasonably quickly on CPU, and an Apple M1 Macbook Pro makes the same prediction in just over 5 minutes.”
Finally, ESMFold shows remarkable scaling properties:
“We see non-linear improvements in protein structure predictions as a function of model
scale, and observe a strong link between how well the language model understands a sequence (as measured by perplexity) and the structure prediction that emerges.”
Are you already converted to the church of Scale Is All You Need - AGI Is Coming? 😉
Today, Meta AI Protein Team announced ESMFold - a protein folding model that uses representations right from a protein LM. Meta AI has been working on BERT-style protein language models for a while, e.g., they created a family of ESM models that are currently SOTA in masked protein sequence prediction tasks.
“A key difference between ESMFold and AlphaFold2 is the use of language model representations to remove the need for explicit homologous sequences (in the form of an MSA) as input.”
To this end, the authors design a new family of protein LMs ESM-2. ESM-2 are much more parameter efficient compared to ESM-1b, e.g., 150M ESM-2 is on par with 650M ESM-1b, and 15B ESM-2 leaves all ESM-1 models far behind. Having pre-trained an LM, ESMFold applies Folding Trunk blocks (simplified EvoFormer blocks from AlphaFold 2) and yields 3D predictions.
ESMFold outperforms AlphaFold and RoseTTAFold when only given a single-sequence input w/o MSAs and also much faster! Check out the attached illustration with architecture and charts.
“On a single NVIDIA V100 GPU, ESMFold makes a prediction on a protein with 384 residues in 14.2 seconds, 6X faster than a single AlphaFold2 model. On shorter sequences we see a ~60X improvement. … ESMFold can be run reasonably quickly on CPU, and an Apple M1 Macbook Pro makes the same prediction in just over 5 minutes.”
Finally, ESMFold shows remarkable scaling properties:
“We see non-linear improvements in protein structure predictions as a function of model
scale, and observe a strong link between how well the language model understands a sequence (as measured by perplexity) and the structure prediction that emerges.”
Are you already converted to the church of Scale Is All You Need - AGI Is Coming? 😉
Upcoming Graph Workshops
If you are finishing a project and would like to probe your work and get the first round of reviews, consider submitting to recently announced workshops:
- Federated Learning with Graph Data (FedGraph) @ CIKM 2022 - deadline August 15
- Trustworthy Learning on Graphs (TrustLOG) @ CIKM 2022 - deadline September 2
- New Frontiers in Graph Learning (GLFrontiers) @ NeurIPS 2022 - deadline September 15
- Symmetry and Geometry in Neural Representations (NeurReps) @ NeurIPS 2022 - deadline September 22
If you are finishing a project and would like to probe your work and get the first round of reviews, consider submitting to recently announced workshops:
- Federated Learning with Graph Data (FedGraph) @ CIKM 2022 - deadline August 15
- Trustworthy Learning on Graphs (TrustLOG) @ CIKM 2022 - deadline September 2
- New Frontiers in Graph Learning (GLFrontiers) @ NeurIPS 2022 - deadline September 15
- Symmetry and Geometry in Neural Representations (NeurReps) @ NeurIPS 2022 - deadline September 22
Google
FedGraph2022
Graph Machine Learning @ ICML 2022
In case you missed all the ICML’22 fun, we prepared a comprehensive overview of graph papers published at the conference: 35+ papers in 10 categories:
- Generation: Denoising Diffusion Is All You Need
- Graph Transformers
- Theory and Expressive GNNs
- Spectral GNNs
- Explainable GNNs
- Graph Augmentation: Beyond Edge Dropout
- Algorithmic Reasoning and Graph Algorithms
- Knowledge Graph Reasoning
- Computational Biology: Molecular Linking, Protein Binding, Property Prediction
- Cool Graph Applications
In case you missed all the ICML’22 fun, we prepared a comprehensive overview of graph papers published at the conference: 35+ papers in 10 categories:
- Generation: Denoising Diffusion Is All You Need
- Graph Transformers
- Theory and Expressive GNNs
- Spectral GNNs
- Explainable GNNs
- Graph Augmentation: Beyond Edge Dropout
- Algorithmic Reasoning and Graph Algorithms
- Knowledge Graph Reasoning
- Computational Biology: Molecular Linking, Protein Binding, Property Prediction
- Cool Graph Applications
Medium
Graph Machine Learning @ ICML 2022
Recent advancements and hot trends, July 2022 edition
Towards Geometric Deep Learning IV: Chemical Precursors of GNNs
In the final post of the series, Michael Bronstein covers the role of chemistry and computational chemistry in developing mathematical concept that were further used in creating GNNs. For instance, the problem of patent offices when registering a new drug required a way to compare a new molecule with those in the existing database - starting from strings continuing with molecular fingerprints and finally arriving to the WL-test and its modern variants.
In the final post of the series, Michael Bronstein covers the role of chemistry and computational chemistry in developing mathematical concept that were further used in creating GNNs. For instance, the problem of patent offices when registering a new drug required a way to compare a new molecule with those in the existing database - starting from strings continuing with molecular fingerprints and finally arriving to the WL-test and its modern variants.
Medium
Towards Geometric Deep Learning IV: Chemical Precursors of GNNs
In the last post in the “Towards Geometric Deep Learning” series, we look at early prototypes of GNNs in the field of chemistry.
Geometric Deep Learning Course: 2022 Update
The go-to GDL course by Michael M. Bronstein, Joan Bruna, Taco Cohen, and Petar Veličković has just been updated! New materials are in the introduction, in the graph transformers section, more about category theory (don’t forget your vegetables 🥦), differential geometry, and topolgy, as well as a new set of invited speakers covering recent hot topics from subgraph GNNs to AlphaFold 2.
The go-to GDL course by Michael M. Bronstein, Joan Bruna, Taco Cohen, and Petar Veličković has just been updated! New materials are in the introduction, in the graph transformers section, more about category theory (don’t forget your vegetables 🥦), differential geometry, and topolgy, as well as a new set of invited speakers covering recent hot topics from subgraph GNNs to AlphaFold 2.
Geometricdeeplearning
GDL Course
Grids, Groups, Graphs, Geodesics, and Gauges
Geometric DL News: 200M proteins in AlphaFold DB, Euclidean nets, Italian GDL Summer School, Diffusers
This week brought us a bunch of news and new materials:
- DeepMind announced expanding the AlphaFold DB to 200 million protein structures. Celebrating 1Y anniversary since the release of groundbreaking AlphaFold 2, DeepMind mentions a huge success of the system among scientists all over the world - more than 500.000 researchers from 190 countries have accesses AlphaFold predictions - and sketches further plans to apply the outcomes in other areas such as drug discovery, fusion, and climate change
- Mario Geiger (MIT) and Tess Smidt (MIT) released an updated version of the writeup on e3nn - the most popular Python library to build Euclidean Neural Networks, a basis for many new cool works like Steerable GNNs and SE(3)-Transformers. The writeup includes simple intuitions behind spherical harmonics, tensor product, irreducible representations, and other key building blocks - if you work on equivariant architectures, you probably do that with e3nn 😉
- 🇮🇹 First Italian School on Geometric Deep Learning releases all slides and Colab Notebooks on equivariance, topology, differential geometry and other topics covered by top speakers including Michael Bronstein, Cristian Bodnar, Maurice Weiler, Pim de Haan, and Francesco Di Giovanni.
- Following the hottest 2022 trend, HuggingFace 🤗 aims to tame the wilds of diffusion models and releases Diffusers 🧨, a single library to build and train diffusion models of all modalities - image generation, text generation, and, of course, graph generation! The PR with GeoDiff, a SOTA molecule generation model from ICLR 2022, is already prepared 🚀
This week brought us a bunch of news and new materials:
- DeepMind announced expanding the AlphaFold DB to 200 million protein structures. Celebrating 1Y anniversary since the release of groundbreaking AlphaFold 2, DeepMind mentions a huge success of the system among scientists all over the world - more than 500.000 researchers from 190 countries have accesses AlphaFold predictions - and sketches further plans to apply the outcomes in other areas such as drug discovery, fusion, and climate change
- Mario Geiger (MIT) and Tess Smidt (MIT) released an updated version of the writeup on e3nn - the most popular Python library to build Euclidean Neural Networks, a basis for many new cool works like Steerable GNNs and SE(3)-Transformers. The writeup includes simple intuitions behind spherical harmonics, tensor product, irreducible representations, and other key building blocks - if you work on equivariant architectures, you probably do that with e3nn 😉
- 🇮🇹 First Italian School on Geometric Deep Learning releases all slides and Colab Notebooks on equivariance, topology, differential geometry and other topics covered by top speakers including Michael Bronstein, Cristian Bodnar, Maurice Weiler, Pim de Haan, and Francesco Di Giovanni.
- Following the hottest 2022 trend, HuggingFace 🤗 aims to tame the wilds of diffusion models and releases Diffusers 🧨, a single library to build and train diffusion models of all modalities - image generation, text generation, and, of course, graph generation! The PR with GeoDiff, a SOTA molecule generation model from ICLR 2022, is already prepared 🚀
Google DeepMind
AlphaFold reveals the structure of the protein universe
Today, in partnership with EMBL’s European Bioinformatics Institute (EMBL-EBI), we’re now releasing predicted structures for nearly all catalogued proteins known to science, which will expand the...
Sampling from Large Heterogeneous Graphs with TF-GNN
In this new blogpost, Brandon Mayer and Bryan Perozzi go into details on how to organize scalable neighborhood sampling over large heterogeneous graphs (of many node types and edge types) using the example of OGB MAG dataset (2M nodes, 20M edges). Sampling can be defined using Apache Beam configs and can fetch data right from the Google Cloud Platform through the Dataflow Engine.
Recently, we covered the release of TensorFlow-GNN (TF-GNN), a new framework by Google to train GNNs on very large graphs that often do not fit into main memory. Today’s post is a more hands-on tutorial with particular code examples you could try yourself 🛠️.
In this new blogpost, Brandon Mayer and Bryan Perozzi go into details on how to organize scalable neighborhood sampling over large heterogeneous graphs (of many node types and edge types) using the example of OGB MAG dataset (2M nodes, 20M edges). Sampling can be defined using Apache Beam configs and can fetch data right from the Google Cloud Platform through the Dataflow Engine.
Recently, we covered the release of TensorFlow-GNN (TF-GNN), a new framework by Google to train GNNs on very large graphs that often do not fit into main memory. Today’s post is a more hands-on tutorial with particular code examples you could try yourself 🛠️.
Google Cloud Blog
Scalable Heterogeneous Graph Sampling with GCP and Dataflow For Graph Neural Networks. | Google Cloud Blog
KDD 2022
KDD 2022, one of the premier Graph & Data Mining venues, will take place in Washington DC in two weeks (Aug 14-18). As always, the published program of Research Track papers and Applied Data Science Track papers is full of graph papers so check them out.
Furthermore, there will be a rich selection of workshops:
- International Workshop on Mining and Learning with Graphs (MLG) (co-located with DLG)
- Deep Learning on Graphs: Methods and Applications (DLG-KDD’22) (co-located with MLG)
- International Workshop on Knowledge Graphs: Open Knowledge Network
- International Workshop on Data Mining in Bioinformatics (BIOKDD 2022)
And even more tutorials:
- Trustworthy Graph Learning: Reliability, Explainability, and Privacy Protection (Tencent AI)
- Graph-based Representation Learning for Web-scale Recommender Systems (Twitter)
- Algorithmic Fairness on Graphs: Methods and Trends (U. Illinois at Urbana-Champaign)
- Toward Graph Minimally-Supervised Learning (Arizona State University)
- Accelerated GNN training with DGL and RAPIDS cuGraph in a Fraud Detection Workflow (NVIDIA)
- Graph Neural Networks: Foundation, Frontiers and Applications
- Temporal Graph Learning for Financial World: Algorithms, Scalability, Explainability & Fairness (MasterCard)
- Efficient Machine Learning on Large-Scale Graphs (TigerGraph)
- Frontiers of Graph Neural Networks with DIG (Texas A&M University)
- Graph Neural Networks in Life Sciences: Opportunities and Solutions (Amazon)
KDD 2022, one of the premier Graph & Data Mining venues, will take place in Washington DC in two weeks (Aug 14-18). As always, the published program of Research Track papers and Applied Data Science Track papers is full of graph papers so check them out.
Furthermore, there will be a rich selection of workshops:
- International Workshop on Mining and Learning with Graphs (MLG) (co-located with DLG)
- Deep Learning on Graphs: Methods and Applications (DLG-KDD’22) (co-located with MLG)
- International Workshop on Knowledge Graphs: Open Knowledge Network
- International Workshop on Data Mining in Bioinformatics (BIOKDD 2022)
And even more tutorials:
- Trustworthy Graph Learning: Reliability, Explainability, and Privacy Protection (Tencent AI)
- Graph-based Representation Learning for Web-scale Recommender Systems (Twitter)
- Algorithmic Fairness on Graphs: Methods and Trends (U. Illinois at Urbana-Champaign)
- Toward Graph Minimally-Supervised Learning (Arizona State University)
- Accelerated GNN training with DGL and RAPIDS cuGraph in a Fraud Detection Workflow (NVIDIA)
- Graph Neural Networks: Foundation, Frontiers and Applications
- Temporal Graph Learning for Financial World: Algorithms, Scalability, Explainability & Fairness (MasterCard)
- Efficient Machine Learning on Large-Scale Graphs (TigerGraph)
- Frontiers of Graph Neural Networks with DIG (Texas A&M University)
- Graph Neural Networks in Life Sciences: Opportunities and Solutions (Amazon)
www.mlgworkshop.org
MLG 2022 - 17th International Workshop on Mining and Learning with Graphs
MLG 2022, 17th International Workshop on Mining and Learning with Graphs, co-located with KDD 2022, Washington, DC, USA
New Software and Library Updates
August is a notoriously quiet month without big news, but there is something new in the graph software:
- Uni-Fold - a re-implemented AlphaFold and AlphaFold-Multimer in PyTorch. The authors emphasize this is the first open-source repo for training AlphaFold-Multimer and their AlphaFold implementation can be trained 2x faster than the original.
- PyKEEN 1.9 features new tools for adding textual representations to KG embedding models as well as adds significant speedups of NodePiece on large graphs (5M nodes / 30M edges in 10 minutes on a laptop) thanks to the METIS partitioning algorithm and GPU-accelerated BFS.
- GRAPE - a Rust/Python library for graph processing and embedding with many compbio datasets integrated.
August is a notoriously quiet month without big news, but there is something new in the graph software:
- Uni-Fold - a re-implemented AlphaFold and AlphaFold-Multimer in PyTorch. The authors emphasize this is the first open-source repo for training AlphaFold-Multimer and their AlphaFold implementation can be trained 2x faster than the original.
- PyKEEN 1.9 features new tools for adding textual representations to KG embedding models as well as adds significant speedups of NodePiece on large graphs (5M nodes / 30M edges in 10 minutes on a laptop) thanks to the METIS partitioning algorithm and GPU-accelerated BFS.
- GRAPE - a Rust/Python library for graph processing and embedding with many compbio datasets integrated.
GitHub
GitHub - dptech-corp/Uni-Fold: An open-source platform for developing protein models beyond AlphaFold.
An open-source platform for developing protein models beyond AlphaFold. - dptech-corp/Uni-Fold