Graph Machine Learning – Telegram
Graph Machine Learning
6.71K subscribers
53 photos
11 files
808 links
Everything about graph theory, computer science, machine learning, etc.


If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo.

Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Download Telegram
Exploring Complexity: How Graph Data Science is pushing new boundaries: panel

A panel on the graph data science is available for registration online. See their message below:

🔎 Exploring complexity is a challenge for all of us. The abundance of data still does not help us to make better decisions, we need to unbundle it, understand the context and find the latent relationships.

📣 Missioned to make this process simplified, as OpenCredo, we will be hosting a panel discussion with the leaders on Graph Data Science space to discuss the impacts of Graphs. Whether you are new to this space or listen to inspiring leaders of the community, this is a great opportunity!

- Dan McCreary, Distinguished engineer at Optum (he has great blog posts at [https://lnkd.in/eUFZMNh3])
- Paco Nathan, Evil Mad Scientists, a contributor to AI/ML/Graph space (some of his amazing work at [https://derwen.ai/report]
- Alessandro Negro, Chief Scientist at GraphAware (recently published Graph Powered Machine Learning [https://lnkd.in/eQycHze2])

💫 Our CTO/CEO Nicki Watt will be hosting the panel, and we are excited to have a great insight into how Graph is pushing the boundaries.
MICCAI 2021 graph papers

Here is a guest post by Mahsa Ghorbani about applications of graphs to medical images.

A few weeks ago, International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) published the accepted list of papers and their reviews. About 20 of the papers are graph related which shows the impact of graph-based methods in medical applications. Here are two examples of papers:

- GKD: Semi-supervised Graph Knowledge Distillation for Graph-Independent Inference
Recently, graphs neural networks show great success in analyzing multi-modal medical data (such as demographic and imaging data) associated with a group of patients regarding the disease prediction problem. However, the conventional methods exhibit poor performance when the graph modality is not available during the inference time. GKD proposes a novel semi-supervised method inspired by the knowledge distillation framework to face this issue. The teacher block distills all the available information in training data and then transfers it to the student network trained with input features, not the filtered ones. Therefore, the student works well on the test data without the graph between them. GKD also utilizes a modified label-propagation algorithm in the teacher block to keep a balance between neighborhood features and node features.

- Early Detection of Liver Fibrosis Using Graph Convolutional Networks
Fibrosis
refers to the deposition of collagen in tissue which can lead to organ dysfunction and even to organ failure. Typically, histochemical stains are being used to visualize collagen. Detection of early onset of fibrosis is critical to detecting long-term damage to identify potential loss of organ function. This paper uses a collagen segmentation method to extract a collagen map from an input histopathological image and then decompose it into a set of tiles. Then cluster the tiles and classify the clusters based on a few samples in them (visually). The tiles clustered as dense collagen are used as the centers, and each tile will be connected to the closest center (Voronoi tessellation). After a set of graph convolutional layers, an attention mechanism is used to aggregate the tile features and detect the fibrosis stage of the input image.
Probabilistic Symmetries and Invariant Neural Networks

A recent survey on neural networks that are invariant/equivariant under group actions (which GNNs are a special class of). Among others, it does a good job of significant works of 20th century that laid the foundation for invariant neural networks.
Fresh picks from ArXiv
This week on ArXiv: GNNs for imbalanced case, relationships behind KGs, and software optimization of GNNs 💻

If I forgot to mention your paper, please shoot me a message and I will update the post.

GNNs
* Distance-wise Prototypical Graph Neural Network in Node Imbalance Classification with Tyler Derr
* Multi-view Contrastive Graph Clustering NeurIPS 2021
* Learning to Learn Graph Topologies NeurIPS 2021

Studies
* What is Learned in Knowledge Graph Embeddings?
* Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective
WSDM2022 Challenge from DGL Team

A really nice competition by DGL on temporal link prediction on two large-scale graph datasets. The dates are Oct 15 - Jan 20. Prize pool: $3500 + WSDM registration.
​​Monday Theory: Logical Expressiveness of Hypergraph GNNs

A prominent spotlight ICLR'20 paper by Barcelo et al proved several important expressiveness boundaries for message passing GNNs on graphs with normal binary edges, ie, an edge connects two nodes. Using a well-known mapping of classifying First Order Logic formulae to the WL-test, the authors show that Aggregate-Combine GNNs are bounded by ALCQ - a common Denoscription Logic fragment of FOL. And Aggregate-Combine-Readout (GNNs with global pooling) are bounded by FOC-2 subset for FOL, i.e., First Order Logic with counting quantifiers and at most 2 variables.

A new anonymous ICLR'22 submission extends this framework to hypergraphs, ie, the graphs where (hyper)edges are constructed from B > 2 nodes. This time, the authors resort to higher-order k-WL tests and find a natural connection between k-WL and expressiveness of hypergraph networks. Three cool contributions:

1. Hypergraphs with B-ary hyperedges are bounded by FOC-B subset of FOL. That is, a logical formula can have up to B variables now.
2. The framework includes several relational architetures such as Neural Logic Machines, Deep Sets, Transformers, and GNNs
3. The authors estimate theoretical bounds on min depth and arity of hypergraph architectures for common GRL tasks. For instance, identifying bipartiteness of n-nodes graph requires log(n) layers of 3-ary GNNs (or 2-WL kernels)

Experiments were conducted on rather toy graphs of 10-80 nodes which is explained by the need to train hypergraph nets on all possible permutations of nodes in hyperedges (and this at the moment has bad scaling properties). Still, most of the hypotheses are confirmed by the experiments, so check out the full paper if you're into logic+GNN studies!
Fresh picks from ArXiv
This week on ArXiv: uncertainty for node classification, GNNs for tabular data, and cryptocurrency graph 🤑

If I forgot to mention your paper, please shoot me a message and I will update the post.

GNNs
* A Scalable AutoML Approach Based on Graph Neural Networks
* Optimizing Sparse Matrix Multiplications for Graph Neural Networks
* Graph Embedding with Hierarchical Attentive Membership WSDM 2022
* Unbiased Graph Embedding with Biased Graph Observations
* Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification with Stephan Günnemann
* Robustness of Graph Neural Networks at Scale with Stephan Günnemann
* VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization
* Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods
* Nested Graph Neural Networks with Pan Li
* Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node Features
* Tackling Oversmoothing of GNNs with Contrastive Learning
* On the Power of Edge Independent Graph Models
* On Provable Benefits of Depth in Training Graph Convolutional Networks
* UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation CIKM'2021
* Finding a Concise, Precise, and Exhaustive Set of Near Bi-Cliques in Dynamic Graphs WSDM 2022
* How to transfer algorithmic reasoning knowledge to learn new algorithms? with Petar Velickovic and Jian Tang


Software
* retworkx: A High-Performance Graph Library for Python
* NeuroComb: Improving SAT Solving with Graph Neural Networks
* Graph? Yes! Which one? Help!

Datasets
* Towards a Taxonomy of Graph Learning Datasets
* Cryptocurrencies Activity as a Complex Network: Analysis of Transactions Graphs
Graph papers at NeurIPS 2021

There are ~140 graph papers which can be found here, which is about 6% of all 2334 papers. More statistics can be found here.
Papers with Code newsletter: GNN

The 19th issue of the Papers with Code newsletter covers a brief update of the latest developments in graph neural networks (GNNs), a list of recent applications of GNNs, top trending papers for October 2021 on Papers with Code.
From Mila with 💌 and graphs

A prolific week for Mila researchers:

- Michael Galkin released a new review of Knowledge Graph papers from EMNLP 2021. For those of us who didn't make it to Dominical Republic, you can experience the premium Punta Cana content about applications of graphs in language modeling, KG construction, entity linking, and question answering.
- Best Long Paper award at EMNLP 2021 went to Visually Grounded Reasoning across Languages and Cultures by the team from Cambridge, Copenhagen, and Mila

Mila and Mila-affiliated folks run a good bunch of reading groups you might find useful: in addition to the GRL Reading Group and LoGaG Reading group, there exist ones on Neural AI, Out-of-Distribution Generalization, Quantum & AI , ML4Code
Complex and Simple Models of Multidimensional Data : from graphs to neural networks

A mini-workshop on applications of graphs in biology. 1 December, free, but registration is mandatory.
Introducing TensorFlow Graph Neural Networks

A new API for TF2 to build GNNs. It would be interesting to see how it compares to PyG and DGL libraries.
Graph Neural Networks through the lens of Differential Geometry and Algebraic Topology

And Michael is back with a first post in the series of posts on the connection between GML and differential geometry and algebraic topology. We've been waiting for this!
Successful Phase I Cancer Vaccine Trials Powered by Graph ML

Transgene and NEC Corporation published a press release on the successful Phase I trials of TG4050, neoantigen cancer vaccine, tested on ovarian cancer, head and neck cancer. The release outlines the NEC's Neoantigen Prediction System based on Graph ML algorithms.

We reached out to Mathias Niepert, Chief Research Scientist of NEC Labs, to shed a bit more light on the graph ml setup and he kindly provided a few interesting details! Mathias says:

The main graph ML method is derived from Embedding Propagation which is a GNN that’s trained in an unsupervised way and, crucially, is able to handle / impute missing data in embedding space. The most relevant papers are Learning Graph Representations with Embedding Propagation (NeurIPS 2017) and Learning Representations of Missing Data for Predicting Patient Outcomes

A major challenge is that for each neoantigen we have some measurements but not all. Obtaining some of these requires expensive tests and some have to be collected from previous biomedical studies. One ends up with several very different feature types (requiring different ML encoders) and, for each such feature type, we only sometimes have a value. The graph-based ML method helps to impute and learn a unifying embedding space.

The graph itself is created based on specific similarity measures between proteins and is not given a-priori. Having a general graph, the task is to rank peptide candidates which would be most efficient for a given patient.

From a probe of a patient's cancer and healthy cells, you get several tens of thousands of neoantigen candidates. To manufacture a personalized vaccine, you have to narrow this down to several dozen candidates. These candidates should have two properties (1) likelihood to elicit immune response (2) different from healthy cell antigen. You end up scoring the neoantigens with the ML method, take the top K, and based on these you synthesize the vaccine. Graph ML is one component of a pretty complex system.

Mathias would like to emphasize that this is based on the work of several people at NEC and most credit should go to the domain experts who have collected the data and adapted and applied the graph ML methods to this problem.
Fresh picks from ArXiv
This week on ArXiv: feature propagation to alleviate missing node features, new sota for molecular prediction, and benchmarks on GNN explanations 👴
If I forgot to mention your paper, please shoot me a message and I will update the post.

GNNs
* Multi-fidelity Stability for Graph Representation Learning with Joan Bruna
* AutoHEnsGNN: Winning Solution to AutoGraph Challenge for KDD Cup 2020
* Demystifying Graph Neural Network Explanations
* Unsupervised Learning for Identifying High Eigenvector Centrality Nodes: A Graph Neural Network Approach
* On the Unreasonable Effectiveness of Feature propagation in Learning on Graphs with Missing Node Features with Michael Bronstein
* Directional Message Passing on Molecular Graphs via Synthetic Coordinates with Stephan Günnemann
Over-squashing, Bottlenecks, and Graph Ricci curvature

A second post by Michael Bronstein about over-squashing effect, when exponentially many neighbors aggregate their information into a fix-sized vector, causing the loss of the information. In this post, Michael connects over-squashing effect with Ricci flow — a well-studied notion of the space curvature that is used in differential geometry.