Direction Improves Graph Learning Measuring Homophily in Directed Graphs A Toy Example Dir-GNN: Directed Graph Neural Network Experimental Results

Artificial Intelligence

Direction Improves Graph Learning Measuring Homophily in Directed Graphs A Toy Example Dir-GNN: Directed Graph Neural Network Experimental Results

admin

June 10, 2023

Direction Improves Graph Learning
Measuring Homophily in Directed Graphs
A Toy Example
Dir-GNN: Directed Graph Neural Network
Experimental Results

Many interesting real-world graphs, encountered in modelling social, transportation, financial transactions, or academic citation networks, are directed. The direction of the perimeters often conveys crucial insights, otherwise lost if one considers only the connectivity pattern of the graph.

In contrast, most Graph Neural Networks (GNNs) which have made remarkable strides in a wide range of graph ML applications, operate under the idea that the input graph is undirected. Making the input graph undirected has change into so prevalent through the years that PyTorch-Geometric, one of the vital popular GNN libraries, features a general utility function that robotically makes graphs undirected when loading datasets [2].

This inclination towards undirected graphs comes from two “primordial sins” of GNNs. First, undirected graphs have symmetric Laplacians with orthogonal eigenvectors offering a natural generalisation of the Fourier transform, on which early spectral GNNs relied to operate properly. Second, early datasets used to benchmark GNNs were predominantly homophilic graphs [3], akin to Cora and Pubmed [4]. On such datasets disregarding the direction by converting the directed graph into an undirected one appears to be advantageous, early evidence whereof has helped cement the “undirected” paradigm.

Direction is essentially useless in homophilic graphs (left), an statement that has led to the vast majority of current GNNs disregarding this information. In contrast, within the heterophilic setting (right), the usage of direction can bring large gains (10% to fifteen%) if used accurately, as proposed in our Dir-GNN framework.

We challenge this established order in our recent paper [5], showing that directionality can bring extensive gains in heterophilic settings.

The homophily of a graph is often measured because the fraction of neighbours with the identical label because the node itself, averaged across all nodes (node homophily). For directed graphs, we propose the weighted node homophily:

h(S) = 1/n Σᵤ ( Σᵥ sᵤᵥ * [yᵤ = yᵥ] ) / Σᵥ sᵤᵥ

where denotes the indicator function, n is the variety of nodes, and is a general adjacency matrix, which might be picked up as 𝐀or𝐀ᵀ, or as higher-order matrices, akin to 𝐀𝐀ᵀor 𝐀² for directed graphs, or because the symmetric matrix 𝐀ᵤ= (𝐀+ 𝐀ᵀ) / 2 and its higher-order counterpart 𝐀ᵤ², if the graph is taken into account as undirected.

Even when 1-hop neighbours are heterophilic [6], the situation may change when going to farther nodes. In comparison with the undirected case, there are 4 distinct 2-hops in directed graphs represented by the matrices 𝐀²,(𝐀ᵀ)², 𝐀𝐀ᵀ, and𝐀ᵀ𝐀, which may manifest different levels of (weighted) homophily.

Given that GNNs operate through multiple-hop aggregations, they will leverage the homophily of any 2-hop (and even further hops) of a graph. To have a comprehensive metric capturing the utmost homophily a GNN can leverage in principle, we introduce the notion of effective homophily, defined as the utmost weighted node homophily at any hop of the graph.

Empirically, we observe that the effective homophily of directed homophilic datasets is left unchanged when making the graph undirected. In heterophilic graphs, in contrast, this conversion decreases the effective homophily by almost 30% on average.

We compare the weighted homophily of each directed and undirected diffusion matrices for a wide range of each homophilic and heterophilic datasets. The effective homophily of the directed graph (hᵤ⁽ᵉᶠᶠ⁾) is way larger than that of the undirected graph (h⁽ᵉᶠᶠ⁾) for heterophilic datasets, suggesting a possible gain from using directionality effectively.

In synthetic experiments, we again observe that the effective homophily of directed stochastic block model graphs is consistently higher in comparison with their undirected counterparts. Interestingly, this gap widens for graphs which are less homophilic.

Particularly, we observe that 𝐀𝐀ᵀ and𝐀ᵀ𝐀consistently seem like the “most homophilic matrices” for heterophilic graphs.

To provide an intuition about why that is the case, imagine we try to predict the publication 12 months of a particular academic paper, for example, the Kipf & Welling 2016 GCN paper, given the directed citation network and the 12 months of publication of the opposite papers. Consider two different sorts of 2-hop relationships: one where we take a look at papers cited by the papers that our paper of interest v cites (represented by the vth row of the matrix 𝐀²), and one other where we take a look at papers that cite the identical sources as our paper (represented by (𝐀𝐀ᵀ)ᵥ).

In the primary case (𝐀²), allow us to start from the GCN paper and follow its citations twice. We land on a paper by Frasconi et al. from 1998. This older paper doesn’t give us much helpful details about when our GCN paper was published since it is simply too far up to now.

Toy example of a directed citation network.

Within the second case (𝐀𝐀ᵀ), we start with the GCN paper, follow a citation, after which come back to a paper that cites the identical source, just like the 2017 GAT paper. This paper is way closer to our GCN paper’s publication 12 months and thus provides a greater clue. More generally, nodes that share more references, like in our second example, may have higher scores in 𝐀𝐀ᵀ, and thus contribute more to our final prediction.

Now, consider an undirected 2-hop relationship (𝐀ᵤ²), which is just the common of the 4 possible 2-hop matrices. This includes our first type (like Frasconi et al.), which was not very helpful. Due to this fact, the highly useful 𝐀𝐀ᵀ matrix gets diluted by less informative matrices, like 𝐀², resulting in a less homophilic operator, leading to a less reliable predictor overall.

While now we have used a citation network in our example, this intuition applies more broadly. In a social network, for example, an influencer’s characteristics usually tend to resemble those of users who share many followers with them, represented by 𝐀ᵀ𝐀. Similarly, in a transaction network, two accounts sending money to the identical set of accounts (captured by 𝐀𝐀ᵀ), are more likely to exhibit similar behaviour.

In an effort to leverage directionality effectively, we propose the Directed Graph Neural Network (Dir-GNN) framework, which extends MPNNs to directed graphs by performing separate aggregations over the in- and out-neighbours of a node:

ᵤ⁽ᵏ⁾ᵢₙ = AGGᵢₙ({{ᵥ⁽ᵏ⁻¹⁾, ᵤ⁽ᵏ⁻¹⁾) : (u,v) ∈ E }})

ᵤ⁽ᵏ⁾ₒᵤₜ = AGGₒᵤₜ({{ᵥ⁽ᵏ⁻¹⁾, ᵤ⁽ᵏ⁻¹⁾) : (v,u) ∈ E }})

ᵤ⁽ᵏ⁾ = COM(ᵤ⁽ᵏ⁻¹⁾, ᵤ⁽ᵏ⁾ᵢₙ, ᵤ⁽ᵏ⁾ₒᵤₜ)

where the aggregation maps AGGᵢₙ and AGGₒᵤₜ, in addition to the mixture maps COM are learnable (normally a small neural network). Importantly, AGGᵢₙ and AGGₒᵤₜ can have independent sets of parameters to permit for various aggregations over in- and out-edges.

Interestingly, this procedural pattern resembles the one implemented by a natural extension of the classical Weisefiler-Lehman graph isomorphism test (1-WL) to directed graphs [7]. This connection is instrumental: when it comes to discriminative power, we show that Dir-GNN is strictly more powerful than standard MPNNs, which either convert the graph to undirected or propagate messages only within the direction of the perimeters.

Our framework can be flexible: it is simple to define directed counterparts of specific architectures akin to GCN, GraphSAGE or GAT. For instance, we will define Dir-GCN as:

𝐗⁽ᵏ⁾ = σ(𝐒ₒᵤₜ𝐗⁽ᵏ⁻¹⁾𝐖ₒᵤₜ⁽ᵏ⁾ + (𝐒ₒᵤₜ)ᵀ𝐗⁽ᵏ⁻¹⁾𝐖ᵢₙ⁽ᵏ⁾)

where 𝐒ₒᵤₜ= ₒᵤₜ⁻¹ᐟ² 𝐀 ᵢₙ⁻¹ᐟ² and ᵢₙ and ₒᵤₜ represent the diagonal in- and out-degree matrices, respectively.

We also show that Dir-GNN, when iteratively applied over multiple layers, results in more homophilic aggregations. Unlike other models, Dir-GNN can access the 4 2-hop matrices 𝐀²,(𝐀ᵀ)², 𝐀𝐀ᵀ and𝐀ᵀ𝐀 and learn to weigh them in a different way. In contrast, a model operating on the undirected graph has only access to 𝐀ᵤ², while models propagating information exclusively along in- or out-edges are limited to (𝐀ᵀ)² and 𝐀² respectively.

Dir-GNN, because of its separate aggregation of the 2 directions, is due to this fact the one model operating on 𝐀𝐀ᵀ and𝐀ᵀ𝐀, which now we have shown to be probably the most homophilic matrices and due to this fact probably the most reliable predictor.

We first compared GraphSAGE and its directed extension (Dir-SAGE) on an artificial task requiring directionality information. The outcomes confirm that only Dir-SAGE(in+out), with access to each in- and out-edges, is capable of almost perfectly solve the duty. The model acting on the undirected version of the graph performs no higher than probability, while the models acting on only in- or out-edges perform similarly obtaining around 75% accuracy.

When examining the performance of GraphSAGE and its Dir- extensions on an artificial task requiring directionality information, only Dir-SAGE (in+out), which utilises information from each directions, is able to solving the duty.

We further validated our approach with an ablation study comparing GCN, GraphSAGE and GAT base models with their Dir- extensions. On heterophilic datasets, using directionality brings exceptionally large gains (10% to twenty% absolute) in accuracy across all three base GNN models. Furthermore, Dir-GNN beats state-of-the-art models designed especially for heterophilic graphs.

These results suggest that, when present, using the sting direction can significantly improve learning on heterophilic graphs. In contrast, discarding it’s so harmful that not even complex architectures could make up for this loss of knowledge.

Latest state-of-the-art results on heterophilic graphs, through the use of direction correctly.

On the opposite hand, on homophilic datasets using directionality leaves the performance unchanged (and even barely hurts). That is consistent with our findings that using directionality as in our framework generally increases the effective homophily of heterophilic datasets, while leaving it almost unchanged for homophilic datasets.

LEAVE A REPLY Cancel reply