You're reading from Graph Machine Learning Learn about the latest advancements in graph data to build robust machine learning models

Product type Paperback

Published in Jul 2025

Publisher Packt

ISBN-13 9781803248066

Length 434 pages

Edition 2nd Edition

Languages

Python

Tools

PyTorch

Concepts

Machine Learning

Authors (3):

Aldo Marzullo

Enrico Deusebio

Claudio Stamile

View More author details

Table of Contents (20) Chapters

Preface

1. Part 1: Introduction to Graph Machine Learning

2. Getting Started with Graphs FREE CHAPTER

3. Graph Machine Learning

4. Neural Networks and Graphs

5. Part 2: Machine Learning on Graphs

6. Unsupervised Graph Learning

7. Supervised Graph Learning

8. Solving Common Graph-Based Machine Learning Problems

9. Part 3: Practical Applications of Graph Machine Learning

10. Social Network Graphs

11. Text Analytics and Natural Language Processing Using Graphs

12. Graph Analysis for Credit Card Transactions

13. Building a Data-Driven Graph-Powered Application

14. Part 4: Advanced topics in Graph Machine Learning

15. Temporal Graph Machine Learning

16. GraphML and LLMs

17. Novel Trends on Graphs

18. Index

19. Other Books You May Enjoy

Download a Free PDF Copy of This Book

Graph properties

As we have already learned, a graph is a mathematical model that is used for describing relationships between entities. However, each complex network presents intrinsic properties. Such properties can be measured by particular metrics, and each measure may characterize one or several local and global aspects of the graph.

In a graph for a social network such as X (formerly known as Twitter), for example, users (represented by the nodes of the graph) are connected to each other. However, there are users who are more connected than others (influencers). On the Reddit social graph, users with similar characteristics tend to group into communities.

We have already mentioned some of the basic features of graphs, such as the number of nodes and edges in a graph, which constitute the size of the graph itself. Those properties already provide a good description of the structure of a network. Think about the Facebook graph, for example: it can be described in terms of the number of nodes and edges. Such numbers easily allow it to be distinguished from a much smaller network (for example, the social structure of an office) but fail to characterize more complex dynamics (for example, how similar nodes are connected). To this end, more advanced graph-derived metrics can be considered, which can be grouped into four main categories, outlined as follows:

Integration metrics: These measure how nodes tend to be interconnected with each other.
Segregation metrics: These quantify the presence of groups of interconnected nodes, known as communities or modules, within a network.
Centrality metrics: These assess the importance of individual nodes inside a network.
Resilience metrics: These can be thought of as a measure of how much a network can maintain and adapt its operational performance when facing failures or other adverse conditions.

Those metrics are defined as global when expressing a measure of an overall network. On the other hand, local metrics measure values of individual network elements (nodes or edges). In weighted graphs, each property may or may not account for the edge weights, leading to weighted and unweighted metrics.

In the following section, we describe some of the most used metrics that measure global and local properties. For simplicity, unless specified differently in the text, we illustrate the global unweighted version of the metric. In several cases, this is obtained by averaging the local unweighted properties of the node.

Integration metrics

In this section, some of the most frequently used integration metrics will be described.

Distance, path, and shortest path

The concept of distance in a graph is often related to the number of edges to traverse to reach a target node from a given source node.

Consider a source node i and a target node j. The set of edges connecting node i to node j is called a path. When studying complex networks, we are often interested in finding the shortest path between two nodes. The shortest path between a source node i and a target node j is the path having the lowest number of edges compared to all the possible paths between i and j. The diameter of a network is the number of edges contained in the longest shortest path among all possible shortest paths.

Look at the following figure. There are different paths to reach Tokyo from Dublin. However, one of them is the shortest (the edges on the shortest path are highlighted):

Figure 1.14: The shortest path between two nodes

The shortest_path function of the NetworkX Python library enables users to quickly compute the shortest path between two nodes in a graph. Consider the following code, in which a seven-node graph is created using networkx. For clarity and simplicity in creating the graph structure, we will use numerical identifiers for the nodes, even though they represent cities:

G = nx.Graph()
nodes = {1:'Dublin',2:'Paris',3:'Milan',4:'Rome',5:'Naples',
         6:'Moscow',7:'Tokyo'}
G.add_nodes_from(nodes.keys())
G.add_edges_from([(1,2),(1,3),(2,3),(3,4),(4,5),(5,6),(6,7),(7,5)])

The shortest path between a source node (for example, 'Dublin', identified by key 1) and a target node (for example, 'Tokyo', identified by key 7) can be obtained as follows:

path = nx.shortest_path(G,source=1,target=7)

This should output the following:

[1,3,4,5,7]

Here, [1,3,4,5,7] are the nodes contained in the shortest path between 'Tokyo' and 'Dublin'.

Characteristic path length

Let’s assume we have a fully connected graph. The characteristic path length is defined as the average of all the shortest path lengths between all possible pairs of nodes. If is the average path length between node i and all the other nodes, the characteristic path length is computed as follows:

Here, V is the set of nodes in the graph, and q = |V| represents its order. This equation calculates the average distance across the entire network by summing the shortest path lengths from each node to every other node and normalizing it by the total number of pairs, q(q–1). The characteristic path length is a crucial measure of how efficiently information spreads across a network. Networks with shorter characteristic path lengths facilitate quick information transfer, thereby reducing communication costs. This concept is particularly important in fields such as social network analysis, where understanding the speed of information dissemination can provide insights into network dynamics. Characteristic path length can be computed through NetworkX using the following function:

nx.average_shortest_path_length(G)

This should give us the following number, quantifying the average shortest path length:

2.1904761904761907

In the following figure, two examples of graphs are depicted. As observable, fully connected graphs have a lower average shortest path length compared to circular graphs. Indeed, in a fully connected graph, the number of edges to traverse to reach a node from another is, on average, less than the one in a circular graph, where multiple edges need to be traversed.

Figure 1.15: Characteristic path length of a fully connected graph (left) and a circular graph (right)

Notice that this metric cannot always be defined since it is not possible to compute a path among all the nodes in disconnected graphs. For this reason, network efficiency is also widely used.

Global and local efficiency

Global efficiency is the average of the inverse shortest path length for all pairs of nodes. Such a metric can be seen as a measure of how efficiently information is exchanged across a network. Consider that is the shortest path between a node i and a node j. The network efficiency is defined as follows:

The contribution to the efficiency for pairs of disconnected nodes is 0, such that disconnected pairs can be dropped from the summation above.

Efficiency has a maximum value of 1 when a graph is fully connected, while it has a minimum value of 0 for completely disconnected graphs. Intuitively, the shorter the path, the lower the measure.

The local efficiency of a node can be computed by considering only the neighborhood of the node in the calculation, without the node itself. In the formula, is the neighborhood of the node i and represents the number of neighbors of node i. The local coefficient is computed as:

Global efficiency is computed in NetworkX using the following command:

nx.global_efficiency(G)

The output should be as follows:

0.6111111111111109

Average local efficiency is computed in NetworkX using the following command:

nx.local_efficiency(G)

The output should be as follows:

0.6666666666666667

In the following figure, two examples of graphs are depicted. As observed, a fully connected graph on the left presents a higher level of efficiency compared to a circular graph on the right. In a fully connected graph, each node can be reached from any other node in the graph, and information is exchanged rapidly across the network. However, in a circular graph, several nodes should instead be traversed to reach the target node, making it less efficient:

Figure 1.16: Global efficiency of a fully connected graph (left) and a circular graph (right)

Integration metrics describe the connection among nodes. However, more information about the presence of groups can be extracted by considering segregation metrics.

Segregation metrics

In this section, some of the most common segregation metrics will be described.

Clustering coefficient

The clustering coefficient is a measure of how closely nodes are grouped together. It is defined as the fraction of triangles (complete subgraph of three nodes and three edges) around a node and is equivalent to the fraction of the node’s neighbors that are neighbors of each other. In the formula, let be the number of neighbors of a node i and let be the number of edges that exist between these neighbors. So ) will be the maximum possible number of edges that could exist among the neighbors of node 𝑖. The local clustering coefficient can then be calculated as follows:

Therefore, the global clustering coefficient is computed by averaging the clustering coefficients for all nodes:

The global clustering coefficient is computed in networkx using the following command:

nx.average_clustering(G)

This should output the following:

0.6666666666666667

The local clustering coefficient is computed in networkx using the following command:

nx.clustering(G)

This should output the following:

{1: 1.0,
2: 1.0,
3: 0.3333333333333333,
4: 0,
5: 0.3333333333333333,
6: 1.0,
7: 1.0}

The output is a Python dictionary containing, for each node (identified by the respective key), the corresponding value. In the graph shown in Figure 1.17, two clusters of nodes can be easily identified. By computing the clustering coefficient for each single node, it can be observed that Rome has the lowest value. Tokyo and Moscow, as well as Paris and Dublin, are instead very well connected within their respective groups (notice the size of each node is drawn proportionally to each node’s clustering coefficient). The graph can be seen in the following figure:

Figure 1.17: Local clustering coefficient representation

Modularity

Modularity was designed to quantify the division of a network into aggregated sets of highly interconnected nodes, commonly known as modules, communities, groups, or clusters. The main idea is that networks having high modularity will show dense connections within the module and sparse connections between modules.

Consider a social network such as Reddit: members of communities related to video games tend to interact much more with other users in the same community, talking about recent news, favorite consoles, and so on. However, they will probably interact less with users talking about fashion. Differently from many other graph metrics, modularity is often computed by means of optimization algorithms.

We will discuss this metric in more detail in Chapter 6, Solving Common Graph-Based Machine Learning Problems, when discussing the algorithms used for community extractions and identifications in more depth. For now, it is sufficient to understand that high modularity indicates a strong community structure, where many connections exist within communities and fewer connections exist between them. Low modularity, instead, suggests that the network does not have a strong community structure (we can say that the distribution of the edges is more random).

Modularity in NetworkX is computed using the modularity function of the networkx.algorithms.community module, as follows:

import networkx.algorithms.community as nx_comm
nx_comm.modularity(G, communities=[{1,2,3}, {4,5,6,7}])

Here, the second argument—communities—is a list of sets, each representing a partition of the graph. The output should be as follows:

0.3671875

Segregation metrics help to understand the presence of groups. However, each node in a graph has its own importance. To quantify this, we can use centrality metrics, which are discussed in the next sections.

Centrality metrics

In this section, some of the most common centrality metrics will be described. Centrality metrics are extremely useful for identifying the most relevant nodes in a network. As a result, these quantities may be the most used metrics when filtering and targeting nodes and edges (e.g., finding influencers, critical points of failures, etc.).

Degree centrality

One of the most common and simple centrality metrics is the degree centrality metric. This is directly connected with the degree of a node, measuring the number of incident edges on a certain node i.

Intuitively, the more a node is connected to another node, the more its degree centrality will assume high values. Note that if a graph is directed, the in-degree centrality and out-degree centrality should be considered for each node, related to the number of incoming and outcoming edges, respectively. This number is then normalized by the graph’s size to obtain a number between 0 and 1. Degree centrality is computed in NetworkX by using the following command:

nx.degree_centrality(G)

The output should be as follows:

{1: 0.3333333333333333, 2: 0.3333333333333333, 3: 0.5, 4: 0.3333333333333333, 5: 0.5, 6: 0.3333333333333333, 7: 0.3333333333333333}

Closeness centrality

The closeness centrality metric attempts to quantify how close a node is (well connected) to other nodes. More formally, it refers to the average distance of a node i to all other nodes in the network. If is the shortest path between node i and node j, the closeness centrality is defined as follows:

Here, V is the set of nodes in the graph. Closeness centrality can be computed in NetworkX using the following command:

nx.closeness_centrality(G)

The output should be as follows:

{1: 0.4, 2: 0.4, 3: 0.5454545454545454, 4: 0.6, 5: 0.5454545454545454, 6: 0.4, 7: 0.4}

Betweenness centrality

The betweenness centrality metric evaluates how much a node acts as a bridge between other nodes. Even if a node has a low degree or closeness centrality, it can still be strategically connected because of a high betweenness centrality, if it helps to keep the whole network connected.

If is the total number of shortest paths between node w and node j and L_wj(i) is the total number of shortest paths between w and j passing through node i, then the betweenness centrality is defined as follows:

If we observe the formula, we can notice that the higher the number of shortest paths passing through node i, the higher the value of the betweenness centrality. Betweenness centrality is computed in networkx by using the following command:

nx.betweenness_centrality(G)

The output should be as follows:

{1: 0.0, 2: 0.0, 3: 0.5333333333333333, 4: 0.6, 5: 0.5333333333333333, 6: 0.0, 7: 0.0}

In Figure 1.18, we illustrate the difference between degree centrality, closeness centrality, and betweenness centrality. Milan and Naples have the highest degree centrality. Rome has the highest closeness centrality since it is the closest to any other node. It also shows the highest betweenness centrality because of its crucial role in connecting the two visible clusters and keeping the whole network connected.

You can see the differences here:

Figure 1.18: Degree centrality (left), closeness centrality (center), and betweenness centrality (right)

Finally, we will mention resilience metrics, which enable us to measure the vulnerability of a graph—that is, how susceptible a network is to disconnection or functional failure when certain nodes are removed.

Resilience metrics

There are several metrics that measure a network’s resilience. Assortativity is one of the most used.

Assortativity coefficient

Assortativity is used to quantify the tendency of nodes being connected to similar nodes, which can impact the network’s ability to withstand failures or “attacks.” High assortativity indicates that nodes of similar degrees are more likely to be connected, leading to a resilient structure where the failure of some nodes does not significantly disrupt overall connectivity. Conversely, networks with low assortativity tend to have nodes connecting with dissimilar degrees, making them more vulnerable to targeted attacks on high-degree nodes, as shown in the figure below:

Figure 1.19: Disassortative (left) and assortative (right) graphs

There are several ways to measure such correlations. One of the most used methods is the Pearson correlation coefficient between the degrees of directly connected nodes (nodes on two opposite ends of a link). The coefficient assumes positive values when there is a correlation between nodes of a similar degree, while it assumes negative values when there is a correlation between nodes of a different degree. Assortativity using the Pearson correlation coefficient is computed in NetworkX by using the following command:

nx.degree_pearson_correlation_coefficient(G)

The output should be as follows:

-0.6

Social networks are mostly assortative. However, the so-called influencers (famous singers, football players, fashion bloggers, etc.) tend to be followed (incoming edges) by several standard users, while tending to not be connected with each other and showing disassortative behavior.

It is important to note that the properties previously presented are just a subset of the many metrics available for describing graphs. We have chosen to focus on these specific metrics because they offer foundational insights into graph theory and are frequently used in practical applications. A wider set of metrics and algorithms can be found at https://networkx.org/documentation/stable/reference/algorithms/.