You're reading from Graph Machine Learning Learn about the latest advancements in graph data to build robust machine learning models

Product type Paperback

Published in Jul 2025

Publisher Packt

ISBN-13 9781803248066

Length 434 pages

Edition 2nd Edition

Languages

Python

Tools

PyTorch

Concepts

Machine Learning

Authors (3):

Aldo Marzullo

Enrico Deusebio

Claudio Stamile

View More author details

Table of Contents (20) Chapters

Preface

1. Part 1: Introduction to Graph Machine Learning

2. Getting Started with Graphs FREE CHAPTER

3. Graph Machine Learning

4. Neural Networks and Graphs

5. Part 2: Machine Learning on Graphs

6. Unsupervised Graph Learning

7. Supervised Graph Learning

8. Solving Common Graph-Based Machine Learning Problems

9. Part 3: Practical Applications of Graph Machine Learning

10. Social Network Graphs

11. Text Analytics and Natural Language Processing Using Graphs

12. Graph Analysis for Credit Card Transactions

13. Building a Data-Driven Graph-Powered Application

14. Part 4: Advanced topics in Graph Machine Learning

15. Temporal Graph Machine Learning

16. GraphML and LLMs

17. Novel Trends on Graphs

18. Index

19. Other Books You May Enjoy

Download a Free PDF Copy of This Book

Hands-on examples

Now that we understand the basic concepts and notions about graphs and network analysis, it is time to dive into some practical examples that will help us start to put into practice the general concepts we have learned so far. In this section, we will present some examples and toy problems that are generally used to study the properties of networks, as well as benchmark performances and the effectiveness of networks’ algorithms.

Simple graphs

We will start by looking at some very simple examples of networks. Fortunately, networkx already comes with a number of graphs already implemented, ready to be used and played with. Let’s start by creating a fully connected undirected graph with n nodes, as follows:

complete = nx.complete_graph(n=7)

This has edges and a clustering coefficient C=1. Although fully connected graphs are not very interesting on their own, they represent a fundamental building block that may arise within larger graphs. A fully connected subgraph of n nodes within a larger graph is generally referred to as a clique of size n.

A clique, C, in an undirected graph is defined as the subset of its vertices, , such that every two distinct vertices in the subset are adjacent. This is equivalent to the condition that the induced subgraph of G induced by C is a fully connected graph.

Cliques represent one of the basic concepts in graph theory and are often also used in mathematical problems where relationships need to be encoded. Besides, they also represent the simplest unit when constructing more complex graphs. On the other hand, the task of finding cliques of a given size n in larger graphs (clique problem) is of great interest and it can be shown that it is a nondeterministic polynomial-time complete (NP-complete) problem, often studied in computer science. In other words, finding a solution is computationally difficult, and the required time increases exponentially as the size of the graph increases.

Some simple examples of networkx graphs can be seen in the following figure:

Figure 1.20: Simple examples of graphs with networkx: (left) fully connected graph; (center) lollipop graph; (right) barbell graph

In Figure 1.20, we show a fully connected graph (left). These graphs represent well-integrated groups, such as small teams in an organization, where every member interacts with every other member. This graph can help analyze communication patterns. It also contains two other simple examples containing cliques that can be easily generated with networkx, outlined as follows:

A lollipop graph formed by a clique of size m and a branch of n nodes, as shown in the following code snippet:
```
lollipop = nx.lollipop_graph(m=7, n=3)
```
A barbell graph formed by two cliques of size m1 connected by a path of size m2, which resembles the sample graph we used previously to characterize some of the global and local properties. The code to generate this is shown in the following snippet:
```
barbell = nx.barbell_graph(m1=7, m2=4)
```

Such simple graphs are basic building blocks that can be used to generate more complex networks by combining them. Merging subgraphs is very easy with NetworkX and can be done with just a few lines of code, as shown in the following code snippet, where the three graphs are merged into a single graph and some random edges are placed to connect them:

def get_random_node(graph):
    return np.random.choice(graph.nodes)
allGraphs = nx.compose_all([complete, barbell, lollipop])
allGraphs.add_edge(get_random_node(lollipop), get_random_node(lollipop))
allGraphs.add_edge(get_random_node(complete), get_random_node(barbell))

Other very simple graphs (that can then be merged and played around with) can be found at https://networkx.org/documentation/stable/reference/generators.html#module-networkx.generators.classic.

Generative graph models

Although creating simple subgraphs and merging them is a way to generate new graphs of increasing complexity, networks may also be generated by means of probabilistic models and/or generative models that let a graph grow by itself. Such graphs usually share interesting properties with real networks and have long been used to create benchmarks and synthetic graphs, especially in times when the amount of data available was not as overwhelming as today. Here, we present some examples of random generated graphs, briefly describing the models that underlie them.

Watts and Strogatz (1998)

This model was introduced by the authors to study the behavior of small-world networks. A small-world network is characterized by a high clustering coefficient and a short average path length, meaning that most nodes can be reached from any other node through a small number of intermediate connections. This structure often mirrors real-world social networks, where individuals are typically connected through a few mutual acquaintances, allowing for rapid information dissemination. The graph is generated by first displacing n nodes in a ring and connecting each node with its k neighbors. Each edge of such a graph then has a probability p of being rewired to a randomly chosen node. By ranging p, the Watts and Strogatz model allows a shift from a regular network (p=0) to a completely random network (p=1). In between, graphs exhibit small-world features; that is, they tend to bring this model closer to social network graphs. These kinds of graphs can be easily created with the following command:

graph = nx.watts_strogatz_graph(n=20, k=5, p=0.2)

An illustration of a graph before and after rewiring can be found below:

Figure 1.21: A sample graph before (left) and after (right) rewiring with p=0.2

Barabási-Albert (1999)

The model proposed by Albert and Barabási is based on a generative model that allows the creation of random scale-free networks by using a preferential attachment schema, where a network is created by progressively adding new nodes and attaching them to already existing nodes, with a preference for nodes that have more neighbors. Mathematically speaking, the underlying idea of this model is that the probability for a new node to be attached to an existing node i depends on the degree of the i-th node, , and the sum of the degrees of all existing nodes in the network, , according to the following formula:

Thus, nodes with a large number of edges (hubs) tend to develop even more edges, whereas nodes with few links are unlikely to develop other links (periphery). Networks generated by this model exhibit a power-law distribution for the connectivity (that is, degree) between nodes. Such a behavior is also found in real networks, such as the World Wide Web (WWW) network and the actor collaboration network (connections between actors based on their collaborations in films and television shows).

Interestingly, this model illustrates that it is the popularity of a node (how many edges it already has) rather than its intrinsic node properties that influences the creation of new connections. The initial model has then been extended (and this is the version that is available on NetworkX) to also allow the preferential attachment of new edges or the rewiring of existing edges.

The Barabási-Albert model is illustrated in the following figure:

Figure 1.22: Barabási-Albert model (left) with 20 nodes; distribution of connectivity with n=100.000 nodes (right), showing the scale-free power law distribution

In Figure 1.22, we showed an example of the Barabási-Albert model for a small network, where you can already observe the emergence of hubs (on the left), as well as the probability distribution of the degree of the nodes, which exhibits a scale-free power-law behavior (on the right). The preceding distribution can easily be replicated in networkx, as in the following code snippet, where n and m are the number of nodes and edges, respectively, p is the probability value for adding an edge between existing nodes, and q is the probability value of the rewiring of existing edges (with p + q < 1):

ba_model = nx.extended_barabasi_albert_graph(n=100,m=1,p=0,q=0)
degree = dict(nx.degree(ba_model)).values()
bins = np.round(np.logspace(np.log10(min(degree)), np.log10(max(degree)), 10))
cnt = Counter(np.digitize(np.array(list(degree)), bins))