SlideShare a Scribd company logo
C
Topological Data Analysis of
Complex Spatial Systems
Mason A. Porter (@masonporter)
Department of Mathematics, UCLA
22 of 45
(a) (b) (c)
FIG. 8. Visualization of the three different spatial partitions of Peru’s provinces on a map. (a) Broad climate partition
into coast (yellow), mountains (brown), and jungle (green); (b) detailed climate partition, in which we start with the
broad partition and then further divide the coast and mountains into northern coast, central coast, southern coast, northern
mountains, central mountains, and southern mountains; and (c) the administrative partition of Peru. We obtained province
boundaries from [82] and plot the maps in MATLAB.
We use the term “spatial partitions” to describe partitions that have high z-Rand scores in
comparison to the manual climate or administrative partitions. For multilayer networks, we
also compare the algorithmic partitions to partitions that contain a planted temporal change in
community structure. For these comparisons, we group the multilayer nodes into ones that occur
before or after a “critical” time tc (i.e., partitions into a “pre-tc” community and a “post-tc”
community). We test the set t = 1,1+D,1+2D,...,1+D ⇥ bT
D c 1 of times that we
use to create the multilayer network, and we report the time with the highest z-Rand score as the
critical time tc. We also test for pairs of critical times (yielding a partition into three communities)
by examining all possible pairs of critical times tc1 and tc2 in the same manner. We use the term
“temporal partitions” to describe algorithmic partitions of the disease-correlation networks that
yield high z-Rand scores in these comparisons.
4.3.1 Modularity Maximization Using the NG Null Model. We first study the community
structures of the 700 overlapping static networks formed by taking t = {1,2,...,700} and using
D = 80. (There are 779 time points in total.) The community structures that we obtain from max-
imizing modularity have a strong spatial organization, as suggested by the high z-Rand scores
when compared to topographical partitions. As one can see in Fig. 9(a), in which we plot the
z-Rand scores versus the centers of the time windows that correspond to the static networks, the
spatial organization is especially evident starting in the year 2000. In our subsequent figures,
time points that we indicate on the axes also correspond to the centers of the associated time
windows.
As one can see from a plot of number of epidemic cases over time (see Fig. 7), this transition
seems to occur near the time of the largest countrywide epidemic in the data, and the subse-
quent period includes recurring yearly epidemics that were linked to climatic patterns in prior
Spatial Systems
• Space has a major influence on
the structures of networks and
other complex systems
• Useful reference: Marc
Barthelemy, Morphogenesis of
Spatial Networks, 2018
Slime molds and fungal networks
• See, e.g., work by Mark
Fricker and collaborators
Leaf-Venation
Patterns
(Eleni Katifori and
collaborators)
Spiders: Spinning Webs While Under the Influence
The border between
Belgium and the
Netherlands at Baarle-
Nassau/Baarle-Hertog
Topological Data Analysis
• Algorithmic methods to study high-dimensional data in a
quantitative manner
• Data from point clouds, networks, etc.
• Examine the “shape” of data
• Persistent homology
• Mathematical formalism for studying topological invariants
• Fast algorithms
• Persistent structures: a way to cope with noise in data
• Allows examination of “higher-order” (beyond pairwise) interactions in data
My TDA “Origin Story”
• Available at:
https://twitter.com/masonporter/status/1200127512371556352
• The short short version
• In college (1994–1998), I saw some algebraic topology, but it looked very abstract and it seemed
more on the ‘purer’ (i.e., theoretical) end of mathematics
• As a postdoc (2002–2005) at Georgia Tech, I saw Konstantin Mischaikow using ideas from
computational topology on experimental data (e.g., in fluid mechanics)
• In late 2012, I noticed that Konstantin and I were both working on granular materials (him with
TDA, me with network analysis), so I contacted him and we arranged a visit.
• We did a project on TDA on spreading processes on networks (Taylor et al., Nature Commun., 2015)
• An Oxford doctoral student saw that paper and wanted to work on TDA with me.
• Another Oxford student saw a couple of previous papers of mine with TDA and wanted to work on that with
me.
• Rinse, wash, and repeat — and unintentionally now I do a lot of TDA stuff.
Topological Data Analysis
(in practice, usually persistent homology)
• Michelle Feng, Abigail Hickok, Yacoub H. Kureh, Mason A. Porter,
Chad M. Topaz, “Connecting the Dots: Discovering the “Shape” of
Data”, Frontiers for Young Minds, 2021
• Chad Topaz’s introductory article (for a general audience) in DSWeb
• https://dsweb.siam.org/The-Magazine/Article/topological-data-analysis
• Nina Otter, MAP, Ulrike Tillmann, Peter Grindrod, and Heather A.
Harrington [2017], “A Roadmap for the Computation of Persistent
Homology”, European Physical Journal — Data Science, Vol. 6: 17
SIAM News
Jan–Feb 2020 issue
Example: Point-Cloud Data
[Figure from: Michelle Feng, Abigail Hickock, Yacoub H. Kureh, MAP, & Chad M. Topaz,
“Connecting the Dots: Discovering the “Shape” of Data”, Frontiers for Young Minds, 2021]
C
Topological Data Analysis
of 2D Voting Data
Michelle Feng & MAP [2021], “Persistent Homology of Geospatial Data:
A Case Study with Voting”, SIAM Review, Vol. 63, No. 1: 67–99
Quantifying “Political Islands”
How do we detect red voters in a sea of blue?
(Or light blue voters in a sea of dark blue?)
Precinct-Level Voting Data
• How do people vote?
• How can we identify
geographical or temporal
patterns in voting?
• Our paper: geographical
• Can we automatically
characterize 2D
geographical outliers?
• Voting data (compiled by Los
Angeles Times) for all
California precincts in 2016
election
TDA and Voting Data
• Topological methods allow us to find and
identify holes, if we have a nice enough
space to search for those holes
• They also allow us to relate the presence
of holes to global structure
• Want to find “political islands”
• Red voters in a sea of blue, etc.
• Consider these as “holes” in a manifold in which
all precincts vote similarly
• Maybe we can also say something about
the structure of a county?
Barcodes
• A method of visualizing the PH of a
point cloud
• Each interval represents a feature in
dimension n
• Left endpoint = “birth” of a feature
• Right endpoint = “death” of a feature
• Visually, long features are “more
persistent”
Persistence Diagrams
• Another way of visualizing PH
• Put the filtration on both the
horizontal and vertical axes
• If a feature is born at b and dies at d,
we place a point at (b,d)
• The height above the diagonal
indicates persistence
• Pink circles: H0
• Blue squares: H1
Distance-Based Constructions:
Vietoris–Rips (VR) and Alpha Complexes
•VR complex [Jigglypuff]
• Surround each point in a point cloud with balls of radius !
• For a set of n + 1 points, if the pairwise distance between any two points is
less than !, build an n–simplex. The resulting simplicial complex is X!
•Alpha complex
• Compute the Delaunay triangulation of the point cloud
• X! is the simplicial complex formed by the set of edges and triangles
whose radii are at most !
Topological Data Analysis of Complex Spatial Systems
Summary: Distance-Based Constructions
•Advantages
• Easy to construct
• Fast algorithms, built into many packages
• Easy to interpret
• Embedded in Euclidean space, built-in parameter selection
•Disadvantages
• Which parameter values are appropriate ones?
• Persistence doesn’t always measure what we want it to
• Sensitive to rescaling
• Requires data in point-cloud form
Adjacency Construction
• Use network adjacency to define simplices
• If n + 1 nodes are all pairwise adjacent, define an n–simplex
• Given appropriate node data (or edge data), we construct a
filtration
• Note that filtration is not determined by distance
• In our data, filtration corresponds to strength of precinct
preference for a specific candidate
• For example, we can find light-blue precincts in a sea of dark blue
Summary: Adjacency Construction
• Advantages
• Does not depend on distance scaling
• Suitable for networks that aren’t easy or natural to embed in
Euclidean space
• Disadvantages
• Still only works on discrete data
• Sensitive to choices of construction of the underlying network
• It requires the nodes or edges to have associated data to construct
a meaningful filtration
Level-Set VR Construction
• Use data in surface form
• Take map of all precincts with similar voting patterns, and consider the outer
contour to be the 0 level set of some 3D object
• Evolve the surface outward with forces on a triangular grid according to the
level-set equation
• Take the collection of filled grid cells to be 2-simplices (and take grid lines to
be edges; and take points to be vertices)
• The filtration is given by the time steps of the evolution
Level Sets and PH
• The level-set method is a very fast method for front
propagation
• Persistence corresponds to the size of a feature: larger
holes take longer to fill
• We’re still “thickening” a point cloud (as in VR complexes),
except that we start with a manifold
Summary: Level-Set VR Construction
•Advantages
• We can use the underlying shape of a map
• We maintain some notion of geographic size of holes via the mesh size
• Faster than previous VR method on large data sets
•Disadvantages
• Difficult to associate generators of holes with the original precincts on the map
• Potentially not well-suited to less granular data
• Captures geographic features (e.g., bodies of water) that may not be desirable
Topological Data Analysis of Complex Spatial Systems
A key point from the MF + MAP paper
• Our new constructions allow us to distinguish short-persistence
features that occur only for a narrow range of distance scales (e.g.,
voting behaviors in densely populated cities) from short-persistence
noise by incorporating information about other spatial relationships
between precincts
• Note: “Short persistence” with respect to the usual filtrations that
don’t take geospatial nature of the problem into account
C
Persistent Homology on
Other Spatial Data
Michelle Feng & MAP [2020], Physical Review Research, Vol. 2, No. 3: 033426
Spiders Spinning Under the Influence
• The Marshall Space Flight Center studied the webs of spiders that were exposed to
various chemicals. (There is a NASA Tech Brief from 1995.)
• Earlier work, starting in 1948 by Swiss pharmacologist Peter N. Witt
• They concluded that more toxic chemicals resulted in more deformed spiderwebs
PH with Level-Set
Complexes on
Spiderwebs
Pink circles: H0
Blue squares: H1
Topological Data Analysis of Complex Spatial Systems
Street Networks in Cities
Los Angeles
(gridlike)
(a) Aleppo and (b) Barcelona
(interrupted grids)
(a) Nanyang and (b) London
(not gridlike)
C
Analysis of Spatiotemporal Anomalies Using
Persistent Homology: Case Studies with
COVID-19 Data
Abigail Hickok, Deanna Needell, and MAP, arXiv:2107.09188
COVID-19 Data Sets
• COVID-19 per capita vaccination rates in the different zip
codes of New York City
• Fully vaccinated people on 23 February 2021
• COVID-19 case rates in neighborhoods in the city of Los
Angeles
• Running 14-day mean per capita case rate from 25 April
2020 through 25 April 2021
Constructing a Simplicial Complex
• (1) Construct a 2D simplicial complex
for each region.
• (2) Glue their boundaries together in a
way that respects the geographical
region boundaries.
A More Complicated Situation
Filtration Functions (example: sublevel filtration)
Per capita vaccination
rate (NYC) or running
14-day mean of per
capita case rate (LA)
Case Study: Vaccination Rates in NYC
• Each point in the PD
corresponds to a zip code
(which we label by Borough)
that has a higher vaccination
rate than its neighboring zip
codes.
• We use “vineyards” to examine
the birth and death of features
over time.
• A continuous “stack” of PDs through
time. Points in the PD trace out curves
(“vines”) through time.
Topological Data Analysis of Complex Spatial Systems
Conclusions
• Topological data analysis (TDA), such as by computing persistent homology
(PH), can give insights into large-scale structures in networks and other
complex systems
• Important: going beyond pairwise interactions in networks
• Persistent homology of spatial and spatiotemporal data
• By looking at 2D data, we can do systematic comparisons between different types of
constructions (topologically, fewer things can happen)
• Incorporate information from applications of interest into PH approaches
• Short-persistence features versus short-persistence noise: Need to think
carefully about how one constructs simplicial complexes
• Serendipity in research: You can end up writing a lot of papers on a topic
without intending to make it a big part of your research program
• Students and postdocs driving you into new research areas is the best thing ever™

More Related Content

What's hot (20)

PDF
Temporal networks - Alain Barrat
Lake Como School of Advanced Studies
 
PDF
SapporoR#6 初心者セッションスライド
Hiroshi Shimizu
 
PPTX
マルコフ連鎖モンテカルロ法
Masafumi Enomoto
 
PPTX
第2回DARM勉強会
Yoshitake Takebayashi
 
PDF
2 5 3.一般化線形モデル色々_Gamma回帰と対数線形モデル
logics-of-blue
 
PDF
アンサンブル木モデル解釈のためのモデル簡略化法
Satoshi Hara
 
PPTX
Oracle property and_hdm_pkg_rigorouslasso
Satoshi Kato
 
PDF
猫でも分かりたい線形回帰の自由度
YukinoriKambe
 
PDF
Normalization of microarray
弘毅 露崎
 
PDF
Opinion Dynamics on Generalized Networks
Mason Porter
 
PPTX
case study of curve fitting
Adarsh Patel
 
PDF
PRML 6.1章 カーネル法と双対表現
hagino 3000
 
PDF
Rでベイズをやってみよう!(コワい本1章)@BCM勉強会
Shushi Namba
 
PPTX
PRMLrevenge_3.3
Naoya Nakamura
 
PPTX
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
Jin-Hwa Kim
 
PDF
Temporal Networks of Human Interaction
Petter Holme
 
PPTX
モンテカルロ法と情報量
Shohei Miyashita
 
DOC
Chapter 2 (maths 3)
Prathab Harinathan
 
PDF
2 3.GLMの基礎
logics-of-blue
 
PDF
一般化線形混合モデル isseing333
Issei Kurahashi
 
Temporal networks - Alain Barrat
Lake Como School of Advanced Studies
 
SapporoR#6 初心者セッションスライド
Hiroshi Shimizu
 
マルコフ連鎖モンテカルロ法
Masafumi Enomoto
 
第2回DARM勉強会
Yoshitake Takebayashi
 
2 5 3.一般化線形モデル色々_Gamma回帰と対数線形モデル
logics-of-blue
 
アンサンブル木モデル解釈のためのモデル簡略化法
Satoshi Hara
 
Oracle property and_hdm_pkg_rigorouslasso
Satoshi Kato
 
猫でも分かりたい線形回帰の自由度
YukinoriKambe
 
Normalization of microarray
弘毅 露崎
 
Opinion Dynamics on Generalized Networks
Mason Porter
 
case study of curve fitting
Adarsh Patel
 
PRML 6.1章 カーネル法と双対表現
hagino 3000
 
Rでベイズをやってみよう!(コワい本1章)@BCM勉強会
Shushi Namba
 
PRMLrevenge_3.3
Naoya Nakamura
 
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
Jin-Hwa Kim
 
Temporal Networks of Human Interaction
Petter Holme
 
モンテカルロ法と情報量
Shohei Miyashita
 
Chapter 2 (maths 3)
Prathab Harinathan
 
2 3.GLMの基礎
logics-of-blue
 
一般化線形混合モデル isseing333
Issei Kurahashi
 

Similar to Topological Data Analysis of Complex Spatial Systems (20)

PDF
Topological Data Analysis of Complex Spatial Systems
Mason Porter
 
PDF
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
Kostis Kyzirakos
 
PPTX
Climate Extremes Workshop - Networks and Extremes: Review and Further Studies...
The Statistical and Applied Mathematical Sciences Institute
 
PDF
Networks in Space: Granular Force Networks and Beyond
Mason Porter
 
PPTX
Summary of survey papers on deep learning method to 3D data
Arithmer Inc.
 
PPTX
Presentation
Peyman Faizian
 
PPTX
[20240506_LabSeminar_Huy]Conditional Local Convolution for Spatio-Temporal Me...
thanhdowork
 
PPT
Spatial data mining
MITS Gwalior
 
PDF
Topological Data Analysis
DeviousQuant
 
PDF
20131106 acm geocrowd
Dongpo Deng
 
PDF
Big Data and Geospatial with HPCC Systems
HPCC Systems
 
PPTX
ODSC India 2018: Topological space creation & Clustering at BigData scale
Kuldeep Jiwani
 
PDF
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
Revolution Analytics
 
PDF
An Introduction to Networks
Francesco Gadaleta
 
PDF
network mining and representation learning
sun peiyuan
 
PPTX
[20240628_LabSeminar_Huy]ScalableSTGNN.pptx
thanhdowork
 
PPTX
Undergraduate Modeling Workshop - Forest Cover Working Group Final Presentati...
The Statistical and Applied Mathematical Sciences Institute
 
PPTX
sheeba 1.pptx
BalasubramanianGurun1
 
PDF
CSE5656 Complex Networks - Final Presentation
Marcello Tomasini
 
Topological Data Analysis of Complex Spatial Systems
Mason Porter
 
ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data
Kostis Kyzirakos
 
Climate Extremes Workshop - Networks and Extremes: Review and Further Studies...
The Statistical and Applied Mathematical Sciences Institute
 
Networks in Space: Granular Force Networks and Beyond
Mason Porter
 
Summary of survey papers on deep learning method to 3D data
Arithmer Inc.
 
Presentation
Peyman Faizian
 
[20240506_LabSeminar_Huy]Conditional Local Convolution for Spatio-Temporal Me...
thanhdowork
 
Spatial data mining
MITS Gwalior
 
Topological Data Analysis
DeviousQuant
 
20131106 acm geocrowd
Dongpo Deng
 
Big Data and Geospatial with HPCC Systems
HPCC Systems
 
ODSC India 2018: Topological space creation & Clustering at BigData scale
Kuldeep Jiwani
 
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
Revolution Analytics
 
An Introduction to Networks
Francesco Gadaleta
 
network mining and representation learning
sun peiyuan
 
[20240628_LabSeminar_Huy]ScalableSTGNN.pptx
thanhdowork
 
Undergraduate Modeling Workshop - Forest Cover Working Group Final Presentati...
The Statistical and Applied Mathematical Sciences Institute
 
sheeba 1.pptx
BalasubramanianGurun1
 
CSE5656 Complex Networks - Final Presentation
Marcello Tomasini
 
Ad

More from Mason Porter (16)

PDF
Social Dynamics on Networks
Mason Porter
 
PDF
Mathematical Models of the Spread of Diseases, Opinions, Information, and Mis...
Mason Porter
 
PDF
Opinion Dynamics on Networks
Mason Porter
 
PDF
The Science of "Chaos"
Mason Porter
 
PDF
Centrality in Time- Dependent Networks
Mason Porter
 
PDF
Paper Writing in Applied Mathematics (slightly updated slides)
Mason Porter
 
PDF
Tutorial on Paper-Writing in Applied Mathematics (Preliminary Draft of Slides)
Mason Porter
 
PDF
Mathematics and Social Networks
Mason Porter
 
PDF
Snowbird comp-top-may2017
Mason Porter
 
PDF
Data Ethics for Mathematicians
Mason Porter
 
PDF
Mesoscale Structures in Networks
Mason Porter
 
PPTX
Map history-networks-shorter
Mason Porter
 
PDF
Ds15 minitute-v2
Mason Porter
 
PDF
Matchmaker110714
Mason Porter
 
PDF
Cascades and Social Influence on Networks, UCSB, 3 Oct 2014
Mason Porter
 
PDF
Multilayer tutorial-netsci2014-slightlyupdated
Mason Porter
 
Social Dynamics on Networks
Mason Porter
 
Mathematical Models of the Spread of Diseases, Opinions, Information, and Mis...
Mason Porter
 
Opinion Dynamics on Networks
Mason Porter
 
The Science of "Chaos"
Mason Porter
 
Centrality in Time- Dependent Networks
Mason Porter
 
Paper Writing in Applied Mathematics (slightly updated slides)
Mason Porter
 
Tutorial on Paper-Writing in Applied Mathematics (Preliminary Draft of Slides)
Mason Porter
 
Mathematics and Social Networks
Mason Porter
 
Snowbird comp-top-may2017
Mason Porter
 
Data Ethics for Mathematicians
Mason Porter
 
Mesoscale Structures in Networks
Mason Porter
 
Map history-networks-shorter
Mason Porter
 
Ds15 minitute-v2
Mason Porter
 
Matchmaker110714
Mason Porter
 
Cascades and Social Influence on Networks, UCSB, 3 Oct 2014
Mason Porter
 
Multilayer tutorial-netsci2014-slightlyupdated
Mason Porter
 
Ad

Recently uploaded (20)

PDF
Quorum Sensing and Microbial Communication
Prachi Virat
 
PPTX
Quality control test for plastic & metal.pptx
shrutipandit17
 
PPTX
Hericium erinaceus, also known as lion's mane mushroom
TinaDadkhah1
 
PPTX
Internal Capsule_Divisions_fibres_lesions
muralinath2
 
PPTX
Qualification of.UV visible spectrophotometer pptx
shrutipandit17
 
PPT
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
PDF
Systems Biology: Integrating Engineering with Biological Research (www.kiu.a...
publication11
 
PDF
NSF-DOE Vera C. Rubin Observatory Observations of Interstellar Comet 3I/ATLAS...
Sérgio Sacani
 
PPTX
The Obesity Paradox. Friend or Foe ?pptx
drdgd1972
 
PDF
High-definition imaging of a filamentary connection between a close quasar pa...
Sérgio Sacani
 
PDF
Pulsar Sparking: What if mountains on the surface?
Sérgio Sacani
 
PDF
Control and coordination Class 10 Chapter 6
LataHolkar
 
PPTX
RED ROT DISEASE OF SUGARCANE.pptx
BikramjitDeuri
 
PPTX
Reticular formation_nuclei_afferent_efferent
muralinath2
 
PPTX
CARBOHYDRATES METABOLSIM, UNIT 2, B Pharm II SEMESTER, BIOCHEMISTRY
ARUN KUMAR
 
PPT
1. Basic Principles of Medical Microbiology Part 1.ppt
separatedwalk
 
PPTX
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
PDF
A water-rich interior in the temperate sub-Neptune K2-18 b revealed by JWST
Sérgio Sacani
 
PPTX
Preparation of Experimental Animals.pptx
muralinath2
 
PDF
Multiwavelength Study of a Hyperluminous X-Ray Source near NGC6099: A Strong ...
Sérgio Sacani
 
Quorum Sensing and Microbial Communication
Prachi Virat
 
Quality control test for plastic & metal.pptx
shrutipandit17
 
Hericium erinaceus, also known as lion's mane mushroom
TinaDadkhah1
 
Internal Capsule_Divisions_fibres_lesions
muralinath2
 
Qualification of.UV visible spectrophotometer pptx
shrutipandit17
 
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
Systems Biology: Integrating Engineering with Biological Research (www.kiu.a...
publication11
 
NSF-DOE Vera C. Rubin Observatory Observations of Interstellar Comet 3I/ATLAS...
Sérgio Sacani
 
The Obesity Paradox. Friend or Foe ?pptx
drdgd1972
 
High-definition imaging of a filamentary connection between a close quasar pa...
Sérgio Sacani
 
Pulsar Sparking: What if mountains on the surface?
Sérgio Sacani
 
Control and coordination Class 10 Chapter 6
LataHolkar
 
RED ROT DISEASE OF SUGARCANE.pptx
BikramjitDeuri
 
Reticular formation_nuclei_afferent_efferent
muralinath2
 
CARBOHYDRATES METABOLSIM, UNIT 2, B Pharm II SEMESTER, BIOCHEMISTRY
ARUN KUMAR
 
1. Basic Principles of Medical Microbiology Part 1.ppt
separatedwalk
 
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
A water-rich interior in the temperate sub-Neptune K2-18 b revealed by JWST
Sérgio Sacani
 
Preparation of Experimental Animals.pptx
muralinath2
 
Multiwavelength Study of a Hyperluminous X-Ray Source near NGC6099: A Strong ...
Sérgio Sacani
 

Topological Data Analysis of Complex Spatial Systems

  • 1. C Topological Data Analysis of Complex Spatial Systems Mason A. Porter (@masonporter) Department of Mathematics, UCLA
  • 2. 22 of 45 (a) (b) (c) FIG. 8. Visualization of the three different spatial partitions of Peru’s provinces on a map. (a) Broad climate partition into coast (yellow), mountains (brown), and jungle (green); (b) detailed climate partition, in which we start with the broad partition and then further divide the coast and mountains into northern coast, central coast, southern coast, northern mountains, central mountains, and southern mountains; and (c) the administrative partition of Peru. We obtained province boundaries from [82] and plot the maps in MATLAB. We use the term “spatial partitions” to describe partitions that have high z-Rand scores in comparison to the manual climate or administrative partitions. For multilayer networks, we also compare the algorithmic partitions to partitions that contain a planted temporal change in community structure. For these comparisons, we group the multilayer nodes into ones that occur before or after a “critical” time tc (i.e., partitions into a “pre-tc” community and a “post-tc” community). We test the set t = 1,1+D,1+2D,...,1+D ⇥ bT D c 1 of times that we use to create the multilayer network, and we report the time with the highest z-Rand score as the critical time tc. We also test for pairs of critical times (yielding a partition into three communities) by examining all possible pairs of critical times tc1 and tc2 in the same manner. We use the term “temporal partitions” to describe algorithmic partitions of the disease-correlation networks that yield high z-Rand scores in these comparisons. 4.3.1 Modularity Maximization Using the NG Null Model. We first study the community structures of the 700 overlapping static networks formed by taking t = {1,2,...,700} and using D = 80. (There are 779 time points in total.) The community structures that we obtain from max- imizing modularity have a strong spatial organization, as suggested by the high z-Rand scores when compared to topographical partitions. As one can see in Fig. 9(a), in which we plot the z-Rand scores versus the centers of the time windows that correspond to the static networks, the spatial organization is especially evident starting in the year 2000. In our subsequent figures, time points that we indicate on the axes also correspond to the centers of the associated time windows. As one can see from a plot of number of epidemic cases over time (see Fig. 7), this transition seems to occur near the time of the largest countrywide epidemic in the data, and the subse- quent period includes recurring yearly epidemics that were linked to climatic patterns in prior Spatial Systems • Space has a major influence on the structures of networks and other complex systems • Useful reference: Marc Barthelemy, Morphogenesis of Spatial Networks, 2018
  • 3. Slime molds and fungal networks • See, e.g., work by Mark Fricker and collaborators
  • 5. Spiders: Spinning Webs While Under the Influence
  • 6. The border between Belgium and the Netherlands at Baarle- Nassau/Baarle-Hertog
  • 7. Topological Data Analysis • Algorithmic methods to study high-dimensional data in a quantitative manner • Data from point clouds, networks, etc. • Examine the “shape” of data • Persistent homology • Mathematical formalism for studying topological invariants • Fast algorithms • Persistent structures: a way to cope with noise in data • Allows examination of “higher-order” (beyond pairwise) interactions in data
  • 8. My TDA “Origin Story” • Available at: https://twitter.com/masonporter/status/1200127512371556352 • The short short version • In college (1994–1998), I saw some algebraic topology, but it looked very abstract and it seemed more on the ‘purer’ (i.e., theoretical) end of mathematics • As a postdoc (2002–2005) at Georgia Tech, I saw Konstantin Mischaikow using ideas from computational topology on experimental data (e.g., in fluid mechanics) • In late 2012, I noticed that Konstantin and I were both working on granular materials (him with TDA, me with network analysis), so I contacted him and we arranged a visit. • We did a project on TDA on spreading processes on networks (Taylor et al., Nature Commun., 2015) • An Oxford doctoral student saw that paper and wanted to work on TDA with me. • Another Oxford student saw a couple of previous papers of mine with TDA and wanted to work on that with me. • Rinse, wash, and repeat — and unintentionally now I do a lot of TDA stuff.
  • 9. Topological Data Analysis (in practice, usually persistent homology) • Michelle Feng, Abigail Hickok, Yacoub H. Kureh, Mason A. Porter, Chad M. Topaz, “Connecting the Dots: Discovering the “Shape” of Data”, Frontiers for Young Minds, 2021 • Chad Topaz’s introductory article (for a general audience) in DSWeb • https://dsweb.siam.org/The-Magazine/Article/topological-data-analysis • Nina Otter, MAP, Ulrike Tillmann, Peter Grindrod, and Heather A. Harrington [2017], “A Roadmap for the Computation of Persistent Homology”, European Physical Journal — Data Science, Vol. 6: 17
  • 11. Example: Point-Cloud Data [Figure from: Michelle Feng, Abigail Hickock, Yacoub H. Kureh, MAP, & Chad M. Topaz, “Connecting the Dots: Discovering the “Shape” of Data”, Frontiers for Young Minds, 2021]
  • 12. C Topological Data Analysis of 2D Voting Data Michelle Feng & MAP [2021], “Persistent Homology of Geospatial Data: A Case Study with Voting”, SIAM Review, Vol. 63, No. 1: 67–99
  • 13. Quantifying “Political Islands” How do we detect red voters in a sea of blue? (Or light blue voters in a sea of dark blue?)
  • 14. Precinct-Level Voting Data • How do people vote? • How can we identify geographical or temporal patterns in voting? • Our paper: geographical • Can we automatically characterize 2D geographical outliers? • Voting data (compiled by Los Angeles Times) for all California precincts in 2016 election
  • 15. TDA and Voting Data • Topological methods allow us to find and identify holes, if we have a nice enough space to search for those holes • They also allow us to relate the presence of holes to global structure • Want to find “political islands” • Red voters in a sea of blue, etc. • Consider these as “holes” in a manifold in which all precincts vote similarly • Maybe we can also say something about the structure of a county?
  • 16. Barcodes • A method of visualizing the PH of a point cloud • Each interval represents a feature in dimension n • Left endpoint = “birth” of a feature • Right endpoint = “death” of a feature • Visually, long features are “more persistent”
  • 17. Persistence Diagrams • Another way of visualizing PH • Put the filtration on both the horizontal and vertical axes • If a feature is born at b and dies at d, we place a point at (b,d) • The height above the diagonal indicates persistence • Pink circles: H0 • Blue squares: H1
  • 18. Distance-Based Constructions: Vietoris–Rips (VR) and Alpha Complexes •VR complex [Jigglypuff] • Surround each point in a point cloud with balls of radius ! • For a set of n + 1 points, if the pairwise distance between any two points is less than !, build an n–simplex. The resulting simplicial complex is X! •Alpha complex • Compute the Delaunay triangulation of the point cloud • X! is the simplicial complex formed by the set of edges and triangles whose radii are at most !
  • 20. Summary: Distance-Based Constructions •Advantages • Easy to construct • Fast algorithms, built into many packages • Easy to interpret • Embedded in Euclidean space, built-in parameter selection •Disadvantages • Which parameter values are appropriate ones? • Persistence doesn’t always measure what we want it to • Sensitive to rescaling • Requires data in point-cloud form
  • 21. Adjacency Construction • Use network adjacency to define simplices • If n + 1 nodes are all pairwise adjacent, define an n–simplex • Given appropriate node data (or edge data), we construct a filtration • Note that filtration is not determined by distance • In our data, filtration corresponds to strength of precinct preference for a specific candidate • For example, we can find light-blue precincts in a sea of dark blue
  • 22. Summary: Adjacency Construction • Advantages • Does not depend on distance scaling • Suitable for networks that aren’t easy or natural to embed in Euclidean space • Disadvantages • Still only works on discrete data • Sensitive to choices of construction of the underlying network • It requires the nodes or edges to have associated data to construct a meaningful filtration
  • 23. Level-Set VR Construction • Use data in surface form • Take map of all precincts with similar voting patterns, and consider the outer contour to be the 0 level set of some 3D object • Evolve the surface outward with forces on a triangular grid according to the level-set equation • Take the collection of filled grid cells to be 2-simplices (and take grid lines to be edges; and take points to be vertices) • The filtration is given by the time steps of the evolution
  • 24. Level Sets and PH • The level-set method is a very fast method for front propagation • Persistence corresponds to the size of a feature: larger holes take longer to fill • We’re still “thickening” a point cloud (as in VR complexes), except that we start with a manifold
  • 25. Summary: Level-Set VR Construction •Advantages • We can use the underlying shape of a map • We maintain some notion of geographic size of holes via the mesh size • Faster than previous VR method on large data sets •Disadvantages • Difficult to associate generators of holes with the original precincts on the map • Potentially not well-suited to less granular data • Captures geographic features (e.g., bodies of water) that may not be desirable
  • 27. A key point from the MF + MAP paper • Our new constructions allow us to distinguish short-persistence features that occur only for a narrow range of distance scales (e.g., voting behaviors in densely populated cities) from short-persistence noise by incorporating information about other spatial relationships between precincts • Note: “Short persistence” with respect to the usual filtrations that don’t take geospatial nature of the problem into account
  • 28. C Persistent Homology on Other Spatial Data Michelle Feng & MAP [2020], Physical Review Research, Vol. 2, No. 3: 033426
  • 29. Spiders Spinning Under the Influence • The Marshall Space Flight Center studied the webs of spiders that were exposed to various chemicals. (There is a NASA Tech Brief from 1995.) • Earlier work, starting in 1948 by Swiss pharmacologist Peter N. Witt • They concluded that more toxic chemicals resulted in more deformed spiderwebs
  • 30. PH with Level-Set Complexes on Spiderwebs Pink circles: H0 Blue squares: H1
  • 33. Los Angeles (gridlike) (a) Aleppo and (b) Barcelona (interrupted grids) (a) Nanyang and (b) London (not gridlike)
  • 34. C Analysis of Spatiotemporal Anomalies Using Persistent Homology: Case Studies with COVID-19 Data Abigail Hickok, Deanna Needell, and MAP, arXiv:2107.09188
  • 35. COVID-19 Data Sets • COVID-19 per capita vaccination rates in the different zip codes of New York City • Fully vaccinated people on 23 February 2021 • COVID-19 case rates in neighborhoods in the city of Los Angeles • Running 14-day mean per capita case rate from 25 April 2020 through 25 April 2021
  • 36. Constructing a Simplicial Complex • (1) Construct a 2D simplicial complex for each region. • (2) Glue their boundaries together in a way that respects the geographical region boundaries.
  • 37. A More Complicated Situation
  • 38. Filtration Functions (example: sublevel filtration) Per capita vaccination rate (NYC) or running 14-day mean of per capita case rate (LA)
  • 39. Case Study: Vaccination Rates in NYC • Each point in the PD corresponds to a zip code (which we label by Borough) that has a higher vaccination rate than its neighboring zip codes.
  • 40. • We use “vineyards” to examine the birth and death of features over time. • A continuous “stack” of PDs through time. Points in the PD trace out curves (“vines”) through time.
  • 42. Conclusions • Topological data analysis (TDA), such as by computing persistent homology (PH), can give insights into large-scale structures in networks and other complex systems • Important: going beyond pairwise interactions in networks • Persistent homology of spatial and spatiotemporal data • By looking at 2D data, we can do systematic comparisons between different types of constructions (topologically, fewer things can happen) • Incorporate information from applications of interest into PH approaches • Short-persistence features versus short-persistence noise: Need to think carefully about how one constructs simplicial complexes • Serendipity in research: You can end up writing a lot of papers on a topic without intending to make it a big part of your research program • Students and postdocs driving you into new research areas is the best thing ever™