SlideShare a Scribd company logo
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Dynamic crowding maps with mobile phone
big data
Maurizio Carpita1, Rodolfo Metulini2
1. Data Methods and Systems Statistical Laboratory - Department of
Economics and Management, University of Brescia
2. Department of Economics and Statistics - University of Salerno
Third international conference on Data Science & Social
Research
December 10-11, 2020
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
The project
• Ongoing project (’till
06/2022):
This talk describes the works
conducted together with Prof.
Roberto Ranzi and Dr. Matteo
Balistrocchi (Department of Civil,
Environmental, Architectural Engineering and
Mathematics, UNIBS) in the context of
MoSoRe project
Regione Lombardia, Call HUB Research &
Innovation: Infrastrutture e servizi per la
Mobilità Sostenibile e Resiliente -
MoSoRe@UnibsID 1180965 - POR FESR
2014-2020
• Scientific output:
1 Metulini, R., Carpita, M., (2020), A Spatio-Temporal Indicator for
City Users based on Mobile Phone Signals and Administrative Data -
Social Indicator Research, 1-21. DOI: 10.1007/s11205-020-02355-2
2 Balistrocchi, M., Metulini, R., Carpita, M., and Ranzi, R.: Dynamic
maps of people exposure to floods based on mobile phone data.
Natural Hazards and Earth System Sciences, 2020, in press. DOI:
10.5194/nhess-2020-201.
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
The context of application
• Floods are natural phenomena whose hazards afflict nearly 20 million
people worldwide (Kellens et al., 2013), posing a serious challenge to
the protection of human lives.
• Urbanization determines dramatic increases in people exposure and
vulnerability to floods, since most of recent urbanizations are
developed in flood prone areas.
• The development of effective emergency management plans are
intended to provide communities with early warnings, reliable
real-time information.
• We provide a detailed and reliable picture of the real-time
spatiotemporal variability of the flood risk by proxying it with
dynamic crowding maps from mobile phone data for
reference groups of days.
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Data
• Erlang mobile phone measures (Erlang, 1909): average number of
mobile phone users (MPU) bearing a SIM connected to the network,
recorded at constant time steps with reference to a georeferenced
grid of square cells.
Available for Telecom Ialia Mobile (TIM) in the period from 04/2014
to 08/2016 thanks to a collaboration with Statistical Office of
Comune di Brescia.
• Census data from ISTAT, reporting residential population
(01/01/2016) by age, for each sezione di censimento (SC)
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
The set-up
• To detect MPU spatiotemporal variability we define the subject of
our analysis: the daily density profiles (DDP).
• Let eit be the number of MPU in the i − th grid cell in a generic time
interval t,
• let Ir = {i1, ..., im} be the set of grid cells in region r of interest,
• let Td = {t1, ..., to} be the set of intervals of time in a day d.
• DDPrd can be defined as the vector of the sums of MPU (a sum for
each considered time instant) in region r and day d (length = o)
DDPrd =
m
l=1
eil,t1 ,
m
l=1
eil,t2 , ...,
m
l=1
eil,to
• Goal: classifying the occurrences in the time series of DDPrd related
to the set d = {d1, ..., dn} of n analyzed days. In other words,
clustering similar DDPrd .
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Issues
• Our dataset amount to n observations (days) and p = m ∗ o features
per day (cells∗quarters).
Let consider one year of data (n = 365): o = 96 (quarters per day),
m = 400 (grid’s cells of the sample area).
• Number of features is larger than number of observations, so we refer
to an high-dimensional data setup (Donoho, 2000).
• Traditional techniques (Arabie and De Soete, 1996) may not return
robust results in high-dimensional data, for example due to the
presence of the curse of dimensionality (Keogh and Mueen, 2017).
• Bouveyron et al. (2007) addressed this issue with regard to
clustering. However, as suggested by Jovi et al. (2015) , a suitable
solution is represented by a preliminar data reduction strategy.
• Histogram of Oriented Gradients (HOG) approach is used for data
reduction.
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
The Strategy ...
( ... to take into account days’ similarity)
Step Type Aim Method Features
1 Data re-
duction +
clustering
find similar
raster images
HOG + k-
means cluster
HOG features
2 clustering find similar
functional
curves
functional
model-based
clustering
DDP features
3 population
assessment
estimate city
users
spatial match
of MPU and
census data
DDP features
+ population
4 visualiza-
tion
find reference
daily profiles
functional box
plots
DDP features
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
The Strategy
Step Type Aim Method Features
1 Data re-
duction +
clustering
find similar
raster images
HOG + k-
means cluster
HOG features
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
HOG data reduction
• for a given t, let it = {e1,t, e2,t, ..., eim,t} be the MPU vector of
region r in time instant t (dimension m).
• Aim: to reduce it to a smaller vector of values κ1,t (m < m), with
the relevant information contained in it.
• To do so, set it, separately for each t, undergoes a histogram of
oriented gradients (HOG) feature extraction (Dalal and Triggs, 2005).
• Vector zit = {ei,t/maxi∈Ir (ei,t)}, ∀i in Ir undergoes HOG
• HOG method:
1 split the m cells of the grid in S smaller grids G1, ..., GS
(Gi ∩ Gj = ∅, ∀i = 1, ..., S and ∀j = 1, ..., S with i = j) (
√
S is a
parameter to be chosen),
2 for each grid Gi , direction and magnitude gradient matrices are
computed (Dalal and Triggs, 2005).
3 from the two gradient matrices, histogram of gradients is
determined, with k equal bins (with k a parameter to be
chosen).
• κit is stacked over the subscript t, in order to derive (for region r,
day d) the vector of features κd (dimension S ∗ k ∗ o), d = 1, ..., n.
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
HOG data reduction ... explained
From a nxn raster data ....


93 124 77 ... ...
217 55 94 ... ...
24 77 109 ... ...
... ... ... ... ...
... ... ... ... ...


...to Xt, a matrix representing the
number of people in that cell at time t
1 standardize MPU data;
2 split matrix in sub-matrices;
3 for each sub-matrix, compute the
matrices of gradients (using the
sobel operator);
4 assign each value of the direction
matrix to one of the k bins of
the histogram using its
magnitude as weight, to produce
the vector of features;
5 stack into a vector the features
of all quarters of the day.
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
First step clustering
• Days are clustered in terms of how MPU are distributed over region r
according to index i, i.e., according to similarity in the raster image .
• The objects to be clustered are the n days and κd contains the
S ∗ k ∗ o (with S ∗ k ∗ o < m ∗ o) features for day d, ∀d = d1, ..., dn.
• k-mean cluster method (Hartigan and Wong, 1979) is adopted
(after having tested against curse of dimensionality)
• According to Hartigan and Wong criterion, the clusters’ number H is
chosen by minimizing the ratio between
the total within sum of squares and the total sum of squares for
different values of H.
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
The Strategy
Step Type Aim Method Features
2 clustering find similar
functional
curves
functional
model-based
clustering
DDP features
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Second step clustering
• Aim: at considering similarity in the functional form of the DDPrd , if viewed
as functional curves.
• We consider DDPrd as the collection of functional observations
xrd (Td ), Td ∈ (t1, ..., to) (length o) (i.e.
m
l=1
eil,t1 in t1), with d
varying in d = {d1, ..., dn}.
• We adopt a model-based functional data clustering method
(MB-FAC, Bouveyron et al., 2015), which provides estimated curve
with specific parameters, to group days d (cluster’s objects) in terms
of the o DDPrd values (cluster’s variables)
• We adopt the following path:
1 functional data outlier detection by likelihood ratio test
(LRT) to remove anomalous DDPrd , as proposed by
Febrero-Bande et al. (2008);
2 Bouveyron et al. (2015) clustering method, using funFEM
package in R
• The method suits for high-dimensional data: it employs sub-space
clustering criterion (Agrawal et al., 1998, it considers just the
minimum number of variables for grouping objects)
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
The Strategy
Step Type Aim Method Features
3 population
assessment
estimate city
users
spatial match
of MPU and
census data
DDP features
+ population
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Population assessment - I
• Aim: to estimate the total amount of people (city users), while
MPU availability regards just one mobile phone company.
• We compute an estimate of the market share of the provider
company, to correct the DDPrd ,
• by comparing the number of residents from archives with the number
of TIM users on a residential area - in late evening hours (assuming
that, in late evening hours, residential Sezione di Censimento (SC)
are only populated by residents).
• MPU grid is made of square cells while SCs are irregular polygons →
the number of TIM users belonging to each SC needs to be retrieved
by intersecting the two sources.
• the portion of the cell belonging to the SC polygon were calculated
in order to count how many TIM users are present in each polygon , by using the
function extract in raster package, R.
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Population assessment - II
• Let Cellj , ∀j = 1, 2, ..., JSC be the cells of the sample area, the ratio
Aj =
area(SC) ∩ area(Cellj )
area(Cellj )
represents how much of Cellj is included in the chosen SC;
• let TUCj be the MPU in Cellj , the estimation of the number of MPU
in SC is
ETUSC =
j
TUCj ∗ Aj
.
• The estimated company market share in SC is given by
ETMSSC =
ETUSC
PSC
where PSC is the resident number for that SC (children and elderly
people excluded).
• The median (me(.)) of ETMSSC can be used as a proxy for the
company market share at city level;
• the city users estimate is given by
ˆDDPrd =
DDPrd
me(ETMSSC )
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
The Strategy
Step Type Aim Method Features
4 visualiza-
tion
find reference
daily profiles
functional box
plots
DDP features
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Visualization
• Let consider DDPrd to be a functional curve xrd (Td ) displaying, in
the y-axis, the sum of MPU in region r and day d with respect to, in
the x-axis, time instants Td ∈ (t1, ..., to).
• Functional box plots (FBP, Sun and Genton, 2011) can be used to
display the profile for each final cluster.
• For cluster h, let dh = {d1h, ..., dnh} be the group of days belonging
to cluster h, and let ˆDDPrd,h = [ ˆDDPrd1,h, ..., ˆDDPrdn,h] be the
matrix of dimension o ∗ nh with a DDPrd of cluster h in each column.
• By considering each DDPrd a curve, the FBP representing the profile
plot of the total number of people (that we call city users) in different
hours (with DB), for cluster h, is computed using matrix DDPrd,h
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Case study description
• WGS 84 UTM 32 N coordinates: 5,040,920–5,049,980N,
585,970–592,970E (area about 64 km2
) centred on the
Mandolossa-Gandovere network (grid of 20x20 150m2
cells)
• at 15-minutes intervals (quarters) over the period July 1st, 2015 -
August 10th, 2016.
• After imputing missing quarters and removing the full day when they
are too many, we ended up with a number of valid 360 days.
• HOG parameters:
√
S = 3, h = 4.
• The interest is in residential and industrial part of 4 specific areas
(Moie di Sotto, Villaggio Badia and Fantasina, southern Gandovere
canal, Roncadelle)
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
First step clustering: results
Figure: Spine-plots representing the first-step clustering of days along (a)
months and (b) days of the week (green: all days mostly occurring in July,
August and September; blue: working days mostly occurring from February to
June; red: working days mostly occurring from October to January; yellow:
weekends mostly occurring from October to June)
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Representation: results
Figure: Functional box plots of exposed people (“city users”) inside residential
areas: (a) Moie di Sotto, (b) Villaggio Badia and Fantasina, (c) southern
Gandovere canal, (d) Roncadelle . Cluster 1 (July, August, September, C1),
Cluster 2 (working-days from October to June, C2), Cluster 3 (week-ends from
October to June, C3)
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Representation: results - II
Figure: Functional box plots of exposed people (“city users”) inside
industrial-commercial settlements: (a) Moie di Sotto, (b) Villaggio Badia and
Fantasina, (c) southern Gandovere canal, (d) Roncadelle. Cluster 1 (July,
August, September, C1), Cluster 2 (working-days from October to June, C2),
Cluster 3 (week-ends from October to June, C3)
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
Discussion
• The combination of:
1 high spatial resolution (150 m2
) and short time step (15 ) of
data, and
2 the application of the proposed statistical strategy thought for
high dimensional data
permits a
1 reliable population assessments even for small area, and
2 a precise evaluation of the temporal dynamic of city users in the
sample area
• Functional box plot results are meaningful:
1 working days and weekends show different temporal dynamics,
when they belong to working months (October to June),
2 daily dynamics in summer months (July, August and
September, holydays in Italy), must be regarded as different
from the others,
3 working days and weekends feature more similar daily density
profiles during such months.
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
References - I
1 Agrawal, R., Gehrke, J., Gunopulos, D., and Raghavan, P.: Automatic
subspace clustering of high dimensional data for data mining applications,
in: Proceedings of the 1998 ACM SIGMOD International Conference on
Management of Data, ACM Press, 94–105, doi:10.1145/276304.276314,
1998
2 Arabie, P., and De Soete, G.: Clustering and classification, World Scientific,
doi:10.1142/1930, 1996
3 Bouveyron, C., Girard, S., and Schmid, C.: High-dimensional data
clustering, Comput. Stat. Data An., 52(1), 502–519,
doi:10.1016/j.csda.2007.02.009, 2007
4 Bouveyron, C., Come, E., & Jacques, J. (2015). The discriminative
functional mixture model for a comparative analysis of bike sharing
systems. The Annals of Applied Statistics, 9(4), 1726-1760.
5 Dalal, N., and Triggs, B.: Histograms of oriented gradients for human
detection, in: Proceedings of the International Conference on Computer
Vision & Pattern Recognition (CVPR ’05), doi:10.1109/CVPR.2005.177,
2005.
6 Donoho, D. L.: High-dimensional data analysis: The curses and blessings of
dimensionality, AMS Math Challenges Lecture, 1–32, 2000
7 Erlang, A. K. (1909). The theory of probabilities and telephone
conversations. Nyt. Tidsskr. Mat. Ser. B, 20, 33-39.
Dynamic
crowding
maps
Carpita
Metulini
The project
Data
Methodology
Application
Conclusions
References
References - II
1 Febrero-Bande, M., Galeano, P., & Gonzalez-Manteiga, W. (2008). Outlier
detection in functional data by depth measures, with application to identify
abnormal NOx levels. Environmetrics: The official journal of the
International Environmetrics Society, 19(4), 331-345.
2 Hartigan, J. A., and Wong, M. A.: Algorithm AS 136: A k-means
clustering algorithm, J. R. Stat. Soc., Series C (Applied Statistics), 28(1),
100–108, doi:10.2307/2346830, 1979.
3 Kellens, W., Terpstra, T., and De Maeyer, P.: Perception and
communication of flood risks: A systematic review of empirical research,
Risk Anal., 33/1, 24–49, doi:10.1111/j.1539-6924.2012.01844.x, 2013
4 Keogh, E., and Mueen, A.: Curse of dimensionality, in: Sammut, C., and
Webb, G. I. (eds.), Encyclopedia of Machine Learning and Data Mining,
314–315, Springer, doi:10.1007/978-1-4899-7687-1-192, 2017.
5 Jovi, A., Brki, K., and Bogunovi, N.: A review of feature selection methods
with applications, in: Proceedings of the 38th International Convention on
Information and Communication Technology, Electronics and
Microelectronics (MIPRO), 1200–1205, doi:10.1109/MIPRO.2015.7160458,
2015
6 Sun, Y., & Genton, M. G. (2011). Functional boxplots. Journal of
Computational and Graphical Statistics, 20(2), 316-334.
Dynamic
crowding
maps
Carpita
Metulini
Supplemental
Figure: Snapshots of a dynamic map showing the spatiotemporal distribution of
mobile phone users (MPU) occurred at 12pm, 17/11/2015 (Wednesday); base
map Lombardy Regional Technical Map CTR 1:5000 provided by Lombardy
Region (www.geoportale.regione.lombardia.it).
Back to slide
Dynamic
crowding
maps
Carpita
Metulini
Supplemental
Figure: Example of dissimilarity among raster images.
Figure: Example of similarity among raster images.
Back to slide
Dynamic
crowding
maps
Carpita
Metulini
Supplemental
Figure: Example of similauty and dissimilaity in the functional form. Curves
with the same colors are similar. On the contrary, curves with different colors are
dissimilar.
Back to slide
Dynamic
crowding
maps
Carpita
Metulini
Supplemental
Figure:
Back to slide
Dynamic
crowding
maps
Carpita
Metulini
Supplemental
Figure: Example of weighting scheme to assign the number of TIM users to SC
110, located at latitude 45.544 N and longitude 10.217 N
Back to slide

More Related Content

PDF
Human activity spatio-temporal indicators using mobile phone data
University of Salerno
 
PPTX
Federated learning based_trafiic_flow_prediction.ppt
khalidhassan105
 
PDF
Interpretability Evaluation of Annual Mosaic Image of MTB Model for Land Cove...
TELKOMNIKA JOURNAL
 
PDF
W33123127
IJERA Editor
 
PDF
B05531119
IOSR-JEN
 
PDF
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
Mokhtar SELLAMI
 
PDF
Poster venezia
University of Salerno
 
PDF
Feature extraction based retrieval of
ijcsity
 
Human activity spatio-temporal indicators using mobile phone data
University of Salerno
 
Federated learning based_trafiic_flow_prediction.ppt
khalidhassan105
 
Interpretability Evaluation of Annual Mosaic Image of MTB Model for Land Cove...
TELKOMNIKA JOURNAL
 
W33123127
IJERA Editor
 
B05531119
IOSR-JEN
 
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
Mokhtar SELLAMI
 
Poster venezia
University of Salerno
 
Feature extraction based retrieval of
ijcsity
 

What's hot (20)

PDF
[GAGIS]A Hierarchical Representation and Computation Scheme of Arbitrary-dime...
wen luo
 
PPT
Spatial data mining
MITS Gwalior
 
PDF
Topological data analysis
Sunghyon Kyeong
 
PDF
Thesis Manuscript Final Draft
Derek Foster
 
PDF
30.03.2017 Data Science Meetup - USER JOURNEY ANALYSIS, BETWEEN BUDGET ALLOCA...
Zalando adtech lab
 
PDF
Active learning algorithms in seismic facies classification
Pioneer Natural Resources
 
PPTX
Ivan Sahumbaiev "Deep Learning approaches meet 3D data"
Fwdays
 
PDF
CLIM: Transition Workshop - A Notional Framework for a Theory of Data Systems...
The Statistical and Applied Mathematical Sciences Institute
 
PDF
Geographic query and analysis
Mohsin Siddique
 
PDF
20210225_ロボティクス勉強会_パーティクルフィルタのMAP推定の高速手法「FAST-MAP」を作ってみた
Mori Ken
 
PDF
Report
Weifeng Hu
 
PDF
Histogram Operation in Image Processing
VARUN KUMAR
 
PDF
010_20160216_Variational Gaussian Process
Ha Phuong
 
PDF
Modelling traffic flows with gravity models and mobile phone large data
University of Salerno
 
PDF
Cari 2020: A minimalistic model of spatial structuration of humid savanna veg...
Mokhtar SELLAMI
 
PPT
Individual movements and geographical data mining. Clustering algorithms for ...
Beniamino Murgante
 
PDF
Justifying Modelin Uncertanty Agora
University of Oradea
 
PDF
Spme 2013 segmentation
Qujiang Lei
 
PDF
Edge detection algorithm based on quantum superposition principle and photons...
IJECEIAES
 
PDF
Spatio-Temporal Data Mining and Classification of Ships' Trajectories
Centre of Geographic Sciences (COGS)
 
[GAGIS]A Hierarchical Representation and Computation Scheme of Arbitrary-dime...
wen luo
 
Spatial data mining
MITS Gwalior
 
Topological data analysis
Sunghyon Kyeong
 
Thesis Manuscript Final Draft
Derek Foster
 
30.03.2017 Data Science Meetup - USER JOURNEY ANALYSIS, BETWEEN BUDGET ALLOCA...
Zalando adtech lab
 
Active learning algorithms in seismic facies classification
Pioneer Natural Resources
 
Ivan Sahumbaiev "Deep Learning approaches meet 3D data"
Fwdays
 
CLIM: Transition Workshop - A Notional Framework for a Theory of Data Systems...
The Statistical and Applied Mathematical Sciences Institute
 
Geographic query and analysis
Mohsin Siddique
 
20210225_ロボティクス勉強会_パーティクルフィルタのMAP推定の高速手法「FAST-MAP」を作ってみた
Mori Ken
 
Report
Weifeng Hu
 
Histogram Operation in Image Processing
VARUN KUMAR
 
010_20160216_Variational Gaussian Process
Ha Phuong
 
Modelling traffic flows with gravity models and mobile phone large data
University of Salerno
 
Cari 2020: A minimalistic model of spatial structuration of humid savanna veg...
Mokhtar SELLAMI
 
Individual movements and geographical data mining. Clustering algorithms for ...
Beniamino Murgante
 
Justifying Modelin Uncertanty Agora
University of Oradea
 
Spme 2013 segmentation
Qujiang Lei
 
Edge detection algorithm based on quantum superposition principle and photons...
IJECEIAES
 
Spatio-Temporal Data Mining and Classification of Ships' Trajectories
Centre of Geographic Sciences (COGS)
 
Ad

Similar to Carpita metulini 111220_dssr_bari_version2 (20)

PDF
A Hierarchical and Grid Based Clustering Method for Distributed Systems (Hgd ...
iosrjce
 
PDF
H017625354
IOSR Journals
 
PPTX
pratik meshram-Unit 5 (contemporary mkt r sch)
Pratik Meshram
 
PDF
Az36311316
IJERA Editor
 
DOC
Visualization of Crisp and Rough Clustering using MATLAB
International Journal of Advance Research and Innovative Ideas in Education
 
PDF
Big Data Clustering Model based on Fuzzy Gaussian
IJCSIS Research Publications
 
PDF
Lecture-1-Introduction-to-Data-Mining.pdf
Jojo314349
 
DOCX
Curse of Dimensionality in Paradoxical High Dimensional Clinical Datasets � A...
ijcnes
 
PPT
clustering.ppt
VivekKumar898803
 
PPT
DM_clustering.ppt
nandhini manoharan
 
PPTX
8clustering.pptx
DeepanshuPatel19
 
PDF
A Visual and Interactive Data Exploration Method for Large Data Sets and Clus...
chenogfrato8i
 
PPT
Chapter 07
Houw Liong The
 
PPTX
KDD, Data Mining, Data Science_I.pptx
YogeshGairola2
 
PPT
Jewei Hans & Kamber Capter 7
Houw Liong The
 
PDF
Du35687693
IJERA Editor
 
PDF
Lx3520322036
IJERA Editor
 
PPTX
3b318431-df9f-4a2c-9909-61ecb6af8444.pptx
NANDHINIS900805
 
PDF
Clustering
Kiran Bhowmick
 
PPT
Dataa miining
SUBBIAH SURESH
 
A Hierarchical and Grid Based Clustering Method for Distributed Systems (Hgd ...
iosrjce
 
H017625354
IOSR Journals
 
pratik meshram-Unit 5 (contemporary mkt r sch)
Pratik Meshram
 
Az36311316
IJERA Editor
 
Visualization of Crisp and Rough Clustering using MATLAB
International Journal of Advance Research and Innovative Ideas in Education
 
Big Data Clustering Model based on Fuzzy Gaussian
IJCSIS Research Publications
 
Lecture-1-Introduction-to-Data-Mining.pdf
Jojo314349
 
Curse of Dimensionality in Paradoxical High Dimensional Clinical Datasets � A...
ijcnes
 
clustering.ppt
VivekKumar898803
 
DM_clustering.ppt
nandhini manoharan
 
8clustering.pptx
DeepanshuPatel19
 
A Visual and Interactive Data Exploration Method for Large Data Sets and Clus...
chenogfrato8i
 
Chapter 07
Houw Liong The
 
KDD, Data Mining, Data Science_I.pptx
YogeshGairola2
 
Jewei Hans & Kamber Capter 7
Houw Liong The
 
Du35687693
IJERA Editor
 
Lx3520322036
IJERA Editor
 
3b318431-df9f-4a2c-9909-61ecb6af8444.pptx
NANDHINIS900805
 
Clustering
Kiran Bhowmick
 
Dataa miining
SUBBIAH SURESH
 
Ad

More from University of Salerno (20)

PDF
Regression models for panel data
University of Salerno
 
PDF
A strategy for the matching of mobile phone signals with census data
University of Salerno
 
PDF
Detecting and classifying moments in basketball matches using sensor tracked ...
University of Salerno
 
PDF
BASKETBALL SPATIAL PERFORMANCE INDICATORS
University of Salerno
 
PDF
Metulini280818 iasi
University of Salerno
 
PDF
Players Movements and Team Performance
University of Salerno
 
PDF
Big Data Analytics for Smart Cities
University of Salerno
 
PDF
Meeting progetto ode_sm_rm
University of Salerno
 
PDF
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
University of Salerno
 
PDF
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
University of Salerno
 
PDF
Metulini1503
University of Salerno
 
PDF
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
University of Salerno
 
PPT
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
University of Salerno
 
PPT
The Global Virtual Water Network
University of Salerno
 
PDF
The Worldwide Network of Virtual Water with Kriskogram
University of Salerno
 
PDF
Ad b 1702_metu_v2
University of Salerno
 
PDF
Statistics lab 1
University of Salerno
 
Regression models for panel data
University of Salerno
 
A strategy for the matching of mobile phone signals with census data
University of Salerno
 
Detecting and classifying moments in basketball matches using sensor tracked ...
University of Salerno
 
BASKETBALL SPATIAL PERFORMANCE INDICATORS
University of Salerno
 
Metulini280818 iasi
University of Salerno
 
Players Movements and Team Performance
University of Salerno
 
Big Data Analytics for Smart Cities
University of Salerno
 
Meeting progetto ode_sm_rm
University of Salerno
 
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
University of Salerno
 
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
University of Salerno
 
Metulini1503
University of Salerno
 
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
University of Salerno
 
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
University of Salerno
 
The Global Virtual Water Network
University of Salerno
 
The Worldwide Network of Virtual Water with Kriskogram
University of Salerno
 
Ad b 1702_metu_v2
University of Salerno
 
Statistics lab 1
University of Salerno
 

Recently uploaded (20)

PDF
Sujay Rao Mandavilli Multi-barreled appraoch to educational reform FINAL FINA...
Sujay Rao Mandavilli
 
PDF
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
PPTX
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
PPTX
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
PPTX
Reticular formation_nuclei_afferent_efferent
muralinath2
 
PDF
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
ESUG
 
DOCX
Echoes_of_Andromeda_Partial (1).docx9989
yakshitkrishnia5a3
 
PPTX
General Characters and Classification of Su class Apterygota.pptx
Dr Showkat Ahmad Wani
 
PPTX
Embark on a journey of cell division and it's stages
sakyierhianmontero
 
PPTX
fghvqwhfugqaifbiqufbiquvbfuqvfuqyvfqvfouiqvfq
PERMISONJERWIN
 
PDF
Multiwavelength Study of a Hyperluminous X-Ray Source near NGC6099: A Strong ...
Sérgio Sacani
 
PPTX
METABOLIC_SYNDROME Dr Shadab- kgmu lucknow pptx
ShadabAlam169087
 
PPTX
Hepatopulmonary syndrome power point presentation
raknasivar1997
 
PDF
JADESreveals a large population of low mass black holes at high redshift
Sérgio Sacani
 
PPTX
first COT (MATH).pptxCSAsCNKHPHCouAGSCAUO:GC/ZKVHxsacba
DitaSIdnay
 
PPTX
The Toxic Effects of Aflatoxin B1 and Aflatoxin M1 on Kidney through Regulati...
OttokomaBonny
 
PDF
Paleoseismic activity in the moon’s Taurus-Littrowvalley inferred from boulde...
Sérgio Sacani
 
PDF
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
PDF
The Cosmic Symphony: How Photons Shape the Universe and Our Place Within It
kutatomoshi
 
PPTX
Feeding stratagey for climate change dairy animals.
Dr.Zulfy haq
 
Sujay Rao Mandavilli Multi-barreled appraoch to educational reform FINAL FINA...
Sujay Rao Mandavilli
 
A deep Search for Ethylene Glycol and Glycolonitrile in the V883 Ori Protopla...
Sérgio Sacani
 
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
Reticular formation_nuclei_afferent_efferent
muralinath2
 
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
ESUG
 
Echoes_of_Andromeda_Partial (1).docx9989
yakshitkrishnia5a3
 
General Characters and Classification of Su class Apterygota.pptx
Dr Showkat Ahmad Wani
 
Embark on a journey of cell division and it's stages
sakyierhianmontero
 
fghvqwhfugqaifbiqufbiquvbfuqvfuqyvfqvfouiqvfq
PERMISONJERWIN
 
Multiwavelength Study of a Hyperluminous X-Ray Source near NGC6099: A Strong ...
Sérgio Sacani
 
METABOLIC_SYNDROME Dr Shadab- kgmu lucknow pptx
ShadabAlam169087
 
Hepatopulmonary syndrome power point presentation
raknasivar1997
 
JADESreveals a large population of low mass black holes at high redshift
Sérgio Sacani
 
first COT (MATH).pptxCSAsCNKHPHCouAGSCAUO:GC/ZKVHxsacba
DitaSIdnay
 
The Toxic Effects of Aflatoxin B1 and Aflatoxin M1 on Kidney through Regulati...
OttokomaBonny
 
Paleoseismic activity in the moon’s Taurus-Littrowvalley inferred from boulde...
Sérgio Sacani
 
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
The Cosmic Symphony: How Photons Shape the Universe and Our Place Within It
kutatomoshi
 
Feeding stratagey for climate change dairy animals.
Dr.Zulfy haq
 

Carpita metulini 111220_dssr_bari_version2

  • 1. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Dynamic crowding maps with mobile phone big data Maurizio Carpita1, Rodolfo Metulini2 1. Data Methods and Systems Statistical Laboratory - Department of Economics and Management, University of Brescia 2. Department of Economics and Statistics - University of Salerno Third international conference on Data Science & Social Research December 10-11, 2020
  • 2. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References The project • Ongoing project (’till 06/2022): This talk describes the works conducted together with Prof. Roberto Ranzi and Dr. Matteo Balistrocchi (Department of Civil, Environmental, Architectural Engineering and Mathematics, UNIBS) in the context of MoSoRe project Regione Lombardia, Call HUB Research & Innovation: Infrastrutture e servizi per la Mobilità Sostenibile e Resiliente - MoSoRe@UnibsID 1180965 - POR FESR 2014-2020 • Scientific output: 1 Metulini, R., Carpita, M., (2020), A Spatio-Temporal Indicator for City Users based on Mobile Phone Signals and Administrative Data - Social Indicator Research, 1-21. DOI: 10.1007/s11205-020-02355-2 2 Balistrocchi, M., Metulini, R., Carpita, M., and Ranzi, R.: Dynamic maps of people exposure to floods based on mobile phone data. Natural Hazards and Earth System Sciences, 2020, in press. DOI: 10.5194/nhess-2020-201.
  • 3. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References The context of application • Floods are natural phenomena whose hazards afflict nearly 20 million people worldwide (Kellens et al., 2013), posing a serious challenge to the protection of human lives. • Urbanization determines dramatic increases in people exposure and vulnerability to floods, since most of recent urbanizations are developed in flood prone areas. • The development of effective emergency management plans are intended to provide communities with early warnings, reliable real-time information. • We provide a detailed and reliable picture of the real-time spatiotemporal variability of the flood risk by proxying it with dynamic crowding maps from mobile phone data for reference groups of days.
  • 4. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Data • Erlang mobile phone measures (Erlang, 1909): average number of mobile phone users (MPU) bearing a SIM connected to the network, recorded at constant time steps with reference to a georeferenced grid of square cells. Available for Telecom Ialia Mobile (TIM) in the period from 04/2014 to 08/2016 thanks to a collaboration with Statistical Office of Comune di Brescia. • Census data from ISTAT, reporting residential population (01/01/2016) by age, for each sezione di censimento (SC)
  • 5. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References The set-up • To detect MPU spatiotemporal variability we define the subject of our analysis: the daily density profiles (DDP). • Let eit be the number of MPU in the i − th grid cell in a generic time interval t, • let Ir = {i1, ..., im} be the set of grid cells in region r of interest, • let Td = {t1, ..., to} be the set of intervals of time in a day d. • DDPrd can be defined as the vector of the sums of MPU (a sum for each considered time instant) in region r and day d (length = o) DDPrd = m l=1 eil,t1 , m l=1 eil,t2 , ..., m l=1 eil,to • Goal: classifying the occurrences in the time series of DDPrd related to the set d = {d1, ..., dn} of n analyzed days. In other words, clustering similar DDPrd .
  • 6. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Issues • Our dataset amount to n observations (days) and p = m ∗ o features per day (cells∗quarters). Let consider one year of data (n = 365): o = 96 (quarters per day), m = 400 (grid’s cells of the sample area). • Number of features is larger than number of observations, so we refer to an high-dimensional data setup (Donoho, 2000). • Traditional techniques (Arabie and De Soete, 1996) may not return robust results in high-dimensional data, for example due to the presence of the curse of dimensionality (Keogh and Mueen, 2017). • Bouveyron et al. (2007) addressed this issue with regard to clustering. However, as suggested by Jovi et al. (2015) , a suitable solution is represented by a preliminar data reduction strategy. • Histogram of Oriented Gradients (HOG) approach is used for data reduction.
  • 7. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References The Strategy ... ( ... to take into account days’ similarity) Step Type Aim Method Features 1 Data re- duction + clustering find similar raster images HOG + k- means cluster HOG features 2 clustering find similar functional curves functional model-based clustering DDP features 3 population assessment estimate city users spatial match of MPU and census data DDP features + population 4 visualiza- tion find reference daily profiles functional box plots DDP features
  • 8. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References The Strategy Step Type Aim Method Features 1 Data re- duction + clustering find similar raster images HOG + k- means cluster HOG features
  • 9. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References HOG data reduction • for a given t, let it = {e1,t, e2,t, ..., eim,t} be the MPU vector of region r in time instant t (dimension m). • Aim: to reduce it to a smaller vector of values κ1,t (m < m), with the relevant information contained in it. • To do so, set it, separately for each t, undergoes a histogram of oriented gradients (HOG) feature extraction (Dalal and Triggs, 2005). • Vector zit = {ei,t/maxi∈Ir (ei,t)}, ∀i in Ir undergoes HOG • HOG method: 1 split the m cells of the grid in S smaller grids G1, ..., GS (Gi ∩ Gj = ∅, ∀i = 1, ..., S and ∀j = 1, ..., S with i = j) ( √ S is a parameter to be chosen), 2 for each grid Gi , direction and magnitude gradient matrices are computed (Dalal and Triggs, 2005). 3 from the two gradient matrices, histogram of gradients is determined, with k equal bins (with k a parameter to be chosen). • κit is stacked over the subscript t, in order to derive (for region r, day d) the vector of features κd (dimension S ∗ k ∗ o), d = 1, ..., n.
  • 10. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References HOG data reduction ... explained From a nxn raster data ....   93 124 77 ... ... 217 55 94 ... ... 24 77 109 ... ... ... ... ... ... ... ... ... ... ... ...   ...to Xt, a matrix representing the number of people in that cell at time t 1 standardize MPU data; 2 split matrix in sub-matrices; 3 for each sub-matrix, compute the matrices of gradients (using the sobel operator); 4 assign each value of the direction matrix to one of the k bins of the histogram using its magnitude as weight, to produce the vector of features; 5 stack into a vector the features of all quarters of the day.
  • 11. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References First step clustering • Days are clustered in terms of how MPU are distributed over region r according to index i, i.e., according to similarity in the raster image . • The objects to be clustered are the n days and κd contains the S ∗ k ∗ o (with S ∗ k ∗ o < m ∗ o) features for day d, ∀d = d1, ..., dn. • k-mean cluster method (Hartigan and Wong, 1979) is adopted (after having tested against curse of dimensionality) • According to Hartigan and Wong criterion, the clusters’ number H is chosen by minimizing the ratio between the total within sum of squares and the total sum of squares for different values of H.
  • 12. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References The Strategy Step Type Aim Method Features 2 clustering find similar functional curves functional model-based clustering DDP features
  • 13. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Second step clustering • Aim: at considering similarity in the functional form of the DDPrd , if viewed as functional curves. • We consider DDPrd as the collection of functional observations xrd (Td ), Td ∈ (t1, ..., to) (length o) (i.e. m l=1 eil,t1 in t1), with d varying in d = {d1, ..., dn}. • We adopt a model-based functional data clustering method (MB-FAC, Bouveyron et al., 2015), which provides estimated curve with specific parameters, to group days d (cluster’s objects) in terms of the o DDPrd values (cluster’s variables) • We adopt the following path: 1 functional data outlier detection by likelihood ratio test (LRT) to remove anomalous DDPrd , as proposed by Febrero-Bande et al. (2008); 2 Bouveyron et al. (2015) clustering method, using funFEM package in R • The method suits for high-dimensional data: it employs sub-space clustering criterion (Agrawal et al., 1998, it considers just the minimum number of variables for grouping objects)
  • 14. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References The Strategy Step Type Aim Method Features 3 population assessment estimate city users spatial match of MPU and census data DDP features + population
  • 15. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Population assessment - I • Aim: to estimate the total amount of people (city users), while MPU availability regards just one mobile phone company. • We compute an estimate of the market share of the provider company, to correct the DDPrd , • by comparing the number of residents from archives with the number of TIM users on a residential area - in late evening hours (assuming that, in late evening hours, residential Sezione di Censimento (SC) are only populated by residents). • MPU grid is made of square cells while SCs are irregular polygons → the number of TIM users belonging to each SC needs to be retrieved by intersecting the two sources. • the portion of the cell belonging to the SC polygon were calculated in order to count how many TIM users are present in each polygon , by using the function extract in raster package, R.
  • 16. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Population assessment - II • Let Cellj , ∀j = 1, 2, ..., JSC be the cells of the sample area, the ratio Aj = area(SC) ∩ area(Cellj ) area(Cellj ) represents how much of Cellj is included in the chosen SC; • let TUCj be the MPU in Cellj , the estimation of the number of MPU in SC is ETUSC = j TUCj ∗ Aj . • The estimated company market share in SC is given by ETMSSC = ETUSC PSC where PSC is the resident number for that SC (children and elderly people excluded). • The median (me(.)) of ETMSSC can be used as a proxy for the company market share at city level; • the city users estimate is given by ˆDDPrd = DDPrd me(ETMSSC )
  • 17. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References The Strategy Step Type Aim Method Features 4 visualiza- tion find reference daily profiles functional box plots DDP features
  • 18. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Visualization • Let consider DDPrd to be a functional curve xrd (Td ) displaying, in the y-axis, the sum of MPU in region r and day d with respect to, in the x-axis, time instants Td ∈ (t1, ..., to). • Functional box plots (FBP, Sun and Genton, 2011) can be used to display the profile for each final cluster. • For cluster h, let dh = {d1h, ..., dnh} be the group of days belonging to cluster h, and let ˆDDPrd,h = [ ˆDDPrd1,h, ..., ˆDDPrdn,h] be the matrix of dimension o ∗ nh with a DDPrd of cluster h in each column. • By considering each DDPrd a curve, the FBP representing the profile plot of the total number of people (that we call city users) in different hours (with DB), for cluster h, is computed using matrix DDPrd,h
  • 19. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Case study description • WGS 84 UTM 32 N coordinates: 5,040,920–5,049,980N, 585,970–592,970E (area about 64 km2 ) centred on the Mandolossa-Gandovere network (grid of 20x20 150m2 cells) • at 15-minutes intervals (quarters) over the period July 1st, 2015 - August 10th, 2016. • After imputing missing quarters and removing the full day when they are too many, we ended up with a number of valid 360 days. • HOG parameters: √ S = 3, h = 4. • The interest is in residential and industrial part of 4 specific areas (Moie di Sotto, Villaggio Badia and Fantasina, southern Gandovere canal, Roncadelle)
  • 20. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References First step clustering: results Figure: Spine-plots representing the first-step clustering of days along (a) months and (b) days of the week (green: all days mostly occurring in July, August and September; blue: working days mostly occurring from February to June; red: working days mostly occurring from October to January; yellow: weekends mostly occurring from October to June)
  • 21. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Representation: results Figure: Functional box plots of exposed people (“city users”) inside residential areas: (a) Moie di Sotto, (b) Villaggio Badia and Fantasina, (c) southern Gandovere canal, (d) Roncadelle . Cluster 1 (July, August, September, C1), Cluster 2 (working-days from October to June, C2), Cluster 3 (week-ends from October to June, C3)
  • 22. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Representation: results - II Figure: Functional box plots of exposed people (“city users”) inside industrial-commercial settlements: (a) Moie di Sotto, (b) Villaggio Badia and Fantasina, (c) southern Gandovere canal, (d) Roncadelle. Cluster 1 (July, August, September, C1), Cluster 2 (working-days from October to June, C2), Cluster 3 (week-ends from October to June, C3)
  • 23. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References Discussion • The combination of: 1 high spatial resolution (150 m2 ) and short time step (15 ) of data, and 2 the application of the proposed statistical strategy thought for high dimensional data permits a 1 reliable population assessments even for small area, and 2 a precise evaluation of the temporal dynamic of city users in the sample area • Functional box plot results are meaningful: 1 working days and weekends show different temporal dynamics, when they belong to working months (October to June), 2 daily dynamics in summer months (July, August and September, holydays in Italy), must be regarded as different from the others, 3 working days and weekends feature more similar daily density profiles during such months.
  • 24. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References References - I 1 Agrawal, R., Gehrke, J., Gunopulos, D., and Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications, in: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, ACM Press, 94–105, doi:10.1145/276304.276314, 1998 2 Arabie, P., and De Soete, G.: Clustering and classification, World Scientific, doi:10.1142/1930, 1996 3 Bouveyron, C., Girard, S., and Schmid, C.: High-dimensional data clustering, Comput. Stat. Data An., 52(1), 502–519, doi:10.1016/j.csda.2007.02.009, 2007 4 Bouveyron, C., Come, E., & Jacques, J. (2015). The discriminative functional mixture model for a comparative analysis of bike sharing systems. The Annals of Applied Statistics, 9(4), 1726-1760. 5 Dalal, N., and Triggs, B.: Histograms of oriented gradients for human detection, in: Proceedings of the International Conference on Computer Vision & Pattern Recognition (CVPR ’05), doi:10.1109/CVPR.2005.177, 2005. 6 Donoho, D. L.: High-dimensional data analysis: The curses and blessings of dimensionality, AMS Math Challenges Lecture, 1–32, 2000 7 Erlang, A. K. (1909). The theory of probabilities and telephone conversations. Nyt. Tidsskr. Mat. Ser. B, 20, 33-39.
  • 25. Dynamic crowding maps Carpita Metulini The project Data Methodology Application Conclusions References References - II 1 Febrero-Bande, M., Galeano, P., & Gonzalez-Manteiga, W. (2008). Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels. Environmetrics: The official journal of the International Environmetrics Society, 19(4), 331-345. 2 Hartigan, J. A., and Wong, M. A.: Algorithm AS 136: A k-means clustering algorithm, J. R. Stat. Soc., Series C (Applied Statistics), 28(1), 100–108, doi:10.2307/2346830, 1979. 3 Kellens, W., Terpstra, T., and De Maeyer, P.: Perception and communication of flood risks: A systematic review of empirical research, Risk Anal., 33/1, 24–49, doi:10.1111/j.1539-6924.2012.01844.x, 2013 4 Keogh, E., and Mueen, A.: Curse of dimensionality, in: Sammut, C., and Webb, G. I. (eds.), Encyclopedia of Machine Learning and Data Mining, 314–315, Springer, doi:10.1007/978-1-4899-7687-1-192, 2017. 5 Jovi, A., Brki, K., and Bogunovi, N.: A review of feature selection methods with applications, in: Proceedings of the 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 1200–1205, doi:10.1109/MIPRO.2015.7160458, 2015 6 Sun, Y., & Genton, M. G. (2011). Functional boxplots. Journal of Computational and Graphical Statistics, 20(2), 316-334.
  • 26. Dynamic crowding maps Carpita Metulini Supplemental Figure: Snapshots of a dynamic map showing the spatiotemporal distribution of mobile phone users (MPU) occurred at 12pm, 17/11/2015 (Wednesday); base map Lombardy Regional Technical Map CTR 1:5000 provided by Lombardy Region (www.geoportale.regione.lombardia.it). Back to slide
  • 27. Dynamic crowding maps Carpita Metulini Supplemental Figure: Example of dissimilarity among raster images. Figure: Example of similarity among raster images. Back to slide
  • 28. Dynamic crowding maps Carpita Metulini Supplemental Figure: Example of similauty and dissimilaity in the functional form. Curves with the same colors are similar. On the contrary, curves with different colors are dissimilar. Back to slide
  • 30. Dynamic crowding maps Carpita Metulini Supplemental Figure: Example of weighting scheme to assign the number of TIM users to SC 110, located at latitude 45.544 N and longitude 10.217 N Back to slide