SlideShare a Scribd company logo
Machine learning algorithm for classification of
Activity of daily life’s
Siddharth Chakravarty
Overview
ADL- What is it?
How is it monitored?
Technology landscape
Problem statement
Approach
Results
ADL- Activity of Daily life’s
A technique for classification of human activities can be a useful tool to not only classify and monitor our activities,
but also improve overall quality of life
Image Source: connectingcleveland.net/wp-content/uploads/2014/07/Daily_Life_Wallpaper.jpg
State of the art technology ADL monitoring
4
•  An accelerometer essentially records the acceleration it experiences is the X,Y and Z direction.
•  Accelerometers are the most commonly used type of sensor for activity recognition with wearable
sensors and other consumer electronic devices ranging from Iphones to Wii
Walking
Running
Jumping
Sweeping
Sleeping
Time scale
Image courtesy: Jawbone
Technology landscape
Major players
•  Fitbit - recently announced next generation device
•  Jawbone- acquired bodymedia to complement it’s technology
•  Microsoft
•  Samsung
•  Pebble
•  Misfit– Acquired by Fossil
•  Facebook- acquired Finland-based fitness app maker Protogeo
Many more….
Clearly there is a need for accurately monitor and classify ADL’s not just for recreation, but also for
other applications.
Heart
rate
Walking Running Sleeping ADL
Heart rate
monitors
X
Pedometers X X X
Phone X X
Wearable X X X X
Problem statement
6
The objective of the project was to develop machine learning algorithm using SVM to predict ADL’.
Emerging Trends in Data Analytics
Image courtesy SARK7
The automatic recognition of a set of Activities of Daily Living, is among the most challenging research fields in Ambient Intelligence. Main
challenge with wearable technology is classification of use case ADL’
Data science toolkit
What’ the problem
statement?
Prediction ? Classification?
Linear regression
K-means
Hierarchical
clustering
Logistic regression Decision Trees
Support vector
machines
Logistic regression
Random forestDecision Trees
Clustering
Clustering ?
Support vector machines SVM is a most popular and efficient classification and regression method. Currently four R packages
contain SVM related software. For this project the e1071 R package was chosen that supports multi-level classification problems
The Dataset
$Dataset for ADL Recognition with Wrist-worn Accelerometer
•  The Dataset for ADL Recognition with Wrist-worn Accelerometer is a public collection of labeled accelerometer data
recordings to be used for the creation and validation of acceleration models of simple ADL .
•  It was provided by UCI machine learning repository, Center for Machine learning and Intelligent systems.
•  The data was collected by using single tri-axial accelerometer attached to the right-wrist of the volunteer. It was carried out by
Barbara Bruno, Fulvio Mastrogiovanni, Antonio Sgorbissa from the Laboratory for Ambient Intelligence and Mobile
Robotics ,DIBRIS, University of Genova
•  The Dataset composed of the recordings of 11 simple ADL performed by a total of 16 volunteers.
Gender Age Weight
M F Min Avg. Max Min Avg. Max
11 5 19 81 57.4 56 85 72.7
Problem
statement
$Data
Data
Wrangling
Data
Exploration
Build
Models
Testing and
classification
brush_teeth getup_bed walk
climb_stairs liedown_bed
comb_hair pour_water
descend_stairs sitdown_chair
drink_glass standup_chair Wrist-worn Accelerometer
Image courtesy: Chalkbeat Colorado
Overview of dataset
Accelerometer-2011-04-11-13-28-18-brush_teeth-f1.txt
-  Refers to an accelerometer recording that was taken on
March 11, 2011, starting from 13:28.18 p.m.
-  The recording refers to the HMP ”brush_teeth" executed
by the volunteer with ID "f1".
Provides equations to convert raw acceleration into real
acceleration values and reduce noise
Raw data from accelerometer for the task “brush_teeth”.
The columns represents the acceleration measured in
X,Y and Z direction.
Each folder contains raw data for the activity.Each file in the dataset has the
following naming convention:
Accelerometer-[START_TIME]-
[ADL]-[VOLUNTEER]
Feature extraction
•  A key challenge for any classification ML algorithm is feature extraction -.i.e. unique parameters that
distinguishes each class (Activity)
Ax	
Ay	
Az	
Ax	
Ay	
Az	
Drink glass
Ax	
Ay	
Az	
Pour water
Pour waterClimb stairs
Feature extraction approaches
Group Methods
Time domain Mean 1, Std. Deviation 2, Variance 3, MAD 4, Entropy 5
Frequency domain Fast Fourier transform 6, Discrete cosine transform 7
Other Principal component analysis 8, Linear discriminant analysis 9, Singular value decomposition
•  The ML algorithm implemented for this project uses Discrete cosine transformation and Singular value
decomposition approaches to classify ADL.
Data wrangling
•  Consolidate into a single
CSV file.
•  Convert raw data into
acceleration and apply
filter
•  Calculate components of
SVD: uvd
•  Sum vector magnitude
(VM) =√(ax)2+(ay)2+(az)2
Data exploration
•  VM and SVD histograms
•  Scale data
•  Under-sample and
Oversample data
•  Split data
Build and select model
•  Build model in SVM
using training data
•  Calculate values for
Cost and Gamma
•  Cross validate model on
training data.
Validate model
•  Use test data to validate
model.
•  F measure
•  Accuracy
•  Sensitivity
•  Selectivity
CONSOLIDATING INFORMATION
•  The original data comprises 479,288 observations of ax, ay, az, distributed among the 11 Activities
•  The data is consolidated to 827 observations that contribute to maximum variance in the data.
§  VM refers to sum of vector magnitude
§  SVD 1 and SVD2 refer to d1 and d2 components of Singular value decomposition
§  Act refers to the activities
What does the data look like?
Problem
stateme
nt
$Data
Data
Wrangling
Data
Exploration
Build
Models
Testing and
classificatio
n
Histogram of VM Histogram of SVD1 Histogram of SVD2
Two models are evaluated
1.  Model 1 uses VD as the dependent variable to predict ADL
2.  Model 2 uses VD and SVD to predict ADL
DATA SAMPLING
•  The original data set is unbalanced with fewer data points for activities such as brush_teeth, comb_hair, and liedown_bed
•  The SMOTE function is used to handle unbalanced classification problem. It generates synthetic data sets that addresses the class
unbalance problem.
•  Following which the data set is split into training (70%) and test data (30%)
Data after balancing and scaling
Histogram of VM Histogram of SVD1 Histogram of SVD2
Scaling avoids attributes in greater numeric ranges dominating those in smaller numeric ranges. Another advantage is it reduces
numerical difficulties during the calculation
Tuning of SVM parameters- for Model 1(VD only)
•  RBF kernel is chosen as it can handle non-
linear relations between class and attributes
•  Cost (C) and Gamma (γ) parameters are the
two key parameters for the SVM model
•  Initial values of cost and gamma for the SVM
model are estimated using the “tune function”,
and running a coarse grid search.
•  Following the initial coarse grid search, the
value of cost and gamma were fine tuned
•  The tune function implemented uses a 10 fold
cross validation
•  Parameter tuning of ‘svm’:
–  Sampling method: 10-fold cross validation
–  Best parameters:
Ø Gamma:20
Ø  cost:900
–  Best performance: 0.715 (1 – error)
•  The model performance is first evaluated on
training data set, followed by test data.
Error
Gamma
Cost
Contour map showing the error rate for
different cost and gamma values
Model performance
ROC curves and Confusion matrix (CM)
fpr
tpr
fpr
tpr
ROC and CM for train data set ROC and CM for test data set
TRAINING CONFUSION MATRIX
Reference Prediction
brush_
teeth
climb_
stairs
comb_
hair
Descend
_stairs
drink_
glass
getup_
bed
liedown_
bed
pour_
water
sitdown_
chair
Standup
_chair
walk
85 brush_teeth 85 0 0 0 0 0 0 0 0 0 0
138 climb_stairs 0 118 0 13 5 9 2 7 0 1 0
47 comb_hair 0 0 42 0 0 0 0 0 0 0 0
60 descend_stairs 0 2 0 35 0 5 0 3 0 0 0
135 drink_glass 0 2 0 1 123 1 0 2 1 1 0
153 getup_bed 0 3 0 3 5 104 2 9 1 2 0
40 liedown_bed 0 2 0 0 1 2 35 0 0 0 0
147 pour_water 0 14 0 7 5 7 2 120 2 1 2
136 sitdown_chair 0 0 0 1 0 1 0 0 129 20 0
144 standup_chair 0 0 0 1 0 0 1 0 12 120 0
161 walk 0 0 4 0 0 2 0 0 0 0 164
TEST DATA SET CONFUSION MATRIX
Reference Prediction
brush_
teeth
climb_
stairs
comb_
hair
descend_
stairs
drink_
glass
getup_
bed
liedown_
bed
pour_
water
sitdown_
chair
Standup
_chair
walk
36 brush_teeth 34	 0	 0	 0	 0	 0	 0	 0	 0	 0	 0	
60 climb_stairs 0	 42	 0	 4	 2	 5	 0	 6	 0	 0	 1	
19 comb_hair 0	 0	 14	 0	 2	 0	 0	 0	 0	 0	 1	
26 descend_stairs 0	 1	 0	 10	 2	 2	 0	 2	 0	 0	 0	
59 drink_glass 0	 2	 1	 1	 41	 3	 0	 3	 2	 2	 0	
56 getup_bed 0	 1	 0	 2	 9	 33	 0	 7	 1	 0	 0	
18 liedown_bed 0	 4	 0	 0	 0	 3	 15	 1	 2	 0	 0	
60 pour_water 0	 6	 0	 7	 1	 7	 0	 41	 1	 0	 3	
62 sitdown_chair 0	 1	 0	 1	 0	 1	 0	 0	 47	 11	 0	
62 standup_chair 0	 2	 0	 0	 1	 0	 3	 0	 9	 49	 0	
71 walk 2	 1	 4	 1	 1	 2	 0	 0	 0	 0	 66
VD model test summary
•  The initial results using VD’ to predict ADL were satisfactory, but showed deviation from training data set when
implemented on test model.
•  The training model performance could be further improved by tuning the value of cost and gamma ,.i.e. high cost
and gamma number.
•  Prior to implementing model 2, we need to check for collinearity (Strong correlation between two or more predictor
variables).
•  The vif function is used to check for collinearity among the three variables
Tuning of SVM parameters: Model 2 (VD and SVM)
•  The initial value of cost and gamma for the SVM
model are predicted using the “tune function”,
and running a coarse grid search.
•  Following the initial coarse grid search, the value
of cost and gamma were fine tuned
•  Each set of cost and gamma values are cross
validated (10 fold)
•  Parameter tuning of ‘svm’:
- Sampling method: 10-fold cross validation
-  Best parameters:
-  gamma:18
-  cost:800
- best performance: 0.863 (1 – error)
Error
Gamma
Cost
ROC curves and Confusion matrix (CM)
TEST DATA SET
Reference Prediction
brush_
teeth
climb_
stairs
comb_
hair
Descend
_stairs
drink_
glass
getup_
bed
liedown_
bed
pour_
water
sitdown_
chair
Standup
_chair
walk
36 brush_teeth 35 0 0 0 0 0 0 0 0 0 0
58 climb_stairs 0 49 0 0 0 1 0 2 1 1 3
15 comb_hair 0 0 15 0 0 1 0 0 0 0 0
25 descend_stairs 0 1 0 23 0 0 0 0 0 1 0
57 drink_glass 0 0 0 0 56 3 0 7 0 0 0
65 getup_bed 0 0 0 0 1 53 0 3 2 0 1
16 liedown_bed 0 0 0 0 0 0 16 0 1 0 0
62 pour_water 0 0 0 0 0 0 0 46 0 0 0
57 sitdown_chair 0 0 0 0 0 0 0 0 66 0 0
61 standup_chair 0 0 0 1 0 0 0 0 3 47 0
69 walk 1 1 0 0 1 6 1 0 3 2 73
TRAINING DATA SET
Reference Prediction
brush_
teeth
climb_
stairs
comb_
hair
descend_
stairs
drink_
glass
getup_
bed
liedown_
bed
pour_
water
sitdown_
chair
standup_c
hair
walk
85 brush_teeth 85 0 0 0 0 0 0 0 0 0 0
138 climb_stairs 0 152 0 0 0 0 0 0 0 0 0
47 comb_hair 0 0 50 0 0 0 0 0 0 0 0
60 descend_stairs 0 0 0 54 0 0 0 0 0 0 0
135 drink_glass 0 0 0 0 154 0 0 0 0 0 0
153 getup_bed 0 0 0 0 0 134 0 0 0 0 0
40 liedown_bed 0 0 0 0 0 0 36 0 0 0 0
147 pour_water 0 0 0 0 0 0 0 127 0 0 0
136 sitdown_chair 0 0 0 0 0 0 0 0 140 0 0
144 standup_chair 0 0 0 13 0 0 0 0 0 146 0
161 walk 0 0 0 0 0 0 0 0 0 0 153
fpr
tpr
fpr
tpr
ROC and CM for train data set ROC and CM for test data set
Summary
•  The results from the study demonstrate a improvised method for detecting ADL’
•  The VD and SVM factors when used in conjunction help to classify activities that would
otherwise be misclassified.
•  The study needs to extended to other ADL data sets that are publically available.
•  While publically available data sets will help tune and validate the model, the efficacy of the
model can only be validated using data sets from the industry.
•  The application of a machine learning algorithm in this setting is quite vast
–  VR band
–  Health monitoring
–  Activity monitoring
–  Quality of activity, not just quantity
–  Remote point of care devices
–  Fall sensors
References
1.  J. Parkka, M. Ermes, P. Korpipaa, J. Mantyjarvi, J. Peltola, and I.
Korho- nen, “Activity classification using realistic data from wearable
sensors,” IEEE Transactions on Information Technology in
Biomedicine, vol. 10, no. 1, pp. 119–128, 2006.
2.  E. M. Tapia, S. S. Intille, W. Haskell, K. Larson, J. Wright, A. King,
and R. Friedman, “Real-time recognition of physical activities and
their intensities using wireless accelerometers and a heart monitor,”
in International Symposium on Wearable Computers, 2007.
3.  O. D. Lara, A. J. Perez, M. A. Labrador, and J. D. Posada,
“Centinela: A human activity recognition system based on
acceleration and vital sign data,” Journal on Pervasive and Mobile
Computing, 2011.
4.  Y.-P. Chen, J.-Y. Yang, S.-N. Liou, Gwo-Yun=Lee, and J.-S. Wang,
“Online classifier construction algorithm for human activity detection
using a tri-axial accelerometer,” Applied Mathematics and
Computation, vol. 205, no. 2, pp. 849–860, 2008.
5.  M. Ermes, J. Parkka, and L. Cluitmans, “Advancing from offline to
online activity recognition with wearable sensors,” in Engineering in
Medicine and Biology Society. 30th Annual International Conference
of the IEEE, pp. 4451–4454, 2008.
6.  L. Bao and S. S. Intille, “Activity recognition from user-annotated
acceleration data,” in Pervasive, pp. 1–17, 2004.
7.  K. Altun and B. Barshan, “Human activity recognition using iner- tial/
magnetic sensor units,” in Human Behavior Understanding, Lecture
Notes in Computer Science, pp. 38–51, Springer Berlin / Heidelberg,
2010.
8.  Z. He and L. Jin, “Activity recognition from acceleration data based
on discrete consine transform and svm,” in IEEE International
Conference on Systems, Man and Cybernetics, pp. 5041–5044,
2009.
9.  Z. He, Z. Liu, L. Jin, L.-X. Zhen, and J.-C. Huang, “Weightlessness
feature; a novel feature for single tri-axial accelerometer based
activity recognition,” in 19th International Conference on Pattern
Recognition, pp. 1–4, 2008.
10.  Y.-P. Chen, J.-Y. Yang, S.-N. Liou, Gwo-Yun=Lee, and J.-S. Wang,
“Online classifier construction algorithm for human activity detection
using a tri-axial accelerometer,” Applied Mathematics and
Computation, vol. 205, no. 2, pp. 849–860, 2008.
11.  Chih-Wei Hsu,Chih-Chung Chang, and Chih-Jen Lin, “A Practical
Guide to Support Vector Classification”, Department of Computer
Science,National Taiwan University, Taipei 106, Taiwan
12.  Bradley, A. P. (1997). The use of the area under the ROC curve in
the evaluation of machine learning algorithms, Pattern Recognition,
30 (7), 1145-1159.
13.  Tobias Sing, Oliver Sander, Niko Beerenwinkel, Thomas Lengauer,
ROCR: visualizing classifier performance in R.Bioinformatics 21(20):
3940-3941 (2005).
14.  A Survey on Human Activity Recognition using Wearable Sensors, O
Óscar D.Lara and Miguel A. Labrador, Department of Computer
Science and Engineering University of South Florida, Tampa, FL
33620
15.  David J. Hand and Robert J. Till (2001). A Simple Generalisation of
the Area Under the ROC Curve for Multiple Class Classification
Problems. Machine Learning 45(2), p. 171—186.

More Related Content

Viewers also liked (20)

PPTX
Why Your Big Data Project Will Fail, and How to Avoid It
303Computing
 
PPTX
Authentic lesson on hamlet
Dedré Engelbrecht
 
PDF
W209 - Interactivity, Games, and Gamification: A Research-Based Approach to E...
Karl Kapp
 
PPTX
Big Data and Classification
303Computing
 
PPT
ppt.gif
butest
 
PPT
MachineLearning.ppt
butest
 
PPTX
Pseudo-Genetic Machine Learning Algorithm
David Juboor
 
PPTX
Fall detection
Lippo Group Digital
 
PPTX
Fall Detection System for the Elderly based on the Classification of Shimmer ...
Moiz Ahmed
 
PDF
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines
Deepak George
 
PPTX
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Parth Khare
 
PDF
Rで学ぶ逆変換(逆関数)法
Nagi Teramo
 
PDF
Action Learning: A Conceptual Introduction
Dr. Greg Waddell
 
PPT
Artificial Intelligence
Muhammad Ahad
 
PPTX
Random forest
Musa Hawamdah
 
PDF
Understanding Random Forests: From Theory to Practice
Gilles Louppe
 
PPTX
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
Edureka!
 
PDF
Azure Machine Learning tutorial
Giacomo Lanciano
 
KEY
Machine Learning on Big Data
Max Lin
 
PDF
Introduction to Machine Learning and Deep Learning
Terry Taewoong Um
 
Why Your Big Data Project Will Fail, and How to Avoid It
303Computing
 
Authentic lesson on hamlet
Dedré Engelbrecht
 
W209 - Interactivity, Games, and Gamification: A Research-Based Approach to E...
Karl Kapp
 
Big Data and Classification
303Computing
 
ppt.gif
butest
 
MachineLearning.ppt
butest
 
Pseudo-Genetic Machine Learning Algorithm
David Juboor
 
Fall detection
Lippo Group Digital
 
Fall Detection System for the Elderly based on the Classification of Shimmer ...
Moiz Ahmed
 
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines
Deepak George
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Parth Khare
 
Rで学ぶ逆変換(逆関数)法
Nagi Teramo
 
Action Learning: A Conceptual Introduction
Dr. Greg Waddell
 
Artificial Intelligence
Muhammad Ahad
 
Random forest
Musa Hawamdah
 
Understanding Random Forests: From Theory to Practice
Gilles Louppe
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
Edureka!
 
Azure Machine Learning tutorial
Giacomo Lanciano
 
Machine Learning on Big Data
Max Lin
 
Introduction to Machine Learning and Deep Learning
Terry Taewoong Um
 

Similar to Machine learning algorithm for classification of activity of daily life’s (20)

PDF
Learning from data
Govind Kanshi
 
PDF
Xavier amatriain, dir algorithms netflix m lconf 2013
MLconf
 
PDF
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
Xavier Amatriain
 
PPTX
Tariku Bokila SVMA Presentation.pptx ddd
TarikuBokila1
 
PDF
A survey on heart stroke prediction
drubosaha
 
PDF
A Survey on Stroke Prediction
MohammadRakib8
 
PDF
Fundamentals of data science presentation
topuri1218
 
PDF
Data analysis_PredictingActivity_SamsungSensorData
Karen Yang
 
PPTX
Classifiers Optimization Using Swarm Algorithms
Aboul Ella Hassanien
 
PDF
A survey of modified support vector machine using particle of swarm optimizat...
Editor Jacotech
 
PPTX
Elderly activity recognition and classification for application in assisted l...
Sara Egidi
 
PPTX
svm.pptx
PriyadharshiniG41
 
PPTX
classification algorithms in machine learning.pptx
jasontseng19
 
PPTX
Support vector machine
zekeLabs Technologies
 
PDF
Big & Personal: the data and the models behind Netflix recommendations by Xa...
BigMine
 
PDF
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Applied Machin...
Thomas Ploetz
 
PDF
Machine Learning Algorithms Introduction.pdf
Vinodh58
 
PPT
Learning On The Border:Active Learning in Imbalanced classification Data
萍華 楊
 
PDF
Chapter 02-logistic regression
Raman Kannan
 
PDF
Application of combined support vector machines in process fault diagnosis
Dr.Pooja Jain
 
Learning from data
Govind Kanshi
 
Xavier amatriain, dir algorithms netflix m lconf 2013
MLconf
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
Xavier Amatriain
 
Tariku Bokila SVMA Presentation.pptx ddd
TarikuBokila1
 
A survey on heart stroke prediction
drubosaha
 
A Survey on Stroke Prediction
MohammadRakib8
 
Fundamentals of data science presentation
topuri1218
 
Data analysis_PredictingActivity_SamsungSensorData
Karen Yang
 
Classifiers Optimization Using Swarm Algorithms
Aboul Ella Hassanien
 
A survey of modified support vector machine using particle of swarm optimizat...
Editor Jacotech
 
Elderly activity recognition and classification for application in assisted l...
Sara Egidi
 
classification algorithms in machine learning.pptx
jasontseng19
 
Support vector machine
zekeLabs Technologies
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
BigMine
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Applied Machin...
Thomas Ploetz
 
Machine Learning Algorithms Introduction.pdf
Vinodh58
 
Learning On The Border:Active Learning in Imbalanced classification Data
萍華 楊
 
Chapter 02-logistic regression
Raman Kannan
 
Application of combined support vector machines in process fault diagnosis
Dr.Pooja Jain
 
Ad

More from Siddharth Chakravarty (7)

PDF
au·ton·o·mous Trends | Tech | Future
Siddharth Chakravarty
 
PDF
VaR analysis for Mutual funds
Siddharth Chakravarty
 
PDF
Yelp Valuation
Siddharth Chakravarty
 
PDF
Stock pitch NVIDIA
Siddharth Chakravarty
 
PPTX
Money Laundering
Siddharth Chakravarty
 
PPTX
ETFs- A new investment frontier
Siddharth Chakravarty
 
PPTX
Holt Challenge 2017 - Adidas vs. Puma
Siddharth Chakravarty
 
au·ton·o·mous Trends | Tech | Future
Siddharth Chakravarty
 
VaR analysis for Mutual funds
Siddharth Chakravarty
 
Yelp Valuation
Siddharth Chakravarty
 
Stock pitch NVIDIA
Siddharth Chakravarty
 
Money Laundering
Siddharth Chakravarty
 
ETFs- A new investment frontier
Siddharth Chakravarty
 
Holt Challenge 2017 - Adidas vs. Puma
Siddharth Chakravarty
 
Ad

Recently uploaded (20)

PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
PPTX
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
PPTX
Extensions Framework (XaaS) - Enabling Orchestrate Anything
ShapeBlue
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
Top Managed Service Providers in Los Angeles
Captain IT
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
Français Patch Tuesday - Juillet
Ivanti
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
PDF
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
PDF
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
Building a Production-Ready Barts Health Secure Data Environment Tooling, Acc...
Barts Health
 
Extensions Framework (XaaS) - Enabling Orchestrate Anything
ShapeBlue
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Top Managed Service Providers in Los Angeles
Captain IT
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
Français Patch Tuesday - Juillet
Ivanti
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 

Machine learning algorithm for classification of activity of daily life’s

  • 1. Machine learning algorithm for classification of Activity of daily life’s Siddharth Chakravarty
  • 2. Overview ADL- What is it? How is it monitored? Technology landscape Problem statement Approach Results
  • 3. ADL- Activity of Daily life’s A technique for classification of human activities can be a useful tool to not only classify and monitor our activities, but also improve overall quality of life Image Source: connectingcleveland.net/wp-content/uploads/2014/07/Daily_Life_Wallpaper.jpg
  • 4. State of the art technology ADL monitoring 4 •  An accelerometer essentially records the acceleration it experiences is the X,Y and Z direction. •  Accelerometers are the most commonly used type of sensor for activity recognition with wearable sensors and other consumer electronic devices ranging from Iphones to Wii Walking Running Jumping Sweeping Sleeping Time scale Image courtesy: Jawbone
  • 5. Technology landscape Major players •  Fitbit - recently announced next generation device •  Jawbone- acquired bodymedia to complement it’s technology •  Microsoft •  Samsung •  Pebble •  Misfit– Acquired by Fossil •  Facebook- acquired Finland-based fitness app maker Protogeo Many more…. Clearly there is a need for accurately monitor and classify ADL’s not just for recreation, but also for other applications. Heart rate Walking Running Sleeping ADL Heart rate monitors X Pedometers X X X Phone X X Wearable X X X X
  • 6. Problem statement 6 The objective of the project was to develop machine learning algorithm using SVM to predict ADL’. Emerging Trends in Data Analytics Image courtesy SARK7 The automatic recognition of a set of Activities of Daily Living, is among the most challenging research fields in Ambient Intelligence. Main challenge with wearable technology is classification of use case ADL’
  • 7. Data science toolkit What’ the problem statement? Prediction ? Classification? Linear regression K-means Hierarchical clustering Logistic regression Decision Trees Support vector machines Logistic regression Random forestDecision Trees Clustering Clustering ? Support vector machines SVM is a most popular and efficient classification and regression method. Currently four R packages contain SVM related software. For this project the e1071 R package was chosen that supports multi-level classification problems
  • 8. The Dataset $Dataset for ADL Recognition with Wrist-worn Accelerometer •  The Dataset for ADL Recognition with Wrist-worn Accelerometer is a public collection of labeled accelerometer data recordings to be used for the creation and validation of acceleration models of simple ADL . •  It was provided by UCI machine learning repository, Center for Machine learning and Intelligent systems. •  The data was collected by using single tri-axial accelerometer attached to the right-wrist of the volunteer. It was carried out by Barbara Bruno, Fulvio Mastrogiovanni, Antonio Sgorbissa from the Laboratory for Ambient Intelligence and Mobile Robotics ,DIBRIS, University of Genova •  The Dataset composed of the recordings of 11 simple ADL performed by a total of 16 volunteers. Gender Age Weight M F Min Avg. Max Min Avg. Max 11 5 19 81 57.4 56 85 72.7 Problem statement $Data Data Wrangling Data Exploration Build Models Testing and classification brush_teeth getup_bed walk climb_stairs liedown_bed comb_hair pour_water descend_stairs sitdown_chair drink_glass standup_chair Wrist-worn Accelerometer Image courtesy: Chalkbeat Colorado
  • 9. Overview of dataset Accelerometer-2011-04-11-13-28-18-brush_teeth-f1.txt -  Refers to an accelerometer recording that was taken on March 11, 2011, starting from 13:28.18 p.m. -  The recording refers to the HMP ”brush_teeth" executed by the volunteer with ID "f1". Provides equations to convert raw acceleration into real acceleration values and reduce noise Raw data from accelerometer for the task “brush_teeth”. The columns represents the acceleration measured in X,Y and Z direction. Each folder contains raw data for the activity.Each file in the dataset has the following naming convention: Accelerometer-[START_TIME]- [ADL]-[VOLUNTEER]
  • 10. Feature extraction •  A key challenge for any classification ML algorithm is feature extraction -.i.e. unique parameters that distinguishes each class (Activity) Ax Ay Az Ax Ay Az Drink glass Ax Ay Az Pour water Pour waterClimb stairs
  • 11. Feature extraction approaches Group Methods Time domain Mean 1, Std. Deviation 2, Variance 3, MAD 4, Entropy 5 Frequency domain Fast Fourier transform 6, Discrete cosine transform 7 Other Principal component analysis 8, Linear discriminant analysis 9, Singular value decomposition •  The ML algorithm implemented for this project uses Discrete cosine transformation and Singular value decomposition approaches to classify ADL. Data wrangling •  Consolidate into a single CSV file. •  Convert raw data into acceleration and apply filter •  Calculate components of SVD: uvd •  Sum vector magnitude (VM) =√(ax)2+(ay)2+(az)2 Data exploration •  VM and SVD histograms •  Scale data •  Under-sample and Oversample data •  Split data Build and select model •  Build model in SVM using training data •  Calculate values for Cost and Gamma •  Cross validate model on training data. Validate model •  Use test data to validate model. •  F measure •  Accuracy •  Sensitivity •  Selectivity
  • 12. CONSOLIDATING INFORMATION •  The original data comprises 479,288 observations of ax, ay, az, distributed among the 11 Activities •  The data is consolidated to 827 observations that contribute to maximum variance in the data. §  VM refers to sum of vector magnitude §  SVD 1 and SVD2 refer to d1 and d2 components of Singular value decomposition §  Act refers to the activities
  • 13. What does the data look like? Problem stateme nt $Data Data Wrangling Data Exploration Build Models Testing and classificatio n Histogram of VM Histogram of SVD1 Histogram of SVD2 Two models are evaluated 1.  Model 1 uses VD as the dependent variable to predict ADL 2.  Model 2 uses VD and SVD to predict ADL
  • 14. DATA SAMPLING •  The original data set is unbalanced with fewer data points for activities such as brush_teeth, comb_hair, and liedown_bed •  The SMOTE function is used to handle unbalanced classification problem. It generates synthetic data sets that addresses the class unbalance problem. •  Following which the data set is split into training (70%) and test data (30%)
  • 15. Data after balancing and scaling Histogram of VM Histogram of SVD1 Histogram of SVD2 Scaling avoids attributes in greater numeric ranges dominating those in smaller numeric ranges. Another advantage is it reduces numerical difficulties during the calculation
  • 16. Tuning of SVM parameters- for Model 1(VD only) •  RBF kernel is chosen as it can handle non- linear relations between class and attributes •  Cost (C) and Gamma (γ) parameters are the two key parameters for the SVM model •  Initial values of cost and gamma for the SVM model are estimated using the “tune function”, and running a coarse grid search. •  Following the initial coarse grid search, the value of cost and gamma were fine tuned •  The tune function implemented uses a 10 fold cross validation •  Parameter tuning of ‘svm’: –  Sampling method: 10-fold cross validation –  Best parameters: Ø Gamma:20 Ø  cost:900 –  Best performance: 0.715 (1 – error) •  The model performance is first evaluated on training data set, followed by test data. Error Gamma Cost Contour map showing the error rate for different cost and gamma values
  • 17. Model performance ROC curves and Confusion matrix (CM) fpr tpr fpr tpr ROC and CM for train data set ROC and CM for test data set TRAINING CONFUSION MATRIX Reference Prediction brush_ teeth climb_ stairs comb_ hair Descend _stairs drink_ glass getup_ bed liedown_ bed pour_ water sitdown_ chair Standup _chair walk 85 brush_teeth 85 0 0 0 0 0 0 0 0 0 0 138 climb_stairs 0 118 0 13 5 9 2 7 0 1 0 47 comb_hair 0 0 42 0 0 0 0 0 0 0 0 60 descend_stairs 0 2 0 35 0 5 0 3 0 0 0 135 drink_glass 0 2 0 1 123 1 0 2 1 1 0 153 getup_bed 0 3 0 3 5 104 2 9 1 2 0 40 liedown_bed 0 2 0 0 1 2 35 0 0 0 0 147 pour_water 0 14 0 7 5 7 2 120 2 1 2 136 sitdown_chair 0 0 0 1 0 1 0 0 129 20 0 144 standup_chair 0 0 0 1 0 0 1 0 12 120 0 161 walk 0 0 4 0 0 2 0 0 0 0 164 TEST DATA SET CONFUSION MATRIX Reference Prediction brush_ teeth climb_ stairs comb_ hair descend_ stairs drink_ glass getup_ bed liedown_ bed pour_ water sitdown_ chair Standup _chair walk 36 brush_teeth 34 0 0 0 0 0 0 0 0 0 0 60 climb_stairs 0 42 0 4 2 5 0 6 0 0 1 19 comb_hair 0 0 14 0 2 0 0 0 0 0 1 26 descend_stairs 0 1 0 10 2 2 0 2 0 0 0 59 drink_glass 0 2 1 1 41 3 0 3 2 2 0 56 getup_bed 0 1 0 2 9 33 0 7 1 0 0 18 liedown_bed 0 4 0 0 0 3 15 1 2 0 0 60 pour_water 0 6 0 7 1 7 0 41 1 0 3 62 sitdown_chair 0 1 0 1 0 1 0 0 47 11 0 62 standup_chair 0 2 0 0 1 0 3 0 9 49 0 71 walk 2 1 4 1 1 2 0 0 0 0 66
  • 18. VD model test summary •  The initial results using VD’ to predict ADL were satisfactory, but showed deviation from training data set when implemented on test model. •  The training model performance could be further improved by tuning the value of cost and gamma ,.i.e. high cost and gamma number. •  Prior to implementing model 2, we need to check for collinearity (Strong correlation between two or more predictor variables). •  The vif function is used to check for collinearity among the three variables
  • 19. Tuning of SVM parameters: Model 2 (VD and SVM) •  The initial value of cost and gamma for the SVM model are predicted using the “tune function”, and running a coarse grid search. •  Following the initial coarse grid search, the value of cost and gamma were fine tuned •  Each set of cost and gamma values are cross validated (10 fold) •  Parameter tuning of ‘svm’: - Sampling method: 10-fold cross validation -  Best parameters: -  gamma:18 -  cost:800 - best performance: 0.863 (1 – error) Error Gamma Cost
  • 20. ROC curves and Confusion matrix (CM) TEST DATA SET Reference Prediction brush_ teeth climb_ stairs comb_ hair Descend _stairs drink_ glass getup_ bed liedown_ bed pour_ water sitdown_ chair Standup _chair walk 36 brush_teeth 35 0 0 0 0 0 0 0 0 0 0 58 climb_stairs 0 49 0 0 0 1 0 2 1 1 3 15 comb_hair 0 0 15 0 0 1 0 0 0 0 0 25 descend_stairs 0 1 0 23 0 0 0 0 0 1 0 57 drink_glass 0 0 0 0 56 3 0 7 0 0 0 65 getup_bed 0 0 0 0 1 53 0 3 2 0 1 16 liedown_bed 0 0 0 0 0 0 16 0 1 0 0 62 pour_water 0 0 0 0 0 0 0 46 0 0 0 57 sitdown_chair 0 0 0 0 0 0 0 0 66 0 0 61 standup_chair 0 0 0 1 0 0 0 0 3 47 0 69 walk 1 1 0 0 1 6 1 0 3 2 73 TRAINING DATA SET Reference Prediction brush_ teeth climb_ stairs comb_ hair descend_ stairs drink_ glass getup_ bed liedown_ bed pour_ water sitdown_ chair standup_c hair walk 85 brush_teeth 85 0 0 0 0 0 0 0 0 0 0 138 climb_stairs 0 152 0 0 0 0 0 0 0 0 0 47 comb_hair 0 0 50 0 0 0 0 0 0 0 0 60 descend_stairs 0 0 0 54 0 0 0 0 0 0 0 135 drink_glass 0 0 0 0 154 0 0 0 0 0 0 153 getup_bed 0 0 0 0 0 134 0 0 0 0 0 40 liedown_bed 0 0 0 0 0 0 36 0 0 0 0 147 pour_water 0 0 0 0 0 0 0 127 0 0 0 136 sitdown_chair 0 0 0 0 0 0 0 0 140 0 0 144 standup_chair 0 0 0 13 0 0 0 0 0 146 0 161 walk 0 0 0 0 0 0 0 0 0 0 153 fpr tpr fpr tpr ROC and CM for train data set ROC and CM for test data set
  • 21. Summary •  The results from the study demonstrate a improvised method for detecting ADL’ •  The VD and SVM factors when used in conjunction help to classify activities that would otherwise be misclassified. •  The study needs to extended to other ADL data sets that are publically available. •  While publically available data sets will help tune and validate the model, the efficacy of the model can only be validated using data sets from the industry. •  The application of a machine learning algorithm in this setting is quite vast –  VR band –  Health monitoring –  Activity monitoring –  Quality of activity, not just quantity –  Remote point of care devices –  Fall sensors
  • 22. References 1.  J. Parkka, M. Ermes, P. Korpipaa, J. Mantyjarvi, J. Peltola, and I. Korho- nen, “Activity classification using realistic data from wearable sensors,” IEEE Transactions on Information Technology in Biomedicine, vol. 10, no. 1, pp. 119–128, 2006. 2.  E. M. Tapia, S. S. Intille, W. Haskell, K. Larson, J. Wright, A. King, and R. Friedman, “Real-time recognition of physical activities and their intensities using wireless accelerometers and a heart monitor,” in International Symposium on Wearable Computers, 2007. 3.  O. D. Lara, A. J. Perez, M. A. Labrador, and J. D. Posada, “Centinela: A human activity recognition system based on acceleration and vital sign data,” Journal on Pervasive and Mobile Computing, 2011. 4.  Y.-P. Chen, J.-Y. Yang, S.-N. Liou, Gwo-Yun=Lee, and J.-S. Wang, “Online classifier construction algorithm for human activity detection using a tri-axial accelerometer,” Applied Mathematics and Computation, vol. 205, no. 2, pp. 849–860, 2008. 5.  M. Ermes, J. Parkka, and L. Cluitmans, “Advancing from offline to online activity recognition with wearable sensors,” in Engineering in Medicine and Biology Society. 30th Annual International Conference of the IEEE, pp. 4451–4454, 2008. 6.  L. Bao and S. S. Intille, “Activity recognition from user-annotated acceleration data,” in Pervasive, pp. 1–17, 2004. 7.  K. Altun and B. Barshan, “Human activity recognition using iner- tial/ magnetic sensor units,” in Human Behavior Understanding, Lecture Notes in Computer Science, pp. 38–51, Springer Berlin / Heidelberg, 2010. 8.  Z. He and L. Jin, “Activity recognition from acceleration data based on discrete consine transform and svm,” in IEEE International Conference on Systems, Man and Cybernetics, pp. 5041–5044, 2009. 9.  Z. He, Z. Liu, L. Jin, L.-X. Zhen, and J.-C. Huang, “Weightlessness feature; a novel feature for single tri-axial accelerometer based activity recognition,” in 19th International Conference on Pattern Recognition, pp. 1–4, 2008. 10.  Y.-P. Chen, J.-Y. Yang, S.-N. Liou, Gwo-Yun=Lee, and J.-S. Wang, “Online classifier construction algorithm for human activity detection using a tri-axial accelerometer,” Applied Mathematics and Computation, vol. 205, no. 2, pp. 849–860, 2008. 11.  Chih-Wei Hsu,Chih-Chung Chang, and Chih-Jen Lin, “A Practical Guide to Support Vector Classification”, Department of Computer Science,National Taiwan University, Taipei 106, Taiwan 12.  Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recognition, 30 (7), 1145-1159. 13.  Tobias Sing, Oliver Sander, Niko Beerenwinkel, Thomas Lengauer, ROCR: visualizing classifier performance in R.Bioinformatics 21(20): 3940-3941 (2005). 14.  A Survey on Human Activity Recognition using Wearable Sensors, O Óscar D.Lara and Miguel A. Labrador, Department of Computer Science and Engineering University of South Florida, Tampa, FL 33620 15.  David J. Hand and Robert J. Till (2001). A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45(2), p. 171—186.