Thesis presentation

Feature Extraction and Selection for Background Modeling
and Foreground Detection
Caroline Pacheco do E. Silva
MIA, Universit´e de La Rochelle
Ph.D. European Label
Supervisors: Carl Fr´elicot (Pr) and Thierry Bouwmans (MCF-HDR)
May 10, 2017 - La Rochelle, France
1 / 66

Summary
Introduction
Purpose of background subtraction.
Background subtraction: Process / Applications / Challenges.
Background subtraction methods / Visual features.
Role and the importance of features.
Contributions for background subtraction.
A novel texture descriptor (XCS-LBP).
Two ensemble learning approaches (pixel-based and superpixel-based) for feature
selection.
Collaborative external research for the Ph.D European Label.
A novel joint color-texture descriptor for dynamic texture recognition.
Conclusions and future perspectives.
2 / 66

Purpose of a background subtraction algorithm
Background Subtraction (BS) is a set of methods that aim to differentiate the
moving objects in the scene from background
Video input Moving object detection result.
3 / 66

Background subtraction process
How it works?
Ini$alize
background model
Background model
maintenance
Foreground
detec$on
Model
First frames
Background subtrac$on model
Detec$on moving objects
The main process of traditional background subtraction algorithm.
4 / 66

Applications
Background subtraction is often the ﬁrst step in many computer vision
applications
Trafﬁc monitoring Intrusion detection
Detection of free parking spaces.
5 / 66

Background subtraction challenges
Baseline
Shadow
Bad weather
Thermal
Dynamic background
Camera jitter
Intermittent object motion
Turbulence
Low framerate
Night scenes
PTZ cameras
PTZ cameras
Night scenes
Most investigated (“Solved”) Investigated (Medium) Less investigated (“Unsolved”)
* Pierre-Marc Jodoin. Motion Detection: Unsolved Issues and [Potential] Solutions. Scene
Background Modeling and Initialization (SBMI), ICIAP, 2015. 6 / 66

Background subtraction methods
A large number of algorithms have been proposed for background subtraction over the
last few years [Sobral and Vacavant, 2014], [Bouwmans, 2014], [Xu et al., 2016]:
Basic methods (i.e.
[Cucchiara et al., 2001])
Statistical methods (i.e.
[Stauffer and Grimson, 1999])
Non-parametric methods (i.e.
[Elgammal et al., 2000])
Fuzzy based methods (i.e.
[El Baf et al., 2008])
Neural and neuro-fuzzy methods (i.e.
[Maddalena and Petrosino, 2012])
At the present time, no
algorithm seems to be able to
simultaneously address
different challenges found in
real environments.
The majority of BS methods are focused on sophisticated learning models, while
visual features have been received relatively little attention.
7 / 66

Common visual features used in background subtraction
Sensor-based features
Color and intensity features: Very discriminative, but they have several
limitations in the presence of illumination changes, camouflage and shadows.
Depth features: Two cameras are needed to obtain the disparity or the depth. The
depth features deal with the camouflage.
Computed-based features (Image transformation)
Edge features: Handle the local illumination changes, but also the ghost leave
when waking foreground objects begin to move.
Motion features: Usually obtained via optical flow. Main drawback is its
computation time.
Texture features: Suitable to illumination changes and to shadows.
Most used Our focus
8 / 66

Summary
Introduction
Role and the importance of features
Contributions for background subtraction
selection.
Collaborative external research for the Ph.D European Label
Conclusions and future perspectives
9 / 66

Can features help improve the segmentation of moving objects? If so, why?
Dynamic background
Motion features?
Camouﬂage
Depth features?
The suitable choice of features in background modeling can improve the segmentation of
moving objects, however, the properties of each feature must be taken into consideration.
10 / 66

Remarks from the literature [Bouwmans, Silva et al., 2016]
Not focused on robust features.
One feature for the whole scene.
11 / 66

Remarks from the literature [Bouwmans et al., 2016]
Ph.D. motivations
Develop new robust features that improve BS.
Use feature selection to ﬁnd the best feature subset that improves BS.
12 / 66

Remarks from the literature [Bouwmans et al., 2016]
Ph.D. motivations
Develop new robust features that improve BS.
Use feature selection to ﬁnd the best feature subset that improves BS.
Our proposals
A novel texture-based descriptor, called eXtended Center-Symmetric Local
Binary Pattern (XCS-LBP)
Two ensemble learning approaches for feature selection to select suitable
features.
13 / 66

Summary
Introduction
selection.
14 / 66

Ordinary LBP [Heikkil¨a et al., 2004]
s(g0 − gc )20 +
s(g1 − gc )21 +
s(g2 − gc )22 +
s(g3 − gc )23 +
s(g4 − gc )24 +
s(g5 − gc )25 +
s(g6 − gc )26 +
s(g7 − gc )27
gc
g6
gc
g7
g0
g1
g2
g3
g4
g5g5
g6
gC
s(g0 − gc )20 +
s(g1 − gc )21 +
s(g2 − gc )22 +
s(g3 − gc )23 +
s(g4 − gc )24 +
s(g5 − gc )25 +
s(g6 − gc )26 +
s(g
7
− g
c
)27
Main advantage: robust to illumination variations.
Challenging situations: shadows, noises, and dynamic scenes.
15 / 66
LBPP ,R = ∑P−1
i=0 s (gi −gc )2i

LBP variants
Ordinary LBP-based (#9 papers): consists of the variants with small change in
its thresholding scheme from ordinary LBP.
Center-Symmetric LBP-based (#3 papers): are based on descriptions which
generates more compact binary patterns by working only with the
center-symmetric pairs of the pixels. (Our focus)
Ternary LBP-based (#5 papers): are robust for local noises by introducing a
small tolerative range.
Spatio-Temporal LBP-based (#3 papers): are variants that extend the ordinary
LBP from spatial domain to spatio-temporal domain.
Hybrid LBP-based (#4 papers): combine two or more characteristics of the
above categories, which usually results in a descriptor even more powerful.
16 / 66

Center-Symmetric LBP (CS-LBP) descriptors [Heikkil¨a et al., 2009]
g6
g7
g0
g1
g2
g3
g4
g5g5
g6
s(g0 − g4)20+
s(g1 − g5)21+
s(g2 − g6)22+
s(g3 − g7)23
Main advantage: more compact binary patterns.
Limitations: not sufﬁciently robust for BS.
17 / 66
CS −LBPP,R (c) = ∑
(P/2)−1
i=0 s(gi −gi+(P/2))2i
center-symmetric pairs of pixels

eXtended Center-Symmetric Local Binary Pattern (XCS-LBP) descriptor [Silva et al., 2015]
Proposed descriptor:
gc
g6
gc
g7
g0
g1
g2
g3
g4
g5g5
g6
gC
g6
g7
g0
g1
g2
g3
g4
g5g5
g6 gc
gc
gc
gc
s[( g0 − g4 )+ gc +( g0 − gc ) · (gc − g4 )] 20 +
s[( g1 − g5 )+ gc +( g1 − gc ) · (gc − g5 )] 21 +
s[( g2 − g6 )+ gc +( g2 − gc ) · (gc − g6 )] 22 +
s[( g3 − g7 )+ gc +( g3 − gc ) · (gc − g7 )] 23
1 CS-LBP + central pixel.
2 Product of the difference between the symmetric pixels and the central pixel.
Main advantages: less sensitive to noisy pixels and produces a short histogram.
18 / 66
XCS −LBPP,R (c) = ∑
(P/2)−1
i=0 s (g1(i,c)+g2(i,c))2i

Experimental results [Silva et al., 2015]
Compared Descriptors
Ordinary LBP [Ojala et al., 2002],
CS-LBP [Heikkil¨a et al., 2009],
CS-LDP [Xue et al., 2011] and
XCS-LBP [Silva et al., 2015].
Popular BS Methods
Adaptive Background Learning (ABL) (also know as Running Average).
Gaussian Mixture Models (GMM).
Dataset
The BMC (Background Models Challenge) dataset of Vacavant et al. (2012) was
chosen, because it contains several videos of outdoor situations (urban scenes).
19 / 66

Validation / Qualitative results [Silva et al., 2015]
Background subtraction results using the ABL method on synthetic scenes
Rotary (frame #1140) – scenes 122, 222, 322, 422 and 522
Original frame
Ground truth
LBP [Ojala et al., 2002]
CS-LBP [Heikkil¨a et al., 2009]
CS-LDP [Xue et al., 2011]
XCS-LBP [Silva et al., 2015]
Different weather conditions: cloudy, sunny, foggy, wind with noise.
20 / 66

Validation / Qualitative results [Silva et al., 2015]
Background subtraction results using the GMM method on synthetic scenes
Street (frame #301) – scenes 112, 212, 312, 412 and 512
Original frame
Ground truth
LBP [Ojala et al., 2002]
CS-LBP [Heikkil¨a et al., 2009]
CS-LDP [Xue et al., 2011]
XCS-LBP [Silva et al., 2015]
Different weather conditions: cloudy, sunny, foggy, wind with noise.
21 / 66

Validation / Quantitative results [Silva et al., 2015]
Table: Performance of the different descriptors on synthetic videos of the BMC using the ABL
method.
Scenes Descriptor Recall Precision F-score
Rotary 122
LBP [Ojala et al., 2002] 0.682 0.564 0.618
CS-LBP [Heikkil¨a et al., 2009] 0.832 0.520 0.640
CS-LDP [Xue et al., 2011] 0.809 0.523 0.635
XCS-LBP [Silva et al., 2015] 0.850 0.784 0.816
Rotary 222
LBP [Ojala et al., 2002] 0.611 0.505 0.553
CS-LDP [Xue et al., 2011] 0.753 0.510 0.608
Rotary 322
LBP [Ojala et al., 2002] 0.603 0.505 0.550
CS-LDP [Xue et al., 2011] 0.733 0.507 0.600
Rotary 422
LBP [Ojala et al., 2002] 0.573 0.502 0.535
CS-LDP [Xue et al., 2011] 0.733 0.508 0.600
Rotary 522
LBP [Ojala et al., 2002] 0.610 0.505 0.553
CS-LDP [Xue et al., 2011] 0.745 0.509 0.605
22 / 66

Validation / Quantitative results [Silva et al., 2015]
Table: Performance of the different descriptors on synthetic videos of the BMC using the GMM
method.
Scenes Descriptor Recall Precision F-score
Street 112
LBP [Ojala et al., 2002] 0.940 0.674 0.785
CS-LDP [Xue et al., 2011] 0.938 0.656 0.772
Street 212
LBP [Ojala et al., 2002] 0.676 0.642 0.659
CS-LDP [Xue et al., 2011] 0.694 0.577 0.630
Street 312
LBP [Ojala et al., 2002] 0.684 0.633 0.657
CS-LDP [Xue et al., 2011] 0.729 0.581 0.647
Street 412
LBP [Ojala et al., 2002] 0.619 0.566 0.591
CS-LDP [Xue et al., 2011] 0.659 0.539 0.593
Street 512
LBP [Ojala et al., 2002] 0.662 0.566 0.610
CS-LDP [Xue et al., 2011] 0.689 0.551 0.612
23 / 66

Validation / Speed comparison [Silva et al., 2015]
MacBook Pro (OS X 10.9.4, 2.2 GHz Intel Core i7 and 8 GB - 1333 MHz DDR3)
with MATLAB R2013a.
Elapsed CPU times needed to segment the foreground masks by ABL and GMM
methods, averaged over nine real videos of BMC dataset.
The reference is the fastest descriptor (original LBP), and the times are
divided by LBP ones.
XCS-LBP shows slightly better time performance than both CS-LBP and
CS-LDP.
Table: Elapsed CPU times (averaged on the nine real-world videos of the BMC) over LBP times
Descriptor CS-LBP CS-LDP XCS-LBP
ABL 1.10 1.12 1.09
GMM 1.06 1.07 1.05
24 / 66

Summary
Introduction
A novel texture descriptor (XCS-LBP)
selection.
25 / 66

Why is it interesting today?
X1
X2
X3
X4X5
Each region can be represented by different features such as: texture, color,
texture-color, motion and edge.
Most of BS methods
Use a single feature for the whole scene.
Key challenges
A deep knowledge of the scene is needed.
Most of the relevant features can be automatically selected by feature selection.
26 / 66

Approaches for feature selection
Traditional approaches:
Filter-based evaluates the relevance of the features based on a statistical measure
estimated directly from the data.
Wrapper-based employs a classification algorithm as a “black box” for selecting a set
of relevant features.
Embedded-based the feature selection is incorporated as part of the classification
algorithm.
Ensemble for feature selection provides a powerful tool to combine a set of
models [Bolón-Canedo et al., 2014]. (Our focus)
27 / 66

Previous BS works based on feature selection approaches
Methods Authors/Date Strategy Level Features
Traditional
[Li et al., 2004] Bayes decision rule Pixel RGB, gradient, and color co-occurrence
[Javed et al., 2015] Means and variances criterion Region RGB, gray, LBP, gradients, and HOG
[Braham and Van Droogenbroeck, 2015] Performance metric Pixel RGB, HSV, and YCbCr
Ensemble-based
[Grabner et al., 2006] AdaBoost Region Haar-like features, HOG, and LBP
[Parag et al., 2006] RealBoost Pixel RGB, gray, and gradients
[Grabner et al., 2008] AdaBoost Region Haar-like features
[Klare and Sarkar, 2009] Ensemble of Mixture of Gaussians Pixel RGB, gradients, and Haar-like features.
Most of these works have used multi-class approaches. However, the BS can be
considered an one-class classiﬁcation (OCC) problem.
Usually only exemplars of one-class elements are available (i.e. the background
component is always present), whereas the other classes are unknown (i.e.
foreground objects can appear/disappear several times in the scene).
28 / 66

Online weighted one-class random subspace (OWOC-RS) ensemble for feature selection
Step 1. Generating multiple base models
Frames containing only
background scene
29 / 66

background scene
Features set (p) Features set (p)
30 / 66

background scene
weighted random
subspace
p*1 ⊂ p
P*2 ⊂ p
p*3 ⊂ p
p*M ⊂ p
ω1
ω3
ω2

ωM
31 / 66

background scene
weighted random
subspace
p*1 ⊂ p
P*2 ⊂ p
p*3 ⊂ p
p*M ⊂ p
ω1
ω3
Ψ1
Ψ2
Ψ3
build M base
classiﬁers
ω2

ΨM
ωM
32 / 66

background scene
weighted random
subspace
p*1 ⊂ p
P*2 ⊂ p
p*3 ⊂ p
p*M ⊂ p
ω1
ω3
Ψ1
Ψ2
Ψ3
build M base
classiﬁers
ω2

ΨM
ωM
M background
models
Training step
33 / 66

Step 2. Adaptive Importance (AI)
Frames containing background/
foreground scene
Final
predic6on
Ψ1
ᵦ1

Ψ2
ΨL
L best base classiﬁers
Adap*ve Importance
Computa*on (ᵦ)
ground truth data
ᵦ2

ᵦL

Training step
Accuracy(Ψl ) = 1 −errorl (classiﬁcation error).
34 / 66

Step 3. Background/foreground separation
Final
predic,on
Ψ1
ᵦ1

Ψ2
ΨL
heuris,c model
update
ᵦ2

ᵦL

Threshold the weighted sum of the output of the best base classiﬁers.
35 / 66

Experimental results [Silva et al., 2016]
Dataset
MVS dataset [Benezeth et al., 2014] which consists of a set of 5 video sequences
containing 7 multispectral and color video sequence (RGB).
Parameter settings
The pool of classifiers was homogeneous and consisted of 10 base classifiers of
the same type (IWOC-SVM with RBF kernel).
6 kind of features: Color features (R,G,B, H,S,V and gray-scale), texture feature
(XCS-LBP), color-texture (OC-LBP), edge feature (gradient orientation and
magnitude), motion feature (optical flow) and multispectral features (7 spectral
narrow bands).
36 / 66

Qualitative results [Silva et al., 2016]
MVS dataset
Original Frame Ground truth OWOC-RS [Silva et al., 2016]
Figure: The true positives (TP) pixels are in white, true negatives (TN) pixels in black, false
positives (FP) pixels in red and false negatives (FN) pixels in green.
MVS scenes: dynamic background, illumination changes, camouﬂage effects and intermittent object motion.
37 / 66

Quantitative results [Silva et al., 2016]
Table: Performance of the different methods using the MVS dataset.
Videos Method Precision Recall F-score
Scene
01
MD (RGB)[Benezeth et al., 2014] 0.6536 0.6376 0.6536
MD (MSB)[Benezeth et al., 2014] 0.7850 0.8377 0.8105
Pooling (MSB)[Benezeth et al., 2014] 0.7475 0.8568 0.7984
Proposed 0.8500 0.9580 0.9008
Scene
02
Proposed 0.8277 0.8245 0.8727
Scene
03
Proposed 0.9056 0.9953 0.9483
Scene
04
MD(RGB)[Benezeth et al., 2014] 0.8402 0.7929 0.8158
Proposed 0.9534 0.8374 0.8997
Scene
05
Proposed 0.7316 0.8392 0.8400
*MD = Mahalanobis distance
38 / 66

Importance of the features [Silva et al., 2016]
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Importance
Gray
Red
Green
Blue
Hue
Saturation
Value
XCS−LBP
OCLBPRR
OCLBPGG
CLBPBB
OCLBPRG
OCLBPRB
OCLBPGB
GradientX
GradientY
GradientMagnitude
GradientDirection
OpticalFlow
MS1
MS2
MS3
MS4
MS5
MS6
MS7
most (+): Gradient Direction less (-): OCLBP-GB
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Importance
Gray
Red
Green
Blue
Hue
Saturation
Value
XCS−LBP
OCLBPRR
OCLBPGG
CLBPBB
OCLBPRG
OCLBPRB
OCLBPGB
GradientX
GradientY
GradientMagnitude
GradientDirection
OpticalFlow
MS1
MS2
MS3
MS4
MS5
MS6
MS7
most (+): OCLBP-BB,RR,RG and less (-): Multispectral and color features
39 / 66

0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Importance
Gray
Red
Green
Blue
Hue
Saturation
Value
XCS−LBP
OCLBPRR
OCLBPGG
CLBPBB
OCLBPRG
OCLBPRB
OCLBPGB
GradientX
GradientY
GradientMagnitude
GradientDirection
OpticalFlow
MS1
MS2
MS3
MS4
MS5
MS6
MS7
most (+): MS1,MS2 and MS6 with Color, Gradient X features less (-): XCS-LBP and MS4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Importance
Gray
Red
Green
Blue
Hue
Saturation
Value
XCS−LBP
OCLBPRR
OCLBPGG
CLBPBB
OCLBPRG
OCLBPRB
OCLBPGB
GradientX
GradientY
GradientMagnitude
GradientDirection
OpticalFlow
MS1
MS2
MS3
MS4
MS5
MS6
MS7
most (+): OCLBP-GG,RR less (-): Hue, Optical ﬂow and multispectral features
40 / 66

0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Importance
Gray
Red
Green
Blue
Hue
Saturation
Value
XCS−LBP
OCLBPRR
OCLBPGG
CLBPBB
OCLBPRG
OCLBPRB
OCLBPGB
GradientX
GradientY
GradientMagnitude
GradientDirection
OpticalFlow
MS1
MS2
MS3
MS4
MS5
MS6
MS7
most (+): MS1,MS2 and MS6 with Color, Gradient X features less (-): XCS-LBP and MS4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Importance
Gray
Red
Green
Blue
Hue
Saturation
Value
XCS−LBP
OCLBPRR
OCLBPGG
CLBPBB
OCLBPRG
OCLBPRB
OCLBPGB
GradientX
GradientY
GradientMagnitude
GradientDirection
OpticalFlow
MS1
MS2
MS3
MS4
MS5
MS6
MS7
most (+): OCLBP-GG,RR less (-): Hue, Optical ﬂow and multispectral features
For each scene there are one or more appropriate features.
41 / 66

Summary
Introduction
A novel texture descriptor (XCS-LBP)
Two ensemble learning approaches (pixel-based and superpixel-based) for
feature selection.
42 / 66

Remarks of the previous approach (pixel-based)
Advantage
The experimental results showed the potential of the proposed approach to
select the best features for distinct regions in a video sequence.
Limitations
Only reaches the highest accuracy when the number of features is huge.
Computationally expensive (pixel-based).
43 / 66

Superpixel-based online wagging one-class (Superpixel-OWAOC) ensemble for feature
selection
background scene
Features set (p)
Features set (p)
ρ1 ρ2 … ρN
ω1
ω1
ω1
weighted diﬀerent
features

ρ1
ρ1
ρN
build N classiﬁer
pools
Ψ1 , Ψ2,
Ψ3 , Ψ4,…,ΨM

Ψ1 , Ψ2,
Ψ3 , Ψ4,…,ΨM

Ψ1 , Ψ2,
Ψ3 , Ψ4,…,ΨM

Ψ4
Ψ1
ΨM
the best model
(smallest error)
Training step
44 / 66

selection
Step 2. Adaptive Importance Computation and Ensemble Pruning (AIC-EP)
Frames containing background/
foreground scene
Training step

Final
predic6on

Ψ4
ᵦ4

select the L base classiﬁers with
the best (ᵦ)
Adap3ve Importance
Computa3on (ᵦ) and
Ensemble Pruning
heuris3c model
update
ᵦ1

ᵦL

Ψ1
ΨL
Different from the previous approach, here we eliminate the base classiﬁers with very low
importance.
45 / 66

selection
Step 3. Background/foreground separation
Final
predic,on
Ψ1
ᵦ1

Ψ2
ΨL
heuris,c model
update
ᵦ2

ᵦL

46 / 66

Experimental results
Datasets
RGB-D dataset [Camplani and Salgado, 2013].
MVS dataset [Benezeth et al., 2014].
Parameter Settings
The pool of classiﬁers was homogeneous and consisted of base classiﬁers of the
same type (IWOC-SVM with RBF kernel).
4 kind of features: gray-scale, XCS-LBP, depth, and multispectral.
47 / 66

Validation / Quantitative results [Silva et al., 2017a]
Original Frame
MVS
RGB-D
Ground truth Superpixel-OWAOC [Silva et al., 2017]
Figure: The true positives (TP) pixels are in white, true negatives (TN) pixels in black, false
positives (FP) pixels in red and false negatives (FN) pixels in green.
MVS: dynamic background, illumination changes, camouﬂage effects and shadows.
RGB-D: shadows and illumination changes.
48 / 66

Validation / Quantitative results [Silva et al., 2017a]
Table: Performance using the RGB-D dataset.
Videos Method Precision Recall F-score
ColCamSeq
IWOC-SVM 0.9898 0.6706 0.7995
OWOC-RS [Silva et al., 2016] 0.8887 0.7555 0.8167
Superpixel-OWAOC (proposed) 0.9859 0.8041 0.8858
DCamSeq
IWOC-SVM 0.9255 0.8172 0.8680
GenSeq
IWOC-SVM 0.7427 0.7513 0.7470
ShSeq
IWOC-SVM 0.6024 0.6385 0.6199
Our two ensemble learning approaches presented better performance than traditional
classiﬁcations using only one classiﬁer.
49 / 66

Importance of the features [Silva et al., 2017a]
RGB-D dataset
Original Frame Feature Maps Histogram of Features Importance
50 / 66

Importance of the features [Silva et al., 2017a]
RGB-D dataset
Original Frame Feature Maps Histogram of Features Importance
51 / 66

What are the differences?
Methods Authors/Date Strategy Level Features
Traditional
[Li et al., 2004] Bayes decision rule Pixel RGB, gradient, and color co-occurrence
[Javed et al., 2015] Means and variances criterion Region RGB, gray, LBP, gradients, and HOG
[Braham and Van Droogenbroeck, 2015] Performance metric Pixel RGB, HSV, and YCbCr
Ensemble-based
[Grabner et al., 2006] AdaBoost Region Haar-like features, HOG, and LBP
[Parag et al., 2006] RealBoost Pixel RGB, gray, and gradients
[Grabner et al., 2008] AdaBoost Region Haar-like features
[Klare and Sarkar, 2009] Ensemble of Mixture of Gaussians Pixel RGB, gradients, and Haar-like features.
Pixel approach [Silva et al., 2016] Weighted Random Subspace Pixel 26 features (e.g HSV,multispectral, etc.)
Superpixel approach [Silva et al., 2017a] Wagging for feature selection Cluster gray, XCS-LBP, and depth
Most of these ensemble learning for feature selection works have used boosting and its
variants
52 / 66

Table of contents
Introduction
selection.
53 / 66

Collaborative research with Jordi Gonzalez at CVC (Barcelona, Spain)
Dynamic textures are motion patterns, i.e. image sequences of moving scenes that
present certain stationarity properties not only in space but also in their dynamics over
time [Doretto et al., 2003].
54 / 66

3D joint color-texture descriptor [Silva et al., 2017b]
c
o
n
c
a
t
e
n
a
t
e
XY
XT
YT
RR-LBP
XY
XT
YT
GG-LBP
0 50 100 150 200 250
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 50 100 150 200 250
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
XY
XT
YT
BB-LBP
XY
XT
YT
RG-LBP
XY
XT
YT
RB-LBP
0 50 100 150 200 250
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 50 100 150 200 250
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 50 100 150 200 250
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
XY
XT
YT
GB-LBP
Space-time opponent color histograms
OCLBP-TOP histogram
0 100 200 300 400 500 600 700
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Original sequence video
A novel Opponent Color Local Binary Pattern from Three Orthogonal Planes
(OCLBP-TOP) descriptor.
55 / 66

Validation / Preliminary results [Silva et al., 2017b]
Table: Overall classification results (%)
YUPENN Feature
Descriptors Dyntex++ (%) Dynamic Scenes (%) Size
OCLBP (2004) [Mäenpää and Pietikäinen, 2004] 70.14 77.85 1 536
LBP-TOP (2007) [Zhao and Pietikäinen, 2007] 71.88 85.37 768
OCLBP-TOP [proposed] 80.58 86.90 4 608
LGBP-TOP (2013) [Almaev and Valstar, 2013] 68.69 84.47 50 976
LGBP-TOP + PCA 52.08 63.57 768
OCLBP-TOP + PCA [proposed] 73.04 84.76 768
HOG/HOF (2008) [Laptev et al., 2008] 72.75 78.80 288
GIST3D (2012) [Solmaz et al., 2012] 70.43 63.33 34 816
work in progress
56 / 66

Summary
1 Introduction
2 Role and the importance of features
3 Contributions for background subtraction
selection.
4 Collaborative external research for the Ph.D European Label
5 Conclusions and future perspectives
57 / 66

Conclusions
We presented an eXtended Center-Symmetric Local Binary Pattern
(XCS-LBP) descriptor produces shorter histogram, tolerant to illumination changes,
and robust to noise.
We proposed two ensemble learning (pixel-based and superpixel-based)
methods to select the best features for distinct regions in a video sequence.
We extended the spatial color-texture OCLBP descriptor to the spatio-temporal
domain. It extracts more detailed information from the video sequence to be
analyzed.
58 / 66

Future perspectives
Texture and color-texture features
Extend the XCS-LBP to include temporal properties.
Reduce the computation time of the proposed OCLBP-TOP.
Feature selection for background subtraction
Extend the proposed approach by developing a new mechanism
to update the importance of each feature without ground-truth data.
59 / 66

Publications
Journal papers (3)
Bouwmans, T. and Silva, C. and Marghes, C. and Zitouni, S. and Bhaskar, H. and Frélicot, C.
“On the Role and the Importance of Features for Background Modeling and Foreground
Detection”. Computer Science Review, 2016 (submitted).
Silva, C. and Gonzàlez, J. and Bouwmans, T. and Frélicot, C. “3D joint color-texture
descriptor for dynamic texture recognition”. IET Computer Vision, 2017 (in revision).
Silva, C. and Bouwmans, T. and Frélicot, C. “Superpixel-based incremental wagging
one-class ensemble for feature selection in foreground/background separation”. Pattern
Recognition Letters (PRL), 2017 (submitted).
Book chapter (1)
Silva, C. and Bouwmans, T. and Frélicot, C. “Features and Strategies Issues”. Chapter on
the handbook “Background Subtraction for Moving Object Detection: Theory and Practices”,
2017 (in progress)
60 / 66

Publications
Conferences (2)
Silva, C. and Bouwmans, T. and Fr´elicot, C. “An eXtended Center-Symmetric Local Binary
Pattern for Background Modeling and Subtraction in Videos”. In the Proceedings of the 10th
International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory
and Applications (VISAPP), Berlin, Germany (oral presentation), March, 2015.
Silva, C. and Bouwmans, T. and Fr´elicot, C. “Online Weighted One-Class Ensemble for
Feature Selection in Background/Foreground Separation”. In the Proceedings of the 23rd
International Conference on Pattern Recognition (ICPR), Cancun, Mexico (oral
presentation), December, 2016.
Websites
Behance.net project: https://www.behance.net/carolinepacheco
Caroline Silva’s homepage: http://lolynepacheco.wixsite.com/carolinesilva
Source Code
XCS-LBP Descriptor: http://lolynepacheco.wix.com/carolinesilva
LBPLibrary: https://github.com/carolinepacheco/lbplibrary
61 / 66

LBP Library
The LBP Library is a collection of eleven Local Binary Patterns (LBP) algorithms
developed for background subtraction problem. The algorithms were implemented in
C++ based on OpenCV.
List of the algorithms available in the LBP Library:
BG-LBP (BackGround Local Binary Pattern) by Davarpanah et al. (2015)
CS-LBP (First-order Center-Symmetric Local Binary Patterns) by Heikkil¨a et al. (2006)
CS-LDP (Second-order Center-Symmetric Local Derivative Pattern) by Xue et al. (2011)
CS-SILTP (Center-Symmetric Scale Invariant Local Ternary Patterns) by Wu et al. (2013)
E-LBP (Extended LBP or Circular LBP) by Mdakane and Bergh (2012)
OC-LBP (Opponent Color Local Binary Pattern) by Maenpaa and Pietikainen (2004)
O-LBP (Original LBP) by Ojala et al. (2001)
SCS-LBP (Spatial extended Center-Symmetric Local Binary Pattern) by Xue et al. (2010)
SI-LTP (Scale Invariant Local Ternary Pattern) by Liao et al. (2010)
VAR-LBP (Variance-based LBP) by Ojala et al. (2002)
XCS-LBP (eXtended Center-Symmetric Local Binary Pattern) by Silva et al. (2015)
62 / 66

Thank you for your attention!!!
63 / 66

I
References
[Almaev and Valstar, 2013] Almaev, T. and Valstar, M. (2013). Local gabor binary patterns from three orthogonal planes for automatic facial expression recognition. In Humaine
Association Conference on Affective Computing and Intelligent Interaction (ACII), pages 356–361.
[Benezeth et al., 2014] Benezeth, Y., Sidibe, D., and Thomas, J. B. (2014). Background subtraction with multispectral video sequences. In IEEE International Conference on
Robotics and Automation (ICRA).
[Bolón-Canedo et al., 2014] Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A., Ben´ıtez, J., and Herrera, F. (2014). A review of microarray datasets and applied feature
selection methods. Information Sciences, pages 111–135.
[Bouwmans, 2014] Bouwmans, T. (2014). Traditional and recent approaches in background modeling for foreground detection: An overview. In Computer Science Review, pages
31–66.
[Bouwmans et al., 2016] Bouwmans, T., Silva, C., Marghes, C., Zitouni, S., Bhaskar, H., and Frélicot, C. (2016). On the role and the importance of features for background model
and foreground detection. In Computer Science Review.
[Braham and Van Droogenbroeck, 2015] Braham, M. and Van Droogenbroeck, M. (2015). A generic feature selection method for background subtraction using global foreground
models. In Advanced Concepts for Intelligent Vision Systems (ACIVS), pages 717–728.
[Camplani and Salgado, 2013] Camplani, M. and Salgado, L. (2013). Background foreground segmentation with RGB-D kinect data: an efficient combination of classifiers. Journa
on Visual Communication and Image Representation (JVCIR).
[Cucchiara et al., 2001] Cucchiara, R., Grana, C., Piccardi, M., and Prati, A. (2001). Detecting objects, shadows and ghosts in video streams by exploiting color and motion
information. In International Conference on Image Analysis and Processing, pages 360–365.
[Doretto et al., 2003] Doretto, G., Chiuso, A., Wu, Y., and Soatto, S. (2003). Dynamic textures. In International Journal of Computer Vision (IJCV), pages 91–109.
[El Baf et al., 2008] El Baf, F., Bouwmans, T., and Vachon, B. (2008). A fuzzy approach for background subtraction. In IEEE International Conference on Image Processing (ICIP)
pages 2648–2651.
[Elgammal et al., 2000] Elgammal, A., Harwood, D., and Davis, L. (2000). Non-parametric model for background subtraction. In European Conference on Computer Vision (ECCV
pages 751–767.
[Grabner et al., 2008] Grabner, H., Leistner, C., and Bischof, H. (2008). Time dependent on-line boosting for robust background modeling. International Joint Conference on
Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP).
[Grabner et al., 2006] Grabner, H., Roth, P., Grabner, M., and Bischof, H. (2006). Autonomous learning a robust background model for change detection. IEEE International
Workshop on Performance Evaluation of Tracking and Surveillance (PETS).
[Heikkilä et al., 2004] Heikkilä, M., Pietikäinen, M., and Heikkilä, J. (2004). A texture-based method for detecting moving objects. In British Machine Vision Conference (BMVC),
pages 1–10.
[Heikkilä et al., 2009] Heikkilä, M., Pietikäinen, M., and Schmid, C. (2009). Description of interest regions with local binary patterns. Pattern Recognition (PR), pages 425–436.
64 / 66

II
References
[Javed et al., 2015] Javed, S., Sobral, A., Bouwmans, T., and Jung, S. K. (2015). OR-PCA with dynamic feature selection for robust background subtraction. In ACM Symposium o
Applied Computing, pages 86–91.
[Klare and Sarkar, 2009] Klare, B. and Sarkar, S. (2009). Background subtraction in varying illuminations using an ensemble based on an enlarged feature set. In IEEE Conferen
on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 66–73.
[Laptev et al., 2008] Laptev, I., Marszaaek, M., Schmid, C., and Rozenfeld, B. (2008). Learning realistic human actions from movies. In IEEE Conference on Computer Vision and
Pattern Recognition (CVPR).
[Li et al., 2004] Li, L., Huang, W., I., G., and Tian, Q. (2004). Statistical modeling of complex backgrounds for foreground object detection. IEEE Transactions on Image Processin
pages 1459–1472.
[Maddalena and Petrosino, 2012] Maddalena, L. and Petrosino, A. (2012). The SOBS algorithm: What are the limits? In IEEE Computer Society Conference on Computer Vision
and Pattern Recognition Workshops, pages 21–26.
[Mäenpää and Pietikäinen, 2004] Mäenpää, T. and Pietikäinen, M. (2004). Classification with color and texture: jointly or separately? Pattern Recognition (PR), pages 16291–164
[Ojala et al., 2002] Ojala, T., Pietikäinen, M., and Mäenpää, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), pages 971–987.
[Parag et al., 2006] Parag, T., Elgammal, A., and Mittal, A. (2006). A framework for feature selection for background subtraction. In IEEE Conference on Computer Vision and Patt
Recognition (CVPR), pages 1916–1923.
[Silva et al., 2015] Silva, C., Bouwmans, T., and Frélicot, C. (2015). An extended center-symmetric local binary pattern for background modeling and subtraction in videos. In
International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP), pages 1–8.
[Silva et al., 2016] Silva, C., Bouwmans, T., and Frélicot, C. (2016). Online weighted one-class ensemble for feature selection in background/foreground separation. In nternationa
Conference on Pattern Recognition (ICPR), pages 1–6.
[Silva et al., 2017a] Silva, C., Bouwmans, T., and Frélicot, C. (2017a). Superpixel-based incremental wagging one-class ensemble for feature selection in foreground/background
separation. In Pattern Recognition Letters (PRL), pages 1–7.
[Silva et al., 2017b] Silva, C., Gonzàlez, J., Bouwmans, T., and Frélicot, C. (2017b). 3d joint color-texture descriptor for dynamic texture recognition. IET Computer Vision.
[Sobral and Vacavant, 2014] Sobral, A. and Vacavant, A. (2014). A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos. Comput
Vision and Image Understanding (CVIU), pages 4–21.
[Solmaz et al., 2012] Solmaz, B., Modiri, S. A., and Shah, M. (2012). Classifying web videos using a global video descriptor. Machine Vision and Applications (MVA).
[Stauffer and Grimson, 1999] Stauffer, C. and Grimson, W. (1999). Adaptive background mixture models for real-time tracking. In IEEE Computer Computer Vision and Pattern
Recognition (CVPR), pages 246–252.
65 / 66

III
References
[Xu et al., 2016] Xu, Y., Dong, J., Zhang, B., and Xu, D. (2016). Background modeling methods in video analysis: A review and comparative evaluation. CAAI Transactions on
Intelligence Technology, pages 43–60.
[Xue et al., 2011] Xue, G., Song, L., Sun, J., and Wu, M. (2011). Hybrid center-symmetric local pattern for dynamic background subtraction. In IEEE International Conference on
Multimedia and Expo (ICME), pages 1–6.
[Zhao and Pietik¨ainen, 2007] Zhao, G. and Pietik¨ainen, M. (2007). Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE
Transactions on Pattern Analysis and Machine Intelligence (TPAMI), pages 915–928.
66 / 66

Thesis presentation

More Related Content

What's hot (19)

Similar to Thesis presentation (20)

Recently uploaded (20)

Thesis presentation