SlideShare a Scribd company logo
Course Program
9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)
1 hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:15-17.30 Recent Advances: Dual-decomposition, higher-order, etc.
             (Carsten Rother + Pawan Kumar)

   All online material will be online (after conference):
   http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
Discrete Models in Computer Vision

              Carsten Rother
       Microsoft Research Cambridge
Overview

• Introduce Factor graphs notation
• Categorization of models in Computer Vision:
  – 4-connected MRFs
  – Highly-connected MRFs
  – Higher-order MRFs
Markov Random Field Models for Computer Vision
                                        Inference:
Model :                                    Graph Cut (GC)
   discrete or continuous variables?      Belief Propagtion (BP)
   discrete or continuous space?          Tree-Reweighted Message Parsing
   Dependence between variables?           (TRW)
   …
                                           Iterated Conditional Modes (ICM)
                                           Cutting-plane
                                           Dual-decomposition
Applications:
   2D/3D Image segmentation               …
   Object Recognition
   3D reconstruction
   Stereo matching                     Learning:
   Image denoising                        Exhaustive search (grid search)
   Texture Synthesis                      Pseudo-Likelihood approximation
   Pose estimation
                                           Training in Pieces
   Panoramic Stitching
   …                                      Max-margin
                                           …
Recap: Image Segmentation


      Input z             Maximum-a-posteriori (MAP)
                          x* = argmax P(x|z) = argmin E(x)
                                  x                x


P(x|z) ~ P(z|x) P(x) Posterior; Likelihood; Prior

P(x|z) ~ exp{-E(x)}                Gibbs distribution

E: {0,1}n → R
E(x) = ∑ θi (xi) + w∑ θij (xi,xj)            Energy
         i          i,j   Є   N4
       Unary term    Pairwise term
Min-Marginals
       (uncertainty of MAP-solution)

Definition: ψv;i = min E(x)
                    xv=i




  image                    MAP          Min-Marginals
                                         (foreground)
                                     (bright=very certain)
  Can be used in several ways:
  • Insights on the model
  • For optimization (TRW, comes later)
Introducing Factor Graphs
Write probability distributions as Graphical models:

       - Direct graphical model
       - Undirected graphical model (… what Andrew Blake used)
       - Factor graphs

References:
       - Pattern Recognition and Machine Learning *Bishop ‘08, book, chapter 8+
       - several lectures at the Machine Learning Summer School 2009
         (see video lectures)
Factor Graphs
P(x) ~ θ(x1,x2,x4) θ(x3,x4) θ(x2,x3) θ(x4,x5)          “4 factors”

P(x) ~ exp{-E(x)}                                       Gibbs distribution
E(x) = θ(x1,x2,x3) + θ(x2,x4) + θ(x3,x4) + θ(x3,x5)


                              unobserved/latent/hidden variable
     x1           x2
                              variables are in same factor.



     x3           x4



      x5

      Factor graph
Definition “Order”
Definition “Order”:
The arity (number of variables) of
the largest factor                                    x1           x2


      P(X) ~ θ(x1,x2,x3) θ(x2,x4) θ(x3,x4) θ(x3,x5)

                 arity 3             arity 2          x3           x4



                                                      x5

                                                           Factor graph
                                                           with order 3
Examples - Order




 4-connected;          higher(8)-connected;    Higher-order MRF
 pairwise MRF          pairwise MRF

E(x) = ∑ θij (xi,xj)   E(x) = ∑ θij (xi,xj)   E(x) = ∑ θij (xi,xj)
                                                    i,j Є N4
      i,j Є N4               i,j Є N8
                                                          +θ(x1,…,xn)
     Order 2                  Order 2               Order n
“Pairwise energy”                             “higher-order energy”
Example: Image segmentation
      P(x|z) ~ exp{-E(x)}
         E(x) = ∑ θi (xi,zi) + ∑ θij (xi,xj)
                    i        i,j   Є   N4



                                       Observed variable

                                       Unobserved (latent) variable
               zi




    xj         xi

Factor graph
Most simple inference technique:
     ICM (iterated conditional mode)
                        Goal: x* = argmin E(x)
        x2                              x
                        E(x) = θ12 (x1,x2)+ θ13 (x1,x3)+
                              θ14 (x1,x4)+ θ15 (x1,x5)+…

x3      x1    x4



        x5
Most simple inference technique:
         ICM (iterated conditional mode)
                  means observed
                                              Goal: x* = argmin E(x)
                x2                                            x
                                              E(x) = θ12 (x1,x2)+ θ13 (x1,x3)+
                                                    θ14 (x1,x4)+ θ15 (x1,x5)+…

    x3          x1          x4



                x5




Simulated Annealing: accept a move even if       ICM                 Global min
energy increases (with certain probability)     Can get stuck in local minima!
Overview
• Introduce Factor graphs notation
• Categorization of models in Computer Vision:

  – 4-connected MRFs

  – Highly-connected MRFs

  – Higher-order MRFs
Stereo matching
                                     d=0

                               d=4



 Image – left(a)     Image – right(b)      Ground truth depth


• Images rectified
• Ignore occlusion for now

Energy:                                         di

    E(d): {0,…,D-1}n → R
    Labels: d (depth/shift)
Stereo matching - Energy
Energy:
    E(d): {0,…,D-1}n → R
    E(d) = ∑ θi (di) + ∑ θij (di,dj)
                i          i,j Є N4

Unary:
         θi (di) = (lj-ri-di)      many others
         “SAD; Sum of absolute differences”




                                                 Right Image
         (many others possible, NCC,…)

                                left
                    i
                                right
                 i-2
               (di=2)                                          Left Image
 Pairwise:
         θij (di,dj) = g(|di-dj|)
Stereo matching - prior

                            θij (di,dj) = g(|di-dj|)




                           cost
                                  |di-dj|




No truncation
(global min.)


                                      [Olga Veksler PhD thesis,
                                      Daniel Cremers et al.]
Stereo matching - prior

                                            θij (di,dj) = g(|di-dj|)




                                           cost
                                                  |di-dj|
                                         discontinuity preserving potentials
                                              *Blake&Zisserman’83,’87+


No truncation      with truncation
(global min.)   (NP hard optimization)


                                                      [Olga Veksler PhD thesis,
                                                      Daniel Cremers et al.]
Stereo matching
    see http://vision.middlebury.edu/stereo/




         No MRF                    No horizontal links
Pixel independent (WTA)   Efficient since independent chains




     Pairwise MRF                   Ground truth
   [Boykov et al. ‘01+
Texture synthesis


    Input

                Output

                         Good case:               Bad case:
                b            b                     b
                             a                     a
                                 i   j                  i    j
            a

                    E: {0,1}n → R
O                   E(x) =   ∑ |xi-xj|      [ |ai-bi|+|aj-bj| ]
                         i,j Є N4
    1
                                         [Kwatra et. al. Siggraph ‘03 +
Video Synthesis




Input           Output




                Video (duplicated)


        Video
Panoramic stitching
Panoramic stitching
AutoCollage




http://research.microsoft.com/en-us/um/cambridge/projects/autocollage/   [Rother et. al. Siggraph ‘05 +
Recap: 4-connected MRFs

• A lot of useful vision systems are based on
  4-connected pairwise MRFs.

• Possible Reason (see Inference part):
  a lot of fast and good (globally optimal)
  inference methods exist
Overview
• Introduce Factor graphs notation
• Categorization of models in Computer Vision?

  – 4-connected MRFs

  – Highly-connected MRFs

  – Higher-order MRFs
Why larger connectivity?
We have seen…
• “Knock-on” effect (each pixel influences each other pixel)
• Many good systems

What is missing:
1. Modelling real-world texture (images)
2. Reduce discretization artefacts
3. Encode complex prior knowledge
4. Use non-local parameters
Reason 1: Texture modelling



Training images   Test image     Test image (60% Noise)




  Result MRF      Result MRF         Result MRF
 4-connected      4-connected        9-connected
 (neighbours)                   (7 attractive; 2 repulsive)
Reason1: Texture Modelling
 input         output




                         [Zalesny et al. ‘01+
Reason2: Discretization artefacts

1
                         Length of the paths:

                      Eucl.       4-con.       8-con.
                        5.65       6.28         5.08
                         8         6.28         6.75

                 Larger connectivity can model true Euclidean
                      length (also other metric possible)




                                              *Boykov et al ‘03, ‘05+
Reason2: Discretization artefacts




   4-connected             8-connected           8-connected
    Euclidean               Euclidean              geodesic

                 higher-connectivity can model
                     true Euclidean length       [Boykov et al. ‘03; ‘05+
3D reconstruction




               [Slide credits: Daniel Cremers]
Reason 3: Encode complex prior knowledge:
                      Stereo with occlusion




       E(d): {1,…,D}2n → R
       Each pixel is connected to D pixels in the other image
                                                      d=1 (∞ cost)
                 1                 1

                                                                 d=10 (match)
θlr (dl,dr) =   d                    d
                      match
                                                                     d=20 (0 cost)

                D                    D
                dl              dr           Left view               right view
Stereo with occlusion




Ground truth   Stereo with occlusion    Stereo without occlusion
               [Kolmogrov et al. ‘02+      *Boykov et al. ‘01+
Reason 4: Use Non-local parameters:
   Interactive Segmentation (GrabCut)




                    [Boykov and Jolly ’01+




                  GrabCut *Rother et al. ’04+
A meeting with the Queen
Reason 4: Use Non-local parameters:
         Interactive Segmentation (GrabCut)


                                                    w
 Model jointly segmentation and color model:

 E(x,w): {0,1}n x {GMMs}→ R
 E(x,w) = ∑ θi (xi,w) + ∑ θij (xi,xj)
           i           i,j Є N4


An object is a compact set of colors:
                                  Red
   Red




                                               [Rother et al Siggraph ’04+
Reason 4: Use Non-local parameters:
                 Segmentation and Recognition
Goal, Segment test image:

                 1


Large set of example segmentation:




             T(1)                 T(2)                  T(3)
                        Up to 2.000.000 exemplars

               E(x,w): {0,1}n x {Exemplar}→ R
               E(x,w) = ∑ |T(w)i-xi| + ∑ θij (xi,xj)
                             i               i,j Є N4
                            “Hamming distance”

                                                               [Lempisky et al. ECCV ’08+
Reason 4: Use Non-local parameters:
     Segmentation and Recognition




            UIUC dataset; 98.8% accuracy

                                           [Lempisky et al. ECCV ’08+
Overview
• Introduce Factor graphs notation
• Categorization of models in Computer Vision?

  – 4-connected MRFs

  – Highly-connected MRFs

  – Higher-order MRFs
Why Higher-order Functions?
In general θ(x1,x2,x3) ≠ θ(x1,x2) + θ(x1,x3) + θ(x2,x3)

Reasons for higher-order MRFs:

1. Even better image(texture) models:
   –   Field-of Expert [FoE, Roth et al. ‘05+
   –   Curvature *Woodford et al. ‘08+

2. Use global Priors:
   –   Connectivity *Vicente et al. ‘08, Nowizin et al. ‘09+
   –   Encode better training statistics *Woodford et al. ‘09+
Reason1: Better Texture Modelling


                            Higher Order Structure
                                 not Preserved

Training images




 Test Image       Test Image (60% Noise) Result pairwise MRF    Higher-order MRF
                                             9-connected
                                                               *Rother et al CVPR ‘09+
Reason 2: Use global Prior
Foreground object must be connected:




          User input    Standard MRF:          with connectivity
                        Removes noise (+)
                        Shrinks boundary (-)


  E(x) = P(x) + h(x)   with h(x)= {
                                  ∞ if not 4-connected
                                  0 otherwise
                                                            [Vicente et. al. ’08
                                                            Nowizin et al ‘09+
Reason 2: Use global Prior
    What is the prior of a MAP-MRF solution:

   Training image:                                              60% black, 40% white

   MAP:                                  Others less likely :
                        8
       prior(x) = 0.6 = 0.016                              prior(x) = 0.6 5 * 0.4 3 = 0.005
               MRF is a bad prior since input marginal statistic ignored !

               Introduce a global term, which controls global stats:




               Noisy input



Ground truth                              Pairwise MRF –               Global gradient prior
                                      Increase Prior strength
                                                                         [Woodford et. al. ICCV ‘09]
                                                                         (see poster on Friday)
Summary
• Introduce Factor graphs notation
• Categorization of models in Computer Vision?

  – 4-connected MRFs

  – Highly-connected MRFs

  – Higher-order MRFs


  …. all useful models,
     but how do I optimize them?
Course Program
9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)
1 hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:15-17.30 Recent Advances: Dual-decomposition, higher-order, etc.
             (Carsten Rother + Pawan Kumar)

   All online material will be online (after conference):
   http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
END
unused slides …
Markov Property


                                  xi




• Markov Property: Each variable is only connected to a few others,
                     i.e. many pixels are conditional independent
• This makes inference easier (possible at all)
• But still… every pixel can influence any other pixel (knock-on effect)
Recap: Factor Graphs


• Factor graphs: very good representation since it
                 reflects directly the given energy

• MRF (Markov Property) means many pixels are
                        conditional independent

• Still … all pixels influence each other (knock-on effect)
Interactive Segmentation - Tutorial example


                              Goal



   z = (R,G,B)n                                 x = {0,1}n

    Given Z and unknown (latent) variables x:

    P(x|z) =       P(z|x)     P(x)    / P(z)     ~ P(z|x) P(x)
    Posterior      Likelihood    Prior
    Probability       (data-    (data-
                  dependent) independent)

   Maximium a Posteriori (MAP): x* = argmax P(x|z)
                                                X
Likelihood   P(x|z) ~ P(z|x) P(x)




                              Green
Green




              Red                       Red
Likelihood             P(x|z) ~ P(z|x) P(x)




   Log P(zi|xi=0)                Log P(zi|xi=1)


Maximum likelihood:
x* = argmax P(z|x) =
        X
argmax ∏ P(zi|xi)
   x    xi
Prior   P(x|z) ~ P(z|x) P(x)




                                 xi      xj


P(x) = 1/f ∏ θij (xi,xj)
                   i,j   Є   N

f = ∑     ∏ θij (xi,xj)           “partition function”
    x    i,j   Є   N

θij (xi,xj) = exp{-|xi-xj|}           “ising prior”

    (exp{-1}=0.36; exp{0}=1)
Posterior distribution
P(x|z) ~ P(z|x) P(x)

  Posterior “Gibbs” distribution:

  θi (xi,zi) = P(zi|xi=1) xi + P(zi|xi=0) (1-xi) Likelihood
  θ (x ,x ) = |x -x |                         prior
    ij   i   j           i   j


  P(x|z) = 1/f(z,w) exp{-E(x,z,w)}
  f(z,w) = ∑ exp{-E(x,z,w)}
                 X
  E(x,z,w) =         ∑ θi (xi,zi)   + w∑ θij (xi,xj) Energy
                     i                  i,j
                   Unary terms        Pairwise terms
 Note, likelihood can be an arbitrary function of the data
Energy minization
 P(x|z) = 1/f(z,w) exp{-E(x,z,w)}
-log P(x|z) = -log (1/f(z,w)) + E(x,z,w)
f(z,w) = ∑ exp{-E(x,z,w)}
         X

 x* = argmin E(x,z,w)
          X
   MAP same as minimum Energy




                     MAP; Global min E     ML
Weight prior and likelihood



     w =0                        w =10




    w =40                       w =200

E(x,z,w) =   ∑ θi (xi,zi)   + w∑ θij (xi,xj)
Moving away from a pure prior …



                                  ising cost                   Contrast Cost

E(x,z,w) =   ∑ θi (xi,zi)   + w   ∑ θij (xi,xj,zi,zj)
              i                   i,j

θij (xi,xj,zi,zj) = |xi-xj| (-exp{-ß||zi-zj||2})


                                                        cost
          ß=2(Mean(||zi-zj||2) )-1

          “Going from a Markov random Field to                           ||zi-zj||2
          Conditional random field”
Tree vs Loopy graphs
                             Markov blanket of xi:
                             all variables which are
                             in same factor as xi         [Felzenschwalb, Huttenlocher ‘01+




                xi                                root




- MAP (in general) NP hard                        tree                       chain
  (see inference part)
- Marginals P(xi) also NP hard           • MAP is tractable
                                         • Marginal, e.g. P(foot), tractable
Stereo matching - prior
                                       θij (di,dj) = g(|di-dj|)




                                     cost
 Left image
                                            |di-dj|


                                                 Potts model




(Potts model)   Smooth disparities


                                               [Olga Veksler PhD thesis]
Modelling texture [Zalesny et al ‘01+
                                    “Unary only”




                                         “8 connected
                                         MRF”



input
                                           “13 connected
                                           MRF”
Reason2: Discretization artefacts
θij (xi,xj) = ∆a / (2*dis(xi,xj)) |xi –xj|             √2       ∆a = π/4

1                                                       1

                                               Length of the path
                                             4-con.    8-con.       true euc.
                                               6.28     5.08          5.65
                                               6.28     6.75           8




       Larger connectivity can model true Euclidean length
       (also any Riemannian metric, e.g. geodesic length, can be
       modelled)
                                                                    *Boykov et al ‘03, ‘05+
References Higher-order Functions?
• In general θ(x1,x2,x3) ≠ θ(x1,x2) + θ(x1,x3) + θ(x2,x3)

   Field of Experts Model (2x2; 5x5)
   *Roth, Black CVPR ‘05 +
   [Potetz, CVPR ‘07+

   Minimize Curvature (3x1)
   *Woodford et al. CVPR ‘08 +

   Large Neighbourhood (10x10 -> whole image)
   *Rother, Kolmogorov, Minka & Blake, CVPR ‘06+
   *Vicente, Kolmogorov, Rother, CVPR ‘08+
   [Komodiakis, Paragios, CVPR ‘09+
   *Rother, Kohli, Feng, Jia, CVPR ‘09+
   *Woodford, Rother, Kolmogorov, ICCV ‘09+
   *Vicente, Kolmogorov, Rother, ICCV ‘09+
   *Ishikawa, CVPR ‘09+
   *Ishikawa, ICCV ‘09+
Conditional Random Field (CRF)
E(x) = ∑ θi (xi,zi) + ∑ θij (xi,xj,zi,zj)                      θij
        i              i,j   Є   N4

       with θij (xi,xj,zi,zj) = |xi-xj| exp(-ß||zi-zj||2)

                                                                                ||zi-zj||2


                      zjj

                zii
                z


                      xjj


                xji                       Ising cost                 Contrast Cost

 Factor graph
                      Definition CRF: all factors may depend on the data z
                      No problem for inference (but parameter learning)

More Related Content

What's hot (20)

PDF
Complex and Social Network Analysis in Python
rik0
 
PDF
Social Network Analysis
rik0
 
PDF
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Valentin De Bortoli
 
PDF
Learning Sparse Representation
Gabriel Peyré
 
PDF
Mesh Processing Course : Active Contours
Gabriel Peyré
 
PDF
Lecture 03: Machine Learning for Language Technology - Linear Classifiers
Marina Santini
 
PDF
Lesson 27: Integration by Substitution (Section 4 version)
Matthew Leingang
 
PDF
The multilayer perceptron
ESCOM
 
PDF
Lecture 5: Structured Prediction
Marina Santini
 
PDF
Decision Trees and Bayes Classifiers
Alexander Jung
 
PDF
Output Units and Cost Function in FNN
Lin JiaMing
 
PDF
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
zukun
 
PDF
Uncertainty in deep learning
Yujiro Katagiri
 
PPT
CS 354 Graphics Math
Mark Kilgard
 
PDF
Sparse autoencoder
Devashish Patel
 
PPT
16 fft
abhi5556
 
PDF
Iclr2016 vaeまとめ
Deep Learning JP
 
PDF
Geodesic Method in Computer Vision and Graphics
Gabriel Peyré
 
PDF
Low Complexity Regularization of Inverse Problems
Gabriel Peyré
 
PDF
Fourier Transforms
Arvind Devaraj
 
Complex and Social Network Analysis in Python
rik0
 
Social Network Analysis
rik0
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Valentin De Bortoli
 
Learning Sparse Representation
Gabriel Peyré
 
Mesh Processing Course : Active Contours
Gabriel Peyré
 
Lecture 03: Machine Learning for Language Technology - Linear Classifiers
Marina Santini
 
Lesson 27: Integration by Substitution (Section 4 version)
Matthew Leingang
 
The multilayer perceptron
ESCOM
 
Lecture 5: Structured Prediction
Marina Santini
 
Decision Trees and Bayes Classifiers
Alexander Jung
 
Output Units and Cost Function in FNN
Lin JiaMing
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
zukun
 
Uncertainty in deep learning
Yujiro Katagiri
 
CS 354 Graphics Math
Mark Kilgard
 
Sparse autoencoder
Devashish Patel
 
16 fft
abhi5556
 
Iclr2016 vaeまとめ
Deep Learning JP
 
Geodesic Method in Computer Vision and Graphics
Gabriel Peyré
 
Low Complexity Regularization of Inverse Problems
Gabriel Peyré
 
Fourier Transforms
Arvind Devaraj
 

Similar to ICCV2009: MAP Inference in Discrete Models: Part 2 (20)

PDF
ICCV2009: MAP Inference in Discrete Models: Part 1: Introduction
zukun
 
PDF
Object Recognition with Deformable Models
zukun
 
PDF
Object Detection with Discrmininatively Trained Part based Models
zukun
 
PDF
Lecture 02 internet video search
zukun
 
KEY
Team meeting 100325
Yi-Hsin Liu
 
KEY
Team meeting 100325
Yi-Hsin Liu
 
PPT
Machine Learning and Statistical Analysis
butest
 
PPT
Machine Learning and Statistical Analysis
butest
 
PPT
Machine Learning and Statistical Analysis
butest
 
PPT
Machine Learning and Statistical Analysis
butest
 
PPT
Machine Learning and Statistical Analysis
butest
 
PPT
Machine Learning and Statistical Analysis
butest
 
PPT
Machine Learning and Statistical Analysis
butest
 
PDF
Image Smoothing for Structure Extraction
Jia-Bin Huang
 
PDF
Tall-and-skinny QR factorizations in MapReduce architectures
David Gleich
 
PDF
Hoip10 presentación seguimiento de objetos_vicomtech
TECNALIA Research & Innovation
 
PDF
Lecture11
Bo Li
 
PDF
CVPR2010: higher order models in computer vision: Part 4
zukun
 
PPTX
Optimization algorithms for solving computer vision problems
Krzysztof Wegner
 
PPTX
12 cv mil_models_for_grids
zukun
 
ICCV2009: MAP Inference in Discrete Models: Part 1: Introduction
zukun
 
Object Recognition with Deformable Models
zukun
 
Object Detection with Discrmininatively Trained Part based Models
zukun
 
Lecture 02 internet video search
zukun
 
Team meeting 100325
Yi-Hsin Liu
 
Team meeting 100325
Yi-Hsin Liu
 
Machine Learning and Statistical Analysis
butest
 
Machine Learning and Statistical Analysis
butest
 
Machine Learning and Statistical Analysis
butest
 
Machine Learning and Statistical Analysis
butest
 
Machine Learning and Statistical Analysis
butest
 
Machine Learning and Statistical Analysis
butest
 
Machine Learning and Statistical Analysis
butest
 
Image Smoothing for Structure Extraction
Jia-Bin Huang
 
Tall-and-skinny QR factorizations in MapReduce architectures
David Gleich
 
Hoip10 presentación seguimiento de objetos_vicomtech
TECNALIA Research & Innovation
 
Lecture11
Bo Li
 
CVPR2010: higher order models in computer vision: Part 4
zukun
 
Optimization algorithms for solving computer vision problems
Krzysztof Wegner
 
12 cv mil_models_for_grids
zukun
 
Ad

More from zukun (20)

PDF
My lyn tutorial 2009
zukun
 
PDF
ETHZ CV2012: Tutorial openCV
zukun
 
PDF
ETHZ CV2012: Information
zukun
 
PDF
Siwei lyu: natural image statistics
zukun
 
PDF
Lecture9 camera calibration
zukun
 
PDF
Brunelli 2008: template matching techniques in computer vision
zukun
 
PDF
Modern features-part-4-evaluation
zukun
 
PDF
Modern features-part-3-software
zukun
 
PDF
Modern features-part-2-descriptors
zukun
 
PDF
Modern features-part-1-detectors
zukun
 
PDF
Modern features-part-0-intro
zukun
 
PDF
Lecture 01 internet video search
zukun
 
PDF
Lecture 03 internet video search
zukun
 
PDF
Icml2012 tutorial representation_learning
zukun
 
PPT
Advances in discrete energy minimisation for computer vision
zukun
 
PDF
Gephi tutorial: quick start
zukun
 
PDF
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
PDF
Object recognition with pictorial structures
zukun
 
PDF
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 
PDF
Icml2012 learning hierarchies of invariant features
zukun
 
My lyn tutorial 2009
zukun
 
ETHZ CV2012: Tutorial openCV
zukun
 
ETHZ CV2012: Information
zukun
 
Siwei lyu: natural image statistics
zukun
 
Lecture9 camera calibration
zukun
 
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-4-evaluation
zukun
 
Modern features-part-3-software
zukun
 
Modern features-part-2-descriptors
zukun
 
Modern features-part-1-detectors
zukun
 
Modern features-part-0-intro
zukun
 
Lecture 01 internet video search
zukun
 
Lecture 03 internet video search
zukun
 
Icml2012 tutorial representation_learning
zukun
 
Advances in discrete energy minimisation for computer vision
zukun
 
Gephi tutorial: quick start
zukun
 
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
Object recognition with pictorial structures
zukun
 
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 
Icml2012 learning hierarchies of invariant features
zukun
 
Ad

Recently uploaded (20)

PPTX
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
PDF
IMP NAAC-Reforms-Stakeholder-Consultation-Presentation-on-Draft-Metrics-Unive...
BHARTIWADEKAR
 
PPTX
CBSE to Conduct Class 10 Board Exams Twice a Year Starting 2026 .pptx
Schoolsof Dehradun
 
PPT
digestive system for Pharm d I year HAP
rekhapositivity
 
PPTX
classroom based quiz bee.pptx...................
ferdinandsanbuenaven
 
PPTX
How to Configure Prepayments in Odoo 18 Sales
Celine George
 
PPTX
LEGAL ASPECTS OF PSYCHIATRUC NURSING.pptx
PoojaSen20
 
PPTX
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
PPTX
How to Manage Access Rights & User Types in Odoo 18
Celine George
 
PDF
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
PPTX
ANORECTAL MALFORMATIONS: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
PDF
07.15.2025 - Managing Your Members Using a Membership Portal.pdf
TechSoup
 
PPTX
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
PPTX
Maternal and Child Tracking system & RCH portal
Ms Usha Vadhel
 
PPTX
PPT on the Development of Education in the Victorian England
Beena E S
 
PPTX
The Human Eye and The Colourful World Class 10 NCERT Science.pptx
renutripathibharat
 
PPTX
How to Define Translation to Custom Module And Add a new language in Odoo 18
Celine George
 
PDF
Federal dollars withheld by district, charter, grant recipient
Mebane Rash
 
PPTX
HEAD INJURY IN CHILDREN: NURSING MANAGEMENGT.pptx
PRADEEP ABOTHU
 
PPTX
Pyhton with Mysql to perform CRUD operations.pptx
Ramakrishna Reddy Bijjam
 
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
IMP NAAC-Reforms-Stakeholder-Consultation-Presentation-on-Draft-Metrics-Unive...
BHARTIWADEKAR
 
CBSE to Conduct Class 10 Board Exams Twice a Year Starting 2026 .pptx
Schoolsof Dehradun
 
digestive system for Pharm d I year HAP
rekhapositivity
 
classroom based quiz bee.pptx...................
ferdinandsanbuenaven
 
How to Configure Prepayments in Odoo 18 Sales
Celine George
 
LEGAL ASPECTS OF PSYCHIATRUC NURSING.pptx
PoojaSen20
 
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
How to Manage Access Rights & User Types in Odoo 18
Celine George
 
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
ANORECTAL MALFORMATIONS: NURSING MANAGEMENT.pptx
PRADEEP ABOTHU
 
07.15.2025 - Managing Your Members Using a Membership Portal.pdf
TechSoup
 
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
Maternal and Child Tracking system & RCH portal
Ms Usha Vadhel
 
PPT on the Development of Education in the Victorian England
Beena E S
 
The Human Eye and The Colourful World Class 10 NCERT Science.pptx
renutripathibharat
 
How to Define Translation to Custom Module And Add a new language in Odoo 18
Celine George
 
Federal dollars withheld by district, charter, grant recipient
Mebane Rash
 
HEAD INJURY IN CHILDREN: NURSING MANAGEMENGT.pptx
PRADEEP ABOTHU
 
Pyhton with Mysql to perform CRUD operations.pptx
Ramakrishna Reddy Bijjam
 

ICCV2009: MAP Inference in Discrete Models: Part 2

  • 1. Course Program 9.30-10.00 Introduction (Andrew Blake) 10.00-11.00 Discrete Models in Computer Vision (Carsten Rother) 15min Coffee break 11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar) 12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli) 1 hour Lunch break 14:00-15.00 Transformation and move-making methods (Pushmeet Kohli) 15:00-15.30 Speed and Efficiency (Pushmeet Kohli) 15min Coffee break 15:45-16.15 Comparison of Methods (Carsten Rother) 16:15-17.30 Recent Advances: Dual-decomposition, higher-order, etc. (Carsten Rother + Pawan Kumar) All online material will be online (after conference): http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
  • 2. Discrete Models in Computer Vision Carsten Rother Microsoft Research Cambridge
  • 3. Overview • Introduce Factor graphs notation • Categorization of models in Computer Vision: – 4-connected MRFs – Highly-connected MRFs – Higher-order MRFs
  • 4. Markov Random Field Models for Computer Vision Inference: Model :  Graph Cut (GC)  discrete or continuous variables?  Belief Propagtion (BP)  discrete or continuous space?  Tree-Reweighted Message Parsing  Dependence between variables? (TRW)  …  Iterated Conditional Modes (ICM)  Cutting-plane  Dual-decomposition Applications:  2D/3D Image segmentation  …  Object Recognition  3D reconstruction  Stereo matching Learning:  Image denoising  Exhaustive search (grid search)  Texture Synthesis  Pseudo-Likelihood approximation  Pose estimation  Training in Pieces  Panoramic Stitching  …  Max-margin  …
  • 5. Recap: Image Segmentation Input z Maximum-a-posteriori (MAP) x* = argmax P(x|z) = argmin E(x) x x P(x|z) ~ P(z|x) P(x) Posterior; Likelihood; Prior P(x|z) ~ exp{-E(x)} Gibbs distribution E: {0,1}n → R E(x) = ∑ θi (xi) + w∑ θij (xi,xj) Energy i i,j Є N4 Unary term Pairwise term
  • 6. Min-Marginals (uncertainty of MAP-solution) Definition: ψv;i = min E(x) xv=i image MAP Min-Marginals (foreground) (bright=very certain) Can be used in several ways: • Insights on the model • For optimization (TRW, comes later)
  • 7. Introducing Factor Graphs Write probability distributions as Graphical models: - Direct graphical model - Undirected graphical model (… what Andrew Blake used) - Factor graphs References: - Pattern Recognition and Machine Learning *Bishop ‘08, book, chapter 8+ - several lectures at the Machine Learning Summer School 2009 (see video lectures)
  • 8. Factor Graphs P(x) ~ θ(x1,x2,x4) θ(x3,x4) θ(x2,x3) θ(x4,x5) “4 factors” P(x) ~ exp{-E(x)} Gibbs distribution E(x) = θ(x1,x2,x3) + θ(x2,x4) + θ(x3,x4) + θ(x3,x5) unobserved/latent/hidden variable x1 x2 variables are in same factor. x3 x4 x5 Factor graph
  • 9. Definition “Order” Definition “Order”: The arity (number of variables) of the largest factor x1 x2 P(X) ~ θ(x1,x2,x3) θ(x2,x4) θ(x3,x4) θ(x3,x5) arity 3 arity 2 x3 x4 x5 Factor graph with order 3
  • 10. Examples - Order 4-connected; higher(8)-connected; Higher-order MRF pairwise MRF pairwise MRF E(x) = ∑ θij (xi,xj) E(x) = ∑ θij (xi,xj) E(x) = ∑ θij (xi,xj) i,j Є N4 i,j Є N4 i,j Є N8 +θ(x1,…,xn) Order 2 Order 2 Order n “Pairwise energy” “higher-order energy”
  • 11. Example: Image segmentation P(x|z) ~ exp{-E(x)} E(x) = ∑ θi (xi,zi) + ∑ θij (xi,xj) i i,j Є N4 Observed variable Unobserved (latent) variable zi xj xi Factor graph
  • 12. Most simple inference technique: ICM (iterated conditional mode) Goal: x* = argmin E(x) x2 x E(x) = θ12 (x1,x2)+ θ13 (x1,x3)+ θ14 (x1,x4)+ θ15 (x1,x5)+… x3 x1 x4 x5
  • 13. Most simple inference technique: ICM (iterated conditional mode) means observed Goal: x* = argmin E(x) x2 x E(x) = θ12 (x1,x2)+ θ13 (x1,x3)+ θ14 (x1,x4)+ θ15 (x1,x5)+… x3 x1 x4 x5 Simulated Annealing: accept a move even if ICM Global min energy increases (with certain probability) Can get stuck in local minima!
  • 14. Overview • Introduce Factor graphs notation • Categorization of models in Computer Vision: – 4-connected MRFs – Highly-connected MRFs – Higher-order MRFs
  • 15. Stereo matching d=0 d=4 Image – left(a) Image – right(b) Ground truth depth • Images rectified • Ignore occlusion for now Energy: di E(d): {0,…,D-1}n → R Labels: d (depth/shift)
  • 16. Stereo matching - Energy Energy: E(d): {0,…,D-1}n → R E(d) = ∑ θi (di) + ∑ θij (di,dj) i i,j Є N4 Unary: θi (di) = (lj-ri-di) many others “SAD; Sum of absolute differences” Right Image (many others possible, NCC,…) left i right i-2 (di=2) Left Image Pairwise: θij (di,dj) = g(|di-dj|)
  • 17. Stereo matching - prior θij (di,dj) = g(|di-dj|) cost |di-dj| No truncation (global min.) [Olga Veksler PhD thesis, Daniel Cremers et al.]
  • 18. Stereo matching - prior θij (di,dj) = g(|di-dj|) cost |di-dj| discontinuity preserving potentials *Blake&Zisserman’83,’87+ No truncation with truncation (global min.) (NP hard optimization) [Olga Veksler PhD thesis, Daniel Cremers et al.]
  • 19. Stereo matching see http://vision.middlebury.edu/stereo/ No MRF No horizontal links Pixel independent (WTA) Efficient since independent chains Pairwise MRF Ground truth [Boykov et al. ‘01+
  • 20. Texture synthesis Input Output Good case: Bad case: b b b a a i j i j a E: {0,1}n → R O E(x) = ∑ |xi-xj| [ |ai-bi|+|aj-bj| ] i,j Є N4 1 [Kwatra et. al. Siggraph ‘03 +
  • 21. Video Synthesis Input Output Video (duplicated) Video
  • 25. Recap: 4-connected MRFs • A lot of useful vision systems are based on 4-connected pairwise MRFs. • Possible Reason (see Inference part): a lot of fast and good (globally optimal) inference methods exist
  • 26. Overview • Introduce Factor graphs notation • Categorization of models in Computer Vision? – 4-connected MRFs – Highly-connected MRFs – Higher-order MRFs
  • 27. Why larger connectivity? We have seen… • “Knock-on” effect (each pixel influences each other pixel) • Many good systems What is missing: 1. Modelling real-world texture (images) 2. Reduce discretization artefacts 3. Encode complex prior knowledge 4. Use non-local parameters
  • 28. Reason 1: Texture modelling Training images Test image Test image (60% Noise) Result MRF Result MRF Result MRF 4-connected 4-connected 9-connected (neighbours) (7 attractive; 2 repulsive)
  • 29. Reason1: Texture Modelling input output [Zalesny et al. ‘01+
  • 30. Reason2: Discretization artefacts 1 Length of the paths: Eucl. 4-con. 8-con. 5.65 6.28 5.08 8 6.28 6.75 Larger connectivity can model true Euclidean length (also other metric possible) *Boykov et al ‘03, ‘05+
  • 31. Reason2: Discretization artefacts 4-connected 8-connected 8-connected Euclidean Euclidean geodesic higher-connectivity can model true Euclidean length [Boykov et al. ‘03; ‘05+
  • 32. 3D reconstruction [Slide credits: Daniel Cremers]
  • 33. Reason 3: Encode complex prior knowledge: Stereo with occlusion E(d): {1,…,D}2n → R Each pixel is connected to D pixels in the other image d=1 (∞ cost) 1 1 d=10 (match) θlr (dl,dr) = d d match d=20 (0 cost) D D dl dr Left view right view
  • 34. Stereo with occlusion Ground truth Stereo with occlusion Stereo without occlusion [Kolmogrov et al. ‘02+ *Boykov et al. ‘01+
  • 35. Reason 4: Use Non-local parameters: Interactive Segmentation (GrabCut) [Boykov and Jolly ’01+ GrabCut *Rother et al. ’04+
  • 36. A meeting with the Queen
  • 37. Reason 4: Use Non-local parameters: Interactive Segmentation (GrabCut) w Model jointly segmentation and color model: E(x,w): {0,1}n x {GMMs}→ R E(x,w) = ∑ θi (xi,w) + ∑ θij (xi,xj) i i,j Є N4 An object is a compact set of colors: Red Red [Rother et al Siggraph ’04+
  • 38. Reason 4: Use Non-local parameters: Segmentation and Recognition Goal, Segment test image: 1 Large set of example segmentation: T(1) T(2) T(3) Up to 2.000.000 exemplars E(x,w): {0,1}n x {Exemplar}→ R E(x,w) = ∑ |T(w)i-xi| + ∑ θij (xi,xj) i i,j Є N4 “Hamming distance” [Lempisky et al. ECCV ’08+
  • 39. Reason 4: Use Non-local parameters: Segmentation and Recognition UIUC dataset; 98.8% accuracy [Lempisky et al. ECCV ’08+
  • 40. Overview • Introduce Factor graphs notation • Categorization of models in Computer Vision? – 4-connected MRFs – Highly-connected MRFs – Higher-order MRFs
  • 41. Why Higher-order Functions? In general θ(x1,x2,x3) ≠ θ(x1,x2) + θ(x1,x3) + θ(x2,x3) Reasons for higher-order MRFs: 1. Even better image(texture) models: – Field-of Expert [FoE, Roth et al. ‘05+ – Curvature *Woodford et al. ‘08+ 2. Use global Priors: – Connectivity *Vicente et al. ‘08, Nowizin et al. ‘09+ – Encode better training statistics *Woodford et al. ‘09+
  • 42. Reason1: Better Texture Modelling Higher Order Structure not Preserved Training images Test Image Test Image (60% Noise) Result pairwise MRF Higher-order MRF 9-connected *Rother et al CVPR ‘09+
  • 43. Reason 2: Use global Prior Foreground object must be connected: User input Standard MRF: with connectivity Removes noise (+) Shrinks boundary (-) E(x) = P(x) + h(x) with h(x)= { ∞ if not 4-connected 0 otherwise [Vicente et. al. ’08 Nowizin et al ‘09+
  • 44. Reason 2: Use global Prior What is the prior of a MAP-MRF solution: Training image: 60% black, 40% white MAP: Others less likely : 8 prior(x) = 0.6 = 0.016 prior(x) = 0.6 5 * 0.4 3 = 0.005 MRF is a bad prior since input marginal statistic ignored ! Introduce a global term, which controls global stats: Noisy input Ground truth Pairwise MRF – Global gradient prior Increase Prior strength [Woodford et. al. ICCV ‘09] (see poster on Friday)
  • 45. Summary • Introduce Factor graphs notation • Categorization of models in Computer Vision? – 4-connected MRFs – Highly-connected MRFs – Higher-order MRFs …. all useful models, but how do I optimize them?
  • 46. Course Program 9.30-10.00 Introduction (Andrew Blake) 10.00-11.00 Discrete Models in Computer Vision (Carsten Rother) 15min Coffee break 11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar) 12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli) 1 hour Lunch break 14:00-15.00 Transformation and move-making methods (Pushmeet Kohli) 15:00-15.30 Speed and Efficiency (Pushmeet Kohli) 15min Coffee break 15:45-16.15 Comparison of Methods (Carsten Rother) 16:15-17.30 Recent Advances: Dual-decomposition, higher-order, etc. (Carsten Rother + Pawan Kumar) All online material will be online (after conference): http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
  • 47. END
  • 49. Markov Property xi • Markov Property: Each variable is only connected to a few others, i.e. many pixels are conditional independent • This makes inference easier (possible at all) • But still… every pixel can influence any other pixel (knock-on effect)
  • 50. Recap: Factor Graphs • Factor graphs: very good representation since it reflects directly the given energy • MRF (Markov Property) means many pixels are conditional independent • Still … all pixels influence each other (knock-on effect)
  • 51. Interactive Segmentation - Tutorial example Goal z = (R,G,B)n x = {0,1}n Given Z and unknown (latent) variables x: P(x|z) = P(z|x) P(x) / P(z) ~ P(z|x) P(x) Posterior Likelihood Prior Probability (data- (data- dependent) independent) Maximium a Posteriori (MAP): x* = argmax P(x|z) X
  • 52. Likelihood P(x|z) ~ P(z|x) P(x) Green Green Red Red
  • 53. Likelihood P(x|z) ~ P(z|x) P(x) Log P(zi|xi=0) Log P(zi|xi=1) Maximum likelihood: x* = argmax P(z|x) = X argmax ∏ P(zi|xi) x xi
  • 54. Prior P(x|z) ~ P(z|x) P(x) xi xj P(x) = 1/f ∏ θij (xi,xj) i,j Є N f = ∑ ∏ θij (xi,xj) “partition function” x i,j Є N θij (xi,xj) = exp{-|xi-xj|} “ising prior” (exp{-1}=0.36; exp{0}=1)
  • 55. Posterior distribution P(x|z) ~ P(z|x) P(x) Posterior “Gibbs” distribution: θi (xi,zi) = P(zi|xi=1) xi + P(zi|xi=0) (1-xi) Likelihood θ (x ,x ) = |x -x | prior ij i j i j P(x|z) = 1/f(z,w) exp{-E(x,z,w)} f(z,w) = ∑ exp{-E(x,z,w)} X E(x,z,w) = ∑ θi (xi,zi) + w∑ θij (xi,xj) Energy i i,j Unary terms Pairwise terms Note, likelihood can be an arbitrary function of the data
  • 56. Energy minization P(x|z) = 1/f(z,w) exp{-E(x,z,w)} -log P(x|z) = -log (1/f(z,w)) + E(x,z,w) f(z,w) = ∑ exp{-E(x,z,w)} X x* = argmin E(x,z,w) X MAP same as minimum Energy MAP; Global min E ML
  • 57. Weight prior and likelihood w =0 w =10 w =40 w =200 E(x,z,w) = ∑ θi (xi,zi) + w∑ θij (xi,xj)
  • 58. Moving away from a pure prior … ising cost Contrast Cost E(x,z,w) = ∑ θi (xi,zi) + w ∑ θij (xi,xj,zi,zj) i i,j θij (xi,xj,zi,zj) = |xi-xj| (-exp{-ß||zi-zj||2}) cost ß=2(Mean(||zi-zj||2) )-1 “Going from a Markov random Field to ||zi-zj||2 Conditional random field”
  • 59. Tree vs Loopy graphs Markov blanket of xi: all variables which are in same factor as xi [Felzenschwalb, Huttenlocher ‘01+ xi root - MAP (in general) NP hard tree chain (see inference part) - Marginals P(xi) also NP hard • MAP is tractable • Marginal, e.g. P(foot), tractable
  • 60. Stereo matching - prior θij (di,dj) = g(|di-dj|) cost Left image |di-dj| Potts model (Potts model) Smooth disparities [Olga Veksler PhD thesis]
  • 61. Modelling texture [Zalesny et al ‘01+ “Unary only” “8 connected MRF” input “13 connected MRF”
  • 62. Reason2: Discretization artefacts θij (xi,xj) = ∆a / (2*dis(xi,xj)) |xi –xj| √2 ∆a = π/4 1 1 Length of the path 4-con. 8-con. true euc. 6.28 5.08 5.65 6.28 6.75 8 Larger connectivity can model true Euclidean length (also any Riemannian metric, e.g. geodesic length, can be modelled) *Boykov et al ‘03, ‘05+
  • 63. References Higher-order Functions? • In general θ(x1,x2,x3) ≠ θ(x1,x2) + θ(x1,x3) + θ(x2,x3) Field of Experts Model (2x2; 5x5) *Roth, Black CVPR ‘05 + [Potetz, CVPR ‘07+ Minimize Curvature (3x1) *Woodford et al. CVPR ‘08 + Large Neighbourhood (10x10 -> whole image) *Rother, Kolmogorov, Minka & Blake, CVPR ‘06+ *Vicente, Kolmogorov, Rother, CVPR ‘08+ [Komodiakis, Paragios, CVPR ‘09+ *Rother, Kohli, Feng, Jia, CVPR ‘09+ *Woodford, Rother, Kolmogorov, ICCV ‘09+ *Vicente, Kolmogorov, Rother, ICCV ‘09+ *Ishikawa, CVPR ‘09+ *Ishikawa, ICCV ‘09+
  • 64. Conditional Random Field (CRF) E(x) = ∑ θi (xi,zi) + ∑ θij (xi,xj,zi,zj) θij i i,j Є N4 with θij (xi,xj,zi,zj) = |xi-xj| exp(-ß||zi-zj||2) ||zi-zj||2 zjj zii z xjj xji Ising cost Contrast Cost Factor graph Definition CRF: all factors may depend on the data z No problem for inference (but parameter learning)