transformations2 and fitting on image processing.pptx

Transformation
s and Fitting
EECS 442 – David Fouhey
Winter 2023, University of Michigan
https://web.eecs.umich.edu/~fouhey/teaching/EECS442_W23/

Administrivia
• Discussion this week = office hours

So Far
1. How do we find distinctive / easy to locate
features? (Harris/Laplacian of Gaussian)
2. How do we describe the regions around
them? (histogram of gradients)
3. How do we match features? (L2 distance)
4. How do we handle outliers? (RANSAC)

Today
As promised: warping one image to another

Why Mosaic?
• Compact Camera FOV = 50 x 35°
Slide credit: Brown & Lowe

Why Mosaic?
• Human FOV = 200 x 135°

Why Mosaic?
• Human FOV = 200 x 135°
• Panoramic Mosaic = 360 x 180°

Why Bother With This Math?
Slide credit: A. Efros

Homework 1 Style
Translation only via alignment

Image Transformations
f
x
T
g
x
f
x
T
g
x
Image filtering: change range of image
𝑔( 𝑥)=𝑇 ( 𝑓 ( 𝑥))
𝑔 ( 𝑥 )= 𝑓 ¿
Image warping: change domain of image

Image Transformations
T
T
Image filtering: change range of image
𝑔(𝑥 , 𝑦)=𝑇 ( 𝑓 (𝑥, 𝑦 ))
𝑔 ( 𝑥 , 𝑦 )= 𝑓 ¿
Image warping: change domain of image
f g
f g

Parametric (Global) warping
translation rotation aspect
affine perspective cylindrical
Examples of parametric warps

Parametric (Global) Warping
T
p’ = (x’,y’)
T is a coordinate changing machine
p = (x,y)
Note: T is the same for all points, has relatively few
parameters, and does not depend on image content
𝒑′
=𝑇 (𝒑)

Parametric (Global) Warping
T
p’ = (x’,y’)
p = (x,y)
Today we’ll deal with linear warps
𝒑′
≡𝑻𝒑
T: matrix; p, p’: 2D points. Start with normal points
and =, then do homogeneous cords and ≡

Scaling
 2
Scaling multiplies each component (x,y) by a scalar.
Uniform scaling is the same for all components.
Note the corner goes from (1,1) to (2,2)

Scaling
Non-uniform scaling multiplies each component by
a different scalar.
X  2,
Y  0.5

Scaling
What does T look like?
𝑥′
=𝑎𝑥
𝑦 ′
=𝑏𝑦
Let’s convert to a matrix:
[𝑥 ′
𝑦 ′ ]=
[𝑎 0
0 𝑏][𝑥
𝑦 ]
scaling matrix S
What’s the inverse of S?

2D Rotation
Rotation Matrix
But wait! Aren’t sin/cos non-linear?
x’ is a linear combination/function of x, y
x’ is not a linear function of θ
What’s the inverse of Rθ? 𝑰 =𝑹𝜽
𝑇
𝑹𝜽
[𝑥 ′
𝑦 ′ ]=
[c os ⁡(𝜃) − sin (𝜃 )
sin ( 𝜃) cos(𝜃 ) ][𝑥
𝑦 ]

Things You Can Do With 2x2
Identity / No Transformation
Shear
[𝑥 ′
𝑦 ′ ]=
[ 1 𝑠 h𝑥
𝑠h𝑦 1 ][𝑥
𝑦 ]
[𝑥 ′
𝑦 ′ ]=
[1 0
0 1 ][𝑥
𝑦 ]

Things You Can Do With 2x2
2D Mirror About Y-Axis
[𝑥 ′
𝑦 ′ ]=
[−1 0
0 1 ][𝑥
𝑦 ]
Before
After
2D Mirror About X,Y
[𝑥 ′
𝑦 ′ ]=
[−1 0
0 −1][𝑥
𝑦 ]
Before
After

What’s Preserved?
Projections of parallel 3D
lines are not necessarily
parallel, so not parallelism
3D lines project to 2D lines
so lines are preserved
Distant objects are smaller
so size is not preserved

What’s Preserved With a 2x2
[𝑥 ′
𝑦 ′ ]=
[𝑎 𝑏
𝑐 𝑑][𝑥
𝑦 ]=𝑇
[𝑥
𝑦]
After multiplication by T (irrespective of T)
• Origin is origin: 0 = T0
• Lines are lines
• Parallel lines are parallel

Things You Can’t Do With 2x2
What about translation?
x’ = x + tx, y’ = y+ty
+(2,2)
How do we make it linear?

Homogeneous Coordinates Again
What about translation?
x’ = x + tx, y’ = y+ty
+(2,2)
[
𝑥+𝑡𝑥
𝑦 +𝑡 𝑦
1 ]≡
[
𝑥′
𝑦
′
1 ]≡
[
1 0 𝑡𝑥
0 1 𝑡𝑦
0 0 1 ][
𝑥
𝑦
1 ]

Representing 2D Transformations
How do we represent a 2D transformation?
Let’s pick scaling
[
𝑥′
𝑦
′
1 ]≡
[
𝑠𝑥 0 𝑎
0 𝑠𝑦 𝑏
𝑑 𝑒 𝑓 ][
𝑥
𝑦
1 ]
a b d e f
0 0 0 0 1
What’s

Affine Transformations
Affine: linear transformation plus translation
In general (without homogeneous coordinates)
𝒙 ′= 𝑨𝒙+𝒃
Will the last coordinate w’ always be 1?
[
𝑥
′
𝑦
′
𝑤 ′]≡
[
𝑎 𝑏 𝑐
𝑑 𝑒 𝑓
0 0 1 ][
𝑥
𝑦
1 ]
t

Matrix Composition
[
𝑥′
𝑦
′
𝑤 ′]≡
[
1 0 𝑡𝑥
0 1 𝑡 𝑦
0 0 1 ][
cos (𝜃 ) −sin ( 𝜃) 0
sin (𝜃) cos ( 𝜃) 0
0 0 1][
𝑠𝑥 0 0
0 𝑠 𝑦 0
0 0 1 ][
𝑥
𝑦
𝑤]
𝑇 (𝑡𝑥 ,𝑡𝑦 ) 𝑅( 𝜃) 𝑆(𝑠𝑥 ,𝑠𝑦 )
We can combine transformations via matrix
multiplication.
Does order matter?

What’s Preserved With Affine
• Lines are lines
[
𝑥′
𝑦
′
1 ]≡
[
𝑎 𝑏 𝑐
𝑑 𝑒 𝑓
0 0 1 ][
𝑥
𝑦
1 ]≡ 𝑻
[
𝑥
𝑦
1 ]

Homogeneous Equivalence
z
x
y
[x,y,w]
λ[x,y,w]
Two homogeneous coordinates are
equivalent if they are proportional
to each other. Not = !
[
𝑢
𝑣
𝑤]≡
[
𝑢′
𝑣′
𝑤′ ]↔
[
𝑢
𝑣
𝑤]= 𝜆
[
𝑢′
𝑣′
𝑤′ ]
𝜆≠ 0
Triple /
Equivalent
Double /
Equals

Perspective Transformations
Set bottom row to not [0,0,1]
Called a perspective/projective transformation or a
homography
[
𝑥′
𝑦
′
𝑤 ′]≡
[
𝑎 𝑏 𝑐
𝑑 𝑒 𝑓
𝑔 h 𝑖 ][
𝑥
𝑦
𝑤]
Can compute [x’,y’,w’] via matrix multiplication.
How do we get a 2D point?
(x’/w’, y’/w’)

Perspective Transformations
Set bottom row to not [0,0,1]
Called a perspective/projective transformation or a
homography
[
𝑥′
𝑦
′
𝑤 ′]≡
[
𝑎 𝑏 𝑐
𝑑 𝑒 𝑓
𝑔 h 𝑖 ][
𝑥
𝑦
𝑤]
How many degrees of freedom?

How Many Degrees of Freedom?
Can always scale coordinate by non-zero value
[
𝑥
′
𝑦
′
𝑤 ′]≡
[
𝑎 𝑏 𝑐
𝑑 𝑒 𝑓
𝑔 h 𝑖 ][
𝑥
𝑦
𝑤]
Perspective
[
𝑥
′
𝑦
′
𝑤 ′]≡
1
𝑖 [
𝑥′
𝑦
′
𝑤′ ]
Homography can always be re-scaled by λ≠0
Typically pick it so last entry is 1.
≡
1
𝑖 [
𝑎 𝑏 𝑐
𝑑 𝑒 𝑓
𝑔 h 𝑖 ][
𝑥
𝑦
𝑤 ]

What’s Preserved With Perspective
• Lines are lines
• Ratios between distances
[
𝑥′
𝑦
′
1 ]≡
[
𝑎 𝑏 𝑐
𝑑 𝑒 𝑓
𝑔 h 𝑖 ][
𝑥
𝑦
1 ]≡ 𝑻
[
𝑥
𝑦
1 ]

Transformation Families
In general: transformations are a nested set of groups
Diagram credit: R. Szeliski

What Can Homographies Do?
Homography example 1: any two views
of a planar surface
Figure Credit: S. Lazebnik

Homography example 2: any images from two
cameras sharing a camera center
Figure Credit: S. Lazebnik

Homography sort of example “3”: far away
scene that can be approximated by a plane
Figure credit: Brown & Lowe

Fun With Homographies
Original image
St. Petersburg
photo by A. Tikhonov
Virtual camera rotations
Slide Credit: A. Efros

Analyzing Patterns
Homography
Automatically
rectified floor
The floor (enlarged)
Slide from A. Criminisi

Analyzing Patterns
Slide from A. Criminisi Automatic
rectification
From Martin Kemp The Science of Art
(manual reconstruction)

Fitting Transformations
Setup: have pairs of correspondences
(𝑥𝑖 , 𝑦𝑖)
(𝑥 ′𝑖 , 𝑦 ′𝑖)
M,t
[𝑥𝑖 ′
𝑦𝑖 ′ ]=𝑴
[𝑥𝑖
𝑦𝑖
]+𝒕
Slide Credit: S. Lazebnik

Fitting Transformation
Data: (xi,yi,x’i,y’i) for
i=1,…,k
Model:
[x’i,y’i] = M[xi,yi]+t
Objective function:
||[x’i,y’i] – (M[xi,yi]+t)||2
M,t
Affine Transformation: M,t

[
⋮
𝑥𝑖
′
𝑦𝑖
′
⋮
]=
[
⋯
¿
¿
𝑥𝑖 𝑦 𝑖
0 0
0 0
𝑥𝑖 𝑦𝑖
1 0
0 1
¿ ⋯
][
𝑚1
𝑚2
𝑚3
𝑚4
𝑡𝑥
𝑡𝑦
]
[𝑥𝑖 ′
𝑦𝑖 ′ ]=
[𝑚1 𝑚2
𝑚3 𝑚4
][𝑥𝑖
𝑦𝑖
]+
[𝑡𝑥
𝑡𝑦
]
Given correspondences: [x’i,y’i] ↔ [xi,yi]
Set up two equations per point

[
⋮
𝑥𝑖
′
𝑦𝑖
′
⋮
]=
[
⋯
¿
¿
𝑥𝑖 𝑦 𝑖
0 0
0 0
𝑥𝑖 𝑦𝑖
1 0
0 1
¿ ⋯ ][
𝑚1
𝑚2
𝑚3
𝑚4
𝑡𝑥
𝑡𝑦
]
2 equations per point, 6 unknowns
How many points do we need to properly
constrain the problem?
2k
6

[
⋮
𝑥𝑖
′
𝑦𝑖
′
⋮
]=
[
⋯
¿
¿
𝑥𝑖 𝑦 𝑖
0 0
0 0
𝑥𝑖 𝑦𝑖
1 0
0 1
¿ ⋯ ][
𝑚1
𝑚2
𝑚3
𝑚4
𝑡𝑥
𝑡𝑦
]
Want: b = Ax (x contains all parameters)
Overconstrained, so solve
How?
2k
6
b A x

Data: (xi,yi,x’i,y’i) for
i=1,…,k
Model:
[x’i,y’i,1] ≡ H[xi,yi,1]
Objective function:
It’s complicated
H
Homography: H

9
k points → 2k
𝑨𝒉=𝟎
What do we use from last time?
h
∗
=arg min
‖h‖=1
‖ h
𝐴 ‖
2 Eigenvector of AT
A with
smallest eigenvalue
𝒑 𝒊=
[
𝑥𝑖
𝑦𝑖
1 ]
[
𝟎
𝑇
− 𝒑1
𝑇
𝑦1
′
𝒑1
𝑇
𝒑1
𝑇
𝟎
𝑇
− 𝑥1
′
𝒑1
𝑇
⋮
𝟎𝑇
− 𝒑𝑛
𝑇
𝑦𝑛
′
𝒑𝑛
𝑇
𝒑𝑛
𝑇
𝟎
𝑇
− 𝑥𝑛
′
𝒑𝑛
𝑇
][
𝒉𝟏
𝒉𝟐
𝒉𝟑
]=𝟎
Row 1 of H

In Practice
[
𝟎
𝑇
− 𝒑1
𝑇
𝑦1
′
𝒑1
𝑇
𝒑1
𝑇
𝟎
𝑇
− 𝑥1
′
𝒑1
𝑇
⋮
𝟎𝑇
− 𝒑𝑛
𝑇
𝑦𝑛
′
𝒑𝑛
𝑇
𝒑𝑛
𝑇
𝟎
𝑇
− 𝑥𝑛
′
𝒑𝑛
𝑇
][
𝒉𝟏
𝒉𝟐
𝒉𝟑
]=𝟎
9
k points → 2k
𝑨𝒉=𝟎
Should consist of lots of {x,y,x’,y’,0, and 1}.
If it fails, assume you mistyped.
Re-type differently and compare all entries.
Debug first with transformations you know.
𝒑 𝒊=
[
𝑥𝑖
𝑦𝑖
1 ]
Row 1 of H

Small Nagging Detail
||Ah||2
doesn’t measure model fit (it’s an algebraic error
that’s mainly just convenient to minimize)
Also, there’s a least-squares setup that’s wrong but
often works.
∑
𝑖=1
𝑘
‖[𝑥𝑖
′
, 𝑦𝑖
′
]−𝑇 ([𝑥𝑖 , 𝑦𝑖])‖
2
+‖[𝑥𝑖 , 𝑦𝑖]−𝑇
−1
([𝑥𝑖
′
, 𝑦𝑖
′
])‖
2
Really want geometric error:

Small Nagging Detail
In RANSAC, we always take just enough points to
fit. Why might this not make a big difference when
fitting a model with RANSAC?
Solution: initialize with algebraic (min ||Ah||), optimize
with geometric using standard non-linear optimizer

Image Warping
x
y
x
y
f(x,y) g(x,y)
T(x,y)
Given a coordinate transform (x’,y’) = T(x,y) and a
source image f(x,y), how do we compute a
transformed image g(x’,y’) = f(T(x,y))?

Forward Warping
x
y
x'
y'
f(x,y) g(x’,y’)
T(x,y)
Send the value at each pixel (x,y) to
the new pixel (x’,y’) = T([x,y])

Forward Warping
x
y
f(x,y)
x-1 x x+1
y-1
y
y+1
x'-1 x' x'+1
y'-1
y'
y'+1
x'
y’
g(x’,y’)
If you don’t hit an exact pixel, give the value to each of
the neighboring pixels (“splatting”).
T(x,y)

Forward Warping
Suppose T(x,y) scales by a factor of 3.
Hmmmm.

Inverse Warping
x
y
x'
y'
f(x,y) g(x’,y’)
T-1
(x,y)
Find out where each pixel g(x’,y’) should get its value
from, and steal it.
Note: requires ability to invert T

Inverse Warping
x'-1 x' x'+1
y'-1
y'
y'+1
x'
y’
g(x’,y’)
x
y
f(x,y)
x-1 x x+1
y-1
y
y+1
If you don’t hit an exact pixel, figure out how to take it
from the neighbors.
T-1
(x,y)

Mosaicing
Warped
Input 1
I1
Warped
Input 2
I2
Image Credit: A. Efros
Can warp an image. Pixels that don’t have a
corresponding pixel in the image are set to a
chosen value (often 0)

Mosaicing
Warped
Input 1
I1
α
Warped
Input 2
I2
αI1 +
(1-α)I2
Image Credit: A. Efros

Mosaicing
Warped
Input 1
I1
α
Warped
Input 2
I2
αI1 +
(1-α)I2
Can also warp an image containing 1s. Pixels
that don’t have a corresponding pixel in the
image are set to a chosen value (often 0)

Putting it Together
How do you make a panorama?
Step 1: Find “features” to match
Step 2: Describe Features
Step 3: Match by Nearest Neighbor
Step 4: Fit H via RANSAC
Step 5: Blend Images

Putting It Together 1
• (Multi-scale) Harris; or
• Laplacian of Gaussian
Find corners/blobs

Describe Regions Near Features
Build histogram of
gradient
orientations (SIFT)
(But in practice use
opencv)
𝑥𝑞 ∈ 𝑅128

Match Features Based On Region
𝑥1 ∈ 𝑅128
𝑥2 ∈ 𝑅128
𝑥𝑞 ∈ 𝑅128
𝑥𝑞
Sort by distance to: ‖𝑥𝑞 − 𝑥1‖<‖𝑥𝑞 − 𝑥2
‖<‖𝑥𝑞 − 𝑥3‖
Accept match if: ‖𝑥𝑞 − 𝑥1‖/‖𝑥𝑞 − 𝑥2‖
Nearest neighbor is far closer than 2nd
nearest neighbor

Fit transformation H via RANSAC
for trial in range(Ntrials):
Pick sample
Fit model
Check if more inliers
Re-fit model with most inliers
arg min
‖𝒉‖=1
‖𝑨𝒉‖
2

Warp images together
Resample images with inverse
warping and blend
(but in practice, just call opencv for
inverse warping)

transformations2 and fitting on image processing.pptx

A pencil of rays contains all views
real
camera
synthetic
camera
Can generate any synthetic camera view
as long as it has the same center of projection!

Automatically rectified floor
St. Lucy Altarpiece, D. Veneziano
Analyzing Patterns
What is the (complicated)
shape of the floor pattern?

From Martin Kemp, The Science of Art
(manual reconstruction)
Automatic
rectification
Analyzing Patterns

Homography Derivation
• This has gotten cut in favor of showing more of
the setup.
• The key to the set-up is to try to move towards
a setup where you can pull [h1,h2,h3] out, or
where each row is a linear equation in
[h1,h2,h3]

[
𝑎 𝑏 𝑐
𝑑 𝑒 𝑓
𝑔 h 𝑖 ][
𝑥𝑖
𝑦 𝑖
𝑤𝑖
]≡
[
𝑥𝑖
′
𝑦𝑖
′
𝑤𝑖 ′ ]≡
[
𝑎 𝑏 𝑐
𝑑 𝑒 𝑓
𝑔 h 𝑖 ][
𝑥𝑖
𝑦𝑖
𝑤𝑖
]
Want:
Recall: 𝒂≡𝒃 𝒂=𝜆𝒃 𝒂×𝒃=𝟎
In turn
𝒑 𝒊=
[
𝑥𝑖
𝑦𝑖
1 ]
𝑯 𝒑𝒊≡
[
𝒉𝟏
𝑻
𝒉𝟐
𝑻
𝒉𝟑
𝑻 ]𝒑𝒊 ≡
[
𝒉𝟏
𝑻
𝒑𝒊
𝒉𝟐
𝑻
𝒑𝒊
𝒉𝟑
𝑻
𝒑𝒊
]
[
𝑥𝑖
′
𝑦𝑖
′
𝑤𝑖
′ ]×
[
𝒉𝟏
𝑻
𝒑𝒊
𝒉𝟐
𝑻
𝒑𝒊
𝒉𝟑
𝑻
𝒑𝒊
]=𝟎
In the end
want:
Why Cross products?
Cross products have
explicit forms

[
𝑥𝑖
′
𝑦𝑖
′
𝑤𝑖
′ ]×
[
𝒉𝟏
𝑻
𝒑𝒊
𝒉𝟐
𝑻
𝒑𝒊
𝒉𝟑
𝑻
𝒑𝒊
]=𝟎
Want:
[
𝑦𝑖
′
𝒉𝟑
𝑻
𝒑𝒊 −𝑤𝑖
′
𝒉𝟐
𝑻
𝒑𝒊
𝑤𝑖
′
𝒉𝟏
𝑻
𝒑𝒊 − 𝑥𝑖
′
𝒉𝟑
𝑻
𝒑𝒊
𝑥𝑖
′
𝒉𝟐
𝑻
𝒑𝒊 − 𝑦𝑖
′
𝒉𝟏
𝑻
𝒑𝒊
]=𝟎
Cross-
product
[
𝒉𝟏
𝑻
𝟎−𝑤𝑖
′
𝒉𝟐
𝑻
𝒑𝒊+ 𝑦𝑖
′
𝒉𝟑
𝑻
𝒑𝒊
𝑤𝑖
′
𝒉𝟏
𝑻
𝒑𝒊 +𝒉𝟐
𝑻
𝟎− 𝑥𝑖
′
𝒉𝟑
𝑻
𝒑𝒊
− 𝑦𝑖
′
𝒉𝟏
𝑻
𝒑𝒊 +𝑥𝑖
′
𝒉𝟐
𝑻
𝒑𝒊 +𝒉𝟑
𝑻
𝟎]=𝟎
Re-arrange
and put 0s in
Note: calculate
this explicitly. It
looks ugly, but do
it by doing [a,b,c]
x [a’,b’,c’] then
re-substituting.
You want to be
able to right-
multiply by
[h1,h2,h3]

[
𝒉𝟏
𝑻
𝟎−𝑤𝑖
′
𝒉𝟐
𝑻
𝒑𝒊+ 𝑦𝑖
′
𝒉𝟑
𝑻
𝒑𝒊
𝑤𝑖
′
𝒉𝟏
𝑻
𝒑𝒊 +𝒉𝟐
𝑻
𝟎− 𝑥𝑖
′
𝒉𝟑
𝑻
𝒑𝒊
− 𝑦𝑖
′
𝒉𝟏
𝑻
𝒑𝒊 +𝑥𝑖
′
𝒉𝟐
𝑻
𝒑𝒊 +𝒉𝟑
𝑻
𝟎]=𝟎
Equation
Pull out h
[
𝟎
𝑻
−𝑤
′
𝑖 𝒑𝒊
𝑻
𝑦 ′𝑖 𝒑𝒊
𝑻
𝑤𝑖
′
𝒑𝒊
𝑻
𝟎
𝑻
− 𝑥𝑖
′
𝒑𝒊
𝑻
− 𝑦𝑖
′
𝒑𝒊
𝑻
𝑥𝑖
′
𝒑𝒊
𝑻
𝟎
𝑻 ][
𝒉𝟏
𝒉𝟐
𝒉𝟑
]=𝟎
Only two linearly independent equations
Yank out h once you have all the coefficients.
If you’re head-scratching about the two equations, it’s not obvious to me at
first glance that the three equations aren’t linearly independent either.

Simplification: Two-band Blending
• Brown & Lowe, 2003
• Only use two bands: high freq. and low freq.
• Blend low freq. smoothly
• Blend high freq. with no smoothing: binary alpha
Figure Credit: Brown & Lowe

Low frequency (l > 2 pixels)
High frequency (l < 2 pixels)
2-band “Laplacian Stack” Blending

transformations2 and fitting on image processing.pptx

More Related Content

Similar to transformations2 and fitting on image processing.pptx (20)

More from Indra Hermawan (20)

Recently uploaded (20)

transformations2 and fitting on image processing.pptx