視覚情報処理論 (visual information processing )7 5 8 8 median filter 3 x 3 filter gaussian...
Embed Size (px)
TRANSCRIPT
-
視覚情報処理論
(Visual Information Processing )開講所属: 学際情報学府水(Wed)5 [16:50-18:35]
-
Schedule• 9/ 26 Introduction (Prof. Oishi)• 10/3 Patch-based Object Recognition (1) (Dr. Kagesawa)• 10/10 Patch-based Object Recognition (2) (Dr. Kagesawa)• 10/17 Computer Vision basics (1)(Prof. Oishi)• 10/24 Computer Vision basics (2)(Prof. Oishi)• 10/31 Image and Video Inpainting (1) (Dr. Roxas) (※in English)• 11/7 Image and Video Inpainting (2) (Dr. Roxas) (※in English)• 11/14 (Cancelled)• 11/21 Vision for Robotics Applications (1) (Dr. Sato)• 11/28 Vision for Robotics Applications (2) (Dr. Sato)• 12/5 3D Data Visualization (1) (Dr. Okamoto)• 12/12 3D Data Visualization (2) (Dr. Okamoto)• 12/19 3D Data Processing (1) (Prof. Oishi)• 1/9 3D Data Processing (2) (Prof. Oishi)
-
Computer Vision Paradigm (Marr)
2.5D Image
2D Image
3D representation
Integration
Brightness Texture Line drawing Stereo Motion
Observer oriented
3D Feature Extraction(shape-from-x)
Object oriented 3D Model
-
Digital image processing (2D)
-
What is digital image?Analog information (Film, Painting, Real world)
Digital image• Digital camera• Smart phone• PC data, IT• Digital broadband
Discretization & Sampling
-
SamplingDiscrete segmentation of analog data
Analog data(Time and value are sequential)
Sampling data(Time is discrete)
Sampling interval
-
Sampling2D digital image
Image resolution is defined by sampling interval
-
What is pixel?Unit of 2D digital image Space sampling
0 1 N-1
0
1
M-1
columns
rows
Digital imageM x N pixels
n
m
-
Sampling-Resolution
320 x 240pixels
160 x 120pixels
80 x 60pixels
40 x 30pixels
-
QuantizationSampled values are discretized
Sampled data(Time line is discrete)
Quantization bit:3 bit = 8 level8 bit = 256 level
Digital data(Both time and value are discrete)
-
Quantization2-D digital image
Number of color depends on quantization bit
0
0
0
1
1
1
1
0
0
1
2
2
2
1
0
0
2
3
3
2
1
0
2
3
5
3
2
0
0
2
3
3
3
2
0
0
1
2
2
2
0
0
0
1
1
1
0
0
0
Color is represented by number
-
Color representationHow many colors do we need?
4colors(2bi)
16colors(4bit)
256colors(8bit)
16.7 millioncolors(32bit)
-
High Dynamic Range Imaging: HDRI
-
Exposure time - Intensity [Mathias Eitz, Claudia Stripf,High Dynamic Range Imaging, 2007]
Under Exposure Over Exposure
-
Dynamic range
Human
Camera
-
Multiple capturing
-
Camera response function
Exposure Exposure
-
Estimation of camera response functionCapturing multiple images with different exposure time
-
Computation of response curve
Zij = f (Eitj )f −1 (Zij ) = Eitjln f −1 (Zij ) = ln Ei + lntj
Log Exposure
Zij : Pixel valuef : Camera response functionEi : Radiancetj : Exposure time
-
Displaying HDRI
HDRI
LDRI
-
Tone mappingLinear mapping Logarithmic mappingGlobal Reinhard operator
L (x, y) = L(x,y) / 1+L(x,y)
-
Results of tone mapping
without tone mapping with tone mapping
-
HDRI Video [Kalantari et al. Patch-Based High Dynamic Range Video, TOG 2013]
-
Filtering
-
FilteringPre-processing for Computer Vision
• Noise reduction• Image enhancement• Feature extraction
FILTER ?
-
Spatial – Frequency filterProcessing in spatial domain
• Neighboring pixels
Processing in frequency domain• Using Fourier Transform
-
Image NoiseNoise source
• Capturing
• Compression/Transfer
-
Mean filterReplace value with mean of neighboring points
0 5
5
3
3
1
4
10
8
8
7
6
8
5
0
9
4
8
5
9
10
9
7
7
5 3 x 3(5 x 5)(7 x 7)
1 / 9
1 / 9
1 / 9
1 / 9
1 / 9
1 / 9
1 / 9
1 / 9
1 / 9
10 / 9
8 / 9
8 / 9
8 / 9
5 / 9
0 / 9
9 / 9
7 / 9
7 / 9
0 5
5
3
3
1
4
10
8
8
7
6
8
5
0
9
4
8
5
9
10
9
7
7
5
7
8
-
Mean filterWeighted average
0 5
5
3
3
1
4
10
8
8
7
6
8
5
0
9
4
8
5
9
10
9
7
7
5
4 / 16
2 / 16
2 / 16
2 / 16
1 / 16
1 / 16
2 / 16
1 / 16
1 / 16
40 / 16
16 / 16
16 / 16
16 / 16
5 / 16
0 / 16
18 / 16
7 / 16
7 / 16
0 5
5
3
3
1
4
10
8
8
7
6
8
5
0
9
4
8
5
9
10
9
7
7
5
8
6
-
Mean filter (Smoothing, Averaging)for Gaussian noise
Noise image(5% Gaussian)
Average Weighted average
-
Mean filterEx. Shot noise
Noise image(Random binary)
Average Weighted average
-
Non-linear filterMaximum filter
• Replace target value with maximum value in a window
Minimum filter• Replace target value with minimum value in a window
Median filter
-
1098887750
7859108780
ソート 中央値
Median filterReplace target value with median value in a window
0 5
5
3
3
1
4
10
8
8
7
6
8
5
0
9
4
8
5
9
10
9
7
7
5
0 5
5
3
3
1
4
10
8
8
7
6
8
5
0
9
4
8
5
9
10
9
7
7
5
8
8
-
Median filter3 x 3 Filter
Shot noiseGaussian noise
-
Edge detection
-
Edge typesStep edge
Roof edge
Peak edge
x
x
x
-
1-D edge differentialFirst and second order differentials
Fig. from Digital Image Processing (Springer)
Original signal
First order
Second order
-
Gradient-baseOperator of first order differential
Discrete difference equation
y
f
x
fyxf ,,
nmfnmfnmf
nmfnmfnmf
y
x
,1,,
,,1,
2 x 2 size
1,1,,
,1,1,
nmfnmfnmf
nmfnmfnmf
y
x
3 x 3 size
Strength and direction of edge
-
Gradient-baseOperators
• Roberts
• Prewitt
• Sobel
10
01
01
10\/ DD
111
000
111
101
101
101
yx DD
121
000
121
101
202
101
yx DD
-
Gradient-basePrewitt operator
Dx Dy
-
Laplacian operatorOperator of second order differential
Strength of edge is estimated
010
141
010
2
2
1
121222 yx DD xxx DDD 2
010
141
0102
111
181
1112
4 direction 8 direction
yyy DDD 2
-
Laplacian operatorLaplacian operator
4 direction 8 direction
-
Laplacian of GaussianDifferential operation is weak to noiseGaussian filter (noise reduction) -> Laplacian operator
Laplacian of Gaussian 222 2/
2
1,G
yxeyx
222 2/2
22
42 2
2
1,G
yxeyx
yx
-
Laplacian of GaussianLOGオペレータ
1 2
-
Line drawing analysis
-
Line drawing extraction
Original image Differential image Line drawing image
-
3D Information form Line DrawingGiven
• Line drawing(2D)Find
• 3D object that projects to given lines
Find• How do you think it’s a cube,
not a painted pancake?
-
Line types
convex concave
occluding occluding
-
Labeling a Line Drawing
Easy to label lines for this solid→Now invert this in order to understand shape
-
Enumerating Possible Line Labeling without Constraints
•9 lines•4 labels each
→4x4x4x4x4x4x4x4x4= 262,144 possibilitiesWe want just one reality
must reduce surplus possibilities→Need constraints (by 3D relationship)
-
Huffman & Clows Junction DictionaryAny other arrangements
cannot ariseHave reduced configuration
from 208 to 12
• L-type - 6• ARROW-type - 3• FORK-type - 3
-
Constraints on LabelingWithout constraints-- 262,144 possibilitiesConsider →3x3x3x6x6x6x3= 17496 possibilitiesconstraints
We can reduce more bycoherency/consistency along line.
-
Labeling by Constraint Propagation“Waltz filtering”By coherence rule, line label constrains neighborsPropagate constraint through common vertexUsually begin on boundaryMay need to backtrack
-
Impossible objectsNo consistent labelingBut some do have a consistent labeling
• What’s wrong here?
-
Limitations of Line LabelingOnly qualitative; only gets topologySomething wrong
-
Color theory
-
Color Theory for Computer VisionColor in several domains:
• Physics• Human vision• Psychophysics• Perception• Computer Vision
Color problems in Computer Vision:• Color for segmentation• Color for reflection physics
-
Color spectrum
Intensity at each wavelength
-
RGB imageRGB color model
r=255g=5b=10
DSC(Digital Still Camera)
Spectrum is compressed to three color valuesResponse function
-
IlluminationSpectrum is richer than RGB
-
Are RGB enough?5900K light
MetamerismNatrium light
Standard illumination
D50 light
-
Spectral distribution measurement
-
Interference CameraSpectrum varies along the position
Interference filter
-
Y
-
Panoramic Multispectral Imaging SystemLCTF Capturing System
Automatic Pan/Tilt Platform
-
LCTF Capturing system
・・・
t (s)400nm ~ ~720nm
Target scene
LCTFMonochromatic CCD camera
400nm404nm408nm416nm・・・nm・・・nm・・・nm・・・nm・・・nm・・・nm・・・nm712nm716nm720nm
PC
[Tominaga et al. 00]
-
Tumulus and hill
-
In what condition painted?under sun-lightunder torch?
� U Tokyo / Topan / Kyushu National Museum
-
Simulation ResultsSimulation results suggest that
• Painted most likely under sun light• First paints, and then covers the tumulus
Torch Sun light
-
Point light source(Incandescent)
Spectral measurement sensor
Target object: Tomato
RGB camera
Data analysis
Spectral measurement of Aging process
-
Measurement time: every 12 hours in 14 days
0
0.05
0.1
0.15
0.2
0.25
380 480 580 680
波長(nm)
分光反
射率
Temporal variation
-
1st principal component proportion : 61.1%Regressioncurve: -0.1996240.0153333t
3t0.00013939358t0.00000773 231
Y
2nd principal component proportion : 23.3%Regressioncurve: 0.008506940.0720887t0.0225962t
0.002247t86t0.00008970-553t0.000001252
3452
Y
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
380 480 580 680
波長(nm)
主成分の係数
第一主成分
第二主成分
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0 5 10 15 20 25
日数
得点
第一主成分得点
第二主成分得点
第一主成分得点(回帰曲線)
第二主成分得点(回帰曲線)
Principal component analysis
-
Reflectance image
?
Color image Texture image Reconstructed image
3D model rendering
-
Human visionRetina
Retina has 4 cells
•“red” cone cell•“green” cone cell•“blue” cone cell• Rod cell Intensity
Color
-
Human vision
380nm 760nm
)(bC
)(gC
)(rC
Response of red cone =
Response of green cone =
Response of blue cone =
dECr )()(760
380 dECg )()( dECb )()(
-
Color spacered = green =blue =
dECr )()( dECg )()( dECb )()(
If we approximate spectral power distribution by vector, it’s a matrix multiplication.
)(E
red greenblue
=)(rC)(gC)(bC )(E
13 3 1
spectral space : infinitely many dimensionscolor space : 3 dimensions
-
Alternate color spaceother isomorphic color spaces formed by linear transforms
red greenblue
=)(rC)(gC)(bC )(E
define new axesABC
=red greenblue
=
=
)(rC)(gC)(bC )(E
)(a)(b)(c )(E
linear transform gives new axes
new response function
green
bluered
A
C B
-
Psychophysical color (X-Y-Z)international standard color space agreed upon byCommision Internationale de I’Eclairage (CIE)• particular linear transform of human cone responses• Two spectral distributions that result in the same values in the
space appear indistinguishable • all colors have positive x, y, zEach point in X-Y-Z is a different colorChromaticity
x = X / (X+Y+Z) ≒ R / (R+G+B)y = Y / (X+Y+Z) ≒ G / (R+G+B)z = Z / (X+Y+Z) ≒ B / (R+G+B)since x+y+z = 1, z = 1-(x+y). --- redundant usually plotted o x-y diagram
Each point is many XYZ colors
-
Chromaticity diagram
r = R / (R+G+B)g = G / (R+G+B)b = B / (R+G+B)
-
Color perceptionHow do people describe color ?NOT “X-Y-Z” nor “R-G-B” !People use cylindrical coordinates.hue, saturation, brightness
B H
S
blue
white
violetred
yellow
green
SH
One plane of constant brightness
hue+saturation form polar coordinates
relationship to red-green-blue
-
Hue-Saturation-Brightness (Value) Space
blue
black whitehue
-
Photometric properties
-
)()()( ESI Observed color
ObservedSurfacereflectance
Illumination
-
Role of Color in Robot Vision1. Feature space for 2D segmentation
more features → beer discrimina�on2. Color physics of reflection
What physical information can color provide?
-
Color reflection physicssurface reflection and body reflection
bodyair
incident lightsurfacereflection
bodyreflection
internalpigment
-
Separating reflection components by colorPixel color vectors are
Make a histogram fit parallelogramProject each pixel onto vectorsDetermine everywhere
Klinker 88bbss CC
bs CC ,
bs ,
body reflection
surface reflection
b
ssC
bC
R
G
B
-
Color space analysis
dbLL
dgLL
drLL
dbL
dgL
drL
B
G
R
C
bs
bs
bs
)())()((
)())()((
)())()((
)()(
)()(
)()(
bbss
b
b
b
b
s
s
s
sbs
b
b
b
s
s
s
bs
bs
bs
CC
B
G
R
B
G
R
dbO
dgO
drO
dbI
dgI
drI
dbO
dgO
drO
dbI
dgI
drI
dbOI
dgOI
drOI
)()(
)()(
)()(
)()(
)()(
)()(
)()(
)()(
)()(
)()(
)()(
)()(
)())()((
)())()((
)())()((
body color vector in RGB spacesurface color vector in RGB spaceColor vector at a pixel is a linear combination of surface + body reflection color vector
-
Dichromatic Reflection Modelsurface reflection has SPD of incident lightbody reflection has SPD of body color
brightness reflected)( Lsurface reflection body reflection
SPD of body colorSPD of incident light
-
Klinker et al.’s method
Steps:
1. Color segmentation
2. T-shape identification
-
Separation Results
-
Chromaticity-Intensity Space
a. Specular image c. Chromaticity Intensity space
a b c
b. Spatial Intensity space
-
96
Iteration Framework
-
Result: a single object
Input image Specular-free image
-
Separation Result
Diffuse reflection component
Specular reflection component
-
Separation using High Frequency Illumination
[S.K. Nayar et al. SIGGRAPH 2006]
-
Summary2D digital image processingEdge detectionLine drawing analysisColor theoryPhotometric properties