depth-based real time head motion tracking using 3d template matching
TRANSCRIPT
Depth-Based Real Time Head Motion Tracking Using 3D Template Matching
以深度影像及三維樣板比對為基礎之即時頭部運動軌跡追蹤
學生:屠愚
指導教授:歐陽明博士
1
Outline
• Introduction
• System Overview
• Algorithm
• Results & System Demo
• Conclusion & Future Work
2
3
Introduction
4
Y-a
xis
Yaw
Pitch
Roll
Y translatio
n
Real-time reconstruction of this 6-DoF motion vectorgiven a stream of video input.
What’s Head Motion Tracking
5
AnimationVideo Game Gaze OrientationDancing
Why Head Motion Tracking
• Color Image Based Methods“Head Pose Estimation in Computer Vision : A Survey”,
E. Murphy-Chutorian and M. M. Trivedi., PAMI 09
• Appearance Template Methods
• Feature Tracking Methods
• Detector Arrays
• Nonlinear Regression Methods
• Manifold Embedding Methods
• Flexible Models
• Geometric Methods
• Hybrid Methods
• Too sensitive to illumination variations! 6
Related Work
• SwissRanger SR4000
– Messa Image (2006)
– $9000
• CameCube 2.0
– PMD Technology (2002)
– $12000
7
Depth Camera
• Kinect by Microsoft– $149.99
• Xtion Pro by Asus– $189 .
– $300 .
8
Depth Camera
9
“Real-time performance-based facial animation”T. Weise et al., SIGGRAPH 2011
“Real Time Head Pose Estimation with Random Regression Forests”, G. Fanelli et al., CVPR 2011
Related Work
10
System Overview
11
12
Avatar ControlDepth Data
Acquisition Real-time Head Pose Estimation
Least Square Error Method
Inverse Rotation
Nose Detection
Sampling
Iterative Optimization Method
13
User Acting Avatar ControlDepth Data
Acquisition Real-time Head Pose Estimation
Least Square Error Method
Inverse Rotation
Nose Detection
Sampling
Iterative Optimization Method
Flow Chart
AlgorithmMethod 1 :Least square
error method
14
15
Nose Detection
16
Z-axis
Zmin+5
Zmin
Nose Detection
17
Nose Detection
Nose Detection
18
Inverse Rotation
19
20
Nose DetectionNose𝑡 𝑥, 𝑦
Nose𝑡−1 𝑥, 𝑦 ,
𝜃𝑡−1, 𝜓𝑡−1, 𝜙𝑡−1,Depth Map
𝑡Searching
Window
Setting
Inverse
Rotation
z-min
Searching
Nose Tracking Flow Chart
21
Ax+By+Cz=1
Least Square Plane Fitting
Yaw & Pitch !
23
Least Square Ellipse Fitting
Roll !
Dynamic weighted average filter
24“Real-time performance-based facial animation” T. Weise et al., Siggraph 2011
Flickering Issue
𝑛𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑𝑤𝑒𝑖𝑔ℎ𝑡 𝜔
𝑓𝑟𝑎𝑚𝑒𝑖 𝑖-1 𝑖-2 𝑖-3 𝑖-4
0.010.1110
𝐻 ∙ 𝑚𝑎𝑥𝑙∈[1,𝑘] 𝑡𝑖 − 𝑡𝑖−𝑙
𝑤𝑗 = 𝑒−𝑗∙𝐻∙𝑚𝑎𝑥𝑙∈[1,𝑘] 𝑡𝑖−𝑡𝑖−𝑙𝑡𝑖
∗ = 𝑗=0𝑘 𝑤𝑗𝑡𝑖−𝑗
𝑗=0𝑘 𝑤𝑗
AlgorithmMethod 2 :Optimization
method
25
Avatar ControlDepth Data
Acquisition
26
User ActingDepth Data
Acquisition Real-time Head Pose Estimation
Least Square Error Method
Inverse Rotation
Nose Detection
Sampling
Iterative Optimization Method
Flow Chart
Avatar Control
27
Model Point Cloud
28
Sample Point Cloud
• 𝑚𝑜𝑡𝑖𝑜𝑛 𝑣𝑒𝑐𝑡𝑜𝑟 𝑎 Θ,𝜓, 𝜙, 𝑡𝑥 , 𝑡𝑦, 𝑡𝑧
• 𝐸 𝑎, 𝑃𝑆, 𝑃𝑀 = 𝑝𝑝∈𝑃𝑆min(𝑑 𝑝′, 𝑃𝑀 )
• 𝑅𝑒𝑠𝑢𝑙𝑡 = 𝑎∗ = arg𝑚𝑖𝑛 (𝐸)
29
Energy Function
30
• Gradient Decent Algorithm
Non-differentiable
Optimization
𝑎𝑘+1 = 𝑎𝑘 − 𝛼𝑘𝛻𝐸(𝑎𝑘)
• 12 Possible Gradients
𝛻1 =
100000
, 𝛻2 =
−100000
, 𝛻3 =
010000
, 𝛻4 =
0−10000
, 𝛻5 =
001000
, 𝛻6 =
00−1000
,
𝛻7 =
000100
, 𝛻8 =
000−100
, 𝛻9 =
000010
, 𝛻10 =
0000−10
, 𝛻11 =
000001
, 𝛻12 =
00000−1
31
Optimization
ResultsOf Method 1
32
33
34
35
ResultsOf Method 2
36
37
38
39
40
Our Method with Whole Database
Our Method with 70% Database (Without Crash)
Our Method with 50% Database (Without Bad Initialization)
41
System Demo
42
Conclusion
43
• Contribution:
– Two novel algorithms.
– Not affected by varying lighting conditions.
– Real-time responses without GPU acceleration.
– Outperforms the state-of-the-art approach (G. Fanelliet al., CVPR 2011).
• Future Work:
– Tracking without nose.
– Combine with NITE skeleton tracking.
– Facial Expression Recognition / Retargeting.
44
Thank You Very Much
45