face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต...
TRANSCRIPT
![Page 2: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/2.jpg)
Standard procedure• Image capturing: camera, webcam, surveillance
• Face detection: locate faces in the image
• Face alignment: normalize size, rectify rotation
• Face matching
• 1:1 Face verification
• 1:N Face recognition
![Page 3: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/3.jpg)
Viola-Jones Haar-like detector (OpenCV haarcascade_frontalface_alt2.xml)
face size~35x35 to 80x80 pixels
too small
occlusion
rotation
Recognition = compare these faces to known faces
![Page 4: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/4.jpg)
Controlled environment face size 218x218 pixels
Viola-Jones eye detector
Eyes distance = 81 pixels Eyes angle = -0.7 degrees
Face size = 180x200 pixels Eyes distance = 100 pixels
Eyes angle = 0 degrees
![Page 5: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/5.jpg)
![Page 6: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/6.jpg)
Comparing face• Face image
• Bitmap of size 180x200 pixels
• Grayscale (0-255)
• 36,000 values/face image
• Given 2 face images x1 and x2
• x1(x,y) - x2(x,y)
• | x1(x,y) - x2(x,y) |
• (x1(x,y) - x2(x,y))2
• What should be used?
![Page 7: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/7.jpg)
Basic Maths• 1 Face image = 1 vector
• 36,000 dimensions (d)
• matrix with 1 column
• Distance
• Euclidean distance
• Norm-p distance
• Norm-1 distance
• Norm-infinity distance
![Page 8: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/8.jpg)
Pixels importance and projection
• Not all pixels have the same importance
• Pixel with low variation -> not important
• Pixel with large variation -> could be important
Projection When ||w||=1, wTx is the projection of x on axis w
w
![Page 9: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/9.jpg)
Subspace projection
• What should be the axis w?
• How many axis do we need?
![Page 10: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/10.jpg)
Principal Component Analysis PCA (1)
• Basic idea
• Measure of information = variance
• Variance of z1,…,zN for real numbers zt
• Given a set of face vectors x1,…,xN and axis wVariance of wTx1,…,wTxN is
Covariance matrix
![Page 11: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/11.jpg)
Principal Component Analysis PCA (2)
• Best axis w is obtained by maximizing wTCw
with constraint ||w||=1
• w is an eigenvector of C : Cw = a w
• Variance wTCw=a is the corresponding eigenvalue of w
• PCA
• Construct Covariance matrix C
• Eigen-decompose C
• Select m largest eigenvectors
![Page 12: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/12.jpg)
Eigenface (1)• What is the problem with face data?
• Solution
Dot matrix
dxd matrix NxN matrix
![Page 13: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/13.jpg)
Eigenface (2)• We work with vectors of projected values
x1 x2 …
x40
x Enrollment
Template
![Page 14: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/14.jpg)
Eigenface (3)
• Vector of raw intensity: 36,000 dimensions
• Vector of Eigenface coefficients: 10 dimensions
• Large Eigenface = large variation
• Small Eigenface = noise
![Page 15: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/15.jpg)
Related techniques• Fisherface (LDA)
• Nullspace LDA
• Laplacianface
• Locality Sensitive Discriminant Analysis
• 2DPCA
• 2DLDA
• 2DPCA+2DLDA
![Page 16: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/16.jpg)
Result on ORL (~10 years ago)
Techniques Accuracy #dimEigenface 90-95 200
Fisherface 91-97 50NLDA 92-97 40
Laplacianface 89-95 50LSDA 91-97 50
2DPCA 91.52DLDA 90.5
2DPCA+2DLDA 93.5
![Page 17: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/17.jpg)
Limitations
• Occlusion: glasses, beard
• Lighting condition
• Facial expression
• Pose
• Make-up
![Page 18: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/18.jpg)
Evaluation• Accuracy: find closest template and check the ID
• Verification (access control)
• Live captured image VS. stored image
• We have distance -> Should we accept or not?
• False Accept (FA) VS. False Reject (FR)
• From a set of face images
• Compute distances between all pair
• Select threshold T that gives 0 FA and X FR
• Number of tries
distance
T
![Page 19: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/19.jpg)
Labeled Faces in the Wild
• Large number of subjects (>5,000)
• Unconstrained conditions
• Human performance 97-99%
• Traditional methods fail
• New alignment technique: funneling
![Page 20: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/20.jpg)
LFW results
Use outside data to train the model
![Page 21: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/21.jpg)
Deep Learning
![Page 22: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/22.jpg)
Neural Network timeline
McCulloch & Pitts Neuron model (1943)
Perceptron limitation (1969)
Backprop algorithm 70-80’s
SVM (1992)
Deep Learning (2006)
![Page 23: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/23.jpg)
• Return of Neural Network
• Focus on Deep Structure
• Take advantage of today computing power
![Page 24: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/24.jpg)
Neural Networks (1)• Neurons are connected via synapse
• A neuron receives signals from other neurons
• When the activation reaches a threshold, it fires a signal to other neurons
http://en.wikipedia.org/wiki/Neuron
![Page 25: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/25.jpg)
Neural Networks (2)• Universal Approximator
• Classical structure: MLP
• #hidden nodes, learning rate
• Backprop algorithm
• Gradient
• Direction of change that increases value of objective function
• Vector of partial derivatives wrt. each parameters
• Work on all structures, all objective functions
• Stoping criteria, local optima, gradient vanishing/exploding
![Page 26: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/26.jpg)
Deep Learning• 2006 Hinton et al.: layer by layer construction -> pre-training
• Stack of RBMs, Stack of Autoencoders
• Convolutional NN (CNN)
• Shared weights
• Take advantage of GPU
![Page 27: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/27.jpg)
CNN today• Common components
• Convolution layer, Max-pooling layer
• ReLU
• Drop-out, Sampling+flip training data
• GPU
• Tools: Caffe, TensorFlow, Theano, Torch
• Structure: LeNet, AlexNet, GoogLeNet
![Page 28: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/28.jpg)
LeNet
![Page 29: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/29.jpg)
LeNet
AlexNet
![Page 30: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/30.jpg)
LeNet
AlexNet
GoogLeNet
![Page 31: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/31.jpg)
LeNet
AlexNet
GoogLeNet
Microsoft deep residual network: 150 layers!
![Page 32: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/32.jpg)
DeepID(Sun et al. CVPR 2014)
• 160 dim, 60 regions, flipped
• 19,200 dimensions!! • Input to other model • CelebFace • Refine training
![Page 33: Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC](https://reader031.vdocuments.pub/reader031/viewer/2022013106/58754bbc1a28abb8208b7649/html5/thumbnails/33.jpg)
Learning technique
for deep structure
Big dataComputing
power GPU, etc.