Robust Real-Time Face Detection International Journal of Computer Vision 57(2), 137 154, 2004 Paul Viola, Michael Jones 授 課 教 授 : 林 信 志 博 士 報 告 者 : 林 宸 宇 報 告 日 期 :96.12.18
Outline Introduction The Boost algorithm for classifier learning Feature Selection Weak learner constructor The strong classifier Result Conclusion 2
Introduction A machine learning approach for visual object detection Capable of processing images extremely rapidly Achieving high detection rates Three key contributions A new image representation Integral Image A learning algorithm( Based on AdaBoost) A combining classifiers method cascade classifiers 3
Feature Papageorgiou et al (1998) 4
Integral Image D=4+1-(2+3) 5
6
AdaBoost A supervised training process 7
8
AdaBoost 9
Attentional Cascade Rowley et al.(1998) Use two neural networks 10
Attentional Cascade 11
Attentional Cascade 12
Result A 38 layer cascaded classifier was trained to detect frontal upright faces Training set: Face: 4916 hand labeled faces with resolution 24x24. Non-face: 9544 images contain no face. Features (350 million subwindows within these non-face images) The first five layers of the detector: 1, 10, 25, 25 and 50 features Total # of features in all layer 6061 13
Result Each classifier in the cascade was trained Face : 4916 + the vertical mirror image 9832 images Non-face sub-windows: 10,000 (size=24x24) 14
Result-outline Speed of the final Detector Image Processing Scanning the Detector Integration of Multiple Detector Experiments on a Real-World Test Set 15
Speed of the final Detector The speed is directly related to the number of features evaluated per scanned sub-window. MIT+CMU test set An average of 10 features out of a total 6061 are evaluated per sub-window. On a 700Mhz PentiumIII, a 384 x 288 pixel image in about.067 seconds 16
Image Processing Minimize the effect of different lightingconditions Using integral image α is standard deviation, m is mean, x is piexl value 17
Scanning the Detector The final detector is scanned across the image at multiple scale and locations Locations are obtained by shifting the window some pixels If the current scale is s, the window is shifted by [s ] 18
Integration of Multiple Detector Multiple detections will usually occur around each face and some types of false positives. A post-process to detected sub-windows in order to combine overlapping detections into a single detection Two detections are in the same subset if their bounding regions overlap 19
Experiments on a Real-World Test Set 20
Result 21
Result 22
Conclusion Authors had developed the fastest known face detector for gray scale images This paper brings together new algorithms, representations and insights which are quite generic The database set includes faces under very wide range of conditions including: illumination, scale, pose, and camera variation 23
Conclusion The database set includes faces under very wide range of conditions including: illumination, scale, pose, and camera variation 24
Thanks! 報 告 結 束 ~ 25
Introduction The attentional operator is trained to detect examples of a particular class --- a supervised training process Face classifier is constructed In the domain of face detection < 1% false negative <40% false postivie 26
27
28
Example x 1 =[1 1] x 2 =[2 2] x 3 =[2 1] x 4 =[3 2] y 1 =1 y 2 =1 y 3 =0 y 4 =0 t=1~3 (round) Initial weight t=1 (round) W t,i =[w 1,1 =1/4, w 1,2 =1/4, w 1,3 =1/4, w 1,4 =1/4] 29
Normalize weight t=1 (round) w 1,1 =(1/4)/(1/4+1/4+1/4+1/4) = 1/4, w 1,2 =(1/4)/(1/4+1/4+1/4+1/4) = 1/4, w 1,3 =(1/4)/(1/4+1/4+1/4+1/4) = 1/4, w 1,4 =(1/4)/(1/4+1/4+1/4+1/4) = 1/4, 30
The error is evaluated with respect to ω t=1 ε 1 = 1/4 1-1 +1/4 0-1 +1/4 0-0 + 1/4 0-0 = 1/4 ε 2 = 1/4 0-1 +1/4 1-1 +1/4 0-0 + 1/4 1-0 = 1/2 2 31
Choose the lowest error ε j t=1 (round) Choose h 1 Update weight / β 1 = (¼)/(1- (¼)) = 1/3 W 2,1 =1/4 β 1 1-0 = 1/12 W 2,2 =1/4 β 1 1-1 = 1/4 W 2,3 =1/4 β 1 1-0 = 1/12 W 2,4 =1/4 β 1 1-0 = 1/12 32
Normalize weight (when t=2) W 2,1 =1/12/1/2 = 1/6 W 2,2 =1/4 /1/2 = 1/2 W 2,3 =1/12/1/2 = 1/6 W 2,4 =1/12/1/2 = 1/6 33
The error is evaluated with respect to ω t=2 ε 1 = 1/6 1-1 +1/2 0-1 +1/6 0-0 + 1/6 0-0 = 1/2 ε 2 = 1/6 0-1 +1/2 1-1 +1/6 0-0 + 1/6 1-0 = 1/3 2 34
Choose the lowest error ε j t=2 (round) Choose h 2 Update weight / β 2 = (1/3)/(1- (1/3)) = 1/2 W 3,1 =1/6 β 2 1-1 = 1/6 W 3,2 =1/2 β 2 1-0 = 1/4 W 3,3 =1/6 β 2 1-0 = 1/12 W 3,4 =1/6 β 2 1-1 = 1/6 35
Normalize weight (when t=3) W 3,1 =1/6 /2/3 = 1/4 W 3,2 =1/4 /2/3 = 3/8 W 3,3 =1/12/2/3 = 1/8 W 3,4 =1/6 /2/3 = 1/4 36
The error is evaluated with respect to ω t=3 ε 1 = 1/4 1-1 +3/8 0-1 +1/8 0-0 + 1/4 0-0 = 3/8 ε 2 = 1/4 0-1 +3/8 1-1 +1/8 0-0 + 1/4 1-0 = 1/2 37
Choose the lowest error ε j t=3 (round) Choose h 1 Update weight / β 3 = (3/8)/(1- (3/8)) = 3/5 38
The final strong classifier α 1 =log3 α 2 =log2 α 3 =log(5/3) log3 h 1 (x)+log2 h 2 (x)+log(5/3) h 1 (x) 1/2 1 0.4771 0.301 0.2218 1 0 1 class1 T 0 0 0 class0 T 0 1 0 class0 F Test point (1,100) 1 1 1 => class1 39
False positive rate Detection rate Features 40