A Bayesian Framework for Unsupervised One-Shot Learning of Object Categories Li Fei-Fei 1 Rob Fergus 1,2 Pietro Perona 1 1. California Institute of Technology 2. Oxford University
Single Object Object Category Lowe et al. Schmidet al.
Algorithm Training Examples Categories Rowley et al. ~500 Faces Schneiderman, et al. ~2,000 Faces, Cars Viola et al. ~10,000 Faces Burl, et al. Weber, et al. Fergus, et al. 200 ~ 400 Faces, Motorbikes, Spotted cats, Airplanes, Cars
Algorithm Training Examples Categories Rowley et al. ~500 Faces Schneiderman, et al. ~2,000 Faces, Cars Viola et al. ~10,000 Faces Burl, et al. Weber, et al. Fergus, et al. 200 ~ 400 Faces, Motorbikes, Spotted cats, Airplanes, Cars
Words of wisdom from statisticians error Bayes error Test ~5 x (# of params) N* Train # training examples (N)
Goal One-Shot learning of object categories.
There are about 30,000 visual categories. Learn 4.5 categories per day 18 years! At age 6, a child can learn roughly all 30,000 visual categories. (13.5/day) P. Buegel, 1559
S. Savarese, 2003
P. Buegel, 1562
Sneak preview Algorithm Training Examples Categories Results (error) Burl, et al. Weber, et al. Fergus, et al. 200 ~ 400 Faces, Motorbikes, Spotted cats, Airplanes, Cars 5.6-10 % Bayesian One-Shot 1 ~ 5 Faces, Motorbikes, Spotted cats, Airplanes 8 15 %
Outline of main points Intuition: Use prior knowledge Approach: Full Bayesian Implementation: Probabilistic model Variational EM Experiments
Prior knowledge about objects I Appearance Shape unlikely likely
Prior knowledge about objects II Appearance Shape unlikely likely
Bayesian framework P(object test, train) vs. P(clutter test, train) Bayes Rule p( test object, train) p(object) Expansion by parametrization p(test θ,object) p( θ object, train) dθ
Bayesian framework P(object test, train) vs. P(clutter test, train) Bayes Rule p( test object, train) p(object) Expansion by parametrization p(test θ,object) p( θ object, train) dθ Previous Work: ( ML ) δ θ θ
Model (θ) space ML θ ML p ( θ 1 ) δ (.) p ( θ ML n ) δ (.) ML p ( θ 2 ) δ (.) θ 1 θ n θ 2
Bayesian framework P(object test, train) vs. P(clutter test, train) Bayes Rule p( test object, train) p(object) Expansion by parametrization p(test θ,object) p( θ object, train) dθ
Bayesian framework P(object test, train) vs. P(clutter test, train) Bayes Rule p( test object, train) p(object) Expansion by parametrization p(test θ,object) p( θ object, train) dθ One-Shot learning: p ( train θ,object) p( θ )
Model (θ) space θ 1 θ 2 θ n
Representing θ Main issues: measuring the similarity of parts representing the configuration of parts φ Fischler & Elschlager 1973 φ Yuille 91 φ Brunelli & Poggio 93 φ Lades, v.d. Malsburg et al. 93 φ Cootes, Lanitis, Taylor et al. 95 φ Amit & Geman 95, 99 φ Perona et al. 95, 96, 98, 00, 03 φ Agarwal & Roth 02
model (θ) space Model Structure Each object model θ Gaussian shape pdf Gaussian part appearance pdf θ 1 θ n θ 2
model (θ) space Model Structure Each object model θ Gaussian shape pdf Gaussian part appearance pdf θ 1 θ n θ 2 model distribution: p(θ) conjugate priors
Variational EM Random initialization new θ s E-Step M-Step new estimate of p(θ train) Attias, Hinton, Minka, etc. prior knowledge of p(θ)
Maximum Likelihood method Bayesian Learning method
Summary of main points Goal: one-shot learning Intuition: use general prior knowledge Approach: full Bayesian Marginalize over all θ Implementation: Probability models Variational EM
Experiments Training: Testing: 1-6 images 50 fg/ 50 bg images (randomly drawn) object present/absent Datasets faces airplanes spotted cats motorbikes [www.vision.caltech.edu]
Faces Motorbikes Airplanes Spotted cats
No Manual Preprocessing No labeling No segmentation No alignment
Experiments: obtaining priors airplanes Prior distr. Learning spotted cats motorbikes Miller, et al. 00
One more word about priors Data Prior distr. Expertise Non-informative Others
Number of training examples
Number of training examples
Number of training examples
Number of training examples
Algorithm Training Examples Categories Results (error) Burl, et al. Weber, et al. Fergus, et al. 200 ~ 400 Faces, Motorbikes, Spotted cats, Airplanes, Cars 5.6-10 % Bayesian One-Shot 1 ~ 5 Faces, Motorbikes, Spotted cats, Airplanes 8 15 %
Summary Learning categories w/ one example is possible Decreased # of training example from ~300 to 1~5 Bayesian treatment Priors from unrelated categories are useful Future Work Incremental learning More categories (e.g. >100) More on priors
Acknowledgements David MacKay, Cambridge University Brian Ripley, Oxford University Yaser Abu-Mostafa, Caltech Andrew Zisserman, Oxford University Leung, Burl, Perona, ICCV 95 Burl, Leung, Perona, CVPR 98 Weber, Welling, Perona, ECCV 00 Weber, Welling, Perona, CVPR 00 Set up constellation model Affine invariance in recognition Unsupervised learning, two-stage learning of shape & appearance Fergus, Perona, Zisserman, CVPR 03 Fei-Fei, Fergus, Perona, ICCV 03 Simultaneous learning of shape & appearance, scale invariance Full Bayesian treatment, One-Shot learning