What makes Big Visual Data hard? Quint Buchholz Alexei (Alyosha) Efros Carnegie Mellon University
My Goals 1. To make you fall in love with Big Visual Data She is a fickle, coy mistress but holds the key to achieving real visual understanding 2. To ask for help in tackling this Big Interdisciplinary Problem
Driven by Visual Data Texture Synthesis Dating Historical Images Seeing Through Water Unsupervised Object Discovery Action Recognition Illumination Estimation Inferring 3D from 2D Geo-location
Texture: microcosm of Big Data radishes rocks yogurt
Texture Synthesis
Classical Texture Synthesis Synthesis Novel texture Parametric Texture Model This is hard! Analysis Sample texture
Throwing away too much too soon? input texture synthesized texture
Non-parametric Approach Synthesis Novel texture Analysis Sample texture
[Efros & Leung, 99, Efros & Freeman 01] p non-parametric sampling Input image
Texture Growing
Portilla & Simoncelli Xu, Guo & Shum input image Wei & Levoy Our algorithm
Two Kinds of Things in the World Navier-Stokes Equation + weather + location +
Lots of data available
Unreasonable Effectiveness of Data Parts of our world can be explained by elegant mathematics: physics, chemistry, astronomy, etc. But much cannot: [Halevy, Norvig, Pereira 2009] psychology, genetics, economics, visual understanding? Enter: The Magic of Data Great advances in several fields: e.g. speech recognition, machine translation, Google
The A.I. for the postmodern world
The Good News Really stupid algorithms + Lots of Data = Unreasonable Effectiveness
140 billion images 6 billion added monthly 6 billion images 1 billion images served daily 72 hours uploaded every minute 3.5 trillion photographs 90% of net traffic will be visual!
Physics Dating Drugs Scientific Experiments Psychology Social Graphs Collaborative Filters Genetics Web Text Medical Data Data Mining Search Disease Tracking Policy Economic Data Business Intelligence Business Data Marketing Visual Data?
Bad News Visual Data is difficult to handle text: clean, segmented, compact, 1D, indexable Visual data: Noisy, unsegmented, high entropy, 2D/3D
Computing distances is hard CLIME - CRIME = hamming distance of 1 letter y y x - x = Euclidian distance of 5 units - = Grayvalue distance of 50 values - =?
How similar are two pictures?? =
Medici Fountain, Paris
INDEXING VIA VISUAL WORDS
VISUAL WORD MATCHING [SIFT: Lowe, 2004]
letter VISUAL WORD MATCHING [SIFT: Lowe, 2004]
Medici Fountain, Paris (winter)
Visual Garbage Heap It irritated him that the dog of 3:14 in the afternoon, seen in profile, should be indicated by the same noun as the dog of 3:15, seen frontally My memory, sir, is like a garbage heap. -- from Funes the Memorious Jorge Luis Borges Organizing the Garbage Heap : Finding visual correspondences across data Mining Visual Data Connecting visual data to enable understanding (Visual Memex)
Improving Visual Correspondence
Improving Visual Correspondence
Lots of Tiny Images 80 million tiny images: a large dataset for nonparametric object and scene recognition Antonio Torralba, Rob Fergus and William T. Freeman. PAMI 2008.
Lots Of Images A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008
Lots Of Images A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008
Lots Of Images
Automatic Colorization Grayscale input High resolution Colorization of input using average A. Torralba, R. Fergus, W.T.Freeman. 2008
[Hays & Efros, SIGGRAPH 07]
Scene Descriptor
Scene Descriptor Scene Gist Descriptor (Oliva and Torralba 2001)
2 Million Flickr Images
200 scene matches
Improving Visual Correspondence
Improving Visual Correspondence
Visual Data has a Long Tail The rare is common!
LEARNING BETTER VISUAL CORRESPONDENCES ABHINAV SRIVASTAVA, TOMASZ MALISIEWICZ, ABHINAV GUPTA, ALEXEI EFROS SIGGRAPH ASIA 11
Input Query Top Matches
Input Query Top Matches
Input Query Top Matches
IMPORTANT PARTS? Input Query Important Parts
Input Query Top Matches
72
Way more efficient approaches: [Ramanan et al 2012, Durand et al 2012]
SEARCH USING PAINTINGS GIST Input Painting Bag-of-Words Tiny Images Our Approach HOG
SEARCH USING PAINTINGS Input Painting Top Matches
SEARCH USING PAINTINGS Input Painting Top Matches
SEARCH USING SKETCHES Tiny Images Input Sketch GIST Bag-of-Words Our Approach 81
SEARCH USING SKETCHES
APPLICATIONS
RE-PHOTOGRAPHY Computational Re-photography (Bae et al., 2010) Historical Image of Boston Station Re-photographed Image
RE-PHOTOGRAPHY Computational Re-photography (Bae et al., 2010) Historical Image of Boston Station Re-photographed Image Then & Now View
INTERNET RE-PHOTOGRAPHY Computational Re-photography (Bae et al., 2010) Historical Image of Boston Station Re-photographed Image Then & Now View Our Approach Search 10,000 Flickr Images of Boston Historical Image of Boston Station Top Match
INTERNET RE-PHOTOGRAPHY Computational Re-photography (Bae et al., 2010) Historical Image of Boston Station Re-photographed Image Our Approach Then & Now View Historical Image of Boston Station Top Match From 10,000 Flickr Images Then & Now View
WHERE WAS THE PAINTER STANDING? Input Painting
PAINTING2GPS Input Painting Retrieval set 10,000 Geo-tagged Flickr Images 100 top matches used to estimation
PAINTING2GPS Input Painting Estimated Geo-location Estimated using 100 top matches
VISUAL SCENE EXPLORATION
VISUAL SCENE EXPLORATION 96
Query image FINDING SIMILAR IMAGES
PAIRWISE SIMILARITY MATRIX...........
TRAVERSING THE GRAPH