Big Data Has the Answers: Now All We Need Are the Questions! Steve Sonka
Recent Headlines The big-data revolution in US health care: Accelerating value and innovation How big data can revolutionize pharmaceutical R&D Weather Channel Now Also Forecasts What You'll Buy In growing field of big data, jobs go unfilled Is Big Data an Economic Big Dud? 'Big data' will change the way you farm!!! Monsanto Buys Climate Corp For $930 Million (10/02/13) Monsanto Pitches Standards for Farm Data (1/31/14) Big Data Comes to the Farm, Sowing Mistrust (2/26/14) Page 2
Web-Enabled Toothbrushes Join the Internet of Things Wall Street Journal: March 2, 2014 10:35 p.m. ET Page 3
Hype Cycle Expectations Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 4
Hype Cycle Expectations R&D Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 5
Hype Cycle Expectations First-generation products, high price, lots of customization needed R&D Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 6
Hype Cycle Expectations Mass media hype begins First-generation products, high price, lots of customization needed R&D Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 7
Hype Cycle Expectations Activity beyond early adopters Mass media hype begins First-generation products, high price, lots of customization needed R&D Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 8
Hype Cycle Expectations Activity beyond early adopters Mass media hype begins Negative press begins First-generation products, high price, lots of customization needed R&D Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 9
Hype Cycle Expectations Activity beyond early adopters Mass media hype begins Negative press begins First-generation products, high price, lots of customization needed Less than 5% of the potential audience has adopted fully R&D Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 10
Hype Cycle Expectations Activity beyond early adopters Mass media hype begins Negative press begins First-generation products, high price, lots of customization needed Less than 5% of the potential audience has adopted fully R&D Second-generation products, some services Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 11
Hype Cycle Expectations Activity beyond early adopters Mass media hype begins Negative press begins First-generation products, high price, lots of customization needed Less than 5% of the potential audience has adopted fully Third-generation products, out of the box, product suites R&D Second-generation products, some services Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 12
Hype Cycle Expectations Activity beyond early adopters Mass media hype begins First-generation products, high price, lots of customization needed Negative press begins Less than 5% of the potential audience has adopted fully High-growth adoption phase starts: 20% to 30% of the potential audience has adopted the innovation Third-generation products, out of the box, product suites R&D Second-generation products, some services Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 13
*** InfoAg 1995 *** Page 14
A mid-1990s view of precision farming from the CCNetAg group Page 15
SURPRISE IS INEVITABLE: BEING UNPREPARED ISN T! Five Questions What s the it about Big Data? How will Big Data transform ag (if it does)? Is it enough to know what -- does why matter? Where s the easy(ier) money? Where is Ag & Big Data on the Hype Cycle? Page 16
The Term of Today BIG DATA Page 17
Big Data (via Wikipedia) Amazon.com millions of back-end operations every day, as well as queries from more than half a million third-party sellers. Walmart more than 1 million customer transactions every hour databases w/ more than 2.5 petabytes (2560 terabytes) of data 167 times the information contained in all the books in the US Library of Congress. Facebook handles 50 billion photos from its user base. Windermere Real Estate uses anonymous GPS signals from nearly 100 million drivers to help new home buyers determine commute times. Volume of business data worldwide doubles every 1.2 years. Page 18
The Term of Today BIG DATA Page 19
Dimensions of Big Data: 3 Vs and an A Variety Velocity Volume Page 20
Not Your Father s Oldsmobile! Page 21
DATA! Not Your Father s Oldsmobile! Page 22
Not Your Father s Oldsmobile! Financial transactions Movements of a cursor on a webpage Turns of a screw in a manufacturing process Tracking of web pages examined by a customer Photos of plants GPS locations DATA! Page 23
DATA! Not Your Father s Oldsmobile! Financial transactions Movements of a cursor on a webpage Turns of a screw in a manufacturing process Tracking of web pages examined by a customer Text GPS locations Your eye s movements as you read this text Conversations on cell phones Fan speed, temperature, and humidity in a factory Images of plant growth taken from drones or from satellites Questions Page 24
Dimensions of Big Data: 3 Vs and an A Variety Analytics Velocity Volume Page 25
GIGO is Now GI nn GO GIGO Garbage In, Garbage Out Garbage In not necessarily Garbage Out Page 26
Identification of Free Parking Lots Goal: Find Free Parking Lots on UIUC Campus Y N Page 27
Identification of Free Parking Lots Goal: Find Free Parking Lots on UIUC Campus Y: Free Parking lot : no charge after 5pm N: Not free parking lot Y N Page 28
Free Parking Lot Identification Results of Extended EM vs Baselines Experiment setup: 106 parking lots of interests, 46 indeed free 30 participants, 901 marks collected Page 29
Free Parking Lot Identification Results of Extended EM vs Baselines Experiment setup: 106 parking lots of interests, 46 indeed free 30 participants, 901 marks collected Page 30
Free Parking Lot Identification Results of Extended EM vs Baselines Experiment setup: 106 parking lots of interests, 46 indeed free 30 participants, 901 marks collected Page 31
Free Parking Lot Identification Results of Extended EM vs Baselines Experiment setup: 106 parking lots of interests, 46 indeed free 30 participants, 901 marks collected Page 32
More Examples: Traffic Regulator Detection Unreliable Sensors! Long Wait Short Wait Page 33
Traffic Regulator Detection Page 34
Traffic Regulator Detection Page 35
Math Formulation: A Maximum Likelihood Estimation Problem Maximize log-likelihood by appropriate selection of truth values for claims: Log-likelihood Function of EM Scheme: M log (1 ) log(1 ) log N z j SiC j ai SiC j ai d i 1 lem( x; ) M j 1 (1 z j ) SiC j log bi (1 SiC j ) log(1 bi ) log(1 d ) i 1 where z 1 when measured variable j is true and 0 otherwise j Page 36
Page 37 Expectation Maximization
Expectation Maximization Expectation Step (E-Step) Z t j f a b d j ( t) ( t) ( t) (, ) (,, ) Page 38
Expectation Maximization Expectation Step (E-Step) Z t j f a b d j ( t) ( t) ( t) (, ) (,, ) Maximization Step (M-Step) Page 39
Expectation Maximization Expectation Step (E-Step) Z t j f a b d j ( t) ( t) ( t) (, ) (,, ) Maximization Step (M-Step) Iterate Page 40
Five Questions What s the it about Big Data? How will Big Data transform ag (if it does)? Is it enough to know what -- does why matter? Where s the easy(ier) money? Where is Ag & Big Data on the Hype Cycle? Page 41
Five Questions What s the it about Big Data? How will Big Data transform ag (if it does)? Is it enough to know what -- does why matter? Where s the easier money? Where is Ag & Big Data on the Hype Cycle? Page 42
Remember the KNOWLEDGE ECONOMY! Page 43
Information Technology And Strategic Change Jeffrey Sampler: How information technology redefines industries.... Look at individual transactions (1) Separabilty (2) Aggregation potential Page 44
Transaction Characteristics (1) Separability: extent to which information can be separated from the associated transactions Page 45
Transaction Characteristics (1) Separability: extent to which information can be separated from the associated transactions 1950s 0 or 1 Page 46
Transaction Characteristics (1) Separability: extent to which information can be separated from the associated transactions 1950s 0 or 1 Today 011010100 110111000 Page 47
Transaction Characteristics (1) Separability: extent to which information can be separated from the associated transactions 1950s 0 or 1 Today 011010100 110111000 Page 48
Transaction Characteristics (2) Aggregation potential: extent to which the information s value is leverageable beyond the original transaction 011010100 110111000 011010100 110111000 011010100 110111000 011010100 110111000 011010100 110111000 011010100 110111000 011010100 110111000 Page 49
Transaction Characteristics (2) Aggregation potential: extent to which the information s value is leverageable beyond the original transaction Analytics 011010100 110111000 011010100 110111000 011010100 110111000 011010100 110111000 011010100 110111000 011010100 110111000 011010100 110111000 Page 50
How will Big Data transform ag? Separability: extent to which information can be separated from the associated transactions Page 51
How will Big Data transform ag? Separability: extent to which information can be separated from the associated transactions Page 52
Five Questions What s the it about Big Data? How will Big Data transform ag (if it does)? Is it enough to know what -- does why matter? Where s the easier money? Where is Ag & Big Data on the Hype Cycle? Page 53
Is It Enough to Know WHAT? Society will need to shed some of its obsession for causality in exchange for simple correlations: not knowing why but only what. This overturns centuries of established practices and challenges our most basic understanding of how to make decisions and comprehend reality Page 54
The Truth about Nutrition The Japanese eat very little fat suffer fewer heart attacks than do the British or Americans The French eat a lot of fat suffer fewer heart attacks than do the British or Americans The Italians drink excessive amounts of red wine suffer fewer heart attacks than do the British or Americans The Japanese drink very little red wine suffer fewer heart attacks than do the British or Americans The Germans drink a lot of beer and eat lots of sausages suffer fewer heart attacks than do the British or Americans Page 55
The Truth about Nutrition SO.. EAT what you want, Apparently it s SPEAKING ENGLISH that kills you! Page 56
Small Data: The Hidden Key to Today s Agriculture! Small Data Why Big Data What: The Big Synergy! Page 57
Five Questions What s the it about Big Data? How will Big Data transform ag (if it does)? Is it enough to know what -- does why matter? Where s the easy(ier) money? Where is Ag & Big Data on the Hype Cycle? Page 58
Telematics Page 59 Telematics connects the farm firm
Optimal Fertilizer Distribution Page 60
Optimal Grain Movement Page 61
Where s the Easy(ier) Money? Page 62
Five Questions What s the it about Big Data? How will Big Data transform ag (if it does)? Is it enough to know what -- does why matter? Where s the easier money? Where is Ag & Big Data on the Hype Cycle? Page 63
A mid-1990s view of precision farming from the CCNetAg group Page 64
A Big Data Vision Page 65
Hype Cycle(s) Expectations 1995 -- Precision Ag 2014 -- Big Data Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 66 Time
Hype Cycle(s) Expectations 1995 -- Precision Ag 2014 -- Big Data 2014 Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 67 Time
Hype Cycle(s) Expectations 1995 -- Precision Ag 2014 -- Big Data 2014 20?? Technology Trigger Peak of Inflated Expectations Trough of Disillusionment Slope of Enlightenment Plateau of Productivity Page 68 Time