Why the Big Deal about Big Data?



Similar documents
Big Data, Enormous Opportunity

Big Data and Science: Myths and Reality

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

CSC384 Intro to Artificial Intelligence

Big Data Hope or Hype?

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS

Unlocking the Intelligence in. Big Data. Ron Kasabian General Manager Big Data Solutions Intel Corporation

Big Data simplified. SAPSA Impuls, Stockholm Martin Faiss & Niklas Packendorff, SAP

Key Findings Advanced, Predictive Analytics Breaking the Barriers to Adoption

Statistics for BIG data

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

CPS 216: Advanced Database Systems (Data-intensive Computing Systems) Shivnath Babu

Brochure More information from

Digital Insurance Era: Stretch Your Boundaries

Regulating AI and Robotics

Applications of Deep Learning to the GEOINT mission. June 2015

Big Data and Healthcare

CONNECTING DATA WITH BUSINESS

Optimized Hadoop for Enterprise

Doing Multidisciplinary Research in Data Science

Big Data Performance Growth on the Rise

Three Tech Trends that will drive Retail and how GS1 is part of them

Parallel Computing. Benson Muite. benson.

Getting to Know Big Data

12/7/2015. Data Science Master s programs

Center for Dynamic Data Analytics (CDDA) An NSF Supported Industry / University Cooperative Research Center (I/UCRC) Vision and Mission

The NEW POSSIBILITY. How the Data Center Helps Your Organization Excel in the Digital Services Economy

IoT in Logistics. An assessment of today s and tomorrows opportunities

Considerations for Management of Laboratory Data

Deep Learning Meets Heterogeneous Computing. Dr. Ren Wu Distinguished Scientist, IDL, Baidu

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Big Data and Industrial Internet

Data Mining in the Swamp

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

Big Data Use Cases Update

Deploying Big Data to the Cloud: Roadmap for Success

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

COMP 590: Artificial Intelligence

Big Data a threat or a chance?

Using Big Data and GIS to Model Aviation Fuel Burn

ebook Adaptive Analytics for Population Health Management

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Aguantebocawertyuiopasdfghvdslpmj klzxcvbquieromuchoanachaguilleanic oyafernmqwertyuiopasdfghjklzxcvbn mqwertyuiopasdfghjklzxcvbnmqwerty

Big-Data Computing: Creating revolutionary breakthroughs in commerce, science, and society

Conquering the Astronomical Data Flood through Machine

Oracle Big Data SQL Technical Update

CPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015

The Evolving Internet of Things Market

What is Artificial Intelligence?

Data Isn't Everything

Tap into Big Data at the Speed of Business

Big Data Analytics: 14 November 2013

Research Article ISSN Copyright by the authors - Licensee IJACIT- Under Creative Commons license 3.0

Strategies For Setting Up Your Organisation For Success With Big Data. Kevin Long Business Development Director Teradata

Are You Ready for Big Data?

Data Centric Computing Revisited

Harnessing the Data Flood: Oracle s Visionary Platform from Device to Data Center. Chris Baker Senior Vice President Worldwide ISV/OEM Java Sales

The Data Lifecycle: Managing Data through Business. Ewan Willars Friday 27 February

Hyper-connectivity and Artificial Intelligence

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Big Data and Your Data Warehouse Philip Russom

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

Transcription:

Why the Big Deal about Big Data? Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering Founding Director, escience Institute University of Washington Technology Alliance Insight to Impact March 2015 http://lazowska.cs.washington.edu/ta.pdf, pptx

Today A quick tutorial on exponentials Big Data and Smart Everything Some closer-in examples Components of the ecosystem Computer Science: The ever-expanding sphere

Processing capacity Storage capacity Network bandwidth Sensors Every aspect of computing has experienced exponential improvement Astonishingly, even algorithms in some cases!

Exponentials are rare we re not used to them, so they catch us unaware 9,223,372,036,854,780,000 4,294,967,296 16,777,216 65,536 256 1 2 4 8... 128

In Computer Science, we can exploit these exponential improvements in two ways Constant capability at exponentially decreasing cost Exponentially increasing capability at constant cost RAM Disk Flash John McCallum / Havard Blok Ray Kurzweil Storage Price / MB, USD (semi-log plot) Microprocessor Performance, MIPS (semi-log plot)

The 1970s to today 1970 Ford Mustang 2014 Ford Mustang Size: roughly comparable Speed: roughly comparable Efficiency (MPG): roughly comparable Value (cost relative to performance): roughly comparable

The 1970s to today 1971 Intel 4004 (2,300 transistors) 2014 Intel Xeon (4,300,000,000 transistors) Size: area occupied by a transistor reduced by 1,000,000x Speed: operations per second increased by 100,000x Efficiency (operations per watt): improved by 6,750x Value (dollars per instruction executed): improved by 2,700x

The 1970s to today 1970 Ford Mustang 2014 Intel Xeon What if cars had improved as rapidly as microprocessors?

The 1970s to today Size: A car would be smaller than an ant! (About 1/5 th of an inch long!)

The 1970s to today Speed: A car would go 6,000,000 miles per hour! (San Francisco to New York in 1.7 seconds!)

The 1970s to today Efficiency: A car would get 100,000 miles per gallon! (San Francisco to New York on 1/2 cup of fuel!)

The 1970s to today Cost: A car would cost less than $10!

Today, these exponential improvements in technology and algorithms are enabling a big data revolution A proliferation of sensors Think about the sensors on your phone More generally, the creation of almost all information in digital form It doesn t need to be transcribed in order to be processed Dramatic cost reductions in storage You can afford to keep all the data Dramatic increases in network bandwidth You can move the data to where it s needed

Dramatic cost reductions and scalability improvements in computation With Amazon Web Services, 1000 computers for 1 day costs the same as 1 computer for 1000 days Dramatic algorithmic breakthroughs Machine learning, data mining fundamental advances in computer science and statistics Ever more powerful models producing everincreasing volumes of data that must be analyzed

So, exactly what is meant by big data? Credit: Dan Ariely, Duke University

Serious answer: big data is enabling computer scientists to put the smarts into everything Smart homes Smart cars Smart health Smart robots Smart crowds and humancomputer systems Smart education Smart interaction (virtual and augmented reality) Smart cities Smart discovery

Shwetak Patel, University of Washington 2011 MacArthur Fellow Smart homes (the leaf nodes of the smart grid)

Smart cars DARPA Grand Challenge DARPA Urban Challenge Google Self-Driving Car

Smart health Larry Smarr quantified self Evidence-based medicine P4 medicine

Smart robots

Smart crowds and human-computer systems Zoran Popovic, UW Computer Science & Engineering David Baker, UW Biochemistry

Zoran Popovic, UW Computer Science & Engineering Smart education

Smart interaction

Smart cities

Smart discovery (data-intensive discovery, or escience) Nearly every field of discovery is transitioning from data poor to data rich Oceanography: OOI Astronomy: LSST Physics: LHC Biology: Sequencing Neuroscience: EEG, fmri Sociology: The Web Economics: POS terminals

Some closer-in examples of big data in action Collaborative filtering

Fraud detection

Price prediction

Hospital re-admission prediction

Travel time prediction and route recommendation under specific circumstances

Coaching / play calling in all sports

Speech recognition

Machine translation Speech -> text Text -> text translation Text -> speech in speaker s voice http://www.youtube.com/watch?v=nu-nlqqfckg&t=7m30s 7:30 8:40

Presidential campaigning

Electoral forecasting

Secret government surveillance of American citizens Hemisphere Project 26 years of records of every call that passed through an AT&T switch New records added at a rate of 4B/day

Secret government surveillance of foreign heads of state

Large Scale Deep Learning Jeff Dean Google Senior Fellow Joint work with many colleagues at Google Deep Learning : A form of Machine Learning A modern reincarnation of Artificial Neural Networks from the 1980s and 1990s Made practical by vast amounts of data (e.g., billions of images on the web) and vast computing resources Fully automated: General algorithms are trained and then turned loose

Generating Image Captions from Pixels Human: Three different types of pizza on top of a stove. Model sample 1: Two pizzas sitting on top of a stove top oven. Model sample 2: A pizza sitting on top of a pan on top of a stove.

Generating Image Captions from Pixels Human: A tennis player getting ready to serve the ball. Model: A man holding a tennis racquet on a tennis court.

Generating Image Captions from Pixels Human: Three different types of pizza on top of a stove. Model Model sample sample 1: I: Two A close pizzas up of sitting a child on holding top of a a stuffed stove top animal. oven. Model sample 2: A baby is asleep next to a teddy bear. Model sample 2: pizza sitting on top of a pan on top of a stove. Arthur C. Clarke: Any sufficiently advanced technology is indistinguishabl e from magic.

Infrastructure/Platforms Components of the ecosystem

Tools Elastic Map Reduce = Hadoop

Verticals/Services Real estate Traffic Government data IT operations Business expense management IT management Predictive analytics for businesses

Sensor systems

Intensive users

The open data movement: Civic data for civic good

Computer Science: The ever-expanding sphere Credit: Alfred Spector, Google

High Demand Fields in WA State, Baccalaureate Level & Above WSAC / SBCTC / WTECB, October 2013 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Computer Science Engineering Health Professions* Current Completions Additional Annual Completions Needed, 2016-21 Research, Science, Technical* *Gap exists at the graduate and/or professional level only Data from Table 2 of the report linked at http://www.wsac.wa.gov/sites/default/files/2013.11.16.skills.report.pdf

Is this a great time or what? http://lazowska.cs.washington.edu/ta.pdf, pptx