How To Understand The Big Data Paradigm
|
|
|
- Diane Allison
- 5 years ago
- Views:
Transcription
1 Big Data and Its Empiricist Founda4ons Teresa Scantamburlo
2 The evolu4on of Data Science The mechaniza4on of induc4on The business of data The Big Data paradigm (data + computa4on) Cri4cal analysis Tenta4ve solu4ons (?) Open problems
3 Sta4s4cal Learning Theory The ques4on is how a machine, a computer, can learn from examples (= induc&ve inference and generaliza&on ability) The machine is shown par4cular examples (x 1, y 1 ),...,(x n, y n ) of a specific task where x i! X (instances) and y i! Y (labels). Its goal is to infer a general rule f : X! Y (classifier) which can both explain the examples it has seen already and which can generalize to new examples. von Luxburg and Schölkopf, Sta&s&cal Learning Theory: Models, Concepts and Results, 2011
4 Sta4s4cal Learning Theory
5 The Business of Data Big Data is not simply denoted by volume. Some characterizing features: velocity, being created in or near real- 4me; variety, being structured and unstructured in nature; exhaus&ve in scope, striving to capture en4re popula4ons or systems (n=all); rela&onal in nature, containing common fields that enable the conjoining of different data sets; fine- grained in resolu4on flexible, holding the traits of extensionality (can add new fields easily) and scaleability (can expand in size rapidly). R. Kitchin, Big data, new epistemologies and paradigm shifs, 2014
6 The Big Data Paradigm Big Data is less about data that is big than it is about a capacity to search, aggregate, and cross- reference large data sets. Big Data as a socio- technical phenomenon It rests on the interplay of: Technology: maximizing computa4on power and algorithmic accuracy to gather, analyze, link, and compare large data sets. Analysis: drawing on large data sets to iden4fy pa]erns in order to make economic, social, technical, and legal claims. Mythology: the widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objec4vity, and accuracy d. boyd and K. Crawford, Cri&cal ques&ons for Big Data: provoca&ons for a cultural, technological, and scholarly phenomenon, 2012
7 The end of theory This is a world where massive amounts of data and applied mathema&cs replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguis4cs to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves. C. Anderson, The end of theory: The data deluge makes the scien&fic method obsolete, 2008
8 The triumph of correla4ons Big Data encourages a growing respect for correla&on, which comes to be appreciated as not only a more informa4ve and plausible form of knowledge than the more definite but also a more elusive, causal explana4on. In the words of Mayer- Schönberger and Cukier (2013): the correla4ons may not tell us precisely why something is happening, but they alert us that it is happening. And in many situa&ons this is good enough. S. Leonelli, What Difference Does Quan&ty Make? On The Epistemology of Big Data in Biology, 2014
9 Empiricism Reborn There is a powerful and a]rac4ve set of ideas at work in the empiricist epistemology that runs counter to the deduc4ve approach that is hegemonic within modern science: Big Data can capture a whole domain and provide full resolu4on; there is no need for a priori theory, models or hypotheses; through the applica4on of agnos4c data analy4cs the data can speak for themselves free of human bias or framing, and any pa]erns and rela4onships within Big Data are inherently meaningful and truthful; meaning transcends context or domain- specific knowledge, thus can be interpreted by R. Kitchin, Big data, new epistemologies and paradigm shifs, 2014
10 Some reac4ons Claims to objec4vity and accuracy are misleading Bigger data are not always be]er data Taken out of context, Big Data loses its meaning Just because it is accessible does not make it ethical Limited access to Big Data creates new digital divides d. boyd and K. Crawford, Cri&cal ques&ons for Big Data: provoca&ons for a cultural, technological, and scholarly phenomenon, 2012
11 An interes4ng analysis Both data analysis models and theore4cal scien4fic models are there to solve a problem, one to solve a problem of data analysis, the other to solve a problem of describing an empirical phenomenon. D.M. Bailer- Jones and C.A.L. Bailer- Jones, Modelling data: Analogies in neural networks, simulated annealing and gene&c algorithms, 2002
12 An interes4ng analysis Data analysis models Beyond the goal of accurate predic4on, the scien&fic insight that computa4onal data models give in a specific case may be limited. Data analysis techniques are not specific to the type of data that are modelled. The techniques are designed to be independent of specific applica4ons they are applica&on- neutral. Theore4cal scien4fic models A theore4cal scien4fic model is, in contrast, specific to a type of phenomenon. The theore4cal concepts and laws that give shape to the theore4cal model are chosen on the basis of the physical proper4es of the phenomenon to be modelled. D.M. Bailer- Jones and C.A.L. Bailer- Jones, Modelling data: Analogies in neural networks, simulated annealing and gene&c algorithms, 2002
13 An interes4ng analysis D.M. Bailer- Jones and C.A.L. Bailer- Jones, Modelling data: Analogies in neural networks, simulated annealing and gene&c algorithms, 2002
14 A tenta4ve reconcilia4on In contrast to new forms of empiricism, data- driven science seeks to hold to the tenets of the scien4fic method, but is more open to using a hybrid combina4on of abduc&ve, induc&ve and deduc&ve approaches to advance the understanding of a phenomenon. It seeks to incorporate a mode of induc4on into the research design, though explana4on through induc4on is not the intended end- point (as with empiricist approaches). It forms a new mode of hypothesis genera4on before a deduc4ve approach is employed. The epistemological strategy adopted within data- driven science is to use guided knowledge discovery techniques to iden4fy poten4al ques4on(hypotheses) worthy of further examina4on and tes4ng R. Kitchin, Big data, new epistemologies and paradigm shifs, 2014
15 A philosophical interpreta4on The mechaniza4on of induc4on The business of data The Big Data paradigm (data + computa4on) Cri4cal analysis Tenta4ve solu4ons (?) Open problems?
16 Hume s Legacy Hume s an4- ra4onalism polemic contributed to introduce a gap between the knowledge of the world and pure reasoning (Hume s fork) Knowledge of the world = a product of repeated percep&ons. Imagina4on becomes accustomed to foresee the order of events. Note that this expecta4on subsumes a feeling of inevitability, somehow replacing the rejected ra4onal necessity. it arises in the mind spontaneously and naturally, without the involvement of reason, merely because the mind is acted upon by the same objects in the same way repeatedly. Induc4on is replaced at the level of a non- ra4onal feeling whose reliability is leh to the vivacity and the freshness of data percep4on. So, removing any degree of ra4onality (or logos) within content experiences, we are led to reinforce the degree of connec4ons
17 Open problems Induc4on: abstrac4on and generaliza4on? Induc4on: models of data and models of phenomena?
Big Data: A Critical Analysis!!
DAIS - Università Ca Foscari Venezia Teresa Scantamburlo Big Data: A Critical Analysis!! 23th April 2015! Politecnico di Milano Outline The Realm of Big Data Big Data definitions Big Data paradigm Examples
Data Mining. Supervised Methods. Ciro Donalek [email protected]. Ay/Bi 199ab: Methods of Computa@onal Sciences hcp://esci101.blogspot.
Data Mining Supervised Methods Ciro Donalek [email protected] Supervised Methods Summary Ar@ficial Neural Networks Mul@layer Perceptron Support Vector Machines SoLwares Supervised Models: Supervised
The Emerging Discipline of Data Science. Principles and Techniques For Data- Intensive Analysis
The Emerging Discipline of Data Science Principles and Techniques For Data- Intensive Analysis What is Big Data Analy9cs? Is this a new paradigm? What is the role of data? What could possibly go wrong?
Big Data Hope or Hype?
Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September 2013 1 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big
Governance as Leadership: Reframing the Work of Nonprofit Boards
Governance as Leadership: Reframing the Work of Nonprofit Boards Tradi
Present Levels of Academic Achievement and Functional Performance (PLAAFP) Training
Present Levels of Academic Achievement and Functional Performance (PLAAFP) Training Dillard Research Associates and Alaska Educa4on & Early Development January 22, 2015 1 Objectives of Training To understand
An Open Dynamic Big Data Driven Applica3on System Toolkit
An Open Dynamic Big Data Driven Applica3on System Toolkit Craig C. Douglas University of Wyoming and KAUST This research is supported in part by the Na3onal Science Founda3on and King Abdullah University
Information Visualization WS 2013/14 11 Visual Analytics
1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and
How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9
Copyright 2014 Splunk Inc. Splunk for Mobile Intelligence Bill Emme< Director, Solu?ons Marke?ng Panos Papadopoulos Director, Product Management Disclaimer During the course of this presenta?on, we may
Graduate Systems Engineering Programs: Report on Outcomes and Objec:ves
Graduate Systems Engineering Programs: Report on Outcomes and Objec:ves Alice Squires, [email protected] Tim Ferris, David Olwell, Nicole Hutchison, Rick Adcock, John BrackeL, Mary VanLeer, Tom
Mission. To provide higher technological educa5on with quality, preparing. competent professionals, with sound founda5ons in science, technology
Mission To provide higher technological educa5on with quality, preparing competent professionals, with sound founda5ons in science, technology and innova5on, commi
How to Develop a Research Protocol
How to Develop a Research Protocol Goals & Objectives: To explain the theory of science To explain the theory of research To list the steps involved in developing and conducting a research protocol Outline:
MSc Data Science at the University of Sheffield. Started in September 2014
MSc Data Science at the University of Sheffield Started in September 2014 Gianluca Demar?ni Lecturer in Data Science at the Informa?on School since 2014 Ph.D. in Computer Science at U. Hannover, Germany
Workshop : Open and Big Data for Life Imaging
Workshop : Open and Big Data for Life Imaging Chris'an Barillot Michel Dojat March 2015 FLI- IAM 1 Many Good Reasons for Sharing Data and Tools in In Vivo Imaging Scien'fic At Least 3. «Power failure:
WHY ANALYSE? BOB APOLLO
WHY ANALYSE? BOB APOLLO Analy-cs are the key that enables the VP of sales, sales opera-ons and front- end sales organiza-ons to move from a culture based only on gut feeling and percep-on- based decision
USE OF EXPERT WITNESSES IN CONTESTED CASES BY: JAMES (DUSTY) JOHNSTON GENERAL COUNSEL TEXAS BOARD OF NURSING
USE OF EXPERT WITNESSES IN CONTESTED CASES BY: JAMES (DUSTY) JOHNSTON GENERAL COUNSEL TEXAS BOARD OF NURSING SCOPE OF PRESENTATION WARNING Although most jurisdic0ons may have similar, or even iden0cal
School of Advanced Studies Doctor Of Management In Organizational Leadership. DM 004 Requirements
School of Advanced Studies Doctor Of Management In Organizational Leadership The mission of the Doctor of Management in Organizational Leadership degree program is to develop the critical and creative
Unified Monitoring with AppDynamics
Unified Monitoring with AppDynamics Dus$n Whi*le @AppDynamics 52% of Fortune 500 firms since 2000 are gone Application complexity is exploding Agile SOA Login Flight Status Search Flight Purchase Mobile
Building your cloud porbolio APS Connect
Building your cloud porbolio APS Connect 5 th November 2014 Duncan Robinson, Parallels Business Consul3ng Introduc/on to BCS Who are we? Created 3 years ago in response to partner demand Define the strategy
CREDIT TRANSFER: GUIDELINES FOR STUDENT TRANSFER AND ARTICULATION AMONG MISSOURI COLLEGES AND UNIVERSITIES
CREDIT TRANSFER: GUIDELINES FOR STUDENT TRANSFER AND ARTICULATION AMONG MISSOURI COLLEGES AND UNIVERSITIES With Revisions as Proposed by the General Education Steering Committee [Extracts] A. RATIONALE
Big Data and Health Insurance Product Selec6on (and a few other applica6on) Jonathan Kolstad UC Berkeley and NBER
Big Data and Health Insurance Product Selec6on (and a few other applica6on) Jonathan Kolstad UC Berkeley and NBER Introduc6on Applica6ons of behavioral economics in health SeIng where behavioral assump6ons
BML Munjal University School of Management. Doctor of Philosophy (Ph.D.) Program In Business AdministraBon
BML Munjal University School of Management Doctor of Philosophy (Ph.D.) Program In Business AdministraBon Inspire, Inquiry, Impact Content About School of Management: BML Munjal University Research Centres
Clinical teachers experiences of nursing and teaching. Dr. Helen Forbes Deakin University
Clinical teachers experiences of nursing and teaching Dr. Helen Forbes Deakin University Main research ques7on How do clinical nurse teachers experience teaching undergraduate nursing students on clinical
Innovation Quality Flexibility
What a Lead Programmer Does for effective project management of programming activities under various outsourced models Innovation Quality Flexibility Agenda Understanding the Operating Model Impact Defining
College of Arts and Sciences: Social Science and Humanities Outcomes
College of Arts and Sciences: Social Science and Humanities Outcomes Communication Information Mgt/ Quantitative Skills Valuing/Ethics/ Integrity Critical Thinking Content Knowledge Application/ Internship
How to write an effec-ve DIGITAL MARKETING STRATEGY. Secrets from the professionals
How to write an effec-ve DIGITAL MARKETING STRATEGY Secrets from the professionals Wri-ng an effec-ve digital media strategy comes down to three things: content, connec-ons and consistency. When building
School of Advanced Studies Doctor Of Management In Organizational Leadership/information Systems And Technology. DM/IST 004 Requirements
School of Advanced Studies Doctor Of Management In Organizational Leadership/information Systems And Technology The mission of the Information Systems and Technology specialization of the Doctor of Management
Big Data Introduction, Importance and Current Perspective of Challenges
International Journal of Advances in Engineering Science and Technology 221 Available online at www.ijaestonline.com ISSN: 2319-1120 Big Data Introduction, Importance and Current Perspective of Challenges
Big Data + Big Analytics Transforming the way you do business
Big Data + Big Analytics Transforming the way you do business Bryan Harris Chief Technology Officer VSTI A SAS Company 1 AGENDA Lets get Real Beyond the Buzzwords Who is SAS? Our PerspecDve of Big Data
Cathrael Kazin, JD, PhD Chief Academic Officer
AT SOUTHERN NEW HAMPSHIRE UNIVERSITY Cathrael Kazin, JD, PhD Chief Academic Officer 2012 Southern New Hampshire University. All rights reserved. 1 My background Assessment Almost 10 years at ETS o Led
DTCC Data Quality Survey Industry Report
DTCC Data Quality Survey Industry Report November 2013 element 22 unlocking the power of your data Contents 1. Introduction 3 2. Approach and participants 4 3. Summary findings 5 4. Findings by topic 6
The Shi'ing Role of School Psychologists within a Mul7-7ered System of Support Framework. FASP Annual Conference October 29, 2015
The Shi'ing Role of School Psychologists within a Mul7-7ered System of Support Framework FASP Annual Conference October 29, 2015 Dr. Jayna Jenkins, Florida PS/RtI Project EARLY WARNING SYSTEMS AND THE
Positive Philosophy by August Comte
Positive Philosophy by August Comte August Comte, Thoemmes About the author.... August Comte (1798-1857), a founder of sociology, believes aspects of our world can be known solely through observation and
CSER & emerge Consor.a EHR Working Group Collabora.on on Display and Storage of Gene.c Informa.on in Electronic Health Records
electronic Medical Records and Genomics CSER & emerge Consor.a EHR Working Group Collabora.on on Display and Storage of Gene.c Informa.on in Electronic Health Records Brian Shirts, MD, PhD University of
Mergers in Produc.on and Percep.on. Ka.e Drager (University of Hawai i at Mānoa) Jennifer Hay (University of Canterbury)
Mergers in Produc.on and Percep.on Ka.e Drager (University of Hawai i at Mānoa) Jennifer Hay (University of Canterbury) Big huge thank you to: Our collaborators: Paul Warren, Bryn Thomas, and Rebecca Clifford
Pu?ng B2B Research to the Legal Test
With the global leader in sampling and data services Pu?ng B2B Research to the Legal Test Ashlin Quirk, SSI General Counsel 2014 Survey Sampling Interna6onal 1 2014 Survey Sampling Interna6onal Se?ng the
Big Data in medical image processing
Big Data in medical image processing Konstan3n Bychenkov, CEO Aligned Research Group LLC [email protected] Big data in medicine Genomic Research Popula3on Health Images M- Health hips://cloud.google.com/genomics/v1beta2/reference/
Overcoming the false dichotomy of quantitative and qualitative research: The case of criminal psychology
Overcomingthefalsedichotomyofquantitativeand qualitativeresearch:thecaseofcriminalpsychology Candidate:SamuelGunn Supervisor:ProfessorDavidD.Clarke Degree:BScPsychology WordCount:3864 1 Contents 1.Introduction
Migrating to Hosted Telephony. Your ultimate guide to migrating from on premise to hosted telephony. www.ucandc.com
Migrating to Hosted Telephony Your ultimate guide to migrating from on premise to hosted telephony Intro What is covered in this guide? A professional and reliable business telephone system is a central
9/21/15. Research Educa4on Solu4ons A NEW LANGUAGE FOR LEADERSHIP TRANSFORMING PERFORMANCE MANAGEMENT: AN ELI LILLY CASE STUDY
A NEW LANGUAGE FOR LEADERSHIP TRANSFORMING PERFORMANCE MANAGEMENT: AN ELI LILLY CASE STUDY Research Educa4on Solu4ons Dr. David Rock, Director, NeuroLeadership Ins4tute Mark Ferrara, VP of Talent Management,
Founda'onal IT Governance A Founda'onal Framework for Governing Enterprise IT Adapted from the ISACA COBIT 5 Framework
Founda'onal IT Governance A Founda'onal Framework for Governing Enterprise IT Adapted from the ISACA COBIT 5 Framework Steven Hunt Enterprise IT Governance Strategist NASA Ames Research Center Michael
Machine Learning and Data Mining. Fundamentals, robotics, recognition
Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,
IT Change Management Process Training
IT Change Management Process Training Before you begin: This course was prepared for all IT professionals with the goal of promo9ng awareness of the process. Those taking this course will have varied knowledge
Phone Systems Buyer s Guide
Phone Systems Buyer s Guide Contents How Cri(cal is Communica(on to Your Business? 3 Fundamental Issues 4 Phone Systems Basic Features 6 Features for Users with Advanced Needs 10 Key Ques(ons for All Buyers
Big Data, new epistemologies and paradigm shifts
Original Research Article Big Data, new epistemologies and paradigm shifts Big Data & Society April June 2014: 1 12! The Author(s) 2014 DOI: 10.1177/2053951714528481 bds.sagepub.com Rob Kitchin Abstract
Protec'ng Informa'on Assets - Week 8 - Business Continuity and Disaster Recovery Planning. MIS 5206 Protec/ng Informa/on Assets Greg Senko
Protec'ng Informa'on Assets - Week 8 - Business Continuity and Disaster Recovery Planning MIS5206 Week 8 In the News Readings In Class Case Study BCP/DRP Test Taking Tip Quiz In the News Discuss items
Data Isn't Everything
June 17, 2015 Innovate Forward Data Isn't Everything The Challenges of Big Data, Advanced Analytics, and Advance Computation Devices for Transportation Agencies. Using Data to Support Mission, Administration,
THE PERFORMANCE MANAGEMENT PROGRAM FOR DEPUTY MINISTERS. May 2012
THE PERFORMANCE MANAGEMENT PROGRAM FOR DEPUTY MINISTERS May 2012 HISTORY The Performance Management Program for Deputy Ministers has been in place since 1999 following the recommendations of the Advisory
Why Semantic Analysis is Better than Sentiment Analysis. A White Paper by T.R. Fitz-Gibbon, Chief Scientist, Networked Insights
Why Semantic Analysis is Better than Sentiment Analysis A White Paper by T.R. Fitz-Gibbon, Chief Scientist, Networked Insights Why semantic analysis is better than sentiment analysis I like it, I don t
*Heinemann, London, 1979
Burrell and Morgan s Sociological Paradigms and Organizational Analysis * *Heinemann, London, 1979 Main 4 debates in Sociology Is reality given or is it a product of the mind? Must one experience something
Introduc)on to the IoT- A methodology
10/11/14 1 Introduc)on to the IoTA methodology Olivier SAVRY CEA LETI 10/11/14 2 IoTA Objec)ves Provide a reference model of architecture (ARM) based on Interoperability Scalability Security and Privacy
Splunk and Big Data for Insider Threats
Copyright 2014 Splunk Inc. Splunk and Big Data for Insider Threats Mark Seward Sr. Director, Public Sector Company Company (NASDAQ: SPLK)! Founded 2004, first sohware release in 2006! HQ: San Francisco
