Big Data: A Critical Analysis!!
|
|
|
- Gabriella Rodgers
- 10 years ago
- Views:
Transcription
1 DAIS - Università Ca Foscari Venezia Teresa Scantamburlo Big Data: A Critical Analysis!! 23th April 2015! Politecnico di Milano
2 Outline The Realm of Big Data Big Data definitions Big Data paradigm Examples (Research and Applications) Philosophical assumptions The empiricist approach Critical aspects Hume s legacy and mechanized induction Open problems Models of data vs. model of phenomena The role of induction in cognitive activity
3 The Realm of Big Data
4 Digital Footprints
5 Internet of Things
6 The Age of Big Data We are witnessing an exceptional growth of flows of information we are entering the age of big data. The term big data refers to datasets whose size is beyond the ability of typical database software tolls to capture, store, manage and analyse (McKinsey Global Institute, 2011). It s a revolution, says Gary King, director of Harvard s Institute for Quantitative Social Science. We re really just getting under way. But the march of quantification, made possible by enormous new sources of data, will sweep through academia, business and government. There is no area that is going to be untouched (New York Times, 2012).
7 Big Data Innovations 1. We can analyse far more data, in some cases we can process all of it relating to a particular phenomenon (comprehensiveness); 2. big data is messy, varies in quality, and is distributed among countless servers around the world. With big data we ll be satisfied with a sense of general direction rather than knowing a phenomenon to the inch, the penny, the atom (messiness); 3. In a big data world we don t have to be fixated on causality, we can discover patterns and correlations, which may not tell us why something is happening but they alert us that something is happening (correlation) (Mayer- Schönberger and Cukier, 2013)
8 Characterizing Features VELOCITY: being created in or near real-time VARIETY: being structured and unstructured in nature EXHAUSTIVE IN SCOPE: striving to capture entire populations or systems (n=all) RELATIONAL: containing common fields that enable the conjoining of different data sets FINE-GRAINED in resolution FLEXIBLE, holding the traits of extensionality (can add new fields easily) and scaleability (can expand in size rapidly). (R. Kitchin, 2014)
9 Big Data Paradigm Big data as a socio-technical phenomenon It does not only refers to very large data sets and the tools and procedures used to manipulate and analyse them, but also to computational turn in thought and research. It is a profound change at the levels of epistemology and ethics. Big data reframes key questions about the constitution of knowledge, the process of research, how we should engage with information, and the nature and the categorisation of reality (d. boyd and K. Crawford, 2012)
10 Computational X Big data and analytics are fostering the emergence of new signposts, Computational + X, and the development of new research areas: Computational social science Computational Biology Computational Physics Computational Chemistry Computational Economics Computational Medicine Computational Low Computational Linguistics Digital Humanities Computer ethics... This trend can be viewed as a result of what has been called infocomputationalism, the framework which is based on two fundamental concepts: information as a structure (the fabric of the universe) and computation as its dynamics (G. Dodig Crnkovic, 2010)
11 Big Data Business Conferences (new and old) Journals, Books, etc. Education (Courses, summer schools) Research centres Research projects Companies and start-up
12 Computational Social Science The main computational social science areas are: automated information extraction systems and social network analysis social geographic information systems (GIS), complexity modelling social simulation models The Wisdom of Crowds If you put together a big enough and diverse group of people and ask them to make decisions affecting matters of general interest that group s decisions will, over time, be intellectually superior to the isolated individual, no matter how smart or well informed he is J. Surowiecki, 2004
13 Disease Detection By processing hundreds of billions of individual searches from five years of Google web search logs, our system generates more comprehensive models for use in influenza surveillance, with regional and state-level estimates of influenza-like illness (ILI) activity in the United States. J. Ginsberg et al., 2009 Influenza-like illness (ILI) activity in the United States Red = prediction by U.S. Centers for Disease Control and Prevention Black = prediction by aggregating historical logs
14 Mass Media Analysis The contents of English-language online-news over 5 years have been analysed to explore the impact of the Fukushima disaster on the media coverage of nuclear power. This big data study, based on millions of news articles, involves the extraction of narrative networks, association networks, and sentiment time series. The key finding is that media attitude towards nuclear power has significantly changed in the wake of the Fukushima disaster. T. Lansdall-Welfare et al., 2014 BEFORE DISASTER! AFTER DISASTER!
15 Recruiting system Some companies are using big data to recruit new employees or to predict which employees are likely to flourish or fail. With data mining techniques we could, e.g.: estimate the specific numerical value of sales Predict production time, or tenure period Rank employees. For example, Applicant Tracking Systems (ATS) software can score and sort resumes and other job application materials from a central database and rank applicants in order to achieve the best fit between a job opening and available job candidates (Data and Society research Institute, 2014) BetterWorks (a company in Palo Alto) makes office software that blends aspects of social media, fitness tracking and video games into a system meant to keep employees more engaged with their work and one another (New York Times, 2015)
16 Crime Fighting The Chicago Police Department conducted a research project that looked at data collected by the police department to see if Big Data analytics could be applied in police work. We could in fact leverage data science across police administrative data and use it as a framework to use predictive data to prevent violence. ( The London's Metropolitan Police Service is using a new software which pulls large amounts of data in-use by the police service and puts it through an advanced analytics engine to predict when criminals are likely to strike. By analysing five years' worth of data, it is hoped that an accurate prediction of when / if a criminal will re-offend can be made. (
17 Philosophical Assumptions
18 The End of Theory This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behaviour, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves. (C. Anderson, 2008) Scientists no longer have to make educated guesses, construct hypotheses and models, and test them with data-based experiments and examples. Instead, they can mine the complete set of data for patterns that reveal effects, producing scientific conclusions without further experimentation. (M. Prensky, 2009)
19 The Effectiveness of Data We should stop acting as if our goal is to author extremely elegant theories, and instead embrace complexity and make use of the best ally we have: the unreasonable effectiveness of data. The biggest successes in natural-language-related machine learning have been statistical speech recognition and statistical machine translation. The reason for these successes is not that these tasks are easier than other tasks...the reason is that a large training set of the input-output behaviour that we seek to automate is available to us in the wild. (A. Halevy, P. Norvig and F. Pereira, 2009)
20 The Triumph of Correlations There is now a better way. Petabytes allow us to say: Correlation is enough We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot...correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all. (C. Anderson, 2008) The correlations may not tell us precisely why something is happening, but they alert us that it is happening. And in many situations this is good enough. (V. Mayer- Schönberger and K. Cukier, 2013)
21 Empiricism Reborn Summarizing the main tenets of the empiricist approach to big data are: big data can capture a whole domain and provide full resolution; there is no need for a priori theory, models or hypotheses; through the application of agnostic data analytics the data can speak for themselves free of human bias or framing, and any patterns and relationships within big data are inherently meaningful and truthful; meaning transcends context or domain-specific knowledge, thus can be interpreted by be interpreted by anyone who can decode a statistic or data visualization. (R. Kitchin, 2014)
22 Empiricism & Hume s Legacy The debate between rationalism and empiricism Rationalists: concepts and knowledge are gained independently of sense experience Empiricists: sense experience is the ultimate source of all our concepts and knowledge Hume s view of knowledge it arises in the mind spontaneously and naturally, without the involvement of reason, merely because the mind is acted upon by the same objects in the same way repeatedly
23 Alternative Approaches There are alternative approaches to empiricism. They view big data and analytics as a positive contribution to scientific practice without considering them as a oracle or a conclusive solution. Data-driven science as a hybrid combination of abductive, inductive and deductive approaches to advance the understanding of a phenomenon. It forms a new mode of hypothesis generation before a deductive approach is employed. The epistemological strategy adopted within data-driven science is to use guided knowledge discovery techniques to identify potential question (hypotheses) worthy of further examination and testing (R. Kitchin, 2014)
24 Objective Science? In reality, working with Big Data is still subjective, and what it quantifies does not necessarily have a closer claim on objective truth (i.e., consider social media) (d. boyd and K. Crawford, 2012) Big data is not self-explanatory. And yet the specific methodologies for interpreting the data are open to all sorts of philosophical debate. Can the data represent an objective truth or is any interpretation necessarily biased by some subjective filter or the way that data is cleaned? (Bollier, 2010) Critical aspects on objectivity and accuracy: biases and subjective choices large data sets and data errors knowing the weaknesses in the data
25 Quality vs. Quantity? Big data offers the humanistic disciplines a new way to claim the status of quantitative science and objective method. Big data may support the mistaken belief that qualitative researchers are in the business of interpreting stories and quantitative researchers are in the business of producing facts Big data risks re-inscribing established divisions in the long running debates about scientific method and the legitimacy of social science and humanistic inquiry. (d. boyd and K. Crawford, 2012)
26 Data Out of Context Because large data sets can be modelled, data are often reduced to what can fit into a mathematical model. Yet, taken out of context, data lose meaning and value. The rise of social network sites prompted an industrydriven obsession with the social graph. (d. boyd and K. Crawford, 2012) Critical aspects on data and contextual information: social graph are not equivalent to personal networks (i.e. consider the notion of tie strength) not every connection is equivalent to every other connection conveyed information may change over the network
27 Ethical implications Being in public is not the same as being public it is problematic for researchers to justify their actions as ethical simply because the data are accessible. Just because content is publicly accessible does not mean that it was meant to be consumed by just anyone (problem of accountability and informed consent) Limited access to big data creates new digital divide Some companies restrict access to their data entirely; others sell the privilege of access for a fee; and others offer small data sets to universitybased researchers...the current ecosystem around big data creates a new kind of digital divide: the big data rich and the big data poor. (d. boyd and K. Crawford, 2012)
28 Is Big Data Unfair? As we re on the cusp of using machine learning for rendering basically all kinds of consequential decisions about human beings in domains such as education, employment, advertising, health care and policing, it is important to understand why machine learning is not, by default, fair or just in any meaningful way. This runs counter to the widespread misbelief that algorithmic decisions tend to be fair, because, you know, math is about equations and not skin colour. (H. Moritz, 2014) After all, as the former CPD [Chicago Police Department] computer experts point out, the algorithms in themselves are neutral. This program had absolutely nothing to do with race but multi-variable equations. Meanwhile, the potential benefits of predictive policing are profound. (Gilian Tett, financial reporter)
29 Discriminatory impact Inequalities might be conveyed in various ways and potential harms are directly concerned with the inner structure of algorithmic decision procedures. Big data driven decision making could have discriminatory effects even in the absence of discriminatory intent. Further concerns are expressed for an opaque decision-making environment and an impenetrable set of algorithms Approached without care, data mining can reproduce existing patterns of discrimination, inherit the prejudice of prior decision-makers, or simply reflect the widespread biases that persist in society. It can even have the perverse result of exacerbating existing inequalities. (S. Barocas and A.D. Selbst, 2014)
30 How Discrimination Occurs Machine learning and data mining represent a form of statistical discrimination. Basically they aim to end up with classification/groupings which make sense. In the machine learning procedures there are several mechanisms/steps which can play a role in the the production of discriminatory results: Defining the Target Variable and Class Labels Training Data Feature selection Proxies Masking (S. Barocas and A.D. Selbst, 2014)
31 Machine Learning The field of machine learning studies how a machine/computer can learn specific tasks by following specified learning algorithms. As opposed to artificial intelligence, it does not try to explain or generate intelligent behaviour, its goal is to discover mechanisms by which very specific tasks can be learned by a computer (inductive inference and generalization ability) Statistical Learning Theory Framework The machine is shown particular examples where (instances) and (labels). of a specific task Its goal is to infer a general rule (classifier) which can both explain the examples it has seen already and which can generalize to new examples. (von Luxburg and Schölkopf, 2011)
32 Statistical Learning Theory
33 Defining Target Variable The proper specification of the target variable is not always obvious. In some problems defining the outcome of interest could be difficult. There are different degrees of difficulty: Spam detection (simple binary classification) Credit scoring ( creditworthy is a more problematic category) Employment decisions (the definition of a good employee is not given) General lesson: while critics of data mining have tended to focus on inaccurate classifications (false positives and false negatives), as much if not more danger resides in the definition of the class label itself and the subsequent labelling of examples from which rules are inferred (S. Barocas and A.D. Selbst, 2014)
34 Training Data Discriminatory training data leads to discriminatory models. This may happen in two ways: Labelling examples: the analyst introduces biases and prejudices in the choice of examples (the classifier will reproduce the prejudices embedded in the examples). But prior prejudice can be inherited by on-going behaviour of users taken as inputs to data mining. Data collection: disadvantaged groups are less involved in the formal economy and its data-generating activities, because they have unequal access to and relatively less fluency in the technology necessary to engage online, or because they are less profitable customers or important constituents and therefore less interesting as targets of observation (S. Barocas and A.D. Selbst, 2014)
35 FATML at NIPS and ICML FAT ML = Fairness, Accountability and Transparency in Machine Learning Present at NIPS 2014 and ICML 2015 Organizers: S. Barocas, S. Friedler, M. Hardt, J. Kroll, S. Venkatasubramanian, H. Whallach
36 Open Problems
37 The Rationale of Data Science The development of data science poses several questions about the meaning and the role of inductive inference in research activities and decision making. Some open problems regard: Data science and the philosophical accounts of induction (Hume s legacy and different perspectives) The role of inductive inference in the models of data (abstraction) and in the models of phenomena (generalization) Models of data in the scientific practice and other human activities (i.e. practical reasoning)
38 Models of Data Data analysis models Beyond the goal of accurate prediction, the scientific insight that computational data models give in a specific case may be limited. Data analysis techniques are not specific to the type of data that are modelled. The techniques are designed to be independent of specific applications they are application-neutral. Theoretical scientific models A theoretical scientific model is, in contrast, specific to a type of phenomenon. The theoretical concepts and laws that give shape to the theoretical model are chosen on the basis of the physical properties of the phenomenon to be modelled. (D.M. Bailer-Jones and C.A.L. Bailer-Jones, 2002)
39 Models of Data (D.M. Bailer-Jones and C.A.L. Bailer-Jones, 2002)
40 References C. Anderson, The end of theory: The data deluge makes the scientific method obsolete, 2008 S. Barocas and A.D. Selbst, Big Data s Disparate Impact, 2014 D.M. Bailer-Jones and C.A.L. Bailer-Jones, Modelling data: Analogies in neural networks, simulated annealing and genetic algorithms, 2002 D. Bollier, The promise and the peril of big data, 2010 d. boyd and K. Crawford, Critical questions for Big Data: provocations for a cultural, technological, and scholarly phenomenon, 2012 G. Dodig Crnkovic, Biological information and natural computation, 2010 S. Leonelli, What Difference Does Quantity Make? On The Epistemology of Big Data in Biology, 2014 R. Kitchin, Big data, new epistemologies and paradigm shifts, 2014 A. Halevy, P. Norvig and F. Pereira, The Unreasonable Effectiveness of Data, 2009 V. Mayer- Schönberger and K. Cukier, Big Data: A Revolution that Will Change How We Live, 2013 H. Moritz, How big data is unfair. Understanding sources of unfairness in data driven decision making, 2014
41 Thanks!
How To Understand The Big Data Paradigm
Big Data and Its Empiricist Founda4ons Teresa Scantamburlo The evolu4on of Data Science The mechaniza4on of induc4on The business of data The Big Data paradigm (data + computa4on) Cri4cal analysis Tenta4ve
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
Big Data Hope or Hype?
Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September 2013 1 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big
Measurement and measures. Professor Brian Oldenburg
Measurement and measures Professor Brian Oldenburg Learning objectives 1. To identify similarities/differences between qualitative & quantitative measures 2. To identify steps involved in choosing and/or
Collaborations between Official Statistics and Academia in the Era of Big Data
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI [email protected] What
Workshop Discussion Notes: Housing
Workshop Discussion Notes: Housing Data & Civil Rights October 30, 2014 Washington, D.C. http://www.datacivilrights.org/ This document was produced based on notes taken during the Housing workshop of the
Information Visualization WS 2013/14 11 Visual Analytics
1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and
Data Isn't Everything
June 17, 2015 Innovate Forward Data Isn't Everything The Challenges of Big Data, Advanced Analytics, and Advance Computation Devices for Transportation Agencies. Using Data to Support Mission, Administration,
Healthcare data analytics. Da-Wei Wang Institute of Information Science [email protected]
Healthcare data analytics Da-Wei Wang Institute of Information Science [email protected] Outline Data Science Enabling technologies Grand goals Issues Google flu trend Privacy Conclusion Analytics
The Networked Nature of Algorithmic Discrimination
OCTOBER 2014 The Networked Nature of Algorithmic Discrimination DANAH BOYD PRINCIPAL RESEARCHER, MICROSOFT RESEARCH; FOUNDER, DATA & SOCIETY RESEARCH INSTITUTE KAREN LEVY POSTDOCTORAL ASSOCIATE, INFORMATION
ICT Perspectives on Big Data: Well Sorted Materials
ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in
CREDIT TRANSFER: GUIDELINES FOR STUDENT TRANSFER AND ARTICULATION AMONG MISSOURI COLLEGES AND UNIVERSITIES
CREDIT TRANSFER: GUIDELINES FOR STUDENT TRANSFER AND ARTICULATION AMONG MISSOURI COLLEGES AND UNIVERSITIES With Revisions as Proposed by the General Education Steering Committee [Extracts] A. RATIONALE
Business Intelligence and Decision Support Systems
Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley
Bioethics Program Program Goals and Learning Outcomes
Bioethics Program Program Goals and Learning Outcomes Program Goals 1. Students will develop a solid knowledge base in areas of Biology including cell biology, evolution, genetics, and molecular biology.
Overview. Triplett (1898) Social Influence - 1. PSYCHOLOGY 305 / 305G Social Psychology. Research in Social Psychology 2005
PSYCHOLOGY 305 / 305G Social Psychology Research in Social Psychology 2005 Overview Triplett s study of social influence (1897-1898) Scientific Method Experimental Advantages & Disadvantages Non-experimental
Making Critical Connections: Predictive Analytics in Government
Making Critical Connections: Predictive Analytics in Improve strategic and tactical decision-making Highlights: Support data-driven decisions. Reduce fraud, waste and abuse. Allocate resources more effectively.
CHAPTER THREE: METHODOLOGY. 3.1. Introduction. emerging markets can successfully organize activities related to event marketing.
Event Marketing in IMC 44 CHAPTER THREE: METHODOLOGY 3.1. Introduction The overall purpose of this project was to demonstrate how companies operating in emerging markets can successfully organize activities
College of Arts and Sciences: Social Science and Humanities Outcomes
College of Arts and Sciences: Social Science and Humanities Outcomes Communication Information Mgt/ Quantitative Skills Valuing/Ethics/ Integrity Critical Thinking Content Knowledge Application/ Internship
Making critical connections: predictive analytics in government
Making critical connections: predictive analytics in government Improve strategic and tactical decision-making Highlights: Support data-driven decisions using IBM SPSS Modeler Reduce fraud, waste and abuse
School of Advanced Studies Doctor Of Management In Organizational Leadership. DM 004 Requirements
School of Advanced Studies Doctor Of Management In Organizational Leadership The mission of the Doctor of Management in Organizational Leadership degree program is to develop the critical and creative
CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE
CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE Michael Diederich, Microsoft CMG Research & Insights Introduction The rise of social media platforms like Facebook and Twitter has created new
Big Data for Development: What May Determine Success or failure?
Big Data for Development: What May Determine Success or failure? Emmanuel Letouzé [email protected] OECD Technology Foresight 2012 Paris, October 22 Swimming in Ocean of data Data deluge Algorithms
Big Data / Privacy: Pick One?
Big Data / Privacy: Pick One? A. Michael Froomkin University of Miami School of Law [email protected] 1 Privacy, Quickly Has multiple elements including control of access to body, to thoughts, protection
MIDLAND ISD ADVANCED PLACEMENT CURRICULUM STANDARDS AP ENVIRONMENTAL SCIENCE
Science Practices Standard SP.1: Scientific Questions and Predictions Asking scientific questions that can be tested empirically and structuring these questions in the form of testable predictions SP.1.1
International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: [email protected]
Data Mining. Toon Calders TU Eindhoven
The Dangers of Data Mining Toon Calders TU Eindhoven Motivation for Data Mining: the Data Flood Huge amounts of data are available in digital form Internet IP Traffic logs Scientific data Customer profiles
Data Driven Discovery In the Social, Behavioral, and Economic Sciences
Data Driven Discovery In the Social, Behavioral, and Economic Sciences Simon Appleford, Marshall Scott Poole, Kevin Franklin, Peter Bajcsy, Alan B. Craig, Institute for Computing in the Humanities, Arts,
CFSD 21 ST CENTURY SKILL RUBRIC CRITICAL & CREATIVE THINKING
Critical and creative thinking (higher order thinking) refer to a set of cognitive skills or strategies that increases the probability of a desired outcome. In an information- rich society, the quality
Integrated Social and Enterprise Data = Enhanced Analytics
ORACLE WHITE PAPER, DECEMBER 2013 THE VALUE OF SOCIAL DATA Integrated Social and Enterprise Data = Enhanced Analytics #SocData CONTENTS Executive Summary 3 The Value of Enterprise-Specific Social Data
Research Methods Carrie Williams, (E-mail: [email protected]), Grand Canyon University
Research Methods Carrie Williams, (E-mail: [email protected]), Grand Canyon University ABSTRACT This paper discusses three common research approaches, qualitative, quantitative, and mixed methods,
Big Data Discovery: Five Easy Steps to Value
Big Data Discovery: Five Easy Steps to Value Big data could really be called big frustration. For all the hoopla about big data being poised to reshape industries from healthcare to retail to financial
CLUSTER ANALYSIS WITH R
CLUSTER ANALYSIS WITH R [cluster analysis divides data into groups that are meaningful, useful, or both] LEARNING STAGE ADVANCED DURATION 3 DAY WHAT IS CLUSTER ANALYSIS? Cluster Analysis or Clustering
Competencies for Secondary Teachers: Computer Science, Grades 4-12
1. Computational Thinking CSTA: Comp. Thinking 1.1 The ability to use the basic steps in algorithmic problemsolving to design solutions (e.g., problem statement and exploration, examination of sample instances,
Big Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
Machine Learning and Data Mining. Fundamentals, robotics, recognition
Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,
Opportunities and Limitations of Big Data
Opportunities and Limitations of Big Data Karl Schmedders University of Zurich and Swiss Finance Institute «Big Data: Little Ethics?» HWZ-Darden-Conference June 4, 2015 On fortune.com this morning: Apple's
Privacy: Legal Aspects of Big Data and Information Security
Privacy: Legal Aspects of Big Data and Information Security Presentation at the 2 nd National Open Access Workshop 21-22 October, 2013 Izmir, Turkey John N. Gathegi University of South Florida, Tampa,
Big Data in Communication Research: Its Contents and Discontents
Journal of Communication ISSN 0021-9916 AFTERWORD Big Data in Communication Research: Its Contents and Discontents Malcolm R. Parks Department of Communication, University of Washington, Seattle, WA, 98195,
Undergraduate Psychology Major Learning Goals and Outcomes i
Undergraduate Psychology Major Learning Goals and Outcomes i Goal 1: Knowledge Base of Psychology Demonstrate familiarity with the major concepts, theoretical perspectives, empirical findings, and historical
T-61.6010 Non-discriminatory Machine Learning
T-61.6010 Non-discriminatory Machine Learning Seminar 1 Indrė Žliobaitė Aalto University School of Science, Department of Computer Science Helsinki Institute for Information Technology (HIIT) University
Exhibit Inquiry Sheets
Exhibit Inquiry Sheets Notes to Students: While you travel through A Question of Truth, we recommend that you refer to the small map at the top of each page to help you find the various exhibit areas,
Big Data. Fast Forward. Putting data to productive use
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
School of Advanced Studies Doctor Of Management In Organizational Leadership/information Systems And Technology. DM/IST 004 Requirements
School of Advanced Studies Doctor Of Management In Organizational Leadership/information Systems And Technology The mission of the Information Systems and Technology specialization of the Doctor of Management
From Data to Foresight:
Laura Haas, IBM Fellow IBM Research - Almaden From Data to Foresight: Leveraging Data and Analytics for Materials Research 1 2011 IBM Corporation The road from data to foresight is long? Consumer Reports
Analysing Qualitative Data
Analysing Qualitative Data Workshop Professor Debra Myhill Philosophical Assumptions It is important to think about the philosophical assumptions that underpin the interpretation of all data. Your ontological
Uncovering Value in Healthcare Data with Cognitive Analytics. Christine Livingston, Perficient Ken Dugan, IBM
Uncovering Value in Healthcare Data with Cognitive Analytics Christine Livingston, Perficient Ken Dugan, IBM Conflict of Interest Christine Livingston Ken Dugan Has no real or apparent conflicts of interest
Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
The Big Data methodology in computer vision systems
The Big Data methodology in computer vision systems Popov S.B. Samara State Aerospace University, Image Processing Systems Institute, Russian Academy of Sciences Abstract. I consider the advantages of
Sanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 [email protected] 1. Introduction The field of data mining and knowledgee discovery is emerging as a
Theoretical Perspective
Preface Motivation Manufacturer of digital products become a driver of the world s economy. This claim is confirmed by the data of the European and the American stock markets. Digital products are distributed
I D C E X E C U T I V E B R I E F
I D C E X E C U T I V E B R I E F E n a b l i n g B e t t e r D e c i s i o n s T h r o u g h U n i f i e d Ac c e s s t o I n f o r m a t i o n November 2008 Global Headquarters: 5 Speen Street Framingham,
BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
Visualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
LEARNING OUTCOMES FOR THE PSYCHOLOGY MAJOR
LEARNING OUTCOMES FOR THE PSYCHOLOGY MAJOR Goal 1. Knowledge Base of Psychology Demonstrate familiarity with the major concepts, theoretical perspectives, empirical findings, and historical trends in psychology.
Chapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
SURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
Kindergarten to Grade 4 Manitoba Foundations for Scientific Literacy
Kindergarten to Grade 4 Manitoba Foundations for Scientific Literacy The Five Foundations Manitoba Foundations for Scientific Literacy To develop scientifically literate students, science learning experiences
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Building and deploying effective data science teams. Nikita Lytkin, Ph.D.
Building and deploying effective data science teams Nikita Lytkin, Ph.D. Introduction Ph.D. in Computer Science, Machine Learning (Rutgers University) Postdoc in Machine Learning for Genomics (NYU School
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
RESEARCH PROCESS AND THEORY
INTERDISCIPLINARY RESEARCH PROCESS AND THEORY ALLEN F. REPKO The University of Texas at Arlington SAGE Los Angeles London New Delhi Singapore Washington DC Detailed Contents Preface Acknowledgments About
Data Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
How to Develop a Research Protocol
How to Develop a Research Protocol Goals & Objectives: To explain the theory of science To explain the theory of research To list the steps involved in developing and conducting a research protocol Outline:
Chapter 2 Conceptualizing Scientific Inquiry
Chapter 2 Conceptualizing Scientific Inquiry 2.1 Introduction In order to develop a strategy for the assessment of scientific inquiry in a laboratory setting, a theoretical construct of the components
Learning is a very general term denoting the way in which agents:
What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);
CHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful
Sentiment Analysis on Big Data
SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social
DOCTOR OF BUSINESS ADMINISTRATION POLICY
DOCTOR OF BUSINESS ADMINISTRATION POLICY Section 1 Purpose and Content (1) This document outlines the specific course requirements of the Doctor of Business Administration (DBA) at the University of Western
Research Methods: Qualitative Approach
Research Methods: Qualitative Approach Sharon E. McKenzie, PhD, MS, CTRS, CDP Assistant Professor/Research Scientist Coordinator Gerontology Certificate Program Kean University Dept. of Physical Education,
Georgia Department of Education
Epidemiology Curriculum The Georgia Performance Standards are designed to provide students with the knowledge and skills for proficiency in science. The Project 2061 s Benchmarks for Science Literacy is
Scholars Journal of Arts, Humanities and Social Sciences
Scholars Journal of Arts, Humanities and Social Sciences Sch. J. Arts Humanit. Soc. Sci. 2014; 2(3B):440-444 Scholars Academic and Scientific Publishers (SAS Publishers) (An International Publisher for
Organizing Your Approach to a Data Analysis
Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize
Using Big Data Analytics to
Using Big Data Analytics to Improve Government Performance Arun Chandrasekaran Gartner is a registered trademark of Gartner, Inc. or its affiliates. This publication may not be reproduced or distributed
2014-15 College-wide Goal Assessment Plans (SoA&S Assessment Coordinator September 24, 2015)
2014-15 College-wide Goal Assessment Plans (SoA&S Assessment Coordinator September 24, 2015) College-wide Goal 1: Intellectual Engagement PG1 Students will demonstrate the ability to think critically and
How Big Data is Different
FALL 2012 VOL.54 NO.1 Thomas H. Davenport, Paul Barth and Randy Bean How Big Data is Different Brought to you by Please note that gray areas reflect artwork that has been intentionally removed. The substantive
What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy
What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy Much higher Volumes. Processed with more Velocity. With much more Variety. Is Big Data so big? Big Data Smart Data Project HAVEn: Adaptive Intelligence
DON T GET LOST IN THE FOG OF BIG DATA
DON T GET LOST IN THE FOG OF BIG DATA MERCER S LESSONS FOR SUCCESS IN WORKFORCE ANALYTICS If 2013 has produced a breakthrough technology phrase, it is big data, a fairly vague but forceful term that features
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Predictive Analytics Certificate Program
Information Technologies Programs Predictive Analytics Certificate Program Accelerate Your Career Offered in partnership with: University of California, Irvine Extension s professional certificate and
BIG DATA FOR DEVELOPMENT: A PRIMER
June 2013 BIG DATA FOR DEVELOPMENT: A PRIMER Harnessing Big Data For Real-Time Awareness WHAT IS BIG DATA? Big Data is an umbrella term referring to the large amounts of digital data continually generated
Event Summary: The Social, Cultural, & Ethical Dimensions of Big Data
Event Summary: The Social, Cultural, & Ethical Dimensions of Big Data March 17, 2014 - New York, NY http://www.datasociety.net/initiatives/2014-0317/ This event summary attempts to capture the broad issues
Qualitative Data Collection and Analysis
Qualitative Data Collection and Analysis In this lecture Overview of observations, diary studies, field studies Interviewing in detail Interviews that are done incorrectly are lost data Externalizing and
ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam
ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open
