Big Data: A Critical Analysis!!

Size: px
Start display at page:

Download "Big Data: A Critical Analysis!!"

Transcription

1 DAIS - Università Ca Foscari Venezia Teresa Scantamburlo Big Data: A Critical Analysis!! 23th April 2015! Politecnico di Milano

2 Outline The Realm of Big Data Big Data definitions Big Data paradigm Examples (Research and Applications) Philosophical assumptions The empiricist approach Critical aspects Hume s legacy and mechanized induction Open problems Models of data vs. model of phenomena The role of induction in cognitive activity

3 The Realm of Big Data

4 Digital Footprints

5 Internet of Things

6 The Age of Big Data We are witnessing an exceptional growth of flows of information we are entering the age of big data. The term big data refers to datasets whose size is beyond the ability of typical database software tolls to capture, store, manage and analyse (McKinsey Global Institute, 2011). It s a revolution, says Gary King, director of Harvard s Institute for Quantitative Social Science. We re really just getting under way. But the march of quantification, made possible by enormous new sources of data, will sweep through academia, business and government. There is no area that is going to be untouched (New York Times, 2012).

7 Big Data Innovations 1. We can analyse far more data, in some cases we can process all of it relating to a particular phenomenon (comprehensiveness); 2. big data is messy, varies in quality, and is distributed among countless servers around the world. With big data we ll be satisfied with a sense of general direction rather than knowing a phenomenon to the inch, the penny, the atom (messiness); 3. In a big data world we don t have to be fixated on causality, we can discover patterns and correlations, which may not tell us why something is happening but they alert us that something is happening (correlation) (Mayer- Schönberger and Cukier, 2013)

8 Characterizing Features VELOCITY: being created in or near real-time VARIETY: being structured and unstructured in nature EXHAUSTIVE IN SCOPE: striving to capture entire populations or systems (n=all) RELATIONAL: containing common fields that enable the conjoining of different data sets FINE-GRAINED in resolution FLEXIBLE, holding the traits of extensionality (can add new fields easily) and scaleability (can expand in size rapidly). (R. Kitchin, 2014)

9 Big Data Paradigm Big data as a socio-technical phenomenon It does not only refers to very large data sets and the tools and procedures used to manipulate and analyse them, but also to computational turn in thought and research. It is a profound change at the levels of epistemology and ethics. Big data reframes key questions about the constitution of knowledge, the process of research, how we should engage with information, and the nature and the categorisation of reality (d. boyd and K. Crawford, 2012)

10 Computational X Big data and analytics are fostering the emergence of new signposts, Computational + X, and the development of new research areas: Computational social science Computational Biology Computational Physics Computational Chemistry Computational Economics Computational Medicine Computational Low Computational Linguistics Digital Humanities Computer ethics... This trend can be viewed as a result of what has been called infocomputationalism, the framework which is based on two fundamental concepts: information as a structure (the fabric of the universe) and computation as its dynamics (G. Dodig Crnkovic, 2010)

11 Big Data Business Conferences (new and old) Journals, Books, etc. Education (Courses, summer schools) Research centres Research projects Companies and start-up

12 Computational Social Science The main computational social science areas are: automated information extraction systems and social network analysis social geographic information systems (GIS), complexity modelling social simulation models The Wisdom of Crowds If you put together a big enough and diverse group of people and ask them to make decisions affecting matters of general interest that group s decisions will, over time, be intellectually superior to the isolated individual, no matter how smart or well informed he is J. Surowiecki, 2004

13 Disease Detection By processing hundreds of billions of individual searches from five years of Google web search logs, our system generates more comprehensive models for use in influenza surveillance, with regional and state-level estimates of influenza-like illness (ILI) activity in the United States. J. Ginsberg et al., 2009 Influenza-like illness (ILI) activity in the United States Red = prediction by U.S. Centers for Disease Control and Prevention Black = prediction by aggregating historical logs

14 Mass Media Analysis The contents of English-language online-news over 5 years have been analysed to explore the impact of the Fukushima disaster on the media coverage of nuclear power. This big data study, based on millions of news articles, involves the extraction of narrative networks, association networks, and sentiment time series. The key finding is that media attitude towards nuclear power has significantly changed in the wake of the Fukushima disaster. T. Lansdall-Welfare et al., 2014 BEFORE DISASTER! AFTER DISASTER!

15 Recruiting system Some companies are using big data to recruit new employees or to predict which employees are likely to flourish or fail. With data mining techniques we could, e.g.: estimate the specific numerical value of sales Predict production time, or tenure period Rank employees. For example, Applicant Tracking Systems (ATS) software can score and sort resumes and other job application materials from a central database and rank applicants in order to achieve the best fit between a job opening and available job candidates (Data and Society research Institute, 2014) BetterWorks (a company in Palo Alto) makes office software that blends aspects of social media, fitness tracking and video games into a system meant to keep employees more engaged with their work and one another (New York Times, 2015)

16 Crime Fighting The Chicago Police Department conducted a research project that looked at data collected by the police department to see if Big Data analytics could be applied in police work. We could in fact leverage data science across police administrative data and use it as a framework to use predictive data to prevent violence. ( The London's Metropolitan Police Service is using a new software which pulls large amounts of data in-use by the police service and puts it through an advanced analytics engine to predict when criminals are likely to strike. By analysing five years' worth of data, it is hoped that an accurate prediction of when / if a criminal will re-offend can be made. (

17 Philosophical Assumptions

18 The End of Theory This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behaviour, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves. (C. Anderson, 2008) Scientists no longer have to make educated guesses, construct hypotheses and models, and test them with data-based experiments and examples. Instead, they can mine the complete set of data for patterns that reveal effects, producing scientific conclusions without further experimentation. (M. Prensky, 2009)

19 The Effectiveness of Data We should stop acting as if our goal is to author extremely elegant theories, and instead embrace complexity and make use of the best ally we have: the unreasonable effectiveness of data. The biggest successes in natural-language-related machine learning have been statistical speech recognition and statistical machine translation. The reason for these successes is not that these tasks are easier than other tasks...the reason is that a large training set of the input-output behaviour that we seek to automate is available to us in the wild. (A. Halevy, P. Norvig and F. Pereira, 2009)

20 The Triumph of Correlations There is now a better way. Petabytes allow us to say: Correlation is enough We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot...correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all. (C. Anderson, 2008) The correlations may not tell us precisely why something is happening, but they alert us that it is happening. And in many situations this is good enough. (V. Mayer- Schönberger and K. Cukier, 2013)

21 Empiricism Reborn Summarizing the main tenets of the empiricist approach to big data are: big data can capture a whole domain and provide full resolution; there is no need for a priori theory, models or hypotheses; through the application of agnostic data analytics the data can speak for themselves free of human bias or framing, and any patterns and relationships within big data are inherently meaningful and truthful; meaning transcends context or domain-specific knowledge, thus can be interpreted by be interpreted by anyone who can decode a statistic or data visualization. (R. Kitchin, 2014)

22 Empiricism & Hume s Legacy The debate between rationalism and empiricism Rationalists: concepts and knowledge are gained independently of sense experience Empiricists: sense experience is the ultimate source of all our concepts and knowledge Hume s view of knowledge it arises in the mind spontaneously and naturally, without the involvement of reason, merely because the mind is acted upon by the same objects in the same way repeatedly

23 Alternative Approaches There are alternative approaches to empiricism. They view big data and analytics as a positive contribution to scientific practice without considering them as a oracle or a conclusive solution. Data-driven science as a hybrid combination of abductive, inductive and deductive approaches to advance the understanding of a phenomenon. It forms a new mode of hypothesis generation before a deductive approach is employed. The epistemological strategy adopted within data-driven science is to use guided knowledge discovery techniques to identify potential question (hypotheses) worthy of further examination and testing (R. Kitchin, 2014)

24 Objective Science? In reality, working with Big Data is still subjective, and what it quantifies does not necessarily have a closer claim on objective truth (i.e., consider social media) (d. boyd and K. Crawford, 2012) Big data is not self-explanatory. And yet the specific methodologies for interpreting the data are open to all sorts of philosophical debate. Can the data represent an objective truth or is any interpretation necessarily biased by some subjective filter or the way that data is cleaned? (Bollier, 2010) Critical aspects on objectivity and accuracy: biases and subjective choices large data sets and data errors knowing the weaknesses in the data

25 Quality vs. Quantity? Big data offers the humanistic disciplines a new way to claim the status of quantitative science and objective method. Big data may support the mistaken belief that qualitative researchers are in the business of interpreting stories and quantitative researchers are in the business of producing facts Big data risks re-inscribing established divisions in the long running debates about scientific method and the legitimacy of social science and humanistic inquiry. (d. boyd and K. Crawford, 2012)

26 Data Out of Context Because large data sets can be modelled, data are often reduced to what can fit into a mathematical model. Yet, taken out of context, data lose meaning and value. The rise of social network sites prompted an industrydriven obsession with the social graph. (d. boyd and K. Crawford, 2012) Critical aspects on data and contextual information: social graph are not equivalent to personal networks (i.e. consider the notion of tie strength) not every connection is equivalent to every other connection conveyed information may change over the network

27 Ethical implications Being in public is not the same as being public it is problematic for researchers to justify their actions as ethical simply because the data are accessible. Just because content is publicly accessible does not mean that it was meant to be consumed by just anyone (problem of accountability and informed consent) Limited access to big data creates new digital divide Some companies restrict access to their data entirely; others sell the privilege of access for a fee; and others offer small data sets to universitybased researchers...the current ecosystem around big data creates a new kind of digital divide: the big data rich and the big data poor. (d. boyd and K. Crawford, 2012)

28 Is Big Data Unfair? As we re on the cusp of using machine learning for rendering basically all kinds of consequential decisions about human beings in domains such as education, employment, advertising, health care and policing, it is important to understand why machine learning is not, by default, fair or just in any meaningful way. This runs counter to the widespread misbelief that algorithmic decisions tend to be fair, because, you know, math is about equations and not skin colour. (H. Moritz, 2014) After all, as the former CPD [Chicago Police Department] computer experts point out, the algorithms in themselves are neutral. This program had absolutely nothing to do with race but multi-variable equations. Meanwhile, the potential benefits of predictive policing are profound. (Gilian Tett, financial reporter)

29 Discriminatory impact Inequalities might be conveyed in various ways and potential harms are directly concerned with the inner structure of algorithmic decision procedures. Big data driven decision making could have discriminatory effects even in the absence of discriminatory intent. Further concerns are expressed for an opaque decision-making environment and an impenetrable set of algorithms Approached without care, data mining can reproduce existing patterns of discrimination, inherit the prejudice of prior decision-makers, or simply reflect the widespread biases that persist in society. It can even have the perverse result of exacerbating existing inequalities. (S. Barocas and A.D. Selbst, 2014)

30 How Discrimination Occurs Machine learning and data mining represent a form of statistical discrimination. Basically they aim to end up with classification/groupings which make sense. In the machine learning procedures there are several mechanisms/steps which can play a role in the the production of discriminatory results: Defining the Target Variable and Class Labels Training Data Feature selection Proxies Masking (S. Barocas and A.D. Selbst, 2014)

31 Machine Learning The field of machine learning studies how a machine/computer can learn specific tasks by following specified learning algorithms. As opposed to artificial intelligence, it does not try to explain or generate intelligent behaviour, its goal is to discover mechanisms by which very specific tasks can be learned by a computer (inductive inference and generalization ability) Statistical Learning Theory Framework The machine is shown particular examples where (instances) and (labels). of a specific task Its goal is to infer a general rule (classifier) which can both explain the examples it has seen already and which can generalize to new examples. (von Luxburg and Schölkopf, 2011)

32 Statistical Learning Theory

33 Defining Target Variable The proper specification of the target variable is not always obvious. In some problems defining the outcome of interest could be difficult. There are different degrees of difficulty: Spam detection (simple binary classification) Credit scoring ( creditworthy is a more problematic category) Employment decisions (the definition of a good employee is not given) General lesson: while critics of data mining have tended to focus on inaccurate classifications (false positives and false negatives), as much if not more danger resides in the definition of the class label itself and the subsequent labelling of examples from which rules are inferred (S. Barocas and A.D. Selbst, 2014)

34 Training Data Discriminatory training data leads to discriminatory models. This may happen in two ways: Labelling examples: the analyst introduces biases and prejudices in the choice of examples (the classifier will reproduce the prejudices embedded in the examples). But prior prejudice can be inherited by on-going behaviour of users taken as inputs to data mining. Data collection: disadvantaged groups are less involved in the formal economy and its data-generating activities, because they have unequal access to and relatively less fluency in the technology necessary to engage online, or because they are less profitable customers or important constituents and therefore less interesting as targets of observation (S. Barocas and A.D. Selbst, 2014)

35 FATML at NIPS and ICML FAT ML = Fairness, Accountability and Transparency in Machine Learning Present at NIPS 2014 and ICML 2015 Organizers: S. Barocas, S. Friedler, M. Hardt, J. Kroll, S. Venkatasubramanian, H. Whallach

36 Open Problems

37 The Rationale of Data Science The development of data science poses several questions about the meaning and the role of inductive inference in research activities and decision making. Some open problems regard: Data science and the philosophical accounts of induction (Hume s legacy and different perspectives) The role of inductive inference in the models of data (abstraction) and in the models of phenomena (generalization) Models of data in the scientific practice and other human activities (i.e. practical reasoning)

38 Models of Data Data analysis models Beyond the goal of accurate prediction, the scientific insight that computational data models give in a specific case may be limited. Data analysis techniques are not specific to the type of data that are modelled. The techniques are designed to be independent of specific applications they are application-neutral. Theoretical scientific models A theoretical scientific model is, in contrast, specific to a type of phenomenon. The theoretical concepts and laws that give shape to the theoretical model are chosen on the basis of the physical properties of the phenomenon to be modelled. (D.M. Bailer-Jones and C.A.L. Bailer-Jones, 2002)

39 Models of Data (D.M. Bailer-Jones and C.A.L. Bailer-Jones, 2002)

40 References C. Anderson, The end of theory: The data deluge makes the scientific method obsolete, 2008 S. Barocas and A.D. Selbst, Big Data s Disparate Impact, 2014 D.M. Bailer-Jones and C.A.L. Bailer-Jones, Modelling data: Analogies in neural networks, simulated annealing and genetic algorithms, 2002 D. Bollier, The promise and the peril of big data, 2010 d. boyd and K. Crawford, Critical questions for Big Data: provocations for a cultural, technological, and scholarly phenomenon, 2012 G. Dodig Crnkovic, Biological information and natural computation, 2010 S. Leonelli, What Difference Does Quantity Make? On The Epistemology of Big Data in Biology, 2014 R. Kitchin, Big data, new epistemologies and paradigm shifts, 2014 A. Halevy, P. Norvig and F. Pereira, The Unreasonable Effectiveness of Data, 2009 V. Mayer- Schönberger and K. Cukier, Big Data: A Revolution that Will Change How We Live, 2013 H. Moritz, How big data is unfair. Understanding sources of unfairness in data driven decision making, 2014

41 Thanks!

How To Understand The Big Data Paradigm

How To Understand The Big Data Paradigm Big Data and Its Empiricist Founda4ons Teresa Scantamburlo The evolu4on of Data Science The mechaniza4on of induc4on The business of data The Big Data paradigm (data + computa4on) Cri4cal analysis Tenta4ve

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Big Data Hope or Hype?

Big Data Hope or Hype? Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September 2013 1 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big

More information

Measurement and measures. Professor Brian Oldenburg

Measurement and measures. Professor Brian Oldenburg Measurement and measures Professor Brian Oldenburg Learning objectives 1. To identify similarities/differences between qualitative & quantitative measures 2. To identify steps involved in choosing and/or

More information

Collaborations between Official Statistics and Academia in the Era of Big Data

Collaborations between Official Statistics and Academia in the Era of Big Data Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI [email protected] What

More information

Workshop Discussion Notes: Housing

Workshop Discussion Notes: Housing Workshop Discussion Notes: Housing Data & Civil Rights October 30, 2014 Washington, D.C. http://www.datacivilrights.org/ This document was produced based on notes taken during the Housing workshop of the

More information

Information Visualization WS 2013/14 11 Visual Analytics

Information Visualization WS 2013/14 11 Visual Analytics 1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and

More information

Data Isn't Everything

Data Isn't Everything June 17, 2015 Innovate Forward Data Isn't Everything The Challenges of Big Data, Advanced Analytics, and Advance Computation Devices for Transportation Agencies. Using Data to Support Mission, Administration,

More information

Healthcare data analytics. Da-Wei Wang Institute of Information Science [email protected]

Healthcare data analytics. Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw Healthcare data analytics Da-Wei Wang Institute of Information Science [email protected] Outline Data Science Enabling technologies Grand goals Issues Google flu trend Privacy Conclusion Analytics

More information

The Networked Nature of Algorithmic Discrimination

The Networked Nature of Algorithmic Discrimination OCTOBER 2014 The Networked Nature of Algorithmic Discrimination DANAH BOYD PRINCIPAL RESEARCHER, MICROSOFT RESEARCH; FOUNDER, DATA & SOCIETY RESEARCH INSTITUTE KAREN LEVY POSTDOCTORAL ASSOCIATE, INFORMATION

More information

ICT Perspectives on Big Data: Well Sorted Materials

ICT Perspectives on Big Data: Well Sorted Materials ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in

More information

CREDIT TRANSFER: GUIDELINES FOR STUDENT TRANSFER AND ARTICULATION AMONG MISSOURI COLLEGES AND UNIVERSITIES

CREDIT TRANSFER: GUIDELINES FOR STUDENT TRANSFER AND ARTICULATION AMONG MISSOURI COLLEGES AND UNIVERSITIES CREDIT TRANSFER: GUIDELINES FOR STUDENT TRANSFER AND ARTICULATION AMONG MISSOURI COLLEGES AND UNIVERSITIES With Revisions as Proposed by the General Education Steering Committee [Extracts] A. RATIONALE

More information

Business Intelligence and Decision Support Systems

Business Intelligence and Decision Support Systems Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley

More information

Bioethics Program Program Goals and Learning Outcomes

Bioethics Program Program Goals and Learning Outcomes Bioethics Program Program Goals and Learning Outcomes Program Goals 1. Students will develop a solid knowledge base in areas of Biology including cell biology, evolution, genetics, and molecular biology.

More information

Overview. Triplett (1898) Social Influence - 1. PSYCHOLOGY 305 / 305G Social Psychology. Research in Social Psychology 2005

Overview. Triplett (1898) Social Influence - 1. PSYCHOLOGY 305 / 305G Social Psychology. Research in Social Psychology 2005 PSYCHOLOGY 305 / 305G Social Psychology Research in Social Psychology 2005 Overview Triplett s study of social influence (1897-1898) Scientific Method Experimental Advantages & Disadvantages Non-experimental

More information

Making Critical Connections: Predictive Analytics in Government

Making Critical Connections: Predictive Analytics in Government Making Critical Connections: Predictive Analytics in Improve strategic and tactical decision-making Highlights: Support data-driven decisions. Reduce fraud, waste and abuse. Allocate resources more effectively.

More information

CHAPTER THREE: METHODOLOGY. 3.1. Introduction. emerging markets can successfully organize activities related to event marketing.

CHAPTER THREE: METHODOLOGY. 3.1. Introduction. emerging markets can successfully organize activities related to event marketing. Event Marketing in IMC 44 CHAPTER THREE: METHODOLOGY 3.1. Introduction The overall purpose of this project was to demonstrate how companies operating in emerging markets can successfully organize activities

More information

College of Arts and Sciences: Social Science and Humanities Outcomes

College of Arts and Sciences: Social Science and Humanities Outcomes College of Arts and Sciences: Social Science and Humanities Outcomes Communication Information Mgt/ Quantitative Skills Valuing/Ethics/ Integrity Critical Thinking Content Knowledge Application/ Internship

More information

Making critical connections: predictive analytics in government

Making critical connections: predictive analytics in government Making critical connections: predictive analytics in government Improve strategic and tactical decision-making Highlights: Support data-driven decisions using IBM SPSS Modeler Reduce fraud, waste and abuse

More information

School of Advanced Studies Doctor Of Management In Organizational Leadership. DM 004 Requirements

School of Advanced Studies Doctor Of Management In Organizational Leadership. DM 004 Requirements School of Advanced Studies Doctor Of Management In Organizational Leadership The mission of the Doctor of Management in Organizational Leadership degree program is to develop the critical and creative

More information

CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE

CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE CORRALLING THE WILD, WILD WEST OF SOCIAL MEDIA INTELLIGENCE Michael Diederich, Microsoft CMG Research & Insights Introduction The rise of social media platforms like Facebook and Twitter has created new

More information

Big Data for Development: What May Determine Success or failure?

Big Data for Development: What May Determine Success or failure? Big Data for Development: What May Determine Success or failure? Emmanuel Letouzé [email protected] OECD Technology Foresight 2012 Paris, October 22 Swimming in Ocean of data Data deluge Algorithms

More information

Big Data / Privacy: Pick One?

Big Data / Privacy: Pick One? Big Data / Privacy: Pick One? A. Michael Froomkin University of Miami School of Law [email protected] 1 Privacy, Quickly Has multiple elements including control of access to body, to thoughts, protection

More information

MIDLAND ISD ADVANCED PLACEMENT CURRICULUM STANDARDS AP ENVIRONMENTAL SCIENCE

MIDLAND ISD ADVANCED PLACEMENT CURRICULUM STANDARDS AP ENVIRONMENTAL SCIENCE Science Practices Standard SP.1: Scientific Questions and Predictions Asking scientific questions that can be tested empirically and structuring these questions in the form of testable predictions SP.1.1

More information

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: [email protected]

More information

Data Mining. Toon Calders TU Eindhoven

Data Mining. Toon Calders TU Eindhoven The Dangers of Data Mining Toon Calders TU Eindhoven Motivation for Data Mining: the Data Flood Huge amounts of data are available in digital form Internet IP Traffic logs Scientific data Customer profiles

More information

Data Driven Discovery In the Social, Behavioral, and Economic Sciences

Data Driven Discovery In the Social, Behavioral, and Economic Sciences Data Driven Discovery In the Social, Behavioral, and Economic Sciences Simon Appleford, Marshall Scott Poole, Kevin Franklin, Peter Bajcsy, Alan B. Craig, Institute for Computing in the Humanities, Arts,

More information

CFSD 21 ST CENTURY SKILL RUBRIC CRITICAL & CREATIVE THINKING

CFSD 21 ST CENTURY SKILL RUBRIC CRITICAL & CREATIVE THINKING Critical and creative thinking (higher order thinking) refer to a set of cognitive skills or strategies that increases the probability of a desired outcome. In an information- rich society, the quality

More information

Integrated Social and Enterprise Data = Enhanced Analytics

Integrated Social and Enterprise Data = Enhanced Analytics ORACLE WHITE PAPER, DECEMBER 2013 THE VALUE OF SOCIAL DATA Integrated Social and Enterprise Data = Enhanced Analytics #SocData CONTENTS Executive Summary 3 The Value of Enterprise-Specific Social Data

More information

Research Methods Carrie Williams, (E-mail: [email protected]), Grand Canyon University

Research Methods Carrie Williams, (E-mail: Carrie.Williams@nnsa.doe.gov), Grand Canyon University Research Methods Carrie Williams, (E-mail: [email protected]), Grand Canyon University ABSTRACT This paper discusses three common research approaches, qualitative, quantitative, and mixed methods,

More information

Big Data Discovery: Five Easy Steps to Value

Big Data Discovery: Five Easy Steps to Value Big Data Discovery: Five Easy Steps to Value Big data could really be called big frustration. For all the hoopla about big data being poised to reshape industries from healthcare to retail to financial

More information

CLUSTER ANALYSIS WITH R

CLUSTER ANALYSIS WITH R CLUSTER ANALYSIS WITH R [cluster analysis divides data into groups that are meaningful, useful, or both] LEARNING STAGE ADVANCED DURATION 3 DAY WHAT IS CLUSTER ANALYSIS? Cluster Analysis or Clustering

More information

Competencies for Secondary Teachers: Computer Science, Grades 4-12

Competencies for Secondary Teachers: Computer Science, Grades 4-12 1. Computational Thinking CSTA: Comp. Thinking 1.1 The ability to use the basic steps in algorithmic problemsolving to design solutions (e.g., problem statement and exploration, examination of sample instances,

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining. Fundamentals, robotics, recognition Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

More information

Opportunities and Limitations of Big Data

Opportunities and Limitations of Big Data Opportunities and Limitations of Big Data Karl Schmedders University of Zurich and Swiss Finance Institute «Big Data: Little Ethics?» HWZ-Darden-Conference June 4, 2015 On fortune.com this morning: Apple's

More information

Privacy: Legal Aspects of Big Data and Information Security

Privacy: Legal Aspects of Big Data and Information Security Privacy: Legal Aspects of Big Data and Information Security Presentation at the 2 nd National Open Access Workshop 21-22 October, 2013 Izmir, Turkey John N. Gathegi University of South Florida, Tampa,

More information

Big Data in Communication Research: Its Contents and Discontents

Big Data in Communication Research: Its Contents and Discontents Journal of Communication ISSN 0021-9916 AFTERWORD Big Data in Communication Research: Its Contents and Discontents Malcolm R. Parks Department of Communication, University of Washington, Seattle, WA, 98195,

More information

Undergraduate Psychology Major Learning Goals and Outcomes i

Undergraduate Psychology Major Learning Goals and Outcomes i Undergraduate Psychology Major Learning Goals and Outcomes i Goal 1: Knowledge Base of Psychology Demonstrate familiarity with the major concepts, theoretical perspectives, empirical findings, and historical

More information

T-61.6010 Non-discriminatory Machine Learning

T-61.6010 Non-discriminatory Machine Learning T-61.6010 Non-discriminatory Machine Learning Seminar 1 Indrė Žliobaitė Aalto University School of Science, Department of Computer Science Helsinki Institute for Information Technology (HIIT) University

More information

Exhibit Inquiry Sheets

Exhibit Inquiry Sheets Exhibit Inquiry Sheets Notes to Students: While you travel through A Question of Truth, we recommend that you refer to the small map at the top of each page to help you find the various exhibit areas,

More information

Big Data. Fast Forward. Putting data to productive use

Big Data. Fast Forward. Putting data to productive use Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize

More information

School of Advanced Studies Doctor Of Management In Organizational Leadership/information Systems And Technology. DM/IST 004 Requirements

School of Advanced Studies Doctor Of Management In Organizational Leadership/information Systems And Technology. DM/IST 004 Requirements School of Advanced Studies Doctor Of Management In Organizational Leadership/information Systems And Technology The mission of the Information Systems and Technology specialization of the Doctor of Management

More information

From Data to Foresight:

From Data to Foresight: Laura Haas, IBM Fellow IBM Research - Almaden From Data to Foresight: Leveraging Data and Analytics for Materials Research 1 2011 IBM Corporation The road from data to foresight is long? Consumer Reports

More information

Analysing Qualitative Data

Analysing Qualitative Data Analysing Qualitative Data Workshop Professor Debra Myhill Philosophical Assumptions It is important to think about the philosophical assumptions that underpin the interpretation of all data. Your ontological

More information

Uncovering Value in Healthcare Data with Cognitive Analytics. Christine Livingston, Perficient Ken Dugan, IBM

Uncovering Value in Healthcare Data with Cognitive Analytics. Christine Livingston, Perficient Ken Dugan, IBM Uncovering Value in Healthcare Data with Cognitive Analytics Christine Livingston, Perficient Ken Dugan, IBM Conflict of Interest Christine Livingston Ken Dugan Has no real or apparent conflicts of interest

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

The Big Data methodology in computer vision systems

The Big Data methodology in computer vision systems The Big Data methodology in computer vision systems Popov S.B. Samara State Aerospace University, Image Processing Systems Institute, Russian Academy of Sciences Abstract. I consider the advantages of

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 [email protected] 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Theoretical Perspective

Theoretical Perspective Preface Motivation Manufacturer of digital products become a driver of the world s economy. This claim is confirmed by the data of the European and the American stock markets. Digital products are distributed

More information

I D C E X E C U T I V E B R I E F

I D C E X E C U T I V E B R I E F I D C E X E C U T I V E B R I E F E n a b l i n g B e t t e r D e c i s i o n s T h r o u g h U n i f i e d Ac c e s s t o I n f o r m a t i o n November 2008 Global Headquarters: 5 Speen Street Framingham,

More information

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

LEARNING OUTCOMES FOR THE PSYCHOLOGY MAJOR

LEARNING OUTCOMES FOR THE PSYCHOLOGY MAJOR LEARNING OUTCOMES FOR THE PSYCHOLOGY MAJOR Goal 1. Knowledge Base of Psychology Demonstrate familiarity with the major concepts, theoretical perspectives, empirical findings, and historical trends in psychology.

More information

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Chapter 5. Warehousing, Data Acquisition, Data. Visualization Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives

More information

SURVEY REPORT DATA SCIENCE SOCIETY 2014

SURVEY REPORT DATA SCIENCE SOCIETY 2014 SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses

More information

Kindergarten to Grade 4 Manitoba Foundations for Scientific Literacy

Kindergarten to Grade 4 Manitoba Foundations for Scientific Literacy Kindergarten to Grade 4 Manitoba Foundations for Scientific Literacy The Five Foundations Manitoba Foundations for Scientific Literacy To develop scientifically literate students, science learning experiences

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

Building and deploying effective data science teams. Nikita Lytkin, Ph.D.

Building and deploying effective data science teams. Nikita Lytkin, Ph.D. Building and deploying effective data science teams Nikita Lytkin, Ph.D. Introduction Ph.D. in Computer Science, Machine Learning (Rutgers University) Postdoc in Machine Learning for Genomics (NYU School

More information

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria

More information

RESEARCH PROCESS AND THEORY

RESEARCH PROCESS AND THEORY INTERDISCIPLINARY RESEARCH PROCESS AND THEORY ALLEN F. REPKO The University of Texas at Arlington SAGE Los Angeles London New Delhi Singapore Washington DC Detailed Contents Preface Acknowledgments About

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

How to Develop a Research Protocol

How to Develop a Research Protocol How to Develop a Research Protocol Goals & Objectives: To explain the theory of science To explain the theory of research To list the steps involved in developing and conducting a research protocol Outline:

More information

Chapter 2 Conceptualizing Scientific Inquiry

Chapter 2 Conceptualizing Scientific Inquiry Chapter 2 Conceptualizing Scientific Inquiry 2.1 Introduction In order to develop a strategy for the assessment of scientific inquiry in a laboratory setting, a theoretical construct of the components

More information

Learning is a very general term denoting the way in which agents:

Learning is a very general term denoting the way in which agents: What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful

More information

Sentiment Analysis on Big Data

Sentiment Analysis on Big Data SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social

More information

DOCTOR OF BUSINESS ADMINISTRATION POLICY

DOCTOR OF BUSINESS ADMINISTRATION POLICY DOCTOR OF BUSINESS ADMINISTRATION POLICY Section 1 Purpose and Content (1) This document outlines the specific course requirements of the Doctor of Business Administration (DBA) at the University of Western

More information

Research Methods: Qualitative Approach

Research Methods: Qualitative Approach Research Methods: Qualitative Approach Sharon E. McKenzie, PhD, MS, CTRS, CDP Assistant Professor/Research Scientist Coordinator Gerontology Certificate Program Kean University Dept. of Physical Education,

More information

Georgia Department of Education

Georgia Department of Education Epidemiology Curriculum The Georgia Performance Standards are designed to provide students with the knowledge and skills for proficiency in science. The Project 2061 s Benchmarks for Science Literacy is

More information

Scholars Journal of Arts, Humanities and Social Sciences

Scholars Journal of Arts, Humanities and Social Sciences Scholars Journal of Arts, Humanities and Social Sciences Sch. J. Arts Humanit. Soc. Sci. 2014; 2(3B):440-444 Scholars Academic and Scientific Publishers (SAS Publishers) (An International Publisher for

More information

Organizing Your Approach to a Data Analysis

Organizing Your Approach to a Data Analysis Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize

More information

Using Big Data Analytics to

Using Big Data Analytics to Using Big Data Analytics to Improve Government Performance Arun Chandrasekaran Gartner is a registered trademark of Gartner, Inc. or its affiliates. This publication may not be reproduced or distributed

More information

2014-15 College-wide Goal Assessment Plans (SoA&S Assessment Coordinator September 24, 2015)

2014-15 College-wide Goal Assessment Plans (SoA&S Assessment Coordinator September 24, 2015) 2014-15 College-wide Goal Assessment Plans (SoA&S Assessment Coordinator September 24, 2015) College-wide Goal 1: Intellectual Engagement PG1 Students will demonstrate the ability to think critically and

More information

How Big Data is Different

How Big Data is Different FALL 2012 VOL.54 NO.1 Thomas H. Davenport, Paul Barth and Randy Bean How Big Data is Different Brought to you by Please note that gray areas reflect artwork that has been intentionally removed. The substantive

More information

What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy

What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy Much higher Volumes. Processed with more Velocity. With much more Variety. Is Big Data so big? Big Data Smart Data Project HAVEn: Adaptive Intelligence

More information

DON T GET LOST IN THE FOG OF BIG DATA

DON T GET LOST IN THE FOG OF BIG DATA DON T GET LOST IN THE FOG OF BIG DATA MERCER S LESSONS FOR SUCCESS IN WORKFORCE ANALYTICS If 2013 has produced a breakthrough technology phrase, it is big data, a fairly vague but forceful term that features

More information

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Predictive Analytics Certificate Program

Predictive Analytics Certificate Program Information Technologies Programs Predictive Analytics Certificate Program Accelerate Your Career Offered in partnership with: University of California, Irvine Extension s professional certificate and

More information

BIG DATA FOR DEVELOPMENT: A PRIMER

BIG DATA FOR DEVELOPMENT: A PRIMER June 2013 BIG DATA FOR DEVELOPMENT: A PRIMER Harnessing Big Data For Real-Time Awareness WHAT IS BIG DATA? Big Data is an umbrella term referring to the large amounts of digital data continually generated

More information

Event Summary: The Social, Cultural, & Ethical Dimensions of Big Data

Event Summary: The Social, Cultural, & Ethical Dimensions of Big Data Event Summary: The Social, Cultural, & Ethical Dimensions of Big Data March 17, 2014 - New York, NY http://www.datasociety.net/initiatives/2014-0317/ This event summary attempts to capture the broad issues

More information

Qualitative Data Collection and Analysis

Qualitative Data Collection and Analysis Qualitative Data Collection and Analysis In this lecture Overview of observations, diary studies, field studies Interviewing in detail Interviews that are done incorrectly are lost data Externalizing and

More information

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open

More information