Wrapping It Up. Jilles Vreeken. 31 July 2015

Size: px
Start display at page:

Download "Wrapping It Up. Jilles Vreeken. 31 July 2015"

Transcription

1 Wrapping It Up Jilles Vreeken 31 July 2015

2 What did we do? Introduction Patterns Correlation and Causation (Subjective) Interestingness Graphs Wrap-up + <ask-me-anything>

3 Take Home: overall Overview of the hot topics in data mining that Jilles thinks are cool strongly biased sample by interest and available time I wanted to give a general picture of what data mining is, what makes it special, and what s currently happening at the edge of human knowledge

4 Key Take-Home Message Data mining is descriptive not predictive the goal is to give you insight into your data, to offer (parts of) candidate hypotheses, what you do with those is up to you.

5 Take Home: Information Theory Exploratory data analysis wandering around your data, looking for interesting things, without being asked questions you cannot know the answer of. Questions like: What distribution should we assume? How many clusters/factors/patterns do you want? Please parameterize this Bayesian network?

6 Take Home: Patterns Pattern mining aims to provide a simple descriptions of the structures that your data exhibits locally. Mining patterns is easy. Mining interesting patterns that are significant and doing so without redundancy, not so much.

7 Take Home: Information Theory Information Theory is a branch of statistics, concerned with measuring information information = reduction of uncertainty Uncertainty can be quantified in bits Everything new you learn about your data allows you to compress it better

8 Take Home: Correlations Correlations can be spurious and deceiving. Mutual information is a strong notion of interaction. Based on Shannon entropy MI is hard to compute for continuous-valued data without making assumptions on the distribution. Based on cumulative entropy MI can detect non-linear correlations without requiring assumptions.

9 Take Home: Causation Causality is a difficult concept. Standard probabilistic approaches based on likelihood cannot detect causal direction between pairs. Additive noise models and information theoretic measures can. Oh, and storks cause babies.

10 Take Home: Interestingness Interestingness is ultimately subjective Still, to have algorithms that can find potentially interesting things we somehow need to formalize it

11 Take Home: Graph Mining Most graph mining approaches are global and predictive Explain everything in one go real graphs are too complex for that Taking a local and descriptive approach allows for more detailed results, richer problems, easier formalization, efficient solutions very little done so far, many cool open problems

12 Take Home: Dynamic Data Data is rarely static even though many algorithms expect that Streaming algorithms work when data is too big to fit anywhere while dynamic algorithms aim to adjust the answer with the changing data

13 Take Home: Assignments What the hell was he thinking?? I wanted you to learn to read scientific papers without getting lost in details quickly forming high level pictures of complex ideas read critically, seeing through scientific sales-pitches show independent thinking, make ideas your own I was not disappointed.

14 Take Home: TADA Data analysis is important, upcoming, but still very young aims to tackle impossible problems, such as finding interesting things in enormous search spaces is a weird mix of theory and practice: likes to be foundational, yet not afraid of ad hoc and, not unimportant, it s lots of fun.

15 Exam dates The Exam type: when: time: oral August 3 rd and 4 th individual where: E1.7 room 3.01 what: all material discussed in the lectures, plus one assignment (your choice) per topic The Re-Exam type: when: time: oral September 28 th individual where: E1.7 room 3.01

16 Evaluation: I did not like Class should end in time :) The amount of time necessary for every assignment.

17 Evaluation: Suggestions More motivated slide More details on the why? Bit heavy course for 5 ECTS Yes. More practical follow-up to implement/text ideas Maybe Discuss assignments the day it is brought online We can do that.

18 Things to do Master thesis projects in principle: yes! in practice: depending background, motivation, interests, and grades --- plus, on whether I have time interested? mail me Student Research Assistant (HiWi) positions in principle: maybe in practice: depends on background, grades, and in particular your motivation and interests interested? mail me, include CV and grades

19 Sample Topics Graphs - characterising viruses - realistic graph generators - interesting subgraphs - comparing graphs Causality - did X cause Y? - mining causal graphs - what s the cause of this? - predicting the future Useful Patterns - tell me about this - privacy & data generation - pattern-based indexing - noise reduction Rich Data & Text - pattern-based topic models - grammar & compression - rich MaxEnt modelling - outliers in rich data

20 Good Reads Data Analysis: a Bayesian Tutorial D.S. Sivia & J. Skilling (very good, but skip the MaxEnt stuff) Elements of Information Theory Thomas Cover & Joy Thomas (very good textbook) The Information James Gleick (great light reading)

21 Teach us More! Well, ok let me advertise Information Retrieval and Data Mining together with Gerhard Weikum Core Lecture 9 ECTS In addition, Hoang Vu and Mario will likely teach one or two courses next semester Options include: Causal Inference Mining High Dimensional Data Mining (Correlated) Patterns (seminar+lectures) (seminar+lectures) (seminar+lectures)

22 Question Time!

23 Privacy & Data Mining What is your opinion on privacy preserving data mining? Have you ever worked with it? Do you think it is useful, or does it somehow contradicts 'the spirit' of data mining?

24 Text Mining Have you ever worked with text mining? Do you think considering grammar is necessary, or is mere statistics enough?

25 Big Data Does Big Data exist? How big is Big Data? When is the data Big enough? Is more data always better?

26 Mining Massive Data Map Reduce, Hadoop, Big Table, Cassandra, Spark, Dremel, etc, etc engineering or science? Essentially tricks not magic that work well for certain specific problems For KDD 2014, at least 25 out of 150 presentations will be specifically aimed at large scale stuff

27 Mining the Cloud How about data analytics in the cloud?

28 Social Network Analysis Many, many, many papers about social network analysis So far: lots of statistics, not much mining That is, most are about how to model a graph probabilistically, how to fit a given distribution. The Elephant in the Room: what is the graph distribution? Nobody knows. Yet.

29 Your Question Here!

30 Conclusions This concludes TADA 15. I hope you enjoyed the ride.

31 Thank you! This concludes TADA 15. I hope you enjoyed the ride.

So, how do you pronounce. Jilles Vreeken. Okay, now we can talk. So, what kind of data? binary. * multi-relational

So, how do you pronounce. Jilles Vreeken. Okay, now we can talk. So, what kind of data? binary. * multi-relational Simply Mining Data Jilles Vreeken So, how do you pronounce Exploratory Data Analysis Jilles Vreeken Jilles Yill less Vreeken Fray can 17 August 2015 Okay, now we can talk. 17 August 2015 The goal So, what

More information

Collaborations between Official Statistics and Academia in the Era of Big Data

Collaborations between Official Statistics and Academia in the Era of Big Data Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What

More information

Data Science at U of U

Data Science at U of U Data Science at U of U Je M. Phillips Assistant Professor, School of Computing Center for Extreme Data Management, Analysis, and Visualization Director, Data Management and Analysis Track University of

More information

Big Data and Scripting

Big Data and Scripting Big Data and Scripting 1, 2, Big Data and Scripting - abstract/organization contents introduction to Big Data and involved techniques schedule 2 lectures (Mon 1:30 pm, M628 and Thu 10 am F420) 2 tutorials

More information

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining. Fundamentals, robotics, recognition Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Contact Recommendations from Aggegrated On-Line Activity

Contact Recommendations from Aggegrated On-Line Activity Contact Recommendations from Aggegrated On-Line Activity Abigail Gertner, Justin Richer, and Thomas Bartee The MITRE Corporation 202 Burlington Road, Bedford, MA 01730 {gertner,jricher,tbartee}@mitre.org

More information

Proseminar Wissenschaftliches Arbeiten. Ulf Leser

Proseminar Wissenschaftliches Arbeiten. Ulf Leser Proseminar Wissenschaftliches Arbeiten Ulf Leser Proseminar We want to teach you how to approach a scientific topic to find scientific literature and discern relevant from irrelevant to systematically

More information

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining

More information

Intelligent Learning and Analysis Systems: Data Mining and Knowledge Discovery Prof. Dr. Stefan Wrobel; Dr. Tamas Horvath

Intelligent Learning and Analysis Systems: Data Mining and Knowledge Discovery Prof. Dr. Stefan Wrobel; Dr. Tamas Horvath Intelligent Learning and Analysis Systems: Data Mining and Knowledge Discovery Prof. Dr. Stefan Wrobel; Dr. Tamas Horvath Lecture Survey Fachschaft Informatik October 12, 2015 Turned in Questionnaires:

More information

A Theoretical Framework for Exploratory Data Mining: Recent Insights and Challenges Ahead

A Theoretical Framework for Exploratory Data Mining: Recent Insights and Challenges Ahead A Theoretical Framework for Exploratory Data Mining: Recent Insights and Challenges Ahead Tijl De Bie and Eirini Spyropoulou Intelligent Systems Lab, University of Bristol, UK tijl.debie@gmail.com, enxes@bristol.ac.uk

More information

Name of Module: Big Data ECTS: 6 Module-ID: Person Responsible for Module (Name, Mail address): Angel Rodríguez, arodri@fi.upm.es

Name of Module: Big Data ECTS: 6 Module-ID: Person Responsible for Module (Name, Mail address): Angel Rodríguez, arodri@fi.upm.es Name of Module: Big Data ECTS: 6 Module-ID: Person Responsible for Module (Name, Mail address): Angel Rodríguez, arodri@fi.upm.es University: UPM Departments: DATSI, DLSIIS 1. Prerequisites for Participation

More information

Agreement on. Dual Degree Master Program in Computer Science KAIST. Technische Universität Berlin

Agreement on. Dual Degree Master Program in Computer Science KAIST. Technische Universität Berlin Agreement on Dual Degree Master Program in Computer Science between KAIST Department of Computer Science and Technische Universität Berlin Fakultät für Elektrotechnik und Informatik (Fakultät IV) 1 1 Subject

More information

Learning outcomes. Knowledge and understanding. Competence and skills

Learning outcomes. Knowledge and understanding. Competence and skills Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

DATE: What is Halloween?

DATE: What is Halloween? Questions: What do you already know about Halloween? Read the article below and then answer the questions. What is Halloween? Halloween is a traditional celebration that began in Europe hundreds of years

More information

RESEARCH OPPORTUNITY PROGRAM 299Y PROJECT DESCRIPTIONS 2015 2016 SUMMER

RESEARCH OPPORTUNITY PROGRAM 299Y PROJECT DESCRIPTIONS 2015 2016 SUMMER Project Code: CSC 1S Professor Ronald M. Baecker & Assistant Lab Director Carrie Demmans Epp Computer Science TITLE OF RESEARCH PROJECT: Exploring Language Use in Computer Science Discussion Forums NUMBER

More information

CPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015

CPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015 CPSC 340: Machine Learning and Data Mining Mark Schmidt University of British Columbia Fall 2015 Outline 1) Intro to Machine Learning and Data Mining: Big data phenomenon and types of data. Definitions

More information

The Intersection of Big Data, Data Science, and The Internet of Things

The Intersection of Big Data, Data Science, and The Internet of Things The Intersection of Big Data, Data Science, and The Internet of Things Bebo White SLAC National Accelerator Laboratory/ Stanford University bebo@slac.stanford.edu SLAC is a US national laboratory operated

More information

What? So what? NOW WHAT? Presenting metrics to get results

What? So what? NOW WHAT? Presenting metrics to get results What? So what? NOW WHAT? What? So what? Visualization is like photography. Impact is a function of focus, illumination, and perspective. What? NOW WHAT? Don t Launch! Prevent your own disastrous decisions

More information

Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015

Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Lecture: MWF: 1:00-1:50pm, GEOLOGY 4645 Instructor: Mihai

More information

Big Data and Scripting. (lecture, computer science, bachelor/master/phd)

Big Data and Scripting. (lecture, computer science, bachelor/master/phd) Big Data and Scripting (lecture, computer science, bachelor/master/phd) Big Data and Scripting - abstract/organization abstract introduction to Big Data and involved techniques lecture (2+2) practical

More information

1. Programme title and designation Advanced Software Engineering

1. Programme title and designation Advanced Software Engineering PROGRAMME APPROVAL FORM SECTION 1 THE PROGRAMME SPECIFICATION 1. Programme title and designation Advanced Software Engineering 2. Final award Award Title Credit Value MSc Advanced Software Engineering

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Bachelor Exchange Report Sweden Uppsala part A Pieter Jan Tavenier (s2348020)

Bachelor Exchange Report Sweden Uppsala part A Pieter Jan Tavenier (s2348020) Bachelor Exchange Report Sweden Uppsala part A Pieter Jan Tavenier (s2348020) In enjoyed my bachelor exchange in Uppsala, Sweden and I can tell you that within Europe this is definitely one of the best

More information

Big Data Frameworks Course. Prof. Sasu Tarkoma 10.3.2015

Big Data Frameworks Course. Prof. Sasu Tarkoma 10.3.2015 Big Data Frameworks Course Prof. Sasu Tarkoma 10.3.2015 Contents Course Overview Lectures Assignments/Exercises Course Overview This course examines current and emerging Big Data frameworks with focus

More information

Big Data Management and Analytics

Big Data Management and Analytics Big Data Management and Analytics Lecture Notes Winter semester 2015 / 2016 Ludwig-Maximilians-University Munich Prof. Dr. Matthias Renz 2015 Based on lectures by Donald Kossmann (ETH Zürich), as well

More information

The Challenge of Helping Adults Learn: Principles for Teaching Technical Information to Adults

The Challenge of Helping Adults Learn: Principles for Teaching Technical Information to Adults The Challenge of Helping Adults Learn: Principles for Teaching Technical Information to Adults S. Joseph Levine, Ph.D. Michigan State University levine@msu.edu One of a series of workshop handouts made

More information

Software Engineering and Service Design: courses in ITMO University

Software Engineering and Service Design: courses in ITMO University Software Engineering and Service Design: courses in ITMO University Igor Buzhinsky igor.buzhinsky@gmail.com Computer Technologies Department Department of Computer Science and Information Systems December

More information

Why do statisticians "hate" us?

Why do statisticians hate us? Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

15.496 Data Technologies for Quantitative Finance

15.496 Data Technologies for Quantitative Finance Paul F. Mende MIT Sloan School of Management Fall 2014 Course Syllabus 15.496 Data Technologies for Quantitative Finance Course Description. This course introduces students to financial market data and

More information

Program description for the Master s Degree Program in Mathematics and Finance

Program description for the Master s Degree Program in Mathematics and Finance Program description for the Master s Degree Program in Mathematics and Finance : English: Master s Degree in Mathematics and Finance Norwegian, bokmål: Master i matematikk og finans Norwegian, nynorsk:

More information

Interviewer: Sonia Doshi (So) Interviewee: Sunder Kannan (Su) Interviewee Description: Junior at the University of Michigan, inactive Songza user

Interviewer: Sonia Doshi (So) Interviewee: Sunder Kannan (Su) Interviewee Description: Junior at the University of Michigan, inactive Songza user Interviewer: Sonia Doshi (So) Interviewee: Sunder Kannan (Su) Interviewee Description: Junior at the University of Michigan, inactive Songza user Ad Preferences Playlist Creation Recommended Playlists

More information

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier Data Mining: Concepts and Techniques Jiawei Han Micheline Kamber Simon Fräser University К MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF Elsevier Contents Foreword Preface xix vii Chapter I Introduction I I.

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

Brown University Department of Economics Spring 2015 ECON 1620-S01 Introduction to Econometrics Course Syllabus

Brown University Department of Economics Spring 2015 ECON 1620-S01 Introduction to Econometrics Course Syllabus Brown University Department of Economics Spring 2015 ECON 1620-S01 Introduction to Econometrics Course Syllabus Course Instructor: Dimitra Politi Office hour: Mondays 1-2pm (and by appointment) Office

More information

How To Get A Master'S Degree In Mathematics In Norway

How To Get A Master'S Degree In Mathematics In Norway Program description for the Master s Degree Program in Mathematics Name English: Master s Degree program in Mathematics Norwegian, bokmål: Masterprogram i matematikk Norwegian, nynorsk: Masterprogram i

More information

Java/Scala Engineer Internet of Iot Competitors

Java/Scala Engineer Internet of Iot Competitors JOB 1 Sr. Java/Scala Engineer Internet of Things, IoT, is a true digital revolution. Predictions of 20, 50 or 100 billion connected devices in 2020 are pointing to massive changes for people and industries.

More information

Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level?

Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level? Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level? Dr. Frank Lee Chair, ECE/CS/IT New York Institute of Technology Old Westbury, NY 11568 Topics This talk describes:

More information

Five High Order Thinking Skills

Five High Order Thinking Skills Five High Order Introduction The high technology like computers and calculators has profoundly changed the world of mathematics education. It is not only what aspects of mathematics are essential for learning,

More information

Second International Workshop on Preservation of Evolving Big Data - Panel on Big Data Quality

Second International Workshop on Preservation of Evolving Big Data - Panel on Big Data Quality Second International Workshop on Preservation of Evolving Big Data - Panel on Big Data Quality Angela Bonifati University of Lyon 1 Liris CNRS, France March 15, 2016 Angela Bonifati 2nd Diachron Workshop

More information

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

More information

Preview. What is a correlation? Las Cucarachas. Equal Footing. z Distributions 2/12/2013. Correlation

Preview. What is a correlation? Las Cucarachas. Equal Footing. z Distributions 2/12/2013. Correlation Preview Correlation Experimental Psychology Arlo Clark-Foos Correlation Characteristics of Relationship to Cause-Effect Partial Correlations Standardized Scores (z scores) Psychometrics Reliability Validity

More information

Exchange Report Linkoping Sweden

Exchange Report Linkoping Sweden Exchange Report Linkoping Sweden Fall semester 2014-2015 Marin Teinsma 20-02-2015 The report is in two parts: a general part (A) and a specific report for each subject (B). A General report 1. Host institution

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

Cosmological Arguments for the Existence of God S. Clarke

Cosmological Arguments for the Existence of God S. Clarke Cosmological Arguments for the Existence of God S. Clarke [Modified Fall 2009] 1. Large class of arguments. Sometimes they get very complex, as in Clarke s argument, but the basic idea is simple. Lets

More information

I GENERAL INFORMATION ABOUT THE SCHOOL

I GENERAL INFORMATION ABOUT THE SCHOOL EXPERIENCE REPORT E-mail: bzeamfeijen@gmail.com Study Program: Organization Studies (OS, Organisatiewetenschappen) Exchange semester: Fall Academic year: 2014/2015 Host University: University of Pécs Country:

More information

INFS5991 BUSINESS INTELLIGENCE METHODS

INFS5991 BUSINESS INTELLIGENCE METHODS Australian School of Business School of Information Systems, Technology and Management INFS5991 BUSINESS INTELLIGENCE METHODS Course Outline Semester 1, 2014 Part A: Course-Specific Information Please

More information

CAs and Turing Machines. The Basis for Universal Computation

CAs and Turing Machines. The Basis for Universal Computation CAs and Turing Machines The Basis for Universal Computation What We Mean By Universal When we claim universal computation we mean that the CA is capable of calculating anything that could possibly be calculated*.

More information

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business Instructor: Kunpeng Zhang (kzhang@rmsmith.umd.edu) Lecture-Discussions:

More information

PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II. Lecture Notes PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

More information

Unsupervised Data Mining (Clustering)

Unsupervised Data Mining (Clustering) Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Business Intelligence

Business Intelligence Business Intelligence Data Mining and Data Warehousing Dominik Ślęzak slezak@infobright.com www.infobright.com Research Interests Data Warehouses, Knowledge Discovery, Rough Sets Machine Intelligence,

More information

Top Ten Mistakes in the FCE Writing Paper (And How to Avoid Them) By Neil Harris

Top Ten Mistakes in the FCE Writing Paper (And How to Avoid Them) By Neil Harris Top Ten Mistakes in the FCE Writing Paper (And How to Avoid Them) By Neil Harris Top Ten Mistakes in the FCE Writing Paper (And How to Avoid Them) If you re reading this article, you re probably taking

More information

Network mining for crime/fraud detection. FuturICT CrimEx January 26th, 2012 Jan Ramon

Network mining for crime/fraud detection. FuturICT CrimEx January 26th, 2012 Jan Ramon Network mining for crime/fraud detection FuturICT CrimEx January 26th, 2012 Jan Ramon Overview Administrative data and crime/fraud Data mining and related domains Data mining in large networks Opportunities

More information

Northwestern University BUS_INST 239 Marketing Management Fall 2013. Department of Psychology University Hall, Room 102 Swift Hall (2029 Sheridan Rd.

Northwestern University BUS_INST 239 Marketing Management Fall 2013. Department of Psychology University Hall, Room 102 Swift Hall (2029 Sheridan Rd. BIP 239 Syllabus, pg.1 Northwestern University BUS_INST 239 Marketing Management Fall 2013 Professor Ginger Pennington Class meets: Tues/Th 11am-12:20pm Department of Psychology University Hall, Room 102

More information

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs 1 Big Data Analytics Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs Montevideo, 22 nd November 4 th December, 2015 INFORMATIQUE

More information

Agreement on Dual Degree Master Program in Computer Science. Politechnika Warszawska. Technische Universität Berlin

Agreement on Dual Degree Master Program in Computer Science. Politechnika Warszawska. Technische Universität Berlin Agreement on Dual Degree Master Program in Computer Science between Politechnika Warszawska Faculty of Electronics and Information Technology and Technische Universität Berlin School of Electrical Engineering

More information

Introduction to Marketing

Introduction to Marketing Introduction to Marketing Theocharis Katranis Spring Semester 2013 1 Today s Lecture 1. We will explain the importance of information in gaining insights about the marketplace and customers. 2. We will

More information

A Few Basics of Probability

A Few Basics of Probability A Few Basics of Probability Philosophy 57 Spring, 2004 1 Introduction This handout distinguishes between inductive and deductive logic, and then introduces probability, a concept essential to the study

More information

JA003 Výběrová angličtina pro přírodovědce

JA003 Výběrová angličtina pro přírodovědce JA3 Výběrová angličtina pro přírodovědce Anketa podzim 214 Otázka Název otázky Možnosti Odpovědi 1 2 4 Balance of language and science 3 5 4 4 Balance of language and science From my point of view, balance

More information

Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course

Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course Prerequisite: Stat 3201 (Introduction to Probability for Data Analytics) Exclusions: Class distribution:

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline

More information

Curriculum for the basic subject at master s level in. IT and Cognition, the 2013 curriculum. Adjusted 2014

Curriculum for the basic subject at master s level in. IT and Cognition, the 2013 curriculum. Adjusted 2014 D E T H U M A N I S T I S K E F A K U L T E T K Ø B E N H A V N S U N I V E R S I T E T Curriculum for the basic subject at master s level in IT and Cognition, the 2013 curriculum Adjusted 2014 Department

More information

How does the problem of relativity relate to Thomas Kuhn s concept of paradigm?

How does the problem of relativity relate to Thomas Kuhn s concept of paradigm? How does the problem of relativity relate to Thomas Kuhn s concept of paradigm? Eli Bjørhusdal After having published The Structure of Scientific Revolutions in 1962, Kuhn was much criticised for the use

More information

How to get started on research in economics? Steve Pischke June 2012

How to get started on research in economics? Steve Pischke June 2012 How to get started on research in economics? Steve Pischke June 2012 Also see: http://econ.lse.ac.uk/staff/spischke/phds/ Research is hard! It is hard for everyone, even the best researchers. There is

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

Strategies for Winning at Math. Student Success Workshop

Strategies for Winning at Math. Student Success Workshop Strategies for Winning at Math Student Success Workshop Just the Facts Poor performance in math is NOT due to a lack of intelligence. The key to success in math is having the right approach to studying

More information

B669 Sublinear Algorithms for Big Data

B669 Sublinear Algorithms for Big Data B669 Sublinear Algorithms for Big Data Qin Zhang 1-1 Now about the Big Data Big data is everywhere : over 2.5 petabytes of sales transactions : an index of over 19 billion web pages : over 40 billion of

More information

Preface to the Second Edition

Preface to the Second Edition Preface to the Second Edition The second edition of this book contains both basic and more advanced material on non-life insurance mathematics. Parts I and II of the book cover the basic course of the

More information

MASTERING MULTIPLE CHOICE

MASTERING MULTIPLE CHOICE MASTERING MULTIPLE CHOICE The definitive guide to better grades on multiple choice exams Stephen Merritt Mastering Multiple Choice The Definitive Guide to Better Grades on Multiple Choice Exams Stephen

More information

How Theses Get Written: Some Hot Tips

How Theses Get Written: Some Hot Tips How Theses Get Written: Some Hot Tips Dr Steve Easterbrook NASA/WVU Software Research Lab 1 Outline Part 1: Writing your thesis (1) Context: What is a thesis (for)? (2) How Do I Get Started? (3) What Should

More information

Introductory Microeconomics

Introductory Microeconomics Introductory Microeconomics January 7 lecture Economics Definition: The social science concerned with the efficient use of scarce resources to achieve the maximum satisfaction of economic wants. Efficient:

More information

Assessing Data Mining: The State of the Practice

Assessing Data Mining: The State of the Practice Assessing Data Mining: The State of the Practice 2003 Herbert A. Edelstein Two Crows Corporation 10500 Falls Road Potomac, Maryland 20854 www.twocrows.com (301) 983-3555 Objectives Separate myth from reality

More information

Roadmap for Ph.D. Students Aiming for a Successful Career in Science

Roadmap for Ph.D. Students Aiming for a Successful Career in Science Roadmap for Ph.D. Students Aiming for a Successful Career in Science Do you really want to get a Ph.D.? Do you have what it takes to get a Ph.D.? How can you get the most out of joining a Ph.D. program?

More information

可 视 化 与 可 视 计 算 概 论. Introduction to Visualization and Visual Computing 袁 晓 如 北 京 大 学 2015.12.23

可 视 化 与 可 视 计 算 概 论. Introduction to Visualization and Visual Computing 袁 晓 如 北 京 大 学 2015.12.23 可 视 化 与 可 视 计 算 概 论 Introduction to Visualization and Visual Computing 袁 晓 如 北 京 大 学 2015.12.23 2 Visual Analytics Adapted from Jim Thomas s slides 3 Visual Analytics Definition Visual Analytics is the

More information

Academic Regulations for MBA Master of Business Administration

Academic Regulations for MBA Master of Business Administration Academic Regulations for MBA Master of Business Administration September 2014 Academic Regulations for MBA 2014 2 Contents ACADEMIC REGULATATIONS FOR MASTER OF BUSINESS ADMINISTRATION MBA... 3 General...

More information

15.034 Metrics for Managers: Big Data and Better Answers. Fall 2014. Course Syllabus DRAFT. Faculty: Professor Joseph Doyle E62-516 jjdoyle@mit.

15.034 Metrics for Managers: Big Data and Better Answers. Fall 2014. Course Syllabus DRAFT. Faculty: Professor Joseph Doyle E62-516 jjdoyle@mit. 15.034 Metrics for Managers: Big Data and Better Answers Fall 2014 Course Syllabus DRAFT Faculty: Professor Joseph Doyle E62-516 jjdoyle@mit.edu Professor Roberto Rigobon E62-515 rigobon@mit.edu Professor

More information

Preliminary Syllabus for the course of Data Science for Business Analytics

Preliminary Syllabus for the course of Data Science for Business Analytics Preliminary Syllabus for the course of Data Science for Business Analytics Miguel Godinho de Matos 1,2 and Pedro Ferreira 2,3 1 Cato_lica-Lisbon, School of Business and Economics 2 Heinz College, Carnegie

More information

Big data challenges for physics in the next decades

Big data challenges for physics in the next decades Big data challenges for physics in the next decades David W. Hogg Center for Cosmology and Particle Physics, New York University 2012 November 09 punchlines Huge data sets create new opportunities. they

More information

Large-Scale Test Mining

Large-Scale Test Mining Large-Scale Test Mining SIAM Conference on Data Mining Text Mining 2010 Alan Ratner Northrop Grumman Information Systems NORTHROP GRUMMAN PRIVATE / PROPRIETARY LEVEL I Aim Identify topic and language/script/coding

More information

Is a Traditional Drawing Exercise for Plant and Seed Identification Still Effective for Millennial Students?

Is a Traditional Drawing Exercise for Plant and Seed Identification Still Effective for Millennial Students? Is a Traditional Drawing Exercise for Plant and Seed Identification Still Effective for Millennial Students? Marshall Hay and Kevin Donnelly Kansas State University Background History of drawing in education

More information

COURSE PROFILE. Local Credits. Theory+PS+Lab (hour/week) ECTS. Course Name Code Semester Term. Accounting Information Systems MAN552T I I 3 3 6

COURSE PROFILE. Local Credits. Theory+PS+Lab (hour/week) ECTS. Course Name Code Semester Term. Accounting Information Systems MAN552T I I 3 3 6 COURSE PROFILE Course Name Code Semester Term Accounting Information Systems Theory+PS+Lab (hour/week) Local Credits ECTS MAN552T I I 3 3 6 Prerequisites -- Course Language Course Type Turkish Mandatory

More information

Academic presentations

Academic presentations ST810 March 17, 2008 Outline Types of talks Organization Preparing slides Presentation tips Taking questions Types of talks: Conference presentation Usually 15-20 minutes for contributed talks. Maybe time

More information

ICT Perspectives on Big Data: Well Sorted Materials

ICT Perspectives on Big Data: Well Sorted Materials ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in

More information

COURSE PROFILE. The objective of this course is to introduce essential concepts, methods, strategies, and processes that are used in sales management.

COURSE PROFILE. The objective of this course is to introduce essential concepts, methods, strategies, and processes that are used in sales management. COURSE PROFILE Course Name Code Semester Term Theory+PS+Lab (hour/week) Local Credits ECTS Personal Selling and Sales Management MAN 526 2 2 3 3 7 Prerequisites - Course Language Course Type English Required

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

Advanced Statistics & Data Analysis

Advanced Statistics & Data Analysis Advanced Statistics & Data Analysis Instructor: Matthew, Ph.D. Office: Kinard 120 Email: hayesm@winthrop.edu (the best way to reach me) Office Phone: 803-323-2628 Office Hours: Office Hours: T 3:00-4:45;

More information

QMB 3302 - Business Analytics CRN 82361 - Fall 2015 W 6:30-9:15 PM -- Lutgert Hall 2209

QMB 3302 - Business Analytics CRN 82361 - Fall 2015 W 6:30-9:15 PM -- Lutgert Hall 2209 QMB 3302 - Business Analytics CRN 82361 - Fall 2015 W 6:30-9:15 PM -- Lutgert Hall 2209 Rajesh Srivastava, Ph.D. Professor and Chair, Department of Information Systems and Operations Management Lutgert

More information

MIS 484-4 Big Data Information Systems

MIS 484-4 Big Data Information Systems MIS 484-4 Big Data Information Systems Chetan (Chet) Kumar, PhD Associate Professor of Information Systems California State University San Marcos ckumar@csusm.edu COURSE DESCRIPTION The aim of this course

More information

Valentine's Tradition By Kelly Hashway

Valentine's Tradition By Kelly Hashway Name: Amerie stared at the giant cardboard heart at the front of the classroom. Mr. Mickelson had a new project in mind, and from the looks of it, it had to do with Valentine s Day. Can someone tell me

More information

QMB 3302 - Business Analytics CRN 80700 - Fall 2015 T & R 9.30 to 10.45 AM -- Lutgert Hall 2209

QMB 3302 - Business Analytics CRN 80700 - Fall 2015 T & R 9.30 to 10.45 AM -- Lutgert Hall 2209 QMB 3302 - Business Analytics CRN 80700 - Fall 2015 T & R 9.30 to 10.45 AM -- Lutgert Hall 2209 Elias T. Kirche, Ph.D. Associate Professor Department of Information Systems and Operations Management Lutgert

More information

Introduction to data mining

Introduction to data mining Introduction to data mining Ryan Tibshirani Data Mining: 36-462/36-662 January 15 2013 1 Logistics Course website (syllabus, lectures slides, homeworks, etc.): http://www.stat.cmu.edu/~ryantibs/datamining

More information

How to Hand Exams Back to Your Class

How to Hand Exams Back to Your Class How to Hand Exams Back to Your Class by Howard E. Aldrich Kenan Professor of Sociology, Chair of the Management & Society Curriculum, and Adjunct Professor of Business (Kenan-Flagler School of Business)

More information

CREATING A GREAT BANNER AD

CREATING A GREAT BANNER AD CREATING A GREAT BANNER AD We ve put together a guide on banner creation and optimization. This guide will cover how to create a great banner ad for your media buying campaigns. By doing this you can create

More information

CS&SS / STAT 566 CAUSAL MODELING

CS&SS / STAT 566 CAUSAL MODELING CS&SS / STAT 566 CAUSAL MODELING Instructor: Thomas Richardson E-mail: thomasr@u.washington.edu Lecture: MWF 2:30-3:20 SIG 227; WINTER 2016 Office hour: W 3.30-4.20 in Padelford (PDL) B-313 (or by appointment)

More information