?????? Data Analytics



Similar documents
Information Management course

Exploratory Data Analysis with #codemash

Course Syllabus. Purposes of Course:

Data, Measurements, Features

Executive Master of Public Administration. QUANTITATIVE TECHNIQUES I For Policy Making and Administration U6310, Sec. 03

Introduction to Data Mining

STAT 360 Probability and Statistics. Fall 2012

Financial Optimization ISE 347/447. Preliminaries. Dr. Ted Ralphs

Introduction to Data Visualization

Let the data speak to you. Look Who s Peeking at Your Paycheck. Big Data. What is Big Data? The Artemis project: Saving preemies using Big Data

PLS 801: Quantitative Techniques in Political Science

Best Practices in Data Visualizations. Vihao Pham January 29, 2014

Best Practices in Data Visualizations. Vihao Pham 2014

CPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015

Multi-Dimensional Data Visualization. Slides courtesy of Chris North

Law Enforcement II CRIJ 1301 Introduction to Criminal Justice Course Syllabus: Fall 2015

CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science

Biology 360 Genetics Lecture Syllabus and Schedule, Fall 2012 Tentative

THE UNIVERSITY OF HONG KONG FACULTY OF BUSINESS AND ECONOMICS

Geology 110 Sect.1 Syllabus; Fall, GEOL110 Section 3 (3 credits) Fall, 2015 Physical Geology

Information Visualization WS 2013/14 11 Visual Analytics

How To Gain Competitive Advantage With Big Data Analytics And Visualization

Class Syllabus. Department of Business Administration & Management Information Systems. Texas A&M University Commerce

Big Data Analytics Process & Building Blocks

CS Data Science and Visualization Spring 2016

1. Basic Information Course Code and Title: FN5202 Advanced Corporate Finance

UNIVERSITY OF LETHBRIDGE FACULTY OF MANAGEMENT Mgt 2400A Management Accounting Fall 2014

Introduction to Geographic Information System course SESREMO Tempus Project. Gabriel Parodi

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

BUSA 501: Introduction to Business Analytics

BANA6037 Data Visualization Fall Semester 2014 (14FS) / First Half Session Section 001 S 9:00a- 12:50p Lindner 107

Course Overview. Course Learning Objectives

CIS4930/6930 Data Science: Large-scale Advanced Data Analysis Fall Daisy Zhe Wang CISE Department University of Florida

CIS 89A: Web Page Development - Syllabus

General Business 704: Data to Decisions Fall 2013 Wisconsin School of Business, UW-Madison. All class meetings will be held in 2294 Grainger.

UNIVERSITY OF MACAU DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE SFTW 463 Data Visualization Syllabus 1 st Semester 2011/2012 Part A Course Outline

MASTER COURSE SYLLABUS EEE 3023 Introduction to Entrepreneurship Spears School of Business Oklahoma State University Fall 2011

Quantitative Analysis in International Affairs American University School of International Service SIS Fall 2008

Intro to Big Data and Business Intelligence

Learning outcomes. Knowledge and understanding. Competence and skills

COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK DEPARTMENT OF INDUSTRIAL ENGINEERING AND OPERATIONS RESEARCH

SI649 Information Visualization. Learning Objectives. Tentative Schedule. Preliminary Syllabus, Fall 2012

Customer Analytics. Turn Big Data into Big Value

Faculty of Management Marketing Research MGT 3220 Y Fall 2015 Tuesdays, 6:00pm 8:50pm Room: S4027 Lab: N637

How To Make Sense Of Data With Altilia

Research Methods. Fall 2011

DSBA6100-U01 And U90 - Big Data Analytics for Competitive Advantage (Cross listed as MBAD7090, ITCS 6100, HCIP 6103) Fall 2015

Adaptive Business Intelligence (ABI): Presentation of the Unit

IN THE CITY OF NEW YORK Decision Risk and Operations. Advanced Business Analytics Fall 2015

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

City University of Hong Kong. Information on a Course offered by Department of Information Systems with effect from Semester B in 2013 / 2014

MBAD7090-U90: Mobile Marketing and Analytics

Part A of the Syllabus

IS6030 Data Management Fall Semester 2015

Big Data in Pictures: Data Visualization

Visualization Software

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

Lahore University of Management Sciences. MGMT 212 Business Communication Fall Semester

COURSE SYLLABUS BUS CORPORATE FINANCE NOTRE DAME DE NAMUR UNIVERSITY. Prerequisites: BUS 1108, BUS 1220 or 1224, MTH 1225 or MTH 2502

Course Syllabus for Commercial Photography 1

ACCY 2001 Intro Financial Accounting Fall 2014

CAS CS 565, Data Mining

IS Management Information Systems

BIO 315 Human Genetics - Online

Geography 651 Spatial Statistics Fall 2014

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

BBA 380 Management for Environmental Sustainability and Durable Competitive Advantage THE BBA PROGRAM

240ST014 - Data Analysis of Transport and Logistics

Prerequisite Knowledge Management Science 2331 Management 3305

SW Process Improvement and CMMI. Dr. Kanchit Malaivongs Authorized SCAMPI Lead Appraisor Authorized CMMI Instructor

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Medical Biochemistry BC 362 Fall 2014

The University of Jordan

Truman College-Mathematics Department Math 125-CD: Introductory Statistics Course Syllabus Fall 2012

TEACHING POSITIONS AVAILABLE IN BIOLOGY UNIVERSITY OF SAN FRANCISCO SPRING 2014

Notre Dame de Namur University BUS 1220 Intro to Financial Accounting Fall 2015

$ Communications$in$the$Professional$7orld$ (Course #13230, J. Foresta, Tuesday / Thursday, Room: ET-201, 7:00 8:50 P.M.)

MIS 6302.X02: Analytics and Information Technology The University of Texas at Dallas Spring 2014

Data Centric Computing Revisited

Psychology Mind and Society Mondays & Wednesdays, 2:00 3:50 pm, 129 McKenzie Hall Fall 2013 (CRN # 16067)

Foundations of Programming

INFS5873 Business Analytics. Course Outline Semester 2, 2014

INFORMATICS PROGRAM. INF 560: Data Informatics Professional Practicum (3 units)

Bachman, R., & Schutt, R. K. (2014). The Practice of Research in Criminology and Criminal Justice (5th ed.). Los Angeles, CA: Sage.

Model Deployment. Dr. Saed Sayad. University of Toronto

COURSE SYLLABUS Health Information Management Program

GGR272: GEOGRAPHIC INFORMATION AND MAPPING I. Course Outline

(Nova SBE) (International Office: Incoming Students) (International Office: Outgoing Students)

Transcription:

?????? Data Analytics Prof. Dr.-Ing. Lars Linsen Prof. Dr. Adalbert FX Wilhelm Fall 2015

0. Organizational Stuff

0.1 Syllabus and Organization Data Analytics 3

Course website http://www.faculty.jacobsuniversity.de/llinsen/teaching/??????.htm (will be accessible through CampusNet) Data Analytics 4

Course description This course provides an introduction to data analytics concepts and methods. The objective of the course is to present methods for gaining insight from data and drawing conclusions for analytical reasoning and decision making. The course starts off by giving real-world examples. Abstracting from these examples leads towards a taxonomy for data types, their characteristics, and relations. The course comprises methods for the analytics of text or document data, image data, high-dimensional data, time-series data, and geospatial data. Moreover, concepts for the analysis of hierarchical, uni-, or bilateral relations are being taught. Data visualization methods are used for visual data representations, visual encoding, and interaction mechanisms, leading to an interactive visual analytics process. Automatic analysis components such as data transformation, aggregation, classification, clustering, and outlier detection are an integral part of the analytics process. Data Analytics 5

Lectures Times: - Tuesday, 9:45am 11:00am, - Thursday 8:15am 9:30am. Location:??? Data Analytics 6

Instructors Lars Linsen (75%) Office: Res I, 128. Phone: 3196 E-Mail: l.linsen [@jacobs-university.de] Office hours: by appointment Adalbert FX Wilhelm (25%) Office: Res IV, 111. Phone: 3402 E-Mail: a.wilhelm [@jacobs-university.de] Data Analytics 7

Lectures Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13 Week 14 Sep 8 Sep 10 Sep 17 Sep 24 Oct 1 Oct 8 Oct 15 Oct 22 Oct 29 Nov 5 Nov 12 Nov 19 Nov 26 Dec 3 Linsen Wilhelm Linsen Linsen Linsen Linsen Linsen Wilhelm Wilhelm Wilhelm Linsen Linsen Linsen Linsen Data Analytics 8

Tuesdays Thursday lectures end with an assignment, where the taught material needs to be applied to a real-world problem. Students will work in groups on a solution. It is intended that group compositions change during the duration of the course. Students will present solutions in the Tuesday slots. Data Analytics 9

Exams There will be a written final examination. Date of the exam: tbd (around finals week). There will be no quizzes or midterms. Data Analytics 10

Grading Assignments: 60% Final exam: 40% Data Analytics 11

Literature??? Alexandru Telea: Data Visualization: Principles and Practice, Wellesley, Mass.: AK Peters, 1st edition, 2008. Matthew Ward, Georges Grinstein, Daniel Keim: Interactive Data Visualization: Foundations, Techniques, and Applications. AK Peters, 1st edition, 2010. Data Analytics 12

Goal This course provides an introduction to data analytics concepts and methods. The objective of the course is to present methods for gaining insight from data and drawing conclusions for analytical reasoning and decision making. Data Analytics 13

Topics Introductory examples Taxonomy for data types Supervised and unsupervised learning Visual analytics High-dimensional data analytics Aggregation, clustering, and classification Text and document data analytics Image data analytics Relations Time-series data analytics Geospatial data analytics Data Analytics 14

1. Introductory Examples and Taxonomy

1.1 Examples for the Digital Era

Social media [LinkedIn] Data Analytics 17

Twitter Data Analytics 18

Twitter Data Analytics 19

Twitter Data Analytics 20

Twitter Data Analytics 21

Instagram Data Analytics 22

Instagram Data Analytics 23

Some challenges bilateral relations huge network text & document data image data time-varying data geospatial data different heterogeneous sources Data Analytics 24

Tasks detect hot topics what goes viral? detect trends detect changes over time detect spatio-temporal patterns Data Analytics 25

Movies online Data Analytics 26

Netflix competition Data Analytics 27

Some challenges massive data: 500k users 20k movies 100m ratings many factors affect ratings actors directors genres high-dimensional data data incomplete Data Analytics 28

Tasks detect correlations understand correlations make predictions (related to many other application, cf. online selling, e.g., amazon etc.) Data Analytics 29

Human genome Data Analytics 30

Microarrays Data Analytics 31

Sequencing Data Analytics 32

Sequencing costs Data Analytics 33

Genome data Data Analytics 34

Genome data Data Analytics 35

Genome visualization Data Analytics 36

Genome visualization Data Analytics 37

Genome visualization Data Analytics 38

Personalized therapy 10 years from now, each cancer patient is going to want to get a genomic analysis of their cancer and will expect customized therapy based on that information. (Director of The Cancer Genome Atlas, Time Magazine, June 13, 2011) Data Analytics 39

Connectome Ramon y Cajal, 1905 Data Analytics 40

Connectome workflow Data Analytics 41

Ultra-thin eletron microscopy sections Data Analytics 42

Automatic reconstruction Data Analytics 43

Connectome visualization Data Analytics 44

Crime prevention Data Analytics 45

Predictive policing [sueddeutsche.de] Data Analytics 46

Predictive policing using Tableau Data Analytics 47

Internet of things Data Analytics 48

Taxi data Data Analytics 49

1.2 Big data analytics

Big Data Data Analytics 51

Big Data Data Analytics 52

Big Data Between the dawn of civilization and 2003, we only created five exabyte of information; now we re creating that amount every two days. (Eric Schmidt, Google) Data Analytics 53

What is Big Data? Massive data? How many exabytes? Everything we cannot inspect manually. It s not just about the amount of data it s also about the complexity of the data. Data Analytics 54

The big V s of Big Data Data Analytics 55

The big V s of Big Data Data Analytics 56

The big V s of Big Data Data Analytics 57

The fourth paradigm Data Analytics 58

Ubiquitous data The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high-school kids, for college kids. Because now we really do have essentially free and ubiquituous data. (Hal Varian, UC Berkeley) Data Analytics 59

Job growth Data Analytics 60

Processing pipeline Data Analytics 61

1.3 Taxonomy

Data example Data Analytics 63

Data example Data Analytics 64

Data example Data Analytics 65

Data example Data Analytics 66

Data example Data Analytics 67

Taxonomy Data samples are items with attributes. Attributes (stored in tables) can be quantitative continuous numbers (real), discrete numbers (integer), ordinal (ordered sets, rating), or nominal / categorical (unordered sets). Data Analytics 68

Taxonomy Nominal / categorical support = relationship oranges, apples, Ordinal obey < relationship small < medium < large Quantitative can do arithmetics on them cm, kg, Data Analytics 69

Data dimensions Uni-variate Data Analytics 70

Data dimensions Bi-variate Data Analytics 71

Data dimensions Tri-variate Data Analytics 72

Data dimensions Multi-variate Data Analytics 73

Special attributes Geospatial location (longitude, latitude) Time-varying attributes change values over time time series Spatio-temporal geospatial & time-varying Data Analytics 74

Other aspects Data Analytics 75

Complex data attributes Text / document cf. Twitter Image cf. Instagram Data Analytics 76

Data samples may have unilateral relations, bilateral relations, or hierarchical relations. Relations Data Analytics 77