Statistics Meets Big Data 統 計 遇 見 大 數 據



Similar documents
POTENTIAL STEM CAREERS

Computer and Information Scientists $105, Computer Systems Engineer. Aeronautical & Aerospace Engineer Compensation Administrator

Opportunities in Computer Science

SASA: Strategic planning for the future

Appendix D: Professional Occupations Education and Training Categories

The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics

Dimensionalizing Big Data. WA State vs. peers. Building on strengths CONTENTS. McKinsey & Company 1

What is Data Science? Girl Develop It! Meetup Renée M. P. Teate, March 2015

Use advanced techniques for summary and visualization of complex data for exploratory analysis and presentation.

What is Data Science? Data, Databases, and the Extraction of Knowledge Renée November 2014

BIG DATA Driven Innovations in the Life Insurance Industry

Statistics for BIG data

5 Health Care Pathways

Data Science. BSc Hons

What Can I Do With A Major In Mathematics?

Elke Rundensteiner

What Do Masters Graduates Do? 2007

THE MCKINSEY GLOBAL INSTITUTE has predicted that by 2018, the US alone could face a shortage of between 140,000 to 190,000 people with deep

COMPUTER SCIENCE: MISCONCEPTIONS, CAREER PATHS AND RESEARCH CHALLENGES

POSTGRADUATE PROGRAMS IN APPLIED DATA ANALYTICS

Insight Data Science: Bridging the gap between academia and industry. Josiah Walton Physics Careers Seminar UIUC Department of Physics April 23, 2015

UNIVERSITY OF INFINITE AMBITIONS. MASTER OF SCIENCE COMPUTER SCIENCE DATA SCIENCE AND SMART SERVICES

Chapter 2 Big Data Panel at SIGDSS Pre-ICIS Conference 2013: A Swiss-Army Knife? The Profile of a Data Scientist

Code Title Code Title Engineering and Technology Pathway

EXPLORE YOUR FUTURE WITH THE FOUNDATION HIGH SCHOOL PROGRAM

Course Requirements for the Ph.D., M.S. and Certificate Programs

The Big Data Deluge: Creating Serious Business Problems. Analytics: Harnessing Big Data Deluge to Acquire Business Power

South Australia skilled occupation list

Program Overview. Updated 06/13

CoolaData Predictive Analytics

COMPSCI 760 S2 C 2014 Machine Learning and Data Mining Computer Science Department

Programme Specification Postgraduate Programmes

Big Data Analytics. David Dietrich, EMC Education Services. April 4, 2013

Domain

Program Overview. Updated 06/13

GETTING STARTED WITH R AND DATA ANALYSIS

Taught Postgraduate programmes in the School of Mathematics and Statistics

PhD in Computer Science at North Carolina A&T State University

ABOUT THE RUTGERS SCHOOL OF PUBLIC HEALTH

THE KEY TO EXECUTIVE DECISIONS!

Plant A Seed, Grow A Future

TECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING

Intro to Big Data and Business Intelligence

Description of the Forensic Science Major

Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics

Uni Graduates: Work, Salaries, Study and Course Satisfaction

Data Science at the University of Virginia

Vanderbilt University Biomedical Informatics Graduate Program (VU-BMIP) Proposal Executive Summary

Proposal for New Program: BS in Data Science: Computational Analytics

An interdisciplinary model for analytics education

Working with telecommunications

BOR 6335 Data Mining. Course Description. Course Bibliography and Required Readings. Prerequisites

School of Public Health and Health Services Department of Epidemiology and Biostatistics

Health Informatics Student Handbook

Statistics 215b 11/20/03 D.R. Brillinger. A field in search of a definition a vague concept

Proposal for New Program: Minor in Data Science: Computational Analytics

Master of Artificial Intelligence

A Statistical Text Mining Method for Patent Analysis

Preface to the Second Edition

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

In Demand Jobs: US Projections, Richard Holden BLS Regional Commissioner San Diego, CA March 6, 2014

MICHIGAN CIVIL SERVICE COMMISSION

State of Delaware s Degree Directory

Cleaned Data. Recommendations

NASA SUPPLEMENTAL CLASSIFICATION SYSTEM NON-AST SCHEMATIC

96 PD Predictive Modeling: Now What? Moderator: Kara L. Clark, FSA, MAAA

NSW Regional Skilled Occupation List - Skilled Migration - as at 19 November 2015 Skilled Regional (Provisional) visa (subclass 489)

Feature Factory: A Crowd Sourced Approach to Variable Discovery From Linked Data

Ph.D. in Bioinformatics and Computational Biology Degree Requirements

Domain

Faculty of Science School of Mathematics and Statistics

MATHEMATICS & APPLIED STATISTICS

Practical Calculation of Expected and Unexpected Losses in Operational Risk by Simulation Methods

PharmaSUG Paper IB05

SAS Academic Program

Why is Internal Audit so Hard?

BOARD NOTICE 129 OF 2010

AUSTRALIA s SKILLED OCCUPATION LIST

Data Scientist... The Sexiest Job of the 21st Century. Harvard Business Review (Oct. 2012)

YOUR OPTIONS IN BUSINESS UTS BUSINESS SCHOOL

Name of the University: University of Wisconsin-Madison Names of the students: Jing Jin Exchange semester: Fall Academic:

Financial Trading System using Combination of Textual and Numerical Data

Transcription:

Stat3980: Statistics in Banking and Finance Statistics Meets Big Data 統 計 遇 見 大 數 據 Dr. Aijun Zhang Spring 2016@HKBU 1

Course Title: STAT3980/MATH4875 Overview Selected Topics in Statistics Statistics in Banking and Finance ( 銀 行 與 金 融 中 的 統 計 應 用 ) Course Objective: This course aims to provide senior students with statistical methods and applications in banking and finance. Real case studies will be discussed. R/Spark/Python programming techniques will be introduced so that the students may get some hands on experience with data analytics. Class Schedule: Every Monday 8:30 11:50am (Early Bird Gets The Worm!) 2

Assessment No. Assessment Methods Weighting Remarks 1 Continuous Assessment 30% In-class assignment (about 3 times) to help practice the basic concepts. 2 Mini-project 30% Group project (of size 2~3 students) during the 2 nd half of the course. You are expected to work independently on real datasets. Each group will deliver a written report with oral presentation. 3 Final Examination 40% Final examination to see how far you have achieved intended learning outcomes especially in the knowledge domain. You are expected to have a thorough understanding on some important statistical methods and machine learning techniques in banking and finance. 3

Course Outline Part I: Statistics Meets Big Data A. Statistics as Data Science B. Explorative Data Analysis C. Basic Statistical Models D. Machine Learning E. Distributed Computing Part II: Banking and Finance Applications A. Quantitative Risk Management B. Credit Scoring C. Credit Risk Modeling D. Rise of Model Risk Management E. Other Miscellaneous Topics 4

Reference Texts Download Free Copy from Gareth s website HKBU Library Online Access (3 rd edition) 5

Part I: Statistics Meets Big Data A. Statistics as Data Science B. Explorative Data Analysis C. Basic Statistical Models D. Machine Learning E. Distributed Computing 6

What is Statistics? Statistics is the science of learning from data, and of collection, organization, analysis, interpretation, and presentation of data. It also includes the planning of data collection in terms of design of surveys and experiments. (See Wikipedia.) 7

A Brief History Unlike mathematics with a long history, statistics is said to start around 1749. The term "statistics" originally designated systematic collection of demographic and economic data by states. Later it broadened to cover the collection, summary, and analysis of data. Today, statistics is widely employed in government, business and all the sciences. Statistics is going to show more of its power as it meets big data. 8

Keywords in Statistics 9

What do statisticians do? Job Types (What my stats friends are doing): Financial Analyst/Quant/Programmer in streets, banks, hedge fund, etc Data Analyst/Statistician/Scientist in Google, Yahoo!, LinkedIn Consultant/Data Specialist/Analyst in McKinsey, IBM Academic roles in Universities and Research Institutes Job Market: NYT 2009 article: For Today s Graduate, Just One Word: Statistics "I keep saying that the sexy job in the next 10 years will be statisticians," said Hal Varian, chief economist at Google. "And I m not kidding. McKinsey 2011 Report: Big data: The next frontier for competition The United States needs 140,000 to 190,000 more workers with deep analytical expertise and 1.5 million managers and analysts with the skills to understand and make decisions based on the analysis of big data. 10

Best Jobs by CareerCast.com Rank 2011 2012 2013 2014 2015 1 Software Engineer Software Engineer Actuary Mathematician Actuary 2 Mathematician Actuary Biomedical Engineer University Professor Audiologist 3 Actuary HR Manager Software Engineer Statistician Mathematician 4 Statistician Dental Hygienist Audiologist Actuary Statistician 5 Comp. Systems Analyst Financial Planner Financial Planner Audiologist Biomedical Engineer 6 Meteorologist Audiologist Dental Hygienist Dental Hygienist Data Scientist 7 Biologist Occupational Therapist Occupational Therapist Software Engineer Dental Hygienist 8 Historian Online Ads Manager Optometrist Comp. Systems Analyst Software Engineer 9 Audiologist Comp. Systems Analyst Physical Therapist Occupational Therapist Occupational Therapist 10 Dental Hygienist Mathematician Comp. Systems Analyst Speech Pathologist Comp. Systems Analyst Statistician (18) Mathematician (18) Statistician (20) 11

Top 10 reasons to be a statistician 1. Statisticians are significant. 2. Estimating parameters is easier than dealing with real life. 3. I always wanted to learn the entire Greek alphabet. 4. The probability a statistician major will get a job is >.9999. 5. If I flunk out I can always transfer to Engineering. 6. We do it with confidence, frequency, and variability. 7. You never have to be right - only close. 8. We're normal and everyone else is skewed. 9. The regression line looks better than the unemployment line. 10. No one knows what we do so we are always right. 12

More Statistical Jokes There are three kinds of lies: lies, damned lies, and statistics. Statistics are like a bikini. What they reveal is suggestive, but what they conceal is vital. I asked a statistician for her phone number... and she gave me an estimate. Three statisticians went out hunting, and came across a large deer. The first statistician fired, but missed, by a meter to the left. The second statistician fired, but also missed, by a meter to the right. The third statistician didn't fire, but shouted in triumph, "On the average we got it! See here for Dr. Ramseyer's extensive collection of statistical jokes. 13

Statistics vs. Probability The two topics are used to be studied together, however statistics and probability are two separate disciplines: Probability deals with predicting the likelihood of future events. Statistics deals with analysis of the frequency of past events. Probability is primarily a theoretical branch of mathematics, which studies the consequences of mathematical definitions. Statistics evolves to an independent science, which tries to make sense of observations in the real world. See Wikipedia for a list of probability topics. See Wikipedia for list of statistics topics. 14

Statistics vs. Probability 通 俗 地 讲 : 概 率 是 已 知 桶 里 黑 白 子 的 分 布, 问 抓 到 手 里 会 是 什 么 状 况 ( 比 如 有 多 大 可 能 抓 到 白 子 黑 子 )? 而 统 计 是 从 多 次 抓 到 手 中 的 情 况, 推 算 桶 里 黑 白 子 的 分 布 15

Statistics as Data Science Google trends: data mining, data science, machine learning, big data (Search items in comparison) Statistics is the science of dealing with data, learning from data, and extracting meaning from data. Data science is more demanding. It lies in the center of statistics/mathematics, hacking skills and substantive expertise; see Drew Conway s Venn diagram for detailed explanation. 16

Statistical applications in diverse fields Statistical use is pervasive wherever there exist data. The fields of application of statistics are many and very diverse. John Tukey (1915 2000): The best thing about being a statistician is that you get to play in everyone s backyard. Long list of fields of application of statistics: Actuarial science, Agriculture, Bioinformatics, Biostatistics, Business Intelligence, Chemometrics, Clinical Trial, Communication Study, Econometrics, Engineering, Environmetrics, Finance, Genetics, Geostatistics, Hedge Fund, Information Technology, Insurance, Management Science, Manufacturing, Marketing, Medical Statistics, Pharmaceutics, Physics, Politics, Process Control, Psychometrics, Public Health, Quality and Productivity, Reliability, Risk Management, Six Sigma, Social Science, Sports, WWW,... 17