# Data Science Will computer science and informatics eat our lunch?

Save this PDF as:

Size: px
Start display at page:

Download "Data Science Will computer science and informatics eat our lunch?"

## Transcription

1 Data Science Will computer science and informatics eat our lunch? Thomas Lumley University of Auckland (g)tslumley statschat.org.nz notstat schat.tumblr.com

2 In the 1920s, the computing labs helped establish statistics on the American continent. Without them, even a modest study was beyond the ability of an individual statistician. At the same time, statistics labs often had the most powerful computing machines within their larger institution. They showed how organized computing could benefit science and provided a place for the earliest of computer scientists to test their ideas. -- Grier The origins of statistical computing, Amstat Online

3 Fig. 2. The Hollerith Electric Tabulating System

4 Iowa State Statistical Computing Service

5 CSIRAC

6 Iowa State Statistical Computing Service

7 Iowa State statistics PhD prelim exam Two eight-hour written-on-paper exams covering : Theory of Probability and Statistics I. Theory of Probability and Statistics II. Statistical Methods I. Statistical Methods II. Advanced Statistical Methods. Advanced Probability Theory. Advanced Theory of Statistical Inference. They do require a stat computing course: 1 credit/30

8 What is data science? and where can we get some?

9 Data Science is just a fancy name for statistics. Fitting simple models to messy and sometimes large data sets Combination of standard black-box fitting tools and good graphics. Doesn t require any fundamental knowledge our students don t have. Needs good computing skills, which our students can learn

10 Need to avoid going overboard with computing Data Wrangling isn t statistics Cleaning, tidying, querying, reformatting, transforming, getting in and out of databases,

11 Data Science is just a fancy name for statistics. Data Wrangling isn t statistics If you value self-consistency, you can hold at most one of these opinions. A/Prof Jenny Bryan, UBC (less than one is good)

12 Data science is statistics in the same way that epidemiology is statistics opinion polling is statistics ag. field trials are statistics

13 I did think, however, that many well-known applied statisticians attacked problems without the necessary mathematical knowledge and manipulative skill. Moreover, I believed that a principal cause of failure among medical research scientists was the lack of basic scientific knowledge in their special chosen field. H. O. Lancaster

14 Computing is easier to steal Define and explain the relevance to applied statistics of: Suffix trees Supernodal Cholesky factorization Column-store database Translation look-aside buffer

15 Computing is easier to steal Need to teach our data science students: A bit about databases and SQL A statistical programming language (eg R) Abstractions such as tidy data, sparse, map/reduce Reproducible data analysis (eg rmarkdown)?collaborative version control (eg git/github) Force students to work with a wild-caught data set and I'm still pretty sure some of the data is Permit interested students to learn the high-tech data structures missing, and butalgorithms could still stuff. be here, in this ONE HUNDRED SHEET excel file a PhD student on Twitter

16 But we don t know this stuff! let mego glethat for you Google Search I'm Feeling Lucky The computing folks are way better at dissemination than us Unlike statistics, the computer can tell you if you get it wrong.

17 Free online courses Books Related Courses M Exploratory Data Analysis Reproducible Research Statistical Inference /osljjÿp o D Dynamic Documents with R and knitr Yihui Xie Pract cal Dat Scienc * Nina/ml John Hooni Doing Data Science STRAIGHT TALK FROM THE FRONTLINE Getting and Cleaning Data Regression Models Developing Data Products d«n» «- dcns<ty(dot>i. n - npts) IIMIMINt Cathy O'Neil & Rachel Schutt dy2 <- M» - JfCIO KqtwlM «- rtrfyel.). length(dx)) lf(flu T> confshade(dx2. s«qb«lo». dy? S' I - 5>l The Data Scientist's Toolbox Data Analysis and Statistical Inference People who make their notes available ÿ 5b5 Home FAQ Syllabus Topics People J Data wrangling, exploration, and analysis with R UBC STAT 545A and 547M Software tools Open source environment for deep analysis of largecomplex data The Power of R with Big Data Get Started inminutes Resources to Learn & Join Learn how to explore, groom, visualize, and analyze dab make all of that reproducible, reusable, ar using R software carpentry

18 What do we have to offer? Popularity? Romance? Excitement?

19 Big Complex Messy Badly Sampled Creepy Vital to ask the right questions

20 Big Data Computer folks are better at this than us, but statistical insights important eg: Noel Cressie: fast computation for spatial models Bill Cleveland: optimising the divide/recombine strategy

21 Big Data Computer folks are better at this than us, Big doesn t mean gigabytes.

22 Complex Data Models for complex data Summaries (parameters, estimators) that answer the real questions Robustness of meaning, not just of power and level.

23 Complex Data: networks F(x)µ1- x -a Power laws: come from network, queue, Matthew effect process blog links page views long tail sales data citations to papers word frequencies earthquake sizes

24 Complex Data: networks F(x)µ1- x -a Power laws: come from network, queue, Matthew effect process blog links page views long tail sales data citations to papers word frequencies earthquake sizes All fit lognormal better, some much better Clauset et al, SIAM Rev. 2009

25 Complex Data: networks Random graph models for connections Erdös-Renyi graphs Exponential Random Graph Models (ERGMs) meaningful parameters, nice likelihood ERGMs are not consistent under sampling. [Shalizi et al, Ann Stat]

26 Complex Data Robustness of meaning can be hard: Suppose a Wilcoxon test shows X > Y, Y>Z What does this tell us about Means of X and Z? Medians of X and Z? Wilcoxon test of X and Z?

27 i i Messy Data Good applied statisticians know from messy data. o CM - X O and I'm still pretty sure some of the data is missing, but it could still be here, in this ONE HUNDRED SHEET excel file blooc Diastolic NnT i o r o a PhD student on Twitter 0 ao o c Age (years)

28 Badly Sampled Whom the Gods Would Destroy, They First Give Real-time Analytics [Dan McKinley, Etsy] This line of thinking is a trap. It's important to divorce the concepts of operational metrics and product analytics. Confusing how we do things with how we decide which things to do is a fatal mistake. Because non-representativeness of short time slices

29 Badly Sampled Statisticians know about sampling design weighting matching selection models

30 Creepy What questions should data answer? income Mount Eden atistics NZ Chris McDowall Based on census meshblock: not actual household data

31 Creepy (and Evil) What questions should data answer? Familiar issues: Bioethics Statistical disclosure/confidentiality New, but statistical issues: algorithm audit/accountability We also talk to social scientists more. (not enough)

32 Creepy (and Evil) How do we learn more? let me LjOOQie that for you Googlo Search I'm Fooling Lucky Cathy O Neil (mathbabe.org) Ed Felten danah boyd

33 Summary The hard problems in data science are hard. Many of the computational ones are solved (ish) Many of the unsolved ones are closer to statistics

34 Data Science Will computer science and informatics eat our lunch? Only if we let them, and it would be bad for data science, too

### What is Data Science? Girl Develop It! Meetup Renée M. P. Teate, March 2015

What is Data Science? { Girl Develop It! Meetup Renée M. P. Teate, March 2015 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa _Big_Data.jpg https://encryptedtbn2.gstatic.com/images?q=tbn:and9gcs9dku3_tzi-swwyaqee5y0ehuvoiznsya_raknubbd0jyxpx7pw

### What is Data Science? Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014

What is Data Science? { Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa

### Data Mining Methods: Applications for Institutional Research

Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014

### Healthcare data analytics. Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw

Healthcare data analytics Da-Wei Wang Institute of Information Science wdw@iis.sinica.edu.tw Outline Data Science Enabling technologies Grand goals Issues Google flu trend Privacy Conclusion Analytics

### Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

### How to Make Money with Google Adwords. For Cleaning Companies. H i tm a n. Advertising

How to Make Money with Google Adwords For Cleaning Companies. H i tm a n Advertising Target Clients Profitably Google Adwords can be one of the best returns for your advertising dollar. Or, it could be

### Why Big Data is not Big Hype in Economics and Finance?

Why Big Data is not Big Hype in Economics and Finance? Ariel M. Viale Marshall E. Rinker School of Business Palm Beach Atlantic University West Palm Beach, April 2015 1 The Big Data Hype 2 Big Data as

### Computer Programming for the Social Sciences

Department of Social and Political Sciences Computer Programming for the Social Sciences This two day workshop will teach beginner level, practical computer programming skills for use in social science

### ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

### In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

### Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015

Course Information Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015 Credit Hours: 3 Semester: Fall 2015 Meeting times and location: MWF, 12:10 13:00, Sloan 163 Course website:

### U N D E R S TA N D I N G T H E D N A O F DATA SCIENCE. 2014 Persontyle Ltd. All rights reserved.

U N D E R S TA N D I N G T H E D N A O F DATA SCIENCE 010100101010011110100101010 101010101010101010101001010 101010100101010101010010101 WHAT IS DATA SCIENCE? One day course to understand the concepts

### Big Data Big Knowledge?

EBPI Epidemiology, Biostatistics and Prevention Institute Big Data Big Knowledge? Torsten Hothorn 2015-03-06 The end of theory The End of Theory: The Data Deluge Makes the Scientific Method Obsolete (Chris

### Data Science with Hadoop at Opower

Data Science with Hadoop at Opower Erik Shilts Advanced Analytics erik.shilts@opower.com What is Opower? A study: \$\$\$ Turn off AC & Turn on Fan Environment Turn off AC & Turn on Fan Citizenship Turn off

### ANALYTICS A FUTURE IN ANALYTICS

ANALYTICS A FUTURE IN ANALYTICS WHAT IS ANALYTICS? In the information age in which we live, almost all of us consume and produce digital data, either for business, community or private uses. We access

### The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

### Francois Ajenstat, Tableau Stephanie McReynolds, Aster Data Steve e Wooledge, Aster Data

Deep Data Exploration: Find Patterns in Your Data Faster & Easier Curt Monash, Founder and President, Monash Research Francois Ajenstat, Tableau Stephanie McReynolds, Aster Data Steve e Wooledge, Aster

### Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques

### Making data predictive why reactive just isn t enough

Making data predictive why reactive just isn t enough Andrew Peterson, Ph.D. Principal Data Scientist Soltius NZ, Ltd. New Zealand 2014 Big Data and Analytics Forum 18 August, 2014 Caveats and disclaimer:

### Computational Science and Informatics (Data Science) Programs at GMU

Computational Science and Informatics (Data Science) Programs at GMU Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ Outline Graduate Program

### Data Analytics at NICTA. Stephen Hardy National ICT Australia (NICTA) shardy@nicta.com.au

Data Analytics at NICTA Stephen Hardy National ICT Australia (NICTA) shardy@nicta.com.au NICTA Copyright 2013 Outline Big data = science! Data analytics at NICTA Discrete Finite Infinite Machine Learning

### POL 204b: Research and Methodology

POL 204b: Research and Methodology Winter 2010 T 9:00-12:00 SSB104 & 139 Professor Scott Desposato Office: 325 Social Sciences Building Office Hours: W 1:00-3:00 phone: 858-534-0198 email: swd@ucsd.edu

### Customer Case Study. Automatic Labs

Customer Case Study Automatic Labs Customer Case Study Automatic Labs Benefits Validated product in days Completed complex queries in minutes Freed up 1 full-time data scientist Infrastructure savings

### Streamline your supply chain with data. How visual analysis helps eliminate operational waste

Streamline your supply chain with data How visual analysis helps eliminate operational waste emagazine October 2011 contents 3 Create a data-driven supply chain: 4 paths to insight 4 National Motor Club

### Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

### III Big Data Technologies

III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

### Practical Data Science with R

Practical Data Science with R Instructor Matthew Renze Twitter: @matthewrenze Email: matthew@matthewrenze.com Web: http://www.matthewrenze.com Course Description Data science is the practice of transforming

### FIVE STEPS FOR DELIVERING SELF-SERVICE BUSINESS INTELLIGENCE TO EVERYONE CONTENTS

FIVE STEPS FOR DELIVERING SELF-SERVICE BUSINESS INTELLIGENCE TO EVERYONE Wayne Eckerson CONTENTS Know Your Business Users Create a Taxonomy of Information Requirements Map Users to Requirements Map User

### INDEX. Introduction Page 3. Methodology Page 4. Findings. Conclusion. Page 5. Page 10

FINDINGS 1 INDEX 1 2 3 4 Introduction Page 3 Methodology Page 4 Findings Page 5 Conclusion Page 10 INTRODUCTION Our 2016 Data Scientist report is a follow up to last year s effort. Our aim was to survey

### POSTGRADUATE PROGRAMS IN APPLIED DATA ANALYTICS

POSTGRADUATE PROGRAMS IN APPLIED DATA ANALYTICS ANU College of Engineering & Computer Science Postgraduate Programs in Applied Data and Analytics 1 ANU is pleased to offer new postgraduate study opportunities

### GETTING AHEAD OF THE COMPETITION WITH DATA MINING

WHITE PAPER GETTING AHEAD OF THE COMPETITION WITH DATA MINING Ultimately, data mining boils down to continually finding new ways to be more profitable which in today s competitive world means making better

### BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

### Data Analytics in Organisations and Business

Data Analytics in Organisations and Business Dr. Isabelle E-mail: isabelle.flueckiger@math.ethz.ch 1 Data Analytics in Organisations and Business Some organisational information: Tutorship: Gian Thanei:

### Page Replacement Strategies. Jay Kothari Maxim Shevertalov CS 370: Operating Systems (Summer 2008)

Page Replacement Strategies Jay Kothari (jayk@drexel.edu) Maxim Shevertalov (max@drexel.edu) CS 370: Operating Systems (Summer 2008) Page Replacement Policies Why do we care about Replacement Policy? Replacement

### Predictive Analytics Enters the Mainstream

Ventana Research: Predictive Analytics Enters the Mainstream Predictive Analytics Enters the Mainstream Taking Advantage of Trends to Gain Competitive Advantage White Paper Sponsored by 1 Ventana Research

### Introduction to predictive modeling and data mining

Introduction to predictive modeling and data mining Rebecca C. Steorts Predictive Modeling and Data Mining: STA 521 August 25 2015 1 Today s Menu 1. Brief history of data science (from slides of Bin Yu)

### Interoperability and Analytics February 29, 2016

Interoperability and Analytics February 29, 2016 Matthew Hoffman MD, CMIO Utah Health Information Network Conflict of Interest Matthew Hoffman, MD Has no real or apparent conflicts of interest to report.

### A future career in analytics

A future career in analytics What is a career in analytics about? In the information age in which we live, almost all of us consume and produce digital data, either for business, community or private uses.

### A Robust Method for Solving Transcendental Equations

www.ijcsi.org 413 A Robust Method for Solving Transcendental Equations Md. Golam Moazzam, Amita Chakraborty and Md. Al-Amin Bhuiyan Department of Computer Science and Engineering, Jahangirnagar University,

### Challenges, Tools and Examples for Big Data Inference

Challenges, Tools and Examples for Big Data Inference Jean-François Plante, HEC Montréal Closing Conference: Statistical and Computational Analytics for Big Data June 12 th, 2015 What is Big Data? Dan

### » Dealing with Big Data: David Hakken Weighs In blog.castac.org file:///users/dhakken/documents/» Dealing with Big Data_ Dav...

Search: Go blog.castac.org From the Committee on the Anthropology of Science, Technology, and Computing (CASTAC) About Adventures in Pedagogy Beyond the Academy Member Sound-Off News, Links, and Pointers

### The Edge Editions of SAP InfiniteInsight Overview

Analytics Solutions from SAP The Edge Editions of SAP InfiniteInsight Overview Enabling Predictive Insights with Mouse Clicks, Not Computer Code Table of Contents 3 The Case for Predictive Analysis 5 Fast

### Probabilities and Proportions

CHAPTER 4 Probabilities and Proportions Chapter Overview While the graphic and numeric methods of Chapters 2 and 3 provide us with tools for summarizing data, probability theory, the subject of this chapter,

### UNIFY YOUR (BIG) DATA

UNIFY YOUR (BIG) DATA ANALYTIC STRATEGY GIVE ANY USER ANY ANALYTIC ON ANY DATA Scott Gnau President, Teradata Labs scott.gnau@teradata.com t Unify Your (Big) Data Analytic Strategy Technology excitement:

### What is Data Analysis. Kerala School of MathematicsCourse in Statistics for Scientis. Introduction to Data Analysis. Steps in a Statistical Study

Kerala School of Mathematics Course in Statistics for Scientists Introduction to Data Analysis T.Krishnan Strand Life Sciences, Bangalore What is Data Analysis Statistics is a body of methods how to use

### Confidence intervals, t tests, P values

Confidence intervals, t tests, P values Joe Felsenstein Department of Genome Sciences and Department of Biology Confidence intervals, t tests, P values p.1/31 Normality Everybody believes in the normal

### HR STILL GETTING IT WRONG BIG DATA & PREDICTIVE ANALYTICS THE RIGHT WAY

HR STILL GETTING IT WRONG BIG DATA & PREDICTIVE ANALYTICS THE RIGHT WAY OVERVIEW Research cited by Forbes estimates that more than half of companies sampled (over 60%) are investing in big data and predictive

### Tips to ensuring the success of big data analytics initiatives

Tips to ensuring the success of big data Big data analytics is hot. Read any IT publication or website and you ll see business intelligence (BI) vendors and their systems integration partners pitching

### Big Data to Knowledge (BD2K)

Big Data to Knowledge () potential funding agency synergies Jennie Larkin, PhD Office of the Associate Director of Data Science National Institutes of Health idash-pscanner meeting UCSD September 16, 2014

### EPSRC Cross-SAT Big Data Workshop: Well Sorted Materials

EPSRC Cross-SAT Big Data Workshop: Well Sorted Materials 5th August 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations

### Statistics, Big Data and Data Science!?

Statistics, Big Data and Data Science!? Prof. Dr. Göran Kauermann Ludwig-Maximilians-Universität Munich, Germany Statistics, Big Data and Data Science Statistics Founded around 1900 with the seminal work

### Big Data and Privacy. Fritz Henglein Dept. of Computer Science, University of Copenhagen. Finance IT Day Riga, 2015-03-26

Big Data and Privacy Fritz Henglein Dept. of Computer Science, University of Copenhagen Finance IT Day Riga, 2015-03-26 About me Professor, Programming Languages and Systems, University of Copenhagen Director,

### Statistics for BIG data

Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

### Master of Science in Healthcare Informatics and Analytics Program Overview

Master of Science in Healthcare Informatics and Analytics Program Overview The program is a 60 credit, 100 week course of study that is designed to graduate students who: Understand and can apply the appropriate

### Big Data Analytics. Lucas Rego Drumond

Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline

### SAS Certificate Applied Statistics and SAS Programming

SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and

### Six Signs. you are ready for BI WHITE PAPER

Six Signs you are ready for BI WHITE PAPER LET S TAKE A LOOK AT THE WAY YOU MIGHT BE MONITORING AND MEASURING YOUR COMPANY About the auther You re managing information from a number of different data sources.

### Course Title: Advanced Topics in Quantitative Methods: Educational Data Science Practicum

COURSE NUMBER: APSTA- GE.2017 Course Title: Advanced Topics in Quantitative Methods: Educational Data Science Practicum Number of Credits: 2 Meeting Pattern: 3 hours per week, 7 weeks; first class meets

### Correlational Research

Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.

### A Review of "Free" Massive Open Online Content (MOOC) for SAS Learners

PharmaSUG 2015 Paper A Review of "Free" Massive Open Online Content (MOOC) for SAS Learners Kirk Paul Lafler, Software Intelligence Corporation Abstract Leading online providers are now offering SAS users

### Data Structures and Programming

Data Structures and Programming http://www.cs.sfu.ca/cc/225/johnwill/ John Edgar 2 Assignments and labs 30% Midterm exam in class 25% Final exam 45% John Edgar 3 Data Structures Algorithms Software Development

### Training for Big Data

Training for Big Data Learnings from the CATS Workshop Raghu Ramakrishnan Technical Fellow, Microsoft Head, Big Data Engineering Head, Cloud Information Services Lab Store any kind of data What is Big

### Five Reasons Spotfire Is Better than Excel for Business Data Analytics

Five Reasons Spotfire Is Better than Excel for Business Data Analytics A hugely versatile application, Microsoft Excel is the Swiss Army Knife of IT, able to cope with all kinds of jobs from managing personal

### Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

### 5 - Low Cost Ways to Increase Your

- 5 - Low Cost Ways to Increase Your DIGITAL MARKETING Presence Contents Introduction Social Media Email Marketing Blogging Video Marketing Website Optimization Final Note 3 4 7 9 11 12 14 2 Taking a Digital

### Jay Buckingham Dynamic Signal jbuckingham@dynamicsignal.com

Jay Buckingham Dynamic Signal jbuckingham@dynamicsignal.com Financial Times PeHub.com Wall Street Journal Harvard Business Review Making use of vast amounts of data to: Discover what we don t know Obtain

### Statistics in Applications III. Distribution Theory and Inference

2.2 Master of Science Degrees The Department of Statistics at FSU offers three different options for an MS degree. 1. The applied statistics degree is for a student preparing for a career as an applied

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

### INTRODUCING AZURE MACHINE LEARNING

David Chappell INTRODUCING AZURE MACHINE LEARNING A GUIDE FOR TECHNICAL PROFESSIONALS Sponsored by Microsoft Corporation Copyright 2015 Chappell & Associates Contents What is Machine Learning?... 3 The

### www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

### Network Security. Mobin Javed. October 5, 2011

Network Security Mobin Javed October 5, 2011 In this class, we mainly had discussion on threat models w.r.t the class reading, BGP security and defenses against TCP connection hijacking attacks. 1 Takeaways

### APPLIED MATHEMATICS A FUTURE IN

APPLIED MATHEMATICS A FUTURE IN APPLIED MATHEMATICS WHAT IS APPLIED MATHEMATICS? Whether or not we are good at mathematics, most of us would agree that maths is important. It underpins so many aspects

### A Pharmacometrician s Perspective for Utilization of Big Data

Is There a Role of Big Data in Drug Development Decisions? ACoP6 Oct. 5, 2015 Crystal City, VA A Pharmacometrician s Perspective for Utilization of Big Data Marc R. Gastonguay, Ph.D. President & CEO Metrum

### Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

### Website Promotion for Voice Actors: How to get the Search Engines to give you Top Billing! By Jodi Krangle http://www.voiceoversandvocals.

Website Promotion for Voice Actors: How to get the Search Engines to give you Top Billing! By Jodi Krangle http://www.voiceoversandvocals.com Why have a website? If you re busier than you d like to be

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

### The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics

The Data Engineer Mike Tamir Chief Science Officer Galvanize Steven Miller Global Leader Academic Programs IBM Analytics Alessandro Gagliardi Lead Faculty Galvanize Businesses are quickly realizing that

Six Questions to Ask About Your Market Research Don t roll the dice ISR s tagline is Act with confidence because we believe that s what you re buying when you buy quality market research products and services,

### What is Data Mining, and How is it Useful for Power Plant Optimization? (and How is it Different from DOE, CFD, Statistical Modeling)

data analysis data mining quality control web-based analytics What is Data Mining, and How is it Useful for Power Plant Optimization? (and How is it Different from DOE, CFD, Statistical Modeling) StatSoft

### 5 Point Social Media Action Plan.

5 Point Social Media Action Plan. Workshop delivered by Ian Gibbins, IG Media Marketing Ltd (ian@igmediamarketing.com, tel: 01733 241537) On behalf of the Chambers Communications Sector Introduction: There

### EBPI Epidemiology, Biostatistics and Prevention Institute. Big Data Science. Torsten Hothorn 2014-03-31

EBPI Epidemiology, Biostatistics and Prevention Institute Big Data Science Torsten Hothorn 2014-03-31 The end of theory The End of Theory: The Data Deluge Makes the Scientific Method Obsolete (Chris Anderson,

### Five Tips for Presenting Data Analyses: Telling a Good Story with Data

Five Tips for Presenting Data Analyses: Telling a Good Story with Data As a professional business or data analyst you have both the tools and the knowledge needed to analyze and understand data collected

### Actuary vs Data Scientist

Actuary vs Data Scientist Richard Pugh Chief Data Scientist, Mango Solutions President @ R Consortium Chris Reynolds Head of Life Solutions Actuarial, PartnerRe 10 November 2015 Disclaimer The following

### T he complete guide to SaaS metrics

T he complete guide to SaaS metrics What are the must have metrics each SaaS company should measure? And how to calculate them? World s Simplest Analytics Tool INDEX Introduction 4-5 Acquisition Dashboard

### Introduction to Regression. Dr. Tom Pierce Radford University

Introduction to Regression Dr. Tom Pierce Radford University In the chapter on correlational techniques we focused on the Pearson R as a tool for learning about the relationship between two variables.

### Differential privacy in health care analytics and medical research An interactive tutorial

Differential privacy in health care analytics and medical research An interactive tutorial Speaker: Moritz Hardt Theory Group, IBM Almaden February 21, 2012 Overview 1. Releasing medical data: What could

### Experiment #1, Analyze Data using Excel, Calculator and Graphs.

Physics 182 - Fall 2014 - Experiment #1 1 Experiment #1, Analyze Data using Excel, Calculator and Graphs. 1 Purpose (5 Points, Including Title. Points apply to your lab report.) Before we start measuring

### Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs

1 Big Data Analytics Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs Montevideo, 22 nd November 4 th December, 2015 INFORMATIQUE

### Sunnie Chung. Cleveland State University

Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

### A Changing Standard for SEO Spam:

A Changing Standard for SEO Spam: Google Penguin, Link Penalties & Declining Leniency Overview If you own a small or medium-sized business, you ve likely hired an outside vendor to build external links

### Statistical Challenges with Big Data in Management Science

Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision

### Top 5 Mistakes Made with Inventory Management for Online Stores

Top 5 Mistakes Made with Inventory Management for Online Stores For any product you sell, you have an inventory. And whether that inventory fills dozens of warehouses across the country, or is simply stacked

### DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2

DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.

### Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable

### SURVEY REPORT DATA SCIENCE SOCIETY 2014

SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses

### Use advanced techniques for summary and visualization of complex data for exploratory analysis and presentation.

MS Biostatistics MS Biostatistics Competencies Study Development: Work collaboratively with biomedical or public health researchers and PhD biostatisticians, as necessary, to provide biostatistical expertise

### Everything you wanted to know about using Hexadecimal and Octal Numbers in Visual Basic 6

Everything you wanted to know about using Hexadecimal and Octal Numbers in Visual Basic 6 Number Systems No course on programming would be complete without a discussion of the Hexadecimal (Hex) number