Oracle's In-Database Statistical Functions

Size: px
Start display at page:

Download "Oracle's In-Database Statistical Functions"

Transcription

1 Oracle 11g DB Data Warehousing <Insert Picture ETL Here> Oracle's In-Database Statistical Functions OLAP Statistics Data Mining Charlie Berger Sr. Director Product Management, Data Mining Technologies Oracle Corporation

2 Synopsis Oracle has delivered on a multi-year strategy to transform the database from a data repository to an analytical database by bringing the "analytics" to the data (data mining, text mining, and statistical functions) This new analytical Database, integrated with Oracle Business Intelligence EE, opens new doors for better BI Why did something happen? What corrective actions should be taken? Which factors are influencing your business s key performance indicators? Which things should I target? What will happen in the future and where should you focus limited resources? Overview of SQL statistical capabilities embedded in Oracle Database Repeat what I was shown hands-on session

3 Agenda Introduction Oracle s in-database Statistical Functions Several Simple Demonstrations Opportunities for Use Cases Hands-on Exercises User Stories A B C

4 Market Trends Analytics Provide Competitive Value Competing on Analytics, by Tom Davenport Some companies have built their very businesses on their ability to collect, analyze, and act on data. Although numerous organizations are embracing analytics, only a handful have achieved this level of proficiency. But analytics competitors are the leaders in their varied fields consumer products finance, retail, and travel and entertainment among them. Organizations are moving beyond query and reporting - IDC 2006 Super Crunchers, by Ian Ayers In the past, one could get by on intuition and experience. Times have changed. Today, the name of the game is data. Steven D. Levitt, author of Freakonomics Data-mining and statistical analysis have suddenly become cool... Dissecting marketing, politics, and even sports, stuff this complex and important shouldn't be this much fun to read. Wired

5 Market Trends Analytics Save Lives Super Crunchers, by Ian Ayers In December 2004, [Berwick] brazenly announced a plan to save 100,000 lives over the next year and a half. The 100,000 Lives Campaign challenged hospitals to implement six changes in care to prevent avoidable deaths. He noticed that thousands of ICU patients die each year from infections after a central line catheter is placed in their chests. About half of all intensive care patients have central line catheters, and ICU infections are deadly (carrying mortality rates of up to 20 percent). He then looked to see if there was any statistical evidence of ways to reduce the chance of infection. He found a 2004 article in Critical Care Medicine that showed that systematic hand-washing (combined with a bundle of improved hygienic procedures such as cleaning the patient s skin with an antiseptic called chlorhexidine) could reduce the risk of infection from central-line catheters by more than 90 percent. Berwick estimated that if all hospitals just implemented this one bundle of procedures, they might be able to save as many as 25,000 lives per year. New York Times, August 23, 2007, Attack of the Super Crunchers: Adventures in Data Mining, By Melissa Lafsky

6 Competitive Advantage of BI & Analytics Optimization $$ What s the best that can happen? Competitive Advantage Predictive Modeling Forecasting/Extrapolation Statistical Analysis Alerts Query/drill down Ad hoc reports What will happen next? What if these trends continue? Why is this happening? What actions are needed? Where exactly is the problem? How many, how often, where? Analytic$ Access & Reporting Standard Reports What happened? Degree of Intelligence Source: Competing on Analytics, by T. Davenport & J. Harris

7 Oracle Data Mining & Statistical Functions

8 Definition: Statistics There are three kinds of lies: lies, damned lies, and statistics. 1 1 This well-known saying is part of a phrase attributed to Benjamin Disraeli and popularized in the U.S. by Mark Twain

9 Definition: Statistics Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It is applicable to a wide variety of academic disciplines, from the physical and social sciences to the humanities. Statistics are also used for making informed decisions and misused for other reasons in all areas of business and government.

10 Definitions: Statistics Statistical methods can be used to summarize or describe a collection of data; this is called descriptive statistics. In addition, patterns in the data may be modeled in a way that accounts for randomness and uncertainty in the observations, and then used to draw inferences about the process or population being studied; this is called inferential statistics. Both descriptive and inferential statistics comprise applied statistics.

11 Statistical Concepts

12 Statistics & SQL Analytics Ranking functions rank, dense_rank, cume_dist, percent_rank, ntile Window Aggregate functions (moving and cumulative) Avg, sum, min, max, count, variance, stddev, first_value, last_value LAG/LEAD functions Direct inter-row reference using offsets Reporting Aggregate functions Sum, avg, min, max, variance, stddev, count, ratio_to_report Statistical Aggregates Correlation, linear regression family, covariance Linear regression Fitting of an ordinary-least-squares regression line to a set of number pairs. Frequently combined with the COVAR_POP, COVAR_SAMP, and CORR functions. Descriptive Statistics average, standard deviation, variance, min, max, median (via percentile_count), mode, group-by & roll-up DBMS_STAT_FUNCS: summarizes numerical columns of a table and returns count, min, max, range, mean, stats_mode, variance, standard deviation, median, quantile values, +/- n sigma values, top/bottom 5 values Correlations Pearson s correlation coefficients, Spearman's and Kendall's (both nonparametric). Cross Tabs Enhanced with % statistics: chi squared, phi coefficient, Cramer's V, contingency coefficient, Cohen's kappa Hypothesis Testing Student t-test, F-test, Binomial test, Wilcoxon Signed Ranks test, Chi-square, Mann Whitney test, Kolmogorov- Smirnov test, One-way ANOVA Distribution Fitting Kolmogorov-Smirnov Test, Anderson-Darling Test, Chi- Squared Test, Normal, Uniform, Weibull, Exponential Note: Statistics and SQL Analytics are included in Oracle Database Standard Edition

13 > SQL Descriptive Statistics MEDIAN & MODE Median: takes numeric or datetype values and returns the middle value Mode: returns the most common value A. SELECT STATS_MODE(EDUCATION) from CD_BUYERS; B. SELECT MEDIAN(ANNUAL_INCOME) from CD_BUYERS; C. SELECT EDUCATION, MEDIAN(ANNUAL_INCOME) from CD_BUYERS GROUP BY EDUCATION; D. SELECT EDUCATION, MEDIAN(ANNUAL_INCOME) from CD_BUYERS GROUP BY EDUCATION ORDER BY MEDIAN(ANNUAL_INCOME) ASC;

14 DBMS_STAT_FUNCS Package SUMMARY procedure The SUMMARY procedure is used to summarize a numerical column (ADM_PULSE); the summary is returned as record of type summarytype > SQL DECLARE v_ownername varchar2(8); v_tablename varchar2(50); v_columnname varchar2(50); v_sigma_value number; type n_arr1 is varray(5) of number; type num_table1 is table of number; s1 dbms_stat_funcs.summarytype; BEGIN v_ownername := 'cberger'; v_tablename := 'LYMPHOMA'; v_columnname := 'ADM_PULSE'; v_sigma_value := 3; dbms_stat_funcs.summary(p_ownername=> v_ownername, p_tablename=> v_tablename, p_columnname=> v_columnname, p_sigma_value=> v_sigma_value, s=> s1); END; /

15 DBMS_STAT_FUNCS Package SUMMARY procedure The SUMMARY procedure is used to summarize a numerical column (ADM_PULSE); the summary is returned as record of type summarytype > SQL set echo off connect CBERGER/CBERGER@ora10gr2 set serveroutput on set echo on declare s DBMS_STAT_FUNCS.SummaryType; begin DBMS_STAT_FUNCS.SUMMARY('CBERGER','LYMPHOMA','ADM_PULSE',3,s); dbms_output.put_line('summary STATISTICS'); dbms_output.put_line('count: ' s.count); dbms_output.put_line('min: ' s.min); dbms_output.put_line('max: ' s.max); dbms_output.put_line('range: ' s.range); dbms_output.put_line('mean:' round(s.mean)); dbms_output.put_line('mode Count: ' s.cmode.count); dbms_output.put_line('mode: ' s.cmode(1)); dbms_output.put_line('variance: ' round(s.variance)); dbms_output.put_line('stddev: ' round(s.stddev)); dbms_output.put_line('quantile 5 ' s.quantile_5); dbms_output.put_line('quantile 25 ' s.quantile_25); dbms_output.put_line('median ' s.median); dbms_output.put_line('quantile 75 ' s.quantile_75); dbms_output.put_line('quantile 95 ' s.quantile_95); dbms_output.put_line('extreme Count: ' s.extreme_values.count); dbms_output.put_line('extremes: ' s.extreme_values(1)); dbms_output.put_line('top 3: ' s.top_5_values(1) ',' s.top_5_values(2) ',' s.top_5_values(3)); dbms_output.put_line('bottom 3:' s.bottom_5_values(5) ',' s.bottom_5_values(4) ',' s.bottom_5_values(3)); end; /

16 DBMS_STAT_FUNCS Package SUMMARY procedure A subset of data that is returned after execution of the PL/SQL package summarizes the use of the different SUMMARY procedures

17 Summary Statistics and Histograms Oracle Data Miner (gui for Oracle Data Mining Option) provides graphical histograms with summary statistics

18 Hypothesis Testing Parametric Tests Parametric tests make some assumptions about the data typically that the data is normally distributed among other assumptions Oracle 10g parametric hypothesis tests include: T-test F-test One-Way ANOVA

19 T-Test T-tests are used to measure the significance of a difference of means. T-tests include the following: One-sample T-test Paired-samples T-test Independent-samples T-test (pooled variances) Independent-samples T-test (unpooled variances)

20 Basic Example Compare difference in blood pressures between people who eat meat frequently vs. don t

21 One-Sample T-Test STATS_T_TEST_* The t-test functions are: STATS_T_TEST_ONE: A one-sample t-test STATS_T_TEST_PAIRED: A two-sample, paired t-test (also known as a crossed t-test) STATS_T_TEST_INDEP: A t-test of two independent groups with the same variance (pooled variances) STATS_T_TEST_INDEPU: A t-test of two independent groups with unequal variance (unpooled variances)

22 One-Sample T-Test Query compares the mean of SURVIVAL_TIME to the assumed value of 35: SELECT avg(survival_time_mo) group_mean, stats_t_test_one(survival_time_mo, 35, 'STATISTIC') t_observed, stats_t_test_one(survival_time_mo, 35) two_sided_p_value FROM LYMPHOMA; Returns the observed t value and its related two-sided significance SQL Worksheet

23 Paired Samples T-Test Query compares the mean of LOGWT for Pig Weights in Week 3 to Week 8, grouped by Diet: SELECT substr(diet,1,1) as diet, avg(logwt3) logwt3_mean, avg(logwt8) logwt8_mean, stats_t_test_paired(logwt3, LOGWT8,'STATISTIC') t_observed, stats_t_test_paired(logwt3, LOGWT8) two_sided_p_value FROM CBERGER.PIGLETS3 GROUP BY ROLLUP(DIET) ORDER BY 5 ASC; Returns the observed t value and its related two-sided significance SQL Worksheet

24 Independent Samples T-Test (Pooled Variances) Query compares the mean of AMOUNT_SOLD between MEN and WOMEN within CUST_INCOME_LEVEL ranges SELECT substr(cust_income_level,1,22) income_level, avg(decode(cust_gender,'m',amount_sold,null)) sold_to_men, avg(decode(cust_gender,'f',amount_sold,null)) sold_to_women, stats_t_test_indep(cust_gender, amount_sold, 'STATISTIC','F') t_observed, stats_t_test_indep(cust_gender, amount_sold) two_sided_p_value FROM sh.customers c, sh.sales s WHERE c.cust_id=s.cust_id GROUP BY rollup(cust_income_level) ORDER BY 1; SQL Worksheet

25 Independent Samples T-Test (Pooled Variances)

26 F-Test Query compares the variance in the SIZE_TUMOR between MALES and FEMALES SELECT variance(decode(gender,'0', SIZE_TUMOR_MM, null)) var_tumor_men, variance(decode(gender,'1', SIZE_TUMOR_MM,null)) var_tumor_women, stats_f_test(gender, SIZE_TUMOR_MM, 'STATISTIC', '1') f_statistic, stats_f_test(gender, SIZE_TUMOR_MM) two_sided_p_value FROM CBERGER.LYMPHOMA; Returns observed f value and two-sided significance SQL Worksheet

27 F-Test Query compares the variance in the SIZE_TUMOR between males and females Grouped By GENDER SELECT GENDER, stats_one_way_anova(treatment_plan, SIZE_REDUCTION,'F_RATIO') f_ratio, stats_one_way_anova(treatment_plan, SIZE_REDUCTION,'SIG') p_value, AVG(SIZE_REDUCTION) FROM CBERGER.LYMPHOMA GROUP BY GENDER ORDER BY GENDER; Returns observed f value and two-sided significance SQL Worksheet

28 One-Way ANOVA In statistics, analysis of variance (ANOVA, or sometimes A.N.O.V.A.) is a collection of statistical models, and their associated procedures, in which the observed variance is partitioned into components due to different explanatory variables. Example Group A is given vodka, Group B is given gin, and Group C is given a placebo. All groups are then tested with a memory task. A one-way ANOVA can be used to assess the effect of the various treatments (that is, the vodka, gin, and placebo).

29 One-Way ANOVA Query compares the average SIZE_REDUCTION within different TREATMENT_PLANS Grouped By LYMPH_TYPE: SELECT LYMPH_TYPE, stats_one_way_anova(treatment_plan, SIZE_REDUCTION,'F_RATIO') f_ratio, stats_one_way_anova(treatment_plan, SIZE_REDUCTION,'SIG') p_value FROM CBERGER.LYMPHOMA GROUP BY LYMPH_TYPE ORDER BY 1; Returns one-way ANOVA significance and split by LYMPH_TYPE

30 Hypothesis Testing (Nonparametric) Nonparametric tests are used when certain assumptions about the data are questionable. This may include the difference between samples that are not normally distributed. All tests involving ordinal scales (in which data is ranked) are nonparametric. Nonparametric tests supported in Oracle Database 10g: Binomial test Wilcoxon Signed Ranks test Mann-Whitney test Kolmogorov-Smirnov test

31 Customer Example "..Our experience suggests that Oracle 10g Statistics and Data Mining features can reduce development effort of analytical systems by an order of magnitude." Sumeet Muju Senior Member of Professional Staff, SRA International (SRA supports NIH bioinformatics development projects)

32 Correlation Functions?x The CORR_S and CORR_K functions support nonparametric or rank correlation (finding correlations between expressions that are ordinal scaled). Correlation coefficients take on a value ranging from 1 to 1, where: 1 indicates a perfect relationship 1 indicates a perfect inverse relationship 0 indicates no relationship The following query determines whether there is a correlation between the AGE and WEIGHT of people, using Spearman's correlation: select CORR_S(AGE, WEIGHT) coefficient, CORR_S(AGE, WEIGHT, 'TWO_SIDED_SIG') p_value, substr(treatment_plan, 1,15) as TREATMENT_PLAN from CBERGER.LYMPHOMA GROUP BY TREATMENT_PLAN;

33 Cross Tabulations This query analyzes the strength of the association between TREATMENT_PLAN and GENDER Grouped By LYMPH_TYPE using a cross tabulation: SELECT LYMPH_TYPE, stats_crosstab(gender, TREATMENT_PLAN, 'CHISQ_OBS') chi_squared, stats_crosstab(gender, TREATMENT_PLAN, 'CHISQ_SIG') p_value, stats_crosstab(gender, TREATMENT_PLAN, 'PHI_COEFFICIENT') phi_coefficient FROM CBERGER.LYMPHOMA GROUP BY LYMPH_TYPE ORDER BY 1; Returns the observed p_value and phi coefficient significance:

34 Cross Tabulations STATS_CROSSTAB function takes as arguments two expressions (the two variables being analyzed) and a value that determines which test to perform. These values include the following: CHISQ_OBS (observed value of chi-squared) CHISQ_SIG (significance of observed chi-squared) CHISQ_DF (degree of freedom for chi-squared) PHI_COEFFICIENT (phi coefficient) CRAMERS_V (Cramer s V statistic) CONT_COEFFICIENT (contingency coefficient) COHENS_K (Cohen s kappa) Function returns all values as specified by the third argument (default is CHISQ_SIG)

35 Distribution-Fitting Functions Distribution-fitting functions in Oracle Database 10g include the following NORMAL_DIST_FIT function UNIFORM_DIST_FIT function POISSON_DIST_FIT function WEIBULL_DIST_FIT function EXPONENTIAL_DIST_FIT function These functions test how well a sample of values fits a particular distribution The IN parameter of each function specifies which of the tests to use to measure the fit

36

37 Opportunities for Use Cases Control charts Set flags on your data e.g. when a value is above 3 sigma

38 Opportunities for Use Cases Construction of a Control Chart 1.Calculate means and ranges for each sample 2.Chart 3.Apply out-ofcontrol rules e.g. outside of 3 sigma

39 Opportunities for Use Cases Construction of a Control Chart 1.Calculate means and ranges for each sample 2.Chart 3.Apply out-ofcontrol rules e.g. outside of 3 sigma

40 Customer Example "..Our experience suggests that Oracle 10g Statistics and Data Mining features can reduce development effort of analytical systems by an order of magnitude." Sumeet Muju Senior Member of Professional Staff, SRA International (SRA supports NIH bioinformatics development projects)

41

42 In-Database Statistics Advantages Data remains in the database at all times with appropriate access security control mechanisms fewer moving parts Straightforward inclusion within interesting and arbitrarily complex queries Oracle 10g DB Data Warehousing Real-world scalability available for mission critical appls OLAP ETL Statistics Data Mining

43 Industry Analysts PREDICTIVE ANALYTICS: Extending the Value of Your Data Warehousing Investment, By Wayne W. Eckerson According to our survey, most organizations plan to significantly increase the analytic processing within a data warehouse database in the next three years, particularly for model building and scoring, which show 88% climbs. The amount of data preparation done in databases will only climb 36% in that time, but it will be done by almost two-thirds of all organizations (60%) double the rate of companies planning to use the database to create or score analytical models. it s surprising that about one-third of organizations plan to build analytical models in databases within three years. We leverage the data warehouse database when possible, says one analytics manager. He says most analysts download a data sample to their desktop and then upload it to the data warehouse once it s completed. Ultimately, however, everything will run in the data warehouse, the manager says.

44 1. In-Database Analytics Engine Basic Statistics (Free) Data Mining Text Mining 2. Costs (ODM: $20K cpu) Simplified environment Single server Security 3. IT Platform SQL (standard) Java (standard) Oracle 11g DB Data Warehousing ETL Analytics vs. OLAP Statistics Data Mining 1. External Analytical Engine Basic Statistics Data Mining Text Mining (separate: SAS EM for Text) Advanced Statistics 2. Costs (SAS EM: $150K/5 users) Duplicates data Annual Renewal Fee (AUF) (~45% each year) 3. IT Platform SAS Code (proprietary)

45 1. In-Database Analytics Engine Basic Statistics (Free) Data Mining Text Mining 2. Costs (ODM: $20K cpu) Simplified environment Single server Security 3. IT Platform SQL (standard) Java (standard) Analytics vs. 1. External Analytical Engine Basic Statistics Data Mining Text Mining (separate: SAS EM for Text) Advanced Statistics 2. Costs (SAS EM: $150K/5 users) Duplicates data Annual Renewal Fee (AUF) (~45% each year) 3. IT Platform SAS Code (proprietary) Oracle 11g DB Data Warehousing ETL Oracle 11g DB Data Warehousing ETL OLAP Statistics OLAP Statistics Data Mining Data Mining

46 SAS In-Database Processing 3-Year Road Map The goal of the SAS In-Database initiative is to achieve deeper technical integration with database providers, but also blends the best SAS data integration and analytics with the core strengths of databases.. Like all DBMS client applications, the SAS engine often must load and extract data over a network to and from the DBMS. This presents a series of challenges: Network bottlenecks between SAS and the DBMS constrain access to large volumes of data The best practice today is to read data into the SAS environment for processing. For highly repeatable processes, this might not be efficient because it takes time to transfer the data and resources are used to temporarily store in the SAS environment. In some cases, the results of the SAS processing must be transferred back to the DBMS for final storage, which further increases the cost. Addressing this challenge can result in improved resource utilization and enable companies to answer business questions more quickly. Oracle Data Mining is available today Source: SAS In-Database Processing White Paper October 2007

47 SAS In-Database Processing 3-Year Road Map It boils down to this simple equation: Less data movement = faster analytics, and faster analytics = faster delivery of real-time BI throughout an enterprise. Source: Use SAS to get more power out of your database Move key components of BI, analytics and data integration processes from the server or desktop to inside the database and help shorten your time to intelligence

48 IDC Worldwide Business Analytics Software Oracle

49 References 1. Back to Basics Understanding and Visualising Variation in Data.Pete Ceuppens, Robert Shaw, Zhiping You. AstraZeneca R&D. 2. QuickStart: Oracle Statistics Release 10gR2. Charlie Berger, Oracle Corporation. April, Oracle Database SQL Reference 10g Release 2 (10.2) Part Number: B December Applied Linear Statistical Models. John Neter, William Wasserman, Michael H. Kutner. IRWIN Mathematical Statistics with Applications. Mendenhall, Scheffer, Wackley. Duxbury Press, Boston, MA Oracle Database Data Warehousing Guide 10g Release 2 (10.2) Part Number: B December Oracle Technology Network: Source: Oracle 10gR2 Statistics Functions, OLSUG08 Workshop, Henri B. Tuthill, AstraZeneca & Charlie Berger, Oracle

50 Hands-on Exercises Quick Start Statistics

51 <Insert Picture Here> More Information: Oracle Data Mining 10g oracle.com/technology/products/bi/odm/index.html Oracle Statistical Functions Oracle Business Intelligence Solutions oracle.com/bi Contact Information:

52 Q U E S T I O N S A N S W E R S

53 This presentation is for informational purposes only and may not be incorporated into a contract or agreement.

Oracle Data Mining In-Database Data Mining Made Easy!

Oracle Data Mining In-Database Data Mining Made Easy! Oracle Data Mining In-Database Data Mining Made Easy! Charlie Berger Sr. Director Product Management, Data Mining and Advanced Analytics Oracle Corporation [email protected] www.twitter.com/charliedatamine

More information

Exadata V2 + Oracle Data Mining 11g Release 2 Importing 3 rd Party (SAS) dm models

Exadata V2 + Oracle Data Mining 11g Release 2 Importing 3 rd Party (SAS) dm models Exadata V2 + Oracle Data Mining 11g Release 2 Importing 3 rd Party (SAS) dm models Charlie Berger Sr. Director Product Management, Data Mining Technologies Oracle Corporation [email protected]

More information

Predictive Analytics for Better Business Intelligence

Predictive Analytics for Better Business Intelligence Oracle 11g DB Data Warehousing ETL OLAP Statistics Predictive Analytics for Better Business Intelligence Data Mining Charlie Berger Sr. Director Product Management, Data Mining Technologies

More information

Statistical Analysis of Gene Expression Data With Oracle & R (- data mining)

Statistical Analysis of Gene Expression Data With Oracle & R (- data mining) Statistical Analysis of Gene Expression Data With Oracle & R (- data mining) Patrick E. Hoffman Sc.D. Senior Principal Analytical Consultant [email protected] Agenda (Oracle & R Analysis) Tools Loading

More information

OLSUG Workshop Oracle Data Mining

OLSUG Workshop Oracle Data Mining OLSUG Workshop Oracle Data Mining Charlie Berger Sr. Director of Product Mgmt, Life Sciences and Data Mining Oracle Corporation [email protected] Dr. Lutz Hamel Asst. Professor, Computer Science

More information

Big Data Analytics with Oracle Advanced Analytics In-Database Option

Big Data Analytics with Oracle Advanced Analytics In-Database Option Big Data Analytics with Oracle Advanced Analytics In-Database Option Charlie Berger Sr. Director Product Management, Data Mining and Advanced Analytics [email protected] www.twitter.com/charliedatamine

More information

The Oracle Data Mining Machine Bundle: Zero to Predictive Analytics in Two Weeks Collaborate 15 IOUG

The Oracle Data Mining Machine Bundle: Zero to Predictive Analytics in Two Weeks Collaborate 15 IOUG The Oracle Data Mining Machine Bundle: Zero to Predictive Analytics in Two Weeks Collaborate 15 IOUG Presentation #730 Tim Vlamis and Dan Vlamis Vlamis Software Solutions 816-781-2880 www.vlamis.com Presentation

More information

SQL - the best analysis language for Big Data!

SQL - the best analysis language for Big Data! SQL - the best analysis language for Big Data! NoCOUG Winter Conference 2014 Hermann Bär, [email protected] Data Warehousing Product Management, Oracle 1 The On-Going Evolution of SQL Introduction

More information

Seamless Access from Oracle Database to Your Big Data

Seamless Access from Oracle Database to Your Big Data Seamless Access from Oracle Database to Your Big Data Brian Macdonald Big Data and Analytics Specialist Oracle Enterprise Architect September 24, 2015 Agenda Hadoop and SQL access methods What is Oracle

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 1 Copyright 2011, Oracle and/or its affiliates. FPO In-Database Analytics: Predictive Analytics, Data Mining, Exadata & Business Intelligence Charlie Berger Sr. Director Product Management, Data Mining

More information

Blazing BI: the Analytic Options to the Oracle Database. ODTUG Kscope 2013

Blazing BI: the Analytic Options to the Oracle Database. ODTUG Kscope 2013 Blazing BI: the Analytic Options to the Oracle Database ODTUG Kscope 2013 Dan Vlamis Tim Vlamis Vlamis Software Solutions 816-781-2880 http://www.vlamis.com Copyright 2013, Vlamis Software Solutions, Inc.

More information

Semantic and Data Mining Technologies. Simon See, Ph.D.,

Semantic and Data Mining Technologies. Simon See, Ph.D., Semantic and Data Mining Technologies Simon See, Ph.D., Introduction to Semantic Web and Business Use Cases 2 Lots of Scientific Resources NAR 2009 over 1170 databases Reuse, Recycling, Repurposing Paul

More information

Oracle Big Data SQL Architectural Deep Dive

Oracle Big Data SQL Architectural Deep Dive Oracle Big Data SQL Architectural Deep Dive Dan McClary, Ph.D. Big Data Product Management Oracle Safe Harbor Statement The following is intended to outline our general product direction. It is intended

More information

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE

STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE Perhaps Microsoft has taken pains to hide some of the most powerful tools in Excel. These add-ins tools work on top of Excel, extending its power and abilities

More information

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Charlie Berger, MS Eng, MBA Sr. Director Product Management, Data Mining and Advanced Analytics [email protected] www.twitter.com/charliedatamine

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University DATA ANALYSIS QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science

More information

Reporting Statistics in Psychology

Reporting Statistics in Psychology This document contains general guidelines for the reporting of statistics in psychology research. The details of statistical reporting vary slightly among different areas of science and also among different

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. [email protected]

More information

Analyzing Research Data Using Excel

Analyzing Research Data Using Excel Analyzing Research Data Using Excel Fraser Health Authority, 2012 The Fraser Health Authority ( FH ) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial

More information

SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

More information

Bill Burton Albert Einstein College of Medicine [email protected] April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Bill Burton Albert Einstein College of Medicine [email protected] April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Anomaly and Fraud Detection with Oracle Data Mining 11g Release 2

Anomaly and Fraud Detection with Oracle Data Mining 11g Release 2 Oracle 11g DB Data Warehousing ETL OLAP Statistics Anomaly and Fraud Detection with Oracle Data Mining 11g Release 2 Data Mining Charlie Berger Sr. Director Product Management, Data

More information

MEASURES OF LOCATION AND SPREAD

MEASURES OF LOCATION AND SPREAD Paper TU04 An Overview of Non-parametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the

More information

The Statistics Tutor s Quick Guide to

The Statistics Tutor s Quick Guide to statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcp-marshallowen-7

More information

There are three kinds of people in the world those who are good at math and those who are not. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Positive Views The record of a month

More information

Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics [email protected] http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

More information

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples Statistics One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples February 3, 00 Jobayer Hossain, Ph.D. & Tim Bunnell, Ph.D. Nemours

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. [email protected] www.excelmasterseries.com

More information

Data analysis process

Data analysis process Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis

More information

Description. Textbook. Grading. Objective

Description. Textbook. Grading. Objective EC151.02 Statistics for Business and Economics (MWF 8:00-8:50) Instructor: Chiu Yu Ko Office: 462D, 21 Campenalla Way Phone: 2-6093 Email: [email protected] Office Hours: by appointment Description This course

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

Parametric and Nonparametric: Demystifying the Terms

Parametric and Nonparametric: Demystifying the Terms Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

An introduction to IBM SPSS Statistics

An introduction to IBM SPSS Statistics An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive

More information

An introduction to using Microsoft Excel for quantitative data analysis

An introduction to using Microsoft Excel for quantitative data analysis Contents An introduction to using Microsoft Excel for quantitative data analysis 1 Introduction... 1 2 Why use Excel?... 2 3 Quantitative data analysis tools in Excel... 3 4 Entering your data... 6 5 Preparing

More information

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences. 1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls [email protected] MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected]

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected] Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information

CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 14 NONPARAMETRIC TESTS CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences

More information

Oracle Advanced Analytics - Option to Oracle Database: Oracle R Enterprise and Oracle Data Mining. Data Warehouse Global Leaders Winter 2013

Oracle Advanced Analytics - Option to Oracle Database: Oracle R Enterprise and Oracle Data Mining. Data Warehouse Global Leaders Winter 2013 Oracle Advanced Analytics - Option to Oracle Database: Oracle R Enterprise and Oracle Data Mining Data Warehouse Global Leaders Winter 2013 Dan Vlamis, Vlamis Software Solutions Tim Vlamis, Vlamis Software

More information

UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER

More information

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

More information

SPSS TUTORIAL & EXERCISE BOOK

SPSS TUTORIAL & EXERCISE BOOK UNIVERSITY OF MISKOLC Faculty of Economics Institute of Business Information and Methods Department of Business Statistics and Economic Forecasting PETRA PETROVICS SPSS TUTORIAL & EXERCISE BOOK FOR BUSINESS

More information

Introduction to Statistical Computing in Microsoft Excel By Hector D. Flores; [email protected], and Dr. J.A. Dobelman

Introduction to Statistical Computing in Microsoft Excel By Hector D. Flores; hflores@rice.edu, and Dr. J.A. Dobelman Introduction to Statistical Computing in Microsoft Excel By Hector D. Flores; [email protected], and Dr. J.A. Dobelman Statistics lab will be mainly focused on applying what you have learned in class with

More information

Introduction to Statistics and Quantitative Research Methods

Introduction to Statistics and Quantitative Research Methods Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.

More information

Using Predictions to Power the Business. Wayne Eckerson Director of Research and Services, TDWI February 18, 2009

Using Predictions to Power the Business. Wayne Eckerson Director of Research and Services, TDWI February 18, 2009 Using Predictions to Power the Business Wayne Eckerson Director of Research and Services, TDWI February 18, 2009 Sponsor 2 Speakers Wayne Eckerson Director, TDWI Research Caryn A. Bloom Data Mining Specialist,

More information

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

More information

Introduction to Statistics with GraphPad Prism (5.01) Version 1.1

Introduction to Statistics with GraphPad Prism (5.01) Version 1.1 Babraham Bioinformatics Introduction to Statistics with GraphPad Prism (5.01) Version 1.1 Introduction to Statistics with GraphPad Prism 2 Licence This manual is 2010-11, Anne Segonds-Pichon. This manual

More information

Teaching Business Statistics through Problem Solving

Teaching Business Statistics through Problem Solving Teaching Business Statistics through Problem Solving David M. Levine, Baruch College, CUNY with David F. Stephan, Two Bridges Instructional Technology CONTACT: [email protected] Typical

More information

January 26, 2009 The Faculty Center for Teaching and Learning

January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i

More information

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS About Omega Statistics Private practice consultancy based in Southern California, Medical and Clinical

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

THE UNIVERSITY OF TEXAS AT TYLER COLLEGE OF NURSING COURSE SYLLABUS NURS 5317 STATISTICS FOR HEALTH PROVIDERS. Fall 2013

THE UNIVERSITY OF TEXAS AT TYLER COLLEGE OF NURSING COURSE SYLLABUS NURS 5317 STATISTICS FOR HEALTH PROVIDERS. Fall 2013 THE UNIVERSITY OF TEXAS AT TYLER COLLEGE OF NURSING 1 COURSE SYLLABUS NURS 5317 STATISTICS FOR HEALTH PROVIDERS Fall 2013 & Danice B. Greer, Ph.D., RN, BC [email protected] Office BRB 1115 (903) 565-5766

More information

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall

More information

SOCIOLOGY 7702 FALL, 2014 INTRODUCTION TO STATISTICS AND DATA ANALYSIS

SOCIOLOGY 7702 FALL, 2014 INTRODUCTION TO STATISTICS AND DATA ANALYSIS SOCIOLOGY 7702 FALL, 2014 INTRODUCTION TO STATISTICS AND DATA ANALYSIS Professor Michael A. Malec Mailbox is in McGuinn 426 Office: McGuinn 427 Phone: 617-552-4131 Office Hours: TBA E-mail: [email protected]

More information

Mathematics within the Psychology Curriculum

Mathematics within the Psychology Curriculum Mathematics within the Psychology Curriculum Statistical Theory and Data Handling Statistical theory and data handling as studied on the GCSE Mathematics syllabus You may have learnt about statistics and

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma [email protected] The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Advanced Excel for Institutional Researchers

Advanced Excel for Institutional Researchers Advanced Excel for Institutional Researchers Presented by: Sandra Archer Helen Fu University Analysis and Planning Support University of Central Florida September 22-25, 2012 Agenda Sunday, September 23,

More information

UNIVERSITY of MASSACHUSETTS DARTMOUTH Charlton College of Business Decision and Information Sciences Fall 2010

UNIVERSITY of MASSACHUSETTS DARTMOUTH Charlton College of Business Decision and Information Sciences Fall 2010 UNIVERSITY of MASSACHUSETTS DARTMOUTH Charlton College of Business Decision and Information Sciences Fall 2010 COURSE: POM 500 Statistical Analysis, ONLINE EDITION, Fall 2010 Prerequisite: Finite Math

More information

Chapter G08 Nonparametric Statistics

Chapter G08 Nonparametric Statistics G08 Nonparametric Statistics Chapter G08 Nonparametric Statistics Contents 1 Scope of the Chapter 2 2 Background to the Problems 2 2.1 Parametric and Nonparametric Hypothesis Testing......................

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

More information

Descriptive and Inferential Statistics

Descriptive and Inferential Statistics General Sir John Kotelawala Defence University Workshop on Descriptive and Inferential Statistics Faculty of Research and Development 14 th May 2013 1. Introduction to Statistics 1.1 What is Statistics?

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University Bussiness Intelligence and Data Warehouse Schedule Bussiness Intelligence (BI) BI tools Oracle vs. Microsoft Data warehouse History Tools Oracle vs. Others Discussion Business Intelligence (BI) Products

More information

Data Analysis with Various Oracle Business Intelligence and Analytic Tools

Data Analysis with Various Oracle Business Intelligence and Analytic Tools Data Analysis with Various Oracle Business Intelligence and Analytic Tools Session ID: 108680 Prepared by: Tim and Dan Vlamis Vlamis Software Solutions www.vlamis.com @TimVlamis Agenda What we will talk

More information