5 Correlation and Data Exploration
|
|
- Russell Stevenson
- 8 years ago
- Views:
Transcription
1 5 Correlation and Data Exploration Correlation In Unit 3, we did some correlation analyses of data from studies related to the acquisition order and acquisition difficulty of English morphemes by both children and adult learners of L2 English. We used a Spearman Rank Order Correlation Test to compare the orders of different groups of learners and found that there were statistically significant relationships (i.e. p < 0.05). We also used a Pearson Correlation Test to find if the morpheme acquisition difficulties were similar across groups of learners. The results were mixed. Some showed statistically significant relationships but others did not. Correlation tests tell us how much two variables vary together. Figure 1 shows scatterplots of pairs of variables with different correlation strengths (r = 0.90, r = 0.50 and r = 0.00) with a regression line and its 95% confidence interval. (Regression is another statistical technique that is closely related to correlation. We shall look at it later.) The 95% confidence intervals show the range of regression lines that are possible based on the sample. The further they are apart, the less precise our regression line is likely to be. Figure 1. Scatterplots of variables at different correlation coefficients (r) with 95% confidence intervals In the scatterplot on the left (Figure 1), there is a very strong relationship (r = 0.90) between the variables. All the points are close to the regression line, and most of them are also in the bottom left and top right quadrants. The 95% confidence interval is also relatively narrow. In the 1
2 middle scatterplot the relationship is not as strong (r = 0.50). The points are more spread out and further from the regression line, but most are still in the bottom left and top right quadrants. The 95% confidence interval is also wider. In the scatterplot on the right, there is no relationship between the variables (r = 0.00). The points are randomly scattered over the graph, and there are roughly the same number in each of the four quadrants. The regression line cannot be seen because it is now I horizontal line that goes through the mean of y. The 95% confidence interval is also the widest. Data Exploration Data exploration means looking at our data in detail so that we can find its characteristics. It is an essential step before carrying out any statistical tests, although this seems to often be forgotten by researchers in the field of SLA. The first step is often to calculate the descriptive statistics: the mean, the median, the minimum, the maximum, the range, the standard deviation, 95% confidence intervals of the mean, skewness, kurtosis and standard error. Other data exploration techniques that are being used more and more are graphic techniques such as histograms, density plots, box plots, scatterplots with regression lines and/or smoothed trend (loess or lowess) lines and confidence intervals. Exploring the critical period hypothesis (dekeyser, 2000) In our quick look at correlations (above), one of our assumptions was that the relationship between the two variables is linear (a straight line). However, the Critical Period Hypothesis (CPH) claims that the relationship between Age of Acquisition (AoA) and ultimate attainment is non-linear (see Unit 5, Figure 1). In this section, we will explore the data from one study that claimed to support the CPH and see if it suggests that the relationship is non-linear. We will begin by making a scatterplot of the data with a regression line and its 95% confidence interval, and a loess (smoothed trend) line and its 95% confidence interval. The graph will look like Figure 2. In order to make this graph, you will need to have the ggplot2 package installed in R. (If it is not installed, follow the instructions in Appendix A to install it, or, if you cannot install it, follow the instructions in Appendix B for creating it with the built-in plotting functions.) The data you will need is in a file called dekeyser.txt. This file begins with a header, which contains the names of the variables ( AoA, GJT, Status ) and below them three columns of data. The columns (and variable names) are separated by an invisible tab character (or "\t" in R). The first few lines of the file look like this: 2
3 Figure 2. Scatterplot of scores on a grammaticality judgement task (GJT) and age of acquisition (AoA) with regression and loess lines and their 95% confidence intervals produced using the ggplot function in the ggplot2 package (data from dekeyser, 2000) "AoA" "GJT" "Status" "Under 15" "Under 15" "Under 15" "Under 15" "Under 15" "Under 15" First, you will need to read the data into R and store it in a variable. There are several ways to do this but the following one is the most similar to other software. The command has several parts. dekeyser is the name of the data frame that you are going to store the data in (you can choose another name if you prefer). read.table() is the function that will actually read the data. file.choose() is another function that will start an open file dialogue box, similar to other programs in Windows and Mac. header = TRUE indicates that the first line of the file is a header and NOT data. The 3
4 final argument sep = "\t" indicates that the columns are separated by tab character. Type the following (without "> ") and choose the file dekeyser.txt. > dekeyser <- read.table(file.choose(), header = TRUE, sep = "\t") If you get an error message just try again. Now let s see what things look like. Type: > head(dekeyser) AoA GJT Status Under Under Under Under Under Under 15 You should see the first six lines of the data. The first row (AoA GJT Status) is your header. You can also see that R has added row numbers ( ) at the beginning of each row of data. Now that the data has been imported, we can start to plot the graph. The first thing to do is to load the ggplot2 package. This is done with the library() function: > library(ggplot2) Next, we use the ggplot() function to plot the graph. ggplot() is a bit different from other functions we have used, and is made up of parts joined by a + symbol. > ggplot(data = dekeyser, aes(aoa, GJT)) + geom_point() + geom_smooth(method = "lm") + geom_smooth(colour = "red") The first part, ggplot(), initialises the plot but does not draw anything. In this example, it has two arguments. data = dekeyser tells ggplot to use the data frame called dekeyser, and aes(aoa, GJT) tells it to use the AoA and GJT variables (notice the order is x-axis, y-axis). geom_point() plots the points. geom_smooth() draws lines (and their 95% confidence intervals) calculated from the data. geom_smooth(method = "lm") draws a straight regression line, which is specified by the argument method = 4
5 "lm". geom_smooth(colour = "red"), draws a red loess trend line on the graph. Notice, the method does not need to be specified for a loess trend line because it is the default. > detach(package: ggplot2) Interpretation What does the graph we have produced tell us about the data? Is there any evidence for a Critical Period? Figure 3. Regression and Loess lines with 95% confidence intervals of the dekeyser (2000) data. I think the most important thing we need to look at is the regression line (blue) and the confidence intervals for the Loess line. If the regression line goes outside the confidence intervals for the Loess line, then there may be evidence for a Critical Period. In this case, we can see that it is outside the confidence intervals from about 20 to 24 years old. This however seems to be very late as the Critical Period is assumed to 5
6 end at puberty, which is usually thought to be from 13 to 15 years old. In other words, this data does not appear to support the Critical Period hypothesis. Of course, more sophisticated statistical techniques are needed to show whether this is likely to be true or not, but a visual analysis of the data can also be extremely helpful. Assignments Create similar graphs using the dekeyserisr.txt, dekeyserus.txt and FlegeSimple.txt. Is there any evidence of a Critical Period? 6
7 Appendix A Installing packages in R This section shows you how to install packages in R. For instructions on installing R, refer to: The easiest way to install new packages in R is to use the menus. First, click Packages (パッケージ) and select Install package(s) (パッケージ のインストール ). A list of servers will appear (see below). 7
8 Next, select the server from which to download the package. Here, the default server (0- Cloud) has been selected. If you prefer you may scroll down and select a server in Japan. After you do this, a list of packages appears. Scroll down this list until you find ggplot2. Select it and click OK. The package will be installed automatically. 8
9 Appendix A Plotting the data without the ggplot2 package Using the built in functions for plotting data (e.g., plot(), abline() and lines()) is more complicated than using functions in the ggplot2 package. The steps to produce Figure 2 are explained below. Figure 4. Scatterplot of scores on a grammaticality judgement task (GJT) and age of acquisition (AoA) with regression and loess lines and their 95% confidence intervals The data you will need is in a file called dekeyser.txt. This file begins with a header, which contains the names of the variables ( AoA, GJT, Status ) and below them three columns of data. The columns (and variable names) are separated by an invisible tab character (or "\t" in R). The first few lines of the file look like this: "AoA" "GJT" "Status" 9
10 8 170 "Under 15" "Under 15" "Under 15" "Under 15" "Under 15" "Under 15" First, you will need to read the data into R and store it in a variable. There are several ways to do this but the following one is the most similar to other software. The command has several parts. dekeyser is the name of the variable that you are going to store the data in (you can choose another name if you prefer). read.table() is the function that will actually read the data. file.choose() is another function that will start an open file dialogue box, similar to other programs in Windows and Mac. header = TRUE indicates that the first line of the file is a header and NOT data. The final argument sep = "\t" indicates that the columns are separated by tab character. Type the following (without the leading "> ") and choose the file dekeyser.txt. > dekeyser <- read.table(file.choose(), header = TRUE, sep = "\t") If you get an error message just try again. Now let s see what things look like. Type: > head(dekeyser) AoA GJT Status Under Under Under Under Under Under 15 You should see the first six lines of the data. The first row (AoA GJT Status) is your header. You can also see that R has added row numbers ( ) at the beginning of each row of data. The variable dekeyser is different from the vector variables that we used before. It is a data frame variable. However, in order to use the variables in it like vectors, type the following: > attach(dekeyser) Now we shall, do some calculations that the graphics functions need in order to draw the lines. The first one lm() calculates the regression line for 10
11 GJT (x-axis) and AoA (x-axis) and stores it in dekeyser.lm. [After you done this, type dekeyser.lm to see what the regression line data looks like.] > dekeyser.lm <- lm(gjt ~ AoA) The next command stores a sequence of numbers in newx. [After you have done it, type newx to see what it looks like.] > newx <- seq(0, 45, 0.1) The next command, predict.lm(), calculates the predicted values of GJT and their confidence intervals and stores them in pred. Because AoA does not have many values, the 95% confidence lines may not be very smooth. newx is used instead of the original AoA values in order to make smoother lines. [Once again, you can type pred to see what this data looks like.] > pred <- predict.lm(dekeyser.lm, newdata = data.frame(aoa=newx), interval = "confidence") Now, we can start to plot the graph. First, the points and the regression line and its 95% confidence intervals. > plot(gjt ~ AoA, bty = "n", col = "grey", ylim = c(80,210)) > abline(dekeyser.lm) > lines(pred[,2]~newx, lty = 2, col = "grey") > lines(pred[,3]~newx, lty = 2, col = "grey") The next step, is to do the calculations for the loess trend line and confidence intervals. The sequences similar to that for the regression line but, because there are differences in the structure of pred (used for the regression line) and pred2, the arguments used for drawing the lines are different. > dekeyser.lo <- loess(gjt ~ AoA) > newx <- seq(0, 45, 0.1) > pred2 <- predict(dekeyser.lo, newdata = data.frame(aoa=newx), se = TRUE) > lines(pred2$fit~newx, col = "red4") > lines(pred2$fit - qt(0.975,pred2$df)*pred2$se~newx, lty = 2, col = "pink3") > lines(pred2$fit + qt(0.975,pred2$df)*pred2$se~newx, lty = 2, col = "pink3") 11
12 Summary of commands > attach(dekeyser) > dekeyser.lm <- lm(gjt ~ AoA) > newx <- seq(0, 45, 0.1) > pred <- predict.lm(dekeyser.lm, newdata = data.frame(aoa=newx), interval = "confidence") > plot(gjt ~ AoA, bty = "n", col = "grey", ylim = c(80,210)) > abline(dekeyser.lm) > lines(pred[,2]~newx, lty = 2, col = "grey") > lines(pred[,3]~newx, lty = 2, col = "grey") > dekeyser.lo <- loess(gjt ~ AoA) > newx <- seq(0, 45, 0.1) > pred2 <- predict(dekeyser.lo, newdata = data.frame(aoa=newx), se = TRUE) > lines(pred2$fit~newx, col = "red4") > lines(pred2$fit - qt(0.975,pred2$df)*pred2$se~newx, lty = 2, col = "pink3") > lines(pred2$fit + qt(0.975,pred2$df)*pred2$se~newx, lty = 2, col = "pink3") > detach(dekeyser) 12
Scatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
More informationBill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1
Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce
More informationMicrosoft Excel. Qi Wei
Microsoft Excel Qi Wei Excel (Microsoft Office Excel) is a spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS X. It features calculation, graphing tools, pivot
More informationUsing Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
More informationData exploration with Microsoft Excel: analysing more than one variable
Data exploration with Microsoft Excel: analysing more than one variable Contents 1 Introduction... 1 2 Comparing different groups or different variables... 2 3 Exploring the association between categorical
More informationDescriptive Statistics
Descriptive Statistics Descriptive statistics consist of methods for organizing and summarizing data. It includes the construction of graphs, charts and tables, as well various descriptive measures such
More informationSPSS Explore procedure
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,
More informationDealing with Data in Excel 2010
Dealing with Data in Excel 2010 Excel provides the ability to do computations and graphing of data. Here we provide the basics and some advanced capabilities available in Excel that are useful for dealing
More informationTutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationGetting started with qplot
Chapter 2 Getting started with qplot 2.1 Introduction In this chapter, you will learn to make a wide variety of plots with your first ggplot2 function, qplot(), short for quick plot. qplot makes it easy
More informationDoing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:
Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationWe are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?
Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do
More informationUsing SPSS, Chapter 2: Descriptive Statistics
1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,
More informationDescribing, Exploring, and Comparing Data
24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter
More informationPsychology 205: Research Methods in Psychology
Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready
More informationAn introduction to using Microsoft Excel for quantitative data analysis
Contents An introduction to using Microsoft Excel for quantitative data analysis 1 Introduction... 1 2 Why use Excel?... 2 3 Quantitative data analysis tools in Excel... 3 4 Entering your data... 6 5 Preparing
More informationAn introduction to IBM SPSS Statistics
An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationData analysis and regression in Stata
Data analysis and regression in Stata This handout shows how the weekly beer sales series might be analyzed with Stata (the software package now used for teaching stats at Kellogg), for purposes of comparing
More informationGeoGebra Statistics and Probability
GeoGebra Statistics and Probability Project Maths Development Team 2013 www.projectmaths.ie Page 1 of 24 Index Activity Topic Page 1 Introduction GeoGebra Statistics 3 2 To calculate the Sum, Mean, Count,
More informationPlot and Solve Equations
Plot and Solve Equations With SigmaPlot s equation plotter and solver, you can - plot curves of data from user-defined equations - evaluate equations for data points, and solve them for a data range. You
More informationIntroduction to Exploratory Data Analysis
Introduction to Exploratory Data Analysis A SpaceStat Software Tutorial Copyright 2013, BioMedware, Inc. (www.biomedware.com). All rights reserved. SpaceStat and BioMedware are trademarks of BioMedware,
More informationAnalysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk
Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
More informationScientific Graphing in Excel 2010
Scientific Graphing in Excel 2010 When you start Excel, you will see the screen below. Various parts of the display are labelled in red, with arrows, to define the terms used in the remainder of this overview.
More informationDiagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationCalibration and Linear Regression Analysis: A Self-Guided Tutorial
Calibration and Linear Regression Analysis: A Self-Guided Tutorial Part 1 Instrumental Analysis with Excel: The Basics CHM314 Instrumental Analysis Department of Chemistry, University of Toronto Dr. D.
More informationSPSS Tutorial, Feb. 7, 2003 Prof. Scott Allard
p. 1 SPSS Tutorial, Feb. 7, 2003 Prof. Scott Allard The following tutorial is a guide to some basic procedures in SPSS that will be useful as you complete your data assignments for PPA 722. The purpose
More informationTIPS FOR DOING STATISTICS IN EXCEL
TIPS FOR DOING STATISTICS IN EXCEL Before you begin, make sure that you have the DATA ANALYSIS pack running on your machine. It comes with Excel. Here s how to check if you have it, and what to do if you
More informationBowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition
Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application
More informationSTATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc.
STATGRAPHICS Online Statistical Analysis and Data Visualization System Revised 6/21/2012 Copyright 2012 by StatPoint Technologies, Inc. All rights reserved. Table of Contents Introduction... 1 Chapter
More informationcontaining Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.
Getting Correlations Using PROC CORR Correlation analysis provides a method to measure the strength of a linear relationship between two numeric variables. PROC CORR can be used to compute Pearson product-moment
More informationThe Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
More informationUsing Excel for Statistical Analysis
Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure
More informationSPSS Tests for Versions 9 to 13
SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationDirections for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
More informationData exploration with Microsoft Excel: univariate analysis
Data exploration with Microsoft Excel: univariate analysis Contents 1 Introduction... 1 2 Exploring a variable s frequency distribution... 2 3 Calculating measures of central tendency... 16 4 Calculating
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationSTC: Descriptive Statistics in Excel 2013. Running Descriptive and Correlational Analysis in Excel 2013
Running Descriptive and Correlational Analysis in Excel 2013 Tips for coding a survey Use short phrases for your data table headers to keep your worksheet neat, you can always edit the labels in tables
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationUsing R for Linear Regression
Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional
More informationKSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
More informationTI-Inspire manual 1. Instructions. Ti-Inspire for statistics. General Introduction
TI-Inspire manual 1 General Introduction Instructions Ti-Inspire for statistics TI-Inspire manual 2 TI-Inspire manual 3 Press the On, Off button to go to Home page TI-Inspire manual 4 Use the to navigate
More informationPERFORMING REGRESSION ANALYSIS USING MICROSOFT EXCEL
PERFORMING REGRESSION ANALYSIS USING MICROSOFT EXCEL John O. Mason, Ph.D., CPA Professor of Accountancy Culverhouse School of Accountancy The University of Alabama Abstract: This paper introduces you to
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More information0 Introduction to Data Analysis Using an Excel Spreadsheet
Experiment 0 Introduction to Data Analysis Using an Excel Spreadsheet I. Purpose The purpose of this introductory lab is to teach you a few basic things about how to use an EXCEL 2010 spreadsheet to do
More informationNormality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationHow To Run Statistical Tests in Excel
How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting
More informationData Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
More informationOnce saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.
1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis
More informationPrism 6 Step-by-Step Example Linear Standard Curves Interpolating from a standard curve is a common way of quantifying the concentration of a sample.
Prism 6 Step-by-Step Example Linear Standard Curves Interpolating from a standard curve is a common way of quantifying the concentration of a sample. Step 1 is to construct a standard curve that defines
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationSummary of R software commands used to generate bootstrap and permutation test output and figures in Chapter 16
Summary of R software commands used to generate bootstrap and permutation test output and figures in Chapter 16 Since R is command line driven and the primary software of Chapter 16, this document details
More informationFormula for linear models. Prediction, extrapolation, significance test against zero slope.
Formula for linear models. Prediction, extrapolation, significance test against zero slope. Last time, we looked the linear regression formula. It s the line that fits the data best. The Pearson correlation
More informationUsing Excel for descriptive statistics
FACT SHEET Using Excel for descriptive statistics Introduction Biologists no longer routinely plot graphs by hand or rely on calculators to carry out difficult and tedious statistical calculations. These
More informationData Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs
Using Excel Jeffrey L. Rummel Emory University Goizueta Business School BBA Seminar Jeffrey L. Rummel BBA Seminar 1 / 54 Excel Calculations of Descriptive Statistics Single Variable Graphs Relationships
More informationEXCEL Tutorial: How to use EXCEL for Graphs and Calculations.
EXCEL Tutorial: How to use EXCEL for Graphs and Calculations. Excel is powerful tool and can make your life easier if you are proficient in using it. You will need to use Excel to complete most of your
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationIntroduction Course in SPSS - Evening 1
ETH Zürich Seminar für Statistik Introduction Course in SPSS - Evening 1 Seminar für Statistik, ETH Zürich All data used during the course can be downloaded from the following ftp server: ftp://stat.ethz.ch/u/sfs/spsskurs/
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationBiology statistics made simple using Excel
Millar Biology statistics made simple using Excel Biology statistics made simple using Excel Neil Millar Spreadsheet programs such as Microsoft Excel can transform the use of statistics in A-level science
More informationEach function call carries out a single task associated with drawing the graph.
Chapter 3 Graphics with R 3.1 Low-Level Graphics R has extensive facilities for producing graphs. There are both low- and high-level graphics facilities. The low-level graphics facilities provide basic
More informationExcel Tutorial. Bio 150B Excel Tutorial 1
Bio 15B Excel Tutorial 1 Excel Tutorial As part of your laboratory write-ups and reports during this semester you will be required to collect and present data in an appropriate format. To organize and
More informationChapter 4 Creating Charts and Graphs
Calc Guide Chapter 4 OpenOffice.org Copyright This document is Copyright 2006 by its contributors as listed in the section titled Authors. You can distribute it and/or modify it under the terms of either
More informationSimple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
More informationExercises on using R for Statistics and Hypothesis Testing Dr. Wenjia Wang
Exercises on using R for Statistics and Hypothesis Testing Dr. Wenjia Wang School of Computing Sciences, UEA University of East Anglia Brief Introduction to R R is a free open source statistics and mathematical
More informationFinal Software Tools and Services for Traders
Final Software Tools and Services for Traders TPO and Volume Profile Chart for NinjaTrader Trial Period The software gives you a 7-day free evaluation period starting after loading and first running the
More informationAbsorbance Spectrophotometry: Analysis of FD&C Red Food Dye #40 Calibration Curve Procedure
Absorbance Spectrophotometry: Analysis of FD&C Red Food Dye #40 Calibration Curve Procedure Note: there is a second document that goes with this one! 2046 - Absorbance Spectrophotometry. Make sure you
More informationOVERVIEW OF R SOFTWARE AND PRACTICAL EXERCISE
OVERVIEW OF R SOFTWARE AND PRACTICAL EXERCISE Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-110012 1. INTRODUCTION R is a free software environment for statistical computing
More informationUCL Depthmap 7: Data Analysis
UCL Depthmap 7: Data Analysis Version 7.12.00c Outline Data analysis in Depthmap Although Depthmap is primarily a graph analysis tool, it does allow you to investigate data that you produce. This tutorial
More informationHomework 11. Part 1. Name: Score: / null
Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is
More informationBeginner s Matlab Tutorial
Christopher Lum lum@u.washington.edu Introduction Beginner s Matlab Tutorial This document is designed to act as a tutorial for an individual who has had no prior experience with Matlab. For any questions
More informationMicrosoft Excel Tutorial
Microsoft Excel Tutorial Microsoft Excel spreadsheets are a powerful and easy to use tool to record, plot and analyze experimental data. Excel is commonly used by engineers to tackle sophisticated computations
More informationUsing Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data
Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationStatistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
More informationMetroBoston DataCommon Training
MetroBoston DataCommon Training Whether you are a data novice or an expert researcher, the MetroBoston DataCommon can help you get the information you need to learn more about your community, understand
More informationUCINET Visualization and Quantitative Analysis Tutorial
UCINET Visualization and Quantitative Analysis Tutorial Session 1 Network Visualization Session 2 Quantitative Techniques Page 2 An Overview of UCINET (6.437) Page 3 Transferring Data from Excel (From
More informationSection 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More informationGraphics in R. Biostatistics 615/815
Graphics in R Biostatistics 615/815 Last Lecture Introduction to R Programming Controlling Loops Defining your own functions Today Introduction to Graphics in R Examples of commonly used graphics functions
More informationSimple Linear Regression, Scatterplots, and Bivariate Correlation
1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.
More informationExercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationUpdates to Graphing with Excel
Updates to Graphing with Excel NCC has recently upgraded to a new version of the Microsoft Office suite of programs. As such, many of the directions in the Biology Student Handbook for how to graph with
More informationSPSS Introduction. Yi Li
SPSS Introduction Yi Li Note: The report is based on the websites below http://glimo.vub.ac.be/downloads/eng_spss_basic.pdf http://academic.udayton.edu/gregelvers/psy216/spss http://www.nursing.ucdenver.edu/pdf/factoranalysishowto.pdf
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationWorkspaces Creating and Opening Pages Creating Ticker Lists Looking up Ticker Symbols Ticker Sync Groups Market Summary Snap Quote Key Statistics
Getting Started Workspaces Creating and Opening Pages Creating Ticker Lists Looking up Ticker Symbols Ticker Sync Groups Market Summary Snap Quote Key Statistics Snap Report Price Charts Comparing Price
More informationTutorial 2: Reading and Manipulating Files Jason Pienaar and Tom Miller
Tutorial 2: Reading and Manipulating Files Jason Pienaar and Tom Miller Most of you want to use R to analyze data. However, while R does have a data editor, other programs such as excel are often better
More informationGeneral instructions for the content of all StatTools assignments and the use of StatTools:
General instructions for the content of all StatTools assignments and the use of StatTools: An important part of Business Management 330 is learning how to conduct statistical analyses and to write text
More informationIBM SPSS Statistics 20 Part 1: Descriptive Statistics
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 1: Descriptive Statistics Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationTI-Inspire manual 1. I n str uctions. Ti-Inspire for statistics. General Introduction
TI-Inspire manual 1 I n str uctions Ti-Inspire for statistics General Introduction TI-Inspire manual 2 General instructions Press the Home Button to go to home page Pages you will use the most #1 is a
More informationIntroduction and usefull hints for the R software
What is R Statistical software and programming language Freely available (inluding source code) Started as a free re-implementation of the S-plus programming language Introduction and usefull hints for
More informationThe correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
More information