Exploring Data and Descriptive Statistics (using R)
|
|
|
- Karin Tyler
- 9 years ago
- Views:
Transcription
1 Data Analysis 101 Workshops Exploring Data and Descriptive Statistics (using R) Oscar Torres-Reyna Data Consultant
2 Agenda What is R Transferring data to R Excel to R Basic data manipulation Frequencies Crosstabulations Scatterplots/Histograms Exercise 1: Data from ICPSR using the Online Learning Center. Exercise 2: Data from the World Development Indicators & Global Development Finance from the World Bank This document is created from the following: 2
3 What is R? R is a programming language use for statistical analysis and graphics. It is based S plus. [see project.org/] Multiple datasets open at the same time R is offered as open source (i.e. free) Download R at project.org/ A dataset is a collection of several pieces of information called variables (usually arranged by columns). A variable can have one or several values (information for one or several cases). Other statistical packages are SPSS, SAS and Stata. 3
4 Other data formats Features Stata SPSS SAS R Data extensions *.dta *.sav, *.por (portable file) *.sas7bcat, *.sas#bcat, *.xpt (xport files) *.Rdata User interface Programming/point-and-click Mostly point-and-click Programming Programming Data manipulation Very strong Moderate Very strong Very strong Data analysis Powerful Powerful Powerful/versatile Powerful/versatile Graphics Very good Very good Good Excellent Cost Program extensions Output extension Affordable (perpetual licenses, renew only when upgrade) Expensive (but not need to renew until upgrade, long term licenses) Expensive (yearly renewal) Open source *.do (do-files) *.sps (syntax files) *.sas *.txt (log files) *.log (text file, any word processor can read it), *.smcl (formated log, only Stata can read it). *.spo (only SPSS can read it) (various formats) *.R, *.txt(log files, any word processor can read) 4
5 Stat/Transfer: Transferring data from one format to another (available in the DSS lab) 1) Select the current format of the dataset 2) Browse for the dataset 3) Select Stata or the data format you need 4) It will save the file in the same directory as the original but with the appropriate extension (*.dta for Stata) 5) Click on Transfer 5
6 This is the R screen in Multiple-Document Interface (MDI) 6
7 This is the R screen in Single-Document Interface (SDI) To make the SDI the default, you can select the SDI during installation of R, or edit the Rconsole configuration file in R's etc directory, changing the line MDI = yes to MDI = no. Alternatively, you can create a second desktop icon for R to run R in SDI mode: Make a copy of the R icon by right clicking on the icon and dragging it to a new location on the desktop. Release the mouse button and select Copy Here. Right click on the new icon and select Properties. Edit the Target field on the Shortcut tab to read "C:\Program Files\R\R 2.5.1\bin\Rgui.exe" sdi (including the quotes exactly as shown, and assuming that you've installed R to the default location). Then edit the shortcut name on the General tab to read something like R SDI. [John Fox, 1E/installation.html#SDI] 7
8 Working directory getwd() # Shows the working directory (wd) setwd(choose.dir()) # Select the working directory interactively setwd("c:/myfolder/data") # Changes the wd setwd("h:\\myfolder\\data") # Changes the wd Creating directories/downloading from the internet dir() # Lists files in the working directory dir.create("c:/test") # Creates folder test in drive c: setwd("c:/test") # Changes the working directory to c:/test # Download file students.csv from the internet. download.file(" "C:/test/students.xls", method="auto", quiet=false, mode = "wb", cacheok = TRUE) 8
9 Installing/loading packages/user written programs install.packages("abc") # This will install the package -ABC--. A window will pop-up, select a # mirror site to download from (the closest to where you are) and click ok. library(abc) # Load the package -ABC- to your workspace # Install the following packages: install.packages("foreign") library(foreign) install.packages("car") install.packages("hmisc") install.packages("reshape") # Full list of packages by subject area 2+2 Log(10) c(1, 1) + c(1, 1) Operations/random numbers x <- rnorm(10, mean=0, sd=1) # Creates 10 random numbers (normal dist.), syntax rnorm(n, mean, sd) x x <- data.frame(x) x <- matrix(x) 9
10 # Save the commands used during the session savehistory(file="mylog.rhistory") # Load the commands used in a previous session loadhistory(file="mylog.rhistory") # Display the last 25 commands history() Keeping track of your work # You can read mylog.rhistory with any word processor. Notice that the file has to have the extension *.Rhistory Getting help?plot # Get help for an object, in this case for the -plot function. You can also type: help(plot)??regression # Search the help pages for anything that has the word "regression". You can also type: help.search("regression") apropos("age") # Search the word "age" in the objects available in the current R session. help(package=car) # View documentation in package car. You can also type: library(help="car ) help(dataabc) # Access codebook for a dataset called DataABC in the package ABC args(log) # Description of the command. 10
11 Example of a dataset in Excel. Variables are arranged by columns and cases by rows. Each variable has more than one value Path to the file: 11
12 From Excel to *.csv In Excel go to File->Save as and save the Excel file as *.csv: You may get the following messages, click OK and YES 12
13 Data from *.csv (copy and paste) # Select the table from the excel file, copy, go to the R Console and type: mydata <- read.table("clipboard", header=true, sep="\t") summary(mydata) edit(mydata) Data from *.csv (interactively) mydata <- read.csv(file.choose(), header = TRUE) Data from *.csv mydata <- read.csv("c:\mydata\mydatafile.csv", header=true) mydata <- read.csv(" header=true) Data from *.txt (space, tab, comma separated) # If you have spaces and missing data is coded as -9, type: mydata <- read.table(("c:/myfolder/abc.txt", header=true, sep="\t", na.strings = "-9") Data to *.txt (space, tab, comma separated) write.table(mydata, file = "test.txt", sep = "\t") 13
14 install.packages("foreign") Data from Stata # Need to install package -foreign - first (you do this only once) library(foreign) # Load package -foreign-- mydata.stata <- read.dta(" mydata.stata <- read.dta(" convert.factors=true, convert.dates=true, convert.underscore=true, warn.missing.labels=true) # Where (source: type?read.dta) # convert.dates. Convert Stata dates to Date class # convert.factors. Use Stata value labels to create factors? (version 6.0 or later). # convert.underscore. Convert "_" in Stata variable names to "." in R names? # warn.missing.labels. Warn if a variable is specified with value labels and those value labels are not present in the file. Data to Stata write.dta(mydata, file = "test.dta") # Direct export to Stata write.foreign(mydata, codefile="test1.do", datafile="test1.raw", package="stata") # Provide a dofile to read the *.raw data 14
15 Data from SPSS install.packages("foreign") # Need to install package -foreign - first (you do this only once) library(foreign) # Load package -foreign-- mydata.spss <- read.spss(" to.data.frame = TRUE, use.value.labels=true, use.missings = to.data.frame) # Where: # # to.data.frame return a data frame. # # use.value.labels Convert variables with value labels into R factors with those levels. # # use.missings logical: should information on user-defined missing values be used to set the corresponding values to NA. Source: type?read.spss Data to SPSS # Provides a syntax file (*.sps) to read the *.raw data file write.foreign(mydata, codefile="test2.sps", datafile="test2.raw", package= SPSS") 15
16 Data from SAS # To read SAS XPORT format (*.xpt). Package -foreign-- install.packages("foreign") # Need to install package -foreign - first (you do this only once) library(foreign) # Load package -foreign-- mydata.sas <- read.xport("c:/myfolder/mydata.xpt") # NOTE: Does not work for files available online # Using package -Hmisc install.packages( Hmisc") # Need to install package -Hmisc - first (you do this only once) library(hmisc) mydata.sas <- sasxport.get(" # It works Data to SAS # It provides a syntax file (*.sas) to read the *.raw data write.foreign(mydata, codefile="test2.sas", datafile="test2.raw", package= SAS") NOTE: As an alternative, you can use SAS Universal Viewer (freeware from SAS) to read SAS files and save them as *.csv. Saving the file as *.csv removes variable/value labels, make sure you have the codebook available. 16
17 Data from ACII Record form mydata.dat <-read.fwf(file=" width=c(7, -16, 2, 2, -4, 2, -10, 2, -110, 3, -6, 2), col.names=c("w","y","x1","x2","x3", "age", "sex"), n=1090) # Reading ASCII record form, numbers represent the width of variables, negative sign excludes variables not wanted (you must include these). # To get the width of the variables you must have a codebook for the data set available (see an example below). # To get the widths for unwanted spaces use the formula: Start of var(t+1) End of var(t) - 1 *Thank you to Scott Kostyshak for useful advice/code. Data locations usually available in codebooks Var Rec Start End Format var F7.2 var F2.0 var A2 var F2.0 var A2 var A3 var A2 17
18 Data from R load("mydata.rdata") load("mydata.rda") /* Add path to data if necessary */ Data to R save.image("mywork.rdata") # Saving all objects to file *.RData save(object1, object2, file= mywork.rda") # Saving selected objects 18
19 Exploring data summary(mydata) # Provides basic descriptive statistics and frequencies. edit(mydata) # Open data editor str(mydata) # Provides the structure of the dataset names(mydata) # Lists variables in the dataset head(mydata) # First 6 rows of dataset head(mydata, n=10)# First 10 rows of dataset head(mydata, n= -10) # All rows but the last 10 tail(mydata) # Last 6 rows tail(mydata, n=10) # Last 10 rows tail(mydata, n= -10) # All rows but the first 10 mydata[1:10, ] # First 10 rows mydata[1:10,1:3] # First 10 rows of data of the first 3 variables 19
20 Exploring the workspace objects() # Lists the objects in the workspace ls() # Same as objects() remove() # Remove objects from the workspace rm(list=ls()) #clearing memory space detach(package:abc) # Detached packages when no longer need them search() # Shows the loaded packages library() # Shows the installed packages dir() # show files in the working directory 20
21 Missing data rowsums(is.na(mydata)) # Number of missing per row colsums(is.na(mydata)) # Number of missing per column/variable rowmeans(is.na(mydata))*length(mydata) # No. of missing per row (another way) # length = num. of variables/elements in an object # Convert to missing data mydata[mydata$age=="& ","age"] <- NA # NOTE: Notice hidden spaces. mydata[mydata$age==999,"age"] <- NA # The function complete.cases() returns a logical vector indicating which cases are complete. # list rows of data that have missing values mydata[!complete.cases(mydata),] # The function na.omit() returns the object with listwise deletion of missing values. # Creating a new dataset without missing data mydata1 <- na.omit(mydata) 21
22 Replacing a value mydata1 <- na.omit(mydata) mydata1[mydata1$sat==1787,"sat"] < mydata1[mydata1$country=="bulgaria","country"] <- "US" Renaming variables # Using base commands fix(mydata) # Rename interactively. names(mydata)[3] <- "First" # Using library -reshape-- library(reshape) mydata <- rename(mydata, c(last.name="last")) mydata <- rename(mydata, c(first.name="first")) mydata <- rename(mydata, c(student.status="status")) mydata <- rename(mydata, c(average.score..grade.="score")) mydata <- rename(mydata, c(height..in.="height")) mydata <- rename(mydata, c(newspaper.readership..times.wk.="read")) 22
23 # Use factor() for nominal data Value labels mydata$sex <- factor(mydata$sex, levels = c(1,2), labels = c("male", "female")) # Use ordered() for ordinal data mydata$var2 <- ordered(mydata$var2, levels = c(1,2,3,4), labels = c("strongly agree", "Somewhat agree", "Somewhat disagree", "Strongly disagree")) # As a new variable. mydata$var8 <- ordered(mydata$var2, levels = c(1,2,3,4), labels = c("strongly agree", "Somewhat agree", "Somewhat disagree", "Strongly disagree")) Reordering labels levels(mydata$major) # Syntax for reorder(categorical variable, numeric variable, desired statistic) mydata$major = with(mydata, reorder(major,read,mean)) # Order goes from low to high levels(mydata$major) attr(mydata$major, 'scores') # Reorder creates an attribute called scores (with the statistic # used to reorder the labels, in this case the mean values. # Mean of reading time: 4.4 for Econ, 5.3 for Math, 4.9 for Politics (using students.xls) 23
24 Creating ids/sequence of numbers # Creating a variable with a sequence of numbers or to index # Creating a variable with a sequence of numbers from 1 to n (where n is the total number of observations) mydata$id <- seq(dim(mydata)[1]) # Creating a variable with the total number of observations mydata$total <- dim(mydata)[1] /* Creating a variable with a sequence of numbers from 1 to n per category (where n is the total number of observations in each category)(1) */ mydata <- mydata[order(mydata$group),] idgroup <- tapply(mydata$group, mydata$group, function(x) seq(1,length(x),1)) mydata$idgroup <- unlist(idgroup) (1) Thanks to Alex Acs for the code 24
25 Recoding variables library(car) mydata$age.rec <- recode(mydata$age, "18:19='18to19'; 20:29='20to29'; 30:39='30to39'") mydata$age.rec <- as.factor(mydata$age.rec) Sort mydata.sorted <- mydata[order(mydata$gender),] mydata.sorted1 <- mydata[order(mydata$gender, -mydata$sat),] mydata$age.rec <- NULL mydata$var1 <- mydata$var2 <- NULL Deleting variables (see subset next page) (see subset next page) Deleting rows 25
26 Subsetting variables mydata2 <- mydata[,1:14] # Selecting the first 14 variables mydata2 <- mydata[c(1:14)] sat <- mydata[c("last", "First", "SAT")] sat1 <- mydata[c(2,3,11)] select <- mydata[c(1:3, 12:14)] # Type names(select) to verify select1 <- mydata[c(-(1:3), -(12:14))] # Excluding variables Subsetting observations mydata2 <- mydata2[1:30,] # Selecting the first 30 observations mydata3a <- mydata[which(mydata$gender=='female' & mydata$sat > 1800), ] Subsetting variables/observations mydata2 <- mydata2[1:30,1:14] # Selecting the first 30 observations and first 14 variables Subsetting using subset mydata3 <- subset(mydata2, Age >= 20 & Age <= 30) mydata4 <- subset(mydata2, Age >= 20 & Age <= 30, select=c(id, First, Last, Age)) mydata5 <- subset(mydata2, Gender=="Female" & Status=="Graduate" & Age >= 30) mydata6 <- subset(mydata2, Gender=="Female" & Status=="Graduate" & Age == 30) 26
27 table(mydata$gender) table(mydata$read) Categorical data: Frequencies/Crosstabs # Two-way tables readgender <- table(mydata$read,mydata$gender) readgender addmargins(readgender) # Adding row/col margins prop.table(readgender,1) # Row proportions round(prop.table(readgender,1), 2) # Round col prop to 2 digits round(100*prop.table(readgender,1), 2) # Round col prop to 2 digits (percents) addmargins(round(prop.table(readgender,1), 2),2) # Round col prop to 2 digits prop.table(readgender,2) # Column proportions round(prop.table(readgender,2), 2) # Round column prop to 2 digits round(100*prop.table(readgender,2), 2) # Round column prop to 2 digits (percents) addmargins(round(prop.table(readgender,2), 2),1) # Round col prop to 2 digits prop.table(readgender) # Tot proportions round(prop.table(readgender),2) # Tot proportions rounded round(100*prop.table(readgender),2) # Tot proportions rounded chisq.test(readgender) # Do chisq test Ho: no relathionship fisher.test(readgender) # Do fisher'exact test Ho: no relationship install.packages("vcd") library(vcd) assocstats(readgender) # First two are assoc measures, last three show degree of association. # 3-way crosstabs table3 <- xtabs(~read+major+gender, data=mydata) table3 ftable(table3a) # NOTE: Chi-sqr = sum (obs-exp)^2/exp. Degrees of freedom for Chi-sqr are (r-1)*(c-1) # NOTE: Chi-sqr contribution = (obs-exp)^2/exp # Cramer's V = sqrt(chi-sqr/n*min). Where N is sample size and min is a the minimum of (r-1) or (c-1) 27
28 Categorical data: Frequencies/Crosstabs using gmodels library(gmodels) mydata$ones <- 1 # Create a new variable of ones CrossTable(mydata$Major,digits=2) # Shows horizontal CrossTable(mydata$Major,digits=2, max.width=1) # Shows vertical CrossTable(mydata$Major,mydata$ones, digits=2) CrossTable(mydata$Gender,mydata$ones, digits=2) CrossTable(mydata$Major,mydata$Gender,digits=2, expected=true,dnn=c("major","gender")) CrossTable(mydata$Major,mydata$Gender,digits=2, chisq=true, dnn=c("major","gender")) CrossTable(mydata$Major,mydata$Gender,digits=2, dnn=c("major","gender")) CrossTable(mydata$Major,mydata$Gender, format=c("spss"), digits=1) chisq.test(mydata$major,mydata$gender) # Null hipothesis: no association # NOTE: Expected value = (row total * column total)/overall total (or total sample size). Value we would expect if all cell were represented proportionally, which indicates no association between variables. This is are we getting what we expect or not. If so then nothing is new. If not then something is going on # NOTE: Chi-sqr = sum (obs-exp)^2/exp Degrees of freedom for Chi-sqr are (r-1)*(c-1) # NOTE: Chi-sqr contribution = (obs-exp)^2/exp # Cramer's V = sqrt(chi-sqr/n*min) Where N is sample size and min is a the minimun of (r-1) or (c-1) 28
29 Measures of association X 2 (chi square) tests for relationships between variables. The null hypothesis (Ho) is that there is no relationship. To reject this we need a Pr < 0.05 (at 95% confidence). Here both chi2 are significant. Therefore we conclude that there is some relationship between perceptions of the economy and gender. lrchi2 reads the same way. Cramer s V is a measure of association between two nominal variables. It goes from 0 to 1 where 1 indicates strong association (for rxc tables). In 2x2 tables, the range is 1 to 1. Here the V is 0.15, which shows a small association. Fisher s exact test is used when there are very few cases in the cells (usually less than 5). It tests the relationship between two variables. The null is that variables are independent. Here we reject the null and conclude that there is some kind of relationship between variables Source: Plotting frequencies barplot(margin.table(readgender,1)) barplot(margin.table(readgender,2)) 29
30 install.packages("pastecs") Descriptive Statistics using pastecs library(pastecs) stat.desc(mydata) stat.desc(mydata[,c("age","sat","score","height", "Read")]) stat.desc(mydata[,c("age","sat","score")], basic=true, desc=true, norm=true, p=0.95) stat.desc(mydata[10:14], basic=true, desc=true, norm=true, p=0.95) 30
31 mean(mydata) # Mean of all numeric variables mean(mydata$sat) with(mydata, mean(sat)) Descriptive Statistics median(mydata$sat) var(mydata$sat) # Variance sd(mydata$sat) # Standard deviation max(mydata$sat) # Max value min(mydata$sat) # Min value range(mydata$sat) # Range quantile(mydata$sat) # Quantiles 25% quantile(mydata$sat, c(.3,.6,.9)) # Customized quantiles fivenum(mydata$sat) # Boxplot elements. From help: "Returns Tukey's five number summary (minimum, # lower-hinge, median, upper-hinge, maximum) for the input data ~ boxplot" length(mydata$sat) # Num of observations when a variable is specify length(mydata) # Number of variables when a dataset is specify which.max(mydata$sat) # From help: "Determines the location, i.e., index of the (first) minimum or maximum of a numeric vector" which.min(mydata$sat) # From help: "Determines the location, i.e., index of the (first) minimum or maximum of a numeric vector # Mode by frequencies table(mydata$country) max(table(mydata$country)) names(sort(-table(mydata$country)))[1] 31
32 Descriptive Statistics # Descriptive statistics by groups using --tapply-- mean <- tapply(mydata$sat,mydata$gender, mean) # Add na.rm=true to remove missing values in the estimation sd <- tapply(mydata$sat,mydata$gender, sd) median <- tapply(mydata$sat,mydata$gender, median) max <- tapply(mydata$sat,mydata$gender, max) cbind(mean, median, sd, max) round(cbind(mean, median, sd, max),digits=1) t1 <- round(cbind(mean, median, sd, max),digits=1) t1 # Descriptive statistics by groups using --aggregate aggregate(mydata[c("age","sat")],by=list(sex=mydata$gender), mean, na.rm=true) aggregate(mydata[c("age","sat")],mydata["gender"], mean, na.rm=true) aggregate(mydata,by=list(sex=mydata$gender), mean, na.rm=true) aggregate(mydata,by=list(sex=mydata$gender, major=mydata$major, status=mydata$status), mean, na.rm=true) aggregate(mydata$sat,by=list(sex=mydata$gender, major=mydata$major, status=mydata$status), mean, na.rm=true) aggregate(mydata[c("sat")],by=list(sex=mydata$gender, major=mydata$major, status=mydata$status), mean, na.rm=true) 32
33 Histograms library(car) head(prestige) hist(prestige$income) hist(prestige$income, col="green") with(prestige, hist(income)) # Histogram of income with a nicer title. # Applying Freedman/Diaconis rule p.120 ("Algorithm that chooses bin widths and locations automatically, based on the sample size and the spread of the data" with(prestige, hist(income, breaks="fd", col="green")) box() hist(prestige$income, breaks="fd") # Conditional histograms par(mfrow=c(1, 2)) hist(mydata$sat[mydata$gender=="female"], breaks="fd", main="female", xlab="sat",col="green") hist(mydata$sat[mydata$gender=="male"], breaks="fd", main="male", xlab="sat", col="green") # Braces indicate a compound command allowing several commands with 'with' command par(mfrow=c(1, 1)) with(prestige, { hist(income, breaks="fd", freq=false, col="green") lines(density(income), lwd=2) lines(density(income, adjust=0.5),lwd=1) rug(income) }) 33
34 # Histograms overlaid Histograms hist(mydata$sat, breaks="fd", col="green") hist(mydata$sat[mydata$gender=="male"], breaks="fd", col="gray", add=true) legend("topright", c("female","male"), fill=c("green","gray")) # Check satgender <- table(mydata$sat,mydata$gender) satgender x <- rnorm(100) hist(x, freq=f) curve(dnorm(x), add(t) Histogram with normal curve overlay h <- hist(x, plot=f) ylim <- range(0. h$density, dnorm(0)) hist(x, freq=f, ylim=ylim) curve(dnorm(x), add=t) 34
35 Scatterplots # Scatterplots. Useful to 1) study the mean and variance functions in the regression of y on x p.128; 2)to identify outliers and leverage points. # plot(x,y) plot(mydata$sat) # Index plot plot(mydata$age, mydata$sat) plot(mydata$age, mydata$sat, main= Age/SAT", xlab= Age", ylab= SAT", col="red") abline(lm(mydata$sat~mydata$age), col="blue") # regression line (y~x) lines(lowess(mydata$age, mydata$sat), col="green") # lowess line (x,y) identify(mydata$age, mydata$sat, row.names(mydata)) # On row.names to identify. "All data frames have a row names attribute, a character vector of length the number of rows with no duplicates nor missing values." (source link below). # "Use attr(x, "row.names") if you need an integer value.)" mydata$names <- paste(mydata$last, mydata$first) row.names(mydata) <- mydata$names plot(mydata$sat, mydata$age) identify(mydata$sat, mydata$age, row.names(mydata)) 35
36 Scatterplots # Rule on span for lowess, big sample smaller (~0.3), small sample bigger (~0.7) library(car) scatterplot(sat~age, data=mydata) scatterplot(sat~age, id.method="identify", data=mydata) scatterplot(sat~age, id.method="identify", boxplots= FALSE, data=mydata) scatterplot(prestige~income, span=0.6, lwd=3, id.n=4, data=prestige) # By groups scatterplot(sat~age Gender, data=mydata) scatterplot(sat~age Gender, id.method="identify", data=mydata) scatterplot(prestige~income type, boxplots=false, span=0.75, data=prestige) scatterplot(prestige~income type, boxplots=false, span=0.75, col=gray(c(0,0.5,0.7)), data=prestige) 36
37 Scatterplots (multiple) scatterplotmatrix(~ prestige + income + education + women, span=0.7, id.n=0, data=prestige) pairs(prestige) # Pariwise plots. Scatterplots of all variables in the dataset pairs(prestige, gap=0, cex.labels=0.9) # gap controls the space between subplot and cex.labels the font size (Dalgaard:186) library(car) 3D Scatterplots scatter3d(prestige ~ income + education, id.n=3, data=duncan) 37
38 plot(vocabulary ~ education, data=vocab) Scatterplots (for categorical data) plot(jitter(vocabulary) ~ jitter(education), data=vocab) plot(jitter(vocabulary, factor=2) ~ jitter(education, factor=2), data=vocab) # cex makes the point half the size, p. 134 plot(jitter(vocabulary, factor=2) ~ jitter(education, factor=2), col="gray", cex=0.5, data=vocab) with(vocab, { abline(lm(vocabulary ~ education), lwd=3, lty="dashed") lines(lowess(education, vocabulary, f=0.2), lwd=3) }) Useful links to graphics
39 Exercises
40 Using the ICPSR Online Learning Center, go to guide on Civic Participation and Demographics in Rural China (1990) Exercise 1 Got to the tab Dataset and download the data ( We ll focus on the first exercise on Age and Participation and use the following variables: Respondent's year of birth (M1001) Village meeting attendance (M3090) Activities: Create the variable age for each respondent Create the variable agegroup with the following categories: 16 35, and Questions: What percentage of respondents reported attending a local village meeting? Of those attending a meeting, which age group was most likely to report attending a village meeting? Of those attending a meeting, which group was most likely to report no village meeting attendance? Source: Inter university Consortium for Political and Social Research. Civic Participation and Demographics in Rural China: A Data Driven Learning Guide. Ann Arbor, MI: Inter university Consortium for Political and Social Research [distributor], July, Doi: /China 40
41 Exercise 2 Got to the World Development Indicators (WDI) & Global Development Finance (GDF) from the World Bank (access from the library s Articles and Databases, Direct link to WDI/GDF Get data for the United States and all available years on: Long term unemployment (% of total unemployment) Long term unemployment, female (% of female unemployment) Long term unemployment, male (% of male unemployment) Inflation, consumer prices (annual %) GDP per capita (constant 2000 US$) GDP per capita growth (annual %) See here to arrange the data as panel data For an example of how panel data looks like click here: Activities: Rename the variables and explore the data (use describe, summarize) Create a variable called crisis where it takes the value of 17 for the following years: 1960, 1961, 1969, 1970, 1973, 1974, 1975, 1981, 1982, 1990, 1991, 2001, 2007, 2008, Replace missing with zeros (source: nber.org). Set as time series (see Create a line graph with unemployment rate (total, female and males) and crisis by year. Questions: What do you see? Who tends to be more affected by the economic recessions? 41
42 References/Useful links DSS Online Training Section Princeton DSS Libguides John Fox s site Quick R UCLA Resources to learn and use R UCLA Resources to learn and use Stata DSS Stata DSS R 42
43 References/Recommended books An R Companion to Applied Regression, Second Edition / John Fox, Sanford Weisberg, Sage Publications, 2011 Data Manipulation with R / Phil Spector, Springer, 2008 Applied Econometrics with R / Christian Kleiber, Achim Zeileis, Springer, 2008 Introductory Statistics with R / Peter Dalgaard, Springer, 2008 Complex Surveys. A guide to Analysis Using R / Thomas Lumley, Wiley, 2010 Applied Regression Analysis and Generalized Linear Models / John Fox, Sage, 2008 R for Stata Users / Robert A. Muenchen, Joseph Hilbe, Springer, 2010 Introduction to econometrics / James H. Stock, Mark W. Watson. 2nd ed., Boston: Pearson Addison Wesley, Data analysis using regression and multilevel/hierarchical models / Andrew Gelman, Jennifer Hill. Cambridge ; New York : Cambridge University Press, Econometric analysis / William H. Greene. 6th ed., Upper Saddle River, N.J. : Prentice Hall, Designing Social Inquiry: Scientific Inference in Qualitative Research / Gary King, Robert O. Keohane, Sidney Verba, Princeton University Press, Unifying Political Methodology: The Likelihood Theory of Statistical Inference / Gary King, Cambridge University Press, 1989 Statistical Analysis: an interdisciplinary introduction to univariate & multivariate methods / Sam Kachigan, New York : Radius Press, c1986 Statistics with Stata (updated for version 9) / Lawrence Hamilton, Thomson Books/Cole,
Getting Started in R~Stata Notes on Exploring Data
Getting Started in R~ Notes on Exploring Data (v. 1.0) Oscar Torres-Reyna [email protected] Fall 2010 http://dss.princeton.edu/training/ What is R/? What is R? R is a language and environment for statistical
Merge/Append using R (draft)
Merge/Append using R (draft) Oscar Torres-Reyna Data Consultant [email protected] January, 2011 http://dss.princeton.edu/training/ Intro Merge adds variables to a dataset. This document will use merge
Introduction to RStudio
Introduction to RStudio (v 1.3) Oscar Torres-Reyna [email protected] August 2013 http://dss.princeton.edu/training/ Introduction RStudio allows the user to run R in a more user-friendly environment.
A Short Guide to R with RStudio
Short Guides to Microeconometrics Fall 2013 Prof. Dr. Kurt Schmidheiny Universität Basel A Short Guide to R with RStudio 1 Introduction 2 2 Installing R and RStudio 2 3 The RStudio Environment 2 4 Additions
Introduction to PASW Statistics 34152-001
Introduction to PASW Statistics 34152-001 V18 02/2010 nm/jdr/mr For more information about SPSS Inc., an IBM Company software products, please visit our Web site at http://www.spss.com or contact: SPSS
Using SPSS, Chapter 2: Descriptive Statistics
1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,
Data Preparation & Descriptive Statistics (ver. 2.7)
Data Preparation & Descriptive Statistics (ver. 2.7) Oscar Torres-Reyna Data Consultant [email protected] http://dss.princeton.edu/training/ Basic definitions For statistical analysis we think of data
Importing Data into R
1 R is an open source programming language focused on statistical computing. R supports many types of files as input and the following tutorial will cover some of the most popular. Importing from text
InfiniteInsight 6.5 sp4
End User Documentation Document Version: 1.0 2013-11-19 CUSTOMER InfiniteInsight 6.5 sp4 Toolkit User Guide Table of Contents Table of Contents About this Document 3 Common Steps 4 Selecting a Data Set...
Getting Started in Data Analysis using Stata
Getting Started in Data Analysis using Stata (v. 6.0) Oscar Torres-Reyna [email protected] December 2007 http://dss.princeton.edu/training/ Stata Tutorial Topics What is Stata? Stata screen and general
There are six different windows that can be opened when using SPSS. The following will give a description of each of them.
SPSS Basics Tutorial 1: SPSS Windows There are six different windows that can be opened when using SPSS. The following will give a description of each of them. The Data Editor The Data Editor is a spreadsheet
Data exploration with Microsoft Excel: analysing more than one variable
Data exploration with Microsoft Excel: analysing more than one variable Contents 1 Introduction... 1 2 Comparing different groups or different variables... 2 3 Exploring the association between categorical
Introduction Course in SPSS - Evening 1
ETH Zürich Seminar für Statistik Introduction Course in SPSS - Evening 1 Seminar für Statistik, ETH Zürich All data used during the course can be downloaded from the following ftp server: ftp://stat.ethz.ch/u/sfs/spsskurs/
IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the
Data Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
How to Use a Data Spreadsheet: Excel
How to Use a Data Spreadsheet: Excel One does not necessarily have special statistical software to perform statistical analyses. Microsoft Office Excel can be used to run statistical procedures. Although
IBM SPSS Direct Marketing 23
IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release
Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs
Using Excel Jeffrey L. Rummel Emory University Goizueta Business School BBA Seminar Jeffrey L. Rummel BBA Seminar 1 / 54 Excel Calculations of Descriptive Statistics Single Variable Graphs Relationships
SPSS Manual for Introductory Applied Statistics: A Variable Approach
SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All
IBM SPSS Direct Marketing 22
IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release
Using R for Windows and Macintosh
2010 Using R for Windows and Macintosh R is the most commonly used statistical package among researchers in Statistics. It is freely distributed open source software. For detailed information about downloading
Scatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
Installing R and the psych package
Installing R and the psych package William Revelle Department of Psychology Northwestern University August 17, 2014 Contents 1 Overview of this and related documents 2 2 Install R and relevant packages
IBM SPSS Statistics 20 Part 1: Descriptive Statistics
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 1: Descriptive Statistics Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the
Getting Started with R and RStudio 1
Getting Started with R and RStudio 1 1 What is R? R is a system for statistical computation and graphics. It is the statistical system that is used in Mathematics 241, Engineering Statistics, for the following
R with Rcmdr: BASIC INSTRUCTIONS
R with Rcmdr: BASIC INSTRUCTIONS Contents 1 RUNNING & INSTALLATION R UNDER WINDOWS 2 1.1 Running R and Rcmdr from CD........................................ 2 1.2 Installing from CD...............................................
4 Other useful features on the course web page. 5 Accessing SAS
1 Using SAS outside of ITCs Statistical Methods and Computing, 22S:30/105 Instructor: Cowles Lab 1 Jan 31, 2014 You can access SAS from off campus by using the ITC Virtual Desktop Go to https://virtualdesktopuiowaedu
January 26, 2009 The Faculty Center for Teaching and Learning
THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i
An introduction to IBM SPSS Statistics
An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive
Exploratory data analysis (Chapter 2) Fall 2011
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering
Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques
Introduction to SPSS (version 16) for Windows
Introduction to SPSS (version 16) for Windows Practical workbook Aims and Learning Objectives By the end of this course you will be able to: get data ready for SPSS create and run SPSS programs to do simple
IBM SPSS Statistics for Beginners for Windows
ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning
Novell ZENworks Asset Management 7.5
Novell ZENworks Asset Management 7.5 w w w. n o v e l l. c o m October 2006 USING THE WEB CONSOLE Table Of Contents Getting Started with ZENworks Asset Management Web Console... 1 How to Get Started...
Introduction to IBM SPSS Statistics
CONTENTS Arizona State University College of Health Solutions College of Nursing and Health Innovation Introduction to IBM SPSS Statistics Edward A. Greenberg, PhD Director, Data Lab PAGE About This Document
Data analysis process
Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis
SPSS: Getting Started. For Windows
For Windows Updated: August 2012 Table of Contents Section 1: Overview... 3 1.1 Introduction to SPSS Tutorials... 3 1.2 Introduction to SPSS... 3 1.3 Overview of SPSS for Windows... 3 Section 2: Entering
Can SAS Enterprise Guide do all of that, with no programming required? Yes, it can.
SAS Enterprise Guide for Educational Researchers: Data Import to Publication without Programming AnnMaria De Mars, University of Southern California, Los Angeles, CA ABSTRACT In this workshop, participants
Gamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
An Introduction to SPSS. Workshop Session conducted by: Dr. Cyndi Garvan Grace-Anne Jackman
An Introduction to SPSS Workshop Session conducted by: Dr. Cyndi Garvan Grace-Anne Jackman Topics to be Covered Starting and Entering SPSS Main Features of SPSS Entering and Saving Data in SPSS Importing
An introduction to using Microsoft Excel for quantitative data analysis
Contents An introduction to using Microsoft Excel for quantitative data analysis 1 Introduction... 1 2 Why use Excel?... 2 3 Quantitative data analysis tools in Excel... 3 4 Entering your data... 6 5 Preparing
Drawing a histogram using Excel
Drawing a histogram using Excel STEP 1: Examine the data to decide how many class intervals you need and what the class boundaries should be. (In an assignment you may be told what class boundaries to
Create Custom Tables in No Time
SPSS Custom Tables 17.0 Create Custom Tables in No Time Easily analyze and communicate your results with SPSS Custom Tables, an add-on module for the SPSS Statistics product line Share analytical results
The Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
OVERVIEW OF R SOFTWARE AND PRACTICAL EXERCISE
OVERVIEW OF R SOFTWARE AND PRACTICAL EXERCISE Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-110012 1. INTRODUCTION R is a free software environment for statistical computing
SPSS 12 Data Analysis Basics Linda E. Lucek, Ed.D. [email protected] 815-753-9516
SPSS 12 Data Analysis Basics Linda E. Lucek, Ed.D. [email protected] 815-753-9516 Technical Advisory Group Customer Support Services Northern Illinois University 120 Swen Parson Hall DeKalb, IL 60115 SPSS
Data exploration with Microsoft Excel: univariate analysis
Data exploration with Microsoft Excel: univariate analysis Contents 1 Introduction... 1 2 Exploring a variable s frequency distribution... 2 3 Calculating measures of central tendency... 16 4 Calculating
MBA 611 STATISTICS AND QUANTITATIVE METHODS
MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain
SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout
Analyzing Data SPSS Resources 1. See website (readings) for SPSS tutorial & Stats handout Don t have your own copy of SPSS? 1. Use the libraries to analyze your data 2. Download a trial version of SPSS
STATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc.
STATGRAPHICS Online Statistical Analysis and Data Visualization System Revised 6/21/2012 Copyright 2012 by StatPoint Technologies, Inc. All rights reserved. Table of Contents Introduction... 1 Chapter
Ohio University Computer Services Center August, 2002 Crystal Reports Introduction Quick Reference Guide
Open Crystal Reports From the Windows Start menu choose Programs and then Crystal Reports. Creating a Blank Report Ohio University Computer Services Center August, 2002 Crystal Reports Introduction Quick
Directions for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
Data Analysis Stata (version 13)
Data Analysis Stata (version 13) There are many options for learning Stata (www.stata.com). Stata s help facility (accessed by a pull-down menu or by command or by clicking on?) consists of help files
Introduction to StatsDirect, 11/05/2012 1
INTRODUCTION TO STATSDIRECT PART 1... 2 INTRODUCTION... 2 Why Use StatsDirect... 2 ACCESSING STATSDIRECT FOR WINDOWS XP... 4 DATA ENTRY... 5 Missing Data... 6 Opening an Excel Workbook... 6 Moving around
SPSS (Statistical Package for the Social Sciences)
SPSS (Statistical Package for the Social Sciences) What is SPSS? SPSS stands for Statistical Package for the Social Sciences The SPSS home-page is: www.spss.com 2 What can you do with SPSS? Run Frequencies
How to set the main menu of STATA to default factory settings standards
University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be
Bill Burton Albert Einstein College of Medicine [email protected] April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1
Bill Burton Albert Einstein College of Medicine [email protected] April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce
GeoGebra Statistics and Probability
GeoGebra Statistics and Probability Project Maths Development Team 2013 www.projectmaths.ie Page 1 of 24 Index Activity Topic Page 1 Introduction GeoGebra Statistics 3 2 To calculate the Sum, Mean, Count,
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize
Introduction to SPSS 16.0
Introduction to SPSS 16.0 Edited by Emily Blumenthal Center for Social Science Computation and Research 110 Savery Hall University of Washington Seattle, WA 98195 USA (206) 543-8110 November 2010 http://julius.csscr.washington.edu/pdf/spss.pdf
R FOR SAS AND SPSS USERS. Bob Muenchen
R FOR SAS AND SPSS USERS Bob Muenchen I thank the many R developers for providing such wonderful tools for free and all the r help participants who have kindly answered so many questions. I'm especially
Using Excel for Data Manipulation and Statistical Analysis: How-to s and Cautions
2010 Using Excel for Data Manipulation and Statistical Analysis: How-to s and Cautions This document describes how to perform some basic statistical procedures in Microsoft Excel. Microsoft Excel is spreadsheet
SPSS Introduction. Yi Li
SPSS Introduction Yi Li Note: The report is based on the websites below http://glimo.vub.ac.be/downloads/eng_spss_basic.pdf http://academic.udayton.edu/gregelvers/psy216/spss http://www.nursing.ucdenver.edu/pdf/factoranalysishowto.pdf
GETTING YOUR DATA INTO SPSS
GETTING YOUR DATA INTO SPSS UNIVERSITY OF GUELPH LUCIA COSTANZO [email protected] REVISED SEPTEMBER 2011 CONTENTS Getting your Data into SPSS... 0 SPSS availability... 3 Data for SPSS Sessions... 4
Federal Employee Viewpoint Survey Online Reporting and Analysis Tool
Federal Employee Viewpoint Survey Online Reporting and Analysis Tool Tutorial January 2013 NOTE: If you have any questions about the FEVS Online Reporting and Analysis Tool, please contact your OPM point
5 Correlation and Data Exploration
5 Correlation and Data Exploration Correlation In Unit 3, we did some correlation analyses of data from studies related to the acquisition order and acquisition difficulty of English morphemes by both
Introduction to Exploratory Data Analysis
Introduction to Exploratory Data Analysis A SpaceStat Software Tutorial Copyright 2013, BioMedware, Inc. (www.biomedware.com). All rights reserved. SpaceStat and BioMedware are trademarks of BioMedware,
S P S S Statistical Package for the Social Sciences
S P S S Statistical Package for the Social Sciences Data Entry Data Management Basic Descriptive Statistics Jamie Lynn Marincic Leanne Hicks Survey, Statistics, and Psychometrics Core Facility (SSP) July
Exercises on using R for Statistics and Hypothesis Testing Dr. Wenjia Wang
Exercises on using R for Statistics and Hypothesis Testing Dr. Wenjia Wang School of Computing Sciences, UEA University of East Anglia Brief Introduction to R R is a free open source statistics and mathematical
Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.
1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis
Hands-on Exercise Using DataFerrett American Community Survey Public Use Microdata Sample (PUMS)
Hands-on Exercise Using DataFerrett American Community Survey Public Use Microdata Sample (PUMS) U.S. Census Bureau Michael Burns (425) 495-3234 DataFerrett Help http://dataferrett.census.gov/ 1-866-437-0171
Appendix III: SPSS Preliminary
Appendix III: SPSS Preliminary SPSS is a statistical software package that provides a number of tools needed for the analytical process planning, data collection, data access and management, analysis,
R Commander Tutorial
R Commander Tutorial Introduction R is a powerful, freely available software package that allows analyzing and graphing data. However, for somebody who does not frequently use statistical software packages,
Query 4. Lesson Objectives 4. Review 5. Smart Query 5. Create a Smart Query 6. Create a Smart Query Definition from an Ad-hoc Query 9
TABLE OF CONTENTS Query 4 Lesson Objectives 4 Review 5 Smart Query 5 Create a Smart Query 6 Create a Smart Query Definition from an Ad-hoc Query 9 Query Functions and Features 13 Summarize Output Fields
Describing, Exploring, and Comparing Data
24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter
Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures
Introductory Statistics Lectures Visualizing Data Descriptive Statistics I Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the
Using Stata for Categorical Data Analysis
Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,
Data Analysis in SPSS. February 21, 2004. If you wish to cite the contents of this document, the APA reference for them would be
Data Analysis in SPSS Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Heather Claypool Department of Psychology Miami University
SAS Analyst for Windows Tutorial
Updated: August 2012 Table of Contents Section 1: Introduction... 3 1.1 About this Document... 3 1.2 Introduction to Version 8 of SAS... 3 Section 2: An Overview of SAS V.8 for Windows... 3 2.1 Navigating
Participant Guide RP301: Ad Hoc Business Intelligence Reporting
RP301: Ad Hoc Business Intelligence Reporting State of Kansas As of April 28, 2010 Final TABLE OF CONTENTS Course Overview... 4 Course Objectives... 4 Agenda... 4 Lesson 1: Reviewing the Data Warehouse...
SPSS for Simple Analysis
STC: SPSS for Simple Analysis1 SPSS for Simple Analysis STC: SPSS for Simple Analysis2 Background Information IBM SPSS Statistics is a software package used for statistical analysis, data management, and
Using Excel for Statistical Analysis
2010 Using Excel for Statistical Analysis Microsoft Excel is spreadsheet software that is used to store information in columns and rows, which can then be organized and/or processed. Excel is a powerful
Directions for Frequency Tables, Histograms, and Frequency Bar Charts
Directions for Frequency Tables, Histograms, and Frequency Bar Charts Frequency Distribution Quantitative Ungrouped Data Dataset: Frequency_Distributions_Graphs-Quantitative.sav 1. Open the dataset containing
Appendix 2.1 Tabular and Graphical Methods Using Excel
Appendix 2.1 Tabular and Graphical Methods Using Excel 1 Appendix 2.1 Tabular and Graphical Methods Using Excel The instructions in this section begin by describing the entry of data into an Excel spreadsheet.
The Chi-Square Test. STAT E-50 Introduction to Statistics
STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed
Working with SPSS. A Step-by-Step Guide For Prof PJ s ComS 171 students
Working with SPSS A Step-by-Step Guide For Prof PJ s ComS 171 students Contents Prep the Excel file for SPSS... 2 Prep the Excel file for the online survey:... 2 Make a master file... 2 Clean the data
Getting started with qplot
Chapter 2 Getting started with qplot 2.1 Introduction In this chapter, you will learn to make a wide variety of plots with your first ggplot2 function, qplot(), short for quick plot. qplot makes it easy
TI-Inspire manual 1. Instructions. Ti-Inspire for statistics. General Introduction
TI-Inspire manual 1 General Introduction Instructions Ti-Inspire for statistics TI-Inspire manual 2 TI-Inspire manual 3 Press the On, Off button to go to Home page TI-Inspire manual 4 Use the to navigate
Microsoft Excel 2010 Part 3: Advanced Excel
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Microsoft Excel 2010 Part 3: Advanced Excel Winter 2015, Version 1.0 Table of Contents Introduction...2 Sorting Data...2 Sorting
SPSS and AM statistical software example.
A detailed example of statistical analysis using the NELS:88 data file and ECB, to perform a longitudinal analysis of 1988 8 th graders in the year 2000: SPSS and AM statistical software example. Overall
INTRODUCTORY LAB: DOING STATISTICS WITH SPSS 21
INTRODUCTORY LAB: DOING STATISTICS WITH SPSS 21 This section covers the basic structure and commands of SPSS for Windows Release 21. It is not designed to be a comprehensive review of the most important
Analysis of categorical data: Course quiz instructions for SPSS
Analysis of categorical data: Course quiz instructions for SPSS The dataset Please download the Online sales dataset from the Download pod in the Course quiz resources screen. The filename is smr_bus_acd_clo_quiz_online_250.xls.
Graphics in R. Biostatistics 615/815
Graphics in R Biostatistics 615/815 Last Lecture Introduction to R Programming Controlling Loops Defining your own functions Today Introduction to Graphics in R Examples of commonly used graphics functions
Tutorial: Get Running with Amos Graphics
Tutorial: Get Running with Amos Graphics Purpose Remember your first statistics class when you sweated through memorizing formulas and laboriously calculating answers with pencil and paper? The professor
Advanced Excel for Institutional Researchers
Advanced Excel for Institutional Researchers Presented by: Sandra Archer Helen Fu University Analysis and Planning Support University of Central Florida September 22-25, 2012 Agenda Sunday, September 23,
Introduction to STATA 11 for Windows
1/27/2012 Introduction to STATA 11 for Windows Stata Sizes...3 Documentation...3 Availability...3 STATA User Interface...4 Stata Language Syntax...5 Entering and Editing Stata Commands...6 Stata Online
KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
ECONOMICS 351* -- Stata 10 Tutorial 2. Stata 10 Tutorial 2
Stata 10 Tutorial 2 TOPIC: Introduction to Selected Stata Commands DATA: auto1.dta (the Stata-format data file you created in Stata Tutorial 1) or auto1.raw (the original text-format data file) TASKS:
SPSS The Basics. Jennifer Thach RHS Assessment Office March 3 rd, 2014
SPSS The Basics Jennifer Thach RHS Assessment Office March 3 rd, 2014 Why use SPSS? - Used heavily in the Social Science & Business world - Ability to perform basic to high-level statistical analysis (i.e.
