Important Things to Know about Stata
|
|
|
- Ross McLaughlin
- 9 years ago
- Views:
Transcription
1 Important Things to Know about Stata
2 Accessing Stata Stata 14.0 is available in all clusters and classrooms on campus. You may also purchase it at a substantial discount through Notre Dame s GradPlan. For more information, go to The version you will probably need is Stata/IC. A First Look at Stata When you open Stata, you will see four windows (if you see a Properties window, just close it). The four windows are Review, Results, Command, and Variables. Let s start with the Command window, at the bottom. This is where you enter any commands you give to Stata (outside of a dofile, described below). Once you enter a command, it appears in the Review window. This can be a useful record of what you have done so far, and double-clicking on a command in the Review window executes that command again. Single-clicking on a command will bring it into the Command window for you to modify. Once Stata has executed your command, the results appear in the Results window. All summary statistics, regression results, etc. will be displayed here. Finally, the Variables window lists the variables available in your data set. In the toolbar at the top, you will see several icons. Beyond the standard (open, save, print...), the most important are those used to access the do-file generator and the data editor. The do-file icon looks like a pencil writing on a piece of paper. The data editor is a pencil on a spreadsheet. The Data Editor: If you want to look at your data, the best way to do this is using the data editor (or the data browser which allows you to look at but not alter your data). Clicking on the icon for the data editor will bring up your data, and you can look at all of your observations. Often a useful thing to do in the data editor is to Sort a variable. This can help you organize your data and spot problems. To sort in Stata 13, go to the Data menu at the top of the editor, and then hover on Sort and choose an option. From the drop-down menu, choose the variable or variables you wish to sort on, and then click OK. Do Files: Stata can be used interactively just type in a command at the command line, and Stata executes that command. Nonetheless, it can be very helpful to have a file of commands that are executed, rather than simply typing them in one at a time. Such a file of commands is called a do file, and you should think of do-files as programs that you write for Stata. One extremely helpful way to use a do-file is for data cleaning. You do not want to have to create the same new variables, drop observations, merge in new data, etc., line-by-line every time you want to start working with your data. A do-file can be written to do the same thing every time, very quickly. To write a do-file, open the window and type commands, notepad-style. Then save the file, with a name like myfile.do. To execute it you open the do-file in the do-file window, click on tools
3 and then execute (do). Alternatively, you could type the command do C:/kbuckles/myfile. You can also run parts of a do-file from the do-file window by highlighting the commands you want to run and then clicking on tools and execute (do). You will find it useful to put comments in your do file. Any line starting with * is interpreted as a comment and is not executed. You may run into trouble if you have very long commands. For legibility, it is easiest to split them across two lines, but then Stata thinks the command is incomplete. To get around this, you can end a line of text with /*, and start the next line with */. This tells Stata that the two lines are part of the same command. For example, this is a single command: reg matchs fipsocc fipsres age mrace3 meduc6 married order mpre5 /* */ adequacy pldel attend, r Getting Help in Stata To get more information on any of the commands discussed here, you can use Stata s help feature. Use the Stata Command... option in Help for information about a specific command. For example, the help file for summarize or sum will give you information on the summarize command, its syntax and options. If you are not sure of the command names, you can try the Search... option. Finally, Stata has online help available at If all else fails, google can occasionally turn up useful user-generated help. You can also work through Stata tutorials just start the first one up by typing: tutorial intro. Getting Started with Data Directory: A useful thing to do at the beginning of any Stata session (and in the first lines of any do-file) is to set the directory. This tells Stata where to find all of the data, do, or log files that you will be using. For example, suppose you put all of your files in a folder on the C:\ drive, named kbuckles\ec At the beginning of your session, type cd C:\ kbuckles\ec As long as the files are in this folder, you only have to type the file name to access them. If I want to use a data set named alcohol.dta, I only have to type use alcohol.dta to access it. If I didn t set the directory first, I d have to type use C:\ kbuckles\ec43550\alcohol.dta. Using and Saving Data Sets You can always open a data set from within Stata by typing use mydata where mydata is the name of the data set (or use C:\kbuckles\ec43550\mydata if you have not set the directory). If you can t remember the name, you can always choose open from the menu and browse your disk to find it. Stata will then type the appropriate command in the command window for you. If you have made changes to the data set, you can save the new data. Just type save mynewdata. If mynewdata
4 already exists, Stata will not let you overwrite it you will have to throw out the old one first. Alternatively, you can type save, replace to overwrite it. Be very careful using the replace option. Once Stata has overwritten a file, it is gone! You can usually also open a Stata data set just by clicking on it from your file directory. This doesn t always work, for example if the data set was made on Windows and you re using a Mac, or the data set is too big for your default memory allocation. Log Files: Log files are extremely important. They are used to store the work you do in order to view it later, show it to someone, or print it. You will want to start a log file almost any time you start working in Stata. To do so, type log using logfilename.log at the command line (or include this command in your do-file). This will create a file called logfilename.log that will save the Stata commands you execute, as well as the output. If you do not open a log file, your work will not be saved. You can name your log file anything you want, but if it is not something.log, it will be in a strange Stata format rather than in a text format that can easily be viewed and edited in Word. If you want to overwrite a log file (for example, if you run a do-file that has errors in it, then fix the errors and run it again), you can use log using logfilename.log, replace. Be careful with the replace option, though, as it will overwrite the existing file without warning you. Very important in order to save the log file, you have to type log close when you are done. If you do not close the log file and close Stata, your log file will be lost! Basic Commands for Data Analysis Descriptive Commands: It is always a good idea to know what the data you are using are like. Typing describe will tell you what the names of the variables are, whether they are strings or numbers, and if the data set has been labeled, what the variable is. Creating labels for your variables is a good idea. While some variables can be given a fairly mnemonic name, for others it is useful to see a more in-depth description. You can type label var myvar description of what myvar is to assign a label to the variable myvar. You can also label values of categorical variables to make self-explanatory tables. For example, suppose you have a variable named sex that is 1 if the individual is male and 2 if they are female. You can type label define sexlabel 1 male 2 female and then label values sex sexlabel to associate the correct category labels with the numbers, but the variable is still a number. In addition to knowing what kind of variables you have, you will want to see if the data make sense. Looking at some simple summary statistics is a good way to do this. Typing summarize (or just sum) will give you the number of nonmissing observations, the mean, the standard deviation, the minimum and the maximum for every variable. If you just want to look at one or two variables, type sum var1 or sum var1 var2. Sometimes you want more detail about the distribution of a variable. Typing sum var1, detail will give you additional statistics, such as the median. For categorical variables, you may just want to look at a frequency distribution. The tabulate (or just tab) command is what you need. Typing tab x1 will give you the frequency distribution of the
5 variable named x1. Two-way tables are also possible. Typing tab x1 x2 will just give you the frequencies. To get per row, column and cell percentages, type tab x1 x2, row col cell instead. Note that cross-tabs with too many cells are not possible. You only want to do this for variables with a fairly limited number of values. If a variable has a value label, the labels, rather than the numbers will appear in the table. More complicated tables can be made using the table command. Assuming you have the variables wage, sex and year in your data, you could type table year sex, c(mean wage) to look at average wages for males and females by year. Other descriptive statistics (up to 5 total) can be included in parentheses. For example, table year sex, c(mean wage median wage n wage std wage max wage) would add in the median wage, the number of observations, the standard deviation and the maximum. The descriptives don t have to be all about one variable. Suppose you also have an education variable. You could type table year sex, c(mean wage std wage mean education std education) to get the mean and standard deviation of both wages and education by sex and year. Data can also be summarized graphically. There are many graph commands in Stata, and the best way to see them all is to do a command search on graph using the help menu. However, two important ones for this class are hist and scatter. The hist command makes a histogram of a variable. To look at a histogram for the variable x1, you just type hist x1. Sometimes you will want to break the histogram into smaller or larger intervals, so you need to change the bin number. The bin size is the number of bars in the histogram. For example: hist x1, bin(30) [the bin can range from 2 to 50, and has a default of 5.] If you want a plot of one variable against another (a scatterplot), use the scatter command, such as scatter y1 x1. Like all graph commands, there are a lot of options you can choose the dot size, color, etc. You may have to do some of these things to make a graph that makes sense. Be patient, and use the help menu. Stata will not save graphs in the log file! You can print them when they appear on the screen by clicking on file and then print graph. You can also save the graph to a separate file that can be printed later by typing, for example: hist x1, bin(30) saving(graph1) or simply by clicking on file and then save graph. Then, if you want to see the graph later you can type graph use graph1, which brings the graph back up, and you can print as before. Finally, you can copy graphs directly into a Word document by clicking on edit then copy graph, and then going to the document and pasting the graph. Creating New Variables To create new variables, use the generate command (or gen) followed by the name of the new variable and an expression defining it. This command is best described by examples: gen xplusy=x+y gen halfx=x/2 (or gen halfx=x*.5) gen xsquared=x^2 (or gen xsquared=x**2 either form of exponentiation works) gen logx=log(x) (or gen logx=ln(x), as either form implies the natural log) There are a few special types of variables for which need additional discussion. Dummy variables can be created by using an expression that evaluates to either true (=1) or false (=0). Suppose that you want a dummy variable equal to 1 for males, 0 for females and currently you have a variable named sex that is 1 for females and 2 for males. You could type gen male=sex==2, which tell Stata that male = 1 when sex is exactly equal to 2. You could also do gen male=sex~=1, which tells Stata that male = 1 when sex is not equal to 1. Finally, you could do this with a couple of
6 commands by first creating a new variable, with all values equal 0: gen male=0. You then replace the zeroes with ones for the males: replace male=1 if sex==2. It is important to note that for expressions that are either true or false, Stata requires double equal signs (==). The single equal sign is only for assigning a value. Not equal is ~=. Other possibilities are: > (greater than), < (less than), >= (greater or equal than), <= (less or equal than). These same expressions can be used with the if qualifier (described below). To combine expressions, use & for and, and use for or. So, typing gen whitemale=(male==1 & white==1) will create a dummy for white males and typing gen blkorhisp=(black==1 hispan==1) will create a dummy for being black or hispanic. Dummy variables can also be created for categorical variables by using an option to the tabulate command. Suppose that marstat was a variable equal to 0 for never married, 1 for married, living with spouse, and 2 for married, not living with spouse. Typing tabulate marstat, gen(mardum) would result in both a frequency tabulation of marstat, and the creation of three dummy variables: mardum1 is 1 if marstat=0, and is 0 otherwise; mardum2 is 1 if marstat=1, and is 0 otherwise; mardum3 is 1 if marstat=2, and is 0 otherwise. Lagged variables are also easy to create, as long as you know the data are in the correct order. Stata has an internal variable _n that is the observation number. Typing gen lagx=x[_n-1] will create the lag. To make sure data is in order, you need the sort command. Suppose this is time series data with one observation per year. Assuming you have a variable named year, just type sort year before generating the lag. Panel data is trickier. Suppose you have 5 years of data for each of 50 states. Then type sort state year and then by state: gen lagx=x[_n-1] to properly make lags for each state. If you do not start with by state: then the first observation of the next state will assign the last observation of the previous state as the lag. A fancier, more powerful version of gen is egen. It is most useful for when you want to attach a summary statistic to each observation in the data set. For example, suppose you have stock data and want a market return. Typing egen mktreturn=mean(return) will create the variable mktreturn that is equal to the mean of all of the firm returns in the data set. Alternatively, typing egen indreturn=mean(return), by(industry) would create a new variable that is the mean return for each firm within a firm's industry. Egen has many other options; its help file is very useful. In all cases if the variable you want to define already exists, and you want to redefine it, you need to use the replace command instead of generate. Alternatively, you could type drop newx and then gen newx=expression. You cannot use gen for a variable that exists, and you cannot use replace for a variable that does not exist. Using the if qualifier Any Stata command can be carried out for just a subset of the data by using the if qualifier. For example, you would type sum x y z if age>=18 to get descriptive statistics only for adults. An alternative way to define a dummy variable for males would the two-step process of gen male=1 if sex==2 and replace male=0 if sex==1. Essentially, you just add if expression to the end of any command. The one exception is that the if qualifer comes before options, that is, before a comma. Thus, to get detailed summary statistics for just adults, type sum x y z if age>=18, detail or to do a regression (see below) on just adults, but with standard errors corrected for possible
7 heteroskedasticity, type reg x y z if age>=18, robust. It is important to remember that Stata considers missing values, represented by a period, to be the highest possible number. Suppose you want to create an income variable that is only defined for people with positive earnings. If you type gen incwearn=income if earnings>0 you will include those with missing earnings. Instead, you should type gen incwearn=income if earnings>0 & earnings ~=. to properly define your variable. Running Regressions The syntax for running regressions is similar for most specifications. The first word is the type of regression you want to run. The second word is the dependent variable, followed by a list of independent variables. Last, you can type a comma followed by options. Some examples: reg y x1 x2 reg means do OLS here you re regressing y on x1 and x2 reg y x1 x2, level(90) the level(x) option changes the reported confidence bands to x% reg y x1 x2, nocons the nocons option runs the regression with no constant term reg y x1 x2, robust the robust option corrects the standard errors for heteroskedasticity reg y x1 x2, cluster(state) the cluster option corrects for heteroskedasticity & within state correlation reg y x1 x2 [w=z] WLS where z is the weight predict yhat predicts fitted values after the last estimation command predict uhat, resid (or predicts residuals after an estimation command. Note that it is the option predict uhat, r) resid that makes it the residual, not naming it uhat instead of yhat! test x1=x2 run after reg y x1 x2, tests whether coefficients on x and z are the same test x1 x2 run after reg y x1 x2, tests whether coefficient on x and z each equal 0 (jointly) probit y x1 x2 run a probit regression, reporting coefficient estimates dprobit y x1 x2 same as probit, but reports changes in probabilities instead logit y x1 x2 run a logit regression, reporting coefficient estimates tobit y x1 x2, ul run a tobit regression where y is topcoded, reporting coefficient estimates heckman y x1, select(z x1) run a heckman selection corrected regression of y on x1, identified by z areg y x1, absorb(year) regression including year fixed-effects (allows robust option) xtreg y x1, fe i(year) alternative regression including year fixed-effects (no options allowed) xtreg y x1, re i(year) regression including random effects for year reg y x1 (z) regresses y on x1, using z as an instrument for x1 ivreg y (x1=z) alternative form of above IV regression reg y x1 x2 (z x2) regresses y on x1 x2, using z as an instrument for x1 ivreg y (x1=z) x2 alternative form of above IV regression tsset timevar prepares to use the following time series commands prais y x1 x2 performs Prais-Winsten estimation (vs OLS from reg y x1 x2) prais y x1 x2, corc use the Cochrane-Orcutt method instead newey y x1 x2, lag(4) t(year) performs Newey-West estimation assuming 4 years of correlation You can include similar variables in a regression (or in other commands) by using an asterisk as a wild card. For example, if you create 8 region dummies using the tab region, gen(regdum) command, you can include the 8 dummies by typing regdum* in the covariate list. Using Stata as a Statistical Table and Calculator You can use the display command along with any expression that you would use in a gen command to use Stata as a calculator. Typing display 2+2 will result in 4 being displayed and display 3*4
8 will result in 12, etc. This command is most useful in concert with the built-in statistical functions that let you skip looking up critical values in a table and can directly calculate p-values for you. For example, if you have a t-statistic that is 1.4 from a model with 25 degrees of freedom, typing display ttail(25, 1.4) will display.087, which is the p-value for a 1-sided t-test (just double it for a 2-sided test). Typing display invttail(25,.087) would display 1.4. Similar functions are Ftail(ndf, ddf, F) and its counterpart invftail(ndf, ddf, p) for an F-statistic, F, with ndf and ddf numerator and denominator degrees of freedom respectively, and chi2tail(df, x) and invchi2tail(df, p) for a chi-squared statistic, x. Bringing Data into Stata (Note: you can skip the next two sections until you need to do this for our class, I will provide you with data that is already in Stata format.) Sometimes you will be starting with a text file or a spreadsheet (like Excel) and want to convert it into Stata (.dta) format. If you start with a spreadsheet (which is probably easiest), you should save it as a text file, tabdelimited or comma-separated. You can then read it into Stata using the insheet command, for example: insheet using mydata.csv. If you have data that are too big for one Excel sheet, you could do it piece by piece and then append the data sets. Usually when you have a large data set, you will need to read in the data directly using either infile or infix. The best choice depends on your data, so you will want to type help infile and help infix to see how each command works. Consider the simplest case where you have a text file organized with rows as observations and columns as variables. Let s say your data file is named data.txt and the top of your data file looks like this: m f You will need a dictionary file to tell Stata how to read the text file. You need to give this file the extension.dct, for example mydict.dct. The file should look like this: dictionary using data.txt { float age age of respondent float income in thousands of dollars str1 sex m = male, f = female } The first column (float or str1, here) tells Stata what kind of variable to look for float is a number, and str# is a string (meaning letters or a combination of letters and numbers) of length #. Here, your sex variable takes on the values m and f. Note that float is the default you don t actually have to type it. The second column is the name of the variable that Stata will use note that Stata is case-sensitive. The third column is optional, and contains labels for the variables that
9 are reminders to you or information for anyone else using your data set. The last curly bracket is important don t leave it off! Once you ve written and saved the dictionary file, you open Stata, and type the command infile using mydict.dct. Stata will then read the in data.txt using the dictionary from mydict.dct. It will tell you how many observations it read in, what the variables are, etc. Be sure that this agrees with your expectations! You can also type data directly into Stata, but I wouldn t recommend it. If you re typing data in, use a spreadsheet like Excel, save the data as text, and proceed as above. Bringing Data Sets Together in Stata If you want to combine multiple data sets, you can do so using the append command. For example, suppose you have two years of CPS data (CPS98.dta and CPS99.dta) that you want to combine and work with together. You would open up CPS98.dta, and then add the other by typing append using CPS99.dta. You might also want to merge a data set onto another, to bring in the additional variables in the second data set. For example, suppose you have CPS data that identifies the respondent s state of residence (called stateres ). You want to merge in state-level unemployment rates to use as a control variable in your regressioins. The unemployment rates are in a separate data set called unemp.dta. First, make sure that the state variable in unemp.dta is also called stateres and is coded exactly the same way as in your original data set. Sort the stateres variable by opening unemp.dta and typing sort stateres. Save unemp.dta. Then open your CPS data and sort stateres there as well. Then merge in the unemployment rates by typing _merge stateres using unemp.dta. In addition to adding the unemployment rates, Stata will also create a new variable called _merge Once you have done this, it is helpful to tab _merge to see whether all the observations in your original data set were matched to an unemployment rate or not. Use the help function for merge to help you understand the results of the tab _merge commend. More Sources of Support If you need help beyond this document, or the resources listed above, please contact me or James Ng, the Economics librarian at [email protected]. He has extensive experience working with data in Stata and should be very helpful. If you are interested in ordering any additional Stata resources, such as user manuals, you can check out the bookstore at and see what is available. Items of possible interest include a textbook-style reference called Statistics with Stata and a Getting Started manual for Windows or Mac.
Getting started with the Stata
Getting started with the Stata 1. Begin by going to a Columbia Computer Labs. 2. Getting started Your first Stata session. Begin by starting Stata on your computer. Using a PC: 1. Click on start menu 2.
Stata Tutorial. 1 Introduction. 1.1 Stata Windows and toolbar. Econometrics676, Spring2008
1 Introduction Stata Tutorial Econometrics676, Spring2008 1.1 Stata Windows and toolbar Review : past commands appear here Variables : variables list Command : where you type your commands Results : results
Appendix III: SPSS Preliminary
Appendix III: SPSS Preliminary SPSS is a statistical software package that provides a number of tools needed for the analytical process planning, data collection, data access and management, analysis,
Bill Burton Albert Einstein College of Medicine [email protected] April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1
Bill Burton Albert Einstein College of Medicine [email protected] April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce
SAS Analyst for Windows Tutorial
Updated: August 2012 Table of Contents Section 1: Introduction... 3 1.1 About this Document... 3 1.2 Introduction to Version 8 of SAS... 3 Section 2: An Overview of SAS V.8 for Windows... 3 2.1 Navigating
ECONOMICS 351* -- Stata 10 Tutorial 2. Stata 10 Tutorial 2
Stata 10 Tutorial 2 TOPIC: Introduction to Selected Stata Commands DATA: auto1.dta (the Stata-format data file you created in Stata Tutorial 1) or auto1.raw (the original text-format data file) TASKS:
Introduction Course in SPSS - Evening 1
ETH Zürich Seminar für Statistik Introduction Course in SPSS - Evening 1 Seminar für Statistik, ETH Zürich All data used during the course can be downloaded from the following ftp server: ftp://stat.ethz.ch/u/sfs/spsskurs/
How to set the main menu of STATA to default factory settings standards
University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be
SPSS and AM statistical software example.
A detailed example of statistical analysis using the NELS:88 data file and ECB, to perform a longitudinal analysis of 1988 8 th graders in the year 2000: SPSS and AM statistical software example. Overall
Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:
Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:
KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
SPSS Explore procedure
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,
Directions for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS
DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.
INTRODUCTORY LAB: DOING STATISTICS WITH SPSS 21
INTRODUCTORY LAB: DOING STATISTICS WITH SPSS 21 This section covers the basic structure and commands of SPSS for Windows Release 21. It is not designed to be a comprehensive review of the most important
There are six different windows that can be opened when using SPSS. The following will give a description of each of them.
SPSS Basics Tutorial 1: SPSS Windows There are six different windows that can be opened when using SPSS. The following will give a description of each of them. The Data Editor The Data Editor is a spreadsheet
A Short Guide to Stata 13
Short Guides to Microeconometrics Fall 2015 Prof. Dr. Kurt Schmidheiny Universität Basel A Short Guide to Stata 13 1 Introduction 2 2 The Stata Environment 2 3 Where to get help 3 4 Additions to Stata
Statgraphics Getting started
Statgraphics Getting started The aim of this exercise is to introduce you to some of the basic features of the Statgraphics software. Starting Statgraphics 1. Log in to your PC, using the usual procedure
SAS: A Mini-Manual for ECO 351 by Andrew C. Brod
SAS: A Mini-Manual for ECO 351 by Andrew C. Brod 1. Introduction This document discusses the basics of using SAS to do problems and prepare for the exams in ECO 351. I decided to produce this little guide
Tutorial 2: Reading and Manipulating Files Jason Pienaar and Tom Miller
Tutorial 2: Reading and Manipulating Files Jason Pienaar and Tom Miller Most of you want to use R to analyze data. However, while R does have a data editor, other programs such as excel are often better
Simple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
Excel Tutorial. Bio 150B Excel Tutorial 1
Bio 15B Excel Tutorial 1 Excel Tutorial As part of your laboratory write-ups and reports during this semester you will be required to collect and present data in an appropriate format. To organize and
Using SPSS, Chapter 2: Descriptive Statistics
1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,
Basics of STATA. 1 Data les. 2 Loading data into STATA
Basics of STATA This handout is intended as an introduction to STATA. STATA is available on the PCs in the computer lab as well as on the Unix system. Throughout, bold type will refer to STATA commands,
Data Analysis Stata (version 13)
Data Analysis Stata (version 13) There are many options for learning Stata (www.stata.com). Stata s help facility (accessed by a pull-down menu or by command or by clicking on?) consists of help files
Longitudinal Data Analysis: Stata Tutorial
Part A: Overview of Stata I. Reading Data: Longitudinal Data Analysis: Stata Tutorial use Read data that have been saved in Stata format. infile Read raw data and dictionary files. insheet Read spreadsheets
Intermediate PowerPoint
Intermediate PowerPoint Charts and Templates By: Jim Waddell Last modified: January 2002 Topics to be covered: Creating Charts 2 Creating the chart. 2 Line Charts and Scatter Plots 4 Making a Line Chart.
IBM SPSS Statistics 20 Part 1: Descriptive Statistics
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 1: Descriptive Statistics Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the
SPSS for Simple Analysis
STC: SPSS for Simple Analysis1 SPSS for Simple Analysis STC: SPSS for Simple Analysis2 Background Information IBM SPSS Statistics is a software package used for statistical analysis, data management, and
EXST SAS Lab Lab #4: Data input and dataset modifications
EXST SAS Lab Lab #4: Data input and dataset modifications Objectives 1. Import an EXCEL dataset. 2. Infile an external dataset (CSV file) 3. Concatenate two datasets into one 4. The PLOT statement will
INTRODUCTION TO SPSS FOR WINDOWS Version 19.0
INTRODUCTION TO SPSS FOR WINDOWS Version 19.0 Winter 2012 Contents Purpose of handout & Compatibility between different versions of SPSS.. 1 SPSS window & menus 1 Getting data into SPSS & Editing data..
SPSS Manual for Introductory Applied Statistics: A Variable Approach
SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All
Introduction to STATA 11 for Windows
1/27/2012 Introduction to STATA 11 for Windows Stata Sizes...3 Documentation...3 Availability...3 STATA User Interface...4 Stata Language Syntax...5 Entering and Editing Stata Commands...6 Stata Online
EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.
EXCEL Tutorial: How to use EXCEL for Graphs and Calculations. Excel is powerful tool and can make your life easier if you are proficient in using it. You will need to use Excel to complete most of your
Introduction to SPSS 16.0
Introduction to SPSS 16.0 Edited by Emily Blumenthal Center for Social Science Computation and Research 110 Savery Hall University of Washington Seattle, WA 98195 USA (206) 543-8110 November 2010 http://julius.csscr.washington.edu/pdf/spss.pdf
Microsoft Excel 2010 Part 3: Advanced Excel
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Microsoft Excel 2010 Part 3: Advanced Excel Winter 2015, Version 1.0 Table of Contents Introduction...2 Sorting Data...2 Sorting
5. Correlation. Open HeightWeight.sav. Take a moment to review the data file.
5. Correlation Objectives Calculate correlations Calculate correlations for subgroups using split file Create scatterplots with lines of best fit for subgroups and multiple correlations Correlation The
SPSS 12 Data Analysis Basics Linda E. Lucek, Ed.D. [email protected] 815-753-9516
SPSS 12 Data Analysis Basics Linda E. Lucek, Ed.D. [email protected] 815-753-9516 Technical Advisory Group Customer Support Services Northern Illinois University 120 Swen Parson Hall DeKalb, IL 60115 SPSS
SPSS: Getting Started. For Windows
For Windows Updated: August 2012 Table of Contents Section 1: Overview... 3 1.1 Introduction to SPSS Tutorials... 3 1.2 Introduction to SPSS... 3 1.3 Overview of SPSS for Windows... 3 Section 2: Entering
An introduction to IBM SPSS Statistics
An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive
Descriptive Statistics
Descriptive Statistics Descriptive statistics consist of methods for organizing and summarizing data. It includes the construction of graphs, charts and tables, as well various descriptive measures such
Module 2 Basic Data Management, Graphs, and Log-Files
AGRODEP Stata Training April 2013 Module 2 Basic Data Management, Graphs, and Log-Files Manuel Barron 1 and Pia Basurto 2 1 University of California, Berkeley, Department of Agricultural and Resource Economics
Gestation Period as a function of Lifespan
This document will show a number of tricks that can be done in Minitab to make attractive graphs. We work first with the file X:\SOR\24\M\ANIMALS.MTP. This first picture was obtained through Graph Plot.
Data Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
An Introduction to SPSS. Workshop Session conducted by: Dr. Cyndi Garvan Grace-Anne Jackman
An Introduction to SPSS Workshop Session conducted by: Dr. Cyndi Garvan Grace-Anne Jackman Topics to be Covered Starting and Entering SPSS Main Features of SPSS Entering and Saving Data in SPSS Importing
SPSS Step-by-Step Tutorial: Part 1
SPSS Step-by-Step Tutorial: Part 1 For SPSS Version 11.5 DataStep Development 2004 Table of Contents 1 SPSS Step-by-Step 5 Introduction 5 Installing the Data 6 Installing files from the Internet 6 Installing
Scatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
Getting started in Excel
Getting started in Excel Disclaimer: This guide is not complete. It is rather a chronicle of my attempts to start using Excel for data analysis. As I use a Mac with OS X, these directions may need to be
Directions for Frequency Tables, Histograms, and Frequency Bar Charts
Directions for Frequency Tables, Histograms, and Frequency Bar Charts Frequency Distribution Quantitative Ungrouped Data Dataset: Frequency_Distributions_Graphs-Quantitative.sav 1. Open the dataset containing
Data analysis and regression in Stata
Data analysis and regression in Stata This handout shows how the weekly beer sales series might be analyzed with Stata (the software package now used for teaching stats at Kellogg), for purposes of comparing
2: Entering Data. Open SPSS and follow along as your read this description.
2: Entering Data Objectives Understand the logic of data files Create data files and enter data Insert cases and variables Merge data files Read data into SPSS from other sources The Logic of Data Files
Figure 1. An embedded chart on a worksheet.
8. Excel Charts and Analysis ToolPak Charts, also known as graphs, have been an integral part of spreadsheets since the early days of Lotus 1-2-3. Charting features have improved significantly over the
Coins, Presidents, and Justices: Normal Distributions and z-scores
activity 17.1 Coins, Presidents, and Justices: Normal Distributions and z-scores In the first part of this activity, you will generate some data that should have an approximately normal (or bell-shaped)
The Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
SPSS (Statistical Package for the Social Sciences)
SPSS (Statistical Package for the Social Sciences) What is SPSS? SPSS stands for Statistical Package for the Social Sciences The SPSS home-page is: www.spss.com 2 What can you do with SPSS? Run Frequencies
CHARTS AND GRAPHS INTRODUCTION USING SPSS TO DRAW GRAPHS SPSS GRAPH OPTIONS CAG08
CHARTS AND GRAPHS INTRODUCTION SPSS and Excel each contain a number of options for producing what are sometimes known as business graphics - i.e. statistical charts and diagrams. This handout explores
The Center for Teaching, Learning, & Technology
The Center for Teaching, Learning, & Technology Instructional Technology Workshops Microsoft Excel 2010 Formulas and Charts Albert Robinson / Delwar Sayeed Faculty and Staff Development Programs Colston
January 26, 2009 The Faculty Center for Teaching and Learning
THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i
SPSS Introduction. Yi Li
SPSS Introduction Yi Li Note: The report is based on the websites below http://glimo.vub.ac.be/downloads/eng_spss_basic.pdf http://academic.udayton.edu/gregelvers/psy216/spss http://www.nursing.ucdenver.edu/pdf/factoranalysishowto.pdf
MS Excel Template Building and Mapping for Neat 5
MS Excel Template Building and Mapping for Neat 5 Neat 5 provides the opportunity to export data directly from the Neat 5 program to an Excel template, entering in column information using receipts saved
Projects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
Forecasting in STATA: Tools and Tricks
Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be
TIPS FOR DOING STATISTICS IN EXCEL
TIPS FOR DOING STATISTICS IN EXCEL Before you begin, make sure that you have the DATA ANALYSIS pack running on your machine. It comes with Excel. Here s how to check if you have it, and what to do if you
SHORT COURSE ON Stata SESSION ONE Getting Your Feet Wet with Stata
SHORT COURSE ON Stata SESSION ONE Getting Your Feet Wet with Stata Instructor: Cathy Zimmer 962-0516, [email protected] 1) INTRODUCTION a) Who am I? Who are you? b) Overview of Course i) Working with
SECTION 5: Finalizing Your Workbook
SECTION 5: Finalizing Your Workbook In this section you will learn how to: Protect a workbook Protect a sheet Protect Excel files Unlock cells Use the document inspector Use the compatibility checker Mark
Microsoft Excel Tips & Tricks
Microsoft Excel Tips & Tricks Collaborative Programs Research & Evaluation TABLE OF CONTENTS Introduction page 2 Useful Functions page 2 Getting Started with Formulas page 2 Nested Formulas page 3 Copying
einstruction CPS (Clicker) Instructions
Two major approaches to run Clickers a. Anonymous b. Tracked Student picks any pad as s/he enters classroom; Student responds to question, but pad is not linked to student; Good for controversial questions,
Advanced Excel Charts : Tables : Pivots : Macros
Advanced Excel Charts : Tables : Pivots : Macros Charts In Excel, charts are a great way to visualize your data. However, it is always good to remember some charts are not meant to display particular types
5 Correlation and Data Exploration
5 Correlation and Data Exploration Correlation In Unit 3, we did some correlation analyses of data from studies related to the acquisition order and acquisition difficulty of English morphemes by both
IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the
Excel Charts & Graphs
MAX 201 Spring 2008 Assignment #6: Charts & Graphs; Modifying Data Due at the beginning of class on March 18 th Introduction This assignment introduces the charting and graphing capabilities of SPSS and
2Creating Reports: Basic Techniques. Chapter
2Chapter 2Creating Reports: Chapter Basic Techniques Just as you must first determine the appropriate connection type before accessing your data, you will also want to determine the report type best suited
Introduction to SPSS (version 16) for Windows
Introduction to SPSS (version 16) for Windows Practical workbook Aims and Learning Objectives By the end of this course you will be able to: get data ready for SPSS create and run SPSS programs to do simple
Introduction to the TI Connect 4.0 software...1. Using TI DeviceExplorer...7. Compatibility with graphing calculators...9
Contents Introduction to the TI Connect 4.0 software...1 The TI Connect window... 1 Software tools on the Home screen... 2 Opening and closing the TI Connect software... 4 Using Send To TI Device... 4
The KaleidaGraph Guide to Curve Fitting
The KaleidaGraph Guide to Curve Fitting Contents Chapter 1 Curve Fitting Overview 1.1 Purpose of Curve Fitting... 5 1.2 Types of Curve Fits... 5 Least Squares Curve Fits... 5 Nonlinear Curve Fits... 6
Learning SPSS: Data and EDA
Chapter 5 Learning SPSS: Data and EDA An introduction to SPSS with emphasis on EDA. SPSS (now called PASW Statistics, but still referred to in this document as SPSS) is a perfectly adequate tool for entering
Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs
Using Excel Jeffrey L. Rummel Emory University Goizueta Business School BBA Seminar Jeffrey L. Rummel BBA Seminar 1 / 54 Excel Calculations of Descriptive Statistics Single Variable Graphs Relationships
Psy 210 Conference Poster on Sex Differences in Car Accidents 10 Marks
Psy 210 Conference Poster on Sex Differences in Car Accidents 10 Marks Overview The purpose of this assignment is to compare the number of car accidents that men and women have. The goal is to determine
4 Other useful features on the course web page. 5 Accessing SAS
1 Using SAS outside of ITCs Statistical Methods and Computing, 22S:30/105 Instructor: Cowles Lab 1 Jan 31, 2014 You can access SAS from off campus by using the ITC Virtual Desktop Go to https://virtualdesktopuiowaedu
Using Word 2007 For Mail Merge
Using Word 2007 For Mail Merge Introduction This document assumes that you are familiar with using Word for word processing, with the use of a computer keyboard and mouse and you have a working knowledge
MAIL MERGE TUTORIAL. (For Microsoft Word 2003-2007 on PC)
MAIL MERGE TUTORIAL (For Microsoft Word 2003-2007 on PC) WHAT IS MAIL MERGE? It is a way of placing content from a spreadsheet, database, or table into a Microsoft Word document Mail merge is ideal for
Introduction to PASW Statistics 34152-001
Introduction to PASW Statistics 34152-001 V18 02/2010 nm/jdr/mr For more information about SPSS Inc., an IBM Company software products, please visit our Web site at http://www.spss.com or contact: SPSS
Describing, Exploring, and Comparing Data
24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter
MetroBoston DataCommon Training
MetroBoston DataCommon Training Whether you are a data novice or an expert researcher, the MetroBoston DataCommon can help you get the information you need to learn more about your community, understand
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Installing and Using the Monte Carlo Simulation Excel Add-in. Software for
Installing and Using the Monte Carlo Simulation Excel Add-in Software for Introductory Econometrics By Humberto Barreto and Frank M. Howland [email protected] and [email protected] (765) 658-4531 and
Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition
Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application
CentralMass DataCommon
CentralMass DataCommon User Training Guide Welcome to the DataCommon! Whether you are a data novice or an expert researcher, the CentralMass DataCommon can help you get the information you need to learn
Excel 2007 A Beginners Guide
Excel 2007 A Beginners Guide Beginner Introduction The aim of this document is to introduce some basic techniques for using Excel to enter data, perform calculations and produce simple charts based on
Intro to Mail Merge. Contents: David Diskin for the University of the Pacific Center for Professional and Continuing Education. Word Mail Merge Wizard
Intro to Mail Merge David Diskin for the University of the Pacific Center for Professional and Continuing Education Contents: Word Mail Merge Wizard Mail Merge Possibilities Labels Form Letters Directory
Tutorial 3: Working with Tables Joining Multiple Databases in ArcGIS
Tutorial 3: Working with Tables Joining Multiple Databases in ArcGIS This tutorial will introduce you to the following concepts: Identifying Attribute Data Sources Converting Tabular Data into GIS Databases
Creating and Using Forms in SharePoint
Creating and Using Forms in SharePoint Getting started with custom lists... 1 Creating a custom list... 1 Creating a user-friendly list name... 1 Other options for creating custom lists... 2 Building a
Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering
Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques
A Short Guide to R with RStudio
Short Guides to Microeconometrics Fall 2013 Prof. Dr. Kurt Schmidheiny Universität Basel A Short Guide to R with RStudio 1 Introduction 2 2 Installing R and RStudio 2 3 The RStudio Environment 2 4 Additions
S P S S Statistical Package for the Social Sciences
S P S S Statistical Package for the Social Sciences Data Entry Data Management Basic Descriptive Statistics Jamie Lynn Marincic Leanne Hicks Survey, Statistics, and Psychometrics Core Facility (SSP) July
Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab Exercise #5 Analysis of Time of Death Data for Soldiers in Vietnam
Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab Exercise #5 Analysis of Time of Death Data for Soldiers in Vietnam Objectives: 1. To use exploratory data analysis to investigate
GeoGebra Statistics and Probability
GeoGebra Statistics and Probability Project Maths Development Team 2013 www.projectmaths.ie Page 1 of 24 Index Activity Topic Page 1 Introduction GeoGebra Statistics 3 2 To calculate the Sum, Mean, Count,
IBM SPSS Statistics for Beginners for Windows
ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning
How to Make the Most of Excel Spreadsheets
How to Make the Most of Excel Spreadsheets Analyzing data is often easier when it s in an Excel spreadsheet rather than a PDF for example, you can filter to view just a particular grade, sort to view which
