Trade Flows and Trade Policy Analysis. October 2013 Dhaka, Bangladesh



Similar documents
Efficient and effective management of big databases in Stata

Stata 12 Merging Guide. Nathan Favero Texas A&M University October 19, 2012

F nest. Financial Intermediation Network of European Studies. Data and Sample Management with Stata

SPSS: Getting Started. For Windows

Big Data in Stata. Paulo Guimaraes 1,2. Portuguese Stata UGM - Sept 18, Big Data in Stata. Paulo Guimaraes. Motivation

Stata basics (Stata 12)

SHORT COURSE ON Stata SESSION ONE Getting Your Feet Wet with Stata

Microsoft' Excel & Access Integration

Stata Tutorial. 1 Introduction. 1.1 Stata Windows and toolbar. Econometrics676, Spring2008

Useful Stata Commands for Longitudinal Data Analysis

Introduction to STATA 11 for Windows

Using Stata for data management and reproducible research

Forecasting in STATA: Tools and Tricks

Introduction to IBM SPSS Statistics

Appendix III: SPSS Preliminary

B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn);

Title. Syntax. stata.com. odbc Load, write, or view data from ODBC sources. List ODBC sources to which Stata can connect odbc list

Introduction to Stata 8. Svend Juul

Innovative Techniques and Tools to Detect Data Quality Problems

Longitudinal Data Analysis: Stata Tutorial

Module 2 Basic Data Management, Graphs, and Log-Files

Introduction to Stata

Getting Started in Data Analysis using Stata

Creating While Loops with Microsoft SharePoint Designer Workflows Using Stateful Workflows

ECONOMICS 351* -- Stata 10 Tutorial 2. Stata 10 Tutorial 2

A BRIEF INTRODUCTION TO STATA WITH 50+ BASIC COMMANDS

A Short Guide to Stata 13

Introduction Course in SPSS - Evening 1

Data Presentation. Paper Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs

Introduction Environment Data Management Statistical Analysis Program/Output. Stata Tutorial. Francesco Andreoli

More Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

5. Crea+ng SAS Datasets from external files. GIORGIO RUSSOLILLO - Cours de prépara+on à la cer+fica+on SAS «Base Programming»

Ad Hoc Advanced Table of Contents

Computer Applications (10004)

Quick Start to Data Analysis with SAS Table of Contents. Chapter 1 Introduction 1. Chapter 2 SAS Programming Concepts 7

From The Little SAS Book, Fifth Edition. Full book available for purchase here.

Importing and Exporting With SPSS for Windows 17 TUT 117

MAS 500 Intelligence Tips and Tricks Booklet Vol. 1

DBF Chapter. Note to UNIX and OS/390 Users. Import/Export Facility CHAPTER 7

Preparing your data for analysis using SAS. Landon Sego 24 April 2003 Department of Statistics UW-Madison

Text files fall into several classifications. When importing from a text file, you need to specify the correct type for the import to go smoothly.

What You re Missing About Missing Values

Paper An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois

THE POWER OF PROC FORMAT

Tips and Tricks SAGE ACCPAC INTELLIGENCE

Detail Report Excel Guide for High Schools

Data Preparation & Descriptive Statistics (ver. 2.7)

Excel Intermediate. Table of Contents UPPER, LOWER, PROPER AND TRIM...28

Getting Started with R and RStudio 1

Jet Data Manager 2012 User Guide

How to set the main menu of STATA to default factory settings standards

EXCEL PIVOT TABLE David Geffen School of Medicine, UCLA Dean s Office Oct 2002

CowCalf5. for Dummies. Quick Reference. D ate: 3/26 /

Introduction to Stata

Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1

Using Excel s PivotTable to Analyze Learning Assessment Data

DATA MANAGEMENT IN STATA*

Search and Replace in SAS Data Sets thru GUI

Flat Pack Data: Converting and ZIPping SAS Data for Delivery

VDF Query User Manual

Top 10 Things to Know about WRDS

Microsoft Office Word 2010: Level 1

Result Entry by Spreadsheet User Guide

1. Base Programming. GIORGIO RUSSOLILLO - Cours de prépara+on à la cer+fica+on SAS «Base Programming»

10 Listing data and basic command syntax

PO-18 Array, Hurray, Array; Consolidate or Expand Your Input Data Stream Using Arrays

PivotTable and PivotChart Reports, & Macros in Microsoft Excel

MICROSIRIS. Statistical and Data Management Software System. Version 24. March 8, Developed by Van Eck Computer Consulting

Vendor: Crystal Decisions Product: Crystal Reports and Crystal Enterprise

Using Excel for Analyzing Survey Questionnaires Jennifer Leahy

ACCESS Importing and Exporting Data Files. Information Technology. MS Access 2007 Users Guide. IT Training & Development (818)

Introduction to Microsoft Access 2003

Excel Database Management Microsoft Excel 2003

G563 Quantitative Paleontology. SQL databases. An introduction. Department of Geological Sciences Indiana University. (c) 2012, P.

Visualization with Excel Tools and Microsoft Azure

Basics of STATA. 1 Data les. 2 Loading data into STATA

Microsoft Access 3: Understanding and Creating Queries

Microsoft Excel 2010 Pivot Tables

EXST SAS Lab Lab #4: Data input and dataset modifications

Norwex Office Suite: The Consultant Experience

<next> <PDF version> Tutorial: Automated table generation and reporting with Stata. Ben Jann, ETH Zurich,

outreg help pages Write formatted regression output to a text file After any estimation command: (Text-related options)

Microsoft Office 2010: Access 2010, Excel 2010, Lync 2010 learning assets

Using the Magical Keyword "INTO:" in PROC SQL

Eliminating Tedium by Building Applications that Use SQL Generated SAS Code Segments

Using outreg2 to report regression output, descriptive statistics, frequencies and basic crosstabulations (v1.6 draft)

Introduction to Stata

A Method for Cleaning Clinical Trial Analysis Data Sets

How To Create A Powerpoint Intelligence Report In A Pivot Table In A Powerpoints.Com

About PivotTable reports

Importing Excel Files Into SAS Using DDE Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA

<next> Italian Stata Users Group meeting. Tutorial: Output processing and automatic reporting with Stata. Ben Jann, ETH Zurich,

GETTING YOUR DATA INTO SPSS

Query 4. Lesson Objectives 4. Review 5. Smart Query 5. Create a Smart Query 6. Create a Smart Query Definition from an Ad-hoc Query 9

The SET Statement and Beyond: Uses and Abuses of the SET Statement. S. David Riba, JADE Tech, Inc., Clearwater, FL

Advanced Excel 10/20/2011 1

Training Needs Analysis

Transcription:

Trade Flows and Trade Policy Analysis October 2013 Dhaka, Bangladesh Witada Anukoonwattaka (ESCAP) Cosimo Beverelli (WTO) 1

Introduction to STATA 2

Content a. Datasets used in Introduction to Stata b. Resources c. Importing data into Stata d. Do files e. Commands for variable s management and descriptive statistics f. Macros g. Loops 3

a. Datasets used in Introduction to Stata To apply some of the Stata commands described in this presentation, we will use two datasets: WDI.dta - a very small subset of the World Development indicators WB_ES.dta derived from the World Bank Enterprise Surveys You can find the datasets in the directory: Stata_material\data\IntroductionStata\ 4

b. Resources Stata help and Stata manual A variety of books covering Stata exist Web resources: Germán Rodríguez s webpage Data management, graphics and programming UCLA IDRES webpage Very comprehensive covering all sorts of topics (data management, analysis, ) with many examples FAQ Statalist Typically accessed via a google search 5

c. Importing data into STATA insheet using filename, clear delimit( ; ) names Typically used for text files that are either comma or tab-separated Rarely used alternatives: infix (fixed-column format); infile (free format) Stata12: import excel using filename, sheet( Ex1 ) first Reads excel files directly into Stata Allows to specifiy variables, cellrange and worksheet to import StatTransfer Specialised software to transfer data Copy paste Sometimes the most efficient way, e.g. when you do not want to write a do file To watch out: the accuracy of copied numbers depends on 1. how data are formatted in excel, i.e. how many digits are shown, and 2. your settings in Stata(use set type double before copying) 6

d. Do files If you work with STATA, (almost) always use do files E.g. one do file for creating your master dataset and one do file for regressions Do files can also be used to set globals and directories or to run a series of different do files after each other Typical commands at beginning of each do file: clear all /* removes all data */ set more off, perm /* prevents Stata to pause while runnning a do file */ capture log close /* closes a log file */ cd directory /* sets the directory, e.g. C:\Research\data\ */ log using filename, replace /* useful for long do files, allows printing */ capture log close /* at the end of a do file that is logged */ use dataset.dta, replace /* open dataset; are not necessarily needed */ 7

e. Commands for variable s management and descriptive statistics generate newvar=exp [if] Creates a new variable replace oldvar=exp [if] Replaces an existing variable rename old_varname new_varname Renames variable; alternative: renvars varlist To drop or keep variables you can use drop varlist or keep varlist To drop or keep observations you can use drop if or keep if 8

e. Commands for variable s management and descriptive statistics (ct d) describe Provides information on dataset (#obs, #vars, size) and on variables (type, labels) sum(marize) varlist Provides #obs, mean, std. dev., min., max tab(ulate) var1 var 2 Provides one- or two-way tables of frequencies tab cou sector Allows the creation of dummy variables with the option generate() table rowvar (colvar), content() Provides frequencies by default. The option contents allows for other statistics table cou sector, content(mean sales sum d_exp) by cou: table sector, content(mean sales sum d_exp) tabstat varlist, statistics() by() Another command to calculate summary statistics tabstat sales if cou=="usa", by(sector) 9

e. Commands for variable s management and descriptive statistics (ct d) Commands to identify missings inspect varlist e.g. inspect cou codebook varlist e.g. codebook cou duplicates (report/drop/tag/list) varlist Reports, drops, tags or lists observations that are identical in all variables or identical in the variables specified by varlist unique varlist Reports the number of unique values for varlist 10

e. Commands for variable s management and descriptive statistics (ct d) egen newvar=function(varlist or other argument) Often used command to create new variables, see Stata help Often used egen functions: bysort cou sector: egen sales_sec=total(sales), missing bysort cou sector: egen sales_sec=mean(sales) egen exp_tot=rowtotal(exp_intermediate exp_final) egen id_cluster=group(cou sector) egen cou_sec=concat(cou sector) Further functions include: max, min, count, tag, 11

e. Commands for variable s management and descriptive statistics (ct d) collapse (mean) varlist (sum) varlist, by(varlist) Creates an aggregate dataset by e.g. averaging or summing variables across the dimension identified in by() All observations not included in the command are dropped Useful in analysis when moving to a higher level of aggregation, e.g. aggregating trade flows from HS 6-digit to HS 2-digit Useful for calculating descriptive statistics before exporting them to excel using outsheet or export excel egen can be used to create aggregates within the disaggregated dataset bysort cou sector: egen sales_sec=total(sales), missing duplicates drop after egen gives the same results as collapse keep cou sector sales_sec duplicates drop 12

e. Commands for variable s management and descriptive statistics (ct d) destring varlist, replace force Converts a string variable to a numeric variable Useful when numbers are imported as string into Stata The other way round numeric to string: tostring varlist, gen(newvar) String functions: generate newvar =function() Allow to manipulate string variables. See Stata help. Some useful functions are: abbrev () shortens the string the number of indicated characters length() returns the length of the string, i.e. number of characters subinstr() allows to replace or delete particular substrings substr() allows to extract substrings based on its position upper (lower) Changes the entire string to upper-case (lower-case) strings trim() removes leading and trailing blanks of the string 13

e. Commands for variable s management and descriptive statistics (ct d) reshape wide (long) stub, i() j() options Reshapes dataset from long to wide format and vice versa Data dimensions such as country, year or sector are normally put in long format stub are variables in reshape wide and stubs of variables in reshape long i() are identifying dimensions; j() dimension to change Exercise: Open WDI.dta and reshape it first long and then wide To merge datasets use either merge or joinby merge (1:1,m:1,1:m,m:m) varlist using filename, update keepusing(varlist) joinby varlist using filename, unmatched(both) update Merge is used to add further variables to observations in the master data Joinby forms all pairwise combination for varlist Exercise: Open WB_ES.dta and merge it with WTI.dta 14

Difference between merge and joinby Introduction à la Microéconomie, session 3 15

f. Macros See Stata help and Germán Rodríguez s webpage Macros are names associated with some text The commands global and local assign strings to global and local macro names global mname [=exp :extended_fcn [`]"[string]"['] ] Global macros, once defined, are available anywhere in Stata local lclname [=exp :extended_fcn [`]"[string]"['] ] Simplest example: local c USA JPN Local macros work only within the do file in which they are defined Globals and locals have a variety of uses To define the directories for this class, i.e. directory_definition.do They are used in loops (see next slides) A set of explanatory variables can be grouped under one macro name 16

g. Loops See Stata help and Germán Rodríguez s webpage Two main commands: foreach and forvalues foreach loops through strings of text, forvalues loops through numbers Syntax: foreach lname {in of listtype} list { commands referring to `lname } forvalues lname = range { commands referring to `lname' } 17

g. Loops (ct d) Examples for loops in WB_ES.dta: foreach k in USA JPN { /* Loop over any_list */ egen sales_`k'=total(sales) if cou=="`k'" } vallist cou, local(c) /* vallist shows values and creates local */ foreach k of local c { /* Loop over a local macro */ capture drop sales_`k' egen sales_`k'=total(sales) if cou=="`k'" } forvalues k=1(1)3 { /* Loop over sector codes */ egen total_`k'=total(sales) if sector==`k' } Foreach can also be used to loop over variables and numbers foreach k of var varlist; foreach k of num numlist 18