Monday, October 20, 2008 Short Notes on Stata (version 10) Data Functions and Stuff How Stata records dates and times Dates and times are called %t values. %t values are numerical and integral. The integral value records the number of time units that have passed from an agreed-upon base, which for Stata is 1960. Coding and interpretation of date and time (%t) values are as follows: +---------------------------------------------------------------------+ ----- Numerical value & interpretation ------ Format Meaning Value = -1 Value = 0 Value = 1 --------+------------+---------------+---------------+--------------- %td days 31dec1959 01jan1960 02jan1960 ----------------------------------------------------------------------- For a %td value, a 1-unit change represents 1 day. Integer 4,569 represents 05jul1972 because that date occurred 4,569 days after 01jan1960. Integer -4,569 represents 29jun1947 because that date occurred 4,569 days before 01jan1960. Inputting date and time data Date and time variables are best read as strings. Then use one of the string-to-numeric conversion functions to convert the string representation to the appropriate %t value: Format String-to-numeric conversion function -------+----------------------------------------- %td date(string, mask) %tw weekly(string, mask) %tm monthly(string, mask) %ty yearly(string, mask) %tg no function necessary; read as numeric ------------------------------------------------- In the above functions, string is the variable or value containing the string representation to be converted and mask specifies the order in which the components occur: For %td function date(), string might be "August 21, 2005" or "8-21- 2005" and mask might be "MDY", meaning that the elements occur in the order month, day, and year.
Constructing date and time values from numerical components If you had numeric variables M, D, and Y containing month number, day of month, and year (in the first observation, the variables might contain 12, 15, and 2006), you could code. generate mydate = mdy(m, D, Y) to obtain new %td variable containing the date (which would be 15dec2006 in the first observation). The date-from-numerical-components functions are where Format Function -------+------------------------------------------ %td mdy(m, D, Y) -------------------------------------------------- td is a %td value, M, D, and Y are month, day, and year values, 1 <= M <= 12 1 <= D <= 31 0100 <= Y <= 9999 2
Formatting date and time values A variable's values are formatted to indicate (1) the units used and (2) how the variable is to be displayed:. generate mydate = date(datestr, "DMY"). list mydate in 1 +--------+ mydate -------- 1. 17096 +--------+. format mydate %td. list mydate in 1 +-----------+ mydate ----------- 1. 22oct2006 +-----------+ The %t formats result in the following output: Format Example of output -------+---------------------------- %td 05jul1972 ------------------------------------ You can specify how date and times are to be formatted. Rather than 05jul1972, you could have July 5, 1972, or rather than 05jul1972 21:38:02, you could have 7-5-72 9:38 p.m. This reformatting is done by adding codes to the end of %tc, %tc, %td, etc. In fact, the default %tc, %tc, %td,..., formats actually mean Format Implied (fully specified) format -------+--------------------------------- %td %tdddmonccyy ----------------------------------------- 3
The formatting codes are Code Meaning Output ----------------------------------------------------------- CC century-1 01-99 cc century-1 1-99 YY 2-digit year 00-99 yy 2-digit year 0-99 JJJ day within year 001-366 jjj day within year 1-366 Mon month Jan, Feb,..., Dec Month month January, February,..., December mon month jan, feb,..., dec month month january, february,..., december NN month 01-12 nn month 1-12 DD day within month 01-31 dd day within month 1-31 DAYNAME day of week Sunday, Monday,... (aligned) Dayname day of week Sunday, Monday,... (unaligned) Day day of week Sun, Mon,... Da day of week Su, Mo,... day day of week sun, mon,... da day of week su, mo,... h half 1-2 q quarter 1-4 WW week 01-52 ww week 1-52 am show am or pm am or pm a.m. show a.m. or p.m. a.m. or p.m. AM show AM or PM AM or PM A.M. show A.M. or P.M. A.M. or P.M.. display period., display comma, : display colon : - display hyphen - _ display space / display slash / \ display backslash \ + separator (see note) ----------------------------------------------------------------- 4
Experimenting with the date and time functions The best way to become familiar with Stata's date and time functions is to experiment with the display command.. display date("5-12-1998", "MDY") 14011. display %td date("5-12-1998", "MDY") 12may1998 Remember, when you work with display, you can specify a format in front of the expression to specify how the result is to be formatted. Translating run-together dates such as 20060125 Functions clock(), Clock(), and date() will translate dates and times that are run together such as 20060125, 060125, and 20060125110215 (which is 25jan2006 11:02:15). There is nothing special that you haveto do:. display %d date("20060125", "YMD") 25jan2006. display %td date("060125", "20YMD") 25jan2006. display %tc clock("20060125110215", "YMDhms") 25jan2006 11:02:15 In a data context, you could type. gen startdate = date(startdatestr, "YMD"). gen double starttime = clock(starttimestr, "YMDhms") Remember to read the original data into a string. If you read the data as numeric, the best advice is to read the data again. Numbers such as 20060125 and 20060125110215 will be rounded unless they are stored as doubles. 5
6
7