Itrduct t Stata Jrd Muñz (UAB) Sess 4: Descrptve statstcs ad exprtg Stata results I ths sess we are gg t wrk wth descrptve statstcs Stata. Frst, we preset a shrt trduct t the very basc statstcal ctets f the sess ad the we wll expla the way f btag them Stata. 1. Shrt trduct t descrptve statstcs Descrptve statstcs s used t descrbe the ctets ad prpertes f a gve varable. Wth a umber, r a lmted set f umbers, we ca easly kw hw s a varable dstrbuted ur sample/ppulat f terest. Average It s the mst well-kw descrptve statstc, equal t the sum f all cases dvded by the umber f cases X 1 x Weghted average Every bservat s weghted by a gve value, that represets the mprtace f ts ctrbut t the fal average. It s calculated just lke the average but multplyg each bservat by ts weght ad dvdg by the verall sum f weghts X k 1 k 1 x w w Meda It s the cetral value f a varable: t has as may cases belw ad abve. Mre frmally, t s the value f the dstrbut that satsfes the cdt f havg half f the values lwer r equal ad the ther half beg hgher r equal t t. I case that the umber f cases was eve, the meda wuld equal the average f the tw cetral values. Mde It s the mst cmm value f the varable Percetles ad quartles 1
Itrduct t Stata Jrd Muñz (UAB) Quartles are a extes f the meda: are thse values that have a 5%, 50%, ad 75% f the cases belw them, respectvely. Percetles are, tur, a geeralzat f the same dea: percetle p has p% f the values belw ad (100-p)% abve. Varace The varace expresses hw a dstrbut s spread ut. It equals the mea f the squared devats f that varable frm ts mea 1 ( x X ) Stadard devat The stadard devat s the square rt f the varace: s s The stadard devat s mprtat because t has sme terestg prpertes. It s the mst wdely used dspers statstc. I geeral, we ca take as a referece pt what we kw the rmal dstrbut: 95% f the cases are wth, aprx, +/- stadard devats frm the mea, ad 99,87% wth +/- 3 stadard devats Rage The rage f a varable equals the dfferece betwee the largest ad smallest values, ad expresses ts ampltude. R = max-m
Itrduct t Stata Jrd Muñz (UAB) Iterquartle rage. The rage mght be affected by extreme values, ad therefre msrepreset the ampltude. We ca use the terquartle rage, that equals the dfferece betwee the thrd ad frst quartles. Wth the terquartle rage we wll have half f the cases. R = Q 3- Q 1 Skewess It measures the symmetry f the dstrbut. It take the rmal dstrbut as a referece pt, because t s perfectly symmetrcal. A rmally dstrbuted varable wuld have a skewess f 0. Otherwse the skewess ca be: 1. Pstve: A lger tal t the rght, mre bservats the left ad therefre, few hgh values. Als called rght-skewed. Negatve: lger left tal, mre bservats t the rght ad few lw-values. Als called left-skewed Descrptve statstcs Stata Stata ca preset all ths frmat wth the cmmad summarze,: Summarze The cmmad summarze varable1 varable (etc.) detals the umber f vald bservats, the mea, the stadard devat ad the mmum ad maxmum value f the varables. If we wat sme addtal frmat, we culd use the pt detal: 3
Itrduct t Stata Jrd Muñz (UAB) Detal Typg summarze varable1 varable, detal Stata wll dsplay the mea, stadard devat, mmum ad maxmum, percetles, varace ad Skewess. Descrptve statstcs tables The summarze cmmad s useful fr summarzg the whle sample. Althugh we ca cmbe t wth the pts f ad by t get descrptves f sub-samples, ths s t the mst apprprate cmmad t d that. Stata has several useful pts f buldg tables f descrptves by grups: Tabulate, summarze tabulate grupvarable, summarze(varable1) shws a frequecy table f the grups defed by the varable grupvarable wth the mea ad stadard devat f varable1 fr each grup. Tabstat s a mre pwerful cmmad, sce we ca clude the table a wder chce f descrptve statstcs f mre tha e varable. tabstat varable1 varable, stats(mea med sd m max) by(varablegrup) frmat(%9.f) Exprtg Stata results Stata prduces results the ma wdw, but fte we wat t exprt them t a spreadsheet r wrd dcumet. Ths requres sme addtal wrk. Lg fles The Stata result wdw des t stre the whle sess, but just the last part. If we wat t stre the whle utput we shuld use a lg fle. We ca pe ad ame t thrugh a c the ma wdw, but the same ca als be de usg the cmmads: Ope lg-fle: lg usg fle.lg Ths pes a lg fle wth the specfed ame, that wll stre all ur actvty. We ca chse the frmat lg (pla text) r.scml (frmatted). If we wat t wrk a exstg fle, we ca ether verwrte t (pt,replace) r use the pt,apped that adds the ew results at the ed f the fle. Clse lg fle: lg clse clses the lg fle Susped el lg fle: Smetmes we mght wat t susped the strg f the results ad the restart s. The cmmads lg ff ad lg wll d the trck. Vew the lg fle: vew fle.lg 4
Itrduct t Stata Jrd Muñz (UAB) Check the status f the lg fle: We mght easly frget whether a lg fle s pe r t. I ths case, we ca just type lg the cmmad le ad Stata wuld tell us. Cpy results Ether f we use a lg fle r t, t exprt ur results t wrd r excel we wll cmmly use the cpy-paste fucts. Frm Stata we ca cpy the relevat results by hghlghtg them, rght-clckg them ad chsg e f the fllwg pts: Cpy Cpes the select as text. It ca be pasted a wrd prcessr, but f we wat t preserve the algmet f the tables we have t use curer r curer ew fts ad chse a small ft sze (10, 9, 8, depedg the table). Cpy table Ths s the mst useful pt, cpes the select as a table. If the table fts the dcumet, t wll appear alged by tabs, s we culd easly cvert t t a wrd table. Hwever, ths pt s best suted fr usg excel as a termedate step. We have t exprt e table at a tme, ad f pssble select the mmum umber f elemets. Cpy table as html ca be useful sme ctexts. Cpy mage Cpes the table as a mage the clpbard. Oly useful f fr whatever reas we wsh t keep exactly the same appearace as Stata. Advaced cmmads I ths trductry curse we are t gg t deal wth these cmmads detal, but ay case t s useful t kw that there are several cmmads that ca prduce drectly frm Stata publcat-qualty tables that ca be drectly used ur papers. These cmmads ca save us a lt f tme. Tabut s the mst cmplete cmmad, a full table creat prgram. It eeds sme effrt t lear t, but the t pays ff. We ca stall t usg the cmmad ssc stall tabut. Ad fd a tutral at www.awats.cm.au/stata/tabut_tutral.pdf. Esttab Fr mre advaced aalyss, maly regress mdels, the cmmad esttab wll be useful, because t easly creates.rtf dcumets wth the tables we eed. 5