1 The Statstcal Iterpretato of Degrees of Freedom Author(s): Wllam J. Mooa Source: The Joural of Expermetal Educato, Vol. 21, No. 3 (Mar., 1953), pp Publshed by: Taylor & Fracs, Ltd. Stable URL: Accessed: 11/09/ :58 Your use of the JSTOR archve dcates your acceptace of the Terms & Codtos of Use, avalable at. JSTOR s a otforproft servce that helps scholars, researchers, ad studets dscover, use, ad buld upo a wde rage of cotet a trusted dgtal archve. We use formato techology ad tools to crease productvty ad facltate ew forms of scholarshp. For more formato about JSTOR, please cotact Taylor & Fracs, Ltd. s collaboratg wth JSTOR to dgtze, preserve ad exted access to The Joural of Expermetal Educato.
2 THE STATISTICAL INTERPRETATION OF DEGREES OF FREEDOM WILLIAM Uversty Meapols, J. MOONAN of Mesota Mesota 1. Itroducto THE CONCEPT of degrees of freedom has a very smple ature, but ths smplcty s ot geerally exemplfed statstcal textbooks. It s the purpose of ths paper to dscuss ad defe the statstcal aspects of degrees of freedom ad thereby clarfy the meag of the term. Ths shall be accomplshed by cosderg a very elem etary statstcal problem of estmato ad pro gressg oward through more dffcult but com mo problems utl fally a multvarate prob s used. The avalable lterature whch s devot ed to degrees of freedom s very lmted. Some of these refereces are gve the bblography ad they cota algebrac, geometrcal, physcal ad ratoal terpretatos. The ma emphass ths artcle wll be foud to be o dscoverg the degrees of freedom assocated wth certa stadard errors of commo ad useful sgfcace tests, ad that for some models, parameters are estmated drectly or drectly, by certa d e grees of freedom. The procedures gve here may be put forth completely the system of es tmato whch utlzes the prcple of least squares. The applcato gve here are specal cases of ths system. 2. I most statstcal problems t s assumed that radom varables are avalable for some aal yss. Wth these varables, t s possble to co struct certa fuctos called statstcs wth whch estmatos ad tests of hypotheses are made. As socated wth these statstcs are umbers of de grees of freedom. To elaborate ad expla what ths meas, let us start out wth a very smple stuato. Suppose we have two radom varables, y ad y2. If we pursue a objectve of statstcs, whch s called the reducto of data, we mght costruct the lear fucto, Yx? yx +? y2. Ths fucto estmates the mea of the popu lato from whch the radom varables were draw. For that matter so does ay other lear fucto of the form, Y axl yx + a12 y2 where the a's are real equal umbers. Whe the coef fcets of the radom varables are equal to the recprocal of the umber of them, the statstc de fed s the sample mea. Ths statstc may be chose here for logcal reasos, but ts specf cato really comes from the theory of estmato metoed before. We also could costruct a other lear fucto of the radom varables, Y2 Ths cotrast statstc s a measure of how well our observatos agree sce t yelds a meas ure of the average dfferece of the varables. These statstcs, Yx ad Y2, have the valuable property that they cota all the avalable form ato relevat to dscerg characterstcs of the populato from whch the y's were draw. Ths s true because t s possble to recostruct the orgal radom varables from them. Clearly, Y Y2 yx ad Yx Y2 y2. We dscer that we have costructed a par of statstcs whch are reduceable to the orgal varables, but they state the formato cotaed the varables a more useful form. There are certa other char acterstcs worth otcg. The sum of the coef fcets of the radom varables of Y2 equals zero ad the sum of the products of the correspodg coeffcets of the radom varables of Yx ad Y2 equals zero. That s, (?)(?) + t?)x?) 0. Ths latter property s kow as the quasorthogoal ty of Y ad Y2. Ths property s aalogous to the property of depedece whch s assocated wth the radom varables. I chagg our radom varables to the stats tcs we have performed a quasorthogoal tras formato. Quasorthogoal trasformatos are of specal terest because the statstcs to whch they lead have valuable propertes. I partcular, f our data are composed of radom varables from a ormal populato, these statstcs are depe det the probablty sese, (.e., stochastcally depedet) or other words, they are ucorrel ated. That remark has a ratoal terpretato whch says that the statstcs used are ot over lappg the formato they reveal about the data. As log as we preserve the property of orth ogoalty we wll be able to reproduce the orgal radom varables at wll. Ths reproductve prop erty s guarateed whe the coeffcets of the radom varables of the statstcs are mutually orthogoal (. e., every statstc s orthogoal to every other oe), sce the determat of such coeffcets does ot vash whe ths s true, our equatos (statstcs) have a soluto whch s the explct desgato of the orgal radom var
3 260 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI ables. The determat for ths problem s (1) (?X?) (*)(*) There s aother valuable property of quasorth ogoal trasformatos whch we shall come to a lttle later. 3. If we have three observatos, we ca costruct three mutually quasorthogoal statstcs. Aga we mght let Yx be the mea of the radom var ables wth Y2 ad Y3 as cotrast statstcs. Spec fcally, let Yx yx y2 t y3. There exst two other mutually quasorthogoal lear sta tstcs whch mght be chose, ad t ca be sad that we ejoy the freedom of two choces the statstcs we actually use to summarze the data. We could let (2) Y2 or, (3)Y2?yx y2 + fy3; Y3?y1 yx +?y2 t'y*. + y2 y3; Y3 yx y2 y3. (It ca be show that there exsts a fty of possble choces! ) Ether par of the statstcs whch we have chose together wth Yx ca be show to reproduce the radom varables y1} ad y2 ad y3. As a cosequece, they possess all the formato that the orgal varables do. I geeral, f we have radom varables, we mght costruct a stats tc represetg the sample mea (whch estmates 9) ad have 1 choces or degrees of freedom for other mutually quasorthogoal lear stats tcs to summarze the data. Each degree of free dom the correspods to a mutually quasorthog oal lear fucto of the radom varables. I geeral, the term degree of freedom does ot ec essarly refer to a lear fucto whch s orth ogoal to all the others whch are or may be co structed; however, commo usage t usually does refer to quasorthogoal lear fuctos. Whe the observatoal model we are workg wth cotas oly parameter whch s estmated by a lear fucto, there s lttle purpose spec fyg the remag degrees of freedom the form of cotrasts. For stace, f our model s y 0 + e s ormally dstrbuted wth zero mea ad varace a2,. e., N(o, a2), ad 1,...,, we would also lke to estmate a2. Ufortuately, ths parameter s ot estmated drectly by lear fuctos other tha Yx. Before proceedg, the other property of quas orthogoal trasformatos wll be dscussed. Oe mght qure about the relatoshp of the umber called the sum of the squares to the y's to the sum of squares of the If we Yj's. requre ths umber to be varat, the o (4) S S Yj2 y2. Jl l For two statstcs, we ca wrte matrx otato, (5) *z j ^a21 a22 J ^y2 J The, Y' Y (a y) '(Ay) y' A' Ay..^yj2 Now f S Yf Y'Y s to equal L J l y?2 y!y, the A'A s a two rowtwo colum matrx wth oes the ma dagoal,. e., A!A /1 Ox vo r A matrx, A', whch whe multpled by ts tras pose, A, equals a ut matrx, the A' s called a orthogoal matrx ad the y^s whch are tras formed to the Yj's by ths matrx are sad to be orthogoally trasformed. You wll otce that the matrx of the coeffcets of Y! ad Y2 of sec to 2 s ot a orthogoal matrx sce A'A J2? If the coeffcets of the Y's had bee 1/V2's stead of?'s the A' would be a orthogoal ma trx. Because the matrx of our trasformatos does ot fulfll the accepted mathematcal def to of orthogoal trasformatos, but oe very much lke them, they are termed, for the purposes of ths paper, quasorthogoal trasformatos. However, t seems uatural to begg studets to defe Yx as yx _l_y1 +_1_ Ya Actually, for /2 y/t Yx ay lear fucto wth postve ad equal co effcets would serve as well as Yx tself for they would be logcally equvalet ad mathematcally reducble to the usual defto of the sample mea. If we are to use the commosese stats tcs, obvously somethg must be doe order to preserve the property (4). Oe thg that ca be doe s to chage our defto of what the sum of squares of the jth lear fucto, Y*, would be. Let us defe the sum of squares assocated wth the lear fucto to be Yj (6) SS(Yj) <ah y* +a2 Ya + + *j Y)2 afj + a22j ++ a*j
4 March 1953) MOONAN 261 Usg ths defto stead of just the umerator of t, property (4) wll be preserved. As a llus trato of ths formula let j 1 ad Yx?yx + 3Ly2 + ty3, the (7)ss(y1) (y + JYa + ys)2 (?yq2 (*) 2 + (*) 2 + () 2 3 or for radom varables, SS(Y) 3 (S y)2/. l Further, f yx 24, y2 18 ad y3 36, the SS(YJ 2028, ad f we use (2), the SS(Y2) 18 ad SS(Y3) 150. Note that SS(YJ + SS(YJ + SS (Y2) +SS(Y3) 2196 ad that 3 Z y? l Thus the sum of squares of the lear fucto equals the sum of squares of the radom varables. These results ca, of course, be geeralzed to the varable case. Clearly, the sum of squares of the two lear fuctos Y2 ad Y3 equals the total sum of squares of the radom varables m us the sum of squares assocated Yx, so: 3 3^2 (8) SS(Y2 ) + SS(Y, ) S yf SS(YX ) S y* ~ <? 7? 11 l >%? or, geeral, (9) SS(Y2) SS(Y) S Y2 (& yj)2 11 Now defe the sample varace of a set of l ear fuctos as the average of the sums of squares assocated wth the cotrast lear fuctos. We see that for the specal case where 3, our d vso for ths average wll be 2 because three are two sums of squares to be averaged (8). Ths argumet accouts for the degrees of freedom d vsor whch has bee tradtoally dffcult to ex pla to begg studets the formula (Q)s^Az (ly)8 A M^! 1 ( 1) 11 1 The statstc Ylf accouts for oe degree of free dom the umerator of the formula for Studet's t ad the deomator s a fucto of (10) ad s assocated wth 1 degrees of freedom. Note that t s ot ecessary to costruct the cotrast degrees of freedom to obta the sums of squares assocated wth them. 4. The problem just preseted s a smple aaly ss of varace (aova) type ad leads to the test of the hypothess, 9 Go. The ext logcal elab orato would be to cosder Fsher's t test of the hypothess. 0X 0E. The observato model s yk 9fc+ ek, where l,..., ^ kl, 2 ad ek are N(o,a2 of a22). The orthogoal lear fuctos whch estmate the parameters 6X ad 02, Y are ady2o The, respectvely, l y +...+l o yx+ y o x x 2 2 y22 ylx+...+o + 1 y y y22. x x 2 2 (11) SS(Y3) SS(Y1+Il2) k 2.^.^yk SS(YJ k 2 2 (sv)2 (.Sy2)2 SS(Y2) S Z Yy _M lz?_ l J _. a _.? (y y f + S (y2 l y2)2 l ad f we average these sum of squares, the ap proprate deomator wll be x The umerator of Fsher's t s Y Y2 uder the ull hypothess Bx 92 ad the deomator s a fuc to of (11) ad s assocated wth x + 22 de grees of freedom. 5. As aother example, we mght cosder the re gresso model, yx G +? (x x) + ex, where 1,..., ad e are N(0> The leur o$.x). fuctos of terest are Yx 1 y ad Y2 (X5Q yx (X X) y. For these fuctos, Yx s used to estmate the mea, G ad Y2, beg a a average product of the devato x's ad co comtat y's, leads to a estmate of the ukow costat of proportoalty,?. Ths s ratoally ad algebracally true, sce f y ad (x x)ted to proportoately crease ad decrease smul taeously or versely, Y2 wll ted to crease absolutely. However, f yx ad (x x) do ot pro portoately rse ad fal smultaeously or versely, Y2 wll ted to be zero. Ths ca be show by the followg table. I ths table, sev eral sets of x's desgated by xjk, k 1,..., 3, each of whch have the same mea, 4, are subst tuted Y2*together wth ther correspodg yj's. The values of the Y2k are gve the bottom le of Table I.
5 262 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI TABLE I EVALUATION OF Y2 FOR CHANG ING VALUES OF X IN THE SIMPLE REGRESSION MODEL these values costruct the followg system o f equatos : P(ax. ax) + p2(ax. a2) Pm(ax. am) xl X3 x5 (14) (ax.y) P(a2. ax)+p2(a2. a2) pm(a2. am) (a2. y) Usg Y2k (6) we fd (12)SS(YX) (j y)2 adss(y2) gxx x)yv IT S.(xx)' 11 Cosequetly, to fd the sample estmate of o& x, (13) SS(Y3)+...+ SS(Y) Y2 SS(YX) 11 SS(Y2)l ft y2 (? y)2 W*' E(xx)2 11 S (Y y) b S (x x)y s ?\2 7? \2 (yt yx) where b s the usual regresso coeffcet for predctg y from a kowledge of x ad y s the predcted value of y. Aga to fd the varace assocated wth these sums of squares we dvde ther sums of squares by the umber of degrees of freedom from whch these sums of square were derved. Ths umber s 2. Uder the ull hypothess,? 0, the deomator of the t test, t b/s. ED, has ( 2) degrees of freedom ad the umerator s assocated wth oe degree of freedom. 6. It s farly laborous to calculate the SS(Yj) ad because of ths t s desrable to have a meth od whereby the sum of squares assocated wth several lear fuctos may be coveetly foud. The proof of the method s farly log ad wll ot be reproduced here. Its exposto wll have to suffce. Let a be the coeffcet vector of the radom varables of the jth degree of freedom ad let y be the observato vector, (yx y2,..., y). Wth P(am.ax)+p2(am. a2)+...+ pm(am. am) (am. y) Whe these equatos are solved, by whatever method s coveet, the sum of squares for the m degrees of freedom, Yx, Y2,..., Ym(m<) s gve by (15) px(ax.y) + p2(a2.y) pm(am.y) The method reveals the correct sum of squares whether or ot the degrees of freedom are mutu ally orthogoal, but we shall llustrate t for the orthogoal case. Cosder aga (2) ad the let a2 (,?, y), a3(f,?,?) ad y (yx, y2, y3) g to (14) we have (24, 18, 36). Correspod (16) p2() + p3 (0) 3 p2(0) + p3(f) 10. Therefore p2 6 ad p3 15, the SS(Y2)+SS(Y3) 6(3)+(15)(10) 168. I some prevous work secto 3, we foud SS(Y2) 18 ad SS(Y3) 150, so ths result checks. I ths problem, Yx was eglected order to show that (4) s qute geeral for ay m<. 7. All of these prcples may be easly geeral zed to the multvarate case. What s eeded s to use matrx varables stead of the sgle oes we have bee usg. Usg the Least Squares Prcple, the deas preseted here (ad may others) have bee appled to multvarate aalyss of varace referece umber 4. The follow g ad last example s take from ths source. Suppose, Y? H, y* 5, y 8; y2 2, y22 6, y3* 13. Here the superscrpts dcate whch varate s beg cosdered (these umbers are ot to be cofused wth powers), ad the subscrpts desg ate the varables. Also let Y + f^ #?. Y*a ad ^? Y*?
6 March 1953) MOONAN ' where a 1, 2. We have, correspodg to (14) P1 (*) + P21 (0) + p3x (0) 8 (17) P.1 (0) + p? () + p,1 (f) 0 P1 (0) + p (0) + p3x (*) 0 Therefore, p/ 24, p^ 6, p3* 0 ad usg (15) we fd (24)(8) + (4)(2) + (0)(0) 210 whch s equal to ll For the secod varate px2 (?) + P? (0) + p (0) I (18) px2 (0) + p2 () + p32 (0) 2 P2 (0) + p22 (0) + p32 (f) 6 Solvg, we get px2 21, p22 4, p32 9 ad correspodg to (15), (21)(7) + (4)(2) + (9)(6) 209 whch s equal to The sum of crossproducts of these three vector degrees of freedom for the two var?tes may be foud oe of two ways; ether (24)(7) + 6(2) + 0(6) 156 or (21)(8) + (4)(3) + (9)(10) 156. Both results are equal to (H)(2) + (5)(6) + (8)(13). The matrx correspods to the total sum of squares ad cross products for the bvarate sample observatos whch have bee trasformed by the vector de grees of freedom Yja, jl, 2, 2. We ote that the sums of squares ad crossproducts of the varables for each varate s preserved by the orthogoal vector set of degrees of freedom. Ths smple problem serves to llustrate ths var ace property for a multvarate case. 8. Summary We have see that certa statstcal problems are formulated terms of lear fuctos of the radom varables. These lear fuctos, called degrees of freedom, served the purpose of pre setg the data a more usable form because the fuctos led drectly or drectly to estmates of the parameters of the observato model ad the estmate of varace of the observatos. More over, these estmates may be used to test hypoth eses about the populato parameters by the stad ard statstcal tests. Moder statstcal usage of the cocept of de grees of freedom had ts cepto Studet's classc work, referece 7, whch s ofte cosd ered the paper whch was ecessary to the devel opmet of moder statstcs. Fsher, begg wth hs frequecy dstrbuto study, referece 2, has geeralzatos to work ther may co trbutos to the geeral theory of regresso a alyss. Ths paper has resulted from a attempt to brg clarfcato to the statstcal terpreta to of degrees of freedom. The author feels that hs attempt wll ot be altogether successful for there rema may questos whch studets may or should ask that have ot bee aswered here. A satsfactory exposto could be gve by a com plete presetato of the thepry of least squares whch s slated towards the problems of moder regresso theory of the aalyss of varace type. Ths dscusso would approprately take book form, however. REFERENCES 1. Cramer, Harald, Mathematcal Methods of Statstcs (Prceto, N.J.: Prceto U versty Press, 1946). 2. Fsher, Roald A., Frequecy Dstrbuto of the Values of the Correlato I Samples from a Idepedetly Large Populato, Bometrka, X (1915), pp Johso, Palmer O., Statstcal Methods Research (New York: Pretce Hall, Ic., 1948). 4. Mooa, Wllam J., The Geeralzato of the Prcples of Some Moder Expermet al Desgs for Educatoal ad Psycholog Research. Upublshed thess, Uversty of Mesota, Meapols, Mesota, Rulo, Phllp J., ''Matrx Represetato of Models for the Aalyss of Varace ad Covarace, Psychometrka, XIV (1949), pp Sedecor, George W., Statstcal Methods (Ames, Iowa: Collegate Press, 1946). 7. Studet, The Probable Error of the Mea, Bometrka, VI (1908), pp. 125.
7 264 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 8. Tukey, Joh W., Stadard Methods of A alyzg Data, Proceedgs: Computato Semar (New York: Iteratoal Busess Maches Corporato, 1949), pp Walker, Joh W., Degrees of Freedom, Joural of Educatoal Psychology, XXI (1940), pp Walker, Hele M., Mathematcs Essetal for Elemetary Statstcs (New York: Hery Holt ad Co., 1951). 11. Yates, Frak, The Desg ad Aalyss of Factoral Expermets, Imperal Bureau of Sol Scece, Techcal Commucato No. 35, Harpede, Eglad: 1937.