Introduction to the article Degrees of Freedom.


 Ethelbert Curtis
 1 years ago
 Views:
Transcription
1 Introduction to the article Degree of Freedom. The article by Walker, H. W. Degree of Freedom. Journal of Educational Pychology. 3(4) (940) 5369, wa trancribed from the original by Chri Olen, George Wahington High School, Cedar Rapid, Iowa. Chri ha made every attempt to reproduce the "look and feel" of the article a well a the article itelf, and did not attempt in any way to update the ymbol to more "modern" notation. Three typographical error were found in the paper. Thee error are noted in the paragraph below. The article, except for pagination and placement of diagram, i a it originally appear. The trancribed page are not numbered to avoid confuion with pagination in the original article. Typographical error: () In the ection on tditribution (the 7th of thee note) the lat entence hould read The curve i alway ymmetrical, but i le peaked than the normal when n i mall. () In the ection (b) Variance of Regreed Value about Total Mean (the th page of x thee note) x and y are revered in the expreion Y% M y = r ( X M x ). It hould y read Y% M = r ( X M ) y x x y (3) In the ection Tet Baed on Ratio of Two Variance (the 4th page of thee ( ) r r note), the econd entence, we may divide by obtaining N r ( r ) r ( N ) hould read we may divide by obtaining. N r r ( N ). r Another poible confuion to modern ear may come in the ection entitled "Fditribution and zditribution." The zditribution mentioned i NOT the tandardized normal ditribution, but i a ditribution known a "Fiher' z ditribution." A potential problem in reading thi file (other than not having Word!) i [that] the equation, which were inerted uing MathType from Deign Science. Chri ued Math Type 4.0, and if you have anything le it could be a problem. A Math Type reader program can be downloaded from the web. [ Follow the path to upport. ]
2 Degree of Freedom. Journal of Educational Pychology. 3(4) (940) DEGREES OF FREEDOM HELEN M. WALKER Aociate Profeor of Education, Teacher College, Columbia Univerity A concept of central importance to modern tatitical theory which few textbook have attempted to clarify i that of "degree of freedom." For the mathematician who read the original paper in which tatitical theory i now making uch rapid advance, the concept i a familiar one needing no particular explanation. For the peron who i unfamiliar with Ndimenional geometry or who know the contribution to modern ampling theory only from econdhand ource uch a textbook, thi concept often eem almot mytical, with no practical meaning. Tippett, one of the few textbook writer who attempt to make any general explanation of the concept, begin hi account (p. 64) with the entence, "Thi conception of degree of freedom i not altogether eay to attain, and we cannot attempt a full jutification of it here; but we hall how it reaonablene and hall illutrate it, hoping that a a reult of familiarity with it ue the reader will appreciate it." Not only do mot text omit all mention of the concept but many actually give incorrect formula and procedure becaue of ignoring it. In the work of modern tatitician, the concept of degree of freedom i not found before "Student'" paper of 908, it wa firt made explicit by the writing of R. A. Fiher, beginning with hi paper of 95 on the ditribution of the correlation coefficient, and ha only within the decade or o received general recognition. Neverthele the concept wa familiar to Gau and hi atronomical aociate. In hi claical work on the Theory of the Combination of Obervation (Theoria Combinationi Obervationum Erroribu Minimi Obnoxiae) and alo in a work generalizing the theory of leat quare with reference to the combination of obervation (Ergänzung zur Theorie der den kleinten Fehlern unterworfen Combination der Beobachtungen, 86), he tate both in word and by formula that the number of obervation i to be decreaed by the number of unknown etimated from the data to erve a divior in etimating the tandard error x of a et of obervation, or in our terminology σ = where r i the number of N r parameter to be etimated from the data. The preent paper i an attempt to bridge the gap between mathematical theory and common practice, to tate a imply a poible what degree of freedom repreent, why the concept i important, and how the appropriate number may be readily determined. The treatment ha been made a nontechnical a poible, but thi i a cae where the mathematical notion i impler than any nonmathematical interpretation of it. The paper
3 will be developed in four ection: (I) The freedom of movement of a point in pace when ubject to certain limiting condition, (II) The repreentation of a tatitical ample by a ingle point in Ndimenional pace, (III) The import of the concept of degree of freedom, and (IV) Illutration of how to determine the number of degree of freedom appropriate for ue in certain common ituation. I. THE FREEDOM OF MOVEMENT OF A POINT IN SPACE WHEN SUBJECT TO CERTAIN LIMITING CONDITIONS A a preliminary introduction to the idea, it may be helpful to conider the freedom of motion poeed by certain familiar object, each of which i treated a if it were a mere moving point without ize. A drop of oil liding along a coil pring or a bead on a wire ha only one degree of freedom for it can move only on a onedimenional path, no matter how complicated the hape of that path may be. A drop of mercury on a plane urface ha two degree of freedom, moving freely on a twodimenional urface. A moquito moving freely in threedimenional pace, ha three degree of freedom. Conidered a a moving point, a railroad train move backward and forward on a linear path which i a onedimenional pace lying on a twodimenional pace, the earth' urface, which in turn lie within a threedimenional univere. A ingle coördinate, ditance from ome origin, i ufficient to locate the train at any given moment of time. If we conider a fourdimenional univere in which one dimenion i of time and the other three dimenion of pace, two coördinate will be needed to locate the train, ditance in linear unit from a patial origin and ditance in time unit from a time origin. The train' path which had only one dimenion in a pace univere ha two dimenion in a pacetime univere. A canoe or an automobile move over a twodimenional urface which lie upon a threedimenional pace, i a ection of a threedimenional pace. At any given moment, the poition of the canoe, or auto, can be given by two coördinate. Referred to a fourdimenional pacetime univere, three coördinate would be needed to give it location, and it path would be a pace of three dimenion, lying upon one of four. In the ame ene an airplane ha three degree of freedom in the uual univere of pace, and can be located only if three coördinate are known. Thee might be latitude, longitude, and altitude; or might be altitude, horizontal ditance from ome origin, and an angle; or might be direct ditance from ome origin, and two direction angle. If we conider a given intant of time a a ection through the pacetime univere, the airplane move in a fourdimenional path and can be located by four coördinate, the three previouly named and a time coördinate. The degree of freedom we have been conidering relate to the motion of a point, or freedom of tranlation. In mechanic freedom of rotation would be equally important. A point, which ha poition only, and no ize, can be tranlated but not rotated. A real canoe can turn over, a real airplane can turn on it axi or make a noe dive, and o thee real bodie have degree of freedom of rotation a well a of tranlation. The parallelim
4 between the ampling problem we are about to dicu and the movement of bodie in pace can be brought out more clearly by dicuing freedom of tranlation, and diregarding freedom of rotation, and that ha been done in what follow. If you are aked to chooe a pair of number (x, y) at random, you have complete freedom of choice with regard to each of the two number, have two degree of freedom. The number pair may be repreented by the coördinate of a point located in the x, y plane, which i a twodimenional pace. The point i free to move anywhere in the horizontal direction parallel to the xx' axi, and i alo free to move anywhere in the vertical direction, parallel to the yy' axi. There are two independent variable and the point ha two degree of freedom. Now uppoe you are aked to chooe a pair of number whoe um i 7. It i readily apparent that only one number can be choen freely, the econd being fixed a oon a the firt i choen. Although there are two variable in the ituation, there i only one independent variable. The number of degree of freedom i reduced from two to one by the impoition of the condition x + y = 7. The point i not now free to move anywhere in the xy plane but i contrained to remain on the line whoe graph i x + y = 7, and thi line i a onedimenional pace lying in the original twodimenional pace. Suppoe you are aked to chooe a pair of number uch that the um of their quare i 5. Again it i apparent that only one number can be choen arbitrarily, the econd being fixed a oon a the firt i choen. The point repreented by a pair of number mut lie on a circle with center at the origin and radiu 5. Thi circle i a onedimenional pace lying in the original twodimenional plane. The point can move only forward or backward along thi circle, and ha one degree of freedom only. There were two number to be choen (N = ) ubject to one limiting relationhip (r = ) and the reultant number of degree of freedom i N r = =. Suppoe we imultaneouly impoe the two condition x + y = 7 and x + y = 5. If we olve thee equation algebraically we get only two poible olution, x = 3, y = 4, or x = 4, y = 3. Neither variable can be choen at will. The point, once free to move in two direction, i now contrained by the equation x + y = 7 to move only along a traight line, and i contrained by the equation x + y = 5 to move only along the circumference of a circle, and by the two together i confined to the interection of that line and circle. There i no freedom of motion for the point. N = and r =. The number of degree of freedom i N r = = 0. Conider now a point (x, y, z) in threedimenional pace (N = 3). If no retriction are placed on it coördinate, it can move with freedom in each of three direction, ha three degree of freedom. All three variable are independent. If we et up the retriction x + y + z = c, where c i any contant, only two of the number can be freely choen, only two are independent obervation. For example, let x y z = 0. If now we chooe, ay, x = 7 and y = 9, then z i forced to be. The equation x y z = c i the equation of a plane, a twodimenional pace cutting acro the original three
5 dimenional pace, and a point lying on thi pace ha two degree of freedom. bn r = 3 =. g If the coördinate of the (x, y, z) point are made to conform to the condition x + y + z = k, the point will be forced to lie on the urface of a phere whoe center i at the origin and whoe radiu i k. The urface of a phere i a twodimenional pace. (N = 3, r =, N r = 3 =.). If both condition are impoed imultaneouly, the point can lie only on the interection of the phere and the plane, that i, it can move only along the circumference of a circle, which i a onedimenional figure lying in the original pace of three dimenion. ( N r = 3 =.) Conidered algebraically, we note that olving the pair of equation in three variable leave u a ingle equation in two variable. There can be complete freedom of choice for one of thee, no freedom for the other. There i one degree of freedom. The condition x = y = z i really a pair of independent condition, x = y and x = z, the condition y = z being derived from the other two. Each of thee i the equation of a plane, and their interection give a traight line through the origin making equal angle with the three axe. If x = y = z, it i clear that only one variable can be choen arbitrarily, there i only one independent variable, the point i contrained to move along a ingle line, there i one degree of freedom. Thee idea mut be generalized for N larger than 3, and thi generalization i necearily abtract. Too ardent an attempt to viualize the outcome lead only to confuion. Any et of N number determine a ingle point in Ndimenional pace, each number providing one of the N coördinate of that point. If no relationhip i impoed upon thee number, each i free to vary independently of the other, and the number of degree of freedom i N. Every neceary relationhip impoed upon them reduce the number of degree of freedom by one. Any equation of the firt degree connecting the N variable i the equation of what may be called a hyperplane (Better not try to viualize!) and i a pace of N dimenion. If, for example, we conider only point uch that the um of their coördinate i contant, X = c, we have limited the point to an N pace. If we conider only point uch that ( X M) = k, the locu i the urface of a hyperhpere with center at the origin and raidu equal to k. Thi urface i called the locu of the point and i a pace of N r dimenion lying within the original N pace. The number of degree of freedom would be N r. II. THE REPRESENTATION OF A, STATISTICAL SAMPLE BY A POINT IN NDIMENSIONAL SPACE If any N number can be repreented by a ingle point in a pace of N dimenion, obviouly a tatitical ample of N cae can be o repreented by a ingle ample point. Thi device, firt employed by R. A. Fiher in 95 in a celebrated paper ( Frequency
6 ditribution of the value of the correlation coefficient in ample from an indefinitely large population ) ha been an enormouly fruitful one, and mut be undertood by thoe who hope to follow recent development. Let u conider a ample pace of N dimenion, with the origin taken at the true population mean, which we will call µ o that X µ = x, X µ = x, etc., where X, X,... X N are the raw core of the N individual in the ample. Let M be the mean and the tandard deviation of a ample of N cae. Any et of N obervation determine a ingle ample point, uch a S. Thi point ha N degree of freedom if no condition are impoed upon it coördinate. All ample with the ame mean will be repreented by ample point lying on the hyperplane ( X µ ) + ( X µ ) ( XN µ ) = N( M µ ), or X = NM, a pace of N dimenion. If all cae in a ample were exactly uniform, the ample point would lie upon the line X µ = X µ = X µ = = X µ = M µ which i the line OR in Fig., a ( ) ( ) ( ) ( ) 3... N line making equal angle with all the coördinate axe. Thi line cut the plane X = NM at right angle at a point we may call A. Therefore, A i a point whoe coördinate are each equal to M µ. By a wellknown geometric relationhip, Fig.
7 ( µ ) ( µ ) ( µ ) OS = X + X X N OA ( µ ) = N M OS = OA + AS ( µ ) ( µ ) AS = X N M = X NM = N Therefore, OA= ( M µ ) N and AS = N. The ratio OA OS i thu M µ and i proportional to the ratio of the amount by which a ample mean deviate from the population mean to it own tandard error. The fluctuation of thi ratio from ample to ample produce what i known a the tditribution. For computing the variability of the core in a ample around a population mean which i known a priori, there are available N degree of freedom becaue the point S move in Ndimenional pace about O; but for computing the variability of thee ame core about the mean of their own ample, there are available only N degree of freedom, becaue one degree ha been expended in the computation of that mean, o that the point S move about A in a pace of only N dimenion. Fiher ha ued thee patial concept to derive the ampling ditribution of the correlation coefficient. The full derivation i outide the cope of thi paper but certain apect are of interet here. When we have N individual each meaured in two trait, it i cutomary to repreent the N pair of number by a correlation diagram of N point in twodimenional pace. The ame data can, however, be repreented by two point in Ndimenional pace, one point repreenting the N value of X and the other the N value of Y. In thi frame of reference the correlation coefficient can be hown to be equal to the coine of the angle between the vector to the two point, and to have N degree of freedom. III. THE IMPORT OF THE CONCEPT If the normal curve adequately decribed all ampling ditribution, a ome elementary treatie eem to imply, the concept of degree of freedom would be relatively unimportant, for thi number doe not appear in the equation of the normal curve, the hape of the curve being the ame no matter what the ize of the ample. In certain other important ampling ditribution  a for example the Poion  the ame thing i true, that the hape of the ditriution i independent of the number of degree of freedom involved. Modern tatitical analyi, however, make much ue of everal very important ampling ditribution for which the hape of the curve change with the effective ize of the ample. In the equation of uch curve, the number of degree of freedom appear a a parameter (called n in the equation which follow) and probability table built from thee curve mut be entered with the correct value of n. If a mitake i made in determining n from the data, the wrong probability value will be obtained from the table, and the ignificance of the tet employed will be wrongly interpreted. The
8 Chiquare ditribution, the tditribution, and the F and z ditribution are now commonly ued even in elementary work, and the table for each of thee mut be entered with the appropriate value of n. Let u now look at a few of thee equation to ee the rôle played in them by the number of degree of freedom. In the formula which follow, C repreent a contant whoe value i determined in uch a way a to make the total area under the curve equal to unity. Although thi contant involve the number of degree of freedom, it doe not need to be conidered in reading probability table becaue, being a contant multiplier, it doe not affect the proportion of area under any given egment of the curve, but erve only to change the cale of the entire figure. Normal Curve. y = Ce x σ The number of degree of freedom doe not appear in the equation, and o the hape of the curve i independent of it. The only variable to be hown in a probability table are x/ and y or ome function of y uch a a probability value. Chiquare. ( χ ) n x y = C e The number of degree of freedom appear in the exponent. When n =, the curve i Jhaped. When n =, the equation reduce to y= Ce and ha the form of the poitive half of a normal curve. The curve i alway poitively kewed, but a n increae it become more and more nearly like the normal, and become approximately normal when n i 30 or o. A probability table mut take account of three variable, the ize of Chiquare, the number of degree of freedom, and the related probability value. tditribution x y= C3 + ( n+ ) t n The number of degree of freedom appear both in the exponent and in the fraction t / n. The curve i alway ymmetrical, but i more peaked than the normal when n i
9 mall. Thi curve alo approache the normal form a n increae. A table of probability value mut be entered with the computed value of t and alo with the appropriate value of n. A few elected value will how the comparion between etimate of ignificance read from a table of the normal curve and a ttable. For a normal curve, the proportion of area in both tail of the curve beyond 3 i.007. For a tditribution the proportion i a follow: n p Again, for a normal curve, the point uch that.0 of the area i in the tail i.56 from the mean. For a tditribution, the poition of thi point i a follow: n x / σ Fditribution and zditribution n F y= C and y = C 4 n + n 5 n + n z ( nf + n ) ( ne + n ) e nz In each of thee equation, which provide the table ued in analyi of variance problem, there occur not only the computed value of F (or of z), but alo the two parameter n and n, n being the number of degree of freedom for the mean quare in the numerator of F and n the number of degree of freedom for that in the denominator. Becaue a probability table mut be entered with all three, uch a table often how the value for elected probability value only. The table publihed by Fiher give value for p =.05, p =.0, and p =.00; thoe by Snedecor give p =.05 and p =0.
10 Sampling Ditribution of r. Thi i a complicated equation involving a parameter the true correlation in the population,?; the oberved correlation in the ample, r; and the number of degree of freedom. If ρ = 0 the ditribution i ymmetrical. If ρ 0 and n i large, the ditribution become normal. If ρ 0 and n i mall the curve i definitely kewed. David' Table of the Correlation Coefficient (Iued by the Biometrika Office, Univerity College, London, 938) mut be entered with all three parameter. IV. DETERMINING THE APPROPRIATE NUMBER OF DEGREES OF FREEDOM A univeral rule hold: the number of degree of freedom i alway equal to the number of obervation minu the number of neceary relation obtaining among thee obervation. In geometric term, the number of obervation i the dimenionality of the original pace and each relationhip repreent a ection through that pace retricting the ample point to a pace of one lower dimenion. Impoing a relationhip upon the obervation i equivalent to etimating a parameter from them. For example, the relationhip X = NM indicate that the mean of the population ha been etimated from obervaiton. The number of degree of freedom i alo equal to the number of independent obervation, which i the number of original obervation minu the number of parmeter etimated from them. Standard Error of a Mean. Thi i σ mean = σ N when i known for the population. A i eldom known a priori, we are uually forced to make ue of the oberved tandard deviation in the ample, which we will call. In thi cae σ meanb N, one degree of freedom being lot becaue deviation have been taken around the ample mean, o that we have impoed one limiting relationhip, X = NM, and have thu retricted the ample point to a hyperplane of N dimenion. Without any reference to geometry, it can be hown by an algebraic olution that NB σ N. (The ymbol B i to be read "tend to equal" or "approximate.") Goodne of Fit of Normal Curve to a Set of Data.The number of obervation i the number of interval in the frequency ditribution for which an oberved frequency i compared with the frequency to be expected on the aumption of a normal ditribution. If thi normal curve ha an arbitrary mean and tandard deviation agreed upon in advance, the number of degree of freedom with which we enter the Chiquare table to tet goodne of fit i one le than the number of interval. In thi cae one retriction i impoed; namely f = f ' f a theoretical frequency. If, where f i an oberved and ' however, a i more common, the theoretical curve i made to conform to the oberved data in it mean and tandard deviation, two additional retriction are impoed; namely
11 fx = fx ' and f ( X M ) = f '( X M ), o that the number of degree of freedom i three le than the number of interval compared. It i clear that when the curve are made to agree in mean and tandard deviation, the dicrepancy between oberved and theoretical frequencie will be reduced, o the number of degree of freedom in relation to which that dicrepancy i interpreted hould alo be reduced. Relationhip in a Contingency Table.Suppoe we wih to tet the exitence of a relationhip between trait A, for which there are three categorie, and trait B, for which there are five, a hown in Fig.. We have fifteen cell in the table, giving u fifteen obervation, inamuch a an "obervation" i now the frequency in a ingle cell. If we want to ak whether there i ufficient evidence to believe that in the population from which thi ample i drawn A and B are independent, we need to know the cell frequencie which would be expected under that hypothei. There are then fifteen comparion to be made between oberved frequencie and expected frequencie. But. are all fifteen of thee comparion independent? If we had a priori information a to how the trait would be ditributed theoretically, then all but one of the cell comparion would be independent, the lat cell frequency being fixed in order to make up the proper total of one hundred fifty, and the degree of freedom would be 5 = 4. Thi i the ituation Karl Pearon had in mind when he firt developed hi Chiquare tet of goodne of fit, and Table XII in Vol. I of hi Table for Statitician and Biometrician i made up on the aumption that the number of degree of freedom i one le than the number of obervation. To ue it when that i not the cae we merely readjut the valuof n with which we enter the table. In practice we almot never have a priori etimate of theoretical frequencie, but mut obtain them from the obervation themelve, thu impoing retriction on the number of independent obervation and reducing the degree of freedom available for etimating reliability. In thi cae, if we etimate the theoretical frequencie from the f ' = 0 40 /50 and other in imilar fahion. data, we would etimate the frequency ( )( ) Getting the expected cell frequencie from the oberved marginal frequencie impoe the following relationhip: ( a) f + f + f + f + f = f + f + f + f + f = f3+ f3+ f33+ f43+ f53 = 50 () b f + f + f = 0 3 f + f + f = 0 3 f + f + f = f + f + f = f + f + f = () c f + f f + f f =
12 A A A3 A A A3 B B f f f3 0 B B f f f3 0 B B3 f3 f3 f33 35 B B4 f4 f4 f43 30 B B5 f5 f5 f FIG. Oberved joint frequency ditri FIG. 3.Oberved marginal frequencie of bution of two trait A and B. two trait A and B. At firt ight, there eem to be nine relationhip, but it i immediately apparent that (c) i not a new one, for it can be obtained either by adding the three (a) equation or the five (b) equation. Alo any one of the remaining eight can be obtained by appropriate manipulation of the other even. There are then only even independent neceary relationhip impoed upon the cell frequencie by requiring them to add up to the oberved marginal total. Thu n = 5 7 = 8 and if we compute Chiquare, we mut enter the Chiquare table with eight degree of freedom. The ame reult can be obtained by noting that two entrie in each row and four in each column can be choen arbitrarily and there i then no freedom of choice for the remaining entrie. In general in a contingency table, if c = number of column and r = number of row, the number of degree of freedom i n = bc gb r g or n = rc br + c g. Variance in a Correlation Table.Suppoe we have a catter diagram with c column, the frequencie in the variou column being n, n,... n c, the mean value of Y for the column being m, m,... m c, and the regreion value of Y etimated from X beingy %%% n, Y,... Y c. Thu for any given column, the um of the Y' i i fy = nm i i. For the entire table N = n+ n n c, c n i NM = fy, o that NM = nm + nm nm. Now we may be intereted in the variance of all the core about the total mean, of all the core about their own column mean, of all the core about the regreion line, of regreed value about the total mean, of column mean about the total mean, or of column mean about the regreion line, and we may be intereted in comparing two uch variance. It i neceary to know how many degree of freedom are available for uch comparion. c c
13 (a) Total Variance.For the variance of all core about the total mean, thi i N = ( Y M ), we have N obervation and only one retriction; namely, N fy = NM. Thu there are N degree, of freedom. (b) Variance of Regreed Value about Total Mean.The equation for the regreed x value being Y% M y = r ( X M x ), it i clear that a oon a x i known, y i alo y known. The ample point can move only on a traight line. There i only one degree of freedom available for the variance of regreed value. Y (c) Variance of Score about Regreion Line.There are N reidual of the form Y r xy. % and their variance i the quare of the tandard error of etimate, or y ( ) There are N obervation and two retriction; namely, and ( Y ) = 0 f Y% ( ) = y( xy ) f Y Y% N r. Thu there are N degree of freedom available. (d) Variance of Score about Column Mean.If from each core we ubtract not the regreion value but the mean of the column in which it tand, the variance of the E where E i the correlation ratio obtained from reidual thu obtained will be y ( ) the ample. There are N uch reidual. For each column we have the retriction n i fy = nm i i, making c retriction in all. The number of degree of freedom for the variance within column i therefore N c. (e) Variance of Column Mean about Total MeanTo compute thi variance we have c obervation, i.e., the mean of c column, retricted by the ingle relation c NM = nm, and therefore have c degree of freedom. The variance itelf can be proved to be i i y E, and repreent the variance among the mean of column, (f) Variance of Column Mean about Regreion Line.If for each column we find the difference mi % Yi between the column mean and the regreion value, and then find
14 c f ( ) i mi Y% i, the reult will be ( E r ) which i a variance repreenting the N y departure of the mean from linearity. There i one uch difference for each column, giving u c obervation, and thee obervation are retricted by the two relationhip c c f i( m Y i % i ) = 0 and f ( m i i Y % i) = N y ( E r ). Therefore, we have c degree of freedom. The following cheme how thee relationhip in ummary form: Source of variation Formula Degree of Freedom (d) Score about column mean... ( E ) N c (e) Mean about total mean..... E c (a) Total..... N (c) Score about regreion line... (b) Regreed value about total mean... (a) Total... ( r ) r N N (d) Score about column mean.... (f) Column mean about regreion line... (c) Score about regreion line... (b) Regreed value about total mean... (f) Column mean about regreion line... (e) Column mean about total mean.. (b) Regreed value about total mean.. (f) Column mean about regreion line. (d) Score about column mean.. (a) Total.. ( E ) ( ) ( r ) E r r ( ) E r E r ( ) ( E ) E r N c c N c c c N c N It i apparent that thee variance have additive relationhip and that their repective degree of freedom have exactly the ame additive relationhip.
15 Tet Baed on Ratio of Two Variance.From any pair of thee additive variance, we may make an important tatitical tet. Thu, to tet whether linear correlation exit ( ) ( ) r r r N in the population or not, we may divide by obtaining. To N r tet whether a relationhip meaureable by the correlation ratio exit in the population, E ( E ) E N c we may divide by obtaining. To tet whether c N c E c ( E r ) E r correlation i linear, we may divide by or may c r c ( ) ( ) r obtaining ( ) E r E E r N c divide by obtaining. In each cae, the reulting c N c E c value i referred to Snedecor' Ftable which mut be entered with the appropriate number of degree of freedom for each variance. Or we may find the logarithm of the ratio to the bae e, take half of it, and refer the reult to Fiher ztable, which alo mut be entered with the appropriate number of degree of freedom for each variance. Partial Correlation.For a coefficient of correlation of zero order, there are N degree of freedom. Thi i obviou, ince a traight regreion line can be fitted to any two point without reidual, and the firt two obervation furnih no etimate of the ize of r. For each variable that i held contant in a partial correlation, one additional degree of freedom i lot, o that for a correlation coefficient of the pth order, the degree of freedom are N p. Thi place a limit upon the number of meaningful interrelationhip which can be obtained from a mall ample. A an extreme illutration, uppoe twentyfive variable have been meaured for a ample of twentyfive cae only, and all the intercorrelation computed, a well a all poible partial correlation the partial of the twentythird order will of neceity be either + or, and thu are meaningle. Each uch partial will be aociated with 5 3 degree of freedom. If the partial were not + or the error variance fantatic ituation. σ ( r ) N p would become infinite, a
16 BIBLIOGRAPHY Dawon, S.: An Introduction to the Computation of Statitic. Univerity of London Pre, 933, p. 4. No general dicuion. Give rule for χ only. Ezekiel, M.: Method of Correlation Analyi. John Wiley & Son, 930, p.. Fiher, R. A.: "Frequency ditribution of the value of the correlation coefficient in ample from an indefinitely large population." Biometrika, Vol. x, 95, pp Firt application of ndimenional geometry to ampling theory. Fiher, R. A.: Statitical Method for Reearch Worker. Oliver and Boyd. Thi ha now gone through even edition. The term "degree of freedom" doe not appear in the index, but the concept occur contantly throughout the book. Goulden, C. H.: Method of Statitical Analyi. John Wiley and Son, Inc., 939. See index. Guilford, J. P.: Pychometric Method. McGrawHill, 936, p Mill, F. C.: Statitical Method Applied to Economic and Buine. Henry Holt & Co., nd ed., 938. See index. Rider, P.R.: "A urvey of the theory of mall ample." Annal of Mathematic. Vol. xxxi, 930, pp Publihed a a eparate monograph by Princeton Univerity Pre, $.00. Give geometric approach to ampling ditribution. Rider, P. R.: An Introduction to Modern Statitical Method. John Wiley and Son, Inc., 939. See index. While there i no general explanation of the meaning of degree of freedom, thi book give a careful and detailed explanation of how the number of degree of freedom i to be found in a large variety of ituation. Snedecor, G. W.: Statitical Method. Collegiate Pre, Inc., 937, 938. See index. Snedecor, G. W.: Calculation and Interpretation of Analyi of Variance and Covariance. Collegiate Pre, Inc., 934, pp Tippett, L. H. C.: The Method of Statitic. William and Norgate, Ltd., 93. One of the few attempt to treat the concept of degree of freedom in general term, but without geometric background, i made on page Yule and Kendall: Introduction to the Theory of Statitic. Charle Griffin & Co. London, 937, pp , 436.
Method of Moments Estimation in Linear Regression with Errors in both Variables J.W. Gillard and T.C. Iles
Method of Moment Etimation in Linear Regreion with Error in both Variable by J.W. Gillard and T.C. Ile Cardiff Univerity School of Mathematic Technical Paper October 005 Cardiff Univerity School of Mathematic,
More informationAsset Pricing: A Tale of Two Days
Aet Pricing: A Tale of Two Day Pavel Savor y Mungo Wilon z Thi verion: June 2013 Abtract We how that aet price behave very di erently on day when important macroeconomic new i cheduled for announcement
More informationA family of chaotic pure analog coding schemes based on baker s map function
Liu et al. EURASIP Journal on Advance in Signal Proceing 5 5:58 DOI.86/36345439 RESEARCH Open Acce A family of chaotic pure analog coding cheme baed on baker map function Yang Liu * JingLi Xuanxuan
More informationTwo Trees. John H. Cochrane University of Chicago. Francis A. Longstaff The UCLA Anderson School and NBER
Two Tree John H. Cochrane Univerity of Chicago Franci A. Longtaff The UCLA Anderon School and NBER Pedro SantaClara The UCLA Anderon School and NBER We olve a model with two i.i.d. Luca tree. Although
More informationWho Will Follow You Back? Reciprocal Relationship Prediction
Who Will Follow You Back? Reciprocal Relationhip Prediction John Hopcroft Department of Computer Science Cornell Univerity Ithaca NY 4853 jeh@c.cornell.edu Tiancheng Lou Intitute for Interdiciplinary Information
More informationSome Recent Advances on Spectral Methods for Unbounded Domains
COMMUICATIOS I COMPUTATIOAL PHYSICS Vol. 5, o. 24, pp. 195241 Commun. Comput. Phy. February 29 REVIEW ARTICLE Some Recent Advance on Spectral Method for Unbounded Domain Jie Shen 1, and LiLian Wang
More informationHealth Insurance and Social Welfare. Run Liang. China Center for Economic Research, Peking University, Beijing 100871, China,
Health Inurance and Social Welfare Run Liang China Center for Economic Reearch, Peking Univerity, Beijing 100871, China, Email: rliang@ccer.edu.cn and Hao Wang China Center for Economic Reearch, Peking
More informationHumidity Fixed Points of Binary Saturated Aqueous Solutions
JOURNAL OF RESEARCH of the National Bureau of StandardA. Phyic and Chemitry Vol. 81 A, No. 1, JanuaryFebruary 1977 Humidity Fixed Point of Binary Saturated Aqueou Solution Lewi Greenpan Intitute for
More informationWarp Field Mechanics 101
Warp Field Mechanic 101 Dr. Harold Sonny White NASA Johnon Space Center 2101 NASA Parkway, MC EP4 Houton, TX 77058 email: harold.white1@naa.gov Abtract: Thi paper will begin with a hort review of the
More informationIncorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors
via Dirichlet Foret Prior David ndrzeewi andrzee@c.wic.edu Xiaoin Zhu erryzhu@c.wic.edu Mar raven craven@biotat.wic.edu Department of omputer Science, Department of iotatitic and Medical Informatic Univerity
More informationMULTIPLE SINK LOCATION PROBLEM AND ENERGY EFFICIENCY IN LARGE SCALE WIRELESS SENSOR NETWORKS
MULTIPLE SINK LOCATION PROBLEM AND ENERGY EFFICIENCY IN LARGE SCALE WIRELESS SENSOR NETWORKS by Eylem İlker Oyman B.S. in Computer Engineering, Boğaziçi Univerity, 1993 B.S. in Mathematic, Boğaziçi Univerity,
More informationThe ImportExport Paradigm for HighQuality College Courses
Public Policy Editor: Stephen Ruth ruth@gmu.edu The ImportExport Paradigm for HighQuality College Coure An Anwer to Tuition Throughthe Roof Cot Spiral? Stephen Ruth George Maon Univerity Three new
More informationEXISTENCE AND NONEXISTENCE OF SOLUTIONS TO ELLIPTIC EQUATIONS WITH A GENERAL CONVECTION TERM
EXISTENCE AND NONEXISTENCE OF SOLUTIONS TO ELLIPTIC EQUATIONS WITH A GENERAL CONVECTION TERM SALOMÓN ALARCÓN, JORGE GARCÍAMELIÁN AND ALEXANDER QUAAS Abtract. In thi paper we conider the nonlinear elliptic
More informationWHICH SCORING RULE MAXIMIZES CONDORCET EFFICIENCY? 1. Introduction
WHICH SCORING RULE MAXIMIZES CONDORCET EFFICIENCY? DAVIDE P. CERVONE, WILLIAM V. GEHRLEIN, AND WILLIAM S. ZWICKER Abstract. Consider an election in which each of the n voters casts a vote consisting of
More informationR&D of diamond films in the Frontier Carbon Technology Project and related topics
Diamond and Related Material 1 (003) 33 40 R&D of diamond film in the Frontier Carbon Technology Project and related topic a, a a a b Koji Kobahi *, Yohiki Nihibayahi, Yohihiro Yokota, Yutaka Ando, Takehi
More informationCore Academic Skills for Educators: Mathematics
The Praxis Study Companion Core Academic Skills for Educators: Mathematics 5732 www.ets.org/praxis Welcome to the Praxis Study Companion Welcome to The Praxis Study Companion Prepare to Show What You Know
More informationIntroduction to Linear Regression
14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression
More informationWith t support a l of th may b possibl
E R With t upport a l of th may b poibl C Over 170 Student from Imperial attended the Student' Union lobby of Parliament yeterday afternoon. The lobby wa upported by the Rector and the College' Governing
More informationAn analysis method for a quantitative outcome and two categorical explanatory variables.
Chapter 11 TwoWay ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that
More informationOPRE 6201 : 2. Simplex Method
OPRE 6201 : 2. Simplex Method 1 The Graphical Method: An Example Consider the following linear program: Max 4x 1 +3x 2 Subject to: 2x 1 +3x 2 6 (1) 3x 1 +2x 2 3 (2) 2x 2 5 (3) 2x 1 +x 2 4 (4) x 1, x 2
More informationMathematics: Content Knowledge
The Praxis Study Companion Mathematics: Content Knowledge 5161 www.ets.org/praxis Welcome to the Praxis Study Companion Welcome to the Praxis Study Companion Prepare to Show What You Know You have been
More informationAn Introduction to Regression Analysis
The Inaugural Coase Lecture An Introduction to Regression Analysis Alan O. Sykes * Regression analysis is a statistical tool for the investigation of relationships between variables. Usually, the investigator
More informationThe InStat guide to choosing and interpreting statistical tests
Version 3.0 The InStat guide to choosing and interpreting statistical tests Harvey Motulsky 19902003, GraphPad Software, Inc. All rights reserved. Program design, manual and help screens: Programming:
More informationAdvanced Problems. Core Mathematics
Advanced Problems in Core Mathematics Stephen Siklos Fourth edition, October 2008 ABOUT THIS BOOKLET This booklet is intended to help you to prepare for STEP examinations. It should also be useful as preparation
More informationWhat are plausible values and why are they useful?
What are plausible values and why are they useful? Matthias von Davier Educational Testing Service, Princeton, New Jersey, United States Eugenio Gonzalez Educational Testing Service, Princeton, New Jersey,
More informationMisunderstandings between experimentalists and observationalists about causal inference
J. R. Statist. Soc. A (2008) 171, Part 2, pp. 481 502 Misunderstandings between experimentalists and observationalists about causal inference Kosuke Imai, Princeton University, USA Gary King Harvard University,
More informationReview of basic statistics and the simplest forecasting model: the sample mean
Review of basic statistics and the simplest forecasting model: the sample mean Robert Nau Fuqua School of Business, Duke University August 2014 Most of what you need to remember about basic statistics
More informationMINITAB ASSISTANT WHITE PAPER
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. OneWay
More informationMC39i Siemens Cellular Engine. Version: 01.02 DocID: MC39i_HD_V01.02
MC39i Siemen Cellular Engine Verion: 01.02 DocID: MC39i_HD_V01.02 Document Name: MC39i Hardware Interface Decription Verion: 01.02 Date: November 12, 2003 DocId: Statu: MC39i_HD_V01.02 General Note Product
More informationPRINCIPAL COMPONENT ANALYSIS
1 Chapter 1 PRINCIPAL COMPONENT ANALYSIS Introduction: The Basics of Principal Component Analysis........................... 2 A Variable Reduction Procedure.......................................... 2
More information