THE PROBABLE ERROR OF A MEAN. Introduction


 Emery Greene
 1 years ago
 Views:
Transcription
1 THE PROBABLE ERROR OF A MEAN By STUDENT Itroductio Ay experimet may he regarded as formig a idividual of a populatio of experimets which might he performed uder the same coditios. A series of experimets is a sample draw from this populatio. Now ay series of experimets is oly of value i so far as it eables us to form a judgmet as to the statistical costats of the populatio to which the experimets belog. I a greater umber of cases the questio fially turs o the value of a mea, either directly, or as the mea differece betwee the two quatities. If the umber of experimets be very large, we may have precise iformatio as to the value of the mea, but if our sample be small, we have two sources of ucertaity: (1) owig to the error of radom samplig the mea of our series of experimets deviates more or less widely from the mea of the populatio, ad () the sample is ot sufficietly large to determie what is the law of distributio of idividuals. It is usual, however, to assume a ormal distributio, because, i a very large umber of cases, this gives a approximatio so close that a small sample will give o real iformatio as to the maer i which the populatio deviates from ormality: sice some law of distributio must he assumed it is better to work with a curve whose area ad ordiates are tabled, ad whose properties are well kow. This assumptio is accordigly made i the preset paper, so that its coclusios are ot strictly applicable to populatios kow ot to be ormally distributed; yet it appears probable that the deviatio from ormality must be very extreme to load to serious error. We are cocered here solely with the first of these two sources of ucertaity. The usual method of determiig the probability that the mea of the populatio lies withi a give distace of the mea of the sample is to assume a ormal distributio about the mea of the sample with a stadard deviatio equal to s/, where s is the stadard deviatio of the sample, ad to use the tables of the probability itegral. But, as we decrease the umber of experimets, the value of the stadard deviatio foud from the sample of experimets becomes itself subject to a icreasig error, util judgmets reached i this way may become altogether misleadig. I routie work there are two ways of dealig with this difficulty: (1) a experimet may he repeated may times, util such a log series is obtaied that the stadard deviatio is determied oce ad for all with sufficiet accuracy. This value ca the he used for subsequet shorter series of similar experimets. () Where experimets are doe i duplicate i the atural course of the work, the mea square of the differece betwee correspodig pairs is equal to the stadard deviatio of the populatio multiplied by. We call thus combie 1
2 together several series of experimets for the purpose of determiig the stadard deviatio. Owig however to secular chage, the value obtaied is early always too low, successive experimets beig positively correlated. There are other experimets, however, which caot easily be repeated very ofte; i such cases it is sometimes ecessary to judge of the certaity of the results from a very small sample, which itself affords the oly idicatio of the variability. Some chemical, may biological, ad most agricultural ad largescale experimets belog to this class, which has hitherto bee almost outside the rage of statistical iquiry. Agai, although it is well kow that the method of usig the ormal curve is oly trustworthy whe the sample is large, o oe has yet told us very clearly where the limit betwee large ad small samples is to be draw. The aim of the preset paper is to determie the poit at which we may use the tables of the probability itegral i judgig of the sigificace of the mea of a series of experimets, ad to furish alterative tables for use whe the umber of experimets is too few. The paper is divided ito the followig ie sectios: I. The equatio is determied of the curve which represets the frequecy distributio of stadard deviatios of samples draw from a ormal populatio. II. There is show to be o kid of correlatio betwee the mea ad the stadard deviatio of such a sample. III. The equatio is determied of the curve represetig the frequecy distributio of a quatity z, which is obtaied by dividig the distace betwee the mea of a sample ad the mea of the populatio by the stadard deviatio of the sample. IV. The curve foud i I is discussed. V. The curve foud i III is discussed. VI. The two curves are compared with some actual distributios. VII. Tables of the curves foud i III are give for samples of differet size. VIII ad IX. The tables are explaied ad some istaces are give of their use. X. Coclusios. Sectio 1 Samples of idividuals are draw out of a populatio distributed ormally, to fid a equatio which shall represet the frequecy of the stadard deviatios of these samples. If s be the stadard deviatio foud from a sample x 1 x...x (all these beig measured from the mea of the populatio), the s = S(x 1) ( ) S(x1 ) = S(x 1) S(x 1) S(x 1x ).
3 Summig for all samples ad dividig by the umber of samples we get the moa value of s, which we will write s : s = µ µ = µ ( 1), where µ is the secod momet coefficiet i the origial ormal distributio of x: sice x 1, x, etc. are ot correlated ad the distributio is ormal, products ivolvig odd powers of x 1 vaish o summig, so that S(x1x) is equal to 0. If M R represet the Rth momet coefficiet of the distributio of s about the ed of the rage where s = 0, Agai M 1 ( 1) = µ. { ( )} S(x s 4 = 1 ) S(x1 ) ( ) S(x ( ) ( = 1 ) S(x 1) S(x1 ) S(x1 ) + = S(x4 1) + S(x 1x ) S(X4 1) 3 4S(x 1x ) 3 + S(x4 1) 4 + 6S(x 1x ) 4 +other terms ivolvig odd powers of x 1, etc. which will vaish o summatio. Now S(x 4 1) has terms, buts(x 1x ) has 1 ( 1), hece summig for all samples ad dividig by the umber of samples, we get M = µ 4 ( 1) +µ µ 4 ( 1) µ + µ 4 ( 1) 3 +3µ 3 ) 4 = µ 4 3 { +1}+ µ 3( 1){ +3}. Now sice the distributio of x is ormal, µ 4 = 3µ, hece M = µ ( 1) 3 { } = µ ( 1)(+1). I a similar tedious way I fid ad M 3 = µ 3 ( 1)(+1)(+3) 3 M 4 = µ 4 ( 1)(+1)(+3)(+5) 4. 3
4 The law of formatio of these momet coefficiets appears to be a simple oe, but I have ot see my way to a geeral proof. If ow M R be the Rth momet coefficiet of s about its mea, we have M = µ ( 1) { ( 1)(+1)(+3) M 3 = µ 3 3 {(+1) ( 1)} = µ 3 ( 1). 3( 1). (( 1) ( 1)3 3 = µ 3 ( 1) 3 { } = 8µ 3 ( 1) 3, M 4 = µ4 4 { ( 1)(+1)(+3)(+5) 3( 1) 1( 1) 3 ( 1) 4} = µ4 ( 1) 4 { } = 1µ4 ( 1)(+3) 4. Hece β 1 = M 3 M 3 = 8 1, β = M 4 M = 3(+3) 1), β 3β 1 6 = 1 {6(+3) 4 6( 1)} = 0. 1 Cosequetly a curve of Prof. Pearso s Type III may he expected to fit the distributio of s. The equatio referred to a origi at the zero ed of the curve will be where ad y = Cx p e γx, γ = M M 3 = 4µ ( 1) 3 8 µ ( 1) = µ p = 4 β 1 1 = 1 Cosequetly the equatio becomes y = Cx 3 e x µ, 1 = 3. which will give the distributio of s. The area of this curve is C x 3 e x µ dx = I (say). The first momet 0 coefficiet about the ed of the rage will therefore be C [ ] x 1 e x µ dx C µ x 1 e x x= µ 0 x=0 = + C 1 0 µ x 3 e x µ dx. I I I 4 }
5 The first part vaishes at each limit ad the secod is equal to 1 µ I I = 1 µ. ad we see that the higher momet coefficiets will he formed by multiplyig successively by +1 µ, +3 µ etc., just as appeared to he the law of formatio of M, M 3, M 4, etc. Hece it is probable that the curve foud represets the theoretical distributio of s ; so that although we have o actual proof we shall assume it to do so i what follows. The distributio of s may he foud from this, sice the frequecy of s is equal to that of s ad all that we must do is to compress the base lie suitably. Now if y 1 = φ(s ) be the frequecy curve of s ad y = ψ(s) be the frequecy curve of s, the y 1 d(s ) = y ds, y ds = y 1 sds, y = sy 1. Hece is the distributio of s. This reduces to y = Cs(s ) 3 e s µ. y = Cs e s σ. Hece y = Ax e s µ will give the frequecy distributio of stadard deviatios of samples of, take out of a populatio distributed ormally with stadard deviatio σ. The costat A may he foud by equatig the area of the curve as follows: The Area = A 0 I p = σ = σ x e x σ dx. ( Let I p represet ( e x σ )dx x p 1 d 0 dx ] x= [ x p 1 e x σ = σ (p 1)I p, x=0 sice the first part vaishes at both limits. + σ (p 1) 0 0 ) x p e x σ dx. x p e x σ dx 5
6 or By cotiuig this process we fid ( σ I = ( σ I = accordig is eve or odd. But I 0 is ad I 1 is Hece if be eve, while is be odd or 0 0 ) ) e x σ dx = xe x sigma dx = [ σ ( 3)( 5)...3.1I 0 ( 3)( 5)...4.I 1 ( π ) σ, ] e x x= σ x=0 = σ. Area A = (π )( ( 3)( 5) σ ), 1 A = Area ( 3)( 5)...4. ( σ ) 1 Hece the equatio may be writte ( ) N ( ) 1 y = ( 3)( 5) π σ x e x σ ( eve) y = N ( ) 1 ( 3)( 5)...4. σ x e x σ ( odd) where N as usual represets the total frequecy. Sectio II To show that there is o correlatio betwee (a) the distace of the mea of a sample from the mea of the populatio ad (b) the stadard deviatio of a sample with ormal distributio. (1) Clearly positive ad egative positios of the mea of the sample are equally likely, ad hece there caot be correlatio betwee the absolute value of the distace of the mea from the mea of the populatio ad the stadard. 6
7 deviatio, but () there might be correlatio betwee the square of the distace ad the square of the stadard deviatio. Let ( ) u S(x1 ) = ad s = S(x 1) ( ) S(x1 ). The if m 1, M 1 be the mea values of u ad s z, we have by the precedig part M 1 = µ ( 1) ad m 1 = µ. Now ( ) ( ) 4 u s = S(x 1) S(x1 ) S(x1 ) ( ) S(x = 1 ) + S(x 1x ).S(x 1) 3 S(x4 1) 4 6S(x 1x ) 4 other terms of odd order which will vaish o summatio. Summig for all values ad dividig by the umber of cases we get R u s σ u σ s +m 1 M 1 = µ 4 +µ ( 1) µ 4 3µ ( 1) 3 3, where R u s is the correlatio betwee u ad s. R u s σ u σ s +µ ( 1) = µ ( 1) 3 {3+ 3} = µ ( 1). Hece R u s σ u σ s = 0, or there is o correlatio betwee u ad s. Sectio III To fid the equatio represetig the frequecy distributio of the meas of samples of draw from a ormal populatio, the mea beig expressed i terms of the stadard deviatio of the sample. We have y = C σ s e x 1 σ as the equatio represetig the distributio of s, the stadard deviatio of a sample of, whe the samples are draw from a ormal populatio with stadard deviatio s. Now the meas of these samples of are distributed accordig to the equatio 1 ()N y = e x σ, (π)σ ad we have show that there is o correlatio betwee x, the distace of the mea of the sample, ad s, the stadard deviatio of the sample. 1 Airy, Theory of Errors of Observatios, Part II, 6. 7
8 Now let us suppose x measured i terms of s, i.e. let us fid the distributio of z = x/s. If we have y 1 = φ(x) ad y = ψ(z) as the equatios represetig the frequecy of x ad of z respectively, the y 1 dx = y dz = y 3 dx s, y = sy 1. Hece y = N ()s e s z σ (π)σ is the equatio represetig the distributio of z for samples of with stadard deviatio s. Now the chace that s lies betwee s ad s+ds is s+ds s 0 C σ 1 s e s σ ds C σ 1 s e s σ ds which represets the N i the above equatio. Hece the distributio of z due to values of s which lie betwee s ad s+ds is y = s+ds s C σ ( π 0 ) s 1 e s (1+z ) σ ds C σ 1 s e s σ ds = ( π ) s+ds s 0 C σ s 1 (1+z ) e s σ ds C σ s e s σ ds ad summig for all values of s we have as a equatio givig the distributio of z ( ) s+ds C π s σ s 1 (1+z ) e s y = σ ds. σ C σ s e s σ ds By what we have already proved this reduces to ad to 0 y = (1+z ) 1, if be odd y = (1+z ) 1, if be eve Sice this equatio is idepedet of σ it will give the distributio of the distace of the mea of a sample from the mea of the populatio expressed i terms of the stadard deviatio of the sample for ay ormal populatio. 8
9 Sectio IV. Some Properties of the Stadard Deviatio Frequecy Curve By a similar method to that adopted for fidig the costat we may fid the mea ad momets: thus the mea is at I 1 /I, which is equal to ( π ) σ, if be eve, or (π ) σ, if be odd. The secod momet about the ed of the rage is I = ( 1)σ. I The third momet about the ed of the rage is equal to I +1 I = I +1 I 1. I 1 I = σ the mea. The fourth momet about the ed of the rage is equal to I + = ( 1)(+1) I σ 4. If we write the distace of the mea from the ed of the rage Dσ/ ad the momets about the ed of the rage ν 1, ν, etc., the ν 1 = Dσ, ν = 1 σ, ν 3 = Dσ3, ν 4 = N 1 σ 4. From this we get the momets about the mea: µ = σ ( 1 D ), µ 3 = σ3 {D 3( 1)D +D } = σ3 D {D +3}, µ 4 = σ { 1 4D +6( 1)D 3D 4 } = σ4 { 1 D (3D +6)}. It is of iterest to fid out what these become whe is large. 9
10 is I order to do this we must fid out what is the value of D. Now Wallis s expressio for π derived from the ifiite product value of six π (+1) = () ( 1). If we assume a quatity θ ( = a 0 + a1 +etc.) which we may add to the +1 i order to make the expressio approximate more rapidly to the truth, it is easy to show that θ = ( π etc., ad we get ) = () ( 1). From this we fid that whether be eve or odd D approximates to whe is large. Substitutig this value of D we get ( µ = σ 1 1 ) (1, µ = σ3 3 + ) , µ 4 = 3σ ( ) 16. Cosequetly the ( value of the stadard ) deviatio of a stadard deviatio σ which we have foud becomes the same as that foud for () {1 (1/4)} the ormal curve by Prof. Pearso {σ/()} whe is large eough to eglect the 1/4 i compariso with 1. Neglectig terms of lower order tha 1/, we fid β 1 = 3 (4 3), β) = 3 ( 1 1 )( 1+ 1 Cosequetly, as icreases, β very soo approaches the value 3 of the ormal curve, but β 1 vaishes more slowly, so that the curve remais slightly skew. Diagram I shows the theoretical distributio of the stadard deviatios foud from samples of 10. Sectio V. Some Properties of the Curve y = ( 4 3. π if be eve if be odd ) ). (1+z ) 1 Writig z = taθ the equatio becomes y = etc. cos θ, which affords a easy way of drawig the curve. Also dz = dθ/cos θ. This expressio will be foud to give a much closer approximatio to π tha Wallis s 10
11 Hece to fid the area of the curve betwee ay limits we must fid etc. cos θdθ = { [ 3 cos etc. cos 4 3 ]} θsiθ θdθ + = etc. cos 4 θdθ etc.[cos 3 θsiθ], ad by cotiuig the process the itegral may he evaluated. For example, if we wish to fid the area betwee 0 ad θ for = 8 we have Area = π = 4 3. π θ 0 θ 0 cos 6 θdθ cos 4 θdθ π cos5 θsiθ = θ π + 1 π cosθsiθ π cos3 θsiθ π cos5 θsiθ ad it will be oticed that for = 10 we shall merely have to add to this same expressio the term π cos7 θsiθ. 11
12 The tables at the ed of the paper give the area betwee ad z ( or θ = π ) ad θ = ta 1 z. This is the same as 0.5+the area betwee θ = 0, ad θ = ta 1 z, ad as the whole area of the curve is equal to 1, the tables give the probability that the mea of the sample does ot differ by more tha z times the stadard deviatio of the sample from the mea of the populatio. The whole area of the curve is equal to etc. π cos θdθ 1 π ad sice all the parts betwee the limits vaish at both limits this reduces to 1. Similarly, the secod momet coefficiet is equal to etc. π cos θta θdθ 1 π = etc. = 3 1 = π 1 π (cos 4 θ cos θ)dθ Hece the stadard deviatio of the curve is 1/ ( 3). The fourth momet coefficiet is equal to etc. π cos θta 4 θdθ 1 π = etc. + 1 π 1 π (cos 6 θ cos 4 θ +cos θ)dθ = ( ) 3 +1 = 3 ( 3)( 5). The odd momets are of course zero, a the curve is symmetrical, so β 1 = 0, β = 3( 3) 5 = Hece as it icreases the curve approaches the ormal curve whose stadard deviatio is 1/ ( 3). β, however, is always greater tha 3, idicatig that large deviatios are mere commo tha i the ormal curve. I have tabled the area for the ormal curve with stadard deviatio 1/ 7 so as to compare, with my curve for = It will be see that odds laid 3 See p. 9 1
13 accordig to either table would ot seriously differ till we reach z = 0.8, where the odds are about 50 to 1 that the mea is withi that limit: beyod that the ormal curve gives a false feelig of security, for example, accordig to the ormal curve it is 99,986 to 14 (say 7000 to 1) that the mea of the populatio lies betwee ad +1.3s, whereas the real odds are oly 99,819 to 181 (about 550 to 1). Now 50 to 1 correspods to three times the probable error i the ormal curve ad for most purposes it would be cosidered sigificat; for this reaso I have oly tabled my curves for values of ot greater tha 10, but have give the = 9 ad = 10 tables to oe further place of decimals. They ca he used as foudatios for fidig values for larger samples. 4 The table for = ca be readily costructed by lookig out θ = ta 1 z i Chambers s tables ad the θ/π gives the correspodig value. Similarly 1 siθ +0.5 gives the values whe = 3. There are two poits of iterest i the = curve. Here s is equal to half the distace betwee the two observatios, ta 1 s s = π 4, so that betwee +s ad z lies π 4 1 π or half the probability, i.e. if two observatios have bee made ad we have o other iformatio, it is a eve chace that the mea of the (ormal) populatio will lie betwee them. O the other had the secod 4 E.g. if = 11, to the correspodig value for = 9, we add cos8 θsiθ: 9 if = 13 we add as well cos10 θsiθ, ad so o. 13
14 momet coefficiet is 1 π + 1 π = 1 π ta θdθ = 1 π [taθ π= θ]+1 =, = 1 π or the stadard deviatio is ifiite while the probable error is fiite. Sectio VI. Practical Test of the foregoig Equatios Before I bad succeeded i solvig my problem aalytically, I had edeavoured to do so empirically. The material used was a correlatio table cotaiig the height ad left middle figer measuremets of 3000 crimials, from a paper by W. R. Macdoell (Biometrika, i, p. 19). The measuremets were writte out o 3000 pieces of cardboard, which were the very thoroughly shuffled ad draw at radom. As each card was draw its umbers were writte dow i a book, which thus cotais the measuremets of 3000 crimials i a radom order. Fially, each cosecutive set of 4 was take as a sample 750 i all ad the mea, stadard deviatio, ad correlatio 5 of each sample determied. The differece betwee the mea of each sample ad the mea of the populatio was the divided by the stadard deviatio of the sample, givig us the z of Sectio III. This provides us with two sets of 750 stadard deviatios ad two sets of 750 z s o which to test the theoretical results arrived at. The height ad left middle figer correlatio table was chose because the distributio of both was approximately ormal ad the correlatio was fairly high. Both frequecy curves, however, deviate slightly from ormality, the costats beig for height β 1 = 0.006, β = 3.176, ad for left middle figer legths β 1 = , β = 3.140, ad i cosequece there is a tedecy for a certai umber of larger stadard deviatios to occur tha if the distributios wore ormal. This, however, appears to make very little differece to the distributio of z. Aother thig which iterferes with the compariso is the comparatively large groups i which the observatios occur. The heights are arraged i 1 ich groups, the stadard deviatio beig oly.54 iches. while, the figer legths wore origially grouped i millimetres, but ufortuately I did ot at the time see the importace of havig a smaller uit ad codesed them ito millimetre groups, i terms of which the stadard deviatio is.74. Several curious results follow from takig samples of 4 from material disposed i such wide groups. The followig poits may be oticed: (1) The meas oly occur as multiples of 0.5. () The stadard deviatios occur as the square roots of the followig types of umbers:, +0.10, +0.5, +0.50, +0.69, (3) A stadard deviatio belogig to oe of these groups ca oly be associated with a mea of a particular kid; thus a stadard deviatio of ca 5 I hope to publish the results of the correlatio work shortly. 14
15 oly occur if the mea differs by a whole umber from the group we take as origi, while 1.69 will oly occur whe the mea is at ±0.5. (4) All the four idividuals of the sample will occasioally come from the same group, givig a zero value for the stadard deviatio. Now this leads to a ifiite value of z ad is clearly due to too wide a groupig, for although two me may have the same height whe measured by iches, yet the fier the measuremets the more seldom will they he idetical, till fially the chace that four me will have exactly the same height is ifiitely small. If we had smaller groupig the zero values of the stadard deviatio might be expected to icrease, ad a similar cosideratio will show that the smaller values of the stadard deviatio would also be likely to icrease, such as 0.436, whe 3 fall i oe group ad 1 i a adjacet group, or 0.50 whe fall i two adjacet groups. O the other had, whe the idividuals of the sample lie far apart, the argumet of Sheppard s correctio will apply, the real value of the stadard deviatio beig more likely to he smaller tha that foud owig to the frequecy i ay group beig greater o the side earer the mode. These two effects of groupig will ted to eutralize the effect o the mea value of the stadard deviatio, but both will icrease the variability. Accordigly, we fid that the mea value of the stadard deviatio is quite close to that calculated, while i each case the variability is sesibly greater. The fit of the curve is ot good, both for this reaso ad because the frequecy is ot evely distributed owig to effects () ad (3) of groupig. O the other had, the fit of the curve givig the frequecy of z is very good, ad as that is the oly practical poit the compariso may he cosidered satisfactory. The followig are the figures for height: Mea value of stadard deviatios: Calculated.07 ± 0.0 Observed.06 Differece = Stadard deviatio of stadard deviatios: Calculated ± Observed Differece Compariso of Fit. Theoretical Equatio: y = (π)σ x e x σ Scale i terms of stadard deviatios of populatio Calculated frequecy Observed frequecy Differece Whece χ = 48.06, P = (about). I tablig the observed frequecy, values betwee ad were icluded i oe group, while betwee ad they were divided over thetwogroups. AsaistaceoftheirregularityduetogroupigImaymetio 15
16 that there were 31 cases of stadard deviatios 1.30 (i terms of the groupig) which is i terms of the stadard deviatio of the populatio, ad they wore therefore divided over the groups 0.4 to 0.5 ad 0.5 to 0.6. Had they all bee couted i groups 0.5 to 0.6 χ would have falle to 0.85 ad P would have rise to The χ test presupposes radom samplig from a frequecy followig the give law, but this we have ot got owig to the iterferece of the groupig. Whe, however, we test the z s where the groupig has ot had so much effect, we fid a close correspodece betwee the theory ad the actual result. There were three cases of ifiite values of z which, for the reasos give above, were give the ext largest values which occurred, amely +6 or 6. The rest were divided ito groups of 0.1; 0.04, 0.05 ad 0.06, beig divided betwee the two groups o either side. The calculated value for the stadard deviatio of the frequecy curve was 1 (±0.0171), while the observed was The value of the stadard deviatio is really ifiite, as the fourth momet coefficiet is ifiite, but as we have arbitrarily limited the ifiite cases we may take as a approximatio 1/ 1500 from which the value of the probable error give above is obtaied. The fit of the curve is as follows: Compariso of Fit. Theoretical Equatio: y = N π cos4 θ, z = taθ Scale of z Calculated frequecy Observed frequecy Differece Whece χ = 1.44, P = This is very satisfactory, especially whe we cosider that as a rule observatios are tested agaist curves fitted from the mea ad oe or more other momets of the observatios, so that cosiderable correspodece is oly to ])c expected; while this curve is exposed to the full errors of radom samplig, its costats havig bee calculated quite apart from the observatios. The left middle figer samples show much the same features as those of the height, but as the groupig is ot so large compared to the variability the curves fit the observatios more closely. Diagrams III 6 ad IV give the stadard deviatios of the z s for the set of samples. The results are as follows: 6 There are three small mistakes i plottig the observed values i Diagram III, which make the fit appear worse tha it really is 16
17 Mea value of stadard deviatios: Calculated.186 ± 0.03 Observed.179 Differece = Stadard deviatio of stadard deviatios: Calculated 0.94 ± Observed Differece = Compariso of Fit. Theoretical Equatio: y = (π)σ x e x σ Scale i terms of stadard deviatios of populatio Calculated frequecy Observed frequecy Whece χ = 1.80, P = Value of stadard deviatio: Calculated 1(±0.017) Observed 0.98 Differece = Compariso of Fit. Theoretical Equatio: y = N π cos4 θ, z = taθ Scale of z Calculated frequecy Observed frequecy Differece Whece χ = 7.39, P = 0.9. A very close fit. We see the that if the distributio is approximately ormal our theory gives us a satisfactory measure of the certaity to be derived from a small sample i both the cases we have tested; but we have a idicatio that a fie groupig is of advatage. If the distributio is ot ormal, the mea ad the stadard deviatio of a sample will be positively correlated, so although both will have greater variability, yet they will ted to couteract oe aother, a mea derivig largely from the geeral mea tedig to be divided by a larger stadard deviatio. Cosequetly, I believe that the table give i Sectio VII below may be used i estimatig the degree of certaity arrived at by the mea of a few experimets, i the case of most laboratory or biological work where the distributios are as a rule of a cocked hat type ad so sufficietly early ormal 17
18 18
19 3. 4 Sectio VII. Tables of ( odd ) ta 1 z 1.1 π eve 1 π cos θdθ for values of from 4 to 10 iclusive Together with (π) 7 x e 7x dx for compariso whe = 10 z ( = x s) = 4 = 5 = 6 = 7 = 8 = 9 = 10 ( For compariso 7 ) x (π) e 7x dx Explaatio of Tables The tables give the probability that the value of the mea, measured from the mea of the populatio, i terms of the stadard deviatio of the sample, will lie betwee ad z. Thus, to take the table for samples of 6, the probability of the mea of the populatio lyig betwee ad oce the stadard deviatio of the sample is 0.96, the odds are about 4 to 1 that the mea of the populatio lies betwee these limits. 19
20 The probability is therefore that it is greater tha oce the stadard deviatio ad that it lies outside ±1.0 times the stadard deviatio. Illustratio of Method Illustratio I. As a istace of the kid of use which may be made of the tables, I take the followig figures from a table by A. R. Cushy ad A. R. Peebles i the Joural of Physiology for 1904, showig the differet effects of the optical isomers of hyoscyamie hydrobromide i producig sleep. The average umber of hours sleep gaied by the use of the drug is tabulated below. The coclusio arrived at was that i the usual doses was, but 1 was ot, of value as a soporific. Additioal hours sleep gaied by the use of hyoscyamie hydrobromide Patiet 1 (Dextro) (Laevo) Differece ( 1) Mea Mea +.33 Mea s.d s.d s.d First let us see what is the probability that 1 will o the average give icrease of sleep; i.e. what is the chace that the mea of the populatio of which these experimets are a sample is positive /1.70 = 0.44, ad lookig out z = 0.44 i the table for te experimets we fid by iterpolatig betwee ad that 0.44 correspods to , or the odds are to that the mea is positive. That is about 8 to 1, ad would correspod to the ormal curve to about 1.8 times the probable error. It is the very likely that 1 gives a icrease of sleep, but would occasio o surprise if the results were reversed by further experimets. If ow we cosider the chace that is actually a soporific we have the mea iclrease of sleep =.33/1.90 or 1.3 times the s.d. From the table the probability correspodig to this is , i.e. the odds are early 400 to 1 that such is the case. This correspods to about 4.15 times the probable error i the ormal curve. But I take it that the real poit of the authors was that is better tha 1. This we must t4est by makig a ew series, subtractig 1 from. The mea values of this series is +1.38, while the s.d. is 1.17, the mea value beig times the s.d. From the table, the probability is , or the odds are about 666 to oe that is the better soporific. The low value of 0