Catalogue no. 1-001-XIE Survey Metodology December 004
How to obtain more information Specific inquiries about tis product and related statistics or services sould be directed to: Business Survey Metods Division, Statistics Canada, Ottawa, Ontario, K1A 0T6 (telepone: 1 800 63-1136). For information on te wide range of data available from Statistics Canada, you can contact us by calling one of our toll-free numbers. You can also contact us by e-mail or by visiting our website. National inquiries line 1 800 63-1136 National telecommunications device for te earing impaired 1 800 363-769 Depository Services Program inquiries 1 800 700-1033 Fax line for Depository Services Program 1 800 889-9734 E-mail inquiries infostats@statcan.ca Website www.statcan.ca Information to access te product Tis product, catalogue no. 1-001-XIE, is available for free. To obtain a single issue, visit our website at www.statcan.ca and select Our Products and Services. Standards of service to te public Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner and in te official language of teir coice. To tis end, te Agency as developed standards of service tat its employees observe in serving its clients. To obtain a copy of tese service standards, please contact Statistics Canada toll free at 1 800 63-1136. Te service standards are also publised on www.statcan.ca under About Statistics Canada > Providing services to Canadians.
Statistics Canada Business Survey Metods Division Survey Metodology December 004 Publised by autority of te Minister responsible for Statistics Canada Minister of Industry, 006 All rigts reserved. Te content of tis electronic publication may be reproduced, in wole or in part, and by any means, witout furter permission from Statistics Canada, subject to te following conditions: tat it be done solely for te purposes of private study, researc, criticism, review or newspaper summary, and/or for non-commercial purposes; and tat Statistics Canada be fully acnowledged as follows: Source (or Adapted from, if appropriate): Statistics Canada, year of publication, name of product, catalogue number, volume and issue numbers, reference period and page(s). Oterwise, no part of tis publication may be reproduced, stored in a retrieval system or transmitted in any form, by any means electronic, mecanical or potocopy or for any purposes witout prior written permission of icensing Services, Client Services Division, Statistics Canada, Ottawa, Ontario, Canada K1A 0T6. April 006 Catalogue no. 1-001-XIE ISSN 149-091 Frequency: semi-annual Ottawa Cette publication est disponible en français sur demande (n o 1-001-XIF au catalogue). Note of appreciation Canada owes te success of its statistical system to a long-standing partnersip between Statistics Canada, te citizens of Canada, its businesses, governments and oter institutions. Accurate and timely statistical information could not be produced witout teir continued cooperation and goodwill.
4 Gunning and Horgan: A New Algoritm for te Construction of Stratum Boundaries Vol. 30, No., pp. 159-166 A New Algoritm for te Construction of Stratum Boundaries in Sewed Populations Patricia Gunning and Jane M. Horgan 1 Abstract A simple and practicable algoritm for constructing stratum boundaries in suc a way tat te coefficients of variation are equal in eac stratum is derived for positively sewed populations. Te new algoritm is sown to compare favourably wit te cumulative root frequency metod (Dalenius and Hodges 1957) and te avallée and Hidiroglou (1988) approximation metod for estimating te optimum stratum boundaries. Key Words: Efficiency; Geometric progression; Neyman allocation; Stratification. 1. Introduction A stratified random sampling design is a sampling plan in wic a population is divided into mutually exclusive strata, and simple random samples are drawn from eac stratum independently. Te essential objective of stratification is to construct strata to allow for efficient estimation. In wat follows X represents te nown stratification or auxiliary variable wile Y represents te unnown study variable. Suppose tere are strata, containing N elements from wic a sample of size n is to be cosen independently from eac stratum ( 1 ). We write N 1 N and n 1 n. In te case of te stratified mean estimate, y st N y, (1) N 1 were y is te mean of te sample elements in te t stratum, we need to coose te breas in order to minimise its variance were V y ( y ) 1, S st y N N n N 1 N ( Yi Y ) i 1 N S n is te standard deviation of Y restricted to stratum and, and is te mean. Y 1 N N Y i i 1,, () Dalenius (1950) derived equations for determining boundaries wen stratifying variables by size, so tat () is minimised, but tese equations proved troublesome to solve because of dependencies among te components. Since ten tere ave been numerous attempts to obtain efficient approximations to tis optimum solution. Te first suc approximation, suggested by Dalenius and Hodges (1957, 1959), constructs te strata by taing equal intervals on te cumulative function of te square root of te frequencies; tis metod is still often used today. Ecman s rule (1959) of iteratively equalising te product of stratum weigts and stratum ranges was found to require arduous calculations, and is less used tan te metod of Dalenius and Hodges metod (Nicolini 001). avallée and Hidiroglou (1988) derived an iterative procedure for stratifying sewed populations into a tae-all stratum and a number of tae-some strata suc tat te sample size is minimised for a given level of reliability. Oter recent contributions include Hedlin (000) wo revisited Eman s rule, Dorfman and Valliant (000) wo compared modelbased stratified sampling wit balanced sampling, and Rivest (00) wo constructed a generalisation of te avallée and Hidiroglou algoritm by providing models accounting for te discrepancy between te stratification variable and te survey variable. In te present paper we propose an algoritm wic is muc simpler to implement tan any of tose currently available. It is based on an observation by Cocran (1961), tat wit near optimum boundaries te coefficients of variation are often found to be approximately te same in all strata. He concluded owever tat computing and setting equal te standard deviations of te strata would be too complicated to be feasible in practice. In wat follows we sow tat, for sewed distributions, te coefficients of variation can be approximately equalised between strata 1. Patricia Gunning, Scool of Computing, Dublin City University, Dublin 9, Ireland; Jane M. Horgan, Scool of Computing, Dublin City University, Dublin 9, Ireland.
Survey Metodology, December 004 5 using te geometric progression. Tis new algoritm is derived in section. Section 3 compares te efficiency of te new approximation wit te cumulative root frequency and te avallée and Hidiroglou approximations. We summarise our findings in section 4.. An Alternative Metod of Stratum Construction To stratify a population by size is to subdivide it into intervals, wit endpoints, 0 < 1 <, <. Ideally, te division sould be based on te survey variable Y. Suc a construction is of course not possible since Y is unnown; if it were nown we would not need to estimate it. In practice terefore we use a nown auxiliary variable X, wic is correlated wit te survey variable. In order to mae te breas ( 0, 1,, ) for any given 0 and, we see to mae te CV Sx / X te same for 1,,, : S X x1 1 S x S x. (3) X X Now S x is te standard deviation and X te mean of X in stratum : If we mae te assumption tat te distribution witin eac stratum is approximately uniformly distributed we may write + 1 X, (4) 1 S x ( 1 ). (5) 1 As an approximation to te coefficients of variation, tis gives wit equal ( 1) 1 CV (6) ( + ) 1 CV terefore we must ave + 1 + 1 + + 1 1 Tis new and exotic recurrence relation reduces owever to someting familiar:. (7) ; (8) + 1 1 te stratum boundaries are te terms of a geometric progression. ar ( 0,1,..., ). (9) Tus a 0, te minimum value of te variable, and ar, te maximum value of te variable. It follows 1/ tat te constant ratio can be calculated as r ( / 0 ). For a numerical example tae ; 5 ; 50,000 : (10) 4 0 4 tus 5.10 ( 0, 1,, 3, 4) and te strata form te ranges 5 50; 50 500; 500 5,000; 5,000 50,000. (11) Tis is clearly an extremely simple metod of obtaining stratum breas. Te relationsip in (8) depends on te assumption tat te distributions witin strata are uniform. Tis may be justified by te following euristic argument. Wen te parent distribution is positively sewed, ten te low values of te variable ave a ig incidence, wic decreases as te variable values increase, wic maes it appropriate to tae small intervals at te beginning and large intervals at te end. Tis is wat appens wit a geometric series of constant ratio greater tan one. In te lower range of te variable, te strata are narrow so tat an assumption of rectangular distribution in tem is not unreasonable. As te value of te variable increases, te stratum widt increases geometrically. Tis coincides wit te decreased rate of cange of te incidence of te positively sewed variable, so ere also te assumption of uniformity is reasonable. Tis algoritm will of course not wor for normal distributions. Also since te boundaries increase geometrically, it will not wor well wit variables tat ave very low starting points: tis will lead to too many small strata; te rule breas down completely wen te lower end point is zero. We expect te best results wen te distribution is igly positively sewed and te upper part contains a small percentage of te total frequency. 3. Te Performance of te Algoritm 3.1 Some Real Positively Sewed Populations To test our algoritm, we implement it on four specific populations, wic are sewed wit positive tail: Our first population (Population 1) is an accounting population of debtors in an Iris firm, detailed in Horgan (003). In addition, we use tree of te sewed populations tat Cocran (1961) invoed to illustrate te efficiency of
6 Gunning and Horgan: A New Algoritm for te Construction of Stratum Boundaries te cumulative root frequency metod of stratum construction. Tese are: Te population in tousands of US cities (Population ); Te number of students in four-year US colleges (Population 3); Te resources in millions of dollars of a large commercial ban in te US (Population 4). Tere were five oter populations in te Cocran paper, wic turned out to be unsuitable for use wit our algoritm. In tree cases te variable was a proportion: agricultural loans, real estate loans and independent loans expressed as a percentage of te total amount of ban loans. Anoter, a population of farms in wic te variable ranged from 1 to 18, was essentially discrete. Yet anoter, a population of income tax returns, was not sufficiently sewed: it owed its sewness to te top 0.05% of te population, and wen tis was removed, or put in a tae-all stratum, te sewness disappeared. Tese four populations are illustrated and summarised in Figure 1 and Table 1 in decreasing order of sewness. 0.5 Population 1 Population 0.5 0.4 0.4 Probability 0.3 0. Probability 0.3 0. 0.1 0.1 0.0 0.0 40-100 100-00 00-500 500-1,000 1,000-,000 Accounts,000-5,000 5,000-10,000 >10,000 10-0 0-40 40-60 60-80 60-100 100-10 10-140 US Cities 140-160 160-180 180-00 0.5 Population 3 Population 4 0.5 0.4 0.4 0.3 0. 0.3 0. 0.1 0.1 0.0 0.0 00-400 400-600 600-800 800-1,000 1,000-1,00 1,00-1,400 1,400-1,600 1,600-1,800 1,800-,000,000-5,000 5,000-10,000 70-100 100-00 00-300 300-400 400-500 500-600 600-700 700-800 800-900 900-1,000 Probability Probability US Students Ban Resources Figure 1. Populations
Survey Metodology, December 004 7 Te new algoritm is implemented on tese populations, and compared wit te cumulative root frequency (cum f ) and te avallée-hidiroglou metods of stratum construction. 3. Comparison wit te Cumulative Root Frequency Metod We first compare te performance of te new algoritm wit cum f by dividing te populations summarised in Table 1 into 3, 4 and 5 strata, using bot metods to mae te breas. Te results are given in Tables, 3 and 4. A cursory examination of te coefficients of variation in Tables, 3 and 4 suggests tat, in most cases, te geometric metod is more successful tan cum f in obtaining nearequal strata CV. For example in Population 1, wic as te greatest sewness, te CV differ substantially from eac oter wen cum f is used to mae te breas, wile te geometric metod appears to acieve near-equal CV in all cases of 3, 4 and 5 strata: te best results are obtained wit 5. In te oter tree populations, te CV are not as diverse wit cum f, but tey still appear more variable tan tose obtained wit te geometric metod of stratum construction. Te CV wit te geometric metod are more omogeneous wen 4 or 5 tan wen 3; tis is to be expected since te validity of te assumption of uniformity of te distribution of elements witin stratum is strengtened wit increased number of strata. A more detailed analysis of te variability of te CV between strata is given in Table 5, were te standard deviation of te CV is calculated for eac design. Table 1 Summary Statistics for Real Populations Population N Range Sewness Mean Variance 1 3,369 40 8,000 6.44 838.64 3,511,87 1,038 10 00.88 3.57 94 3 677 00 10,000.46 1,563.00 3,36,60 4 357 70 1,000.08 5.6 36,74 Table Te Geometric vs te Cum f : Stratum Breas wit 3 and n 100 Stratification Stratum Population Metod CV 1 3 1 Geometric 0.0600 354 3,15 N,334 1,88 189 n 9 46 45 CV 0.71 0.68 0.64 Cum f 0.0600 558,36 N,339 735 95 n 19 17 64 CV 0.70 0.4 0.76 Geometric 0.070 6 7 N 701 43 94 n 36 9 35 CV 0.8 0.3 0.33 Cum f 0.069 8 66 N 79 08 101 n 40 38 CV 0.9 0.5 0.34 3 Geometric 0.0317 76,645 N 53 31 103 n 9 38 53 CV 0.3 0.37 0.39 Cum f 0.08 1,179 3,69 N 456 15 69 n 37 35 8 CV 0.41 0.31 0.7 4 Geometric 0.0184 168 405 N 11 93 53 n 7 7 46 CV 0.3 0.4 0.30 Cum f 0.0198 16 441 N 07 107 43 n 5 39 36 CV 0.3 0.30 0.7
8 Gunning and Horgan: A New Algoritm for te Construction of Stratum Boundaries Table 3 Te Geometric vs te Cum f : Stratum Breas wit 4 and n 100 Stratification Stratum Population Metod CV 1 3 4 1 Geometric 0.0430 05 1,057 5,443 N 1,416 1,38 483 88 n 6 40 3 CV 0.45 0.44 0.48 0.50 Cum f 0.0480 558 1,117,795 N,339 483 35 n 3 5 10 6 CV 0.70 0.19 0.7 0.69 Geometric 0.0194 0 43 93 00 N 459 398 130 51 n 31 5 CV 0. 0.0 0. 0. Cum f 0.013 19 38 85 N 393 48 155 6 n 15 6 30 9 CV 0.0 0.17 0.5 0.6 3 Geometric 0.014 56 1,386 3,653 N 138 343 17 69 n 5 7 6 4 CV 0.7 0.6 0.6 0.7 Cum f 0.030 690,160 5,100 N 35 319 75 48 n 13 43 1 3 CV 0.31 0.33 0.9 0.19 4 Geometric 0.014 134 61 504 N 156 109 63 9 n 0 3 9 8 CV 0.18 0.19 0.19 0.0 Cum f 0.0143 16 55 488 N 07 58 57 35 n 33 9 3 35 CV 0.3 0.11 0.18 0.4 Table 4 Te Geometric vs te Cum f : Stratum Breas wit 5 and n 100 Stratification Stratum Population Metod CV 1 3 4 5 1 Geometric 0.0360 147 549,037 7,55 N 1,054 1,67 73 65 51 n 14 7 33 4 CV 0.37 0.38 0.40 0.37 0.41 Cum f 0.0349 79 838 1,677 4,193 N 1,644 1,010 33 49 134 n 9 14 7 15 55 CV 0.5 0.30 0.0 0.5 0.57 Geometric 0.0144 17 3 59 108 N 364 418 130 87 39 n 18 8 17 0 17 CV 0.18 0.14 0.15 0.16 0.15 Cum f 0.0186 8 38 57 104 N 79 9 89 88 40 n 58 4 7 16 15 CV 0.8 0.08 0.11 0.16 0.16 3 Geometric 0.0184 433 941,043 4,434 N 100 55 1,989 74 56 n 16 7 0 35 CV 0. 0.1 0.4 0.1 0.1 Cum f 0.01 1,179 1,669 3,139 6,079 n 50 3 17 15 15 CV 0.40 0.09 0.0 0.19 0.13 4 Geometric 0.0110 118 00 339 576 N 114 116 64 39 4 n 1 0 4 18 4 CV 0.14 0.14 0.17 0.1 0.16 Cum f 0.0119 16 55 395 67 N 07 58 37 36 19 n 44 11 10 19 16 CV 0.3 0.11 0.10 0.13 0.11
Survey Metodology, December 004 9 Table 5 Te Variability of te CV for te Geometric and te Cum f Metods Strata Population 1 3 4 3 Geometric 0.035 0.050 0.036 0.038 Cum f 0.181 0.045 0.07 0.035 4 Geometric 0.07 0.010 0.006 0.008 Cum f 0.76 0.04 0.06 0.059 5 Geometric 0.018 0.015 0.013 0.00 Cum f 0.166 0.076 0.119 0.054 We see from Table 5 tat, wit just two exceptions, te standard deviations of te CV are substantially lower wit te geometric metod of stratum construction tan wit cum f. In te two cases were te cumulative root as a lower standard deviation tan te geometric, te differences between tem is not great, and occur wit te smallest number of strata, 3, in Populations and 4. We may conclude terefore tat te new algoritm is successful in breaing te strata in suc a way tat te CV are near equal. Wat remains is to investigate weter te geometric breas lead to more efficient estimation tan cum f. To do tis, te two metods are compared in terms of te relative efficiency or variance ratio obtained wit n 100 allocated optimally among te strata using Neyman allocation (Neyman 1934): Te relative efficiency is defined as n N Sx n. N S (1) i 1 i xi cum ( xst ) geom ( x ), st V e f f cum, geom (13) V were V cum ( xst ) and V geom ( xst ) are te variances of te mean respectively wit te cumulative root frequency and te geometric metods, wit n 100 and n allocated as in (1) for eac of te stratification metods. In sample size planning te relative efficiencies may be interpreted as te proportionate increase or decrease in te sample size wit cum f to obtain te same precision as tat of te geometric metod wit n 100. Te variance calculations are based on te auxiliary variable X, and since tis is assumed to be igly correlated wit te unnown survey variable Y, we can assume te relative efficiency e f f, given in (13), will be a reasonable approximation of te relative efficiency of Y. Table 6 gives te variance ratio wen te number of strata 3, 4 and 5. From Table 6 we see tat, wile tis new metod is not always more efficient tan te cumulative root frequency metod of stratum construction, wen it is, it is substantially so, and wen it is not it is only marginally worse. For example, large gains in efficiency are observed wen 5 in Populations, 3 and 4: ere te relative efficiencies are 1.69, 1.33 and 1.17 respectively indicating tat samples of sizes n 169, 133 and 117 are required wit cum f to obtain te sample precision as tat of te geometric metod wit n 100. Table 6 Efficiencies of te Cum f Relative to te Geometric Metod Strata Population 1 3 4 3 0.97 0.99 0.79 1.16 4 1.3 1.19 1.16 1.04 5 0.94 1.69 1.33 1.17 We also see from Table 6 tat wile tere are four cases were te relative efficiency is less tan 1, wit one exception, all are greater tan 0.9. Te exception is Population 3 wit 3, te smallest number of strata; te relative efficiency in tis case is 0.79. 3.3 Comparison wit te avallée and Hidiroglou Algoritm Wit te avallée-hidiroglou algoritm, te optimum boundaries, 1 1 are cosen to minimise te sample size n for a given level of precision. Te requirement on precision is usually stated by requiring te coefficient of variation to be equal to some specified level between 1% 10%. Obtaining te minimum n is an iterative process, and te SAS code used for implementing it was obtained from te web at ttp://www.ulval.ca/pages/lpr/. To compare te performance of te new metod wit avallée-hidiroglou, te CVs from te geometric algoritm given in Tables, 3 and 4 are used as input for te avallée- Hidiroglou algoritm, and te sample sizes required to obtain te same precision as tat of te geometric metod wit n 100 are computed. Te results are given in Table 7. Te first ting to notice from Table 7 is tat te sample size required wit te avallée-hidiroglou algoritm to obtain te same precision as te geometric metod is greater tan 100 in all but four cases. In Population wit 5 strata, it is necessary to increase te sample size by 36% to
10 Gunning and Horgan: A New Algoritm for te Construction of Stratum Boundaries n 136, to obtain te same precision as te geometric metod wit n 100. Wit tree and four strata, sample sizes of n 11 and 113 are required in Population 1, and samples sizes of n 13 and n 117 are required in Population, to obtain te same precision as te geometric metod. Wen te sample size falls below n 100, te drop is not as large. In Population 4, wit four and five strata, n 93 and n 99 respectively, and in Population 1 wit 5 strata a sample size of n 90 will suffice wit te avallée-hidiroglou algoritm to obtain te same precision as te geometric metod. Te results in Table 7 migt appear to indicate tat te geometric metod outperforms te avallée-hidiroglou metod in terms of te minimum sample size required for a specified precision. We observe owever tat te geometric metod does not give a tae-all stratum. If tis is required it is more appropriate to use te avallée-hidiroglou to obtain te strata. Often, in financial applications te top stratum is decided judgementally; for example US state taxing autorities typically decide teir tae-all stratum based on a total percentage of purcase amounts (Fal, Rotz and Young 003). If after suc a tae-all stratum as been removed te sewness remains, te geometric metod is probably te easier and more efficient way of obtaining te remaining strata. Table 7 Boundaries and Sample Size Required wit te avallée-hidiroglou Metod to Obtain te Same CV as te Geometric Metod wen n 100 3 Strata Population n CV 1 3 1 11 0.0600 1,48 8,676 N,867 464 38 n 4 41 38 CV 0.87 0.57 0.37 13 0.070 35 10 N 795 0 41 n 47 35 41 CV 0.31 0.31 0.17 3 107 0.0317 1,398 4,197 N 481 135 61 n 8 18 61 CV 0.41 0.30 0.4 4 100 0.0184 17 361 N 1 85 60 n 18 60 CV 0.3 0.1 0.3 4 Strata 1 3 4 1 113 0.0430 44 1,88 8,411 N,086 915 37 41 n 16 1 35 41 CV 0.64 0.41 0.45 38 117 0.0194 19 37 95 N 393 40 176 49 n 13 1 34 49 CV 0.19 0.16 0.8 0.1 3 103 0.014 740 1,505 3,819 N 56 34 118 69 n 9 10 15 69 CV 0.3 0.18 0.5 0.7 4 93 0.014 117 188 359 N 111 11 74 60 n 7 9 17 60 CV 0.14 0.1 0.19 0.3 5 Strata 1 3 4 5 1 90 0.0360 34 1,153 3,431 10,301 N 1,846 993 357 147 6 n 1 14 17 1 6 CV 0.58 0.34 0.31 0.31 0.3 136 0.0144 14 1 35 80 N 189 70 336 164 79 n 4 7 16 30 79 CV 0.1 0.10 0.1 0.4 0.30 3 105 0.0184 51 869 1,577 3,675 N 133 180 185 110 69 n 4 5 10 17 69 CV 0.7 0.15 0.16 0.3 0.7 4 99 0.0119 99 130 189 339 N 70 68 85 71 63 n 4 4 8 0 63 CV 0.10 0.08 0.10 0.18 0.33
Survey Metodology, December 004 11 4. Summary Tis paper derives a simple algoritm for te construction of stratum boundaries in positively sewed populations, for wic it is sown tat te stratum breas may be obtained using te geometric distribution. Te proposed metod is easier to implement tan approximations previously proposed. Comparisons wit te commonly used cumulative root frequency metod using four positively sewed real populations divided into tree, four and five strata, sowed substantial gains in te precision of te estimator of te mean; te greatest gains occurring wen te number of strata was five. Comparisons wit te avallée-hidiroglou metod indicated tat a greater sample size was required to obtain te same precision as te geometric metod is most cases; te greatest increase in te required sample size occurred wit te largest number of strata. One limitation of te new algoritm compared to te avallée-hidiroglou metod of stratum construction is tat it does not determine a tae-all top stratum. Acnowledgements Tis wor was supported by a grant from te Iris Researc Council for Science, Engineering and Tecnology. We are indebted to te referees for teir elpful suggestions wic ave greatly improved te original paper. References Cocran, W.G. (1961). Comparison of metods for determining stratum boundaries. Bulletin of te International Statistical Institute, 3,, 345-358. Dalenius, T. (1950). Te problem of optimum stratification. Sandinavis Atuarietidsrift, 03-13. Dalenius, T., and Hodges, J.. (1957). Te coice of stratification points. Sandinavis Atuarietidsrift, 198-03. Dalenius, T., and Hodges, J.. (1959). Minimum variance stratification. Journal of te American Statistical Association, 88-101. Dorfman, A.H., and Valliant, R. (000). Stratification by size revisited. Journal of Official Statistics, 16, 139-154. Ecman, G. (1959). An approximation useful in univariate stratification. Te Annals of Matematical Statistics, 30, 19-9. Fal, E., Rotz, W. and Young,..P. (003). Stratified sampling for sales and use tax igly sewed data-determination of te certainty stratum cut-off amount. Proceedings of te Section on Statistical Computing, American Statistical Association, 66-7. Hedlin, D. (000). A procedure for stratification by an extended eman rule. Journal of Official Statistics, 16, 15-9. Horgan, J.M. (003). A list sequential sampling sceme wit applications in financial auditing. IMA Journal of Management Matematics, 14, 1-18. avallée, P., and Hidiroglou, M. (1988). On te stratification of sewed populations. Survey Metodology, 14, 33-43. Neyman, J. (1934). On te two different aspects of te representative metod: Te metod of stratified sampling and te metod of purposive selection. Journal of te Royal Statistics Society, 97, 558-606. Nicolini, G. (001). A metod to define strata boundaries. Woring Paper 01-001-marzo, Departimento di Economia Politica e Aziendale, Universita degli Studi di Milano. Rivest,.-P. (00). A generalization of te avallée-hidiroglou algoritm for stratification in business surveys. Survey Metodology, 8, 191-198.