Implementation of Controlled Selection in the National Compensation Survey Redesign October 2008

Similar documents
Calculation of Sampling Weights

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

The OC Curve of Attribute Acceptance Plans

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

An Alternative Way to Measure Private Equity Performance

DEFINING %COMPLETE IN MICROSOFT PROJECT

The Current Employment Statistics (CES) survey,

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Section 5.4 Annuities, Present Value, and Amortization

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

1. Measuring association using correlation and regression

1 Example 1: Axis-aligned rectangles

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

How To Calculate The Accountng Perod Of Nequalty

Chapter 7: Answers to Questions and Problems

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

What is Candidate Sampling

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

Time Value of Money Module

The Greedy Method. Introduction. 0/1 Knapsack Problem

Construction Rules for Morningstar Canada Target Dividend Index SM

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.

Traffic-light a stress test for life insurance provisions

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Sample Design in TIMSS and PIRLS

Project Networks With Mixed-Time Constraints

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT

Demographic and Health Surveys Methodology

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

Computer-assisted Auditing for High- Volume Medical Coding

Brigid Mullany, Ph.D University of North Carolina, Charlotte

PERRON FROBENIUS THEOREM

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

The program for the Bachelor degrees shall extend over three years of full-time study or the parttime equivalent.

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Hollinger Canadian Publishing Holdings Co. ( HCPH ) proceeding under the Companies Creditors Arrangement Act ( CCAA )

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

Examensarbete. Rotating Workforce Scheduling. Caroline Granfeldt

Extending Probabilistic Dynamic Epistemic Logic

An MILP model for planning of batch plants operating in a campaign-mode

Multiple-Period Attribution: Residuals and Compounding

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem

MARKET SHARE CONSTRAINTS AND THE LOSS FUNCTION IN CHOICE BASED CONJOINT ANALYSIS

CHAPTER 14 MORE ABOUT REGRESSION

Forecasting the Direction and Strength of Stock Market Movement

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt.

How Much to Bet on Video Poker

LIFETIME INCOME OPTIONS

Underwriting Risk. Glenn Meyers. Insurance Services Office, Inc.

Lecture 3: Force of Interest, Real Interest Rate, Annuity

This circuit than can be reduced to a planar circuit

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW.

Using Series to Analyze Financial Situations: Present Value

Fixed income risk attribution

Allocating Collaborative Profit in Less-than-Truckload Carrier Alliance

Recurrence. 1 Definitions and main statements

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Question 2: What is the variance and standard deviation of a dataset?

Texas Instruments 30X IIS Calculator

Support Vector Machines

Enhancing the Quality of Price Indexes A Sampling Perspective

To manage leave, meeting institutional requirements and treating individual staff members fairly and consistently.

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Implementation of Deutsch's Algorithm Using Mathcad

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Nordea G10 Alpha Carry Index

The Mathematical Derivation of Least Squares

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia

A 'Virtual Population' Approach To Small Area Estimation

J. Parallel Distrib. Comput.

where the coordinates are related to those in the old frame as follows.

L10: Linear discriminants analysis

Ring structure of splines on triangulations

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

Section C2: BJT Structure and Operational Modes

2008/8. An integrated model for warehouse and inventory planning. Géraldine Strack and Yves Pochet

Mathematics of Finance

Reporting Forms ARF 113.0A, ARF 113.0B, ARF 113.0C and ARF 113.0D FIRB Corporate (including SME Corporate), Sovereign and Bank Instruction Guide

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Simple Interest Loans (Section 5.1) :

Can Auto Liability Insurance Purchases Signal Risk Attitude?

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Secure Password-Authenticated Key Agreement Using Smart Cards

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Scatter search approach for solving a home care nurses routing and scheduling problem

Traffic-light extended with stress test for insurance and expense risks in life insurance

Realistic Image Synthesis

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES

Transcription:

Implementaton of Controlled Selecton n the Natonal Compensaton Survey Redesgn October 2008 Lawrence R. Ernst, Chrstopher J. Gucardo 2, Yoel Izsak 3, Jonathan J. Lsc 2, Chester H. Ponkowsk 2 Bureau of Labor Statstcs, 2 Massachusetts Ave., N.E., Room 950, Washngton, DC 2022, Ernst.Lawrence@bls.gov 2 Bureau of Labor Statstcs, 2 Massachusetts Ave., N.E., Room 360, Washngton, DC 2022 3 Natonal Agrcultural Statstcs Servce, 400 Independence Ave., S.W., Room 6337, Washngton, DC 20250 Abstract Izsak et al. (2005) ncluded a proposal to allocate the number of sample establshments among the samplng cells usng a controlled selecton procedure, where cells are area PSUs ndustry samplng strata samplng panels. Snce then the procedure has been mplemented but wth a number of modfcatons not dscussed n Izsak et al. (2005). These modfcatons and possble future changes are dscussed n ths new paper. They nclude: weghtng changes necesstated by the use of controlled selecton, complcatons caused by roundng ssues and how they were overcome, complextes caused by the need to allocate over fve samplng panels, and use of a real-valued mnmum allocaton for each samplng cell n the controlled selecton process n order to avod very large sample weghts and accompanyng ncreases n varances. Key Words: Allocaton, samplng panels, roundng, samplng cells, area PSUs, ndustry samplng strata. Introducton Ths paper covers some ssues regardng the most recent redesgn of the Natonal Compensaton Survey (NCS), a compensaton program conducted by the Bureau of Labor Statstcs, that were not completely covered n prevous papers, such as Izsak et al. (2005). In partcular, ths paper covers some of the aspects of the use of controlled selecton n the sample allocaton process. Ths paper s best read n conjuncton wth Izsak et al. (2005), and we wll not repeat materal that overlaps wth that paper, to the extent that seems reasonable. Other papers that dscuss some aspects of the NCS sample redesgn nclude Ernst et al. (2002), Ernst, Gucardo, and Izsak (2004), Ernst, Izsak, and Paben (2004), and Izsak et al. (2003). A key step n the selecton of sample establshments for the NCS s the allocaton of the NCS Wage Only sample and the Employment Cost Index (ECI) among 52 area strata 20 ndustry strata for the government sector and 52 area strata 23 ndustry strata fve sample panels for the prvate sector. The allocatons are determned by solvng several controlled selecton problems usng the methodology descrbed n Causey, Cox, and Ernst (985). The use of ths methodology for the NCS applcaton was frst proposed n Izsak et al. (2005). However, the controlled selecton procedure actually mplemented requred a number of modfcatons that were not dscussed n that paper. They nclude: weghtng changes necesstated by the use of controlled selecton, complcatons caused by roundng ssues and how they were overcome, complextes caused by the need to allocate over fve samplng panels, and use of a real-valued mnmum allocaton for each samplng cell n the controlled selecton process n order to avod very large sample weghts and accompanyng ncreases n varances. A short descrpton of the two-dmensonal controlled selecton problem n general s presented n Secton 2. The specfc formulatons of the controlled selecton problems used n the NCS applcaton are dscussed n Secton 3. Some necessary modfcatons to nsure that cell values for the nternal cells of the controlled selecton problem sum to the necessary margnals and that ths addtvty s not destroyed by ether roundng or the fact that the orgnal controlled selecton arrays are generally realvalued, not nteger-valued, are presented n Secton 4. Weghtng changes necesstated by the use of controlled selecton are presented n Secton 5. Fnally, n Secton 6, the use of real-valued mnmum allocatons for the samplng cells n the controlled selecton process n order to avod very large sample weghts and accompanyng ncreases n varances s dscussed along wth other optons for modfyng the allocaton process. 2. Controlled Selecton Problems Much of ths paper s concerned wth obtanng allocatons of portons of the NCS sample among sample area ndustry stratum cells by constructon of two-dmensonal tabular, that s addtve, arrays, each of whch consttutes a controlled selecton problem, and then solvng each controlled selecton problem usng a modfcaton of the method of Causey, Cox, and Ernst (985). Note that a two-dmensonal controlled selecton problem n the context of ths paper s a two-dmensonal addtve array

S = ( s j ) of dmensons ( M + ) ( N + ), where M s the number of sample areas and N s the number of ndustry strata. S satsfes the followng condtons: Each nternal cell value, that s, a cell value for a cell that s nether n the last row or last column, s the expected number of sample unts to be selected n the correspondng sample area ndustry stratum cell. A row margnal s the value for a cell n the last column of a row and a column margnal s the value for a cell n the last row of a column. These two margnals are, respectvely, the expected number of sample unts n an area and an ndustry. The cell value n the fnal row and column s the total sample sze. The cell values, except n some cases for the grand total, are generally real-valued, not nteger-valued. A soluton to the controlled selecton problem S s a set of l nteger-valued two-dmensonal addtve arrays, N = ( n j ),..., Nl = ( njl ), of the same dmensons as S, and assocated probabltes, p,..., pl, such that for each cell j, ncludng margnals, n each array Ν k n s < (2.) jk j and for each j l k= p n = s (2.2) k jk j A soluton to a controlled selecton problem can be obtaned by solvng a sequence of transportaton problems through a recursve procedure descrbed n Causey, Cox, and Ernst (985 Sec. 4.). Once a soluton s obtaned to a controlled selecton problem, one of the arrays Ν k s chosen from among the arrays Ν,..., N l usng the assocated probabltes p,..., pl. The chosen Ν k determnes the allocaton to each samplng cell. Note that by (2.) and (2.2) t follows that for the soluton to our controlled selecton problems: The number of sample unts n each sample area, n each ndustry stratum, and n each sample area ndustry stratum cell s wthn one of the desred number for every possble sample. (2.3) The expected number of sample unts n each of the domans lsted n (2.3) over all possble samples s the desred number. (2.4) However, the methodology gven n Causey, Cox, and Ernst (985) for solvng a controlled selecton may fal to yeld a soluton to a controlled selecton problem due to roundng error n the computaton of the cell values, snce the roundng error can destroy the addtvty of the tabular array. In partcular, ths can occur for controlled selecton problems assocated wth the NCS allocatons to be descrbed, snce the cell values for each of the ntal controlled selecton arrays are real numbers as opposed to ntegers. We wll explan n Secton 4 how the problems arsng from roundng error can be avoded for these controlled selecton problems by convertng the orgnal controlled selecton problem nto a sequence of roundng problems nvolvng nteger arthmetc. Before dong so, however, we proceed to descrbe n Secton 3 how the orgnal real-valued tabular arrays were constructed for our NCS allocaton problems. 3. Formulaton of the Tabular Arrays for NCS Allocatons There are fve types of tabular arrays that requre the soluton of controlled selecton problems n conjuncton wth the NCS allocatons. We frst consder the arrays for the government sector. For ths sector there are two controlled selecton problems, one for the ECI allocaton and one for the NCS Wage Only allocaton, wth the latter problem created by subtractng the ECI allocaton array from the total NCS Wage allocaton array. Note, n partcular, that although a controlled selecton problem for total government NCS Wage s constructed, t s not solved drectly. For reasons explaned n Izsak et al. (2005), the Wage Only array s solved nstead and the sum of the soluton to the Wage Only array and the ECI allocaton array s used as a soluton to the total NCS Wage allocaton array. The prelmnary government ECI allocaton array s obtaned by takng the entre ECI government sample count, whch s an nteger, and allocatng among the 20 government samplng ndustres proportonal to PSU weghted employment, wth the ndustry allocatons beng real-valued wth no roundng. The allocaton for each of the 20 ECI government ndustry totals are then allocated among 52 sample areas, agan proportonal to PSU weghted employment. Ths creates a prelmnary two-dmensonal, real-valued controlled selecton problem wth dmensons of the nternal cells beng 52 20. The prelmnary total NCS Wage government allocaton s obtaned smlarly to the ECI array, but n the opposte order. The total government NCS Wage sample s frst allocated among the 52 areas and then wthn each area to the 20 ndustres. The allocaton arrays for government ECI and total government NCS wage are then modfed n two ways. Frst, for any cell n whch the total NCS Wage allocaton s less than the ECI allocaton, the NCS Wage allocaton s rased to the ECI allocaton. In addton, any cell for whch ether the ECI or the total NCS Wage allocaton exceeds the number of frame unts has ts cell allocaton

lowered to the number of frame unts for that survey or surveys. Then, for each of these surveys, the remanng sample n each area s reallocated among the remanng cells n the area proportonal to frame employment and the allocaton adjustment process terated untl no more adjustments are necessary. The modfed margnals are then obtaned by summng. Fnally, as stated earler n ths secton, the controlled selecton array for ECI government s subtracted from the controlled selecton array for total NCS Wage government to obtan a controlled selecton array for NCS Wage Only government and the controlled selecton procedure s used to obtan nteger allocatons for ECI and NCS Wage Only government. For the prvate sector there are three types of controlled selecton problems used n the allocaton process. Frst, the total NCS Wage prvate sector sample and ECI sample over fve panels (Izsak et al. 2005) s allocated among the samplng cells. Ths s analogous to the government sector except there are 23 ndustres n the prvate sector. Also, when allocatng the prvate sector ECI sample among the ndustres, the sample allocaton to each ndustry s based on hstorcal allocatons that n turn are based on frame employments, varances, and response rates for the ndustres, nstead of allocatng proportonal to PSU weghted employment. A topc for further research s possble adjustments of the proportons of the ECI allocated to each ndustry to take nto account changes n these quanttes by ndustry over tme. An addtonal dfference for the prvate sector s that for the NCS Wage sample there s a mnmum allocaton for the Pay Agent areas (Izsak et al. 2005)) and a maxmum allocaton for the three largest areas. When allocatng ths sample among the areas, any area wth an allocaton below the mnmum or above the maxmum allocaton has ts allocaton adjusted to the mnmum or maxmum, respectvely, wth the remanng sample allocated to the remanng areas proportonal to PSU weghted employment and the allocaton adjustment process terated f necessary. The total prvate sector sample for ECI and NCS Wage are ntegervalued, but the cell allocatons are real-valued for all other cells. Controlled selecton s not used to drectly allocate ether the total prvate NCS Wage Only sample or the ECI sample over fve panels. The frst prvate sector controlled selecton array s for the ECI fve panel noncertanty unts as explaned n Izsak et al. (2005) and summarzed here. We frst obtan the allocaton to the ECI noncertanty fve panel sample n each cell by subtractng the number of fve panel ECI certantes from the total ECI fve panel sample sze n the cell, and allocatng the remanng ECI sample between fve panel noncertantes and sngle panel unts, proportonal to PSU weghted employment. The margnals for fve panel ECI noncertantes are obtaned by summng the nternal cells. Controlled selecton s then performed on the tabular array for ECI fve panel noncertantes. The grand total for the ECI fve panel noncertantes and the sngle panel unts are nteger-valued, but the cell allocatons are real-valued for all other cells. The fnal two controlled selecton problems are for sngle panel prvate sector noncertanty unts for ECI and NCS Wage Only. These are done smlarly to the two controlled selecton problems for the government sector, wth the followng exceptons. The ECI sngle panel cell allocaton for the frst of the fve panels s obtaned by takng the total ECI allocaton for the cell, subtractng the number of ECI fve panel sample unts, and dvdng by 5. The NCS Wage sngle panel cell allocaton for the frst of the fve panels s handled smlarly. The NCS Wage Only allocaton for ths panel s obtaned by subtracton as was done for the government sector. Orgnally t was ntended that the resultng controlled selecton arrays for the ECI and NCS Wage Only sngle panels would each be used to ndependently obtan fve controlled roundngs correspondng to the fve sngle panels for ECI and NCS Wage Only. However, for the frst sngle panel sample, a sample cut took place before the sample selecton. The result of the sample cut was a reducton of the allocaton n each ECI samplng cell n the ECI controlled selecton array by a fxed percentage, and a reducton n each NCS Wage samplng cell n the NCS Wage controlled selecton array by a dfferent fxed percentage. These sample reductons were performed wth the constrant that the NCS Wage sample for each Pay Agent area not be reduced below the Pay Agent area mnmum. For the second sngle panel sample, an addtonal sample cut took place and the cell allocatons were also modfed by the use of updated frame counts for the samplng cells, wth these two changes resultng n a dfferent controlled selecton array. where ths cut was based on the reduced expected allocaton n each cell after the cut for the frst sngle panel sample. Thus the controlled selecton problems dffered between the frst and second sngle panels and wll contnue to dffer f there are further sample adjustments correspondng to the other sngle panel samples or f updated frames are used n formng the controlled selecton arrays. Note that the altered cell allocatons used n the controlled problems for the sngle panel samples are generally real-valued, not nteger-valued, even for the grand total because of the dvson by 5. 4. Modfyng Controlled Selecton Problems to Avod Roundng Errors In ths secton we explan how modfcatons are made to avod roundng errors that would otherwse destroy the addtvty of the controlled selecton arrays. These modfcatons nvolve converson of the ntal controlled selecton array. Note n general for a two-dmensonal tabular array S = s ), a controlled roundng of S to a postve nteger base b s a tabular array N = n ), where for each j, n j s a postve nteger multple of b for whch that s N = n ) s an nteger-valued tabular array satsfyng (2.). ( j ( j nj sj < ( j b. If no base s specfed, then base s understood,

For each of the controlled selecton problems S = ( s j ) descrbed n Secton 3, the array S s generally not nteger-valued, whch can lead to lack of addtvty and nablty to solve the necessary transportaton problems to obtan controlled roundngs of S. To convert the array to a controlled selecton array whch overcomes these dffcultes, we convert the array S wth dmensons ( M + ) ( N + ) to an nteger-valued addtve array S = s ) wth dmensons ( M + 2) ( N + 2), ncorporatng an approach n ( j Cox and Ernst (982). The margnals of S are postve nteger multples of a base 0, where s a postve nteger that depends on the number of places of accuracy desred. = 4 was used for the controlled selecton problems consdered for NCS. To obtan S frst let s = floor(0 s,), =,..., M, j =,..., N, (4.) j j N s ( N + 2) = celng( s,0 ),,..., M, j j = = (4.2) N j s ( N + ) = s ( N + 2) s,,..., M, j = = (4.3) M s( M + 2) j = celng( = s,0 ), j =,..., N +, j (4.4) M s( M + ) j = s( M + 2) j = s, j =,..., N +, j (4.5) N s ( 2) = + N + sj, = M +, M + 2, (4.6) j= where floor ( x, y) s the largest nteger multple of y not exceedng x and celng ( x, y) s the smallest nteger multple of y that s not less than x. We wll llustrate the controlled selecton process by an example. The controlled selecton arrays used n producton n NCS have M =52, N = 20 or N = 23, and = 4; to keep the llustratve example manageable n sze we take M = 2, N = 3, and = 3. There are a number of ways to set up a controlled selecton problem. The approach used n (4.)-(4.6) clearly nsures that S s an addtve array wth nteger cell values. In Fgure the frst two arrays presented are the orgnal S for the llustratve example and S calculated usng (4.)-(4.6). However, there s one drawback to calculatng S wth ths approach, whch wll be addressed at the end of the secton. Other approaches for modfyng S may not work at all. For example, one approach would be to smply round the allocaton of each cell n S. Ths generally wll not work because the array obtaned from roundng S wll typcally not be addtve. We obtan a soluton to the controlled selecton problem S by teratvely constructng a sequence of arrays A k, k =,..., l, of dmensons ( M + 2) ( N + 2) wth the margnals of A k an nteger multple of a base b k, wth N k a controlled roundng of A k to the base b k, and wth p k the probablty of selecton of N k. The set of controlled roundngs and assocated probabltes satsfy (2.2) wthout roundng error. The only roundng error occurs n the converson from S to S. We begn by lettng ( ) A = a =, and b = 0. Then we obtan a controlled roundng N = n ) of A = a ) to the base j S ( j ( j b. Next we dvde each cell n N by b and round to the nearest nteger to obtan N. Snce N s an nteger multple of b, there should be no roundng error n obtanng N beyond the roundng error n obtanng S. Havng obtaned A, N, N, and b, we proceed to explan how for k > we obtan by recurson bk, p k, A k, N k, N k. We let bk = max{ nj ( k ) aj ( k ) =,..., M +, j =,..., N + } (4.7) p k = ( bk bk ) / b (4.8) jk = ( nj ( k ) / b( k ) ) bk + aj ( k ) nj ( k ) a (4.9) N k be a controlled roundng of A k to the base b k (4.0)

N k = ( n jk ) be the array defned by n jk = njk / bk wth the quotent rounded to the nearest nteger (4.) Eventually we reach a k for whch b k = 0. Then l = k and pl = bl / b, wth the tabular arrays N = ( n j ),..., Nl = ( njl) and assocated probabltes p,..., pl consttutng a soluton the controlled selecton problem S /0. For the llustratve example, l = 7, b - b 8 are, respectvely, 000, 542, 434, 35, 4, 8, 2, and 0 by (4.7) and p - p7 are, respectvely, 0.458, 0.08, 0.083, 0.20. 0.23, 0.06, 0.002. A k, N k, k =,...,7, are gven n Fgure. N,..., N 7 are not presented n the fgure but are obvous by (4.). Note that for ths procedure just descrbed, S and A k, k =,... l, are completely addtve nteger-valued tabular arrays, whch s the key to nsurng that the necessary controlled roundngs can be obtaned. There s roundng error n the constructon of S from S, but t does not destroy any necessary addtvty, whch could lead to dffcultes n performng the controlled roundngs. The roundng error n the constructon of S s at most for any nternal cell, whch s equvalent to a roundng error of 0 n the orgnal S. The one drawback wth the approach usng (4.)-(4.6) s that t does not guarantee that f the grand total for the orgnal controlled selecton array S was exactly an nteger value, that each of the controlled roundngs assocated wth S wll lead to that grand total. Ths may or may not be a concern. In the example, the orgnal expected value of the grand total for S s 7 but the grand 2 3 total for S / 000 s s /000 = 6. 998 and the grand total correspondng to N 7 s n 6 j = j= correspondng to all the other N k s 7. 2 3 = j= j7 =, whle the grand total If ths roundng error n the grand total s a concern, t can be avoded as follows. Frst make the followng modfcaton n the δ constructon of S. In (4.)-(4.6) replace wherever t occurs wth + δ, where δ s the smallest postve nteger for whch 0 s greater than the number of nternal cells n S. Thus δ = n the llustratve example snce there are 6 nternal cells n S. S s presented n Fgure 2. Then construct a new tabular array S from S by frst performng a controlled roundng of S to the base δ 0 wth the addtonal requrement that s( M + )( N + ) be rounded up, not down, n ths controlled roundng; and then that each cell δ n the controlled roundng be dvded by 0 and rounded to the nearest nteger to obtan S. For the llustratve example S s as gven n Fgure 2. We then let A = S and proceed usng (4.7)-(4.) as we have done prevously. The addtonal requrement that s ( M + )( N + ) s always rounded up n the controlled roundng can always be satsfed, as explaned n Cox and Ernst (982). In partcular, for the llustratve example, s 000, from whch t follows that 4,3 = n 000 and n = 7 for all k 4,3, k = 2 3 jk = j= Another ssue we have s the stuaton when the controlled roundngs have to be selected n a coordnated fashon for two controlled selecton problems. In partcular, ths problem arses when we have separate controlled selecton problems for ECI and NCS Wage Only, and we wsh to mnmze the number of samplng cells for whch the sum of the two rounded allocatons dffers by more than from the expected number of total NCS wage unts n the samplng cell. (Ths can occur f both surveys are rounded n the same drecton.) In that case, the constructon of S s done ndependently for each controlled selecton problem and we set b = 0. The recursve computaton of bk, p k, A k, N k, N k for k > s done separately and ndependently for the two surveys usng (4.7)-(4.) wth the followng exceptons: After obtanng the controlled roundng N k for ECI, the correspondng controlled roundng for NCS Wage Only s obtaned by usng an objectve functon whch mnmzes the number of cells for whch the sum of the roundngs for the two surveys dffers from the expected value by more than. (4.2) After b k s computed separately for each survey, the mnmum of these two b k s s taken as the b k to use n (4.7)-(4.) for both surveys. (4.3) See Izsak et al. (2005) for more nformaton on the coordnated selecton of the controlled roundngs for the two surveys. 5. Base Sample Weghts

The procedure for obtanng sample weghts s typcally more complex when usng controlled selecton, where the allocaton to each samplng cell s not fxed, than for samplng problems where the sample cell allocaton s fxed. Consder a populaton of N unts wth weghts w, values y, =,..., N, populaton total Y = = y and estmator of total Yˆ N = = w y. A suffcent condton for ths set of weghts to result n unbased estmates of totals s for E ( w ) = for each unt (Ernst 989), snce f ths condton s met we have E ( Y ˆ) = Y. The smplest case for whch ths condton would be met occurs when the probablty of selecton p of unt can be calculated for each unt. In ths case f we let w = / p when unt s n sample and w = 0 otherwse, the set of weghts satsfes E ( ) = for all. w The calculaton of p s typcally easer to do when the allocaton of the number of unts to each samplng cell s fxed than when t s not. However, there are many stuatons where ths allocaton s not fxed. In partcular, when controlled selecton s used, there are generally two possble allocatons for each unt, whch are consecutve ntegers that we denote by j and j +. In ths case, there are two possble selecton probabltes for the samplng cell contanng unt : p j and p ( j +), correspondng to the allocatons of j unts and j + unts, respectvely. Then, provded j 0, the correspondng weght s w = / p j f unt s n sample and the allocaton to the cell contanng unt s j unts; whle w = / p ( j + ) f unt s n sample and the allocaton to the cell contanng unt s j + unts; and w = 0 f unt s not n sample. Then under these condtons E ( ) =, snce E ( w ) = condtonal on the allocaton to unt s cell beng j unts and also condtonal on ths allocaton beng j + unts. (Note that for the NCS Wage sample, we can sometmes have three possble allocatons for a cell for reasons dscussed at the very end of Secton 4 and n more detal n Izsak et al. (2005), but the same weghtng dea works n that case too.) Now, f j = 0, then unt s n sample f and only f the allocaton to the cell contanng unt s unt and unt s selected condtonal on ths allocaton. We let r be the probablty of the former condton beng met, whle the probablty of the latter condton beng met s p. That s, r s the value of the entry n the controlled selecton array n the samplng cell contanng unt. Consequently, f we let w = /( r p) when unt s selected and w = 0 otherwse, then E ( w ) =. Thus r s the probablty that the allocaton to the cell contanng unt s unt and / r s the weghtng adjustment to account for the fact that when the allocaton to a cell s 0 unts, the cell wll not contrbute to the estmates. r s calculated dfferently for the ECI and the NCS Wage sample. To calculate r for the controlled selecton problem for ECI, smply set t to the controlled selecton value for each nternal cell for whch the controlled selecton value s less than ; whle r = for all other nternal cells. As dscussed n Secton 3, controlled selecton s used n choosng the ECI sample for the government sector, the ECI fve panel noncertanty unts for the prvate sector, and the ECI sngle panel unts for the prvate sector. For the NCS Wage sample, the calculaton of r s more complex because the allocaton to the Wage sample n a cell contanng unt s greater than 0 f ether the ndex sample or the Wage Only sample has an allocaton to that cell that s greater than 0. For each Wage sample to be selected there corresponds an ndex controlled selecton problem and a Wage Only controlled selecton problem. For each such par of controlled selecton problems, there corresponds a set of pars of controlled roundngs, one wth the ndex allocaton to each cell and the other wth the Wage Only allocaton, wth each par havng an assocated probablty. For each cell, r s the sum of these assocated probabltes over all pars of controlled roundngs for whch ether the ndex allocaton or the Wage Only allocaton to the cell s greater than 0. The reason for ths s that r s the probablty that the NCS Wage allocaton s postve. Controlled selecton s used n choosng the NCS Wage Only sample for the government sector and for the sngle panel prvate sector. (Note that n the case when t s possble for a cell allocaton to be any of 0,, or 2 unts, then w = /( r p ) when the allocaton s 2 unts and unt s selected.) 2 6. Mnmum Expected Cell Allocatons Now, usng the notaton of the prevous secton, f j = 0 and r s very small, then the general tendency would be for w to be very large, whch typcally leads to large varance estmates. To overcome ths problem we consdered requrng a mnmum value for r and conducted an emprcal nvestgaton comparng fve allocaton optons. Three of these optons only dffered n the way they calculated the mnmum value of r. The three values consdered n the emprcal nvestgaton were r = 0.00, (that s no mnmum), r = 0. 0, and r = 0. 05, whch were labeled Optons, 4 and 3, respectvely. Two other optons were also consdered, Optons 2 and 5, nether of whch uses mnmum values for r. In Opton 2, unlke any of the other optons, mnmum N w

allocatons for Pay Agent areas were not consdered, whle for Opton 5, unlke other optons, we dd not remove the Wage Only sample from nonmetropoltan areas. We compared the varances of the fve optons. For natonal estmates, among the three optons that only dffered n the value of r, Optons 3 and 4 had the lowest varances. Opton 3 dd slghtly better than Opton 4, lkely because of the hgher mnmum weght adjustment factor, but we preferred Opton 4 snce we would prefer havng mnmums to be as small and unobtrusve as possble. However, Opton 5 had the lowest natonal varances among all fve optons, whch we beleve s due to the fact that ths opton s the only opton that retans Wage Only unts for nonmetropoltan areas. Opton 2 had lower natonal varances than Opton. Ths s to be expected snce the removal of the Pay Agent area mnmums should be expected to lower natonal varances. However, Opton 2 produced hgher natonal varances than Optons 3 and 4 snce the latter two optons use cell mnmums. For the group of Pay Agent areas, varances were farly smlar over the fve optons, wth a slghtly hgher varance for Opton 2, whch s to be expected because of the removal of mnmum thresholds for Pay Agent areas for ths opton, and a slghtly lower varance for Opton 4. For metropoltan areas excludng Pay Agent areas, the average varances were farly even across the dfferent methods, wth Opton 5 producng the hghest varances and Opton 4 producng the lowest. The hgher varances for Opton 5 for these areas appeared to result from the fact that snce t s the only opton that does not remove mcropoltan and outsde CBSAs county clusters from the NCS Wage Only sample, ths opton has the largest sample for these two types of areas and the smallest sample for metropoltan areas. In mcropoltan and outsde CBSAs county clusters, Opton 5 performed the best for the reason just explaned, wth no clear pattern for the other four optons. The varances for these four optons jumped around qute a bt, n part we beleve because of the relatvely small sample allocated to nonmetropoltan areas for these optons. Note that snce each of the optons yelded dfferent controlled selecton problems, dfferent controlled roundngs were used for each of the optons. Ths fact may have resulted n a substantal ncrease n the varablty of the varance estmates for the dfferent optons. It was decded to adopt Opton 4. Opton 2 was elmnated because t elmnated Pay Agent area mnmums, whch ncreased the varances for those areas, wthout producng the lowest varances for any types of areas. Opton 5 was elmnated because of our emphass on reducng the varances of metropoltan area estmates. Among the other three optons, Optons 3 and 4 generally produced lower varances than Opton for most domans because of the use of mnmum real-valued cell allocatons. Actually Opton 3 produced a slghtly lower natonal varance estmate than Opton 4, but the dfference was very small and may have been at least partally due to the specfc controlled roundng that was selected for each opton. In general we thought when n doubt t s better to use a mnmum that s small and unobtrusve as possble, whle stll avodng very large weght adjustment factors and t was felt that Opton 4 best met ths requrement. References Causey, B. D., Cox, L. H., and Ernst, L. R. (985). Applcatons of Transportaton Theory to Statstcal Problems. Journal of the Amercan Statstcal Assocaton, 80, 903-909. Cox, L. H., and Ernst, L. R., (982). Controlled Roundng. INFOR, 20, 423-432. Ernst, L. R. (989). Weghtng Issues for Longtudnal House and Famly Estmates. Panel Surveys, 39-59. New York, John Wley. Ernst, L. R., Gucardo, C. J., Ponkowsk, C. H., and Tehonca, J. (2002). Sample Allocaton and Selecton for the Natonal Compensaton Survey. 2002 Proceedngs of the Amercan Statstcal Assocaton, Secton on Survey Research Methods, [CD- ROM], Alexandra, VA: Amercan Statstcal Assocaton. Ernst, L. R., Gucardo, C. J, and Izsak, Y. (2004). Evaluaton of Unque Aspects of the Sample Desgn for the Natonal Compensaton Survey. Amercan Statstcal Assocaton, Secton on Survey Research Methods, [CD-ROM], Alexandra, VA: Amercan Statstcal Assocaton. Ernst, L. R., Izsak, Y., Paben, S. P. (2004). Use of Overlap Maxmzaton n the Redesgn of the Natonal Compensaton Survey. 2004 Proceedngs of the Amercan Statstcal Assocaton, Secton on Survey Research Methods, [CD-ROM], Alexandra, VA: Amercan Statstcal Assocaton. Izsak, Y., Ernst, L. R., Paben, S. P., Ponkowsk, C.H. and Tehonca, J. (2003). Redesgn of the Natonal Compensaton Survey. 2003 Proceedngs of the Amercan Statstcal Assocaton, Secton on Survey Research Methods, [CD-ROM], Alexandra, VA: Amercan Statstcal Assocaton. Izsak Y, Ernst, L. R., McNulty E,. Paben, S. P., Ponkowsk, C. H., Sprnger G., and Tehonca, J. (2005). Update on the Redesgn of the Natonal Compensaton Survey. 2005 Proceedngs of the Amercan Statstcal Assocaton, Secton on Survey Research Methods, [CD-ROM], Alexandra, VA: Amercan Statstcal Assocaton. Any opnons expressed n ths paper are those of the authors and do not consttute polcy of the Bureau of Labor Statstcs.

S=.98424 0.87508 2.5993 5.45845 5.3303 3.9778 2.29364.5455 7.3437 4.79286 4.89277 7 A 4 = 686 226 843 35 206 894 404 92 2 422 228 25 0 349 702 2808 755 755 702 7020 A = S = 984 875 2599 542 6000 5330 397 2293 460 2000 686 208 08 998 2000 8000 5000 5000 2000 20000 N 4 = 702 35 702 35 206 755 404 053 0 422 35 0 0 35 702 2808 755 755 702 7020 N = 2000 000 3000 0 6000 5000 4000 2000 000 2000 000 0 0 000 2000 8000 5000 5000 2000 20000 A 5 = 266 6 423 4 846 844 564 282 2 692 8 25 0 39 282 28 705 705 282 2820 A 2 = 068 47 225 542 3252 3040 2085 377 2 6504 228 208 08 540 084 4336 270 270 084 0840 N 5 = 282 0 423 4 846 846 564 282 0 692 0 4 0 4 282 28 705 705 282 2820 N 2 = 084 542 084 542 3252 3252 268 084 0 6504 0 0 542 542 084 4336 270 270 084 0840 A 6 = 20 6 54 8 08 06 72 36 2 26 8 2 0 6 36 44 90 90 36 360 A 3 = 852 309 009 434 2604 2392 653 6 2 5208 228 208 0 432 868 3472 270 270 868 8680 N 6 = 8 8 54 8 08 08 72 36 0 26 8 0 0 8 36 44 90 90 36 360 N 3 = 868 434 868 434 2604 2604 302 302 0 5208 0 434 0 434 868 3472 270 270 868 8680 A 7 = N 7 = 4 0 6 2 2 0 8 4 2 24 2 2 0 0 4 6 0 0 4 40 Fgure. Tabular Arrays for Illustratve Example S = A = S = 9842 8750 2599 547 60000 5330 3977 22936 4586 20000 6857 2073 073 9997 20000 80000 50000 50000 20000 200000 984 875 2599 542 6000 5330 398 2294 458 2000 686 207 07 000 2000 8000 5000 5000 2000 20000 Fgure 2. Intal Tabular Arrays for Modfed Illustratve Example