Survey Process with Significance Editing: Foundations, Perspectives, and Plans for Development Joseph S. Kosler, Ph.D. 1 1 National Agricultural Statistics Service, United States Department of Agriculture 1400 Independence Avenue SW, Washington, D.C., USA 20250 Abstract There has been a change in business model at the National Agricultural Statistics Service (NASS) of the United States Department of Agriculture (USDA) precipitated in part by the global economic crisis of 2008. NASS has created a research prototype for significance editing (Johanson, 2012) which is being developed into a full-scale production system. The NASS change in business model included a sweeping change of information technology systems used for processing establishment reports (Nealon and Gleaton, 2012). These circumstances provided a foundation for exploring Fellegi-Holt methodology (Fellegi and Holt, 1976) and designing a fully automated system for significance editing. Incorporation of the significance editing system is contingent upon performance to be assessed through a series of post-production tests begun September 2012 with the quarterly Hogs and Pigs Survey. If successful, the new system for significance editing, called SignEdit, would combine with existing NASS production systems to form an automated process support for participating surveys. Cost savings attributable to the SignEdit System could not be estimated at this time. The system is expected to improve consistency of editing and imputation across establishment reports. Also, the system is expected to alert agricultural statisticians of any imputed value that might have an impact of practical significance on a state-level indication. The focus on manual review for high-impact reports and automation for low-impact reports is expected to maintain the quality and caliber of NASS estimates while providing a substantial reduction in the time needed for processing. The recent changes to NASS production systems (Nealon and Gleaton, 2012) enable process control for surveys in the sense that it has become possible to automate the capture of process data (i.e. paradata) as well as the reaction to process data (e.g. adaptive design) for the purposes of controlling the grade (i.e. caliber), quality (e.g. compliance with publication standards), and financial cost of survey estimates. The term survey process control refers to the combined use of methods from survey production, survey management, industrial quality control, and statistical science to enact process control for sample surveys. Key Words: significance edit, selective edit, process control, automated imputation 1. The SignEdit System for Significance Editing This paper was submitted to share enthusiasm about a system for significance editing that has been approved for development at the National Agricultural Statistics Service (NASS) of the United States Department of Agriculture (USDA). The system for significance editing is called the SignEdit System and was created within the Editing and Imputation Research Section (EIRS) of the Research and Development Division (RDD). Dale Atkinson and Asa Manning (Manning and Atkinson, 2009) described the EIRS
effort to create an automated edit and imputation system in the context of the NASS RDD research program at the Conference of European Statisticians hosted by the United Nations Statistical Commission; and specified expectations that the system should run at the enterprise level in parallel with data collection. NASS investigated use of the Banff System commercialized by Statistics Canada beginning in 2009. The primary concern with the Banff System was that it was intended for use after completion of the reporting cycle. By design, the Banff System expected the national sample at the time of automated editing and imputation. The SignEdit System was designed to run in parallel with data collection to meet requirements of timeliness for NASS publications. The SignEdit System performs tasks of detection and correction or errors in farm reports with a NASS custom program (SAS 9.2) for editing and imputation. The program relies on SAS Macro to capture best practices for the agricultural statistician. The program relies on Banff System procedures for error localization, donor imputation, mean imputation, and imputation by substitution of previously reported data (PRD). By design, the SignEdit System includes a viewer (not shown due to size) to assist the agricultural statistician with manual review, a dedicated database for interaction with enterprise production systems, and the custom program. The custom program was modularized and generalized to enable addition of new surveys to the system, and to meet requirements of the NASS centralized computing environment (Nealon and Gleaton, 2012). The research prototype was originally coded by James Johanson who has substantial experience as an agricultural statistician performing interactive edits for participating surveys; and preliminary results were presented at ICES IV in Montreal (Johanson, 2012). A diagram of the proposed system is provided as Figure 1 on the next page, where the custom program is named the SignEdit Processor. The system was named the SignEdit System for development purposes. Literature review indicates the terms significance edit and selective edit are used somewhat interchangeably. Both terms refer to a procedure which can be used for transforming manual imputation with interactive editing into an automated procedure which focuses resources on high impact reports. There is abundant literature on selective editing and significance editing describing attempts to apply the methodology. Selective and/or significance editing procedures have been used successfully at Statistics Sweden (Lawrence and McKenzie, 2000), Statistics New Zealand (Seyb et al., 2009), the Australian Bureau of Statistics (Farwell, 2006), and Statistics Netherlands (de Waal et al., 2009). Significance editing for NASS includes four steps which perform the tasks of automated edit and imputation, selective editing, and outlier detection. The term significance editing was adopted to distinguish selective editing as a core function of the system. The four steps are described in Sections 1.1 through 1.4. 1.1 Automated Edit The SignEdit System for significance editing includes tools for detection and correction of errors. The correction of errors is performed through automated imputation tools (Section 1.2). The detection of errors is performed using error localization (de Waal, 2009) as prescribed for minimizing the number of errors to be corrected (Fellegi and Holt, 1976). For performing the task of error localization, NASS chose the ERRORLOC procedure bundled with Banff System application software as commercialized by Statistics Canada. The SignEdit Processor (Figure 1) was founded on Banff s Proc ERRORLOC. Existing edits are supplied as a system of linear equations (e.g. X1 < b and X1+X2=c, where X1 and X2 are survey variables).
Farm Reports Paper Keyed from Image OCR 1 Call Center CATI with Blaise 2 Personal Computer 3 Self-Report/CAPI Office Handling Databases 5 Metadata Repository Edit Repository Question Repository Blaise/CATI 2 Work in Progress 6 Tablet (ipad) Field Enumeration 4 SignEdit Processor 7 SignEdit Database 7 SignEdit Viewer 7 Databases 5 Blaise Edit 3 Work in Progress 6 Data Warehouse Analysis Summary Indications Board Estimation Published Estimates Figure 1: Process Flow for Farm Reports through an Integrated Significance Editing System Notes 1) through 7) are listed below: 1) OCR indicates paper forms scanned electronically using optical character recognition technology. 2) Blaise 5.0 (Statistics Netherlands) supports computer assisted telephone interviewing (CATI) through the NASS National Operations Center (NOC). Blaise 5.0 also supports
interactive edit and imputation by detecting errors and displaying warnings for each establishment report. Agricultural statisticians use the critical error messages and warnings generated by Blaise 5.0 during manual review of establishment reports. 3) Computer assisted personal interviewing (CAPI) is supported through internet browser questionnaires and data retrieval technology (Nealon and Gleaton, 2012). Office handling indicates special handling of a farm report through a NASS field office. 4) NASS field enumerators are equipped with ipad tablets to assist data collection. For security, a unique system was developed to enable data entry from remote locations without storing data on the device itself (Kleweno and Hird, 2012). 5) Database technology and innovative design are at the heart of the NASS centralized computing environment (Nealon and Gleaton, 2012). Database repositories for metadata, edit logic, and questionnaires are integrated with database support for other NASS production systems such as the SignEdit System (Manning and Atkinson, 2009). 6) WIP indicates a work in progress database which was created to provide access to survey data for analytic purposes (Nealon and Gleaton, 2012). Production groups through NASS can access the WIP database (non-transactional) throughout a reporting cycle without accessing production databases (transactional) directly. WIP is designed to hold up to 5 years previously reported data (PRD) and a survey is initialized with 2 years PRD. 7) The SignEdit System for significance editing includes a custom program (SAS 9.2), a database (MySQL), and a viewer (not shown due to size). Research prototypes for the program and viewer (SAS/AF) are completed. Development prototypes for the program and viewer (.NET) are in progress. A production beta for the database has been approved and the logical design was completed; further work is needed for a physical design and build of the database and support services. Institutional knowledge of the agricultural statistician and procedures established by the survey administrator were captured and coded into one survey instrument as instructions. Statistical methodology and survey methodology (e.g. Fellegi-Holt methodology and selective edit methodology) for fundamental processing of farm reports were captured and coded into one significance editor a core program which was designed to receive instructions from the survey instrument. While there is one significance editor for every major group of surveys (e.g. livestock surveys), there is a separate survey instrument for each participating survey (e.g. the quarterly hogs and pigs survey and the annual hogs and pigs survey). The SignEdit Processor (Figure 1) comprises significance editors and survey instruments as generalized and modularized computer program code. Two significance editors are planned for production use: one for livestock surveys, and one quarterly agricultural surveys. Administrative changes to edits and the list of questions asked (i.e. variables to be processed) are managed centrally through transactional databases. Changes to edit logic may be managed as changes to SignEdit System settings managed centrally through the SignEdit Database (Figure 1). The task of outlier detection was coded as range edits on Hidiroglou-Berthelot (H-B) effects (Hidiroglou and Berthelot, 1986). The H-B effects were formulated for an
individual farm as an H-B score amplified by the maximum reported value (current quarter versus previous quarter): H-B effect = score x maximum. The H-B effects for the reported Number of Producers from the quarterly Hogs and Pigs Survey were plotted against limits as shown in Figure 2. An effect falling outside the charted limits was regarded as an outlier, and labeled with a farm code, with the interpretation that a sizable quarter-to-quarter change was recorded for that farm, and that farm is potentially large enough to impact the final indication: the H-B score was relatively large, and the farm was farm was relatively large. The limits were set for each survey variable on the basis of process history (four quarters available for initial research, Figure 2). Wider limits were used for outlier detection intended to restrict the donor pool without manual intervention (e.g. 0.25 th and 99.75 th percentiles). Narrower limits were used to identify potential outliers which would require a manual review (e.g. 1 st and 99 th percentiles). Two concerns governed decision-making for the established limits: 1) the risk of eliminating a valid donor from the donor pool was minimized, supporting the need to gather sufficient depth for the donor pool for successful imputation in early hours or early days of hourly batch runs; and 2) the volume of potential outliers to be reviewed manually was minimized, as the intention of the system is to reduce workload, and as an interactive data analysis system (IDAS) is available for detecting outliers later in the survey production process (Manning and Atkinson, 2009). Planned work for development testing of the production beta system includes assessment of whether or not the outlier detection using H-B effects succeeds in identifying outliers that could not be detected with the IDAS system. Figure 2: Farm Reports Identified as Outliers Using H-B Effects Plotted against Limits
1.2 Automated Imputation The SignEdit System for significance editing includes tools for detection and correction of errors. The correction of errors is performed through automated imputation tools bundled with Banff System application software as commercialized by Statistics Canada. NASS chose to use the Banff System s DONORIMPUTATION, PRORATE, and ESTIMATOR procedures to perform donor imputation, mean imputation, and imputation by substitution of previously reported data (PRD). The order of the imputation procedures was chosen to meet production needs specific to the survey and does not match recommendations of Statistics Canada. For the livestock significance editor, the order imputation procedures was coded as donor imputation first, mean imputation second, and PRD imputation third. Statistics Canada recommends beginning with a model-based imputation and using donor imputation last. The difference between NASS usage of the imputation procedures and Statistics Canada s general recommendation is partially explained by administrative circumstances: NASS is one of many data collection agencies in a decentralized federal statistics environment, list frames are built and managed without registry data, administrative data and survey data collected by other agencies that might serve as covariates for a model-based imputation are not readily available to NASS for research and development for all surveys, and efforts to minimize response burden for American citizens inhibit duplicate collection of information of across agencies. Due to seasonal effects and well known long-term cycles in livestock production, imputation of a farm s report based on a pool of donors from the current reporting cycle was preferred to direct substitution of data from a previous report from that farm. Preliminary results using a research prototype of the SignEdit Processor (Figure 1), and data from the quarterly Hogs and Pigs Survey, were presented at ICES IV in Montreal (Johanson, 2012). Changes to farm reports made by Banff imputation procedures were assessed on the basis of closeness of agreement with manual changes made by agricultural statisticians. Planned work for development testing of the production beta system includes assessment of hourly runs concurrent with data collection, in terms of both donor pool management and the quality of donor imputed values. 1.3 Selective Edit The SignEdit System for significance editing includes tools for selective editing. The implementation plan is focused on providing support to the agricultural statisticians in field offices who currently perform 100% inspections of farm reports. After being processed through tools for automated detection and correction of errors, each farm report in the critical stream receives a unit score for purposes of performing a selective edit. It is important to recognize that not all farm reports are subjected to the selective edit. Blank reports (i.e. unit nonresponse) and farm reports requiring special administrative handling, for instance, are not included in the critical stream for significance editing. To generate reduction of costs associated with manual 100% inspection of farm reports, the farm reports may be sorted in priority order according to potential impact to final indications. In this way, manual inspection (or manual review) can be limited to high-impact farm reports and farm reports identified as potential outliers. The portion of farm reports that may be excluded from manual review due to selective editing cannot be assessed at this time. The priority order of farm reports by potential impact to final indications is achieved through computation of an item score for each response on a farm s report. The item score is formulated as the magnitude of a change made through automated imputation
relative to the final indication for that item: item score = [final weight x imputed value reported value ] / [final indication]. The final weight and final indication were taken from the previous quarter for purposes of processing during data collection. The sort order of the farm reports was vital to selective editing. However, the value of a farm report s unit score was not meaningful on its own. It was speculated that no analytic loss of practical significance would be generated by substitution of previous quarter weights and indications for current quarter weights and indications in the computation of the item score for a given farm. Planned work for development testing of the production beta system includes assessment of the sort order of the farm reports, but no example of analytic loss due to use of previous quarter weights and indications has surfaced in research testing. The priority order of farm reports by potential impact to final indications is based on a unit score. The SignEdit System is designed to use a unit score calculated as the maximum item score across a farm s report. This unit score is easily interpreted by the agricultural statistician in terms of the survey question (i.e. variable or item) that had the largest change due to automated imputation. The SignEdit Viewer (Figure 1) provides a graphical display of the farm s unit score relative to other farms, and provides a color coded spreadsheet listing of automated changes that led to a high unit score. On the basis of original reported data and automated imputed values, the agricultural statistician can perform a comparison. Agricultural statisticians have a desktop interface to centralized resources for looking at other information that might support a determination for action of the farm s report. The automated changes may be retained or replaced with values determined by the agricultural statistician. This manual review and determination of action on the farm s report are performed dynamically through support of transactional databases and the Blaise 5.0 (Statistics Netherlands) system for interactive editing. The Establishment of a threshold for selective edit manual review would resolve the agricultural statistician s question of what top portion of farm reports (in priority order) require manual review to ensure that automated imputation has generated no practical impact to indications. Unit scores calculated as the sum of item scores, or geometric mean of item scores may be utilized for establishing thresholds due to their statistical properties. In particular, the maximum item score has a large variance relative to the sum or geometric mean of the item scores. Planned work for development testing of the production beta system includes assessment of selective edit thresholds for production use; however, thresholds are not a requirement of the system as the priority sort order may suffice to provide a focus of manual editing on high-impact farm reports. 1.4 Quality The SignEdit System for significance editing includes a quality component by design to provide capability for automated capture and display of metadata (about the data) and paradata (about the survey process). Metadata and paradata are produced at the record level (farm report), item level (individual responses), and run level (farm reports batched for hourly runs). Paradata may be produced as aggregates across farm reports (e.g. a table of counts of imputed items by imputation type for each survey question). Paradata may also be produced as aggregates across items (e.g. record-level count of imputed values). When using the Banff System, or SAS procedures bundled with the Banff System (Statistics Canada), metadata are captured by a log file. This log file may be parsed to automate use of its contents. However, NASS chose to compute metadata and paradata as needed within the SAS System, and to store those metadata and paradata in a transactional database. The SignEdit Database (Figure 1) is a production database dedicated to significance editing. The centralized design of the NASS processing
environment (Nealon and Gleaton, 2012) allows the SignEdit Database to be integrated with other production databases. Survey setup paradata (e.g. reporting cycle dates, internal system parameters and codes, variable names) may be obtained electronically from the Metadata Repository System (MRS) and Question Repository System (QRS) when the survey reporting cycle is initialized. For efficiency of database interaction, the volume (or density) of the metadata and paradata stored in the SignEdit System was reduced by sufficiency (referring to the statistical properties of sufficient statistics) to counts that may be needed for system monitoring or analysis of automated editing and imputation. For example, to be able to produce imputation rates by imputation type, the numerator and denominator counts for each rate (e.g. number of items imputed by donor imputation and number items imputed altogether) were computed using SAS 9.2 to be stored in the SignEdit Database. By design, the counts are stored at the run level. The SignEdit Database is in the design stage. A logical design was completed as stated above. A physical design is underway as a collaborative effort with the information technology division (ITD). The build for the database is planned using MySQL and following schema proven successful in previously created databases for such systems (e.g. MRS, QRS, and a system for Generalized Imputation). NASS database design was described thoroughly in a paper written with NASS ITD and recently accepted for publication in the Journal of Official Statistics (Nealon and Gleaton, 2012). Additional discussion of NASS centralized computing systems, including details of quality management, cultural transformation (i.e. resistance to change ), and modes of project deployment, were presented at the 2012 Conference for European Statisticians sponsored by United Nations Economic Commission for Europe (Kleweno and Hird, 2012). 2. Survey Process There has been a change in business model at NASS precipitated in part by the global economic crisis of 2008. NASS has traditionally operated a network of field offices throughout the United States and Puerto Rico. Agricultural statisticians located in each field office have traditionally processed farm reports for manual imputation. Prior to centralization of the computing environment (Nealon and Gleaton, 2012), a 100% manual inspection of farm reports was conducted for each regular production survey using application software for automated interactive editing (e.g. Blaise and SAS). The 100% manual inspection, in combination with statistical methodology (e.g. IDAS outlier identification) and management methodology, constituted an environment of statistical quality control. The change in business model has ushered in sweeping changes to the computing environment which enable addition of automated reaction to process data. Adaptive design was recently implemented for the County Agricultural Production Survey (CAPS) to enable stopping rules at the county level. The stopping rules were enacted using the metadata repository system (MRS) amongst other existing production databases integrated into a centralized environment for processing farm reports (methodology not published; project discussed briefly in Kleweno and Hird, 2012). The recent changes to NASS production systems enable process control for surveys in the sense that it has become possible to automate the capture of process data (i.e. paradata) as well as the reaction to process data (e.g. adaptive design) for the purposes of controlling the grade (i.e. caliber), quality (e.g. compliance with publication standards), and financial cost of survey estimates. The term survey process control refers to the combined use of methods from survey production, survey management, industrial quality control, and statistical science to enact process control for sample surveys. Figure 3 (below) suggests a model of the survey process whereby survey professionals in research and production
may enact the survey process with active support from automated production systems and process analytical technology for monitoring and adapting use of the production systems. Survey Process Process Support Research Development Production Automated Production Systems and Process Analytical Technology Figure 3: Process Support for Research and Production of Sample Surveys Notes: 1) This diagram was intended as a prospective model for purposes of discussion only. 2) Survey Process indicates activities of survey professionals. 3) Process Support indicates tools designed to support survey professionals for conducting a survey. 2.1 Process Analytical Technology Process Analytical Technology (PAT) is a term borrowed from the pharmaceutical industry to indicate the combination of a system for automation, the dashboard display for monitoring that system, and the mechanisms for computation and communication between the system and its dashboard. NASS is currently committed to development of dashboards for monitoring the survey process throughout data collection. Data collection is being migrated from NASS field offices to a centralized call center in the National Operations Center (NOC) in St. Louis, MO. Call center activity at the NOC involves the Blaise 5.0 (Statistics Netherlands) system for data collection and interactive editing. Dashboard development is reliant upon sophisticated communications across integrated databases to enact real time monitoring of NOC call center activity. The combination of tools for measurement (survey questionnaire, CATI), data processing, database communications, and dashboard display for process monitoring, constitute an example of process analytical technology. Planning for the future at NASS, we designed the quality component of the SignEdit System to pair SAS computation with dynamic transactional database capabilities to prepare for creation of tools for monitoring the survey process, informing the survey process, and reacting to process data (paradata) in real time. Prospectively, in-memory computing can be utilized to access the SignEdit Database (Figure 1), collect counts needed for calculated statistics to be displayed (e.g. response rates by stratum accruing hourly; further discussion in Section 1.4), perform the computation in memory with an efficient programming language such as C, and then perform data transfer for only the statistics required for the display. An example of in-
memory computing for reaction to process data at NASS is database and display support for adaptive design for the County Agricultural Production Survey (CAPS) in 2011. 2.2 Process as a Quality Paradigm The change in business model at NASS has enabled a transition from a quality program based on manual inspection with statistical quality control to a quality program based on principles of process control (Table 1). Historically, quality programs have evolved with technology and science as shown in Table 1; Table 1 provides a timeline including some primary contributors to the evolution of the quality paradigm and some primary resources used by quality professionals. Examples of manual inspection include manual review of a judgment sample selected by principles of selective editing, and manual inspection of an audit subsample of recorded interviews (e.g. computer assisted recorded interviews or CARI) for purposes of quality assurance. The use of acceptance sampling for quality assurance (Schilling and Neubauer, 2009) was combined with statistical methods, management methods, and technical expertise of production groups (e.g. survey process professionals and agricultural statisticians), to form the paradigm of quality program known as statistical quality control or quality control as early as the 1920 s. Advances in computer science (e.g. IBM and Digital, Inc.) enabled automated capture of process data (paradata) and automated reaction to process data (paradata) to be common industry practices as early as the 1960 s. The centralized computing environment at NASS (Nealon and Gleaton, 2012) has the flexibility to support these three quality paradigms for the survey process: manual inspection, quality control, and process control. The SignEdit System was designed to support an environment of process control for the survey process for each participating survey, or survey process control.
Table 1: One Historic Perspective on Process Circa 1920s Circa 1960s Circa 1980s Circa 2000s Paradigm Manual Inspection Quality Process Six Sigma Quality by Design Primary Contributors Examples 100% Inspection Audit Subsample Judgment Sample Walter Shewhart Bell Labs/ AT&T George Box Deming IBM, Inc. Digital, Inc. Motorola General Electric Joseph Juran (1985) CMMI (2006) ICH (online) Comprises: Foundational Resources Acceptance Sampling for Quality Schilling & Neubauer (2009) Quality Handbook Joseph Juran (1951) Statistical Method from the Viewpoint of Quality Walter Shewhart (1939) Statistical Quality Handbook Western Electric (1956) Statistical by Monitoring and Feedback Adjustment Box and Luceno (1997) TQM Process Process Capability Taguchi Method Combines with: Lean Production GMP Facility ICH Q8: Design Space vs. One-at-a- Time Attribute s ICH Q9: Quality Risk Management ICH Q10: Change and Lifecycle Management CMMI = Capability Maturity Model Integration GMP = Good Manufacturing Practices IBM = International Business Machines, Inc. ICH = International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use TQM = Total Quality Management
Acknowledgements The significance editing project traces back to efforts by a number of individuals in NASS RDD headed by Dale Atkinson. Since 2009, Kay Turner has led the research and development team for this project as the EIRS section head. Team contributors have been Matthew Gregg, Jason Bell, James Johanson, and myself. We are particularly grateful to Michael Hogye for offering his consistent guidance and technical expertise. Wendy Barboza has been actively involved with the project as Branch Chief for the Statistical Methodology Research Branch (SMRB) since Dale Atkinson s retirement. The SignEdit System was designed to provide support to many production groups throughout the agency, and requires substantial computing efforts through collaboration with NASS ITD (information technology). Our research director, Mark Harris, and agency administrator, Cynthia Clark, have offered consistent encouragement and fostered collaboration between key contributors across agency divisions. The SignEdit System requires integration with several production systems and synchronization with numerous production databases. Exceptional informational technology support and a sophisticated centralized computing environment have made it possible to advance this project to development of a production beta. References Chrissis, M.B., Konrad, M., and Shrum, S. (2006). CMMI Guidelines for Process Integration and Product Improvement. Second Edition. SEI Series in Software Engineering. Addison-Wesley. Pearson Education, Inc. de Waal, T., Pannekoek, J., and Scholtus, S. Handbook of Statistical Data Editing and Imputation (2009). Wiley Handbooks in Survey Methodology. John Wiley & Sons, Inc. Farwell, K. (2006). The General Application of Significance Editing to Economic Collections. Research Paper, ABS Catalogue No. 1352.0.55.066. Methodology Advisory Committee, Statistical Services Branch on Hobart, Australian Bureau of Statistics. Canberra, 26 November 2004; ABS, 16 February 2006. Fellegi, I., Holt, D. T. (1976). A systematic approach to automatic edit and imputation. Journal of the American Statistical Association, vol. 71, ppg. 17-35. Hidiroglou, M., and Berthelot, J. (1986). Statistical Editing and Imputation for Periodic Business Surveys. Survey Methodology, Vol. 12, No. 1, ppg. 73-83. Johanson, J. M. (2012). Banff Automated Edit and Imputation on a Hog Survey. Proceedings of the Fourth International Conference of Establishment Surveys, June 11-14, 2012, Montréal, Canada [CD-ROM]: American Statistical Association. Juran, J. (1985). Juran on Quality by Design. The Free Press, New York, NY. Kleweno, D. and Hird, P. (2012). New Solutions, Challenges, and Opportunities: CAPI the NASS Way. Conference of European Statisticians, October 31-November 2, 2012. Geneva, Switzerland. Available in November 2012 at http://www.nass.usda.gov/research_and_science/technology/capi%20the%20nas S%20Way_New%20Solutions%20Challenges%20and%20Opportunities.pdf. Lawrence, D., and McKenzie, R. (2000). The General Application of Significance Editing. Journal of Official Statistics, Vol. 16, No. 3, ppg. 243-253, 2000. Manning, A. and Atkinson, D. (2009). Toward a Comprehensive Editing and Imputation Structure for NASS Integrating the Parts. USDA NASS RDD. United Nations Statistical Commission and Economic Commission for Europe, Conference for
European Statisticians, Work Session on Statistical Data Editing. Neuchatel, Switzerland, 5-7 October 2009. Nealon, J., and Gleaton, E. (2012). Consolidation and Standardization of Survey Operations at a Decentralized Federal Statistical Agency. Journal of Official Statistics. Accepted for publication in 2012, publication pending. Schilling, E. G., and Neubauer, D. V. (2009). Acceptance Sampling in Quality. Second Edition. CRC Press, Chapman and Hall. Seyb, A., Stewart, J., Chiang, G., Tinkler, I., Kupferman, L., Cox, V., and Allen, D. (2009). Automated Editing and Imputation System for Administrative Financial Data in New Zealand. United Nations Statistical Commission and Economic Commission for Europe. Conference of European Statisticians. Work Session on Statistical Data Editing. Working Paper No. 5. Neuchatel, Switzerland, 5-7 October 2009.