TOMASZ KLIMANEK USING INDIRECT ESTIMATION WITH SPATIAL AUTOCORRELATION IN SOCIAL SURVEYS IN POLAND 1 1. BACKGROUND



Similar documents
State of Louisiana Office of Information Technology. Change Management Plan

FROM THE EDITOR Challenges in Statistics Production for Domains and Small Areas (II) Other Articles

A SPATIAL UNIT LEVEL MODEL FOR SMALL AREA ESTIMATION

A Data Placement Strategy in Scientific Cloud Workflows

JON HOLTAN. if P&C Insurance Ltd., Oslo, Norway ABSTRACT

Modelling and Resolving Software Dependencies

Detecting Possibly Fraudulent or Error-Prone Survey Data Using Benford s Law

10.2 Systems of Linear Equations: Matrices

INFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES

Towards a Framework for Enterprise Architecture Frameworks Comparison and Selection

Unbalanced Power Flow Analysis in a Micro Grid

Cross-Over Analysis Using T-Tests

The one-year non-life insurance risk

An intertemporal model of the real exchange rate, stock market, and international debt dynamics: policy simulations

Enterprise Resource Planning

MSc. Econ: MATHEMATICAL STATISTICS, 1995 MAXIMUM-LIKELIHOOD ESTIMATION

Exploratory Optimal Latin Hypercube Designs for Computer Simulated Experiments

Data Center Power System Reliability Beyond the 9 s: A Practical Approach

Measures of distance between samples: Euclidean

Digital barrier option contract with exponential random time

Performance And Analysis Of Risk Assessment Methodologies In Information Security

RUNESTONE, an International Student Collaboration Project

How To Segmentate An Insurance Customer In An Insurance Business

Minimum-Energy Broadcast in All-Wireless Networks: NP-Completeness and Distribution Issues

A New Evaluation Measure for Information Retrieval Systems

Option Pricing for Inventory Management and Control

Lecture L25-3D Rigid Body Kinematics

Math , Fall 2012: HW 1 Solutions

Mathematics Review for Economists

Calibration of the broad band UV Radiometer

Cost Efficient Datacenter Selection for Cloud Services

Optimal Energy Commitments with Storage and Intermittent Supply

FAST JOINING AND REPAIRING OF SANDWICH MATERIALS WITH DETACHABLE MECHANICAL CONNECTION TECHNOLOGY

Chapter 9 AIRPORT SYSTEM PLANNING

Mandate-Based Health Reform and the Labor Market: Evidence from the Massachusetts Reform

11 CHAPTER 11: FOOTINGS

On Adaboost and Optimal Betting Strategies

Risk Management for Derivatives

Product Differentiation for Software-as-a-Service Providers

Sensor Network Localization from Local Connectivity : Performance Analysis for the MDS-MAP Algorithm

Firewall Design: Consistency, Completeness, and Compactness

ThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 12, June 2014

Using research evidence in mental health: user-rating and focus group study of clinicians preferences for a new clinical question-answering service

A Blame-Based Approach to Generating Proposals for Handling Inconsistency in Software Requirements

Bond Calculator. Spreads (G-spread, T-spread) References and Contact details

DECISION SUPPORT SYSTEM FOR MANAGING EDUCATIONAL CAPACITY UTILIZATION IN UNIVERSITIES

Game Theoretic Modeling of Cooperation among Service Providers in Mobile Cloud Computing Environments

A Generalization of Sauer s Lemma to Classes of Large-Margin Functions

SEC Issues Proposed Guidance to Fund Boards Relating to Best Execution and Soft Dollars

Professional Level Options Module, Paper P4(SGP)

CALCULATION INSTRUCTIONS

Gender Differences in Educational Attainment: The Case of University Students in England and Wales

USING SIMPLIFIED DISCRETE-EVENT SIMULATION MODELS FOR HEALTH CARE APPLICATIONS

The concept of on-board diagnostic system of working machine hydraulic system

The higher education factor: The role of higher education in the hiring and promotion practices in the fire service. By Nick Geis.

Optimal Control Policy of a Production and Inventory System for multi-product in Segmented Market

Improving Direct Marketing Profitability with Neural Networks

Modeling and Predicting Popularity Dynamics via Reinforced Poisson Processes

Supporting Adaptive Workflows in Advanced Application Environments

! # % & ( ) +,,),. / % ( 345 6, & & & &&3 6

A Theory of Exchange Rates and the Term Structure of Interest Rates

Example Optimization Problems selected from Section 4.7

Stock Market Value Prediction Using Neural Networks

View Synthesis by Image Mapping and Interpolation

MODELLING OF TWO STRATEGIES IN INVENTORY CONTROL SYSTEM WITH RANDOM LEAD TIME AND DEMAND

Linking ICT related Innovation. Adoption and Productivity: results from micro-aggregated versus firm-level data

Predicting Television Ratings and Its Application to Taiwan Cable TV Channels

N O T I C E O F E X A M I N A T I O N

An Introduction to Event-triggered and Self-triggered Control

Ch 10. Arithmetic Average Options and Asian Opitons

Improving Emulation Throughput for Multi-Project SoC Designs

S&P Systematic Global Macro Index (S&P SGMI) Methodology

A Universal Sensor Control Architecture Considering Robot Dynamics

Seeing the Unseen: Revealing Mobile Malware Hidden Communications via Energy Consumption and Artificial Intelligence

Risk Adjustment for Poker Players

Introduction to Integration Part 1: Anti-Differentiation

Forecasting and Staffing Call Centers with Multiple Interdependent Uncertain Arrival Streams

A New Pricing Model for Competitive Telecommunications Services Using Congestion Discounts

Consumer Referrals. Maria Arbatskaya and Hideo Konishi. October 28, 2014

ESTABLISHING MARINE ACCCIDENT CLASSIFICATION: A CASE STUDY IN TAIWAN

DIFFRACTION AND INTERFERENCE

Minimizing Makespan in Flow Shop Scheduling Using a Network Approach

Optimizing Multiple Stock Trading Rules using Genetic Algorithms

Heat-And-Mass Transfer Relationship to Determine Shear Stress in Tubular Membrane Systems Ratkovich, Nicolas Rios; Nopens, Ingmar

Coalitional Game Theoretic Approach for Cooperative Transmission in Vehicular Networks

Exponential Functions: Differentiation and Integration. The Natural Exponential Function

GPRS performance estimation in GSM circuit switched services and GPRS shared resource systems *

Innovation Union means: More jobs, improved lives, better society

Non Qualified Annuity Claimant s Statement

Safety Stock or Excess Capacity: Trade-offs under Supply Risk

Sensitivity Analysis of Non-linear Performance with Probability Distortion

EU Water Framework Directive vs. Integrated Water Resources Management: The Seven Mismatches

A Comparison of Performance Measures for Online Algorithms

Trace IP Packets by Flexible Deterministic Packet Marking (FDPM)

Manure Spreader Calibration

This post is not eligible for sponsorship and applicants must be eligible to work in the UK under present visa arrangements.

Aon Retiree Health Exchange

The Impact of Forecasting Methods on Bullwhip Effect in Supply Chain Management

Mathematical Models of Therapeutical Actions Related to Tumour and Immune System Competition

Transcription:

PRZEGLĄD STATYSTYCZNY NUMER SPECJALNY 1 01 TOMASZ KLIMANEK USING INDIRECT ESTIMATION WITH SPATIAL AUTOCORRELATION IN SOCIAL SURVEYS IN POLAND 1 1. BACKGROUND First attempts at applying various approaches to parameter estimation for small areas in Polan were unertaken after the International Conference on Small Area Statistics hel in Warsaw in 199 (es. Kalton, Koros, Platek, 1993). There were only a few applications of small area estimation (SAE) methos to measure the scope of unemployment, poverty, househol structure an in agriculture relate surveys (Koros, Paraysz, 000). Further applications an examination of stanar inirect estimators properties were unertaken within the EURAREA project after the IASS Satellite Conference hel in Riga in 1999. The first important stuy to apply SAE methoology in LFS was conucte in 003. The authors (Bracha, Lenicki an Wieczorkowski, 003) estimate totals an rates of several labour market characteristics by region, subregion an poviat (NUTS, NUTS3 an NUTS4). They use irect, synthetic an composite estimators. The secon important stuy was conucte in 004 by E. Gołata. The stuy was intene to rely on EURAREA project experiences. The Polish atabase - the so-calle super-population labelle POLDATA - was create on the basis of 3 ata sources: the 1995 Micro-census, the 1995 Househol Buget Survey an the Local Data Bank. POLDATA represente as closely as possible the characteristic of Polan in 1995 with respect to the new aministration ivision of the country which was introuce in January 1999. For the purposes of applying the stanar estimators, the proportion of ILO unemploye (in the whole population over 15) in an area was estimate (Gołata, 004). One shoul also mention the stuy conucte in 004 by Kubacki. The parameter of interest in the stuy was unemployment size for NUTS an NUTS 4 level. Registere unemployment constitute the aitional ata source use in the stuy (with covariates: the number of unemploye persons, the number of employe persons, the 1 The results presente in the paper are the outputs of Work Package 5 (Case stuies) of EUROSTAT project ESSnet on SAE 61001.009.003-009.859 (009-01). The EURAREA project no. IST-000-690 entitle Enhancing Small Area Estimation Techniques to meet European nees is part of 5 th framework programme for research, technological evelopment an emonstration of EU. Its main co-orinator is ONS Office for National Statistics, UK.

156 Tomasz Klimanek number of economically inactive persons, the number of wellings an the number of persons age 15 an above, Kubacki, 004). Both esign an moel base types of estimators were applie: esign base estimators incluing post stratification methos (both ratio an regression estimator), synthetic estimator (both ratio an regression estimator), moel base estimators inclue empirical Bayes (EB) estimator an hierarchical Bayes (HB) estimator. Recent years have seen a growing interest in new possibilities an tools evelope to meet the growing eman for estimates at local level. Special projects were carrie in the Central Statistical Office (CSO) in cooperation with university researchers. The projects referre to ifferent subjects, for example: labour market, househol structure, business statistics incluing small business. From the perspective of the Population Census 011, which was in progress at the time, special research was unertaken within the Central Statistical Office an newly establishe Centre for Small Area Statistics in Poznan. It was aime at examining aministration registers, their quality an usefulness as sources of auxiliary variables in small omain estimation. But practical application of SAE methos of official statistics in Polan is still not part of normal practice.. GENERAL SETUP The aim of the Author was to continue explorations of estimating labour market characteristics for small omains. First, ata infrastructure referring to economic activity an unemployment is presente an iscusse. The intention was to inclue all variable categories, which experience has shown to be effective. In the case of ILO 3 unemployment the stanar variables are: age, sex, eucation, employment status an housing. In the stuy ata from the following sources were use: Register of Unemploye, Vital Statistics Register, Tax Register ata will be use in an inirect form via Commuting Survey which was base on ata from Tax Register the first eition of the survey is available for 006. Taking into account special features of the labour market in Polan, especially its high territorial ifferentiation, various estimation techniques will be analyse. Theoretical approaches to estimation with spatial effects propose by R. Chambers & A. Saei (003) together with the SAS software provie by EURAREA project are consiere 3 ILO unemployment unemployment efine accoring to International Labour Organization, which is comparable among European an non-european countries. Accoring to this efinition the unemploye in general comprise all persons above a specifie age who uring the reference perio were: without work, that is, were not in pai employment or self employment uring the reference perio; currently available for work, that is, were available for pai employment or self-employment uring the reference perio; an seeking work, that is, ha taken specific steps in a specifie recent perio to seek pai employment or self-employment.

Using inirect estimation with spatial autocorrelation in social surveys in Polan 157 an ajuste to fit specific arrangements. The stanar EURAREA estimators use in the stuy are: irect (for comparative reasons), generalise regression estimator GREG, synthetic an EBLUP. Also EBLUPGREG, which takes into account spatial correlation structure, was applie. The structure of economically active population an, especially, unemployment an its structure are of exceptional social interest in Polan. Unemployment, since the very beginning of 90 s has assume alarming imensions an is characterise by great territorial ifferentiations at the national as well as regional level. This characteristic is ue to structural ifferences in economy an regional inequalities shape by istinct historical experiences as well as the transformation process. Regularities observe at the national level, in most cases, cannot be generalise an iffer from region to region. For example, in June 011 the highest registere unemployment rate in Polan was observe in the Warmia-Masuria Province 19,5%, an the lowest in the Wielkopolska Province 9,1% (province (voivoship) refers to the NUTS level accoring to Eurostat territorial ivision).this situation requires avance stuies reflecting regional specificities. Data available from the Labour Force Survey enable estimation of employe an unemploye for the whole country by age, sex an place of resience: urban an rural areas. But at the regional level (NUTS) only aggregate ata can be obtaine from LFS ifferentiate by sex an place of resience (into urban an rural areas), but not by age. The goal is then to estimate the percentage of unemploye people in the population of 15 an oler 4 at the NUTS3 level base on ata from the Labour Force Survey 1 st quarter of 008. Although there ha been previous attempts to estimate unemployment at the NUTS4 but they were not very satisfactory. 3. DESCRIPTION OF THE SURVEY AND EMPLOYMENT-RELATED POPULATION FLOW STUDY Labour Force Survey The LFS methoology is base on the efinitions of the economically active population, the employe an the unemploye aopte by the Thirteenth International Conference of Labour Statisticians (October 198) an recommene by the International Labour Office. The survey concentrates on the situation from the point of view of economic activity of population, i.e. the fact of being employe, unemploye or economically inactive in the reference week. The labour force survey is a probability sample survey. Sampling for the LFS follows the two-stage househol sampling. The primary sampling units subject to the first stage selection, are census units calle census clasters 4 It is ifferent from the unemployment rate but such an assumption will significantly simplify the computations, especially as far as the MSE of the estimators is concerne.

158 Tomasz Klimanek CCs in towns, while in rural areas they are enumeration istricts EDs 5. Secon stage sampling units are wellings 6. The primary sampling units (PSU) are sample with the application of the so-calle stratification. Strata correspon to provinces (voivoships). Strata within provinces were create epening on the size of a place; rural areas were inclue into the smallest ones. The estimation process consists of efining the appropriate generalizing factors, referre to as weights. This is achieve in three steps. The first step provies primary weights, which basically are the reciprocals of selection probabilities for ultimate sampling units (i.e. wellings), which compensates for the isproportionate construction of the sample. The seconary weights are calculate in the next step by iviing primary weights by R, where R rate epens on the category of a place of resience of a given welling (the rural area or one of the five town classes mentione above). The seconary weights are also final for the results concerning househols an families. Final weights for the results concerning the population are calculate in the thir step. The purpose of this step is to ajust the LFS results to the current emographic estimates. It is given by calculating the so-calle moifiers for each of 48 categories efine by the place-of-resience (urban/rural), sex an 1 age groups. Final weights result from multiplying seconary weights by aequate moifiers. The variances of complex estimators obtaine in LFS cannot be estimate with orinary methos an special, approximate proceures must be employe. Since 003 one of the most popular approximate methos has been chosen for this purpose, base on the resampling an bootstrap rule. The etaile escription of the bootstrap proceure applie in the variance estimation for LFS estimates in Polan is presente in etails in Bracha et al. (003). Tax Register general characteristics of an employment-relate population flow stuy in 006 The use of aministrative registers in Polish public statistics is at the initial stage. The only larger stuy in which they playe a supporting role was the 011 National Census, which relies on information from various sources in orer to collect certain ata (reucing the buren on responents), upate the sampling frame for sample surveys an to upate the atabase of builings an wellings. Wier access to aministrative atabases provies an opportunity for Polish public statistics to evelop methos of 5 In rural areas application of smaller first stage sampling units is useful for organizational reasons, but negatively influences precision of results for these areas. In orer to improve this, the principle of the so-calle overrepresentation of rural areas was applie, i.e. the number of wellings sample from rural areas is about 10% higher than the number resulting from the so-calle proportional allocation (relate to the number of wellings in the whole population). 6 Sampling of primary sampling units an wellings is conucte on the basis of the Domestic Territorial Division Register, incluing among others a list of territorial statistical units an a list of welling aresses within CC s an EDs.

Using inirect estimation with spatial autocorrelation in social surveys in Polan 159 how they can be use in statistical reporting, an consequently, to upgrae the statistical infrastructure. A stuy of employment-relate population flow was conucte on the basis of ata from the tax system collecte in the POLTAX atabase in the Statistical Office in Poznan in 009. The stuy was intene to provie estimates about the volume an irections of commuter traffic involving people in pai employment, using ata from 31 December 006. The results an the methoological etails of the stuy have been mae partially available in the Regional Data Bank an in the book entitle Commuting in Polan, eite by K. Kruszka, Poznan, in October 010. Registers, after verification an cleaning, turne out to be a goo source of information about the structure of economic activity from the territorial perspective. Aitionally, the set inclue characteristics of sex an age (variable erive from the variable birth ate ), which enable another breakown. One isavantage is their incompleteness. In aition, they o not cover all the characteristics reporte in other stuies (such as eucation, class of places of resience, etc., which are inclue in LFS); this means that these atasets cannot fully replace the previously use measurement tools. Nevertheless, registers can play a supporting role an provie a goo source for auxiliary variables in inirect estimation of labour market characteristics. 4. APPROACH TO THE PROBLEM The problem was to estimate the percentage of unemploye people at the lower level of aggregation than presente in the CSO s publications. Having in min that ata available from the Labour Force Survey enable irect estimates of employe an unemploye for the whole country by age, sex an place of resience an for aggregate ata at the regional level (NUTS), it was ecie to try to get small area estimates at the NUTS3 level. Another problem was to evaluate the results. The applie software of course enables us to compute MSEs for the stuie estimators, but there was a serious ifficulty to valiate the estimates against population values. It was ecie that espite of small ifferences in the efinition of LFS an registere unemployment the latter will be use as a kin of benchmark. The natural choice was to use the EURAREA coe. The variable status on the labour market was recoe into binary variable getting 1 if the person was unemploye an 0 otherwise. This way target variable was create. As potential covariates in the moel we chose: commuting to work, place of resience, sex, 6 age groups. All covariates were recoe into binary variables proucing nine variables: X1 commuting to work (1 if a person commutes to work, 0 otherwise),

160 Tomasz Klimanek X place of resience (1 if a person lives in rural area, 0 otherwise), X3 sex (1 if a person is a male, 0 otherwise), X4 group of age (1 if a person is up to 0 years of age, 0 otherwise), X5 group of age (1 if a person is 0-4 years of age, 0 otherwise), X6 group of age (1 if a person is 5-34 years of age, 0 otherwise), X7 group of age (1 if a person is 35-44 years of age, 0 otherwise), X8 group of age (1 if a person is 45-54 years of age, 0 otherwise), X9 group of age (1 if a person is over 55 years of age, 0 otherwise). Then we use stepwise selection to get the following moel (two variables were exclue from the moel: X6 was not significant an X9 because of collinearity): Moel parameter estimates (omain level) after excluing X6 an X9 Table 1. Variable Parameter estimate Stanar error t Value Pr > t Intercept 0.376 0.1851 1.1 0.317 X1-0.6409 0.097 -.86 0.0058 X 0.00630 0.0145 0.9 0.7700 X3-0.503 0.45085-1.11 0.699 X4.4064 0.63456 3.79 0.0004 X5-1.11075 0.53435 -.08 0.041 X7-1.5566 0.3008-4.17 0.0001 X8 1.00178 0.4339 4.1 0.0001 An its gooness of fit is as follows Moel gooness of fit (omain level) after excluing X6 an X9 Table. Root MSE 0.0136 R-Square 0.7191 Depenent Mean 0.055 Aj R-Sq 0.685 Coeff Var.37504 Although another two variables (X an X3) are not significant we ecie to inclue them in the moel as in our opinion they are of special importance (place of resience an sex). 5. METHODS The seven stanar estimators were applie: synthetic population level estimator NSMEAN, irect estimator,

Using inirect estimation with spatial autocorrelation in social surveys in Polan 161 GREG with a stanar linear regression moel, synthetic estimator consiere uner two ifferent moels: a) a linear two-level moel with iniviual ata, b) a linear moel with area-level covariates an a poole sample estimate of within-area variance, EBLUP estimator using moels: a) a linear two-level moel with iniviual ata, b) a linear moel with area-level covariates an a poole sample estimate of within-area variance. The eighth metho was also EBLUP estimator however base on the assumption of the existence of spatial autocorrelation EBLUPGREG SPATIAL a linear two-level moel with iniviual ata taking into account the spatial correlation structure. Direct estimator ˆȲ DIRECT = 1ˆN w i y i, (1) i u where: ˆN = w i, w i = 1/π i. i u MSE Estimator: MŜE( ( ) DIRECT 1 ˆȲ ) = w i (w i 1)(y i ˆN i u ˆȲ DIRECT ) () (assuming, that π i, j = 0, for all or i j. GREG Estimator y i = x T i β + ε i, (3) Ŷ GREG E(ε i ) = 0, Var (ε i ) = σε, = 1ˆN y i + π i X T 1ˆN T x i ˆβ, (4) i s 1 where ˆN = an π ˆβ are estimate by using LSM weighte by weights resulting i s i from the sampling process: 1 ˆβ = w i x i x T i w i x i y i, (5) i u i u E(ε i ) = 0, Var (ε i ) = σε. Assuming: r i = y i x T ˆβ i an Ŷ GREG = w i g i y i, i u i s π i

16 Tomasz Klimanek with weights g : g i = 1 + ( X. x. ) T ( w i x i x T i ) 1 x i, i u MŜE( ˆȲ GREG ) = i u SYNTHETIC ESTIMATORS MODEL A Stanar two level linear moel: π ij π iπ j g i r i g j r j. (6) π j u ij π i π j y i = x T i β + u + e i, (7) u ii N(0,σu), e i ii N(0,σe), ˆȲ SYNTH = X Ṭ ˆβ, (8) z X. = ( X.,1,..., X.,p ) T Estimator oes not respect sampling weights MŜE( where ˆV is the covariance matrix of covariates. MODEL B Moel for omain is as follows: ˆȲ SYNTH ) = ˆσ u + X. ˆV X Ṭ, (9) ȳ. = x Ṭ β + ξ, (10) where ξ ii N(0,σu + σ e ) an n enotes sample size for area. n Variance σe is estimate accoring to the formula: ˆσ e = 1 n na (y i ȳ. ), (11) where: n sample size; na number of omains in the sample. One level regression moel with β i σ u estimate itreratively from: where y vector of the sample elements y, x matrix with rows consisting of x T, i ˆβ = (x T D 1 x) 1 x T D 1 y, (1) D iagonal matrix with iteratively upate values ( ˆσ u + ˆσ e n ) on the iagonal. ˆȲ SYNTH = X Ṭ ˆβ, (13)

Using inirect estimation with spatial autocorrelation in social surveys in Polan 163 SYNTH MŜE( ˆȲ ) = ˆσ u + X Ṭ ˆV X., (14) where ˆV is the estimate of the covariance matrix (x T D 1 x) 1. EBLUP ESTIMATORS MODEL A ˆȲ EBLUP = w EBLUP ˆȲ GREG + (1 w EBLUP SYNTH ) ˆȲ, (15) w EBLUP =. (16) ˆσ u + ˆσ e /n In more etails the moels may be presente as follows: EBLUP using MODEL A ˆσ u ˆȲ = γ (ȳ. x Ṭ ˆβ) + X Ṭ ˆβ, (17) where: γ = ˆσ u, (18) ˆσ u + ˆσ e /n ȳ., x Ṭ, are corresponing sample means of y an the covariates for omain. ˆβ, ˆσ e, ˆσ uare parameters estimate using stanar two-level linear moel. MSE Estimators. ESTIMATOR 1 ESTIMATOR MŜE( ˆȲ ) = γ ˆσ e n + (1 γ ) X Ṭ ˆV X. (19) MŜE( ˆȲ ) = γ ˆσ e + (1 γ ) ( X Ṭ n ˆV ) X. + ( ˆσ ) ( e / ) 3 ( ) ˆσ u + n ˆσ e n Vâr(ˆσ u) + ˆσ4 u Vâr(ˆσ ˆσ e) ˆσ u Côv(ˆσ e 4 ˆσ u,σ e), e (0) where: Vâr(ˆσ e) ˆσ e 4 = m 1 m p, Côv(ˆσ e, ˆσ u) ˆσ e ˆσ u = m 1 m p, Vâr(ˆσ u) = Vâr(ˆρ)Vâr(ˆσ e) + ˆσ evâr(ˆρ) 4 + ˆρ Vâr(ˆσ e),

164 Tomasz Klimanek m 1 number of selecte units, m number of omains, ρ = σ u/σ e is the variance ratio, Vâr(ˆρ) = n /, (1 + n ˆρ) where ˆV is the covariance matrix of covariates. Vâr(ˆσ e) is estimate variance ˆσ e an Vâr(ˆσ u) is estimate variance ˆσ u. Estimator 1 is a corresponing approximation if the number of omains is large. Estimator may be applie in any case. MODEL B ˆȲ EBLUP = w EBLUP ˆȲ DIRECT + (1 w EBLUP SYNTH ) ˆȲ, (1) w EBLUP = ˆσ u ˆσ u + ˆψ. () EBLUP using MODEL B. ˆȲ EBLUP = γ ˆȲ irect + (1 γ ) X Ṭ ˆβ, (3) where γ = ˆσ u, (4) ˆσ u + ˆσ e ˆβ = (x T D 1 x) 1 x T D 1 y, (5) where: y vector of the sample elements y, x matrix with rows consisting of x Ṭ, D iagonal matrix with iterative upate values ( ˆσ u + ˆσ e) on the iagonal. MSE Estimators ESTIMATOR 1 MŜE( ˆȲ ) = γ ˆψ + (1 γ ) ( X Ṭ ˆV X. ), (6) ˆψ is an estimator of resiual variance insie omains ˆψ = σ e n. ESTIMATOR MŜE( ˆȲ ) = γ ˆψ + (1 γ ) ( X Ṭ ˆV ) ( ) X. + ˆψ ˆσ 3 u + ˆψ Vâr(ˆσ u ) (7)

Using inirect estimation with spatial autocorrelation in social surveys in Polan 165 ESTIMATOR 3 MŜE( ˆȲ ) = γ ˆψ + (1 γ ) ( X Ṭ ˆV ) ( ) X. + ˆψ ˆσ 3 u + ˆψ Vâr(ˆσ u ), (8) where ˆV is the covariance matrix of covariates an Vâr(ˆσ u) is an estimate of ˆσ u. EBLUP using spatial correlation (EBLUPGREG SPATIAL) Software prepare by ISTAT was base on re-formulation of the expressions containe in Saei an Chambers (003) in orer to obtain a more efficient SAS coe. The consiere moel is the general linear mixe moel y = Xβ + Zu + e, (9) where: X an Z are known matrices of orer N P an N DOM respectively; X is the matrix of the population values of the covariates an Z is the incience matrix for the spatial ranom area effect; e an u are vectors of ranom variables with mean an variance an covariance matrices expresse respectively by the couples: N [0, σ I N ], N [0, σu A], I N being the ientity matrix of orer N an A square matrix of orer DOM allowing a spatial correlation structure to be inclue in the moel. The generic element of A a is given by [ ( ist(, )] 1 ) a ( ) = 1 + δ ( ) exp, (30) α 0 for =, δ ( ) = 1 for, ist(, ) is the Eucliean istance between centrois of area an. 6. SOFTWARE The software use in the stuy was SAS. We starte from using PROC SUR- VEYMEANS to compute the irect estimator. Then the EURAREA coe was implemente. Not only were Seven Stanar Estimators compute but the EBLUPGREG software with its options was teste to fin out whether to allow for spatial autocorrelation or not. The SAS software was also use for computing coorinates of the centrois which are of special importance in estimation conucte via EBLUPGREG taking into account the spatial epenence.

166 Tomasz Klimanek Figure 1. Centrois of small omains NUTS4 7. RESULTS Figures an 3 show that the spatial pattern of the irect estimates for NUTS3 level fits quite well to the pattern of the values coming from the aministrative registers. Figure. Direct estimates

Using inirect estimation with spatial autocorrelation in social surveys in Polan 167 Figure 3. Registere unemployment The following figures (Fig. 4 Fig.10) give some overview on how the applie estimators reprouce the registere unemployment. The Figure 4 showe that irect estimates of ILO (International Labor Bureau) unemployment are in most of the cases lower for the NUTS3 areas than registere unemployment. Quite ifferent situation is presente on the Figure 5. In case of moel assiste estimator (GREG) the estimate unemployment from LFS is higher than registere. However if the ranges of unemployment are compare one coul see that irect estimates have slightly more narrow range when compare with GREG. The registere unemployment ranges from 1.41% to 10.77%, irect estimates range from 1.01% to 7.89% an finally GREG from.9% to 10.89%. Synth A estimator has the smallest variance but the bias in this case is unacceptable. Synth B estimates are quite similar to irect ones. However the range for the estimates obtaine while applying Synth B is very short from.35% to 5.94%. So one coul conclue that smoothing of the estimates is too strong. The comparison of the EBLUP estimators base on ifferent assumptions reveale that they shoul be of a special interest in applications of small area methoology in Labor Force Survey. The results obtaine suggest their bias is relatively small with relatively small variation. For instance the range of estimates prouce by EBLUP- GREG SPATIAL (which takes into consieration the spatial structure of ata) is from 4.50% to 10.34%. One shoul notice that even some basic assumptions relating to the

168 Tomasz Klimanek Figure 4. Comparison of registere unemployment rate an irect estimates Figure 5. Comparison of registere unemployment rate an greg estimates Figure 6. Comparison of registere unemployment rate an Synth A estimates

Using inirect estimation with spatial autocorrelation in social surveys in Polan 169 Figure 7. Comparison of registere unemployment rate an Synth B estimates nature of variable are violate (number of unemploye people aging 15 an more is not a continuous variable an the istribution coul not be normal), the behavior of the EBLUPs is quite well. The estimates obtaine by application of EBLUP A an EBLUP B moels show the same patters as their irect components. EBLUP A is a linear combination of GREG an SYNTH A while EBLUP B is a linear combination of DIRECT an SYNTH B. We also compare the MSE of the stuie estimators. The software prouce in EURAREA project computes the MSE of seven stanar estimators an also for spatial version of EBLUPGREG estimator. However the approach presente by us has two simplifications. In fact for DIRECT an GREG estimators we have mainly variance as these two estimators are esign unbiase. The secon problem while applying EU- RAREA coe the reaer shoul realize is the fact, that the quality assessment measures are compute assuming simple ranom sampling. In fact the sampling esign applie in LFS surveys is rather stratifie an two-stage in most cases. Figure 8. Comparison of registere unemployment rate an EBLUP A estimates

170 Tomasz Klimanek Figure 9. Comparison of registere unemployment rate an EBLUP B estimates Figure 10. Comparison of registere unemployment rate an EBLUPGREG SPATIAL estimates Figure 11. Distribution of MSEs (NUTS3 orere accoring to increasing sample size)

Using inirect estimation with spatial autocorrelation in social surveys in Polan 171 The last Figure (11) presents the istribution of MSEs where the NUTS3 were orere accoring to the increasing sample size. The highest values of MSEs are connecte with SYTH A estimator. In this case the important input in its value is the input of bias. The esign unbiase estimators DIRECT an GREG show a quite high amount of variance which is slightly smaller in the case of GREG. However when sample size increases the variance of the DIRECT estimator is ecreasing. In the case of SYNTH the variance is rather constant. The best performance as far as the behavior of the estimation is concerne is connecte with EBLUPs. They are quite similar an have the smallest MSE. Poznan University of Economy LITERATURA [1] Bracha, C., Lenicki, B. an Wieczorkowski, R., (003), Estimation of Data from the Polish Labour Force Surveys by poviats (counties) in 1995 00 (in Polish), Central Statistical Office of Polan, Warsaw. http://www.stat.gov.pl/cps/re/xbcr/gus/publ estymacja anych z ba na poziomie pow la lat 1995 00.pf [] Chanra H., Salvati N., Chambers R., (009), Small Area estimation for Spatially Correlate Populations A Comparison of Direct an Inirect Moel-Base Methos, Southampton Statistical Sciences Research Institute, Methoology Working Paper M07/09, University of Southampton. [3] D Alò M., Falorsi S., Solari F., (004), EURAREA Documentation on SAS/IML program on Linear Mixe Moel with Spatial Correlate Area Effects in Small Area Estimation, EURAREA Deliverable 3.3.. [4] EURAREA Project Reference Volume, http://www.statistics.gov.uk/eurarea. [5] EURAREA EBLUPGREG Software Documentation, Statistics Finlan EURAREA Consortium, Deliverables D.3., D3.3., 004. [6] Gołata E., (004), Estymacja pośrenia bezrobocia na lokalnym rynku pracy, Wyawnictwo AE w Poznaniu, Poznań. [7] Kruszka K., (e.), (010), Commuting in Polan, Statistical Office in Poznań, Poznań. [8] Kubacki J., (004), Application of the Hierarchical Bayes Estimation to the Polish Labour Force Survey, Statistics in Transition, 6 (5), 785-796, Warsaw. [9] Saei A., Chambers R., (003), Small Area Estimation: A Review of Methos Base on the Application of Mixe Moels, University of Southampton. [10] Saei A., Chambers R., (004), Small Area Estimation Uner Linear an Generalize Linear Mixe Moels With Time an Area Effects, University of Southampton. WYKORZYSTANIE ESTYMACJI POŚREDNIEJ UWZGLĘDNIAJĄCEJ KORELACJĘ PRZESTRZENNĄ W BADANIACH SPOŁECZNYCH W POLSCE Streszczenie Artykuł przestawia propozycję wykorzystania meto estymacji pośreniej (w tym także tej metoy, która uwzglęnia korelację przestrzenną) o oszacowania pewnych charakterystyk rynku pracy w populacji osób w wieku 15 lat i więcej w przekroju poregionów w Polsce w 008 roku. Jest to barziej szczegółowy poziom agregacji przestrzennej niż ten prezentowany w publikacjach Głównego Urzęu Statystycznego

17 Tomasz Klimanek opartych na wynikach Baania Aktywności Ekonomicznej Luności. Drugim celem jest porównanie miar precyzji estymatora bezpośreniego z precyzją estymatora typu EBLUP (empirical best linear unbiase preictor) oraz estymatora typu EBLUPGREG SPATIAL (uwzglęniającego korelację przestrzenną). Słowa kluczowe: statystyka małych obszarów, autokorelacja przestrzenna, bezrobocie, Baanie Aktywności Ekonomicznej Luności (BAEL) USING INDIRECT ESTIMATION WITH SPATIAL AUTOCORRELATION IN SOCIAL SURVEYS IN POLAND Abstract The article presents possible application of inirect estimation methos (incluing the metho accounting for spatial correlation) to estimate some characteristics of labor market in the population of people age 15 an over at the level of NUTS3 in Polan in 008. This is a more etaile spatial aggregation of ata compare with that foun in publications of the Central Statistical Office base on Labour Force Survey results. The secon aim of the article is to compare the precision measures of the irect estimator with those of the EBLUP estimator (empirical best linear unbiase preictor) an the EBLUPGREG SPATIAL estimator (which takes into account spatial correlation). Key wors: small area statistics, spatial autocorrelation, unemployment, Labour Force Survey (LFS)