An Exploratory Spatial Data Analysis of Income and Education Inequality in Pakistan
|
|
|
- Rolf Morrison
- 10 years ago
- Views:
Transcription
1 An Exploratory Spatial Data Analysis of Income and Education Inequality in Pakistan Sofia Ahmed Joint Doctoral Program in International Economics SIS/CIFREM October 2009 This draft is preliminary and incomplete, not for citation. Abstract Generally, econometric studies on income inequality consider regions as independent entities, ignoring the likely possibility of spatial interaction particularly within a country. This interaction may cause spatial dependency or clustering, which is referred to as spatial autocorrelation. This chapter analyzes the relationship between the spatial clustering of income and education in the districts of Pakistan by employing spatial exploratory data analysis (ESDA) techniques. Global and local measures of spatial autocorrelation were computed using the Moran s I index to obtain estimates of the existing spatial autocorrelation in income and education levels across districts. The results reveal a surprising absence of knowledge spillovers in terms of education attainment rates across districts close to large cities with high education attainment rates. On the other hand, district-wise incomes reveal a clear spatial autocorrelation pattern whereby high income districts tend to be neighbors of other high income districts. By detecting outliers and clusters, ESDA allows policy makers to focus on the geography of inequalities, hence highlighting the need to pursue spatial analysis at lower geographical units such as the district level instead of the common practice of provincial analysis in Pakistan. 1
2 Introduction This dissertation analyzes the spatial evolution of income inequality and its causes in Pakistan. As a first step, this chapter investigates whether spatial clustering of income and average education levels can explain the distribution of income across Pakistani districts. The technique used for this is exploratory spatial data analysis (ESDA), which describes and visualizes spatial distributions, identifies spatial outliers, detects agglomerations and local spatial autocorrelations, and highlights the types of spatial heterogeneities (Haining 1990; Bailey and Gatrell 1995; Anselin 1988; Le Gallo and Ertur 2003; Oort 2004, 107). The chapter is organized as follows. Section 1 describes the data; Section 2 gives an overview of the methodology; Section 3 explains the global and the local spatial autocorrelation detection techniques; Section 4 provides an analysis of the results after having applied the ESDA techniques on district income and education data; finally Section 5 summarizes and evaluates. 1. Data This study uses micro data from the Pakistan Social and Living Standards Measurement survey (PSLM). It is annually produced by the Federal Bureau of Statistics (FBS) of Pakistan since It is the only socio-economic micro data that is representative at the provincial and at the district level. Data collection at the district level was a plausible initiative as it was required for planning in the context of decentralization which began in Moreover, the sample size of the district level data is also substantially larger than the provincial level data contained in micro data surveys such as Household Income and Expenditure Survey (HIES) of Pakistan and the Labour Force Survey (LFS) of Pakistan. This enables researchers to draw socioeconomic information which is representative at lower administrative levels as well. Currently, this study only utilizes the PSLM survey of , but it aims to extend the estimations over a period from 2000 to 2009 to study the temporal changes along with the spatial changes. The PSLM survey for provides district level welfare indicators for a sample size of about households. The data is statistically comparable with the Pakistan Census Data (1998), with some margin of sampling error. It provides data on districts in all four provinces of Pakistan namely; Punjab, Sindh, North West Frontier Province (NWFP), and Balochistan. The 2
3 federally administered tribal areas (FATA region) along the Afghan border in the north west and Azad Kashmir are not included in the data. The PSLM is divided into two parts. The first part contains data on socio-economic characteristics such as education, health, population welfare, immunization, pre/post natal care, family planning, water supply, and sanitation. The second part contains household income and expenditure data. The quality and reliability of the PSLM data is ensured through cross checking of field work at various stages. Regional/field offices carry out initial data editing in their regional offices before it is handed over to the Federal Bureau of Statistics in Islamabad where it undergoes various consistency checks by the data entry programs. 2. Methodology Due to the abundance in data collected at a provincial or a rural/urban disaggregation, most socio-economic analysis studies on Pakistan, are a province based analysis. Pakistani provinces however have extreme within diversity in terms of their economic structures, cultures, language, natural resources and geography. Hence regional policy making requires analyzing socio economic issues at an even smaller geographical disaggregation. For this reason, the spatial unit of analysis in this study will be the districts of Pakistan. In terms of geographical disaggregation Pakistan (excluding the Federally Administered Tribal Area (FATA) region and Azad Kashmir) has 4 levels consisting of 4 provinces, 107 districts, 377 sub-districts, and villages. A lower level unit of analysis is not being used because of two main reasons. Firstly, data on regional scales below the district level in Pakistan suffers from reliability issues. The second issue is more technical. In order to give information on 45,653 villages of Pakistan instead of 107 districts, the project would need a matrix of distance with 45,653 (45, ) = 1,042,121,031 2 free elements to be evaluated, hence the utilization of district level data. 3
4 2.1 Why a spatial economic analysis? A fundamental concept in geography is that proximate locations often share more similarities than locations far apart. This idea is commonly referred to as the Tobler s first law of geography and is incorporated in spatial modeling which typically aims to look for associations instead of trying to develop explanations (Haining 2003 p. 358). Classical statistical inference such as conventional regressions are inadequate for an in-depth spatial analysis since they fail to take into account spatial effects and problems of spatial data analysis such as spatial autocorrelation, identification of spatial outliers, edge effects, modifiable areal unit problem and lack of spatial independence 1. These reasons necessitate the use of spatial exploratory and explanatory methods that explicitly take spatial effects into account. 2.2 Spatial effects Spatial effects can be divided into two main kinds: spatial dependence and spatial heterogeneity. Spatial heterogeneity refers to the display of instability in the behavior of the relationships under study. This implies that parameters and functional relationships vary across space and are not homogenous throughout data sets. However, spatial dependence refers to the lack of independence between observations often present in cross sectional data sets. It can be considered as a functional relationship between what happens at one point in space and what happens in another. If the Euclidean sense of space is extended to include general space (consisting of policy space, inter-personal distance, social networks etc) it shows how spatial dependence is a phenomenon with a wide range of application in social sciences. Two factors can lead to it. First, measurement errors may exist for observations in contiguous spatial units. The second reason can be the use of inappropriate functional frameworks in the presence of different spatial processes (such as diffusion, exchange and transfer, interaction and dispersal) as a result of which what happens at one location is party determined by what happens elsewhere in the system under analysis. 1 Modifiable Areal Unit Problem: When attributes of a spatially homogenous phenomenon (e.g people) are aggregated into districts, the resulting values (e.g totals, rates and ratios) are influenced by the choice of the district boundaries just as much as by the underlying spatial patterns of the phenomenon. 4
5 Assuming non-stationarity or structural stability over space is a highly unrealistic assumption when the variable under study belongs to different locations across space. Along the lines of temporal autocorrelation often found in time series data, spatial autocorrelation also violates the standard assumption of independence among observations. Hence standard regression analysis that does not compensate for spatial dependency can yield possibly biased estimators and unreliable significance tests. As a remedy spatial autocorrelation statistics have been devised to detect, measure and analyze the degree of dependency among observations. 2.3 Quantifying spatial effects Spatial dependence puts forward the need to determine which spatial units in a system are related, how spatial dependence occurs between them, and what kind of influence do they exercise on each other. Formally these questions are answered by using the concepts of neighborhood expressed in terms of distance or contiguity. Boundaries of spatial units can be used to determine contiguity or adjacency which can be of several orders (e.g. first order contiguity or more). Contiguity can be defined as linear contiguity (i.e. when counties which share a border with the county of interest are immediately on its left or right), rook contiguity (i.e. counties that share a common side with the county of interest), bishop contiguity (i.e. counties share a vertex with the county of interest), double rook contiguity (i.e. two counties to the north, south, east, west of the county of interest), and queen contiguity (i.e when counties share a common side or a vertex with the county of interest) (LeSage 1999). Other common conceptualizations of spatial relationships include inverse distance, travel time, fixed distance bands, and k-nearest neighbors. The most popular way of representing a type of contiguity or adjacency is the use of the binary contiguity (Cliff and Ord 1973, 1981) expressed in a spatial weight matrix (W). In spatial econometrics W provides the composition of the spatial relationships among different points in space. The spatial weight matrix enables us to relate a variable at one point in space to the observations for that variable in other spatial units of the system. It is used as a variable while modeling spatial effects contained in the data. Generally it is based on using either distance or contiguity between spatial units. Consider below a spatial weight matrix for three units: 5
6 where w ij may be the inverse distance between two units i and j or it may be 0 and 1 if they share a border or a vertex. The W matrix displays the properties of a spatial system and can be used to gauge the prominence of a spatial unit within the system. The usual expectation is that values at adjacent locations will be similar. 2.4 The spatial weight matrix for Pakistan The choice of the W matrix representation and its conceptualization has to be carefully based on theoretical reasoning and the historical factors underlying the concept or phenomenon under study. For example for cluster detection and influence analysis inverse distance is the most appropriate measure, but when we are assessing the geographic distribution of a region s commuters, travel time or cost would be a better choice. This paper has employed two W matrices for Pakistan. The first one is a simple binary contiguity W matrix (BC) based on the concept of Queen Contiguity i.e. if a district i shares a border or a vertex with another district j, they are considered as neighbors, and takes the value 1 and 0 otherwise. This matrix is also zero along its diagonal implying that a district cannot be a neighbor to itself. Hence it is symmetric binary matrix with a dimension of 81x81 (81 being the total number of the districts being analyzed) 2. This matrix precisely tells us the influence of geographically adjacent neighbors on each other. A simple binary contiguity matrix is a standard starting point and its influence is often compared with other types of W matrices. The second W matrix developed for Pakistan is one based on inverse average road distance between the centroid of a district to the centroid of its nearest district/s with a large 2 The total number of districts in Pakistan is 104, but the PSLM covers 81 districts. For Balochistan, the geographical unit of analysis is division, since the data is available only on a division level. Divisions were one level greater than districts and one level smaller than provinces. As a geographical unit, they got eliminated in In the subsequent years however, Balochistan is also district representative in the PSLMs as compared to division representation only. 6
7 size city (ID matrix). If the nearest district is not the neighbor of a district with a large size city, then the value of is the distance from the centroid of that district to the centroid of the district which is the provincial capital city of that province in which that district is located. This matrix is a symmetric non-binary matrix, again with a dimension of 81x81. Out of the 81 districts being studied there are only 14 that come under the category of a district with a large size city as per the classification of the coding scheme for the PSLM survey. These include Islamabad as the federal capital city; Lahore, Faisalabad, Rawalpindi, Multan, Gujranwala, Sargodha, Sialkot, and Bahawalpur as districts with a large size city in Punjab; Karachi, Hyderabad and Sukkur in Sindh; Peshawar in the North West Frontier Province and Quetta in Balochistan. The reason for selecting road distance instead of train distance as is normally done in most studies on urban area analysis is that in Pakistan, the road network is much better developed than the railway network. As a result, Pakistan s transport system is primarily dependent on road transport which makes up 90 percent of national passenger traffic and 96 percent of freight movement every year (The Economic Survey of Pakistan p. 225). Inverse distance matrices have more explanatory power as partitions of geographic space especially when the phenomenon under study involves the exchange or transfer of information and knowledge (in our case wages and education). It establishes a decay function that weighs the effect of events in geographically proximate units more heavily than those in geographically distant units. Since a country is not a plain piece of land, Euclidean distance calculations or distance as the crow flies make little economic sense when we are trying to investigate the effect of distance from districts with a large city on regional wages. The effect of the density of country s infrastructure network is an important influence. For this reason we have used the Google Maps service of distance calculation. It not only provides the Euclidean or the straight line distance between districts using their longitude and latitude information but also the maximum and minimum road distance to reach from one district to another carefully taking into consideration the existing road network of Pakistan. The distance used in this paper is the inverse of the average of the maximum and the minimum roads distance between two the centroids of districts. 7
8 Finally both the matrices are row-standardized i.e. each weight is divided by its row sum. Row standardization is recommended whenever the distribution of the variables under consideration is potentially biased due to errors in sampling design or due to an imposed aggregation scheme. 3. Exploratory spatial data analysis This paper applies exploratory spatial data analysis techniques to district wise data on wages, employment and education. Before estimating the spatial econometric models, the presence of spatial dependence has to be detected. This is done by using explanatory spatial data analysis (ESDA). The technique employed in this study is Moran s I statistic. The global Moran s I demonstrates the spatial association of data collected from points in space and measures similarities and dissimilarities in observations across space in the whole system (Anselin, 1995). However in the presence of uneven spatial clustering, the Local Indicators of Spatial Association are utilized. They measure the contribution of individual spatial units to the global Moran s I statistic (Anselin, 1995). The study will also generate Moran scatter plots to demonstrate the spatial distribution of district wage and education levels across Pakistan. 3.1 Measures of spatial autocorrelation: i) Global spatial autocorrelation Spatial autocorrelation occurs when the spatial distribution of the variable of interest exhibits a systematic pattern (Cliff and Ord 1981). Positive (negative) spatial autocorrelation occurs when a geographical area tends to be surrounded by neighbors with similar (dissimilar) values of the variable of interest. As previously mentioned, this paper utilizes Moran s I Statistic to detect the global spatial autocorrelation present in the data 3. The Moran s I is the most widely used measure for detecting and explaining spatial clustering not only because of its interpretative simplicity but also because it can be decomposed into a local statistic along with providing graphical evidence of the presence of absence of spatial clustering. It is defined as 3 Other well known measures of spatial autocorrelation include the Geary s c statistic and the Getis and Ord s G statistic, see Anselin (1995a, p.22-23). 8
9 I = (1) where is the observation of variable in location i, is the mean of the observations across all locations, n is the total number of geographical units or locations, is one of the elements of the weights matrix and it indicates the spatial relationship between location i and location j. is a scaling factor which is equal to the sum of all the elements of the W matrix : (2) is equal to n for row standardized weights matrices (which is the preferred way to implement the Moran s I statistic), since each row then adds up to 1. The first term in equation (1) then becomes equal to 1 and the Moran s I simplifies to a ratio of spatial cross products to variance. Under the null hypothesis of no spatial autocorrelation, the theoretical mean of Moran s I is given by E (I) = -1/ (n-1) (3) The expected value is thus negative and will tend to zero as the sample size increases as it is only a function of n (the sample size). Moran s I ranges from -1 (perfect spatial dispersion) to +1 (perfect spatial correlation) while a 0 value indicates a random spatial pattern. If the Moran s I is larger than its expected value, then the distribution of y will display positive spatial autocorrelation i.e. the value of y at each location i tends to be similar to values of y at spatially contiguous locations. However, if I is smaller than its expected value, then the distribution of y will be characterized by negative spatial autocorrelation, implying that the value of y at each location i tends to be different from the value of y at spatially contiguous locations. Inference is based on z-values computed as (4) 9
10 i.e. the expected value of I is subtracted from I and divided by its standard deviation. The theoretical variance of Moran s I depends on the assumptions made about the data and the nature of spatial autocorrelation. This paper will present the results under the randomization assumption i.e. each value observed could have equally occurred at all locations 4. Under this assumption asymptotically follows a normal distribution, so that its significance can be evaluated using a standard normal table (Anselin 1992a). A positive (negative) and significant z- value for Moran s I accompanied by a low (high) p-value indicates positive (negative) spatial autocorrelation 5. Finally, the results of the Moran s I are dependent on the specification of the weights matrix. Interpretations change depending on whether the matrix was based on the use of physical distance or economic distance. However, a pattern of decreasing spatial autocorrelation with increasing orders of contiguity (distance decay) is commonly witnessed in most spatial autoregressive processes regardless of the matrix specification (Oort (2004) p.109). ii) Local spatial autocorrelation Since the Moran s I as a global statistic is based on simultaneous measurements from many locations, it only provides some broad spatial association measurements, ignores the location specific details, and cannot identify which local spatial clusters (or hot spots) contribute the most to the global statistic. As a remedy, local statistics commonly referred to as Local Indicators of Spatial Association (LISA) used along with graphic visualization techniques of the spatial clustering using a Moran s Scatterplot, have been developed in exploratory spatial data analysis. The Moran scatterplot is derived from the global Moran I statistic. Recall that the Moran s I formula when we use a row standardized matrix can be written as I= (5) This is similar to the formula for a coefficient of the linear regression b, with the exception of, which is the so-called spatial lag of the location i. 4 The other two assumptions include the assumption of normal distribution of the variables in question (normality assumption) or a randomization approach using a reference distribution for I that is generated empirically (permutation assumption). For details and formulas of the randomization assumption, see Sokal et al. 1998). 5 Negative spatial autocorrelation reflects lack of clustering, more than even the case of a random pattern. The checkerboard pattern is an example of perfect negative spatial autocorrelation. 10
11 Therefore I is formally equivalent to the regression coefficient in a regression of a location s spatial lag (Wz) on the location itself. This interpretation is used by the Moran s scatterplot, enabling us to visualize the Moran s I in a scatterplot of Wz versus z, where.moran s I is then the slope of the regression line contained in the scatterplot. A lack of fit in this scatterplot indicates local spatial associations (local pockets/non-stationarity). This scatterplot is centered on 0 and is divided in four quadrants that represent different types of spatial associations. However graphical evidence alone does not give the significance levels of the spatial clustering for which we resort to complementing the Moran scatterplot with a local statistic. Local statistics or indicators can reveal the locations that display significant deviation from spatial randomness in the presence of global spatial autocorrelation (hot spots) and the significant outliers in a diagnostic analysis for local stability. Anselin (1995b) defines a LISA as a statistic that satisfies the following two requirements: 1) The LISA for each observation gives an indication of the spatial clustering of similar values around that observation; 2) The sum of all LISA s for all observations is proportional to a global indicator of spatial association We use the local Moran s I statistic which satisfies the above requirements for our analysis. Each local Moran I for a particular location indicates the extent of spatial clustering around it and the sum of all local Moran s I s is equal to the global Moran s I. The Local Moran s I can be defined as: (6) The null hypothesis tested in this case is that there is no association between the value observed at a location i and values observed in its neighbors i.e. values of s are zero. Positive (negative) local spatial autocorrelation exists when we obtain positive (negative) values for and z-scores which indicate the clustering of similar (dissimilar) values of y around location i. 11
12 4. Global spatial autocorrelation District Incomes In this section we briefly summarize the results of the estimation exercises carried out for detection of spatial autocorrelation in the district wise wage rates. This is the starting point of analysis before we proceed towards a spatial econometric analysis of the determinants of varying intra district wages across Pakistani districts in the subsequent chapters 6. The variable has been obtained from the micro data set by estimating the district wise average log wage and then comparing them. As a robustness measure, we have estimated the global and local measures of spatial autocorrelation using two W matrices instead of one. Table 1 shows the result of Moran s test for average log district wage rates using the two weights matrices. In both the cases, the null hypothesis of no spatial dependence is rejected at the significance level of 1%. Table 1: Global autocorrelation results for l_wage Weight Matrix I II i Moran s I E(I) Sd(I) Z p-value In this preliminary version of this chapter, incomes are considered synonymous to district monthly wages of all salaried persons interviewed in the survey, just to check the presence of spatial autocorrelation. Later a more comprehensive definition of income will be taken which will encompass wages, income in kind, transfers and pensions. 12
13 4.1 Local spatial autocorrelation District Incomes The Moran scatterplot (in Figures 1 and 2) provides a more disaggregated view of the nature of the global autocorrelation. It not only provides us information on the presence of clusters in the data but also the outliers contained in it. This scatterplot is divided into four quadrants, each of which represents a different type of spatial association: The upper right quadrant represents spatial clustering of a district with a high average wage rate around neighbors that also have high average wages. This quadrant is also called the High-High zone (HH) since z-score and Wz both have high values. In general these are locations that have a positive value for the local Moran s I. The upper left quadrant represents spatial clustering of a district with a low average wage rate around neighbors that have high average wages. This quadrant is also called the Low-High zone (LH) since z-score is low while Wz has high values indicating a low outlier among neighbors with high values. In general these are locations that have a negative value for the local Moran s I. The lower left quadrant represents spatial clustering of a district with a low average wage rate around neighbors that also have low average wages. This quadrant is also called the Low-Low zone (LL) since z-score and Wz both have low values. In general these are locations that have a negative value for the local Moran s I. The lower right quadrant represents spatial clustering of a high district with a high average wage rate around neighbors that have low average wages. This quadrant is also called the High-Low zone (HL) since the z-score is high while Wz has low values indicating a high outlier among neighbors with high values. In general these are locations that have a negative value for the local Moran s I. 13
14 Figure 1: Spatial autocorrelation of district incomes using the binary contiguity matrix 2 Moran scatterplot (Moran's I = 0.688) l_wage Wz Khanew Muzaff Rawalp Attock Islama Manseh Kohist Swat Nasira Abbott Zhob Kalat Lower Sibi Quetta Chakwa Makran Charsa Mardan KarkGujrat Ghotki Upper Jhelum Sheiku Thatta Narowa Hangu Sargod D I Kh Nowshe Tank Mianwa Gujran Mandi Sawabi Haripu Shangl Chitra Bunir Bannu Lakki Batagr Malaka Khusha Hafiza Kohat Sialko Shikar BadinSangha Dadu Jaccob Nowshe Sukkur Peshaw Karach Kasur Nawab Bhakka Khair Larkan Hydera Lahore Tharpa Mirpur Faisal Jhang Okara Bahawa Sahiwa TT Sin Vehari Bahawa Lodhra Pakpat Layyah -2 Rajanp R Y Kh D G Kh Multan z Figure 1 uses the binary contiguity W matrix to produce a positive global Moran s I (z-score = 8.391), represented by the slope of the black line. On a local level it is confirmed by the shape and the direction of the scatterplot. There are relatively few extreme outliers or atypical locations that deviate from the global pattern of the positive spatial autocorrelation. Figure 2 (below) uses the inverse distance matrix to produce the scatterplot. Compared to the previous scatterplot, it has a lower value for the global Moran s I (z-score = 4.655) since the clusters here are not based on geographic contiguity but on geographic proximity. Hence we conclude that for the year , there exists statistically significant local spatial autocorrelation for district wages. 14
15 Figure 2: Spatial autocorrelation of district incomes using the inverse distance matrix 3 Moran scatterplot (Moran's I = 0.495) l_wage Wz Khanew Muzaff Charsa Shikar Khair Dadu Thatta Larkan Manseh Mardan Narowa Hangu D Nowshe Tank I Kark Kohist KhSwat Lower Sawabi Upper Lakki Kohat Bannu Shangl Chitra Quetta Malaka Bunir Batagr Jaccob Kasur Sheiku Gujran Ghotki GujratPeshaw BadinSangha Tharpa Mirpur Nawab Sargod Nowshe Sahiwa Okara Jhang TT Pakpat Bhakka Sin Mianwa Khusha Mandi Hafiza Sialko Layyah D G Kh Lodhra Rajanp R Y KhBahawa Vehari Bahawa Hydera Lahore Sukkur Faisal Abbott Haripu Attock Chakwa Nasira Zhob Sibi Kalat Jhelum Makran Rawalp Islama Karach -2 Multan z While the Moran s scatterplot provides information mainly on the clusters, we use the detailed estimates of Local Moran s I provided in Appendix A for an analysis of outlier detection. Lahore, Karachi and Peshawar, the three provincial capitals of Punjab, Sindh and NWFP, all emerge with low or negative but statistically insignificant z-scores. This indicates that wage rates in their neighboring districts are lower than theirs but there is no indication of spillovers in terms of labor remuneration and we cannot reject the null hypothesis of no spatial association between them and their neighboring districts at a 95% confidence level. The LISA s for log of average district wages and the Moran s scatterplot produce three main statistically significant clusters when we use the inverse distance matrix. All three of them belong to Punjab. While the cluster of Rawalpindi, Islamabad and Chakwal falls into the High High zone, the cluster of Vehari, Bahawalnagar, Bahawalpur, Muzaffargarh and R Y Khan falls into the Low Low zone i.e. comparatively lower wage rates in and around these districts in the province of Punjab. These clusters are an evidence of spillovers between these districts. 15
16 4.2 Global spatial autocorrelation District Education Attainment We have also carried out an analysis of the average district wise education attainment level which is measured as the average number of schooling years completed in a district. It is expected that neighbors of districts with high education attainment should also have high educational awareness and hence similar if not equal attainment levels. We again made use of the Moran s I global and local version along with a Moran scatterplot using the two weights matrices. Table 2: Global autocorrelation results for education attainment Weight Matrix I II i Moran s I E(I) Sd(I) Z p-value The analysis shows that the average knowledge spillover is weak in most districts that are neighbors to districts with high education attainment levels. This finding of virtually no knowledge spillovers becomes even more significant when neighbors are defined in terms of inverse distances rather than contiguous units. Hence the data shows more outliers than clusters. Therefore contrary to the economic prediction of spillovers, having Karachi as a neighbor may translate into higher education incentives for its neighboring districts but is not actually translating into higher education levels. 16
17 Figure 3: Spatial autocorrelation of district education levels using the binary contiguity matrix Moran scatterplot (Moran's I = 0.180) yrsed main 2 Abbott Islama Wz Rawalp Ghotki Attock Sheiku Jhelum Gujrat Narowa Mandi Thatta Hafiza Chakwa Sialko Charsa Jaccob Sawabi Kasur Gujran Bunir Dadu Lodhra Hangu Mianwa Khusha Nowshe Haripu Upper Jhang Kalat Zhob KarkLower TT Sin Sangha Okara Shikar Mardan Sargod Kohist MuzaffBhakka Pakpat Khanew Sahiwa Manseh FaisalLahore Nawab Tank Sibi Vehari Peshaw Nowshe Kohat Malaka Lakki Tharpa Badin Bahawa Khair Shangl Swat Chitra Bannu Bahawa Multan Sukkur D I Kh Nasira Layyah Larkan Quetta Batagr Mirpur Hydera Rajanp R Y Kh Makran D G Kh -2 Karach z The spatial pattern of autocorrelation is quite diffused when we use the BC matrix for analysis. The positive Moran s I value indicates that neighboring districts share similar values of average district education attainment levels but overall autocorrelation is still weak. Karachi and Thatta emerge as the most significant outliers when we analyze the local Moran s I values using the BC and the ID matrices. However, while Karachi falls into the High-Low zone, Thatta falls in the Low-High zone 7. Similarly, under both the neighborhood structures Islamabad, Rawalpindi, Abbottabad, Chakwal and Jhelum emerge as a statistically significant cluster of districts with high average education attainment levels. The global spatial autocorrelation while using the ID matrix is negative but close to 0 and statistically insignificant. This indicates that we cannot reject the null hypothesis of no spatial association and that a random pattern exists between districts for average education rates 8. 7 The results of districts with significant spatial autocorrelation have been reported in Appendix A part (c). 8 The Moran s scatterplot for average district education attainment level is provided in Appendix A part (e). 17
18 5. Conclusion This chapter presents the initial results after having applied ESDA techniques to district-wise income and education data. The two main preliminary findings that emerge from this crosssectional analysis are that, although the distribution of district wise income exhibits a significant tendency for income to cluster in space (i.e. the presence of autocorrelation), the distribution of education is spatially random. The chapter however remains incomplete without extending the analysis over time in order to examine and report the varying nature of spatial autocorrelation in district incomes and education. For this the immediate next step is to append PSLM data sets from 2000 till If the absence of knowledge spillovers still persists over the years, we will carry out a political economy analysis of the reasons for regional disparities in education. Moreover, the detection of significant spatial autocorrelation in income levels across districts calls for a spatial econometric analysis that considers this fact. The presence of clusters and outliers supports the use of the spatial lag model to capture the spillover of income between districts. However, missing data on district incomes or omitted variables could also necessitate the use of a spatial error model (which reflects spatial autocorrelation in measurement errors) in analyzing the effect of inequality on district income levels. The next chapter will consider these issues in detail. 18
19 Bibliography Anselin, Luc. (1988b). Spatial Econometrics: Methods and Models. Dordrecht, Kluwer Academic Press. Anselin, Luc. (1995a), SpaceStat. A Software Program for the Analysis of Spatial Data (version 1.80), Morgantown: Regional Research Institute, West Virginia University Anselin, Luc (1995b), Local Indicators of Spatial Association LISA, Geographical Analysis 27: p Anselin, Luc (1996), The Moran Scatterplot as an ESDA Tool to Assess Local Instability in Spatial Association, in M.Fisher, H.J Scholten and D. Unwin (eds.), Spatial Analytical Perspectives on GIS, London: Taylor and Francis. Anselin, Luc. (2003a). "Spatial Externalities, Spatial Multipliers and Spatial Econometrics." International Regional Science Review 26: p Arbia, Guiseppe. (2006). Spatial Econometrics. Statistical Foundations and Applications to Regional Convergence. Berlin, Heidelberg, Springer-Verlag. Haining, Robert. (2003). Spatial Data Analysis. Theory and Practice. Cambridge, Cambridge University Press Le Gallo, Julie. and Ertur, Cem, (2003). An Exploratory Spatial Data Analysis of European Regional Disparties, , in European Regional Growth (Advances in Spatial Sciences) by Bernard Fingleton (Ed), Springer. Van Oort, Frank. G. (2004). Urban Growth and Innovation. Spatially Bounded Externalities in the Netherlands. Aldershot, Ashgate. Wooldridge, Jeffrey M. (2002). Econometric Analysis of Cross Section and Panel Data. Cambridge, MA, MIT Press. 19
20 Appendix A: Measures of local spatial autocorrelation a) Local spatial autocorrelation using the binary contiguity weights matrix 9 Moran's Ii (l_wage) dist Ii E(Ii) sd(ii) z p-value* Lodhran Karachi Lahore Peshawar Okara Islamabad Bahawalnagar Rawalpindi Sahiwal D G Khan Bahawalpur Vehari Layyah R Y Khan Rajanpur Khanewal Muzaffar grah *2-tail test 9 The local Moran statistics are available for each one of the 81 districts and available on request. Only the statistics for the main city districts and the statistically significant ones are reported here. 20
21 b) Local spatial autocorrelation using the inverse distance matrix Moran's Ii (l_wage) dist Ii E(Ii) sd(ii) z p-value* Vehari Karachi Lahore Peshawar Chakwal Bahawalnagar Muzaffar grah R Y Khan Bahawalpur Rajanpur Islamabad Rawalpindi *2-tail test 21
22 (c) Local spatial autocorrelation district education using the inverse distance matrix Moran's Ii (yrsed main) dist Ii E(Ii) sd(ii) z p-value* Karachi Thatta Jhelum Abbottabad Chakwal Sialkot Haripur Rawalpindi Islamabad (d) Local spatial autocorrelation district education using the binary contiguity matrix Moran's Ii (yrsed main) dist Ii E(Ii) sd(ii) z p-value* Karachi Thatta Jhelum Chakwal Rajanpur Sialkot Abbottabad Islamabad Rawalpindi *2-tail test 22
23 (e) Spatial autocorrelation of district wages using the ID matrix 3 Moran scatterplot (Moran's I = ) yrsed main Wz Kohist Rajanp Lodhra R Y KhKalat Zhob Nasira Sibi Bahawa Vehari Makran Attock Jhelum Narowa Kasur Gujrat Bhakka Layyah Khanew Mianwa Sheiku Thatta Jhang Hafiza Mandi D G Khusha Muzaff JaccobPakpat Dadu OkaraKhair Ghotki Shikar Sahiwa TT Larkan Sin Nawab Tharpa Badin Sangha Mirpur Nowshe Upper Tank Charsa Shangl Hangu Batagr D Lakki Sawabi Bunir I Kark Kh Lower Swat Mardan Nowshe Chitra Bannu Manseh Kohat Malaka Abbott Chakwa Gujran Sargod Lahore Peshaw Sukkur Bahawa Faisal Quetta Multan Hydera Sialko Haripu Islama Rawalp -2 Karach z 23
24 e) District Map of Pakistan 24
NATIONAL TRANSMISSION & DESPATCH COMPANY LTD.
NTS REGISTRATION FORM NATIONAL TRANSMISSION & DESPATCH COMPANY LTD. Screening Test to be conducted by NTS Reg.. To be Filled by NTS Picture 1 Paste your recent passport size color photograph with gum Eligibility
New Tools for Spatial Data Analysis in the Social Sciences
New Tools for Spatial Data Analysis in the Social Sciences Luc Anselin University of Illinois, Urbana-Champaign [email protected] edu Outline! Background! Visualizing Spatial and Space-Time Association!
Spatial Analysis with GeoDa Spatial Autocorrelation
Spatial Analysis with GeoDa Spatial Autocorrelation 1. Background GeoDa is a trademark of Luc Anselin. GeoDa is a collection of software tools designed for exploratory spatial data analysis (ESDA) based
National Testing Service Invigilation Staff
NTS REGISTRATION FORM National Testing Service Invigilation Staff Reg. To be Filled by NTS Picture 1 Paste your recent passport size color photograph with gum 01. ank Online Deposit of Rs: 650/- from Designated
District Education Profile 2011-12
District Education Profile 211-12 National Education Management Information System Academy of Educational Planning and Management Ministry of Education, Trainings and Standards in Higher Education Government
This chapter will cover key indicators on school attendance, enrolment rates and literacy.
2. EDUCATION 2.1 Introduction One of the main objectives of the MDGs is the improvement in the percentage of literate population. Unfortunately literacy rates in Pakistan are very low when compared to
Spatial Analysis of Five Crime Statistics in Turkey
Spatial Analysis of Five Crime Statistics in Turkey Saffet ERDOĞAN, M. Ali DERELİ, Mustafa YALÇIN, Turkey Key words: Crime rates, geographical information systems, spatial analysis. SUMMARY In this study,
SPORTS BOARD PUNJAB Sports Board HQ
NTS Eligibility riteria: REGISTRATION FORM Screening Test to be conducted by NTS Reg.. To be Filled by NTS SPORTS BOARD PUNJAB Sports Board HQ A. Is your Age according to the prescribed age limit for the
Literacy & Non Formal Basic Education Department, Government of the Punjab
NTS Eligibility Criteria: REGISTRTION FORM Registration Literacy & n Formal Basic Education Department, Government of the Punjab Recruitment Test for Punjab Literacy Movement Project. Is your ge according
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
SPORTS BOARD PUNJAB Tennis Stadium & Swimming Pool
NTS Eligibility Criteria: REGISTRATION FORM Screening Test to be conducted by NTS Reg.. To be Filled by NTS SPORTS OARD PUNJA & A. Is your Age according to the prescribed age limit for the desired Post
Spatial Data Analysis Using GeoDa. Workshop Goals
Spatial Data Analysis Using GeoDa 9 Jan 2014 Frank Witmer Computing and Research Services Institute of Behavioral Science Workshop Goals Enable participants to find and retrieve geographic data pertinent
Marketing Mix Modelling and Big Data P. M Cain
1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored
EXPLORING SPATIAL PATTERNS IN YOUR DATA
EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze
A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study
A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study But I will offer a review, with a focus on issues which arise in finance 1 TYPES OF FINANCIAL
Introduction to Exploratory Data Analysis
Introduction to Exploratory Data Analysis A SpaceStat Software Tutorial Copyright 2013, BioMedware, Inc. (www.biomedware.com). All rights reserved. SpaceStat and BioMedware are trademarks of BioMedware,
NetSurv & Data Viewer
NetSurv & Data Viewer Prototype space-time analysis and visualization software from TerraSeer Dunrie Greiling, TerraSeer Inc. TerraSeer Software sales BoundarySeer for boundary detection and analysis ClusterSeer
The primary goal of this thesis was to understand how the spatial dependence of
5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial
Geostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras [email protected]
UNIVERSITY OF WAIKATO. Hamilton New Zealand
UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun
Spatial Dependence in Commercial Real Estate
Spatial Dependence in Commercial Real Estate Andrea M. Chegut a, Piet M. A. Eichholtz a, Paulo Rodrigues a, Ruud Weerts a Maastricht University School of Business and Economics, P.O. Box 616, 6200 MD,
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
Request for Proposal (RFP)
Request for Proposal (RFP) REQUEST FOR QUOTES SERVICES REQUIRED DATE OF ISSUANCE OFFERS CLOSING DATE OFFERS BASIS Exit Interview Survey 2014 Conduct a total of 4,470 exit interviews with MSS clients at
Using GIS to Identify Pedestrian- Vehicle Crash Hot Spots and Unsafe Bus Stops
Using GIS to Identify Pedestrian-Vehicle Crash Hot Spots and Unsafe Bus Stops Using GIS to Identify Pedestrian- Vehicle Crash Hot Spots and Unsafe Bus Stops Long Tien Truong and Sekhar V. C. Somenahalli
Fairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
Data Entry Spot Check
Data Entry Spot Check Final Report 2013 ACKNOWLEDGEMENT The BISP. All Rights reserved. To reproduce material contained herein requires the explicit written permission of the BISP. To obtain permission
The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.
The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables Kathleen M. Lang* Boston College and Peter Gottschalk Boston College Abstract We derive the efficiency loss
CLUSTER ANALYSIS FOR SEGMENTATION
CLUSTER ANALYSIS FOR SEGMENTATION Introduction We all understand that consumers are not all alike. This provides a challenge for the development and marketing of profitable products and services. Not every
Research Publications by Universities/DAIs from Pakistan 2011
Research Publications by Universities/DAIs from Pakistan 2011 * Publications were extracted using three databases SCI-E, SSCI, A&HCI from Thomson-Reuters Web of Science * Criterion used Country & Institution
Elements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
Simple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
Multivariate Analysis of Ecological Data
Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology
Alison Hayes November 30, 2005 NRS 509. Crime Mapping OVERVIEW
Alison Hayes November 30, 2005 NRS 509 Crime Mapping OVERVIEW Geographic data has been important to law enforcement since the beginning of local policing in the nineteenth century. The New York City Police
117, Street 66, F-11/4, Islamabad 44000, Pakistan Tel: +92 51 229 2270, +92 51 229 2231, Fax: +92 51 229 2230 www.pmn.org.pk
117, Street 66, F-11/4, Islamabad 44000, Pakistan Tel: +92 51 229 2270, +92 51 229 2231, Fax: +92 51 229 2230 www.pmn.org.pk An action funded by the European Union MICROFINANCE PRODUCT MAPPING AND ADVOCACY
CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA
CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA Chapter 13 introduced the concept of correlation statistics and explained the use of Pearson's Correlation Coefficient when working
Balochistan University of Engineering & Technology - Khuzdar Balochistan University of IT Engineering and Management Sciences Quetta
The information contained in this document is for guidance purposes only. Offer of entry to the University of Glasgow is entirely at the discretion of the University. University Name Abasyn University
COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537)
Get the latest knowledge from leading global experts. Financial Science Economics Economics Short Courses Presented by the Department of Economics, University of Pretoria WITH 2015 DATES www.ce.up.ac.za
Treatment of Spatial Autocorrelation in Geocoded Crime Data
Treatment of Spatial Autocorrelation in Geocoded Crime Data Krista Collins, Colin Babyak, Joanne Moloney Household Survey Methods Division, Statistics Canada, Ottawa, Ontario, Canada KA 0T6 Abstract In
PAYMENT AND SETTLEMENT SYSTEMS
4 PAYMENT AND SETTLEMENT SYSTEMS 4.1 Overview Payment and settlement systems are crucially important to the smooth functioning of the economy. It is the responsibility of a central bank to promote sound
Simple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
Appendix B Checklist for the Empirical Cycle
Appendix B Checklist for the Empirical Cycle This checklist can be used to design your research, write a report about it (internal report, published paper, or thesis), and read a research report written
Factors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
Data Visualization Techniques and Practices Introduction to GIS Technology
Data Visualization Techniques and Practices Introduction to GIS Technology Michael Greene Advanced Analytics & Modeling, Deloitte Consulting LLP March 16 th, 2010 Antitrust Notice The Casualty Actuarial
Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
HIG HER EDUC ATION COMMISSION. Ranking 2014 of Pakistani Higher Education Institutions (HEIs)
HIG HER EDUC ATION COMMISSION Ranking 04 of Pakistani Higher Education Institutions (HEIs) Objectives To create culture of competition among the HEIs within the country as well as at Global level. To improve
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4
4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression
Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University
Practical I conometrics data collection, analysis, and application Christiana E. Hilmer Michael J. Hilmer San Diego State University Mi Table of Contents PART ONE THE BASICS 1 Chapter 1 An Introduction
Big Ideas in Mathematics
Big Ideas in Mathematics which are important to all mathematics learning. (Adapted from the NCTM Curriculum Focal Points, 2006) The Mathematics Big Ideas are organized using the PA Mathematics Standards
Multiple regression - Matrices
Multiple regression - Matrices This handout will present various matrices which are substantively interesting and/or provide useful means of summarizing the data for analytical purposes. As we will see,
Forecasting Geographic Data Michael Leonard and Renee Samy, SAS Institute Inc. Cary, NC, USA
Forecasting Geographic Data Michael Leonard and Renee Samy, SAS Institute Inc. Cary, NC, USA Abstract Virtually all businesses collect and use data that are associated with geographic locations, whether
Quality and Research Based Ranking 2013
Quality and Research Based Ranking 2013 Category University Normalized Score Agriculture/Veterinary University of Agriculture - Faisalabad 100.000 Pir Mehr Ali Shah Arid Agriculture - University Rawalpindi
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
Exploratory spatial data analysis using Stata
Exploratory spatial data analysis using Stata Maurizio Pisati Department of Sociology and Social Research University of Milano-Bicocca (Italy) [email protected] 2012 German Stata Users Group meeting
Education and Wage Differential by Race: Convergence or Divergence? *
Education and Wage Differential by Race: Convergence or Divergence? * Tian Luo Thesis Advisor: Professor Andrea Weber University of California, Berkeley Department of Economics April 2009 Abstract This
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
Performance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Geographically Weighted Regression
Geographically Weighted Regression CSDE Statistics Workshop Christopher S. Fowler PhD. February 1 st 2011 Significant portions of this workshop were culled from presentations prepared by Fotheringham,
Intro to Data Analysis, Economic Statistics and Econometrics
Intro to Data Analysis, Economic Statistics and Econometrics Statistics deals with the techniques for collecting and analyzing data that arise in many different contexts. Econometrics involves the development
Local outlier detection in data forensics: data mining approach to flag unusual schools
Local outlier detection in data forensics: data mining approach to flag unusual schools Mayuko Simon Data Recognition Corporation Paper presented at the 2012 Conference on Statistical Detection of Potential
Chapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
Annual Activity For the year
Annual Activity For the year 2014 Index: 1- About Edhi 2- Annual performance Report 3- Contact Edhi Introduction: Edhi Foundation is the largest and most organized social welfare system in Pakistan. Foundation
Introduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
Schools Value-added Information System Technical Manual
Schools Value-added Information System Technical Manual Quality Assurance & School-based Support Division Education Bureau 2015 Contents Unit 1 Overview... 1 Unit 2 The Concept of VA... 2 Unit 3 Control
Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model
Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort [email protected] Motivation Location matters! Observed value at one location is
Recall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
Pakistan Medical Research Council
Prevalence of Hepatitis B&C in Pakistan Pakistan Medical Research Council Shahrah-e-Jamhuriat, Sector G-5/2, Islamabad. Phone: 051-9217146, 9206092, 9207386, Fax: 051-9216774 Email: [email protected],
16 : Demand Forecasting
16 : Demand Forecasting 1 Session Outline Demand Forecasting Subjective methods can be used only when past data is not available. When past data is available, it is advisable that firms should use statistical
CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
COMMON CORE STATE STANDARDS FOR
COMMON CORE STATE STANDARDS FOR Mathematics (CCSSM) High School Statistics and Probability Mathematics High School Statistics and Probability Decisions or predictions are often based on data numbers in
CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,
Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary
Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:
Regression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
The Gravity Model: Derivation and Calibration
The Gravity Model: Derivation and Calibration Philip A. Viton October 28, 2014 Philip A. Viton CRP/CE 5700 () Gravity Model October 28, 2014 1 / 66 Introduction We turn now to the Gravity Model of trip
Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP. Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study.
Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study Prepared by: Centers for Disease Control and Prevention National
Chapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
I. Introduction. II. Background. KEY WORDS: Time series forecasting, Structural Models, CPS
Predicting the National Unemployment Rate that the "Old" CPS Would Have Produced Richard Tiller and Michael Welch, Bureau of Labor Statistics Richard Tiller, Bureau of Labor Statistics, Room 4985, 2 Mass.
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)
COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
Module 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
Response to Critiques of Mortgage Discrimination and FHA Loan Performance
A Response to Comments Response to Critiques of Mortgage Discrimination and FHA Loan Performance James A. Berkovec Glenn B. Canner Stuart A. Gabriel Timothy H. Hannan Abstract This response discusses the
Chapter 111. Texas Essential Knowledge and Skills for Mathematics. Subchapter B. Middle School
Middle School 111.B. Chapter 111. Texas Essential Knowledge and Skills for Mathematics Subchapter B. Middle School Statutory Authority: The provisions of this Subchapter B issued under the Texas Education
