Estimating Global Migration Flow Tables Using Place of Birth Data Guy J. Abel Wittgenstein Centre (IIASA, VID/ÖAW, WU) Vienna Institute of Demography/Austrian Academy of Sciences 1 Introduction International migration flow data often lacks adequate measurements of volumes, direction and completeness. These pitfalls limit comparative studies of migration and constrain cross national population projections to use net migration measures or inadequate data. This paper aims to address these issues at a global level, presenting estimates of bilateral flow tables between 191 countries. A methodology to estimate flow tables of migration transitions for the globe is illustrated in two parts. First, a methodology to derive flows from sequential stock tables is developed. Second, the methodology is applied to recently released World Bank migration stock tables between 196 and 2 (Özden et al., 211) estimating a set of four decadal global migration flow tables. The results of the applied methodology are discussed with reference to comparable estimates of global net migration flows of the United Nations and models for international migration flows. The proposed methodology adds to the limited previous literature on linking migration flows to stocks. The estimated flow tables represent a first-of-a-kind set of comparable global origin-destination flow data. 2 Data International migration flow data provided by national statistical offices are not comparable. Tables of bilateral international migration flow data have severe missing data problems and inconsistent definitions, see for example Nowok et al. (26). Efforts to estimate European migration tables of comparable flow data have been partially successful, see for example Abel (21); Beer et al. (21); Raymer et al. (212). However, these methodologies rely on a reasonable percentage of double counted flows, i.e., reported values from both the sending and receiving counties. The application of these estimation methods to global data does not appear to be possible, as the availability of reported migration flows from non-european countries remains scarce. In comparison to flow data, international migrant stock data is both easier to collect and compare. The World Bank recently released a global bilateral foreign born migrant stock database for the last five census rounds (Özden et al., 211). The data is primarily based on place of birth responses to Census questions or details collected from population registers. In order to construct a set of complete bilateral tables, issues of definitions, changes in geography, 1
aggregated data and missing values were addressed by the World Bank. The resulting tables represent the most comparable global data set of past international migration stocks available. 3 Methodology Bilateral migration data is commonly represented in square tables. Values within the table vary, depending on definitions used in data collection or the research question at hand. Values in non-diagonal cells represent some form of movement, for example a migration flow or a foreign born stock between a specified set of R regions or areas. Values in diagonal cells represent some form non-moving population, or those that move within a region, and are often not presented. Table 1: Dummy Example of Place of Birth Migrant Stock Data Place of Birth Data in Stock Tables: Place of Residence (t) Place of Residence (t + 1) Place of Birth A 1 1 1 111 Place of Birth A 95 1 6 111 B 55 555 5 5 665 B 8 55 75 5 665 C 8 4 8 4 96 C 9 3 8 4 96 D 2 25 2 2 265 D 4 45 18 265 Sum 1155 72 88 245 3 Sum 116 68 935 225 3 Place of Birth Data in Flow Tables: Place of Birth=A Place of Birth=B A 1 A 55 B 1 B 555 C 1 C 5 D D 5 Sum 95 1 6 111 Sum 8 55 75 5 665 Place of Birth=C Place of Birth=D A 8 A 2 B 4 B 25 C 8 C 2 D 4 D 2 Sum 9 3 8 4 96 Sum 4 45 18 265 2
Consider two migrant stock tables in consecutive years (t and t + 1) in the top panel of Table 1. The rows represent places of birth from four regions (A to D). The columns represent place of residence, which also range from regions A to D. Hence, non-diagonal entries represent the number of foreign born migrants in each area of residence, whilst diagonal entries contain the number of native born residents. In this hypothetical data there are no births or deaths. This results in two noticeable features in the tables. First, the row totals in each time period remain the same, as the number people born in each region cannot increase or decrease. Second, differences in cells must implicitly be driven solely by migration. These migrations occurs by individuals changing their place of residence (moving across columns), whilst their place of birth (row) characteristic remains fixed. To derive a corresponding set of flows that are constrained to meet the stocks tables, we can alternatively consider the top panel of a Table 1 as a set of R birth place specific migration flow tables where the marginal totals are known, shown in the bottom panel of Table 1. These are formed by considering each row of the two consecutive stock tables as a set of separate margins of a migration flow table. Place of residence totals at time t from the stock data now become origin margin (row) totals for each birth place specific population. Similarly, place of residence totals at time t + 1 from the stock data now become destination margin (column) totals for each birth place specific population. As the row totals from the stock tables were equal, the row and column margins in each of the birth place specific migration flow tables in Table 1 are also equal. The cells in each of these tables may be considered as missing data from a log-linear model, which are commonly used in the estimation of migration flow tables when only marginal totals are known, see for example Willekens (1999). Estimates for the missing non-diagonal cells, constrained to match the marginal totals in the bottom panel of Table 1 can the be obtained by fitting a quasi-independent log-linear model using the contained maximization routine outlined in Abel (212). Rather than being estimated, diagonal cells representing non-movers, are fixed to their maximum values possible, given the known marginal totals. As a result estimated flows represent the minimum number of migration transition flows required to meet the corresponding stock tables. The full estimates of both the fixed diagonal cells, and the estimated non-diagonal cells are shown in the top panel of Table 2. Summing over all birth place dimensions and deleting non-movers in the diagonal elements allows us to obtain a traditional flow table of migrant transitions from origin i to destination j during the time period t to t + 1 in the bottom panel of Table 2. In reality, natural changes from births and deaths in the population occur, causing differences in the row totals between subsequent years. In addition, members of the population that are alive in both time periods, may have not been recorded in one or both periods. These problems can be controlled for using standard demographic accounting methods (see Abel (212) for more details). 3
Table 2: Estimates of Migrant Transition Flow Tables Based on Stock Data in Table 1, with Known Diagonals Estimates of --Place of Birth Flow Tables: Place of Birth=A Place of Birth=B A 95 5 1 A 55 55 B 1 1 B 25 55 25 555 C 1 1 C 5 5 D D 5 5 Sum 95 1 6 111 Sum 8 55 75 5 665 Place of Birth=C Place of Birth=D A 8 8 A 2 2 B 1 3 4 B 25 25 C 8 8 C 1 1 2 D 4 4 D 1 1 18 2 Sum 9 3 8 4 96 Sum 4 45 18 265 Estimates of Total - Flow Table: A B C D Sum A 5 5 B 35 25 6 C 1 1 2 D 1 1 2 Sum 55 2 75 15 4 Results Place of birth data, published by the World Bank were used to provide foreign born migration stock tables at the start of each the last five decade. Using this data, the conditional maximisation routine was run to calculate four 191 191 tables of global migrant transition flows over 1-year periods (see http://goo.gl/b29ld for a spreadsheet of these estimates). Due to the large amount of data estimated, some form of dimension reduction is required in order to asses estimates. In this paper, the estimated migrant transition flows from the applied methodology are discussed with reference to comparable estimates of global net migration flows of the United 4
Nations and models for international migration. Figure 1: Scatter Plot of Estimated Net 1-year Migrant Transition Flow Rates vs Derived UN Rates (per ) 196s QAT 197s Estimated Net Rate per 5 5 4 2 2 PSE AFG NCL BHS CIV BRN GLP BHRPYF AFG ALB BEL AGOARG ARM AUS ABW BRB BGD AUT BLR BEN BOL AZE BTN BIH BDI BWA KHM BGR CMR BRA CAN ISR GMB COM COL TCD CHL COG CHN CPV CRI CAF COD CYP HRV CUB CZE GAB DNK DEU FRA GHA DOM SLV EGY ETH FIN GTM ECU GNB EST GRC GUY MYSKAZ HND KOR HTI ISL KEN HUN GIN FJI PRK LAO ITA IRQ IDN IND IRN JPN KGZ LSO JAM DZA BFA GNQ LBN LBRLBY LUX MTQ LVA MUS MDV MKD MEX MLI MDG MRT MWI FSM LTU MDA MAR MOZ MMR MNG NAM NZL OMN PAN NIC NER NPL NGA NOR PNG SAU LCA STP SWZSWE TKM PRY SVNSLB SVK SDN ROU RUS NLD ANT PHL SOM PAK PRT PER POL REU RWA SGP SLE ZAF ERI SEN MLT WSM SCG ESP SYR LKA TJK CHE TTOTON TUR GBR THA TZA TLS UKR USA UGA TWN SUR URY VUT VEN TGO TUN YEM ZWE VNM ZMB UZB GEO PRI GRD VCT BLZ IRL HKG SOMLBR LBN GUF JOR MAC MKD OMN LUX CZE PYF AUS ABW AUT BGD CMR BEN DZA ARG AGO ARM AZE BHS BLR BWA BEL BTN BFA BOL BRA BDI CAN GAB FSM NCL HRV NLD COG CRI EST TCD CHL CHN CAF DNK CIV COL COD DOM CUB KHM COM BGR ECU FRA GHA FIN DEU GRC LVA ETH GTM GNB HND EGY HUN ISR LBY NER JPN IDN IND ITA ISL ERI ALB GIN HKG IRN KAZ IRQ HTI PRK KEN LTU MLI MRT MDV NPL MOZ MDG KOR IRL GEO BRB CYP SLV FJI KGZ LAO MLT MYS MUS MEX MDA MNG MAR MMR NGA NAM NZL NOR SVN CHE SEN PRT RWA PER PRY PAN POL PNG RUS SVK SLB ESP SWE SYR SDN ZAF PAK PHL SLE LKA TJK TWN SCG SWZ TZA TGO THA TLS TUN TUR GBR UKR USA PSEYEM VUT UZB URY VNM UGA VEN ZMB ZWE TKM JOR STP NICROU LSO PRI VCT JAM LCA TTO ANT BIH BRN BLZ CPV GRD TON WSM GUY SUR DJI GLP KWT SAU MTQ BHR REU DJI GMB SGP MAC MWI GNQ GUF KWT 1 5 MAC HKG KWT AFG ALB AUS BLZETH BIH BGR COG AGO DZA ARG AUT BEL BOL BLR BRA ARM ABW KHM BDI BFA BGD AZE BEN CMR CAN TCD COL CAF CHL CHN COD CZE HRV DEU GTM CRI SLV DOM CUB EGY KAZ ECU DNK FIN FRA PYF GEO FJI HND KOR KGZ GIN GHA HTI HUN ISL IDN CYP GRC IND IRN IRL IRQ ITA LTU KEN JPN ISR LAO NAM LVA PRK BWA ERI BHS CPV JAM LBN LSO MDG MWI MEX MRT MNG MYS MDV MDA MUS MAR MLI NIC MMR MLT NER MOZ NPL NGA PAK TLSTGO YEM UGA PRY ROU PNG PHL RWA POL PER RUS NLD NCL STP SEN SVK SCG SLE PAN SGP ESP NZL ZAF SLB SVN LCALKA SWZ SYR CHE GBR TWN TKM UKR SWE TJK TZA THA SDN USA TTO URY TUN TUR PRI UZB VEN ZWE VNM ZMB VUT PRT GUY FSM REU ANT JOR BRB TON VCT PSE MKD WSM MTQ 2 2 4 6 5 5 1 198s 199s QAT 4 2 QAT SAU BHR BRN GUF GMB COM EST GAB LBY CIV DJI NOR LUX OMN LBR GNB BTN SOM GNQ SUR GRDGLP ARM KWT BIH 2 TON ANT ALB QATMAC GUF BRN SGP PSELUXISR BLZ BHR MLT GAB OMN LSO GMB AUT AUS BGD DZA BHS ARG AGO BLR BWA CAN CYP BTN BEL KHM DNK DJI HKG MUSCIV CRI BDIBGR EGY COM BRB BOL CMR BRA TCD CHL BEN BFA COL COD CHN FRA ITADEU GRC GIN MWI LBY MNG GNB IRN IDN GHA PRK HRV IND JPN KEN ETH ISL HUN CAF CUB CZE PYF FIN ERI GTM DOM ECU IRQ HND HTI IRL JOR LBN MAR MDV KOR LAO MMR NPL MDG MRT NLD PRT RWA SAU ESP SWZ ROUNOR NGA RUS NAM MYS MLI MOZ NZL PAK PNG NER PANNCL SLE LKA PER SEN PHL SDN PRY FSMMDA COG NICPOL PRI SLB SWE CHE TWN GBR USA SYR TLS THA TZA VUT ZWE URY TUN VNM UKR TUR SVK SVN SOM TKM TGO UGA VEN ZMB ZAF YEM UZB SCG MKD GNQ TJK MEX SLV FJI AZE REULBR KGZ GEO KAZ GRD EST LVA LTU LCA TTO SUR JAM CPV WSM STP VCT GUY MTQ GLP AFG ABW 4 2 2 4 3 2 1 1 2 3 United Nations Net Rate per A scatter plot comparing the estimated net rates (on the y-axis) with the derived United Nations (United Nations Population Division, 211) net rate (x-axis) is shown in Figure 1 from each decade. Noticeable is the general linear trend along the x = y line, indicating a broad conformity of the estimated with the derived UN net rates. Noticeable is the general linear trend along the x = y line, indicating a broad conformity of the estimated with the derived UN net rates. This pattern is confirmed in separate regressions for each decade of the estimated rate on the derived United Nations rate and an intercept, discussed in more detail, alongside 5
possible explanations for the outlier values in Abel (212). Kim and Cohen (21) investigated non-economic predictors of reported international migration bilateral flows. Using a log normal regression model, geographic, demographic and social and historical determinants were estimated twice, once using migration flow data by origin into 17 destinations countries and using migration flow data by destinations from 13 origin countries from data provided by the United Nations United Nations Population Division (29). Comparison of the parameter values from the same model fitted to the estimated flows and the parameter values from Kim and Cohen (21) show some broad similarities in direction and size. Discrepancies (fully explored in Abel (212)) are predominately explained by the march larger pool of flow data available from the estimates, which additionally includes migration between traditionally less developed countries. 5 Conclusion Comparable international migration flow data are needed by researchers to better understand people s movements and identify patterns. Policy makers can also use comparable international migration flow data to help forecast populations better, where migration can often play an important role. The methodology outlined in this paper provides a relatively simple yet powerful technique to estimate global migration flow tables, exploiting newly available global stock data. References Abel, G. J. (21). Estimation of international migration flow tables in Europe. Journal of the Royal Statistical Society: Series A (Statistics in Society) 173 (4), 797 825. Abel, G. J. (212). Estimating Global Migration Flow Tables Using Place of Birth. Vienna Institute of Demography Working Paper (1/212). Beer, J., J. Raymer, R. van der Erf, and L. van Wissen (21, November). Overcoming the Problems of Inconsistent International Migration data: A New Method Applied to Flows in Europe. European Journal of Population/Revue européenne de Démographie 26 (4), 459 481. Kim, K. and J. E. Cohen (21). Determinants of International Migration Flows to and from Industrialized Countries: A Panel Data Approach Beyond Gravity1. International Migration Review 44 (4), 899 932. Nowok, B., D. Kupiszewska, and M. Poulain (26). Statistics on International Migration Flows. In M. Poulain, N. Perrin, and A. Singleton (Eds.), Towards the Harmonisation of European Statistics on International Migration (THESIM), Chapter 8, pp. 23 233. Louvain-La-Neuve, Belguim: UCL Presses Universitaires de Louvain. Özden, c., C. R. Parsons, M. Schiff, and T. L. Walmsley (211, March). Where on Earth is Everybody? The Evolution of Global Bilateral Migration 1962. World Bank Economic Review, 12 56. 6
Raymer, J., G. J. Abel, and P. W. F. Smith (27, October). Combining census and registration data to estimate detailed elderly migration flows in England and Wales. Journal of the Royal Statistical Society: Series A (Statistics in Society) 17 (4), 891 98. Raymer, J., J. J. Forster, P. W. F. Smith, J. Bijak, and A. Wiśniowski (212). Integrated Modelling of European Migration: Background, Specification and Results. NORFACE Migration Discussion Paper (4). United Nations Population Division (29). International Migration Flows to and from Selected Countries: The 28 Revision. United Nations Population Division (211). World Population Prospects: The 21 Revision, Highlights and Advance Tables. Working Paper ESA/P/WP (22). Willekens, F. (1999). Modeling Approaches to the Indirect Estimation of Migration Flows: From Entropy to EM. Mathematical Population Studies 7 (3), 239 78. 7