Language as Identity: Dialectal Distance and Labor Migration in China



Similar documents
The Chinese Language and Language Planning in China. By Na Liu, Center for Applied Linguistics

Demographics of Atlanta, Georgia:

Revealing Taste-Based Discrimination in Hiring: A Correspondence Testing Experiment with Geographic Variation

Determining Future Success of College Students

Data Mining: Algorithms and Applications Matrix Math Review

The primary goal of this thesis was to understand how the spatial dependence of

Is the person a permanent immigrant. A non permanent resident. Does the person identify as male. Person appearing Chinese

Calculating the Probability of Returning a Loan with Binary Probability Models

Do Supplemental Online Recorded Lectures Help Students Learn Microeconomics?*

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Education and Wage Differential by Race: Convergence or Divergence? *

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

The Impact of the Medicare Rural Hospital Flexibility Program on Patient Choice

Measuring BDC s impact on its clients

The Effect of China s New Cooperative Medical Scheme. on Rural Utilization of Preventive Medical Care. and Rural Households Health Status

An Analysis of the Telecommunications Business in China by Linear Regression

THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA

Credit Card Market Study Interim Report: Annex 4 Switching Analysis

Econometrics Simple Linear Regression

Data Analysis for Healthcare: A Case Study in Blood Donation Center Analysis

Gao Peiyong* * Gao Peiyong, Professor, Renmin University, Beijing, China. gaopy@263.net.

An approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups

Motor and Household Insurance: Pricing to Maximise Profit in a Competitive Market

Rethinking the Cultural Context of Schooling Decisions in Disadvantaged Neighborhoods: From Deviant Subculture to Cultural Heterogeneity

Example: Boats and Manatees

Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88)

Chapter 4 Specific Factors and Income Distribution

LOGIT AND PROBIT ANALYSIS

Why are Some Diversified U.S. Equity Funds Less Diversified Than Others? A Study on the Industry Concentration of Mutual Funds

Economic Growth and Government Size. Mark Pingle Professor of Economics University of Nevada, Reno. and

Do broker/analyst conflicts matter? Detecting evidence from internet trading platforms

Markups and Firm-Level Export Status: Appendix

Simple Linear Regression Inference

Introduction to Longitudinal Data Analysis

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

STATISTICAL ANALYSIS OF UBC FACULTY SALARIES: INVESTIGATION OF

Health and Rural Cooperative Medical Insurance in China: An empirical analysis

Export Pricing and Credit Constraints: Theory and Evidence from Greek Firms. Online Data Appendix (not intended for publication) Elias Dinopoulos

2003 National Survey of College Graduates Nonresponse Bias Analysis 1

Introduction to Regression and Data Analysis

Expansion of Higher Education, Employment and Wages: Evidence from the Russian Transition

D-optimal plans in observational studies

FLOOD DAMAGES AND TOOLS FOR THEIR MITIGATION Lenka Camrova, Jirina Jilkova

Technical Efficiency Accounting for Environmental Influence in the Japanese Gas Market

An Impact Evaluation of China s Urban Resident Basic Medical Insurance on Health Care Utilization and Expenditure

Human Capital and Ethnic Self-Identification of Migrants

Follow your family using census records

A New Dataset of Labour Market Reform/Rigidity Indexes for Up to 145 Countries since 1960: LAMRIG

Social Security Eligibility and the Labor Supply of Elderly Immigrants. George J. Borjas Harvard University and National Bureau of Economic Research

Data quality in Accounting Information Systems

The International Migrant Stock: A Global View. United Nations Population Division

Statistical tests for SPSS

Violent crime total. Problem Set 1

The Life-Cycle Motive and Money Demand: Further Evidence. Abstract

How to Get More Value from Your Survey Data

ENGINEERING LABOUR MARKET

THE IMPACT OF CHILDHOOD HEALTH AND COGNITION ON PORTFOLIO CHOICE

Online Supplementary Material

A SWOT analysis of poverty alleviation and mountain development in China: A case study of Xiangxi prefecture

Chapter 13 Introduction to Linear Regression and Correlation Analysis

AP Microeconomics Chapter 12 Outline

Technical note I: Comparing measures of hospital markets in England across market definitions, measures of concentration and products

Access to compulsory education by rural migrants' children in urban China: A case study from nine cities

Comparative Analysis of Shanghai and Hong Kong s Financial Service Trade Competitiveness

2. Simple Linear Regression

Chapter 4 Specific Factors and Income Distribution

Local outlier detection in data forensics: data mining approach to flag unusual schools

SOCIAL SECURITY REFORM: work incentives ISSUE BRIEF NO. 6. Issue Brief No. 6. work incentives

How To Write A Data Analysis

Chi Square Tests. Chapter Introduction

Response to Critiques of Mortgage Discrimination and FHA Loan Performance

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu

On the Determinants of Household Debt Maturity Choice

Factors affecting the inbound tourism sector. - the impact and implications of the Australian dollar

On the Interaction and Competition among Internet Service Providers

Pension Reform and Implicit Pension Debt in China

Russian migrants to Russia: assimilation and local labor market effects

Analysis of Bayesian Dynamic Linear Models

Figure 1: Real GDP in the United States

What explains modes of engagement in international trade? Conference paper for Kiel International Economics Papers Current Version: June 2011


The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Battery Bias

Population Change and Public Health Exercise 1A

How To Run Statistical Tests in Excel

Have the GSE Affordable Housing Goals Increased. the Supply of Mortgage Credit?

Online Appendix to Are Risk Preferences Stable Across Contexts? Evidence from Insurance Data

Online Appendix. for. Female Leadership and Gender Equity: Evidence from Plant Closure

Transcription:

Language as Identity: Dialectal Distance and Labor Migration in China Yuyun Liu, Xianxiang Xu Sun Yat-sen University Jan 3, 06 Abstract This paper explores the question how dialectal (linguistic) distance affects labor migration. By using the special sample of labor migration in China who speak both native dialect and Putonghua (standard pronunciation of Chinese), we have identified the true reason that dialectal distance prevent migration, which is weaker feeling of identification of destinations that are dialectally further rather than higher cost of learning its dialect to eliminate linguistic communication obstacle, and once dialectal distance is large enough that has been across dialectal super-groups it reduces migration probability by about 0%. We have also found a positive effect of dialectal distance that comes from diversified complementary skills and thinking ways brought in by migrants from dialectally further cities, and its magnitude is about 30% when dialectal distance increases by one level within a dialectal super-group. Thirdly, the prevention from identification effect becomes stronger if a labor have stronger social network because it increases the opportunity cost of migration. These findings are robust. Keywords: Labor Migration; Chinese Dialectal Distance; Identification Effect; Complementary Effect JEL Classification Numbers: J60, J6, Z0

Introduction This paper explores the question how dialectal (linguistic) distance affects labor migration. Recent studies show migration rates increase with linguistic proximity and a migrant s linguistic competence in the destination country (Adsera and Pytlikova, 05), because if a migrant s native language is linguistically closer to the language in destination, it is easier for him to learn (Chiswick and Miller, 005; Isphording and Otten, 0), and the linguistic competence is strongly associated with a migrant s earning and social outcomes in the destination (Bleakley and Chin 004, 00). So, linguistic distance could reduce labor migration through the mechanism of linguistic communication. However, the negative relationship between linguistic distance and migration rate could be explained by another mechanism, identification effect. Research in social psychology shows, contact between similar people occurs at a higher rate than among dissimilar people (McPherson et al., 00). Language, as an important dimension of ethnic identity and membership (Pendakur and Pendakur, 00), could lead to more migration between two countries that are linguistically closer because of perceived similarities (Huston and Levinger, 978), a feeling of identification. Pendakur and Pendakur (00) found that, conditional on knowledge of a majority language, knowledge of a minority language is associated with lower earnings, and this could be attributed to ethnicity operating either through the immigrant working in an ethno-linguistic labor market enclave or to labor market discrimination against minorities. Then, what on earth is the mechanism that linguistic distance inhibits labor migration, linguistic communication or identification effect? Drawing an analogy, what is the true reason that a US company rejects a foreigner s job application, limited English skill because he comes from an Asian country, or just because he comes from an Asian country? It is hard to distinguish the identification effect from the linguistic communication effect. If one country is linguistically closer to another, both effects are positive. It is easier for people to learn each other s language to eliminate communication obstacles for migration. Closer language leads to more identification of the other country because of perceived similarities in language, which could attract more labor migration between the two countries. Even if we could measure individual s linguistic competence in the destination country, it is still hard to say whether the individual s high linguistic competence is due to closer linguistic distance between the destination country and source country, or is due to higher identification of the destination country which is linguistically closer to the source country. In this paper, we deal with this problem by using a special sample of labor migration in China. Chinese people speak both their native dialect and Putonghua which is the standard pronunciation of Chinese. Each dialect has its specific pronunciation and historic origin, which generates both linguistic obstacles among different dialects and different dialectal identity. Dialect is an indirect indicator of where one s ancestors came from, so is an important dimension of identity in China. Nevertheless, Putonghua could eliminate oral communication obstacles among dialects, thus help us to distinguish the identification effect from linguistic communication effect. Besides, in spite that dialects are different in pronunciation, Chinese characters are mutual and shared no matter which dialect you speak, and rule out the communication obstacles in writing. Thus, using the sample of labor migration in China to study the influence of dialectal distance could help us to distinguish the identification effect from linguistic communication effect.

This paper measures dialectal distance according to current studies (Spolaore and Wacziarg, 009; Adsera and Pytlikova, 05) by counting the steps to the common nodes on the Chinese dialectal family tree from Language Atlas of China (987). Measurement along dialectal family tree tells us the pair-wise dialectal distance of counties. Furtherly with population shares in 98 we have calculated the population-share-weighed pair-wise dialectal distance of 78 prefectures, which could be matched with the birthplace (source) and destination of labor migration. We measure labor migration by discrete variables, whether an individual has ever migrated from his birthplace to another prefecture or province and resided for more than 6 months. We identify the birthplace prefecture and destination prefecture of each migrant s first-time migration in China Labor-force Dynamic Survey 0 (CLDS) hosted by Center for Social Survey in Sun Yat-Sen University. By matching with dialectal distance we have constructed a micro-dataset of labor migration across dialects in China. There are two points to be explained about this dataset. Firstly, we measure dialectal distance between a migrant s birthplace and destination, rather than that between the migrant himself and his destination, namely his linguistic competence in destination, because we want to rule out endogeneity that in order to migrate to the destination the migrant learn its language on his own initiative, and that linguistic competence could be correlated with unobserved ability in residual. Secondly, we focus on each migrant s first-time migration, because this could help us to exclude impact of other dialects. Besides, first-time migration implies the source prefecture is the birthplace prefecture. Even though dialectal distance, through identification effect and linguistic communication effect, could prevent the supply of migrants, it could on the other side facilitate the demand of migrants through the mechanism of complementary effect. Namely, migrants from dialectally further cities could bring in diversified complementary skills and thinking ways which are beneficial in certain production processes (Alesina and La Ferrara, 005), so firms demand and attract migrants from dialectally further cities. Hong and Page (00) have proved theoretically that a team with diversified cognition but limited skills will perform better than a homogeneous team with higher skills. Hambrick et al. (996) have found that heterogeneous teams could gain more market power and profit although they would react slower than a homogeneous competitor. Trax et al. (0) use samples of Germany firms and find birthplace diversity of foreign workers has positive effect on a firm s performance. With samples of Australian firms and workers, Böheim et al. (0) use IV approach and find birthplace diversity of workers will raise their wages. Alesina et at. (03) also shows a positive relationship between birthplace diversity and economic development. Results Summary With both positive and negative effect, we empirically estimated the impact on labor migration by introducing dialectal distance as well as its square term, and to distinguish identification effect from linguistic communication effect we also control for individuals Putonghua skill. The results show a robust inverted U relationship between labor migration and dialectal distance after controlling for Putonghua as well as other control variables, including individual characteristics, average GDP level of destination and birthplace, provincial dummies, geographic factors, and when taking measure errors into consideration, estimating under various estimation methods and excluding reverse causality and problems in setting of econometric strategy, etc. Two points could be concluded from these results. Firstly, dialectal distance promotes 3

migration as long as two prefectures are within the same dialectal super-groups; otherwise, it begins to inhibit migration. Secondly, the inhibition comes from identification effect rather than linguistic communication effect, namely that individuals tend not to migrate across dialectal super-groups, is not due to that they don t understand dialect there, but that they have lower identification of cities outside of their own dialectal super-group. To explain how the inverted U pattern is formed, we introduce both identification effect and complementary effect into a simple model of a labor s migration decision. By assuming that individuals are risk averse and the marginal effect of complementary declines, the inverted U pattern could be derived. We have also tested whether labors from different dialect have different skills as a support of complementary effect. By using average education years as an indicator of skills, the fix effects of dialects are jointly significant. Further findings have been derived when we compare effects of dialectal distance among different samples. The identification and complementary effect are both stronger for urban labors compared with rural labors. If a labor have stronger social network, the prevention from identification effect becomes stronger because it increases the opportunity cost of migration. As time goes by, identification effect becomes weaker and complementary effect stronger. Related Literature There are several strands of research related to this paper, first of which is the role of language on labor migration. This paper has distinguished the identification effect from linguistic communication effect in explaining the negative relationship between linguistic distance and migration rate (Adsera and Pytlikova, 05). Current studies emphasize if a country is linguistically further from another, it is harder for people to learn each other s language (Chiswick and Miller, 005; Isphording and Otten, 0), which prevent labor migration between them. But this paper shows the true reason should be the weaker identification at least in the sample of labor migration in China (Pendakur and Pendakur, 00). This paper has also unearthed the positive complementary effect that migrants from dialectally further cities could bring in diversified complementary skills and thinking ways which are beneficial in certain production processes (Alesina and La Ferrara, 005), so firms demand and attract migrants from dialectally further cities. This paper is also related to the effect of relatedness of human traits and answer at what extent the relatedness of human traits begins to inhibit social interaction. Biologically, genetic distance from technology frontier is associated with economic development and diffusion of technology (Spolaore and Wacziarg, 009, 03, 04). Culturally, trust affects economic development (Knack and Keefer, 997) and individual performance (Butler et al., 04); cultural biases, sourced from different religion, a history of conflict, and genetic distance, harms bilateral trust and trade, portfolio investment, and direct investment between two countries (Guiso et al, 009); and international migration rates increase with linguistic proximity (Adsera and Pytlikova, 05). This paper has found that dialect, as an indirect measure of where one s ancestor came from hundreds or thousands of years ago, still has an impact on labor migration today. Thirdly, this paper also contributes a new explanation, dialectal distance, to labor migration in China. Individual characteristics are associated with one s decision of migration, such as primary education (Zhao, 997). The unbalanced economic development is the main reason of labor migration in China (Cheng, 006). Institutions also serve as a determinant, that land allotment system provides security for the rural labor force to go out to work (Li, 007) and the reform of household registration system encourages labor migration in the long-run (Sun, 0). 4

Social network and clan network are also helpful for labor migration. This paper, after controlling for these factors, robustly finds the effect of dialectal distance. This paper is most closed to Chen et al. (04) and Adsera and Pytlikova (05). The former studies how migrants Shanghai dialect (Wu) skill will affect their income, and furtherly use the dialectal distance from a migrant s birthplace to Shanghai as instrumental variables of migrants Shanghai dialect skill. Different from Chen et al. (04), our paper measures pair-wise dialectal distances over China, and explores the effect on labor migration. The latter studies the role of linguistic proximity, as well as widely spoken languages, linguistic enclaves and language-based immigration policy requirements, in international immigration. Rather than only emphasizing the communication function of language (Adsera and Pytlikova, 05), this paper distinguishes the identification effect from linguistic communication effect. Paper Structure In next section we introduce the background of Chinese dialects and Putonghua which shows the reason why we could distinguish the identification effect from communication effect. Section 3 presents econometric strategy and data. Empirical results of how dialectal distance affects migration are in section 4. Section 5 explains how the inverted U pattern is formed. In Section 6 we do some comparisons among different samples. In section 7 we summarize discussing avenues for future research. Background: Dialects and Putonghua in China. Why dialect is an important identity? The historical formation of Chinese dialects shows why dialect is an important dimension of identification in China. Figure is here Ancient China could be roughly divided into three parts, northern nomadic region (region in Figure ), central plains (region in Figure ), and south China (the rest white areas in Figure ). Northern nomadic region was sparsely populated mainly by non-han ethnic groups, namely nomads, central plain is settled by Han ethnic group, and South China lies in the south side of Qinling Mountains and southern minorities settled. Climate shocks in northern nomadic region, especially drought, led to famine in nomads and drove them to move over the boundary that separate nomads from Han ethnic group, the Great Wall, so that they could survive on robbery on central people (Bai and Kung, 0). So, along Chinese history, wars between Han ethnic group and northern nomads are common, and the nomads even took control of central plains in some dynasties, like Qing-China (80). Wars lead to migration of Han-ethnic large families from central plain to South China and the introduction of language of central plain. South China is mountainous and rugged, so could protect them from wars. Once migrated into a city, the Han-ethnic large families, who were more developed and had larger population, usually dominated and soon became the new host there, as well as their language. The natives, less powerful to resist, had to compromise and learn the new language. So, the language brought from central plains kept and evolved slowly there in south China. However in central plain and northern nomadic region, frequent wars lead to integration among them and speed up the evolution of language in central plain. So each time the Han-ethnic large families migrated to South China, they brought in a newly evolved language, which was 5

quite different from the previous one. What s more, the mountainous landscapes led to geographic isolation of interaction among different languages, which was helpful for the maintenance of various languages. These languages become different dialects today in South China. For example, Yue dialect, namely the Hong Kong Chinese, is formed based on the language brought in North and South Dynasty (386-589, AD) and Hakka dialect evolves from the language in Central Plain in Song Dynasty (960-79, AD), according to Zhou (997, 006). So the formation of dialect in South China is driven by historical migration of Han-ethnic large families at different dynasties. In northern nomadic region and central plain, instead, thousands of years integration results in unified language, Mandarin (Figure ). The historical formation of dialects indicates, which dialect you speak partly shows where your ancestors came from, and that s why dialect serves as an important dimension of identification in China.. The promotion of Putonghua The promotion of Putonghua has unified the standard pronunciation of Chinese among most Chinese people, and helps to eliminate the linguistic obstacles among different dialects. Promotion of Putonghua began with the establishment of Chinese Language Reform Association in 949, who was in charge of normalizing the pronunciation of Putonghua, Pinyin. In 958, the Scheme of the Chinese Phonetic Alphabet (Pinyin) was authorize by the fifth session of the first National People's Congress. Since 98 Promotion of Putonghua has been written into Constitution of People's Republic of China, and it is requested to speak Putonghua in school, media, and other public institutions, children learn Pinyin in primary school, learning Putonghua is a right of Chinese citizens, and in minority areas it is also encouraged to learn Putonghua. Until 000, the percentage of population who can speak Putonghua fluently is more than 50%, and until 00 it reaches more than 70%, and the usage rate of Putonghua under different occasions are, about 0% at home, 30% in market, 50% in hospital, 60% in government and 50% at work. So the Promotion of Putonghua has made Putonghua be widely used (Xie, 0), and plays a key role in eliminating linguistic obstacles among different dialects. 3 Econometric Strategy and Data 3. Economic Strategy Equation () shows our econometric strategy to study how dialectal distance affects whether an individual i migrate or not. Mi, m, n stands for migration across prefectures (or above) from prefecture m to prefecture n, m n; otherwise M,, 0, and m n. P[] is the probability. d is dialectal distance between prefecture m and n, and mn, Putonghua are variables to indicate i i m n isputonghua ' skill. i d is its square term. mn, Z is a series of individual characteristics including gender, age, political status, education and education of parents, household registration status, industry category and firm category. P m and P n are characteristics of prefecture m and n respectively, including average GDP level and dummy variables of province it belongs to. Which probability model to choose depends on the distribution We measure migration at prefecture level rather than a finer one, say county level, because prefecture is the finest level that we could identify an individual s birthplace and destination most accurately. 6

of residuals, F(), and we estimate under Probit model, which is one of the most widely used, as well as the Logit model and Linear Probability Model (LPM). P[ M d,, P, P ]= F( d, d, Putonghua,, P, P ) () i, m, n m, n i m n m, n m, n i i m n In order to distinguish identification effect from linguistic communication effect, we estimate the impact of dialectal distance on individuals migration decision by controlling for their Putonghua skill. The logic is like this. Widely used Putonghua could eliminate the linguistic obstacles among dialects. Thus if it s the case that dialectal distance prevent labor migration through the mechanism of linguistic communication effect, when an individual s Putonghua skill is raised, the linguistic obstacle should be less or disappear, and we will no longer see the significant negative effect or the parameter will weaken. Similar logic could be applied as for another indicator, the distance between an individual s native dialect and Putonghua, because if the dialect of one s birthplace is closer to Putonghua, the linguistic obstacle also could be lightened. However after controlling for the two aspects if we still see the significant negative effect, it should be attributed to identification effect instead of linguistic communication effect. We also introduce the square term of dialectal distance to identify complementary effect. Dialectal distance could prevent labor migration through identification effect and promote labor migration through complementary effect. So adding both dialectal distance and its square term could help us to identify the two contrary effects. 3. Data of dialectal distance We measure dialectal distance in the following three steps. Firstly, we draw Chinese dialectal family tree according to its classification in Language Atlas of China. Chinese dialects could be classified into 0 super-groups in the roughest way, 0 groups, and 05 sub-groups in the finest way, as Figure shows. Figure is here Secondly, by counting the steps to the common node on the family tree, we define pair-wise dialectal distance of counties. Each county could be matched with a sub-group uniquely according to Dictionary of Chinese Dialect, so we could define dialectal distance of county pairs along the family tree with following rules. If two counties belong to the same sub-group, it takes 0 step to get to the common dialectal sub-group, so we define the dialectal distance to be 0; if they belong to the same group but different sub-groups, it takes step to reach the common dialectal group, so we define as ; if same super-group but different groups, steps and defined as ; if different super-groups, 3. Thirdly we calculate dialectal distance at prefecture level by weighting dialectal distances of county pairs with population shares. We use population shares of each county relative to its prefecture in the year 98, the latest census year before the survey of Dictionary of Chinese Dialect, so that we could lighten endogeneity caused by migration s impact on population shares. Specifically, the population-share- weighed dialectal distance of two prefectures m and n, d( m, n ) is calculated as following coincident with current studies (Spolaore and Wacziarg, 009), I J d( m, n) S _ m S _ n d a b a, b () a b Where, S _ m a is the population share of county a in prefecture m, S_ n b is the 7

population share of county b in prefecture n, d ab, is the dialectal distance between county a and county b. So S _ m a represents the probability that an individual i from prefecture m is born in county a, and S_ n b represents the probability that individual j in prefecture n met by individual i, is born in county b of prefecture n. As a result, equation (3) is a probability-weighted formula whose economic intuition is the expectations of dialectal distance when an arbitrary individual i from prefecture m meets an arbitrary individual prefecture n, and it reflects isprejudgment ' of dialectal distance between two cities. Figure 3 is here j in We take Figure 3 as an example to show how to calculate dialectal distances between two prefectures. Prefecture m and prefecture n are both made of two counties a, a and b, b whose population shares are 0.3,0.7 and 0.4,0.6 respectively. The dialectal distance between a and b is, between a and b is, between a and b is 0, and between a and b is 3. By calculation of equation (), the distance between prefecture m and prefecture n is.74. Using this method, we have measured the pair-wise Chinese dialectal distance of 78 Han Chinese speaking prefectures in China. Table I is here In Table I, we show the pair-wise dialectal distance of 4 municipalities and first-tier cities, from which we could find some features. ) Symmetry. Dialectal distance of prefecture m and prefecture n is equal to that of prefecture n and prefecture m. ) The diagonals are not necessarily zero. Only if all the counties in a prefecture are belong to the same sub-group, dialectal distance between a prefecture and itself is zero. 3) Continuity. Population weighting has transformed discontinuous dialectal distances into continuous ones. 3.3 Data of labor migration To measure whether a labor has ever migrated or not, we use micro survey data from China Labor-force Dynamic Survey 0 (CLDS) by Center for Social Survey in Sun Yat-Sen University. According to their report 3, samples are built upon interviewees birthplaces rather than current locations, which can help to reduce sample selection problem. This survey uses the stratified random sampling method, and covers 9 provinces and 8 counties in China, whose population weights correspond to Sixth National Population Census of the People's Republic of China. In this paper, we focus on labors first-time migration, because it is more exogenous and clean. Before an individual s first-time migration, firstly he didn t have any experience of migration, thus the possibility of learning the dialect of the destination in order to migrate there is relatively lower, satisfying exogeneity; secondly he didn t contact with dialects outside of birthplace, which could excludes the interference of other dialects, and help us to identify the effect of dialectal distance between birthplace and destination more cleanly. How to measure migration using CLDS? In CLDS, There are totally 6,53 observations in CLDS,,386 of which have ever migrated to other counties (or above) and resided for more than 6 months 4, and the rest 3,867 are those who have never been to other cities or resided less than 6 0.3*0.4*+0.3*0.6*+0.7*0.4*0+0.7*0.6*3=.74 3 Center for Social Survey in Sun Yat-Sen University, China Labor-force Dynamic Survey: 03 report, Social Sciences Academic Press (03). 4 The percentage of migrated individuals in the whole sample is 4.7%, quite close to that reported by Statistical Communiquéof the People's Republic of China on the 04 National Economic and Social Development, 8

months. With the restriction of CLDS that the source and destination of migration can be identified the finest only at prefecture level, we measure labor migration at prefecture level. If an individual migrated across prefectures, Mi, m, n, otherwise M,, 0. We match the birthplace and destination of labors first-time migration with the pair-wise dialectal distances of 78 prefectures. We first clean the data of one s first-time migration in case there would be recording errors. After data cleaning, 774 of,386 migrated individuals could be matched with dialectal distance. The rest have to be dropped because of recording errors or because people speak minority languages rather than Han Chinese dialects in either birthplace or destination. Among the sample of 774 migrants,,555 migrated across prefectures and the rest 9 migrated across counties within a prefecture. So, dialectal distance of individuals who migrate from his birthplace to another prefecture is that between these two prefectures; dialectal distance of individuals who migrates across counties within his birthplace prefecture is that of the prefecture and itself; and dialectal distance of those who didn t migrate is obviously 0. Finally we use the 564(=3867+555+9) observations to do empirical analysis. Figure 4 is here In case that data cleaning and matching process would generate sample bias, we have calculated the spatial distribution of the migrants sample (,774) and the whole sample (6,53) over provinces in China. As Figure 4 shows, where black bars represent for percentages of interviewees of each province among the whole sample, and white bars represent percentages of migrated interviewees of each province among the migrated sample, the graphs of two distributions are almost coincident. The t-test of difference in two sample means doesn t reject that two samples have equal means, and p value is. So data cleaning and matching didn t cause serious sample bias. Figure 5 is here We also explore the temporal distribution of migrants sample, as Figure 5 shows. Totally speaking, during 936-0, the amount of migrants are growing up, especially after 000 remaining at a high level, which is coincident with the yearly variation of migrants reported by National Population Census (Duan et al., 008). 3.4 Descriptive Statistics of Labor Migration across Dialects Firstly, we look at the distribution of labor migration within and across dialects at dialectal super-group level, group level and sub-group level respectively (Figure 6). At dialectal sub-group level, there are more people migrating across dialects than within dialect, which means dialectal distance promotes migration across dialects when it is relatively smaller. At dialectal group level it seems indifferent whether a labor migrates across or within dialects. At dialectal super-group level, the amount of migrants within dialect is much larger than migrants across dialects, which shows, dialectal distance prevent migration across dialects when it is relatively large. Secondly, we dig deeper into each dialectal super-group to see percentages of labors that migrate within it (diagonals) and migrate from it to another super-group (Table II). Rows show the dialectal super-groups of where migrants come from, columns show the dialectal super-group of where they migrate to, and items in each row from column to column 9 are relative percentages i m n 8.5%. 9

and in column 0 is the sum of them. In Mandarin, 68% of its migrants move within it, none of the percentages of migrants to other dialectal super-groups is more than 0%. In other super-groups, except for Hui and Xiang whose population only account for 0.5% and.8% of the whole population, items on diagonals are always larger than those lie outside diagonals. So, table II shows at dialectal super-group level, dialectal distance prevent labor migration across dialects. Thirdly, we investigate each province of how much of its migrants move to another province that lies in the same dialectal super-group (Figure 7). Percentages are calculated roughly by matching each province with its dominant dialectal super-group. As Figure 7 shows, in most provinces more than 50% of its migrants move to provinces that lie in the same dialectal super-group, which also shows the inhibition of dialectal distance at dialectal super-group level. So, there are two features labor migration across dialects according to descriptive statistics above. Firstly, dialectal distance facilitates migration across dialects when it is relatively small; secondly, at super-group level, it inhibits migration across dialects. 4 Empirical Result 4. Basic Result We estimate equation () under Probit model to test the influence of dialectal distance on labor migration across prefectures. As mentioned above, the whole sample for estimation include 564 interviewees; dependent variable is the bivariate variable that is migration across prefectures and otherwise 0; independent variables of interest are dialectal distance as well as it square term and Putonghua skill; control variables are individual characteristics, including gender, age, political status, education and education of parents, household registration status, industry characteristics and firm characteristic, average GDP level of birthplace and destination, and dummy variables of provinces that birthplace and destination belong to respectively; and we cluster standard error at provincial level. Descriptive statistics and data source of all variables are in table III. In Table IV, column to column 3 gradually introduce in dialectal distance and its square term, individuals Putonghua speaking skill and distance between Putonghua and dialect of individuals birthplace. The parameter of dialectal distance is significantly positive and its square term is significantly negative at % level, both before and after controlling for Putonghua. This exhibits an inverted U pattern between dialectal distance and individuals migration possibility. When dialectal distance is increasing, migration possibility will firstly increase then decrease. We take derivative of d relative to migration possibility, and find that averagely the positive marginal effect is around 0% and the negative marginal effect is about %. We also calculated the inflection point of dialectal distance, which is around.3 and lies within dialectal super-group and between dialectal groups. Intuitively, when dialectal distance is relatively small, say within dialectal super-groups, one level increase in it will raise migration possibility by 0% due to complementary effect; while when dialectal distance becomes larger, say across dialectal super-groups, one level increase in it will lower down migration possibility by %, and the prevention comes from identification effect rather than linguistic communication effect. In column 4 to 6 we add control variables including individual characteristics, average GDP level of birthplace and destination and provincial dummies gradually. Column 7 is estimated under Logit model and column 8 under Linear Probability Model (LPM). Results show robust inverted U pattern and significant negative effect after controlling for Putonghua. 0

Parameters of control variables are coincident with current studies. The parameters of education are significantly positive, which shows individuals with higher education level are more likely to migrate. The significant positive parameters of GDP level of destination have illustrated that more developed cities could attract more migrants; while in a less developed city, labors tend to migrate out of it, as showed by the significant negative parameters of GDP level of birthplace. In order to use the information of migration across counties within the same prefecture, we expand the dependent variable into a 3-value variable in the following analysis. In table IV migrants who migrated across counties within the same prefecture were treated as those who didn t migrate across prefectures and Mi, m, n 0. Actually these observations are different from those who definitely didn t migrate. In order to use more information, namely the information of migration within a prefecture, we expand the dependent variable from a bivariate variable into a 3-value variable. Specifically, Mi, m, n 0 is for those who definitely didn t migrate, Mi, m, n for migration across counties with the same prefecture, and Mi, m, n for migration across prefectures. Correspondingly, our estimation method becomes Order-Probit model because of the sequential relationship of the new dependent variable. Table V reports estimation associating labor migration with dialectal distance under Ordered-Probit model, where migration is a 3-value-variable. The inverted U pattern is still robust both before and after controlling for Putonghua as well as other controls, and the inflection point still lies within the same dialectal super-group and across dialectal groups. The only difference is the magnitude of average positive effect and negative effect, the former of which is around 30% rather than 0% and the latter of which is around 0% instead of %. Possible explanation of it is treating migrants who migrated within the same prefecture as those who didn t migrate across prefectures, has underestimated the impact of dialectal distance on these migrants. 4. Measurable Errors We now take possible measure errors of dialectal distance into consideration, which could come from three sources. Definition of dialectal distance by counting steps of reaching to common node could underestimate dialectal distance among dialectal groups and among dialectal super-groups. In basic results, dialectal distance is defined in 0---3 pattern by counting steps, which may not be large enough to capture the difference between dialectal groups and super-groups, so we define in another pattern, 0--0-00 instead. Specifically, dialectal distance of two counties belong to same dialectal sub-group is 0, belong to different sub-groups in the same group is, belong to different dialectal groups in the same super-group is 0, belong to different super-group is 00. The result is shown in column Table VI. Another source is that weighting be population share could lead to endogeneity in dialectal distance. In basic results, we use population share in 98 of each county in its prefecture as weight to calculate the pair-wise dialectal distance among prefectures. However the time span of migration in CLDS is 936 to 0, thus migration before 98 could affect the population share and lead to endogeneity. In order to avoid this problem, we give equal weight to each county instead of population share. Column in Table VI shows the result. The third problem is the classification of dialects family tree per se could be biased.

According to the research in Chinese dialects (Yuan, 00; Li, 00), ) compared with southern dialects, the classification of northern dialects is finer, which will result in overestimation of dialectal distance within northern dialects; ) in Fujian province, mountains account for more than 90% of its area, and in Zhejiang province the proportion of mountainous area, water area and farming area are 0.7, 0., and 0. respectively, so these geographical isolation could lead to larger dialectal distance than what we measured, namely dialectal distance within Wu and Min dialectal super-groups could be underestimated. Our solutions for the two problems are, ) controlling for the dummy variable of whether an individual migrated within northern dialects to lighten the overestimation of dialectal distance within northern dialects (column 3 in Table VI), and ) dropping observations of migration within Wu dialect and within Min dialect to lessen the underestimation of dialectal distance within Wu and Min dialect (column 4 in Table VI). 4.3 Other Control Variables Even though, dialectal distance, due to its historical formation, is exogenous relative to migration today, the relationship between them could be explained by common third factors, especially geographic factors. Namely, it might be the case that geographic factors have shaped both the historical formation of dialects and migration today, rather than that dialectal distance has affected migration. To rule out this possibility, we introduce the following geographic factors, longitude, latitude, geographic distance, slope, Relief Degree of Land Surface (RDLS) and provincial neighbors. We control for geographical factors from three possible aspects that could affect both dialectal distance and labor migration. Firstly, if two prefectures are geographically further from each other, it is more likely that dialectal distance between are further and migrants between them are less, so we control for longitude and latitude of destination and birthplace respectively (column in Table VII), and we also control for geographically distance as well as its square term (column in Table VII). Secondly, if either of two prefectures is geographically isolated according to its landscape, dialectal distance would be larger and migrants between them are less, so we control for slope and RDLS of destination and birthplace respectively (column 3 and 4 in Table VII). Thirdly, if two prefectures belong to two provinces who are neighbors, dialectal distance is more likely to be smaller, and migration would be larger, so we control for the dummy variable of whether two prefectures belong to two neighbor provinces (column 7 in Table VII). 4.4 Other Concerns Reverse causality is another possibility to explain the relationship between dialectal and migration. Namely, it might be the case that migration affects dialectal distance rather than the other way round. To rule out this possibility we exclude observations that migrated before the year 987 when the Language Atlas of China was surveyed. Relative result is in column in Table VIII. To make sure that the impact of dialectal distance on migration is coincident with that the purpose of migration is searching for job opportunities, we exclude observations that migrate for other reasons such as for marriage, for family membership, or others (column in Table VIII). Another important concern is the setting of our economic strategy. In analysis above, if an individual definitely didn t migrate, the dependent variable is Mi, m, n 0 and dialectal distance is also 0, which could lead to the positive effect when dialectal distance is small. To rule out this

possibility, we exclude observations that definitely didn t migrate, and restrict in the sample of migrants only. Column 3 in Table VIII shows this result that the inverted U pattern is still robust when controlling for Putonghua as well as other control variables. Besides, dialectal distance between two different prefectures is more likely to be larger than dialectal distance between a prefecture and itself. Thus dialectal distance of migrants across prefectures is initiatively more likely to be larger than dialectal distance of migrants not across prefectures. The positive effect could be driven by this. To rule out this possibility, we measure migration at provincial level rather than prefecture level, because the possibility that dialectal distance between two different provinces is larger than dialectal distance between a province and itself is not as large as that at prefecture level. We measure migration at provincial level like this. If birthplace and destination belong to same province, it is migration across provinces and equals to, if belong to same province, it is migration within a province and equals to, if not migrated, equals to 0. Column 4 in Table VIII shows this result. Until now, we have found a robust inverted U relationship between labor migration and dialectal distance after controlling for Putonghua as well as other control variables, including individual characteristics, average GDP level of destination and birthplace, provincial dummies, geographic factors, and when taking measure errors into consideration, estimating under various estimation methods and excluding reverse causality and problems in setting of econometric strategy, etc. These results show two points, that dialectal distance promotes migration as long as two prefectures are within the same dialectal super-groups, otherwise, it begins to inhibits migration, and that the inhibition comes from identification effect rather than linguistic communication effect, namely individuals tend not to migrate across dialectal super-groups not because they don t understand dialect there but because they have lower identification of cities outside of their own dialectal super-group. Then, with both complementary effect and identification effect, how is inverted U pattern formed? We answer this question in the following section. 5 How Is Inverted U Pattern Formed? 5. Model Consider a representative individual i s decision of migration whose birthplace city is A and an arbitrary destination city is B, E[ U ] U M 0 E[ U ] U (3) Where, U is constant, measuring i s reservation utility if i doesn t migrate, and EU [ ] measures i s expected utility of migration to city B. Obviously, if E[ U ] M ; else, if E[ U ] U, i will not migrate, i.e. M 0. U, i will migrate, i.e. i s utility of migration to city B is determined by income there, y, and U U ( y). According to the Permanent Income Hypothesis (Friedman, 956), i s income in city B is made of two parts: 3

y y y. y is permanent income, such as wages; y is temporary gain or loss because of being robbed, stolen, cheated or getting others help, etc. Given the dialectal distance between A and B, d [0, d]. On the one side, as complementary effect mentioned that different dialect is associated with difference in thinking-ways and skills, and labors from dialectally further cities could bring in diversity beneficial for firms in certain production process, thus it is helpful for migrants to find a job if they migrate to cities that are dialectally further. So, we assume that dialectal distance could raise individuals permanent income y, and without loss of generality, the marginal effect declines. As the permanent income y can also be affected by other factors such as policy, economic development, institution and so on denoted, we have y y(, d), satisfying y(, d) 0, yd (, d) 0 and y (, d) 0. dd On the other side, as identification effect mentioned, that dialect is an important dimension of identification that individuals have higher identification of cities that speak closer dialects, while lower identification of cities with quite different dialects. The expression of lower identification is they would afraid that there is a higher risk of being stolen, cheated or robbed as well as unexpected gains if migrate to cities with quite different dialects, thus dialectal distance enlarges the variance of temporary income. We assume that the temporary income y is normally distributed, and if the dialectal distance d is larger, the variance of y is larger, i.e. y ~ 0, d, where is constant. By assuming a CARA utility function 5 U ( / )exp( y) where is the risk aversion parameter, and given and d, i s expected utility in destination could be expressed by, E U y d Z get / exp - y, d + d /. Taking its deriavative subject to d, we will y, d exp - y, d +0.5 d 0.5 exp - y, d +0.5 d d, (4) E[ U ( y) d, Z ] d which reveals the trade-off of complementary effect and identification effect. The first term on the right side of the equation shows the complementary effect which is the mechanism how dialectal distance affects one s utility through permanent income. The second term on the right side of the equation shows the identification effect which is the mechanism of temporary income. According to assumptions, there must exist some d denoted as d * satisfying y d. Thus, when d d*, we will have d (, *) 0.5 0 0 namely complementary E[ U ( y) d, Z ] d effect beats down identification effect at a lower dialectal distance; when d d*, we will have 0, namely identification effect dominates at a higher dialectal distance. This E[ U ( y) d, Z ] d 5 The conclusion will be similar if we use other risk-aversion utility functions, but CARA is more powerful for calculation. 4

meansvwhen the dialectal distance between i s birthplace and destination increases, i s expected utility in destnation will firstly increases then decreases after some certain value. As Fig 8 shows, where the horizontal axis is dialectal distance and the vertical axis is the expected utility, there is an inverted- U -shaped relationship between E U y, d Z and d, and at certain value d * i s expected utility reaches the peak, denoted as U* E U y d*, Z. Finally we explore how dialectal distance affect i s migration decision. We denote the possibility for migration from A to B as P[ M d, Z]. As the reservation utility is constant 6 and according to equation (), we can conclude that the possibility P[ M d, Z] is also associated with dialetal distance, specificly as the following proposition shows. 0 d d* Proposition 7 P[ M d, Z ] : If U U*, P[ M d, Z] 0 ; if U U*, 0 d d* d. 0 d d* The economic intuition of the proposition is natural. Dialectal distance is associated with both complementary effect and identification effect, which will result in an inverted- U -shaped pattern of labors across-dialects migration. Specificly, when the dialectal distance is relatively small, complementary effect donimates, so dialectal distance will promote labors crossing-dialects migration; when dialectal distance is quite large, identification effect dominates, so dialectal distance will prevent labors crossing-dialects migration. The reason is, as dialectal distance increases, complementary effect is strengthened which will encourage individual to migrate by raising permanent income, however the marginal gain of complementary effect declines; at the same time, identification effect also becomes stronger, which will discourage one s migration by enlarging the variance of temporary income, besides individuals are risk-averse, which quickens the diminishing of one s utility. As a result, by trading off the two effects, one s migration probability exhibits an inverted- U -relationship with dialectal distance. In the following parts of this paper, we will test this pattern empirically. 5. Empirical support If the positive effect come from the complementary effect that people from different dialects have different skills and thinking ways, is there any evidence that dialectally specific fix effect of people s skill is significant? We look at people s skill from the aspect of average education years. We regress average education years on dialectal super-group dummies (figure 9a), dialectal groups (figure 9b) and dialectal sub-group dummies separately upon 334 counties in China in the 6 If i doesn t migrate, the dialectal distance d is naturally 0, the income is only related to Z not to d, and will not variate along d. 7 Please referee to appendix for its proof. 5

year 000. The dialectal super-group dummies, dialectal group dummies, and sub-group dummies are all jointly significant at 99.9% confidence level. These results show that, averagely speaking, the people from different dialects are associated with correspondingly different education level. 6 Comparison among Samples 6. Comparison by Individual Characteristics Table IX compares the effect of dialectal distance on migration among samples with different individual characteristics including urban vs rural household registration status, male vs female, different age and different education level. Compared with rural labors, the identification effect and complementary effect are both larger for urban labors. An explanation is, because of the economic development gap between urban and rural, the urban labors has stronger identification of birthplace because their social networks in birthplace is more useful than that of rural labors when searching for jobs, and an urban labor is more easily to get skills with comparative advantages in destination if he migrate. The effects of dialectal distance are almost indifferent under other comparisons. 6. Comparison by Social Networks Table X compare whether dialectal distance functions differently if people have different social network. We measure social network in three aspects, number of people that can talk about things deep in heart with, number of people that can talk about important things with, and number of people that can borrow money (>RMB 5000) from. Under each comparison the positive effects are almost indifferent, however people with stronger social networks face larger prevent from dialectal distance. A possible explanation is, people with stronger social networks might be more able to construct social networks, so will have stronger social networks in birthplace compared with those with weaker social network, and social networks in birthplace have strengthened the identification effect of birthplace and relatively lower identification of destination, thus prevent them from migrating outside. 6.3 Comparison by Time Span Table XI compare the effect of dialectal distance on migration along time span. We gradually dropped migrants who migrated before the year 990, 995, 000 and 005. Results show that positive effect becomes larger and larger while negative effects becomes weaker and weaker. An explanation could be that, the development of information technology has sharpened the distance among people so that people don t need to rely too much on identity to get information and resource, and that the spread of internet offers more information that encourages people to migrate to dialectally further places. 6

7 Conclusion This paper explores the question how dialectal (linguistic) distance affects labor migration. By using the special sample of labor migration in China who speak both native dialect and Putonghua (standard pronunciation of Chinese), we empirically estimated the impact on labor migration by introducing dialectal distance as well as its square term, and to distinguish identification effect from linguistic communication effect we also control for individuals Putonghua skill. The results show a robust inverted U relationship between labor migration and dialectal distance after controlling for Putonghua as well as other control variables, including individual characteristics, average GDP level of destination and birthplace, provincial dummies, geographic factors, and when taking measure errors into consideration, estimating under various estimation methods and excluding reverse causality and problems in setting of econometric strategy, etc. Two points could be concluded from these results. Firstly, dialectal distance promotes migration as long as two prefectures are within the same dialectal super-groups; otherwise, it begins to inhibit migration. Secondly, the inhibition comes from identification effect rather than linguistic communication effect, namely that individuals tend not to migrate across dialectal super-groups, is not due to that they don t understand dialect there, but that they have lower identification of cities outside of their own dialectal super-group. To explain how the inverted U pattern is formed, we introduce both identification effect and complementary effect into a simple model of a labor s migration decision. By assuming that individuals are risk averse and the marginal effect of complementary declines, the inverted U pattern could be derived. We have also tested whether labors from different dialect have different skills as a support of complementary effect. By using average education years as an indicator of skills, the fix effects of dialects are jointly significant. Further findings have been derived when we compare effects of dialectal distance among different samples. The identification and complementary effect are both stronger for urban labors compared with rural labors. If a labor have stronger social network, the prevention from identification effect becomes stronger because it increases the opportunity cost of migration. As time goes by, identification effect becomes weaker and complementary effect stronger. This paper reveals that, the true reason that dialectal distance prevents migration is weaker feeling of identification of destinations that are dialectally further, rather than higher cost of learning its dialect to eliminate linguistic communication obstacle. This paper also answers at what extent the relatedness of human traits begins to inhibit social interaction. Dialect, as an indirect measure of where one s ancestor came from hundreds or thousands of years ago still has an impact on labor migration today. If two populations are not that far from each other in human traits, say within the same dialectal super-group, dissimilarity will encourage social interacts between them because of possible complementary in skills and thoughts, such as labor migration. While, if their human traits are quite different from each other, say belonging to two different dialectal super-groups, few social interacts will happen between them. The inflection point that relatedness begin to hinder social interacts of two populations is determined by the historical origin of populations hundreds or thousands years ago. References 7

Adsera Alicia, and Mariola Pytlikova, The role of language in shaping international migration, The Economic Journal 5 (05): F49 F8. Alesina Alberto, and Eliana La Ferrara, Ethnic Diversity and Economic Performance, Journal of Economic Literature 43.3 (005): 76. Alesina Alberto, Johann Harnoss, and Hillel Rapoport, Birthplace diversity and economic prosperity, National Bureau of Economic Research No. w8699 (03): -54. Bai Ying, and James Kai-sing Kung, Climate shocks and Sino-nomadic conflict, Review of Economics and Statistics 93.3 (0): 970-98. Bleakley Hoyt, and Aimee Chin, Language skills and earnings: Evidence from childhood immigrants, Review of Economics and Statistics 86 no. (004): 48-496. Bleakley Hoyt, and Aimee Chin, Age at arrival, English proficiency, and social assimilation among US immigrants, American Economic Journal: Applied Economics, (00): 65 9. Böheim René, G. Horvath, and Karin Mayr, Birthplace diversity of the workforce and productivity spill-overs in firms, WIFO No. 438 (0): -35. Butler Jeffrey, Paola Giuliano, and Luigi Guiso, The Right Amount of Trust, UCLA (04), mimeo. Center for Social Survey in Sun Yat-Sen University, China Labor-force Dynamic Survey: 03 report (Zhongguo Laodongli Dongtai Diaocha: 03 Baogao), Social Sciences Academic Press (03). Chen Zhao, Ming Lu and Le Xu, Returns to dialect: Identity exposure through language in the Chinese labor market, China Economic Review, 30 (04): 7-43. Cheng Mingwang, Qinghua Shi and Jianxia, Yang. From Malthus to Solow: An Explanation for the Motivation and Obstacles Effecting Farmer Labor Emigration in China, Economic Research Journal 4 (006):68-78. Chiswick Barry R., and Paul W. Miller, Linguistic distance: a quantitative measure of the distance between English and other languages, Journal of Multilingual and Multicultural Development, 6(005):. Duan Chengrong, Ge Yang, Fei Zhang and Xuehe Lu, Nine Trends of China's Floating Population after Reform and Opening up (Gaige Kaifang Yilai Zhongguo Liudong Renkou Biandong De Jiu Da Qushi), Population Research 6 (008): 30-43. Feng Zhiming, Yan Tang, Yanzhao Yang, and Dan Zhang, The Relief Degree of Land Surface in China and Its Correlation with Population Distribution, Acta Geographica Sinica 6 no. 0 (007): 073-08. Feng Zhiming, Yanzhao Yang, Zhen You, and Jinghua Zhang, Research on The Suitability of Population Distribution at The County Level in China, Acta Geographica Sinica 6 (04): 73-737. Friedman M., A Theory of the Consumption Function, Princeton, NJ: Princeton University Press (956). Guiso Luigi, Paola Sapienza, and Luigi Zingales, Cultural Biases in Economic Exchange, the Quarterly Journal of Economics 4.3 (009): 095-3. Hambrick Donald C., Theresa Seung Cho, and Ming-Jer Chen, The Influence of Top Management Team Heterogeneity on Firms Competitive Moves, Administrative Science Quarterly 4(996): 659-684. Hong Lu, and Scott E. Page, Problem Solving by Heterogeneous Agents, Journal of Economic 8

Theory 97(00): 3-63. Huston Ted L., and George Levinger, Interpersonal attraction and relationships, Annual Review of Psychology 9(978): 5-56. Knack Stephen, and Philip Keefer, Does Social Capital Have an Economic Payoff? A Cross-Country Investigation, Quarterly Journal of Economics (997): 5 88. Li Rulong, Chinese Dialects (Hanyu Fangyan Xue), Higher Education Press (00). Li Shi, A Gray Landscape of China's Economic Development (Zhongguo Jingji Fazhan zhong de Yi Dao Huise de Fengjing Xian), Economic Research Journal (007): 54-57. McPherson Miller, Lynn Smith-Lovin, and James M. Cook, Birds of a feather: Homophily in social networks, Annual Review of Sociology (00):45-444. Pendakur Krishna, and Ravi Pendakur, Language as Both Human Capital and Ethnicity, International Migration Review 36(00): 47 77. Li Rong, Language Atlas of China (Zhongguo Yuyan Ditu Ji), Hong Kong: Longman Group (Far East) Ltd. Australian Academy of the Humanities, Chinese Academy of Social Sciences, and Dept. of Linguistics Australian National University, Pacific Linguistics, Series C 0 (987). Spolaore Enrico, and Romain Wacziarg, The diffusion of development, The Quarterly Journal of Economics 4(009): 469-59. Spolaore Enrico, and Romain Wacziarg, How Deep Are the Roots of Economic Development, Journal of Economic Literature 5. (03): 35-369. Spolaore Enrico, and Romain Wacziarg, Long-term barriers to economic development, in Handbook of Economic Growth, Volume A (04):-76. Sun Wenkai, Chongen Bai, and Peichu Xie, The effect on rural labor mobility from registration system reform in China, Economic Research Journal (0): 8-4. Xie Junying, the Survey of the Current Situation of Putonghua Popularization, Applied Linguistics 3 (0): -0. Xu Baohua and Miyada Ichiro, Dictionary of Chinese Dialect (Hanyu Fangyan Da Cidian), Zhonghua Book Company (999). Yuan Jiahua, Chinese dialect outline (Hanyu Fangyan Gaiyao), Language Press (00), Second Edition. Zhao Yaohui, China's Rural Labor Migration and the Role of Education (Zhongguo Nongcun Laodongli Liudong ji Jiaoyu zai Qizhong de Zuoyong), Economic Research Journal (997): 37-4, 73. Zhou Zhenhe, Research on Cultural Regions in China History (Zhongguo Lishi Wenhua Quyu Yuanjiu), Fudan University Press (997). Zhou Zhenhe and Rujie You, Dialect and Chinese Culture (Fangyan yu Zhongguo Wenhua), Shanghai People's Press (006). Isphording Ingo E and Sebastian Otten, Linguistic distance and the language fluency of immigrants, Ruhr Economic Paper 74 (0). Trax Michaela, Stephan Brunow and Jens Suedekum, Cultural diversity and plant-level productivity, Regional Science and Urban Economics 53 (05): 85-96. Appendix: Proof of Proposition 9

i) If U U*, P[ M d, Z] 0 holds obviously. ii) If U U*,. P[ M d, Z ] P[ E[ U ] U ] d d According to assumptions of utility function and determinant of income, the expectation of utility if one migrate from A to B, E[ U ( y) d, Z ] satisfies E U y d, Z / exp - y, d + d /. Taking deriavative of E [ U ( y ) d, Z ] subjective to d, we will have [ y (, d) / ]exp[- y(, d)+ d / ] d (5) E[ U ( y) d, Z ] d besides, exp[- y(, d)+ d / ] 0, [ yd(, d) / ] d 0 0, [ yd (, d) / ] d d / 0, [ yd (, d) / ] d d* 0. Thus, E[ U ( y) d, Z ] satisfies E[ U ( y) d ] E d 0 0 d, [ U ( y ) d, Z ] 0 d d d. Taking second order deriavative of E[ U ( y) d, Z ] subjective to d, we will get E[ U ( y) d, Z] =[ ydd (, d ) ( yd (, d ) / ) ] exp[- y (, d )+ / ] d For y() satisfies y(, d) 0, y (, d) 0, y (, d) 0, it easy to know last formula is less than d dd 0. Thus, E[ U ( y) d, Z ] d is monotonely decreasing along d. Together with continuity and intermediate value theorem, there must be the unique d * satisifying 0 E[ U ( y) d, Z ] d d d*, i.e. y d. d (, *) / 0 So we will get, when d d*, 0 and E[ U ( y) d, Z ] d 0 ; when d d*, P[ E[ U ] U ] d 0 and E[ U ( y) d, Z ] d 0 ; when d d*, P[ E[ U ] U ] d 0 and E[ U ( y) d, Z ] d 0. P[ E[ U ] U ] d It s the end of the proof. 0

Figures and Tables Figure a Figure b Figure. Figure a: Three Parts of Ancient China (Qing-China, 80) Figure b: Atlas of Chinese Dialectal Super-Groups Today (987) Note:. Region is northern nomadic region, Region is central plains, and the rest white area is south China, source: Bai & Kung (0).. Chinese dialectal super-groups include Mandarin, Jin, Wu, Hui, Gan, Xiang, Min, Hakka, Yue (Hong Kong Chinese) and Ping, source: Language Atlas of China, Li (987).

Figure. Chinese Dialects Family Tree Note: Source is Language Atlas of China, Li (987).

0.3 0.4 0 0.7 3 0.6 Figure 3. An Example to Show the Calculation of Dialectal Distance between Two Prefectures Note: Prefecture m and prefecture n are both made of two counties a, a and b, b whose population shares are 0.3,0.7 and 0.4,0.6 respectively. The dialectal distance between a and b is, between a and b is, between a and b is 0, and between a and b is 3. By calculation of equation (), the distance between prefecture m and prefecture n is.74. 3

Beijing Tianjing Hebei Shanxi Inner Mongoria Liaoning Jilin Heilongjiang Shanghai Jiangsu Zhejiang Anhui Fujian Jiangxi Shandong Henan Hebei Hunan Guangdong Guangxi Chongqing Sichuan Guizhou Yunnan Shaanxi Gansu Qinghai Ningxia Xinjiang 5 0 5 Whole Sample Migrants Sample 0 5 0 Figure 4. Distribution Of Interviewees Over Provinces In China Note: Figure 4 reports the percentages of interviewees of each province among the whole sample (black bars, 653 interviewees) and percentages of migrated interviewees of each province among the migrated sample (white bars, 774 interviewees). Unit is %. Data source: China Labor-force Dynamic Survey (0). 4

935 953 956 959 96 964 966 969 97 973 975 977 979 98 983 985 987 989 99 993 995 997 999 00 003 005 007 009 0 0.06 0.05 0.04 0.03 0.0 0.0 0 Figure 5. Temporal Distribution of Migrants Note: Figure 5 shows the percentage of migrants in each year relative to the whole migrant sample. Unit is %. Data source: sample of 774 migrants collected from China Labor-force Dynamic Survey (0) by authors. 5

400 00 000 800 600 Across dialects Within a dialect 400 00 0 Super-groups Groups Sub-groups Figure 6. Migration Within and Across Dialects at Different Dialectal Levels Note: Figure 6 shows the distribution of labor migration within and across dialects at dialectal super-group level, group level and sub-group level respectively. Data source: sample of 774 migrants collected from China Labor-force Dynamic Survey (0) by authors. 6

Jilin - Mandarin Jiangsu - Mandarin, Beijing - Mandarin Tianjing - Mandarin Hebei - Mandarin Liaoning - Mandarin Heilongjiang - Anhui - Mandarin Shandong - Mandarin Henan - Mandarin Hubei - Mandarin Guangxi - Mandarin Chongqing - Mandarin Sichuan - Mandarin Guizhou - Mandarin Yunnan - Mandarin Shaanxi - Mandarin Gansu - Mandarin Qinghai - Mandarin Ningxia - Mandarin Xinjiang - Mandarin Shanxi - Jin Inner Mongoria - Jin Jiangxi - Gan Fujian - Min Hainan - Min Shanghai - Wu Zhejiang - Wu Hunan - Xiang Guangdong - Yue 00% 90% 80% 70% 60% 50% 40% 30% 0% 0% 0% Figure 7. Percentage of Migrants from Each Province to Other Provinces That Lie In the Same Dialectal Super-Group Note: Figure 7 shows the rough percentages of labor migration of each province to other provinces that lie in the same dialectal super-group, and each province is matched with its dominant dialectal super-group. Data source: Sixth National Population Census of the People's Republic of China. 7

0 d Figure 8. Dialectal Distance and Expected Utility in Destination 8

Figure 9..0 0.8 0.6 0.4 0. 0.0-0. -0.4-0.6 Figure. 9a Dialectal Super-Group Fix Effect of Education (years) 0.0-0.5 -.0 -.5 -.0 -.5 Figure. 9b Dialectal Group Fix Effect of Education (years) Note: Figure 9a shows the average education years of each dialectal super-group compared with Hakka when regressing average education years on dialectal super-group dummies as well as constant upon 334 counties in China in the year 000. P-values are in parentheses. Standard errors are clustered at province level. The dialectal super-group fixed effects are jointly significant at the 99.9% confidence level. R square is 0.05. Figure 9b shows the average education years of each dialectal group compared with Mindong 9

when regressing average education years on dialectal group dummies as well as constant upon 334 counties in China in the year 000. P-values are in parentheses. Standard errors are clustered at province level. The dialectal group fixed effects are jointly significant at 99.9% confidence level. R square is 0.70. We also regress average education years on dialectal sub-group dummies as well as constant upon 334 counties in China in the year 000 but didn t report here. The dialectal sub-group dummies are jointly significant at 99.9% confidence level. R square is 0.35. 30

TABLE I Pair-Wise Dialectal Distances Between Municipalities And First-Tier Cities Beijing Shanghai Tianjin Chongqing Guangzhou Shenzhen Beijing 0.34 3.88 3 3 Shanghai 3 0 3 3 3 3 Tianjin.88 3 0.355 3 3 Chongqing 3 0.33 3 3 Guangzhou 3 3 3 3 0.03 Shenzhen 3 3 3 3.03.3 TABLE II Migration Across And Within Each Dialectal Super-Group to from Mandarin Jin Gan Hui Hakka Min Wu Xiang Yue Total Mandarin 0.68 0.0 0.0 0.0 0.06 0.03 0.0 0.00 0.08 Jin 0.30 0.50 0.00 0.0 0.00 0.00 0.0 0.00 0.00 Gan 0.5 0.00 0.8 0.04 0.6 0.06 0.3 0.00 0.7 Hui 0. 0.00 0.00 0. 0.00 0.00 0.44 0.00 0. Hakka 0.0 0.00 0.0 0.00 0.56 0. 0.04 0.00 0.4 Min 0.06 0.00 0.03 0.00 0.8 0.35 0.0 0.00 0.8 Wu 0.4 0.0 0.0 0.0 0.0 0.0 0.37 0.00 0.04 Xiang 0.6 0.00 0.8 0.0 0.4 0.0 0.0 0.04 0. Yue 0.6 0.0 0.00 0.00 0.3 0.03 0.0 0.00 0.64 Note: Table II reports the percentages of migrants by each dialectal super-group that migrate within it (diagonals) and migrate from it to another super-group. Rows show the dialectal super-groups of where migrants come from, columns show the dialectal super-group of where they migrate to, and items in each row from column to column 9 are relative percentages and in column 0 is the sum of them. Data source: sample of 774 migrants collected from China Labor-force Dynamic Survey (0) by authors. 3

TABLE III Descriptive Statistics of All Variables Variable Source Observation Mean Variance Min Max Migration across prefectures 0/ (0=no, =yes) CLDS 564 0.099 0.99 0 across prefectures 0// (0=not migrate, =within a prefecture, =across prefectures) CLDS 564 0.3 0.605 0 across prefectures or within a prefecture 0/ (0= within a prefecture, =across prefectures) CLDS 774 0.877 0.39 0 across provinces 0// (0=not migrate, =within province, =across provinces) CLDS 564 0.75 0.57 0 Dialectal Distance weighted by population share in 98 and defined in 0---3 pattern By authors 564 0.05 0.680 0 3 weighted by population share in 98 and defined in 0--0-00 pattern By authors 564 5.000 0.43 0 00 weighted by equal weights and defined in 0---3 pattern By authors 564 0.05 0.679 0 3 Evaluation of Putonghua Skill CLDS 56 3.464.80 5 very fluently=5 3 CLDS 459 4 0.55 0.436 0 fluently and with dialectal accent=4 CLDS 585 0.36 0.465 0 not that fluently=3 CLDS 365 0.47 0.354 0 could understand but not speak= CLDS 357 0.04 0.403 0 can neither speak nor understand= CLDS 56 0.078 0.69 0 Distance between Putonghua and Dialect of Birthplace By authors 377.330 0.55.45 3 Individual Characteristics gender 5 CLDS 564 0.47 0.499 0 age (years) CLDS 563 4.770 4.43 5 86 political status 6 CLDS 564.63 0.547 3 education 7 CLDS 564.43.485 0 7 education of father CLDS 564.07.00 0 5 education of mother CLDS 564 0.665 0.937 0 5 household registration status 8 CLDS 564 0.66 0.37 0 3

TABLE III (continued) Variable Source Observation Mean Variance Min Max Characteristics of Birthplace and Destination Prefectures average GDP level of destination (00 billion) China Statistical Yearbook for 503.37.804 0 0.68 average GDP level of birthplace (00 billion) Regional Economy 00-00 503.87.588 0 0.68 geographic distance (000km) By authors 564 0.065 0.69 0 3.47 longitude of destination Google earth 984 4.04 6.495 8.068 3.005 longitude of birthplace Google earth 984 3.904 6.37 8.99 3.59 latitude of destination Google earth 984 3.803 6.387 8.55 50.45 latitude of birthplace Google earth 984 3.88 6.55 8.55 50.45 slope of destination Feng et al. (007, 04) 96 4.857 4.09 0.444 slope of birthplace Feng et al. (007, 04) 376 5.34 4.607 0.444 relief degree of land surface (RDLS) of destination Feng et al. (007, 04) 96 0.574 0.699 0 3.60 relief degree of land surface (RDLS) of birthplace Feng et al. (007, 04) 376 0.637 0.794 0 3.559 Social Network of Interviewees Note: # people with whom can talk about something deep in heart 9 CLDS 307.488.045 5 # people with whom can talk about important things CLDS 307.365 0.990 5 # people from whom can borrow money (>RMB 5000) CLDS 307.8.5 5 Migration across Prefectures 0//: for migration across prefectures, for across counties in same prefecture, 0 for non-migration. Putonghua skill is evaluated by the interviewer after the interview ranging from to 5 discretely, 5 points for very fluently, 4 points for fluently and with dialectal accent, 3 points for not that fluently, points for those who could understand but couldn t speak Putonghua, and point for those who can neither speak nor understand. 3 These are 5 dummy variables indicating whether an individual s Putonghua skill equals to the relative scores. 4 459 represents for number of observations whose Putonghua skill is very fluently, similarly applied to other levels of Putonghua skill. 5 Gender: for male, 0 for female. 6 Political Status: 3 for members of Communist Party, for members of the Democratic Party, for non-partisan. 7 Education: uneducated equal to 0, the ungraduated from primary school equal to, the graduated from primary school equal to, from middle school equal to 3, from high middle school equal 4, undergraduates equal to 5, master degree equal to 6, and PhD degree equal to 7 8 Household Registration Status: for the urban, 0 for the rural. 9 represents for none, for to 3 people, 3 for 4 to 6 people, 4 for 7 to 9 people, and 5 for more than 0 people, similarly hereinafter. 33

TABLE IV Basic results Dependent variable: migration across prefectures or not (/0) Probit Logit LPM () () (3) (4) (5) (6) (7) (8) Dialectal Distance d 5.4*** 5.04*** 5.3*** 5.97*** 5.057*** 5.555***.775***.06*** (0.4) (0.403) (0.40) (0.399) (0.443) (0.664) (.663) (0.037) SquareTerm of Dialectal Distance d -.54*** -.44*** -.03*** -.96*** -.087*** -.64*** -.4*** -0.4*** (0.35) (0.3) (0.30) (0.9) (0.40) (0.85) (0.445) (0.03) Putonghua skill 0.3*** 0.5*** 0.073 0.09* 0.066 0.35 0.00 (0.043) (0.043) (0.053) (0.053) (0.067) (0.6) (0.00) Distance between Putonghua and Dialect of Birthplace -0.063-0.074-0.09-0.48-0.84-0.00 (0.095) (0.00) (0.0) (0.70) (0.564) (0.005) Average GDP level of birthplace -0.67*** -0.345*** -.0*** -0.0*** (0.069) (0.9) (0.376) (0.005) Average GDP level of destination 0.57*** 0.40***.07*** 0.0*** (0.068) (0.34) (0.380) (0.005) Individual characteristics NO NO NO YES YES YES YES YES Provincial dummies NO NO NO NO NO YES YES YES Constant -.449*** -.937*** -.77*** -.836*** -.960*** -3.355*** -6.559*** 0.5* (0.067) (0.49) (0.307) (0.374) (0.374) (0.478) (.75) (0.069) Inflection Point of d.3.3...33.39.44.3 Positive Marginal Effect.%.8%.4%.3% 0.5% 0.3% 9.% ---- Negative Marginal Effect.%.%.0%.0%.% 0.8% 0.5% ---- Observations 5,64 5,6 3,57 3,5 3,5 3,5 3,5 3,5 R-squared 0.838 0.84 0.854 0.856 0.86 0.879 0.879 0.886 Note: 34

Table IV reports the estimation results associating labor migration with dialectal distance. Dependent variable is a bivariate variable of cross-prefecture-migration (=) or not (=0). Independent variables of interest are dialectal distance between one s birthplace and destination of first-time migration, d and its square term, and individuals Putonghua skills. Control variables are individual characteristics, including gender, age, political status, education and education of parents, household registration status, categorical variable of industry characteristics according to the category of One Yard Industry Code in China, categorical variable of firm characteristic according to company ownership, average GDP level of birthplace and destination, and dummy variables of provinces that birthplace and destination belong to respectively. Positive and negative marginal effects represent for the average extent that possibility of labor migration will increase and decrease by respectively, when dialectal distance increases by one level. Inflection point of d is caltulated by dividing the parameter of the square term of d by the paramenter of d, and then multiplying by. Stantard errors are in parentheses, and are clustered at provincial level, *** p<0.0, ** p<0.05, * p<0.. Observations for estimation are 5,64 after data cleaning and matching, and because of missing values of some variables the amount declines. We didn t report the Positive and negative marginal effects when estimating under LPM, because the possibility could be larger than due to the initiative disadvantage of LPM, similarly hereinafter. 35

Dependent variable: migration across prefectures, within prefecture or not (//0) TABLE V Basic results: expanding dependent variable Ordered Probit Ordered Logit () () (3) (4) (5) (6) (7) (8) Dialectal Distance d 5.874*** 5.83*** 6.05*** 6.040*** 5.867*** 6.338*** 3.89***.4*** (0.48) (0.47) (0.495) (0.500) (0.538) (0.75) (.77) (0.063) SquareTerm of Dialectal Distance d -.39*** -.379*** -.44*** -.438*** -.364*** -.480*** -3.66*** -0.538*** (0.57) (0.53) (0.60) (0.6) (0.7) (0.) (0.458) (0.0) Putonghua skill 0.*** 0.097*** 0.07* 0.080* 0.056 0. 0.004 LPM (0.038) (0.037) (0.04) (0.04) (0.040) (0.089) (0.003) Distance between Putonghua and Dialect of Birthplace -0.066-0.088-0.089-0.75-0.39-0.04 (0.097) (0.0) (0.04) (0.06) (0.553) (0.0) Individual characteristics NO NO NO YES YES YES YES YES Average GDP level of birthplace and destination NO NO NO NO YES YES YES YES Provincial dummies NO NO NO NO NO YES YES YES Inflection Point of d...0.0.5.4.9.06 Positive Marginal Effect 3.8% 3.6% 3.7% 3.8% 3.0% 3.% 33.5% ---- Negative Marginal Effect.8%.4%.8%.5%.6%.6% 9.4% ---- Observations 5,64 5,6 3,57 3,5 3,5 3,5 3,5 3,5 R-squared 0.743 0.746 0.756 0.758 0.763 0.779 0.784 0.893 Note: Table V reports estimation results associating labor migration with dialectal distance, where dependent variable is a 3-value variable of non-migration (=0), migration across counties but within a prefecture (=), and migration across prefectures (=). Other descriptions are the same as Table IV. And we have also controlled constant but didn t report here. 36

TABLE VI Possible measurable errors Dependent variable: migration across prefectures: //0 Defined by 0--0-00 Equal weights North dialects Wu and Min dialects () () (3) (4) Dialectal Distance d 0.9*** 5.88*** 5.00*** 6.565*** SquareTerm of Dialectal Distance d (0.044) (0.579) (0.54) (0.679) -0.00*** -.33*** -0.984*** -.50*** (0.000) (0.7) (0.8) (0.06) Putonghua skill 0.06 0.058 0.094* 0.04 Distance of Putonghua & Dialect of Birthplace (0.040) (0.038) (0.05) (0.03) -0.06-0.45 0.030-0.9 (0.69) (0.5) (0.47) (0.88) North 3.460*** (0.394) Individual characteristics YES YES YES YES GDP of birthplace & destination YES YES YES YES Provincial dummies YES YES YES YES Inflection Point of d 66.67.0.64.5 Positive Marginal Effect.% 3.0% 5.9% 3.0% Negative Marginal Effect.6% 9.6%.6% 9.8% Observations 3,5 3,5 3,5 3,066 R-squared 0.598 0.78 0.834 0.799 Note: Table VI reports estimation results associating dialectal distance with labor migration when we take possible measurable errors into consideration. In column, dialectal distance of two prefectures is the mean of dialectal distance of two counties of the two prefectures of all kind of combination instead of population share weighting. In column, the definition method of dialectal between two counties has been changed as 0--0-00. In column 3 we have further introduced a dummy variable of migrating within northern dialects in case dialectal distances within northern dialects are overestimated. In column 4 we have dropped observations of migration within Wu dialect and within Min dialect in case dialectal distances within Wu and Min dialect are underestimated. In column 5, migration is measured at provincial level. We have also introduced all the control variables as column 6 in Table V but didn t report here. Other descriptions are the same as column 6 in Table V. Positive and negative marginal effects represent for the average extent that possibility of labor migration will increase and decrease by respectively, when dialectal distance increases by one unit rather than one level. 37

Dependent variable: migration across prefectures, within TABLE VII Other variables: geographic factors prefecture or not (//0) () () (3) (4) (5) (6) (7) Dialectal Distance d 6.385***.06*** 6.53*** 6.50*** 6.46***.056*** 6.97*** (0.679) (0.056) (0.79) (0.700) (0.680) (0.057) (0.78) SquareTerm of Dialectal Distance d -.475*** -0.58*** -.447*** -.444*** -.48*** -0.56*** -.57*** (0.85) (0.08) (0.) (0.06) (0.84) (0.09) (0.5) Putonghua skill 0.060 0.004 0.065 0.063 0.066 0.004 0.054 (0.04) (0.003) (0.04) (0.043) (0.045) (0.003) (0.040) Distance between Putonghua and Dialect of Birthplace -0.367-0.0-0.93-0.55-0.49* -0.03-0.58 (0.39) (0.0) (0.8) (0.4) (0.39) (0.03) (0.07) Longitude of destination -0.07-0.089 (0.098) (0.097) Longitude of birthplace 0.68* 0.4** (0.097) (0.099) Latitude of destination 0.39 0.3 (0.09) (0.03) Latitude of birthplace -0.433** -0.49** (0.08) (0.07) Geographic distance 0.676*** 0.660*** (0.6) (0.) Square term of geographic distance -0.66*** -0.60*** (0.05) (0.049) Slope of birthplace -0.9* -0.06-0.007 (0.069) (0.094) (0.07) 38

TABLE VII (continued) () () (3) (4) (5) (6) (7) Slope of destination 0.43** 0.05 0.03 (0.063) (0.087) (0.07) Relief degree of land surface (RDLS) of birthplace -.08*** -.04* -0.04 (0.308) (0.578) (0.03) Relief degree of land surface (RDLS) of destination.7***.9** 0.007 (0.306) (0.600) (0.0) Dummy for whether destination and birthplace are located in neighbor provinces 9.533*** (0.404) Individual characteristics YES YES YES YES YES YES YES Average GDP level of birthplace and destination YES YES YES YES YES YES YES Provincial dummies YES YES YES YES YES YES YES Inflection Point of d.6.99.6.6.7.99.06 Positive Marginal Effect 33.8% ---- 3.9% 3.3% 33.5% ---- 9.3% Negative Marginal Effect.4% ----.3%.3%.3% ---- 0.9% Observations,73 3,5,73,73,73,73 3,5 R-squared 0.783 0.896 0.780 0.780 0.784 0.896 0.787 Note: Table VII reports estimation results associating labor migration with dialectal distance when introducing more control variables of geographic factors. We have also introduced all the control variables as column 6 in Table V but didn t report here. Other descriptions are the same as column 6 in Table V. Column and column 6 are estimated under LPM because when introducing the geographic distance the iteration of Ordered Probit model doesn t stop. 39

Dependent variable in column to : migration across prefectures, within prefecture or not: //0 TABLE VIII Other concerns Reverse Causality Migrate for Job Setting of Econometric Strategy Sample of migrants migration across provinces://0 () () (3) (4) Dialectal Distance d 6.469*** 6.554*** 0.578*** 5.389*** (0.839) (0.99) (0.060) (0.430) SquareTerm of Dialectal Distance d -.58*** -.550*** -0.8*** -.56*** (0.47) (0.97) (0.06) (0.07) Putonghua Skill 0.095 0.044 0.003 0.053 Distance of Putonghua & Dialect of Birthplace (0.047) (0.054) (0.00) (0.037) 0.00-0.309 0.03-0.3 (0.67) (0.) (0.09) (0.6) Individual Characteristics YES YES YES YES GDP of Birthplace and Destination YES YES YES YES Provincial Dummies YES YES YES YES Inflection Point of d.3..5.33 Positive Marginal Effect 30.% 30.5% 55.9% 34.5% Negative Marginal Effect 4.4% 3.3% 7.8%.4% Observations,87,64,53 3,5 R-squared 0.84 0.85 0.43 0.734 Note: Table VIII reports estimation results associating labor migration with dialectal distance. Column has dropped observations that migrate before 987 in order to rule out endogeneity that migration before 987 could influence dialectal distance calculated by the survey of Language Atlas of China in 987. Column has dropped observations whose purpose of migration is not for job. Column 3 focuses only on the sample of 774, where migration across prefectures equals to within a prefecture equals to 0, in case that it is our setting of econometric strategy that has driven the inverted U pattern. In column 4, dependent variable is a 3-value variable that migration across provinces equals to, within a province equals to and non-migration equals to 0. We have also introduced all the control variables as column 6 in Table V but didn t report here. Other descriptions are the same as column 6 in Table V. 40

Dependent variable: migration across prefectures, within prefecture or not (//0) TABLE IX Comparison among samples Hukou Status Gender Age Education urban rural male female (8,45] (45,60] others middle school high middle () () (3) (4) (5) (6) (7) (8) (9) Dialectal Distance d 8.456*** 6.93*** 6.97*** 6.69*** 6.086*** 6.9*** 9.790*** 6.48*** 6.3*** school (0.70) (0.87) (0.63) (.93) (0.93) (0.6) (.078) (0.764) (0.777) SquareTerm of Dialectal Distance d -.7*** -.437*** -.408*** -.56*** -.43*** -.677*** -.34*** -.5*** -.358*** (0.4) (0.4) (0.99) (0.36) (0.74) (0.03) (0.639) (0.6) (0.6) Putonghua skill -0.065 0.073* 0.095** 0.03 0.073-0.00 0.9 0.064 0.04 (0.099) (0.043) (0.044) (0.053) (0.05) (0.054) (0.08) (0.055) (0.05) Distance between Putonghua and Dialect of Birthplace -.05*** -0.05-0.45* 0.35-0.073-0.68 0.4-0.030-0.538** (0.44) (0.65) (0.46) (0.30) (0.73) (0.484) (0.40) (0.77) (0.69) Individual characteristics YES YES YES YES YES YES YES YES YES Average GDP level of birthplace and destination YES YES YES YES YES YES YES YES YES Provincial dummies YES YES YES YES YES YES YES YES YES Inflection Point of d.99.5.0.9.5.06.09.3.5 Positive Marginal Effect 4.4% 30.9% 3.4% 30.5% 30.8% 30.3% 33.5% 3.6% 30.8% Negative Marginal Effect 3.%.4%.3% 0.6% 0.6% 3.6% 5.6% 3.6% 3.9% Observations,30 0,849 6,99 6,95 6,54 4,690, 9,873 3,78 R-squared 0.796 0.787 0.79 0.77 0.79 0.778 0.80 0.787 0.779 Note: Table IX reports estimation results associating labor migration with dialectal distance by comparing different samples. We have also introduced all the control variables as column 6 in Table V but didn t report here. Other descriptions are the same as column 6 in Table V. Hukou status is the household registration status. 4

Note: Dependent variable: migration across prefectures, within prefecture or not (//0) TABLE X Comparison among samples of different social networks # people that can talk about things deep in heart with # people that can talk about important things with # people that can borrow money (>RMB 5000) from 4 <4 4 <4 4 <4 () () (3) (4) (5) (6) Dialectal Distance d 7.648*** 5.876*** 7.973*** 5.908*** 7.04*** 6.*** (0.58) (0.970) (0.55) (0.894) (0.57) (0.86) SquareTerm of Dialectal Distance d -.767*** -.309*** -.809*** -.34*** -.698*** -.385*** (0.53) (0.33) (0.47) (0.94) (0.65) (0.73) Putonghua skill 0.097 0.0 0.3*** 0.03 0.06 0.058 Distance between Putonghua and Dialect of Birthplace (0.066) (0.047) (0.040) (0.055) (0.059) (0.050) 0.036-0.397** 0.59-0.430*** -0.44-0.67 (0.78) (0.80) (0.3) (0.56) (0.33) (0.56) Individual characteristics YES YES YES YES YES YES GDP of birthplace and destination YES YES YES YES YES YES Provincial dummies YES YES YES YES YES YES Inflection Point of d.6.4.0.0.06. Positive Marginal Effect 3.0% 3.5% 30.4% 3.3% 3.4% 3.0% Negative Marginal Effect 4.5% 9.7% 6.7% 9.5% 5.% 9.5% Observations 5,884 7,67 5,38 7,833 5,5 7,96 R-squared 0.787 0.790 0.794 0.787 0.787 0.786 Table X reports estimation results associating labor migration with dialectal distance by comparing individuals social networks. Under each comparison, the left columns are those with stronger social networks and the right columns are those with weaker social networks. We have also introduced all the control variables as column 6 in Table V but didn t report here. Other descriptions are the same as column 6 in Table V. 4

TABLE XI Comparison among samples of different time periods Dependent variable: migration across prefectures, within prefecture or not: //0 >990 >995 >000 >005 () () (3) (4) Dialectal Distance d 6.64*** 6.884*** 6.879*** 7.53*** (0.90) (0.853) (0.84) (0.847) SquareTerm of Dialectal Distance d -.543*** -.634*** -.664*** -.74*** (0.68) (0.57) (0.5) (0.6) Putonghua skill 0.077* 0.066 0.068 0.09** (0.044) (0.04) (0.049) (0.04) Distance of Putonghua & Dialect of Birthplace -0.08 0.069 0.8-0.4 (0.6) (0.70) (0.73) (0.05) Individual characteristics YES YES YES YES Average GDP level of birthplace and destination YES YES YES YES Provincial dummies YES YES YES YES Inflection Point of d.4..07.05 Positive Marginal Effect 30.0% 30.% 30.% 30.4% Negative Marginal Effect.6%.% 8.8% 6.3% Observations,709,533,69,98 R-squared 0.838 0.844 0.84 0.83 Note: Table XI reports estimation results associating labor migration with dialectal distance by comparing different time periods. We gradually dropped migrants before the year 990, 995, 000 and 005. We have also introduced all the control variables as column 6 in Table V but didn t report here. Other descriptions are the same as column 6 in Table V. 43