Outward Direct Investment and Firm Productivity: Evidence from Chinese Firms

Outward Direct Investment and Firm Productivity: Evidence from Chinese Firms Wei Tian y Miaojie Yu z November 7, 2014 Abstract This paper examines how Chinese rm productivity a ects rms outward direct investment (ODI) by using two novel data sets on the ODI decision and ODI volume. As there are fewer ODI rms than non-odi rms, the estimations correct for rare-events bias and show that high-productivity rms are more likely to invest abroad. Conditional on rms engaging in ODI, a 10 percentage point increase in rm productivity leads to a 3.87 percent increase in rm ODI. By estimating an endogenous threshold of income in host countries, the threshold regressions nd support for the Linder hypothesis on ODI volume to high-income countries. JEL: F13, P51 Keywords: Outward Direct Investment, Firm Productivity, Linder Hypothesis, Rare- Events Corrections, Threshold Estimates We thank Qiang Chen, Yiping Huang, Anders Johansson, Zhiyuan Li, Kalina Manova, Ding Sai, Heiwai Tang, Yang Yao, and the participants in the seminar at the Chinese Academy of Social Sciences, the rst China Trade Research Group (CTRG), and the CCER-CERC international conference at the Stockholm School of Economics. Financial support from the National Natural Science Foundation of China (No. 71001030) is gratefully acknowledged. All errors are ours. y School of International Trade and Economics, University of International Business and Economics, Beijing, China. Email: wei.tian08@gmail.com. z Corresponding author. China Center for Economic Research (CCER), National School of Development, Peking University, Beijing 100871, China. Phone: +86-10-6275-3109, Fax: +86-10-6275-1474, Email: mjyu@nsd.pku.edu.cn.

1 Introduction This paper investigates the connection between productivity and outward direct investment (ODI) of Chinese rms. As the second largest economy in the world, China s ODI is surging in the new century and plays a more and more important role in global direct investment. With a 50 percent annual growth rate, China s ODI has become economically signi cant enough to a ect international investment (Rosen and Hanemann, 2009). China s non- nancial ODI increased from $29.9 billion in 2002 to $326.5 billion in 2011, a more than tenfold increase during the period. 1 In 2012, China s ODI ow of $87.8 billion accounted for around 6.5 percent of the global foreign direct investment (FDI) ows of $1.35 trillion. China s non- nancial ODI ranks third in the world, following the United States and Japan, and rst among developing countries. The productivity of Chinese rms has also increased dramatically in the new century. Chinese rms have undergone at least 2.7 percent weighted average annual productivity growth from 1998 to 2006 (Brandt et al., 2012). The trend of rapid productivity growth was only slightly a ected by the negative shock from the recent nancial crisis (Feenstra et al., 2014). This raises two questions: Is rm productivity crucial to rms ODI decision? If so, to what extent does rm productivity foster rm ODI? Our main ndings in this paper are threefold. First, by using a comprehensive ODI decision data set covering universal Chinese ODI manufacturing rms during 2000 08, we nd that the higher is a rm productivity, the higher is the probability that the rm engages in ODI. This nding is qualitatively ascertained by rm ODI data in Zhejiang province, one of the largest ODI provincial sources in China, in 2006 08. Since only a very small proportion of rms in our large panel sample engaged in ODI activity, we correct for rare-events estimation downward bias. We use Zhejiang s ODI data and nd a large marginal e ect of rm productivity on the decision to engage in ODI. Second and equally important, as our Zhejiang ODI data set provides information on each rm s ODI volume and investing destinations, we are able to explore the 1 China s non nancial investment (i.e., green eld investment) outweighs the country s nancial investment (i.e., investment from mergers and acquisitions). In 2011, China s non nancial investment accounted for 91.8 percent of its entire foreign investment. Thus, we focus on green eld ODI in this paper. 1

intensive margin e ect of a rm s productivity on its ODI. After controlling for endogeneity of rm productivity, we nd strong evidence that rm productivity fosters ODI and the e ect is statistically and economically signi cant. Conditional on the rm s engaging in ODI, a 10 percentage point increase in rm productivity leads to a 3.87 percent increase in rm ODI. Third, by allowing rm heterogeneity on choosing host destinations, we nd that the role of a rm s productivity on its ODI ow di ers by destination income. By estimating an endogenous threshold of income in host countries, our threshold regressions nd support for the Linder hypothesis on ODI volume to high-income countries. Our paper makes the following two contributions to the literature. First, it enriches our understanding of Chinese ODI behavior. Except for a fairly large micro-level literature on Chinese exports (see Qiu and Xue (2014) for a recent survey), not many papers investigate China s ODI, especially from the rm-level perspective, in large part because of lack of data. It has been only recently that China s government (more precisely, China s Ministry of Commerce) has released the universal, nationwide, rm-level ODI decision data (i.e., which rms engage in ODI activity). With this data set, we are now able to explore whether the well-accepted Melitztype e ects apply to China, the largest trading country in the world today. Based on Melitz (2003), Helpman et al. (2004) predict that, to enter foreign markets through foreign a liates, rms have to pay extra high xed costs to cover additional expenses, such as investigating the foreign market regulatory environment. Only pro table, high-productivity rms can do so. Our binary estimates nd that the theoretical predictions of Helpman et al. (2004) work well in China. Thus, di erent from the mixed ndings on Chinese exports and rm productivity, 2 we con rm that the standard Melitz ndings apply to Chinese ODI rms. More important, we explore the intensive margin on rm ODI ow, which is completely absent in previous studies because of the unavailability of data. As introduced in detail in the next section, although the Ministry of Commerce of China released the list of ODI rms 2 Lu (2010) nds that Chinese exporters are less productive. However, Dai et al. (2012) and Yu (2014) argue that that nding was because of the presence of China s processing exporters, which are less productive than non-exporters and non-processing exporters. Once processing exporters are excluded, Chinese exporters are more productive than non-exporters, in line with the theoretical predictions of Melitz (2003). 2

(henceforth, the ODI decision data set), the data set does not report each rm s ODI volume in all years. To overcome this data challenge, we access a con dential ODI data set compiled by the department of commerce in Zhejiang province, which reports rms ODI volume in addition to all other information covered in the ODI decision data set. Thanks to this novel data set, we are able to explore the intensive margin of rm ODI in China. Second, our paper contributes to the literature on empirical identi cation. We adopt razoredge econometric techniques to deal with the related empirical challenges; the techniques can be applied to other projects facing a similar problem or data constraints. An empirical challenge is rare-events estimation bias. As there are much fewer ODI rms than non-odi rms in our ODI data sets (i.e., national ODI decision data and Zhejiang s ODI ow data), conventional binary estimates, like logit or probit, would face a downward estimation bias of rms ODI probability, which will be discussed carefully. We adopt the rare-events logit method proposed by King and Zeng (2001, 2002) to correct for possible estimation bias. We nd that the marginal e ect of rm productivity on ODI probability with rare-events corrections is much larger than that without the corrections. Another econometric innovation is that we use the endogenous threshold regressions developed by Hansen (1999, 2000). Recent studies nd that the conventional Linder (1961) export hypothesis can extend to and work for ODI: high-income countries usually absorb more ODI (Fajgelbaum et al., forthcoming). We are particularly interested in whether rm productivity has a heterogeneous impact on rm ODI volume by destination income. The empirical challenge is where to set the line for high-income and low-income host countries. We take a di erent approach from previous studies that set the cuto lines at a predetermined level as adopted from the World Bank. We instead allow rms to choose their endogenous cuto s based on their productivity performance. Hence, we are able to estimate the endogenous average income threshold for rms ODI decision. Our threshold regressions nd strong support for the Linder hypothesis for ODI volume to high-income countries. The present study is related to three strands of the literature on ODI. The rst strand is rm 3

heterogeneity of productivity and ODI. Inspired by Melitz (2003), Helpman et al. (2004) develop the concentration-proximity trade-o initiated by Markusen (1984) to nd rms sorting behavior: low-productivity rms self-select to sell in domestic markets, whereas high-productivity rms sell in domestic and foreign markets. However, only the most productive rms self-select to engage in ODI. The sorting pattern is mainly determined by the trade-o between transportation cost and the xed cost of ODI. By assuming that rm production requires headquarter services and manufactured components, Antràs and Helpman (2004) ascertain that a rm s productivity ranking in uences the rm s choice between outsourcing and ODI, which is con rmed by Federico (2009), who uses Italian manufacturing rm-level data. Yet, the sorting pattern proposed by Helpman et al. (2004) is challenged by Bhattacharya et al. (2010), who use data on the Indian software industry. Di erent from those ndings in the services industry, we nd that the predictions of Helpman et al. (2004) work well for Chinese rms. The second strand is related to the literature on the nexus between exports and ODI. Early works, such as Froot and Stein (1991), nd that depreciation in the host country would absorb more ODI because of the declining investment cost in the host countries. In search of the relationship between exports and ODI, Blonigen (2001) nds a possible substitution between Japanese ODI to the United States and Japanese exports of nal goods to the United States in the automobile market, although intermediate goods are complementary. Recent works examine this nexus beyond the traditional concentration-proximity models. For instance, Oldenski (2012) explores the role of communication of complex information in the traditional proximityconcentration model of the decision between exports and ODI. She nds evidence that rms would prefer exporting if the activities require complex within- rm communication. Instead, rms would prefer choosing ODI if goods and services require direct communication with consumers. Based on Russ (2007), Ramondo et al. (2014) nd that countries with less volatile uctuations are served relatively more by foreign a liates than by exporters. Similarly, inspired by Jovanovic (1982), Conconi et al. (2014) nd that rms are more likely to export rather than engage in ODI when they face uncertainty of foreign market demand. So exporting and 4

(horizontal) FDI may be complements in a dynamic setup, although they are substitutes in the static setting. The third related strand of the literature is research on China s ODI. Because of the unavailability of micro-level data, previous works, such as Rosen and Hanemann (2009), examine the industrial characteristics of ODI but abstract away the role of rm activity. Huang and Wang (2011) argue that Chinese ODI rms have di erent objectives for their investment. In echoing this, Kolstad and Wiig (2012) nd that Chinese ODI is attracted to three destinations: countries with lower institutional quality, countries that are rich in natural resources, and large markets. Most recent related works tend to explore what determines the ODI of Chinese rms. Using the same universal nationwide ODI decision data set, Wang et al. (2012) nd that government support and the industrial structure of Chinese rms play an important role in interpreting the ODI decision of Chinese rms. Chen and Tang (2014) also nd that rm productivity and the probability of rm ODI are positively correlated, yet, because of lack of data, they remain silent on the intensive margin of rms ODI. Our present paper aims to ll this gap and take a step further to explore the income heterogeneity of rms ODI. The rest of the paper is organized as follows. Section 2 describes our data sample, followed by a careful scrutiny of measures of rm productivity. Section 3 examines the importance of rm productivity in the rm s ODI decision. Section 4 explores the role of rm productivity on the intensive margin of ODI ows. Section 5 discusses the rm s investment destination and Section 6 concludes. 2 Data and Measures We rely on three data sets. The rst data set provides the list of ODI rms in China since 1980. This data set is crucial for understanding rms ODI decision. However, the data set does not report any ODI values. To examine the role of the intensive margin, we rely on the second rm-level ODI data set, which contains information on the universal rm-level ODI green eld activity in Zhejiang province of China. Finally, we merge rm-level manufacturing production 5

data with the two ODI data sets to explore the nexus between ODI and rm productivity. 2.1 ODI Decision Data The nationwide data set of Chinese rms ODI decisions was obtained from the Ministry of Commerce of China (MOC). MOC requires every Chinese ODI rm to report its detailed outward investment activity since 1980. To invest abroad, for any Chinese rm, whether it is a stateowned enterprise (SOE) or private rm, it is mandatory to apply to the MOC and its former counterpart, the Ministry of Foreign Trade and Economic Cooperation of China, for approval and registration. MOC requires such rms to provide the following information: the rm s name, the names of the rm s foreign subsidiaries, the type of ownership (i.e., SOE or private rm), the investment mode (e.g., trading-oriented a liates, mining-oriented a liates), and the amount of foreign investment (in U.S. dollars). Once a rm s application is approved by MOC, MOC will release the information mentioned above, as well as additional information, including the date of approval and the date of registration abroad. All such information can be downloaded freely from the MOC webpage except the amount of the rm s outward investment, which may be considered sensitive and con dential information to the rms. The rst year that the data were released by MOC was 1980. Since 1980, MOC has released information on new ODI rms every year. Thus, the nationwide ODI decision data indeed report ODI starters by year. However, since this data set does not report rms ODI ows, researchers are not able to explore the intensive margin of rm ODI with this data set. 2.2 ODI Flow Data To explore the intensive margin, we use our second data set, which is compiled by the Department of Commerce of Zhejiang Province. The most novel aspect of this data set is that it includes data on rms ODI ows (in current U.S. dollars). The data set covers all rms with headquarters located (and registered) in Zhejiang and is a short, unbalanced panel from 2006 to 2008. In addition, the data set provides each rm s name, the city where it has its headquarters, type of ownership, industry classi cation, investment destination countries, stock share from 6

Chinese parent company, category of FDI (i.e., green eld or merger & acquisition (M&A)). The database even reports speci c modes of investment: foreign a liates, foreign resource utilization, marketing network, processing trade, consulting service, real estate, research and development (R&D) center, trading company, and other unspeci ed types. Although this data set seems ideal for examining the role of the intensive margin of rms ODI, the disadvantage is also obvious: the data set is for only one province in China. 3 Regrettably, as is the case for many other researchers, we cannot access similar databases from other provinces. Still, we believe that Zhejiang s rm-level ODI ow data are a good proxy for understanding the universal Chinese rm s ODI ow for the following reasons. First, the ODI ow from Zhejiang province is outstanding in the whole of China. Firms in Zhejiang have engaged in ODI since 1982. Such rms were the pioneers of Chinese ODI activity. As reported by MOC, only around 10 rms began to engage in ODI before 1982. Since then, Zhejiang has maintained a fast growth rate similar to that of other large eastern provinces, such as Guangdong, Jiangsu, and Shandong. In 2008, Zhejiang had 2,809 ODI rms (including green eld rms and M&A rms), accounting for 21 percent of all ODI rms in China, and became the largest province in the number of FDI rms. In terms of FDI ow, Zhejiang s ODI also maintained a high plateau, ranking at the very top in the entire country from 2006 to 2009. As shown in Appendix Table 1, Zhejiang s ODI accounted for 16 percent of the country s ODI ow and became the largest ODI province in 2010. Second, the distribution of type of ownership of ODI rms in Zhejiang province is consistent with that across the country. According to the Statistical Bulletin of China s Outward Foreign Direct Investment (2009) of Ministry of Commerce, 95 percent of all Chinese ODI rms are directly managed by local governments and most are private rms. In Zhejiang province, 70 percent of FDI rms are private rms. Third, the distribution of Zhejiang ODI rms destinations is similar to that of the whole 3 To our knowledge, almost all previous work was not able to access nationwide universal outward FDI ow data, except Wang et al. (2012), who use nationwide rm-level outward FDI data to investigate the driving force of outward FDI of Chinese rms. However, the study uses data only from 2006 to 2007; hence, it cannot explore the possible e ect of the nancial crisis in 2008. 7

country. Up to 2009, Chinese ODI rms invested in 177 countries (regimes) and 71.4 percent of ODI volume was invested in Asia. Hong Kong is the most important destination for Chinese ODI rms. 4 This observation also applies to Zhejiang s ODI rms. Most FDI rms in Zhejiang invest in Asia, Europe, and North America. Hong Kong and the United States are the two destinations with the largest investments. The most common investment mode is to set up production a liates and create a marketing network by establishing a trade-oriented o ce. Fourth, the industrial distribution of Zhejiang s ODI rms is similar to that for the whole of China. According to the Statistical Bulletin of China s Outward Foreign Direct Investment (2009), the top two sectors for Chinese ODI rms are manufacturing and retail and wholesale, which accounted for 30.2 percent and 21.9 percent, respectively, of China s total ODI in 2009. These sectors are especially relevant for Zhejiang province s ODI rms. Manufacturing industry is the most important sector in which Zhejiang s rms invest abroad. In particular, Zhejiang s rms invest mostly in the garment, machinery, textile, mining, and electronics industries, in descending order. The lower module of Table 1 shows the number of ODI rms in 2006 08, resulting in a total of 1,270 ODI rm-year observations in the database. 2.3 Firm-Level Production Data [Insert Table 1 Here] Our last database is the rm-level production data compiled by China s National Bureau of Statistics in an annual survey of manufacturing enterprises. The data set covers around 162,885 rms in 2000 and 410,000 rms in 2008 and, on average, accounts for 95 percent of China s total annual output in all manufacturing sectors. The data set includes two types of manufacturing rms: universal SOEs and non-soes whose annual sales are more than RMB 5 million (or equivalently $830,000 under the current exchange rate). The data set is particularly useful for calculating measured TFP, since the data set provides more than 100 rm-level variables listed in the main accounting statements, such as sales, capital, labor, and intermediate inputs. 4 Note that it is possible that some Chinese ODI rms take Hong Kong as an international investment exprót since Hong Kong is a popular "tax haven." Such a phenomenon is beyond the scope of the present paper, although it would be interesting for future research. 8

As highlighted by Feenstra et al. (2014) and Yu (2014), some samples in this rm-level production data set are noisy and somewhat misleading, largely because of mis-reporting by some rms. To guarantee that our estimation sample is reliable and accurate, we screen the sample and omit outliers by adopting the following criteria. First, we eliminate a rm if its number of employees is less than eight workers, since otherwise such an entity would be identi ed as self-employed. Second, a rm is included only if its key nancial variables (e.g., gross value of industrial output, sales, total assets, and net value of xed assets) are present. Third, we include rms based on the requirements of the Generally Accepted Accounting Principles (GAAP). 5 By using these two methods, we match Zhejiang s manufacturing rms with Zhejiang s ODI ow rms. As shown in the lower module of Table 1, of 1,270 ODI rm-year observations in Zhejiang province from 2006 to 2008, 407 ODI rms are engaging in manufacturing sectors, suggesting that around two-thirds of Zhejiang s ODI rms either serve in service sectors or are trading intermediates (Ahn et al., 2010). Table 2 reports the summary statistics of rms characteristics for nationwide manufacturing rms and Zhejiang s manufacturing rms, respectively. 2.4 Data Merge We then merge the two rm-level ODI data sets (i.e., nationwide ODI decision data and Zhejiang s ODI ow data) with the manufacturing production database. Although the two data sets share a common variable the rm s identi cation number their coding system is completely di erent. Hence, we use alternative methods to merge the three data sets. The matching procedure involves three steps. First, we match the three data sets (i.e., rm production data, nationwide ODI decision data, and Zhejiang ODI ow data) by using each rm s Chinese name and year. If a rm has an exact Chinese name in a particular year in all three data sets, it is considered an identical rm. Still, this method could miss some rms since the Chinese name for an identical company may not have the exact Chinese characters in the two data sets, although 5 In particular, an observation is included in the sample only if the following observations hold: (1) total assets are higher than liquid assets; (2) total assets are larger than the total xed assets and the net value of xed assets; (3) the established time is valid (i.e., the opening month should be between January and December); and (4) the rm s sales must be higher than the required threshold of RMB 5 million. 9

they share some common strings. Our second step is to decompose a rm name into several strings referring to its location, industry, business type, and speci c name, respectively. If a company has all identical strings, such a rm in the three data sets is classi ed as an identical rm. 6 Our second step is to decompose a rm name into several strings referring to its location, industry, business type, and speci c name, respectively. If a company has all identical strings, such a rm in the three data sets is classi ed as an identical rm. 7 Finally, to avoid possible mistakes, all approximate string-matching procedures are double-checked by eyes. By using these two methods, we match Zhejiang s manufacturing rms with Zhejiang s ODI ow rms. As shown in the lower module of Table 1, of 1,270 ODI rms-year in Zhejiang province from 2006 to 2008, 407 ODI rms are engaging in manufacturing sectors, suggesting that around two-third of Zhejiang ODI rms either serve in service sectors or are trading intermediates (Ahn et al., 2010). Table 2 reports the summary statistics of rms characteristics for national-wise manufacturing rms and Zhejiang s manufacturing rms, respectively. 2.5 TFP Measures [Insert Table 2 Here] The main interest of this paper is to investigate how rm productivity a ects rm ODI. Hence, it is crucial to measure rm productivity accurately. In the literature, the most convenient way to measure productivity is to adopt labor productivity, which is appropriate when the research interest is on labor productivity and wages (Tre er, 2004). However, such an index faces several serious pitfalls because it ignores the role of capital used by a rm, which is crucial for a rm to determine its output productivity (De Loecker, 2011). Thus, we use total factor productivity (TFP) to measure rm productivity. Traditionally, TFP is measured by the estimated Solow residual between the true data on 6 For example, "Ningbo Hangyuan communication equipment trading company" shown in the ODI data set and "(Zhejiang) Ningbo Hangyuan communication equipment trading company" shown in the National Bureau of Statistics of China production data set are the same company but do not have exactly the same Chinese characters. 7 In the example above, the location fragment is "Ningbo," the industry is "communication equipment," the business type is "trading company," and the speci c name is "Hangyuan." 10

output and its tted value using the OLS approach. However, the OLS approach su ers from two problems, namely, simultaneity bias and selection bias. 8 Following Amiti and Konings (2007) and Yu (2014) in assuming a Cobb-Douglas production function, we adopt the augmented Olley- Pakes semi-parametric approach to deal with simultaneity bias and selection bias in measured TFP. In particular, we tailor the standard Olley-Pakes approach to t the data for China with the following extensions. First, we estimate the production function for exporting and non-exporting rms separately in each industry. The idea is that di erent industries may use di erent technology; hence, rm TFP must be estimated for each industry. Equally important, even within an industry, exporting rms may use completely di erent technology than non-exporting rms. For example, some processing exporters only receive imported material passively (Feenstra and Hanson, 2005) and hence do not have their own technology choice. We hence include an export dummy to allow di erent TFP realization between exporting rms and non-exporting rms. 9 Second, we use de ated prices at the industry level to measure TFP. Previous studies, such as De Loecker (2011), stressed the estimation bias of using monetary terms to measure output when estimating the production function. In that way, one actually estimates an accounting identity. Hence, we use di erent price de ators for inputs and outputs. Admittedly, it would be ideal to adopt rm-speci c prices as the de ators. Unfortunately, the rm-level data set does not provide su cient information to measure prices of products. Following previous studies, such as Goldberg et al. (2010), we adopt industry-level input and output de ators for TFP measures. As in Brandt et al. (2012), the output de ators are constructed using "reference 8 Note that these two biases arise for the following reasons: First, only a few parts of TFP changes can be observed by the rm early enough for it to change its input decision to maximize pro t. Thus, the rm s TFP may have reverse endogeneity in its input factors. The rm s maximized choice becomes biased without this consideration. Second, the rm s dynamic behavior also generates selection bias. With international competition, low-productivity rms will collapse and exit the market, whereas high-productivity rms will stay in the market (Melitz, 2003). The observations in the panel data set include rms that have already survived. Low-productivity rms that have collapsed are excluded from the sample, suggesting that the samples in the estimations are not randomly selected, hence generating estimation bias. 9 Note that we are not able to include a processing dummy in the full data sample, since processing information is only available in the released China s customs data set during the period 2000 06, but our sample extends to 2008. 11

price" information from China s Statistical Yearbooks, whereas input de ators are constructed based on output de ators and China s national Input-Output Table (2002). Third, it is important to construct the real investment variable when using the Olley-Pakes (1996) approach. 10 As usual, we adopt the perpetual inventory method to investigate the law of motion for real capital and real investment. The nominal and real capital stocks are constructed as in Brandt et al. (2012). Rather than assigning an arbitrary number for the depreciation ratio, we use the exact rm s real depreciation provided by the Chinese rm-level data set. 11 Figure 1 shows that the average annual productivity of ODI rms is larger than that of non-odi rms during 2000 08. [Insert Figure 1 Here] 3 Productivity and the Extensive Margin of ODI This section discusses how a rm s productivity a ects the rm s decision to engage in ODI (i.e., the extensive margin). Before running regressions, we provide several preliminary statistical tests to enrich our understanding of the di erence in productivity between ODI and non-odi rms, following a careful scrutiny of the e ect of rm productivity on the decision to engage in ODI. 3.1 Descriptive Analysis on Productivity Di erences Previous studies, like Helpman et al. (2004), nd that rms sales decision can be sorted by their productivity. Low-productivity rms serve in domestic markets, high-productivity rms export, and higher-productivity rms engage in ODI. Eaton et al. (2011) also nd that higherproductivity rms are usually larger. If so, we would observe that, compared with non-odi rms, ODI rms on average are larger, more productive, and export more. The rst module of 10 In the literature, the Levinsohn and Petrin (2003) approach is also popular in constructing TFP in which materials (i.e., intermediate inputs) are used as a proxy variable. This approach is appropriate for rms in countries not using a large amount of imported intermediate inputs. However, such an approach may not directly apply to China, given that Chinese rms substantially rely on imported intermediate inputs, which have prices that are signi cantly di erent from those of domestic intermediate inputs (Helpern et al., 2011). 11 Note that even with the presence of exporting behavior, the data still exhibit a monotonic relationship between TFP and investment. 12

Table 3 checks the di erence between non-odi and ODI rms on their TFP, labor, sales, and exports. Compared with non-odi rms, ODI rms are found to be more productive, hire more workers, sell more, and export more. The t-values for these variables are strongly signi cant at the conventional statistical level. [Insert Table 3 Here] The upper part of Table 4 shows that ODI rms are more productive than non-odi rms by year during the sample period 2000 08. 12 The productivity di erence between ODI rms and non-odi rms roughly declines over the period (especially after 2004), suggesting that ODI rms may not enjoy much productivity gain via learning from investing. To see whether productivity is an important cause of rms ODI, we compare several key rm characteristics between non-odi rms and ODI starters in the lower part of Table 3. Once again, we observe that ODI starters have higher productivity, hire more workers, sell more, and even export more than those that have never engaged in ODI, indicating that ODI rms had predominantly high productivity before engaging in ODI. However, the simple t-test comparisons thus far may not be su cient to conclude that ODI rms are more productive than their counterparts, since ODI rms may be very di erent from non-odi rms in terms of size (number of employees and sales) and experience in foreign markets. As seen from Table 3, ODI rms on average have larger exports, suggesting that ODI rms have already penetrated to foreign markets. We follow Imbens (2004) and perform propensity score matching (PSM) by choosing the number of rm employees, rm sales, and rm exports as covariates. Each ODI rm is matched to its most similar non-odi rm. Since there are observations with identical propensity score values, the sort order of the data could a ect the results. We thus perform a random sort before adopting the PSM approach. Table 4 reports the estimates for average treatment for the 12 Note that TFP in 2008 is calculated and estimated di erently. As in Feenstra et al. (2014), we use de ated rm value added to measure production and exclude intermediate inputs (materials) as one kind of factor input. However, we are not able to use value added to estimate rm TFP in 2008, since it is absent in the data set. We instead use industrial output to replace value added in 2008. Thus, we have to be cautious in comparing TFP in 2008 with TFP in previous years. 13

treated (ATT) by year. The coe cient of ATT for all ODI manufacturing rms is 0.114 and highly statistically signi cant, suggesting that, overall, productivity for ODI rms is higher than that for similar non-odi rms during the period 2000 08. [Insert Table 4 Here] 3.2 Extensive Margin of ODI To examine whether rm productivity plays a key role in the rm s decision to engage in FDI, we consider the following empirical speci cation: Pr(ODI it = 1) = 0 + 1 ln T F P it + X + $ i + t + " it ; (1) where ln T F P it is the log productivity of rm i in year t. X denotes other rm characteristics, such as rm size (produced by rm s log of employment) and types of ownership (i.e., multinational rms or SOEs). 13 For instance, SOEs might be less likely to invest abroad because of low e ciency (Hsieh and Klenow, 2009). In addition, larger rms are more likely to invest abroad because they may have an additional advantage to realize increasing returns to scale. Inspired by Oldenski (2012), we also include a rm export indicator in the estimations, since an exporting rm could nd it easier to invest abroad, given that it would have an information advantage on foreign markets compared with non-exporting rms. Finally, the error term is decomposed into three components: (1) rm-speci c xed e ects to control for time-invariant factors such as rm location 14 ; (2) year-speci c xed e ects t to control for rm-invariant factors such as Chinese RMB appreciation; and (3) an idiosyncratic e ect " it with normal distribution " it s N(0; 2 i ) to control for other unspeci ed factors. We start from a simple linear probability model (LPM) to conduct our empirical analysis. The attractiveness of the LPM method is that it can make it easier to address the xed ef- 13 Here, a rm that has investment from foreign countries or Hong Kong/Macao/Taiwan is de ned as a foreign rm, following Feenstra et al. (2014). 14 We rst estimate using rm-speci c xed e ects and then industry-speci c xed e ects, given our sample has only a three-year time span, and the within-group variation would be reduced too much with the implementation of rm-speci c xed e ects. 14

fects. Many nonlinear estimators, including probit, may be biased in the presence of incidental parameters (Angrist and Pischke, 2009). The well-known drawback of using the linear probability model is that there is no justi cation for why the speci cation is linear. In addition, the predicted probability could be less than zero or greater than one, which does not make sense. We therefore report the LPM estimates by including two-digit Chinese industry classi cation (CIC) level industry-speci c xed e ects in column 1 of Table 5 and rm-speci c xed e ects in column 2. 15 We then compare the estimate results with the probit and logit estimations using two-digit CIC-level xed e ects in columns 3 and 4, respectively. 16 The rst four columns of Table 5 show that higher-productivity rms are more likely to engage in ODI. As expected, larger rms are more likely to invest abroad, whereas SOEs are less likely to do so. Exporting rms are more likely to engage in ODI by employing their information advantage (Oldenski, 2012). A striking nding is that multinational rms are less likely to engage in ODI activity. One possible reason is that multinational rms are more likely to engage in global value chain or processing trade and hence focus more on exporting (Yu, 2014). [Insert Table 5 Here] 3.3 Estimates with Rare Events Corrections Our estimations above may still face some bias. As observed from Tables 1 and 2, of the total 1,526,167 observations, only 0.44 percent of rms engage in outward investment. Thus, our sample exhibits the features of rare events that occur infrequently but may have important economic implications. As highlighted by King and Zeng (2001, 2002), standard econometric methods such as logit and probit would underestimate the probability of rare events, although maximum likelihood estimators are still consistent. To see this, consider a simpli ed logit 15 Note that the coe cients of export (SOE, foreign) indicator are still present in the xed-e ects estimates since exporting rms (SOEs, foreign rms) could switch to non-exporting rms (non-soes, non-foreign rms) during the sample period. 16 Note that the coe cients shown in the probit estimates are not marginal e ects. 15

regression of the ODI dummy on rm TFP. Pr(ODI it = 1) = ( 1 ln T F P it ) = exp( 1 ln T F P it ) 1 + exp( 1 ln T F P it ) ; (2) where () is the logistic cumulative density function (henceforth CDF). Since ^ 1 > 0, as shown in columns (1)-(4) of Table 5, the probability of ODI it = 1 is positively associated with rm TFP; most of the zero-odi observations will be to the left and the observation with ODI it = 1 will be to the right with little overlap. Since there are around 1.5 million observations with zero ODI, the standard binary estimates can easily estimate the illustrated probability density function curve without error, as shown by the solid line in Figure 2. 17 However, since only 0.44 percent of the observations have positive ODI, any standard binary estimates of the dashed density line for rm TFP when ODI it = 1 will be poor. Because the minimum of the observed rare ODI sample is larger than that of the unobserved ODI population, the cutting point that best classi es non-odi and ODI would be too far from the density of observations with ODI it = 1. This will cause a systematic bias toward the left tail and result in an underestimation of the rare events with ODI it = 1 (See King and Zeng (2001, 2002) for a detailed discussion). As recommended by King and Zeng (2001, 2002), the rare-events estimation bias can be corrected as follows. We rst estimate the nite sample bias of the coe cients, bias(^), to obtain the bias-corrected estimates ^ bias(^), where ^ denotes the coe cients obtained from the conventional logistic estimates. 18 Column 5 in Table 5 reports the logit estimates with rareevents corrections. The coe cient of rm TFP is slightly larger than its counterpart in column 4, suggesting that the estimation bias is not so severe. The coe cient of the SOE indicator in column 5 is relatively larger, in absolute terms, than its counterpart in column 4, whereas the coe cients of the other variables do no show much change. An alternative approach to correct possible rare-events estimation errors is to use the complementary log-log model. 19 The idea is the distributions of standard binary nonlinear models, 17 To illustrate the idea in a simple way, the distribution curves are drawn to be normal, although this need not be the case. 18 Chen (2014) also adopts this method to explore how negative climate shocks (e.g., severe drought, locust plagues) a ected peasant uprisings. 19 The CDF of the complementary log-log model is C(X 0 ) = 1 exp( exp(x 0 )) with margin e ect 16

such as probit and logit, are symmetric to the original point. So the speed of convergence toward the probability that ODI it = 1 is the same as that for ODI it = 0. This violates the feature of the rare events, which exhibit faster convergence toward the probability that ODI it = 1. The complementary log-log model can address this issue, since the model has a left-skewed extreme value distribution, which also exhibits a faster convergence speed toward the probability that ODI it = 1 (Cameron and Trivedi, 2005). The complementary log-log model in column 6 in Table 5 shows that the coe cient of rm TFP is fairly close to its counterparts in conventional logit estimates and rare-events logit estimates, suggesting once again that the estimation bias caused by the property of "rare events" is not so severe in our estimates. One possible reason is that the number of observations with positive ODI is still quite large (N = 6,673) in absolute number, although it only accounts for a small proportion in the whole sample. Finally, to examine the entry of ODI rms, we only include rms that never employed FDI and FDI starters in columns 7 to 9 in Table 5. Similarly, we start from the logit estimation, as a comparison with the following rare-events logit estimates and complementary log-log estimates. As seen from columns 7 to 9, high-productivity rms are still found to be more likely to invest abroad. 3.4 Endogeneity of Firm Productivity The speci cations in Table 5 face a possible endogeneity problem. Firms that engage in outward investment may be able to absorb better technology or gain managerial e ciency from host countries (Oldenski, 2012), which in turn boosts rm productivity. We use two approaches to address such reverse causality. To mitigate the endogeneity issue, we adopt an instrumental variable approach in which rms R&D expenses are an instrument. The economic rationale is straightforward. With more investment in new technology, rms will have higher productivity (Bustos, 2011). However, rms with more R&D expenses will not necessarily have more ODI: the simple correlation between rm ODI and rm R&D expenses is close to nil (0.07), as shown in the sample. In the rst-stage exp( exp(x 0 )) exp(x 0 ). 17

estimation, we regress log R&D expenses as an excluded variable on rm TFP, 20 as well as other included variables such as indicators of SOE, foreign, exporter, and log labor. The bottom module of Table 6 shows that the coe cient of log rm R&D expenses is positively correlated with rm TFP and strongly signi cant at the conventional statistical level. As suggested by King and Zeng (2001, 2002), the tted value of rm TFP obtained in the rst stage then serves as a new variable in the second-stage nonlinear binary estimates, including logit in column 1, rareevents logit in column 2, and complementary log-log in column 3. Of course, the standard errors of all coe cients are required to be bootstrapped. After correcting the rare-events estimates bias, the coe cient of tted rm TFP in the rare-events logit in column 2 is found to be slightly larger than the regular logit estimates, suggesting that regular binary estimates face a downward bias. There is still a concern about whether rm R&D expenses strictly satisfy the requirement of the "exogenous" restriction of the instrumental variable, since rm TFP might reversely a ect the rm s R&D expenses in a dynamic catch-up scenario: Low-productivity rms may input more R&D expenses to boost their productivity so they can catch up with high-productivity rms. More important, mainstream rm-heterogeneity trade theory, such as Melitz (2003), suggests that rm productivity is ex ante randomly drawn from a distribution. But the conventional TFP measure is essentially a Solow residual and hence an ex post measure. So there is a possible mismatch between the theoretical foundation and the empirical evidence. Inspired by Feenstra et al. (2014), we address this issue by constructing an alternative measure of rm ex ante TFP. To motivate this from the Olley-Pakes speci cation, consider a Cobb-Douglas gross production function: ln Y it = K ln K it + L ln L it + it ; (3) where Y it, K it, and L it represent value-added, capital, and labor for rm i in year t, respectively. The standard Olley-Pakes approach is to take the di erence between log gross output and log 20 Since many observations in the sample have zero R&D expenses, we de ne the variable of log rm R&D here as log(1 + R&D), where R&D refers to the rm s R&D expenses, to allow observations with zero R&D expenses in the sample. 18

factor inputs times their estimated coe cients: T F P OP it = ln Y it ^ K ln K it + ^ L ln L it : Clearly, rm productivity T F P OP it is correlated with ex post productivity shock it caused by possible technology spillover and managerial e ciency (Qiu and Yu, 2014). We can instead construct an ex ante productivity measure T F P OP 2 it by taking advantage of rm investment I it (the proxy variable used in the conventional Olley-Pakes estimates), which depends on the ex ante productivity measure in a polynomial functional form: I it = g(t F Pit OP 2 ; ln K it ). We invert this equation to obtain T F Pit OP 2 = g 1 (I it ; ln K it ). Feenstra et al. (2014) shows that the ex ante TFP measure is independent of the ex post productivity shock it and more consistent with the spirit of Melitz (2003), which requires that rms randomly draw productivity at the very beginning. Columns (4) to (6) of Table 6 replace the conventional Olley-Pakes TFP with the ex-ante TFP measures and still obtain similar results as before. 21 We now turn to discuss the marginal e ect of our estimations to consider the economic magnitude of the e ect of rm productivity. As shown in Table 6, the estimated coe cients of rm TFP vary from 0.280 to 0.334, suggesting that a one-point increase in rm productivity would increase the odds ratio of rms engaging in ODI by 32 to 39 percent, given that exp(0:28) = 1:32 and exp(0:33) = 1:39. As rm average TFP increases from 3.11 in 2000 to 4.97 in 2008, rm productivity improvement raises the odds ratio of rms engaging in ODI by 60 to 72 percent during this period. [Insert Table 6 Here] 4 Intensive Margin of ODI Thus far, we can safely conclude that high-productivity Chinese manufacturing rms are more likely to engage in ODI. We now turn to explore the role of rm productivity on ODI ow. Since 21 Note that the number of observations in columns 4 to 6 in Table 6 is greater than in columns 1 to 3 because data on rm R&D expenses are unavailable for 2008. 19

we only have Zhejiang province s ODI ow data, we start by examining whether our previous ndings based on nationwide ODI decision data hold for Zhejiang s ODI manufacturing rms. Table 7 picks up this task. By controlling for rm-speci c xed e ects, the linear probability model estimates in column 1 con rm that Zhejiang s high-productivity manufacturing rms are more likely to engage in ODI during the sample period 2006 08. The logit estimates in column 2 yield similar ndings with a slightly larger coe cient of rm TFP. Of 102,785 manufacturing rms during the sample period, there are only 407 ODI manufacturing rms, as shown in the lower module of Table 1. That is, the probability of ODI is only 0.39 percent, suggesting that rm ODI activity is also a rare event in Zhejiang province during the sample period and the standard logit estimation results may have a downward bias. In Table 7, we again correct for such bias by using rare-events logit estimates in column 3 and complementary log-log estimates in column 4. The estimated coe cients of rm productivity are much larger than their counterparts in columns 1 and 2, indicating that the downward bias in the regular estimates is fairly large. The increases in the odds ratio caused by rm productivity are similar to their counterparts in Table 5. [Insert Table 7 Here] To examine the e ects of rm productivity on ODI ow, we introduce a bivariate sample selection model, or equivalently, a Type-2 Tobit model (Cameron and Trivedi, 2005). The Type-2 Tobit speci cation includes: (1) an ODI participation equation, 0 if U it < 0 ODI it = 1 if U it 0 ; (4) where U it denotes a latent variable faced by rm i; and (ii) an "outcome" equation whereby the rm s ODI ow is modeled as a linear function of other variables. In particular, we use a logit model to estimate the following selection equation: Pr(ODI ijt = 1) = Pr(U it 0) = ( 0 + 1 \ ln T F P it + 2 SOE it (5) + 3 F IE it + 4 F X it + 5 ln L it + j + t ) 20

where (:) is the logistic CDF. In addition to the logarithm of rm productivity, a rm s ODI decision is also a ected by other factors, such as the rm s ownership (whether it is an SOE or a multinational rm), export status (F X it equals one if a rm exports and zero otherwise) and size (measured by the logarithm of the number of employees). Our estimations here include three steps. Because ODI rms may improve their productivity via investment abroad, in the rst step, we run a preliminary regression where the dependent variable, ln T F P it, is regressed on rm log R&D expenses, the other variables that appeared in the ODI participation equations, and rm-level indicators. As in Feenstra et al. (2014), we use the predicted variable \ ln T F P it to proxy rm productivity in Equ. (5). For the second step, our Type-2 Tobit model requires an excluded variable that a ects the rm s ODI decision but does not a ect its ODI ow (Cameron and Trivedi, 2005). Here the rm s export indicator (F X it ) serves this purpose, since the literature nds that a rm s export status matters for its ODI decision but may not be directly related to ODI ow, because exports of intermediate goods and exports of nal goods a ect rm ODI ow in di erent directions (Blonigen, 2001). The simple correlation between ODI ow and export status is close to nil (0.04) and ascertains that the export indicator can serve as an excluded variable in the thirdstep Heckman estimates. For the third step, we include the two-digit CIC industrial j and year dummies t to control for other unspeci ed factors. Note that we also include rare-events logit estimates in the ODI decision equation to avoid possible downward bias, which occurs in the regular logit estimates. Table 8 reports the estimation results for the bivariate sample selection model. From the second-step rare-events logit estimates of equation 3, as shown in column 2, high-productivity rms are more likely to engage in ODI. Similarly, as predicted, exporting rms are more likely to engage in ODI. We then include the computed inverse Mills ratio obtained in the second-step rare-events logit estimates in the third-step Heckman estimation as an additional regressor in column 3. We also include rm export share (i.e., rm exports over rm sales) as an additional regressor. It turns out that the estimated coe cients have exactly identical signs as obtained 21