1 A simple two period model

Transcription

1 A QUICK REVIEW TO INTERTEMPORAL MAXIMIZATION PROBLEMS 1 A simple two period model 1.1 The intertemporal problem The intertemporal consumption decision can be analyzed in a way very similar to an atemporal problem. The consumer has to choose between consumption of the same good at different dates. He lives for two periods and derives utility from his stream of consumption, as given by u (c 0,c 1 ). The consumer takes the gross real interest rate R as given. His budget at time 0 is given by y (expressed in terms of period 0 goods). Assume that the only source of income in period 1 is the income from savings at time 0. At date 0, the consumer chooses between the consumption level c 0 and savings s 0. Savings are in the form of consumption goods at date 0. The budget constraint at date 0 is c 0 + s 0 y. (1) At date 1, the consumer chooses the consumption level c 1. The income in that period only comes from savings in the previous period. Hence, the budget constraint is The consumer s maximization problem is thus: c 1 Rs 0. (2) max u (c 0,c 1 ) s.t. (1) (2). The difference between this problem and an atemporal problem (choice between two different goods at the same date) is that there are two budget constraints. This is artificial though, since assuming strictly increasing utility, the two budget constraints will hold as equality and can be combined as follows: so that the maximization problem is given by c R c 1 = y, (3) max u (c 0,c 1 ) s.t. (3). (P) This is similar to a standard utility maximization problem subject to a budget constraint, with the characteristic that the trade-off is between current and future consumption and the relative price of date 1 good to 1

2 date 0 good is 1/R (it is not equal to 1 even though the two goods have the same physical characteristics). The consumer thus has the possibility of intertemporarily substituting consumption across time. A special case (yet very common) for u (.,.) is u (c 0,c 1 )=U (c 0 )+βu (c 1 ), 0 <β<1. This intertemporal utility function assumes that the consumer derives utility from consumption in each period and that intertemporal utility is a weighted sum of the utility levels in the two periods [time-additive utility function]. β is called the discount factor. Solving (P), we get u 1 /u 2 = R (4) y c 0 c 1 /R = 0 (5) Condition (4) equates the marginal rate of substitution (relative value of current consumption to future consumption, i.e. the rate at which consumers are willing to trade one type of consumption for another) with the marginal rate of transformation (i.e. the rate of substitution between the two types of consumption available in the market). 1.2 The determinants of savings There are three important determinants of savings: the income profile, the rate of return to savings (the real interest rate) and the agent s patience toward future consumption. We enrich the model by allowing the household to have income at date 1 in addition to the income from savings (it could be labor income for example). Then, the budget constraint at date 1 is c 1 Rs 0 + y 1. The household s budget constraint at date 0 is (1) with y = y 0. The intertemporal constraint is then c 0 + c 1 R y 0 + y 1 R. (6) The level of savings is given by s 0 = y 0 c 0.Theoptimalc 0 is given by (4). Since the intertemporal budget constraint (6) binds, c 1 = y 1 + R (y 0 c 0 ). Substituting this and using time additivity of u, weget Income profile U 0 (c 0 ) U 0 = Rβ. (7) (y 1 + R (y 0 c 0 )) By the income profile, we refer to the household s income derived from human capital such as labor income, not counting income derived from savings. Hence, the income profile is (y 0,y 1 ). To illustrate how the income 2

3 profile affects savings, assume that R =1and that the household does not discount future utility (β =1). In that case, condition (7) becomes (assuming a strictly concave U (.)) c 0 = y 1 + R (y 0 c 0 ). The household achieves the same amount of consumption every period. This is due to the "consumptionsmoothing motive". In that simple case, we can explicitly solve for s 0, s 0 = y 0 y 1. 2 If the household has a flat income profile (and (R, β) =(1, 1)), the optimal level of savings is 0; if the household has an increasing income profile (y 1 >y 0 ), the optimal level of savings is negative and vice versa Real interest rate Let us now look at the rate of return to savings, i.e. the real interest rate. To isolate its role, let us assume again that β =1and and, for the moment, that the household has a smooth income profile (y 0 = y 1 = y). In that case, the optimal level of savings is 0 and the real interest rate R =1. Assume, instead, that R>1. Condition (7) becomes U 0 (c 0 )=RU 0 (y + R (y c 0 )) Unlike the previous case, the optimal consumption levels are different in the two periods. Given the smooth income profile and no time discounting, a higher real rate entices the household to save more. The intertemporal substitution effect is due to the fact that an increase in the real interest rate effectively makes consumption cheaper at date 1 than at date 0 (the real interest rate is the relative price of date 0 goods to date 1 goods). The increase in the real interest rate can also induce an income effect on savings (which does not appear in this example since, before the increase in R, households savings are 0). If instead, households had nonzero savings, then an increase in R will affect the household s interest income from savings Rs 0, and change future income. If the household has positive savings, an increase in the interest rate increases future income relative to current income. Anticipating the rising income profile, the household reduces savings in order to smooth consumption. On the other hand, if the household has negative savings, an increase in the interest rate increases the household s interest payment at date 1. In this case, future income falls relative to current income and the household increases savings in order to smooth consumption. Therefore, income and intertemporal substitution effects work in the same direction when the household has negative savings, but in opposite directions when the household has positive savings. A measure of the strength of intertemporal substitution is the elasticity of intertemporal substitution. Because R is the relative price between goods at different dates, we can define the intertemporal elasticity as dln (c 0/c 1 ). dln (R) 3

4 Another way to compute the elasticity is dln(c 0/c 1 ) d(u 1 /u 2 ), which comes from substituting the first-order condition F = u 1 /u 2. For the time-additive utility function, with U (c) = c1 σ 1 1 σ, the elasticity of substitution is 1/σ (the case where σ 1 corresponds to logarithmic utility). Exercise: Suppose that U (c) = c1 σ 1 1 σ,whereσ>0. Hence, the elasticity of substitution is 1/σ. Assume β =1, R 1, y 0 > 0 and y 1 > 0. Prove the following results: (i) the optimal level of savings is s 0 = y 0R 1/σ y 1 R 1/σ + R. (ii) s 0 0 always implies ds 0/dR > 0. (iii) If R =1and y 0 >y 1,thends 0/dR > 0 if and only if σ<(y 0 + y 1 ) / (y 0 y 1 ). Solution: (i) The household s problem is to solve c 1 σ max c 0,c σ + c1 σ σ s.t. c 0 + c 1 R = y 0 + y 1 R. The first order conditions are µ u 1 c σ = 0 u 2 c = R, 1 c 0 + c 1 R = y 0 + y 1 R. After some algebra, this results in s 0 = y 0 c 0 = y 0R 1/σ y 1 R 1/σ + R. (ii) From the above, once can compute ds 0/dR. After some algebra ds 0 dr = s 0 R + R 1/σ + 1 σ R1/σ y 0 + y 1 R R + R 1/σ 2. Hence, if s 0 0, ds 0 /dr > 0. (iii) When R =1 ds 0 dr = 2s σ (y 0 + y 1 ) = (y 0 y 1 )+ 1 σ (y 0 + y 1 ) 4 4 Thus ds 0/dR > 0 if and only if (y 0 y 1 )+ 1 σ (y 0 + y 1 ) > 0, i.e. when σ< y 0 + y 1, y 0 y 1 given that y 0 >y 1. Notice that with an increasing income profile (y 0 <y 1 ), ds 0/dR > 0, for all σ. 4

5 What is the point of the exercise? We know the following: As R s 0 0 s 0 0 Subst. effect s 0 s 0 Income effect s 0 s 0 Question (ii) allowed us to check analytically that when savings are negative, an increase in the real interest rate always induces an increase in savings, as both the income and substitution effects work in the (same) direction of increasing savings. In question (iii), we assume a decreasing income profile and hence positive savings (when R =1). That is the case when the income and substitution effects work in opposite directions. We established analytically, that for the substitution effect to dominate the income effect, it has to be that σ is small enough and thus that the elasticity of substitution is high enough. The insight in this simple model will be useful when we look at a more general one in the upcoming chapter on general equilibrium, real business cycle models Impatience towards the future The degree of impatience is measured by the discount factor. When β is low, the household is more impatient. For illustration, assume that the income profile is flat, the real interest rate is equal to 1, andthatβ<1. Then condition (7) becomes U 0 (c 0 )=βu 0 (2y c 0 ), which implies that c 0 >y. In fact, the lower β, the higher c 0 is. Let us illustrate in a different manner how savings fall when the degree of impatience increases. Suppose the household chooses the same level of consumption in the two periods, i.e. that c 0 = c 1 = y. Let the household reduce savings by a slightly positive amount ε and increase consumption at date 0 by the same amount. Utility at date 0 is now approximately U (y)+εu 0 (y), while utility at date 1 is now approximately U (y) εu 0 (y). Taking into account the discounting implies that utility increases by ε (1 β) U 0 (y) > 0. 5

6 Chapter I BUSINESS CYCLES 1

7 1 A General Real Business Cycle Model 1.1 The model Theideaistovieweconomicfluctuations as the result of external shocks hitting the economy. These shocks can be of different sources. The real business cycle (RBC) literature focuses on shocks to the productive capacity of the economy as the main source of fluctuations. Hence, the reference to real shocks, as opposed to nominal or monetary shocks which had been the focus of previous literatures. In that sense, RBC models are a very definite departure from previous models. They try to see how much of the fluctuations are normal reactions to real shocks, in the absence of any market imperfections. The previous literatures tended to assume some kind of market imperfections, and explain cycles with these imperfections. The idea is to model an artificial competitive economy where aggregates determined in equilibrium (output, investment, consumption, labor) are derived from agents maximizing behavior at the microeconomic level. Instead of assuming relationships at the macroeconomic level, aggregates outcomes are derived from optimal behavior of individual agents. Agents: Large number of identical households (of measure 1). Each household is small enough that it cannot influence market outcomes through its behavior (competitive economy) and lives forever. Large number of identical small firms. Both types of agents are price takers, i.e. they believe they can buy or sell any quantity of the good or labor but not affect price. Households are expected utility maximizers, while firms are profit maximizers. The population size is assumed constant. Technology: Each period, households rent capital and provide labor to firms, at a wage rate w t and a rental rate r t, respectively. The production function is Y t = e zt f(h t,k t ) (1) where Y t, H t,andk t are output, labor and capital in period t, respectively. z t is a random shock to the economy s productive capacity. Assume that the stochastic process is characterized as follows: z t+1 = ρz t + ε t+1 (2) where 0 <ρ<1 and ε t N(0,σ 2 ε). Each period, a new shock hits the economy. That shock is known before any decision is to be taken. f is defined on R 2 + with values in R +, and it is increasing, concave, and continuously differentiable in H and K. It is also homogenous of degree 1 (f(ah, ak) =af(h, K), i.e. if the inputs of capital and labor are doubled, output is also doubled). In addition, f(h, 0) = f(0,k)=0. Also, f 1 (H, K) + as H 0 and f 2 (H, K) + as K 0. Endowments: 2

8 Households are initially endowed with k 0 and start every period with a new total time endowment equal to 1, to be allocated between labor and leisure. Utility functions: For the households, the utility derived from an infinite sequence of consumptions and labor choices, {c t } + t=0 and {h t } + t=0 is h i + X u {c t } + t=0, {h t} + t=0 = β t U (c t, 1 h t ) t=0 where the discount factor β satisfies 0 <β<1. The one period utility function U is defined from R + to R and is assumed to be increasing, strictly concave and continuously differentiable in both arguments. Moreover, lim 1 (c, 1 h) c 0 = lim U 2 (c, 1 h) =+ h 1 lim 1 (c, 1 h) c + = lim U 2 (c, 1 h) =0 h 0 These restrictions, as well as the restrictions on the production function f ensure that there exists an interior solution 1. Resource constraint: Each period capital depreciates at a rate δ. The resource constraint is c t + i t = c t + k t+1 (1 δ) k t = w t h t + r t k t where i t is the household s investment in period t. This constraint states that households need to allocate their period income (from renting capital and working) between consumption and investment. The problem has already been set up as a competitive equilibrium problem. It could also be solved as a social planner problem, given that, in this particular case, they are equivalent. However, for the sake of generality, we will solve the problem as an equilibrium problem. The method retained can also be used in cases where the welfare theorems do not apply (distortions, externalities...). The concept used is called Recursive Competitive Equilibrium. We will solve the problem using dynamic programming and start by defining the state variables, as well as the control variables. Individual state variable: k t (capital owned by an individual household) Aggregate state variables: z t, K t (total capital in the economy) Control variables: c t, h t, i t We will start by describing the maximization problems faced by the firms and the households. Firm s decision problem: 1 These conditions are referred to as the Inada conditions. 3

9 Every period, the firm chooses capital and labor to maximize profits. The firm, as opposed to the worker, faces a one period problem. Because it does not have any claim to either labor or capital, it does not have to consider how its choices this period will affect its decision next period. Hence, the firm s problem is Differentiating with respect to h t and k t,weobtain t, Max h t,k t Π t = e z t f (h t,k t ) w t h t r t k t e zt f 1 (h t,k t )=w t (3) e zt f 2 (h t,k t )=r t (4) These two equations define the individual demands, given factor prices. Remark that, since the production functions is assumed to exhibit constant return to scale, firms make zero profits in equilibrium 2. The aggregate variables H t and K t satisfy (3) and (4) if the number of firmsisnormalizedto1. Hence, (3) and (4) imply that w t = bw (z t,h t,k t ) (5) r t = br (z t,h t,k t ) (6) Equations (5)-(6) represent the market clearing conditions, pinning down prices. Household s decision problem: Households are expected utility maximizers. Hence, they solve 3 " + # X MaxE β t u (c t, 1 h t ) t=o s.t. c t + k t+1 (1 δ) k t = w t h t + r t k t, t given stochastic processes for w t, r t and given k 0 The households need to make expectations on the future behavior of variables relevant to their intertemporal decision making. We will assume that households expect wage and capital rental rates to be functions of z t, K t and H t, as indicated by the solution to the firm s problem, and hence that they know (5) and (6) (remember h t is a control variable, but not H t ). Assuming that the other households also behave optimally, the individuals also know, or make expectations on H t = H (z t,k t ) (7) K t+1 = K (z t,k t ) (8) 2 By the application of Euler s theorem for homogenous functions of degree 1, f(h, k) =hf 1 (h, k)+kf 2 (h, k). 3 Because firms make zero profit in equilibrium, we do not need to add dividends from ownership of firms in the household s budget constraint. 4

10 z t+1 = z (z t,ε t ) (9) Using (7), the households expect the future wage and capital rental rates to be functions of the aggregate state variables only w t = w (z t,k t ) (10) r t = r (z t,k t ) (11) Once these expectations are taken, the households are able to solve their maximization problem for consumption, investment and labor supplied. Expectations are rational in the sense that K t+1 = K (z t,k t ), z t+1 = z (z t,ε t ), w t = w (z t,k t ), r t = r (z t,k t ) are known by the agents. But be careful, what is known by the agents is the rule of motion for K t+1 and z t+1, as well as factor prices as a function of aggregate state variables, not the exact future sequences. The agents make exact predictions about these on average, but are not assumed to be correct every period! Due to the recursive nature of the problem, the individual household s decision problem can be written as 4 : where v (z t,k t,k t )= Max h t,k t+1 {u (w t h t + r t k t +(1 δ) k t k t+1, 1 h t )+βe ε [v (z t+1,k t+1,k t+1 ) z t ]} (12) The solution to this problem is w t = w (z t,k t ) r t = r (z t,k t ) K t+1 = K (z t,k t ) z t+1 = z (z t,ε t ) z 0,k 0,K 0 are given h t = h (z t,k t,k t ) k t+1 = k (z t,k t,k t ) DEFINITION: A recursive competitive equilibrium is a list of value function V (z t,k t,k t ), individual decision rules h t (z t,k t,k t ), k t+1 (z t,k t,k t ) for the representative household, aggregate laws of motion H t (z t,k t ), K t+1 (z t,k t ), factor price functions w t (z t,k t ) and r t (z t,k t ) such that : (i) the household s problem is satisfied, i.e. h t (z t,k t,k t ),andk t+1 (z t,k t,k t ) solve (12) (ii) the firm s problem is satisfied and markets clear, i.e. e zt f 1 (H t,k t )=w t (z t,k t ) and e zt f 2 (H t,k t )=r t (z t,k t ) 4 Because the budget constraint will always be binding, there are really only two decision variables: how much labor to supply and how much to invest. Once investment is chosen, consumption is automatically given. Also, choosing investment is equivalent to choosing how much capital to bring to the next period (k t+1 ). 5

11 (iii) consistency of individual and aggregate decisions 5, i.e. h t (z t,k t,k t )=H t (z t,k t ) and k t+1 (z t,k t,k t )=K t+1 (z t,k t ) (iv) aggregate resource constraint is satisfied, i.e. C(z t,k t )+I(z t,k t )=Y (z t,k t ) The concept used to construct the equilibrium is illustrated in figure Expectations? optimization Equilibrium Behavior ¾? Outcome mechanism/institution Figure 1: Recursive competitive equilibrium concept 1.2 Calibration The concept presented above abstracts away from considerations of growth in the economy. Implicitly, we only looked at fluctuations around a steady state growth path. However, this is not a problem because one can always start from an economy that is allowed to grow along a steady state path and transform it into a stationary economy and solve for the Recursive Competitive Equilibrium. Hence, the techniques outlined can be used to study both how an economy grows over time and how it fluctuates around its growth trend. We present below how the problem can be stationarized (in later applications, we will start directly from the stationarized version of the economy). Restriction on the production function The steady state growth path is defined as the path where all rates of growth are constant, except for labor, which is bounded above (by H). Assume that there is a permanent component to technological change X t 5 This is because all households are identical. The typical household must be typical in equilibrium. However, it cannot be imposed on the decision maker. Prices move to make it desirable to the household. 6

12 in addition to the temporary one z t. We restrict ourselves to the case where the permanent change is labor augmenting 6,thatis,X t affects the efficiency of labor f (H t,k t,x t )=f (X t H t,k t ) Along the steady state growth path, H t = H. Let us call γ the growth rates. Hence, γ X = X t+1 /X t, γ K = K t+1 /K t, γ C = C t+1 /C t, γ I = I t+1 /I t, γ Y = Y t+1 /Y t. Writing the resource constraint for any dates t and t +1givesusthat and hence Therefore, for all t C t + I t = Y t γ c C t + γ I I t = γ Y Y t γ c C t + γ I I t = γ Y C t + γ Y I t (γ C γ Y ) C t +(γ I γ Y ) I t =0 For that to hold for all t, it has to be the case that C t and I t grow at the same rate. If this were not the case, the above equation cannot hold for all t, unless the two coefficients are equal to zero, which was ruled out by assumption. Hence it must be the case that γ C = γ I, which in turn implies that (γ C γ Y )(C t + I t )=0. Thus γ C = γ Y. We know that I t = K t+1 (1 δ)k t. Therefore γ K =1 δ + I t K t and thus I t /K t is constant for all t, and as a result γ K = γ I. Finally, Y t = f(x t H t,k t )=f(x t H,K t ). Because of the constant returns to scale assumption on f γ Y = Y t+1 = f Xt+1 H,K t+1 Y t f = f γx X t H,γ K K t X t H,K t f X t H,K t The only way for this expression to be constant for all t, isthatγ X = γ K. Hence 7 γ X = γ I = γ Y = γ C = γ K This matches the empirical observations that (i) real output grows at a more or less constant rate, (ii) the stock of real capital grows at a more or less constant rate greater than the rate of growth of the labor input, the growth rates of real output and the stock of capital tend to be about the same 8. 6 For a steady state growth to be feasible, we need that the permanent technological change be labor augmenting. 7 Hence a steady state growth path requires that the growth rates be constant. This is enough to conclude that they are not only constant, but also equal. 8 Of course, there has been a marked break in the trend of productivity growth around 1973, a phenomenon for which a satisfactory explanation still has to be found. Nevertheless, the facts mentioned still hold. The growth rates may have changed, but the relationships between them are still valid. 7

13 Restriction on the utility function The dynamic programing problem can be solved for a variety of utility functions. We would like to use the data to restrict the class of utility functions to be considered. We will use the observation that the long-run aggregate labor supply is constant to pick a particular class of utility function. A general treatment can be found in King, Plosser & Rebelo (JME 1988). The question asked is: what kind of utility function do we need to be consistent with balanced growth and a constant labor supply in the long run? Suppose you solve the following maximization problem: V (k t ) = Max {u (c t, 1 h t )+βv (k t+1 )} k t+1,h t s.t. c t + k t+1 (1 δ) k t = f (X t h t,k t ) You are interested in finding restrictions on the utility functions that are consistent with a steady state growth path, where labor is constant at h and other variables grow at the same rate γ as the technological progress X t.theefficiency conditions are: u 1 c, 1 h = β f 2 +1 δ u 1 γc,1 h (13) γ t f 1 u 1 c, 1 h = u 2 c, 1 h (14) These are the standard efficiency conditions. The marginal products of capital of labor (f 1 ) and capital (f 2 ) are constant along the steady state growth path, due to the constant return to scale assumption on the production function. X t was normalized to γ t.bydifferentiating (13) with respect to c and writing the expression c u 11(c,1 h) at c u 1(c,1 h) t = c and c t+1 = γc, one can verify that c u 11(c,1 h) must remain constant along the u 1(c,1 h) SSGP as consumption increases (and equal to σ). Solving the differential equation, one gets that: u (c, 1 h) = c 1 σ a (1 h)+b (1 h) if σ 6= 1 u (c, 1 h) = Ln (c) d (1 h)+e (1 h) if σ =1 Similarly, writing (14) at c t = c and c t+1 = γc and using the functional form just found for u, one can verify that one needs the restrictions that b 0 (1 h) =0if σ 6= 1and d 0 (1 h) =0if σ =1. Hence: u (c, 1 h) = c 1 σ a (1 h) if σ 6= 1 u (c, 1 h) = Ln (c)+e (1 h) if σ =1 For a CRRA utility function, the income effect (h ) of an increase in wage cancels out the substitution effect (h ), which matches the empirical result that the long-run aggregate supply of labor is constant, even though real wages increased over time. With this class of utility function, we can now compute the intertemporal elasticity of substitution ( IES ), as well as the intratemporal elasticity of substitution ( ES ). Suppose you are solving the following standard 8

14 equilibrium problem. The household s maximization problem is (state variable: k t, control variables: h t, k t+1 ): + X Max t=0 β t u (c t, 1 h t ) s.t. w t h t + r t k t = c t + k t+1 (1 δ) k t In dynamic programming terms, this can be rewritten as This gives us the following first order conditions: V (k t )= Max k t+1,h t {u (c t, 1 h t )+βv (k t+1 )} U c w = U l (15) Uc t = βuc t+1 (1 + r t+1 δ) (16) Hence, the marginal rate of substitution between consumption this period and leisure this period is U l /U c = w, and the marginal rate of substitution between consumption this period and consumption next period is Uc/βU t c t+1 =1+r t+1 δ. Wecannowdefine the two elasticities as: IES = ES = dln (c t+1 /c t ) dln MRS ct,c t+1 dln (c t /l t ) dln (MRS ct,l t ) ³ σ With a CRRA utility function, U c = c σ g(l) and U l = c1 σ 1 σ g0 (l). Hence, MRS ct,c t+1 = 1 ct+1 g(lt) β c t g(l t+1 ) and MRS ct,l t = 1 c t l t g 0 (l t ) 1 σ l t g(l.thus, t) IES = 1 σ ES = 1 Making the problem stationary Now, for the variable A t,define a s t = At X t,except for L t and H t, which are not normalized (we know that all variables grow at the same rate along the steady state growth path). Since the utility function is CRRA, we can rewrite the maximization problem as + X Max t=0 β t Ct 1 σ 1 σ g (L t) s.t. Y t = C t + K t+1 (1 δ) K t 9

15 and further The Bellman equations are + X Max t=0 β t (c s tx 0 γ t x) 1 σ g (lt s ) 1 σ s.t. e zt f (h s t,k s t ) = c s t + γ x k s t+1 (1 δ) k s t V (kt s,z t ) = Max kt+1 s,hs t where β = βγ 1 σ x ( (c s t) 1 σ 1 σ g (ls t )+β E V ) kt+1,z s t+1 The problem thus redefined is stationary and can be solved using dynamic programing Choosing the parameters of the model We can now start the calibration. Let us start by defining the capital and labor shares of output. Take a production function y = Zf(h, k) Then by differentiation, we obtain and thus y = Zf(h, k)+zf 1 (h, k) h + Zf 2 (h, k) k y y = Z Z + Zf 1(h, k)h h y h + Zf 2(h, k)k y Zf 1 (h, k)h and Zf 2 (h, k)k are the labor and capital shares of output (income from production going to the owner of labor and capital). Notice that because of the perfect competition assumption, Zf 1 is the wage received by the worker and Zf 2 is the rental income to the owner of capital. By the constant returns to scale and perfect competition assumptions, the two shares add up to 1. Since everything else is observable, one can calculate time series for z, using (17). It is found, looking at data, that capital and labor shares of output have been approximately constant over time 9, even while their relative prices have changed. This suggests a Cobb-Douglas production function f (h t,k t )=h θ t k 1 θ t To be consistent with observed values of labor and capital shares of output, θ 2/3. We know that the utility function must be CRRA. The only parameter we have to choose is 1/σ (intertemporal elasticity of substitution). Most empirical studies point towards a value of σ between 1 and 2. As the artificial 9 This is also true across countries. k k (17) 10

16 economy is not very sensitive to the exact value of σ, avalueofσ =1is generally chosen. Hence, the form generally retained for the utility function is u (c, 1 h) =(1 α) Ln (c)+αln (1 h) By solving for the first order conditions, we can obtain the deterministic steady state. maximization problem is " + # X MaxE β t [(1 α) Ln (c t )+αln (1 h t )] t=0 s.t. c t + k t+1 (1 δ) k t = f (h t,k t ), t The household s The first order conditions are u 1 (c t, 1 h t ) = βv 0 (k t+1 ) f 1 (h t,k t ) u 1 (c t, 1 h t ) = u 2 (c t, 1 h t ) and from the envelope theorem V 0 (k t )=u 1 (c t, 1 h t )[f 2 (h t,k t )+1 δ] Hence u 1 (c t, 1 h t )=βu 1 (c t+1, 1 h t+1 )[f 2 (h t+1,k t+1 )+1 δ] Given our assumptions on the production function, this implies that α 1 h = θ 1 α c µ 1 θ k 1 α h c Ã = β 1 α c (1 θ) µ θ h +1 δ! k or α 1 h = θ y 1 α h c 1 = β ³(1 θ) y +1 δ k (18) (19) We also that know that in steady state i = δk The depreciation rate is chosen to match the average investment to capital ratio in the economy. Using (19) and the average value for the output to capital ratio, β can be determined. Microeconomic evidence points toward a value of h.3. Finally, given the average value for y/c, α can be found using (18). As you can see, 11

17 calibrated values comes either from the requirement that the steady state values match the corresponding average values in the economy or from independently conducted microeconomic studies. To complete the calibration of the model, we need to determine values for ρ and σ ε (see (2)). Using (1), we see that z t+1 z t =(LnY t+1 LnY t ) θ (LnH t+1 LnH t ) (1 θ)(lnk t+1 LnK t ) (20) From there, the series of {z t } observed in the economy can be calculated. The residuals calculated are quite persistent, hence a very high value for ρ is generally retained (typically ρ =.95). Knowing ρ, the standard deviation of the error terms (σ ε )canbedetermined. 1.3 Defining and measuring the business cycle Business cycles are generally considered as a deviation from a trend. The question is then to define the trend. Once the trend is known, fluctuations can be easily calculated. Several method can be used (linear trend, piecewise linear trend). Most authors use a technique known as the Hodrick-Prescott filter. Consider the series of real output {y t } for example. It can be decomposed as the sum of a growth component y g t and a cyclical component y c t. The problem is to choose a trend to minimize the cyclical component, while still retaining a smooth trend. In other words, the problem is equivalent to Min s.t. TX (yt c ) 2 t=0 TX t=0 y g t+1 yg t y g t yt 1 g 2 not too big Take λ 10 a parameter reflecting the relative variance of the growth component to the cyclical component. Then the problem is to choose {y g t } to minimize the loss function Min {y g t } TX TX (y t y g t ) 2 + λ t=0 t=0 y g t+1 yg t y g t y g 2 t 1 The point of the exercise is to trade off the extent to which the growth component tracks the actual series against the smoothness of the trend. Please notice that for λ =0, y g t = y t and for λ +, thegrowth component is a purely linear time trend. For quarterly data, a value of λ =1, 600 is generally retained 11.Using this method, we can get a smooth time varying trend. The rational behind that choice is that it eliminates fluctuations at frequencies lower than eight years (business cycles are generally considered as fluctuations around the growth path occurring at frequencies of three to five years). 10 Penalty coefficient. 11 λ =400for annual data. 12

18 The standard detrending procedure is the following. Let X t be the series of interest. Take LnX t. HP-filter LnX t. Take the standard deviation of filtered LnX t. This is the percentage standard deviation of X t. The relative percentage standard deviations are usually relative to the percentage standard deviation of GDP 12. Once the U.S. time series have been detrended, business cycle facts can be presented. Several measures are of interest. First, we can look at the amplitude of fluctuations in the data. Second, we can also measure the correlation of aggregate variables with real GDP. That allows us to verify if a particular variable is proor countercyclical with respect to y t. Third, we can look at cross-correlation over time to see if one variable tends to lead or lag another variable. Below is a table summarizing the U.S. business cycle data from 1954 to Variable x t Std Dev Corr (x t 1,y t ) Corr (x t,y t ) Corr (x t+1,y t ) GNP 1.72% Consumption: Non-Durables & Services.86% Investment 5.34% Non-Farm Hours 1.59% Productivity 13.90% The magnitude of fluctuations in output and hours are similar. This confirms the general consensus that the effects of business cycles are most clearly felt in the labor market. Consumption is the smoothest of the series 14. Investment fluctuates the most. Productivity is slightly procyclical, but varies less than output. This can now be compared with the same measures as simulated with the model. The model has been simulated 100 times, each simulation lasting for 150 period long (or the length of the observation period) 15. The simulated data were HP-filtered to give the same representation as the U.S. data. The results of the simulated economy are provided below: 12 Take a series X t =(1+α t ) T t. T t represents the trend component, while α t represent the percentage deviation from trend. Hence, after taking logs, LnX t = Ln (1 + α t)+lnt t.ifoneweretohp-filter LnX t, one would get a trend term and a deviation term. What the procedure explained above amounts to, is to isolate the deviation term. By HP-filtering LnX t, one tries to pick up Ln (1 + α t ) α t. 13 GNP/Hours 14 This should not be surprising given the concavity of the utility function. 15 For each simulation, the first observations have been discarded, in order to get rid of dependence on initial values. However, the simulation still produced 150 observations. 13

19 Variable x t Std Dev Corr (x t 1,y t ) Corr (x t,y t ) Corr (x t+1,y t ) GNP 1.35% Consumption: Non-Durables & Services.33% Investment 5.95% Non-Farm Hours.77% Productivity 16.61% Performance of simulated model: One question can be answered using these simulations: assuming the economy is a perfectly competitive one, how much of the variations in output can be explained by optimal adjustments to purely real shocks to the productive capacity of the economy? This question is an interesting one, because before the advent of RBC theory, all models assumed that the fluctuations were due to nominal (monetary) shocks, and fluctuations were how an economy with market imperfections reacted to these nominal shocks. In that sense, RBC is a radical departure from previous literature. In the artificial economy, output fluctuates less than in the U.S. economy, but still a large share of the fluctuations can be accounted for, without assuming any kind of market imperfections. The investment time series is very volatile, both in the U.S. and in the artificial economy. All times series are procyclical in the U.S. and this is reflected in the model economy. This, however, is not surprising, given that there is only one source of uncertainty in the economy, z t. The model does not perform as well, when looking at hours of work or productivity, which suggests that some elements of the labor market are missing. Another point that the model is missing is the fact that the consumption series is not volatile enough in the model economy. Interpretation of the model: There are several channels through which a shock is propagated in the economy. The shocks considered affect the productive capacity of the economy, and they are propagated by the manner in which optimizing agents (at the micro level) react and alter their economic decisions (investment and consumption). Because their utility functions are concave (they are risk averse), households smooth out their consumption throughout their lifetime, so that a change in output will manifest itself partly through a change in consumption and partly through a change in investment. As households try to avoid wide fluctuations in consumption, it is not surprising to find in the model and in the data, that investment is more volatile than consumption. Of course, variations in investment now, affect future output. Hence shocks are transmitted through time. Finally, households substitute leisure across periods, in response to a rise/decrease in wages in this period (due to a rise/decrease in z t, and thus in labor productivity). It is interesting to point out the role of capital accumulation. Suppose, for the sake of argument, that output is only a function of labor (i.e. abstract away from capital). Then the household s problem becomes static. Assume that a positive productivity shock hits the economy. Then, output y t and wage w t increase proportionately and a change in z t has the same secular effect as a trend. Then, the income and substitution effects cancel out and labor h t is constant (but consumption c t increases in proportion to w t ). With capital, 16 GNP/Hours 14

20 however, investment would increase and consumption would not increase as much. Thus, it is also efficient to lower consumption and raise labor hours relative to the no-investment case. Another point that is often stressed in the RBC literature is how the persistence of productivity shocks affects the model. Suppose ρ =0. Assume a one-time (temporary) shock to the economy. The marginal productivity of labor increases this period, and the representative household faces an unusually high opportunity cost of taking leisure this period. While there are offsetting income and substitution effects, the model s preferences were chosen so that a permanent increase in the real wage generates exactly offsetting income and substitution effects, so that labor is left unchanged following such an increase. An implication is that labor has to rise in response to a temporary productivity increase. With a temporary shock, there is a much smaller income effect and there is great incentive to substitute intertemporally, since the current wage is high relative to expected future wages. On net, the positive labor response amplifies the productivity shock. The agents must decide what to do with this additional income. One possibility is to consume it all in one period. This would be inefficient, given that the marginal utility of consumption is decreasing, thus inducing a preference for smooth consumption paths. It is optimal to increase consumption both today and tomorrow. When there is serial correlation in the productivity shocks (ρ >0),the same mechanism is at work,but the effects are drawn out over time. Conclusion: In conclusion, the artificial economy performs relatively well, but can be improved along certain lines, particularly the labor market. Also, the model can be generalized by adding new sources of uncertainty, such as government spending shocks or monetary shocks, or any other type of shock. We will look at how the model can be improved by adding new shocks, how it performs when including monetary shocks, and how it can be modified to capture essential aspects of the labor market. Although we will not study the case, RBC theory can be used to study economies with heterogeneous agents 17. You should retain from this chapter that RBC theory is really a rigorous methodology and is flexible enough to study a lot of problems. 1.4 An application of RBC theory We present here a paper entitled Variance properties of Solow s productivity residual and their cyclical implications (Finn, JEDC 1995). It is intended first as an application of the methodology just learned, but also to show the concept of RBC can be extended to very different situations. We will see other examples of that later, when we study the labor market. The question is what constitutes a technology shock?. Theoretically, it is any real shock that influences the productive capacity of an economy, since it enters as a multiplicative factor in the production function. When we compute the Solow residuals, we can obtain time series for the {z t }, but it does not tell us what these z t stand for. Weather shocks may be considered as (negative) technology shocks (it decreases the output produced given, a certain amount of labor and capital inputs). What else may constitute a temporary 17 Agents characterized by age or skill, or subject to idiosyncratic shocks. 15

21 technology shock? The paper looks at how energy price shocks 18, which are not part of the standard model, affect the economy. The impetus for the paper is the observation that the correlation between the growth rates of the Solow residual and oil prices is 0.55 and the correlation between the growth rates of the Solow residual and total government spending is It seems that the Solow residuals, as computed, may include more than pure technology shocks and are influenced by such events as energy price shocks and to a lesser degree, government spending shocks. The paper investigates if energy shocks influence economic outcomes, in a way that cannot be captured by a production function, whose only inputs are labor and capital. The channel through which energy prices influence the productive capacity of the economy is capital utilization. The capital rented by firms can now be used more or less intensively. Energy costs (and depreciation) depend on how intensively the machinery is being used. For example, one may expect that if energy prices increase, capital utilization (hours of service per period or speed of utilization per hour) will decrease and hence output will decrease. The model makes the standard assumptions that the economy is perfectly competitive, with a representative firm, a representative household and a government. The production function exhibits constant returns to scale, and labor augmenting technological change. The utility function has constant relative risk aversion and unitary intertemporal elasticity of substitution. The defining characteristics of this model is variable energy costs, as well as variable depreciation costs. Energy prices are exogenous (open economy). There is endogenous capital utilization, that is it is left to the agent to optimally choose how intensively to use their capital. The economy is hit by stochastic shocks to technology, energy prices and government spending. Firm s problem: The production function is Cobb-Douglas in labor input (l t ) and capital services. Capital services is equal to the physical capital rented (k t ) times the rate of capital utilization (h t ). Capital utilization is defined as hours of service per period or speed of utilization per hour. Firms pay for what they use in production: l t and (k t h t ). Hence, firms maximize profits, taking wage rates (w t ) and capital services rental rates (r t )asgiven: Max l t,(k t h t ) y t w t l t r t (k t h t ) where y t = f (z t l t,k t h t )=(z t l t ) θ (k t h t ) 1 θ Household s problem: The household maximizes its expected discounted lifetime utility, subject to its budget constraint. The novelty is that energy costs and depreciation are functions of how intensively the capital is being used. In particular, both energy costs and depreciation are increasing and convex functions of capital utilization. Higher rate of utilization increases wear and tear and causes capital to depreciate faster. Using capital more intensively also 18 Oil price shocks come in mind immediately. 16

22 increases energy costs. Because of wear and tear, the machinery is not as energy efficient, so that a higher rate of capital utilization also increases energy usage faster. Hence, we have k t+1 = [1 δ (h t )] k t + i t δ (h t ) = hω t ω ω 1 This is the law of motion with endogenous depreciation (i t is investment at date t). Energy costs per unit of capital are given by e t = a (h t )= hν t k t ν ν 1 Hence, the household s maximization problem is ( + ) X MaxE 0 β t (Lnc t + γln(1 l t )) t=0 s.t. w t l t +(1 τ) r t k t h t = c t + i t + x t + p t e t where c t is consumption, x t is a lump-sum tax and p t is the exogenous energy price. τ is a tax on capital income. Government: The economy is subject to exogenous government purchases g t, and the government budget balances every period: g t = τr t k t h t + x t Stochastic nature of the economy: Lnz t+1 = Lnz t + Lnz + u z,t+1 Lng t+1 = ρ g Lng t + 1 ρ g Lng + ug,t+1 Lnp t+1 = ρ p Lnp t + 1 ρ p Lnp + up,t+1 g t = g t z t 0 < ρ g,ρ p < 1 u t = u z,t u g,t E (u t ) = 0 u p,t 17

23 Where u is governed by a Markov process Φ (u t+1 u t ). Hence Lnz isthemeangrowthofz t, Lng is the mean of Lng t,andlnp is the mean of Lnp t. Since z t is not stationary and g t is stationary, g t is also not stationary (it grows with the size of the economy). Movements in z t generate permanent movements in g t, but movements in g t only generate temporary fluctuations in g t. Defining the recursive competitive equilibrium: The firms chooses l t and (k t h t )suchthat: w t = f 1 (z t l t,k t h t ) r t = f 2 (z t l t,k t h t ) Using dynamic programming to solve the household s problem, we have to define the problem s state and control variables: Individual state variables: k t Aggregate control variables: K t, u t Control variables: l t, h t, k t+1 The Bellmann s equation can be written as: V (k t,k t,u t )= Max {u (c t,l t )+βe t [V (k t+1,k t+1,u t+1 ) u t ]} l t,h t,k t+1 Definition: A recursive competitive equilibrium is a list of aggregate laws of motion L t (K t,u t ), H t (K t,u t ), K t+1 (K t,u t ), individual decision rules l t (k t,k t,u t ), h t (k t,k t,u t ), k t+1 (k t,k t,u t ),factorpricesw t (K t,u t ), r t (K t,u t ) and lump-sum taxes x t (K t,u t ), such that: 1) Firms maximize and markets clear: w t (K t,u t )=z t f 1 (z t L t (K t,u t ),K t H t (K t,u t )) r t (K t,u t )=f 2 (z t L t (K t,u t ),K t H t (K t,u t )) 2) Household maximize, i.e. Bellmann s equation is satisfied by l t (k t,k t,u t ), h t (k t,k t,u t ), k t+1 (k t,k t,u t ) 3) Consistency of individual and aggregate behavior: l t (K t,k t,u t )=L t (K t,u t ) h t (K t,k t,u t )=H t (K t,u t ) k t+1 (K t,k t,u t )=K t+1 (K t,u t ) 4) The government budget balances, i.e.: g t = x t (K t,u t )+τr t (K t,u t )K t H t (K t,u t ) 18

24 Once the equilibrium has been defined, one can look at the first order conditions. The first order condition with respect to labor, l t, capital utilization h t,andk t+1 are: w t u 1 (c t,l t ) = u 2 (c t,l t ) (FOC[l t ]) Marginal benefit of consumption = Marginal disutility of working 1 more hour (1 τ) r t k t = δ 0 (h t ) k t + p t k t a 0 (h t ) (FOC[h t ]) After-tax marginal return to an increase in h t = Marginal depreciation+marginal energy cost u 1 (c t,l t )=βe t {u 1 (c t+1,l t+1 ) [(1 τ) r t+1 h t+1 +1 δ (h t+1 ) p t+1 a (h t+1 )]} (FOC[k t+1 ]) This last condition states that the marginal benefit of current consumption is equal to the marginal benefit of consumption next period, with returns from delaying consumption. Remark that, in the usual case, the last FOC would be: u 1 (c t,l t )=βe t {u 1 (c t+1,l t+1 ) [(1 τ) r t+1 +1 δ]}. In summary, the equilibrium is defined by 13 equations: 1) Firm s labor efficiency condition 2) Firm s capital services efficiency condition 3) Household s labor efficiency condition 4) Household s capital utilization efficiency condition 5) Household s capital accumulation efficiency condition 6) Domestic resource constraint 19 7) Production function equation 8) Law of motion of capital 9) Energy usage constraint 10) Government budget constraint 11) Stochastic equation governing technological shocks 12) Stochastic equation governing energy price shocks 13) Stochastic equation governing government spending shocks Now that the equilibrium equations have been established, we can see how energy shocks are diffused through the economy. However, it is VERY IMPORTANT to recognize that, given the general equilibrium nature of the model, all the effects of a particular type of shocks are taking place simultaneously. Hence, it may be difficult to disentangle these effects. Nonetheless, it is interesting to figure out all the channels through which a shock affects the economy. Ultimately, it will be necessary to simulate the model to account for the general equilibrium effects. Diffusion of a positive energy price shock: 19 Obtained by combining the household s budget constraint and the governemnt budget constraint: y t p t e t = c t + i t + g t. 19

25 Assume that there is positive energy price shock, i.e. that: p t This creates a negative income effect (from the domestic resource constraint). Consumption and leisure being normal goods, this implies that c t l t The capital utilization efficiency condition, and the fact that depreciation and energy costs are convex functions of utilization, imply that h t Also, due to the fall in capital utilization, y t is reduced, which enhances the income effect 20.Inthatsense,a positive energy price shock is equivalent, in a way, to a negative technology shock. In conclusion, positive energy price shocks tend to reduce capital utilization. Diffusion of a positive government spending shock: Assume a positive government spending shock. This again results in a negative income effect and c t l t g t If labor increases, the marginal productivity of capital services increases and from the capital utilization efficiency condition and the convexity of depreciation and energy costs h t Finally, with h t and l t increasing, total output produced y t tends to increase, which has a counter-effect on the negative shock due to increased government spending 21. In conclusion, positive government spending shocks tend to increase capital utilization. The objective of the project is to determine a true technology shock, i.e. a measure of an economy s ability to produce output from a given quantity of inputs, regardless of energy prices or government spending. The usual calculations of Solow residuals would give us LnSR t =[ Lny t θ Lnl t (1 θ) Lnk t ] /θ 20 Notice that, due to the general equilibrium effects, the marginal productivity of labor diminishes and the household s labor efficiency condition implies that households substitute labor for leisure, hence that l t tends to decrease (mitigating the increase in l t previously mentioned). 21 Again to account for the general equilibrium effects, with labor increasing, the marginal productivity of labor decreases (mitigated by increase in h t ) and from the substitution effect, l t decreases {w t is the price of leisure, os that if w t decreases, leisure increases). 20

26 where k t is capital as reported in National and Income Product Accounts. This measure of capital is calculated assuming constant depreciation. However, this does not fit with the model. To calculate a proper measure of k t in accordance with the setup, we use the capital utilization efficiency condition and the capital stock law of motion to get series on h t and k t. Hence, the true technology shocks are given by Lnz t =[ Lny t θ Lnl t (1 θ)( Lnk t + Lnh t )] /θ When technology shocks are computed in this fashion, the correlation between the growth rates of the technology shocks and oil prices on the one hand and the correlation between the growth rates of the technology shocks and total government spending on the other hand are very close to Market imperfections and business cycles 2.1 The Lucas imperfect information model We will now study an economy that has all the ingredients of a competitive economy (rationally behaving agents, price taking behavior, market clearing), but has the characteristic that agents are imperfectly informed about the state of the economy. There will be preference shocks (affecting the relative price between goods) and monetary shocks. However, the monetary shocks are unobserved. In particular, producers, when observing a change in the price for their product do not know whether it is a relative price increase (i.e. an increase in the price of their product relative to the aggregate price level) or if it is a general price increase, reflected in the aggregate price level. Hence, they will have to take production decisions, based on imperfect information. In particular, they will need to take expectations about the overall price level, based on the only information they get, that is the price of their commodity. The treatment of the Lucas imperfect information model comes from the Romer textbook. In order to focus on the central problem, which is to determine how agents take their decisions, when faced with uncertainty about the relative demand for their product, the model will make a number of simplifying assumptions. Agents use their own labor to produce a good indexed by i. They sell that good, taking the market price as given. The agent s production function is Y i = L i where L i is labor. The individual consumes his entire REAL income P i Y i /P,whereP is the aggregate price level, or the price of a market basket of goods 23. His utility is determined by consumption and leisure 24 U i = P il i P 1 γ Lγ i,γ >1 22 And statistically insignificant. 23 Or an index of prices of all goods. 24 This assumes constant marginal utility of consumption and increasing marginal disutility of work. 21

27 Averaging over the p i s, we get that 30 y = The case of perfect information For comparison, first suppose that the producer has perfect information on both the relative price P i he faces and also the aggregate price level P. Taking prices as given, the agent chooses L i to maximize his utility P i P Lγ 1 i =0 Hence L i = µ 1 Pi γ 1 P Taking logarithms 25 l i = 1 γ 1 (p i p) (21) This implies that production of good i increases with p i p, the relative price of good i. That determined the individual s supply of good i. We now need to look at the demand side. It is assumed that the demand per producer of good i depends on real income, the good s relative price and a random shock to preferences 26 : q i = y + z i η (p i p) (22) where y is log aggregate real income, z i is a shock to the demand of good i 27, η is the elasticity of demand for each good, and q i is the demand per producer of good i 28. It is assumed that the z i s have zero mean across sectors, i.e. that they are pure relative demand shocks (or preference shocks), hence E(z i )=0. It is assumed that y = q i (average across goods of the q i s) and p = p i (the price index is defined as the average across goods of the p i s). The aggregate demand is given by where m canbeconsideredasthemoneyincirculation 29. y = m p (23) The equilibrium in the market for good i requires that the market clears. Hence, 1 γ 1 (p i p) =y + z i η (p i p) or p i = γ 1 1+η (γ 1) (y + z i)+p 25 Lowercase letters correspond to the natural logarithms of uppercase letters. 26 For the particular functional form for utility function assumed, this is an approximation 27 This is a relative demand shock and can be interpreted as a preference shock. 28 Total (log) demand = LnN + y + z i η (p i p) 29 See Romer p. 269, on interpreting y = m p, wherem canbeviewedasagenericvariableaffecting aggregate demand, not necessarily money. 30 Donotgorgetthaty is log output. 22

28 and hence m = p So, in case of perfect information, changes in money supply are fully reflected in p and all relative prices p i are proportionally increased. However, output is unchanged. Hence, not surprisingly, money is neutral in the case of perfect information The case of imperfect information Producers observe P i, but not P. However, since their optimal decision depends on P,theyneedtomake inferences about the aggregate price level, solely based on the price of the good they produce and sell. Define r i = p i p (relative price of good i). What matters to the individual is r i, but what he sees is p i. From (21), we have l i = 1 γ 1 E [r i; p i ] (24) Notice that the maximization problem faced by the producer is MaxE[U i ; P i ] given P i and expectations of P i /P which gives or L γ 1 i = E [P i /P ; P i ] (γ 1) l i = LnE [P i /P ; P i ] (25) This is different from (24) that states that (γ 1) l i = E [Ln (P i /P );P i ]. We know that due to the concavity of the logarithm function, these are not equal. However, Lucas showed that if one assumes that Ln(P i /P )= E[Ln(P i /P ); P i ]+u i,whereu i is normal with mean zero and a standard deviation that is independent of P i, then one can show that the labor supply defined in (24) and (25) only differ by a constant. In (24), expectations are rational in the sense that the subjective probability of the distribution of the r i s given p i is equal to its objective distribution. We now look in detail at how producers take their expectations. It is assumed that the monetary shock m and the relative demand shock z i are normally distributed and independent. z i N (0,V z ) m N (E [m],v m ) z i m 23

29 We will assume for now (but confirm later) that p and r i are normally distributed and independent. p N (E (p),v p ) r i N (E (r i ),V ri ) p r i Hence p i is also normally distributed. p i N (E (p i ),V pi ) E (p i ) = E (p)+e (r i ) V pi = V p + V ri We will assume for now (but confirm later) that E (p) = E (m) E (r i ) = 0 V p and V ri can be expressed as function of V m and V z Since r i and p i are jointly normally distributed, the expectation of one conditional on the other is linear in the conditioning variable. This can be seen as follows. Assume two random variables (X, Y ) are jointly normally distributed. Hence their pdf is q can be rewritten as f X,Y (x, y) = q = 1 p 2πσ 1 σ 2 1 ρ 2 e (q/2) " µx 2 µ µ µ # 2 1 μ1 x μ1 y μ2 y μ2 1 ρ 2 2ρ + q = σ ρ 2 σ 1 σ 2 µ 2 µ y b x μ1 + where b = μ 2 + ρ σ 2 (x μ σ 1 ) 1 Now, we can rewrite the joint pdf as f X,Y (x, y) = " 1 e 2πσ1 1 2 x μ1 = f x (x) f Y X y x σ 1 σ 2 σ #" 1 p 2πσ2 1 ρ 2 e 1 2 σ 2 y b σ 2 1 ρ 2 Hence, we see that the conditional pdf is given by the term in the second bracket. It is a normal distribution f Y X y x N µμ 2 + ρ σ 2 (x μ σ 1 ),σ ρ #

30 Applying this result, we get σ ri E [r i ; p i ]=E (r i )+ρ ri,p i (p i E (p i )) σ pi Since E(r i )=0and E(p i )=E(p), wehavethat σ ri E [r i ; p i ]=ρ ri,p i (p i E (p)) σ pi where ρ ri,p i = cov(ri,pi) σ ri σ pi. The covariance can be calculated: Hence, cov (r i,p i ) = E [(r i E (r i )) (p i E (p i ))] = E [r i (p i E (p))] = E [r i (r i + p E (p))] = E ri 2 + E [ri (p E (p))] = E ri 2 = σ 2 ri E [r i ; p i ] = σ 2 r i σ ri (p i E (p)) σ ri σ pi σ pi = σ2 r i σ 2 (p i E (p)) p i Hence, E [r i ; p i ]= V r i (p i E [p]) (26) V ri + V p Equation (26) states that if the observed p i equals its mean, then the guessed r i also equals its mean, and that the observed p i and the expected r i are always on the same side of their respective means. Finally, we can see that if the variance in r i goes to zero, p i is a good expectation of p (E (p; p i )=p i ) and if the variance in p goes to zero, all the variation in p i is due to r i (E (p; p i )=E (p)). By substitution of (26) into (24), we get the individual labor supply: l i = 1 V ri (p i E [p]) = b (p i E [p]) (27) γ 1 V ri + V p Hence, averaging across producers: y = b (p E [p]) (28) This is the famous Lucas supply curve. It implies, that in the aggregate, the deviation in output from its average value (zero in model) increases with the surprise in the price level (p E[p]). If the price level where perfectly known, every period, then there would not be any deviation in output (money neutrality). It is easy to verify that the higher V ri, the higher the illusion effect and the higher V p, the lower the illusion effect (in fact, V ri / (V ri + V p ) represent the variance in p i due to the variance in r i ). This gives a microfoundation to the Phillips curve (negative relationship between inflation and unemployment). 25

31 We can now combine the aggregate supply curve with the aggregate demand curve to calculate the equilibrium aggregate price level. From equations (23) and (28), we obtain: m p = b (p E [p]) Hence 1 p = b +1 m + b E [p] (29) b +1 b y = (m E [p]) (30) b +1 Assuming rational expectations, equation (29) implies that E [p] = 1 b +1 E [m]+ b b +1 E [p] Hence E [p] =E [m] Rewriting (29) and (30), we get p = E [m]+ 1 (m E [m]) (31) b +1 b y = (m E [m]) (32) b +1 We can interpret equations (31) and (32). The expected monetary disturbance - E[m] -onlyaffects the aggregate price level and has no effect on output. However, the unexpected portion of monetary policy - (m E[m]) -affect both the aggregate price level and total output. When m E[m] 6= 0, producers attribute part of the increase in the demand for their own good to there being more money in the economy, part to an increase in the relative demand for their product. Hence, they increase output. However, when m increases, while m E[m] remains constant, y is unaffected. In that case, there is no money illusion and output is unchanged. We must now check the validity of some assumptions made along the way. We see from (31) that p is normal ((31) implies that E(p) =E(m) and V p = Vm ). Substituting (28) into (22), we get q (b+1) 2 i = b(p E[p]) + z i η(p i p). From equation (27), we have that l i = b(p i p) +b(p E[p]). Equating supply for good i with demand for good i, we obtain that b(p i p) =z i η(p i p) and hence that (p i p) = z i b+η. Hence, r i is normal, E(r i )=0and V r = V z.onecanseethatp are r (b+η) 2 i are independent (since m and z i are independent). We see then that b can be described only as a function of γ, η, V z,andv m : b = 1 γ 1 V z V z + (b+η)2 V (b+1) 2 m It is easy to show that the above implicitly defined b increase in V z and decreases in V m. Notice that when V z =0, V r =0and E [p; p i ]=p i,andb =0(no illusion ). 26

32 2.2 Coordination failures We first present a very general framework where agents take actions which they cannot coordinate between themselves. As a result of the lack of coordination, multiple equilibria may arise and the economy may be stuck either in a good equilibrium (boom) or a bad equilibrium (recession). We will then study a particular economy where different levels of unemployment may occur because of a lack of coordination between the actions the agents take. We want to look at situations where agents cannot coordinate their actions. Hence, the concept of Nash equilibrium seems appropriate, since it assumes non-cooperative behavior among agents. This is not due to their unwillingness to cooperate, but rather to their inability to do so. Agents cannot coordinate their actions in a decentralized economy populated by many agents, even though they would benefit fromdoingso. We first look at a general game that agents play. Assume that there are I agents 31 who have to choose an action e i from an interval [0,E]. Thepayoff to agent i of action e i, taking the actions of other agents e i as given, is equal to σ(e i,e i ). We assume that the payoff functions have all the desired differentiability and continuity properties 32. In particular, assume that the payoff functions are continuously differentiable and that 2 σ < 0. e 2 i Denote the payoff to agent i of action e i, when all other agents take action e, byv (e i, e). e i (e) istheoptimal response of agent i, when all other agents choose action e. Since we are only looking at symmetric Nash equilibria (SNE), we need to have e i (e) =e Of course, there is the possibility of multiple SNE. Call S the set of all SNE s 33. (assume that lim V 1 (e, e) e 0 S = {e [0,E],V 1 (e, e) =0and V 11 (e, e) < 0} > 0 and lim V 1 (e, e) e E < 0.) Let us now define the set of efficient equilibria. We define symmetric cooperative equilibria (SCE) as actions by all agents that maximize the welfare of the representative agent. The set e S of all SCE is given by 34 es = {e [0,E],V 1 (e, e)+v 2 (e, e) =0and V 11 (e, e)+2v 12 (e, e)+v 22 (e, e) < 0} Let us define the following: If V 2 (e i, e) > 0, the game exhibits positive spillovers (33) If V 2 (e i, e) < 0, the game exhibits negative spillovers (34) If V 12 (e i, e) > 0, the game exhibits stategic complementarity (35) If V 12 (e i, e) < 0, the game exhibits stategic substitutability (36) 31 We could look equivalently at I agents or I groups of agents. What matters is that agents have a non-negligible effect on the payoffs ofothersandactstrategically. 32 For all technical assumptions, see Cooper and John (QJE 1988). 33 Cooper and John (1988) make the appropriate assumptions for existence of an interior solution. 34 The authors ensure the existence of an interior solution by assuming that lim V 2 (e, e) < 0 e 0 V 1 (e, e) +V 2 (e, e) > 0 and lim V 1 (e, e) + e E 27

33 Equations (33) and (34) characterize positive or negative externalities present in the game. The action that one agent takes may affect the other agents payoffs. The interactions are at the level of payoffs. The increase in a player s strategy affects other players payoffs. Equation (35) states that an increase in the action of all other agents increase the marginal return to agent i s actions. Hence, an increase in all the other agents strategy (or an increase in e) causes an increase in agent i s optimal strategy. The interactions are at the level of strategies. Under strategic complementarity, one therefore expects e i (e) to be an increasing function of e [from the implicit function theorem, de i de Here are a few results about this game 35 : = V12 V 11 > 0]. Proposition 1: Strategic complementarity is necessary for multiple SNE. This is because, looking at figure 2 below, multiple solutions (intersections of the reaction function and the 45 o line) can only arise if the slope of the reaction functions is positive on some range. Call s that slope. Then V 1 (e i (e), e) =0,andsV 11 (e i (e), e)+v 12 (e i (e), e) =0, which implies that s = V 12 (e i (e), e) /V 11 (e i (e), e). If we have strategic complementarity on some range, s is positive Proposition 2: Ifthegameexhibitsspilloversate S, thene is inefficient. If V 2 (e, e) 6= 0(spillovers) and V 1 (e, e) =0(e S), thene/ e S. ThisisbecauseinanSNE,agentsdonottake into account the fact that their actions affect others payoffs (externalities). Proposition 3: Assume there are several SNE. Also assume positive spillovers for all e E. The SNE s can be Pareto-ranked by the equilibrium action. Higher action SNE s are preferred. By the Envelope Theorem, e V (e i (e), e) =V 2 (e i (e), e). If the game exhibits positive spillovers (globally), then d de V (e i (e), e) > 0. Agentsi s payoffs increase when everybody else increase their action. Equilibria with higher actions are preferred by everybody. However, SNE with lower welfare for the representative agent are also consistent with optimizing behavior. Such equilibria can be sustained because of the inability of agents to coordinate their actions An example of coordination failure: the low skill trap We can now look at a particular example of the general setup described above. As the focus is on the strategic interactions between firms and workers, the important assumptions are with respect to how the labor market functions. In particular, we do not assume that the labor market clears anymore (this anticipates the search literature that we will explore in more details later). Rather, because of informational problems, finding a firm (or a worker) to match and start production with, is a difficult and time consuming process. This can come from several reasons. Because workers and firms do not know instantaneously where each other are, they have to search and this takes time. Or even if they know where they are, they do not necessarily know which ones are actively looking for a partner. Or even if they know where to locate a partner interested in making a match, several workers may apply for a given position for example, and only one of them can get 35 Similar (opposite) results can be obtained, if one considers negative spillovers or strategic substitutability. 28

34 e i (e) 6 45 o line (reaction function e i (e)) - e Figure 2: Reaction function 29

35 it. To represent all these possible frictions, it is assumed that matching opportunities come randomly for searching workers or firms. The following model is derived from The Low Skill Trap (Burdett 1990 s). The paper tries to explain the following facts, relative to the English and the German economies: (i) A greater proportion of the German labor force is skilled, (ii) There are more skilled job vacancies in Germany than in the U.K., (iii) The expected return to training in the U.K. is no higher than in Germany, (iv) The (long-run) unemployment/vacancy ratio is higher in the U.K. than in Germany. Workers can be in two states, searching for a firm or matched and producing output. Similarly, firms can be in two states, searching for a worker or matched. To find a worker, firmshavetopayavacancycostc (per period). State variables: workers: searching or matched firms: searching or matched Decision variables: workers: accept a match or not, whether to train or not firms: whether to post vacancies, accept a match or not Types: workers: trained or untrained firms: same type Technologies: Production technology: X 1 for trained workers and X 2 for untrained workers (X 1 >X 2 ) Meeting technology: M(U t,v t ) (number of meetings per period, where U t is the number of unemployed workers and V t is the number of vacancies). It is generally assumed that the meeting function is increasing and concave in both arguments, and exhibits constant returns to scale. Utility: Linear utility for both workers and firms Let β t be the proportion of the unemployed workers who are trained (type 1), at t. All workers are born untrained, and must take the decision whether to get training or not (and incur the costs), BEFORE entering the labor market. For simplicity (but that should not alter the main message of the model), once a worker and a firm have made contact, they start production and leave the labor market, that is these matches are not subject to future breakdowns. Also denote by g the number of workers who enter the market at date t. This is exogenously given. The number of firms entering the market, though, is endogenously determined. It 30

36 is because firms need to post a costly vacancy to attract a worker, and firmsdosountiltheexpectedvalue of posting a vacancy is driven down to zero. This is a free entry condition for firms. Steady State Value functions: S 1 : value of search to the trained worker S 2 : value of search to the untrained worker Π: value of posting a vacancy to the firm The novelty is to assume that meetings between firms and workers only come randomly, Hence, workers and firms face arrival rates of meetings. Call α 1 w and α 2 w the arrival rates of job offersinagivenperiod,for a trained and an untrained worker, respectively. These are arrival rates of firms willing to match with the particular types of workers. Call α f thearrivalrateofmeetingsforthefirms. Denote by w 1 and w 2 the wage of trained and untrained workers. Assume that (to be checked later) X 1 >w 1 > 0,X 2 >w 2 > 0,w 1 >w 2 LOOKING AT THE WORKERS AND THE FIRMS PROBLEMS IN THE SEARCHING POOL FIRST: When calculating his value of search, the worker takes the market wage w i and arrival rates of job offers as given: S 1 = 1 h n α 1 1+r wmax S 1, w o α 1 i w S1 (37) r S 1 is the discounted lifetime expected value of being a searching worker of type 1 (trained), and following an optimal strategy. It says that (at the end of the period, hence the discounting), the worker has a probability α 1 w of meeting a firm, in which case he will follow the optimal strategy of accepting the match if the value ofdoingso(w 1 /r) is greater than the value of continuing search (S 1 ). There is also a probability 1 α 1 w that the trained worker does not meet a firm in that period. The value of accepting the match is w 1 /r, since we assumed that matches lasted forever. Hence, the worker s decision problem is simple: accept the job if S 1 w 1 r and reject it otherwise. This is equivalent to a Bellman equation. Remember that the only control variable (at this stage) for the worker is whether to accept or not accept a matching opportunity, given a meeting took place. And the only state variable for the worker is whether he is searching or matched. Hence, for a searching worker, (37) says that the value derived from the state of search is that the worker gets offers that he optimally chooses from, knowing that he can continue search in the next period if he chooses to. S 1 is the discounted value of making the optimal choice, given the opportunity arises. 31

37 The general form for a Bellman equation, the one you are used to is: V (State t )= Max control t [U (.)+βev (state t+1 )] In the present case, the control variable is discrete: do you accept the match or not? Given the current state is search for the worker, and since there is no utility received outside the match (U =0), this becomes: S 1 = 1 E [Max(value of search, value of a match)] 1+r there is an expectation term, because the worker may not have the opportunity to make a choice. The value function for the untrained worker is similarly given by: Hence, we have that: S 2 = 1 h n α 2 1+r wmax S 1 = S 2 = S 2, w 2 r o + 1 α 2 i w S2 α 1 n w r + α 1 Max S 1, w o 1 w r α 2 n w r + α 2 Max S 2, w o 2 w r (38) As w 1 > 0 and w 2 > 0, trained and untrained workers accept the first offer they receive (it is intuitive and easy to show that assuming S i > w i r leads to a contracdiction). Thus: S 1 = S 2 = α 1 w r + α 1 w α 2 w r + α 2 w w 1 r w 2 r (39) (40) Taking wages, arrival rates and the proportion of skilled workers as given, the value of search for the firm is given by: Π = 1 ½ α f βmax Π, X ¾ ½ 1 w 1 + α f (1 β) Max Π, X ¾ 2 w 2 +(1 α f ) Π c 1+r r r Hence: ½ (r + α f ) Π = α f βmax Π, X ¾ ½ 1 w 1 + α f (1 β) Max Π, X ¾ 2 w 2 c r r As for the worker, the firm s decision is straightforward: accept the match with a worker of type i if Π Xi wi r and reject it otherwise. LOOKING AT THE WORKERS AND THE FIRMS PROBLEMS 32

38 BEFORE THEY ENTER THE SEARCHING POOL: Firms decision is the following: taking wages as given and taking expectations on the proportion of trained workers 36, post vacancies until the value of doing so is driven down to zero, i.e. when Π =0. Atthatstage, firms have to have expectations over β. This is the free entry condition. As X i w i > 0,i=1, 2, this implies that: c = α f β X 1 w 1 r + α f (1 β) X 2 w 2 r Since employers with a vacancy offer a job to all workers met (X i w i > 0, i =1, 2), α 1 w = α 2 w = α w. Hence, (41) S 1 = S 2 = α w w 1 r + α w r α w w 2 r + α w r Turning now to the training decision by the workers (that has to be taken before entering the labor market), we see that the return from education η is equal to S 1 S 2,where: η = S 1 S 2 = α w w 1 w 2 r + α w r (42) Denote the cost of training to the worker by e. Then, the worker s decision is (taking wages as given and taking expectations on meeting rates): Undertake training if η > e Do not train if η < e (if η = e, the workers are indifferent between training or not) Flows: In steady state, the flows into the labor market are equal to the flows outside the labor market (remember that we assumed that the meeting probabilities were constant through time). Hence: g = M(U, V ) (43) The probability of a meeting to a worker (firm) is equal to the ratio of the number of meetings to the number of searching workers (firms): α w = M(U, V ) U α f = M(U, V ) V 36 Of ocourse, as the number of vacancies posted increases, α f decreases. (44) (45) 33

39 Hence, g = α w U = α f V (46) From (41), (42), and (46), we obtain that: ³ g µ µ X1 w 1 X2 w 2 V (β) = β +(1 β) c r r µ µ g w1 w 2 η(u) = g + ru r (47) (48) Conditional on w 1 and w 2, a steady state equilibrium can take two forms. Denote by e the cost of education (training). Either it is an interior solution or it is a corner solution. The interior solution (U,V,β ) is given by The corner solution is given by either or by g = M(U,V ) η(u ) = e V = V (β ) g = M(U,V ) η(u ) e V = V (0) g = M(U,V ) η(u ) e V = V (1) We now investigate the possibility of multiple equilibria. Continue assuming that X 1 w 1 >X 2 w 2 (this seems more appealing than the opposite assumption). From (47), we see that V (.) is linear and increasing in β, such that V (0) = g X2 w 2 c r and V (1) = g X1 w 1 c r. Remark that the iso-matching curve g = M(U, V ) is downward sloping, because of the assumptions on M. Define U 0, U 1 such that M(U 0,V(0)) = M(U 1,V(1)) = g. Please refer to the figure. First assume that there is U [U 1,U 0 ], such that η(u )=e. From the figure, one can check that this corresponds to a unique interior equilibrium 37. Now looking at the two corner solutions, it is easy to see that 37 This is not necessarily an interesting equilibrium, since it is unstable in the sense, that if more people decide somehow to acquiremoretraining(β ), then η>e, and we would move towards the corner solution where β =1. 34

40 if all workers decide to acquire training (β =1), the number of vacancies increases to V (1). Asthishappens, unemployment decreases. Consequently, the return to education exceeds its cost e, because workers have a higher chance to reap benefits from their training. So, firms post more vacancies since the chance of making a higher profit match increases. But also, workers have more incentive to acquire training. since there are more vacancies (job opportunities). Now assume that all workers decide to remain untrained. Then, by the same token, the number of vacancies decreases and workers have no incentive to acquire education. The strategic interaction between workers and firms decisions is clear. This is the low skill trap alluded to in the title. Workers do not acquire training because there are not enough job vacancies, where they can enjoy the fruits of their training. And firms do not post many vacancies, since the chances of making a high profit match are low, because there are few high skilled workers around. Please note that the expectations are rational in these equilibria. It is easy to see that the high skill equilibrium Pareto dominates the low skill equilibrium. Please remark that if the training costs are low enough or high enough, there is a unique corner solution (either every worker trains or none trains) 38. Equilibrium wages: Finally, we need to verify what we assumed along the way: X 1 >w 1 > 0, X 2 >w 2 > 0, w 1 >w 2,and X 2 w 2 >X 1 w 1. The match between the worker and the firm creates a local surplus between the two, which is due to the fact that the combined value of the match between the two entities is greater than the total value of search (between both of them): agents do search because they enjoy utility once matched (due to the division of output), but they do not enjoy any income during search. So, matching creates some gain from trade that has to be divided between the two parties. That is when we make the assumption that wages (or the division of output) are negotiated between the two parties. And for an equilibrium, we require that negotiated wages are equal to market wages. The division of gains from trade has long been a topic of great theoretic interest. The most common approach in equilibrium models is to use the Nash bargaining solution, as derived in Nash (1950). This bargaining solution is explained in detail below. Remember that, when negotiating, the two parties take their values of search and values of being matched as given. Applying the Nash bargaining solution, the resulting negotiated wage has to be equal to the market wage. The Nash bargaining solution: This is referred to the axiomatic approach to bargaining. The reason is that Nash was interested in finding a solution with particular properties to a bargaining problem. Letusfirst define the problem and the properties that the solution must have. There are two agents 39, j =1, 2, with utility functions u j.thereis an arbitrary set of outcomes A. D is the outcome in case the agents cannot reach an agreement (disagreement or threat point). Define S = {(u 1 (a),u 2 (a)),a A} and d =(d 1,d 2 ),whered j = u j (D). Suppose that S is compact and convex and that d S. Alsoassumethat s S, such that s j >d j, j =1, 2. (S, d) is the bargaining problem f : (S, d) S is a solution to (S, d) 38 It can be shown that if X 1 w 1 X 2 w 2, there is always a unique equilibrium. 39 In what follows, we are only interested in situations where two agents bargain. 35

41 Nash was looking for a solution with the following properties: (A1) Invariance to utility choices: Given (S, d) and (S 0,d 0 ) defined by s 0 j = α js j + β j and d 0 j = α jd j + β j,thenf j (S 0,d 0 )=α j f j (S, d)+β j (A2) Symmetry: If d 1 = d 2 and (s 1,s 2 ) S (s 2,s 1 ) S, thenf 1 (S, d) =f 2 (S, d) (A3) Independence of irrelevant alternatives: If (S, d) and (S 0,d) satisfy S S 0 and f (S 0,d) S, thenf (S, d) =f (S 0,d) (A4) Pareto efficiency: Given (S, d), ifs S and s 0 S and s 0 j >s j, j =1, 2, thenf (S, d) 6= s Nash (1950) showed that the unique solution to this problem is 40 : f (S, d) = Arg max (s 1 d 1 )(s 2 d 2 ) (49) s 1 d 1,s 2 d 2 To sketch the proof, it will be useful to draw a graph. By (A1), one can choose the set of possible outcome S 1,suchthatd =(0, 0) (normalization of the utility functions). Denote by S 2 the intersection of S 1 and the positive quadrant. Let (u 1,u 2)=Arg maxu 1 u 2. By assumption, S 2 is non-empty, compact and convex, s S 2 which guarantees existence of the maximizers. Uniqueness is obtained from the convexity assumption. By (A1), chooseu 1, u 2 such that (u 1,u 2 )=(u,u ) lies on the 45 o line (normalization of the utility functions). Notice that every point of S 2 is such that u 1 + u 2 2u 41.LetB be a square, symmetric relative to the 45 o line, one side of which is supported by u 1 + u 2 =2u, that includes S (of course, it is not unique). It exists since S is bounded. Then by (A2), f (B,O) is located on the 45 o line. By (A4), f (B,O) =(u,u ).By(A3), f (S, O) =f (B,O). Hence, given the normalizations performed, f (S, d) is located at (u,u ). Remarkably, it can be proved that uniqueness of the bargaining solution cannot be obtained with a proper subset of these four axioms. Final remark: we need to check that X 1 >w 1 > 0, X 2 >w 2 > 0, w 1 >w 2,andX 2 w 2 >X 1 w 1. The Nash bargained wage w i satisfies: ³ wi µx Max w i r S i w i i Π r 40 Remark that if requirement (A2) were dropped, then there is a continuum of solutions: f θ (S, d) = Arg max (s 1 d 1 ) θ (s 2 d 2 ) (1 θ) s 1 d 1,s 2 d 2 41 Suppose that there exists a point M =(u 1,u 2 ) S 2 such that u 1 + u 2 > 2u. Then, there exists a point between M and N =(u,u ) that belongs to S, forwhichu 1 u 2 >u 2 (by convexity of S 2 ). 36

42 U @@ - U 1 Figure 3: Graphical transformation used in Nash s proof The first order condition implies (one can easily check the second order condition): w i r S i = X i w i Π r since S i and Π are taken as given, by the negotiating parties. As Π =0and S i is given by (39)-(40), we have that: w i = r + α w X i 2r + α w One can then check that X 1 >w 1 > 0, X 2 >w 2 > 0, w 1 >w 2,andX 2 w 2 >X 1 w 1. Conclusion: this simple model can be used to shed some light on the remarks made at the beginning: consider that England is in the low skill equilibrium while Germany is in a high skill equilibrium. Then: (i) Germany s workforce would be more skilled than England s workforce, (ii) there are more skilled job vacancies in Germany than in England, (iii) the expected return to education is lower in England, (iv) the unemployment to vacancy ratio is higher in the U.K. than in Germany. 37

43 Chapter II ECONOMIC GROWTH 1

44 In this chapter, instead of trying to explain why economic variables uctuate, we are focusing our attention on the levels of economic aggregates and their growth rates. In particular, some empirical facts need to be explained 1 : 1 Observed growth facts 1) There is a great wealth disparity between countries. In 1985, average per-capita incomes in the richest 5% of countries and the poorest 5% of countries di ered by a factor of 29. The wealth distribution has shifted up. But the disparities tend to persist in both levels of income per capita and growth rates. 2) There has been development miracles and disasters. 3) Barro (QJE 1991) tries to nd relationships between growth rates, initial per capita output, and initial human capital levels (proxied by school enrollment rates). The Solow growth model predicts that if countries have the same preference and technology parameters, one should expect convergence - poor countries grow faster than rich ones. However, the data show almost zero correlation between initial capital output (GDP60 2 ) and subsequent growth rates (GR ). However, di erent countries have di erent levels of human capital. So, he performs the following regression GR6085 = + GDP 60 + SEC60 + P RIM60 < 0; > 0; > 0 (and all highly statistically signi cant) where SEC60 and PRIM60 are secondary and primary enrollment rates in 1960, respectively, which are proxies for initial human capital levels. < 0 implies that with constant human capital levels, there is a strong negative relationship between initial wealth and subsequent growth rates. Since > 0 and > 0, initial human capital levels and subsequent growth rates are positively related, holding other variables constant. This seems to support endogenous growth models, which suggest that a large stock of human capital promotes growth, because it makes physical capital more productive and/or it makes it easier to develop or absorb new ideas, products and technologies 4. Notice that having < 0 and corr(g y ; initialgdp ) 0 can be reconciled, if countries with low GDP 60 also had a low SEC60 and P RIM60. 4) Barro (1991) also nds that growth rates are positively related with measures of political stability. 5) This comes in addition to the facts mentioned in your rst semester class: persistent di erences in Y=N and g Y=N across countries, low correlation between Y and g Y=N across countries, stable and similar growth rates in rich countries, unstable and diverse growth rates in poor countries. 1 Facts (1) and (2) come from Prescott and Parente (FRB Minneapolis Quarterly Review, Spring 1993) per capita GDP. 3 Average growth rate from Remark though that there is a possible problem with this regression, since the schooling decision is potentially endogenous. 2

45 2 Welfare costs: growth vs. uctuations In the Jahnsson Lectures Series (1987), Lucas argued that business cycles in the post-war involved very small welfare losses, challenging the presumption that stabilizing the cyclical uctuations was desirable. In particular, he showed that removing all the empirical volatility observed in U.S. times series - regardless of feasibility, would bring small welfare gains, in particular relatively to gains from increasing trend growth rates. Here is how he proceeded. The argument involved does not necessitate developing and solving a full economic model. Rather, we just examine and compare expected welfare from given streams of consumption. We can thus evaluate welfare gains/losses from removing volatility from the stream of consumption or increasing trend components. Hence, no particular policy is examined. Rather, we try to determine how much can be gained from a hypothetical policy removing all volatility, for example. In essence, we are asking the representative household for his attitudes towards some purely hypothetical consumption streams. The only things we need to carry out that exercise is parameters describing (i) preferences (from consumption) and (ii) the stochastic consumption process. Assume households have utility from consumption de ned by ( X +1 U = E t 1 1 c1 t 1 ) ; (1) t=0 where 2 (0; 1) is a constant discount factor and > 0 is the constant coe cient of relative risk aversion. Let us represent consumption as c t = (1 + ) (1 + ) t e z zt ; t = 0; ::: + 1; (2) where fz t g is a stationary stochastic process with a stationary distribution given by ln (z t ) ~N 0; 2 z : Then, we have that E(e z zt ) = 1, so that mean consumption is given by (1 + )(1 + ) t. At this point is just a normalizing constant, whose reason for being will be evident soon. What are empirically reasonable values for and 2 z? For the U.S., the annual growth rate in total consumption is about 3 percent, so that = 0:03. For the post-war period in the U.S., the standard deviation of the log of consumption about trend is about 0:013, so that 2 z = (0:013) 2. Thus, we may take (2) with ; ; z 2 = 0; 0:03; (0:013) 2. Given any choice of ; ; z 2, we could simply calculate the value of (1) under (2) and call the indirect utility function so de ned U ; ; z 2. Instead, we will use compensating variations in to evaluate various and 2 z. To evaluate changes in the growth rate, de ne f (; 0 ) by U f (; 0 ) ; ; 2 z = U 0; 0 ; 2 z ; 3

46 so that f (; 0 ) is the percentage change in consumption, uniform across all dates and values of the shocks, required to leave the consumer indi erent between the growth rates and 0. 5 One nds that f (; 0 ) = : Taking = 0:95 and with a base growth rate 0 = 0:03, one can generate the following table, evaluating the cost of reducing growth from 0 to : f (; 0 ) The table says that consumers would require a 20 percent across the board consumption increase to accept voluntarily a reduction in the consumption growth rate from 0:03 to 0:02 percent. The costs of economic instability can be measured in a way which is identical. De ne g 2 z by U g 2 z ; ; 2 z = U (0; ; 0). Thus g z 2 is the percent increase in consumption, uniform across all dates and values of the shocks, required to leave the consumer indi erent between consumption instability of 2 z and a perfectly smooth consumption path. One nds that g is given by 6 g 2 1 z ' 2 2 z: Here again, one can evaluate the cost of consumption instability, as in the table below: g 2 z The table says that eliminating all aggregate consumption variability of this magnitude would be the equivalent in utility terms of an increase in average consumption of less than one tenth of a percent. Just as a reference, total U.S. consumption in 1983 was $2 trillion, so one tenth of a percent is $2 billion or $8.50 per person. This is a relatively small number, especially compared to the welfare gains from increasing growth trends. 5 With CRRA preferences, 2 z does not appear as an argument of f. 6 Approximating ln (1 + ) as. 4

47 Of course, this calculation entailed some simpli cations. Yet at the least, this implies that the welfare consequences of increasing growth rates are larger than those of policies targeting business cycles. Interestingly, this paper has spurred a great interest in better assessing the welfare gains from policies aimed at reducing consumption volatility. Some directions of research include recognizing that not all households are identical and incorporating imperfection in capital markets. 3 The Solow model: a quick refresher There is a representative household and an aggregate production function: y t = f (h t ; k t ) (CRS, increasing and concave in both arguments). The law of motion for capital is given by: k t+1 (1 ) k t = y t, which assumes that households are saving at a constant exogenous rate. Assume that households supply labor inelastically (h t = 1). Rewrite F (k) = f (1; k) (then F 0 > 0, F 00 < 0, F (0) = 0, F 0 (0) = +1, F 0 (1) = 0). Hence, k t+1 = (1 ) k t + F (k t ) g (k t ). Given k 0, the above relation gives the time paths of k t, c t and y t. We look at the steady state, i.e. a solution to g (k ) = k. By assumption, g (0) = 0, g 0 (0) > 1, g 00 (k) < 0, g 0 (+1) < 1, guaranteeing two steady states (including a degenerate one at k = 0). It is straightforward to show that we have monotonic convergence to k and slower growth as k increases. (1 + ) t h t ; k t f To actually obtain persistent growth, assume: y t = f (1 + ) t ; k t. Call e k t = kt. Then: e k (1+) t t+1 = 1 e ekt + ef ekt, where. Then, k t+1 = (1 ) k t + 1 e = 1 1+ and e = 1+. We are back to the previous setup, and hence e k t converges to e k t. Hence, asymptotically, k t grows at rate 1 + (and so do c t and y t ). 4 The Cass-Koopmans optimal growth model We will start by refreshing your memories and quickly lay out the standard Cass-Koopmans exogenous growth model. All individuals have the same preferences represented by the utility function u(c t ) = c t =. As leisure does not enter the utility function, agents devote all their time endowment to labor. The production technology is given by f(k t ) = A 1 t kt 7. Looking for the optimal solution, we have to maximize the lifetime discounted utility from consumption of the representative agent: Max +1X fk t+1;c tg t=0 s:t: f (k t ) = k t+1 (1 ) k t + c t k 0 given t u (c t ) (3) Notice that the problem is deterministic. The state variable is k t and the only control variable is k t+1 8. Assume that A t = (1 + ) t A 0, or that the technological progress component grows at a rate. To render the 7 This implies that A t, the technological progress, is labor augmenting. 8 Or equivalently c t. 5

48 problem stationary, we need to rede ne the variables. Hence: We can now rewrite (3) as: k t = k t A t c t = c t A t y t = y t A t Max +1X fkt+1 ;c t g t=0 ( (1 + ) ) t u (c t ) s:t: (k t ) = (1 + ) k t+1 (1 ) k t + c t (4) The value function is thus: V (k t ) = Max k t+1 The rst order condition (on k t+1) is: u k t + (1 ) kt (1 + ) kt+1 + (1 + ) V kt+1 (1 + ) u 0 k t + (1 ) k t (1 + ) k t+1 + (1 + ) V 0 k t+1 = 0 Using the envelope theorem: V 0 (kt ( 1) ) = kt + (1 ) MC of savings = MB of savings u 0 kt + (1 ) kt (1 + ) kt+1 Hence: u 0 kt + (1 ) kt (1 + ) kt+1 1 ( 1) = (1 + ) kt+1 + (1 ) u 0 Since u 0 (c t ) = c 1 t (c t ) 1 = (1 + ) 1 ( 1) kt+1 + (1 ) kt+1 + (1 ) kt+1 (1 + ) kt+2 c t+1 1 (5) We want to look at the steady state balanced growth path, i.e. when the normalized variables (c ; k ) are constant. Of course, this implies that c t, k t grow at the same rate as the deterministic technological progress. From (4) and (5), we have that: We have a system of two equations in c and k. Then: " k 1 = k = ( + ) k + c (6) 1 = (1 + ) 1 k ( 1) + 1 (7) 1 (1 + ) c = k ( + ) k y = k!# 1 1 6

49 Notice that, in this model, growth rates are only dependent on the rate at which technology increases (). However, the levels of consumption, investment and output also depend on structural parameters such as production technology (), preferences () and discount rates () 9. On the steady state balanced growth path, k t = A t k. A t is the same for all countries. k depends on,,,,and. Any cross-country di erence, in this model, comes from di erences in,, or. If you assume that technology is not restricted by borders (, ) and that households are the same across countries ( and ), this basic model cannot explain cross-country di erences. Fortunately, there are many alternatives to explaining the various growth experiences in the di erent countries. Next chapter introduces you to some of the possible directions. 5 What do we need to generate economic growth? 5.1 A basic framework to determine what is needed to generate economic growth What did we learn from the previous two models? So far, growth was only generated exogenously through labor augmenting technological progress. This is clearly not satisfying. Let us a look at a more general approach. Suppose the economy has a constant population of a large number of identical agents with preferences de ned as +1P t=0 t u (c t ) ; with u (c) = c1 1 1 ; > 0: For the discussion, the production function will take the form F (K t ; X t ) = X t f( b K t ); where b K t K t X t The function F (:) will have the usual properties (monotonicity, diminishing marginal returns, constant returns to scale, Inada conditions), which implies that lim f 0 ( K) b = 1; lim f 0 ( K) b = 0; bk!0 bk!+1 F 1 (K; X) = f 0 ( (8) F 2 (K; X) = f( f 0 ( K) b K b (9). The input K t is physical capital, while the input X t captures the contribution of labor (we will be more precise soon). 9 And of course the depreciation rate, which has no reason to di er across country. 7

50 We are looking for additional technological assumptions to generate a balanced growth path, where consumption grows at rate. In a competitive equilibrium, the payment to capital is given by Households maximize lifetime utility subject to a budget constraint: r t = F 1 (K t ; X t ) = f 0 ( b K t ): (10) Max +1 P t u (c t ) t=0 s:t: c t + k t+1 (1 ) k t = r t k t + t where t stands for labor income (we will be more precise soon). The rst order condition is u 0 (c t ) = u 0 (c t+1 ) [r t ] : (11) Combining (10) and (11), we get ct+1 c t = h f 0 ( b K t+1 ) + 1 i (12) In conclusion, a constant consumption growth rate must come from a constant rate of return on b K. Again, capital accumulation cannot by itself sustain long-term growth when the other input X t is constant over time (for example,. if X t is purely labor. X t = L). The Solow and Cass-Koopmans model took care of that problem by assuming an exogenous, labor-augmenting technological progress. These two model assumed that X t = A t L, with A t+1 = (1 + ) A t, implying a constant b K t along the balanced growth path. Notice that r t = F 1 (K t ; X t ) = f 0 ( K); b w t = F 2 (K t ; X t ) dx h t dl = f( K) b f 0 ( K) b K b i A t ; which implies that capital is paid a constant rate along the BGP, but payments to labor increase with the technological progress. 5.2 Some possible extensions Externality from spillovers The externality considered here is that technology grows because of aggregate spillovers coming from rms production activities (this section is inspired by Arrow (1962) [learning by doing] and Romer (1986)). Here, technological advancement is external to rms. Assume that rms faced a xed labor productivity, which is proportional to the current economy-wide average of physical capital per worker 10. In particular, assume that X t = K t L; where K t = K t L : 10 Arrow considers that learning from experience is embodied in capital goods. Romer considers that the spillovers come from rms investment in knowledge (aggregate stock of knowledge). 8

51 The rental rate of capital is given by (10). Of course, we have that b K t = 1 and, hence, (12) becomes ct+1 = [f 0 (1) + 1 ] c t Not surprisingly, it turns out that the competitive equilibrium is not Pareto optimal. The social rate of return on capital is given by df K t ; Kt L L = F 1 (K t ; K t ) + F 2 (K t ; K t ) = f(1) dk t where the last equality comes from (8)-(9). Inserting this rate of return into the Planner s problem gives us a higher growth rate ct+1 = [f(1) + 1 ] a fact that is shown by (9). c t All factors reproducible - one-sector model An alternative approach to generating endogenous growth is to assume that all factors of production are reproducible. In particular, assume that human capital X t can also be produced and depreciates at rate X ( K for physical capital). The competitive wage is given by w t = F 2 (K t ; X t ) : Households maximize Max +1 P t u (c t ) t=0 s:t: c t + k t+1 (1 K ) k t + x t+1 (1 X ) x t = r t k t + w t x t : The rst order condition with respect to human capital x t+1 is u 0 (c t ) = u 0 (c t+1 ) [w t X ] : (13) Since households are investing in both human and physical capital (equations (11) and (13)), the two returns must be equal, F 1 (K t+1 ; X t+1 ) K = F 2 (K t+1 ; X t+1 ) X ; which, using (8)-(9), implies that X K = f( b K t+1 ) h1 + b K t+1 i f 0 ( b K t+1 ): (14) This last equation pins down a time invariant value for K b = K b. One can solve (14) for f 0 ( K b ) and insert it in (12) to get " ct+1 f( = b # K ) c t 1 + K b + 1 X + KK b 1 + K b : 9

52 6 Miscellaneous: An empirical investigation: why does not capital ow from rich to poor countries? This is derived from a paper by Lucas (same title, AER 1990). The paper starts from the observation that physical capital does not ow from rich to poor countries, where presumably it would earn a higher return. This is puzzling if one considers the basic assumptions on production technology and free trade between countries. Let us take a look at these standard assumptions. If you consider that countries share the same production function (technology spreads across borders) Y = AK N 1 (15) and that physical capital can be traded between countries without restrictions. Di erences in output per worker that are observed across countries must come from di erences in the capital to labor ratio [ Y N = A K N ]. But then the higher return to capital in one country should cause capital to go where the return is the higher. That is why one would expect capital to move from rich to poor countries (if one believes in this simple model). Let us take an example to illustrate this. It is observed that output per person in the U.S. is 15 times higher than in India. Taking the Cobb-Douglas production function in (15), denote by y the output per worker and k the capital per worker. Then and the return to capital r = dy=dk is given by y = Ak r = Ak 1 = A 1 1 y Using = 0:4 (average of U.S. and Indian capital shares) 1:5 r I yi = 58 r U:S: y U:S: Thus if capital ows were not restricted, one would expect capital movements from the U.S. to India. Hence, this very basic framework needs to be altered. We will look below at several possible modi cations of the neoclassical framework to address this issue. Human capital in the production function: This is equivalent to considering that labor quality or e ectiveness enters the labor input. Two workers producing for the same number of hours may produce di erent amounts of output. The production function becomes Y = AK (hn) 1 10

53 where h is e ciency units of labor, or human capital. In that case, denoting the output per e ective worker (Y=hN) and capital per e ective worker (K=hN) by y and k, respectively, and the return to capital by r, we have that y = Ak r = Ak 1 = A 1 1 y Lucas reports that, if each country had the same physical capital endowment per worker, an Indian worker would produce 38% of what his American counterpart would 11. Hence (h I =h U:S: ) 1 = :38, or h I =h U:S: 0:2. This would result in 1 r I yi = = per capita income ratio (India to U:S:) h 1:5 U:S: 5:2 r U:S: y US h I Of course, this is an improvement over the previous version that did not include human capital. The ratio r I =r US is lower because higher human capital in the U.S. increases r US. But the puzzle remains partially unexplained. Human capital with externalities: Assume that: Y = AK (hn) 1 h a where h a is the average level of human capital in the economy and > 0. Hence, output depends on the average level of human capital in the economy (and not only the e ective labor used as input). Of course, when taking their decision of investing in human capital, agents do not take into account the fact that their decisions a ect h a in equilibrium. Rather, they take h a as given. The accumulation of human capital therefore brings a positive externality on the economy. The idea is that the more people around are skilled, the more productive one is. For a complete treatment, see Lucas (JME 1988). Again denoting y and k as income per e ective worker and capital per e ective worker, we have that: y = Ak h a r = A 1 1 y Remark that this is equivalent to having a growing technology component. Lucas estimates the parameter = 0:36. With this parameter, one nds that r I =r U:S: 1 (the parameter is actually estimated and not 11 This comes from Krueger (Economic Journal 1968). A bit more precisely, she estimates how (age/sector mix/education) a ects productivity in the U.S. (from earnings information). She then uses this information to compute the productivity of an Indian worker (with a di erent mix of age/sector/education). That allows her to estimate per capita income in the two countries, if each countries had the same physical capital per worker. The estimate is: [Y =N] I = A(K=N) h 1 I [Y =N] U:S: A(K=N) h 1 U:S: h Hence, I = :2. h U:S: = hi h U:S: 1 = :38. h a 11

54 calibrated to obtain the ratio of 1!). However, notice that it was assumed the external bene ts of human capital accumulation in the U.S. did not a ect productivity in India, hence that there was no knowledge spillover across borders. If that was the case, the nal result would not hold so strongly 12. The previous intuition applies, with the e ect of human capital on r even stronger. 7 Miscellaneous: can policy di erences explain the wide disparity in income levels? Can policy di erences explain the wide disparity observed in income per capita across countries? To answer this question, let us consider a simple neoclassical, exogenous growth model. In principle, policy di erences can di er in a lot of ways. One can think of the relative price of capital goods as re ecting distortions arising from taxes, corruption or other import substitutions. Jones (1995) shows that a high price of investment goods relative to consumption goods is associated with low growth rates over the postwar period. More generally, we will introduce policy di erences as di erential tax rates on investment. Suppose that all countries have access to the same technology given by Y j = K 1 j (AH j ) : (16) The capital accumulation is given by I j;t = K j;t+1 (1 ) K j;t. Households maximize their lifetime utility, j 1 with preferences de ned by C1 1. The budget constraint is di erent in each country. In particular, (1 + j ) I j + C j = I j : Here, j, which can be interpreted as a tax on investment, varies across countries, for example because of policies or di erences in institution/property rights enforcement. Notice that 1 + j is also the relative price of investment goods (relative to consumption goods): one unit of consumption goods can only be transformed into 1=(1 + j ) units of investment goods. 13 Attacking the problem as an equilibrium problem, the solution to the household s problem is given by V (k t ) = max k t+1 fu (c t ) + V (k t+1 )g The rst order condition is s:t: (1 + j ) [k t+1 (1 ) k t ] + c t = w t h + r t k t (1 + j ) u 0 (c t ) = [r t+1 + (1 + j ) (1 )] u 0 (c t+1 ) : 12 The human capital accumulation in the U.S. would also increase r in India. 13 Note that, implicitly, j I j is wasted, rather than redistributed to agents in the economy. 12

55 Pro t maximization by rms imply that r = (1 ) Ah. k Using the functional form we have for u (:), we get that ct+1 1 At+1 h = + 1 c t 1 + J k t+1 (along the steady state growth path, Ah=k is constant.) When the growth rate of technological progress g A is set to 0, the above equation gives us that " # k = Ah; (17) J 1 + implying that those distortions negatively a ect capital accumulation and, thus, the respective income levels. However, can simple di erences in policies (as represented by ) be enough to explain wide discrepancies in income per capita across countries? Continue assuming that g A = 0 for simplicity. From equations (16)-(17), one can see that the ratio of per capita income between countries i and j is given by Y ( i ) Y ( j ) = 1 + j 1 + i 1 (18) Taking a value for the capital share of income of 1=3 (i.e. = 2=3). Data suggest that there is a large amount of variation in the relative price of investment goods: countries with the highest such price have approximately eight times as high a value as countries with the lowest relative price of investment goods. Using the value = 2=3, equation (18) implies that the output gap between two such countries should be approximately 3 ' (8) 1=2. Therefore, di erences in capital-output ratio or capital-labor ratios caused by taxes or distortions cannot account for the large income di erences observed. 8 Endogenous technological change With this section, we start a very di erent approach. The two previous chapters were based on some modi ed version of the neoclassical model. Here, we look at a model, based on Romer (JPE 1990), where technology is both input in a production function of a nal good and the output of a research sector. Hence, technology is endogenously determined. Technological change is de ned as improvement in our ability to use inputs. In this sense, the output of the research sector is new blueprints on how to use labor and capital to create goods. One fundamental characteristics of technological change is that, once the costs of creating better ways to produce goods have been incurred, the technological changes can be repeatedly implemented at no cost. The distinguishing feature of technology as an input is that it is neither a conventional good nor a public good; it is a non-rival, partially excludable good. (A good is a rival good if its use by one agent precludes its use by another; a good is excludable if the owner can prevent others from using it, possibly through the legal 13

56 system; conventional goods are both rival and excludable; public goods are non-rival and non-excludable). Technology is non-rival since a blueprint or set of instructions can be used simultaneously by several agents within a rm, and can be used repeatedly at no extra cost. Because of patents, it is partially excludable, and hence there is an incentive to incur the costs of creating new technology. Here it is important to see how technology (the ability to use inputs to produce output) di ers from human capital (the e ciency of labor): technology is a non-rival good, while human capital is a rival good. Since human capital is both a rival and an excludable good, it can be traded in competitive markets. Technology, as a non-rival good, can be accumulated without any limit (per capita) and remains forever, once the costs of creating it have been incurred. The non-rival aspect of technology has very important consequences on the nature of the market for technology production. Temporarily assume that there are two sectors: (i) sector A, which produces R&D and (ii) sector B which uses R&D as one of its inputs. Denote by F (N; R), the production function in sector B, where N are the non-rival inputs and R the rival inputs, then 8m, F (N; mr) = mf (N; R). To be convinced, think of N as technology (blueprints) and R as the rival inputs of capital and labor. Then given a certain know-how N, output can be doubled by doubling capital and labor. This is because technology does not have to be doubled. By that argument, F is homogenous of degree 1 in R and hence F (N; R) = RF R (N; R). We will see that this situation is not compatible with competitive markets. Assume perfectly competitive markets for now: - First, if we were to assume that a rm using a non-rival good N as an input, is a price taker in all inputs and output markets, rents these inputs and pays marginal product to all of them, then this rm could not survive since then pro ts would be negative: F (N; R) RF R (N; R) NF N (N; R) < 0. - Second, in sector A, since the marginal cost of supplying the blueprint to an additional user is zero (once the discovery has been made), then, in a competitive market, the rental price of the blueprint should be zero. But, then the rm in sector A would not have engaged in research. - Hence, it is necessary to move away from the assumption of perfectly competitive markets. Instead, we need to account for the fact that technology is a non-rival and (partially excludable) input and only needs to be purchased once. In what will follow soon, because the R&D rm will be able to sell a patent and assign the right to the technology to a particular rm, it will do so and sell the patent to the highest bidder. In turn, we will assume that the rm in sector B owning the patent is a monopolist in the production of its good (whose production requires the patent). That rm will incur the cost of purchasing the patent once. Because of its monopoly situation, it will be able to sell above marginal cost and recoup its initial cost. Because rms in sector B have to bid for the patent, the discounted value of the stream of pro ts will be equal to the price of the patent (like a free entry condition). Keep all of this in mind. It will justify the assumption we will make later that intermediate goods producers (who use the blueprints) are monopolists in the production of their own good. There are three sectors: A research sector, an intermediate good sector and a nal good sector. We will look 14

57 at each one in turn. For simplicity, total population and labor supply are assumed constant. Total human capital is also assumed constant and the proportion allocated to the market is xed. Research Sector: Inputs in the research sector are human capital and the existence technology (stock of knowledge). The output is new knowledge, in the form of new designs for the intermediate good sector. When a new design has been produced, the R&D rm obtains a patent for its invention. It is assumed that if a researcher with human capital h has access to the cumulative number of designs invented up to that time A, his rate of production of new designs is a = ha. Summing across researchers, the aggregate rate of design production is A = H A A (19) where H A is aggregate human capital devoted to the R&D sector. 14 Equation (19) assumes that (i) more human capital in the R&D sector implies higher rate of inventions, (ii) more designs already invented leads to higher productivity of new designs, (iii) the rate of production of new designs is linear in H A and A. We will discuss assumption (iii) later in more detail. Notice that the creation of new designs has thus two e ects: it increases productivity in the nal goods sector, but also future productivity in the R&D sector. By holding a patent, an inventor has property rights over its use in the intermediate goods sector, but not in the R&D sector. The R&D rm s maximization problem (per unit of time) is Max H A P A A w A H A This implies that: w A = P A A where w A is the wage rate paid per unit of human capital provided in the research sector and P A is the price of new designs. Final Goods Sector: Inputs are labor, human capital and producer durables (the ones that have been invented to this point). Labor is measured by the number of people working. Human capital re ects the accumulation of education and work experience. The output can be either consumed or saved as new capital (capital is measured in units of consumption goods). In that sector, the production function is 15 F (H y ; L; x) = H y L +1 X i=1 x 1 i 14 Remark that if we were assuming that the rate of production of new designs was only a function of human capital, we could not generate continued growth. 15 This implicitly assumes that all types of capital are not perfect substitutes for each other. An additional unit of x i has no e ect on the marginal productivity of x j. 15

58 where H y is the human capital allocated to the nal goods sector, L is labor services and fx i g are the set of producer durables produced to this point. Because each durable is derived from a new design, for any given t, there exists A (t) such that x i = 0 for i > A (t). Since the designs last forever, A (t) is an increasing function. Because F (H y ; L; x) exhibits constant returns to scale, the nal goods sector can be considered as made of one single representative price taking rm. The rm s maximization problem is Max H H Y ;L;fx ig A y L 1 A X i=1 x 1 i w H H y w L L AX p i x i i=1 where p i is price of durable i. The rst order conditions are: w H = MP H w L = MP L p i = (1 ) x (+) i H y L (20) Intermediate Goods Sector: Inputs are the designs provided by the research sector and forgone output (forgone consumption in order to accumulate capital). The output is producer durables to be used by the nal goods sector. There is one rm i for each producer durable x i. That rm purchases the design for its product and converts units of nal output into one unit of x 16 i. It rents the product it manufactures at a rental rate p i. Since the rm has a monopoly for its own product, it faces a downward demand curve for its product, and maximizes pro ts accordingly. Taking L and H y as given, the intermediate goods rm producing durable x i knows that the demand it faces for its product is given by (20). Having already incurred the one time cost of purchasing the technology, the rm s maximization problem is to maximize pro ts (x). Remember that the cost can be recovered because of the rm s monopoly position. Max (x) = p (x) x rx (21) x where p(x) is de ned by (20) and rx represents the costs of producing x units (x units of nal good needed to produce 1 unit of x, rented at the interest rate r). The rst order condition for this problem is which implies a price p (from (20)), and a pro t : (1 ) 2 H y L x (+) i p = r 1 = px ( + ) 16 To be precise, the foregone consumption is actually never manufactured. Instead, the resources are used to produce capital goods. = r 16

59 where x is the output determined by the aggregate demand curve. When awarding the new designs to competing bidders, the price P A will be bid up until it is equal to the discounted value of future pro ts attributable to that new design. This is equivalent to a free entry condition for the intermediate goods producers. Hence r = P A (22) Preferences: Preferences are assumed to be CRRA in consumption, that is u (c) = c This implies a growth rate of consumption g c : c g c = c = r (23) where r is the interest rate and is the rate of time preference. This can be quickly veri ed. Consider that households choose a stream of consumptions fc (t)g. Suppose that at date t, households reduce consumption by a small amount dc that they save and instead increase consumption at date t + dt (using the incremental investment made at date t). The marginal impact on lifetime utility must be zero, since households are assumed to maximize lifetime utility. The utility loss at t is e t u 0 (c (t)) dc = e t c (t) dc. Consumption at date t + dt is increased by e rdt dc. The marginal utility of consumption at t + dt is c (t + dt) = c (t) e gcdt. Hence, we must have that e t c (t) dc = e (t+dt) e rdt dc c (t) e gcdt. Eliminating e t c (t) dc on each side, we have that g c = r. Equilibrium: An equilibrium is a path for prices and quantities, such that: - consumers make consumption and savings decisions, taking interest rates as given, - workers allocate their human capital in the di erent sectors, taking A, P A, and w H as given, - nal goods producers choose labor, human capital, and durables taking prices as given, - intermediate goods producers set prices to maximize pro ts, taking interest rates and the downward sloping demand curve it faces as given, - rms trying to enter a new durable good sector take P A as given, - R&D rms sell their new designs to the highest bidder, - markets clear. Solving for equilibrium: Use (20) to establish that x i = x; 81 i A P Because it takes x units of forgone consumption to produce x units of a durable, K = A x i and because of symmetry i=1 K = Ax (24) 17

60 and therefore F (H y ; L; x) = H y L Ax 1 = + 1 (AH Y ) (AL) K 1 (25) That implicit production function is CRS in (AH Y ), (AL) and K. This is equivalent to labor and human capital augmenting technological progress. As A increases, K, (AH Y ), and (AL) will grow at the same rate, ensuring sustained growth. This is to be compared with the standard result from the Solow model that, because of diminishing returns to physical capital (and a constant labor force), growth rates are eventually nil, without some exogenous technological progress. Hence, here we have endogenized technological progress (it is a result of R&D activity), and with A as an implied input in the production function. we do not have the problems due to diminishing marginal returns to physical capital. Hence, the number of intermediate goods increases (more diversity in intermediate goods) and Y (composite output) grows at rate g A. Calculating growth rates in this economy: We are looking for an equilibrium where A, K, C and Y grow at a constant rate. For the same kind of reasons as in Chapter 1, K and A will grow at the same rate along the balance growth path, and so does Y (A is labor augmenting in the bounded inputs and the implicit production function is CRS). If this is the case, the economy is said to be on its balanced growth path. a) For g A to be constant, H A and therefore H Y must remain constant, since g A = H A. b) With K and A growing at the same rate, x is constant. c) The wage to human capital in the nal good sector grows with A (w H = Hy 1 L Ax 1 ), while the wage to human capital in the R&D sector grows with A, if P A is constant (w A = P A A). d) Hence, H y and H A are constant if P A is constant. e) P A = r is constant since p, x and r are constant (r is constant since g c = r ). From (22), we know that: P A = r = + (1 ) Hy L x 1 r It must also be the case that wages paid to human capital in the research and the nal goods sectors are equal, otherwise either H Y or H A would be equal to zero. Hence: w A = P A A = w H = H 1 y L Ax 1 Given the expression for P A above: H Y = ( + ) (1 ) r (26) H A = H H Y (27) 18

61 Since L, H Y are constant, (25) shows that Y grows at the same rate as A. The capital accumulation condition Y = C + K implies that 17 C Y = 1 K K K Y Hence C and Y grow at the same rate. As (19) implies that g = g c = g y = g k = g A = H A A A = H A, we have that: Given the expression for H A above, g = H ( + ) (1 ) r Denote (+)(1 ) by. Since g = g c = r, r can be calculated and: g = H 1 + Note that g does not depend on L. This is because an increase in L increases the returns to human capital in both research and manufacturing sectors (as the author outlines, however, this is not robust to a change in the functional forms). Notice that when computing equilibrium values for H A and H Y, we should take into account the constraint that H A > 0. For (26) and (27) to hold, one needs that H > r=. If H < r=, H A = 0 and g = 0. For strictly positive growth, one needs H > =. Hence, we get the enclosed graph plotting g and H A against H. This shows that contrary to an increase in L, an increase in H has a (positive) e ect on g. Figure 1 also shows that if there is not enough total human capital in an economy, that economy may stay trapped at a zero growth rate. Highlighting some important assumptions: (a) The R&D sector is the engine of growth. It is therefore important to notice that the current stock of knowledge a ects both current and future productivity in that sector. More on that point below. (b) The introduction of new intermediary goods does not a ect the marginal product of existing intermediary goods. (c) The introduction of new intermediary goods allows the economy to grow forever even though each intermediary good taken separately is subject to diminishing marginal returns. (d) It is important to notice that the non-reproducible input (labor) is not required in the R&D sector production function. 17 This implicitly assumes no depreciation for the capital. However, with depreciation at rate, then we have that Y = C + K + K. We would still have that C and Y must grow at the same rate. 19

62 g; H A 6-0 H Figure 1: g = H A as a function of H; = 1 Related contributions: These endogenous growth models have started a new line of literature. An interesting contribution is by Jones (JPE 1995). The author starts by noticing that Romer (and others models as well) nd a scale e ect, that is that growth rates increase when the human capital allocated to R&D increases. Jones argues that this is not empirically veri ed. For example, the number of scientists in most developed have increased a lot, but average growth rates have been constant at best. He suggests a model where growth rates depend on the growth rate of invention, but not on the level of human capital devoted to R&D. Remember that Romer assumes the rate at which designs are invented follows: A = H A A or g A = H A Jones main contribution is to alter this law of motion of technology, while keeping the rest of Romer s model untouched. In particular, the discovery of new ideas is assumed to depend on the number of people in the R&D sector L A, and the rate at which these scientists discover new ideas. That is 18 : A = L A For a given number of scientists in the economy, one would expect to depend on the amount of knowledge already created A. Hence, it is assumed that = A. No restriction is taken on (notice that Romer 18 Remark that Jones does not make a di erence between labor and human capital, the way Romer does. However, this does not really a ects the argument. 20

63 xes it at = 1). Rather, > 0 if there are positive spillovers. That is, the work done by predecessors make current researchers more productive. Or as Einstein said, I have been able to achieve what I did, because I was standing on the shoulders of giants (or something to that e ect...). Alternatively, < 0 if one assumes that there is some xed number of inventions, and therefore the probability of discovering a new idea decreases with the stock of knowledge 19 (this is reminiscent of the idea of shing out, where one s shing in a pond with limited number of sh decreases everyone else s likelihood of catching sh, hence imposing a negative externality). These two e ects are externalities across time. Also consider the possible waste of research e ort due to the fact that two scientists may nd an invention at the same time. This would reduce the total number of new inventions at a particular point in time, so that L A ; 0 < 1 rather than L A enters the equation (this is referred to as a duplication externality, or stepping on toes e ect ). Hence a = l A A L 1 A where l A is the individual s allocation of time to the R&D sector, and a the individual s output. In equilibrium, of course, we must have that l A = L A, so that, on aggregate: A = L AA (28) One can see that, in this framework, Romer s model is a very speci c case ( = 1; = 1). In the rest, will be assumed to be strictly less than one. With no such speci c restrictions on and, one can derive steady state growth, even in the presence of a growing labor force. Equation (28) implies that: By di erentiating, one obtains: g = g A = L A A 1 g = n (1 ) where n is the growth rate of the labor force (along the steady state growth path, labor is allocated in xed proportion between R&D and production). The results from Romer and Jones can be compared along a few lines. First, in Jones, the level of resources allocated to R&D does not a ect growth, whereas with Romer it does (which is counterfactual as noticed). Second, in Jones, population growth (n) induces growth in per capita variables, which a lot of economists take as counterfactual. (29) 9 Production technology and growth In this section, we are focusing on the nature of the production technology to explain some patterns observed across countries. In particular, we will see that a production function with a high degree of complementarity 19 Maybe that is why the average duration of a Ph.D. seems to be increasing! 21

64 between all the tasks necessary to produce the nal output helps explain large income di erences between countries, smaller rms in poor countries and other observations. The analysis is based on Kremer (1993). The production of a good generally requires several steps or tasks that are typically performed by di erent individuals. The value of the nal good being produced depends on how successful each task was performed. When workers are heterogeneous with respect to their skills, the rm is confronted with the decision of having to choose the skill level of the workers it employs. Hiring higher skill workers on average increases the value of nal output, but these workers, presumably, are also paid higher wages. Hence, we can see that the nature of the production technology has the potential to a ect the assignment of workers to tasks. The production of one unit of good consists of n tasks. Every task needs one person only devoted to this particular task. Hence one high skill worker cannot be substituted for several low skill workers. It is assumed that each task in the production process must be successfully completed for the nal product to have its maximum value. Here, skill q is de ned as the probability that the task will be successfully completed (hence, it is a number between 0 and 1), or equivalently, the expected percentage of maximum value the nal good retains if the worker performs the task. Since the probability of making a mistake is independent across tasks, the production function is de ned as: E(y) = k n Y i=1 q i! nb where k is capital, and B is a normalizing constant (output per worker with a single unit of capital, when all tasks are performed perfectly, q i = 1; 8i = 1:::n). Suppose that there is a xed supply of capital k and a given distribution of skill in the labor force (q). Assume that leisure does not enter workers utility and that they supply labor inelastically. We look for a competitive equilibrium, that is an allocation of workers to rms, a wage schedule w (q), and a rental rate of capital r, such that rms maximize pro ts and all markets clear. In this case, the workers decision problem is trivial since they supply labor inelastically. The markets in consideration are the market for capital and the market for workers of each skill level. Since rms are risk neutral, we can replace E (y) by y and obtain the rm s maximization problem 20 : Max k k;fq ig n i=1 n Y i=1 q i! nb nx w (q i ) rk (30) i=1 Firms have to choose capital and a skill level for the ith task, so as to maximize pro ts, given market prices. 20 The wage schedule is taken as given, but rms can still perform marginal analysis and look at how output and the wage bill increase when the required skill level is increased at a particular task. 22

65 The n + 1 rst order conditions are: 0 ny j=1;j6=i q j 1 A nb w 0 (q i ) = 0; 8i = 1:::n! Y n k 1 q i nb r = 0 Note that the inputs in that production function are complementary, since: i=1 d 2 y dq i d Q j6=i q j > 0 Hence, the derivative of the marginal product of the ith worker with respect to the skill of all other workers is strictly positive. That is the (marginal) product of one worker is positively related with the skill levels of ALL other workers. Hence, rms with high skill workers in the n 1 rst tasks prefer a high skill worker in the nth task, since the marginal product of all the rst n 1 workers depends positively on the skill of that last worker. It is in that sense that skill levels are very complementary in the production function. In fact, equilibria can be restricted to the case where all workers in a rm have the same skill level q. Indeed, using the rst order conditions for the ith and the jth tasks, one gets: nbk Y l6=i q l = w 0 (q i ) nbk Y m6=j q m = w 0 (q j ) By taking the ratio of these two expressions, one obtains that: q j = w0 (q i ) q i w 0 ; 8 (i; j) 2 f1:::ng (q j ) Hence: q i w 0 (q i ) = q j w 0 (q j ), 8 (i; j) 2 f1:::ng It will be checked later that qw 0 (q) is strictly monotonic. Consequently, q i = q j = q, 8 (i; j) 2 f1:::ng. The rst order conditions become: k q n 1 nb = w 0 (q) k 1 q n nb = r This implies that: q n nb k = r 1 1 (31) 23

66 This is the demand for capital of the individual rm hiring at skill level q, as a function of rental price r. Rewriting the individual rm s production function equation y = k q n nb as q n nb = y k and the rst order condition for capital k 1 q n nb = r as q n nb = r k, one gets that y 1 k = r k, or 1 rk = y (32) This implies that a proportion of output goes to the payment of capital. Of course, this is true at the individual rm level, and on aggregate. In equilibrium, the market for capital clears, or k = Z 1 0 q n nb r d (q) (33) n The density of rms hiring q workers is 1 n times the density of such workers. This is the sum of individual demands. One can get r from there. (remember that k is individual capital, while k is aggregate capital.) The rst order condition on q, k q n 1 nb = w 0 (q), can be rewritten as h i q n nb 1 r q n 1 nb = w 0 (q) or: nb w 0 1 n (q) = nbq 1 1 (34) r Integrating, one gets w (q) = (1 ) h n i 1 B 1 n 1 q 1 + c (35) r Using the rst order condition on capital, n 1 r = k (q n 1 B), this wage schedule implies that the total wage bill, nw (q) = n (1 ) k Bq n + nc = (1 ) y + nc. Therefore, total pro ts are equal to y (1 ) y nc y = nc. Pro ts cannot be negative, hence c = 0. Since rms make zero pro t in equilibrium, rms are indi erent as to the skill level of their employees, as long as they are of homogenous skill. Since c = 0, we have that: h n i 1 w (q) = (1 ) B 1 n 1 q 1 (36) r Thus, we have what we were looking for: an expression for the wage that only depends on q (and model parameters) and an expression for r that only depends on model parameters. It is important to understand what we just did in order to derive the wage schedule w (q). First, of course, w (q) must be de ned for all values of q. So, taking w (q) and r as given, rms choose q and k to satisfy the two rst order conditions. Then, knowing how k is related to q and r (equation (31)), we can get a relationship, at all levels of q between w 0 (q), q and r (which is given and has been previously calculated in equation (33)). We can then integrate for w (q) and get (35). By integrating, we nd the functional form for w (q) that allows rms to possibly satisfy their rst order condition for any q. 24

67 Applications: This production technology may help explain some empirical facts observed across countries. a) Income and productivity di erentials. There are enormous income per capita di erentials between the poorest and the richest countries in the world. Lucas (1990) suggests a few explanations, primarily including human capital in the production function (with or without externalities). Hence, di erent levels in human capital across countries may account for some of the great variability in income observed. With the production technology used in this paper however, small di erences in skill distribution result in large di erence in wages or incomes. This is because the production function exhibits a high degree of complementarity in its inputs, the workers skills. Indeed, one can see from (35) that w (q) is strictly convex in q. Therefore a small increase in q results in a large increase in w (q). It is interesting to also notice that the derivative of the marginal product of capital with respect to the skill of all workers is strictly positive, d 2 y dkd ( Q i q i) > 0 and hence, that capital and skills are complementarity inputs in the production technology. Therefore more capital will be used with higher skill workers, explaining why ows of capital from rich to poor countries are relatively limited: higher skill workers are less likely to waste the rental value of capital. b) Firms hire workers of di erent skill levels and produce goods of di erent quality. Within a given industry, there are rms producing high quality goods with skilled workers and rms that produce low quality goods with unskilled workers. However, both types of rms can survive (pro ts are zero, for all q). This is can also be observed across economies, where some countries tend to specialize in high quality goods while others specialize in low quality goods. c) There is a positive correlation among the wages of workers in di erent occupations within rms. This is because the equilibrium allocation of skills is that rms hire workers of homogenous skills. Simply put, high-q secretaries work with high-q lawyers and bankers. d) For a given symmetric distribution of skill q, the distribution of income w is skewed to the right and the distribution of log income Logw is distributed symmetrically. This is what is observed empirically. w (q) is strictly increasing and convex in q. For the sake of argument, assume that q is uniformly distributed. Denote the mean and the median of the skill distribution by E (q) and MD (q) respectively. Because of the uniform distribution, we have that E (q) = MD (q). Since w (q) is strictly monotonic, MD (w (q)) = 25

68 w (MD (q)). By Jensen s inequality and the strict convexity of w (q), E (w (q)) w (E (q)) = w (MD (q)) = MD (w (q)). Hence, the distribution of income is skewed to the right. The distribution of Logw is symmetric, because so is the distribution of Logq (Logw (q) = Endogenizing the choice of technology: n 1 Logq+constant). Consider that rms have the choice of how complex a production technology they can choose to implement, as well as what kind of workers they can hire. In particular, a complex technology is one with a high n, i.e. with multiple tasks, and hence more possibilities to commit mistakes. For simplicity, capital is not necessary for production and all tasks require the same amount of labor. The production function is given by: y =! ny q i nb (n) i=1 where n is the number of tasks to be performed to complete the production process and B (n) is the value of output per task if all tasks are performed correctly. It is assumed that increasing the complexity of the technology increases the value of nal output, if all tasks are performed correctly, but that these bene ts diminish as technology become more and more complex. In other words, B 0 (n) > 0 B 00 (n) < 0 Hence, the rms maximization problem is, taking w (q) as given: Max n;fq ig! ny q i nb (n) i=1 nx w (q i ) Again, equilibria can be restricted to cases where the q i s are all equal. So, the rm s problem becomes The rst order conditions are i=1 Max n;q qn nb (n) nw (q) nq n 1 B (n) w 0 (q) = 0 (Logq) q n nb (n) + q n B (n) + q n nb 0 (n) w (q) = 0 The rst order condition on q implies that w (q) = q n B (n) + c. To ensure non-negative pro ts, c must be equal to 0. Thus, w (q) = q n B (n) The rst order condition on n can be simpli ed as (Logq) B (n) + B 0 (n) = 0 26

69 This implies that: Logq = B0 (n) B (n) Equation (37) implicitly de nes n as a function of q. The left-hand side increases with q, while the righthand side increases in n. As a result, the implicitly de ned n is an increasing function of q. Therefore rms choosing a complex production technology (high n) will also choose highly skilled workers (high q). Because mistakes are more costly with a highly sophisticated technology, rms prefer to hire workers who are expected to commit fewer mistakes. Applications: Even though we are not fully solving for the equilibrium in this case, we can still make use of what was just established. e) Rich countries specialize in complicated products and rms are larger in rich countries. Rich countries, with high skilled workers, specialize in complex products, i.e. with high n. f) Within a single country, there is a positive correlation between wage and rm size. (37) 27

70 Chapter III SOME LABOR MARKET MODELS 1

71 1 Some refinements of the standard RBC model applied to the labor market As we saw, RBC theory is fundamentally a methodology for the study of business cycles. You may also remember that the basic RBC model does not perform very well for the labor market. The point of this chapter is to modify the structure of the basic model, in order to replicate some characteristics that are peculiar to the labor market. In particular, we want to improve the basic RBC model along the following lines: (i) the labor input in the U.S. economy fluctuates roughly twice as much as it does in the model, (ii) hours of work and productivity basically show zero correlation in reality, but they tend to move together in the artificial economy, (iii) total hours worked fluctuate more than productivity in the U.S. These points are illustrated in the table below: TABLE 1 σ h /σ y σ h /σ prod corr (h, prod) U.S. time series Standard model What seems to be the problem? One problem with the basic RBC model is that it relies on only one type of shock to generate fluctuations. It is therefore not too surprising that, in general, variables would tend to be highly correlated. In the case of hours of work and productivity, for example, a positive technology shock increasing productivity would also increase hours, as wages would also increase (remember that in the neoclassical framework, average productivity is equal to y h = z k h θ and marginal productivity or wage is equal to (1 θ) z k h θ). The fact that hours vary more than productivity in the U.S. means that the implied short run labor supply elasticity is relatively high. In other words, in the model, households are not sufficiently willing to substitute leisure across periods, in response to a given shock. The first two adjustments to the model considered will therefore attempt to generate higher response of hours to a given shock. The fact that hours and productivity are too highly positively correlated in the model is due to the fact that fluctuations are driven by a single shock. Think of a positive technology shock as a shift of labor demand with no alteration to labor supply. It is then not surprising to find that real wage and hours are highly correlated. Hence, the last two adjustments to the model will introduce additional shocks to the model. 2

72 But first, let us quickly consider the baseline model, from which we will consider the various adjustments. 1.2 Baseline model Agents: Homogenous households Firms Preferences: Defined over consumption and leisure: u (c, l) =Log (c)+alog (l) =Log (c)+alog (1 h) Discounting at rate β Endowments: Unit endowment of leisure (l t ) and work hours (h t ) every period. Hence, h t + l t =1. Households start every period with their capital k t. Technology: y t = e z t f(h t,k t )=e z t h 1 θ t kt θ,wherez t is a technology shock observed by agents at the beginning of the period,beforedecisionsaretobetaken. The shocks z t follow a random process: z t+1 = ρz t + ε t. Skipping a few steps along the way, and solving the problem as a social planner problem (the problem does not have distortions), we have the following maximization problem: Max + X {c t },{h t } t=0 β t u (c t, 1 h t ) s.t. c t + k t+1 (1 δ) k t = e zt ht 1 θ kt θ Using standard dynamic programming techniques, it is possible to get decision rules c t = c t (z t,k t ) and h t = h t (z t,k t ), and from there, all the values of interest (i t...). One can then simulate the model for a high number of periods, do so a number of times, average the statistics over the number of simulations and generate the statistics of interest. The results are provided in table 1 above (with the same statistics for the different variants). 3

73 1.3 First adjustment: non-separable leisure Consider that the utility from leisure does not only depend on leisure this period, but also on past leisure. More precisely, it depends on a weighted average of past leisure. In this case, leisure in one period is a good substitute for leisure in nearby periods. Hence, the agent may be more willing to reduce leisure (that is increase hours of work) when the technology shock is high, if he derives utility not only from leisure this period, but also from past periods. With this type of utility functions, it is possible to reduce leisure a lot more in a good period, since the marginal utility of leisure at very low values of leisure is not necessarily infinite anymore. In particular, assume the following utility function: Ã +! X Log (c t )+ALogL t = Log (c t )+ALog a i l t i i=0 where + X i=0 a i = 1 and a i+1 = (1 η) a i,i 1 0 < η < 1 This implies, of course, that a i =(1 η) i 1 a 1,fori 1. Remark that the number of free parameters in the utility function is actually limited to two (η and a 0 ). Noticing that a 0 + a 1 η =1, one can rewrite L t as: X+ L t = a 0 (1 h t )+a 1 (1 η) i 1 (1 h t i ) i=1 = a 0 (1 h t )+ a 1 η a 1 = 1 a 0 h t η (1 a 0 ) + X i=1 + X i=1 (1 η) i 1 h t i (1 η) i 1 h t i Define B t as B t = + X i=1 (1 η) i 1 h t i This new variable B t should be viewed as the contribution of past leisure to L t. With a simple manipulation, it can be shown that: B t+1 =(1 η) B t + h t 4

74 Thus we have that: L t = 1 a 0 h t η (1 a 0 ) B t B t+1 = (1 η) B t + h t The problem then becomes: ( + ) X MaxE β t [Log (c t )+ALog (1 a 0 h t η (1 a 0 ) B t )] {k t+1 },{h t } t=0 s.t. c t + k t+1 (1 δ) k t = e zt ht 1 θ kt θ B t+1 =(1 η) B t + h t 1 z t+1 = ρz t + ε t This problem can be solved using standard dynamic programing (notice that B t is a state variable) 2. As the focus is on assessing whether the adjustment considered improves the standard model along the lines mentioned, we look at the table provided below. TABLE 2 σ h /σ y σ h /σ prod corr (h, prod) U.S. time series Standard model Non-separable utility Indivisible leisure Government shocks Home production It looks like considering non-separable utility improves the standard model. As expected, hours are more volatile because agents are more willing to substitute leisure across time. 1 It is a law of motion for the state variable B t. 2 Since time goes back indefinitely, there is no such thing for which B t would not be properly defined. To solve, assume that you start at B t = B 0 (for some t =0). B 0 is a given parameter of the problem, and you can solve the problem using dynamic programming. You thus get decision rules and find a steady state that is independent of B 0. In fact, when you simulate, you would use approximation around a steady state (see Cooley, ch. 2). 5

75 1.4 Second adjustment: indivisible labor In addition to the facts already mentioned, it is observed that most fluctuations in aggregate hours worked is due to fluctuations in the number employed rather than in fluctuations in hours per employed worker. The second adjustment attempts to also address this issue. The treatment is based on Hansen (JME 1985). A non-convexity is introduced in the model: workers can either work zero hours or a constant number of hours e h. This non-convexity is taken as given. In practice, it may arise from the costs of going to work or from the fact that individuals need warm up time every day before becoming fully productive. This is not specified here. The important point is that the choice of hours is discrete. As illustrated in Rogerson (JME 1988), the competitive equilibrium in an economy with non-convex choice sets (because of indivisible labor), may involve different agents choosing different allocations of consumption and hours of work (as long as agents are indifferent between the two bundles). Also, there are allocations involving lotteries over employment that dominate the competitive equilibrium allocation 3 (such allocation would be of the type every agents consume c and work with some probability φ [0, 1] and would dominate the equilibrium allocation where a fraction of agents consume c 1 and work e h and the rest consume c 2 and work zero hours). Hence, realizing the optimal allocation in these economies involves such lotteries to decide which agents do work and which agents do not work. You may think of these lotteries as convexifying the choice set in the problem. This may look very different from the choice sets you are used to. The next paragraph explains in more detail how these lotteries function. Every period, a lottery is held determining which agents work or do not work. Agents are only paid if they do work. Also assume that agents can buy unemployment insurance in case they do not work. Here is how the economy works each period. First, the technology shock z t is realized. Then, households choose a probability α t of working, rather than hours of work, in the form of a contract with the firm. They may also purchase privately provided unemployment compensation y t at a price p (α t ) and they decide on consumption c st and investment i st, contingent on whether the household works or not (s =1indicating that the household is working and s =2indicating that it is not). [Notice that it does not make sense to choose c and i, independently of the outcome of the lottery. In fact, agents choose c and i after realization of the lottery and after insurance payments. But when you choose α, you are aware that you will have to make that contingent choice.]. Their problem is the following (it is very convenient to denote next period value by a prime ): 3 Assuming that agents are expected utility maximizers. 6

76 n ³ V (k, K, z) = Max α u c 1, 1 e o h + βe [V (k 0 α,y,c s,i s 1,K 0,z 0 )] +(1 α) {u (c 2, 1) + βe [V (k2,k 0 0,z 0 )]} s.t. c 1 + i 1 = w (K, z) e h + r (K, z) k p (α) y c 2 + i 2 = y + r (K, z) k p (α) y k 0 s =(1 δ) k + i s,s=1, 2 The insurance company maximizes expected profits p (α) y (1 α) y. Suppose that competition in the insurance industry forces profits to zero. Then, p (α) =(1 α). In that case, the insurance is actuarially fair. The problem can be rewritten as: n ³ V (k, K, z) = Max α u w (K, z) e h + r (K, z) k (1 α) y k 0 α,y,c s,ks 0 1 +(1 δ) k, 1 e o h + βe [V (k1,k 0 0,z 0 )] +(1 α) {u (y + r (K, z) k (1 α) y k2 0 +(1 δ) k, 1) + βe [V (k2,k 0 0,z 0 )]} The first order conditions with regards to ks 0 and y are: u 1 ³c 1, 1 e h = βe [V 1 (k1,k 0 0,z 0 )] u 1 (c 2, 1) = βe [V 1 (k2,k 0 0,z 0 )] u 1 ³c 1, 1 e h = u 1 (c 2, 1) We assumed preferences that are additively separable in consumption and leisure (as well as strictly concave). Notice that this drastically simplifies the problem. However, this is equivalent to assuming a constant relative risk aversion coefficient of 1, which is in the range of estimates for that coefficient. Hence: In turn, this implies that: c 1 = c 2 k 0 1 = k 0 2 i 1 = i 2 The first line is obtained from combining the first two first order conditions above, and the second line is due to the fact that i s = k 0 s (1 δ) k. This means that the two budget constraints are the same. Hence, y = w (K, z) e h We have the usual result that the insurance is actuarially fair and agents are fully insured. All agents facing the same problem implies that k = K (everybody has same capital holdings, which was not obvious from the 7

77 start, since agents have different stochastic employment histories). In summary, the agent s problem can be greatly simplified as: n ³ o V (k, K, z) =Max α,k 0 αu c, 1 e h +(1 α) u (c, 1) + βe [V (k 0,K 0,z 0 )] Since the utility function is (Log-Log) additively separable, it can be further simplified as: n ³ V (k t,k t,z t )= Max Log (c t )+α t ALog 1 e o h + βe [V (k t+1,k t+1,z t+1 )] α t,k t+1 s.t. c t + k t+1 (1 δ) k t = w (K t,z t ) α t e h + r (Kt,z t ) k t The new budget constraint reads as if agents were paid according to expected hours. This gives us two first order conditions: 1 w (K t,z t ) c e ³ h + ALog 1 e h = 0 t 1 βe (r (K t+1,z t+1 )+1 δ) = 1 c t+1 c t Notice that the main result in the end, for our purpose, is that the problem has been reduced to a usual representative agent problem with the following utility function: ³ Log (c)+αalog 1 e h Of course, the agent now chooses k t+1 and α t, rather than k t+1 and h t. 4 Notice that the probability α t is given by: 5 α t = h t e h where, h t = per-capita hours Hence, the utility function in the problem is of the following form (up to a constant term): Log (c t ) Bh t ³ ALog 1 e h where B = > 0 e h 4 The firm s problem is standard. Simply, firm optimization and market clearing imply that: w (z, K) =f 1 ³α H,K e and r (z, K) =f 2 ³α H,K e. 5 When looking at data, α t is to be interpreted as h t. h e 8

78 It is linear in leisure. Hence, even though we have the usual utility function for the households, the problem can be made into a problem with a representative agent that has a utility function that is linear in leisure. You may remember that the problem with the standard model was that households were not sufficiently willing to substitute leisure across periods. With a utility that is not concave in leisure anymore, the representative agent is not attempting to smooth out his intertemporal pattern of leisure and this may result in more substitution of leisure across periods in response to a given technology shock. This can be verified in table 2. σ h /σ y in the indivisible labor economy is almost the same as in the data. In fact, one could argue that this adjustment has generated too much volatility of hours, when looking at σ h /σ prod which is much higher than in the U.S. However, if one also allowed for adjustments along the number of hours per worker, this may reduce the volatility. Finally, corr (h, prod) is still too high. Actually, indivisible labor was not trying to improve the standard model along that particular line. For that, one needs to add a new type of shock. This is the object of the next two paragraphs. 1.5 Third adjustment: an economy with government spending shocks Assume that there is a government and that government expenditures g t are financed through non-distortionary lump-sum taxation T t. Assume the following functional form for the random shock to government spending: Log (g t+1 ) = (1 λ) Log (g)+λlog (g t )+µ t µ t N 0,σ 2 µ µ t z t The value of λ reflects how persistent (λ =1) or temporary (λ =0) these shocks are expected to be 6. Temporary shocks have a smaller wealth effect (the average of g t y t is used to get g). The government shocks are supposed to be independent of the technology shocks. The utility function is still given by: u (c, l) =Log (c)+alog (1 h) which implies that g does not enter the utility function. Other functional forms could have been chosen, that would have included g 7. With this assumption, the government expenditures are just a drain on resources, since they do not provide any utility to the households, but rather reduce their budget available to allocate between consumption and investment 8. This is equivalent to a wealth effect, and tends to reduce leisure and, 6 If λ =1, Lng t+1 = Lng t + µ t. Hence g t+1 = g te µ t and E [g t+1 g t]=g te [e µ t], hence the persistence of shocks. If λ =0, Lng t+1 = Lng + µ t,andg t+1 = ge µ t and E [g t+1 g t]=ge [e µ t], hence the temporary nature of shocks. 7 For example, u (c, l) = Log (C) +ALog (l), where: (i) C = c + g (c and g are perfect substitutes) or (ii) C = αc φ +(1 α) g φ φ 1. 8 If, instead, private consumption c and g were perfect substitutes, households would clearly derive the same utility than if there was no government expenditure. They would just reduce c by the amount of government expenditures g. Consider the following: Problem A [u = u (c + g, 1 h) and g>0] andproblemb[u = u (c, 1 h) and g =0]. 9

79 hence, increase labor. Remember that there is a second source of uncertainty in the economy: the uncorrelated technology shock. This shock will affect both productivity and hours of work. Since hours of work are also affected by the government shock, it is possible that productivity and labor may exhibit low correlation, if the effect of the two shocks work in opposite directions on the labor hours. The maximization problem is the following: " + # X MaxE β t u (c t, 1 h t ) t=0 s.t. c t + k t+1 (1 δ) k t + g t = e z t h 1 θ t kt θ g t = T t z t+1 = ρz t + ε t Log (g t+1 )=(1 λ) Log (g)+λlog (g t )+µ t Remark that if we had set this up as a competitive equilibrium problem (even though there are no distortions), we would have had the following problem to solve. Denote the random shocks by x t =(z t,g t ): V (k t,k t,x t )= Max k t+1,h t {u (c t, 1 h t )+βe [V (k t+1,k t+1,x t+1 )]} Problem A: Max h,k 0 {u (c + g, 1 h)+βe [V (k t+1,k t+1,x t+1 )]} s.t. wh + rk = c + k 0 (1 δ) k + T and g = T (gvt budget constraint) The FOC s are: u 1 (c + g, 1 h) =βe [(r 0 +1 δ) u 1 (c 0 + g 0, 1 h 0 )] wu 1 (c + g, 1 h) =u 2 (c + g, 1 h) w t (K t,x t ) h t + r t (K t,x t ) k t = c t + i t + T t z t+1 = ρz t + ε t Problem B: Max h,k 0 {u (c, 1 h)+βe [V (k t+1,k t+1,x t+1 )]} s.t. wh + rk = c + k 0 (1 δ) k The FOC s are: u 1 (c, 1 h) =βe [(r 0 +1 δ) u 1 (c 0, 1 h 0 )] wu 1 (c, 1 h) =u 2 (c, 1 h) Looking at the various FOC s, one can check that [c + g] A = c B or c A = c B g. Ifc and g are perfect substitutes, g substitutes for some consumption. The agents would not mind the government expenditures. 10

80 Given the first order conditions for the firm, the two maximization problems are equivalent. Log (g t+1 )=(1 λ) Log (g)+λlog (g t )+µ t w t (K t,x t ) = (1 θ) e z t h θ t kt θ r t (K t,x t ) = θe z t h 1 θ t kt θ 1 Intuitively, the shock to government expenditures can be seen as affecting labor supply, while the shock to technology can be seen as affecting labor demand. The first one would tend to produce a negative relationship between hours and productivity, while the second one would tend to produce the opposite. Hence, there is a potential for the two shocks together to produce low correlation between hours of work and productivity, as illustrated in the figures below. Of course, this can be resolved by calibrating and simulating the model. The results from table 2 show that, indeed, the correlation between hours and productivity decreased, once we introduced government expenditures shocks. It is still a little too high, but constitutes an improvement. It did not improve the volatility of hours (but was not really expected to do so) 11

81 Real @ Labor Supply Corr(Prod,Hrs) Labor Demand - Labor Figure 1: Effect of a positive government shock Corr(Prod,Hrs) > 0 Real @ Labor Demand - Labor Figure 2: Effect of a positive technology shock 12

82 1.6 Fourth adjustment: Home production Home production is essentially defined as any activity out of discretionary time, that is not included in market production (i.e. production with a firm) and leisure. This would include activities such as maintaining the household, taking care of children, home improvements, etc. Studies report that: - a typical married couple spends 25% of discretionary time to home production and 33% to paid activities (market production). - in the postwar period, investment in household capital (consumer durables and residential structures), exceed investment in market capital (producer durables and non residential structures) by 15%. - Estimates of output from home production range from 20% to 50% of measured (market) GNP. All these facts should convince you of the importance to include home production in our basic RBC model. In addition, it may help us address some of the issues we mentioned at the beginning. Since home production is likely to be affected by different shocks than the market production, this allows us to have a second type of shocks in the economy. Agents have preferences over consumption and leisure defined by u (c, l) =Log (c)+alog (l). However, the consumption good is now a composite of the market produced and the home produced output: c t =[ac e Mt +(1 a) c e Ht] 1 e where c Mt is consumption of the market produced good and c Ht is the consumption of the home produced good. This functional form implies an elasticity of substitution of 1/ (1 e) between the two goods. The unit time endowment has to be allocated between three activities: leisure, production in the market sector and production in the home sector: 9 l t + h Mt + h Ht =1 The technology is characterized by a market good production function and a home good production function. Because the capital used in production has to be allocated between the two sectors, we have that: The production functions are defined by: k t = k Mt + k Ht y Mt = f (z Mt,k Mt,h Mt )=e zmt kmth θ 1 θ Mt y Ht = g (z Ht,k Ht,h Ht )=e zht k η Ht h1 η Ht 9 Since l t + h Mt + h Ht =1, the two types of work are perfect substitutes. 13

83 where: z M,t+1 = ρz Mt + ε Mt z H,t+1 = ρz Ht + ε Ht ε Mt N 0,σ 2 ε M ε Ht N 0,σ 2 ε H γ = corr (ε Mt,ε Ht ) Output has to divided between consumption and investment. The latter is defined as usual: i t = k t+1 (1 δ) k t Since the composite consumption good is a function of c Mt and c Ht, it is necessary to define each term: c Mt + i t = f (z Mt,k Mt,h Mt ) c Ht = g (z Ht,k Ht,h Ht ) This implies that investment is only derived from market activities, while the home produced good is entirely for consumption. The state variables are: k t,x t =(z Mt,z Ht ) The control variables are: k Mt,h Mt,h Ht,k t+1 Once k Mt is chosen, k Ht is automatically determined. Similarly, once h Ht and h Mt are chosen, l t is the leftover discretionary time. The maximization problem can be written as: V (k t,x t )= Max {u (c t, 1 h Mt h Ht )+βe [V (k t+1,x t+1 )]} k Mt,h Mt,h Ht,k t+1 s.t. c t =[ac e Mt +(1 a) c e Ht] 1 e k t = k Mt + k Ht c Mt + k t+1 (1 δ) k t = e z Mt kmth θ 1 θ Mt c Ht = e z Ht k η Ht h1 η Ht z M,t+1 = ρz Mt + ε Mt z H,t+1 = ρz Ht + ε Ht 14

84 Looking at table 2, adding home production into the model, improves the baseline model. Indeed, it makes hours more volatile and reduces the correlation between work hours and productivity (it is to be mentioned, though, that the simulation results depend, to some degree, on the values of e and γ 10 ). The intuition for these results is that now, in addition to market production and leisure, agents can allocate their time to home production. Hence, they can adjust their time spent in the market, in response to a technology shock, without changing their leisure time too much. A low γ implies more divergence between z H and z M and more opportunities to specialize. In fact, this model would improve the volatility of work hours, even with deterministic home production, because there would still be productivity differentials between the two sectors. 11 The ability of the home production model to decrease the corr (h, prod) is that two types of shocks affect this economy, as in the previous adjustment. Hence, the ability of the model to replicate U.S. time series along these lines depend critically on the nature of the shocks to home production. 2 Matching models of the labor market 2.1 Baseline model The treatment below comes from Pissarides (1990). It develops the basic framework for matching models of the labor market and is the building block for many applications. The central idea is that the labor market does not clear at all instant, as in the neoclassical framework, but that finding a partner (worker or firm) is a time consuming, uncoordinated and costly activity. The number of matches between workers and firms looking for each other, are determined by a matching function M (U, V ). The function M (.,.) gives the number of matches formed per unit of time, as a function of its two arguments U, the number of unemployed and searching workers and V, the number of vacant and searching firms. It is assumed to be increasing and concave in both arguments, and to exhibit constant returns to scale 12. Consider that the labor force is made of two states: a worker is either employed or looking for a job 13.Assume 10 This is because results depend on the willingness to substitute home consumption for market consumption (e) andonthe incentive to switch production between the home sector and the market sector (γ). 11 The paper mentions that the results are not "too" sensitive to γ. The average values for time spent at home and at work come from time studies. 12 You probably noticed that the neoclassical production function and the matching function are assumed to have the same properties. The assumptions on the matching function are made to ensure that on a balanced growth path, the unemployment rate is constant. 13 This implies that workers cannot be out of the labor force ( OLF ). In statistics, such as in the Current Population Survey, workers can actually be in three states: (i) employed, (ii) unemployed and looking for a job, or (iii) out of the labor force, i.e. unemployed and not looking for a job. For simplicity, we do not consider that option. In official statistics, the unemployment 15

85 that the labor force is comprised of L workers. Firms have to post vacancies to find workers. Hence, jobs can also be in two states: vacant or matched and producing. Denote by u the unemployment rate, by v the job vacancy rate, and by m the matching rate. Thus: U = ul (1) V = vl (2) M (U, V ) = ml (3) The activity in the labor market can be represented as follows: rate is in fact defined as the the proportion of job seekers [(ii)] over the labor force [(i)+(ii)]. 16

86 Matched Pool 6 - PRODUCTION - New matches Match forming Searching pool Breakdowns ¾ SEARCH ¾? ¾ Free entry of firms Let θ = v u be the market tightness. Denote the workers and the firms probability of matching (per unit of time) by p w and p f, respectively. Because of random matching, they are the same for each worker and each firm, and: p w = M (U, V ) U p f = M (U, V ) V Given the nature of the matching technology: p f = M (U, V ) V p w = M (U, V ) U = M (4) (5) µ 1 θ, 1 = q (θ) (6) = M (1,θ)=θq (θ) (7) This implies (and this is important for the rest of the analysis) that the matching probabilities are only functions of the market tightness. Because of the assumptions on M, p f is decreasing in θ and p w is increasing in θ. That is,firms have more difficulties matching when there are a lot of vacancies per unemployed workers. Similarly, when this is the case, it is easier for workers to find a firm. We just looked at transitions from unemployment ( search pool ) into employment ( matched pool ). Tran- 17

87 sitions into unemployment are given by a job-specific separation rate s 14. The origin of the shocks is not made explicit, but they can be associated with structural shifts of demand for the firm s product. Since transitions are stochastic and derived from Poisson processes, it is possible to compute the average time before the Poisson event, that is the average employment/unemployment duration for a worker and the average employment/vacancy duration for a firm. As you know, for Poisson processes, these average times are given by the inverse of the Poisson rate: Average Unemployment Duration = 1 θq (θ) (8) Average Employment Duration = 1 s Average Vacancy Duration = 1 q (θ) (9) (10) The flows into and out of unemployment have to be equal in steady state, resulting in a constant unemployment rate. Hence: sdt [L U] =θq (θ) dtu or s (1 u) =θq (θ) u (11) or u = s s + θq (θ) (12) Remark that this allows us to decompose the unemployment rate into its two components: unemployment duration (how long, on average, one stays unemployed) and unemployment incidence (how often, on average, one becomes unemployed). Duration is given by 1 θq(θ) and incidence is given by 1 s. Decomposing unemployment this way carries more information than just knowing the unemployment rate. Indeed, a given unemployment rate can be the result of many different combinations of unemployment duration and unemployment incidence. Firm s problem: S f : value of a vacancy to a firm M f : value of being matched to a firm r: discountrate c: cost of posting a vacancy (per unit of time) 14 That is, shocks are idiosyncratic and not aggregate shocks. 18

88 Time is continuous. In that case, the value function equations are generally given in flow terms. In papers, the flow equations are rarely explicitly derived. That is because they tend to take the same general form and because, most times, they are actually written in steady state. The flow of value from being in a particular state is generally equal to the instantaneous utility received from being in that state plus the instantaneous probability of changing state times the resulting gain/loss in value from that change. Let us see why in this particular example: S f 1 (t) = cdt +(q (θ (t)) dt) M f (t + dt)+[1 q (θ (t)) dt] S f (t + dt) ª 1+rdt As dt 0, wehavethat: (1 + rdt) S f (t) = cdt +(q (θ (t)) dt) M f (t)+dtm f (t) +[1 q (θ (t)) dt] S f (t)+dts f (t) (13) Taking S f (t) out on both sides, dividing through by dt and with dt 0, onegets: rs f (t) = c + q (θ (t)) M f (t) S f (t) + S f (t) In steady state, this becomes: rs f = c + q (θ) M f S f (14) This can be interpreted as follows. A vacancy is like an asset. The cost of holding that asset (rs f )isequal to the return on that asset (carrying cost c plus the probability q (θ) of changing state times the net return of doing so M f S f ). Notice that we assumed in (13) that given the choice, firms always prefer to match than remain vacant. That can be checked, in equilibrium. Since firms post vacancies until the value of doing so is driven down to zero, it is enough to verify that M f 0. Intuitively, it is impossible that firms would post costly vacancies and, at the same time, refuse to match. The free entry condition implies that: M f = c (15) q (θ) In equilibrium, the expected value of a filled vacancy is equal to its expected cost, i.e. the cost of posting it per period of time times the average duration of a vacancy. Given that match output is y and the firm pays wage w, the value of a match to the firm is given by 15 : rm f = y w + s S f M f = y w sm f (16) 15 M f (t) = 1 1+rdt (y w (t)) dt +(sdt)(0)+[1 sdt] M f (t + dt) ª = (1 + rdt) M f (t) =(y w (t)) dt +[1 sdt] M f (t)+dtm f (t) = rm f (t) =y w (t) sm f (t)+ M f (t) = rm f = y w sm f 19

89 In flow terms, the value of a filled position is equal to instantaneous net output plus the instantaneous probability of a separation times the resulting loss. Combining (15) and (16), this can be rewritten as: y w r + s = c q (θ) (17) Worker s problem: S w : value of search to worker (value of being unemployed) M w : value of employment to worker b: worker s income during search (home production, unemployment benefits, value of leisure) In flow terms 16 : rs w = b + θq (θ)[m w S w ] (18) rm w = w + s [S w M w ] (19) In flow terms, the value of search is equal to income during search plus the probability of changing states times the net gain (again, it is rightly assumed that workers prefer the state of employment to the state of unemployment). Likewise, the value of employment is the wage received plus the probability of separation times the associated net loss. (18) and (19) can be solved to get: rs w = rm w = (r + s) b + θq (θ) w r + s + θq (θ) sb +(r + θq (θ)) w r + s + θq (θ) 16 Value of search: S w (t) = 1 1+rdt {bdt +(θq (θ) dt) Mw (t + dt)+[1 θq (θ) dt] S w (t + dt)} = (1 + rdt) S w (t) =bdt +(θq (θ) dt) M w (t)+dt M (t) +[1 θq (θ) dt] S w (t)+dt S (t) = rs w (t) =b + θq (θ)[m w (t) S w (t)] + S (t) = rs w = b + θq (θ)[m w S w ] Value of employment: M w (t) = 1 1+rdt {w (t) dt +(sdt) Sw (t + dt)+[1 sdt] M w (t + dt)} = (1 + rdt) M w (t) =w (t) dt +(sdt) S w (t)+dt S (t) +[1 sdt] M w (t)+dt M (t) = rm w (t) =w (t)+s [S w (t) M w (t)] + M w (t) = rm w = w + s [S w M w ] 20

90 To guarantee that, indeed, a worker prefers working than being unemployed, we need that w b. Wage determination: Since the total value of matching (to a worker/firm pair) is greater than the total value of search (to that same pair), matching creates a surplus, whose split is determined by bargaining between the worker and the firm. Again, we use the Nash bargaining solution to obtain the negotiated wage. In the negotiations, the market wage is taken as given by the two parties (since all matches are similar, in equilibrium, all must be characterized by the same wage). In equilibrium, the negotiated wage must be equal to the market wage. Conditional on the negotiated wage w n, the value to the firm of being matched and paying the worker w n is given by rm f n = y w n sm f n, while the value of the same match to the worker is rm w n = w n + s [S w M w n ]. Notice that S w depends on the market wage, since it is the expected return to the worker from search. Hence, w n must maximize the Nash product, where β is the workers bargaining power: The maximum is achieved when: Max(Mn w S w ) β Mn f S f (1 β) w n β M f n S f =(1 β)[m w n S w ] (20) Hence 17 : β (y w n )=(1 β)(w n rs w ) or: w n = βy +(1 β) rs w (21) Remark that the wage is a convex combination of match output and the worker s value of search or his threat point (it should be clear that, in this type of models, wage is not equal to the worker s marginal product). It is easy to see from there that any parameter increasing the value of search also increases the wage of employed workers. One such example would be unemployment benefits. The wage can be rewritten as: w n = rs w + β (y rs w ) (22) This can be interpreted as follows. The worker receives his reservation wage (rs w ), i.e. the minimum wage he must get for him to accept the match plus a proportion β of the net surplus created by accepting the match (y rs w ). (15) and (20) imply that M w S w = β 1 β M f = β c 1 β q(θ). From (18), it results that rs w = b+θq (θ)[m w S w ]= β c b + θq (θ) 1 β q(θ) = b + β 1 β cθ. Using (21), w = βy +(1 β)(b + β 1 β cθ) and: 17 Comes from (16) and (19), and the above equation. w =(1 β) b + β [y + cθ] (23) 21

91 Equilibrium: The equilibrium is given by the triplet (u, v, w) satisfying (12) [steady state], (17) [free entry] and (23) [wage determination]. We can perform a little comparative static exercise. In particular, it is interesting to look at the effect of changes in productivity (y) and worker s search income (b) on the equilibrium values of unemployment and wages. Combining (17) and (23), one gets: (1 β)(y b) =βcθ +(r + s) c q (θ) (24) As the right-hand side of (24) is increasing in θ, anincreaseinsearchincomeb results in a decrease in the equilibrium value of θ eq. Looking at (12), it is clear that u eq increases. By totally differentiating (11), we see that v eq decreases. Finally, by differentiating (23) [and using the expression for dθ db ], one gets that w eq increases with b. From (24), an increase in productivity y increases θ eq and v eq, while it decreases u eq.from (23), we see that w eq also increases. The intuition is the following: by increasing the value of search to the worker, hence his threat point, an increase in b results in increased market wages (w ). That reduces match profitability to firms who post less vacancies (v ). In steady state, there is an inverse relationship between u and v. Hence, u. The case of an increase in y is different. In that case, both the value of search to the worker and the value of the match increase (matches are more productive, and hence, search has also higher value). So, the increase in y accrues to both the worker and the firm. Thus, we have that both wages and vacancies increase (w, v ). 2.2 Application to the study of job creation and job destruction Up to now, the rate of job destruction s was exogenously given. The purpose of this application is to endogenize it. The treatment is based on Mortensen & Pissarides (Restud 1994). But, first let us look at some empirical facts. These facts come from Davis & Haltiwanger (QJE 1992). The paper looks at employment changes at the plant level, rather than at the aggregate level. The authors look at the U.S. manufacturing sector data between 1972 and The establishments considered are 5 employees or more (99% of the population). An establishment is defined as a plant (physical location rather than firm). In particular, the authors have data on the number of jobs at a given plant over a long period. They are thus able to look at the patterns of job creation and destruction within a firm. The frequency of observation is one year. 22

92 Here is some notation: - Employment at date t, E t : number of workers on the payroll at date t, - Job Creation, JC t : (net) employment gains at plants that expanded between t 1 and t within a sector, -JobDestruction,JD t : (net) employment losses at establishments that shrank between t 1 and t within a sector. Hence: E t = E t 1 + JC t JD t To get the measures in terms of rates (JCR t, JDR t ), divide JC t and JD t by the average sector size 1 2 [E t 1 + E t ]. - Job Reallocation Rate: SUM t = JCR t + JDR t - Net Employment Growth Rate: NET t = JCR t JDR t The main results are reported below along two dimensions: Magnitudes: There is a relatively high amount of simultaneous job creation and job destruction at every phase of the business cycle: between 1972 and 1986, Mean(JCR t )=.092, Min(JCR t )=.064, andmean(jdr t )=.113, Min(JDR t )=.061. Cyclicality: The job creation and job destruction rates are negatively correlated, Corr(JCR t,jdr t )=.864. The authors also make the point that the patterns of job destruction and job creation cannot be explained by sectorial shifts, that is shocks to particular industries, with labor being reallocated across industries following these shocks. If this were the case, one would observe some industries with high levels of job destruction, while others would experience high levels of job destruction. After narrowing their observations by focusing on specific industries, the authors report that this pattern is not observed. This seems to indicate that the high levels of both job destruction and job creation may be better explained by idiosyncratic shocks to active jobs. We can now extend the basic Pissarides model and add a job destruction decision (that is endogenize s). The productivity of job/worker pairs is now idiosyncratic. Denote the instantaneous match output by: y = p + ε 23

93 where p is the deterministic component of the value of the product and ε its idiosyncratic random component (ε can be considered as a productivity shock or a preference shock affecting the product relative price). Nature of the random shocks: The arrival of shocks follows a Poisson process with arrival rate λ. Conditional on a new shock, ε is drawn from a distribution F (z) with finite upper support ε u. This specification is chosen for two reasons. First, assuming a Poisson process maintains tractability. Second, it brings positive persistence to firm size (the lower λ, the more persistent firm size). Assume that newly created jobs start at the highest productivity p + ε u. This can be supported by the observation that most job creation comes from existing firms, which have better information about the profitability of new products within their sectors. Of course, this is an extreme assumption. For our purpose, it is not going to change the results of the model. An area where it may matter is, if we were interested in looking at wages. For example, it would mean that wages in a given occupation always decrease before increasing, which would be inconsistent with the typical wage tenures observed. Because of the shocks, there is now a new decision variable for firms and workers. Since matches start at the highest productivity level ε u, neither firm, nor worker will want to separate. However, upon arrival of a new shock ε, the two parties have to decide whether to break the match down or to continue production. This is how the job destruction decision is endogenized. Whether production will stop now depends on the current value of ε, instead of being exogenously given. Decision variables: Firms: (i) how many vacancies to post, (ii) when to break the match down. Workers: (i) when to break a match down (as usual the decision to accept a match is trivial, hence is not included). Value functions: Since matches are characterized by their idiosyncratic productivity, the value functions when matched must have ε as an argument. However, ex-ante, all vacancies and unemployed workers are identical. Therefore, the values of search do not depend on ε. Wehavethat: rs f = c + q (θ) M f (ε u ) S f (25) rs w = b + θq (θ)[m w (ε u ) S w ] Z (26) rm w (ε) = w (ε)+λ [Max{M w (z),s w } M w (ε)] df (z) (27) rm f (ε) = p + ε w (ε)+λ Z Max M f (z),s fª M f (ε) df (z) (28) 24

94 (27) and (28) can be intuitively explained as follows. In flow terms, the value to the worker of being in a match of productivity ε is equal to the instantaneous wage received w (ε) plus the option value of being hit by a new idiosyncratic shock z and taking the optimal decision of staying in the match at the new value of productivity (M w (z)) or breaking the match down and returning to the state of search (S w ). A similar reasoning holds for (28) 18. Wage determination: As before, wages are negotiated. Given productivity ε, the worker s surplus from the match is given by M w (ε) S w and the firm s surplus is given by M f (ε) S f. Hence, total match surplus T (ε) is M w (ε)+ M f (ε) S w S f. Because of the free entry condition for firms, in equilibrium, S f =0 (29) As usual, the worker retains a proportion β (equal to his bargaining power) of the total surplus. 19 Hence: M w (ε) S w = β [M w (ε)+m w (ε) S w ]=βt (ε) (30) M f (ε) = (1 β)[m w (ε)+m w (ε) S w ]=(1 β) T (ε) (31) Thus, we can rewrite (27) and (28) as: Z rm w (ε) = w (ε)+λβ [Max{T (z), 0} T (ε)] df (z) (32) Z rm f (ε) = p + ε w (ε)+λ (1 β) [Max{T (z), 0} T (ε)] df (z) (33) By combining (26), (32) and (33), one gets [adding (32) and (33) and using the equation for rs w ]: Z rt (ε) =p + ε + λ [Max{T (z), 0} T (ε)] df (z) b θq (θ) βt (ε u ) 18 M w (ε; t) = 1 R 1+rdt w (ε; t) dt + λdt Max{M w (z; t + dt),s w (t + dt)} df (z)+(1 λdt) M w (ε; t + dt) = (1 + rdt) M w (ε; t) =w (ε; t) dt+λdt R [Max{M w (z; t + dt),s w (t + dt)} M w (ε; t + dt)] df (z)+m w (ε; t)+dt M w (ε; t) = rm w (ε; t) =w (ε; t)+λ R [Max{M w (z; t + dt),s w (t + dt)} M w (ε; t + dt)] df (z)+ M w (ε; t) The terms in dt in the integral are negligible as dt 0, thus: rm w (ε; t) =w (ε; t)+λ R [Max{M w (z; t),s w (t)} M w (ε; t)] df (z)+ M w (ε; t) In steady state, rm w (ε) =w (ε)+λ R [Max{M w (z),s w } M w (ε)] df (z) 19 The Nash bargaining solution is given by: max w n (M w S w ) β ³ M f S f 1 β. 25

95 or Z (r + λ) T (ε) =p + ε b + λ Max{T (z), 0} df (z) θq (θ) βt (ε u ) (34) We can see from (34) that T (ε) is increasing in ε. Notice that due to the nature of the bargaining procedure retained, breakdowns are privately efficient. Workers decide to break a match down when M w (ε) <S w and firmsdecidetodosowhenm f (ε) < 0. But these situations arise at the same time, since M w (ε) S w and M f (ε) always have the same sign (that of T (ε)). Hence, workers and firms always agree whether to stay in the match or to separate. This implies that matches are (bilaterally) broken down when the total surplus, given the match current productivity, is negative. Since T (ε) is increasing in ε, we have a reservation property. Matches break down when the idiosyncratic shock falls below a certain value ε r,suchthatt (ε r )=0.Thereservation shock is defined as: T (ε r )=0 (35) (34) can be rewritten as: Z ε u (r + λ) T (ε) =p + ε b + λ Integrating by parts, the integral can be rewritten as follows 20 : Z ε u ε r T (z) df (z) = [T (z) F (z)] εu εr = T (ε u ) ε r T (z) df (z) θq (θ) βt (ε u ) Z ε u ε r Z ε u ε r 0 F (z) T (z) dz 0 F (z) T (z) dz = Z ε u 0 T (z)[1 F (z)] dz ε r = 1 r + λ Z ε u ε r (1 F (z)) dz Hence: (r + λ) T (ε) =p + ε b + λ (1 F (z)) dz θq (θ) βt (ε u ) r + λ ε r Z ε u 20 Assuming the cdf is well behaved. 26

96 From (25), (31) and (35), we get that 21 : Z ε u p + ε r = b + β 1 β cθ λ (1 F (z)) dz (36) r + λ ε r Using (25), (26), (30) and (31), notice that (36) can be rewritten as: Z ε u rs w = p + ε r + λ (1 F (z)) dz (37) r + λ ε r Remark that this last expression is not necessarily very "intuitive". Instead, one can use the integration by parts above to rewrite it as: Z ε u rs w = p + ε r + λ T (z) df (z) (38) ε r This states that, at the reservation shock, the opportunity cost from matching (rs w )isequaltotheopportunity cost of search (instantaneous output p + ε r plus the option value of being hit by a higher shock, appropriately discounted to account for the match impermanence). At ε r, agents are indifferent between search and production. Condition (36) [or (37), or (38)] is often referred to as the Job Destruction Condition, because it defines ε r the reservation shock, for a given θ. It can be interpreted as follows. The left hand-side of (36) is the lowest acceptable price to firms to stay in the job. The first term in the right-hand side is the opportunity cost from matching. The second term is the extent to which a firm is willing to occur a current loss in expectation of future increases in the productivity of the match. Hence, firms hoard labor, that is they retain some workers, even though current output is below the opportunity cost of employment, in expectation of future improvement in match productivity. Firms do that, because finding a worker is a costly process. Now that we defined the threshold level at which jobs are destroyed (notice that this is equivalent to endogenizing the job destruction rate), we want to look at the job creation condition. As already mentioned, this is given by the free entry condition. Rewriting (34) for ε and ε r,wegetthat: T (ε) = ε ε r r + λ Using (25) and (31), we get that: c q (θ) =(1 β) ε u ε r (40) r + λ Condition (40) states that the expected vacancy costs (c/q (θ)) are equal to the return from posting a vacancy (firm s share of the surplus times the surplus at job creation). 21 rs w = b + θq (θ)[m w (ε u) S w ]=b + θq (θ) βt (ε u)=b + θq (θ) β 1 1 β M f (ε u)=b + θq (θ) β 1 β c q(θ) = b + β 1 β cθ (39) 27

97 JC - θ Figure 3: Job creation and destruction curves The equilibrium is characterized graphically by a job creation curve (JC) and a job destruction curve (JD), as follows: (JC) is a downward sloping curve reflecting the fact that a low incidence of match breakdown (low ε r ) induces firms to post more vacancies (high θ). (JD) is an upward sloping curve reflecting the fact that the joint value of the match increases with match idiosyncratic productivity, while the joint value of breakdown (or value of search) increases with market tightness. Hence, the reservation shock, where the joint value of continuation equals the joint value of breakdown, increases with market tightness. Remark that once the equilibrium values of θ and ε r are known, employment duration and unemployment duration are also known (and, hence, the unemployment rate): Employment duration : Unemployment duration : Unemployment rate : 1 λf (ε eq r ) 1 θ eq q (θ eq ) λf (ε eq r ) λf (ε eq r )+θ eq q (θ eq ) Let us assume for simplicity that the matching function is Cobb-Douglas in U and V. This functional form satisfies the assumptions made at the beginning of the paper. That is: M (U, V )=U η V 1 η 28

98 Hence: q (θ) =θ η (41) Using (41) and (40), one can rewrite (36) with only ε eq r, and by totally differentiating, obtain that: d r < 0 (42) dp εeq d dp θeq > 0 (43) In conclusion, we can use the model to explain the two main facts laid out at the beginning of the chapter. These facts were the following: (i) there is a relatively high amount of simultaneous job creation and job destruction at every phase of the business cycle, (ii) the job creation and job destruction rates are negatively correlated. First, because of the idiosyncratic nature of the shocks, there is always a high level of job creation and job destruction. Hence, we defined a steady state equilibrium, where some jobs are created and others destroyed, these two phenomena occurring at all times. Second, we can look at the economy at different phases of the business cycle. For that, we can consider that p can take two values p 2 >p 1. Changes in p canbeconsideredasaggregate shocks to the economy. We want to use the above model to look at the effect of an aggregate shock on job destruction and job creation. The above model characterized the steady state equilibrium values of θ and ε r,takingp as a model parameter. Carrying out a comparative exercise, we were able to look at the effect of a change in the parameter p on θ eq and ε eq r. These are the results we are going to use to answer the question of how job creation and job destruction are varying over the business cycle (that is when p changes). One should realize however, that this is somewhat of a simplification. By allowing for changes in p, we are not only changing one of the model s exogenous parameters, but also the entire structure of the economy. Comparative statics exercises usually involve looking at how a change in an exogenous parameter affects equilibrium values. Here, by allowing for aggregate shocks, we actually change the problem faced by the agents. Such shocks are now part of the problem to solve, and should be anticipated by the agents. By looking at an economy where no aggregate shock has been built into the model, and letting p vary, we are looking at what happens to the steady state of an economy with no anticipated aggregate shock, when there is a new value of p?. It is different from looking at what happens to job destruction and job creation when an aggregate shock hits an economy where agents anticipate the possibility of aggregate shocks taking place?. Mortensen & Pissarides (1994) actually do look at the two cases: (i) they carry out the comparative static exercise based on a model with no anticipated aggregate shock, and (ii) alter their model to account for the possibility of aggregate shocks hitting the economy at a Poisson rate µ. It turns out that, even though one method is superior, the qualitative results do not change 22. Using (42) and (43) and assuming that p decreases from p 2 to p 1,weseethatε eq r increases, 22 One may argue that the methodological bias from using the first method is not as much of a problem, when one considers 29

99 hence job destruction increases (immediately, i.e. all the current matches between ε r (p 2 ) and ε r (p 1 )), and that θ eq decreases, hence job creation decreases (and u increases). Looking at the case where p increases from p 1 to p 2 yields the opposite results (except that the impact on job destruction is not immediate anymore). Hence, job destruction and job creation move in opposite directions over the business cycle. 2.3 Application to the study of labor market policies Some facts Labor market policies Labor market policies in place in European countries are very different from the ones in the U.S. There are policies designed for income security, such as unemployment insurance ( UI ) and minimum wage. Unemployment insurance varies between the two economies by its replacement rate (percentage of previous wage received as benefits), the length of the benefits (the replacement rate may also depend on the length of the unemployment spell), benefit ceilings, and eligibility criteria. In addition, the U.S. unemployment insurance system is experience rated, which means that the firm s contribution to the unemployment insurance system is proportional to the number of layoffs the firm initiated (this is implicitly a form of firing costs). In some European countries, the minimum wage is increased annually, while, in the U.S., it is only increased by Congress, at much longer intervals. There also are employment protection policies ( EP ), which are designed to reduce unemployment by imposing firing costs to firms for terminating an employee. There are four basic types of protection measures: severance pay, advance notice, administrative and procedural costs, and legal costs. The severance pay regulation dictates how much compensation the fired worker is to receive. It is a fraction of the worker s current wage which generally increases with the length of tenure. The advance notice regulation stipulates a period before which a firing is to take place and during which the worker is still employed. This also varies with the tenure, but generally not to the same extent as severance pay does. Administrative and legal costs are difficult to quantify. These two sets of costs are due to record keeping requirements, and the obligation to inform and consult with worker representatives and/or a third party. These regulations primarily take place in Europe, while in the U.S. these types of costs are essentially nil 23. Also bear in mind a very important distinction one-time shocks, or infrequent shocks It would be difficult to describe all the firing regulations in the different countries. As an example, French employment protection policies are described below in more detail. In France, dismissals can be either for economic reasons or for cause. Terminations for cause can be for serious reason or for very serious misconduct. Firing for very serious misconduct exempts the employer of any cost. For all other reasons, the employer must give advance notice and provide severance payments to the terminated employee. The laws differ slightly, depending on whether the dismissal is individual or collective. However, in 30

100 between the different firing costs imposed on the firm. Some regulations, such as severance payments and advance notice, impose costs that are proportional to the dismissed worker s wage and hence proportional to his or her skill. Also, they are transfers from the firm to the worker. Other regulations, such as administrative, procedural and legal ones, impose costs that are essentially fixed, i.e. the same regardless of the dismissed worker s wage. They also have the characteristic of being incurred by the firm, but to not be received by the worker. It is very important to mention that these requirements are the legal minimum requirements and that in many cases, additional protection is in place at the firm or industry level. Finally, keep in mind that, typically, these regulations do not come into effect before a certain tenure with the firm. Empirical facts The empirical evidence on the effects of EP policies is somewhat inconclusive. Using the number of months of severance payment for blue-collar workers with ten years of tenure as a measure of strictness of employment protection policies, Lazear (QJE 1990) finds a small positive effect of severance pay regulations on the unemployment rate, for a sample of 22 countries over 29 years. However, depending on the specification adopted, the evidence presented is not always statistically significant. For example, regressing the unemployment rate on country means over the entire period for the severance pay proxy, the coefficient becomes statistically insignificant. In addition, he also recognizes in introduction that unmeasured factors (such as other labor market institutions) differ across countries. Among ten European countries and the U.S., Bertola (EER 1990) finds no strong correlation between the long-term unemployment rate and a general ranking of strictness of employment protection policies - including all form of firing restrictions, not only severance pay. This seems to point out toward the need for a detailed modelization of the different forms of employment protection policies. From a different perspective, looking at the effect of firing regulations on wages, Friesen (ILRR, 1996) studies the wages of workers covered by advance notice and severance pay regulations. Using wage data from the different Canadian provinces, hence subject to different regulations, and controlling for education, tenure, firm size, occupation and industry, she was able to determine that incumbent workers, protected by regulations, extract higher wages than workers not protected by these laws and, also, that starting wages (for non-union workers) appear to fall to offset subsequent wage increases. both cases, the employee must be notified of the reason for termination. The Ministry of Labor must also be informed of the termination, even though the request is rarely denied, as long as the procedures have been respected. A retraining program for the dismissed worker may be offered. This entire procedure takes up to three weeks. For collective dismissals, the employer must also consult with the union. For collective dismissals of more than ten employees, the regulations require that the firm establish a social plan, including steps designed to facilitate the re-employment of the dismissed work force. For all terminations, workers between six months and two years of tenure are to be given a one month notice, and, in case of a longer tenure, a two month notice. The notice must be three month for engineers, professionals and managers. The legal minimum severance payment is one tenth of the monthly wage per year of seniority. An additional 1/15 of the monthly wage must be added for every year of tenure beyond 10 years of service. 31

101 There is less controversy about the effects of UI policies on unemployment. Layard & al (1991) 24 find that cross-country unemployment rates are positively associated with the generosity of UI benefits. Looking at differences between the U.S. and European labor markets, the following facts emerge. First, unemployment in Europe are characterized by less frequent, but longer unemployment spells than in the U.S. (less skilled workers suffer higher unemployment in Europe, primarily because of higher duration). Second, the following facts are known regarding the unemployment rate time series over the past 35 five years: unemployment has not shown any long-term trend in the U.S., while it trended upwards in Europe. In fact, within country relative changes in the unemployment rates, measured by the percentage difference between average rates in the last half of the 80 s and the last half of the 70 s, are positively correlated across countries with both the generosity of UI benefits and the aggressiveness of EP policies The model The framework developed below comes from Millard & Mortensen 26. It is the standard Mortensen & Pissarides model adapted to the study of various labor market policies. The notation is essentially the same as before. However, we introduce three different policies: Firing costs: When a match breaks down, the firm incurs firing costs. To follow the dichotomy developed above, we will consider two types of costs. First, there is a severance payment T that the firm has to pay to the worker. Second, there is a pure firing tax t paid by the firm, but not received by the worker. Notice that, in fact, the severance payment should be proportional to wage, as per the regulations. Accounting for that fact would change the outcome of the bargaining process, as firm would have to take into account that the severance payments they may have to pay in the future, depends on the wage they are currently negotiating. Instead, we assume that it is fixed (T ), since it is not central to our purpose, but simplifies the analysis. Although Millard&Mortensendonotmakeadifference between the two types of firing costs, we will see it is important to differentiate between the two. Unemployment benefits: 24 Unemployment: Macroeconomic Performance and the Labor Market, by Layard, Nickell and Jackman, Oxford University Press, For a ranking of strictness of EP policies in OECD countries, see OECD Job Study Millard S. and Mortensen D. (1997): The unemployment and welfare effects of labour market policy: A comparison of the U.S. and U.K. in D. Snower and G. de la Dehesa (eds.), Unemployment Policy: How Should Governments Respond to Unemployment? Oxford: Oxford University Press 32

102 While unemployed, a worker receives a proportion ρ of his/her wage. To keep the recursive nature of the problem, we assume that these benefits are received as long as the worker remains unemployed. Subsidizing vacancy posting: A subsidy to hiring can be interpreted as government assistance in the job/worker matching process or as a tax credit per worker hired paid to employers. In our model, this is equivalent to reducing the vacancy posting costs c. For simplicity, assume that output is entirely stochastic (p =0). Using notations previously developed, we have that, in flow terms: Z rm w (ε) = w (ε)+λ [Max{M w (z),s w + T } M w (ε)] df (z) (44) rm f (ε) = ε w (ε)+λ Z Max M f (z), T t ª M f (ε) df (z) (45) rs w = b + ρw + θq (θ)[m w (ε u ) S w ] (46) rs f = c + q (θ) M f (ε u ) S f (47) The flow equations should look familiar. The only difference between (27) and (44) is that, upon a breakdown, the worker now receives a severance payment T. The only difference between (28) and (45) is that the firm has to pay T + t after a breakdown. The value function equations imply that all breakdowns are followed by payments of T and t, regardless of who initiated the breakdown. As per the regulations, only layoffs are supposed to require payments of T and t (by the firm). Since breakdowns are privately efficient, it is not clear which party initiated the breakdown. We take the stand that firms always decide to separate, and thus that they are responsible for paying T and t 27. Finally, in (46), the unemployed worker receives unemployment benefits equal to ρ times the average wage w (it now makes sense to view b as value of leisure and home production only). It is assumed that unemployment benefits are based on the average wage and not the wage before the breakdown, to avoid heterogeneity among workers, due to their labor market history. Define the match surplus as: T (ε) =M w (ε) (S w + T )+M f (ε) S f T t = M w (ε)+m f (ε) S w + t It is clear that severance payments do not affect the total match surplus, since they are a transfer within the match. Firing taxes, for the opposite reason, do influence the match surplus. Proceeding as previously, we can derive a job destruction equation. This condition will have the property 27 We could add quits to the model (at a rate δ) and assume that these quits do not require payments by the firm. That would lessen the effects of firing costs, since these payments would be less likely for the firm. 33

103 that, at the reservation productivity, the value of a match will be equal to the value of search. Since M w (ε) S w T = βt (ε) (48) M f (ε)+t + t = (1 β) T (ε) (49) we can rewrite (44) and (45) as: Z rm w (ε) = w (ε)+λβ [Max{T (z), 0} T (ε)] df (z) Z rm f (ε) = ε w (ε)+λ (1 β) [Max{T (z), 0} T (ε)] df (z) Adding up the two equations, we get: Z r [T (ε)+s w t] =ε + λ [Max{T (z), 0} T (ε)] df (z) or Z (r + λ) T (ε) =ε + λ Max{T (z), 0} df (z) r (S w t) (50) Again, we see that the model exhibits a reservation property. Define the reservation productivity shock ε r such that T (ε r )=0. Then, rewriting (50) at ε and ε r and subtracting the two, one gets that: T (ε) = ε ε r r + λ (51) We see that (51) is actually the same equation as (39). Both say that the total surplus at ε is the value of output above the reservation productivity, discounted to take into account the match impermanence (r + λ, not r). Of course the value of ε r is not the same in the two models. Inserting (51) into (50), we get: ε r + λ Integrating by parts, this is equivalent to: Z ε u ε r r (S w t) =ε r + λ r + λ z ε r r + λ df (z) r (Sw t) =0 Z ε u ε r Z ε u (1 F (z)) dz = ε r + λ ε r T (z) df (z) (52) (52) is similar to (37). The left hand-side is the opportunity cost of a match, while the right-hand side is the opportunity cost of search (at ε r ). With firing costs, the (total) opportunity cost of a match must reflect the fact that by remaining in the match, the firm avoids the payment of the firing tax (t). The severance payment 34

104 (T ) does not enter the equation, because it does not affect the total opportunity cost. So, only t affects the breakdown decision. It will become clear, however, that both T and t affect wages. We can now look at the job creation condition. Because of the free entry condition, firms post vacancies until the value of doing so is driven to zero. Hence: c + q (θ) M f (ε u )=0 or c q (θ) =(1 β) T (ε u) T t =(1 β) ε u ε r r + λ T t (53) The difference between (40) and (53) is that (53) reflects the fact that the two types of firing costs adversely affect the firm s surplus. Firms anticipate to have to pay firing costs in the future and that decreases their surplus at match formation. Notice that contrarily to the job destruction condition (52), where only t enters,both T and t enter in (53), because they affect the threat points and hence, wage formation. Since we can expect the firing regulations to have effects on both unemployment, but also wages, it is interesting to determine w (ε) 28. The calculations are provided in the footnotes. We obtain the following expression for the wage: w (ε) =βε +(1 β) rs w + r (T + βt) (54) Comparing the wage in (54) with the expression (21) obtained in the baseline (Pissarides) model, one would think that the wage is higher when firing costs are present. Indeed, firing costs enhance the worker s bargaining position (by increasing his/her threat point), and adversely affect the firm s bargaining position, and this may 28 From (48) and (49), we know that β M f (ε)+t + t = (1 β)[m w (ε) S w T ]. Equations (44) and (45) imply that (r + λ) M w (ε) = w (ε) +λf (ε r )(S w εr u + T )+λ M w (z) df (z) and (r + λ) M f (ε) = ε w (ε) +λf (ε r )( T t) + ε r εr u λ M f (z) df (z). Therefore, (r + λ)(m w (ε) S w T ) = w (ε) +(λf (ε r ) r λ)(s w εr u + T )+λ M w (z) df (z) and ε r ε r (r + λ) M f (ε)+t + t εr u εr u = ε w (ε) +(λf (ε r ) r λ)( T t) +λ We have that εr u εr u εr u ε r M f (z) df (z). ε r M w (z) df (z) = (βt (z)+s w + T ) df (z) and M f (z) df (z) = ((1 β) T (z) T t) df (z). ε r ε r ε r By inserting these integrals into the above equations, we get that (r + λ)(m w (ε) S w T ) = w (ε) + (λf (ε r ) r λ)(s w εr u + T ) + λβ T (z) df (z) + λ (1 F (ε r )) (S w + T ) and (r + λ) M f (ε)+t + t = ε w (ε) + ε r εr u (λf (ε r ) r λ)( T t) +λ (1 β) T (z) df (z) +λ (1 F (ε r )) ( T t) This simplifies as (r + λ)(m w (ε) S w T )= ε r w (ε) r (S w εr u + T )+λβ T (z) df (z) and (r + λ) M f (ε)+t + t εr u = ε w (ε) r ( T t) +λ (1 β) T (z) df (z) Usingthetopequationinthisfootnote,onegetsthatβε βw (ε) βr ( T t) +λβ (1 β) T (z) df (z) =(1 β) w (ε) ε r ε r εr u ε r (1 β) r (S w εr u + T )+λβ (1 β) T (z) df (z) Hence, w (ε) =βε +(1 β) rs w + r (T + βt). ε r 35

105 result in higher wages. In other terms, during wage negotiations, theworkerisabletoextracttherent associated with the avoidance of firing costs by firms. However, one has to be careful since, firing costs also affect the value of search S w, which is endogenously determined and depends on market tightness. In particular, firing costs make vacancy posting less desirable to firms, which decreases market tightness and thereforethevalueofsearchtoworkers. We can look at the effect of firing regulations on labor market outcomes (unemployment and wages). In particular, we can do some preliminary graphical analysis using a job creation/job destruction graph, as with the Mortensen/Pissarides model. Combining (46), (48), (51) and (52) one gets the following job destruction condition 29 : b + ρw + θq (θ) β ε u ε r r + λ + T rt = ε r + λ Z ε u (1 F (z)) dz (55) r + λ ε r And from (53): c q (θ) =(1 β) ε u ε r r + λ T t (56) Using the model: Using a JC/JD graph, we can look separately at what happens if T>0, t =0or T =0, t>0 [starting from t =0and t =0]. 0) Notice, that because w is endogenous, and because we do not know the effect of T and t on w, the model should really be simulated. Nonetheless, let us try to preview the effects of the various policies. 1) First, as severance payments (only) are introduced in the model (T >0, t =0), we can easily see that both the job destruction and job creation curves shift to the left. This results in lower θ, but has an inconclusive 29 Assume that the economy is in steady state, and that total steady state employment is E. The change in total wage bill per period dt is equal to we (1 λdt)we = λdtwe, since only a proportion λdt of existing matches are subject to a new shock. Over the same period, this change in total wage bill is due to (i) newly formed matches at w (ε u ), for an amount equal to (λdt) EF(ε r)w (ε u) since only a proportion F (ε r) of matches hit by a new shock are broken down and re-created in steady εr u state, and to (ii) continuing matches hit by new shocks (higher than ε r ) for an amount equal to λdte w(x)df (x). Hence, in ε r steady state: Zε u λdtwe = Eλdt F (ε r)w (ε u)+ w(x)df (x) or, ε r Zε u w = F (ε r )w (ε u )+ w(x)df (x) ε r 36

106 effect on ε r. This implies a higher unemployment duration, while the effect on unemployment incidence cannot be resolved qualitatively. This makes it difficult to relate that to the empirical facts from the motivation. 2) Second, as firing taxes (only) are introduced in the model (T =0, t>0), we can see that, again, the job creation curve shifts left. The effect on the job destruction curve is a shift down (one should keep in mind that ignoring the possible effects on w might be significant in this case, since t only comes in as rt inthe job destruction condition(and thus represents a small change)). Hence, no clear conclusion can be drawn. 3) Third, one can look at the effect of firing regulations (t, T ) on wages. From (54), one can see that firing regulations may increase wages, since during the wage negotiations, workers are able to take advantage of the fact that firms want to avoid paying these costs. However, looking at (54), whether or not wages are increased depends also on the value of search. If the regulations have a strong enough negative effect on firms s vacancy posting, S w might decrease and wages as well. This is why the model needs to be simulated to be able to draw any conclusion. 4) In conclusion, this version of the model is not very satisfying since (i) one cannot fully determine the effects of T and t on unemployment and wages and (ii) one does not get any implication on the wage profile from the model. Notice however, that this model delivers a clear effect on job creation. Firms, being forward looking, expect to have to pay these costs in the future. This reduces their lifetime discounted match value. Therefore, firms post less vacancies. 5) What can be done to remedy these problems? Notice that, implicitly, regulations were assumed to come into place at the very beginning of the match, or at match formation. We are presenting next a model, where regulations do not come into effect before a certain tenure with the firm Tenure dependent policies We are going to account for the fact that typically, employment protection policies do not come into effect before a certain tenure with the firm. In particular, it is assumed that severance payments are not due before the first idiosyncratic shock hits the match. This assumption is made for analytical tractability, but is designed to replicate the fact that severance pay T is not required in the early parts of the tenure with the firm 30.Also, regulations about unfair dismissals do not take effect until completion of a trial period, whose length varies from a few months to two years, depending on the country. Hence, it will also be assumed that the fixed cost t only comes into consideration after the first shock hits. These assumptions are made to reflect the fact that, in general, firing costs only come into effect, after a certain tenure with the firm. This is an important consideration, as it affects wage formation. Alternatively, it could be assumed that the regulations come into 30 In France, for example, no severance payment is due until the worker has been with the firm for two years (8 quarters). Compare that with a Poisson arrival rate for idiosyncratic shocks λ =0.1 (as calibrated in Mortensen (JEDC 1994)). Such a value for λ implies an average time before arrival of the first shock of 10 quarters. 37

107 effect, following a Poisson process with a different rate λ 2 than the rate at which idiosyncratic shocks arrive. That way, the artificial link between the arrival of shocks and the arrival of policy effects would be broken. However, assuming the same rate does not affect the main fact we want to illustrate, which is that the firm s bargaining position at match formation is not altered by the regulations, while it is once these are in place. Because of these assumptions, it is necessary to define (i) the value of a match (to the worker or the firm) at match formation (indexed by zero), or before regulations come into effect, and (ii) the value of a match after the regulations come into effect, hereafter called a continuing match. rm w 0 Z = w 0 + λ [Max{M w (z),s w + T } M w 0 ] df (z) (57) Z h rm f 0 = ε u w 0 + λ Max M f (z),s f T t ª i M f 0 df (z) (58) Z rm w (ε) =w (ε)+λ [Max{M w (z),s w + T } M w (ε)] df (z) (59) rm f (ε) =ε w (ε)+λ Z Max M f (z),s f T t ª M f (ε) df (z) (60) rs w = b + ρw + θq (θ)[m0 w S w ] (61) h rs f = c + q (θ) M f 0 Sfi (62) The flow equations can be interpreted in the usual manner. Notice that the option values in (57) and (59) on the one hand, and (58) and (60) on the other hand do not differ, since by assumption, once the first shock has hit, the regulations come into effect. The difference comes from the fact that w 0 is negotiated when the regulations are not in place, while w (ε) is negotiated once these are in place. And firing costs do affect wage formation, because they negatively affect the firm s bargaining position (through the threat point). Wage formation At match formation: Surplus to the wor ker : M w 0 S w Surplus to the firm : M f 0 Sf = M f 0 Total surplus : T 0 = M w 0 + M f 0 Sw Applying the Nash bargaining solution: Max ³M f (1 β) w 0 (M w 0 S w ) β 0 s.t. M f 0 0, Mw 0 S w 38

108 which implies that: M w 0 S w = βt 0 M f 0 = (1 β) T 0 For continuing matches: Surplus to the wor ker : M w (ε) S w T Surplus to the firm : M f (ε) S f T t = M f (ε)+t + t T otal surplus : T (ε) =M w (ε)+m f (ε) S w + t Applying the Nash bargaining solution: Max M f (ε)+t + t (1 β) (M w (ε) S w T ) β w 0 which implies that: s.t. M w (ε) S w + T, M f (ε) T t M w (ε) S w T = βt (ε) M f (ε)+t + t = (1 β) T (ε) Since we have already gone through the algebra, and because the calculations do not add much to the economics, the results are summarized below. As expected in a search and matching model, the decision whether to stay matched or resume search, follows a reservation property (of course, the cutoff productivity ε r will depend on T and t!). The surplus from the match is the output above the reservation productivity discounted to take into account the match impermanence: T (ε) = ε ε r r + λ The job destruction condition is similar to the one we previously derived: Z ε u r (S w t) =ε r + λ (1 F (z)) dz (63) r + λ ε r This should not come as a surprise, since it states that, at the reservation productivity ε r, the total opportunity cost of search (right-hand side) is equal to the total opportunity cost of a match (r (S w + T ) to the worker and r ( T t) to the firm). 39

109 We can compute the total surplus at match formation, T 0. Writing (59)-(60) at ε = ε u, we get that (r + λ) M w (ε u )=w (ε u )+λ R Max{M w (z),s w + T } df (z) and (r + λ) M f (ε u )=ε u w (ε u ) +λ R Max M f (z), T t ª df (z). Adding these up, we get that (r + λ)[t (ε u ) t + S w ] = ε u + λ R Max{M w (z),s w + T } df (z)+λ R Max M f (z), T t ª df (z). By the same token, using (57)-(58), we get that (r + λ) M0 w = w 0 +λ R Max{M w (z),s w + T } df (z) and (r + λ) M f 0 = ε u w 0 +λ R Max M f (z), T t ª df (z Adding these up again, one gets that (r + λ)(t 0 + S w )=ε u +λ R Max{M w (z),s w + T } df (z)+λ R Max M f (z), T t ª d Comparing the two expression, we get that T (ε u ) t = T 0,or and wages are given by 31 : T 0 = ε u ε r r + λ t (64) w (ε) = βε +(1 β) rs w + r (T + βt) (65) w 0 = βε u +(1 β) rs w λ (T + βt) (66) To determine the job creation equation, we use again the free entry condition. The difference with the above model is that the surplus at match formation, which is what firms are considering when posting vacancies, has a different expression. We get: c εu ε r =(1 β) q (θ) r + λ t (67) We can now use the model to look at the effect of firing regulations on unemployment and wages. Combining (61) and (63) and using (67), we get: Z ε u ε r + λ εu ε r (1 F (z)) dz = b + ρw + θq (θ) β r + λ r + λ ε r c εu ε r = (1 β) q (θ) r + λ t t rt Effect of firing taxes on unemployment: Assume t>0 and T = 0. When t>0, the surplus at match formation decreases, hence firms post less vacancies and θ decreases (for a given ε r ). This corresponds to a shift down of the JC curve. Looking at the JD condition, we see that the term w is also influenced by t. First, not taking into account the effect of t on w, we see that for a given ε r, θ increases as t increases. This implies that the JD curve shifts down in reaction to an increase in t, which means that firing taxes have the (partial) effect of reducing 31 The calculations for w 0 are very similar to the one we carried out previously for w (ε). 40

110 job destruction. A calibration and simulation of the model reveals that the effect of t on w is to reduce w a little, confirming the shift down of the JD curve. The combined effect of the two shifts is to decrease ε r, that is to make job separation less frequent. The effect on θ eq is qualitatively inconclusive. However, calibration shows that a higher t resultsinbothhigher unemployment duration and a lower incidence of unemployment. Effect of severance payments on unemployment: Assume t =0and T > 0. We see that the JC condition is unaffected by T. This is because (i) severance payments are transfers within the match, and (ii) they do not enter the firm s or the worker s threat point when the wage is initially negotiated at match formation. Also, the severance payments do not enter the JD condition, again because they are transfers within the match (for the same reason as above, w is not very dependent on T ). Hence, since neither job creation, nor job destruction is much affected by T,itdoesnot have much effect on the unemployment rate either. Effect of T and t on wages and the wage profile: As in the previous version of the model, wages of continuing matches are increased by firing regulations, relative to w 0. As seen from (66), wages initially are reduced by the firing regulations (comparing w 0 at productivity ε u and w (ε u ) for a continuing match, one can see that w (ε u ) w 0 =(r + λ)(t + βt) > 0). In order to get the firms to accept to match, the wage w 0 has to be low enough, in anticipation of possible future firing costs. When bargaining at match formation, the firm fully expects that once the regulations come into play, its bargaining position will be weaker than what it is now (i.e. its threat point will be lower). Anticipating this, and given its current bargaining position before the regulations come into play, the firm is able to retain more of the match output. The worker has to concede more of the output at first, knowing that, later in the match, he or she will be able to extract more from the match output, since the firm will try to avoid paying the firing costs. Hence, we have a steeper wage profile, in the presence of firing regulations. In conclusion, this extended version of the basic Mortensen-Pissarides model delivers the following results: - Firing taxes t tend to reduce job separation, but still result in longer unemployment, - Severance payments have little effect on unemployment, - Firing regulations (T or t) result in a steeper wage profile. A very important consideration when studying firing regulations is to recognize that the different types of firing costs have various effects on unemployment. It is necessary to differentiate between transfer payments and costs that are incurred by only one party to the match 32. However, much of the debate in the literature focuses on whether adjustment costs are linear or quadratic. Also, when firing costs are considered, empirical studies tend to only include severance payments or advance notice regulations, or a composite index of 32 And between policies that impose fixed costs and policies that impose costs proportional to the worker s skill, as these policies have different effects on heterogeneous workers. But this would require a model with heterogeneous workers, of course. 41

111 strictness of regulations. Of course, procedural costs are hard to quantify. Nevertheless, this model shows that they play an important role in the total effect of firing regulations. Also, the literature tends to focus on the effect of firing costs on unemployment. However, we saw that there are potential effects on wages. Another potential application of this type of model is to explain why, despite higher firing costs in Europe than in the U.S., job destruction rates are of relatively similar magnitude in these economies. This can be explained by the fact that generous unemployment benefits have the opposite effect on unemployment incidence than firing costs do. While firing costs tend to reduce unemployment incidence, generous unemployment benefits (i.e. higher benefit replacement rates) tend to increase unemployment incidence. Since the countries with high firing costs (i.e. Europe) are also the countries with the more generous unemployment benefits, that can qualitatively explain similar magnitudes of job destruction rates in Europe and the U.S. 3 Efficiency wage models The literature on efficiency wage models is concerned with one question in particular: Given unemployment is observed in real economies, why don t firms cut their wages, in the face of excess supply?. Of course, in the neoclassical model, this is not an issue, since firms are price takers and we know that there will not be any involuntary unemployment in equilibrium, exactly because of market clearing. However, it seems like the neoclassical model is missing something and that market clearing, as an assumption, is not necessarily appropriate when studying the labor market. The efficiency wage models are an attempt to justify why it may not be in firm s best interest to cut wages. In this framework, firms are wage setters, yet they may not want to cut wages anyway. There are actually several types of efficiency wage models. Their common characteristic is that the worker s productivity is a function of wage. As a result, setting a lower wage induces lower worker s productivity and may negatively affect profits. Hence, in equilibrium, wages may be above the market clearing level. We first look at a general model, where productivity depends on wages, without stating through which channel. We then look at an efficiency wage model, where the relationship between wage and productivity is not assumed, but rather derived. 3.1 Wage setting when productivity depends on wages Suppose the firm s production function is given by: y = sf (e (w) L) 42

112 L is the number of workers it employs, e (w) is an effort function and s is a random component. The production function F satisfies the usual conditions: The effort function satisfies: F 0 (L) > 0 F 00 (L) < 0 e 0 (w) > 0 e 00 (w) < 0 The firm is a wage setter, thus chooses wages and labor to maximize profits. That is, its problem is to: MaxsF (e (w) L) wl w,l where s, e (.) are given The first order conditions are: sf 0 (e (w) L) e 0 (w) L L = 0 sf 0 (e (w) L) e (w) w = 0 Hence: e 0 [w (s)] w (s) e [w (s)] =1 Inspection of that condition reveals that the wage is actually independent of the shock s. That is, we have complete wage rigidity! The wage is such that the elasticity of effort with respect to wage is equal to 1 at the optimal wage. Once wage is determined, labor is given by the fact that firms hire until the marginal product of labor is equal to wage: dy dl (w,l )=w Given the fact that effort (which is an indirect input in the production function) depends on wage, firms do not reduce wage below w, because doing so would reduce profits. 3.2 Efficiency wages: the moral hazard case 43

113 This is in fact the most commonly cited reference for efficiency wage models. It comes from Shapiro & Stiglitz (AER 1984). The moral hazard comes from the fact that firms and workers have to engage into an employment contract, determining the wage the worker will receive. However, the firm can only imperfectly monitor the worker s actions, or work effort, once the contract is in place. So, the firm wants to make sure that the employee does not shirk, that is, it wants to make sure that the worker puts in a high work effort. But how can the firm do that, if it cannot monitor its employees easily? The economy is described below: Agents: N identical workers M identical (wage setting) firms Preferences: Worker s utility function: U (c, e), where c is consumption and e represents work effort. The worker has the choice between two effort levels: e {0, e} where e > 0 The worker gets to consume his or her wage if employed and receives unemployment benefits, if not employed. That is: c = w, ifemployedatw c = b, if not employed Of course, U is increasing in c and decreasing in e. ItisalsoassumedthatU is separable in c and e. Hence: U (c, e) =c e Firms are profit maximizers. Flows: - The is an exogenous (separation) rate s at which worker/firm matches are broken down. - There is an (imperfect) monitoring technology, that allows the firm to see if the worker shirks or not. However, it can only detect shirking, given it is taking place, with probability f, per unit of time. In that case, the worker is fired. -The rate a at which workers find jobs is endogenously determined. State variables: For workers: employed or unemployed 44

114 Control variables: Work effort e {0, e} Value functions: V u : Value to worker of being unemployed (expected lifetime value) V s e : Value to employed worker of shirking V n e : Value to employed worker of not shirking Worker s problem: In order to make his or her decision, the worker has to take the wage w i at firm i as given, as well as the value of his alternative state (state of unemployment) V u. In flow terms, the value function equations are given by 33 : rv s e = w i +(s + f)(v u V s e ) (68) This equation says that the flow of utility from shirking while employed is the instantaneous wage received, not reduced by any work effort, plus the probability (s + f) of becoming unemployed, in which case, the worker s expected lifetime value goes from V s e to V u. Similarly, the value of not shirking is: rv n e = w i e + s (V u V n e ) (69) This equation states that the flow of utility from not shirking is the instantaneous wage received minus the disutility from effort, plus the probability s of becoming unemployed, due to an exogenous random event, in which case, the worker s expected lifetime value goes from Ve n to V u. 33 In papers, the flow equations are rarely explicitly derived. That is because they tend to take the same general form and because, most times, they are actually written in steady state. The flow of value from being in a particular state is generally equal to the instantaneous utility received from being in that state plus the instantaneous probability of changing state times the resulting gain/loss in value from that change. Let us see why in this particular example: Ve s 1 (t) = 1+rdt {w (t) dt +(sdt) V u (t + dt)+(1 sdt)[(fdt) V u (t + dt)+(1 fdt) Ve s (t + dt)]} As dt 0, wehavethat: (1 + rdt) Ve s (t) =w (t) dt +(sdt) V u (t)+dt V u (t) +(1 sdt) (fdt) V u (t)+dt V u (t) +(1 fdt) Ve s (t)+dt V (t) e s Taking Ve s (t) out on both sides, dividing through by dt and with dt 0, onegets: rve s (t) =w (t)+(s + f)[v u (t) Ve s (t)] + V e s (t) In steady state, this becomes: rv s e = w +(s + f)[v u V s e ] 45

115 Equations (68) and (69) can be solved for Ve s and Ve n as functions of the wage w i at firm i and the value of being unemployed V u. Ve s = w i +(s + f) V u r + s + f Ve n = w i e + sv u r + s Remember that the employed worker s decision is whether to shirk or not. Given w i and V u,theworker prefers to work hard if: V n e V s e which is equivalent to: w i rv u + r + s + f e (70) f This condition is generally referred to as the no-shirking condition ( NSC ). Note that the wage necessary to deter shirking increases with V u, the value of the state the worker goes to, if caught. It increases with e, the effort level corresponding to no shirking. Finally, the no-shirking wage decreases with the probability of being caught f. If being fired becomes more likely, the incentive the firm has to give the employee to work hard does not have to be as big. Firm s problem: Technology: When firm i s employment is L i,itseffective employment l i is the number of its workers who are not shirking. Since shirkers are contributing zero effort, their effective labor input is also zero. Then, firm i s production is given by: y i = F (l i ) where F (.) is strictly increasing and concave. Each firm must find it optimal to fire shirking workers, since the other alternative, a wage reduction, would only induce him or her to shirk (remember that, unless a worker chooses effort level e, no output is produced). The firm cannot pay a shirking worker anything else than 0, since a shirker s output is zero. It finds it better to pay the no-shirking wage and get output from the worker. Remember that firms are wage setters. Taking V u as given, they can just set their wage to the point where the worker is just indifferent between shirking and not. Hence, they choose to hire at the no-shirking wage. Hence, firm i sets its wage ew i,suchthat: ew i = rv u + r + s + f e f 46

116 The employment choice L i by firm i has to be such that the marginal product of labor in firm i is equal to the wage set: F 0 (L i )= ew i Equilibrium: Up to now, V u and the market wage w were taken as given. Of course, these are determined in equilibrium. We are looking for an equilibrium where the wage w i set by firm i is equal to the market wage w. The value of unemployment is given by: rv u = b + a (V e V u ) (71) Remember that the firm sets the wage so that workers are just indifferent between shirking or not shirking. Hence: V s e = V n e Equation (71) has the same interpretation as the other value function equations. In flow terms, the value of being unemployed is equal to the utility received while unemployed (which may include unemployment benefits, value of leisure, home production) plus the option value of changing state (U to E). Solving for equilibrium values of V e and V u,weget 34 : rv e = (r + a)(w e)+sb r + s + a rv u = a (w e)+(r + s) b r + s + a Knowing that, in equilibrium, all firms set the same wage ( ew i = w), this allows us to rewrite the NSC condition as: w = b + e + r + a + s e f (72) As previously mentioned, the rate out of unemployment a is endogenously determined. In steady state, every period, the flow out of employment (number of workers losing their jobs) is equal to the flow into employment (number of workers finding a job). Calling L the aggregate employment, this steady state condition can be written as: sl = a (N L) = a = sl N L Let u = N L 1 u N be the unemployment rate. Thus, L/N =1 u and a = s u. Therefore, w = b +e+(e/f)(r + s(1 u) u + s) and we can further rewrite the NSC condition as: w = b + e + e h r + s i (73) f u = V e 34 Remark that we could use either (68) or (69) to compute V e. 47

117 Wage w 6 Labor Demand No-Shirking Condition N - Employment L Figure 4: Equilibrium wage and employment At this point, the unemployment rate u has not been determined yet. In fact, the market wage and employment level are given by two conditions. First, the firm, given L, sets the wage just equal to the no-shirking wage, as given by the NSC condition (73). Then, the firm, given w, chooses how many workers to hire to the point where the marginal productivity of labor equals w. Given that all firms are homogeneous and hence have same employment, this results in: µ F 0 L = w (74) M This gives us two curves in (L, w) space. The equilibrium wage and employment level is the intersection of these two curves. The equilibrium wage and employment levels are the intersection of two curves. 48

118 One is the NSC condition. There has to be a certain relationship between the wage set by the firm and the employment level:w NSC is increasing in L. Thisisbecausethefirm uses the wage to entice the worker to produce full effort: the higher the employment level, and therefore the lower the unemployment rate, the higher the firm has to set the wage to make shirking costly. Notice that, of course, the threat of firing the worker is effective only if there is a certain level of (involuntary) unemployment. The unemployed would prefer to work at or even below the equilibrium wage 35, but cannot make a credible promise to produce full effort at that wage. Also, notice that the no-shirking wage w increases with the income received during search b. Thisisbecause ahigherb implies that the cost of becoming unemployed is less, and hence that being fired is relatively less costly. Also, the higher the effort level e, themorethefirm has to pay the worker to guarantee hard work. And finally, the more difficult its is to monitor the worker (low f), the higher the wage. If being caught is unlikely, you have to increase the cost of being caught to deter low effort. The other relationship is the usual labor demand equation. Given the wage, the firm sets employment such that the marginal product of labor is equal to the wage. Because firms choose to raise wages as an enforcement mechanism, the quantity of labor demanded decreases, resulting in unemployment. 4 An introduction to implicit contracts The implicit contract literature is also interested in looking at why wages seem to be relatively stable (or rigid ). If one considers that firms and workers are engaged in long-term relationships, then wages may reflect the terms of an implicit contract entered in voluntarily by the worker and the firm, rather than ensure that the labor market is cleared every period. The basic feature of implicit contract theory is that workers are risk averse and have limited access to capital markets. This can be justified by considering that workers only capital is their human capital, that cannot be used as collateral (especially firm-specific human capital). Since they are risk averse, they would like to obtain some kind of insurance against income fluctuations, but cannot get this from private insurance companies. The employers, however, are less risk averse and have access to capital markets. For simplicity, one may assume that employers are risk neutral 36. Because workers and employers have different attitudes towards risk, there is room for employers to provide some form of insurance to the workers. Hence, employers implicitly provide wages and insurance to their workers as part of an employment package. The implicit contract should be interpreted in the as if sense of an explicit one, as a mutual understanding between worker and employer that the invisible handshake implies, as in commercial contracts. In other words, by 35 Notice that: r (r + s + a)(v e V u )=r (w b e) > 0. Hence, V e V u > Because the firm s owner is risk neutral or because firms have access to capital markets. 49

119 providing insurance to risk averse workers, the risk neutral firm is able to attract workers at a lower expected wage, than they would if they were not providing insurance. First assume that workers are risk averse and cannot insure themselves, either through direct insurance or through savings accumulation. Consider that the productivity of a worker is determined, in part, by a random variable s. The shock s can reflect demand uncertainty and/or a productivity shock. The firm offers an optimal contract to the worker, in the form of wage and work hours, as a function of s, {W (s),h(s)} s S. The workers maximize expected utility: E [V (C (s), 1 H (s))] s.t. C (s) =W (s) H (s) The constraint reflects the fact that the workers do not have access to credit markets, so that labor is their only source of income. Firms are risk neutral and maximize expected profits: E [sf (H (s)) W (s) H (s)] What if the labor market were characterized by the usual competitive market?: Let us firstlookatwhatwouldhappeninaspotcompetitivemarket.denotebyf (L) the firm s production function: F 0 (L) > 0, F 00 (L) < 0 In a competitive spot market, firms and workers, after realization of s, take W (s) as given, and maximize current utility, and firms maximize expected profits. Hence, the firm s FOC is given by: while the worker s FOC is: sf 0 (H (s)) = W (s) (75) W (s) V 1 [W (s) H (s), 1 H (s)] V 2 [W (s) H (s), 1 H (s)] = 0 (76) In such a market, both wage and employment respond to a change in s. Now assume the relationship between a worker and a firm are characterized by a long-term contract: Consider that firms set wages, and in particular offer a contract to workers. In fact, the contract can be determined in terms of {C (s),h(s)} instead of {W (s),h(s)}. The contract specifies a value for consumption and labor hours as a function of s (before the realization of s, ofcourse). 50

120 Since the firm is risk neutral, it can offer a contract that transfers income across states. The contract maximizes expected profits, given expected utility. The contract must maximize: Max E [sf (H (s)) C (s)] + λe [V (C (s), 1 H (s))] (77) {C(s),H(s)} The constant λ can be seen as a measure of bargaining strength of the worker (the higher λ, thehigherthe weight on the worker s expected utility in the maximization problem) 37. Notice that this results in an optimal contract, by construction. By varying λ, one describes the set of optimal contracts. Since workers are risk averse, while firms are risk neutral, one can see that a contract derived from this maximization problem will shift some of the risk from workers to firms, hence increasing the objective function. (Writing the maximization problem in these terms is equivalent to writing in the form: Max E [sf (H (s)) C (s)] {C(s),H(s)} s.t. E [V (C (s), 1 H (s))] V 0 where V 0 is some reservation value to the worker. The contract has to make the worker as least as well off as V 0,whichcanbeseenasreflecting some outside option to the worker.) [The outside option can be viewed as what the worker would get in a spot market.] We can maximize with respect to C and H in each state s 38,andget: 1+λV 1 [C (s), 1 H (s)] = 0 (78) sf 0 [H (s)] λv 2 [C (s), 1 H (s)] = 0 (79) We get, by differentiating (78) and (79): Hence: V 11 [C (s), 1 H (s)] C 0 (s) V 12 [C (s), 1 H (s)] H 0 (s) = 0 h i λv 12 [C (s), 1 H (s)] C 0 (s)+ λv 22 [C (s), 1 H (s)] + sf 00 [H (s)] H 0 (s) = F 0 (H (s)) C 0 (s) = F 0 V 12 H 0 (s) = F 0 V 11 where = λ V12 2 V 11 V 22 sf 00V 11 (80) (81) 37 Notice that this has nothing to do with the bargaining power in the Nash bargaining solution. 38 Maximization of the expected value of an objective function, with given weights. We can just maximize the objective function at every s. 51

121 We can see that H 0 (s) has the opposite sign of. Since workers are risk averse, V 2 12 V 11 V 22 < 0 and V 11 < 0. We also know that the production function itself is concave. Therefore < 0 and: It results from inspection of (80) that: H 0 (s) > 0 C 0 (s) > 0 V 12 (C, 1 H) < 0 The implicit contract specifies that workers always work more in good times (high s). There is no ambiguity due to opposing income and substitution effects (remember that in a competitive spot market, the income and substitution effects of a increase in productivity, work in opposite directions). The positivity of H 0 (s) is basically a result of the substitution effect only. By providing insurance (income transfers) to the worker, the firm removes the income effect, and workers substitute leisure for labor when productivity is high. A bad productivity draw does not carry any income effect. Hence, the worker will work more when the marginal product of labor is larger and redistributes consumption by insurance. If indeed the economy is characterized by long-term implicit contracts between workers and firms, the fact that these contracts emphasize substitution effects gives more support to RBC models that rely on high degree of substitution of labor hours in response to productivity shocks. Remark that implicit contracts do not insure ex-post utility. In fact, from (78), we see that the effect of insurance is to smooth out marginal utility of consumption, but not utility. Define ex-post utility as Then: u (s) =V (C (s), 1 H (s)) u 0 (s) =C 0 V 1 H 0 V 2 = F 0 V 12 Hence, u 0 (s) has the sign of V 11 V 2 V 12 V 1.Wehavethat 39 : V 1 F 0 V 11 V 2 Utility is fully insured V 11 V 2 V 12 V 1 =0 This is the case if V (C, 1 H) = v (C + φ (1 H)) Notice that C 0 (s) can take any sign. However, we have the following: Consumption is completely smoothed when the utility function is additively separable in consumption and leisure Consumption and labor are positively correlated only when 39 Notice that, in that case, employment is the same in the spot and the contract economies. 52

122 V 12 (C, 1 H) < 0 Comparing the FOC s for the spot competitive market and the contract economy, we can see that: Spot Market Economy: V 2 V 1 = sf 0 and sf 0 = W Contract Economy: V 2 V 1 = sf 0 and V 1 = 1 λ We can see that in the two economies, the same efficiency condition on hours holds. However, in the contract economy, insurance ensures that the marginal utility of consumption is constant across states. 53

123 Miscellaneous issues in search and matching 1

124 1 First generation of search models: wage posting The treatment comes from Rogerson, Shimer and Wright (JEL 2005). We start the course with the rst generation of search models to have an idea of how the literature evolved. One of the early questions was to try understanding how unemployment can arise and how can similar workers earn di erent wages. For that of course, one has to move away from Walrasian models towards models with frictions. To address these questions, we look at the rst generation of search models (as opposed to matching models). The rst attempt was to study the problem of a worker facing a distribution of wages and deciding whether to accept a wage o er. For now, the distribution of wages itself is exogenous. We will think of endogenizing this distribution later in the course. 1.1 In nitely lived matches Discrete time: Consider a worker seeking to maximize his expected lifetime income E P +1 t=0 t x t, where x t is his income at date t. This income can either be a wage w t if employed or unemployment income b. The unemployed worker samples every period over a known distribution of o ered wages F (w). The worker can either accept or reject the sampled wage. If accepted, he starts working. If refused, the worker waits a period and gets to sample again from the same distribution. Start by assuming that if the wage is accepted, the worker stays employed inde nitely. We thus have two value functions ( U = b + R +1 maxfu; W (w)gdf (w); 0 W (w 0 ) = w 0 + W (w 0 ): It is immediate that W (w 0 ) = w 0 =(1 ) is increasing in w 0 and therefore there exists a reservation wage w r that will make the unemployed worker indi erent between accepting and rejecting a wage o er. By de nition, W (w r ) = U: How do we derive that reservation wage w r? It must be that That can be rewritten as U = W (w r ) = wr 1 = b + R +1 =) wr 1 = b + R +1 0 maxf wr 1 ; w 1 maxfw (w 0 r ); W (w)gdf (w); )gdf (w); =) w r = b(1 ) + R +1 0 maxfw r ; wgdf (w): w r (1 ) = b(1 ) + R +1 maxf0; w w 0 r gdf (w); =) w r = b + R +1 1 maxf0; w w 0 r gdf (w): This is one of the standard ways to express the reservation wage equation in a wage posting model. The other one is derived as w r = b + R +1 1 maxf0; w w 0 r gdf (w); =) w r = b + R +1 1 w r (w w r )df (w); =) w r = b + R +1 1 w r (1 F (w))dw; 1

125 where the last expression is obtained by integrating by parts. Continuous time: We re-derive the results in continuous time. The discount rate is 1=(1 + rdt) and the probability to get an o er per period dt is given by dt Then the value functions can be rewritten as ( U = bdt rdt dt R +1 W (w 0 ) = w 0 dt rdt W (w0 ): 0 maxfu; W (w)gdf (w) + (1 dt) U With the usual algebra, we obtain that ( ru = b + R +1 0 maxf0; W (w) UgdF (w); rw (w 0 ) = w 0 : We can see the similarity with the discrete time case. The di erence is that the value functions just above are expressed in terms of ow. By the same argument, we can prove the existence of a reservation wage w r above which o ers are accepted, W (w r ) = U. Algebra implies that this reservation wage can be expressed as 8 >< >: w r = b + r or w r = b + r R +1 w r (w w r )df (w); R +1 w r (1 F (w))dw: ; What do we get out of that very simple model? This generates unemployment. Because not all o ers are accepted, the model predict some unemployment (of course, remember that a very important of that unemployment is the distribution F which is still exogenous...) The model also generates a distribution G(w) of accepted wages (which may be di erent from the distribution of o ered wages). Also we can do some comparative static exercises. Here are the results summarized: 1 Unemployment duration: U:D: = 1=H, where H = [1 F (w r )]: H is called the hazard rate (why?). Also, Distribution of accepted (observed) wages: Also, ( dwr db dw r d > 0 =) dh db G(w) = F (wjw w r ): < 0 =) higher b implies longer unemployment spells, > 0, but possibly ambiguous e ect on H. dh d > 0 under log-concavity. Notice that the rst e ect is the one obtained in standard matching models, but for di erent reasons (why?). 1 The probability that a worker has not found a job after a spell of duration is equal to (1 Hdt) dt average unemployment duration is equal to R +1 0 ue Hu Hdu = 1=H. 1 e H. Thus the 2

126 1.2 Finitely lived matches So far, we have assumed that once a worker accepted a job, he works for the rm for ever. This is clearly empirically invalid since as we already know there are relatively high levels of job destruction at every phases of the cycle. We thus start by assuming that matches end exogenously at a Poisson rate. The value of unemployment is unchanged. However, the value of employment must be rewritten as rw (w) = w + [U W (w)]: The nature of the worker s decision is still the same so that we still have a reservation wage property and U = W (w r ), which leads to w r = b + Z +1 (1 F (w))dw: r + w r It is the same as the reservation wage equation above, except that possible match breakdowns in the future have the e ect of reducing the discount rate. Unemployment duration is still given by 1=H and employment duration is equal to 1=. 1.3 Finitely lived matches with on-the-job search The problem with the above formulation is that the separation rate is exogenous, which makes it impractical for a lot of applications. We thus introduce on-the-job search in the model in order to obtain job-to-job transitions. This is empirically a very important feature of labor markets (see work by Nagypal among others documenting that fact). We will see models of on-the-job search several times over the course of the semester. Suppose new o ers arrive at rate 0 while unemployed and at rate 1 while employed. In each case, workers get to sample from the same distribution. The value functions become ( R +1 ru = b + 0 w r [W (w) U]dF (w); R rw (w 0 ) = w maxf0; W (w) W (w 0 )gdf (w) + [U W (w 0 )]: 0 The second term in the LHS accounts for the fact that the employed worker can improve on his current situation without going back to unemployment. Again we have a reservation wage property for the unemployed worker. The employed workers reservation wage property is very simple: accept if w > w 0. After some algebra, we get that Z +1 w r = b + ( 0 1 ) (W (w) W (w r ))df (w): w r Notice that w r 7 b () (why?). Again, this reservation wage property can be rewritten as Z +1 1 F (w) w r = b + ( 0 1 ) w r r [1 F (w)] dw: We still have that dw r =db > 0, but here that also implies that turnover is reduced by unemployment insurance. 3

127 This simple framework delivers a lot of predictions about turnover that are empirically valid: 2 8 >< Time since last unemployed is positively correlated with wage, Separation are negatively correlated with wages, >: Negative correlation between job tenure and separation rates. We will use a variant of this model later in the course (Ljungqvist and Sargent 1998) 2 One can again compute a number of things. In steady state, (1 u) = u o(1 F (w r)), thus u = + o(1 F (w r)) : We can also compute the distribution of observed wages G(w). Consider the interval [w r; w]. Flows into that interval are coming from unemployed workers. Flows out of that interval are going either to unemployment or to higher wages (see graph). Thus in steady state, it must be that which combined with the expression for u above implies that Finally, the job-to-job transition rate is given by u 0 [F (w) F (w r)] = (1 u)g(w) + (1 u)g(w) 1 (1 F (w)); [F (w) F (w r)] G(w) = [1 F (w r)][ + 1 (1 F (w))] : 1 Z +1 w r (1 F (w))dg(w): 4

128 1 E ciency in matching models: the Hosios condition Matching models are characterized by search externalities: searching rms make it harder for other rms to match and make it easier for workers to match. The question is: when is e ciency achieved? Hosios (Restud 1990) answers that question. We look at a simpli ed version of a matching model, but the argument carries through to a more general setup. Since the decision being studied is: how are vacancy decisions taken? (and thus how are matching probabilities determined?), we look at a simple model where an individual rm has the option to post costly vacancies and faces a matching probability which is determined by rms behavior. The rm takes market tightness = v u as given and posts a vacancy if the value of doing so is positive. Thus, free entry ensures that c = q()m f ; Prices are determined once a meeting takes place, by Nash bargaining. Suppose output is y and the outside options of both sides are 0. Then the wage is determined as arg max w Thus equilibrium vacancy posting is determined by w (y w) 1 =) w = y =) M f = (1 )y: c = q()(1 )y: (1) The social planner s problem is to choose the number of vacancies to maximize aggregate output net of vacancy costs, i.e. to max m(u; v) y cv: v Thus optimal vacancy posting is determined by c = m 2 (u; v)y: (2) The equilibrium and optimal solutions coincide when c = q()(1 )y = m 2 (u; v)y =) 1 = v:m 2(u; v) m(u; v) : Denote by v the elasticity of the matching function with respect to vacancies. The equilibrium is optimal when 1 = v ; (3) in other words, when the rm s bargaining power is equal to the elasticity of the matching function with respect to vacancies. Notice that with constant returns to scale in the matching function, this is equivalent to 1

129 requiring that the worker s bargaining power is equal to the elasticity of the matching function with respect to unemployment. This is known as the Hosios condition. 1 Notice two things. First, optimality holds only under very speci c conditions. Generically, the equilibrium is not e cient. Second, the intuition behind the Hosios condition is that to obtain an e cient allocation in equilibrium, the rm must keep a su cient share of the surplus to compensate appropriately for posting a costly vacancy. The private marginal return to a vacancy is equal to m(u;v) v (1 )y, the individual matching probability times the share of surplus. The social marginal return to a vacancy is equal to m 2 (u; v)y, the product of the marginal increment in the number of matches from posting the vacancy times the output produced in a match. The Hosios condition equates the two marginal returns. 2 E ciency with an investment decision and holdup of investment We have just established that e ciency basically does not hold in random matching models. We now investigate the question of e ciency of investment decisions (by rms). We will see that even if the Hosios condition holds, investment by rms may not be socially optimal. To help understand the issues at hand, we consider two cases: investment prior to matching and investment after matching. 2.1 Investment costs incurred before matching - holdup Firms take market tightness as given. Prior to looking for a worker, the individual rm can acquire capital k at a cost c(k) to increase productivity y(k). The value of search conditional on the investment decision k is where w(k) is the solution to S f (k) = C c(k) + q()(y(k) w(k)); max w w (y(k) w) 1 ; thus w(k) = y(k). The investment choice is determined by max k S f (k), where q() is taken parametrically. In equilibrium, the free entry must also hold. Thus we have a system [ds f (k)=dk = 0; S f (k) = 0] to solve which comes to ( c 0 (k) y 0 (k) = (1 )q(); (4) C + c(k) = (1 )q()y(k): The social planner s problem is to choose the number of vacancies and the investment level to maximize aggregate output net of vacancy and investment costs, i.e. to max k;v m(u; v) y(k) (C + c(k))v: 1 With a Cobb-Douglas matching function (m(u; v) = Au v 1 ), optimality of equilibrium requires that =. 2

130 The optimal solution is characterized by 2 ( c 0 (k) y 0 (k) = q(); C + c(k) = v q()y(k): (5) The equilibrium and optimal allocations coincide when = 0 and v = 1. (6) In that case, the Hosios condition is necessary, but not su cient. It must also be that the rm has all the bargaining power. There is a holdup problem, since the rm s investment decision is taken prior to matching potential workers. When the worker and rm negotiate, the investment is already realized and unless the rm has all the bargaining power, the worker will share in the returns from investing. This holdup problem leads to underinvestment by the rm. We can see from (4) that the rm anticipates the upcoming holdup of its investment and recognizes that its marginal return from investment is only q()(1 )y 0 (k) rather than q()y 0 (k). We show below that the investment ine ciency is really coming from the fact that the investment is carried out before matching. We consider a model where investments are realized after matching and see that the Hosios condition is again necessary and su cient for optimality. 2.2 Investment costs incurred after matching - no holdup The value of search is S f (k) = C + q()(y(k) w c(k)); where wage and investment are the solutions to max k;w w (y(k) w c(k)) 1 : The negotiated wage and investment are given by ( c 0 (k) y 0 (k) = 1; w = (y(k) c(k)): Free entry requires that S f (k) = 0. The equilibrium allocation is such that ( c 0 (k) y 0 (k) = 1; C = q()(1 )(y(k) c(k)): (7) 2 Notice that m 2 (u; v) = q() v. 3

131 The social planner s problem is to choose the number of vacancies and the investment level to maximize aggregate output net of vacancy and investment costs, i.e. to max k;v m(u; v) (y(k) c(k)) Cv: The optimal solution is characterized by ( c 0 (k) y 0 (k) = 1; C = q() v (y(k) c(k)): (8) The Hosios condition guarantees optimality of equilibrium. Because the rm s investment is not held up by the worker during the negotiations - the private and social marginal returns to investment are both equal to y 0 (k), we have e ciency of the investment decision and only need the Hosios condition to hold for e ciency. 4

132 Chapter IV OPTIMAL TAXATION 1

133 1 Optimal taxation with commitment We are interested in looking at the problem a government faces in financing its own expenditures. We will treat it as a dynamic problem, where a government has to raise distortionary taxes, to finance a given exogenous stream of government expenditures. 1.1 Competitive equilibrium Households: The problem is deterministic. There is an infinitely lived representative household. Its preferences are given by a utility function u (c t,l t ),wherec t is consumption of the single good and l t is leisure. The utility function has the usual properties. It is increasing and concave in c t and l t. The representative household discounts thefutureatrateβ. The household is endowed every period with one unit of time, so that: h t + l t =1 so that the household has to divide its time between leisure and labor h t. Firms: To produce output, firms use a technology characterized by a production function f (h t,k t ). They rent labor and capital from households, paying the capital rental rate r t and wage rate w t. The function f is assumed to be increasing in both arguments, concave and to exhibit constant returns to scale. Aggregate resource constraint: The output produced in period t can be consumed by households, used by the government or used to augment the capital stock, which otherwise depreciates at rate δ: c t + k t+1 (1 δ) k t + g t = f (h t,k t ) (1) Notice that {g t } t=0..+ is taken as given. Government: The government can raise flat-rate, time-varying taxes on capital (τ k t )andlaborincome(τ h t ). It can also trade one-period bonds, which can accomplish any intertemporal trade in a deterministic economy. Denote 2

134 by b t the government indebtedness. b t is denominated in time t goods and mature at the beginning of period t. Hence, the government budget constraint is given by: g t = τ k t r t k t + τ h t w t h t + b t+1 b t (2) R t where R t is the gross rate of return of the one-period bonds held from t to t+1 (interest earnings are assumed to be tax-exempt). Optimization: The firm s problem is standard. Profit maximization implies that factor prices are equal to their marginal products: w t = f 1 (h t,k t ) r t = f 2 (h t,k t ) The representative household maximizes the following objective function: + X Max t=0 β t u (c t, 1 h t ) s.t. c t + k t+1 (1 δ) k t + b t+1 R t = 1 τ h t wt h t + 1 τ k t rt k t + b t We will use dynamic programming to solve the household s problem. The state variables are the capital stock k t and the initial bond holdings b t. The control variables are investment k t+1, labor supply h t,andthebond holdings to bring to the next period b t+1. We can thus rewrite the household s problem as: V (k t,b t )= Max {u (c t, 1 h t )+βv (k t+1,b t+1 )} k t+1,h t,b t+1 s.t. c t + k t+1 (1 δ) k t + b t+1 R t = 1 τ h t wt h t + 1 τ k t rt k t + b t The first order condition with respect to investment is: u 1 (c t, 1 h t )=βv 1 (k t+1,b t+1 ) which, combined with the envelope condition on k t, V 1 (k t,b t )= 1 τ k t rt +1 δ u 1 (c t, 1 h t ) results in: u 1 (c t, 1 h t )=βu 1 (c t+1, 1 h t+1 ) 1 τ k t+1 rt+1 +1 δ (3) 3

135 The first order condition with respect to labor supply is given by: 1 τ h t wt u 1 (c t, 1 h t )=u 2 (c t, 1 h t ) (4) Finally, the first order condition with respect to bond holdings is: 1 u 1 (c t, 1 h t )=βv 2 (k t+1,b t+1 ) R t which combined with the envelope condition on b t, V 2 (k t,b t )=u 1 (c t, 1 h t ) gives us 1 R t u 1 (c t, 1 h t )=βu 1 (c t+1, 1 h t+1 ) (5) Combining (3) and (5), we find that: R t = 1 τ k t+1 rt+1 +1 δ (6) This condition, which does not involve any quantities that the household is free to adjust, constitutes an arbitrage condition. Since only one type of financial asset is needed to accomplish all intertemporal trades in a deterministic economy, (6) ensures that the two assets available (capital and bonds) offer the same rate of return 1. Competitive equilibrium (for a given government policy): A competitive equilibrium is an allocation [{c t }, {k t }, {h t },.{g t }] t=0...+, 1 If we write the household budget constraint for two consecutive periods, and eliminate b t+1 which appears in both, we get: = c t + c t+1 + k t+2 + b t+2 R t R t R t R t+1 ³ 1 τ h t w t h t + 1 τ h t+1 wt+1 h t+1 R t + " # 1 τ k t+1 rt+1 +1 δ ³ 1 k t τ k t r t k t +(1 δ) k t + b t R t If the term multiplying k t+1 were not zero, the household could make its budget set unbounded by buying or selling an arbitrarily large k t+1 and entering the bond market (depending on the sign of the expression). Hence, to ensure the existence of a competitive eqiulibrium with bounded budget sets, the arbitrage condition must hold. 4

136 a price system [{w t }, {r t }, {R t }] t=0...+ and a government policy {g t }, ª ª τ k t, τ h t, {bt }, such that: t=0...+ (a) Given prices and government policies, firms and households satisfy their respective optimization problems, (b) The aggregate resource constraint (1) is satisfied for all t, (c) Given allocations and prices, government policies satisfy the government resource constraint (2), for all t. Of course, the competitive allocation depends on the (exogenous) government policy. This motivates our interest in the following so-called Ramsey problem. In the Ramsey problem, the government s goal is to maximize the representative household s welfare, subject to raising given revenues through distortionary taxation. The government knows how people react to a given set of taxes and can therefore use that reaction function to design optimal taxes. Ramsey problem : Given k 0 and b 0, choose a competitive equilibrium that maximizes the discounted lifetime utility of the representative household. 1.2 The Ramsey problem The government tax revenues are τ k t r t k t + τ h t w t h t. Denote net after-tax capital and labor rental rates as er t = 1 τ k t rt and ew t = 1 τ h t wt, respectively. Hence, government tax revenues can be rewritten as: tax revenues = (r t er t ) k t +(w t ew t ) h t = f (h t,k t ) er t k t ew t h t Inserting this expression in the government budget constraint, we get: g t = f (h t,k t ) er t k t ew t h t + b t+1 R t We thus incorporated the firm s first order conditions into the government budget constraint. The government policy choice is also restricted by the aggregate resource constraint and the household s problem first order conditions. b t We know that the government is interested in designing a tax path ª τ h t,τ k t to maximize consumers welfare. It must first choose (once and for all) these taxes - and commit to them, then the competitive equilibrium determines how households react to these taxes, in each period. Thus, the firms and the representative household s first order conditions, as well as the usual resource constraints, must be constraints in the government s problem. The difficulty is that the control variables in the household s problem are only implicitly defined through the first order conditions. We must therefore treat the problem as a Lagrangian one, where the government chooses taxes and the households control variables (from section 1.1), with the restriction that the variables in the household s problem satisfy the first order conditions. 5

137 Hence, the government s problem can be written as: Max τ h t ª, τ k t ª {k t+1 }, {h t }, {b t+1 }, {c t } + X β t t=0 u (c t, 1 h t ) i +Φ t hf (h t,k t ) er t k t ew t h t + bt+1 R t b t g t +Θ t [f (h t,k t ) c t k t+1 +(1 δ) k t g t ] +µ 1t [u 2 (c t, 1 h t ) ew t u 1 (c t, 1 h t )] +µ 2t [u 1 (c t, 1 h t ) βu 1 (c t+1, 1 h t+1 )[er t+1 +1 δ]] where R t = er t+1 +1 δ Since the point we want to make does not require fully solving for the government s problem, we only look at one of the optimizing condition, namely, the first-order condition with respect to k t+1 : β {Φ t+1 [f 2 (h t+1,k t+1 ) er t+1 ]+Θ t+1 [f 2 (h t+1,k t+1 )+1 δ]} Θ t =0 (7) Suppose that the government expenditures remain constant after some period (g t = g, t T )andthatthe solution to the Ramsey problem converges to a steady state. Then, from (7), we have that: β {Φ [r er]+θ [r +1 δ]} = Θ From (3), we have, in steady state: 1=β [er +1 δ] which, plugged in the above equation, gives us: (Φ + Θ)[r er] =0 (8) One can easily show that the multipliers on the aggregate and government resource constraints must be positive. Therefore, (8) implies that r t = er t, and thus we obtain the celebrated result that: τ k t =0 Hence, if the equilibrium has a steady state, then the optimal policy is to eventually set the tax rate on capital to zero! Of course, the same conclusion does not hold for the labor income tax. It is important to notice that this conclusion is robust to several changes. In particular, it holds whether the government can issue debt (as in the present model) or must run a balanced budget (in which case, one could follow the same proof and just set b t = b t+1 =0in the Lagrangian problem). Also, notice that this strong result is not due to some efficiency considerations, as we did not consider lump-sum taxes. In fact, the only strong assumption of the model is that the government can commit to future tax rates, at time zero. Notice that taxing the (inelastically supplied) capital stock at date 0 amounts to lump-sum taxation and hence disposes of distortionary taxes. Thus, a government without a commitment technology may be tempted in future periods to renege on its 6

138 premises and levy a confiscatory tax on capital. But, of course, households would anticipate that possibility and all the reasoning above would unravel. This leads us in the territory of time-inconsistent policies, and of reputation. Can an announced fiscal policy be sustained in an equilibrium because the government wants to preserve its reputation (rather than because one assumes that the government commits to the announced policy once and for all)? 7

139 Chapter V TIME INCONSISTENCY OF GOVERNMENT POLICY AND OTHER CONSIDERATIONS 1

140 1 General considerations on the recursive nature of various problems 1.1 The standard competitive equilibrium has a recursive formulation Consider the typical problem of 1 max E 0 nx β t u (z t,s t,d t )o, where z t are state variables chosen by nature and follow a law of motion z t+1 = f(z t,ε t ). The term ε t is an innovation vector, thus z t is our usual exogenous stochastic state vector. Think of s t as (a vector of) state variables under partial control of the decision maker. Each period, he selects (a vector) of decision variables d t.thereisafixed technology describing the motion of s t given the actions d t by the agent and actions z t by nature s t+1 = g(z t,s t,d t ). The agent s decision rule is d t = h(z t,s t ). We know that, under suitable conditions, these types of problems have a recursive formulation. Intuitively, that property is due to two characteristics of the maximization problem, (i) that the criterion function [expected lifetime discounted utility] is additively separable and (ii) that decisions at time t only influences the returns dated t and later. This influence occurs directly on u(z t,s t,d t ) and indirectly on u(z t 0,s t 0,d t 0), t 0 >t. Taken together, these features of the problem impart to it a sequential character which permits it to be solved via a recursive procedure. In particular, consider a finite time horizon T. To solve, one would start by solving the constrained optimization problem at the last period for given {z T,s T }, thus generating a decision rule h T (z T,s T ) [since we are not looking at an infinite horizon problem at this point, decision rules must be indexed by time]. Armed with that result, one could solve the agent s problem in period T 1 using the decision rule h T, generating a decision rule h T 1 (z T 1,s T 1 ). One could thus proceed recursively back to the current period. The agent can solve his problem now, knowing the decision rule he will use in any future period. It should be clear that once period τ, 0 <τ T, is actually reached, the agent has no reason to use a different decision rule than the one originally prescribed. In fact, the preceding argument shows that if a sequence of policy functions {h t, 0 t T } is optimal, then the tail end of the plan {h t,s t T } is optimal for the remainder of the problem at s>0. This property is known as Bellman s principle of optimality and is due to two features of our problem, (i) that the criterion function is additively separable and (ii) because of the sequential nature of the problem. When the horizon is infinite, we know that we can drop the time indices and that the decision 1 Theproblemcaneitherhaveafinite or an infinite horizon. 2

141 rule will be independent of time. In that case too, agents have no reason to deviate from the decision rule in future periods. 1.2 Early applications to macroeconomic policy - The Lucas Critique The insight of optimal control theory have been applied in the 1970 s to the design of macroeconomic policy. We will see in this section the flaws associated with that approach, essentially an argument similar in spirit to the famous Lucas critique. Recall the problem ½ + ¾ max E P 0 β t u (z t,s t,d t ), {d t } t=0 z t+1 = f(z t,ε t ), s.t. s t+1 = g(z t,s t,d t ), where the optimization is over control laws of the form d t = h(z t,s t ). In applications of this setup to determining macroeconomic policy rules, {d t } was interpreted as a vector of policy instruments, such as tax rates, government expenditures, money supply, etc. {s t } was considered to be a set of endogenous variables, such a output, labor, etc. Finally, the function g was an econometric model of the economy. The function h was the optimal law for the macroeconomic policy variables. The idea was to implement this setup in the context of particular concrete econometric models g to make quantitative statements about optimal fiscal and monetary policy rules. These applications view the policymaking problem as a game against nature. That is, the problem assumes that the functions f and g are both fixed and independent of the policymaker s choice of h. But recall that s t+1 = g(z t,s t,d t ) constitute an econometric model of private agents behavior. Included in the policymaker s g are the decision functions of private agents, who themselves face dynamic optimization problems. The assumption that g is independent of the government s choice of its h is inconsistent with the notion that private agents are solving their optimization problems properly. This is the essence of the Lucas critique. Widely used macroeconometric models should recognize that private agents decision rules are not invariant to government policies. Thus one cannot use a given assumed macroeconometric model g to analyze the effect of government policy on macroeconomic aggregates. 1.3 Modeling both government s and private agents decisions and their interactions These observations suggest that the single-agent decision theory outlined above is inadequate for fully analyzing the mutual interaction of the government s and private agents decisions. For that, we need to set up a game featuring two agents, agent 1 and agent 2. We will think of agent 1 as the government and agent 2 as 3

142 being the public or the private agent. 2 The technology is now defined as s t+1 = g(z t,s t,d 1t,d 2t ), (1) where d it is the control variable now set by agent i. Weretainthat z t+1 = f(z t,ε t ). (2) Consider a finite horizon T.Agent1 s problem is to ( T ) X max E 0 β t 1u 1 (z t,s t,d 1t,d 2t ), (3) while agent 2 s problem is to t=0 ( T ) X max E 0 β t 2u 2 (z t,s t,d 1t,d 2t ). (4) t=0 We assume that at each t, each agent observes {s t,z t }. The maximization problem is over decision rules of the form d it = h it (z t,s t ), i =1, 2 and 0 t T. One can define the game played in one of two ways, as a Nash equilibrium or a Stackelberg or dominant-player equilibrium Nash equilibrium In the Nash equilibrium, agent i is supposed to maximize his criterion function (3) or (4) subject to (1)-(2) and knowledge of the sequence of the decision rules h i,t, t =0...T, of the other agent. The maximization is carried out taking as given the h i,t of the other agent, so that agent i assumes that his choice of the sequence of functions h it has no effect on the decision rules h i,t, t =0...T. A Nash equilibrium is a pair of sequences of functions {h 1t },{h 2t }, t =0...T such that h 1t maximizes ( X T E 0 β t 1 u 1 zt,s t,d 1t, h 2t (z t,s t ) ), (5) t=0 s.t. s t+1 = g(z t,s t,d 1t, h 2t (z t,s t )), d 1t = h 1t (z t,s t ), z t+1 = f(z t,ε t ), while h 2t maximizes ( X T E 0 β t ) 2u 2 zt,s t, h 1t (z t,s t ),d 2t, (6) t=0 2 In many problems, we would like a single government, agent 1, to face a large collection of private agents who act approximately competitively. For this purpose, we would use an N-agent game in which agents 2,..., N are the private agents. Equation (1) would be replaced by s t+1 = eg(z t,s t,d 1t,...,d Nt ) and (3)-(4) would be modified accordingly. For the N-agent setup, most of the remarks we make about the two-agent setup game would apply. 4

143 s.t. s t+1 = g(z t,s t, h 1t (z t,s t ),d 2t ), d 2t = h 2t (z t,s t ), z t+1 = f(z t,ε t ). The Nash equilibrium of this game is known to have the property that the principle of optimality applies to the maximization problem of each player. This can be verified by noticing that problem (5) or (6) is a version of the single-agent maximization problem studied in section 1.1. This means in particular that one can use recursive methods to compute the Nash equilibrium decision rules. The fact that in a Nash equilibrium, each agent s problem satisfies the principle of optimality means that each agent has an incentive to adhere to the sequence of decision rules that he initially chooses. This is true so long as the assumptions about each agent s perception of the independence of the other agent s decision from his own decisions remain valid Stackelberg equilibrium A second concept is the Stackelberg or dominant-player equilibrium. The leader, player 1, say the government, is assumed to take into account the mapping between its current and future policies at a given time t, h 1s, s t and the follower s (private agent 2) response at date t, i.e. the government anticipates that h 2t = T t h1s,s t. Thus, the government chooses h 1t,t=0...T ª to maximize X T E 0 t=0 β t 1u 1 zt,s t, h 1t (z t,s t ),T t h1s,s t (z t,s t ), (7) s.t. s t+1 = g(z t,s t, h 1t (z t,s t ),T t h1s,s t (z t,s t )), z t+1 = f(z t,ε t ). (8) This expresses the fact that the government is choosing the sequence of h 1t ª taking into account the effect of this choice on the private agent s sequence of decisions. The private agent is behaving in the same fashion as described for the Nash equilibrium. That is, he is solving his maximization problem taking the sequence of functions {h 1t } as given. A dominant-player or Stackelberg equilibrium is a pair of sequences {h 1t,t =0...T } and {h 2t,t =0...T } such that {h 2t,t=0...T } maximizes the follower s criterion function (6) given the sequence {h 1t } andsuchthat {h 1t,t=0...T } maximizes the leader s criterion (7) subject to (8) and given the mappings h 2t = T t h1s,s t. In the dominant-player equilibrium, the follower s problem is readily shown to satisfy the principle of optimality. However, the leader s problem does not satisfy the principle of optimality. The reason is that via the mappings T t,thefunctionsh 1t influence the returns of agent 1 for dates earlier than t. This means that the problem ceases to be a sequential one to which the principle of optimality applies. The reason that the principle of optimality fails to hold for the leader s problem is the appearance of future values of his own 5

144 policy functions h 1s in the current return or utility function at date t<s. Future h 1s s appear in current utility through the mappings h 2t = T t h1s,s t, which summarize the influence of the leader s current and future policy functions h 1s on the follower s current decision rules. In essence, the principle of optimality fails to hold for the leader s problem because he is taking account of the fact that he is playing a dynamic game against an intelligent agent who is reacting in systematic ways to his own choice of h 1s. In particular, in choosing the policy functions h 1s, the leader takes into account the influence of his choice on the follower s choices in earlier periods. One consequence is the following. Suppose that a particular sequence of functions {h 1t } is optimal for the problem (7) starting at date t. It is not in general true that the tail end of the plan is optimal for the remainder of the problem, starting at s>t. This means that in general, at future points in time, the leader has an incentive to depart from a previously planned sequence of policy functions. Some authors, in particular Kydland and Prescott (JPE 1977) which we are studying in section 2, refer to this situation as the time inconsistency of optimal plans. At future points in time, the leader has an incentive to depart from a previously planned optimal strategy h 1t and to employ some different sequence of policy functions for the tail of the problem. However, if a leader actually gives in to the temptation to abandon the initially optimal rules h 1t in favor of new rules, this invalidates the assumptions used by the follower in solving his problem. Once the follower catches on to this fact, the follower has an incentive not to behave as originally predicted, leading to a breakdown in an ability either to predict the behavior of the follower or to make an analytic statement about optimal government policy. In other terms, all hell breaks loose! This section was meant as an introduction to issues of time inconsistency. The next section is developing models where time inconsistency arises. 2 Time inconsistent policies: some examples from the literature 2.1 Rules rather than discretion: the inconsistency of optimal plans The model is derived from a celebrated paper from Kydland and Prescott (JPE 1977). I am reporting here just the basic points. The paper itself reports some applications. The point of the paper is to show that a time-consistent plan is typically sub-optimal. What does this mean? We first need to set up some notation, as well as the definition of optimal (which is standard) and time consistent (which is new). Let us start by quoting the abstract form Kydland and Prescott: Even if there is an agreed-upon, fixed social objective function and policymakers know the timing and magnitude of the effects of their actions, discretionary policy, namely, the selection of that decision which is best, given the current situation and a correct evaluation of the end-of-period position, does not result in the social objective function being maximized. The reason for this 6

145 apparent paradox is that economic planning is not a game against nature but, rather, a game against rational economic agents. We conclude that there is no way control theory can be made applicable to economic planning when expectations are rational. We illustrate that point by defining optimal and time-consistent policies and showing on a simple example why the two notions do not coincide. Take a time horizon T N {+ }. 3 Let π =(π 1,..., π T ) be a sequence of policies for periods 1 to T and x =(x 1,...,x T ) be the corresponding sequence of economic agents decisions. An agreed-upon social objective function S (x 1,..., x T,π 1,...,π T ), (9) is assumed to exist. Agents decisions in period t depend upon all policy decisions and their past decisions, x t = X t (x 1,..., x t 1,π 1,..., π T ),t=1,..., T. (10) Definition 1 An optimal policy, if it exists, is that feasible π which maximizes (9) subject to constraints (10). Thus for the optimal policy, the government chooses its policy once and for all. Definition 2 Apolicyπ is time-consistent if, for each time period t, π t maximizes (9), taking as given previous decisions, x 1,..., x t 1, and that future policy decisions (π s for s>t) are similarly selected. In that case, policies are chosen sequentially. The inconsistency of the optimal plan is easily demonstrated by a two-period example. The second-period policy of the optimal plan is determined by max π 2 S (x 1,x 2,π 1,π 2 ), x 1 = X 1 (π 1,π 2 ), s.t. x 2 = X 2 (x 1,π 1,π 2 ). Assuming differentiability and an interior solution, this gives us the following first order condition S X 1 + S X2 X 1 + X 2 + S =0, x 1 π 2 x 2 x 1 π 2 π 2 π 2 which rearranged gives us S X 2 + S + X 1 S + S X 2 =0. x 2 π 2 π 2 π 2 x 1 x 2 x 1 3 The definition of time consistency presented will really only apply to the finite horizon case. However, the paper by Kydland and Prescott presents an extension of the concept to the infinite horizon case. It basically requires that it be best to use now the same policy expected to be used in the future. 7

146 The second-period policy of the time-consistent plan is determined by The first order condition is given by max π 2 S (x 1,x 2,π 1,π 2 ), given (x 1,π 1 ), x 2 = X 2 (x 1,π 1,π 2 ). S x 2 X 2 π 2 + S π 2 =0, which typically gives a different solution than the optimal problem unless either X1 π 2 =0or S x 1 + S x 2 X 2 x 1 =0, i.e. unless π 2 has no effect on x 1 or the combined effect of a change of x 1 on S is nil. It should be apparent from the two maximization problems that the time-consistent policy ignores the effect of π 2 on x 1. 4 We thus see that the use of discretion (as opposed to rules) in setting policies can lead to suboptimal outcomes. This indicates that access to a commitment technology (following a rule) that binds the government not to choose sequentially has value. As explained by Kydland and Prescott, the reason that such timeconsistent policies are suboptimal is not due to myopia. The effect of this decision upon the entire future is taken into consideration. Rather, the suboptimality arises because there is no mechanism to induce future policymakers to take into consideration the effect of their policy, via the expectations mechanism, upon current decisions of agents. It is also another illustration that using standard dynamic programming techniques is not appropriate in the context of optimal policy design. The intuition is that we use the standard dynamic programming techniques in situations where current outcomes and the movement of the state variables depend only upon current and past policy decisions and the current states. But it cannot be used when agents decisions also depend on their expectations of future policies Inflationary bias of monetary policy The material presented comes from Shouyong Shi s course notes. See his notes attached to the back. 2.3 A tax example This subsection is building from Barro (1989). 6 Let us start by presenting a simple model. We will then sketch a generalization of the results from that simple model. 4 The two maximization problems for the first period policy π 1 are the same max π 1 S (x 1,x 2,π 1,π 2 ), s.t. x 1 = X 1 (π 1,π 2 ), x 2 = X 2 (x 1,π 1,π 2 ). 5 Notice that the argument does not require that agents perfectly forecast future policies, only that they realize that the government has the option of changing policy in the future. 6 Modern Business Cycle Theories, edited by R. Barro. Chapter 7, Time consistency and policy. 8

147 Consider an economy with a large number of identical consumers and a government. There is a linear production technology for the marginal product of capital is a constant R>1 and the marginal product of labor is 1. Consumers make decisions at two distinct points in time, the first-stage and the second-stage. They make consumption-investment decisions at the first stage and consumption-labor supply decision at the second stage. At the first stage, consumers are endowed with ω units of the consumption good from which they consume c 1 and save k. At the second stage, they consume c 2 and work l units. Second-stage income, net of taxes is (1 τ k )Rk +(1 τ h )l, whereτ k and τ h denote the tax rates on capital and labor respectively. For simplicity, we assume that first-stage consumption is a perfect substitute for second-stage consumption. A consumer, confronted with tax rates τ k and τ h,chooses(c 1,k; c 2,l) to solve max U (c 1 + c 2,l) (11) c 1 + k ω, s.t. c 2 (1 τ k ) Rk +(1 τ h ) l. If the tax rate on capital τ k is set so that (1 τ k ) R =1, the consumer is indifferent about the timing of consumption. In such a case, the consumer saves his entire endowment. The government sets proportional tax rates on capital and labor income to finance an exogenously given amount of second-stage per capita government spending G. The government s budget constraint is thus G τ k RK + τ h L, (12) where as usually uppercase symbols represent aggregate values. Assume that G>Rωso that even if consumers save their entire endowments and the tax on capital is set equal to one, the government still needs to tax labor. Capital taxation with commitment: In an economy with commitment, the government sets tax rates before private agents make their decisions. Let x 1 =(c 1,k) and x 2 =(c 2,l) denote an individual consumer s first- and second-stage allocations. Let π =(τ k,τ h ) denote government policy. Definition 3 A competitive equilibrium (X, π) is an individual allocation (x 1,x 2 ), an aggregate allocation (X 1,X 2 ) and a tax policy π that satisfy 1. Given the tax policy π, the individual allocations solve the consumer problem (11), 2. At the aggregate allocation (X 1,X 2 ),thepolicyπ satisfies the government budget constraint (12), 3. The individual and aggregate allocations coincide, (x 1,x 2 )=(X 1,X 2 ). Let E denote the set of policies π for which an equilibrium exists. Assume that for each π in E, thereisa unique equilibrium allocation X(π). Let S(π, X(π)) denote the equilibrium value of utility under the policy π so that S(π, X(π)) = U (C 1 (π)+c 2 (π),l(π)). 9

148 Apair(π, X) is a Ramsey equilibrium if π solves max π E S (π,x (π)) and X = X (π). Proposition 4 The Ramsey equilibrium (π, X) has first-stage allocation C 1 =0and K = ω andacapital tax rate τ k =(R 1)/R. Proof. If the tax on capital is such that (1 τ k )R 1, then consumers save their entire endowments, while if (1 τ k )R<1, then they save nothing. The tax on capital acts like a lump-sum tax when it is selected at any level less than or equal to (R 1)/R. Thus it is optimal to raise as much revenue as possible from this tax. Since G>Rω, government spending is greater than the maximum possible revenues from this capital tax, so it is optimal to set τ k =(R 1)/R. Consumers thus save their entire endowments. The tax rate on labor τ h isthensetatalevelsufficient to raise the rest of the revenues needed to finance G. Capital taxation without commitment: The lack of commitment is modeled by assuming that the government does not set policy until after consumers have made their decisions. The timing is (i) consumers make first-stage decisions, (ii) the government sets tax policy, and (iii) consumers make second-stage decisions. Thus, government tax rates depend on the aggregate first-stage decisions. A government policy is no longer a pair of tax rates, but a specification of tax rates for every possible X 1, σ(x 1 )=[τ k (X 1 ),τ h (X 1 )]. Each consumer s second-stage decisions depend on the first-stage decisions x 1 (and X 1 ) and the tax policy selected. These second-stage decisions are described by a pair of functions f 2 (x 1,X 1,π)=[c 2 (x 1,X 1,π),l(x 1,X 1,π)]. An equilibrium in this environment is defined recursively. First, a second-stage competitive equilibrium is defined, given the history of past decisions by consumers and the government. We consider symmetric histories so that x 1 = X 1. The resulting allocations are used to define the problem faced by the government. Next, the first-stage competitive equilibrium is defined. Combining all of these gives a time-consistent equilibrium. Second-stage equilibrium: A competitive equilibrium at the second stage, given the history (x 1,X 1,π) is a set of allocation rules f 2 (x 1,X 1,π) that satisfy 1. Given the history (x 1,X 1,π), the individual allocation solves 2. Equality of individual and aggregate allocations. max U (c 1 + c 2,l), c 2,l s.t. c 2 (1 τ k ) Rk +(1 τ h ) l. Government s problem: 10

149 Given the past aggregate decisions X 1 and knowing the future decisions are selected according to a rule f 2 (or F 2 ) derived from the above problem, the government selects a policy π = σ(x 1 ) that maximizes consumer welfare. The government s objective function is U(C 1 + C 2 (X 1,π),L(X 1,π)), subject to its own budget constraint. First-stage equilibrium: Each consumer chooses an allocation for the first stage x 1 = (c 1,k), together with an allocation rule f 2 (x 1,X 1,π) for actions in the second stage. Each consumer takes as given X 1 and that future policy is set according to the plan σ. The definition of the first-stage equilibrium is analogous to the definition of the second-stage equilibrium. We have thus defined a time-consistent equilibrium, with sequential rationality built in both the private agents and the government. Proposition 5 The time-consistent equilibrium has first-stage allocations C 1 = ω and K =0and a capital tax plan τ k (X 1 )=1. Proof. Consider the policy plan σ. For any X 1 =(C 1,K), it is optimal for the government to raise as much revenue as possible from taxing the given amount of capital. By assumption G>Rωso even if all the endowments are saved and the resulting capital is fully taxed, the revenues fall short of total spending. Thus τ k (X 1 )=1. Faced with such a tax, it is optimal for consumers to save nothing and consume all their endowments. The difference with the previous proof is now that X 1 is taken as given, the incentive is to tax capital fully, regardless of its level. Proposition 6 The utility level in the time-consistent equilibrium is strictly lower than in the Ramsey equilibrium. Proof. Let us compare the second stages of the Ramsey and the time-consistent equilibria. We already established that in the Ramsey equilibrium, c 1 =0, k = ω and τ k =(R 1)/R. Thus the second-stage Ramsey allocations (c 2,l) and τ h solve max U (c 2,l), c 2 ω +(1 τ h ) l, s.t. U l U c =1 τ h, G (R 1) ω + τ h l. The government chooses the tax rate to maximize the consumer s welfare, subject to its own budget constraint, with the additional constraints that (c 2,l) must satisfy the consumer s first order conditions. 11

150 We also established that the first-stage time-consistent allocation is c 1 = ω and τ k =1. Then the government s problem in the time-consistent case is to set τ h and l to solve max U (ω +(1 τ h ) l, l), U l s.t. U c =1 τ h, G τ h l. The maximization problems for the time-consistent and the Ramsey second-stage allocations are essentially the same, except that the former has an implicit higher level of government expenditures, G as opposed to G (R 1)ω. So, the resulting allocations should bring more utility in the Ramsey than the time-consistent allocations. 2.4 Can we improve from the time-consistent equilibrium? Here we will only briefly sketch an argument of how, even without commitment technology, one can improve from the time-consistent equilibrium seen in section 2.3. This is based on Chari and Kehoe (1977). 7 Our notion of sustainable plans is that policies be sequentially rational. That is, the policy rules must maximize the social welfare function at each date given that agents behave optimally. Likewise, optimality on the part of private agents requires that they forecast future policies as being sequentially rational for society. One can also allow for allocations and policies to depend on the entire history of past decisions by governments as well as on past allocations. Thus, policies and allocations are history-contingent functions. When we impose sequential rationality, both government and consumers must be able to predict how current decisions affect future outcomes. Allowing for history dependence solves this forecasting problem. For finite-horizon models, one can use backwards induction. It turns out that there is a unique timeconsistent equilibrium, which is the sequence of the single-period time-consistent equilibrium. Of course, one cannot use backwards induction with infinite horizon models. There is actually a large set of time-consistent equilibria. These equilibria are based on trigger strategies. Define autarky as playing the single-period timeconsistent plan defined in section 2.3, regardless of past history, i.e. reverting forever to it. In fact, autarky is the worst time-consistent equilibrium. The trigger strategy equilibria specify playing a pair of sequences (π, X) as long as these have been played in the past, and to revert to autarky if one of the parties has deviated. Of course, a few conditions are attached to the type of sequences that can be so sustained. In particular, (π, X) must be a competitive equilibrium. However, it can be shown that for high enough discount factor, even the Ramsey allocation can be supported. The idea is that the players will stick to an established pattern of behavior, even without commitment technology, because deviating from that pattern would be too costly, returning to autarky. A credible policy is thus one that is in the government s own interest to maintain. 7 Sustainable Plans, JPE

151 88

152 Chapter8 APOSITIVETHEORYOF INFLATION Wenow turn to theissue of in ation. As discussed in Chapter 1,in ation rates di er remarkably across countries and across di erent periods of the same country. The presence of high in ation in the 1970s pressed many industrialized countries to make in ation control as one of the most important policy targets in the past decade. Most countries succeeded but many believe that the success came with a high price tag. For example,many peoplein Canada think that loosening the monetary policy to allow moderate in ation in Canada can stimulate output. In this and next few chapters, wewillseehowexpectationsandpoliciesinteracttoa ectin ationandwhythereis sometemptationtoallowin ationtorise. 8.1 ActiveMonetaryPolicies The starting point of our analysis is the theory that anticipated changes in themoney supply are neutral but unanticipated changes in the money supply may not be neutral. Thus,a government may have the incentive to surprise the private market by active monetary policies. How e ective these active policies are in real terms depends on the degree to which the policies are unanticipated. On the other hand,the degree to whichtheprivatesectoranticipatesthegovernmentpoliciesdependsinturnonthe government's actions. Thus, there is a two way relation between the government's policy and the private sector's expectations. To rule out wild cases,we will assume that the government is benevolent in the sense that it tries to maximize the welfare of the agents in the economy(so as to get elected?). That is,the government's and the private agents'objective functions are the same. Even in this case,thegovernment may pursue active monetary policies for various reasons. ² Di erence in assessment. The government may disagree with the private sector on certain things. For example,the government maybelieve that thepotential

153 90 A Positive Theory of In ation GDP is too low, or the natural rate of unemployment is too high. In this case,the government may want to surprise the private sector by increasing the money supply in order to raise employment and output(at least temporarily). Thisdi erenceinassessmentisrealistic. Asexaminedinthelastfewchapters, theunemploymentrateincanadahasbeenabout9:5%foranumberofyears. Since there have been no major structural changes or shocks during this period, onemay think that this unemployment rateis closeto the naturalrate. The governmentclearlythinksthatthisrateistoohigh. ² Credibility constraint. Certain policies may be time-inconsistent and thus incredible. Azeroin ationtargetcanbesuchapolicy. Example1 Tounderstandthisconcept,consideranexamplewhereaparenttriesto adviseanobesechildnottoovereat. Hereboththeparentandthechildhavethesame objective. Letussupposethatthechildjusthadafullmealbuthestillcravesforthe chocolate ice-cream inthe refrigerator. For the well-being of the child, it is best for the parent to use reasonable means to prevent the child from eating that ice-cream. Supposethattheparentinthisexampleusesthefollowingthreat: \Ifyoueatthaticecream,youwillnothaveanythingtoeatinthe next vedays." This extreme threat is desirable exante (although it may be illegal): If it prevents the child from eating the ice-cream now, it improvesthe child's well-being. The problem is, it isnot time consistent and hence not credible. To nd why it is not credible,let us try to answer the following question: If the child cannot resist the craving and somehow manages to get the ice-cream and eat it, is it optimal ex post to carry out the action in the threat? TheanswerisclearlyNo. Thechildalreadyatetheice-creamandhishealth mighthavebeenharmedbythat. Iftheparentpreventedhimfromregularmealsinthe next ve days,the child would su er even more. Thus,the parent loses the incentive tocarryoutthethreatoncetheundesirableactiontakesplace. The expostincentive di ers from the exanteincentive,which makes the threat incredible. The sametime-inconsistency problemexists in policy making and in particular in monetarypolicy making. Let us suppose that zero in ation is desirable to the societyandsothegovernmentannouncesthatitintendstoachievesuchatarget. In response to this announcement,wage contracts are made in the private sector. Since the in ation rate is expected to be zero,the real wage rate equals the nominal wage rate. Now given the wage contracts being made with a zero expected in ation,does

154 ActiveMonetary Policies 91 thegovernmentwanttocarryoutthezeroin ationpolicy? TheanswerislikelyNo. By increasing the in ation rate above zero,the government can reduce the real wage rate and stimulate output(temporarily). The private sector anticipates this ex post incentive of the government and so would not believe the zero-in ation target in the rst place. To illustratesuch a time-inconsistency problem in a slightlymore formal way, let us utilize the Phillips curve which argues for a negative relationship between unemployment and the in ation rate. That is, the employment level is positively relatedtothein ationrate. ItiswellknownthatthePhillipscurveisnotstableand shifts when theexpectations on in ation change. We augment thephillips curve by incorporating in ation expectations. Let xbethe employment level,¼betheactual in ation rate and ¼ e bethe expected in ation rate (allin logarithmicterms). The augmented Phillips curve,which we should term the Lucas supplycurve,is x=x +(¼ ¼ e ): (8.1) Thenumberx isthenaturallevelofemploymentandso(1 x )canbeunderstood as the natural rate of unemployment. The Lucas supply curve states that the actual employment level equals thenatural level if the expected in ation equals the actual in ation. If¼>¼ e,employmentcanbeincreasedbeyondthenaturallevel. The government in this economy tries to minimize in ation and achieve a socially optimal employment level, x 0. To make the examination interesting, we will assume that this socially optimal level of employment exceeds the natural level of employment x, for the reason described above. If the actual employment level di ers fromthesocially optimallevel,theeconomysu ers a socialloss. (Whywould itinvolveasociallossifx>x 0?) Ifthereisin ation,thereisalsoasocialloss. The total welfare loss is captured by the following loss function: L(¼;x)=¼ 2 + (x x 0 ) 2 : (8.2) Theparameter > 0 describes the government's relativeattitude towards employment. In theextremecase!0,thegovernment does not careabout theemployment level and only cares about the cost of in ation. The government tries to set the in ation rate to minimize the social welfare loss L. The government's performance depends on the private sector's expectation on in ation. This is because the actual employment level xdepends on the expected in ation¼ e (see(8.1)). Thegovernmentcanin uencetheprivatesector'sexpectations

155 92 A Positive Theory of In ation by choosing a particular in ation policy but it cannot directly controlthe expected in ation rate. Thus,the government tries to minimize the social loss by the policy ¼, taking theexpected in ation¼ e as given. Looselyspeaking, thepolicywillbea functionoftheexpectedin ationrate,say,¼=f(¼ e ). The expectations cannot be arbitrary either. They must be rational. In the current case where no uncertainty exists,the rational expectations hypothesis requires thattheexpectationin ationratebeequaltotheactualin ationrate,i.e.,¼ e =¼. Thetwo-way relationship between thegovernment's policy¼and theprivatesector's expectations¼ e formsastandardgameandtheequilibriumin ationrateisanash solution to the game. Before getting into the details of the game, let us nd the socially optimal policies, the rst-best policies. Since > 0, the two terms in the loss function are both non-negative. Theminimumvalue of the loss function is thus zero,which can be achievedbythe rstbestpolicy¼=0andx=x 0. Unfortunately,thiscombination of in ation and employment is infeasible under rational expectations. If ¼= 0 and x=x 0,(8.1)wouldimply¼ e =x x 0 <0. Thatis,theexpectedin ationratewould have to be below the actual in ation rate, which violates the rational expectations requirement. Since the rst-best cannot be achieved, the best we can hope for is a second-best. 8.2 Second-BestMonetaryPolicy: Commitment Thesecondbestpolicyinduceszeroin ationbutgeneratesanemploymentlevelthat islowerthanthesociallyoptimallevelx 0. Inthissectionweshowthatagovernment canachievezeroin ationifithasthetechnologytocommittosuchapolicy. Example2 To illustrate the important of commitment, let us rst re-consider the ice-cream example and allow the parent to have a commitment \technology". After makingthethreatexplicit,supposethattheparentcanprogramarobottolockupall thefoodfor vedaysifthechildeatstheice-cream. Theprogramcannotbechanged once is programmed and so the parent is commit to carrying out the threat. In this case,if the child happenstoeat the ice-cream,the parent willhave the desire not to carryout the threat, as before. However, the matter is not in the parent's hand but inthe robot's and the threat willbe carried out anyway. Now, will the child eat the ice-cream? Ifheregards vedayswithoutfoodbeingaseverepunishment,itislikely thathewillnoteattheice-cream. Thus,usingthecommitmenttechnologymakesthe commitmentsuccessfulandimprovesthewelfareofboththeparentandthechild.

156 Third-Best Policy: Discretion 93 Now consider the monetary policy. An example of the commitment technology islaw. Thatis,thegovernmentcanmakethezeroin ationtargetasalawandallow an independent branch to enforce it(the enforcement must beindependent,otherwise the government would have the incentive to change it ex post). Let us see how zero in ation can be achieved with this new commitment technology. The policy game between the private sector and the government takes three stages: ² Stage 1: The government announces an in ation rate ¼ a AND writes it into law; ² Stage 2: The private sector forms rational expectations ¼ e and writes wage contracts; ² Stage 3: The law is enforced independently. Since theannounced policy willbeenforced,the actual in ation ratewillbe equaltotheannouncedin ationrate: ¼=¼ a. Thetargetiscredibleandtheprivate sectorexpects¼ e =¼ a =0. Withtheseexpectations,theprivatesectormakeswage contractswhich,bythelucassupplycurve,yieldanemploymentlevelx=x. That is, the employment level equals the natural level of employment and is below the socially optimal level of employment. The welfare loss is L(0;x )= (x x 0 ) 2 >0=L(0;x 0 ): Sincethewelfarelossunderthe rst-bestpolicyisl(0;x 0 ),thecurrentpolicy(¼;x)= (0;x )yieldsahigherwelfarelossthanthe rst-bestpolicydoesandhenceisasecond best. Nevertheless,it does produce a zero in ation rate. A distinguished feature of the equilibrium under commitment is that the actual in ation rate always equals the announced in ation rate. This is true because the announced target is made credible bythe commitment technology. Without the commitment technology, the government can still promise to commit to the zero in ation target but no one would believe it,because everyone knows that there is an incentive for the government to change the policy ex post. We examine this incentive inthenextsection. 8.3 Third-BestPolicy: Discretion Suppose that the commitment technology is not available and so the policy enforcement is left to the discretion of the same government who makes the policy. We

157 94 A Positive Theory of In ation showthatthereisanexpostgainforthegovernmenttodeviatefromazeroin ation target, which renders its zero in ation target incrediblein the rst place and leads toathird-bestoutcome. Thisthirdbestoutcomeis¼>0andx=x. Again,consider thethreestages of thegame played between theprivatesector and thegovernment: ² Stage1: Thegovernmentannouncesanin ationrate¼ a ; ² Stage 2: The private sector forms rational expectations ¼ e and writes wage contracts; ² Stage 3: The government chooses the actual in ation rate ¼and implements it. The di erence between this game and the one with commitment is that in stage 3 the government does not have to implement the policy announced in Stage 1. Thatis,theactualin ationrate¼ doesnothavetobeequalto¼ a. Instead,the government will review theeconomic conditions in Stage 3 and implement the policy that is best in that stage. The only economic condition that has changed between stages 1 and 3 is that wage contracts are written in Stage 2. The way the expectations are formed can provide ex post incentives for the government to deviate from its own announced policy. To see the ex post incentives, let us check whether the second best policy (¼;x) = (0;x ) is optimal in Stage 3, if somehow the private sector believes this policy. Start with the announcement ¼ a = 0. If the private sector believes such a zero-in ationtarget,then¼ e =0. UndertheLucassupplycurve(whichreliesonthe wagecontractsbeingwrittenwiththeexpectation¼ e ),theactualemploymentlevel is x=x +¼ ¼ e =x +¼: Theemploymentlevelisanincreasingfunctionofthein ationrate. Thisiswhythe government has ex post incentive to in ate. By deviating from the announced zero in ation target,the government can increase the employment level and so reduce the welfareloss. To be precise,substitute the employment level into the welfare loss function: L=¼ 2 + (¼+x x 0 ) 2 :

158 Third-Best Policy: Discretion 95 In Stage 3,the government chooses the in ation rate ¼to be such that minimizes the above welfare loss. The rst-order condition is Solvingfor¼,wehave ¼+ (¼+x x 0 )=0: ¼= 1+ (x 0 x ): Sincethesociallyoptimalemploymentlevelx 0 ishigherthanthenaturalemployment level, the in ation rate is positive. That is, in Stage 3, the government nds it attractive to deviate from the announced zero in ation rate to a higher in ation rate. By such deviation, the government increases the employment level from x in the caseofthesecond-best policy to a higher levelx +¼. Thewelfareloss is reduced from (x 0 x ) 2 inthecaseofthesecond-bestpolicyto ¼ 2 + (¼+x x 0 ) 2 = 1 1+ (x 0 x ) 2. The possible gains in employment and welfare from a positive in ation rate in Stage 3 come fromthe assumption that the privatesector is tricked by thegovernment to believe the announced zero in ation target. Of course,the privatesector cannot be so easilytricked. Knowing that the government will have an incentive to deviate to a higher in ation rate ex post,the private sector will never believe the announced zeroin ationtargetinthe rstplace. Thatis,thezeroin ationpolicyisnotcredible when there is no commitment technology. In fact,rational agents will believe that the in ationrateispositiveandso theequilibriumwillbe¼>0 andx=x,resulting inawelfarelossthatisevenlargerthanunderthesecond-bestpolicy. There is a two-wayrelationship between thegovernment's policy and private agents'expectations. The actual in ation rate ¼which the government nds optimal in Stage 3 depends on the expected in ation rate ¼ e. In turn, to make rational expectations on the in ation rate, ¼ e, private agents must look forward and check what in ation rate ¼ thegovernment would like to havein Stage 3. To nd the equilibrium policy, it is proper to solvethe problembackward fromstage 3. The steps are detailed below. Step1. Forgivenexpectations¼ e formedinstage2, ndthebestpolicyfor the government in Stage 3. Forgivenexpectations¼ e,theemploymentlevelisx=x +¼ ¼ e. Substitute this employment level into the welfare loss function: L=¼ 2 + (¼+x ¼ e x 0 ) 2 :

159 96 A Positive Theory of In ation The government chooses an in ation rate to minimizing this welfare loss. The rstorder condition is ¼+ (¼+x ¼ e x 0 )=0: Solvingfor¼,wehave: ¼= 1+ (x 0 x +¼ e ): (8.3) This is the reaction function of the government to the private sector's expectations. The chosen in ation rate in Stage 3 is an increasing function of the expected in ation rate. The explanation is as follows. If the private sector expects a high in ation rate,thewage contracts are written to be indexed to this expected in ation rate. Therealwageis high and the employment level is low (seethe Lucas supply curve). To increase the employment level, the government must indeed pick a high in ation rate. Step 2. Knowing the government's incentive to in ate in Stage 3,the private sector makes rational expectations on the in ation rate. Since thereis no uncertainty in this model, theexpectations must coincide with the actual in ation rate. That is, ¼ e =¼: (8.4) This is the private sector's reaction function to the government's policy. Not surprisingly,the higher the in ation rate, the higher the expected in ation rate. Note alsothatwedonotsettheexpectedin ationratetoanyannouncedratebuttothe actual rate,since not all of the announced rates are credible. Step 3. Equilibrium in ation rate and expected in ation rate are joint solutionstothereactionfunctions(8.3)and(8.4). Wesolve(¼;¼ e )jointly fromthe two reaction functions. Substitute ¼ e from (8.4)to(8.3): ¼= 1+ (x 0 x +¼): Solvingfor¼andusing(8.4),wehave ¼ e =¼= (x 0 x )>0: (8.5) ThisisaNashequilibriumofthegame. Thatis,(i)giventhegovernment's strategy ¼ = (x 0 x ), the rational (best) response of the private sector is ¼ e = (x 0 x ),and(ii)giventheprivatesector'sexpectations¼ e = (x 0 x ),thebest

160 Third-Best Policy: Discretion 97 response by the government is ¼ = (x 0 x ). Part (i) is obvious, because any expectations that do not coincide with the actual in ation rate are not rational in the current case without uncertainty. To verify part(ii),it is su±cient to show that, when¼ e = (x 0 x ),thechoice¼= (x 0 x )minimizesthewelfareloss. Thiscan beestablishedimmediatelyaftersubstituting¼ e = (x 0 x )into(8.3). ADigression: Letustryadi erentwaytosolveforthein ationrate. Since rationalexpectationsrequire¼ e =¼,letussubstitutethisexpectedin ationrateinto thelucassupplycurvetoobtainx=x +¼ ¼ e =x. Substitutingthisemployment level into the welfare loss function, we have L = ¼ 2 + (x 0 x ) 2. Choosing the in ation rate to minimize the welfare loss,we have ¼= 0. Why does this procedure leadtoadi erentsolution? Whyisn'titapropersolutionprocedureforthecurrent game? Note also that this di erent solution method yields the second best policy examinedinthelastsection. The in ation rate given by(8.5) is the only one which is credible in the absence ofcommitment. Therefore,theonlycredibleannouncementis¼ a = (x 0 x ). This equilibrium in ation rate is positive, in contrast to the zero in ation rate under commitment. The employment level under this positive in ation rate is x = x, thesameas under commitment. Thehigher in ation rate does not generatehigher employmentbecauseitisfullyexpectedandhencetakenintoconsiderationwhenthe wage contracts are made. Sincetheequilibrium has a higher in ation ratethan under commitmentbutthesameemploymentlevel,thewelfarelossislargerthanthatwith commitment, resulting in a third best. Substitute the in ation rate in(8.5) into the loss function,we have: L = [ (x 0 x )] 2 + (x 0 x ) 2 = (1+ ) (x 0 x ) 2 > (x 0 x ) 2 ; where (x 0 x ) 2 isthewelfarelossundercommitment. Wehavea situation that resembles theprisoners' dilemma. Beforetheexpectations are formed, the pair ¼ = ¼ e = 0 is clearly more desirable than the pair ¼ = ¼ e = (x 0 x ). That is, both the government and the private sector wish to have a zero in ation ex ante. However, if the private sector actually believes a

161 98 A Positive Theory of In ation zero in ation policy,the government nds it attractive to deviate from it. This incentiveto makedeviation renders thezero in ation policy incredibleand economy achievesahigherin ationrate¼= (x 0 x )andaworseoutcomethanunderzero in ation. Thus,the monetary policy under discretion is inferior to the policy under commitment. Theexerciseillustrateswhyitissometimedi±culttoachievealowin ation target. Countries may failin their attempt to controlin ation, not becauselow in ation is not desirable ex ante but because it is not credible without su±cient commitment power. What makes it incredible is the government's temptation to in ate. How strong thetemptation is depends on how much a government likes to use in ation ex post to raise employment. Evidence suggests that a conservative governmentislesslikelytousein ationtoachieveemploymenttargetthanaliberal government. For example, Great Britain achieved a low in ation rate under the conservative government and so did Canada. The above simple model can be used to con rm that a conservative government is more likely to achieve low in ation rates. To illustrate, recall that the welfare loss function incorporates the government'sconcernofbothin ationandlowemployment. Therelativeweightgivento in ation is 1=. The lower the parameter,the more thegovernment worries about in ation. We can then associate a low parameter with a conservative government. Sincetheequilibriumin ationrateis¼= (x 0 x ),aconservativegovernmentis able to achieve a lower in ation rate than a liberalgovernment. It should be emphasizedthattheemploymentlevelisthesame(x )underbothconservativeandliberal monetarypolicies.

162 A Quick Introduction to Simulation Methods 1 General formulation The typical maximization problem can be written in the following general form v (z,s) = max {u (z, s, d)+βe [v (z 0,s 0 ) z]} s.t. z 0 = A (z)+ε 0 s 0 = B (z, s, d) where V (z, s) is the optimal value function, z is a vector of exogenous state variables (such as technology shocks, for example), s is a vector of endogenous state variables (capital stock, for example), d is a vector of decision variables and u (z, s, d) is the "return" function for the problem. The two constraints describe the evolution of the state variables., 2 Two approaches 2.1 Operating on the value functions It basically consists of defining an operator which maps the space of continuous bounded functions into itself, so that the value function is the unique fixed point. Given the approximation to the true v (.), thedecision rules follow immediately. For any g X, the state of continuous bounded functions, define the operator T as follows Tg(z,s) = max s 0 B(z,s,d) [u (z, s, d)+βe (g (s0,z 0 ) z)]. This consists in searching for a fixed point of the operator T. One approach is to actually solve the problem with the computer, one can simplify the state space by restricting the domain of definition to be a grid. Another approach is to find a linear quadratic approximation around the steady state equilibrium path. 2.2 Operating on the decision rules The second approach consists obtains the decision rules, directly from the first order conditions. To illustrate that, assume that the economy is characterized by one simple intertemporal decision, say investment. The equilibrium can be described as a function I (z, s) that satisfies an integral equation of the form Z h [I (z,s),z,s]=β J [I (z 0,s 0 ),z 0,s 0 ] dg (z 0,z). 1

163 The endogenous state variable s in that case is k and k 0 =(1 δ) k + I (z, k). We can define an operator T : I (z,k) TI(z,k), whereti(z, k) satisfies the conditions Z arg min h [TI(z, k),z,k]+β J [I (z 0,k 0 ),z 0,k 0 ] dg (z 0,z), and k 0 =(1 δ) k + I (z, k). We need to find a fixed point of that operator. As in the previous section, the operator cannot be exactly replicated numerically. One approach is to discretize the state space. Another possibility is a linear approximation to the first order conditions characterizing equilibrium. 3 An example In this section, we give the example of a method that operates on the value functions and use linear quadratic approximations. Such a structure is one with a quadratic objective, linear constraints and exogenous disturbances generated by a first-order, linear, vector-autoregressive process. The particular quadratic objective used is the second order Taylor series expansion of the "return" function, evaluated around the steady state. It turns out to imply that the decision rules are linear, which is not too much of a problem, since there is little evidence of major non-linearities in aggregate data. Let us look at a case where we can use a social planner approach and describe in general terms how to solve numerically. Remember that the problem can be formulated as v (z,s) = max {u (z, s, d)+βe [v (z 0,s 0 ) z]} s.t. z 0 = A (z)+ε 0 s 0 = B (z, s, d) where A (.) and B (.) are linear. To ensure that B (.) be linear, it may be necessary to substitute any non-linear constraints into the return function. The method described applies to quadratic return functions. Hence, we first need to approximate a general return function by a quadratic one. The advantage of solving a linearquadratic planning problem is that it delivers a linear policy function d t = d (z t,s t ), which when substituted yields a linear law of motion s t+1 = s (z t,s t ). The quadratic approximation of u corresponds to the first three terms of a Taylor series expansion around the steady state values for (z, s,d) denoted z, s, d [z is the solution to z = A (z)]. Let y be the stacked vector (z,s,d)and a superscript T denote the transpose of a vector. The Taylor series expansion of u (y) at the steady state y is eu (y) =u (y)+du (y) T (y y)+ 1 2 (y y)t D 2 u (y)(y y), where Du (y) is the η (y) x1 vector of first partial derivatives of ru and D 2 u is the η (y) xη (y) matrix of second partial derivatives of u. In practice, the derivatives are approximated numerically. 1 It can be shown that er (y) can be written as eu (y) =y T Qy, 1 For example, D i u (y) = u y + h i u y h i / e h., 2

164 where Q is a symmetric η (y) xη (y) matrix. Hence, the problem can be written in matrix form as v (z,s) = max y T Qy + βe [v (z 0,s 0 ) z] ª s.t. z 0 = A (z)+ε 0 s 0 = B (z, s, d) It can be shown that the optimal value function is identical, save for a constant for any covariance matrix of ε. As a result, the optimal policy function is independent of this covariance matrix and we can solve the programming problem for the certainty case, when the covariance matrix has been set to zero. In other words, we solve the problem by dropping the expectations operator and ε 0 is replaced by its mean of zero. The general idea is to generate a sequence of approximations to v (.) which will converge to the optimal value function. An initial value V 0 is selected (matrix of dimension η (z, s) xη (z, s)). The standard Bellman mapping is then used to obtain the sequence of approximations. Given the nth element of the sequence, the (n +1)th element is obtained as follows v n+1 (z,s) = max y T Qy + βv n (z 0,s 0 ) ª s.t. [z 0,s 0 ] i = X B ij y j for i =1,..., η (z, s). j η(y). The notation implies that v n (z,s) =[zs].v n. " z s #. What follows describes in very general lines how the algorithm proceeds. For more details, the reader should refer to Cooley s "Frontiers of Business Cycles Research", chapter 2: -Define by x the stacked vector (z,s,d,z 0,s 0 ), - Construct the matrix R [η(x)] (dimension η (x)), " # R [η(x)] Q 0 = 0 βv n. - The maximization problem can be rewritten as: max d x T R [η(x)] x -Usetheconstraintstoeliminateonebyonealltheelementsof(z 0,s 0 ) from the above formulation. We get a problem formulated as above, max x T R [η(z,s,d)] x,exceptthatthelastη (z 0,s 0 ) elements of x are zeros d and R [η(z,s,d)] is still of dimension η (x), butthelastη (x) η (z 0,s 0 ) rows and columns are zeros. - Use the FOC s to eliminate one by one the next η (d) variables. We have max x T R [j] x and the first order d conditions is R [j] x =0. That allows us to eliminated the elements of d from the formulation, one by one. 3

165 - In the end, we are left with a formulation max x T R [η(z,s)] x.setthatr matrix equal to V n+1. d -CompareV n+1 V n to some required degree of precision. Stop iterating when V n+1 is close enough to V n. -Use the first order conditions from the last iteration to get the policy functions. 4