36 Biophyical Journal Volume 8 December 200 36 336 Stochaticity in Trancriptional Regulation: Origin, Conequence, and Mathematical Repreentation Thoma B. Kepler* and Timothy C. Elton *Santa Fe Intitute, Santa Fe, New Mexico 8750, and Biomathematic Graduate Program, Department of Statitic, North Carolina State Univerity, Raleigh, North Carolina 27695 USA ABSTRACT Trancriptional regulation i an inherently noiy proce. The origin of thi tochatic behavior can be traced to the random tranition among the dicrete chemical tate of operator that control the trancription rate and to finite number fluctuation in the biochemical reaction for the ynthei and degradation of trancript. We develop tochatic model to which thee random reaction are intrinic and a erie of impler model derived explicitly from the firt a approximation in different parameter regime. Thi innate tochaticity can have both a quantitative and qualitative impact on the behavior of gene-regulatory network. We introduce a natural generalization of determinitic bifurcation for claification of tochatic ytem and how that imple noiy genetic witche have rich bifurcation tructure; among them, bifurcation driven olely by changing the rate of operator fluctuation even a the underlying determinitic ytem remain unchanged. We find tochatic bitability where the determinitic equation predict monotability and vice-vera. We derive and olve equation for the mean waiting time for pontaneou tranition between quaitable tate in thee witche. INTRODUCTION The rate at which protein are yntheized from individual gene i tightly regulated. In prokaryote, thi regulation i accomplihed in part by the binding of regulatory protein to tretche of DNA uptream (by definition) of the proteincoding region of the gene. Regulatory protein can either inhibit or facilitate the binding of RNA polymerae to DNA or facilitate the iomerization of the DNA RNA polymerae complex into a trancriptionally competent tate. RNA polymerae procee along the DNA, trancribing the DNA into meenger RNA (mrna). A riboome can aociate with mrna and begin tranlation of the mrna into an amino acid equence a oon a a complete riboome binding ite emerge from the RNA polymerae. Tranlation can occur many time per trancript. The canonical introduction to thi ubject remain the monograph of Ptahne (992). In eukaryote, regulatory protein, referred to a trancription factor, are alo ued a one method of controlling gene expreion. Another form of eukaryotic regulation occur through chromatin DNA interaction. In general, trancriptional regulation in eukaryote i more complicated than in prokaryote. However, we believe the reult preented in thi manucript are relevant to both type of cell. In thi manucript, we adopt the terminology ued in tudying prokaryote and ue operator to refer to uptream regulatory DNA ite. The term promoter refer to the nucleotide equence to which RNA polymerae bind to begin trancription. Single gene may have multiple operator that can overlap with the promoter. The operator i aid Received for publication 4 May 200 and in final form 5 September 200. Addre reprint requet to Thoma B. Kepler, Santa Fe Intitute, 399 Hyde Park Rd., Santa Fe, NM 8750. Tel.: 505-984-8800; Fax: 505-982- 0565; E-mail: kepler@antafe.edu. 200 by the Biophyical Society 0006-3495/0/2/36/2 $2.00 to be in an occupied tate if a regulatory protein i bound to it and unoccupied otherwie. Chemical reaction that change the tate of the operator are referred to a operator fluctuation. One of the main goal of thi manucript i to undertand the role of operator fluctuation in trancriptional regulation. The mathematical method developed here alo directly apply to eukaryotic ytem that regulate gene expreion through trancription factor or chromatin DNA interaction. Much previou mathematical modeling of trancriptional regulation repreent the production of gene product a a determinitic proce (for a review, ee Haty et al., 200b). In thee model, both the gene-product concentration and operator tate are treated a continuou. The probability of an occupied operator i given by a function of the regulatory protein concentration that i computed uing thermodynamic argument (Shea and Acker, 985). In fact, however, there i now coniderable experimental evidence that indicate the preence of ignificant tochaticity in trancriptional regulation in both eukaryote and prokaryote. In particular, everal recent experiment on mammalian cell have upported the idea that gene initiation in repone to an inductive ignal i a tochatic proce (Weintraub, 988; van Roon et al., 989; Fiering et al., 990; Ko et al., 990; Dingemane et al., 994; Walter et al., 995). Additionally, there i evidence that chromatin regulated gene expreion i tochatic (Ahmad and Henikoff, 200; Wijgerde et al., 995) and that the initiation and deactivation of pigment expreion during melanocyte differentiation proceed in a random fahion (Bennett, 983). Finally, tudie on engineered gene circuit, which have been deigned to act a toggle witche and ocillator, have revealed large tochatic effect (Elowitz and Leibler, 2000; Gardner et al., 2000; Beckei et al. 200). Several recent paper have reported theoretical invetigation into the effect of fluctuation in gene regulation.
Stochaticity in Gene Regulation 37 Thattai and van Oudenaarden (200) ued imple model of trancription and tranlation to derive expreion for the mean and variance of mrna and protein number that compare favorably with experimental reult. Uing Monte Carlo imulation, McAdam and Arkin (997) tudied a more detailed model of thee procee that alo take into account riboome RNae binding competition. Neither of thee two invetigation conidered fluctuation in the tate of the operator. Simple model that take into account fluctuation between active and inactive genetic tate have been tudied and ued to explain induction-level heterogeneity among individual cell expreing teroid-inducible gene (Ko, 99, 992), and to upport the conjecture that haploinufficiency dieae arie from tochatic gene expreion (Cook et al., 998). In thee model, trancription and tranlation proceed determinitically while the gene i on. Additionally, thee invetigation relied on computer imulation and did not fully explore the conequence of fluctuation in regulated ytem. Numerical imulation of complete genetic network have alo been carried out. For intance, Arkin et al. (998) conidered a detailed tochatic model for the initial deciion between the two developmental pathway (lyi and lyogeny) of bacteriophage. However, in thi invetigation the chemical kinetic of the operator fluctuation wa aumed to be fat. Thi aumption allowed the operator tate to be treated determinitically uing a quai-teadytate approximation. The role of noie ha alo been conidered in engineered gene network (Haty et al., 2000, 200b). In thi work, fluctuation were added pot-hoc to determinitic rate equation, and, therefore, the noie trength wa an adjutable parameter. Our aim here i to conider the origin of intrinic fluctuation and to develop appropriate repreentation for them within the context of de novo tochatic model. We further conider a et of implifying approximation found in variou parameter limit, eventually arriving at determinitic model. Becaue our method tart from a microcopic decription of the proce, all the parameter in the approximate model are defined in term of the underlying chemical rate contant. Our approximate cheme are important for two reaon. Firt, they provide inight into the dynamic of the ytem that cannot be gained from Monte Carlo imulation of the full model; and, econd, numerical imulation uing the approximate cheme can run order of magnitude fater than Monte Carlo imulation of the full proce. In the firt two ection and in the appendice, we preent the mathematical technique for manipulating the model. Thee method are then ued to examine the conequence of fluctuation in regulatory ytem: ) Noie detabilize genetic witche. We compute the mean firt paage time for a imple ingle-gene witch and examine their behavior under variation of everal parameter, particularly the rate of operator tranition. 2) The detabilization of witche make it neceary to generalize the notion of bifurcation. FIGURE A chematic diagram illutrating trancriptional regulation without feedback. The operator ha two poible tate, occupied and unoccupied, and fluctuate between them. If the operator i empty, protein monomer are produced at a rate 0 ; if the ite i occupied, the production rate i. The operator doe not affect the degradation of the protein product, which occur with rate. We aume that the activator concentration i contant. We examine tochatic bifurcation (Horthemke and Lefever, 984), qualitative change in the probability denity function under change in parameter, again, with pecial attention to thoe bifurcation induced olely by change in the rate of operator tranition. We tart by conidering a imple model without feedback to etablih our method, then move on to a witching ytem coniting of a ingle elf-promoting gene, and finally to a witching ytem compoed of two mutually repreing gene. SINGLE GENE, NO FEEDBACK We begin with a model for a gene that ha no feedback, direct or indirect, onto it own trancriptional regulation. Similar model have been ued to tudy teroid-inducible gene (Ko, 99, 992) and haploinufficiency dieae (Cook et al., 998). The implicity of thi model allow u to preent the technique we ue for the analyi of regulated ytem in a direct manner, uncomplicated by nonlinearitie. The tate pace of the model conit of the number of gene product monomer (an integer variable) and the tate of the gene operator (binary). We ue the term gene product to account for the lumping of mrna and protein in thi treatment. The event that occur in thi model can be repreented a biochemical reaction (Fig. ), and the mater equation derived directly from that repreentation. In thi manucript, we adopt the following notational convention. Uppercae calligraphic letter denote a molecule of a particular protein pecie or an operator (e.g., for a monomer and for an operator). Uppercae letter repreent tate variable that denote the current number of molecule of a particular protein pecie or the current chemical tate of the operator. Lowercae letter denote Biophyical Journal 8(6) 36 336
38 Kepler and Elton any allowed value of their uppercae counterpart. Becaue the tate variable are pure number, all the rate contant have unit of invere time and are reported here a per econd. Concentration are formed by dividing the mean number of a protein pecie by the relevant volume (e.g., the cell volume). In thi cae, econd-order rate contant would have unit of /(time concentration). The degradation of gene product i written a the Markov property. Phyically, thi mean that the time evolution of the ytem i determined olely by it current tate and i independent of it pat. The Markov property allow u to write down a mater equation for the time evolution of the probabilitie p m (t) Pr[M(t) m and S(t) ] (van Kampen, 992). Simply put, the mater equation i a rate equation for p m (t). Written out explicitly, the mater equation for thi proce ha the form A, () dp m 0 dt Kk 0 m 0 p m 0 Kk p m where repreent the monomer form of the expreed protein, i the degradation rate, and A i ued to denote a protein ink. We alo ue A to denote a protein ource. We write the pontaneou tranition between operator tate, denoted 0 (occupied) and (unoccupied), a 0 k 0 K L ; k K, (2) where the rate K et the time cale for thi reaction. The contant k 0 and k are dimenionle and contrained to obey k 0 k. The reaction for the production of protein i written a A O (3) where rate ( 0 or ) depend on the chemical tate of the operator. It i important to recognize that thi imple reaction i an effective reaction repreenting a large number of component reaction together making up trancription, tranlation of mrna into polypeptide and the folding of thee polypeptide into protein. In thi treatment, we ditinguih two ource of tochaticity: operator fluctuation, and the combined action of trancription per e and degradation of protein product. In both cae, the variability reult from the inherent dicretene of the tate pace and the randomne in the dwell time between dicrete reaction. (Below, we will conider a further ource of variability in the reaction in which gene product monomer interact to form dimer, although our dicuion will erve primarily to indicate why thee dimer fluctuation can be neglected.) At any given time t, the tate of the ytem decribed by reaction 3 i pecified by the number of monomer protein M(t) and the chemical tate of the operator S(t). S(t) i equal to 0 if the operator i unoccupied and if it i occupied. Thu, the variou tate of the ytem can be written a the ordered pair (m, ), where m i a non-negative integer and i either 0 or. M(t) and S(t) are random variable, becaue the chemical reaction that change the tate of the ytem occur randomly in time. If we aume that the dwell time in any particular chemical tate of the ytem i exponentially ditributed, then the ytem atifie dp m dt 0 m p m 0 0 p m, (4) Kk m p m Kk 0 p m 0 m p m p m, (5) where we have uppreed the explicit time dependence of p m (t) for readability. The firt term on the right-hand ide of Eq. 4 and 5 repreent the rate at which probability i flowing out of the tate (m, ). It i the product of the average rate, (Kk m ), at which tranition occur out of (m, ) and the probability p m (t) of being in (m, ). Correpondingly, the other three term on the right-hand ide of thee equation repreent the rate at which probability flow into (m, ) from acceible tate. Although the notation ued in Eq. 4 and 5 make thee equation eay to interpret, it quickly become unmanageable a the ytem become more complex. Therefore, we will adopt a more compact notation and combine thee equation a dp m dt p m p m m p m mp m Kk ŝ p ŝ m k p m, (6) where the hatted index indicate the other one : ŝ ( ) mod. The partial moment, defined for integer j, are m j m m j p m. (7) Note that the zeroth moment m 0 are the marginal probabilitie for the operator to be in tate. The moment are then the um over operator tate of the partial moment, m j m j 0 m j. (8) We can ue Eq. 6 to derive ordinary differential equation (ODE) for the time evolution of the partial moment (van Kampen, 992). Becaue the reaction are all zeroth- and firt-order, the ODE for the partial moment factorize into independent pair of linear equation, which can eaily be Biophyical Journal 8(6) 36 336
Stochaticity in Gene Regulation 39 olved. Here we are intereted only in the teady-tate value of the firt moment and the variance, and m * m 0k k 0 (9) Var m m * k 0 k 2 0 K, (0) where the overbar indicate the teady-tate value. Under the given circumtance, we expect the firt factor in the product on the right ide of Eq. 0 to be of order one, the econd factor to be of order m * 2 and the order of the third to depend on the relative rate of product decay, and operator tranition K. When product decay i much fater than operator tranition, the lat factor i of order one and the variance i dominated by the econd term, of order m * 2. When the operator tranition are much fater, the lat factor i very mall, and the variance approache the mean. An appropriate meaure of the relative ize of the fluctuation i the coefficient of variation, which i defined a the ratio of the variance to the mean quared. The teadytate coefficient of variation can be written a CV Var m m * 2 m * k 0 k K 0 2 0 k k 0. () Note that, a the average number of monomer increae, the firt term in the CV decreae. Even with large protein number, however, the CV can till be large due to fluctuation in the operator tate. A thee operator fluctuation become fater, the econd term decreae a well. Thi tabilizing effect of fat operator fluctuation ha been oberved in computer imulation of tochatic gene expreion (Cook et al., 998). Below, we contruct approximation to the dynamic for cae in which the product number i large or the operator fluctuation are fat. Small-noie and fat-tranition approximation to the mater equation For cae in which there i feedback regulation, we generalize Eq. 6 to include tate-dependent tranition rate and nonlinearitie. When thi i done, the moment equation no longer factorize. Numerical olution can be difficult to obtain a well, epecially when everal gene are conidered a part of a regulatory network. Therefore, we develop approximation to Eq. 6 that can be generalized immediately to the nonlinear cae of feedback regulation and are valid a one or the other ource of variability become negligible. Large teady-tate gene-product level When the protein abundance given by Eq. 9 i large compared to one, we can ue a diffuion approximation for Eq. 6. In the ection, Ecape Time, we preent an example in which thi diffuion approximation i accurate with m * a mall a 25. We tart by defining an appropriately caled continuou variable for the monomer number. Thi variable i given by X M m *. (2) A defined here, X i dimenionle, though we hall often refer to it a a concentration. To convert to an actual concentration, in Eq. 2 replace m * by the cell volume in appropriate unit. We define the probability denity function (x, t) uch that m/2/m* p m t (m/2)/m* dx x, t. (3) One way to arrive at the diffuion approximation i to note that, by appropriately rearranging term in Eq. 6, thi equation can be recat in a form that i identical to a econd-order finite differencing of the diffuion equation. However, it i poible to quickly arrive at the ame reult through ue of a Taylor erie expanion. The Taylor erie for a function g(x) about x i gx u j j! x j gxu j e ux gx, (4) where x i horthand for the partial derivative with repect to x. The lat equality in the above expreion define the hift operator e u x, which tranlate g from x to x u. Changing variable and uing the hift operator in Eq. 6, we get the equation of motion for, t x e (/m*)x x m * e (/m*)x x x Kk ŝ ŝ x k x. (5) The Taylor erie that define the hift operator can be truncated without incurring large error when the ize of the tranlation i ufficiently mall. In our cae, if /m * i mall enough, we can neglect term of third order and higher to obtain the diffuion approximation, t x x m * x x 2m * x 2 m * x x Kk ŝ ŝ x k x. (6) Biophyical Journal 8(6) 36 336
320 Kepler and Elton In the limit, a m * 3, the only tochaticity remaining i that due to the operator fluctuation. The mater equation in thi limit become t x x m * x x Kk ŝ ŝ x k x, (7) where the term /m * i order. The above equation can be interpreted in the following way. In each tate of the operator, the concentration evolve determinitically according to the equation, dx dt m * x. (8) However, the effective rate at which protein i made /m * fluctuate randomly in time between high ( ) and low ( 0) level. A generalization of thi ytem, allowing for multiple operator and operator tate i dicued in detail in Appendix B. Fat operator fluctuation The variance due to operator tranition decreae a the rate of thee tranition increae and their characteritic time become much maller than thoe of the ret of the ytem; thee fluctuation are effectively averaged out over the longer time cale (Cook et al., 998). We take advantage of thi effect to contruct an approximation to the dynamic that i accurate when the fluctuation in the operator tate are fat, but finite. To apply thee method, we re-expre Eq. 6 in term of the marginal probability function p m p m 0 p m and the difference m k 0 p m 0 k p m, dp m dt 0 k k 0 p m p m m p m mp m 0 m m, (9) d m K dt m k 0 0 k k 0 k m m k 0 k 2 m m m m k 0 k 0 p m p m (20) When K i very large compared to and m *, reache a rapid quai-equilibrium for any value of p. Thi i realized mathematically by etting the derivative in Eq. 20 equal to zero. The reulting expreion for i then ubtituted back into Eq. 9 with the reult being the approximate equation of motion for p m, dp m dt ( 0 k k 0 )p m p m m p m mp m K k 0k 0 2 p m2 2p m p m. (2) The lat term i a econd-order finite difference centered on m (rather than on m). It act a a diffuion term, producing the ame dynamic for the mean and variance a the uual finite difference centered on m. Although the higher moment differ in thee two cae, a m * increae, thee difference are multiplied by teadily decreaing factor and vanih a m * 3. AK 3, we obtain a imple Poion proce with degradation, with the intantaneou rate of trancription equal to the equilibrium average 0 k k 0. Simultaneou limit We can apply thee approximation imultaneouly. The diffuion approximation applied to Eq. 2 give the ame reult a the fat-noie approximation applied to Eq. 6. Taking the marginal denity 0 on x a the dynamical object (ee Appendix B for more detail), we find where t x x Axx 2 x 2 Bxx, (22) Ax x K, (23) Bx 2 K k 0k 0 2 m * x. (24) When the approximation that led to Eq. 22 are appropriate, one advantage of thi formulation i that an expreion for the teady-tate denity in term of a imple quadrature can be found, x Bx exp x Ax Bx dx, (25) 20 where the overbar i again ued to indicate teady tate, and i a normalization contant. Another advantage i that ample path of the proce decribed by Eq. 22 can be Biophyical Journal 8(6) 36 336
Stochaticity in Gene Regulation 32 generated from the tochatic differential equation dx AX BXt, (26) dt where (t) i a Gauian white noie proce. (We are uing the Ito interpretation of the tochatic integral. There i no ambiguity with thi choice, becaue Eq. 22 define the tochatic proce. We chooe the Ito interpretation becaue it i traightforward to implement numerically.) Sample path generated from thi equation can run order of magnitude fater than Monte Carlo imulation of the full proce. In the limit, a both m * and K become infinite, the noie term in Eq. 26 goe to zero, and thi equation become the determinitic rate equation, dx Ax x. (27) dt For the imple example we are conidering (and indeed for all linear ytem), the above expreion i identical to the equation for the firt moment derived directly from the underlying mater equation. For nonlinear ytem, the two equation differ. However, from Eq. 26, we can perform a mall-noie expanion to find dx AX AX d 2 A dt 2 dx 2 XVarX. (28) So the imple determinitic rate equation i appropriate when A(X)Var(X) i much maller than A(X). REGULATED SYSTEMS I: SELF-PROMOTER We now conider a ytem in which the gene product i itelf an activating regulatory protein for it gene. Thi ytem i regulated by poitive feedback. Thi type of regulation i thought to play a role in the developmental deciion pathway of bacteriophage (Ptahne, 992) and ha been ued to contruct a ynthetic eukaryotic gene witch in Saccharomyce cereviiae (Beckei et al., 200). Here we tudy a minimal verion of a ytem with poitive feedback, which erve to illutrate the qualitative feature of thi type of regulation. For a more biologically complete treatment of the lyi/lyogeny deciion pathway, the reader i referred to the pioneering work of Arkin et al. (998). The ytem we conider i imilar to that hown in Fig., only now the activator i the gene own product. After developing the model for thi ytem and exploring the conequence of noie for bifurcation and pontaneou tranition in thi ytem, we will come back to explore a econd bipolar witch involving a pair of mutually repreing protein. Regulatory protein often bind to their operator a dimer or higher-order oligomer (Ptahne, 992). We aume that the active form of the protein in our ytem i a dimer. Thu, we explicitly conider the dimerization reaction and the tochaticity aociated with it. Letting repreent a protein dimer, the dimerization reaction i L ;, (29) where i the dimenionle equilibrium diociation contant, and i the forward rate contant. The operator tranition reaction, Eq. 2, i now modified to explicitly include the role of the protein dimer K 0 L ; K, (30) where K i now the forward tranition rate and i the dimenionle diociation contant for thi reaction. The other reaction remain a given above, except that we mut pecify the rate at which dimer are degraded. For implicity, we have made the arbitrary choice that dimer are not degraded at all. That i, only the monomer form of the protein i untable. Incluion of an arbitrary dimer degradation i eaily accommodated. Our choice doe not have qualitative implication for thi analyi within the range of parameter variation conidered. Let D(t) repreent the number of dimer and N(t) the total protein number at time t. Then we have M(t) N(t) 2D(t). The mater equation for thi proce i dp n,d dt n 2dp n,d p n,d p n,d n 2dp n,d n 2d 2n 2d p n,d n 2dn 2d p n,d d p n,d dp n,d Kp n,d dp 0 n,d, (3) where p n,d Pr[N(t) n, D(t) d, and S(t) ]. It i traightforward to generate ample path of the full proce decribed by the equation given above (ee Fig. 6). Thee Monte Carlo imulation are uually computationally intenive, becaue the time cale of the variou reaction involved can be very different. Therefore, we make ue of the limiting cae dicued above to contruct approximation to the mater equation that are le computationally intenive and provide inight into the dynamic of the ytem. Biophyical Journal 8(6) 36 336
322 Kepler and Elton The reaction given by Eq. 29 i generally aumed to be fat compared to all other reaction (McAdam and Arkin, 997; Haty et al., 2000, 200a). Phyically, thi mean that the monomer and dimer concentration come to quaiequilibrium before the total amount of protein change appreciably. In Appendix A, we how how thi aumption, along with the aumption that the coefficient of variation of D conditional on N i mall, allow u to eliminate the dimer concentration from the problem. In thi limit, we get the mater equation, written in term of the marginal denity on M (In an abue of notation, we let the marginal on M be repreented a p m and rely on the ubcript itelf to ditinguih thi probability ditribution from that of N). dp m dt [(m )p m mp m ] p m p m mm Kp m p m 0. (32) To arrive at thi imple et of equation, it i neceary to aume that the diociation contant for dimerization i large. Although thi i not necearily the regime of biological interet, it erve to implify the preentation coniderably and i the regime commonly ued, uually implicitly, by other reearcher. A dicued in detail in Appendix A, all the analye preented below can be carried through when thi aumption i relaxed; the qualitative nature of the reult do not change. A direct comparion with experimental data, however, require the more general treatment preented in Appendix A. To move to the diffuion limit, we change to the dimenionle variable u t and X M/m o, where m o / i the teady-tate value of the mean monomer number for the ytem locked in the occupied tate. In term of thee variable, the diffuion limit for large protein number produce u x xa x x 2m o x a x x b x x 2 0 x, (33) where the recaled parameter are a, a 0 0 /, K 2 /( 3 ) and b 2 / 2. A m o 3 the fluctuation in the monomer concentration become negligible; Eq. 33 loe it diffuive term and can be written a u x x a x x b x x 2 0 x. (34) All of the tochaticity in thi ytem now derive from the operator fluctuation. The key difference between the above equation and Eq. 7 i the appearance of the x 2 factor multiplying 0 (x) in Eq. 34. A will be een, thi factor i reponible for the bitable behavior oberved in the macrocopic limit of thi ytem. One ueful feature of Eq. 34 i that an explicit expreion for the teady-tate marginal denity 0 can be found, x expa 0 x x2 2 x a 0 a 2 0 x b, (35) where i a normalization contant. Below, we dicu how thi ditribution undergoe bifurcation a a reult of the fluctuation in the operator tate. Next, we conider the mall-noie limit of the fluctuation in the monomer concentration and the fat-noie limit of fluctuation in the operator tate. That i, both m o and are taken to be large, but finite. In Appendix B, we preent a general algorithm for deriving an effective diffuion equation for the marginal denity, and find where u x x Axx 2 x 2 Bxx, (36) Ax ba 0 x 2 b x 2 x 2xba 0 a 0 2 xx 2 bx a 0 (37) b x 2 4 Bx m o ba 0 x x 2 x b x 2 bx2 a 0 2 b x 2 3. (38) The teady-tate olution to Eq. 36 i given by Eq. 25, uing the above expreion for A and B. Finally, in the determinitic limit, m o 3, we are left with the ODE dx dt ba 0 x 2 b x 2 x where we have introduced the potential x x, (39) x x2 2 x a 0 b arctan x b. (40) Looely peaking, (x) can be thought of a an effective free energy for the ytem. It local minima repreent table teady tate of the concentration. The local maxima are energetic barrier that mut be urmounted by thermal activation. In general, uch a free energy function doe not exit for nonequilibrium ytem, a i the cae for the mutual repreor ytem conidered below. Biophyical Journal 8(6) 36 336
Stochaticity in Gene Regulation 323 FIGURE 2 The determinitic bifurcation diagram for the elf-promoter. The dimenionle parameter are defined a b 2 / 2 and a 0 0 /. The point, 3, and 4 indicate the parameter value ued to produce the potential hown in the figure. Point and 2 pecify the ytem invetigated in the Monte Carlo imulation hown in Fig. 4, both of which are monotable. BIFURCATIONS The determinitic ytem given by Eq. 39 act a a witch in the appropriate parameter regime: the ytem ha two ditinct table fixed point (for a dicuion of the neceary condition required to make a biological witch, ee Cherry and Adler, 2000). Figure 2 how the bifurcation diagram for thi ytem a a function of the two parameter b and a 0. The correponding potential (x) are alo drawn on the diagram to illutrate the number of table fixed point in each region. In the region where there are two table fixed point, the ytem act a a genetic witch. For tochatic ytem, the notion of table fixed point in the tate pace i not well defined. To generalize the idea of bifurcation for applicability to our tochatic model in uch a way a to be conitent with the uual meaning a the fluctuation become vanihingly mall, we focu our attention on the gain and lo of critical point in the probability denity function (Horthemke and Lefever, 984). So, for example, a bifurcation in which a ingle table fixed point become a pair of table fixed point and a ingle untable point correpond to the tranformation of a unimodal probability denity function to one that i bimodal. We now illutrate how fluctuation in the operator tate change the dynamic of the ytem. In particular, thee fluctuation can either induce bitability in region that are determinitically (i.e., in the zero-noie limit) monotable or wah out region of bitability. Figure 3 A i a bifurcation diagram for the tationary ditribution given by Eq. 35 a a function of b and a 0. For thi figure 50. The dahed curve correpond to the determinitic bifurcation diagram hown in Fig. 2. The variou region of thi graph are labeled with the number 6. In Fig. 3 B, qualitative feature of the teady-tate ditribution for each region of Fig. 3 A are hown. A the vertical line hown in Fig. 3 A i croed from left to right, an integrable ingularity occur at the lower boundary of the ditribution. Likewie, a the horizontal line i croed from top to bottom, an integrable ingularity occur at the upper boundary of the ditribution. The point labeled and 2 in Fig. 2 are monotable in the determinitic limit. In Fig. 3 B, however, we ee that finite operator fluctuation induce a type of bitability. The tationary ditribution ha a local maximum at high concentration and the ditribution i ingular at x a 0. Note that there i mall region near the cup of the determinitic bifurcation curve where the determinitic ytem predict bitability, but the fluctuation caue thi behavior to be lot. Not urpriingly, the Monte Carlo imulation preented below reveal that fluctuation in the monomer concentration can alo wah out bitable behavior. In region 3, the ditribution i bimodal. Thi region grow and hift a i increaed until it coincide with the determinitic limit (dahed curve). Thi effect can be een in Fig. 3 C, which i a bifurcation diagram for the teady-tate ditribution a a function of b and with a 0 0.05. Thi value of a 0 i ued in the Monte Carlo imulation dicued next. Figure 4 how the reult of Monte Carlo imulation conitent with Eq. 32. The parameter value in the upper pair of panel correpond to the point labeled in Fig. 2. In the determinitic limit, the ytem i monotable. From Fig. 3 A, however, we ee that the finite operator fluctuation have induced bitability. Thi can be een from the dahed curve in the top right panel of Fig. 4, repreenting the teady-tate ditribution when monomer fluctuation are ignored. In the left panel, a typical time erie for the proce i hown. There i no indication of bitability. The olid curve hown in the top right panel i the teady-tate ditribution when the mall-and-fat noie approximation are ued (i.e., Eq. 36 and 25). By comparing the dahed and olid curve, we ee that, even with thi large value of m o, monomer fluctuation can till be detected. Indeed, thee fluctuation are reponible for wahing out the bitable nature of the ytem. We find excellent agreement between thi ditribution and the hitogram from the Monte Carlo imulation. Comparion of the hitogram reveal that bitability can be induced by varying either or K. Experimentally, could be altered, for example, by the addition of a proteae. Comparing the middle panel to Fig. 3 C, we expect that thi ytem will be bitable, and, indeed, the time erie hown in the middle left panel reveal that thi i the cae, although low monomer level are very unlikely. The lower tate can be further tabilized by increaing. In the bottom panel, K ha been reduced while i unchanged from the top panel. The bitability of the ytem i evident. Note that, although both approximation do a pretty good job of reproducing the teady-tate ditribution, there i ome dicrepancy at low Biophyical Journal 8(6) 36 336
324 Kepler and Elton FIGURE 3 (A) The bifurcation diagram for Eq. 35 with 50. The region numbered 6 correpond to qualitatively different teady-tate ditribution. The dahed curve i the bifurcation diagram in the determinitic limit hown in Fig. 2. (B) The teady-tate ditribution aociated with the region numbered in (A) and (C). (C) The bifurcation diagram for b veru with a 0 0.05. concentration. Thi i exactly where we expect the approximation to break down. Figure 5 i the ame a Fig. 4 except that, intead of performing Monte Carlo imulation of the dicrete proce, we have generated ample path baed on the tochatic differential equation aociated with Eq. 36. Thi method run roughly an order of magnitude fater than the dicrete Monte Carlo imulation. A can be een, the agreement between the diffuion approximation and the full imulation i good, and it look a if the diffuion approximation i faithfully capturing the dynamic a well a the teady-tate ditribution. We expand on thi point in our dicuion of the mean firt paage time below. Biophyical Journal 8(6) 36 336
Stochaticity in Gene Regulation 325 FIGURE 4 Sample path and ditribution for the elf promoter. The dimer concentration ha been eliminated uing the quaiequilibrium approximation (Eq. 32). In all the panel, 0,000, 000, 0 50, and 0 (a 0 0.05). Each hitogram correpond to the time erie on it left. Together, they illutrate the bifurcation that occur a or K are varied. The bifurcation that occur a K i varied i due olely to fluctuation in the operator tate and doe not occur in the macrocopic limit. The ditribution hown a olid line are the reult of the mall-and-fat noie approximation. The dahed curve are the teady-tate ditribution given by Eq. 35. Finally, we illutrate the validity of ignoring the inherent dimer concentration fluctuation. Figure 6 A how ample path for M(t) and D(t) generated by Monte Carlo imulation of the proce decribed by Eq. 3. In the determinitic limit, the ytem i bitable and thi i indeed evident in the time erie. The intrinic fluctua- FIGURE 5 Sample path of the tochatic differential equation aociated with the mall-and-fat noie approximation and correponding teady-tate ditribution. The olid curve are the ame a in Fig. 4. Biophyical Journal 8(6) 36 336
326 Kepler and Elton FIGURE 7 The MFPT a a function of log(). The olid line are the reult for the dicrete proce Eq. 32 and the dahed line are the mall-and-fat noie approximation. The dot-dahed line i the limit m 0 goe to infinity. To produce thi curve, the proce decribed by Eq. 34 wa ued. The parameter ued to produce thi figure were and 6000. The parameter, 0, and b were adjuted to vary m o with b 0.28 and a 0 0.08 fixed. The ytem i determinitically bitable. FIGURE 6 (A) Sample path from Monte Carlo imulation that include independently both dimer and monomer number (i.e., the procee decribed by Eq. 3) with 0., 6000, k 0 2,.7, 500, 0 40, and (m o 500, 500, b 0.28 and a 0 0.08). In the macrocopic limit, the ytem i bitable, a i evident in the time erie. (B) The teady-tate monomer ditribution. The hitogram wa generated from the time erie hown in (A). The olid curve i the teady-tate ditribution uing the malland-fat noie approximation together with the quaiequilibrium approximation for the dimer number. tion, however, allow for tranition between the high and low protein level. Figure 6 B how a comparion between the correponding hitogram and the teadytate ditribution found uing imultaneou application of the mall-and-fat noie and quai-equilibrium approximation. Note the excellent agreement between the numerical and analytical reult. ESCAPE TIMES When the ytem i tochatically bitable, a quantity of interet i the time between witche from low to high concentration and vice vera. Thi time i a random variable and i often referred to a the firt-paage time. Here we preent reult for the mean firt-paage time (MFPT). In Appendix D, we preent the mathematical detail needed to compute the MFPT. For illutration, we conider cae in which the correponding determinitic ytem i bitable. We let the initial dimenionle concentration X 0 equal it lower bound, a 0, and compute the average time for the concentration to reach 0.65, which i cloe to the high concentration teady tate of Eq. 39. It i aumed that, initially, the probabilitie for the operator tate take their equilibrium value. Figure 7 i a plot of the MFPT veru log for variou value of m o. The olid curve hown in thi figure are the MFPT a calculated from the dicrete proce Eq. 32. The dot-dahed curve i the limiting cae in which m o 3. In thi limit, the MFPT goe to infinity a i increaed, becaue there are no fluctuation to induce witching. Thi approximation i valid for ituation in which the operator fluctuation are low and the total concentration i large. In thi limit, the firt-paage time again become large due to the long time pent in the unoccupied tate. The dahed curve are the approximation in which both operator and concentration fluctuation are treated uing the mall noie approximation. Thi approximation i poor when i mall. Surpriingly, with ufficiently large, the approximation i good when m o i even a mall a 25. The fact that the approximation accurately reproduce the MFPT indicate fur- Biophyical Journal 8(6) 36 336
Stochaticity in Gene Regulation 327 FIGURE 8 Time erie and teady-tate ditribution for the variou cae illutrated in Fig. 7 with 500. The time erie and hitogram how the reult of Monte Carlo imulation of the dicrete proce decribed by Eq. 32. The olid line are the teady-tate ditribution obtained uing the mall-and-fat noie approximation. The dahed line how the teady-tate ditribution in the limit m o 3 (Eq. 34). The tranition to bitability for increaing m o i evident. ther that they are accurately depicting the dynamic and the teady-tate ditribution of the ytem. Figure 8 how the time erie and the teady-tate ditribution for the three vale of m o hown in Fig. 7 with 500. The time erie repreent Monte Carlo imulation of the dicrete proce ignoring intrinic dimer fluctuation. When m o 25, the bitable nature of the ytem i wahed out by the fluctuation in monomer concentration. Again, we ee that the mall noie approximation accurately capture the teady-tate ditribution (olid curve). A m o i increaed, bitability become apparent, and, for very large m o (data not hown), the mall noie ditribution become inditinguihable from the ditribution that ignore intrinic concentration fluctuation, but explicitly include operatorinduced fluctuation (dahed curve). We can alo ee the emergence of bitability if we plot the MFPT vere the initial concentration X 0 (Fig. 9). The emergence of a tep in MFPT at large value of m o indicate a potential barrier at around X 0 0.3 that fluctuation mut urmount. of the other and repree it trancription. Recently, uch a regulatory network ha been engineered and hown to act a REGULATED SYSTEMS II: MUTUAL REPRESSORS Another type of witch i formed when two protein, and 2, act a mutual repreor; each bind to the operator FIGURE 9 The MFPT a a function of the initial concentration X 0. The parameter ued to produce thi plot are the ame a in Fig. 8. A before, the olid line how the reult for the dicrete proce and the dahed line repreent the mall-and-fat noie approximation. With thi value of, the approximation i a good even when m o 25. Biophyical Journal 8(6) 36 336
328 Kepler and Elton a toggle witch (Gardner et al., 2000). To illutrate the mathematical technique and highlight the main feature of thi ytem, we make everal implifying aumption. Even for thi overimplified ytem, the effective diffuion equation for the ytem i quite complicated and the dynamic, nontrivial. Our firt implifying aumption i that the two gene that code for and 2 hare the ame operator. Although our motivation i implification, imilar arrangement do occur in nature (Ptahne, 992). Thi aumption reduce the number of operator tate from 4 to 3: 0 (empty), (occupied by ), and 2 (occupied by 2 ). Again we aume that the protein bind to the operator a dimer, and 2. For implicity, we aume that the protein differ only in the gene they repre all other biophyical parameter are identical. Furthermore, we conider only the limiting cae in which the dimerization reaction i fat a compared to all other procee. A before, we aume that the dimeric form of the protein i table and the diociation contant i large. We aume that, if i bound to the operator, then the production rate of 2 i 0,2 0, and vice vera. If there i no repreor bound both protein are produced at rate,i, where i or 2. All thee conideration lead to the following et of biochemical reaction i A, (4) i A O i i L ; i, (42) 0 i L ; K K i, (43) i. (44) It i traightforward to write down the mater equation for the reaction cheme given above and ue the quai-equilibrium approximation to eliminate the dimer number. The reulting mater equation for the monomer abundance i not enlightening and will not be preented here. A i hown in Appendix C, the diffuion equation in the mall noie limit ha the form u x, x 2, x A x, x 2 x2 A 2 x, x 2 2 x 2 B x, x 2 2 x 2 2 B 2 x, x 2 x x2 B 2 x, x 2. (45) The explicit form of A and B are given in Appendix C. The occurrence of a cro term with a negative coefficient in the above equation indicate that the fluctuation in the two FIGURE 0 The bifurcation diagram for the mutual repreor. The olid curve indicate the table fixed point of the ytem. The ytem undergoe a bifurcation at b 4 9. The point marked indicate the value of b ued in the top and bottom panel of Fig. and 2, and the point marked 2 indicate the value of b ued in the middle panel of thee figure. concentration are anticorrelated, a expected, given the inhibitory interaction between the two protein pecie. In the determinitic limit, the ODE for the dimenionle concentration are where, again, b 2 / 2. dx dt x 2 2 /b x 2 x, (46) dx 2 dt x 2 /b x 2 2 x 2, (47) Reult for the mutual repreor The bifurcation diagram for Eq. 46 and 47 (Fig. 0) how that, when b 4 9, there are two table fixed point. In the abence of a imple form for the teady-tate ditribution, we examine the effectivene of the approximation by direct comparion of realization of the dicrete proce to the continuou diffuion approximation (i.e., realization from the tochatic differential equation correponding to Eq. 45). Figure how time erie and hitogram for Monte Carlo imulation of the dicrete proce. The top three panel correpond to the point marked in Fig. 0. In the determinitic limit, the ytem i monotable. The value of m o and are 000 and 50, repectively; we expect the mall noie approximation to be valid. In the middle three panel, ha been reduced. They correpond to the point marked 2 in Fig. 0; we expect the ytem to be bitable. Thi behavior i clearly een in the time erie and hitogram. Finally, in the bottom three panel, the ame parameter value are Biophyical Journal 8(6) 36 336
Stochaticity in Gene Regulation 329 FIGURE Monte Carlo imulation for the mutual repreor with 0,000, 50, 000. In the top three panel, k 0 0.5,, and b 0.5. The dimenionle parameter are: (upper) b 0.5 and 50, (middle) b 0.28 and 8.5, and (lower) b 0.5 and 0. Note that, in the lower panel, and b are not large, o we expect fluctuation in the operator tate to have a ignificant effect. The dimer concentration ha been approximated uing the quai-teady-tate approximation. ued a in the top panel except that K ha been reduced one-hundred-fold. The determinitic decription predict that the ytem i monotable. However, a can be een from the figure with finite fluctuation, the ytem i bitable. Due to the mall value of and b ued in the bottom panel, we may not be jutified in uing the diffuion approximation for thi cae. Figure 2 illutrate reult obtained uing the mall-andfat noie approximation. That i, the time erie hown in the figure were generated uing the tochatic differential equation aociated with Eq. 45. Very good agreement i een between the top and middle panel of thi figure and Fig., verifying the validity of thee approximation. Some dicrepancie are noticeable, however, in the bottom panel of the two figure where b i only 5. To accurately capture the dynamic of the ytem with thi value of, the fluctuation in the operator tate mut be explicitly included in the model. That i, Eq. C C3 hould be ued. When thi i done, there i good are agreement for all three cae (data not hown). DISCUSSION Genetic regulation i a topic of central importance in biology. With the advent of new technique for the imultaneou determination of expreion level of ten of thouand of gene, many of it key iue are likely to be dealt with comprehenively in the next everal year. Given the extraordinary quantitie of data that will be neceary to accomplih thee goal and the inherent complexity of the ytem involved, it i inevitable that thee gain will require ignificant ue of novel mathematical and tatitical tool. Furthermore, the nature of trancription mall trancript number and dicrete operator tate dictate that tochaticity be explicitly treated and undertood in the baic model. Thi contention i buttreed by the exitence of everal macrocopic gene-regulatory phenomena in which tochatic effect play a major role (Weintraub, 988; van Roon et al., 989; Fiering et al., 990; Dingemane et al., 994; Walter et al., 995; Wijgerde et al., 995; Ahmad and Henikoff, 200). We are particularly intereted in examining the geneproduct concentration variability due to internal fluctuation in the dicrete tate of the operator, becaue, to our knowledge, a theoretical treatment of thee fluctuation doe not exit. However, computer imulation of imple model of inducible gene expreion have been tudied (Ko, 99; Cook et al., 998). We are not aware of any etimate of reaction rate for operator fluctuation and therefore cannot Biophyical Journal 8(6) 36 336
330 Kepler and Elton FIGURE 2 Sample path of the tochatic differential equation aociated with the mall-and-fat noie approximation of the mutual repreor. Good agreement between thi figure and Fig. i evident in the upper and middle panel, through difference are clearly viible in the lower panel. Thee diparitie arie becaue the fluctuation in the operator tate are not fat enough to warrant the fat noie approximation. ay in advance jut how large the aociated effect will be. We have derived expreion for the gene-product variance attributable to thee operator fluctuation that could help to etimate thee effective rate, or converely, given thee effective rate, etimate the ize of the aociated variance. Even the more complicated of the model dicued here leave out potentially important feature. The mot egregiou uch omiion i that of the ditinction between trancription and tranlation. In other word, the model, read literally, are model of direct tranlation from DNA into protein. Thi implification clearly ha ignificant impact on phenomena in ome ytem. The artificial three-gene cycling contruct (Elowitz and Leibler, 2000), for example, would not ocillate at all were it not for the delay between trancription and tranlation. Other reearcher have developed model that treat trancription and tranlation eparately. In a model of the tryptophan operon, Santillan and Mackey (200a,b) ued delay differential equation to take into account time delay aociated with thee two procee, wherea other have contructed tochatic model of trancription and tranlation that explicitly account for delay in thee procee (Mc- Adam and Arkin, 997; Thattai and van Oudenaarden, 200). In all thee invetigation, operator fluctuation were ignored. One aim of thi manucript i to undertand the role of thee fluctuation in trancriptional regulation. Thi i the motivation for implifying trancription and tranlation into a ingle kinetic tep. A eriou concern with thi aumption i that our model allow a finite probability for the intantaneou production of protein. We do not expect including explicit model of trancription and tranlation to affect the qualitative feature of our reult. However, to fully undertand the combined effect of all the relevant procee require further invetigation. We have attempted here to trike a balance between thoe model that are baed on dicrete-object imulation (Endy and Brent, 200; McAdam and Arkin 998) and thoe that are derived directly in term of macrocopic tate variable and either neglect randomne or add it by hand (Shea and Acker, 985; Haty et al., 200a,b; Haty 2000; Santillan and Mackey, 200a,b). Both thee approache are ueful and have provided inight into genetic network. However, the former can be very difficult to analyze, or even undertand adequately, and become computationally intractable for large network, wherea the latter may fail to repreent the phenomena quantitatively, or indeed, a we how here, qualitatively. Biophyical Journal 8(6) 36 336
Stochaticity in Gene Regulation 33 We have preented a et of tochatic model of differing level of temporal and, effectively, patial reolution, derived in variou parameter limit. We have ued thee model to explore ome of the baic conequence of tochaticity in trancriptional regulation in two imple model exhibiting table witching behavior in the determinitic limit. Becaue our approach tart from a microcopic decription, all the macrocopic parameter are defined in term of the underlying chemical procee. Even for ituation in which intrinic fluctuation are negligible, it i important to derive the macrocopic rate equation directly from the underlying mater equation, becaue phenomenological treatment do not reliable capture the dynamic of the ytem (cf. Eq. A2). In the preence of noie, the witche are detabilized, and, correpondingly, the bifurcation diagram that provide inight into the nature of the witche mut be generalized. We have ued the appearance of critical point in the teadytate probability denity function (rather than of ingular point of the determinitic dynamic) to characterize the qualitative behavior of thee ytem, which have rich bifurcation tructure, including bifurcation aociated with change in the operator fluctuation rate alone. In other word, the qualitative behavior of thee witche change when the characteritic time for operator fluctuation i aumed zero, the limit in which the determinitic rate equation are derived. The tochaticity of thee witche caue pontaneou tranition to occur. In the context of our model, we can compute the mean firt-paage time. We find that thee mean waiting time increae exponentially with the operator fluctuation characteritic rate. Thu, we might expect to find that evolution ha tuned thee rate to jut over the value neceary to prevent pontaneou tranition within the lifetime of the cell, except in the cae where pontaneou tranition are an intrinic component of the functional behavior of the ytem. Beyond the reult reported here, our primary concern i that the effect of intrinic noie may actually become increaingly important a more and more component are aembled into the large regulatory network that clearly comprie the baic apparatu of the cell. Thu far, we have only examined the cae of very mall network, and do not now have any ene for how the effect in quetion cale with network ize. There urely are ource of variability we have not yet conidered (fluctuation in cell volume, ionic environment, DNA acceibility, etc.) that may have profound conequence for genetic regulation. There are everal DNA chemical modification, uch a methylation (Yeivin and Razin, 993) and acetylation (Gruntein, 997) unique to multicellular eukaryote that reult in longer term, more table regulatory change. Even here, we expect that the initiation of thee change ha great variability from cell to cell. We are anxiou to learn, among other thing, the role of tochaticity coupled to more macrocopic intercellular procee in driving the extraordinary development of a complete organim from a ingle cell. APPENDIX A: QUASI-EQUILIBRIUM DIMER FLUCTUATIONS In thi appendix, we how how the quai-equilibrium aumption can be ued to eliminate the dimer number a a tate variable from the problem. Thi procedure depend on two characteritic of the ytem: Firt, that the coefficient of variation of the dimer number conditional on the total protein concentration i mall; Second, that the rate contant for the dimerization reaction are fat compared to other rate in the ytem. We write the joint probability function a the product of the marginal in n and the conditional of d on n. p n,d p n p dn. (A) Summing Eq. 3 over d give the differential equation for p n, dp n dt n 2dn p n n 2dn p n p n p n Kp n dn p 0 n. (A2) Subtitution of Eq. A into Eq. 3 yield that for p dn. Thi latter equation i omewhat complicated and not particularly tranparent. The dominant term in thi equation, however, i the dimer-related probability flux, or dimer flux, which, for both operator tate, i given by j n,d t n 2d 2n 2d p dn dp dn p n. (A3) A tend toward infinity, thi flux mut remain finite for all d: quaiequilibrium correpond to etting the expreion inide the quare bracket to zero. Equation for the conditional moment can then be found by multiplying thi expreion by d q and umming over d. The reult for q i dn n 2dnn 2dn 4Vardn. (A4) When the coefficient of variation of d conditional on n i mall (we will determine hortly when it will be mall), the variance term can be neglected, and the mean i given imply by dn mnmn, (A5) where m(n) n 2dn. Under the quai-teady-tate approximation, Eq. A2 become dp n dt n 2dn p n n 2dnp n p n p n Kp n dnp 0 n, (A6) Biophyical Journal 8(6) 36 336
332 Kepler and Elton fixed N can be found and ued to examine thi aumption explicitly. The equilibrium probability denity i given by log Dn dn 2d 4n8n 2 /2 8d 4n arctan 8n 2 2log4 2 m 0 2 d 4n n 2, (A8) where i a normalization contant. Figure A, A and B, how plot of thi ditribution with 50. In Fig. A, A, N 20. With thi value, the average number of dimer i between and 2. Surpriingly, the continuum limit work relatively well even for thi mall number of dimer. In Fig. A, B, N 20. For thi cae, the mean and the variance are 27.5 and 0.0, repectively. Note that, even with thi mall number of dimer, the ditribution look nearly Gauian and the coefficient of variation (0.0) i mall. For a large gene-product pool, we can ue the diffuion approximation to Eq. A6. To do thi, we convert to the dimenionle variable u t and Y N/m o. Then the marginal denity for X M/m o i found by uing the change of variable Y X 2X 2 /. The reult i u x a x 4xm o 22 a x 4xm o 3 b x 2 0. 2m 0 x 2 a x 4xm o 2 (A9) In the limit m 0 3, the above equation become u x a x 4xm o b x 2 0. (A0) FIGURE A The equilibrium dimer ditribution with 50: (A) N 20 and (B) N 20. The hitogram are the reult of Monte Carlo imulation and the olid line are the diffuion approximation, which treat the number of dimer a a continuou variable. where dn i given by Eq. A4. For the above equation to be ueful, however, we need the functional dependence of dn on n. Thi relation can be found eaily if the coefficient of variation i mall, in which cae Eq. A5 can be ued to olve for dn in term of n. If, additionally, the diociation contant i large, then M N. Making thi change of variable in Eq. A6 produce Eq. 32 of the text. We can approximate the coefficient of variation under the aumption that the third central moment i mall compared to the mean cubed, and find CV dn dn 4dn (A7) giving a condition for the elf-conitency of the approximation. If the dimer number i large enough to warrant the diffuion approximation, then an analytic expreion for the equilibrium dimer probability denity for The teady-tate marginal denity 0 for the above equation i log x x2 2 a 0x 2 3 m o x log 4m ox 6b 6a 0 2 3a 0 x 2x 2 b 4m ob log x a 2 0 4a 0 3 m o logx a 0, (A) which reduce to Eq. 35 in the limit m 0. Next, taking the limit 3 Biophyical Journal 8(6) 36 336
Stochaticity in Gene Regulation 333 in Eq. A0 reult in the Liouville equation, equivalent to the ODE, dx dt 4m o x/ ba 0 x 2 b x 2 x, (A2) which reduce to Eq. 39 in the limit m o. Note that the prefactor multiplying the right-hand ide of Eq. A2 doe not affect the fixed point of the ytem. It doe, however, play a ignificant role in the dynamic. where. In term of thi decompoition, Eq. B3 become t f T L fr 0, t L fr 0 K, (B7) (B8) APPENDIX B: EFFECTIVE DIFFUSION EQUATION FROM OPERATOR FLUCTUATIONS: GENERAL CASE We often expect that the fluctuation in the operator tate occur on a fater time cale than the rate of production and degradation of protein. In thi limit, it i poible to derive an effective diffuion equation for the marginal probability denity. We tart from the mall noie approximation for the protein concentration and make ue of the dimenionle variable dicued in the text. Let the element of the q-dimenional vector X(t) denote the dimenionle concentration of the q protein pecie involved in the proce at time t. The ingle time denity for the operator tate and protein concentration i denote by i (x, t). Therefore, i a c-dimenional vector, where c i the number of chemical tate of the operator. The ingle-time denitie atify the equation t Lx K, (B) where K i the c c tranition matrix that contain the reaction rate for tranition between chemical tate of the operator. The diagonal matrix operator L ha the form q L ii j xj g ji x q 2 2 xj h ji x. (B2) The matrix g i a q c matrix, whoe jth column contain the net production rate of the q protein pecie when the operator i in the jth chemical tate. The matrix h i likewie a q c matrix. The column of h are the diffuion coefficient for each protein pecie in that particular chemical tate of the operator. To make explicit our aumption that the chemical kinetic of the operator are fat, we cale K a K/ and write j t Lx K. (B3) Now, becaue probability i conerved, the matrix K(x) mut have one zero eigenvalue for all value of x. We aume that K ha exactly one zero eigenvalue (all the ret mut be negative) at each point. Furthermore, the left eigenvector correponding to the eigenvalue zero i the row vector T (,,...,);i.e., T K 0. (B4) We deignate the correponding right eigenvector r 0 (x) and normalize it to atify T r 0. Thu, the element of r 0 are the teady-tate probabilitie of the chemical tate of the operator for fixed x. The projection operator, I r 0 T, (B5) project out the dynamic of the ytem that doe not lie in the null pace of K. The marginal denity for X i f T. The joint denity can be decompoed a where the lat term in the econd equation take account of the fact that K K. We next make the quai-equilibrium approximation. That i, we aume that the probabilitie for the chemical tate of the operator reach their teady-tate value, before X change appreciably. Thi amount to etting the left-hand ide of Eq. B8 equal to zero. Remembering that, by contruction, doe not have a component that lie in the null pace of K, we have K Lfr 0 O 2, where K i a peudo-invere of K defined by (B9) K K KK and K K. (B0) An explicit formula for K i K E*E, (B) where E i the matrix whoe column are the right eigenvector of K, and * i the diagonal matrix whoe entrie on the diagonal are the invere of the eigenvalue of K, except that the entry correponding to the null eigenvalue i itelf zero. One can ee that the matrix thu defined atifie Eq. B0. We then ubtitute Eq. B9 into Eq. B7 to give the diffuion equation for the marginal denity, t f T Lfr 0 T LK Lfr 0. (B2) EXAMPLE Two chemical tate (elf-promoter). A an example of the algorithm decribed above, we treat the cae of one protein pecie and two operator tate, a pecial cae of which i the elf-promoter dicued in the text. Let k 0 K K 2 and k K 22 K 2. In thi cae, the matrice g and h are 2. Let g g 0 and g 2 g. Likewie, let h h 0 and h 2 h. Then we have So Eq. B become r 0 k 0 k K k 0 k 2 k k 0, k 0 k k 0 k (B3), (B4) gr 0 g 0k g k 0 k 0 k, (B5) hr 0 h 0k h k 0 k 0 k. (B6) fr 0, (B6) t fx, t x Axfx, t 2 x 2 Bxfx, t, (B7) Biophyical Journal 8(6) 36 336
334 Kepler and Elton where Ax k g 0 k 0 g k 0 k and g 0 g k 0 k k 0 g 0 xk 0 k 0 k 2 k 0 g xk g 0 g k 0 k 2 (B8) B 2 k 0k g 0 g 2 k 0 k 3 h 0k h k 0 k 0 k. (B9) Note that, when 3 0 and when within-operator tate fluctuation are negligible, we recover the determinitic ODE for the ytem, dx dt k g 0 k 0 g k 0 k. (B20) The correpondence with the elf-promoter ytem dicued in the text i etablihed with g 0 (x) x, g (x) a 0 x, h 0 x, h a 0 x, K 0 (x) b, K 0 (x) x 2, and /, leading to the expreion for A(x) and B(x) given by Eq. 37 and 38. APPENDIX C: MUTUAL REPRESSORS Converting to dimenionle variable, the diffuion limit for fluctuation in the monomer concentration i u x, x 2 x x x2 x 2 2 2m x x 2 x2 x 2 o b x 2 0, (C) where A x, x 2 x 2 2 /b x 2 x 2x x 2 x x 2 x b x 2 2b x 2 x x 2 2 3b x 2x x x 2 4 bb x 2 x 2 2 4, A 2 x, x 2 x 2 /b x 2 2 x 2 2x x 2 x x 2 2 b x 2 x 2 2b x 2 2 x 2 x 2 4 3b x 2 2x 2 x 2 x, bb x 2 2 x 2 4 B x, x 2 m o x 2 2 /b x 2 x B 2 x, x 2 m o x 2 2 b 2 2bx 2 x 2 x 2 2 x 4 bb x 2 x 2 2 3 x 2 /b x 2 2 2 x x 2 b 2 2bx 2 2 x 2 x 2 2 x 4 2 bb x 2 x 2 2 3 B 2 x, x 2 x 2 x 2 2 2b x 2 x 2 2 bb x 2 x 2 2 3.,, u 2 x, x 2 x2 x 2 2 x x 2 2 2m x x 2 x2 x 2 2 o b 2 x 2 0, u 0 x, x 2 x2 x 2 0 x x 0 (C2) APPENDIX D: THE MEAN FIRST PASSAGE TIME Here we derive equation that govern the mean time to witch from one quai-table tate to another. For implicity, we will only conider the cae in which there i one protein pecie. For more complicated ytem, one i, in general, forced to ue numerical technique to compute the mean firt-paage time. 2 2m x x 2 x2 x 2 0 O x 2 x 2 2 0 b b 2, (C3) where X M /m o and X 2 M 2 /m o. In the limit of fat operator fluctuation, the diffuion approximation for the marginal denity 0 2 ha the form x, x 2, x A x, x 2 x2 A 2 x, x 2 2 x 2 B x, x 2 2 x2 B 2 x, x 2 x x2 B 2 x, x 2, (C4) Continuou and dicrete procee We begin by conidering the general ituation in which the tochatic proce ha both a continuou and dicrete component. Thi correpond to a cae in which the monomer concentration i conidered to be continuou, but the tate of the operator are dicrete. Below, we pecialize the treatment to conider purely dicrete or continuou ytem. The tarting point for thee conideration i the backward equation (correponding to the forward equation Eq. B) for the conditional or tranition denitie ij (y, tx, 0) (Gardiner, 990) t L x T Kx, (D) Biophyical Journal 8(6) 36 336
Stochaticity in Gene Regulation 335 where L ii n j g ji x xj n h 2 ji x 2 xj. (D2) j If we conider the cloed interval [a, b] with an aborbing barrier at one or both end, the mean firt-paage time, T i (x) for the concentration to leave thi interval, given that it tarted at x with the operator in tate i at t 0, i T i x 0 a b m j m ji yxdydt ji x. j (D3) Uing Eq. D, it i traightforward to how that the atify the equation I L x T Kx. (D4) Let T denote the vector whoe element conit of T i (x). Summing the above matrix equation over the row produce the vector equation for T, L T T TK, (D5) where i an m-dimenional row vector of one. The above equation repreent a et of m nonhomogeneou-coupled econd-order ODE. The boundary condition are T i 0 at an aborbing boundary and T i /x 0 at a reflecting boundary condition. The mean firt-paage time for the proce i found from n Tx p i T i x, i (D6) where p i denote the probability of being in tate i at t 0. Even for the imple two-tate ytem dicued above, analytic olution to Eq. D5 are unavailable. If we ignore fluctuation in the concentration, the equation for the two-tate ytem become Dicrete procee Here we conider the cae in which the monomer number i treated a a dicrete random variable. We retrict ourelve to the elf-promoting ytem dicued in the text. Let p ba (j, tm, 0) denote the tranition denity for the operator to be in tate b with j monomer preent at time t given that, at time 0, the operator wa in tate a with m monomer preent. If there are l poibilitie for the number of monomer, the tranition denitie can be arranged in a 2l 2l matrix P, where the 2 come from the fact that the operator ha 2 tate. The backward equation then ha the form dp dt PW, (D) where W i the tranition matrix for the entire proce. The element of W come from the underlying mater equation for the proce, e.g., Eq. 32. a Let T m be the MFPT for a ytem tarting in tate (a, m) with an aborbing barrier placed at N m and a reflecting barrier at n m. Uing Eq. D and 32 and following the ame reaoning a decribed above for the mixed cae, we have 0 0 T m T 0 m 0 mt m T 0 m k 0 mm T m T m 0, T m T m m T m T m k 0 T m 0 T m. (D2) (D3) The boundary condition are a follow. At the aborbing boundary we have T N a 0. At the reflecting boundary we have g 0 x dt 0 dx k 0xT 0 T, (D7) 0 0 0 T n T 0 n k 0 mm T n T 0 n, (D4) g x dt dx k xt T 0. (D8) 0 T n T n k 0 T 0 n T n. (D5) Notice that the order of the equation ha been reduced from econd to firt. Therefore, we only need two boundary condition. If the determinitic flow i toward an aborbing barrier, then T i for that tate mut vanih at the boundary. If the flow i toward a reflecting boundary or table fixed point, then dt i /dx vanihe at that point. There are no boundary condition at the point where the flow i away from aborbing or reflecting boundary or untable fixed point. To olve Eq. D7 and D8, we ue the change of variable T 0 T. Thi produce d dx 0 g k 0 g 0 k g 0 g g g 0 g 0 g, (D9) whoe olution can be written explicitly. The remaining differential equation i dt dx 0 g x 0 k x 0 g x 0, (D0) which, again, ha olution given by quadrature involving the olution for. Eq. D2 D5 mut be olved numerically. Continuou procee For completene, we note that an expreion for the mean firt-paage time can be contructed in the full diffuion limit (Gardiner, 990). In thi cae, the mean firt-paage time T(x) atifie the equation, Ax dtx dx Bx d 2 T 2 dx 2. (D6) The mean firt-paage time for the concentration to leave the interval [0, a] with a reflecting boundary at x 0 and an aborbing boundary at x a i a x y Tx 2x By xdx0 dy, (D7) Biophyical Journal 8(6) 36 336
336 Kepler and Elton where x x exp 2 Ay By dy. 0 (D8) Becaue of the multiple integral involved in the above expreion, in general, it i not particularly ueful. In fact, the reult for thi cae preented in the manucript were obtained by numerically olving Eq. D6 uing a hooting method. The author would like to thank Bard Ermentrout for ueful dicuion during the early work on thi project and Jeff Haty for hi critical reading of the manucript. T.C.E. wa upported by the National Science Foundation under reearch grant DMS-007582. T.B.K. wa partially upported by National Science Foundation award MCB 9357637. REFERENCES Ahmad, K., and S. Henikoff. 200. Modulation of a trancription factor counteract heterochromatic gene ilencing in Droophila. Cell. 04: 839 847. Arkin, A., J. Ro, and H. H. McAdam. 998. Stochatic kinetic analyi of developmental pathway bifurcation in phage -infected Echerichia coli cell. Genetic. 49:633 648. Beckei, A., B. Seraphin, L. Serrano. 200. Poitive feedback in eukaryotic gene network: cell differentiation by graded to binary repone converion. EMBO J. 20:2528 2535. Bennett, D. C. 983. Differentiation in moue melanoma cell: initial reveribility and an on off tochatic model. Cell. 34:445 453. Cherry, J. L., and F. R. Adler. 2000. How to make a biological witch. J. Theor. Biol. 203:7 33. Cook, D. L., A. N. Gerber, and S. J. Tapcott. 998. Modeling tochatic gene expreion: implication for haploinufficiency. Proc. Natl. Acad. Sci. U.S.A. 95:564 5646. Dingemane, M. A., P. A. J. de Boer, A. F. M. Moorman, R. Charle, W. H. Lamer. 994. The expreion of liver-pecific gene within rat embryonic hepatocyte i a dicontinuou proce. Differentiation. 56: 53 62. Elowitz, M. B., and S. Leibler. 2000. A ynthetic ocillatory network of trancriptional regulator. Nature. 403:335 338. Endy, D., and R. Brent. 200. Modeling cellular behavior. Nature. 409: 39 395. Fiering, S., J. P. Northrop, G. P. Nolan, P. S. Mattila, G. R. Crabtree, L. A. Herzenberg. 990. Single cell aay of a trancription factor reveal a threhold in trancription activated by ignal emanating from the T-cell antigen receptor. Gene Dev. 4:823 834. Gardiner, C. 990. Handbook of Stochatic Method for Phyic, Chemitry, and the Natural Science. Springer-Verlag, Berlin. 36 42. Gardner, T. S., C. R. Cantor, and J. J. Collin. 2000. Contruction of a genetic toggle witch in Echerichia coli. Nature. 403:339 342. Gruntein, M. 997. Hitone acetylation in chromatin tructure and trancription. Nature. 389:349 352. Haty, J., F. Iaac, M. Dolnik, D. McMillen, and J. J. Collin. 200a. Deigner gene network: toward fundamental cellular control. Chao. :207 220. Haty, J., D. McMillen, F. Iaac, and J. J. Collin. 200b. Computational tudie of gene regulatory network: in numero molecular biology. Nat. Rev. Gen. 2:268 278. Haty, J., J. Pradine, M. Dolnik, and J. J. Collin. 2000. Noie-baed witche and amplifier for gene expreion. Proc. Natl. Acad. Sci. U.S.A. 97:2075 2080. Horthemke, W., and R. Lefever. 984. Noie Induced Tranition. Theory and Application in Phyic, Chemitry and Biology. Springer-Verlag, Berlin. 258 292. Ko, M. S., H. Nakauchi, and N. Takahahi. 990. The doe dependence of glucocorticoid-inducible gene expreion reult from change in the number of trancriptionally active template. EMBO J. 9:2835 2842. Ko, M. S. H. 99. A tochatic model for gene induction. J. Theor. Biol. 53:8 94. Ko, M. S. H. 992. Induction mechanim of a ingle gene molecule: tochatic or determinitic? Bioeay. 4:34 346. McAdam, H. H., and A. Arkin. 997. Stochatic mechanim in gene expreion. Proc. Natl. Acad. Sci. U.S.A. 94:84 89. McAdam, H. H., and A. Arkin. 998. Simulation of prokaryotic genetic circuit. Ann. Rev. Biophy. Biomol. Struct. 27:99 224. Ptahne, M. 992. A Genetic Switch: Phage and Higher Organim. 2nd edition. Cell Pre and Blackwell Scientific Publication, Cambridge, MA. Santillan, M. S., and M. C. Mackey. 200a. Dynamic regulation of the tryptophan operon: a modeling tudy and comparion with experimental data. Proc. Natl. Acad. Sci. U.S.A. 98:364 369. Santillan, M. S., and M. C. Mackey. 200b. Dynamic behavior in mathematical model of the tryptophan operon. Chao. :26 268. Shea, M. A., and G. K. Acker. 985. The OR control ytem of bacteriophage lambda. A phyical-chemical model for gene regulation. J. Mol. Biol. 8:2 230. Thattai, M., and A. van Oudenaarden. 200. Intrinic noie in gene regulatory network. Proc. Natl. Acad. Sci. U.S.A. 98:864 869. van Kampen, N. G. 992. Stochatic Procee in Chemitry and Phyic. North-Holland, Amterdam. 39 208. van Roon, M. A., J. A. Aten, C. H. van Oven, R. Charle, and W. H. Lamer. 989. The initiation of hepatocyte-pecific gene expreion within embryonic hepatocyte i a tochatic event. Dev. Biol. 36: 508 56. Walter, M. C., S. Fiering, J. Eidemiller, W. Magi, M. Groudine, and D. I. K. Martin. 995. Enhancer increae the probability but not the level of gene expreion. Proc. Natl. Acad. Sci. U.S.A. 92:725 729. Weintraub, H. 988. Formation of table trancription complexe a aayed by analyi of individual template. Proc. Natl. Acad. Sci. U.S.A. 85:589 5823. Wijgerde, M., F. Groveld, and P. Fraer. 995. Trancription complex tability and chromatin dynamic in vivo. Nature. 377:209 23. Yeivin, A., and A. Razin. 993. DNA methylation: molecular biology and biological ignificance. In DNA Methylation: Molecular Biology and Biological Significance. J. P. Jot, H. P. Saluz, editor. Birhauer Verlag, Bael, Switzerland. 523 568. Biophyical Journal 8(6) 36 336