Stochastic Gene Expression in Prokaryotes: A Point Process Approach Emanuele LEONCINI INRIA Rocquencourt - INRA Jouy-en-Josas ASMDA Mataró June 28 th 2013 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 1 / 22
Introduction Central role of protein production in prokaryotes Central role of protein production Proteins are the core of biologic processes: enzymes, DNA replication machinery,... 50% of the bacteria dry weight 3.5 millions of proteins in each cell 2000 types of proteins produced at any time at any growth condition (volume growth) proteins ranging from few dozens up to 10 5 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 2 / 22
Introduction Central role of protein production in prokaryotes Central role of protein production Proteins are the core of biologic processes: enzymes, DNA replication machinery,... 50% of the bacteria dry weight 3.5 millions of proteins in each cell 2000 types of proteins produced at any time at any growth condition (volume growth) proteins ranging from few dozens up to 10 5 A highly consuming process: at each generation, bacteria has to duplicate all proteins more than 85% of cell resources Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 2 / 22
Introduction Stochasticity and living organisms Stochasticity in protein production: experimental viewpoint ADk cytoplasm protein 1 1 Yuichi Taniguchi et al. Science (2010), pp. 533 538. Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 3 / 22
Introduction Stochasticity and living organisms Stochasticity in protein production: structural Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 4 / 22
Introduction Stochasticity and living organisms Stochasticity in bacteria Sources of stochasticity: bacterial cytoplasm: disordered medium main cellular motility mechanism: diffusion in a stiff medium most cellular processes require the encounter of macromolecules (Poisson process) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 5 / 22
Introduction Stochasticity and living organisms Stochasticity in bacteria Sources of stochasticity: bacterial cytoplasm: disordered medium main cellular motility mechanism: diffusion in a stiff medium most cellular processes require the encounter of macromolecules (Poisson process) Protein production: inherently stochastic process. Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 5 / 22
Model Model description Stochastic model of Gene Expression in Prokaryotes Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 6 / 22
Model Model description 4-Step model: activation Y (t) {0, 1} Gene status λ + 1 λ 1 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 7 / 22
Model Model description 4-Step model: transcription Y (t) {0, 1} M(t) N Messenger λ 2 λ + 1 λ 1 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 7 / 22
Model Model description 4-Step model: translation initiation Y (t) {0, 1} M(t) N R(t) N Ribosome λ 2 λ 3 λ + 1 λ 1 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 7 / 22
Model Model description 4-Step model: translation completion Y (t) {0, 1} M(t) N R(t) N P(t) N Protein λ 2 λ 3 µ 3 λ + 1 λ 1 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 7 / 22
Model Model description 4-Step model Y (t) {0, 1} M(t) N R(t) N P(t) N λ 2 λ 3 µ 3 λ + 1 λ 1 µ 2 µ 4 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 7 / 22
Model Model description 4-Step model Y (t) {0, 1} M(t) N R(t) N P(t) N λ 2 λ 3 µ 3 µ 2 µ 4 Goal: characterize λ 2 mean and λ 3 variance of the number of proteins P at equilibrium µ 3 λ + 1 λ 1 µ 2 µ 4 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 7 / 22
Model Classic assumptions & results Classic assumptions Properties Assumption: each step has exponentially distributed duration Markovian description of the protein production Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 8 / 22
Model Classic assumptions & results Classic assumptions Properties Assumption: each step has exponentially distributed duration Markovian description of the protein production Tools Markov processes Fokker-Plank equations explicit analytic formulas of mean and variance as function of the main parameters Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 8 / 22
Model Classic assumptions & results Results: models used by biologists quantitative characterisation of protein fluctuations [ var(p) = E [P] 1 + E [P] = λ 2λ 3 µ 2 µ 4 λ 3 µ 3 (µ 2 + µ 3 + µ 4 ) (µ 2 + µ 3 )(µ 2 + µ 4 )(µ 3 + µ 4 ) ] def if δ + = λ+ 1 λ + 1 +λ 1 = 1 (active gene) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 9 / 22
Model Classic assumptions & results Classic approach: Exponential assumption: Not each described process has an exponentially distributed duration. Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 10 / 22
Model Classic assumptions & results Classic approach: Exponential assumption: Not each described process has an exponentially distributed duration. λ? Y (t) {0, 1}? M(t) N R(t) N P(t) N µ 2 3? The duration of the following processes is not exponential protein elongation mrna elongation deterministic protein dilution (vs. classic stochastic proteolysis) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 10 / 22
Model Classic assumptions & results Protein chain elongation trna trna (charged) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 11 / 22
Model Classic assumptions & results Protein chain elongation trna (uncharged) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 11 / 22
Model Classic assumptions & results Protein chain elongation Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 11 / 22
Model Classic assumptions & results Protein chain elongation Elongation: exponentially distributed elementary steps therefore elongation is not exponentially distributed Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 11 / 22
Model Classic assumptions & results Protein chain elongation Elongation: exponentially distributed elementary steps therefore elongation is not exponentially distributed Large number of steps (N 400 a.a.) elongation time described by normal random variable with var = (mean elongation time)/ N Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 11 / 22
MPPP Marked Poisson Point Process (MPPP): new description of gene expression Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 12 / 22
MPPP Explanatory model Explanatory model 1 λ + λ 0 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 13 / 22
MPPP Explanatory model Explanatory model 1 ENCOUNTER (birth) λ M M(t) N λ + λ 0 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 13 / 22
MPPP Explanatory model Explanatory model λ + 1 ENCOUNTER (birth) λ λ M M(t) N PROCESSING (lifetime) F M (dt) 0 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 13 / 22
MPPP Explanatory model Explanatory model 1 ENCOUNTER (birth) λ M M(t) N PROCESSING (lifetime) F M (dt) λ + λ 0 Assumptions: Y (t) {0, 1} exponentially distributed switches with rates λ +, λ births (s n ) follows a Poisson process of parameter λ M time (σ n ) to process the M(t) has (general) distribution F M (dt) (mark) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 13 / 22
MPPP Explanatory model Explanatory model lifetime birth Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 14 / 22
MPPP Explanatory model Explanatory model lifetime σ 2 σ 3 σ 1 E 1 E 2 E 3 birth E i exponential random variables of parameter λ M. Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 14 / 22
MPPP Explanatory model Explanatory model How many M(t) alive at time t? lifetime t birth Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 14 / 22
MPPP Explanatory model Explanatory model How many M(t) alive at time t? lifetime birth s 1 + σ 1 t s 3 + σ 3 s 2 + σ 2 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 14 / 22
MPPP Explanatory model Explanatory model How many M(t) alive at time t? lifetime y = s + t birth s 1 + σ 1 t s 3 + σ 3 s 2 + σ 2 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 14 / 22
MPPP Explanatory model Explanatory model: general results At equilibrium: M = N M ( 1{u 0 u+v} ) = R R + 1 {u 0 u+v} N λm (du, dv) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 15 / 22
MPPP Explanatory model Explanatory model: general results At equilibrium: M = N M ( 1{u 0 u+v} ) = R R + 1 {u 0 u+v} N λm (du, dv) The Laplace transform of a Marked Poisson point process is [ ] ( ( L NλM (f ) = E e N M(f ) = exp 1 e f (x,y)) ) λ M dxf M (dy) where N λm (f ) = n f (s n, σ n ). Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 15 / 22
MPPP Explanatory model Explanatory model: general results At equilibrium: M = N M ( 1{u 0 u+v} ) = R R + 1 {u 0 u+v} N λm (du, dv) Proposition E [M] = δ + λ M E [σ] var(m) = E [M] + 2λ 2 M δ +(1 δ + ) + 0 where δ + = λ+ λ + +λ 0 u e (λ+ +λ )v (1 F M (u))(1 F M (u + v)) du dv Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 15 / 22
MPPP Applications & Model Extensions MPPP Applications & Model Extensions Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 16 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model Y (t) {0, 1} M(t) N R(t) N P(t) N Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 17 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model λ λ F (dy) Y (t) {0, 1} M(t) N R(t) N P(t) N 2 3 F 2 (dv) 3 F 4 (dz) λ 2 λ 3 F 3 (dy) λ + 1 λ 1 F 2 (dv) F 4 (dz) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 17 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model Y (t) {0, 1} M(t) N R(t) N P(t) N λ 2 λ 3 F 3 (dy) µ 2 µ 4 λ 2 λ 3 F 3 (dy) λ + 1 λ 1 µ 2 µ 4 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 17 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model Y (t) {0, 1} M(t) N R(t) N P(t) N λ 2 λ 3 F 3 (dy) µ 2 µ 4 Choices for F 3 (dy) λ + 1 Exponential Normal Deterministic λ 1 explicit close formula depending on model parameters 2 λ 3 λ analytic formula explicit close formula depending on model 2 µ parameters (limit case) F 3 (dy) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 17 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model Deterministic vs Exponential Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 18 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model [ var Det (P) =E [P] 1 + λ ] 3 µ 2 + µ 4 [ var Exp (P) = E [P] 1 + λ ] 3 µ 3 (µ 2 + µ 3 + µ 4 ) µ 2 + µ 4 (µ 2 + µ 3 )(µ 3 + µ 4 ) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 19 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model Conclusions (MPPP) appropriate math tool to describe cell stochastic processes (Encounter + Processing) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 20 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model Conclusions (MPPP) appropriate math tool to describe cell stochastic processes (Encounter + Processing) analysis and proof of the correct assumption for protein degradation (proteolysis/volume dilution) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 20 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model Conclusions (MPPP) appropriate math tool to describe cell stochastic processes (Encounter + Processing) analysis and proof of the correct assumption for protein degradation (proteolysis/volume dilution) analytic form formula for any distribution explicit formula depending on the model parameters for specific and interesting distributions Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 20 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model Conclusions (MPPP) appropriate math tool to describe cell stochastic processes (Encounter + Processing) analysis and proof of the correct assumption for protein degradation (proteolysis/volume dilution) analytic form formula for any distribution explicit formula depending on the model parameters for specific and interesting distributions deterministic protein elongation might be an upper-bound for protein variance Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 20 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model Conclusions (MPPP) appropriate math tool to describe cell stochastic processes (Encounter + Processing) analysis and proof of the correct assumption for protein degradation (proteolysis/volume dilution) analytic form formula for any distribution explicit formula depending on the model parameters for specific and interesting distributions deterministic protein elongation might be an upper-bound for protein variance counter-intuitive: var DET (P) > var EXP (P) Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 20 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model V. Fromion, E. Leoncini, and P. Robert. Stochastic Gene Expression in Cells: A Point Process Approach. In: SIAM Journal on Applied Mathematics 73.1 (2013), pp. 195 211 Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 21 / 22
MPPP Applications & Model Extensions MPPP 4-Step Model Thanks. Emanuele LEONCINI (INRIA) Stochastic Gene Expression ASMDA 2013 22 / 22