Statistical mechanics for real biological networks

From this document you will learn the answers to the following questions:

Who is the name of William Bialek?

What model for networks of real neurons?

Similar documents
DNA Insertions and Deletions in the Human Genome. Philipp W. Messer

Bayesian probability theory

Topic 3b: Kinetic Theory

Cluster detection algorithm in neural networks

Methods of Data Analysis Working with probability distributions

Heuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations

"in recognition of the services he rendered to the advancement of Physics by his discovery of energy quanta". h is the Planck constant he called it

Topic 2: Energy in Biological Systems

Some background on information theory

High flexibility of DNA on short length scales probed by atomic force microscopy

arxiv:cond-mat/ v1 [cond-mat.soft] 31 Aug 2000

Integrating DNA Motif Discovery and Genome-Wide Expression Analysis. Erin M. Conlon

Simple Linear Regression Inference

8. Time Series and Prediction

MAKING AN EVOLUTIONARY TREE

Determining Measurement Uncertainty for Dimensional Measurements

What Do You Think? For You To Do GOALS

Tutorial on Markov Chain Monte Carlo

Hierarchical Bayesian Modeling of the HIV Response to Therapy

COMPUTATIONAL FRAMEWORKS FOR UNDERSTANDING THE FUNCTION AND EVOLUTION OF DEVELOPMENTAL ENHANCERS IN DROSOPHILA

NO CALCULATORS OR CELL PHONES ALLOWED

Exploratory Spatial Data Analysis

Gene Models & Bed format: What they represent.

E190Q Lecture 5 Autonomous Robot Navigation

Graph theoretic approach to analyze amino acid network

Project 5: Scoville Heat Value of Foods HPLC Analysis of Capsaicinoids

Experiment: Static and Kinetic Friction

Activity 7.21 Transcription factors

Normal Distribution. Definition A continuous random variable has a normal distribution if its probability density. f ( y ) = 1.

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Fluids Confined in Carbon Nanotubes

CALCULATIONS & STATISTICS

Table of Content. Enzymes and Their Functions Teacher Version 1

Some Essential Statistics The Lure of Statistics

BIOPHYSICS OF NERVE CELLS & NETWORKS

Masters research projects. 1. Adapting Granger causality for use on EEG data.

II. DISTRIBUTIONS distribution normal distribution. standard scores

Heat Transfer Prof. Dr. Ale Kumar Ghosal Department of Chemical Engineering Indian Institute of Technology, Guwahati

Lecture 19: Eutectoid Transformation in Steels: a typical case of Cellular

Mass Spectrometry Signal Calibration for Protein Quantitation

NOVEL GENOME-SCALE CORRELATION BETWEEN DNA REPLICATION AND RNA TRANSCRIPTION DURING THE CELL CYCLE IN YEAST IS PREDICTED BY DATA-DRIVEN MODELS

Elementary Statistics

How To Understand The Human Body

Conference: From DNA-Inspired Physics to Physics-Inspired Biology. 1-5 June How Proteins Find Their Targets on DNA

Define the notations you are using properly. Present your arguments in details. Good luck!

Detection of changes in variance using binary segmentation and optimal partitioning

Complex dynamics made simple: colloidal dynamics

Enzymes and Metabolic Pathways

Gas Chromatography. Let s begin with an example problem: SPME head space analysis of pesticides in tea and follow-up analysis by high speed GC.

PART I: Neurons and the Nerve Impulse

Entropy and Mutual Information

Measurement and Simulation of Electron Thermal Transport in the MST Reversed-Field Pinch

Thermodynamics - Example Problems Problems and Solutions

MTH 140 Statistics Videos

Neurophysiology. 2.1 Equilibrium Potential

Section 1.4. Difference Equations

Lecture 11 Enzymes: Kinetics

How To Model The Fate Of An Animal

Wiring optimization in the brain

This paper is also taken for the relevant Examination for the Associateship. For Second Year Physics Students Wednesday, 4th June 2008: 14:00 to 16:00

Chem 1A Exam 2 Review Problems

Work and Energy. Work = Force Distance. Work increases the energy of an object. Energy can be converted back to work.

Reproductive System & Development: Practice Questions #1

CHAPTER 12. Gases and the Kinetic-Molecular Theory

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Two Forms of Energy

Global Seasonal Phase Lag between Solar Heating and Surface Temperature

The first law: transformation of energy into heat and work. Chemical reactions can be used to provide heat and for doing work.

Trading activity as driven Poisson process: comparison with empirical data

Fundamentals of Statistical Physics Leo P. Kadanoff University of Chicago, USA

PECULIARITIES OF THERMODYNAMIC SIMULATION WITH THE METHOD OF BOUND AFFINITY.

STATIC AND KINETIC FRICTION

Lecture 24 - Surface tension, viscous flow, thermodynamics

Behavioral Entropy of a Cellular Phone User

Accuracy of the coherent potential approximation for a onedimensional array with a Gaussian distribution of fluctuations in the on-site potential

AP Biology 2015 Free-Response Questions

M.Sc. in Nano Technology with specialisation in Nano Biotechnology

GA/7 Potentiometric Titration

Models of Switching in Biophysical Contexts

How To Calculate A Non Random Walk Down Wall Street

SIZE OF A MOLECULE FROM A VISCOSITY MEASUREMENT

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

Page 1. Name: 4) The diagram below represents a beaker containing a solution of various molecules involved in digestion.

Systems Biology through Data Analysis and Simulation

CHAPTER 6 AN INTRODUCTION TO METABOLISM. Section B: Enzymes

Thermodynamics: Lecture 2

Protein Protein Interaction Networks

Wave-particle and wave-wave interactions in the Solar Wind: simulations and observations

Appendix A: Science Practices for AP Physics 1 and 2

ADVANCED COMPUTATIONAL TOOLS FOR EDUCATION IN CHEMICAL AND BIOMEDICAL ENGINEERING ANALYSIS

Pairwise Sequence Alignment

Quality Tools, The Basic Seven

CHEMISTRY STANDARDS BASED RUBRIC ATOMIC STRUCTURE AND BONDING

IDEAL AND NON-IDEAL GASES

Prelab Exercises: Hooke's Law and the Behavior of Springs

Review Jeopardy. Blue vs. Orange. Review Jeopardy

1. Degenerate Pressure

E/M Experiment: Electrons in a Magnetic Field.

Fluid Dynamics Viscosity. Dave Foster Department of Chemical Engineering University of Rochester

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Enzymes: Practice Questions #1

Transcription:

Statistical mechanics for real biological networks William Bialek Joseph Henry Laboratories of Physics, and Lewis-Sigler Institute for Integrative Genomics Princeton University Initiative for the Theoretical Sciences The Graduate Center City University of New York Colloquium, Santa Fe Institute Thursday, 27 October, 2011

Many of the phenomena of life emerge from interactions among large numbers of simpler elements: A network how do we state of one state of the which states get from element in whole are sampled one state to the network network in real life? the next? (from R Segev & a percept, memory, motor plan or thought expression level of one gene fate of a cell during embryonic development choice of amino acid at one site in a protein sequence: structure & function flight direction and speed of one bird in a flock coherent migration coding capacity, multiple attractors,...? ion channels and synapses diffusion, binding, walking,... how many fates? modules vs. emergence? evolution how large is the repertoire? A σ30 LPEGWEMRFTVDGIPYFVDHNRRTTTYIDP ----WETRIDPHGRPYYVDHTTRTTTWERP LPPGWERRVDPRGRVYYVDHNTRTTTWQRP LPPGWEKREQ-NGRVYFVNHNTRTTQWEDP LPLGWEKRVDNRGRFYYVDHNTRTTTWQRP LPNGWEKRQD-NGRVYYVNHNTRTTQWEDP LPPGWEMKYTSEGIRYFVDHNKRATTFKDP LPPGWEQRVDQHGRAYYVDHVEKRTT---LPPGWERRVDNMGRIYYVDHFTRTTTWQRP... B Mutual information (bits) 1 5 0.8 propagation of information, responsiveness,...? sensorimotor response + Newton residue position spike/silence from one neuron 10 0.6 15 0.4 20 0.2 25 30 σn {σn } P ({σn }) {σn }t {σn }t+ t 5 10 15 20 residue position 25 30 0

There is a long history of physicists trying to use ideas from statistical mechanics to approach these systems... approximate dynamics Lyapunov function energy landscape on space of states stable states (stored memories, answers to computations,...) are energy minima If there is a statistical mechanics of network states, there is a phase diagram as networks get large... Where are we in this space? On a critical surface, perhaps? But how do we connect these ideas to real data?

Once again, what did we want to know about the system? How do we get from one state to another? Which states of the whole system are sampled in real life? {σ n } t {σ n } t+ t P ({σ n }) State space is too large to answer these (and other) questions directly by experiment. You can t measure all the states, but you can measure averages, correlations,.... Could we build the minimally structured model that is consistent with these measurements? Minimally structured = maximum entropy, and this connects the real data directly to statistical mechanics ideas (!).

Network information and connected correlations. E Schneidman, S Still, MJ Berry II & W Bialek, Phys Rev Lett 91, 238701 (2003); arxiv:physics/0307072 (2003). Weak pairwise correlations imply strongly correlated network states in a neural population. E Schneidman, MJ Berry II, R Segev & W Bialek, Nature 440, 1007-1012 (2006); arxiv:q-bio.nc/0512013 (2005). Ising models for networks of real neurons. G Tkacik, E Schneidman, MJ Berry II & W Bialek, arxiv:q-bio.nc/0611072 (2006). Rediscovering the power of pairwise interactions. W Bialek & R Ranganathan, arxiv:0712.4397 (2007). Statistical mechanics of letters in words. GJ Stephens & W Bialek, Phys Rev E 81, 066119 (2010); arxiv:0801,0253 [q-bio.nc] (2008). Spin glass models for networks of real neurons. G Tkacik, E Schneidman, MJ Berry II & W Bialek, arxiv:0912.5409 [q-bio.nc] (2009). Maximum entropy models for antibody diversity. T Mora, AM Walczak, W Bialek & CG Callan, Jr, Proc Nat l Acad Sci (USA) 107, 5405-5410 (2010); arxiv:0912.5175 (2009). Are biological systems poised at criticality? T Mora & W Bialek, J Stat Phys 144, 268-302 (2011); arxiv:1012.2242 [q-bio.qm] (2010). When are correlations strong? F Azhar & W Bialek, arxiv:1012.5987 [q-bio.nc] (2010). Statistical mechanics for a natural flock of birds W Bialek, A Cavagna, I Giardina, T Mora, E Silverstri, M Viale & AM Walczak, arxiv:1107.0604 [physics.bio-ph] (2011). and, most importantly, work in progress.

1. Basics of maximum entropy construction 2. The case of neurons in the retina 3. A bit about proteins, birds,... 4. Perspective (a few words about the retinal experiments) D Amodei, O Marre & MJ Berry II, in preparation (2011).

Let s define various function of the state of the system, f 1 ({σ n }),f 2 ({σ n }),,f K ({σ n }), and assume that experiments can tell us the averages of these functions: f 1 ({σ n }), f 2 ({σ n }),, f K ({σ n }). What is the least structured distribution P ({σ n }) that can reproduce these measured averages? P ({σ n })= 1 Z({g µ }) exp [ K µ=1 g µ f µ ({σ n }) Still must adjust the coupling constants {g µ } ] This tells us the form of the distribution. Matching expectation values = maximum likelihood inference of parameters. to match the measured expectation values.

Reminder: Suppose this is a physical system, and there is some energy for each state, E({σ n }) Thermal equilibrium is described by a distribution that is as random as possible (maximum entropy) while reproducing the observed average energy: P ({σ n })= 1 Z exp [ βe({σ n})]] In this view, the temperature T =1/(k B β) is just a parameter we adjust to reproduce E({σ n }).

We can think of the maximum entropy construction as defining an effective energy for every state, E({σ n })= K µ=1 g µ f µ ({σ n }), with k B T =1. This is an exact mapping, not an analogy. Examples for neurons: { f µ } = probability of spike vs. silence E = n h n σ n spike probability + pairwise correlations probability that M cells spike together E = h n σ n + 1 2 n ( E = V n σ n ) nm J nm σ n σ m this case we can do analytically

maximum entropy model consistent with probability of M cells spiking together E = V ( n σ n ) Find this global potential for multiple subgroups of N neurons. Lots of states with the same energy... count them to get the entropy. For large N we expect entropy and energy both proportional to N. Plot of S/N vs. E/N contains all the thermodynamics of the system. entropy per neuron 0.6 0.5 0.4 0.3 0.2 0.1 What we see is (within errors) S = E. This is very weird. N=20 40 80 160 extrapolation S = (0.985 ± 0.008)E 0 0 0.2 0.4 0.6 0.8 1 1.2 energy per neuron

The real problem: maximum entropy model consistent with mean spike probabilities, pairwise correlations, and probabilities of M cells spiking together. E = n h n σ n + 1 2 nm J nm σ n σ m + V There are lots of parameters, but we can find them all from ~1 hour of data. This is a hard inverse statistical mechanics problem. ( n σ n ) one example of 100 neurons Correlations reproduced within error bars. ln[p(data)] No sign of over-fitting. training segments test segments

For small networks, we can test the model by checking the probability of every state. For larger networks, we can check connected higher order (e.g. 3-cell) correlations. (example w/n=10) N=100, matching pairwise correlations N=100, matching pairwise + P(M) matching pairwise correlations matching pairwise + P(M) mean absolute error of predicted 3-cell correlations Seems to be working even better for larger N.

Where are we in parameter space? One direction in parameter space corresponds to changing temperature... let s try this one: specific heat (per neuron) N=100 80 The system is poised very close to a point in parameter space where the specific heat is maximized - a critical point. heat capacity 60 40 20 independent Can we do experiments to show that the system adapts to hold itself at criticality? temperature specific heat = variance of energy = variance of log(probability) Having this be large is exactly the opposite of the usual criteria for efficient coding (!). Instead, does operation near the critical point maximize the dynamic range for representing surprise?

Can use the same strategy to make models for... the distribution of amino acids in a family of proteins, A σ 30 LPEGWEMRFTVDGIPYFVDHNRRTTTYIDP ----WETRIDPHGRPYYVDHTTRTTTWERP LPPGWERRVDPRGRVYYVDHNTRTTTWQRP LPPGWEKREQ-NGRVYFVNHNTRTTQWEDP LPLGWEKRVDNRGRFYYVDHNTRTTTWQRP LPNGWEKRQD-NGRVYYVNHNTRTTQWEDP LPPGWEMKYTSEGIRYFVDHNKRATTFKDP LPPGWEQRVDQHGRAYYVDHVEKRTT---- LPPGWERRVDNMGRIYYVDHFTRTTTWQRP... or the flight velocities of birds in a flock. B Mutual information (bits) 1 residue position 5 10 15 20 0.8 0.6 0.4 25 0.2 30 0 5 10 15 20 25 30 residue position

10 1 10 2 Observed Model Independent Probability 10 3 10 2 10 4 10 4 10 6 10 0 10 2 10 4 10 0 10 1 10 2 10 3 10 4 Rank For protein families (here, antibodies), look at the Zipf plot. This is S vs. E, turned on its side; unit slope implies S = E (again!). For birds, look at the correlations directly in real space (as usual in stat mech). ALL of these, as with the specific heat in our neural network, are signatures that the real system is operating near a critical point in its parameter space.