Analogy Between Gambling and Measurement-Based Work Extraction



Similar documents
Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

THE ABRACADABRA PROBLEM

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

I. Chi-squared Distributions

A probabilistic proof of a binomial identity

Universal coding for classes of sources

Analogy Between Gambling and. Measurement-Based Work Extraction

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

Properties of MLE: consistency, asymptotic normality. Fisher information.

Modified Line Search Method for Global Optimization

MARTINGALES AND A BASIC APPLICATION

CHAPTER 3 THE TIME VALUE OF MONEY

Soving Recurrence Relations

Asymptotic Growth of Functions

5 Boolean Decision Trees (February 11)


Irreducible polynomials with consecutive zero coefficients

5: Introduction to Estimation

Hypothesis testing. Null and alternative hypotheses

Chapter 7 Methods of Finding Estimators

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.

Tradigms of Astundithi and Toyota

Estimating Probability Distributions by Observing Betting Practices

Research Article Sign Data Derivative Recovery

I. Why is there a time value to money (TVM)?

Infinite Sequences and Series

Present Value Factor To bring one dollar in the future back to present, one uses the Present Value Factor (PVF): Concept 9: Present Value

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

INVESTMENT PERFORMANCE COUNCIL (IPC)

Entropy of bi-capacities

CHAPTER 3 DIGITAL CODING OF SIGNALS

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

1 Computing the Standard Deviation of Sample Means

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis

Measures of Spread and Boxplots Discrete Math, Section 9.4

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

A Mathematical Perspective on Gambling

Bond Valuation I. What is a bond? Cash Flows of A Typical Bond. Bond Valuation. Coupon Rate and Current Yield. Cash Flows of A Typical Bond

Hypergeometric Distributions

A PROBABILISTIC VIEW ON THE ECONOMICS OF GAMBLING

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

Convexity, Inequalities, and Norms

Incremental calculation of weighted mean and variance

Maximum Likelihood Estimators.

Automatic Tuning for FOREX Trading System Using Fuzzy Time Series

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Output Analysis (2, Chapters 10 &11 Law)

1. C. The formula for the confidence interval for a population mean is: x t, which was

Trackless online algorithms for the server problem

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

Systems Design Project: Indoor Location of Wireless Devices

How to read A Mutual Fund shareholder report

Basic Elements of Arithmetic Sequences and Series

Math C067 Sampling Distributions

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Determining the sample size

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

1 Correlation and Regression Analysis

CS103X: Discrete Structures Homework 4 Solutions

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

Chapter 14 Nonparametric Statistics

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology

Swaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps

Sequences and Series

3 Energy Non-Flow Energy Equation (NFEE) Internal Energy. MECH 225 Engineering Science 2

Department of Computer Science, University of Otago

Ekkehart Schlicht: Economic Surplus and Derived Demand

Floating Codes for Joint Information Storage in Write Asymmetric Memories

Baan Service Master Data Management

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Overview of some probability distributions.

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design

3 Basic Definitions of Probability Theory

Plug-in martingales for testing exchangeability on-line

Institute of Actuaries of India Subject CT1 Financial Mathematics

The Stable Marriage Problem

Class Meeting # 16: The Fourier Transform on R n

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets

Confidence Intervals for One Mean

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

76 SYSTEMICS, CYBERNETICS AND INFORMATICS VOLUME 9 - NUMBER 1 - YEAR 2011 ISSN:

Domain 1: Designing a SQL Server Instance and a Database Solution

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

MTO-MTS Production Systems in Supply Chains

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Lesson 15 ANOVA (analysis of variance)

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Subject CT5 Contingencies Core Technical Syllabus

Transcription:

Aalogy Betwee Gamblig ad Measuremet-Based Work Extractio Dror A. Vikler Dept. of Electrical & Computer Eg. Be-Gurio Uiversity of the Negev Beer-Sheva 84105, Israel Email: viklerd@post.bgu.ac.il Haim H. Permuter Dept. of Electrical & Computer Eg. Be-Gurio Uiversity of the Negev Beer-Sheva 84105, Israel Email: haimp@bgu.ac.il Neri Merhav Dept. of Electrical Eg. Techio - Israel Istitute of Techology Techio City, Haifa 32000, Israel Email: merhav@ee.techio.ac.il Abstract TO BE CONSIDERED FOR AN IEEE JACK KEIL WOLF ISIT STUDENT PAPER AWARD. I iformatio theory, mutual iformatio is a kow boud o the gai i the growth rate due to kowledge of side iformatio o a gamblig result; the bettig strategy that reaches that boud is amed the Kelly criterio. I physics, it was recetly show that mutual iformatio is also a boud o the amout of work that ca be extracted from a sigle heat bath usig measuremet-based cotrol protocols; extractio that is doe usig Iformatio Egies. However, to the best of our kowledge, o relatio betwee these two fields has bee preseted before. I this paper, we briefly review the two fields ad the show a aalogy betwee gamblig, where bits are coverted to wealth, ad iformatio egies, where bits represetig measuremet results are coverted to eergy. This eables us to use well-kow methods ad results from oe field to solve problems i the other. We preset three such cases: imal work extractio whe the joit distributio of X ad Y is ukow, work extractio whe some eergy is lost i each cycle, e.g., due to frictio, ad a aalysis of systems with memory. I all three cases, the aalogy eables us to use kow results to reach ew oes. I. INTRODUCTION While both work extractio from feedback cotrolled systems ad iformatio theoretic aalysis of gamblig are old cocepts, to the best of our kowledge the relatio betwee them has ot bee highlighted before. This relatio icludes a straightforward mappig of cocepts from oe field to the other, e.g., measuremets are aalogous to side iformatio ad cotrol protocols - to bettig strategies. Fudametal formulas i either field apply to the other after simple replacemet of variables accordig to the mappig foud. This allows us to gai isights o oe field from kow results i the other oe. The relatioship betwee work extractio ad iformatio was first suggested by Maxwell [1] i a thought experimet cosistig of a itelliget aget, later amed Maxwell s demo; the aget measures the velocity of ideal gas molecules i a box that is divided ito two parts by a barrier. Although the box is attached to a heat bath ad thus has a costat temperature, T, the molecules iside the box have differet velocities. The demo opes a small hole i the barrier oly whe a faster-tha-average molecule arrives from the left part of the box, allowig it to pass to the right part, ad whe slower-tha-average molecules arrive from the right part of the box. By doig this, the demo causes molecules of higher eergy to cocetrate i the right part of the box ad those of lower eergy - to cocetrate i the left part. This causes the right part to heat up ad the left part to cool dow, thus eablig work extractio whe the system returs to equilibrium, i apparet cotradictio to the secod law of thermodyamics. This experimet shows how iformatio o the speed ad locatio of idividual molecules ca be trasformed to extracted eergy, settig the basis for what is ow kow as Iformatio Egies. Extesive research ad debate has cetered aroud Maxwell s demo sice its iceptio, expadig the cocept to more geeral cases of feedback cotrol based o measuremets where work is extracted at the price of writig bits [2] [6]. However, it was ot util recetly that Sagawa et al. reached a upper boud o the amout of work that ca be extracted [7], [8], owig to the developmet of fluctuatio theorems. That upper boud was foud to be closely related to Shao s mutual iformatio, hitig at a possible relatio to problems i iformatio theory; a relatio that was ot yet explored i full. Aother field where bits of iformatio were give cocrete value is gamblig, through the aalysis of optimal gamblig strategies usig tools from iformatio theory; a aalysis that was first doe by Kelly [9]. The settig cosisted of cosecutive bets o some radom variable, where all the moey wo i the previous bet is ivested i the curret oe. Kelly showed that imizig over the expectatio of the gambler s capital would lead to the loss of all capital with high probability after sufficietly may rouds. However, this problem is solved whe imizatio is doe over the expectatio of the capital s logarithm. Moreover, the logarithm of the capital is additive i cosecutive bets, which meas that the law of large umbers applies. Uder these assumptios, the optimal bettig strategy is to place bets proportioal to the probability of each result, a strategy dubbed the Kelly criterio. Kelly also showed that give some side iformatio o the evet, the profit that ca be made compared to someoe with o side iformatio is give by Shao s mutual iformatio. This serves as aother hit at a possible relatio betwee iformatio egies ad gamblig, as the amout of work that ca be extracted usig measuremets, compared to

that which ca be extracted without measuremets, is also give by mutual iformatio. I this paper, we preset a aalogy betwee the aalysis of feedback cotrolled systems ad the aalysis of gamblig i iformatio theory. We show that fidig the optimal cotrol protocol i various systems is aalogous to fidig the optimal bettig strategy usig the Kelly criterio. Furthermore, the amout of work extracted after cycles of a iformatio egie is show to be aalogous to the capital gaied after rouds of gamblig. The aalogy is the show o two models: the Szilard egie, where the particle s locatio is discrete, ad a particle i some potetial field, where the locatio is cotiuous. This aalogy allows us to geeralize the models preseted here to more elaborate cases, such as gamblig o cotiuousvalued radom variables. Moreover, it eables us to develop a simple criterio to determie the best cotrol protocol i cases where a optimal protocol is iapplicable, ad a optimal protocol whe the probabilities goverig the system are ot kow. Fially, well kow results for gamblig with memory ad causal kowledge of side iformatio are trasferred to the field of physical systems with memory, yieldig the bouds o extracted work i such systems. Due to space limitatios we omit proofs of Lemmas, which will appear i the full paper [10]. II. THE HORSE RACE GAMBLING The problem of gamblig, as preseted i [9] ad [11], cosists of experimets whose results are marked by the radom vector X, e.g., the wiig horse i horse races. We will assume that the gambler has some side iformatio, Y, about the races, ad that the experimets ad side iformatio are i.i.d. The followig otatio is used: P X y - the probability vector of X, the wiig horse, give a observatio y of the side iformatio. - a vector describig the amout of moey ivested i each result give y. o X - a vector describig the amout of moey eared for each dollar ivested o each horse, if that horse wis. S - the gambler s capital after experimets. P X y (x y) (which we will abbreviate as P (x y)) marks the probability that X = x, give y. Similarly, (x y) ad o X (x) (abbreviated b(x y) ad o(x), respectively) mark the amout of moey ivested ad eared, respectively, whe X = x. Each roud, the gambler ivests all of his capital. Without loss of geerality, we will set S 0 = 1, amely, the gamblig starts with 1 dollar. S is the give by: S = b(x i y i )o(x i ), (1) i=1 ad imizatio will be doe o log S. We defie the profit i roud i as log S i log S i 1 = log [b(x i y i )o(x i )]. (2) Sice the experimets are i.i.d., the same bettig strategy will be used i every roud. As show i [11, Chapter 6], the optimal bettig strategy is the give by: b X y = arg E[log S y ] = P X y. (3) Substitutig b X y ito eq. (1), the followig imum is derived: E[log S y ] = x P (x y) log [P (x y)o(x)]. (4) The bet is said to be fair if o(x) = 1/P (x), ad it ca be see from eq. (4) that without side iformatio o moey ca be eared i that case. I this paper, we oly cosider fair bets. For a fair bet, the expected value of log S with respect to P (x, y ) is E[log S ] = I(X; Y ). (5) I a costraied bet, meaig a fair bet where the bettig strategy is limited to some set B of possible strategies, the imum gai will be give by E[log S ] = I(X; Y ) P (y)d(p X y b X y ), B y Y (6) where the optimal bettig strategy b X y B is the oe that miimizes D(P X y ). III. THE SZILARD ENGINE We ow examie the Szilard egie [12], which ivolves a sigle particle of a ideal gas eclosed i a box of volume V ad attached to a heat bath of temperature T. The egie s cycle cosists of the followig stages (see Fig. 1): 1) The particle is movig freely i equilibrium with the heat bath. 2) A divider is iserted, dividig the box ito two parts of volumes V L 0 ad V R 0. The part of the box that cotais the particle is deoted by X, with the alphabet X = {L, R}. 3) A oisy measuremet of the particle s locatio is made; the result is deoted Y with Y = {L, R}. 4) The divider is moved quasi-statically 1, util the volumes of the parts are Vf L ad Vf R. 5) The divider is removed from the box. Without loss of geerality, we will set V = 1. Deote V 0 (x) as V0 L for x = L ad V0 R otherwise. Similarly, V f (x y) is Vf L for x = L ad Vf R otherwise, ad these values deped o the measuremet y. Sice the particle starts each cycle i equilibrium with its eviromet, differet cycles of the egie are idepedet of each other. Assumig V 0 is the same for each cycle, X are i.i.d. with P X = V 0. Followig the aalysis i [13], the work extracted i roud i give Y i = y i is W i = k B T l V f (X i y i ), (7) P (X i ) 1 Ifiitesimally slowly, keepig the system close to equilibrium.

TABLE I ANALOGY OF GAMBLING AND SZILARD S ENGINE Fig. 1. The cycle of Szilard s egie. where k B is the Boltzma costat. The optimal V f is V f (x y) = arg V f E[W y] = P (x y), (8) ad the imal amout of work extracted after cycles is [ E[W ] = k B T E l P ] X Y V f P X = k B T I(X; Y ). (9) Note that the iitial locatio of the barrier V 0 (x) ca also be optimized, leadig to the followig formula E[W ] = k B T I(X; Y ). (10) V f,v 0 P (x) A aalogy with gamblig arises from this aalysis, as preseted i Table I. The equatios defiig both problems, eqs. (2) ad (7), are the same whe reamig b(x y) as V f (X y) ad o(x) as 1/P (X). The aalogy also holds for the optimal strategy i both problems, preseted i eqs. (3) ad (8), ad imum gai, preseted i eqs. (5) ad (9), where log S is reamed W /k B T. Specifically, the Szilard egie is aalogous to a fair bet, sice V 0 = P X ad this is aalogous to o(x) = 1/P (x). As stated previously, i a fair bet o moey ca be eared without side iformatio. Equivaletly, o work ca be extracted from the Szilard egie without measuremets, which coforms with the secod law of thermodyamics. Moreover, the optio to imize over P (x) prompts us to cosider a extesio to horse race gamblig, where the gambler ca choose betwee several differet races ad thus imize eq. (5) over all distributios P (x) i some set of possible distributios. IV. A PARTICLE IN AN EXTERNAL POTENTIAL AND CONTINUOUS-VALUED GAMBLING We ow cosider a system of oe particle that has the Hamiltoia (eergy fuctio): H(X, p) = p2 2M + E 0(X), (11) where p is the particle s mometum, M its mass, X its locatio, ad E 0 (X) is some potetial eergy. Agai, the particle is kept at costat temperature T. The optimal cotrol protocol for this system was preseted i [14] ad [15] to be as follows: Give y, chage the exteral potetial immediately to be Ef (X, y) such that the iduced Boltzma distributio Gamblig Szilard s egie X i - result of horse race i roud X i - locatio of the particle i i. cycle i. Namely, left or right. Side iformatio. Measuremets results, possibly with oise. Y i - some side iformatio o Y i - oisy measuremet of the particle s roud i. locatio i cycle i. P X - PMF of the result. P X - PMF of the particle s locatio. P X y - PMF of the result give P X y - PMF of the particle s locatio side iformatio y. give measuremet y. o X - amout of moey eared for every dollar gambled. 1/V 0 - the reciprocal of the iitial volume of the box s parts. Placig bets o differet horses. Movig the dividers to their fial positios. Choosig the optimal race to bet o. Choosig the optimal iitial locatio for the divider. - amout of moey gambled o each result, give y. V f (X y) - the fial volume of the box s parts, give y. Log of the capital. Extracted work. log S - log of the acquired moey after rouds of gamblig. W /(k B T ) - total work extracted after cycles of the egie. Trasformig bits to wealth. Trasformig bits to eergy. Eqs. (2), (3), (5) - Profit i roud i, optimal bettig strategy ad imum profit. Eqs. (7), (8), (9) - Work extracted i roud i, optimal cotrol protocol ad imum work extractio. of X will be Pf (x y) = P (x y), i.e., equal to the coditioal distributio of X give y. Chage the potetial quasi-statically back to E 0 (X). Notig that i eq. (8) Vf equals P f, oe otices that both i this case ad i the Szilard egie the cotrol protocol is defied by P (x y). Furthermore, eq. (7) is also valid for this case. If X is a cotiuous radom variable, P (x), P (x y) will be the particle s PDF ad coditioal PDF, respectively. The protocol preseted above is optimal i the sese that it reaches the upper boud o extracted work, i.e., E[W (P f )] = k B T I(X; Y ), (12) where W is the extracted work after cycles of the egie. If E 0 is uder our cotrol, we ca imize over all P 0 as well. However, it is importat to ote that there will always be some costrait over P 0, due to the fiite volume of the system or to the method of creatig the exteral potetial or both. Thus, deotig by P the set of allowed iitial distributios P 0, the imal amout of extracted work is give by E[W ] = k B T I(X; Y ). (13) P 0 P,P f P (x) P Aother poit of iterest is that settig Pf = P X y will ot ecessarily be possible. This gives rise to the followig, more geeral, formula E[W ] P 0 P,P f = k B T P (x) P P f P B {I(X; Y ) E Y [D(P X y P f )]}, (14) where P B is the set of all possible distributios P f, which stems from the set of all possible potetials. Thus, the optimal

P f is the oe that miimizes E Y [D(P X y P f )]. Notice that this aalysis holds both for cotiuous ad discrete X. It follows that the aalogy preseted i Table I ca be exteded to work extractio from a particle i a exteral potetial. Agai, this system is aalogous to a fair bet, i coformace with the secod law of thermodyamics. This system is also aalogous to a costraied bet, as ca be see from eq. (14) ad its aalogy with eq. (6). If X is cotiuous, a iterestig extesio to the gamblig problem arises where the bet is o cotiuous radom variables. We will ow preset this extesio i detail. A. Cotiuous-Valued Gamblig We cosider a bet o some cotiuous-valued radom variable, where the gambler has kowledge of side iformatio. The gambler s wealth is still give by eq. (1), where the bettig strategy, b(x y), ad the odds, o(x), are fuctios istead of vectors. I the case of stocks or currecy exchage rates, for istace, such bettig strategy ad odds ca be implemeted usig optios. The costrait that the gambler ivests all his capital o each roud is traslated i this case to the costrait b(x y)dx = 1. The optimal bettig strategy is the give by X b (x y) = f(x y), where f(x y) is the coditioal probability mass fuctio (PMF) of X give y, ad the bet is said to be fair if o(x) = 1/f(x), where f(x) is the PMF of X. For a fair bet, eq. (5) holds ad eq. (6) holds with the sum replaced by a itegral ad each PDF replaced by the appropriate PMF. We coclude that two ofte discussed schemes of work extractio are aalogous to the well-kow problem of horse race gamblig or to the extesio of that problem to the cotiuous-valued case, a extesio that actually arose from the aalogy. We will ow discuss some of the possible beefits from this aalogy. V. ANALOGY CONSEQUENCES The aalogy that was show i this paper eables us to use well-kow methods ad results from horse race gamblig to solve problems regardig measuremet-based work extractio, ad vice versa. We preset three such cases: imal work extractio whe the joit distributio of X ad Y is ukow, work extractio whe some eergy is lost i each cycle, e.g., due to frictio, ad a aalysis of systems with memory. I all three cases the aalogy eables us to use kow results to gai ew isight. A. Uiversal Work Extractio The cotrol protocols preseted so far cosisted of a chage to the system that chaged the distributio of X from the vector P 0 to some measuremet-depedet vector P f ( y), ad the a retur back to P 0. However, i order to achieve the upper boud of E[W ] = k B T I(X; Y ), it was ecessary to kow P X y i advace. The questio the arises whether this boud could be achieved whe the coditioal probability is ot kow, e.g., a system with a ukow measuremet error. For portfolio maagemet, which is a geeralizatio of horse race gamblig, the problem of ivestmet with ukow probability distributios was solved by Cover ad Ordetlich [16]. They devised the µ-weighted uiversal portfolio with side iformatio, which was show to asymptotically achieve the same wealth as the best costat bettig strategy for ay pair of sequeces x, y. Namely, it was show that lim 1 x,y log S (x y ) = 0, (15) Ŝ (x y ) where Ŝ is the wealth achieved by the uiversal portfolio ad S is the imal wealth that ca be achieved by a portfolio where b i (y i ) = b (y i ) for all i. Furthermore, choosig µ to be the uiform (Dirichlet(1,..., 1)) distributio, it was show that the wealth achieved by the portfolio ca be bouded by log Ŝ(x y ) log S (x y ) k(m 1) log( + 1), (16) where m is the cardiality of X ad k is the cardiality of Y. For this µ, the uiversal portfolio ca be reduced to the followig bettig strategy for the horse race gamble: ˆb i (y i, x i 1 ) = ( i (1, y i ) + 1 i (y i ) + m,..., i(m, y i ) + 1 i (y i ) + m ), (17) where i (j, y i ) is the umber of times X was observed to be j ad Y was observed to be y i before the ith cycle, i.e., i (j, y) = {l : x l = j, y l = y, l < i}, ad similarly i (y) = {l : y l = y, l < i}. Usig the aalogy preseted above, this uiversal portfolio ca be adapted straightforwardly ito a uiversal cotrol protocol i cases where X has a fiite alphabet. I this cotrol protocol, P f,i is give by the right-had-side of eq. (17) ad the extracted work ca be bouded by Ŵ W k B T k(m 1) l( + 1), (18) a boud that follows directly from eq. (16). Namely, the work extracted by this uiversal cotrol protocol is asymptotically equal to the work extracted by the best costat cotrol protocol, i.e., the best cotrol protocol for which P f,i ( y i ) = P f ( y i) for all i. However, this derivatio is applicable oly for cases where X ad Y have fiite alphabets. B. Imperfect Work Extractio Aother result that arises from the aalogy show above is the aalysis of a imperfect system of work extractio. Cosider a system where some amout of eergy f(x) is lost i each cycle, e.g., due to frictio. I.e., W i = k B T l P f (X i y i ) P 0 (X i ) This is aalogous to a ufair bet with the odds f(x i ). (19) o(x) = 1 P (x) e f T (x), (20) where f T (x) = f(x)/k B T ad T is a ufairess parameter. As show, if the gambler has to ivest all the capital i each roud, the optimal b(x y) is idepedet of o(x), i.e., for the odds give i eq. (20) the optimal bettig strategy is still give

by eq. (3). However, it may be the case that for some values of y the gambler should ot gamble at all. I the same maer, the optimal cotrol protocol for imperfect systems of work extractio is still give by Pf (x y) = P (x y), but for some measuremet results it may be preferable ot to perform the cycle at all. Substitutig Pf ito eq. (19) ad takig the average w.r.t. P (x y) yields W i = k B T D(P X yi P X ) E[f(X i ) y i ]. (21) Thus, the egie s cycle should be performed oly i cases where W > 0. Equivaletly, the cycle should be performed oly if y satisfies k B T D(P X yi P X ) > E[f(X i ) y i ]. C. Systems With Memory Fially, we would like to aalyze cases where the differet cycles of the egie, or differet measuremets, are ot idepedet. Agai, we would use kow results from the aalysis of gamblig o depedet horse races. If the gambler has oly causal kowledge of the side iformatio, the imum growth rate of wealth is [17] E[log S ] = I(Y X ), (22) b(x Y ) where I(Y X ) is the directed iformatio from Y to X, as defied by Massey [18], ad b(x Y ) idicates the bettig strategy i roud i depeds causally o previous results X i 1 ad side-iformatio Y i. The optimal bettig strategy i this case is give by b (x y ) = P (x y ), where P (x y ) = i=1 P (x i y i, x i 1 ) is the causal coditioig of X by Y as defied by Karmer [19]. I the Szilard egie, depedece arises, for istace, if the iitial placemet of the barrier i each cycle is doe before the system has reached equilibrium. I that case, the locatio of the particle depeds o its locatio o the previous cycle, i.e., P (x ) i=1 P (x i) ad the Markov X i X i 1 (X i 2, Y i 1 ) holds. This leads to the followig formula for imizig the extracted work arg E[W y i, x i 1 ] = arg mi D(P Xi y V f,i V i,x i 1 V f,i) f,i = P Xi y i,x i 1, (23) which meas that imal work extractio is give by E[W ] = k B T I(Y X ), (24) Vf for some P (x ) iduced by the iitial locatio of the barrier. As was doe previously, this iitial locatio ca be optimized, yieldig the optimal P (x ) i the set of possible distributios, i.e., the distributios for which the Markov property holds. Deotig this set P, imal extracted work is give by Vf,V 0 E[W ] = k B T I(Y X ), (25) P (x y 1 ) P where P (y x ) is a costat depedig o the measurig device. Due to the Markov property, eq. (25) ca reduced to E[W ] = k B T I(X i ; Y i X i 1, Y i 1 ). Vf,V 0 P (x y 1 ) P i=1 (26) It would be beeficial to have a scheme to fid the set of probabilities that achieves the imum i this case. I order to do that, the followig two lemmas are first eeded. Lemma 1 The rhs of eq. (26) is cocave i P (x y 1 ) with P (y x ) costat. Lemma 2 The imizatio problem i eq. (26) is a covex optimizatio problem over the affie set P. Usig these two lemmas, the alteratig imizatio procedure ca be used to imize over each term P (x i x i 1, y i 1 ) separately while settig all other terms as costat, begiig with i = ad movig backward to i = 1, similarly to [20]. Sice each term depeds oly o previous terms ad ot o the followig oes, this procedure will yield the global imum as eeded. ACKNOWLEDGMENT The authors would like to thak Oleg Krichevsky for valuable discussios. REFERENCES [1] J. C. Maxwell, Theory of Heat. Appleto, Lodo, 1871. [2] L. Brilloui, Maxwell s demo caot operate: Iformatio ad etropy. I, J. Appl. Phys., vol. 22, o. 3, pp. 334 337, 1951. [3] R. Ladauer, Irreversibility ad heat geeratio i the computig process, IBM J. Res. Dev., vol. 5, o. 3, pp. 183 191, 1961. [4] C. H. Beett, Demos, egies ad the secod law, Scietific America, vol. 257, o. 5, pp. 108 116, 1987. [5] D. Madal ad C. Jarzyski, Work ad iformatio processig i a solvable model of well s demo, Proc. Natl. Acad. Sci. USA, vol. 109, o. 29, pp. 11 641 11 645, 2012. [6] D. Madal, H. Qua, ad C. Jarzyski, Maxwell s refrigerator: A exactly solvable model, Phys. Rev. Lett., vol. 111, o. 3, p. 030602, 2013. [7] T. Sagawa ad M. Ueda, Secod law of thermodyamics with discrete quatum feedback cotrol, Phys. Rev. Lett., vol. 100, o. 8, p. 080403, 2008. [8], Geeralized Jarzyski equality uder oequilibrium feedback cotrol, Phys. Rev. Lett., vol. 104, o. 9, p. 090602, 2010. [9] J. L. Kelly Jr, A ew iterpretatio of iformatio rate, Bell System Techical Joural, vol. 35, pp. 917 926, 1956. [10] D. A. Vikler, H. H. Permuter, ad N. Merhav, Aalogy betwee gamblig ad measuremet-based work extractio, i preparatio. [11] T. M. Cover ad J. A. Thomas, Elemets of Iformatio Theory. Joh Wiley & Sos, 1991. [12] L. Szilard, Über die etropievermiderug i eiem thermodyamische system bei eigriffe itelligeter wese, Zeitschrift für Physik, vol. 53, o. 11-12, pp. 840 856, 1929. [13] T. Sagawa ad M. Ueda, Noequilibrium thermodyamics of feedback cotrol, Phys. Rev. E, vol. 85, o. 2, p. 021104, 2012. [14] J. M. Horowitz ad J. M. Parrodo, Desigig optimal discretefeedback thermodyamic egies, New J. Phys., vol. 13, o. 12, p. 123019, 2011. [15] M. Esposito ad C. Va de Broeck, Secod law ad Ladauer priciple far from equilibrium, EPL, vol. 95, o. 4, p. 40004, 2011. [16] T. M. Cover ad E. Ordetlich, Uiversal portfolios with side iformatio, IEEE Tras. If. Theory, vol. 42, o. 2, pp. 348 363, 1996. [17] H. H. Permuter, Y.-H. Kim, ad T. Weissma, Iterpretatios of directed iformatio i portfolio theory, data compressio, ad hypothesis testig, IEEE Tras. If. Theory, vol. 57, o. 6, pp. 3248 3259, 2011. [18] J. Massey, Causality, feedback ad directed iformatio, i Proc. It. Symp. If. Theory Applic. (ISITA-90), 1990, pp. 303 305. [19] G. Kramer, Directed iformatio for chaels with feedback, Ph.D. dissertatio, Uiversity of Maitoba, Caada, 1998. [20] I. Naiss ad H. H. Permuter, Extesio of the Blahut-Arimoto algorithm for imizig directed iformatio, IEEE Tras. If. Theory, vol. 59, pp. 760 781, 2013.