Babylonian Method of Computing the Square Root: Justifications Based on Fuzzy Techniques and on Computational Complexity

Bbylonin Method of Computing the Squre Root: Justifictions Bsed on Fuzzy Techniques nd on Computtionl Complexity Olg Koshelev Deprtment of Mthemtics Eduction University of Texs t El Pso 500 W. University El Pso, TX 79968, USA Emils: olgk@utep.edu Abstrct When computing squre root, computers still, in effect, use n itertive lgorithm developed by the Bbylonins millenni go. This is very unusul phenomenon, becuse for most other computtions, better lgorithms hve been invented even division is performed, in the computer, by n lgorithm which is much more efficient tht division methods tht we hve ll lerned in school. Wht is the explntion for the success of the Bbylonins method? One explntion is tht this is, in effect, Newton s method, bsed on the best ides from clculus. This explntions works well from the mthemticl viewpoint it explins why this method is so efficient, but since the Bbylonins were very fr from clculus, it does not explin why this method ws invented in the first plce. In this pper, we provide two possible explntions for this method s origin. We show tht this method nturlly emerges from fuzzy techniques, nd we lso show tht it cn be explined s in some resonble sense) the computtionlly simplest techniques. I. HOW TO COMPUTE THE SQUARE ROOT: AN ITERATIVE FORMULA GOING BACK TO THE ANCIENT BABYLONIANS How cn we compute the squre root x = of given number? Historiclly the first method for computing the squre root ws invented by the ncient Bbylonins; see, e.g., [] nd references therein. In this Bbylonin method, we strt with n rbitrry positive number x 0, nd then pply the following itertive process: +1 = 1 + ). 1) To be more precise, Bbylonins rrely described their mthemticl procedures in lgorithmic form: they usully described them by presenting severl exmples. Only in the Greek mthemtics these procedures were reformulted in the generl bstrct form. This is true for the squre root procedure s well: the first person to describe this procedure in generl bstrct terms ws Heron of Alexndri, mostly known s the inventor of the stem engine; see, e.g., [3]. Becuse of this, the bove lgorithm is lso known s Heron s method. II. PROPERTIES OF THE BABYLONIAN METHOD If we strt with the vlue x 0 which is lredy equl to the squre root of, x 0 =, then, s one cn esily check, the next itertion is the exct sme vlue: x 1 = 1 ) x 0 + x0 = 1 ) + = 1 + ) =. ) It hs been proven tht this method converges: for ll possible strting vlues x 0. It is intuitively cler tht we closer the originl pproximtion x 0 to the squre root, the fster the convergence. Convergence is not so strightforwrd to prove, but it is strightforwrd to prove tht if the sequence converges to some vlue x, then this limit vlue x is indeed the desired squre root. Indeed, by tending to limit x nd +1 x in the formul 1), we conclude tht x = 1 x + x Multiplying both sides of this equlity by, we get ). 3) x = x + x. 4) By subtrcting x from both sides, we conclude tht x = x. Multiplying both sides of this equlity by x, we get = x ; this is exctly the defining eqution of the squre root. III. BABYLONIAN METHOD: A NUMERICAL EXAMPLE The Bbylonin method for computing the squre root is very efficient. To illustrte how efficient it, let us illustrte it on the exmple of computing the squre root of, when =. Let us strt with the simplest possible rel number x 0 = 1. Then, ccording to the Bbylonin lgorithm, we compute the next pproximtion x 1 to s x 1 = 1 ) x 0 + x0 = 1 1 + ) = 1 3 = 1.5. 5) 1 The next pproximtion is x = 1 ) x 1 + x1 = 1 1.5 + ) = 1.5

1 1.5 + 1.3333...) = 1.8333... = 1.4166.... 6) So, fter two simple itertions, we lredy get 3 correct deciml digits of = 1.41... IV. BABYLONIAN METHOD IS VERY EFFICIENT The bove exmple is typicl. In generl, the Bbylonin method for computing the squre root converges very fst. In fct, it is so efficient tht most modern computers use it for computing the squre roots. The computers lso tke dvntge of the fct tht inside the computer, ll the numbers re represented in the binry code. As result, division by simply mens shifting the binry point one digit to the left: just like in the stndrd deciml code, division by 10 simply mens shifting the deciml point one digit to the left, e.g., from 15. to 1.5. V. THE LONGEVITY OF THE BABYLONIAN METHOD IS VERY UNUSUAL The fct tht the Bbylonin method for computing the squre root hs been preserved intct nd is used in the modern computers is very unusul. Even for simple rithmetic opertions such s division, the trditionl numericl procedures tht people hs used for centuries turned out to be not s efficient s newly designed ones. For exmple, in most computers, subtrction nd opertions with negtive numbers re not done s we do it, but by using the s complement representtion; see, e.g., [1]. Similrly, division is not performed the wy we do it, but rther by using specil version of Newton s method, etc. In contrst, the Bbylonin method for computing the squre root remins widely used. Wht is the reson for this longevity? How could Bbylonins come up with method which is so efficient? VI. NEWTON S EXPLANATION OF THE EFFICIENCY OF THE BABYLONIAN METHOD Historiclly the first nturl explntion of the efficiency of the Bbylonin method ws proposed by Isc Newton. Newton showed tht this method is prticulr cse of generl method for solving non-liner eqution, method tht we now cll Newton s method. Specificlly, suppose tht we wnt to solve n eqution fx) = 0, 7) nd we know n pproximte solution. How cn we find the next itertion? We ssumed tht the known vlue is close to the desired solution x. So, we cn describe this solution s x = + x, where the correction x def = x is reltively smll. In terms of x, the eqution 7) tkes the form f + x) = 0. 8) Since x is smll, we cn use the derivtive f ). Specificlly, the derivtive f x) is defined s the limit f f + h) f ) ) = lim. 9) h 0 h The limit mens tht the smller h, the closer is the rtio f + h) f ) h to the derivtive f ). Since x is smll, the rtio f + x) f ) x is close to the derivtive f ): 10) 11) f ) f + x) f ). 1) x We know tht f + x) = fx) = 0; thus, 8) implies tht nd hence, f ) f) x. 13) x f) f ). 14) So, s the next pproximtion to the root, it is resonble to tke the vlue +1 = + x, i.e., the vlue +1 = f) f ). 15) Finding the squre root x = mens finding solution to the eqution x = 0. This eqution hs the form fx) = 0 for fx) = x. Substituting this function fx) into the generl formul 15), we get +1 =. 16) Explicitly dividing ech term in the right-hnd side expression by, we get xn +1 = ). 17) Opening prentheses, we get +1 = +. 18) Replcing with, nd moving the common divisor outside the sum, we get the Bbylonin formul +1 = 1 + ). 19) VII. THERE SHOULD BE A MORE ELEMENTARY EXPLANATION OF THE BABYLONIAN FORMULA Newton s explntion explins why the Bbylonin method is so efficient becuse it is prticulr cse of the efficient Newton s method for solving nonliner equtions. However, Newton s explntion does not explin how this method ws invented in the first plce, since the min ides of Newton s method re hevily bsed on clculus, while the Bbylonins were very fr wy from discovering clculus ides. We therefore need more elementry explntion of the Bbylonin formul. Two such explntions re provided in this pper.

VIII. EXPLANATION BASED ON FUZZY TECHNIQUES: MAIN IDEA We re looking for squre root x =, i.e., for the vlue for which x = x. 0) Insted of the exct vlue x, we only know n pproximte vlue x. Since x, we conclude tht x. 1) Becuse of 0), we cn conclude tht x. ) Thus, the desired squre root x must stisfy two requirements: the first requirement is tht x ; the second requirement is tht x y n where we denoted def y n =. 3) Thus, we must find x from the following requirement: x ) & x y n ). 4) Fuzzy logic is method for formlizing sttements like this; see, e.g., [4], [6]. In fuzzy logic, to find the vlue x from the bove requirement, we must follow the following steps: First, we select the membership function µ z) for describing the pproximte reltion. After this selection, the degree of confidence in sttement x is equl to µ x ); nd the degree of confidence in sttement x y n is equl to µ x y n ). Next, we select t-norm f & d, d ) to describe the effect of the nd opertor on the corresponding degrees. After this selection, for ech possible rel number x, the degree µx) to which this number stisfies the bove requirement cn be computed s µx) = f & µ x ), µ x y n )). 5) Finlly, we need to select defuzzifiction procedure tht trnsforms the membership function µx) into single most pproprite vlue. For exmple, s this x, it is nturl to tke the vlue x for which the degree µx) is the lrgest possible. IX. EXPLANATION BASED ON FUZZY TECHNIQUES: ANALYSIS Let us consider different possible selections of membership function nd of the t-norm. As n exmple, let us tke Gussin membership function to describe pproximte ) µ z) = exp z σ 6) for some σ > 0, nd the product s t-norm: f & d, d ) = d d. 7) In this cse, we hve µx) = µ x ) µ x y n ) = exp x ) ) σ exp x y n) ) σ = exp x ) + x y n ) ) σ. 8) Due to monotonicity of the exponentil function, this vlue ttins the lrgest possible vlue when the expression x ) + x y n ) 9) is the smllest possible. Differentiting this expression with respect to x nd equting the derivtive to 0, we conclude tht x ) + x y n ) = 0, i.e., tht x = + y n. 30) If for the sme Gussin membership function for pproximte, we choose f & d, d ) = mind, d ) s the t-norm, we get different expression for µx): min µx) = minµ x ), µ x y n )) = x ) ), exp x y n) exp σ σ )). 31) Due to monotonicity of the exponentil function, this vlue is equl to µx) = exp mxx ), x y n ) ) ) σ, 3) nd this vlue is the lrgest when the expression mxx ), x y n ) ) 33) is the smllest possible. This expression, in its term, cn be rewritten s mx x, x y n ). Due to the fct tht the bsolute vlues re lwys non-negtive nd the squre function is monotonic on non-negtive vlues, this expression hs the form mx x, x y n )), nd its minimum is ttined when the simpler expression mx x, x y n ) is the smllest possible. Let us show tht this expression lso ttins its smllest possible vlue t the midpoint 3). Indeed, in geometric terms, the minimized expression mx x, x y n ) is simply the lrgest of the distnces x nd x y n between the desired point x nd the given points nd y n. Due to the tringle inequlity, we hve x + x y n y n. Thus, it is not possible tht both distnces x nd x y n re smller thn y n becuse then their sum would be smller thn y n. So, t lest one of these distnces hs to be lrger thn or equl to y n, nd therefore, the lrgest of these distnces mx x, x y n ) is lwys lrger thn or equl to y n. The only wy for this lrgest distnce is to be equl to y n is when both distnce x nd

x y n re equl to y n, i.e., when the desired point x is exctly t midpoint 3) between the two given points nd y n. One cn show tht we get the exct sme nswer 3) if we use tringulr membership functions, symmetric piece-wise qudrtic membership functions, different t-norms, etc. X. EXPLANATION BASED ON FUZZY TECHNIQUES: RESULT Our nlysis shows tht for mny specific selections of the membership function for pproximte nd of the t-norm, we get the sme nswer: x = + y n. In other words, for mny specific selections, s the next pproximtion +1 to the squre root x, we tke exctly the vlue +1 from the Bbylonin procedure. Thus, fuzzy techniques indeed explin the selection of the Bbylonin method. XI. THE IMPORTANT ROLE OF SYMMETRY The fct tht different choices led to the sme result x cn be explined by the symmetry of the problem. Indeed, the problem is symmetric with respect to reflection x + y n ) x 34) tht swps the vlues nd y n. Thus, if we get the unique solution x, this solution must be invrint with respect to this symmetry otherwise, the symmetric point would be nother solution [5]. This invrince mens tht x = + y n ) x nd thus, tht x = + y n. XII. THIS APPLICATION OF FUZZY TECHNIQUES IS RATHER UNUSUAL The fct tht fuzzy techniques cn be useful is well known [4], [6]. However, usully, fuzzy techniques led to good pproximtion: to n idel control, to n idel clustering, etc. Wht is unusul bout the Bbylonin lgorithm is tht here, fuzzy techniques led to exctly the correct lgorithm. XIII. EXPLANATION BASED ON COMPUTATIONAL COMPLEXITY: MAIN CLAIM In ny itertive procedure for computing the squre root, once we hve the previous pproximtion, we cn use: this pproximtion, the vlue, nd if needed) constnts to compute the next pproximtion +1 = f, ) for n pproprite expression f. For the itertive method to be successful in computing the squre root, the expression f, ) should stisfy the following nturl properties: first, if we strt with the vlue which is lredy the squre root =, then this procedure should not chnge this vlue, i.e., we should hve f, ) = ; second, this itertive method should converge. In the Bbylonin method, the computtion of the corresponding expression f, ) involves three rithmetic opertions: division ; 35) n ddition nd multipliction + ; 36) 0.5 + ). 37) Our clim is tht this is the simplest possible opertion. In other words, our clim is tht it is not possible to find n expression f, ) which would be computble in 0, 1, or rithmetic opertions. Let us prove this clim. XIV. IT IS NOT POSSIBLE TO HAVE 0 ARITHMETIC OPERATIONS If we re not llowing ny rithmetic opertions t ll, then s +1 = f, ), we should return either the vlue, or the vlue, or some constnt c. In ll three cses, we do not get ny convergence to the squre root: in the first cse, the vlues remin the sme nd never converge to : x 0 = x 1 = x =... = =... ; 38) in the second cse, we strt with some initil vlue x 0 nd then repetedly return the vlues equl to : x 1 = x =... = =... = ; 39) in the second cse, we strt with vlue x 0 nd then repetedly return the vlues equl to the constnt c: x 1 = x =... = =... = c. 40) XV. IT IS NOT POSSIBLE TO HAVE 1 ARITHMETIC OPERATION The rithmetic opertion is either ddition, or subtrction, or multipliction, or division. Let us consider these four cses one by one. All opertions involve the vlues nd nd possible constnts) c nd c. For ddition, depending on wht we dd, we get +, +, + c, +, + c, nd c + c. In ll these cses, for =, the result is different from. So, the expression

f, ) involving only one ddition does not stisfy the condition f, ) =. For subtrction, depending on wht we subtrct, we get the expressions,, c with c 0, c, c, c, nd c c we dismiss the trivil expressions of the type = 0). In ll these cses, for =, the result is different from. So, the expression f, ) involving only one subtrction does not stisfy the condition f, ) =. For multipliction, depending on wht we multiply, we get,, c,, c, nd c c. In ll these cses, for =, the result is different from. So, the expression f, ) involving only one multipliction does not stisfy the condition f, ) =. For division, depending on wht we divide, we get the expressions, c, with c 1,, c c, c, nd c c we dismiss the trivil expressions of the type = 1). In ll these cses, except for the cse, for =, the result is different from. So, the expression f, ) corresponding to ll these cses does not stisfy the condition f, ) =. In the remining cse f, ) =, 41) we do hve f, ) = =, 4) but we do not stisfy the second condition: of convergence. Indeed, in this cse, x 1 = x 0, 43) then nd then gin, etc. So, here, we hve nd no convergence. x = x 1 = /x 0 = x 0, 44) x 3 = x = x 0 = x 1, 45) 0 = =... = n =..., 46) 1 = 3 =... = n+1 =... 47) XVI. IT IS NOT POSSIBLE TO HAVE ARITHMETIC OPERATIONS Similrly, one cn prove tht it is not possible to hve two rithmetic opertions. This cn be proven by enumerting ll possible sequences of rithmetic opertions nd then checking tht for ll possible inputs,, or c) the resulting expression f, ): either does not stisfy the requirement f, ) = ; or does not led to the convergence to the squre root. For exmple, if we first dd, nd then multiply, then we get the expression e + e ) e. Replcing ech of the possible inputs e, e, nd e with one of the possible vlues,, or c, we get ll possible expressions f, ) corresponding to this cse. For exmple, if we tke e = nd e = e =, we get the expression f, ) = +). This expression clerly does not stisfy the requirement f, ) =. If we tke e =, e =, nd e = c, then we get the expression f, ) = + ) which lso does not stisfy the sme requirement. By considering ll possible cses, we cn thus prove tht no expression with rithmetic opertions is indeed possible. XVII. POSSIBLE PEDAGOGICAL USE OF THIS PROOF In our opinion, this proof provides good pedgogicl exmple of simple esy-to-hndle rithmetic problem tht is ctully not toy problem t ll: it is relted to n efficient lgorithm for computing the squre root. For ech combintion of opertions nd inputs, it is reltively esy to come up with n explntion of why this prticulr combintion will not work. With sufficiently lrge number of students in clss, nd sufficient time llocted for this exercise, students cn ctully check ll the possibilities nd thus get sense of chievement. Indeed, we hve 4 possible opertions coming first, 4 possible opertions coming second, so we hve 4 = 16 possible sequences of opertions. So, if we hve 16 students in the clss, ech student cn hndle one of these combintions. If we hve 3 students, then we cn divide students into pirs so tht ech pir of students hndles one combintion of opertions. For ech of the these combintions of opertions, we hve 3 options for ech of the inputs, so totlly, we hve mngeble number of 3 3 = 7 possible combintions. XVIII. IS BABYLONIAN FORMULA THE ONLY POSSIBLE OPERATION THAT REQUIRES THREE ARITHMETIC OPERATIONS? We hve shown tht every itertive procedure for computing the squre root requires t lest three different rithmetic opertions on ech itertion. Since the Bbylonin procedure requires exctly three opertions, it is thus indeed the fstest possible. The next nturl question is: re there other itertive procedures tht require three rithmetic opertions on every itertion? The nswer is No but this proof requires considering very lrge number of possible combintions. XIX. A MORE MANAGEABLE PROOF: THE BABYLONIAN PROCEDURE IS THE FASTEST In the bove discussions, to estimte how fst ech computtion is, we simply counted the number of rithmetic opertions. This counting mkes sense s good pproximtion to the ctul computtion time, but it implicitly ssumes tht ll rithmetic opertions require the exct sme computtion time. In relity, in the computer, different rithmetic opertions

ctully require different computtion times. Specificlly, in the computer just like in humn computtions), ddition nd subtrction re the simplest hence fstest) opertions; multipliction, in effect, consists of severl dditions of the results of multiplying the first number by different digits of the second one; thus, multipliction tkes longer thn ddition or subtrction; finlly, division, in effect, consists of severl multiplictions nd thus requires n even longer time thn multipliction. Ech itertion of the Bbylonin lgorithm consists of one division, one multipliction, nd one ddition. Since division is the most time-consuming of the rithmetic opertions, to prove tht the Bbylonin lgorithm is the fstest, we must first prove tht no lgorithm without division is possible. Indeed, if we only use ddition, subtrction, or multipliction, then the resulting expression is polynomil. Once we hve polynomil f, ), the requirement f, ) = cn only be stisfied for liner function f, ) = for which there is no convergence. Thus, ech expression must hve t lest one division, i.e., t lest s mny s the Bbylonin expression. It cn still be fster thn the Bbylonin formul if we hve exctly 1 division nd no multiplictions, just divisions, ddition, nd subtrction. By enumerting ll possibilities, one cn conclude tht such n expression is impossible. Thus, every expression must hve t lest one division, nd t lest one multipliction. So, it is not possible to hve n expression which is fster thn the Bbylonin one, but it my be potentilly possible to hve n expression which is exctly s fst s the Bbylonin one, i.e., tht consists of: one division, one multipliction, nd one ddition or subtrction. Agin, one cn enumerte ll possible combintions of these three opertions nd see tht the Bbylonin expression is the only one tht leds to convergence to the squre root. ACKNOWLEDGMENTS The uthor is thnkful to the nonymous referees for vluble suggestions. REFERENCES [1] R. E. Brynt nd D. R. O Hllron, Computer Systems: A Progrmmer s Perspective, Prentice Hll, Upper Sddle River, New Jersey, 003. [] D. Flnnery, The Squre Root of Two, Springer Verlg, 005. [3] T. Heth, A History of Greek Mthemtics, Clrendon Press, Oxford, 191, Vol.. [4] G. Klir nd B. Yun, Fuzzy sets nd fuzzy logic: theory nd pplictions. Prentice Hll, Upper Sddle River, New Jersey, 1995. [5] H. T. Nguyen nd V. Kreinovich, Applictions of continuous mthemtics to computer science, Kluwer, Dordrecht, 1997. [6] H. T. Nguyen nd E. A. Wlker, A first course in fuzzy logic, CRC Press, Boc Rton, Florid, 005.