4.3.3 Some Studies in Machine Learning Using the Game of Checkers

Size: px
Start display at page:

Download "4.3.3 Some Studies in Machine Learning Using the Game of Checkers"


1 4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 535 Some Studes n Machne Learnng Usng the Game of Checkers Arthur L. Samuel Abstract: Two machne-learnng procedures have been nvestgated n some detal us!jg the game of checkers. Enough work has been done to verfy the fact that a computer can be programmed so that t wll learn to playa better game of checkers than can be played by the person who wrote the program. Furthermore, t can learn to do ths n a remarkably short perod of tme (8 or 10 hours of machne-playng tme) when gven only the rules of the game, a sense of drecton, and a redundant and ncomplete lst of parameters whch are thought to have somethng to do wth the game, but whose correct sgns and relatve weghts are unknown and unspecfed. The prncples of machne learnng verfed by these 'experments are, of course, applcable to many other stuatons. ntroducton The studes reported here have been concerned wth the progr;jmmng of a ugtal computer to behave n a way whch. f done by human bengs or anmals, would be descrhcu as nvolvng the process of learnng. Whle ths s not thc place to uwell on the mportance of machnc-learnng proceuures. or to uscourse on the phlosophcal aspccts, there s obvously a very large amount nf \\'L)rk. now done hy people. whch s qute trval n ts uemanus on the ntellect hut does, nevertheless, nvolve some learnng. We have at our command computers wth adequate data-handlng ahlty and wth suffcent,computatonal specd to make use of machne-learnng technqucs. hut our knowledge of the basc prncplcs of these tcchnqucs s stll rudmentary. Lackng such knowledge, t s neccssary to spccfy methods of problcm soluton n mnute anu ~xact uetal a tme-consumng and costly pr0cedurc. Programmng computers to learn from experenec shoulu eventual v elmnatc the need for much of ths dctaled programm'ng erfort. Ge/ll'l'lll /llelhodl' of llpproach At thc outset t mght he well to dstngush sharply betwccn two gcncral approaches to thc problem of machne learnng. One method. whch mght be called the Nellral Xel ApprolCh. ueals wth the possblty of nducng learncd hchavor nto a randomly connected swtchng nct (or ts smulaton on a dgtal computer) as a result of a rewaru-and-punshment routne. A seconu. and 1l1uch llorc ctlcent approach, s to prouucc the equvalent nf a hghly organzeu network whch has been desgncd tn carn nnly certan specfc thngs. The frst method should lead to the dcvelopmcnt of general-purpose learnng machnes. A comparson between the sze of the swtchng nets that can be reas,?nably constructed or smulated at the present tme and the sze of the neural nets used by anmals, suggests that we have a long way to go before we obtan practcal devces.~ The second procedure requres rcprogrammng for each new applcaton. but t s capable of realzaton at the present tme. The expermcnts to be descrbcd here were based on ths second approach. Choce of prohlem For some years the wrter has devotcd hs spare tme to the subject of machne learnng and has concentrated on the development of learnng proccdures as appled to games.:: A game provdes a convenent vehcle for such study as contrasted wth a problem taken from lfe, snce many of the complcatons of dctal are removed. Checkers. rather than chcss,4-7 was chosen because the smplcty of ts rules permts greater emphass to be placed on learnng technques. Regardless of the relatve merts cf the two games as ntellectual pastmes, t s far to state that checkers contans all of the basc characterstcs of an ntellectual actvty n whch heurstc procedures and learnng processes can playa major role and n whch these processes can be evaluated. Some nf these characterstcs mght well be enumerated. They are: (l) The actvty must not be determnstc n the practcal sense. Therc exsts no known algorthm whch wll guarantee a wn or a draw n checkers, and the complete Orgnally publshed n BM Journal, Vol. 3, No.3. July, 1959.

2 - 536 mprovng the Effcency of a Problem Solver exploratons of every possble path through a checker game would nvolve perhaps choces of moves whch. at 3 choces per mllmcrosecond. would stll take 1O~ centures to consder. (2) A defnte goal must exst-the wnnng of the game-and at least one crteron or ntermedate goal must exst whch has' a bearng on the achevement of the fnal goal and for whch the sgn should be known. n checkers the goal s to deprve the opponent of the possblty of movng, and the domnant crteron s the number of peces of each color on the board. The mportance of havng a known crteron wll be dscussed later. (3) The rules of the actvty must be defnte and they should be known. Games satsfy ths requrement. Unfortunately, many problems of economc mportance do not. Whle n prncple the determnaton of the rules can be a part of the learnng process, ths s a complcaton whch mght well be left untl later. (4) There should be a background of knowledge concernng the actvty aganst whch the learnng progress can be tested. (5) The actvty should be one that s famlar to a substantal body of people so that the behavor of the program can be made understandable to them. The ablty to have the program play aganst human opponents (or antagonsts) adds spce to the study and, ncdentally, provdes a convncng demonstraton for those who do not beleve that machnes can learn. Havng settled on the game of checkers for our learnng studes, we must, of course, frst program the computer to play legal check ~rs; that s, we must express the rules of the game n macl:jne language and we must arrange for the mechancs. of acceptng an opponent's moves and of reportng the computer's moves, together wth all pertnent data desred by the expermenter. The general methods for dong ths were descrbed by Shannon H n 1950 as appled to chess rather than checkers. The basc program used n these experments s qute smlar to the program descrbed by Strachey9 n The avalablty of a larger and faster machne (the BM 704), coupled wth many detaled changes n the programmng procedure, leads to a farly nterestng game beng played, even wthout any learnng. The basc forms of the program wll now be descrbed. The basc checker-playng program The computer plays by lookng ahead a few moves and by evaluatng the resultng board postons much as a human player mght do. Boal d postons are stored by sets of machne words. four words normally beng used to represent any partcular board poston. Thrty-two bt postons (of the 36 avalable n an BM 704 word) are, by conventon, assgned to the 32 playng squares on the checkerboard, and peces appearng on these squares are represented by 's appearng n the assgned bt postons of the correspondng word. "Lookng-ahead" s prepared for by computng all possble next moves, startng wth a gven board poston. The ndcated moves are explored n turn by producng new hoard-poston records correspondng.to the condtons after the move n queston (the old board postons beng saved to facltate a return to the startng. ponn and the process can be repeated. Ths look-ahead procc;dure s carred several moves n advance, as llustrated n Fg. 1.. The resultng board pestons are then scored n terms of ther relatve value to the machne. The standard method of scorng the resultng board postons has been n terms of a lnear polynomal. A number of schemes of an abstract sort were tred for evaluatng board postons wthout regard to the usual checker concepts, but none of these was suecessfu.o One way of lookng at the varous terms n the scorng polynomal s that those terms wth numercally small coeffcents should measure crtera related as ntermedate goals to the crtera measured l'y the larger terms. The achevement of these ntermedate goals ndcates that the machne s gong n the rght drecton, such that the larger terms wn. eventually ncrejse. f the program could look far enough ahead we need only ask, "s the machne stll n' 'the game?,~j.l Snce t cannot look ths far ahead n the usual stuaton, we must substtute somethng else, say the pece rato, and let the machne contnue the look-ahead untl one sde has ganed a pece advantage. But even ths s not always possble, so we have the program test to see f the machne has ganed a postonal advantage, et cetera. Numercal measures of these varous propertes of the board postons are then added together (each wth an approprate coeffcent whch defnes ts relatve mportance) to form the evaluaton polynomal. More 3pecfcally, as defned by the rules for checkers, the domnant scorng parameter s the nablty for one sde or the other to move,12 Snce ths can occur but once n any game, t s tested for separately and s not ncluded n the scorng polynomal as tabulated by the computer durng play. The next parameter to be consdered s the relatve pece advantage. t s always assumed that t s to the machne's advantage to reduce the number of the opponent's peces as compared to ts own. A reversal of the sgn of ths. term wll, n fact, cause the program to play "gve-away" checkers, and wth learnng t can only learn to playa better and better gve-away game. Were the sgn of ths term not known by the programmer t could. of course, be determned by tests, but t must be fxed by the expermenter and, n effect, t s one of the nstructons to the machne defnng ts task. The numercal computaton of the pece advantage has been arranged n such a way as to account for the well-known property that t s usually to one's advantage to trade peces when one s ahead and to avod trades when behnd. Furthermore, t s assumed that kngs are more valuable than peces, the relatve weghts assgned to them beng three to two. 13 Ths rato means that the program wll trade three men for two kngs, or two kngs for three men. f by so dong t can obtan some postonal advantage.

3 Some Studes n Machne Learnng Usng the Game of Checkers 537 NTAL BOARD POSTON ---.'!....' j '..\,; \...., '. 3 / \..'.,. 4 1! ,./!\. - ~ \ 8 / A\ 9 10, '.: :.' \ / 11 Fgure A "tree" of moves whch mght be nvestgated durng the look-ahead procedure. The actual branchngs are much more numerous than those shown, and the "tree" s apt to extend to as many as 20 levels.

4 - 538 mprovng the Effcency of a Problem Solver Th~ Chllc~ for lh~ paralllcl~rs to follow ths frst t~rlll llf th~ scorng polynlllllal and thcr cocllc~nts th~n b~ Clllll~S a llatter of concern. Two cours~s ar~ open ~th~r the experlll~ntcr can d~cd~ what these suhseljuent t~rms are to h~. or he can arrange for the program to make the s~kcton. We wll dscuss the frst case n some detal n connecton wth the rot~-carnng studes am! leave for a lat~r sedon the dscusson of varous program methods of s~lcctng parameters and adjustng ther coetlcents. t s not satsfactory to s~lect the ntal move whch kads to the hoard poston wth the hghest score, snce to reach ths poston would requre the cooperaton of the opponent. nstead. an analyss must he made proceedng h(/ck \"(/rd from the evaluated board postons through the "tree" of possble moves, each tme wth consderaton of the ntent of the sde whose move s beng examned, assumng that the opponent would always attempt to mnmze the machne's score whle the machne acts to maxmze ts score. At each branch pont, then. the correspondng hoard poston s gven the score of the hoard poston whch would result from the most favorahle move. Carryng ths "mnmax" procedure hack to the startng pont results n the selecton of a "best move." The score of the board poston at the end of the most lkely chan s also brought back. and for learnng purposes ths score s now assgned to the present board poston. Ths process s shown n Fg. 2. The best move s executed, reported on the console lghts. and tabulated by the prnter. The opponent s then permtted to make hs move, whch can be communcated to the machne ether by means of console swtches or by means of punched cards. The computer verfes the legalty of the opponent's move, rejectng l- or acceptng t, and the process s repeated. \Vhen the program can look ahead and predct a wn, ths fact s reported on the prnter. Smlarly, the program concedes when t sees that t s gong to lose. Ply lllllatolls Playng-tme consderatons make t necessary to lmt the look-ahead dstance to some farly small value. Ths dstance s defned as the ply (a ply of 2. consstng of one proposed move hy the machne and the antcpated reply by the opponent). Th~ ply s not fxed hut depends upon the dynamcs of the stuaton. and t vares from move to move and from branch to branch durng the move analyss. A great many schemes of adjustng the look-ahead dstance have been tred at varous tmes, some of them qute complcated. The most effectve one. although qute detaled. s smple n concept and s as follows. The program always looks ahead a mnmum Llstance. whch for the op~nng game and wthout learnng s usually set at three moves. At ths mnmum ply the program wll evaluate the boarll poston f none of the followng condtons llccurs: () the next move s a jump. (2) the last move was a jump. or (3) an exchange olcr s posshk. f any on~ of these condtons exsts. the program contnues lookng aheall. At a ply of 4 the program wll stop and evaluate the resultng board poston f condtons () and (3) ahove arc not met. At a ply of 5 or greater. the program stops the look-ahead whenever the next ply level docs not offer a jump. At a ply of or greater. the 'program wll termnate the lookaheall. even f the next move,;"to he a jump, should one slle at ths tme be ahead by more than two kngs (to prevent the needless exploraton of obvously losng or wnnng sequences), The program stops at a ply of 20 regardkss of all condtons (snce the memory space for the look-ahead moves s then exhausted) and an adjustment n score s made to allow for the pendng jump. Fnally. an adjustment s made n the levels of the break ponts between the dfferent condtons when tme s saved through rotc learnng (see below) and when the total number of peces on the board falls below an arbtrary number. All break ponts are determned by sngle data words whch can be changed at any tme by manual nterventon. Ths tyng of the' ply wth board condtons acheves three desred re~ults. n the frst place. t permts board evaluatons to be made und'cr condtons of relatve stablty for so-called dead postons. as defned by Turng.l~ Secondly, t causes greater survellance of those paths whch offer better opportuntes for ganng or losng an advantage. Fnally, snce branchng s usually serously restrcted by a jump stuaton. the total number of board postons and moves to be consdered s stll held down to a reasonable number and s more equtably dstrbuted between the varous pos~ble ntal moves. As a practcal maller, machne-playng tme usually has heen lmted to appro;.:mately 3D seconds per move. Elaborate table-lookup procedures. fast sortng and searchng procedures, and a varety of new programmng trcks were developed, and full use was made of all of the resources of the BM 704 to ncrease the operatng speed as much as possble. One can, of course, set the playng tme at any desred value by adjustments of the permtted ply; too small a ply results n a bad game and too large a ply makes the game unduly costly n terms of machne tme. Other modes of play For study purposes the program was wrtten to accommodate several varatons of ths basc plan. One of these permts the program to play aganst tself. that s. to play both sdes of the game. Ths mode of play has been found to be especajly good durng the early stages of learnng. The program can also follow book games presented to t ether on cards or on magnetc tape. When operatng n ths mode. the program lkcdes at each pont n the game on ts ncxt move n the usual way and rcports ths proposell move. nstead of actually makng ths move. the program refers to the ~torcd record of a book game and makes the book move. The program r~cords ts evaluaton of the two moves. and t also counts and reports the number of possble moves whch the program

5 4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 539 CD CD CD MACHNE CHOOSES BRANCH WTH LARGEST SCORE OPPONENT EXPECTfD TO CHOOSE BRANCH WTH SMALLEST SCORE MACHNE CHOOSES BRANCH W TH MOST POS T VE SCORE AA.--; A.1 \ +7,.: '. ' '--., PLY NUMBERCD A CD EVALUATONS MADE AT THS LEVEL o +3 -ld Fgure 2 Smplfed dagram showng how the evaluatons are backed-up through the "tree" of possble moves to arrve at the best next move. The evaluaton process starts rates as beng better than the book move and th~ number t rates as beng poorer. The sdes are then reversed and the process s repeated. At the end of a book game a corrdaton coeffcent s computed, relatng the machne's ndcated moves to those moves adjudged best by the checker masters.!!; t should be noted that the emphass throughout all of these studes has been on learnng technques. The temptaton to mprove the machne's game by gvng t standard openngs or other man-generated knowledge of playng technques has been consstently ressted. Even when book games are played, no weght s gven to the fact that the moves as lsted are presumably the best possble moves under the crcumstances. For demonstraton purposes, and also as a means of avodng lost machne tme whle an opponent s thnkng. t s sometmes convenent to play several smultaneous games aganst dfferent opponents. Wth the program n ts present.form the most convenent numher for ths purpose has been found to be sx, although eght have been played on a number of occasons. Games may be started wth any ntal confguraton fllr the hoard poston so that the program may be tested ln end games. checkcr puzzles. et cetera. For nonstandan startng condtons, the program lsts the ntal pece arrangement. From tme to tme, and at the end of each game. the program also tabulates varous bts of statstcal nformaton whch assst n the evaluaton of playng performance. Numerous other features have also been added to make the program convenent to operate (for detals see Appendx A), but these have no drect bearng on the problem of learnng, to whch we wll now turn our attenton. Rote learnng and ts varants Perhaps the most elementary type of learnng worth dscussng would be a form of rote learnng n whch the program smply saved all of the board postons encountered durng play, together wth ther computed scores. Reference could then be made to ths memory record and a certan amount of computng tme mght be saved. Ths can hardly be called a very advanced form of learnng; nevertheless, f the program then utlzes the saved tme to compute further n depth t wll mprove wth tme. Fortunately, the ablty to store board nformaton at a ply of 0 and to look up boards at a larger ply provdes the possblty of lookng much farther n advance than mght otherwse be possble. To understand ths, consder a very smple case where the look-ahead s always termnated at a fxed ply, say 3. Assume further that the program saves only the board postons encountered durng the actual play wth ther assocated backed-up

6 540 mprovng the Effcency of a Problem Solver scorcs. Now t s ths lst of prevous board postons that s used to look up board postons whle at a ply lev"cl of J n the suhsequent games. f a board poston s ("ound, ts score has, n cleet. already been backed up by three levels. and f t hecomes etrectve n determnng the move to be made. t s a.(-ply score rather than a smple J-ply score. Ths new ntal board poston wth ts (-ply score s, n turn, saved and"(t may be encountered n a future game and the score backed up by an addtonal set of three levels, et cetera. Ths procedure s llustrated n Fg. 3. The ncorporaton of ths varaton. together wth the smpler rotc-learnng feature. results n a farly powerful learnng technque whch has been studed n some detal. Several addtonal features had to be ncorporated nto the program before t was practcal to embark on learnng studes usng ths storage scheme. n the frst place, t was necessary to mpart a sense of drecton to the program n order to force t to press on toward a wn. To llustrate ths, consder the stuat9n of two kngs aganst one kng, whch s a wnnng combnaton for practcally all varatons n board postons. n tme, the program can be assumed to have stored all of these varatons, each assocated wth a wnnng score. Now. f such a stuaton s encountered, the program wll look ahead along all possble paths and each path wll lead to a wnnng combnaton, n spte of the fact that only one of the possble ntal moves may be along the drect path toward the wn whle all of the rest may be wastng tme. How s the program to dfferentate between these? A good soluton s to keep a record of the ply value of the dtrerent board postons at all tmes and to make a further choce between board postons on ths bass. f ahead. the program can be arranged to push drectly toward the wn whle. f behnd, t can be arranged to adopt delayng' tactcs. The most recent method used s to carry the effectve ply along wth the score by smply decreasng the magntude of the score a small amount each tme t s backed-up a ply level durng the analyses. f the program s now faced wth a choce of board postons whose scores dffer only by the ply number, t wll automatcally make the most advantageous choce, choosng a low-ply alternatve f wnnng and a hgh-ply alternatve f losng. The sgnfcance of ths concept of a drecton sense should not be overlooked. Even wthout "learnng," t s very mportant. Several of the early attempts at learnng faled because the drecton sense was not properly taken nto account. Cawlogng lnd cullng stored nlormatoll Snce practcal consderatons lmt the number of board postons whch can be saved, and snce the tme to search through those that are saved can easly become unduly long. one must devsg systems () to catalog hoards that arc saved. (2) to delete redundances. and (3) to dscard board postons whch are not beleved to he of much value. The most effectve catalogng system found to date starts hy standardzng all board postons. frst by reversng the peces and pece postons f t s a board POSton n whch Whte s to move. so that all boards are reported as f t were Black's turn to move. Ths reduces by nearly a factor of two the numher of boards whch must be saved. Board postons. n whch all of the peces 'are kngs, can be reflected about the dagonals wth a possble fourfold reducton n the number whch must be saved. A more' compact board representaton than the one employed durng play s also used so as to mnmze the storage requrements. After the board postons are standardzed. they are grouped nto records on the bass of () the number of peces on the board, (2) the presence or absence of a pece advantage, (3) the sde possessng ths advantage, (4) the presence or absence of kngs on the board.:(5) the sde havng the so-called "move," or.opposton advantage, and fnally (6) the frst moments of the peces about normal and dagonal axes through the board. Durng play, newly acqured board postons are saved n the memory untl a reasonable number have been accumulated. and they are then merged wth those on the "memory tape" and a new memory tape s produced. Board postons wthn a.record are lsted n a seral fashon, beng sorted wth respect to the" words whch defne them. The records are arranged on the tape n the order that they are most lkely to be needed durng the course of a game; board postons wth 12 peces to a sde comng frst, et cetera. Ths method of catalogng s very mportant because t cuts tape-searchng tme to a mnmum. Reference must be made. of course. to the board postons already saved, and ths s done by readng the correct record nto the memory ane searchng through t by a dchotomous search procedure. Usually fve or more records are held n memory at on\~ tme. the exact number at any tme dependng upon the lengths of the partcular records n queston. Normally, the program calls three or four new records nto memory durng each new move, makng room for them as needed, by dscardng the records whch have been held the longest. Two dfferent procedures have been found to be of value n lmtng the number of board postons that are saved; one based on the frequency of use. and the second on the ply. To keep track of the frequency of use, an age term s carred along wth the score. Each new board poston to be saved s arbtrarly assgned an age. When reference s made to a stored board poston, ether to update ts score or to utlze t n the lookahead procedure, the age recorded for ths board poston s dvded by two. Ths s called relre~ hng. Offsettng ths, each board poston s automatcally aged by one unt at the memory merge tmes (normally occurrng about once every 20 moves). When the age of anyone board poston reaches an arbtrary maxmum value ths hoard poston s expunged from the r~cord. Ths s a form of lorgellng. New board postons whch reman unused arc soon forgotten. whle board postons whch arc used several tmes n successon wll be refreshed to such an extent that they wll be remembered even f not used thereafter for a farly long perod of tme. Ths form of refreshng and forgettng was adopted on the bass of

7 4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 541 PLY NUMBER 1 2 EVALUATONS WOULD NORMALLY BE MADE AT THS LEVEL ~. /... ~....., e 3. f \ \ \.! PREVOUS EVALUATON LEVEL Fgure 3! / / e.'\ ' \ \...! :\ \ \ \ e. r\ \. Smplfed representaton of the rote-learnng process, n whch nformaton saved from a prevous game s used to ncrease the effectve ply of the.backed-up score. retlectons as to the fralty of human memores. t has proven to be very effectve. n addton to the lmtatons mposed by forgettng, t seemed desrable to place a restrcton on the maxmum sze of anyone record. Whenever an arbtrary lmt s reached. enough of the lowest-ply board postons are automatcally culled from the record to brng the sze well hclow the maxmum. Before embarkng o~ a study of the learnng capahltes of the system as just descrbed, t was, of course, frst necessary to fx the terms and coeffcents n the evaluaton polynomal. To do ths, a number of dfferent sets of values were tested by playng through a seres. of hook games am computng the move correlaton coeffcents. These values vared from 0.2 for the poorest polynomal tested, to approxmately 0.6 for the one fnally adopted. The selected polynomal contaned four terms (as contrasted wth the use of 16 terms n later experments). n decreasng order of mportance these were: (l) pece advantage, (2) denal of occupancy, (3) moblty, and (4) a hybrd term whch combned control of the center and pece advancement. Rote-learnng tests After a scorng polynomal was arbtrarly pcked, a seres of games was played, both self-play and play aganst many dfferent ndvduals (several of these beng checker masters). Many book games were also followed, some

8 542 mprovng the Effcency of a Problem Solver of these heng end games. The program lcarnel1 to 'play a very good openng game and to recognze most wn-, nng and losng end postons many moves n advanc.:, although ts mdgame play was not greatly mproved. Ths program now qualfes as a rather hetter-thanaverage novce. but defntely not as an expert. At the present tme the,jlemory tape contans somethng over board postons (averagng 3.X words each) whch have been selected from a much larger numher of postons by means of the cullng technques descrbed. Whle ths s stll far from the number whch would tax the lstng and searchng procedures used n the program, rough estmates. based on the frequency wth whch the saved boards are utlzed durng normal play (these fgures beng tabulated automatcally), ndcate that a lbrary tape contanng at least 20 tmes the present number of board postons would he needed to mprove the mdgame play sgnfcantly. At the present rate of acquston of new postons ths would requre an nordnate amount of play and, consequently. of machne tmey' The general conclusons whch can be drawn from these tests are that: () An effectve rote-learnng technque must nclude a procedure to gve the program a sense of drecton. and t must contan a refned system for catalogng and storng nformaton. (2) Rote-learnng procedures can be use'd etlectvely on machnes wth the data-handlng capacty of the BM 704 f the nformaton whch must be saved and searched docs not occupy more than, wugh1v. one mllon words, and f not more than llne hundred or so references need to be made to ths nformatl'n per mnute. These fgures are, of course. hghly dependent upon the exact etncency of catalogng whch can be a-:heved. (3) Thy game of checkers. when played wth a smple scorng scheme and wth rote learnng only, requres more than ths number of words for master cal ber of play and. as a consequence. s not completely amenable to ths treatment on the 18\'1 70-L (4) A game, such as checkers. s a sutable vehcle for use durng the development of learnng technques. and t s a very satsfactory devce for demonstratng machne-learnng procedures to the unbelevng. Learnng procedure nvolvng generalzatons An obvous way to decrease the amount of storage needed to utlze past experence s to gen~ralze on the hass of cxperence and to save nnly the generalzatl1ns. Ths should. of course. be a contnullls process f t s to be truly e1fectve. and t slllluld n\'l1lve se\'cral levels nf abstractllll. A start has been made n ths drecton by havng the program select a subsl,t llf possble terms for use n the evaluaton polylllljllal and by havng the program determne the sgn and ma~ntllde llf the cnl'llcents whch multply these paran1l'ters. \t the present tme ths subset conssts of /(, tl'nn, c1lllsen frllm a lst of JX parameters. The pece-advantage term needed to defne the task s computed separately anj. of course. s not altered hy the program. After a number of relatvely unsuccessful attempts to havc thc program generalze whle playng both sdes of the gamc. the program was arranged to act as two dffe.. ~nt players, for convenence called A plw and BetG. Alph,~ gcneralzes on ts experertce after each m~ve bv adjustn~ the coeffcents n ts evaluaton polynomal and hy replacnb terms whch appear to be unltlportant by new parametel ~ drawn from a reserve lst. Beta. on the contrary, uses th~ same evaluaton polynomal for the duraton of any ont: game. Program Alpha s used to play aganst human opponents. and durng self-play Alpha and Beta play each other. At the end of each self-play game a determnaton s made of the relatve playng ablty of Alpha, as compared wth Beta, by a neutral porton of the program. f Alpha wns-or s adjudged to be ahead when a game s otherwse termnated-the then current scorng system used by Alpha s gven to Beta. f, on the other hand, Beta wns or s ahead; ths fact s recorded as a blackmark for Alpha. 'Yhenever Alpha receves an arbtrary number of black marks (usu'rrlly set at three) t s assumed to be on the wrong track. and a farly drastc and arbtrary change s made n ts scorng polynomal (by reducng the coeffcent of the leadng term to zero). Ths acton s necessary on occason, snce the entre learnng process s an attempt to fnd the hghest pont n multdmensonal scorng space n the presence of many secondary maxma on whch the program can become trapped. By manual nterventon t s possble to return to some prevous condton or make some other change f t becomes apparent that the learnng process s not functonng properly. n general, however, the program seeks to extrcate tself from traps and to mprove more or less contnuously. The capablty of the program can be tested at any tme by havng Alpha play one or more book games (wth the learnng procedure temporarly mmoblzed) and by correlatng ts play wth the recommendatons of the masters or. more nterestngly, by pttng t aganst a human player. Polynomal modfcaton procedure ~ f Alpha s to make changes n ts scorng polynomal. t must be gven some trustworthy crtera for measurn a performance. A logcal dffculty presents tself, snc; the only measurng parameter avalable s ths same scorng polynomal that the process s des!!ned to mprove. Recourse s had to the pecular pro;erty of the look-ahead procedure. whch makes t less mportant for the scorng polynomal to be partcularlv good the further ahead the process s contnued. Th~ means that one can evaluate the relatve change n the postons of two players, when ths evaluaton s made over a farly large number of moves, by usng a scorng system whch s much too gross to be sgnfcant on a move-w-move bass. - Perhaps an even better way of lookng at the matter

9 4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 543 s that we arc attemptng to make the score, calculatet! for the current boart! poston. look lke that calculated for the termnal hoard poston of the chan of moves whch most probably wll occur t!urng actual play. Of course, f one coult! t!evelop a perfect system of ths sort t would be the equvalent of always lookng ahead to the end of the game. The nearer ths t!eal s approachet!, the better would be the play.l.~ n order to obtan a suffcently large span to make use of ths characterstc, Alpha keeps a record of the apparent goodness of ts board postons as the game progresses. Ths record s kept by computng the scorng polyno'mal for each board poston encountered n actual play and by savng ths polynomal n ts entrety. At the same tme, Alpha also computes the backed-up score for all board postons, usng the look-ahead procedure descrbed earler. At each play by Alpha the ntal board score, as saved from the prevous Alpha move, s compared wth the backed-up score for the current poston. The dfference between these scores, defned as delta, s used to check the scorng polynomal. f delta s postve t s reasonable to assume that the ntal board evaluaton was n error and terms whch contrbuted postvely shoult! have been gven more weght, whle those that contrbuted negatvely should have been gven less weght. A converse statement can be made for the case where delta s negatve. Presumably, n ths case, ether the ntal board evaluaton was ncorrect, or a wrong choce of moves was made, and greater weght should have been gven to terms makng negatve contrbutons, wth less \veght to postve terms. These changes are not made drectly but are brought about n an nvolved way whch wll now be descrbed. \ record s kept of the correlaton exstng between the sgns of the ndvdual term contrbutons n the ntal scl)r1g polynomal and the sgn of delta. After each play an adjustment s made n the values of the correlaton cl)ellcents, due account beng taken of the number of tmes that each partcular term has been used and has hat! a nonzero value. The coeffcent for the polynomal term (other than the pece-advantage term) wth the then largest correlaton coeffcent s set at a prescrbed maxmum value wth proportonate values determned for all of the remanng coeffcents. Actually, the term coeffcents arc f:\et! at ntegral powers of 2, ths power beng t!efned by the rato of the correlaton coeffcents. More precsely, f the rato octwo correlaton coeffcents s equal to or larger than n hut less than l+, where n s an nteger. then the rato of the two term coeffcents s set equal to 2". Ths procedure was adopted n order to ncrease the range n values of the term coeffcents. \Vhenever a correlaton-coeffcent calculaton leads to a negatve sgn, a correspondng reversal s made n the sgn assocated wth the term tself. l.l'w!>lte.\' t should he noted that the span of moves over whch delta s computet! conssts of a remembered part and an antcpated porton. Durng the remembered play, use had been made of Alpha's current scorng polynomal to determne Alpha's moves but not to determne the opponent's moves, whle durng the antcpaton play the moves for hoth sdes are made usng Alpha's scorng polynomal. One s tempted to ncrease the senstvty of delta as an ndcator of change by ncreasng the span of the remembered porton. Ths has been found to be dangerous snce the coeffcents n the evaluaton polynomal and, ndeed. the terms themselves, may change between the tme of the remembered evaluaton and the tme at whch the antcpaton evaluaton s made, As a matter of fact, ths dffculty s present even for a span of one move-par. t s necessary to recompute the scorng polynomal for a gven ntal board poston after a move has been determned and after the ndcated cor~ rectons n the scorng polynomal have been made, and to save ths score for future comparsons, rather than to save the score used to determne the move. Ths may seem a trval pont, but ts neglect n the ntal stages of these experments led to oscllatons qute analogous to the nstablty nouced r electrcal crcuts by long delays n a feedback loop. As a means of stablzng aganst 'mnor varatons n the delta values, an arbtrary mnmum value was set, and when delta fell below ths mnmum for any partcular move no change was made n the polynomal. Ths same mnmum value s used to set lmts for the ntal board evaluaton score to decde whether or not t wll be assumed to be zero. Ths mnmum s recomputed each tme and, normally, has been fxed at the average value of the coeffcents for the terms n the currently exstng evaluaton polynomal. Stll another type of nstablty can occur whenever a new term s ntroduced nto the scorng polynomal. Obvously, after only a sngle move the correlaton coeffcent of ths new term wll have a magntude of 1, even though t mght go to 0 after the very next move. To prevent volent fluctuatons due to ths cause, the correlaton coeffcents for newly ntroduced terms are computed as f these terms had already been used several tmes and had been found to have a zero correlaton coeffcent.. Ths s done by replacng the tmes-used s number the calculaton by an arbtrary number (usually set at 16) untl the usage does, n fact, equal ths number. After a term has been n use for some tme, qute the opposte acton s desred so that the more recent experence can outwegh earler results. Ths s acheved, together wth a substantal reducton n calculaton tme, by usng powers of 2 n place of the actual tmes-used and by lmtng the maxmum power that s used. To be specfc, at any stage of play defned as the Nth move, correctons to the values of the correlaton coeffcents C x are made usng 16 for N untl N equals 32, whereupon 32 s used untl N equals 64, et cetera, usng the formula: C.v=C.Y_l C~'_l±1 N and a value for N larger than 256 s never used. After a mnmum was set for delta t seemed reasona-

10 - 544 mprovng the Effcency of a Problem Solver he to attach greater weght to stuatons leadng to large values of tlelta. Accortlngly, two atltltonal categores are ddnetl. f a contrhuton to tlelta s matle hy the frst term. meanng that a change has occurred n the pece rato. the ndcatetl changes n the correlaton coetlcents arc tloubled, whle.f the value of tlelta s so large as to ndcate that an almost sure wn or lose wll result. the elfect on the correlaton"~oeffcentss quadrupled. Term replacemellt ;\o[enton has been made several tmes of the procedure for replacng terms n the scorng polynomal. The program. as t s currently runnng, contans 38 dfferent terms (n addton to the pece-advantage term), 16 of these beng ncluded n the scorng polynomal at anyone tme and the remanng 22 beng kept n reserve. After each move a low-term tally s recorded aganst that actve term whch has the lowest correlaton coeffcent and, at the same tme, a test s made to see f ths brngs ts tally count up to some arbtrary lmt, usually set at 8. When ths lmt s reached for any specfc term. ths term s transferred to the bottom of the reserve lst, and t s re-. placed by a term from the head of the reserve Jst. Ths new term enters the polynomal wth zero values for ts correlaton coeffcent. tmes used, and low-tally count. On the average, then, an actve term s replaced once each eght moves and the replaced terms are gven another chance after 176 moves. As a check on the effectveness of ths procedure, the program reports on the usage whch has accrued aganst each dscarded term. Terms whch are repeatedly rejected after a mnmum amount of usage can be removed and replaced wth completely new terms. t mght be argued that ths procedure of havng the program select terms for the evaluaton polynomal from a suppled lst s much too smple and that the program should generate the terms for tself. Unfortunately, no satsfactory scheme for dong ths has yet been devsed. Wth a man-generated lst one mght at least ask that the terms be members of an orthogonal set, assumng that ths has some meanng as appled to the evaluaton of a checker poston. Apparently, no one knows enough about checkers to defne such a set. The only practcal soluton seems to he that of ncludng a relatvely large number of possble terms n the hope that all of the contrbutng parameters get covered somehow, even though n an nvolved and redundant way. Ths s not an undesrable state of atfars, however, snce t smulates the stuaton whch s lkely to exst when an attempt s made to apply smlar learnng technques to real-lfe stuatons. :-'1any of the terms n the exstng lst arc related n some vague way to the parameters used by checker experts. Some of the concepts whch checker experts appear to use have elutled the wrter's attempts at defnton, and he has heen unable to program them. Some of the terms are qute unrelated to the usual checker lore and have been dscovered more or less by accdent. The second moment ahout the dagonal axs through the double corners s an example. Twenty-seven tllferent smple terms are now n use, the rest beng comh natonal terms. as wll he descrbed later. A word mght be sad about these terms wth respect to the exact' way n whch they arc defnetl and the general procedures used for ther evaluaton. Each term relates to the relatve stand~gs of the two sdes. wth respect to the parameter n queston, and t s numercally equal to the dfference hetween the ratngs for the ndvdual sdes. A reversal of the. sgh obvously corresponds to a change of sdes. As a further means of nsurng symmetry the ndvdual ratngs of the respectve sdes are determned at correspondng tmes n the playas vewed by the sde n queston. for example, consder a parameter whch relates to the board condtons as left after one sde has moved. The ratng of Black for such a parameter would be made after Black had moved, and the ratng of Whte would not be made untl after Whte had moved. Durng antcpaton play, these ndvdual ratngs are made after each move and saved for future reference. When an evaluaton s desred the progrm)1 takes the dfferences between the most recent ratngs and those mane a move earler. n general, an attempt has been made to defne all parameters so that the ndvdual-sde ratngs are expressble as small postve ntegers. Bnary connect!'e terms n addton to the smple terms of the type just descrbed, a number of combnatonal terms have been ntroduced. Wthout these terms the s.~orng polynomal would, of course, be lnear. A number of dfferent ways of ntroducng nonlnear terms have been devsed but only one of these has been tested n any detal. Ths scheme provdes terms whch have some of the propertes of bnarv logcal connectves. Four such terms are formed fo'r each par of smple terms whch are to be related. Ths s done by makng an arbtrary dvson of the range n values for each of the smple terms and assgnng the bnary values of 0 and 1 to these ranges. Snce most of the smple terms are symmetrcal about 0, ths s easlv done on a sgn bass. The new terms are then of th~ form A-B, A-B, A-B. and A-8, yeldng values ether of' o or. These terms are ntroduced nto the scorna polynomal wth adjustable coeffcents and s!.!ns, and are thereafter ndstngushable from the other t~rms. As t would requre some 1404 such combnatonal terms to nterrelate the 27 smple terms orgnallv used. t was fountl desrable to lmt the actual nlln~ber of combnatonal terms used at anyone tme to a small fracton of these and to ntroduce new terms onlv as t became possble to retre older netfectual tcrm~. The terms actually used are gven n Appendx C. Prelmnary leamflg-hy-gefleral;;arofl tests An dea of the learnng ablty of ths procedure can he ganed by analyzng an ntal test seres of 28!.!amesUl played wth the program just descrbed. At the ~tart an arbtrary selecton of 16 terms was chosen and all terms

11 4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 545 were assgned equal weghts. Durng the frst 14 games Alpha was assgned the Whte sde, wth Beta constraned as to ts frst move (two cycles of the seven dfferent ntal moves). Thereafter, Alpha was assgned Black and Whte alternately. Durng ths tme a total of 29 dfferent terms was ds carded and replaced, the majorty of these on two dfferent- occasons. Certan other fgures obtaned durng these 2S games ~re of nterest. At frequent ntervals the program lsts the 12 leadng terms n Alpha's scorng polynomal wth ther correlaton coeffcents and a runnng count of the number of tmes these coeffcents have been altered. Based on these samplngs, one observes that at least 20 dfferent terms were assgned the largest coeffcent at some tme or other, some of these alternatng wth other terms a number of tmes, and two even reappearng at the top of the lst wth ther sgns reversed. Whle these varatons were more volent at the start of the seres of games and decreased as tme went on, ther presence ndcated that the learnng procedure was stll not completely stable. Durng the frst seven games there were at least 14 changes n occupancy at the top of the lst nvolvng 10 dfferent terms. Alpha won three of these gamcs and lost four. The qualty of the play was extrcmely poor. Durng the next seven games there were at least eght changes made n the top lstng nvolvng lve dfferent terms. Alpha lost the frst of these games and won the next sx. Qualty of play mproved steadly hut the machne stll played rather badly. Durng Games 15 through 11 there were eght changes n the top lstng nvolvng fve terms; Alpha wnnng fve g;,mes and losng two. Some farly good amateur players who played the machne durng ths perod agreed that t l'as "trek~' but beatable". Durng Games 22 through 28 there \\'ere at least four changes nvolvng three terms. Alpha \\'on two games and lost fve. The program appeared to be approachng a qualty of play whch caused t to he descrbed as "a better-than-average player". A detaled ~nalyss of these results ndcated that the learnng procedure dd work and that the rate of learnng \\as surprsngly hgh, but that the learnng was qute erratc and none too stable. Second seres oj tests Some of the more obvous reasons for ths erratc hehavor n the frst seres of tests have been dentfed. The program was modfed n several respects to ~prove the stuaton, and addtonal tests were made. Four l\f these modfcatons are mportant enough to justfy a uetaleu explanaton. n the frst place, the program was frequently fooled hy bad play on the part of ts opponent. A smple soluton was to chan!!e the correlaton coeffcents less drastcally when delta was postve than when delta was negatve. The procedure fnally adopted for the postve delta case was to make correctons to selected terms n the polynomal only. When the scorng polynomal was Postve. changes were made to coeffcents assocated lvth the negatvely contrbutng terms, and when the polynomal was negatve, changes were made to the coeffcents assocated wth postvely contrbutng terms. No changes were made to coeffcents assocated wth terms whch happened to. be zero. For the negatve delta case, changes were made to the coeffcents of all contrbutng terms, just as before. A second defect seemed to be connected wth the too frequent ntroducton of new terms nto the scorng polynomal and the tendency for these new terms to assume domnant postons on the bass. of nsuffcent evdence. Ths was remeded by the smple expedent of decreasng the rate of ntroducton of new terms from one every eght moves to one every 32 moves. The thrd defect had to do wth the complete excluson from consderaton of many of the board postons encountereq durng play by reason of the mnmum lmt on delta. Ths resulted n the msassgnment of credt to those board postons whch permtted spectacular moves when the credt rghtfully belonged to earler board postons whch. had permtted the necessary groundlayng moves. Although no precse way has yet been devsed to nsu,e the corr:ect assgnment of credt, a very smple expedent was found to be most effectve n mnmzng the adverse effects of earler assgnments. Ths expedent was to allow the span of remembered moves, over whch delta s computed, to ncrease untl delta exceeded the arbtrary mnmum value, and then to apply the correctons to the coeffcents as dctated by the terms n the retaned polynomal for ths earler board poston. n ths case, the dffculty whch was mentoned n the secton on nstabltes n connecton wth an arbtrary ncrease n span, does not occur after each correcton, snce no changes are made n the coeffcents of the scorng polynomal as long as delta s below the mnmum value. Of course, whenever delta does exceed the mnmum value the program must then recompute the ntal scorng polynomal for the then current board poston and so restart the procedure wth a span of a sngle remembered move-par. Ths over-all procedure rectfes the defect of assgnng credt to a board poston that les too far along the move chan, but t ntroduces the possblty of assgnng credt to' a board poston that s not far enough along. As a partal expedent to compensate for ths newly ntroduced danger, a change was made n the ntal board evaluato n. nstead of evaluatng the ntal board postons drectly, as was done before, a standard but rudmentary tree-search (termnated after the frst nonjump move) was used. Errors due to mpendng jump stuatons were elmnated by ths procedure, and because of the greater accuracy of the evaluaton t was possble to reouee the mnmum delta lmt by a small amount. F.nally, to avod the danger of havng Beta adopt Alpha's polynomal as a result of a chance wn on Alpha's part (or perhaps a stuaton n whch Alpha had allowed ts polynomal to degenerate after an early or mdgame advantage had been ganed), t was decded

12 546 mprovng the Effcency of a Problem Solver to re4ure a majorty of wns on Alpha's p;u before Jkta would allopt Alpba's scorng polynomal. \Vth these mollfcatons, a new seres of t,~,>ts was malc. n order to reduce tbe learnng tme, the ntal selecton of terms was made on the hass of tl,,; results ohtanell durng the carl er tests, but no allent on was pad to ther prevously assgned weghts. n contrast wth the earler erratc behavor, the revsed prol~ram appeared to be extremely stahle, perhaps at the ex pense of a somewhat lower ntal learnng rate. The way n whch the character of the evaluaton polynomal altered as learnng progressell s shown n Fg. 4. The most obvous change n behavor was n regard to the relatve number of games won hy Alpha ;,nd the prevalence of draws. Durng the frst 2H game'> of the earler seres Alpha won 16 and lost 12. The eorrespondng fgures for the frst 2M games of the new seres were 18 won by Alpha, and four lost, wth sx draws. n all cases the names were termnated, f not fnshed, n 70 moves and'"a judgment made n terms of tll; fnal p:>stons. Unfortunately, these lgures arc not '>trc~ly comparable because of the decreased frequency Wth whch Beta adopted Alpha's polynomal durng the,ccond seres, both by desgn and hecause a programmng crror mmoblzed the adopton procedure durng part of the tests. Nevertheless, the great decrease n the nulllhcr of losses and the prevalence of draws seemed to ndcate that the learnng process was much more stable. Some typcal games from ths second seres arc gven n Appendx B. As learnng proceeds, t should become harder and harder for Alpha.to mprove ts game, and one would expect the number of wns hy Alpha to decrease wth tme. f secondary maxma n scorng space arc encountered, one mght even fnd sluatons n whch /\pha wns less than half of the games. Wth Beta at such a maxmum a~y mnor change ll Alpha's polynljlllal would result n a degradaton of ls play, and' several oscllatons about the maxmulll mght occur 11l'lre Alpha landed at a pont whch WlHlll1 enable t to heat Beta. Some evdence of ths trend s dscernble n he play, although many more games \\ll have 10 he ';p,ed before t can be ohserved wth certanty. The tentatve conclusons whch can be drawn njlll these tests are: () A smple generalzaton Schl'1l1e of the typl' hne used can be an effectve learnng devce for pn1hll'nls amenable to tree-searchng procedures. (2) The memory requrements llf such schemes ;re qute modest and reman fxed wth tme. (3) The operatng tmes arc abll n:asonahle and rl' man fxed. ndependent of the anll11ll1t of al'l'umulall'd learnng. (4) ncpent forms of nstabll\' n the s"ullhl l';ll he expected hut, at least for he chl'l'ker pr"gralll. lll'w can be dealt wth hy qute straghlf,l['ward pnlcl'lhlll S. (5) Even wth the neolllplete and redundant sl'l lf parameters whch have been used date. t s Pllsshle for the computer to learn to play ; bl'tter-than-a\'l'r;l~e game of checkers n a relatvely short perod 01 tme. As a fnal precautonary note, t should be statell that these experments have not encompassed a sullcently large seres of games to demonstrate unamhl!uouslv that the learnng procedure s completely stable -or th;t t wll necessarly lend to the best possble choce of parameters and coetlcents.'. Rote learnng vs. generalzaton Some nterestng comparsons can he nade between the playng style developed hy the learnng-by-generalzaton program and that developed by the carler rotc-learnng procedure. The program wth rote learnng soon learned to mtate master play durng the openng moves. t was always qute poor durng the mddle game, b~t t easly learned how to avod most of the obvous traps durng end-game play and could usually drve on toward a wn when left wth a pece advantage. The program wth the generalzaton procedure has never learned to play n a conventonal manner and ts openngs are apt to be weak. On the other hand. t soon learned to play a good mddle game: and wth a pece advantage t usually polshes off ts -opponent' n short order. nterestngly enough, after 28 games t had stll not learned how to wn an end game wth two kngs aganst one n a double corner. Apparently, rote learnng s of the greatest help, ether under condtons when the results of any specfc acton are long delayed, or n those stuatons where hghly specalzed technques are requred. Contrastng wth ths, the generalzaton proceg'jre s most helpful n stuatons n whch the avalable permutatons of condtons are large n number and when the consequences of any specfc acton 'are not long delayed.. Procedures /ll'oll'ng both forms of learnng The next obvous step s to combne the better features of the rote-learnng procedure wth a generalzaton scheme. Ths must be done wth some care, snce t s not practcal to update the prevously saved nformaton after every change n the evaluaton polynomal. A compromse soluton mght be to save only a very lmted amount of nformaton durng the early stages of learnng and to ncrease the amount as warranted by the ncreasng stablty of the evaluaton coetlcent wth learnng. For example, the program could he arranged to save only the pece-advantage term at the start. -At some stage n the learnng process the next term could be added. perhaps when no change had been made n the parameter used for ths tern; durng some farlv long perod, say for three complete games~ f and whe~ the program s able to play an addtonal peroll wthout changes n the next parameter, ths could also be added. et cetera. \Vhenever a change does occur n a parameter prevously assumed to be stable the entre memory tape could be revewed, all terms nvolvng the changed parameter and those lower on the lst could be expunged, and the program could drop hack to the earler condton wth respect to ts term-savng schedule.

13 4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 547 -r-;--r, ;. ;. ;.. :. :. :. ;; ;,, "

14 548 mprovng the Effcency of a Problem Solver Another soluton woulll hc to utlze the generalzaton scheme alone untl t hall hecomc farly stable and to ntroduce rote learnng at ths tme. t s, of course, perfectly feasble to salvage much of the learnng whch has been accumulated by both of the programs studcd to date. Ths could be done hy appendng an abrdged form of the present mcmory tape,to the generalzaton scheme n ts present stage of learnng and by proceedng from there n accordance wth the frst soluton proposed above. Future development Whle t s beleved that these tests have reached the stage of dmnshng returns, some effort mght well be expended n an attempt to get the program to generate ts own parameters for the evaluaton polynomal. Lackng a perfectly general procedure, t mght stll be possble to generate terms.based on theores as proposed hy students of the game. Ths procedure would be at varance wth the wrter's prevous phlosophy, but t s hghly lkely that smlar compromses wll havc to be made when one attempts to apply learnng procedures to problems of economc mportance. Conclusons As a result of thcse experments one can. say wth some certanty that t s now possble to devse learnng schemes whch wll greatly outperform an average person and that such learnng schemes may eventually be economcally feasble as appled to real-lfe problems. Acknowledgments Many dfferent people have contrbuted to these studes through stmulatng dscussons of the basc problems. From tme to tme the wrter was asssted by several dfferent programmers, although most of the detaled work was hs own. The forbearance of the machne room operators and the~ wllngness to play the machne at all hours of the day and nght are also greatly apprecated. Footnotes and References 1. Some of these are qute profound and have a bearng on the questons rased by Nelson Goodman n Fact, Fctol/ //d Fnrecas(. Harvard Unversty Press, Warre.~ S. ~[cculloch ("The Bran as a Computng Ma.:hllle. lee. /g. 69, 492, 1949) has compared the llgllal computer 10 the nervous system of a flatworm. T () extend ths comparson to the stuaton under dscusson would he unfar to the worm snce ts nervous system s a':lllally qute hghly organ'zed as compared Wth the random-net studes by B. G. Farley and W. A. C.;lrke \.. Smulaton of Self-Organzng Systems by Dgtal Computers," RE PGT 4, 76, Sept. 1954), N. Ro.:hester, J. H. Holland, L. H. Habt and W. L. Duda ('"Tests on a Cell Assembly Theory of the Acton ".1 the Bran Usng a Large Dgtal Computer," RE 7Tc//suel/o/ls Ol /formaton Theory T-2 No S (re),,,-. cpt..'~'.'.and b~- F. Rosenblatt ("The Perceptron; A Probablstc Model for nformaton Storage and OrganzatOn the Bran" Psych. Rev November 1958). ' ", The frst operatng checker program for the BM 701 was wntten Ths was recoded for the BM 704 n 195_.:1. The frst program wth learnng was completed 19)5 and demonstrated on televson on February C. E. ~hannon. "Programmng a Computer for Playng Chess. Plll. Mag. 41, 256 (March 1950L 5. A. Ber~;~ten and M. dey. Roberts, "Computer vs. Chess Player. Snellt..Amer. 198,6 (June 1958). fl. 1. J;:.lster. P. Stell. S. Ulam, W. Walden, M. Wells, "Expenments Chess," Journal of the ACM "1. ', 174 (Aprl 7. A. Newell. 1. C. Shaw and H. A. Smon, "Chess-Playng Pro!!r~ms and the Problem of Complexty," BM J. of Re.\". L~ DC'e! (October 1958). x. Shannon. loe ct. ). C. S. Strachey, "Logcal or Non-Mathematcal Programmes," Proc. of ACM Meetng at Toronto, Ontaro, pp Sept. 8-10, ' 10. One of the more nterestng of these was to express a board poston n terms of the frst and hgher moments of the whte and black peces separately about two orthogonal axes on the board. Two such sets of axes were tred, one set beng parallel to the sdes of the board and the second set beng those through the dagonals. 11. Ths apt phraseology was suggested by John McCarthy. 12. Not the capture of all of the opponent's peces, as popularly assumed, although- nearly all games end n ths fashon. 13. The use of a weght rato rather than ths, conformng more closely to the values assumed by many players. can lead nto certan logcal complcatons, as found by Strachey, lac. ct. 14. The only departure from complete generalty of the game as programmed s that the program requres the opponent to make a permssble move. ncludng the takng of a capture f one s offered. "Huffng" s not permtted B. V. Bowden, Faster Than Thought, Chapter 25, Ptman, Ths coeffcent s defned as C=(L-H)/(L+H), where. L s the total number of dfferent legal moves whch the machne judged to be poorer than the ndcated book moves, and H s the total number whch t judged to be better than the book moves. 17. Ths playng-tme requrement, whle _large n terms of cost. would be less than the tme whch the checker master probably spends to acqure hs profcency. 18. There s a logcal fallacy n ths argument. The program mght save only nvarant terms whch have nothng to do wth goodness of play; for example, t mght count the squares on the checkerboard. The forced ncluson of the pece-advantage term prevents ths. 19. Each game averaged 68 moves (34 to a sde), of whch approxmately 20 caused changes to be made n the scorng polynomal.

15 4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 549 Appendx A: Programmng detals,..j pproxlllate sz.e of program Basc chccker-playng routne. nput. move verfcaton and output Game startng and..\crmnatng routnes Loaders. table generators, dumpng, et cetera Statstcal and analytcal routncs Rote-learnng routnes Generalzaton-learnng routnes Tables and constants for basc play Workng space for basc play. \Vorkng space for generalzaton learnng Workng space for rote learnng 100 nstructons 1400 nstructons 600 nstructons H50 nstructons 700 nstructons 1500 nstructons 650 nstructons 700 words 2000 words 500 words balance of memory Approxmate compllfaton tmes To fnd all avalable moves from gven board poston. To make a sngle move and fnd resultng board poston To evaluate a board poston (4 terms) To fnd score for a saved board poston (rote learnng) To evaluate poston (wth 16 terms for generalzaton learnng) 2.6 mllseconds 1.5 mllseconds 2.4 mllseconds 2.3 mllseconds 7.5 mllseconds Board representatons The standard checkerboard numberng system (see Appendx B) s used n communcatng wth the machne. A modfed numberng system s used for nternal computatons, the numbers shown on the squares n Fg. A- correspondng to the bt postons n an BM 704 word. Any gven board poston s represented by four such worc1s; one word (FA) contanng 1's n those bt postons correspondng to squares contanng peces of the color whose turn t s to move and whch normally move n a forward drecton. To be specfc. f t s Black's turn to move (.e., : Black s "actve") FA desgnates the locaton of all of Black's peces, both men and kngs. Conversely, f Whte s actve, FA desgnates the locaton of Whte's kngs only, snce Whte's men can only move n the drecton arbtrarly called backward. The other words desgnate, respectvely: BA, backward actve peces; FP, forward passve peces; and BP, backward passve peces. To conserve space when wrtng on tape, three words are used to record board postons wth kngs, and only two words are used for board postons wthout kngs. These are saved n a standardzed form, as explaned n the text. Possble moves are desgnated by fve words; one word to ndcate by ts sgn (wth the word tself contanng other nformaton) whether the moves are jumps or not. (f a jump s avalable, only jump moves are saved.) The other four words desgnate the locaton of those peces whch can move n the four dfferent dagonal drectons: RF, for rght forward; LF, for left forward; LB, for left backward; and RB, for rght backward, respectvely. By reference to Fg. A-, t wll be observed that a rght-forward move results n an ncrease of 4 n the square desgnaton. whle a left-forward move results n an ncrease of 5. Bt postons 9, 18 and 27 do not appear on the board. Ths notaton makes t possble to compute avalable moves fos all peces smultaneously. Havng prevously. computed a word called EMPTY, whch contans 1's n locatons correspondng to all unoccuped squares, one can compute RF, for the normal move case, n four nstructons, as lsted below (n BM 704 symbolc language): CLA ALS ANA STO EMPTY 4 FA RF (puts word EMPTY nto the accumulator); (shfts word to left by 4 postons); (forms logcal AND between EMPTY and FA); (stores word as newly computed RF). Jump moves are computed by a smple extenson of ths procedure. Multple jumps are handled as a sequence of sngle jumps separated by null-reply moves.

16 550 mprovng the Effcency of a Problem Solver CD 1 CD CD CD! CD CD CD BLACK o 0< <3: 0< o... Fl:llr,' A - Checkerboard notaton for nternal computatons. ", ddlo//lllllle-s(/\'ng expede//s lt countng s done hy a table-lookup procedure n a closed subroutne of 16 executed nstructons (408 mcroseconds). ' hs reljures a 156-word tahle whch s generated at the start by a 13-word program. Smlar table-lookup procedures arc used. 10 llrn a word end-for-end. and to locate the l's n a word for move reportng..\ullplcatons are usually avoded. n several places where multplcaton by small ntegers must be done, t s programmed n terms of shfts and logcal operatons. Durng the look-ahead procedure a complete record s kept of the sequence of board postons currently under nvestgalon. As a result. no computng s needed to retract moves. -

17 4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 551 Appendx B: Sample games from the second seres wth generalzaton learnng Typcal opel//r.' The frst eght moves of selected games n whch Alpha played Black aganst Beta, showng the way n whch dfferent types of play were tred. G-4 G-6 G-11 G-17 G-19 G-21 G-31 G-37 G-39 G-41 G , L 16 Ll 16 L ~ L L ~ S l) ~ l) Typcal games Sample games n whch Alpha played Whte aganst forced Beta openngs. G- G-8 G-30 G-40 G- G:~8 G-30 G--U ' " ' l) S : [ ' :: LO 7 19 S LO L S 1 1 L : j: L " Termnated , Manually ~ S L L t " H 1 ' ' , 14 to " :l ll ' : j;.. J! Beta Concedes, L " L L Beta Concedes S S , 7 j 1 5 S ll Move Termnaton

18 - 552 mprovng the Effcency of a Problem Solver WHTE )( BLACK FC:/r,. B-1 Square desgnatons used n reportng games. Appendx C: Evaluaton polynomal detals for second seres,\fethod of C()/plllng terll/s The 16 terms called for n the evaluaton polynomal are computed, ndvdually, by takng the value of the approprate parameter, as L1etned below, for the board poston under consderaton and subtractng the value of ths same parameter computell for the board poston just pror to the last move (wth the necessary reversal n the defntons lf actve and passve sdes). Ths dfference s then multpled by the correspondng program-computed coeffcent. whch can vary between " and +2 1x and credted to the sde whch was passve on the board poston under cllnslleraton.

19 4.3.3 Some Studes n Machne Learnng Usng the Game of Checkers 553 Dcjntnns O!'(/rtllll'tl'rS ADV (Advancemcnt) The paramcter s crcdted wth for each passve man n thc 5th and 6th rows (countng n passvc's drecton) and debted wth 1 for each passve man n thc 3rd and 4th rows. APEX (Apcx) The parameter s dcbted wth f there are no kngs on the board. f ether square 7 or 26 s occuped by an actve man. and f nether of these squares s occuped by a passve man. BACK (Back Row Brdge) The parameter s credted wth 1 f there are no actve kngs on the board and f the two brdge squares ( and 3, or 30 and 32) n the back row are occuped by passve peces. CENT (Center Control ) The parameter s credted wth for each of the followng squares: 11,12,15,16,20,21,24 and 25 whch s occuped by a passve man. CNTR (Center Control ) The parameter s credted wth for each of the followng squares: 11,12,15,16,20,21,24 and 25 that s ether currently occuped by an actve pece or to whch an actve pece can move. CORN (Double-Corner Credt) The parameter s credted wth 1 f the materal credt value for the actve sde s 6 or less, f the passve sde s ahead n materal credt, and f the actve sde can move nto one of the double-corner squares. CRAMP (Cramp) The paranleter s credted wth 2 f the passve sde occupes the crampng square (13 for Black, and 20 for Whte) and at least one other nearby square (9 or 14 for Black, and 19 or 20 for Whte), whle certan squares (17,21,22 and 25 for Black, and 8,11,12 and 16 for Whte) are all occuped by the actve sde. DENY (Denal of Occupancy) The parameter s credted wth for each square defned n MOB f on the next move a pece occupyng ths square could he captured wthout an exchange. ora (Double Dagonal Fle) The parameter s credted wth for each passve pece located n the dagonal tles termnatng n the doublecorner squares. OlAV (Dagonal Moment Value) The parameter s credted wth 1/2 for each passve pece located on squarcs 2 removed from the doublecorner dagonal fles. wth l for each passve pece located on squarcs removed from the douhle-corner fles and wth 3! 2 for each passve pece n the double-corner lles. DYKE (Dyke) The parameter s credted wth for each strng of passve peces that occupy three adjacent dagonal squares. EXCH (Exehang~) The parameter s credted wth 1 for each square to whch the actve sde may advance a pece and. n so dong, force an exchange. EXPOS (Exposure) The parameter s credted wth for each passve pece that s flanked along one or the other dagonal by two empty squares. FORK (Threat of Fork) The parameter s credted wth 1 for each stuaton n. whch passve peces occupy two adjacent squares n one row and n whch there are three empty squares so dsposed that the actve sde could, by occupyng one of them, threaten a. ~ure capture of one or the other of the two peces. GAP (Gap) The parameter s credted wth 1 for each sngle empty square that separates two passve peces along a dagonal, or that separates a passve pece from the edge of the board. GUARD (Back Row Control) The parameter s credted wth 1 f there are no actve kngs and f ether the Brdge or the Trangle of Oreo s occuped by passve peces. HOLE (Hole) The parameter s credted wth 1 for each empty square that s surrounded by three or more passve peces. KCENT (Kng Center Control) The parameter s credted wth 1 for each of the followng squares: 11, 12, 15, 16, 20,21,24 and 25 whch s occuped by a passve kng. MOB (Total Moblty) The parameter s credted wth 1 for each square to whch the actve sde could move one or more peces n the normal fashon, dsregardng the fact that jump moves mayor may not be avalable. MOBL (Undened Moblty) The parameter s credted wth the dfference between MOB and DENY. MOVE (\ove) The parameter s credted wth f peces are even wth a total pece count (2 for men. and 3 for kngs) of less than 24, and f an odd number of peces are n the move system. defned as those vertcal fles startng wth squares 1, 2, 3 and 4. NODE (Node) The parameter s credted wth 1 for each passve pece that s surrounded by at least three empty squares.

20 554 mprovng the Effcency of a Problem Solver OREO (Trangle of Orca) The parameter s credted wth f there are no passve kngs and f the Trangle of Oreo (squares 2, 3 and 7 for Black. and squares 26. 3D and 3 for Whte) s occuped hy passve peces. RECAP (Recapture) Ths parameter s dentcal wth Exchange. as defned ahove. (t was ntroduced to test the elfects produced by the random tmes at whch parameters are ntroduced and deleted from the evaluaton polynomal.) THRET (Threat) The parameter s credted wth fo'r each square to POLE (Pole) The parameter s credted wth for each passve man that s completely surrounded hy empty squares. whch an actve pece may he moved and n so dong threaten the capture of a passve pece on a subsequent move ' _._ Bnary connectl'e terms The abbrevatons used for the terms of ths type whch have been employed are lsted below, n the order of AoB. r/ob AoB, and AoB. where A and B are the two respectve parameters headng the sublsts of abbrevatons. Denal 01 Occllpancy-Total Mohlty DEMO DEMMa DDEMO DDMM Undened Moblty-Denal 01 Occupancy MODE 1 MODE 2 MODE 3 MODE 4 Undened Moblty- Center Control MOC 1 MOC 3 MOC 2 MOC 4!:''all/aton polynomal (frst /2 terms only) alter 42!?Clnes, dllrng whch a total dfferent sets 01 adjllstments 11'['1'1'!!lade ro (he terms and ther coeffcents. ':'..., _- --'_._ Correlaton Sgn 01 Power 012 Tmes Coeffcent Coeffcent Used as Coeffcent A dllsted \OC 2 J..:CENT \OC -+ \OD~.1 DE\\O \OVE.-\DV \ODE 2 l.-\ck C:\TR THRET \OC ~').J_ COR:\ CR.-\\1 P (L'.\RD F\:'OS DD\\1 [)YJ..:E \OC 1 :\:CH [)DDO Tmes A djlls/ed felore Dscard --'-' o o o " Term MODE CENT MODE 4 FORK MOBL POLE HOLE GAP MOB Tmes Adjusted Belore Dscard.\'"r, od",.',! ll f'l'oof: An addtonal 211 games have recently been played, Although "'111<: 't);l11t'an.1 changes w<:re notell. the general stablzaton of the learnng process,tt););c,lcll hy h~ure -l has heen contrmej. Durng ths play. 412 more adjustments nc ll1ad, tl the terms ;tnd ther coellcents anj 12 alllltons were made to the hl uf d-carded tnms. Recel'ed vlarch 3, )1) o

Do Firms Maximize? Evidence from Professional Football

Do Firms Maximize? Evidence from Professional Football Do Frms Maxmze? Evdence from Professonal Football Davd Romer Unversty of Calforna, Berkeley and Natonal Bureau of Economc Research Ths paper examnes a sngle, narrow decson the choce on fourth down n the

More information

Boosting as a Regularized Path to a Maximum Margin Classifier

Boosting as a Regularized Path to a Maximum Margin Classifier Journal of Machne Learnng Research 5 (2004) 941 973 Submtted 5/03; Revsed 10/03; Publshed 8/04 Boostng as a Regularzed Path to a Maxmum Margn Classfer Saharon Rosset Data Analytcs Research Group IBM T.J.

More information



More information

MANY of the problems that arise in early vision can be

MANY of the problems that arise in early vision can be IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004 147 What Energy Functons Can Be Mnmzed va Graph Cuts? Vladmr Kolmogorov, Member, IEEE, and Ramn Zabh, Member,

More information

Ciphers with Arbitrary Finite Domains

Ciphers with Arbitrary Finite Domains Cphers wth Arbtrary Fnte Domans John Black 1 and Phllp Rogaway 2 1 Dept. of Computer Scence, Unversty of Nevada, Reno NV 89557, USA, jrb@cs.unr.edu, WWW home page: http://www.cs.unr.edu/~jrb 2 Dept. of

More information

can basic entrepreneurship transform the economic lives of the poor?

can basic entrepreneurship transform the economic lives of the poor? can basc entrepreneurshp transform the economc lves of the poor? Orana Bandera, Robn Burgess, Narayan Das, Selm Gulesc, Imran Rasul, Munsh Sulaman Aprl 2013 Abstract The world s poorest people lack captal

More information

Complete Fairness in Secure Two-Party Computation

Complete Fairness in Secure Two-Party Computation Complete Farness n Secure Two-Party Computaton S. Dov Gordon Carmt Hazay Jonathan Katz Yehuda Lndell Abstract In the settng of secure two-party computaton, two mutually dstrustng partes wsh to compute

More information



More information


EVERY GOOD REGULATOR OF A SYSTEM MUST BE A MODEL OF THAT SYSTEM 1 Int. J. Systems Sc., 1970, vol. 1, No. 2, 89-97 EVERY GOOD REGULATOR OF A SYSTEM MUST BE A MODEL OF THAT SYSTEM 1 Roger C. Conant Department of Informaton Engneerng, Unversty of Illnos, Box 4348, Chcago,

More information

Who are you with and Where are you going?

Who are you with and Where are you going? Who are you wth and Where are you gong? Kota Yamaguch Alexander C. Berg Lus E. Ortz Tamara L. Berg Stony Brook Unversty Stony Brook Unversty, NY 11794, USA {kyamagu, aberg, leortz, tlberg}@cs.stonybrook.edu

More information

The Relationship between Exchange Rates and Stock Prices: Studied in a Multivariate Model Desislava Dimitrova, The College of Wooster

The Relationship between Exchange Rates and Stock Prices: Studied in a Multivariate Model Desislava Dimitrova, The College of Wooster Issues n Poltcal Economy, Vol. 4, August 005 The Relatonshp between Exchange Rates and Stock Prces: Studed n a Multvarate Model Desslava Dmtrova, The College of Wooster In the perod November 00 to February

More information

Turbulence Models and Their Application to Complex Flows R. H. Nichols University of Alabama at Birmingham

Turbulence Models and Their Application to Complex Flows R. H. Nichols University of Alabama at Birmingham Turbulence Models and Ther Applcaton to Complex Flows R. H. Nchols Unversty of Alabama at Brmngham Revson 4.01 CONTENTS Page 1.0 Introducton 1.1 An Introducton to Turbulent Flow 1-1 1. Transton to Turbulent

More information

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C.

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. Fnance and Economcs Dscusson Seres Dvsons of Research & Statstcs and Monetary Affars Federal Reserve Board, Washngton, D.C. Banks as Patent Fxed Income Investors Samuel G. Hanson, Andre Shlefer, Jeremy

More information

DISCUSSION PAPER. Should Urban Transit Subsidies Be Reduced? Ian W.H. Parry and Kenneth A. Small

DISCUSSION PAPER. Should Urban Transit Subsidies Be Reduced? Ian W.H. Parry and Kenneth A. Small DISCUSSION PAPER JULY 2007 RFF DP 07-38 Should Urban Transt Subsdes Be Reduced? Ian W.H. Parry and Kenneth A. Small 1616 P St. NW Washngton, DC 20036 202-328-5000 www.rff.org Should Urban Transt Subsdes

More information

Assessing health efficiency across countries with a two-step and bootstrap analysis *

Assessing health efficiency across countries with a two-step and bootstrap analysis * Assessng health effcency across countres wth a two-step and bootstrap analyss * Antóno Afonso # $ and Mguel St. Aubyn # February 2007 Abstract We estmate a sem-parametrc model of health producton process

More information

The Developing World Is Poorer Than We Thought, But No Less Successful in the Fight against Poverty

The Developing World Is Poorer Than We Thought, But No Less Successful in the Fight against Poverty Publc Dsclosure Authorzed Pol c y Re s e a rc h Wo r k n g Pa p e r 4703 WPS4703 Publc Dsclosure Authorzed Publc Dsclosure Authorzed The Developng World Is Poorer Than We Thought, But No Less Successful

More information

Why Don t We See Poverty Convergence?

Why Don t We See Poverty Convergence? Why Don t We See Poverty Convergence? Martn Ravallon 1 Development Research Group, World Bank 1818 H Street NW, Washngton DC, 20433, USA Abstract: We see sgns of convergence n average lvng standards amongst

More information

Income per natural: Measuring development as if people mattered more than places

Income per natural: Measuring development as if people mattered more than places Income per natural: Measurng development as f people mattered more than places Mchael A. Clemens Center for Global Development Lant Prtchett Kennedy School of Government Harvard Unversty, and Center for

More information

DISCUSSION PAPER. Is There a Rationale for Output-Based Rebating of Environmental Levies? Alain L. Bernard, Carolyn Fischer, and Alan Fox

DISCUSSION PAPER. Is There a Rationale for Output-Based Rebating of Environmental Levies? Alain L. Bernard, Carolyn Fischer, and Alan Fox DISCUSSION PAPER October 00; revsed October 006 RFF DP 0-3 REV Is There a Ratonale for Output-Based Rebatng of Envronmental Leves? Alan L. Bernard, Carolyn Fscher, and Alan Fox 66 P St. NW Washngton, DC

More information

From Computing with Numbers to Computing with Words From Manipulation of Measurements to Manipulation of Perceptions

From Computing with Numbers to Computing with Words From Manipulation of Measurements to Manipulation of Perceptions IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: FUNDAMENTAL THEORY AND APPLICATIONS, VOL. 45, NO. 1, JANUARY 1999 105 From Computng wth Numbers to Computng wth Words From Manpulaton of Measurements to Manpulaton

More information

TrueSkill Through Time: Revisiting the History of Chess

TrueSkill Through Time: Revisiting the History of Chess TrueSkll Through Tme: Revstng the Hstory of Chess Perre Dangauther INRIA Rhone Alpes Grenoble, France perre.dangauther@mag.fr Ralf Herbrch Mcrosoft Research Ltd. Cambrdge, UK rherb@mcrosoft.com Tom Mnka

More information

As-Rigid-As-Possible Shape Manipulation

As-Rigid-As-Possible Shape Manipulation As-Rgd-As-Possble Shape Manpulaton akeo Igarash 1, 3 omer Moscovch John F. Hughes 1 he Unversty of okyo Brown Unversty 3 PRESO, JS Abstract We present an nteractve system that lets a user move and deform

More information

(Almost) No Label No Cry

(Almost) No Label No Cry (Almost) No Label No Cry Gorgo Patrn,, Rchard Nock,, Paul Rvera,, Tbero Caetano,3,4 Australan Natonal Unversty, NICTA, Unversty of New South Wales 3, Ambata 4 Sydney, NSW, Australa {namesurname}@anueduau

More information

Ensembling Neural Networks: Many Could Be Better Than All

Ensembling Neural Networks: Many Could Be Better Than All Artfcal Intellgence, 22, vol.37, no.-2, pp.239-263. @Elsever Ensemblng eural etworks: Many Could Be Better Than All Zh-Hua Zhou*, Janxn Wu, We Tang atonal Laboratory for ovel Software Technology, anng

More information

The Global Macroeconomic Costs of Raising Bank Capital Adequacy Requirements

The Global Macroeconomic Costs of Raising Bank Capital Adequacy Requirements W/1/44 The Global Macroeconomc Costs of Rasng Bank Captal Adequacy Requrements Scott Roger and Francs Vtek 01 Internatonal Monetary Fund W/1/44 IMF Workng aper IMF Offces n Europe Monetary and Captal Markets

More information

Alpha if Deleted and Loss in Criterion Validity 1. Appeared in British Journal of Mathematical and Statistical Psychology, 2008, 61, 275-285

Alpha if Deleted and Loss in Criterion Validity 1. Appeared in British Journal of Mathematical and Statistical Psychology, 2008, 61, 275-285 Alpha f Deleted and Loss n Crteron Valdty Appeared n Brtsh Journal of Mathematcal and Statstcal Psychology, 2008, 6, 275-285 Alpha f Item Deleted: A Note on Crteron Valdty Loss n Scale Revson f Maxmsng

More information

Face Alignment through Subspace Constrained Mean-Shifts

Face Alignment through Subspace Constrained Mean-Shifts Face Algnment through Subspace Constraned Mean-Shfts Jason M. Saragh, Smon Lucey, Jeffrey F. Cohn The Robotcs Insttute, Carnege Mellon Unversty Pttsburgh, PA 15213, USA {jsaragh,slucey,jeffcohn}@cs.cmu.edu

More information

Table of contents Document code: DPD00288B3 Edited: 20.01.2011

Table of contents Document code: DPD00288B3 Edited: 20.01.2011 Table of contents Document code: DPD00288B3 Edted: 20.01.2011 1. Safety 4 1.1 Warnngs 4 1.2 Safety nstructons 6 1.3 Earthng and earth fault protecton 6 1.4 Before runnng the motor 7 2. Recept of delvery

More information

As-Rigid-As-Possible Image Registration for Hand-drawn Cartoon Animations

As-Rigid-As-Possible Image Registration for Hand-drawn Cartoon Animations As-Rgd-As-Possble Image Regstraton for Hand-drawn Cartoon Anmatons Danel Sýkora Trnty College Dubln John Dnglana Trnty College Dubln Steven Collns Trnty College Dubln source target our approach [Papenberg

More information

What to Maximize if You Must

What to Maximize if You Must What to Maxmze f You Must Avad Hefetz Chrs Shannon Yoss Spegel Ths verson: July 2004 Abstract The assumpton that decson makers choose actons to maxmze ther preferences s a central tenet n economcs. Ths

More information