contriuted rticles Exmple- Bsed Lerning in Computer- Aided STEM Eduction DOI:10.1145/2634273 Exmple-sed resoning techniques developed for progrmming lnguges lso help utomte repetitive tsks in eduction. BY SUMIT GULWANI such exmple-sed resoning techniques developed in the progrmminglnguges community cn lso help utomte certin repetitive nd structured tsks in eduction, including prolem genertion, solution genertion, nd feedck genertion. These connections re illustrted through recent work (in computer science) pplied to vriety of STEM suject domins, including logic, 1 utomt theory, 3 progrmming, 27 rithmetic, 5,6 lger, 26 nd geometry. 17 More significnt, the rticle identifies some generl principles nd methodologies tht re pplicle cross multiple suject domins. Procedurl vs. conceptul prolems. Procedurl prolems involve solutions tht require following specific procedure students re expected to memorize nd pply; exmples include mthemticl procedures 5 tught in middle school or high school (such s ddition, long division, gretest common divisor computtion, Gussin elimintion, nd sis trnsformtions) nd lgorithmic procedures tught in undergrdute computer science, where students re expected to demonstrte their understnding of certin clssic lgorithms on specific inputs (such s redth-first serch, insertion sort, Dijkstr s shortest-pth key insights HUMAN LEARNING AND communiction is often structured round exmples, possily student trying to understnd or mster certin concept through exmples or techer trying to understnd student s misconceptions or provide feedck through exmple ehviors. Exmple-sed resoning is lso used in computer-ided progrmming to nlyze progrms, including to find ugs through test-input-genertion techniques 4,34 nd prove correctness through inductive resoning or rndom exmples 15 nd synthesize progrms through input/output exmples or demonstrtions. 10,16,18,22 This rticle explores how Computing technologies cn utomte repetitive tsks in eduction, including prolem genertion, solution genertion, nd feedck genertion, for numerous suject domins, including progrmming, logic, utomt theory, rithmetic, lger, nd geometry. This cn mke stndrd nd online clssrooms more efficient nd enle new pedgogies involving personlized workflows, sved techer time, nd improved student lerning. Computer-ided eduction requires crossdisciplinry computing technologies; highlighted here re contriutions from progrmming lnguges, humn-computer interction, nd rtificil intelligence; nturl lnguge understnding nd mchine lerning lso ply significnt role. 70 COMMUNICATIONS OF THE ACM AUGUST 2014 VOL. 57 NO. 8
IMAGE BY SERGEY NIVENS lgorithm, regulr expression to utomton conversion, nd even computing tensor/inner product of quits). Conceptul prolems include ll nonprocedurl ones for which there is no decision procedure (the student is expected to know nd pply) ut require cretive thinking in the form of pttern mtching or educted guesses. Prolems include: Proof prolems. Nturl deduction proofs,1 proofs of lgeric theorems,26 nd proofs of non-regulrity of lnguges; nd Construction prolems. Construction of computtionl rtifcts (such s geometric constructions,17 utomt,3 lgorithmic procedures,27 nd itvector circuits). Exmple-sed lerning. Exmples hve multifceted use in eductionl technologies. This rticle clssifies their use ccording to interction with the underlying technology (see Figure 1). Input. For severl eductionl tsks, exmples constitute nturl mens to express intent. In the cse of solution genertion for procedurl prolems, techers cn demonstrte exmple trces with the gol of synthesizing procedures for the prolems. In the cse of prolem genertion for conceptul prolems, techers cn provide n exmple prolem with the gol of generting similr prolems. In the cse of feedck genertion for procedurl prolems, techers cn provide exmples of uggy trces with the gol of lerning the lgorithmic misconceptions student might hve. In the cse of feedck genertion for conceptul prolems, techers cn provide exmples of common locl error corrections, iming to find some ppro- prite comintion of the corrections tht correct given incorrect ttempt. For such cses, this rticle descries techniques inspired y reserch in progrmming y exmple (PBE).10,16,18,22 Output. For some eductionl tsks, exmples constitute the intended output rtifct. In the cse of prolem genertion for procedurl prolems, techers wnt to produce exmple inputs tht exercise vrious pths in the given procedure to generte progression of prolems. In the cse of feedck genertion for conceptul prolems, techers wnt to produce counterexmples tht expose incorrect ehvior in the student s solution. For such cses, the rticle descries techniques inspired y progrm nlysis, in prticulr y test-input-genertion techniques4,34 often used to find ugs. Inside. Exmples cn lso e used AU G U ST 2 0 1 4 VO L. 57 N O. 8 C OM M U N IC AT ION S OF T HE ACM 71
contriuted rticles inside the underlying lgorithms to perform inductive resoning, which hppens in oth solution genertion nd prolem genertion for conceptul prolems. It is inspired y how humns often pproch prolem genertion nd solving, with the underlying techniques inspired y reserch in estlishing progrm correctness using rndom exmples 15 nd progrm synthesis using exmples. 16 The rticle next explores exmplesed lerning technologies through specific instnces, highlighting generl principles. It is orgnized round the three key tsks in intelligent tutor- Figure 1. Three wys exmples re used in computer-ided eductionl technologies s input (for intent expression); s output (to generte the intended rtifct); nd inside the underlying lgorithm (for inductive resoning). Procedurl Conceptul Solution Genertion Input 6 Inside 1,17 Prolem Genertion Output 5 Input, 1,26 Inside 1,26 Feedck Genertion Input 6 Output, 3 Input 27 Figure 2. Solution genertion for procedurl prolems: 6 () demonstrte gretest common fctor (GCF) procedure over inputs 762 nd 1270 to produce output 254; nd () synthesize procedure GCF utomticlly from the demonstrtion in (). 762 1270 1 762 508 762 1 508 254 508 2 508 0 254 () Figure 3. Solution genertion for geometry constructions. 17 GCF (int rry rry T, int I1, int I2) 1 Assume T, [0,0], T[1,0] contin I1, I2 respectively. 2 for ( j := 0; T[2j, j ] 0; j := j + 1): 3 T[2j, j + 2] := Floor(T[2 j, j + 1] T[2 j, j]); 4 T[2j + 1, j + 1] := T[2j, j + 2] T[2 j, j]); 5 T[2j + 1, j + 1] := T[2j, j + 1] [2j + 1, j + 1]; 6 T[2j + 2, j + 2] := T[2j, j]; 7 return T[2j, j + 1]; () () English Description Construct tringle given its se L (with end-points p 1, p 2 ), se ngle, nd sum r of the other two sides. () PreCondition r > Length(p 1, p 2 ) PostCondition (c) Rndom Exmple L = Line(p 1 = 81:62; 99:62, p 2 = 99:62; 83:62 ) r = 88:07 = 0:81 rdins p = 131:72; 103:59 (d) Geometry Progrm (e) Geometry Construction Angle(p, p 1, p 2 ) = Length(p, p 1 ) + Length(p, p 2 ) = r ConstructTringle(p 1, p 2, L, r, ): L 1 := ConstructLineGivenAngleLinePoint(L,, p 1 ); C 1 := ConstructCircleGivenPointLength(p 1, r); (p 3, p 5 ) := LineCircleIntersection(L 1, C 1 ); L 2 := PerpendiculrBisector2Points(p 2, p 3 ); p := LineLineIntersection(L 1, L 2 ); return p; p 1 C 1 r L 1 p 2 L 2 p p 3 ing 33 solution genertion, prolem genertion, nd feedck genertion through multiple instnces of exmplesed lerning technologies for ech tsk. Also descried re severl evlutions ssocited with ech of these instnces. While severl of the instnces re preliminry, some hve een deployed nd evluted more thoroughly. Solution Genertion Solution genertion involves utomticlly generting solutions, given prolem description in some suject domin, nd is importnt for severl resons: It cn e used to generte smple solutions for utomticlly generted prolems; given student s incomplete solution, it cn e used to complete solution tht cn e much more illustrtive for the student compred to providing completely different smple solution; nd, given student s incomplete solution, it cn lso e used to generte hints on the next step or towrd n intermedite gol. Procedurl prolems. Solution genertion for procedurl prolems cn e chieved y writing down nd executing the corresponding procedure for given prolem. While these procedures cn e written mnully, technologies for utomtic procedure synthesis (from exmples) cn enle nonprogrmmers to crete customized procedures on the fly. The numer of such procedures nd their stylistic vritions in how they re tught cn e significnt nd my not e known in dvnce to outsource mnul cretion of the procedures. The procedures cn e synthesized through PBE technology 10,16,22 trditionlly pplied to end-user pplictions. More recently, PBE hs lso een used to synthesize progrms for spredsheet tsks, including string trnsformtions nd tle lyout trnsformtions. 18 Mthemticl procedures cn e viewed s spredsheet procedures involving computtion of new vlues from existing vlues in spredsheet cells, s in string trnsformtions tht produce new output string from sustrings of input strings, nd positioning tht vlue in n pproprite spredsheet cell, s in tle trnsformtions tht reposition the content of n input spredsheet tle. Ides from lerning string nd 72 COMMUNICATIONS OF THE ACM AUGUST 2014 VOL. 57 NO. 8
contriuted rticles tle trnsformtions cn e comined to lern mthemticl procedures from exmple trces, where trce is sequence of (vlue, cell) pirs. 6 Dynmic progrmming cn e used to compute ll suprogrms tht re consistent with vrious sutrces (in order of incresing length). The underlying lgorithm strts out y computing, for ech trce element (v, c), the set of ll progrm sttements (over techer-specified set of opertors) tht cn produce v from previous vlues in the trce; see Figure 2 for synthesis of gretest common divisor procedure from n exmple trce, where the techer-specified opertors include,,, nd Floor. Conceptul prolems. Solution genertion for conceptul prolems often requires performing serch over the underlying solution spce. Following re two complementry principles, ech useful cross multiple suject domins while lso reflecting how humns might serch for such solutions. S1: Perform resoning over exmples s opposed to strct symolic resoning. The ide is to reson out the ehvior of solution on some or even ll exmples, or concrete inputs, insted of performing symolic resoning over n strct input. Such resoning reduces serch time y lrge constnt fctors ecuse executing prt of construction or proof on concrete inputs is much quicker thn resoning symoliclly out the construction or proof. S2: Reduce solution spce to solutions with smll length. The ide is to extend the solution spce with commonly used mcro constructs in which ech such construct is composition of severl sic constructs/steps. This extension reduces the size of solutions, mking serch more fesile in prctice. The following illustrtes these principles in multiple suject domins: Geometry constructions. Geometry construction is method for constructing desired geometric oject from other ojects y pplying sequence of ruler nd compss constructions (see Figure 3). Such constructions re n importnt prt of high school geometry. The utomted geometric-theorem-proving community (one of the success stories in utomted resoning) hs developed The use of trcesed modeling llows for testinput-genertion tools for generting prolems with certin trce fetures. tools (such s Geometry Explorer 32 nd Geometry Expert 14 ) tht llow students to crete geometry constructions nd use interctive provers to prove properties of the constructions. How re these constructions synthesized in the first plce? Geometry constructions cn e regrded s stright-line progrms tht mnipulte geometry ojects points, lines, nd circles using ruler/compss opertors. Hence, their synthesis cn e phrsed s progrm-synthesis prolem 17 in which the gol is to synthesize stright-line progrm, s in Figure 3d, tht relizes the reltionl specifiction etween inputs nd outputs, s in Figure 3. The semntics of geometry opertions is too complicted for symolic methods for synthesis or even for verifiction. Ruler/compss opertors re nlytic functions, implying the vlidity of geometry construction cn e proilisticlly inferred from testing on rndom exmples, n impliction tht follows from the following extension of the clssicl result on polynomil identity testing 25 to nlytic functions: Property 1 (proilistic testing of nlytic functions). Let f (X) nd g(x) e non-identicl rel-vlued nlytic functions over R n. Let Y R n e selected uniformly t rndom, then with high proility over the rndom selection f (Y) g(y). Property 1 follows from the fct tht non-zero nlytic functions hve isolted zeroes; tht is, for every zero point of n nlytic function, there exists neighorhood in which the function is non-zero. The numer of non-zero points of the non-zero nlytic function f(x) g(x) thus domintes the numer of its zero points. The prolem of synthesizing geometry constructions tht stisfy symolic reltionl specifiction etween inputs nd outputs cn thus e reduced to synthesizing constructions tht re consis Unlike the polynomil identity testing theorem, 25 which llows performing modulr rithmetic over numers selected rndomly from finite integer set for efficient evlution, this result provides no constructive guidnce on the size of the selection set nd requires precise rithmetic. This process is pproximted y using finite-precision floting-point rithmetic nd threshold for compring equlity; in prcticl experiments, it hs yielded no unsoundness or incompleteness. AUGUST 2014 VOL. 57 NO. 8 COMMUNICATIONS OF THE ACM 73
contriuted rticles Figure 4. Solution genertion for nturl deduction: 1 () smple inference rules; () smple replcement rules; (c) strct proof of the prolem in Figure 7, with second column listing the 32-it integer representtion of the truth-tle over five vriles; (d) nturl deduction proof of the prolem in Figure 7, with inference rule pplictions in old; nd (e) nturl deduction proof of prolem similr to the one in Figure 7 with the sme inference rule steps. Rule Nme Premises Conc Modus Ponens (MP) p q, p q Hypo. Syllogism (HS) p q, q r p r Disj. Syllogism (DS) p q, p q Simplifiction (Simp) p q q () Equivlent Rule Nme Proposition Proposition Distriution p (q r) (p q) (p r) Doule p p Negtion Impliction p q p q Equivlence p q (p q) (q p) p q (p q) ( p q) () Step Truth-tle Reson P1 1048575 Premise P2 4294914867 Premise P3 3722304989 Premise 1 16777215 P1, Simp 2 4294923605 P2, P3, HS 3 1442797055 1, 2, HS (c) tht is either premise or derived from preceding propositions through ppliction of some inference rule (see Figure 4) or replcement rule (see Figure 4), the lst of which concludes the rgument; see Figure 4d for proof. Ditmrsc 29 surveyed proof ssistnts for teching nturl deduction (such s Pndor 9 ), some of which lso solve prolems. This rticle descries different, sclle, wy to solve such prolems while lso pving the wy for generting fresh prolems, s descried in the next section. While the SAT (Boolen stisfiility), SMT (stisfiility modulo theories), nd theorem-proving communities 8 continue to focus on solving lrge-size proof prolems in resonle mount of time, one recent pproch, y Ahmed et l., 1 to generting nturl deduction proofs in rel time leverges the oservtion tht clssroom-size instnces re smll. The Ahmed et l. pproch reflects use of the two generl principles discussed erlier: strct proposition using its truth tle, which cn e represented using itvector representtion, 20 thus voiding expensive symolic resoning nd reducing ppliction of inference rules to simple itvector opertions (Principle S1); nd rek the proof serch into multiple smller (nd hence more efficient) proof serches (Principle S2). First, n strct proof is discovered tht involves only inference-rule pplictions over truth-tle representtion; note replcement rules re identity opertions over truth-tle representtion. This strct proof over truth-tle representtion is then refined to complete proof over symolic propositions y serching for sequences of replcement rules etween consecutive inference rules; see Figure 4c for n exmple of n strct proof nd Figure 4d for its refinement to complete proof. Note the size of n strct proof nd the numer of replcement rules inserted etween ny two consecutive inference rules is much smller thn the size of the overll proof. The Ahmed et l. pproch solved 84% of 279 prolems from vrious textooks (generting proofs of 27 steps), while seline lgorithm (using symolic representtion for propositions nd performing redthtent with rndomly chosen input-output exmples (Principle S1). This reduction is the sis of Gulwni et l. s 17 synthesis lgorithm for geometry constructions involving two key steps (see lso Figure 3) reflecting the two generl principles discussed erlier: Generte rndom input-output exmples, s in Figure 3c, from the logicl description, s in Figure 3, of the given prolem using off-the-shelf numericl solvers; the logicl description is in turn generted from the nturl lnguge description, s in Figure 3, using nturl lnguge trnsltion technology; nd Perform rute-force serch over lirry of ruler-nd-compss opertors to find construction, s in Figure 3d, tht trnsforms the rndomly selected input(s) into corresponding output(s). Step Proposition Reson P1 ( ) Premise P2 x 4 Premise P3 x 4 Premise 1 ( x ) (x x ) 2 1 3 P1, Distr. 2 1, Simp. 3 P2, P3, HS. 4 2, Comm. 5 4, Doule Neg 6 5, Impliction 7 x5 6, 3, HS. 8 x 5 7, Impliction Conc x 5 8, Doule Neg (d) Step Proposition Reson P1 Premise P2 Premise P3 (x 4 ) Premise 1 ( ) ( ) P1, Equivlence 2 1, Simp. 3 (x 4 ) P3, P2, HS. 4 (x 4 x 5 ) 3, Trnsposition 5 (x 4 ) 4, Doule Neg 6 (x 4 ) 2, 5, HS. 7 ( x 4 x 5 ) 6, Impliction 8 ( x 4 x 5 ) 7, De Morgn s Conc (x 4 x 5 ) 8, Doule Neg. (e) The serch is performed over n extended lirry of ruler nd compss opertors tht includes higher-level primitives, such s perpendiculr nd ngulr isectors (Principle S2). The use of n extended lirry not only shortens the size of solution (llowing for efficient serch) ut lso mkes solution more redle for students. On Gulwni et l. s 17 enchmrk of 25 prolems, the extended lirry helped reduce the mximum solution size from 45 steps to 13 steps nd incresed the success rte from 75% to 100%. Nturl deduction proofs. Nturl deduction (tught in introductory logic courses in college) is method for estlishing the vlidity of rguments in propositionl logic, where the conclusion of n rgument is derived from the premises through series of discrete steps. Ech one derives proposition 74 COMMUNICATIONS OF THE ACM AUGUST 2014 VOL. 57 NO. 8
contriuted rticles first serch for the complete proof) solved 57% of the sme prolems. 1 Prolem Genertion Generting fresh prolems with specific solution chrcteristics (such s certin difficulty level nd set of concepts) is tedious for the techer. Automting the genertion of fresh prolems hs severl enefits: Generting prolems similr to given prolem cn help void copyright issues. It my not e legl to pulish prolems from textooks on course wesites. A prolem-genertion tool cn give instructors fresh source of prolems for their ssignments or lecture notes. It cn lso help prevent cheting 23 in clssrooms or MOOCs (with unsynchronized instruction) since ech student cn e given different prolem with the sme difficulty level. And when student fils to solve prolem nd ends up looking t the smple solution, the student my e ssigned similr prctice prolem y n utomted system, not necessrily y humn techer. Generting prolems with given difficulty level nd exercising given set of concepts cn help crete personlized workflows for students. Students who solve prolem correctly my e given prolem more difficult thn the lst prolem or tht involves richer set of concepts. On the other hnd, fresh prolems crete new pedgogicl chllenges since techers my no longer recognize the prolems nd students my e unle to discuss them with one nother fter ssignment sumission. These chllenges cn e mitigted through solution-genertion nd feedckgenertion cpilities. Procedurl prolems. A procedurl prolem cn e chrcterized y the trce it genertes through the corresponding procedure. Vrious fetures of the trce cn then e used to identify the difficulty level of procedurl prolem nd the concepts it exercises; for instnce, trce tht executes oth sides of rnch (in multiple itertions through loop) might exercise more concepts thn the one tht simply executes only one side of tht rnch, nd trce tht executes more itertions of loop might e more difficult thn the one tht executes fewer itertions. Trce-sed modeling llows for test-input-genertion tools 4 for generting prolems with certin trce fetures. Andersen et l. 5 used this insight to utomticlly synthesize prctice prolems for elementry nd middle school mthemtics; 5 Figure 5 outlines such utomtic synthesis in the context of n ddition procedure. Note vrious ddition concepts cn e modeled s trce properties nd, in prticulr, regulr expressions over procedure loctions. Moreover, trce-sed modeling llows for use of notions of procedure coverge 34 to evlute the comprehensiveness of certin collection of expert-designed prolems nd fill ny holes. It lso llows for defining prtil order over prolems y defining prtil order over corresponding trces sed on trce fetures (such s numer of times loop ws executed nd whether the exceptionl cse of conditionl rnch ws executed) nd the set of n-grms present in the trce. Andersen et l. 5 used this prtil order to synthesize progressions of prolems nd even to nlyze nd compre existing progressions cross multiple textooks. As prt of follow-on work, Andersen et l. used their trce-sed frmework to synthesize progression of thousnds of levels for Refrction, populr mth puzzle gme. An A/B test with 2,377 plyers (on the portl http://www. newgrounds.com) showed utomticlly synthesized progression cn motivte plyers to ply for similr lengths of time, s in the cse of the originl expert-designed progression. The medin plyer in the synthesized progression group plyed 92% s long s the medin plyer in the expert-designed progression group. Effective progressions re importnt not just for school-sed lerning ut lso for usility nd lernility in end-user pplictions. Mny modern user pplictions hve dvnced fetures, nd lerning them constitutes mjor effort y the user. Designers hve thus focused on trying to reduce the effort; for exmple, Dong et l. 11 creted series of mini-gmes to tech users dvnced imge-mnipultion tsks in Adoe Photoshop. The Andersen et l. 5 methodology my ssist in creting such tutorils nd gmes y utomticlly generting progressions of tsks from procedurl specifictions of dvnced tsks. Conceptul prolems. Prolem genertion for certin conceptul prolems cn e likened to discovering new theorems, serch-intensive ctivity tht cn e ided y dominspecific strtegies. However, two generl principles re useful cross multiple suject domins: Figure 5. Prolem genertion for procedurl prolems: 5 () ddition procedure to dd two numers, instrumented with control loctions on the right side; nd () concepts expressed in terms of trce fetures nd corresponding exmple inputs tht stisfy those fetures (such exmple inputs cn e generted through test-input-genertion techniques). Add(int rry A, int rry B) := Mx(Len(A), Len(B)); for i=0 to -1. Loop over digits (L) if (i Len(A)) t := B[i]; Different # of digits (D) else if (i Len(B)) t := A[i]; Different # of digits (D) else t:=a[i]+b[i]; if (C[i] == 1) t:=t+1; Crry from prev. step (C) if (t > 9) { R[i]:=t-10; C[i + 1]:=1; } else R[i] := t; if (C[ ] == 1) R[ ] := 1; Extr digit in output (E) () Concept Trce chrcteristic Exmple input Single-digit ddition L 3 + 2 Multiple-digit ddition without crry LL + 1234 + 8765 Single crry L*(LC)L* 1234 + 8757 Two single crries L*(LC)L + (LC)L* 1234 + 8857 Doule crry L*(LCLC)L* 1234 + 8667 Triple crry L*(LCLCLC)L* 1234 + 8767 Extr digit in input nd new digit in output L*CLDCE 9234 + 900 () AUGUST 2014 VOL. 57 NO. 8 COMMUNICATIONS OF THE ACM 75
contriuted rticles P1: Exmple-sed templte generliztion. This involves generlizing given exmple prolem into templte nd serching for ll possile instntitions of the templte for vlid prolems. Given the serch spce might e vst, it is usully pplicle when the vlidity of given cndidte prolem cn e checked quickly. It does not necessrily require ccess to solution-genertion technology, though such technology cn e used to scertin the difficulty level of the generted prolems; nd P2: Prolem genertion s reverse of solution genertion. This pplies only to proof prolems. The ide is to perform reverse serch in the solution-serch spce strting with the gol nd leding up to the premises. It hs the dvntge of ensuring the generted prolems hve specific solution chrcteristics. The following sections illustrte how these principles re used in multiple suject domins. Algeric proof prolems. Prolems tht require proving lgeric identities (see Figure 6) re common in high school mth curricul. Generting such prolems is tedious for the techer since the techer cnnot ritrrily chnge constnts (unlike in procedurl prolems) or vriles to generte correct prolem. The Singh et l. 26 Alger-prolemgenertion methodology, s in Figure 6, uses Principle P1 to generte fresh prolems similr to given exmple prolem. First, given exmple prolem is generlized into templte with hole for ech opertor in the originl prolem to e replced y nother opertor of the sme type signture. The techer cn guide the templte-generliztion process y providing more exmple prolems or mnully editing the initilly generted templte. All possile instntitions of the templte re utomticlly enumerted, nd the vlidity of n instntition is checked y testing on rndom inputs. The proilistic soundness of such check follows from Property 1. The methodology works for identities over nlytic functions involving common lgeric opertors, including trigonometry, integrtion, differentition, logrithm, nd exponentition. Note the methodology would not e fesile if symolic resoning were used (insted of rndom testing) to check the vlidity of cndidte instntition since symolic resoning is much slower (Principle S1) nd the density of vlid instntitions is often quite low. Nturl deduction prolems. Figure 7 covers three interfces for generting new nturl deduction prolems: 1 Figure 6. Prolem genertion for lgeric-proof prolems involving identities over nlytic functions (such s trigonometry nd determinnts); 26 given prolem is generlized into templte, nd vlid instntitions re found y testing on rndom vlues for free vriles. Exmple Prolem Generlized Prolem Templte New Similr Prolems sin A 1 + cos A + 1 + cos A sin A T1 A 1 ± T3A + = 2T5A 1 ± T2A T4A where Ti {cos, sin, tn, cot, sec, csc} cos A 1 sin A + 1 sin A cos A = 2 tn A cos A 1 + sin A + 1 + sin A cos A = 2 sec A cot A 1 + csc A + 1 + csc A cot A = 2 sec A tn A 1 + sec A + 1 + sec A tn A = 2 csc A sin A 1 cos A + 1 cos A sin A = 2 cot A (x + y) 2 zx zy = 2 csc A zx (y + z) 2 xy = 2xyz(x + y + z) 3 yz xy (z + x) 2 F0(x, y, z) F3(x, y, z) F6(x, y, z) F1(x, y, z) F4(x, y, z) F7(x, y, z) F2(x, y, z) F5(x, y, z) F8(x, y, z) = c F9(x, y, z) where Fi (0 i 8) nd F9 re homogeneous polynomils of degrees 2 nd 6 respectively, (i, j) {(4,0), (8,4), (5, 1),...} : Fi = Fj [x y; y z; z x], nd c {±1, ±2,..., ±10}. y 2 (z + y) 2 z 2 xy zx + z 2 xy + yz + y 2 yz zx z 2 (x + z) 2 yz + y 2 yz xy + xy zx + z 2 zx (y + x) 2 y 2 yz + y 2 zx + z 2 zx xy yz xy + = 2(xy + yz + zx) 3 = xyz(x + y + z) 3 = 4 y 2 z 2 The proposition replcement interfce (see Figure 7) finds replcements for given premise or the conclusion in given exmple prolem. It genertes those propositions s replcements tht ensure the new prolem is well defined, or one whose conclusion is implied y the premises ut not y ny strict suset of the premises. This interfce, sed on Principle P1, involves checking ll possile smll-size propositions s replcements. The vlidity of ech cndidte prolem is checked y performing itvector opertions over itvector-sed truth tle representtion of the propositions 20 (Principle S1). A cndidte prolem is vlid if the itwise-nd of the premise itvectors is itwise smller thn the conclusion itvector. The similr prolem-genertion interfce finds prolems with solution tht uses exctly the sme sequence of inference rules used y solution of n exmple prolem. Figure 7 lists utomticlly generted prolems, given n exmple prolem. Figure 4e descries solution for the first new prolem in Figure 7. Oserve this solution uses exctly the sme sequence of inference rules (in old) s the solution for the originl exmple prolem in Figure 4d. The prmeterized prolem-genertion interfce finds prolems with specific fetures (such s given numer of premises nd vriles, mximum size of propositions, nd smllest proof involving given numer of steps nd given set of rules). Figure 7c lists utomticlly generted prolems, given some prmeters. Both these interfces find desired prolems y performing reverse serch in the solution spce (Principle P2) explored y the solution-genertion technology for nturl deduction descried erlier. The similr-prolem-genertion interfce further uses the solution templte otined from solution of the exmple prolem for serch guidnce (Principle P1). Feedck Genertion Feedck genertion my involve identifying whether or not student s solution is incorrect, why it is incorrect, nd where or how it cn e fixed. A techer might even wnt to generte hint to enle students to identify nd/or fix mistkes on their own. In exmintion 76 COMMUNICATIONS OF THE ACM AUGUST 2014 VOL. 57 NO. 8
contriuted rticles Figure 7. Prolem-genertion interfces for nturl deduction prolems; 1 () proposition replcement; () similr-prolem genertion; nd (c) prmeterized-prolem genertion. Some replcements for Premise 3 in Exmple Prolem in (): x4 x4 x5 x4 x2 x4 x2 x4 x1 () Premise 1 Premise 2 Premise 3 Conclusion Exmple Prolem ( ) x 4 x 4 x 5 New Similr Prolems (x 4 ) (x 4 x 5 ) ( ) ( x 4 ) x 5 x 5 ( x 4 ) ( ) ( x 4 ) ( x 4 ) ( ) x 4 (x 5 x 4 ) x 5 ( ) ( ) x 4 ( x 5 ) x 4 ( x 5 ) () Prmeters: # of premises = 3, Size of propositions 4, # of vriles = 3, # of inference steps = 2, Inference rules = { DS, HS } Premise 1 Premise 2 Premise 3 Conclusion ( ) ( ) ( ) (x x ) 1 2 (x x ) 2 3 ( ) (c) settings, the techer would even like to wrd numericl grde. Automting feedck genertion is importnt for severl resons: First, it is quite difficult nd time-consuming for humn techer to identify wht mistke student hs mde. As result, techers often tke severl dys to return grded ssignments to their students. In contrst, if students get immedite feedck (due to utomtion), it cn help them relize nd lern from their mistkes fster nd etter. Furthermore, mintining grde consistency cross students nd grders is difficult. The sme grder my wrd different scores to two very similr solutions, while different grders my wrd different scores to the sme solution. Procedurl prolems. Generting feedck for procedurl prolems is reltively esy (compred to conceptul prolems) since they ll hve unique solution; the student s ttempt cn simply e syntcticlly compred with the unique solution. While student errors my include creless mistkes or incorrect fct recll, one common clss of mistkes students mke in procedurl prolems is employing wrong lgorithm. Vn Lehn 30 identified more thn 100 ugs students introduce in sutrction lone. Ashlock 7 identified set of uggy computtionl ptterns for vriety of lgorithms sed on rel student dt. Here re two ugs Ashlock descried for the ddition procedure (see Figure 5): Add ech column nd write the sum elow ech column, even if it is greter thn nine; nd Add ech column from left to right; if the sum is greter thn nine, write the 10s digit eneth the column nd the ones digit ove the column to the right. All such ugs hve cler procedurl mening nd cn e cptured s procedure. The uggy procedures cn e utomticlly synthesized from exmples of incorrect student trces using the sme PBE technology discussed erlier in the context of solution genertion for procedurl prolems. In fct, ech of the 40 ugs descried y Ashlock 7 is illustrted with set of five to eight exmple trces, nd Andersen et l. 6 were le to synthesize 28 (out of 40) uggy procedures from their exmple trces. Identifying uggy procedures hs multiple enefits; for instnce, it cn inform techers out student s misconceptions. It cn lso e used to utomticlly generte progression of prolems specificlly tilored to highlighting differences etween the correct procedure nd the uggy procedure. Aleven et.l. 2 used PBE technology to generlize demonstrtions of correct nd incorrect ehviors provided upfront y the techer. While their generliztion is restricted to loopfree procedures, techers re le to dd nnottions s feedck to students who get stuck or follow known incorrect pth. Conceptul prolems. Feedck for proof prolems cn e generted y checking correctness of ech individul step (ssuming students re using correct proof methodology) nd using solution-genertion technology to generte proof completions from the onset of ny incorrect step. 13 Here, this rticle focuses on feedck genertion for construction prolems, including two generl principles useful cross multiple suject domins: F1: Edit distnce. The ide is to find the smllest set of edits to the student s solution tht will trnsform it into correct solution. Such feedck informs students out where the error is in their solution nd how it cn e fixed. An interesting twist is to find the smllest set of edits to the prolem description tht will trnsform it into one tht corresponds to the student s incorrect solution, thus cpturing the common mistke of misunderstnding the prolem description. Such feedck cn inform students s to why their solution is incorrect. The numer nd type of edits cn e used s criterion for wrding numericl grdes; nd F2: Counterexmples. The ide is to find input exmples on which student s solution does not ehve correctly. Such feedck informs the student out why the solution is incorrect. The density of such inputs cn e used s criterion for wrding grdes. The following illustrtes how these principles re used in different suject domins: Introductory progrmming ssignments. The stndrd pproch to grding progrmming ssignments is to exmine its ehvior on set of test inputs tht cn e written mnully or generted utomticlly. 4 Douce et l. 12 surveyed vrious systems developed for utomted grding of progrmming ssignments. Filing test inputs, or counterexmples, cn provide guidnce s to why given solution is incorrect (Principle F2). However, this guidnce lone is not idel, especilly for AUGUST 2014 VOL. 57 NO. 8 COMMUNICATIONS OF THE ACM 77
contriuted rticles While the Singh et l. tool mkes no ssumption s to the lgorithms or plns students cn use, key limittion is it cnnot provide feedck on student ttempts with ig conceptul errors tht cnnot e fixed through locl rewrite rules. Moreover, the Singh et l. tool is limited to providing feedck on functionl equivlence, s opposed to performnce or design ptterns. Automt constructions. Deterministic finite utomton (DFA) is simple ut powerful computtionl model with diverse pplictions nd hence is stndrd prt of computer science eduction. JFLAP 24 is widely used system for teching utomt nd forml lnguges tht llows for constructing, testing, nd conversion etween computtionl models ut does not support grding. The following prgrphs explore technique for utomted grding of utomt constructions. 3 Consider the prolem of constructing DFA over lphet {, } for the regulr lnguge L = {s s contins the sustring exctly twice}. Figure 9 includes five ttempts sumitted y different students nd the respective feedck generted y the Alur et l. 3 utomt grding tool. The underlying technique involves identifying different kinds of feedck, including edit distnce over oth solution nd prolem (Principle F1) nd counterexmples (Principle F2), with ech feedck ssocited with numericl grde. The feedck tht corresponds to the est numericl grde is then reported to the student. The reported feedck for the third ttempt is sed on edit distnce to correct solution, nd the grde is function of the numer nd kind of edits needed to convert the student s incorrect utomton into correct utomton. In contrst, the rest of the incorrect ttempts hve lrge edit distnce nd hence re sed on other kinds of feedck. The second ttempt nd the lst ttempt correspond to slightly different lnguge description; tht is, L = {s s contins the sustring t lest twice}, possily reflecting the common student mistke of misreding the prolem description. The reported feedck here is sed on edit distnce over prolem descriptions, nd the ssocited grde is function of the numer nd kind of edits required. The reported feedck for the fourth teginners who find it difficult to mp counterexmples to errors in their code. An edit-distnce-sed technique 27 offers guidnce on fixing n incorrect solution (Principle F1). Consider the prolem of computing the derivtive of polynomil whose coefficients re represented s list of integers, teching conditionls nd itertion over lists (see Figure 8 for reference solution). For this prolem, students struggled with low-level Python semntics involving list indexing nd itertion ounds. Students lso struggled with conceptul spects of the prolem (such s missing the corner cse of hndling lists consisting of single element). A techer could leverge this knowledge of common exmple errors to define n edit distnce model consisting of set of weighted rewrite rules tht cpture potentil corrections (long with their cost) for mistkes students might mke in their solutions. Figure 8 includes smple rewrite rules: The first such rule trnsforms the index in list ccess; the second trnsforms the righthnd side of constnt initiliztion; nd the third trnsforms the rguments for the rnge function. Figure 8c e show three student progrms, together with respective feedck generted y Singh et l. s progrm-grding tool. 27 The underlying technique involves exploring the spce of ll cndidte progrms, pplying techer-provided rewrite rules to the student s incorrect progrm, to synthesize cndidte progrm equivlent to the reference solution while requiring minimum numer of corrections. For this purpose, the underlying technique leverges SKETCH, 28 stte-of-the-rt progrm synthesizer tht employs SAT-sed lgorithm to complete progrm sketches (progrms with holes) so they meet given specifiction. Singh et l. evluted their tool on thousnds of rel student ttempts (t progrmming prolems) otined from the 2012 Introduction to Progrmming course t MIT (6.00) nd MITx (6.00x). 27 The tool generted feedck (up to four corrections) on over 64% of ll sumitted solutions tht were incorrect in out 10 seconds on verge. Intention-sed mtching pproches 19 mtch plns in student progrms with those in preexisting knowledgese to provide feedck. The underlying technique involves exploring the spce of ll cndidte progrms, pplying techer-provided rewrite rules to the student s incorrect progrm, to synthesize cndidte progrm equivlent to the reference solution while requiring minimum numer of corrections. 78 COMMUNICATIONS OF THE ACM AUGUST 2014 VOL. 57 NO. 8
contriuted rticles Figure 8. Automted grding of introductory progrmming prolems: 27 () reference implementtion (in Python) for the prolem of computing derivtive of polynomil; () rewrite rules tht cpture common errors; nd (c), (d), nd (e) denoting three different student sumissions, long with respective feedck generted utomticlly. def computederiv(poly): result = [] for i in rnge(len(poly)): result += [i * poly[i]] if len(poly) == 1: return result # return [0] else: return result[1:] # remove the leding 0 () x[] x[{ + 1, 1,?}] x = n x = {n + 1, n 1, 0} rnge( 0, 1 ) rnge({0, 1, 0 1, 0 + 1}, { 1 + 1, 1 1}) def computederiv(poly): deriv, zero = [], 0 if (len(poly) == 1): return deriv for e in rnge(0,len(poly)): if (poly[e] == 0): zero += 1 else deriv.ppend(poly[e]*e) return deriv The progrm requires 3 chnges: In the return sttement return deriv in line 4, replce deriv y [0]. In the comprison expression (poly[e] == 0) in line 6, chnge (poly[e] == 0) to Flse. In the expression rnge (0, len(poly)) in line 5, increment 0 y 1. def computederiv(poly): idx = 1 deriv = list([]) plen = len(poly) while idx <= plen: coeff = poly.pop(1) deriv + [coeff*idx] idx = idx + 1 if len(poly) < 2: return deriv The progrm requires 1 chnge: In the function computederiv, dd the se cse to return [0] for len(poly) = 1. def computederiv(poly): length=int(len(poly) 1) i = length deriv = rnge(1,length) if len(poly) == 1: deriv = [0.0] else: while i >= 0: new = poly[i] * i i = 1 deriv[i] = new return deriv The progrm requires 2 chnges: In the expression rnge(1, length) in line 4, increment length y 1. In the comprison expression (i >= 0) in line 8, chnge opertor >= to!=. () (c) (d) (e) tempt, which does not involve smll edit distnce, is sed on counterexmples. The grde here is function of the density of counterexmples, with more weight given to smller-size counterexmples since students ought to hve checked the correctness of their construction on smller strings. To utomticlly generte feedck, Alur et l. 3 formlized prolem descriptions using logic clled MOSEL, n extension of the clssicl mondic-second order logic (MSO) with some syntctic sugr tht llows defining regulr lnguges in concise, nturl wy. In MO- SEL, the lnguges L nd L cn e descried y the formuls indof() = 2 nd indof() 2 respectively, where the indof constructor returns the set of ll indices where the rgument string occurs. Their utomt-grder tool implements synthesis lgorithms tht trnslte MOSEL descriptions into utomt nd vice vers. The MOSEL-toutomton synthesizer rewrites MOSEL descriptions into MSO, then leverges stndrd techniques to trnsform n MSO formul into the corresponding utomton. The utomton-to-mosel synthesizer uses rute-force serch to enumerte MOSEL formuls in order of incresing size to find one tht mtches given utomton. Edit distnce is then computed sed on notions of utomt distnce or tree distnce (in cse of prolem descriptions), while counterexmples re computed using utomt difference. Alur et l. 3 evluted their utomtgrder tool on 800+ student ttempts to solve severl prolems from n utomt course CS373 t the University of Illinois t Urn Chmpign in Spring 2013. Ech sumission ws grded y two instructors nd the tool. For one of these representtive prolems, instructors were incorrect (hving given full mrks to n incorrect ttempt) or inconsistent (sme instructor hving given different mrks to syntcticlly equivlent ttempts) for 20% of ttempts. For nother 25% of ttempts, there ws t lest three (out of 10) points discrepncy etween the tool nd one of the instructors; in more thn 60% of these cses, the instructor concluded (fter re-reviewing) tht the tool s grde ws more fir. The two instructors thus concluded tht the tool is preferle to humns for consistency nd sclility. The utomt grding tool 3 hs een deployed online, providing live feedck nd vriety of hints. In Fll 2013, Alur et l., 3 together with Bjoern Hrtmnn of the University of Cliforni, Berkeley, conducted user study round the utility of the tool t the University of Pennsylvni nd the University of Illinois t Urn-Chmpign, oserving such hints were helpful, in- Figure 9. Automted grding of utomt prolems: 3 severl student ttempts to construct n utomton tht ccepts strings contining the sustring exctly twice, long with utomticlly generted feedck nd grde. 0 0 DFA Attempt 1 2 3 4 5, 6 0, 1 2 3 4, 1 2 3 4 5 6, 0 1 2 3 4 5 0, 1 2 3 4 5 Feedck nd Grde Accepts the correct lnguge Grde: 10/10 Accepts the strings tht contin t lest twice insted of exctly twice Grde: 5/10 Misses the finl stte 5 Grde: 9/10 Behves correctly on most of the strings Grde: 6/10 Accepts the strings tht contin t lest twice insted of exctly twice Grde: 5/10 AUGUST 2014 VOL. 57 NO. 8 COMMUNICATIONS OF THE ACM 79
contriuted rticles cresed student persevernce, nd improved prolem-completion time. Conclusion Providing personlized nd interctive eduction (s in one-on-one tutoring) remins n unsolved prolem in stndrd clssrooms. The rrivl of MOOCs, despite eing n opportunity for shring qulity instruction with lrge numer of students, excertes the prolem with n even higher student-to-techer rtio. Recent dvnces in computer science cn e rought together to rethink intelligent tutoring, 33 with the phenomenl rise of online eduction mking this investment very timely. This rticle hs summrized recently pulished work from different res of computer science, including progrmming lnguges, 17,27 rtificil intelligence, 1,3,26 nd humn-computer interction. 5 It lso revels common thred in this interdisciplinry line of work, nmely the use of exmples s n input to the underlying lgorithms (for intent understnding), s n output of these lgorithms (for generting the intended rtifct), or even inside these lgorithms (for inductive resoning). This my enle other reserchers to pply these principles to develop similr techniques for other suject domins. This rticle should inform eductors out new dvnces to ssist vrious eductionl ctivities, llowing them to think more cretively out curriculum nd pedgogicl reforms; for instnce, these dvnces cn enle development of gming lyers tht tke computtionl thinking into K 12 clssrooms. This rticle hs pplied rther technicl perspective to computer-ided eduction. While the technologies cn ffect eduction in positive mnner, computer-ided eduction reserchers must still devise wys to quntify its enefits on student lerning, which my e criticl to ttrct funding. Furthermore, this rticle hs discussed only logiclresoning-sed techniques, ut these techniques cn e ugmented with complementry techniques tht leverge lrge student popultions nd dt whose vilility is fcilitted y recent interest in online eduction pltforms like Khn Acdemy nd MOOCs; for instnce, lrge mounts of student dt cn e used to collect different correct solutions to (proof) prolem, which in turn cn e used to generte feedck 13 or discover effective lerning pthwys to guide prolem selection. Lrge student popultions cn e leverged to crowdsource tsks tht re difficult to utomte, 31 s in peer grding. 21 A synergistic comintion of logicl resoning, mchine lerning, nd crowdsourcing methods my led to self-improving dvnced intelligent tutoring systems tht cn revolutionize ll eduction. Acknowledgments I thnk Moshe Y. Vrdi for encourging me to write this rticle. I thnk Ben Zorn nd the nonymous reviewers for providing vlule feedck on erlier versions of the drft. References 1. Ahmed, U., Gulwni, S., nd Krkre, A. Automticlly generting prolems nd solutions for nturl deduction. In Proceedings of the Interntionl Joint Conference on Artificil Intelligence (Beijing, Aug. 3 9, 2013). 2. Aleven, V., McLren, B.M., Sewll, J., nd Koedinger, K.R. A new prdigm for intelligent tutoring systems: Exmple-trcing tutors. Artificil Intelligence in Eduction 19, 2 (2009), 105 154. 3. Alur, R., D Antoni, L., Gulwni, S., Kini, D., nd Viswnthn, M. Automted grding of DFA constructions. In Proceedings of the Interntionl Joint Conference on Artificil Intelligence (Beijing, Aug. 3 9, 2013); tool t http://www.utomttutor.com/ 4. Annd, S., Burke, E., Chen, T.Y., Clrk, J., Cohen, M.B., Grieskmp, W., Hrmn, M., Hrrold, M.J., nd McMinn, P. An orchestrted survey on utomted softwre test cse genertion. Journl of Systems nd Softwre 86, 8 (2013), 1978 2001. 5. Andersen, E., Gulwni, S., nd Popovic, Z. A trcesed frmework for nlyzing nd synthesizing eductionl progressions. In Proceedings of the ACM SIGCHI Conference on Humn Fctors in Computing Systems (Pris, Apr. 27 My 2). ACM Press, New York, 2013, 773 782. 6. Andersen, E., Gulwni, S., nd Popovic, Z. Progrmming y Demonstrtion Frmework Applied to Procedurl Mth Prolems Technicl Report MSR- TR-2014-61. Microsoft Reserch, Redmond, WA, 2014. 7. Ashlock, R. Error Ptterns in Computtion: A Semi- Progrmmed Approch. Merrill Pulishing Compny, Princeton, NC, 1986. 8. Bjørner, N. Tking stisfiility to the next level with Z3. In Proceedings of the Sixth Interntionl Joint Conference on Automted Resoning (Mnchester, U.K., June 26 29). Springer, 2012, 1 8. 9. Brod, K., M, J., Sinnduri, G., nd Summers, A.J. Pndor: A resoning toolox using nturl deduction style. Logic Journl of the Interest Group in Pure nd Applied Logics 15, 4 (2007), 293 304. 10. Cypher, A., Ed. Wtch Wht I Do: Progrmming y Demonstrtion. MIT Press, Cmridge, MA, 1993. 11. Dong, T., Dontchev, M., Joseph, D., Krhlios, K., Newmn, M., nd Ackermn, M. Discovery-sed gmes for lerning softwre. In Proceedings of the ACM SIGCHI Conference on Humn Fctors in Computing Systems (Austin, TX, My 5 10). ACM Press, New York, 2012, 2083 2086. 12. Douce, C., Livingstone, D., nd Orwell, J. Automtic test-sed ssessment of progrmming: A review. Journl of Eductionl Resources in Computing 5, 3 (2005), 511 531. 13. Fst, E., Lee, C., Aiken, A., Bernstein, M.S., Koller, D., nd Smith, E. Crowd-scle interctive forml resoning nd nlytics. In Proceedings of the ACM Symposium on User Interfce Softwre nd Technology (St. Andrews, Scotlnd, Oct. 8 11). ACM Press, New York, 2013, 363 372. 14. Go, X.-S. nd Lin, Q. MMP/Geometer- softwre pckge for utomted geometric resoning. In Proceedings of the Fourth Interntionl Workshop on Automted Deduction in Geometry (Hgenerg Cstle, Austri, Sept. 4 6). Springer, 2004, 44 66. 15. Gulwni, S. Progrm Anlysis Using Rndom Interprettion. Ph.D. thesis. University of Cliforni, Berkeley, 2005; http://reserch.microsoft.com/en-us/ um/people/sumitg/pus/disserttion.pdf 16. Gulwni, S. Synthesis from exmples: Interction models nd lgorithms (invited tlk pper). In Proceedings of the 14 th Interntionl Symposium on Symolic nd Numeric Algorithms for Scientific Computing (Timisor, Romni, Sept. 26 29). IEEE Computer Society, 2012, 8 14. 17. Gulwni, S., Korthiknti, V.A., nd Tiwrim, A. Synthesizing geometry constructions. In Proceedings of the 32 nd ACM SIGPLAN conference on Progrmming Lnguge Design nd Implementtion (Sn Jose, CA, June 4 8), ACM Press, New York, 2011, 50 61. 18. Gulwni, S., Hrris, W., nd Singh, R. Spredsheet dt mnipultion using exmples. Commun. ACM 55, 8 (Aug. 2012), 97 105. 19. Johnson, W. Intention-sed Dignosis of Novice Progrmming Errors. Morgn Kufmnn, Burlington, MA, 1986. 20. Knuth, D.E. The Art of Computer Progrmming, Volume 4A: Comintoril Algorithms, Prt 1. Addison-Wesley Professionl, Boston, 2011. 21. Kulkrni, C., Png, K., Le, H., Chi, D., Ppdopoulos, K., Cheng, J., Koller, D., nd Klemmer, S. Peer nd self ssessment in mssive online design clsses. ACM Trnsctions on Computer-Humn Interction 20, 6 (2013). 22. Lieermn, H. Your Wish Is My Commnd: Progrmming y Exmple. Morgn Kufmnn, Burlington, MA, 2001. 23. Mozgovoy, M., Kkkonen, T., nd Cosm, G. Automtic student plgirism detection: Future perspectives. Journl of Eductionl Computing Reserch 43, 4 (2010), 511 531. 24. Rodger, S. nd Finley, T. JFLAP: An Interctive Forml Lnguges nd Automt Pckge. Jones nd Brtlett Pulishers, Inc., Sudury, MA, 2006. 25. Schwrtz, J.T. Fst proilistic lgorithms for verifiction of polynomil identities. Journl of the ACM 27, 4 (1980), 701 717. 26. Singh, R., Gulwni, S., nd Rjmni, S. Automticlly generting lger prolems. In Proceedings of the 26 th conference on Artificil Intelligence (Toronto, July 22 26). AAAI Press, 2012. 27. Singh, R., Gulwni, S., nd Solr-Lezm, A. Automted feedck genertion for introductory progrmming ssignments. In Proceedings of the 34 th nnul ACM SIGPLAN Conference on Progrmming Lnguge Design nd Implementtion (Settle, June 16 22). ACM Press, New York, 2013, 15 26. 28. Solr-Lezm, A. Progrm Synthesis By Sketching. Ph.D. thesis. University of Cliforni, Berkeley, 2008; http://www.eecs.erkeley.edu/pus/techrpts/2008/ EECS-2008 177.pdf 29. Vn Ditmrsch, H. User interfces in nturl deduction progrms. In Proceedings of the User Interfces for Theorem Provers Workshop (Eindhoven, The Netherlnds, July 1998), 87 95. 30. VnLehn, K. Mind Bugs: The Origins of Procedurl Misconceptions. MIT Press, Cmridge, MA, 1991. 31. Weld, D.S., Adr, E., Chilton, L., Hoffmnn, R., Horvitz, E., Koch, M., Lndy, J., Lin, C.H., nd Musm, M. Personlized online eduction: A crowdsourcing chllenge. In Proceedings of the Fourth Humn Computtion Workshop t the 26 th Conference on Artificil Intelligence (Toronto, July 22 26, 2012). 32. Wilson, S. nd Fleuriot, J.D. Comining dynmic geometry, utomted geometry theorem proving nd digrmmtic proofs. In Proceedings of the User Interfces for Theorem Provers Workshop (Edinurgh, Apr.). Springer, 2005. 33. Woolf, B. Building Intelligent Interctive Tutors. Morgn Kufmn, Burlington, MA, 2009. 34. Zhu, H., Hll, P.A.V., nd My, J.H.R. Softwre unit test coverge nd dequcy. ACM Computing Surveys 29, 4 (Dec. 1997), 366 427. Sumit Gulwni (sumitg@microsoft.com) is principl resercher t Microsoft Reserch, Redmond, WA, djunct fculty in the Deprtment of Computer Science nd Engineering t the Indin Institute of Technology, Knpur, Indi, nd ffilite fculty in the Deprtment of Computer Science & Engineering t the University of Wshington, Settle. 2014 ACM 0001-0782/14/08 $15.00 80 COMMUNICATIONS OF THE ACM AUGUST 2014 VOL. 57 NO. 8