Learning(to(Teach:(( Machine(Learning(to(Improve( Instruc6on'

Similar documents

Modeling learning patterns of students with a tutoring system using Hidden Markov Models

The Sum is Greater than the Parts: Ensembling Student Knowledge Models in ASSISTments

Data Mining for Education

Microsoft Azure Machine learning Algorithms

Predict Influencers in the Social Network

A STUDY ON EDUCATIONAL DATA MINING

Feature Engineering and Classifier Ensemble for KDD Cup 2010

An Introduction to Data Mining

How To Solve The Kd Cup 2010 Challenge

Chapter X: Educational Data Mining and Learning Analytics

Machine learning for algo trading

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Data Mining. Dr. Saed Sayad. University of Toronto

Linear Classification. Volker Tresp Summer 2015

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence

Predictive Data modeling for health care: Comparative performance study of different prediction models

Mapping Question Items to Skills with Non-negative Matrix Factorization

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

MS1b Statistical Data Mining

Pentaho Data Mining Last Modified on January 22, 2007

Wheel-spinning: students who fail to master a skill

Predictive Modeling Techniques in Insurance

HT2015: SC4 Statistical Data Mining and Machine Learning

Data Mining with SQL Server Data Tools

Predictive Modeling and Big Data

Better credit models benefit us all

Mixture Modeling of Individual Learning Curves

Information Management course

BIG DATA What it is and how to use?

New Work Item for ISO Predictive Analytics (Initial Notes and Thoughts) Introduction

Statistical Models in Data Mining

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

Data Mining Algorithms Part 1. Dejan Sarka

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE

8. Machine Learning Applied Artificial Intelligence

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Azure Machine Learning, SQL Data Mining and R

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Studying Auto Insurance Data

E-Learning Using Data Mining. Shimaa Abd Elkader Abd Elaal

MA2823: Foundations of Machine Learning

Data Mining. Nonlinear Classification

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

Big Data Big Knowledge?

Classification Problems

Using Data Mining for Mobile Communication Clustering and Characterization

Factorization Models for Forecasting Student Performance

Learning outcomes. Knowledge and understanding. Competence and skills

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction

Question 2 Naïve Bayes (16 points)

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

Designing a learning system

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

Predicting Soccer Match Results in the English Premier League

Data Mining Part 5. Prediction

Mining Wiki Usage Data for Predicting Final Grades of Students

Introduction to UTS/OLT Educational Data Mining Projects. Prof. Longbing Cao Director, Advanced Analytics Institute

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Decision Trees from large Databases: SLIQ

LCs for Binary Classification

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION

Didacticiel Études de cas

COURSE RECOMMENDER SYSTEM IN E-LEARNING

Data Mining + Business Intelligence. Integration, Design and Implementation

Machine Learning Introduction

Presentation by: Ahmad Alsahaf. Research collaborator at the Hydroinformatics lab - Politecnico di Milano MSc in Automation and Control Engineering

Introduction to Data Mining

Audit Analytics. --An innovative course at Rutgers. Qi Liu. Roman Chinchila

Bayesian networks - Time-series models - Apache Spark & Scala

Comparison of Data Mining Techniques used for Financial Data Analysis

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Evaluation of Machine Learning Techniques for Green Energy Prediction

Introduction. A. Bellaachia Page: 1

DATA MINING IN FINANCE

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

A survey on click modeling in web search

Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University)

Transcription:

Learning(to(Teach:(( Machine(Learning(to(Improve( Instruc6on BeverlyParkWoolf SchoolofComputerScience,UniversityofMassachuse<s bev@cs.umass.edu NIPS2015WorkshoponHumanPropelled MachineLearning,Dec13,2014 LongTermGoal Millionsofschoolchildrenwillhaveaccessto whatalexanderthegreatenjoyedasaroyal prerogerasve: thepersonalservicesofatutoraswell informedasaristotle Studentswillhave instantaccesstovast storesofknowledge throughtheir computerizedtutors PatSuppes,StanfordUniversity,1966 DiedNov2014)

AlexandertheGreatvaluedlearningsohighly,that hesaidhewasmoreindebtedtoaristotleforgivinghim knowledgethantohisfatherforgivinghimlife. Weareontrack. Key(components:( ( ArSﬁcialIntelligence MachineLearning LearningSciences Weareabletoachievepersonalservicesof atutorforeverystudentandinstantaccess tovaststoresofknowledge

Then:(((~(400(BC( Now:((((((2014( Model(the(Student( Model(the(Domain( Intelligent( Tutoring( Systems( Personalize(Tutoring( Assess(Learning( Learning( @(Scale(

ResearchQuesSons HowtoretrievesubstancefromeducaSonaldata? Whatdoteachersandstudentsneedtoknow? WhatdoresearchersinLearningScienceswanttoknow? ExplorelargeeducaSonaldatasetsandhowtheyare analyzed createmodelsandpa<ernfinding. HowareresearchersinthefieldofeducaSonal technologyusingavarietyoftechniquestousedata toimproveteachingandlearning?

WhatKindofMLTechniques? VisualizaSonandmodeling Decisiontrees Bayesiannetworks LogisScRegression TemporalModels MarkovModels ClassificaSon:NaïveBayes,NeuralNetworks, Decisiontrees Reasoning(about(the(Learner(with(Machine(Learning

Techniques Pre-processing: Discretizing Variables, Normalizing and Transforming Variables Visualizations: Single Variables and Relationshipts Models: Correlations/ Crosstabulations Models: Causal Modeling Open Learner Models Bull & Mitrovic Models for Teachers, Parents Models of the Domain Arroyo, EDM 2010 Models: Linear Regression Heffernan Koedinger Models of Student Knowledge, Learning Ritter EDM2011 best paper Arroyo, Log Files Models of Student Affect/ Motivatio Pedagogi Engagemen cal t/use and Moves Misuse/Onoff task Tutorial and Actions Baker Arroyo, Log Files Arroyo, Log Files Beck & Rai Arroyo, Shanabrook; Baker Arroyo, EDM 2010 Arroyo -- Animalwatch Models: Feature Selection. Splitting Models vs. Accounting for. Martin & Koedinger Arroyo Classification: Logistic Regression Pavlik (PFA); Gong & Beck, v Cooper, David Beck Classification: Clustering Desmarais -- non negateive matrix factorization Yue Gong UMAP2012, clustering without features :-) Classification: Naive Bayes Classification: Neural Networks Burns, Handwriting DMello: Predicting affective states Baker Stern, MANIC Classification: Decision Trees random forest approach was widely used in KDD2011 cup de Vicente & Pain Models: Association Rule Learning Romero Merceron Temporal Models: Temporal Patterns and Trails over observable variables, and Markov Chains Romero (Educational Trails) Shanabrook Shanabrook Shanabrook Models: Bayesian Networks Zapata-Rivera Heffernan Lots of classic ITS work (HYDRIVE; William Murray) Conati; Arroyo; Rai Chaz Murray RTDT Temporal Models: Hidden Markov Models (latent variables) Mayo & Mitrovic Beck; Pardos Johns & Woolf IvonArroy,WorcesterPolytechnic InsStute DataSetsUsed DatasetscomefromLogFiles EducaSonaltutoringandassessment soeware,

LargeDataSets EventLogTableofaMathTutoringSystem.571,776rows,justinayearSme. Introduction Agenda Model the Student Model the Domain Personalize Tutoring Assess Learning Intelligent( Tutoring( Systems( Learning( @(Scale(

Student Model( Student Model(

Student Model(

AdatagdrivenapproachtowardautomaScpredicSonofstudents emosonalstateswithoutsensorsandwhilestudentsaressllacsvely engagedintheirlearning. Modelsfromstudentsongoingbehavior.AcrossgvalidaSonrevealed smallgainsinaccuracyforthemoresophisscatedstategbased modelsandbe<erpredicsonsoftheremainingunpredictedcases, comparedtothebaselinemodels. Bymodifyingthecontextofthetutoringsystemincludingstudents perceivedemosonaroundmathemascs,atutorcannow opsmizeandimproveastudentsmathemascsahtudes. DavidH.Shanabrook,DavidG.Cooper,BeverlyParkWoolf,andIvonArroyo StudentStates Describingstudent/tutorinteracSon

Problemstatepa<erns IBMsManyEyesWordTreealgorithm.Thetotal1280ATT(a<emptedandsolved)events. MostfrequentlyATTwasfollowedbyaSOFevent(seetoptree).Thesecondlevelofthe treeshowsthatthesequenceattattthehighestfrequenteventchangestotheatt event,i.e.theshieinbehavioroccursaeertwoattstates(seesecondtreeand topbranch).thisindicatestheattstateismoreoeenasolitaryevent,where theattattpa<ernwillconsnueintheattstate.thus,fromtheanalysisthemost frequent3problemstatepa<erns(e.g.,notrgnotrgnotr)are determined(seethirdtreeandsecondbranch).

ADynamicMixtureModeltoDetectStudent MoSvaSonandProficiency JeffJohns Autonomous( Learning(Laboratory Beverly Woolf Center for Knowledge Communication AAAI 7/20/2006 ProblemStatement Background Developamachinelearningcomponentforamathtutoring systemusedbyhighschoolstudents(sat,mcas) FocusonesSmaSngthe state ofastudent,whichisthenused forselecsnganappropriatepedagogicalacson Problem UsingamodeltoesSmatestudentability,but StudentsappearunmoSvatedin~30%ofproblems SoluSon ExplicitlymodelmoSvaSon(asadynamicvariable)andstudent proficiencyinasinglemodel

DetecSonofMoSvaSon UnmoSvatedstudentsdonotreapthefullrewardsof usingacomputergbasedintelligenttutoringsystem. DetecSonofimproperbehavioristhusanimportant componentofanonlinestudentmodel. DynamicmixturemodelbasedonItemResponseTheory.This modelsimultaneouslyessmatesastudent sproficiencyand changingmosvasonlevel. ByaccounSngforstudentmoSvaSon,thedynamicmixture modelresearcherscanmoreaccuratelyessmateproficiency andtheprobabilityofacorrectresponse. Created(Item(Response(Theory((IRT)(models(for(modeling(the(students( knowledge( Data(consists(of(responses((correct/incorrect)(for(400(students(across(70( problems,(where(a(student(performs(~33(problems(on(average( T(implemented(an(EM(algorithm(to(learn(the(parameters(of(the(IRT(model( T(crossTvalidated(results(indicate(the(model(can(predict(with(72%(accuracy( how(the(student(will(perform(on(each(problem( T(algorithms(can(be(used(online(to(es6mate(a(students(ability(while( interac6ng(with(the(tutor( T(currently(working(on(an(extension(of(the(IRT(model(to(include(informa6on( relevant(to(a(students(mo6va6on((6me(spent(on(problem,(number(of(hints( requested)( (

LowStudentMoSvaSon Example:Actualdatafromastudentperforming 12problems(green=correct,red=incorrect) Problemsareofroughlyequaldifficulty Studentappearstoperformwellinbeginningand worsetowardtheend Conclusion:Thestudent sproficiencyisaverage 1 2 3 4 5 6 7 8 9 10 11 12 LowStudentMoSvaSon Conclusion:Poorperformanceonthelastfive problemsisduetolowmosvason(not 50 proficiency) 40 Time(s) ToFirst Response 30 20 Student(is( unmo3vated( Use(observed( data(to(infer( mo3va3on!( 10 0 1 2 3 4 5 6 7 8 9 10 11 12

LowStudentMoSvaSon Opportunityforintelligenttutoringsystemsto improvestudentlearningbyaddressing mosvason Thisissueisbeingdealtwithonalargerscale bytheeducasonalassessmentcommunity Wise&Demars2005.LowExamineeEffortin LowgStakesAssessment:PotenSalProblemsand SoluSons.Educa3onal(Assessment. HiddenMarkovModel(HMM) AHMMisusedtocaptureastudent s changingbehavior(levelofmosvason) M 1 M 2 M n H 1 H 2 H n M i (hidden) Unmotivated Hint Unmotivated Guess Motivated H i (observed) Time to first response < t min AND Number of hints before correct response > h max Time to first response < t min AND Number of hints before correct response < h min If other two cases don t apply

Newedges(inred)changethecondiSonal probabilityofastudent sresponse:p(u i θ, M i ) M 1 M 2 M n H 1 H 2 H Mo3va3on((M i( )( n affects(student( response((u i( )( U 1 U 2 U n θ ParameterEsSmaSon UsesanExpectaSongMaximizaSonalgorithmto essmateparameters MgStepisiteraSve,similartotheIteraSveReweighted LeastSquares(IRLS)algorithm ModelconsistsofdiscreteandconSnuousvariables IntegralfortheconSnuousvariableisapproximatedusing aquadraturetechnique OnlyparametersnotesSmated P(U i θ,m i =unmo3vated@guess)=0.2 P(U i θ,m i =unmo3vated@hint)=0.02

ModelingAbilityandMoSvaSon Combinedmodeldoesnotdecreasetheability essmatewhenthestudentisunmosvated! Combinedmodel separatesabilityfrom mosvason(irtmodel lumpsthemtogether) Experiments Data:400highschoolstudents,70problems,astudent finished32problemsonaverage TraintheModel EsSmateparameters TesttheModel Foreachstudent,foreachproblem: EsSmateθandP(M i )viamaximumlikelihood PredictP(M i+1 )givenhmmdynamics PredictU i+1.doesitmatchactualu i+1? Comparecombinedmodelvs.justanIRTmodel

Results Combinedmodelachieved72.5%crossg validasonaccuracyversus72.0%fortheirt model GapisnotstaSsScallysignificant OpportuniSesforimprovingtheaccuracyof thecombinedmodel Longersequences(perstudent) Be<ermodelofthedynamics,P(M i+1 M i ) Conclusions Proposedanew,flexiblemodeltojointlyesSmate studentmosvasonandability NotseparaSngabilityfrommoSvaSonconflatesthetwo concepts Easilyadjustedforothertutoringsystems CombinedmodelachievedsimilaraccuracytoIRT model OnlineinferenceinrealgSme ImplementedinJava;ranitinonehighschoolinMay 06

Agenda Introduction Model Student Emotion Model the Domain Personalize Tutoring Assess Learning Sensorsusedintheclassroom Bayesiannetworks andlinearregression models

LinearModelstoPredictEmoSons VariablesthathelppredictselfgreportofemoSons.Theresultsuggestthat emosondependsonthecontextinwhichtheemosonoccurs(mathproblem justsolved)andalsocanbepredictedfromphysiologicalacsvitycapturedbythe sensors(bo<omrow). Introduction Agenda Model the Student Model the Domain Personalize Tutoring Assess Learning Intelligent( Tutoring( Systems( Learning( @(Scale(

Domain Model( KurtVanLehn, Domain Model( TheAndesBayesiannetworkbefore(lee)and aeer(right)theobservasonagisgabody. KurtVanLehn.

StudentacSons(lee) andtheselfg explanasonmodel (right). Thephysicsproblem asksthestudenttofi ndthetensionforce exertedonaperson hangingbyaropesed tohiswaist.assume themidshipmanwas namedjake. Domain Model(

Stephens,2006 Stephens,2006

Stephens,2006 Agenda Introduction Model the Student Model the Domain Personalize Tutoring Assess Learning

PredicSngStudentTimeToComplete TwoagentswerebuilttopredictstudentSmetosolve problems(becketal.,2000). 1) PopulaSonstudentmodel(PSM):responsiblefor modelinghowstudentsinteractedwiththetutor,based ondatafromtheensrepopulasonofusersandinput characterisscsofthestudent,aswellasinformason abouttheproblemtobesolvedandoutputaboutthe expectedsme(inseconds)thestudentwouldneedto solvethatproblem. 2) Pedagogicalagent(PA),anditwasresponsiblefor construcsnga teachingpolicy.itwasareinforcement learningagentthatreasonedaboutastudent s knowledgeandprovided customized examplesand hintstailoredforeachstudent(beck andwoolf,2001; Becketal.,1999a,2000). OverviewoftheADVISORmachine learningcomponentinanimalwatch. Thetutorpredictedacurrentstudent sreacsontoavariety ofteachingacsons,suchaspresentasonofspecificproblemtype. (Becketal,2000)

Thetutorpredictedacurrentstudent sreacsontoavariety ofteachingacsons,suchaspresentasonofspecificproblemtype. Accountedforroughly50%ofthevarianceintheamountofSmethesystempredicteda studentwouldspendonaproblemandtheactualsmespenttosolveaproblem. (Becketal,2000) ADVISORpredictedstudentresponseSmeusingits populasonstudentmodel

CycleNetwork CyclenetworkinDTtutor.ThenetworkisrolledouttothreeSmeperiods represensngcurrent,possible,andprojectedstudentacsons.(frommurrayet al.,2004.) ModelsbeingEvaluated Fewissuestosolve SarahSchultz,WPI Whichmodel,learnedoverdata,helpspredictfutureperformancebest? 60

ProblemSelecSonWithinaTopic Arroyoetal. EDMJounraleffort. 61 Pedagogical(Moves(:(Dynamically(adjusted( EmpiricalgbasedesSmatesofeffortleadtoadjustedproblemdifficultyand otheraffecsveandmetagcognisvefeedback 62

Whatis normal behavior? In(((EACH((problem(p i i=1,..,nn=totalproblemsinsystem LookingacrossthewholepopulaSonofstudentswho(used(a(problem( IncorrectA<empts Hints Time(eachbar=5seconds) E(I i )! E(H i )! E(T i )! δ IL! δ IH! δ HL! δ HH! δ TL! δ TH! 01234 01234567 Withinexpectedbehavior Anewstudentencountersthisproblem IstheirbehaviorwithinexpectaSon,oratypical? 63 Whatisoddbehavior? Inanyproblemp i i=1,..,nn=totalproblemsinsystem IncorrectA<empts Hints Time(eachbar=5seconds) E(I i )! E(H i )! E(T i )! δ IL! δ IH! δ HL! δ HH! δ TL! δ TH! 01234 01234567 Oddbehavior Attempts < E(I i ) δ IL! Hints > E(H i ) + δ HH! Time < E(T i ) δ TL! Few Inc. Attempts! Lots of Hints! Little Time! <! >! <! 64

IncreasingProblemDifficulty AtthenextSmestep.Assumeweknowproblemdifficultyofitems. H= Sortedlistofhardermathproblems Easiest Hardestofall X LastProbSeen m( ) # Harder(H[1..m],γ ) = H ceiling m &, + % (. * $ γ - Parameter γ=3 gg>challengerate 65 DecreasingProblemDifficulty AtthenextSmestep.Assumeweknowproblemdifficultyofitems. E= Sortedlistofeasiermathproblems Easiestofall Hardest X n( LastProbSeen * $ Easier(E[1..n],γ ) = E ceiling n n -, & )/ + % γ (. Parameter γ=3 66

Agenda Introduction Model the Student Model the Domain Personalize Tutoring Assess Learning Learning( @(Scale( Stanford scomputersciencecourse Machinelearningtechniqueswereusedtoautonomouslycreatea graphicalmodelofhowstudentsinanintroductoryprogramming courseprogressthroughthehomeworkassignment. Machinelearningalgorithmsfoundpa<ernsinhow(students solvedthecheckerboardkarelproblem.thesepa<ernswere moreinformasveatpredicsnghowwellstudentswouldperform ontheclass(midterm(thanthegradesstudentsreceivedonthe assignment.thealgorithmcapturedameaningfulgeneraltrend abouthowstudentsweresolvingprogrammingproblems. Piech,C.,Sahami,M.,Koller,D.,Cooper,S.,&Blikstein,P.(2012,February). Modelinghowstudentslearntoprogram.InProceedings(of(the(43rd(ACM( technical(symposium(on(computer(science(educa3on(pp.153g160).acm.

StudentModelinginComputer Programming Bag(of(Words(Difference:((Researchersfirstbuilthistogramsofthedifferentkey wordsusedinacomputerprogramandusedtheeuclideandistancebetweentwo histogramsasanaïvemeasureofthedissimilarity.thisisakintodistancemeasures oftextcommonlyusedininformasonretrievalsystems. ApplicaSonProgramInterface(API)CallDissimilarity:Theyraneachprogramwith standardinputsandrecordedtheresulsngsequenceofapicalls.theyused NeedlemangWunschglobalDNAalignmenttomeasurethedifferencebetweenthe listsofapicallsgeneratedbythetwoprograms. Piech,C.,Sahami,M.,Koller,D.,Cooper,S.,&Blikstein,P.(2012, February).Modelinghowstudentslearntoprogram.InProceedings(of( the(43rd(acm(technical(symposium(on(computer(science(educa3on (pp.153g160).acm. HiddenMarkovModel ThefirststepintheirstudentmodelingprocesswastolearnahighlevelrepresentaSonof howeachstudentprogressedthroughthecheckerboardkarelassignment.tolearnthis representasontheymodeledastudent sprogressasahiddenmarkovmodel(hmm)[17]. LearningaHMM.EachstatefromtheHMMbecomesanodeintheFSMandtheweightofa directededgefromonenodetoanotherprovidestheprobabilityoftransisoningfromone statetothenext.the(programs(hidden(markov(model(of(state(transi6ons(for(a(given( student.(the(node("codet"(denotes(the(code(snapshot(of(the(student(at(6me(t,(and(the(node( "statet"(denotes(the(hightlevel(milestone(that(the(student(is(in(at(6me(t.(n(is(the(number(of( Piech,C.,Sahami,M.,Koller,D.,Cooper,S.,&Blikstein,P.(2012,February). snapshots(for(the(student.( Modelinghowstudentslearntoprogram.InProceedings(of(the(43rd(ACM( technical(symposium(on(computer(science(educa3on(pp.153g160).acm.

DissimilarityMatrix Dissimilarity(matrix(for( clustering(of(2000(snapshots.( Each(row(and(column(in(the( matrix(represents(a(snapshot( and(the(entry(at(row(i,(column(j( represents(how(similar( snapshot(i(and(j(are((dark( means(more(similar)( Clusteringonasampleof2000randomsnapshotsfromthetrainingsetreturneda groupofwellgdefinedsnapshotclusters(seefigure2).thevalueofkthatmaximized silhoue<escore(ameasureofhownaturaltheclusteringwas)was26clusters.a visualinspecsonoftheseclustersconfirmedthatsnapshotswhichclusteredtogether werefuncsonallysimilarpiecesofcode. Piech,C.,Sahami,M.,Koller,D.,Cooper,S.,&Blikstein,P.(2012,February). Modelinghowstudentslearntoprogram.InProceedings(of(the(43rd(ACM( technical(symposium(on(computer(science(educa3on(pp.153g160).acm. Thefinitesetofhighglevelormilestonesthatastudentcouldbein.Astateisdefinedbya setofsnapshotswhereallthesnapshotsinthesetcamefromthesamemilestone. ThetransiSonprobability,ofbeinginastategiventhestatethestudentwasininthe previousunitofsme. Theemissionprobability,ofseeingaspecificsnapshotgiventhatyouareinaparScular state.tocalculatetheemissionprobabilityweinterpretedeachofthestatesasemihng snapshotswithnormallydistributeddissimilarises.inotherwords,giventhedissimilarity betweenaparscularsnapshotofstudentcodeandastate s"representasve"snapshot, wecancalculatetheprobabilitythatthestudentsnapshotcamefromagivenstateusing anormaldistribusonbasedonthedissimilarity. Piech,C.,Sahami,M.,Koller,D.,Cooper,S.,&Blikstein,P.(2012, February).Modelinghowstudentslearntoprogram.InProceedings(of( the(43rd(acm(technical(symposium(on(computer(science(educa3on (pp.153g160).acm.

Stanford smooc:teachingmachinelearningtopics Huang,J.,Piech,C.,Nguyen,A.,&Guibas,L.(2013,June).SyntacScand funcsonalvariabilityofamillioncodesubmissionsinamachinelearning mooc.inaied(2013(workshops(proceedings(volume(p.25). ThelandscapeofsoluSonsfor gradientdescentforlinearregression represensngover40,000studentcodesubmissionswithedgesdrawnbetween syntacscallysimilarsubmissionsandcolorscorrespondingtoperformanceona ba<eryofunittests(redsubmissionspassedallunittests). HourofCodeChallengeModeling HowYoungStudentsLearntoProgram

Correct(Answer( Node:uniqueparSal soluson. Arc:(NextsoluSon anexpertwould recommend. Code.orgproblemsolvinggraphoflearnedpolicyforhowtosolveasingleopen endedprogrammingassignmentfromover1musers.eachnodeisaunique parsalgsoluson.thenode0isthecorrectanswer. ChrisPiech,StanfordPh.D. student ImprovedRetenSon Code.orggatheredover137millionparSal solusons.notallstudentsmadeitthroughthe ensrehourofcodebutretensonwasquite highrelasvetoothercontemporary openaccesscourses.

63KPeerGradingfor7Kstudents Blue(Blob:(( StudentA Red(Squares: Studentswhograded StudentA Red(Circle:( Studentswho weregradedby StudentA. ACourseracoursetoteachHCI.Peergradingnetworkof63Kpeergrades for7kstudents.asinglestudentishighlighted,redsquaresgradedthe student,redcirclesweregradedbythestudent. ChrisPiech, StanfordPh.D. student Squares:QuesSons Circles:(Concepts Edges:(Strong QuesSonConcept RelaSonship Lan,A.S.,Studer,C.,Waters,A.E.,&Baraniuk,R.G.(2013).Joint topicmodelingandfactoranalysisoftextualinformasonand gradedresponsedata.arxiv(preprint(arxiv:1305.1956.

Agenda Introduction Model the Student Model the Domain Personalize Tutoring Assess Learning Intelligent( Tutoring( Systems( Learning( @(Scale( Longtermgoal Millionsofschoolchildrenwillhaveaccessto whatalexanderthegreatenjoyedasaroyal prerogerasve: thepersonalservicesofatutor aswellinformedasaristotle Studentswillhave instantaccesstovast storesofknowledge throughtheir computerizedtutors PatSuppes,StanfordUniversity,1966 DiedNov2014)

Longtermgoal Millionsofschoolchildrenwillhaveaccessto whatalexanderthegreatenjoyedasaroyal prerogerasve: thepersonalservicesofatutor aswellinformedasaristotle Studentswillhave instantaccesstovast storesofknowledge throughtheir computerizedtutors PatSuppes,StanfordUniversity,1966 DiedNov2014) Learning(to(Teach:(Machine( Learning(Techniques To(Improving(Instruc6on ThankYou! AnyQuesSons? NIPS2015WorkshoponHumanPropelled MachineLearning Dec13,2014