Crowdsourced Code Review in Programming Classes

Similar documents
The single biggest mistake many people make when starting a business is they'll create a product...

Outline. 1 Denitions. 2 Principles. 4 Implementation and Evaluation. 5 Debugging. 6 References

Draft report of the Survey of Participants in Stage 3 of the Workflow Pilot Non MEs users

Generating Marketing Leads Via Marketing Forums

Waitlist Chat Transcript - 04/04/13 10 AM

The Rules 1. One level of indentation per method 2. Don t use the ELSE keyword 3. Wrap all primitives and Strings

Example s for collecting testimonials

TIME MANAGEMENT FOR COLLEGE STUDENTS

>> My name is Danielle Anguiano and I am a tutor of the Writing Center which is just outside these doors within the Student Learning Center.

Introduction to Open Atrium s workflow

Warmest Regards, Josh Nelson President PlumberSEO Toll Free:

Create an anonymous public survey for SharePoint in Office Ted Green, SharePoint Architect

ERAS Online Q&A. Sections. September 8, 2014

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

Transcript - Episode 2: When Corporate Culture Threatens Data Security

User Testing for Pre-Phase 2, Search 360 Janifer Holt and Barbara DeFelice

John Smith. Your feedback report and personal development plan. Your results Pages 2-5. Your personal development plan Pages 6-7

Using our Club website to manage team Practice Schedules

How To Increase Your Odds Of Winning Scratch-Off Lottery Tickets!

Invitation Scripts Setting an Appointment by Text Messaging (Document 8 of 11)

Module 6.3 Client Catcher The Sequence (Already Buying Leads)

Fast Analytics on Big Data with H20

Demystifying Quality Score with Brad Geddes

Computer Programming I

Feature Factory: A Crowd Sourced Approach to Variable Discovery From Linked Data

Course Content Concepts

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

Simple Document Management Using VFP, Part 1 Russell Campbell russcampbell@interthink.com

Agile Testing: The Agile Test Automation Pyramid

5. At the Windows Component panel, select the Internet Information Services (IIS) checkbox, and then hit Next.

Tips for Taking Online Classes. Student Success Workshop

PHPM 631 Health Information Management Systems

Response Rates in Online Teaching Evaluation Systems

These Two Words Just Made Us 37% In 3 Months. "These Two Words. Just Made Us 37% In 3 Months"

DESCRIBING OUR COMPETENCIES. new thinking at work

QUESTION: How has the class helped you grow/change as a communicator in the world of work?

EMPLOYEE JOB IMPROVEMENT PLANS. This Employee Job Improvement Plan designed by Kielley Management Consultants achieves results because:

AT&T Voice DNA User Guide

A: We really embarrassed ourselves last night at that business function.

CRM for the Independent Software Developer

GET 114 Computer Programming Course Outline. Contact: Office Hours: TBD (or by appointment)

The material is written and facilitated by a training team that has current experience in integral mission.

CHAT SESSION WITH ACCA F8 TUTOR

How to Overcome the Top Ten Objections in Credit Card Processing

Set internet safety parental controls with Windows

Not agree with bug 3, precision actually was. 8,5 not set in the code. Not agree with bug 3, precision actually was

Using the For Each Statement to 'loop' through the elements of an Array

Developer Workshop Marc Dumontier McMaster/OSCAR-EMR

RBS NatWest Clear Rate and Cashback Plus Credit Cards Three Star Fairbanking Mark Research

Coverity White Paper. Effective Management of Static Analysis Vulnerabilities and Defects

So with no further ado, welcome back to the podcast series, Robert. Glad to have you today.

Integrate your tools to help integrate your stakeholders

Cell Phone Hijacking. >> After four months of harassing phone calls, Heather and Courtney Kuykendall were afraid to

IBM Watson Ecosystem. Getting Started Guide

Extreme programming (XP) is an engineering methodology consisting of practices that ensure top-quality, focused code. XP begins with four values:

ECDL. European Computer Driving Licence. Database Software BCS ITQ Level 1. Syllabus Version 1.0

Text of Templates

Liferay Portal Performance. Benchmark Study of Liferay Portal Enterprise Edition

CSE 308. Coding Conventions. Reference

Tracking Referrals. Creating a List of Referral Sources. Using the Sales Rep List

PostgreSQL Concurrency Issues

Spring 2013 Structured Learning Assistance (SLA) Program Evaluation Results

Why do you want to launch a business analyst career? Some possibilities include:

Customer Market Research Primer

Will Dormann: Sure. Fuzz testing is a way of testing an application in a way that you want to actually break the program.

Penetration Testing Walkthrough

TIE-PROJ, Project Management workshop questions from ten groups:

Getting Started with Kanban Paul Klipp

How to Perform a Break-Even Analysis in a Retail Store A Step by Step Guide

CollabNet TeamForge 5.3. User Guide

Android Programming Family Fun Day using AppInventor

My DevOps Journey by Billy Foss, Engineering Services Architect, CA Technologies

Power, Influence, and Negotiation MIT Sloan EMBA Class of 2015

Basic info Course: CS 165 Accelerated Introduction to Computer Science Credits: 8 Instructor: Tim Alcon timothy.alcon@oregonstate.

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

Instructions for applying data validation(s) to data fields in Microsoft Excel

Templates for Customer Service Backoffice

TSG Quick Reference Guide to Agile Development & Testing Enabling Successful Business Outcomes

Sibyl: a system for large scale machine learning

Google Resume Search Engine Optimization (SEO) and. LinkedIn Profile Search Engine Optimization (SEO) Jonathan Duarte

Profiles of Mechanical Engineers

Next Generation Tech-Talk. Cloud Based Business Collaboration with Cisco Spark

MIT TechCASH Virtual Chat with John McDonald, MIT TechCASH Monday, August 19, :30 p.m. 1:30 p.m. (EST)

Oxford Learning Institute University of Oxford

SyncTool for InterSystems Caché and Ensemble.

Visual Basic 6 Error Handling

Transcription:

Crowdsourced Code Review in Programming Classes Rob Miller MIT CSAIL joint work with Mason Tang, Elena Tatarchenko, Max Goldman, Joe Henke 1

Problem: Feedback about Coding Style MIT 6.005 Software Construction foundation-level programming course (replaced 6.001/6.170) 400 students per year, mostly sophomores Students write lots of code roughly 10kloc in problem sets and projects Automatic grading is necessary but not sufficient correct but confusing correct and understandable we need human readers, and we want line-by-line feedback 2

Approach: Crowd-Driven Code Review Chop up student programs into chunks Review the chunks by a mixed crowd: students, staff, alums staff alumni students Anticipated benefits faster, cheaper, more diverse comments give practice with code reviewing (a widespread industry practice) expose to good and bad solutions reduce workload on teaching staff incorporate alumni back into the course Not using for grading... yet 3

Outline Introduction System Caesar code-reviewing system Process How we use Caesar in a programming class Experience Some results from initial deployment last fall & spring Looking ahead Issues to address Opportunities for on-campus and online 4

Caesar: Divide & Conquer programs chopped into chunks each chunk assigned to multiple reviewers 5

Social Reviewing automatic style checker comments reviewers can see whole program (not just chunk) if needed reviewer comments upvotes & downvotes replies & discussion reviewers have a reputation (#upvotes, + 100 if they re alums or staff of the course) 6

Code Chopping Chop each submission into chunks We ve tried two sizes: method (~20 loc) and class (~150 loc) Sort chunks Prioritize classes marked important by staff Diff against staff-provided code and prioritize chunks with more student-authored lines Deprioritize test code Cluster chunks using MOSS fingerprinting Not all chunks get assigned for review Typically 30% So prioritization is important 7

Task Routing Lazy assignment wait until a reviewer logs in before assigning tasks to them, since a given alum/student may or may not show up Multiple reviewers per chunk assign 2 students/alums to a chunk to encourage discussion assign 1 staff to the chunk too, for reviewing the reviews Fairness to code author balance #reviewed chunks per submission Diversity of learning opportunities maximize differences between reviewers on same chunk & same submission role (student, alum, staff) reputation score # times assigned to a chunk together 8

Privacy vs. Visibility Code author is anonymous All reviewers are identified & their reviews are browsable 9

Process Assignment due on Thurs Students review Fri/Sat 10 chunks; must comment or vote on each assignment is still fresh in their minds, curious about other s solutions Staff reviews Sun 30 chunks; mostly voting Alums review whenever (Fri-Wed) 3-5 chunks; must comment or vote Revise & resubmit assignment by Thurs staff alumni students Main motivation is to fix bugs that lost points in autograding, but must address code review comments as well 10

Experience Fall 2011 180 students 54 alums 15 staff Spring 2012 215 students 0 alums 17 staff PS0 PS1 PS2 PS3 PS4 PS5 PS6 PS7 PS0 PS1 PS2 PS3 PS4 13 problem sets, 2.2k submissions 21.5k comments 5% alums 16.2% upvoted 8% staff 0.7% downvoted 87% students 9.6 comments per submission 60m 50m 40m 30m 20m 10m average time students spent reviewing PS0 PS1 PS2 PS3 PS4 PS5 PS6 PS7 11

Kinds of Comments Bug Clarity Performance Simplicity Style Learning Positive ~13% < 1% LGTM Meta Does the message already include I don't really understand what you're "\n"? If so, this is fine. If not, it should trying nice to use test of for comments here - why the 20? be This added is nice here. and concise. (I didn't This Small could comments be implemented would help within tell what know you could iterate through an the next you're You I like for don't doing. how loop necessarily you for a have faster assert need algorithm this array like this in a for loop) You variable. statements should put sprinkled this comment throughout above the line, I your think otherwise test: this Great code it runs way could off have checking the page. been This is interesting. Why do you more where simpler things are without failing. using so many if store all the messages you statements. For example, you could looks send/receive good to me in a log? have Thank divided you for the making case this in which more the Code author: For debugging. first than operand 1 line :) it's is not much a scalar more or not. It Test looks The fine log adds time stamps, is readable. hard to read. which help a lot for debugging Two of my files to review are Main :[ yes, this concurrency is all correct problems. I don't think I should be reviewing this method. I think its a problem with caesar. 12

Interaction Happened 16% of comments were upvoted 22% of comments were in conversations (threads with at least 2 human comments) Corrections Clarifications Author responses student: Prefer equalsignorecase because it says to ignore capitalization student: student: (Not The 100% name sure = name.touppercase on this, but...) if you line tell seems it to roll to three take care days of from capitalization. say Sep. 30, won't it give you Sep. 3? student: Agreed. I think the desired method here is add, not roll. student: what does this method do? code author: Essentially assertequals. As someone else pointed out, this should have been an assert array equals, so that will happen in the future. 13

Alums Have Different Context alum: avoid abbreviating variable names... 'hi' would be especially confusing to someone who isn't familiar with english student: Idk about this one though. I've seen lo and hi at various places. I'd say this is fine. Though, the integer N should be lower case, just to conform with the java naming convention. alum: When you're in industry and working on code with people who aren't necessarily familiar with English... it's a problem to use phonetic abbreviations. There is really no good reason to abbreviate variable names especially since IDEs autocomplete. student: For that, yes, I agree. Sadly, some IDEs don't autocomplete very well. *cough* code author: These were the variables that were given to us 14

Reviewers Repeat Themselves staff: A magic number is a some fixed constant (numbers, Strings, etc.) used in a place where variables would likely add to code clarity or understanding... staff: Magic numbers are simply primitives that are used without having a descriptive variable name... alum: Returning null is usually dangerous, especially if your Javadoc doesn't mention that it might return null in some cases... alum: Don't return null in exception cases, especially if your Javadoc's @return line doesn't mention the possibility of returning null (even though it's mentioned earlier)... student: Should include more in rep invariant. ie a digit is allowed only once per row, column, and box student: Should include more in rep invariant. ie a digit is allowed only once per row, column, and box 15

Some Code Authors Are Self-Conscious 16

Survey Responses Learning I thought it was helpful. There are more things I'll be looking for when I code because I saw them in other people's code and people commented about it. I like how reviewing the code after we've submitted it allows us to retain the information and concepts for a longer time. Self-doubt I think I have more to learn than to offer, at this point though. I was glad that the code I was assigned were on the parts that I had understood in my submission, so I could give better feedback. 17

Survey Responses Skepticism Meh. I don't think this will have a major impact on students' coding. No; I'm fixing code instead of learning. Keen to give and get more Is it alright to review more than what is assigned on Caesar? I really like reading and reviewing other people's code, but don't know if over-reviewing would be frowned on. I'm thrilled to get 51 human comments, but they are ALL, somehow, on tests. Scary Very good UI. I enjoy commenting on code. I feel like the Simon Cowell of code review. 18

Related Work ColorMyGraph [Blank, Sutner, von Ahn] crowd grading of proofs Github social commenting on code commits Code Review StackExchange threaded Q&A about short code chunks 19

Looking Ahead System changes Capturing common advice in a wiki Increasing reviewer leverage: searching for similar code Promoting good comments for all users to read Alumni engagement: recruiting, social networking, reputation management Other possibilities Combining automatic grading (e.g. online tutors) with crowddriven human feedback System is open source http://github.com/uid/caesar-web 20