On comparing web development platforms

Similar documents

What is a programming language?

Attack Frameworks and Tools

Agile Offsharing: Using Pair Work to Overcome Nearshoring Difficulties

Agreement on. Dual Degree Master Program in Computer Science KAIST. Technische Universität Berlin

Introduction to Geventis. Registration for the MIN Graduate School (MINGS)

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

HOW TO SUCCESSFULLY USE SOFTWARE PROJECT SIMULATION FOR EDUCATING SOFTWARE PROJECT MANAGERS

Guideline for stresstest Page 1 of 6. Stress test

Chapter 24 - Quality Management. Lecture 1. Chapter 24 Quality management

Jonathan Worthington Scarborough Linux User Group

White Paper. Java versus Ruby Frameworks in Practice STATE OF THE ART SOFTWARE DEVELOPMENT 1

The Challenge of Productivity Measurement

Introducing DocumentDB

Software Engineering, Not Computer Science. A scientist builds in order to learn; an engineer learns in order to build.

Outline. hardware components programming environments. installing Python executing Python code. decimal and binary notations running Sage

Chapter 13 Computer Programs and Programming Languages. Discovering Computers Your Interactive Guide to the Digital World

1. Overview of the Java Language

Programming Languages

How To Understand And Understand The Security Of A Web Browser (For Web Users)

Instructional Design Framework CSE: Unit 1 Lesson 1

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Programming Languages & Tools

THE CHALLENGE OF ADMINISTERING WEBSITES OR APPLICATIONS THAT REQUIRE 24/7 ACCESSIBILITY

CHANCE ENCOUNTERS. Making Sense of Hypothesis Tests. Howard Fincher. Learning Development Tutor. Upgrade Study Advice Service

About the Return on Investment of Test-Driven Development

STUDY AND ANALYSIS OF AUTOMATION TESTING TECHNIQUES

Viewpoint. Choosing the right automation tool and framework is critical to project success. - Harsh Bajaj, Technical Test Lead ECSIVS, Infosys

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

TATJA: A Test Automation Tool for Java Applets

C. Wohlin, "Is Prior Knowledge of a Programming Language Important for Software Quality?", Proceedings 1st International Symposium on Empirical

PROJECT MANAGEMENT SYSTEM

Slides from INF3331 lectures - web programming in Python

Do Programming Languages Affect Productivity? A Case Study Using Data from Open Source Projects

Functional Modelling in secondary schools using spreadsheets

How To Calculate Max Likelihood For A Software Reliability Model

CSE 307: Principles of Programming Languages

Open-source content management. systems. Open-source content management is a viable option FEBRUARY Open-source strengths

CS514: Intermediate Course in Computer Systems

Data processing goes big

Application Development for Mobile and Ubiquitous Computing

Using Ruby on Rails for Web Development. Introduction Guide to Ruby on Rails: An extensive roundup of 100 Ultimate Resources

CREDENTIALS & CERTIFICATIONS 2015

Protecting Database Centric Web Services against SQL/XPath Injection Attacks

Agent Languages. Overview. Requirements. Java. Tcl/Tk. Telescript. Evaluation. Artificial Intelligence Intelligent Agents

Real Time Market Data and Trade Execution with R

A Comparison of Programming Languages for Graphical User Interface Programming

Moving from ISO9000 to the Higher Levels of the Capability Maturity Model (CMM)

Software Engineering Techniques

Structure of Presentation. Stages in Teaching Formal Methods. Motivation (1) Motivation (2) The Scope of Formal Methods (1)

Database Management System Choices. Introduction To Database Systems CSE 373 Spring 2013

SOFTWARE VALUE ENGINEERING IN DEVELOPMENT PROCESS

Subject Examination and Academic Regulations for the Research on Teaching and Learning Master s Programme at the Technische Universität München

How To Understand The History Of The Web (Web)

Seamless Method- and Model-based Software and Systems Engineering

Software Development for Multiple OEMs Using Tool Configured Middleware for CAN Communication

Comparing Native Apps with HTML5:

Business & Computing Examinations (BCE) LONDON (UK)

Develop PHP mobile apps with Zend Framework

CSC 180 H1F Algorithm Runtime Analysis Lecture Notes Fall 2015

Accelerating Wordpress for Pagerank and Profit

Developing in the MDA Object Management Group Page 1

Characteristics of Java (Optional) Y. Daniel Liang Supplement for Introduction to Java Programming

AP Physics 1 and 2 Lab Investigations

22C:22 (CS:2820) Object-Oriented Software Development

The Research Bulletin of Jordan ACM, Volume II(III) P a g e 127

Here s how to choose the right mobile app for you.

CS 40 Computing for the Web

1 Business Modeling. 1.1 Event-driven Process Chain (EPC) Seite 2

Report - Marking Scheme

Intro to scientific programming (with Python) Pietro Berkes, Brandeis University

Integrated version control with Fossil SCM

Benefits of Test Automation for Agile Testing

The link between Research and Higher Education

Transcription:

Seminar "Beiträge zum Software Engineering" On comparing web development platforms Lutz Prechelt Freie Universität Berlin, Institut für Informatik http://www.inf.fu-berlin.de/inst/ag-se/ Lutz Prechelt, prechelt@inf.fu-berlin.de 1/ 33

A class of research questions An eternal research question in software engineering: How do alternative methods, processes, technologies, notations, tools etc. compare? With respect to ease of use, productivity, error-proneness, quality of results, learnability etc. Applicable to (for instance): design methods design notations development process models testing methods programming languages middleware technologies etc. Lutz Prechelt, prechelt@inf.fu-berlin.de 2/ 33

Example: Comparing programming languages For instance, assume we would like to compare a number of programming languages Questions: How easily are they learned? How productive are they? (time for solving a given problem) How fast do the programs run? How much memory do they need? How many defects do they contain? How reliably do they run? How robust are they? How safe are they? How secure? How long are they? How easy are they to understand, modify, test? and many more Lutz Prechelt, prechelt@inf.fu-berlin.de 3/ 33

Research method of choice: Controlled experiment In principle, the perfect research method for addressing these questionsisa controlled experiment: Change one thing (here: The language used) Keep everything else the same (this is what is called control) Observe what happens in each case i.e. write a program and investigate its properties Interpret the observations to answer your research questions Lutz Prechelt, prechelt@inf.fu-berlin.de 4/ 33

Research method problems: Generalizability No matter which task we choose for the program it may be more suitable for language A than for B but for a different program it could have been the other way round This means, we cannot guarantee that our results generalize to everything that can be done with the languages. There is no solution to this problem The best we could do is use several tasks, not just one. And spread them over the spectrum of possible tasks. We should choose a task that can be considered either "typical" or "interesting" Typical: Results describe frequent cases Interesting: Results describe cases with high uncertainty Lutz Prechelt, prechelt@inf.fu-berlin.de 5/ 33

Research method problems: Criteria needed We need operationalizations for the individual attributes targeted by our questions Repeatable procedures for obtaining an answer This is straightforward for some attributes, e.g.: How fast do they run? Choose an input, measure execution time in seconds It is very difficult for others, e.g.: How many defects do they contain? How safe are they? How secure? How easy are they to understand, modify, test? Lutz Prechelt, prechelt@inf.fu-berlin.de 6/ 33

Research method problems: Control The worst methodological problem in our case, however, is control: How to keep "everything else constant"? We would need to keep the programmer constant! That is obviously impossible: We cannot guarantee that somebody has equivalent knowledge of all languages The programmer learns each time when solving the problem. We cannot erase or reset this experience. So we need a replacement for true constancy. Lutz Prechelt, prechelt@inf.fu-berlin.de 7/ 33

Keeping the programmer "constant" Obviously, we need to use a different programmer for each language. How do we make them all the same? Method 1: Grow them Find a large-enough set of monozygotic multiples Make sure they have all had the same youth and education Make sure they had no programming training yet Train each in a different language Make sure each training has the same quality! Perform the experiment with them Make sure they are all in identically good condition on that day! This is practically impossible. Lutz Prechelt, prechelt@inf.fu-berlin.de 8/ 33

Keeping the programmer "constant" (2) Method 2: Use averaging Find a large-enough set of roughly similar programmers for each language Make sure none of these groups is inherently inferior to another, programming-wise Have each of these persons solve the task When evaluating the results, use the average of the results of each language And hope that all differences between the programmers cancel out within each group as well as across groups This is possible. We still need to operationalize "roughly similar" and find such Lutz Prechelt, prechelt@inf.fu-berlin.de 9/ 33

Variation among "roughly similar" programmers Each point represents one group (by task type): The worktime quotient of slowest to fastest quarter of participants Lutz Prechelt, prechelt@inf.fu-berlin.de 10 / 33

Variation among "roughly similar" programmers (2) Lutz Prechelt, prechelt@inf.fu-berlin.de 11 / 33

Problem of averaging Compare these interpersonal differences to the differences you would expect between languages With interpersonal variation this large, will we be able to (clearly) see the language differences at all? Conclusion: It is important to get either fairly similar programmers (not just roughly similar ones) to reduce the interpersonal variation or very many of them The precision of an estimate of the mean is stddev/sqrt(n) Lutz Prechelt, prechelt@inf.fu-berlin.de 12 / 33

Many or similar: When to use which We will now shortly review two different studies: One used many roughly similar programmers to compare programming languages The other (currently in planning) attempts to use fewer, but more similar programmers to compare web development platforms Lutz Prechelt, prechelt@inf.fu-berlin.de 13 / 33

Example 1: jccpprt The study jccpprt compares Java, C, C++, Perl, Python, Rexx, and Tcl. It is based on a precise specification of a simple string processing program 80 implementations were collected from 74 programmers Some in a controlled experiment on a different topic Others via a call in Usenets newsgroups Lutz Prechelt, prechelt@inf.fu-berlin.de 14 / 33

jccpprt task A fixed mapping from letters to digits is given: The program receives as input a "dictionary" (list of words) and a list of "phone numbers" and must produce as output all possible encodings of the phone numbers by words according to the mapping Lutz Prechelt, prechelt@inf.fu-berlin.de 15 / 33

Some jccpprt results: Execution time for 1000 numbers with a 73000-word dictionary Lutz Prechelt, prechelt@inf.fu-berlin.de 16 / 33

Some jccpprt results: Observations The individual variation is huge Note the scale is logarithmic! Observe how wide the stderr dashes around the means (M) are! General trends are hence unclear But (looking at the medians), we find that scripting language programs are not generally much slower than programs in compiled languages Lutz Prechelt, prechelt@inf.fu-berlin.de 17 / 33

Some jccpprt results: Work time required Lutz Prechelt, prechelt@inf.fu-berlin.de 18 / 33

Some jccpprt results: Observations It looks like the compiled languages require 2-3 times as long for solving the problem with them and interpersonal variation is also larger there But beware: The Java, C, C++ times were measured in a controlled experiment they are reliable The scripting language times are self-reported by participants found via Usenet postings Canwetrustthelatter? Lutz Prechelt, prechelt@inf.fu-berlin.de 19 / 33

Some jccpprt results: Length of resulting program Lutz Prechelt, prechelt@inf.fu-berlin.de 20 / 33

Some jccpprt results: Observations The length data are undisputable and confirm the work time data if we assume that lines written per hour is language-independent That leaves one problem: Are the length differences due to language differences or due to better qualifications of the script programmers? Validation idea: Better-qualified programmers would write more lines per hour Lutz Prechelt, prechelt@inf.fu-berlin.de 21 / 33

Validation of comparability of programmer qualifications Lutz Prechelt, prechelt@inf.fu-berlin.de 22 / 33

jccpprt summary Overall, the sheer mass of data points allows to overcome the significant problems of interpersonal variation and even (to a reasonable degree) the validity problems regarding the trustworthyness of some of the data So how about comparing web development platforms in the same manner? Lutz Prechelt, prechelt@inf.fu-berlin.de 23 / 33

Example 2: A contact Late in 2005, Gaylord Aulke, CEO of the web development service company 100days, contacted me roughly like this: "We are using PHP in our projects. I use jccpprt to argue that scripting languages are to be taken seriously. Couldn't we do a similar study for PHP?" Lutz Prechelt, prechelt@inf.fu-berlin.de 24 / 33

Comparing web development platforms: The main problem The main success factor in jccpprt was the large number of implementations per language The large number was possible, because the task was small Well under 10 person-hours for most participants, despite high reliability requirements and modest skills The task could be small, because the domain "language comparison" could be meaningfully addressed with an algorithmic problem. The task was not more or less representative than any other. However web applications cannot reasonably be as small The whole point of web development platforms is giving good support and reuse for many complex development tasks. So those need to be present in the task Lutz Prechelt, prechelt@inf.fu-berlin.de 25 / 33

Which platforms? So we obviously have a scaling problem. Let's see how big that is. First, which platforms would we need to look at? The "Big Money" platforms: Java Enterprise Edition,.NET The ubiquitous one: PHP The grand old lady: Perl Possibly younger contesters: Python, Ruby-on-Rails So we need to obtain 4 to 6 groups of implementations Lutz Prechelt, prechelt@inf.fu-berlin.de 26 / 33

Which aspects in the task? Second, certain aspects should be present in the task to make the comparison meaningful: User registration, authentication, session tracking, forms and validation, dynamic results, reports, persistence, web services, So we need to have a task that is many times larger than that of jccpprt Lutz Prechelt, prechelt@inf.fu-berlin.de 27 / 33

How to make clear-enough differences likely? Again, we have two possible approaches: Obtain a large number of implementations or obtain only a few, but from fairly similar programmers Discussion: Which way to go? Why? How? How to ensure validity of the study? Lutz Prechelt, prechelt@inf.fu-berlin.de 28 / 33

One possible design: Plat_Forms Ensure validity by creating all implementations under supervision same time, same place Make sufficient task size possible by inviting teams of 3 for 30 hours A (hopefully sensible) balance of size and cost Reduce interpersonal variation by soliciting participation from top-class programmers only Make sufficient participation likely by advertising the study as a contest an opportunity for software development service organizations to show off their capabilities at most 3 teams will be admitted per platform Lutz Prechelt, prechelt@inf.fu-berlin.de 29 / 33

Current status The Plat_Forms contest and study has been announced by 'ix in their November 2006 issue and on heise online Newsticker on October 11 The contest is planned for January 25-26, 2007, in Nürnberg We are now looking for sponsors platform representatives who will help select teams from among the applicants teams who want to participate students who would like to participate in the scientific evaluation afterwards Lutz Prechelt, prechelt@inf.fu-berlin.de 30 / 33

Tasks during the evaluation Develop tests and criteria for correctness and usability May require comparing apples and oranges meaningfully Obtain load testing software, design and set up load tests May be difficult if a lot of Javascript is used Develop categories and criteria for describing the structure, size, and modularity of the solutions Must accomodate the very different approaches of the platforms Develop scenarios for characterizing the flexibility and modifiability of the solutions Requires sufficient understanding of each platform Perform all these evaluations and record results during February, March, April Write result report Lutz Prechelt, prechelt@inf.fu-berlin.de 31 / 33

Pointers Lutz Prechelt: "An empirical comparison of C, C++, Java, Perl, Python, Rexx, and Tcl for a search/string-processing program", Technical Report 2000-5, Fakultät für Informatik, Univ. Karlsruhe The language comparison http://page.mi.fuberlin.de/~prechelt/biblio/ Short version in IEEE Computer 33(10):23-29, October 2000 http://www.plat-forms.org/ The web platform comparison Lutz Prechelt: "The 28:1 Grant/Sackman legend is misleading, or: How large is interpersonal variation really?", Technical Report 1999-18, Fakultät für Informatik, Univ. Karlsruhe Programmer variation study Also in "Kontrollierte Experimente in der Softwaretechnik", Springer Verlag, 2001 Course "Empirische Bewertung in Informatik" (Empirical evaluation in informatics) SoSe, 2V+2Ü Lutz Prechelt, prechelt@inf.fu-berlin.de 32 / 33

Thank you! Lutz Prechelt, prechelt@inf.fu-berlin.de 33 / 33