Progress Report to ONR on MURI Project Building Interactive Formal Digital Libraries of Algorithmic Mathematics



Similar documents
Progress Report: Building Interactive Digital Libraries of Formal Algorithmic Knowledge Project Overview

System Description: The MathWeb Software Bus for Distributed Mathematical Reasoning

ShadowDB: A Replicated Database on a Synthesized Consensus Core

Masters in Information Technology

Using the Computer to Prove the Correctness of Programs p.1/12

The CompCert verified C compiler

Introducing Formal Methods. Software Engineering and Formal Methods

LOUGHBOROUGH UNIVERSITY

FACULTY STUDY PROGRAMME FOR POSTGRADUATE STUDIES

Automated Theorem Proving - summary of lecture 1

Telecommunication (120 ЕCTS)

Verifying security protocols using theorem provers

Brigitte Pientka. Programming Languages, Verification, Automated Theorem Proving, Logical Frameworks, Logic, Type Theory, Logic Programming

Curriculum for the basic subject at master s level in. IT and Cognition, the 2013 curriculum. Adjusted 2014

3. Programme accredited by Currently accredited by the BCS. 8. Date of programme specification Students entering in October 2013

Annual Goals for Math & Computer Science

Research Overview in. Formal Method in Software Engineering Laboratory

Masters in Human Computer Interaction

Masters in Advanced Computer Science

Course MS10975A Introduction to Programming. Length: 5 Days

Masters in Artificial Intelligence

Masters in Computing and Information Technology

Masters in Networks and Distributed Systems

Adversary Modelling 1

School of Computer Science

COMPUTER SCIENCE PROGRAM

Specification and Analysis of Contracts Lecture 1 Introduction

US Federal Cyber Security Research Program November 15, 2012 New England Advanced Cyber Security Center Workshop Bill Newhouse (NIST)

Tap Unexplored Markets Using Segmentation The Advantages of Real-Time Dynamic Segmentation

MEng, BSc Applied Computer Science

C. Wohlin and B. Regnell, "Achieving Industrial Relevance in Software Engineering Education", Proceedings Conference on Software Engineering

ELPUB Digital Library v2.0. Application of semantic web technologies

PROGRAM LOGICS FOR CERTIFIED COMPILERS

University of Cambridge: Programme Specifications MASTER OF PHILOSOPHY IN MODERN EUROPEAN HISTORY

Elevate your Client Relationships by Solving the Channel Marketing Challenge

Guidelines for Doctoral Programs in Business and Management

A Study on the Game Programming Education Based on Educational Game Engine at School

Engineering of a Clinical Decision Support Framework for the Point of Care Use

What is Data Mining, and How is it Useful for Power Plant Optimization? (and How is it Different from DOE, CFD, Statistical Modeling)

School of Computer Science

LONG BEACH CITY COLLEGE MEMORANDUM

An Automated Workflow System Geared Towards Consumer Goods and Services Companies

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001

Mining Social-Driven Data

Social Semantic Emotion Analysis for Innovative Multilingual Big Data Analytics Markets

Information management software solutions White paper. Powerful data warehousing performance with IBM Red Brick Warehouse

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

The Design Study of High-Quality Resource Shared Classes in China: A Case Study of the Abnormal Psychology Course

Java Programming (10155)

A New MSc Curriculum in Computer Science and Mathematics at the University of Zagreb

Language-oriented Software Development and Rewriting Logic

Proving Hybrid Protocols Correct

Verification of Imperative Programs in Theorema

Development (60 ЕCTS)

Careers in Biostatistics and Clinical SAS Programming An Overview for the Uninitiated Justina M. Flavin, Independent Consultant, San Diego, CA

Final Assessment Report of the Review of the Cognitive Science Program (Option) July 2013

Graduate Handbook Department of Computer Science College of Engineering University of Wyoming

Textbooks: Matt Bishop, Introduction to Computer Security, Addison-Wesley, November 5, 2004, ISBN

Data Quality Improvement and the Open Mapping Tools

Experiences with Online Programming Examinations

The University of Jordan

Customer Case Study. Sharethrough

Kentucky Lung Cancer Research Program Strategic Plan Update

CHAPTER 1 INTRODUCTION

Data Warehousing: A Moderated Panel Discussion

Efficient Collection and Analysis of Program Data - Hackystat

First Choice Graduate Program Report. Department of Political Science Master of Science in Political Science College of Sciences

Analysis of Data Mining Concepts in Higher Education with Needs to Najran University

connect.munichre Munich Re s exclusive client portal your success is our business

Functional Programming. Functional Programming Languages. Chapter 14. Introduction

Curriculum Vitae (CV)

US Federal Cyber Security Research Program. NITRD Program

ISSUES IN RULE BASED KNOWLEDGE DISCOVERING PROCESS

MAP-I Programa Doutoral em Informática. Rigorous Software Development

The CS Principles Project 1

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

MiddleWare for Sensor Systems keeping things Open

Software Engineering Transfer Degree

INSTRUCTIONAL STRATEGY IN THE TEACHING OF COMPUTER PROGRAMMING: A NEED ASSESSMENT ANALYSES

A common interface for multi-rule-engine distributed systems

Please consult the Department of Engineering about the Computer Engineering Emphasis.

MEng, BSc Computer Science with Artificial Intelligence

Arts, Humanities and Social Science Faculty

Models of Dissertation Research in Design

INTERBUSSINES ACADEMY LTD. Business Administration. Bachelor's Program

PANEL SESSION: INFORMATION SECURITY RESEARCH AND DEVELOPMENT IN ACADEMIA

Anne Karle-Zenith, Special Projects Librarian, University of Michigan University Library

Oxford University - A Centre of Excellence in Research and Teaching

Content Analyst's Cerebrant Combines SaaS Discovery, Machine Learning, and Content to Perform Next-Generation Research

What is Modeling and Simulation and Software Engineering?

11 Tips to make the requirements definition process more effective and results more usable

2.1 The RAD life cycle composes of four stages:

Maturity, motivation and effective learning in projects - benefits from using industrial clients

What is Artificial Intelligence?

MASTER of PHILOSOPHY in MODERN EUROPEAN HISTORY

CURRICULUM VITAE. Oct 2005 Dec MSc in Computer Science. Faculty of Mathematics,

The data forest. Application. Application Application DATA. Office of Research

Requirements Engineering: Elicitation Techniques

Writing in the Computer Science Major

Transcription:

Progress Report to ONR on MURI Project Building Interactive Formal Digital Libraries of Algorithmic Mathematics Robert L. Constable Cornell University February 2003 Project Web Page http://www.cs.cornell.edu/info/projects/nuprl/html/digital Libraries.html Accomplishments since the May 2002 Project Review 1. Further Design of the Prototype Formal Digital Library (FDL) Stuart Allen has written a key technical article on the notion of abstract object identifiers and certificates and a substantial companion design document. The technical article has been submitted for publication, and the design documents are a blueprint for our on-going work. These articles are highly original and conceptual, and the ideas developed in them are essential to a sound FDL. Stuart s treatment of every topic breaks new ground, and we have had to spend time to educate the project and publish ideas to elicit feedback. We have been exploring the use of reflection based on the term structure of the FDL. We can now demonstrate a result that will hold in all of the theories that we connect. It is a result critical to Tarski s theorem. It represents a significant improvement in a logical treatment of reflection made possible by computer support. 2. Software Expanding the Formal Digital Library (FDL) Prototype 1

We have added Library navigation functions and documented them. This entails a great deal of programming and technical writing. The extension to the FDL represents nearly 1,000 person hours of work. Linking Nuprl 5, Nuprl 4, MetaPRL, and PVS to the FDL We have spent many programmer hours linking these provers to the Library. 3. Adding content to the FDL We have added new content in two areas, both essential. First, we are developing new material that is directly relevant to software infrastructure protection, namely protocol verification, system security, and basic formal data structures. Second, we are incorporating PVS material into the Library. Critical infrastructure protection of software Our approach to this area is unique and important; together we can point this out to OSD. The basic argument is that verification tools will be vastly more useful if they have access to large knowledge bases. Moreover, the latest DARPA thrust in intelligent networks will require the knowledge base that we will assemble. We know the importance of digital knowledge bases from research in AI, namely, intelligent systems need access to large amounts of knowledge. We also know it from our work with protocol verification. Once we established a large knowledge base in distributed systems, we could verify protocol designs as fast as the systems people could produce them. The investigations we are conducting now are modeled after the research program we pursued with ONR funding in 1990 s for the investigation of correct-by-construction functional programming with Nuprl. The area created with NSF and ONR funding is still very active and is having a significant impact spawning related research programs and systems and eventually leading to a decade long period of practical work supported by DARPA. The Nuprl book from 1986 remains among the top twenty five most cited documents on the Web according to Citeseer, and the current Nuprl Web page brings that book up to date. There are many lines of theoretical work that can be traced back to this earlier project, including the solution of two open problems in theoretical computer science and mathematic (Howe and Murthy) and the creation of closely related proof development systems, such as Alf, Coq, Lego and MetaPRL, and many significant practical re- 2

sults in both industry and government. Dozens of PhD theses have been written on this topic along with hundreds of articles and reports. We believe that the research agenda we are proposing now has the same or greater potential to produce strong practical applications and unexpected discoveries over another twenty year period. In addition, we have picked a topic on which there is common interest with the Naval Research Laboratory protocol verification using IO automata and protocol synthesis. Also Jason s popular expository writing about O Caml spreads our formally grounded writing ideas to a wide audience. We have developed a new capability to extract distributed systems from constructive proofs that specified behaviors are achievable. In some sense this solves a problem that has been under investigation by many researchers since 1990. It will have a significant impact on our ability to create reliable and secure software, and it depends vitally on a large formal digital library of the kind we are building. Accumulating content from multiple provers Jason s work on a formal O Caml compiler developed inside MetaPRL is highly original and complements our formal O Caml semantics. We have imported PVS proofs into the FDL. We did this instead of posting the O Caml semantics to the Web because as you noted, we are burning researcher cycles as fast as we can and still have a large stack of subgoals awaiting attention. Our ability to replay and import PVS proofs extends our PVS capability considerably. We have written a CADE paper on this. The import required a substantial amount of programming and experimentation, again hundreds of hours. We are working on posting this material to the Web. 4. Presenting FDL content on the Web The ability to display formal content on the Web has proven to be important to advertising the project. Thus it has taken on a higher priority with me and you than the technology now supports when applied to massive content from multiple systems. Progress has been slow against hard technical problems. We are writing internal notes about the mechanisms and problems and discussing them in our research meetings and seminars. There are serious problems of scale as we try to deal with tens of thousands of objects. 3

Presenting PVS proofs on the Web at the same quality level of Nuprl and MetaPRL proofs is an additional technical challenge and requires us to extend Stuart s tools still further. 5. Engaging with a Community Our discussions with people at the review and with you suggested the importance of identifying a community on which we can have a significant impact. We are committed to at least one such community, Mathematical Knowledge Management (MKM). In June 2002 I gave an invited talk at the first meeting of the North American branch, NA-MKM. This talk is posted at the project home page. Jim Caldwell is engaged with the European MKM which has ties to the North American branch, and I will meet some of the leaders in July 2003. We will also look to identify a second community that complements the first. At the North American MKM meeting that we attended en mass, this was one of the liveliest topics. We have remained in contact with the North American and European leaders of this group. I expect that we will host a meeting of the North American group next year if they are willing, and we might ask to use some of our ONR funds to help organize. A related effort in which we also participate is QPQ (QED Pro Quo), a repository of open source deductive software. I am on the Advisory Board. As part of community building, I have given technical lectures on the FDL at these places: North American Mathematical Knowledge Management meeting in Hamilton Ontario. Automath meeting at Harriot-Watt University in Edinburgh University of Reading, England Ben-Gurion University, Israel I will be lecturing on the topic at the Marktoberdorf Summer School in July 2003 and at an ICALP workshop. 4

6. Personnel Cornell CalTech Wyoming Stuart Allen Jason Hickey James Caldwell Robert Constable Aleksey Nogin Vitali Khaikine Christoph Kreitz Xin Yu John Cowles Richard Eaton Lori Lorigo Eli Barzilay 7. Publications We list fourteen publications on the project Web page (including the FDL manual). There will also be two PhD theses, one from August 2002 and one from August 2003. This is a very high publication rate given the amount of software and formal mathematical content we produce as well. Among the most significant papers are these. (a) Abstract Identifiers and Textual Reference (2002) (b) Notes on the Design and Purpose of the FDL (2002) (c) Logic of Events (2003) (d) Nuprl-PVS Connection: Integrating Libraries of Formal Mathematics (2002) (e) eflecting Higher-Order Abstract Syntax in Nuprl (2002) (f) Sequent Schema for Derived Rules (2002) (g) Theory and Implementation of an Efficient Tactic-Based Logical Framework (2002) Paper (a) lays the foundations for a multi-logic library of formalized computational mathematics and computer science. It discusses an issue that many people have not discovered as critical to such a system, namely the need for abstract treatment of the name space. Paper (b) translates the ideas of (a) into FDL design decisions. Paper (c) shows the value of this work to one the problem of building secure-by-construction distributed system software. This is a core problem in CIP/SW. Cornell has another MURI in the area, on language based security. Our work is important to that MURI as well. We make a direct connection between the FDL and the creation of secure code. 5

Paper (d) discusses some of the early issues discovered as we imported PVS into the FDL. The PVS prover creates a great deal of formal mathematics, and some of it is useful in algorithm construction; but the PVS prover is based on an entirely different logic than what is used in Alf, Nuprl, and Coq, the provers for computational mathematics. We are showing how to incorporate two very different logics into the FDL. This will increase its value considerably. It is a very hard logical and software engineering problem. Paper (e) deals with the problem of reflecting formal logics. This capability is widely recognized as important in practice, but it is extremely complex and has not been fully deployed in any prover. We proposed using insights from computational mathematics to simplify the task. A thesis is being written on this topic. Its results will apply to all logics of the FDL. Papers (f) and (g) describe the mechanisms of MetaPRL on derived rules which will benefit all implemented formal logics. 6