Database System For Obstetrics & Gynaecology at St. James s Hospital



Similar documents
A Database Re-engineering Workbench

IT2304: Database Systems 1 (DBS 1)

IT2305 Database Systems I (Compulsory)

(Refer Slide Time: 01:52)

Requirements engineering

B.Sc (Computer Science) Database Management Systems UNIT-V

Agile Business Suite: a 4GL environment for.net developers DEVELOPMENT, MAINTENANCE AND DEPLOYMENT OF LARGE, COMPLEX BACK-OFFICE APPLICATIONS

(BA122) Software Engineer s Workshop (SEW)

Rapid software development. Ian Sommerville 2004 Software Engineering, 7th edition. Chapter 17 Slide 1


Data Modeling Basics

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

THE BCS PROFESSIONAL EXAMINATIONS Certificate in IT. October Examiners Report. Information Systems

Foundations for Systems Development

n Assignment 4 n Due Thursday 2/19 n Business paper draft n Due Tuesday 2/24 n Database Assignment 2 posted n Due Thursday 2/26

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)

Foundations of Information Management

2. Basic Relational Data Model

CMS Query Suite. CS4440 Project Proposal. Chris Baker Michael Cook Soumo Gorai

Process Methodology. Wegmans Deli Kiosk. for. Version 1.0. Prepared by DELI-cious Developers. Rochester Institute of Technology

B.Com(Computers) II Year RELATIONAL DATABASE MANAGEMENT SYSTEM Unit- I

Change Management for Rational DOORS User s Guide

Introduction to Database Systems

COMHAIRLE NÁISIÚNTA NA NATIONAL COUNCIL FOR VOCATIONAL AWARDS PILOT. Consultative Draft Module Descriptor. Relational Database

Evolving a New Software Development Life Cycle Model SDLC-2013 with Client Satisfaction

Rapid Software Development

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

B.Sc. (Computer Science) First Year

Unit 2.1. Data Analysis 1 - V Data Analysis 1. Dr Gordon Russell, Napier University

1 File Processing Systems

Python Checker. Computer Science Department

10.1 Determining What the Client Needs. Determining What the Client Needs (contd) Determining What the Client Needs (contd)

Rapid software development. Ian Sommerville 2004 Software Engineering, 7th edition. Chapter 17 Slide 1

SOFTWARE PROCESS MODELS

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati

CHAPTER 6 DATABASE MANAGEMENT SYSTEMS. Learning Objectives

Chapter 4 Software Lifecycle and Performance Analysis

CHAPTER_3 SOFTWARE ENGINEERING (PROCESS MODELS)

Software Requirements Specification

Custom Software Development Approach

MITRE Baseline Configuration System Implementation Plan

Contents. Introduction and System Engineering 1. Introduction 2. Software Process and Methodology 16. System Engineering 53

Introduction to Computing. Lectured by: Dr. Pham Tran Vu

11 Tips to make the requirements definition process more effective and results more usable

Introduction to Database Development

Efficient database auditing

THE BCS PROFESSIONAL EXAMINATIONS Diploma. April 2006 EXAMINERS REPORT. Systems Design

CS4507 Advanced Software Engineering

Decomposition into Parts. Software Engineering, Lecture 4. Data and Function Cohesion. Allocation of Functions and Data. Component Interfaces

An Integrated Framework for Hospital Appointment Management Mohammed Jamal Anwar Computer Science with Operational Research (Industry) 2008/2009

SAMPLE FINAL EXAMINATION SPRING SESSION 2015

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

1. INTRODUCTION TO RDBMS

Table of Contents Author s Preface... 3 Table of Contents... 5 Introduction... 6 Step 1: Define Activities... 7 Identify deliverables and decompose

CHAPTER 11 REQUIREMENTS

A Comparative Study of Database Design Tools

Pearson Education Limited 2003

Software Development Processes. Software Life-Cycle Models

CS 389 Software Engineering. Lecture 2 Chapter 2 Software Processes. Adapted from: Chap 1. Sommerville 9 th ed. Chap 1. Pressman 6 th ed.

Project management. Organizing, planning and scheduling software projects

Chapter 2. Data Model. Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel

Computer Science Department CS 470 Fall I

How To Develop Software

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

Requirements Definition and Management Processes

How To Test For Elulla

A Software Engineering Model for Mobile App Development

A system is a set of integrated components interacting with each other to serve a common purpose.

Higher National Unit specification: general information. Relational Database Management Systems

Requirements engineering and quality attributes

Fundamentals of Database System

UML SUPPORTED SOFTWARE DESIGN

Table of Contents. CHAPTER 1 Web-Based Systems 1. CHAPTER 2 Web Engineering 12. CHAPTER 3 A Web Engineering Process 24

Unit I. Introduction

The most suitable system methodology for the proposed system is drawn out.

Software Engineering. Software Processes. Based on Software Engineering, 7 th Edition by Ian Sommerville

Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database.

The Plan s Journey From Scope to WBS to Schedule

Project management. Organising, planning and scheduling software projects. Ian Sommerville 2000 Software Engineering, 6th edition.

Chapter 13: Program Development and Programming Languages

Data Coding and Entry Lessons Learned

Data Hierarchy. Traditional File based Approach. Hierarchy of Data for a Computer-Based File

Software Development Processes. Software Life-Cycle Models. Process Models in Other Fields. CIS 422/522 Spring

How To Design An Information System

Online Enrollment and Administration System

Information Systems Development Process (Software Development Life Cycle)

Organizing, planning and scheduling software projects

Software Development Life Cycle

Appendix M INFORMATION TECHNOLOGY (IT) YOUTH APPRENTICESHIP

Software development life cycle. Software Engineering - II ITNP92 - Object Oriented Software Design. Requirements. Requirements. Dr Andrea Bracciali

To introduce software process models To describe three generic process models and when they may be used

Software Engineering. What is a system?

By the end of the placement in July 2002, a beta version was being prepared for testing sites.

- Suresh Khanal. Microsoft Excel Short Questions and Answers 1

Software Engineering. Session 3 Main Theme Requirements Definition & Management Processes and Tools Dr. Jean-Claude Franchitti

TRADITIONAL VS MODERN SOFTWARE ENGINEERING MODELS: A REVIEW

Data Analysis 1. SET08104 Database Systems. Napier University

DATABASE DESIGN. - Developing database and information systems is performed using a development lifecycle, which consists of a series of steps.

Course: CSC 222 Database Design and Management I (3 credits Compulsory)

Transcription:

Database System For Obstetrics & Gynaecology at St. James s Hospital Tristan Sehgal BSc Computing (Cert. Ind) The candidate confirms that the work submitted is their own and the appropriate credit has been given where reference has been made to the work of others. I understand that failure to attribute material which is obtained from another source may be considered as plagiarism. (Signature of student) i

Summary The problem presented and solved in this project report is that of building an interactive database system for a new research project within the Obstetrics and Gynaecology department of St. James s Hospital, Leeds. The head of the research study is Dr. Nick Wood and he is the person who requested the system. A system was required to enter/edit and view all the information generated from and associated with the research study. This project has taken the requirements of the end user through to the final deployment of the system and this report describes the phases that achieved this. ii

Acknowledgements I would like to thank Dr. Nick Wood for placing his trust in me to produce the database system for his research project, his helpful comments throughout and being a pleasure to work with. I would also like to thank my project supervisor, Dr. Stuart Roberts for his constant support, guidance and encouragement throughout this project. Thanks also to Naomi Quinton at St. James s Hospital for helping to evaluate the system. Thanks to Chris O Hara for reviewing the code and thanks to Craig Lambert for lending me various books and computer equipment. iii

Table of Contents Summary...ii Acknowledgements...iii 1 Introduction...1 1.1 Overview of problem...1 1.2 Motivation for project...1 1.3 Project Objectives...2 1.4 Previous Attempts at Solving Problem...2 2 Background Research...3 2.1 Project Management...3 2.2 Requirements Engineering...3 2.2.1 Types of Requirement...3 2.2.2 Requirement Gathering...3 2.2.3 Requirement Documentation...4 2.3 Data Modelling...4 2.4 User Interface Design...5 2.4.1 Participatory Design...5 2.4.2 Presenting Data...6 2.4.3 Data Entry...6 2.5 Software Testing...6 2.6 Security in Microsoft Access 2000...6 3 Methodologies...8 3.1 Software Development Methodology...8 3.2 Database Design Methodology...10 4 Project Planning... 12 4.1 Milestones for completion...12 4.2 Project scheduling...13 5 Requirements Gathering... 14 5.1 Initial discussions with Nick Wood...14 5.2 Data storage requirements...14 5.3 Storyboarding...14 5.4 Requirements Specification...16 6 Hardware and Software decisions... 17 6.1 Hardware...17 6.2 Software...17 6.2.1 Choice of DBMS...17 6.2.2 Choice of application development software...18 6.3 Final System Architecture...18 iv

7 Database Design and Construction... 20 7.1 Conceptual Design: High Level Data Modelling...20 7.1.1 Entities...20 7.1.2 Relationships...20 7.1.3 Entity Attributes...22 7.1.4 Integrity Constraints...22 7.2 Logical Design 1: Defining the database schema...23 7.2.1 Identifying Primary Keys...23 7.2.2 Foreign Keys and Referential Integrity...23 7.2.3 Documenting the Initial Schema...24 7.3 Logical Design 2: Normalisation...24 7.3.1 Functional Dependencies...25 7.3.2 Normalising to 1 st Normal Form...25 7.3.3 Normalising to 2 nd Normal Form...26 7.3.4 Normalising to 3 rd Normal Form...27 7.4 Physical Design and Implementation...27 7.5 Further iterations to the methodology...28 8. Implementation... 30 8.1 GUI Creation...30 8.2 Example: Saving a Patient s First Visit Details...30 8.2.1 Selecting information from the database...31 8.2.2 Selecting information from the database...31 8.2.3 Checking the required information has been entered...32 8.2.4 Saving From the GUI to the Database...33 8.2.5 Error Handling...35 8.3 Example: Editing a Patient s First Visit Details...35 9. Testing... 37 9.1 Overview...37 9.2 System testing...37 9.3 User acceptance testing...37 10 Deployment... 39 10.1 Installation...39 10.2 Security and Backup...39 10.3 Design Documentation...39 11. Evaluation... 40 11.1 Defining project success...40 11.2 Adjustments to the minimum requirements...40 11.3 Minimum requirements...40 11.3.1 Were the Minimum Requirements Met?...40 11.3.2 How well were the Minimum Requirements Met?...41 11.4 Extended requirements implemented...42 11.5 User satisfaction...43 11.6 Quality, maintainability and extendibility of code...44 11.7 Software/Hardware choices...45 11.8 Extending the solution further...46 11.8.1 Data Analysis Tools...46 v

11.8.2 Mailmerge...46 11.8.3 External Data Access...47 11.9 Project management...47 11.10 Conclusions...47 Bibliography... 48 Glossary of Terms... 50 Appendix A: Personal Reflection... 51 Appendix B: Summary of initial meeting with Nick Wood... 52 Appendix C: Initial Project Schedule... 53 Appendix D: Revised Project Schedule... 54 Appendix E: Requirements Specification... 55 Appendix F: Database Schema... 61 Appendix G: System Screenshots... 68 Appendix H: Example Test Script... 74 Appendix I: Change Requests... 77 Appendix J: End User Feedback... 79 vi

1 Introduction 1.1 Overview of problem Within the Obstetrics and Gynaecology department of St. James s Hospital, Leeds, a new research project has started, the leader of which is Dr. Nick Wood. The patients being studied in the research project are known as HNPCC patients (hereditary non-polypois colorectal cancer) as they have been identified as being at risk of this form of cancer. There are two principal aims of this research study. The first is to try and determine the medical factors which could be used to predict whether or not an HNPCC patient is likely to suffer from cancer. The second is to assess whether the study is acceptable to the patients who it is being carried out on. A significant amount of data will be generated from the research study in the form of patient details, results of examinations, tissue bank data and questionnaire results. The study leader wants this data to be easily accessible to all those involved with the study. In addition, he requires an easy method to view, edit and enter the data generated from the study. In order for this desire to be fulfilled, a database will need to be constructed together with a graphical user interface. 1.2 Motivation for project Without a database to store the data from the study, all the results would have to be stored on paper and stored in a filing cabinet. A number of problems with this paper based approach could be solved by introducing an electronic database system. Firstly, all the data can be well structured and strictly organised in a database. There is no risk for example of someone recording the same information for a patient twice because database constraints could be used to prevent this. With a database, the data is guaranteed to be put in the place it belongs. For example the results from one type of experiment won t get mixed up with those from another. The motivation for implementing a GUI on top of the database is that it will allow the users at St. James s Hospital to enter and edit data without any knowledge of the DBMS being used or the database schema. Technical details such as runtime SQL queries and referential integrity constraints need not concern the users at St. James s Hospital. Providing a GUI to view the data means that it is available easily and instantaneously to anyone who has the permission to view it. Storing the data on the database also means that it can be secured and that patient details remain strictly confidential. 1

1.3 Project Objectives In order to understand the full extent of the problem it was necessary to discuss the high level objectives of the system to decide if what was required would be too much to complete in the time given or too little. In addition, various implicit objectives such as ensuring data consistency needed to be made explicit as to effectively scope the work required. Most of the key objectives were made clear from the initial meeting with Nick Wood (see Appendix B) and are documented below: Gather and document a full set of requirements from the end users of the system. Produce a relational data model of the data that has been normalised to at least 3 rd Normal Form. Map the data model to tables in a DBMS. Enforce data integrity checking. Provide graphical user interfaces for data entry and modification. Produce an SQL query wizard so that users can quickly, group and filter required data. Ensure the database is secure and has backup mechanisms in place. Fully test and deploy the system so that it is ready for use by the end users. 1.4 Previous Attempts at Solving Problem Nick Wood had already created an MS Access 2000 database with one table and one form for entering data into that table. No functionality was present behind the form and the table contained attributes of everything to do with patients, their visits and results. Other than this no attempt had been made at producing a database, no requirements had been documented, no data model produced and no implementation were present. 2

2 Background Research 2.1 Project Management For this project to be successful deadlines must be meet and the end users of the system satisfied with what they get. If deadlines are to be meet then sensible time management is crucial which leads to the need for a project plan. O Connell (1996), provides an entire chapter on assessing project plans. Emphasis is placed on defining a critical path, i.e. the path that the project must follow to be meet its deadline, any deviations from which will lead to delay. The project schedule for this project will be in the form of a Gant Chart because this provides a clear view of the plan. O Connel (1996) provides a section detailing how these can be assessed. O Connel (1996) provides the following analysis for assessing a project plan: Shows what the main phases are and how long they should last. How the phases relate to each other. The amount of work required in each phase. 2.2 Requirements Engineering No previous attempts have been made in solving the problem so no formal set of requirements exists. Research was therefore taken to review the process for gathering, classifying and documenting system requirements. Requirements documentation is needed to form the basis of an agreement between the developer (Tristan Sehgal) and client (Nick Wood) as to what work should be completed (Kotonya and Sommerville (1998)). 2.2.1 Types of Requirement All the literature reviewed relating to requirements discusses two broad classification of requirement, functional and non-functional. Functional requirements define the tasks that the system must perform, i.e. its functionality. Non-functional requirements define the safety, security, usability, reliability and performance requirements (Kotonya and Sommerville (1998)). For this project, usability and security issues are important because several users will need to use the system without difficulty and only certain users should have access to the system. 2.2.2 Requirement Gathering Because there is no clear set of requirements, techniques will have to be employed to gather them. Kotonya and Sommerville (1998) discuss several elicitation techniques: Interviews, scenarios, soft 3

system methods and observational analysis. Interviews are appropriate to this project because there is only on key user that will decide what is required from the system. Scenarios involve discussing with the users how the system would typically need to interact with them, this again would be useful because it is the key end user who will know how the system should behave. In order to ascertain how the system should work, real life examples and situations where the system shall be used are a good method of elicitation. This technique is good because end-users can relate to it and developers can clarify the existing requirements (Kotonya and Sommerville (1998)). Observation and social analysis techniques can t be used because the research project at St. James s Hospital is not underway yet. 2.2.3 Requirement Documentation Requirements documentation should include: Product functions, user characteristics, general constraints, assumptions and dependencies (Kotonya and Sommerville (1998)). The Rational Requisite Pro tool was considered for documenting requirements but discarded due to its complexity and heavy orientation towards very large projects using the Rational Unified Process. 2.3 Data Modelling Data modelling is required in order to build robust, extensible databases (Mott and Roberts (1999)) which are both properties required for the database in this project. Data modelling will be required because the only form the data is in at the start of the project is a list of data fields. There is a significant amount of data involved with this project so it is important that it is modeled correctly so that the database constructed is valid. According to Watson (2002), there are two criteria for judging the quality of a data model: 1. It should be well formed and 2. It should have a high fidelity image. The key attributes (as identified by Watson (2000)) of a data model with a high-fidelity image are that it Correctly describes the world it is supposed to represent and be Complete, understandable, accurate and syntactically correct. There are a number of ways in which in which in an entity-relationship model can be documented. Kotonya and Sommerville (1998) discuss the disadvantages of using the standard Codd and Date model. Firstly it is not possible to assign types to relations and secondly it is poor at modelling the exact nature of relationships between entities. For this project the first problem is not an issue but the second is. There are many relationships which will require clear understanding as part of the requirements phase e.g. many blood samples can be taken from one blood test. 4

As an alternative, Rock-Evans notation allows the cardinality and relationship names to be shown clearly and so this is the notation that has been chosen to use for this project. There are two CASE (Computer Aided Software Engineering) tools that were discovered for producing an ER diagram, Dia on the Linux operating system and Microsoft Visio on the windows platform. Microsoft Visio was chosen as it provides full notation support and is easy to use. 2.4 User Interface Design This section outlines the literature reviewed in relation to designing and building the GUI (Graphical User Interface). This research has been done because it is anticipated that a large number of screens will be required for the final GUI. For example initial discussions with Nick Wood suggest that the first visit details could have between fifty and sixty fields. Several screens will therefore be required to avoid cramming all the data onto a small number of screens. As the system is to be used by end users to help them with their work, the system will need to be easy to use. As it is the GUI they will be interacting with, it must be designed with the end users in mind. 2.4.1 Participatory Design This project is heavily dependent on the needs of two to three key users so the following details how users can be participate in the design of the GUI. Dix et. al. (1998) outlines three methods in which to get the end users involved: Brainstorming, storyboarding, pencil and paper exercises and workshops. Brainstorming involves the users and developers sitting round bouncing ideas off each other and is informal. Storyboards are a means of describing the user s day-to-day activities, Dix et. al. (2002). Wood (1998) describes storyboarding as producing a series of walkthrough steps in relation to the various windows in the system. Unlike brainstorming sessions these provide a more focused view on design. The role of the designer is to question the user about their environment and the role of the users is to question the designer on technical issues and concepts. Wood (1998) describes user involvement rather than user participation, this involves providing the user with prototypes to evaluate, rather than the user directly having an influence on the precise layout of the GUI. The argument for user participation is that the user best knows the environment that the system will be involved in. The argument against it is that it can slow the development down due to the shift of power and responsibility, Wood (1998). In this project there is only one developer and two-three end users so it would seem sensible to involve the users as much as possible but not to the extent of actually designing the GUI because they will not have the time or knowledge to do so. 5

2.4.2 Presenting Data There will be many occasions when users will simply want to view information directly from the database or from the result of a query. Consistency of data displayed, efficient information assimilation by the user, minimal memory load on the user, compatibility of data display with data entry, and flexibility for user control of data display are the criteria for good data display interfaces that Schneiderman (1998) identifies. 2.4.3 Data Entry A lot of data will need to be entered for each patient so it is important this is well designed. There are number of criteria which Schneiderman(1998) identifies that can be used to judge how good the data entry interface is: Consistency of data-entry transactions, minimal input actions by user, minimal memory load on users, compatibility of data entry with data display, flexibility for user control of data entry. These criteria are highly relevant to this project as GUI s must provide an easy way to enter data into the system. 2.5 Software Testing To ensure that the system functions as specified it will need to be fully tested to remove errors in the software. According to Sommerville (1995): Users are most interested in the system meeting its requirements and testing should be planned so that all requirements are individually passed. It is therefore intended to cross reference each requirement to a specific test case when it comes to the writing of test plans. As this system is being built for users other than the developer, it will need to go through a process of user acceptance testing and this should be done using customer supplied data (Sommerville (1995)) so it is intended the users at St. James s will test the system with their own patient data. 2.6 Security in Microsoft Access 2000 It was stated at the start of the project. by Nick Wood that the database should be made as secure as possible. Because it was clear from the start that MS Access 2000 would also be the likely DBMS (described in detail in Chapter 6) research was carried out on securing MS Access databases. Microsoft has a Frequently Asked Questions page on security in MS Access 2000 which details the steps that are needed in order to secure an MS Access database (Microsoft (2000b). Principally this involves creating a new workgroup information file. This file contains a profile of the users of the system together with the permissions they do and don t have. Once this file has been set up, different users can be given different permissions on the objects within the database such as tables, forms and 6

reports (Microsoft (2000a). This is of relevance because Nick Wood wants some people to only be able to read data and others to able to edit it. 7

3 Methodologies This project will need to deliver a piece of software to St. James s Hospital which is robust, delivered on time, and meets the specified minimum requirements. A software methodology is therefore required to achieve this as it provides a structured and well tested approach that can be applied to solving a problem. As this project is based on an underlying database, a methodology will also be required for modelling the data and creating a valid database. 3.1 Software Development Methodology Having had initial discussions with Nick Wood (see Appendix B), it was clear that he wanted not just a database but a GUI and additional functionality built in as well. Designing and building the system immediately would not have been a sensible approach as the project in the initial stages had not been scoped, planned or linked together. A software development methodology was therefore chosen in order to help resolve these issues. The classic software development methodology is the waterfall lifecycle model (Krutchen (2000)) and is depicted in Figure 3.1: Requirements Design Implementation Testing Deployment Figure 3.1 Classic Waterfall model The link between the lifecycle model and this project is clear. Requirements analysis will involve meetings with Nick Wood. These meetings will aim to find out the precise details of what must be stored on the database and how the data is to be manipulated. The design phase will involve designing the database structure and GUI prototyping. The implementation phase will involve creating the database and complete GUI. Testing will involve unit testing and user acceptance testing. Deployment will involve installing the system at St. James s Hospital and building in security. 8

A number of problems however have been identified with relating this methodology to this project. Firstly the HNPCC project at St. James s Hospital is new in itself so there is a high probability that requirements will need to be modified, added and removed as the project progresses. The lifecycle model is not an iterative methodology and does not therefore allow multiple iterations of lifecycle phases. The success of this project will depend on the user s satisfaction of what is finally delivered. By demonstrating or even providing different versions of the software at the earliest possible stage there is a higher probability of problems being addressed earlier. As early feedback can be generated from the iterative approach, it minimises risk and is superior to the lifecycle model (Larmen (1997)). A more iterative approach is therefore required. Krutchen (2000) outlines such as approach which has been depicted in Figure 3.2: Initial Planning Analysis and Design Requirements Implementation Evaluation Testing Deployment Figure 3.2 Iterative and incremental development (Krutchen (2000)) Once an initial cycle of development has been carried out, evaluation can take place which will involve discussing the work done, what s wrong with it and what needs to be done next, at the regular meetings with Nick Wood. This process will involve prototyping which involves rapidly building small versions of the system for the purposes of user feedback (Atzeni et. al (1999)). Although the development will be iterative there are some constraints on the ordering of what can be designed and implemented first. Because the GUI will be highly dependant on the database (as its purpose is to manipulate the data), an initial database schema will be needed before GUI prototyping can begin. In addition, the functionality required to map the GUI to the database can only start once there is a database and GUI to start working with. 9

The Rapid Application Development (RAD) methodology for software development was also considered as it involves building quick prototypes of the GUI and refining it based on user needs. The characteristics of this project mean that RAD could be considered a suitable methodology because RAD can be used for systems which are interactive and the functionality is clear at the user interface (Davies (1998)). It is intended that rapid prototypes will be created and reviewed by Nick Wood so that he his happy with what gets delivered. In terms of the overall development however, RAD was discarded because the iterative lifecycle methodology clearly defines the ordering and phases that this project will need to take e.g. requirements will have to be initially gathered before any prototyping can begin. 3.2 Database Design Methodology Although the overall iterative approach to solving the problem is sufficient for the overall stages in the project, a large part of the project relies on a well structured and robust database, the processes and planning required to achieve this are not however part of the chosen software methodology. A process for designing the database was therefore chosen to ensure a structured and proven approach could be taken to ensure that the data storage requirements could be met. In terms of the overall development cycle, the database design fits into the system analysis and design while the database construction fits into the implementation phase (Atzeni et. al (1999)). The steps required for database construction outlined by Atzeni et. al (1999) is shown in Figure 3.3: Conceptual Design Conceptual Schema Logical Design Logical Schema Physical Design Physical Schema Figure 3.3 Database development process (Atzeni et. al (1999)). 10

This methodology is analogous to that referred to by Elmasri (2000). It is also the approach to database design described by Mott & Roberts (1999). The conceptual phase involves producing a very high level description of the data, the primary output being an entity relationship diagram. Many of Nick Wood s requirements are related to the data he wants to be stored so there will be overlap between the requirements phase of the overall development and building an image of the data. The logical design can be subdivided into the following processes: 1. Building an ER Model, 2. Identifying integrity constraints and 3. normalisation, (Atzeni et. al (1999)) all of which will be necessary to meet the data storage requirements of this project as is explained in Chapter 7. The physical design is concerned with mapping the logical schema onto the DBMS. As the development will be iterative in nature, it will follow that the database design will be iterative as it fits into the software lifecycle. If for example an HNPCC patient could only answer one type of questionnaire changes to the ER diagram down to the physical structure of the database would have to be modified. 11

4 Project Planning This chapter describes the processes taken to plan the duration and dates of the phases required to complete the system. 4.1 Milestones for completion Before any schedule could be devised, any hardware/software decisions made or any thought given to the design, an initial iteration of the requirements phase was needed in order for the system to be scoped and an acceptable solution agreed with the user. An initial milestone was to therefore create a formal requirements specification. The goal of the requirements phase and therefore this project milestone is to provide a clear communication between the client and developer the needs of the system (Larmen (1998)). The results of achieving this milestone and the steps taken to achieve this milestone have been explained in Chapter 5. Once an initial set of requirements has been agreed, iterations of the design phase can begin. The next milestone is to therefore build a sound data model (ER-Diagram and normalised schemas) which can be used as the basis for building the database structure and tables. In addition, the integrity constraints and business rules of the HNPCC project have to be established between myself and Nick Wood before the construction of the database can begin. Once the design for the database has been completed, the next milestone is to have the design implemented on the chosen DBMS. The completion of this milestone will mean that the necessary tables are present in the database with the necessary attributes and that the integrity constraints, business rules and referential integrity constraints have been enforced. The next step for completion is to provide a GUI which is able to manipulate the data as set out in the requirements specification. At this point all the implementation should have been completed so the next milestone has been identified as having the system fully tested and ready for an initial release to the client. The project can be considered complete when the users are happy with the system given and show that are the prepared to start using it for their HNPCC research project. Although reaching a milestone represents the completion of a certain task, the iterative nature of the development means that it is likely that what has been achieved to reach a milestone may need be reviewed and changed at a later stage. 12

4.2 Project scheduling The principal aim of creating the project schedule is to estimate the times required to achieve the various milestones. In addition a project schedule is used for checking that the project is running on time. The initial project schedule can be found in Appendix C. Creating the database design in this project is tightly coupled to the requirements specification as much of the requirements specification outlines the data requirements for the system. Once the data properties have been established, it is a relatively straightforward process to devise an ER diagram, normalised schema and constraints based on the system requirements. Time spans of days rather than weeks were therefore allocated to these data modelling tasks. For this project designing the GUI is a more complex task as great care and consideration need to be taken to ensure the GUI is as user friendly as possible and aesthetically satisfactory. In addition to this, there is a higher risk involved with developing the GUI. The risks of the user requesting modification and not being content with what is produced are much greater. A longer period of time (four weeks) was therefore reserved for the GUI implementation. For similar reasons to the GUI implementation, the query builder (which is part of the GUI as well) was also given a relatively long period of time as the requirements were broad for this piece of functionality and the design subject to user approval. The initial project schedule did not show the iterative nature of the development of this project that was intended. The reason for this being that it was originally based on the critical path that a project plan should take as identified by O Connell (1996) which was not a suitable approach for this project. This became apparent several weeks into the design phase of the project when it was clear the original requirements were changing frequently. In addition, parts of the implementation phase needed carrying out before the overall design had been completed in order to generate user feedback. The original schedule also gave no indication of the various consultations with Nick Wood. It was therefore necessary to revise the schedule which can be found in Appendix D. 13

5 Requirements Gathering This chapter outlines the steps taken and the outcome of the initial meetings with Nick Wood which were held to discuss the minimum and additional requirements for the system. 5.1 Initial discussions with Nick Wood The initial meetings with Nick Wood involved high level discussions about what was required. The purpose of these meetings was to gain a better understanding of the work Nick was involved in, what he wanted and whether what he wanted could be achieved in the time given. Various interview style questions were asked during these initial meetings such as if you want this, would you therefore also want this? in order to establish exactly what Nick Wood required from the system. 5.2 Data storage requirements During the initial meeting, a detailed description of the data required for patients, their medical details and test results was provided. This was simply in the form of a list of fields so some analysis would have to be performed in the database design phase of the project as described in Chapter 7. The first stage in gathering the data storage requirements was to identify in general terms what had to be stored such as tissue samples, maternity history etc. and then look at the details of each of these entities at later stages. Further meetings were held to discuss in detail the nature of the data provided such as whether certain data was mandatory or not and the format that the data for each of the fields should take. This is explained further in Chapter 7. Other details were provided at later stages in the project such as the data required relating to patients not part of the HNPCC study i.e. NE/EC patients. Some information about the data stored was not provided at all and had to be found out during the meetings by asking questions such as You say you want the system to store tissue samples, what information do you want recorded relating to these samples?. 5.3 Storyboarding Once the data storage requirements had been established it was fairly confusing as to how all the data would relate to each other and how a user using the database would typically want to manipulate the data. In order to have better understanding about how the system would interact with the users at St. James s, Nick Wood was asked to describe from start to finish the typical actions a user would take when using the system, the outcome is shown in Figure 5.1. 14

Enter / Select a patient No First details entered on database? Yes Enter Questionnaire? Yes No First details known? Subsequent details known? Questionnaire results known? Yes Yes Yes Enter first visit details Enter subsequent visit details Enter questionnaire results Patient results known? Yes Questionnaire results complete Record patient results Patient sample details known? Yes Record patient blood and tissue samples Patient details complete Figure 5.1 Flowchart illustrating basic flow of events of user interaction A use-case diagram was first considered for documenting these processes because a use case describes a sequence of actions, performed by a system that yields a result of value to the user 15

(Lefingwell (2000)). However, looking at the text description from the meetings with Nick Wood this wouldn t be appropriate because he had given a very specific flow of events indicating that a flow chart would be a better method of documenting user interactions. Although the flow chart depicted in Figure 5.1 does not completely show all the actions that a user may take, it gives a clear view of the typical actions that would occur when a user of the system begins a session with it in order to enter a complete patient profile. Not only did this process provide a much easier format to visualise the system, it was used as the basis for designing the basic menu structure of the GUI. 5.4 Requirements Specification As described in Chapter 2, Section 2.2 a requirements specification is needed so that the requirements of the system are made explicit. It also forms a contract between client and developer. This was produced at an early stage in the project and can be found in Appendix E. The various additions and modifications to this original specification that were requested and implemented during the project can be found in Appendix I. 16

6 Hardware and Software decisions 6.1 Hardware The hardware decisions had been made before the project began and weren t open to change. Nick Wood stated clearly that he wanted the software to run on a standalone PC in an office at St. James s Hospital, the specification of which is given in Table 6.1. Processor RAM Free Hard Drive Space Monitor Intel Pentium 900 MHZ 128MB 20GB (approximately) 14 Inch SVGA Table 6.1 PC Specification for system The alternative suggested to Nick Wood was to have the database stored on a server machine on the University of Leeds network which all the PCs within Obstetrics and Gynaecology at St. James s Hospital are connected to. This would allow several users access to the database concurrently. It was made clear however that it was unlikely that a situation would occur whereby more than one person would need to use the database at a given point in time. Furthermore, Nick Wood was not happy with storing patient details on a machine other than in an office at St. James s Hospital. It was pointed out that measures could be taken to ensure the data would be secure but Nick Wood stated he was more concerned with having a database in place on one machine and saw moving to a networked architecture as unnecessary at present and something to be noted as a future enhancement. 6.2 Software 6.2.1 Choice of DBMS A DMBS (database management system) is needed to store the physical database and is used to provide access to the data. The roles of a DBMS can be summarised as follows: Controlling Redundancy, restricting unauthorised access, enforcing integrity constraints and providing mechanisms for backup and recovery (Elmasri & Navathe (2000)). All these properties are required to meet the requirements of the HNPCC database. The decision as to what DMBS should be used was also restricted by St. James s Hospital. The only DBMS available on any machine Nick Wood was prepared to use was MS Access 2000. The only alternative was to use a DBMS available freely such as MySQL. MS Access 2000 was chosen for a number of reasons. Firstly it meets the DBMS requirements for this project. MySQL does not support 17

referential integrity implicitly, all rules most be enforced by the developer whereas they only need be defined in MS Access 2000. MS Access 2000 was also chosen because it is designed specifically to run on a Microsoft platform which is the only platform available at St. James s Hospital and provides easy connectivity to Microsoft development tools such as Microsoft Visual Basic. 6.2.2 Choice of application development software The choice of application software was not as clear. As long the software could be written to connect to the chosen DBMS, MS Access 2000, any development application tool could be used. In addition support for building a GUI and adding functionality behind it is also desirable to meet the data entry requirements of this project. Three development packages identified meeting these requirements and are summarised in Table 6.2. Application Development Software Microsoft VBA (Visual Basic for Access) Microsoft Visual Basic 6 Borland Delphi Advantages Provides an easy and efficient mechanism for working with tables in Microsoft Access. Provides the same functionality as VBA and additional features. Has object oriented features which allow code reuse and encapsulation. Disadvantages Not as extensive in terms of functionality as Visual Basic. Unlike VBA wasn t designed specifically to interact with Microsoft Access. Also more complex than VBA. Too complex for the needs of the HNPCC database system and does not connect as easily to Access 2000 as VBA or Visual Basic does. Table 6.2 Comparison of development software Microsoft VBA was chosen because it provides the best interaction with an MS Access 2000 databases and provides all the necessary functionality for this project. 6.3 Final System Architecture Although the system is to be exclusively developed and used in MS Access 2000, three distinct layers exist within the system architecture. As shown in Figure 6.1. 18

System Users IBM PC Access 2000 Forms Changing the GUI to reflect user actions Processing the data entered by the user VBA Code Retrieving data that was requested for viewing Formulating and processing the data entered by the user into database commands. Access 2000 Tables Figure 6.1 System Architecture The forms in MS Access 2000 can directly manipulate the database tables directly but due to the need for user friendly error checking and data processing, an additional layer is required. As Figure 6.1 shows, this architecture maps onto the PAD (Presentation, Application and Data) model. 19

7 Database Design and Construction This chapter describes the processes taken and the outcome of deriving the necessary data model and physical database required to meet the data storage requirements of the HNPCC database. The chosen DBMS, Microsoft Access 2000 is a relational database system so it follows the database design is based on relational database principals. 7.1 Conceptual Design: High Level Data Modelling 7.1.1 Entities The conceptual phase involves building a high level overview of the data. A large proportion of this had been completed from documenting the high level data storage requirements which describe the overall data requirements. However, as the key output of the conceptual phase is a high level entity relationship diagram, the processes used for obtaining this have been shown. Entities represent physical things in the real world (Elmasri & Navathe (2000)). In relation to this project they will correspond to the physical things related to the HNPCC project that need be stored such as tissue samples. Certain key entities were identified from the beginning and remained stable throughout the development of the project, examples are patient, visit, and blood test. Most of the entities were identified at the start of the initial iteration of the database design phase but others such as tissue slides were identified during later iterations. These entities were identified from the data initially provide by Nick Wood using a bottom up approach. 7.1.2 Relationships Having identified the various entities that will form the basis of the tables in the database, the next stage is to identify the relationships between these entities and the cardinality of these relationships. The cardinality of a relationship shows the number of relationship instances that an entity can participate in (Elmasri & Navathe (2000)). For example, a patient is only registered with one particular GP but a GP can have more than one patient registered with them, thus a 1:M (one to many) relationship between patient and GP is required. The other property of each relationship that must be identified is the participation (optional or mandatory) of entities in each relationship. The reason the relationships must be identified for this project are so that the database is constructed in way that it does not allow inconsistent data to be entered such as entering a patient s visit information for a patient that doesn t exist. The result of this process is an Entity Relationship (ER) diagram and is shown in Figure 7.1. 20

21 Figure 7.1 High level entity relationship diagram

To explain this diagram and how it was derived, some examples are given. Taking the HNPCC patient entity, it had to be determined which entities an HNPCC patient is directly related to. Each patient has one or more subsequent study centre visits, this is depicted in the ER diagram using Rock Evans notation. The participation of relationships is also shown. For example a patient stored on the database does not necessarily have to have had a visit but a visit must have an associated patient. This is depicted by the dashed part of the relationship line originating from the patient and the solid line originating from the visit entity. A visit can only be held at one study centre and a study centre can hold several visits as shown by the one to many relationship between visit and study centres on the ER Diagram. 7.1.3 Entity Attributes Each entity has one or more attributes. Attributes are properties of the entit ies which define it. A number of meetings were held with Nick Wood in the earlier stages of the project involved finding not only the full list of attributes for each entity but also the data types for each attribute. A full list of the attributes associated with the key entities and their properties is given in Appendix H. 7.1.4 Integrity Constraints Once the attributes of each entity have been identified, integrity constraints need to be identified to ensure that invalid and inconsistent data can not be entered into the database. As a trivial example, the age of a patient (or indeed any person) must not be below zero, this must be made explicit at the conceptual stage so that it is enforced when physically building the database or at least checked at the GUI level. With each attribute, the first thing to decide is whether or not the attribute must have data entered for it in the database i.e. whether or not it is allowed to take a null value. This had to be discussed with Nick Wood as he hadn t previously provided this information. As an example, the additional comments attribute of a HADS questionnaire could take a null value as not every patient will choose to provide these. However each question in a HADS questionnaire must be answered and cannot therefore be allowed to be stored as a blank field on the database. The integrity constraints together with the acceptance of null values derived from discussions with Nick Wood are documented in Appendix F. Some complex constraints identified could not be enforced by MS Access s table system so they had to be checked and enforced at the GUI level. 22

7.2 Logical Design 1: Defining the database schema Once the entities, their attributes and relationships amongst the data have been established, the next phase of the database design is to define the logical database schema. 7.2.1 Identifying Primary Keys Every relation (a table in the physical database) must have a unique identifier for each record so that a record is not duplicated which leads to redundancy. Primary keys are also needed for referential integrity as described in section 7.2.2. A primary key can be made up of one or more attributes and is guaranteed to be unique for each relation. As an example the Patient Study ID together with the visit number uniquely identifies a subsequent study centre visit. 7.2.2 Foreign Keys and Referential Integrity While integrity constraints ensure that records concerned with one relation are consistent, referential integrity is needed to ensure that the relationships between records related to more than one table are consistent (Elmasri & Navathe (2000)). With regards to the HNPCC project, no serum blood samples can be recorded for a visit unless a blood test has been recorded for that visit. Furthermore a blood sample can only be recorded if a patient has had at least one visit. To ensure this is reflected on the database, referential integrity is required. Foreign keys are needed so that relations can be mapped onto each other meaning referential integrity can be enforced by the DBMS. Foreign keys are copies of the primary key in the related table. Figure 7.2 shows four entities of the HNPCC database, the boldness of attributes signify that they are part of the primary key. In the blood test and serum blood sample tables, the primary key is also a foreign key. Reading from left to right, the Patient Study ID of the patient relation is posted into the Patient Visit entity. With referential integrity enforced no Patient Study ID can exist in the Patient Visit table unless there is a corresponding patient in the Patient relation. In similar fashion there can be no blood test recorded unless it relates to a particular visit. The Storage ID in the Serum blood sample entity represents a foreign key which must also be present as the primary key of the Storage Location entity. 23

Patient Patient Visit Blood Test Patient Study ID Patient Study ID Visit Number Patient Study ID Visit Number Storage Location Storage ID Serum Blood Sample Patient Study ID Visit Number Storage ID Figure 7.2, Example of referential integrity in the HNPCC database Referential integrity is enforced by the DBMS once the foreign keys and relationships have been defined, this is discussed further in section 7.4. 7.2.3 Documenting the Initial Schema Once all the foreign keys have been identified, the database schema can be documented. This takes the form of listing the relations together with their attributes and making the primary key for each relation explicit, for example the Serum Blood Sample relation would be defined as: Serum Blood Sample: (Patient Study ID, Visit Number, Label, Date Stored, Number of Aliquots, Location ID) Here Patient Study ID and Visit Number form the primary key of the relation as they uniquely define a record in that no one patient can have more than one serum blood sample taken at any one visit. The combination of Patient Study ID and Visit Number also forms the primary key of the majority of other entities because most entities relate to one patie nt visit. Location ID represents a foreign key in that it is the primary key of the location relation. In order to maintain referential integrity in this instance the Location ID field in the Serum Blood Sample relation must have an equivalent Location ID present in the Location relation. 7.3 Logical Design 2: Normalisation Once the initial database schema had been established the next stage was to normalise it to (or check that it is in) 3 rd Normal Form. This process represents the final stage of the logical design in the database design methodology. Normalisation is needed to avoid redundancy and avoid update and 24

insert anomalies (Roberts (2000)). Because users at St. James s Hospital will need to frequently insert and update their records, normalisation is important in relation to this project. 7.3.1 Functional Dependencies Normalastion is based on the functional dependencies that are identified in each relation (Elmasri & Navathe (2000)). A functional dependency exists if the value of one or more attributes determines the value of another attribute. In the HNPCC database for example, a patient s weight and height will determine their BMI (body mass index). Formally if a functional dependency between X and Y exists i.e. X Y, this implies that if t 1 (X) = t 2 (X) then t 1 (Y) = t 2 (Y) (Roberts (2000)). Continuing with the Serum Blood Sample example, the following functional dependencies can be identified: Patient Study ID, Visit Number Label, Date Stored, Number of Aliquots, Location ID A more complex example is that of the questionnaire schema: Questionnaire: (Patient Study ID, Months after first visit, Questionnaire type, Question number, Question, Answer number, Answer, Additional Comments) 1. Patient Study ID, Months after first visit, Questionnaire type Additional comments 2. Questionnaire type, Question_Number Question 3. Patient Study ID, Months after first visit, Questionnaire type, Question number Answer Number 4. Questionnaire type, Question_Number, Answer number Answer 7.3.2 Normalising to 1 st Normal Form A relation is in 1NF if and only if all attributes are single valued (Roberts (2000)). This means that every attribute defined for the HNPCC database had to be checked so that it could not accept more than one value for one attribute in any one record. Not only is this important in terms of normalisation but the chosen DBMS, Microsoft Access 2000 does not accept multiple values for a given attribute. The majority of attributes provided by Nick Wood met this required property but some broke the 1NF rule. For example Current Use of COCP, how long? would take a Yes/No value and a time duration. This required decomposition into a boolean attribute showing whether or not the patient had ever used 25

COCP (the pill) and another for specifying the duration (which would be null if the patient hadn t ever used COCP). 7.3.3 Normalising to 2 nd Normal Form To test if a relation is in 2 nd Normal Form the functional dependencies must be analysed. The left hand side of any functional dependency must not be a proper subset of a key unless the right hand side is a member of some key (Roberts (2000)). A key is a set of attributes which determines all the others and for which no subset of the key has the same property. As the primary keys have been defined for each table, the key for each relation will be the primary key. Using the Questionnaire relation as an example, every functional dependency identified breaks the 2NF conditions. For example the functional dependency: Questionnaire type, Question_Number, Answer number Answer breaks the 2NF conditions because Questionnaire type, Question_Number and Answer number are together a proper subset of the primary key and answer is not part of the primary key. To create a set of relations which satisfy the 2NF conditions new relations must be created which contain the partial keys and their dependant attributes. As each functional dependency found for the questionnaire relation broke the 2NF conditions a new relation would have to set up for each one. The results are as follows: 1. Filled Questionnaire: (Patient Study ID, Months after first visit, Questionnaire type, Additional comments) 2. Question: ( Questionnaire type, Question number, Question) 3. Answer: (Patient Study ID, Months after first visit, Questionnaire type, Question number, Answer Number) 4. Choice: (Questionnaire type, Question_Number, Answer number, Answer) 26

Filled Questionnaire represents a questionnaire that a patient completes together with any comments. Question defines the list of questions over all questionnaires. Answer represents all the answers that a patient has given and choice stores the list of choices a patient can answer for each question. 7.3.4 Normalising to 3 rd Normal Form Again the functional dependencies must be analysed to check that a relation is in 3 rd Normal Form. A relation is in 3 rd Normal Form if the left hand side forms a superkey or the right hand side is a member of some key (Roberts (2000). This means that the attributes on the left hand side must contain the primary key or itself be the primary key or the attributes on the right hand side must be part of the primary key. Continuing with the questionnaire example each of the new relations meet the 3NF conditions as well as the 2NF conditions. This is because there is only one functional dependency for each of these new relations which is that of the primary key determining one other attribute. Initially most of the schemas documented were already normalised to 3NF so the normalistaion process for this project was more of checking for 3NF conditions being met rather than actually normalising the schemas. A complete set of the normalised schemas for the HNPCC database can be found in Appendix F. 7.4 Physical Design and Implementation Once the normalised schema as documented in Appendix F had been completed it had to be placed physically onto the chosen DBMS, Microsoft Access 2000. Firstly a table is created for each relation. The attributes for each table then have their properties set as shown in Figure 7.3 for the Number of live births field recorded from a patient s first visit. Figure 7.3 Setting attribute properties in MS Access 2000 The number of live births example shows that the attribute s datatype will be stored as an integer, defaults to zero, must be greater than or equal to zero, is null and is not indexed. 27

Referential integrity is enforced by creating a relationships diagram between the tables, the result of which is shown in Figure 7.5. The primary to foreign key relationships are defined in the relationships window of MS Access 2000. Figure 7.4 shows how the relationship between a patient and GP was set up. GP_ID is the primary key of the GP table and has a related GP_ID in the patient table. Simply clicking Enforce Referential Integrity will give the DBMS the responsibility of enforcing that a GP_ID value in the patient table corresponds to a GP_ID value in the Patient table. Cascading updates are required so that if a GP_ID changes in the GP table it is also changed in the Patient table. However cascading deletes are not wanted because if a GP is deleted the patient should not be deleted automatically. Figure 7.4 Defining relationship properties in MS Access 2000 7.5 Further iterations to the methodology During the development of the GUI, certain problems with the database were found which required amendments. As an example, the satisfaction questionnaire which had one question that could have an open ended answer rather than a predefined choice. This was a problem because the previous schema only allowed an answer to come from the choice table. As a result an additional table named free_answer was required to deal with this problem. Another issued arose with question numbers in that an order was required for them when using loops to check them in the code. This was dealt with by adding a field to contain the question number as it appeared on the questionnaire e.g. 9(i) and another to hold its position on the questionnaire. Although many constraints were also checked or only checked at the user interface level, the importance of good database design became apparent during development as in circumstances where the code fell, the database raised the appropriate errors which was a highly useful method of testing and checking the GUI functionality. 28

29 Figure 7.5 Final Database Layout

8. Implementation The purpose of this chapter is to detail the internal implementation of the system. The main objectives of this project are concerned with the ability of a user to enter/edit and view information so an example is given showing what happens internally when a user has the task of saving a set of first visit details. An extended set of screenshots for the system can be found in Appendix G. 8.1 GUI Creation The GUI was created by creating forms relating to groups of data that would need to be entered at any one time, e.g. a set of blood and tissue samples. Additional screens were created when too many controls were being used on a form. The initial forms were shown to Nick Wood throughout the project for feedback and modified as needed. 8.2 Example: Saving a Patient s First Visit Details Once a user has entered in a patient, they must enter the first visit details before other parts of the data can be recorded. They can access the screen to do this from the patient visit menu. The first screen they are presented with is shown in Figure 8.1. User selects a patient from database which is activated as the current patient. Figure 8.1 Saving a patient visit 30

8.2.1 Selecting information from the database The list of patients is populated by running a piece of SQL run against the database which is shown in Figure 8.2. SELECT P.Patient_Study_ID, P.Second_Name, P.First_Name, P.Birth_Date, P.Referal_Date FROM Patient P WHERE P.Patient_Study_ID NOT IN (SELECT Patient_Study_ID From First_Study_Centre_Visit) ORDER BY P.Second_Name; Figure 8.2 Creating a patient list for entering first visit details The result of this query is a list of patients and the details required by the user to distinguish them if any patients have the same first and second names. Only patients that have no first details are retrieved to avoid a user attempting to enter details already present on the database. The user can select the patient from the drop down menu, an example is shown Figure 8.3. Figure 8.3 Example of list from which a patient can be selected In similar fashion the user selects the study centre which is also read from the database via an SQL query. 8.2.2 Selecting information from the database Certain fields require the user to enter information depending on what has been entered for other fields. For example, if a patient is currently a smoker, or has smoked in the past then the total duration that the patient has smoked for is required, otherwise this field should remain empty. The field for entering the total duration a patient has smoked is disabled if the patient has never smoked and enabled otherwise as depicted in Figure 8.4. Figure 8.4 Changing interface based on user input 31

Assuming as in Figure 8.4 the user clicks yes for ex-smoker the simple code in Figure 8.5 is run: Private Sub chkexsmokeryes_click() chkexsmokerno.value = False chkexsmokeryes.value = True txbsmokingduration.enabled = True End Sub Figure 8.5 Code to change interface 8.2.3 Checking the required information has been entered As Microsoft Access 2000 is a DBMS, it checks that integr ity constraints are not broken before committing data to the database. If a user attempts to enter null values into fields which are not allowed to taken null values then an error is raised. This error can be caught and relayed to the user in an appropriate way to explain that more data is required. This approach however is flawed in that it could be any of the fields which the user has missed out. This is an issue particularly with first and subsequent visit details as there are over fifty fields to fill in. It places a burden on the user if they have to look over all the data they have entered to spot the mistake they have made. The decision was therefore made to explicitly inform the user about each mandatory field that had not been filled in. When the user clicks the Save button the savepatientvisit function is called which checks through each field in turn. The code in Figure 8.6 checks that the visit date has been entered. If settboxvalue _ (fm!txbvisitdate, You have not entered the date of the visit, dateofvisit) = False Then Exit Function End If Figure 8.6 Example of checking for missing data The settboxvalue function returns true or false depending on whether or not the required data was entered so the calling code can exit the current subroutine allowing the user to fill in the missing data. It also sets the dateofvisit variable to the value entered by the user on the form The code in Figure 8.6 checks if the text box on the GUI has been filled in and displays the appropriate message if it hasn t. 32

Function settboxvalue(tb As TextBox, message As String, data) As _ Boolean tb.setfocus If tb.text = "" Then MsgBox message, vbexclamation, "Missing Data" settboxvalue = False Exit Function Else settboxvalue = True data = tb.value End If End Function Figure 8.7 Function to check for missing empty text box The function shown in Figure 8.7 checks text box fields on the GUI, other functions were written to check combo boxes (drop down menus) and tick boxes. Assuming that the user had entered in all the necessary information apart from the visit date, the message box in Figure 8.8 would be displayed. Figure 8.8 Example of missing data dialogue 8.2.4 Saving From the GUI to the Database Once all the data has been verified in the code, the data that has been entered by the user has to be committed to the database. In early versions of the system this was done by building an SQL Insert statement based on what the user had entered, however with over fifty fields to insert the code soon became unmanageable when only small modifications were required. The alternative was the use of RecordSet objects which allow modifications to be made to the tables in MS Access 2000 through VBA code. Recordsets are an example of a DAO (Data Access Object) which are created in VBA/Visual Basic code in order to work with tables in a database (Holzner (1998)). Figure 8.9 shows how this is achieved for a patient s first visit details. 33

Declare recordeset and database DAO objects Dim dbs As DAO.Database Dim rst As DAO.Recordset Set dbs = CurrentDb Set the recordset to the data that needs to be updated Set rst = dbs.openrecordset("study_centre_visit",, dbappendonly) rst.addnew Update each field in turn based on what the user has entered rst.fields("patient_study_id") = patientid rst.fields("study_centre_id") = studycentre rst.fields("date_of_visit") = dateofvisit rst.fields("height") = height... Commit to the database rst.update rst.close Figure 8.9 Code to save from form to database Before this code is executed the user is given a conformation as to whether they wish to proceed as shown in Figure 8.10 Figure 8.10 Example of conformation request displayed If the user selects Yes and the save is completed successfully the message shown in Figure 8.11 is displayed. Figure 8.11 Example of conformation displayed 34

8.2.5 Error Handling When runtime errors occur in Microsoft Access 2000 a description is displayed and the code is displayed on the screen at the point of failure. This situation is unacceptable in the context of this project because the intended users are not familiar with VBA so a more intuitive method of reporting errors has been implemented. When a runtime error occurs in VBA, a code is returned signaling what error has occurred. A check can be done to determine what error has occurred. Based on the error code an appropriate message can be displayed. If an error has occurred which does not have an error handler then a standard message can be displayed. Figure 8.12 illustrates how error handling has been implemented throughout the system. Err_savePatientVisit: If ERR.Number = 2450 Then MsgBox "There are 3 pages of details to fill in for a visit, please ensure that you have accessed all these pages have been filled in", vbexclamation, "Missing Data" savepatientvisit = False Exit Function Else MsgBox "An unknown error occurred, vbcritical savepatientvisit = False Exit Function End If Figure 8.12, Example of error Checking 8.3 Example: Editing a Patient s First Visit Details Once any record has been entered into the system the user has the option to edit the records that they have entered. With the first visit details example the user loads in all the details and can then change them as shown in Figure 8.13 35

User selecting a patient results in first visit details being loaded on to screen for editing. Done by calling an onclick event handler which calls editpatient() Figure 8.13 Editing Records The patient list on the edit screen is only populated with patients who have first visit details already recorded. In terms of viewing records the same approach is taken but all the controls and locked and there is no save button present. 36

9. Testing 9.1 Overview Throughout development in the early stages of the project, testing took place alongside implementation. For example, functionality was added to insert a patient, tested and then further implementation carried out to enter a patient s visit details. However this testing was informal in that there was no structured test plan or testing procedure. Formal testing was only carried out once the implementation had been completed so that test scripts would not require frequent amendments and only needed to be re-run if bugs were found. 9.2 System testing In order to document the testing process and ensure that the functionality could be tested as much as possible, test scripts were produced to test the system. This was done so that for each requirement a formal test could be performed to verify that the requirement had been satisfied. An example test script is shown in Figure 9.1. A complete example test script run can be found in Appendix H. Test_ID Related Requirement Test Description Expected Result Pass/Fail Comments 1 2.4 Click Save A message is displayed stating that no patient has been selected Fail No message displayed Figure 9.1 Test script format Each test case relates to requirement, has a description of what has to be performed, the expected outcome of the action and whether the system passed the test together with any comments if the system failed. For highly repetitive tasks one test case was written but stated in the description it had to be run over a certain number of times. A separate test script was written for each section of information that had to be entered, edited and viewed i.e. patients, patient visit details, patient results etc. This was done not only because it provided a logical separation of the functionality to test but it also various different modules of code. 9.3 User acceptance testing Whilst the system testing could show that the system worked correctly in terms of the functional requirements, the non-functional requirements such as usability had be tested by the end users. 37

Towards the later stages of development Nick Wood was given a working version of the system. Although the primary purpose of this was to test that what had been produced was acceptable to the user, it also highlighted bugs which hadn t been found yet by the formal testing process. The user acceptance tests were by no means complete tests of the system but consisted of illustrating to Nick Wood the core functionality of the system. In addition to this, near to the completion date, Nick Wood went through the process of entering a complete patient profile with myself present in order to test the functionality and highlight any problems before final delivery. 38

10 Deployment 10.1 Installation The final machine for the system is to be installed on was on order and not present at the time of completion so it had to be installed on another machine at St. James s Hospital on a temporary basis for about a month. A decision was made to separate the underlying data from the GUI into two physically separate files. This decision was made because any failure with the GUI and code would mean the GUI could be changed in one file whilst the data remained safe and stable in the other data file. In addition only the data needs to be backed up so separating the tables means only the data needs replicating rather than the entire GUI. 10.2 Security and Backup The final machine for the system to be installed on was on order and not present at the time of completion so it had to be installed on another machine at St. James Hospital on a temporary basis for about a month. Two steps were taken to securing the system firstly a password was set on the back end tables and then a password was set on the GUI to prevent people accessing the data from there. However, more rigid security settings can be set by creating a workgroup information file. This was carried out on a copy of the system but Nick Wood said he only wanted this extra level of security on the final machine. As the process of doing this is relatively simple it will be done once the final machine for the system is in place. 10.3 Design Documentation A stated deliverable was the provision of some design documentation to aid future developers of the system. Much of what would be needed by someone with MS Access 2000 and VBA knowledge is contained within this report. The key issues surrounding the design of they system have therefore been summarised in a separate document and left with Nick Wood should he hand over the system to a future developer. 39

11. Evaluation 11.1 Defining project success The primary aim of this project is to deliver a database system to St. James s Hospital whic h meets the needs of Nick Wood and is deployed and then used by the intended users. The criteria for evaluating this project have therefore been based on whether these requirements have been met and whether or not what has been delivered is satisfactory from the end users point of view. In addition evaluation has been carried out on the quality of the implementation of the system, how the project was managed and the technology chosen. 11.2 Adjustments to the minimum requirements During the development of the HNPCC database, many of the detailed data storage requirements changed and additional ones added. Even a small change such as requiring an additional attribute required changes to the database schema, the application code and the GUI and additional testing during and after development. A summary of the extensions to all the requirements that occurred during the project can be found in Appendix I. As a result of this, there was not enough time to build the SQL Query builder which was one of the original minimum requirements. However an initial prototype of the query builder was produced with some working functionality. After this was discussed with Nick Wood, it was decided that the completion of the query builder was not important and that he was far more concerned with getting the data storage and entry exactly as he wanted it and requested that these changes be implemented rather than the SQL builder being completed. In addition Nick Wood decided that he was not concerned with an interface for importing external data unless all the changes and additions as outlined in Appendix I had been met. 11.3 Minimum requirements 11.3.1 Were the Minimum Requirements Met? Given the detailed list of requirements in the requirements specification (Appendix E) each requirement was examined in turn and tested. In order that any use could be made of the HNPCC database by St. James s Hospital, completion of the data storage and data requirements were mandatory which is why they were fully tested. Because the requirements were so dynamic during the project two of the previous minimum requirements (SQL builder and data import) were not fully implemented but were replaced by considerably extending the existing requirements for the database and GUI which was agreed with Nick Wood during development (as shown in Appendix I). 40

The test scripts were run enough times so that every test case passed on the latest version of the system. It can therefore be concluded that the modified minimum requirements for this project have been met as the testing process has proved this. The minimum non-functional requirements were also met as described in Chapter 10. 11.3.2 How well were the Minimum Requirements Met? Although the testing process shows that the system does what it was specified to do, it does not indicate how well it achieves this and to what extent. The following criteria have therefore been devised together with justification and analysis against the system. To what extent does the system exactly do what the user is after? For each test case, the system may have passed but a pass only indicates that the bare minimum has been achieved. As an example, the system may allow entry of a patient but provides no user friendly error checking at the interface level and is designed in a way which makes it difficult for the user to enter a patient. A great deal of work was put into checking the data entered at the interface level to provide feedback to the user about missing and invalid data and also making the interface as easy to use as possible. In terms of data entry and data storage, little more could have been done to meet the end user requirements. In terms of backing up the data however, more could have been done such as providing an interface to back the data up. How good is the overall quality of what has been delivered? Although the system can be shown to have met the minimum requirements and even gone further to completely satisfy the user needs, this does mean that a good quality product has been delivered from either a user point of view or a technical point of view. The underlying database has been based on proven and tested principals of entity-relationship modelling and normalisation. As the chosen DBMS (MS Access 2000) is a relational one and the schema implemented on top of it can be proved to be in 3 rd Normal Form, the data model can be taken as valid. In terms of the GUI, some improvements could be made in terms of it s quality. In total the system has over ninety separate screens so the users frequently navigate around the GUI. When these screens are opened and closed, it is obvious that screens are being opened and closed i.e. the transition could be a lot smoother. When viewing certain sets of information, several screens have to be opened to fill in all the fields. On slower machines especially this does not happen particularly smoothly and 41

given more time could have been looked into and implemented in way which does not require multiple screens to be opened at any one time. Is there any missing functionality which would further enhance the system in terms of satisfying the minimum requirements? Although testing shows that the necessary functionality is delievered to justify meeting the requirements, it does not show the extent of the functionality implemented and what functionality has been added or should be added to improve the system. One area that could have been improved is an on-line help system so that a user who becomes stuck could have direct access to help. The GUI was designed so that it would be easy enough not to require help but as it grew in complexity the perceived value of on-line increased in relation to menu navigation and explanation of the built in error messages. Given extra time, an easy to use interface could have been created for the database owner (Nick Wood) to set up the security rather than leave it as the responsibility to the person deploying the system. 11.4 Extended requirements implemented Although the query builder wasn t implemented a tool was built to calculate the ages of the blood and tissue samples that have been entered in the system. This was requested at a later stager in the project but was implemented because the majority of the minimum requirements had been satisfied and the general query builder had not been implemented, initial testing has shown this to work but full testing on sample data will be required for complete confidence. In addition, an export tool was built to export data from the database to Microsoft Excel spreadsheets. Because of the relational nature of the database simple table outputs weren t acceptable. As an example, a GP_ID in the patient table is meaningless to the end user who is interested in the name of the patients GP. As a result several SQL queries were formulated and used as the basis for the data that could be exported. Because the system was developed in MS Access 2000, much of the work in linking the GUI to the data could have been left to Access using tools such as interface wizards, eliminating the need for much of the VBA code. However because the system needed to be as user friendly as possible, the decision was made to implement all the GUI functionality from code to give complete control and ensure a better quality application could be delivered. 42

As an example, the GUI delivered allows a user to freely navigate a screen, enter data as they wish in any order and only checks it when they wish to save. Using GUI controls bound to the data in Access means that for certain data checking only MS Access messages are displayed and a user gets trapped in a control unless they have entered valid data. In terms of viewing the data, simple subforms with table displays could have been used. Because some datasets have a large number of attributes a decision was made to provide a more user friendly means to display data. Forms identical to the data entry forms were created and code written to fill them with the data from the database. From this point of view I feel that the data entry requirements have been exceeded by producing a high quality graphical user interface as opposed to an average one which is not so user friendly or visually as acceptable. 11.5 User satisfaction What are the opinions of end users? This criterion is included because as it allows the end users to freely and subjectively provide feedback on the system which may not be gained through more formal methods such as questionnaires. In order to evaluate the user satisfaction, a key user of the system (Naomi Quinton) was given a copy of the system and asked to enter a complete HNPCC patient profile corresponding to the main sequence of events identified in Chapter 5. She was also asked to note any problems and comment on what was good and what was bad about the system. An e-mail received from Naomi Quinton can be found in Appendix J. The feedback is positive but the comment is made that it would be preferable for warning messages to be overridden. As the GUI is built on a database which contains many non-null fields this would require significant modification and discussion with Nick Wood on the acceptability of missing data but could definitely considered as a valid modification for the future. The evaluation of the system carried out by the key end user Nick Wood can also be found in Appendix J. To summarise his evaluation, he is generally pleased and has got what he expected but has identified a few issues he would like resolving and a series of future enhancements he would like to see implemented. How easy is the system to use? A key non-functional requirement of the system is that it is easy to use. In order to quantify the ease of which the system could be used, the Naomi Quinton was asked to time how long it took them to complete the task of entering a complete patient profile. This time was then compared to the time it 43

took myself, the developer of the system to complete entry of a patient profile. The results are shown in Table 11.1. Person carrying out task End User Developer Time to complete task 40 minutes 25 minutes Table 11.1 Comparison of task completion times Given that this was Naomi Quinton s first time using the system, that she received no help and was not given any instructions on how to use the system this can be considered a relatively good time for completion given that the developer (myself) had frequently tested the system with data and knew in detail how it all fits together. 11.6 Quality, maintainability and extendibility of code Although the end users of the HNPCC database are happy with what they have been given, they will never see the code that is running the system and how the various parts of the system integrate with one another i.e. what is good on top is not necessarily good underneath. This is an important issue because poor quality code design leads to greater software failure and greater effort in improving and maintaining the software. There are several key enhancements that Nick Wood would find very useful and the ease at which these could be provided are set out in the criteria of this section. Is the code readable to someone with a reasonable knowledge of VBA/Visual Basic? Code that is not easy to understand can be confusing to both the developer and those that may have to improve and extend it in the future. In order to judge this Chris O Hara a final year computing student with extensive knowledge of VBA performed a code review on some of the key VBA modules. The conclusions from this were that generally the code is well structured but comments are missing in certain areas where it is not obvious what the code is doing. When bugs were found were they isolated and easy to fix or widespread and lengthy to fix, i.e. is the system maintainable? Frequent occurrence of widespread bugs indicates a lack of poor code design, the purpose of this criterion is to therefore verify the overall code structure. The majority of bugs found during testing were isolated and easy to trace. Few bugs were found which could not be traced to the task that was being performed. However although the test scripts were written to be as comprehensive as possible there are huge number of permeations that can occur when a user interacts with system. Time was not available to exhaustively test every scenario that could arise. 44

Is it easy to add new functionality to the existing code? As there is a strong chance further enhancements will be required later on after release, this is an important criterion. In the later stages of development it became clear that the software was not easily extendable. As an example, new functionality had to be added which allowed entry of new NE/EC patients. This did not prove to be a quick task although it was easy to accomplish. 11.7 Software/Hardware choices The choice of hardware platform and DBMS software were constrained from the start as described in Chapter 6 so an evaluation of the development software only is given. As explained in section 11.6 it was found that the software produced was not particularly extendable. This was due to the constraints of the application development software being used. The code shown in Figure 11.1 shows how a RecordSet object is prepared in order to save a patient visit to the database. Depending on whether a HNPCC patient is being dealt with or an NE/EC patient, if statements have to be used to set up the recordset correctly. This is a fairly messy way of trying to create code which can deal with two things that are similar but not completely the same. Figure 11.1: Example of poor code Because VBA is not an object-oriented language the concepts of inheritance and polymorphism are not supported. In a language such as C++ or Java generic functions could have been written 45

implementing the core functionality of saving a patient visit. Separate functions could have been added for dealing with the specialisations of these patients. A simple call such as p.savevisit() could then be made and the correct function dynamically called through dynamic binding and polymorphism. A better choice of software would have been Microsoft Visual C++ or Borland Delphi as they allow connectivity to an MS Access 2000 database, easy GUI development and provide object oriented functionality. In addition these languages provide a cleaner separation between GUI and database which is better for data independence. In addition to these flaws with VBA, error handling was also an issue when developing the system. In VBA the error code for possible errors has to be identified and then dealt with from within the module of code that raised the error. C++ and Java support explicit error handling through the use of exceptions which allow the error code to be decoupled from the main code. This would have been useful in this project because many common errors could have had one exception handler written rather than have it being handled at various places throughout the code thus making it more difficult to maintain. 11.8 Extending the solution further There are numerous ways in which the system could be extended to provide more useful functionality to the end users. Some of these were identified as additional requirements at the start of the project whilst others became apparent during development. 11.8.1 Data Analysis Tools Initially an SQL query builder was proposed and a prototype was developed. Even had this been fully developed though, more could be done to search for patterns in the data in order to determine factors that can lead to cancer. Data mining tools build decision trees based on the data present so that decision rules can be derived which allow predictions to be made. For example applying data mining tools to the database would allow rules such as smoking increases the probability of getting small bowel cancer by twenty percent to be identified from the existing data. 11.8.2 Mailmerge Nick Wood often has to send letters to the patients that will be stored on the database and as an additional feature would have liked an easy interface to allow him to link the address stored on the database to Microsoft Word documents. 46

11.8.3 External Data Access There may be occasions where users wish to access the data remotely, either from home, or from another hospital. This is not possible at present as the system resides on a stand alone machine. It would be useful to provide a web front end to the system so that users could enter and edit data off site. In addition people who need the information who are aren t part of the research study but are allowed to look at the data could have access from a location far away from St. James s Hospital. 11.9 Project management No monetary costs were associated with this project so the evaluation has been based on whether or not the milestones were achieved on time and whether or not all the deadlines were adhered to. The first criticism is that the risk management of the project was poor. Because the project was new from the users and developers point of view, the likelihood that the requirements would be highly unstable was not taken into account. Had this have been the case, it would have been known from the outset that there would not be time to build a complex query builder to analyse the data. The early stages of the project went according to plan but the implementation took longer than planned because the GUI required constant modification and refinement on presentation to the end user. This is something that should have been taken into account in the project plan. The final system was deployed later after it was scheduled but with all the core requirements in place and the user satisfied. 11.10 Conclusions Overall the system delivered to St. James s Hospital has been accepted with positive feedback from the end users and functions as it was originally intended to. The methodologies chosen were helpful in going about the solving the problem and were generally adhered to except in circumstances where a different approach was needed. The deadlines were not always adhered to but were not seriously compromised wither. The technology chosen in terms of whether it can meet the user needs is satisfactory but from a development point of view better choices could have been made as to what application development technologies should have been used. 47

Bibliography Atzeni, P, Ceri, S, Paraboschi, S & Torlone, R, (1999), Database Systems: Concepts, Languages and Architectures, McGraw Hill. Bowman, S, Emerson, L, & Darnvsky, M, (1996), The Practical SQL Handbook: Using Structured Query Language, 3 rd Edition, Addison-Wesley Developers Press. Davies, B, (1998), Rapid Application Development (RAD), URL: http://www.comp.glam.ac.uk/soc_server/research/gisc/radbrf1.htm, [27 th April 2003] Dix, A, Finlay, J, Aboud, G & Beale, R, (1998), Human Computer Interaction, 2 nd Edition, Prentice Hall. Elmasri, R & Navathe, S, (2000), Fundamentals of Database Systems, 3 rd Edition, Addison-Wesley Holzner, S, (1998), Visual Basic 6 Black Book, 1 st Edition, The Coriolis Group Larmen, C, (1997), Applying UML and Patterns, Prentice Hall. Leffingwell, G, (2000), Rational Software White Paper, Features, Use Cases, Requirements, Oh My!, Rational. Kotonya, G & Sommerville, I, (1998), Requirements Engineering : Processes and Techniques, John Wiley. Kruchten, P, (2000), The Rational Unified Process: An Introduction, 2 nd Edition, Adison-Wesley Microsoft Coorporation, (2002a), Access 2000 Security: Create a New Workgroup Information File, URL: http://office.microsoft.com/assistance/2000/acnewworkgroup.aspx, [28th November 2002]. Microsoft Coorporation, (2002b), Access Security FAQ, URL: http://support.microsoft.com/default.aspx?scid=/support/access/content/secfaq.asp [3 rd December 2002]. Mott, P, & Roberts, S, (1999), Introduction To Databases Module Notes, University of Leeds 48

O Connel, F, (1996), How to Run Successful Projects II : The Silver Bullet, Prentice Hall. Roberts, S, (2000), Database Principals and Practice Module Slides, University of Leeds Sommerville, I, (1995), Software Engineering, 6th Edition, Addison Wesley. Schneiderman, B, (1998), Designing the User Interface, 3 rd Edition, Addison Wesley. Thalheim, B, (2000), Entity-Relationship Modelling: Foundations of Database Technology, Springer Watson, R, (2002), Data Management : Databases and Organisations, 3 rd Edition, Wiley. Wood, L, (1998), User Interface Design : Bridging the Gap From User Requirements to Design, CRC Press. 49

Glossary of Terms 1NF 2NF 3NF DBMS ER Diagram GUI MS Access OO PC RAD SQL UML VBA 1 st Normal Form 2 nd Normal Form 3 rd Normal Form Database Management System Entity Relationship Diagram Graphical User Interface Microsoft Access Object Oriented Personal Computer Rapid Application Development Structured Query Language Unified Modelling Language Visual Basic for Access 50

Appendix A: Personal Reflection In terms of what has been developed for St. James s hospital, I am happy with what has been delivered because the feedback from the two key people who will use the system has been positive. I am pleased that I have managed to deliver a system which will be useful to someone other than myself. I feel that a lot more could be achieved with the system in the future in terms of analysing the patient data to find interesting patterns and statistics but that I have laid a solid foundation for doing so. Because on delivery a large amount of data wasn t available such analysis tools would not have been able to have been fully tested as no data had been entered into the system. I learnt that the most obvious technology to use for solving a problem is not necessarily the best. Knowing now how quickly the system has grown in magnitude, I would have chosen a different architecture and software to develop the system. I have also learnt a lot about working with end users of computer systems. Looking back at the meetings with Nick Wood I feel that on occasions I took the discussion into technical areas that were not of his concern which is something I would aim to avoid if working with end users in the future. I also felt that during some meetings I could have been more organised in terms of gathering and documenting all the data requirements. The fact that the requirements changed throughout the project caused problems in sticking to the project schedule. Given the chance to do the project again I would have stated from the start that changes to requirements would lead to other functionality being sacrificed and would have made such changes explicit in the project plan. In terms of spreading the workload I felt there was large a imbalance between what was achieved in the first and second half of the project. I would estimate that around seventy percent of what was achieved was achieved in the second half of the project. I therefore feel more effort should have been put in to the project in the initial stages to avoid a heavier workload later on. Overall I feel the project has been a success because a system has been delivered on time which meets the requirements of the end users but I would have managed certain aspects of it differently had I done it again. 51

Appendix B: Summary of initial meeting with Nick Wood Discussed the research project that Nick is involved with. Discussed why a database system is required to store and manipulate data from the project. I made general observations about how the existing data structures will need to be changed and he approved this. Discussed the need for providing a mechanism to allow non-users of the database system to easily submit their results to users of the database to be inserted into the database. Ideally with the use of simple forms. Discussed in detail the nature of the existing data in terms of what the attributes actually mean what their datatypes were and why and if they should or shouldn t allowed to be null. Discussed the additional information that will need to be recorded which at present has no existing tables or data forms: Records of what are contained in the blood and tissue banks and the results of research experiments on these samples. Discussed the need for keeping records of patients who have cancer but are not part of the current research project. Originally Nick wanted this to be part of a totally separate database. Once however once I established that this data wasn t stored anywhere else already I put it to him that it should be part of just one database to which he agreed. At present Nick hasn t got any information about the format of the data that the research results from the blood and tissue will take. Discussed the issues of security. Nick only sees the database running on a single machine. However as it will be used by more than person, I pointed out that this approach could be limited. As the machine will be part of the university network I suggested it might be better to place the database on a server but he seemed worried about security implications, I think he will need persuading! Will need ways of manipulating data in different ways every time the package is run but there will be some general reports that will need to be produced at certain times. General appearance of package up to me but there are certain graphics as part of the research website that be would nice if incorporated into the package. 52

Appendix C: Initial Project Schedule 53

Appendix D: Revised Project Schedule 54

Appendix E: Requirements Specification A Functional Requirements: * denotes mandatory data 1. Data Storage Requirements 1.0 General Patient Information: 1.0.0 The system will store the following information for all patients: First name*. Second name*. Address*. Post-code*. Country. Date of birth*. Ethnic origin. Home telephone number. Work telephone number. Mobile telephone number. 1.1 Patient s Medical History 1.1.0 The system will store the following information relating to the patients medical details: Height*. Weight*. BMI*. Age of menarche*. Age of menopause*. Whether or not the patient has ever had irregular periods and if so for how long. Whether or not the patient has ever used COCP (the pill) and if so for how long.* Whether or not the patient has ever had hormone replacement therapy (HRT) and if so for how long.* Whether or not the patient has ever had MIRENA and if so for how long.* Whether or not the patient has ever had a depot or implant (progestagen).* 55

Whether or not the patient has ever had clinical progestagen and if so for how long.* 1.1.1 The system will store the following information about the patients maternity history: Number of pregnancies. Number of live births. Number of miscarriages. Number of terminations. Age at birth of first child. Age at birth of last child. Total time spent breast feeding. 1.2 Patient s Family s Medical History 1.2.1 The system will store the following information about the patient s family cancer history: History of CRC cancer. History of endometrial cancer. History of small bowel cancer. History of renal tract cancer. History of ovarian cancer. History of any other cancer. Amsterdam II criteria met. Any family members with an MMR gene mutation. Any family members which is MMR gene mutation carrier. 1.3 Patient Visit Information: 1.3.0 The system will store the following information relating to the patients menstrual details taken on a particular visit date: Whether or not the patient currently has irregular periods*. Number of periods per year (if irregular). Whether or not the patient currently experiences inter-menstrual bleeding (IMB).* Whether or not the patient currently experiences post-coital bleeding (PCB).* 1.3.1 The system will store the following information relating to patients hormone and COCP intake, taken from patients on a particular visit date: 56

Whether or not the patient currently uses COCP (the pill) and if so for how long*. Whether or not the patient currently has hormone replacement therapy (HRT) and if so for how long*. Whether or not the patient MIRENA and if so for how long*. Whether or not the patient has a depot or implant (progestagen)*. Whether or not the patient currently has clinical progestagen and if so for how long*. Details of any other hormone treatment. 1.4 Patients Test Result Information: 1.4.0 The system will store the reports of the TVS that is performed on the patients. 1.4.1 The system will store the endometrial thickness (mm) of patients found from the TVS. 1.4.2 The system will store the reports of hysteroscopy that are performed on the patients. 1.4.3 The system will store whether or not polyps were found during the hysteroscopy. 1.4.4 The system will store the histology of endometrial biopsies. 1.5 Study Centre Information: 1.5.0 The system will store the following information about the study centers which the patients visit: Name of study center* 1.5.1 The system will store the following information about GP s: Initials* Surname* Address of Practice* 1.6 Further Research Information: Further cancer research will be taken on the tissue taken from the endometrial biopsy but at present no information is available on the structure that this data will take. 1.7 Blood Bank Information: 1.7.0 The system will keep an inventory of what blood samples are stored in the research blood bank and from which patients they come from. 57

1.8 Tissue Bank Information: 1.8.0 The system keep an inventory of what tissue samples are stored in the research tissue bank and from which patients they come from. 2 Questionnaire Requirements: 2.1 The system will record the following from each questionnaire that is given to patients: Attribute Domain Age Range 25-30, 31-35, 36-40, 41-45, 46-50, 51-55, 56-60, 61-65, 66-70, Over 70 Ethnic Origin White, Indian, Black African, Pakistani, Black Caribbean, Bangladeshi, Chinese. Number of visits to 1, 2, 3, 4, 5, 6, More than 6 outpatient clinic (at any hospital) Satisfied with visit(s) Always, Usually, Sometimes, Rarely, Never. Reason for rarely or <Free flow text> never or being satisfied with visit. (only applicable if patient answered Rarely or Never above). Patient Received Yes, No Information Sheet Patient Received Yes, No Information Sheet Patient waiting time after Less than 10 minutes, 10-20 minutes, 20-30 minutes, 30-40 appointment time minutes, 40-50 minutes, More than an hour. Patient informed about Yes, No the reason for long wait (only applicable if patient answered 30-40 minutes, 40-50 minutes or More than an hour above). Additional Comments <Free flow text> 58

The following is set of statements in the questionnaire to which the patient can answer Strongly Disagree, Disagree, Unsure, Agree or Strongly Agree : The instructions on how to find the clinic were clear and easy to follow. The patient information pamphlet was clear and easy to understand. The reception staff were friendly and helpful. The doctor used medical terms that I didn t understand. I felt confident that the doctor/nurse would stop the procedure if I requested or was in too much discomfort. The findings of the investigations were explained to me in a simple understandable way. I found the outpatient hysteroscopy uncomfortable. I found the one-stop screening clinic an acceptable experience. I am happy to attend this clinic on an annual basis. The screening process made me feel awkward. I would recommend this clinic to other female family members. I am more anxious about my risk of cancer after attending the clinic. 3. Data Entry Requirements 3.0 Standard Data Entry : 3.0.0 The system will provide a GUI (graphical user interface) for entering and editing the general information of patient as outlined in 1.0.1. 3.0.1 The system will provide a GUI for entering and editing patients medical information outlined in 1.1.0 and 1.1.1. 3.0.2 The system will provide a GUI for entering and entering patients family medical information outlined in 1.2.0. 3.0.3 The system will provide a GUI for entering and editing information collected from a patient as outlined in 1.3.0 and 1.3.1. 3.0.4 The system will provide a GUI for entering and editing test result information as outlined in 1.4.0 to 1.4.4. 3.0.5 The system will provide a GUI for entering information collected from a patient as outlined in 1.3.0 and 1.3.1. 3.0.6 The system will provide a GUI for entering the results of research information on endometrial Tissue (which will be) outlined in 1.6 59

3.0.7 The system will provide a GUI for keeping the inventory of blood samples as outlined in 1.7.0 up to date. 3.0.8 The system will provide a GUI for keeping the inventory of tissue samples as outlined in 1.7.1 up to date. 3.1 Data Validation 3.1.0 The system must prevent users from inserting invalid data and invalid datatypes. 3.2.0 The system must warn the user that the tried to enter invalid information. 4. Data Analysis Requirements No formal requirements specified by user B Non - Functional Requirements: 5. Security Requirements 5.1 The system must prevent unauthorised users from accessing the system and modifying or viewing any of the data through any other means. 6. Usability Requirements 6.1 The system must be easy to use by those at St. James Hospital who will not have knowledge of any particular database management system or database principals. C Extended Requirements: 1 It would be desirable for the system to be accessible from multiple machines over a network if security is not violated. 2 It would be desirable to provide a web interface for non-users of the system to easily submit results to placed in the underlying data by an authorised user. 3 It would desirable if on-line help is provided for users if they get stuck. 60

Appendix F: Database Schema This appendix shows the complete database schema in that the attributes for each table are listed together with their properties. Bold italic attributes represent the primary key and italic unbold attributes represent a foreign key. Only the tables for HNPCC patients are shown as there is little difference between the tables corresponding to these patients and NE/EC patients. The questionnaire schema has been illustrated in Chapter 7. Each table is in 3 rd Normal Form Patient Attribute Allow null value (yes/no) Constraints Datatype Patient Study ID No None Text Title No <Miss, Mrs or Ms> Text First Name No None Text Second Name No None Text Address 1 No None Text Address 2 No None Text Address 3 Yes None Text Post Code No None Text Home phone number Yes None Integer Work phone number Yes None Integer Mobile phone number Yes None Integer E-Mail Yes None Text Date of birth No Must be less than the current date. Age No To be calculated from the date of birth. Ethic Origin No <white, black, asian, chinese, other> Date (dd/mm/yy) Integer Text GP No None Text 61

Study Centre: Attribute Allow null value (yes/no) Constraints Datatype Study Centre ID No None Autonumber Name No None Text GP Attribute Allow null value (yes/no) Constraints Datatype GP ID No None Autonumber Name No None Text Address 1 No None Text Address 2 No None Text Address 3 Yes None Text Post Code No None Text Study Centre Visit Attribute Allow null value (yes/no) Constraints Datatype Patient Study ID No None Text Visit Number No > 0 Integer Date of visit No Must less than or equal to the current date Study_Center_ID No None Integer Height (metres) No > 0 Double Weight (kilograms) No > 0 Double BMI (Body mass index) No = Weight / Height 2 Text Current Smoker No None Boolean Ex Smoker Yes Cant be true if Current Smoker = false Duration Smoked Yes Must be null if current or exsmoker = true Age of Menarche Yes Must be null if Age of Menopause is not null Date (dd/mm/yy) Boolean Boolean Integer 62

Menopause Reached No None Boolean Age of Menopause Yes >0 and must be null of menopause reached is false PMB Yes Must be null if menopause reached is false Periods regular Yes Must be null of menopause reached is false. Period irregularity Yes < < 3 years, 3-5 years, 5-10 years, > 10 years Current use of: COCP, HRT, MIRENA IUS, Progestin, Progestogen, Tamoxifen, Aromatase Inhibitor (separate attributes) Previous use of: COCP, HRT, MIRENA IUS, Progestin, Progestogen, Tamoxifen, Aromatase Inhibitor (separate attributes) Integer Boolean Boolean Text No None Boolean Yes Can t be null if current use is true. Clinical Trial No < None, IBIS, ATAC, CAPP2 > IUCD No < Never, Currently, Previously > Boolean Text Text Sterelised No None Boolean Number of livebirths No >= 0 Integer Number of terminations No >= 0 Integer Number of miscarriages No >= 0 Integer Number of pregnancies No Must be equal the Number of livebirths + Number of terminations + Number of terminations Integer Systemic chemotherapy No None Boolean Radiotherapy of abdomen No None Boolean Previous history of: Colon/Rectum, Small Bowel, Renal tract or Other Cancer (separate attributes) No None Text Cancer history details Yes Can t be null when other is specified for previous cancer. Family history of: Endometrium, Text No None Text 63

Colon/Rectum, Small Bowel, Renal tract, Ovary or Other Cancer (separate attributes) Family cancer history details Yes Can t be null when other is specified for family cancer. Text Amsterdam II Criteria Met No None Boolean MMR Gene Identified In Family Member No None Boolean Hysteroscopy Attribute Allow null value (yes/no) Constraints Datatype Patient Study ID No None Text Visit Number No > 0 Integer Cervix No <Normal, Abnormal> Text Appearance Of Cavity No <Normal, Abnormal> Tex Abnormal Details Yes Can t be null if Appearance Of Cavity is abnormal Text Report Yes None Text Polyps No None Boolean Trans Vaginal Ultrasound Attribute Allow null value (yes/no) Constraints Datatype Patient Study ID No None Text Visit Number No > 0 Integer Probe Used No <TV, TA> Text Endometrial Thickness No >0 Integer Endometrial Appearance No <Normal, Polyp, Suspicious, Other> Details Yes Not null if Appearance of Endometrial thickness is Other Text Text Report Yes None Text 64

Left/Right Ovary results No <Not Seen, Normal, Simple Cyst (<4, >4 cm), Complex Cyst(<4, >4 cm), Other Left/Right Ovary details Yes Can t be null if ovary results are Other Text Text Endometrial Biopsy Attribute Allow null value (yes/no) Constraints Datatype Patient Study ID No None Text Visit Number No > 0 Integer Histology Report Number No None Text Result No None Text Result Details Yes None Text Report Yes None Text Blood Test Attribute Allow null value (yes/no) Constraints Datatype Patient Study ID No None Text Visit Number No > 0 Integer CA125 No None Text EDTA Blood Sample Attribute Allow null value (yes/no) Constraints Datatype Patient Study ID No None Text Visit Number No > 0 Integer Label No =Patient Study ID + Year of Visit Date Stored No None Date (dd/mm/yy) Text Storage ID No None Text DNA Extracted No None Boolean DNA Retrieval Date Yes None Date (dd/mm/yy) 65

Amount DNA Retrieved Yes None Integer DNA Storage ID Yes None Text Serum Blood Sample Attribute Allow null value (yes/no) Constraints Datatype Patient Study ID No None Text Visit Number No > 0 Integer Label No =Patient Study ID + Year of Visit Date Stored No None Date (dd/mm/yy) Text Storage ID No None Text Num Aliquots No 1-10 Integer Tissue Sample Attribute Allow null value (yes/no) Constraints Datatype Patient Study ID No None Text Visit Number No > 0 Integer Label No =Patient Study ID + Year of Visit Date Stored No None Date (dd/mm/yy) Text Storage ID No None Text RNA Extracted No None Boolean RNA Date Yes None Date (dd/mm/yy) RNA Storage ID Yes None Integer Extra Aliquots No None Integer Date Extra Aliquots Found Yes None Date (dd/mm/yy) Number Of Aliquots Yes None Integer Aliquots Storage ID Yes None Integer 66

Tissue Slides Attribute Allow null value (yes/no) Constraints Datatype Patient Study ID No None Text Visit Number No > 0 Integer Label No =Patient Study ID + Year of Visit Date cut No None Date (dd/mm/yy) Text Number of slides cut No 1-10 Integer Storage ID No None Text 67

Appendix G: System Screenshots Screenshot G.1 shows a list of all the screens making up the GUI, the key forms from this list for data entry will be shown in this appendix. Figure G.1 List of GUI screens Figure G.2 Opening menu 68

Figure G.3 Entering a Patient Figure G.4 Entering first visit ( page 1) 69

Figure G.5 Entering first visit( page 2) Figure G.6 Entering first visit( page 3) 70

Figure G.7 Entering Results( page 1) Figure G.8 Entering Results( page 2) 71

Figure G.9 Entering Samples( page 1) Figure G.10 Entering Samples( page 1) 72

Figure G.11 Entering Satisfaction Questionnaire( page 1) Figure G.12 Entering Satisfaction Questionnaire( page 2) 73

Appendix H: Example Test Script Test_ID Related Requirement Test Description Expected Result Pass/ Fail Comments 1 1.0.0, 3.0.0 Enter patient ID LDS909 LDS909 appears in patients ID box. 2 1.0.0, 3.0.0 Click Save A message is displayed stating that the hospital ID is missing. 3 1.0.0, 3.0.0 Enter a hospital ID AK989102 AK989102 appears in the hospital ID box. 4 1.0.0, 3.0.0 Click Save A message is displayed stating that the patients title is missing. 5 1.0.0, 3.0.0 Attempt to enter the title Mr. An error is displayed stating value isn't in list 6 1.0.0, 3.0.0 Select the title Ms The titles Miss, Ms and Mrs appear in the Title drop down and Ms can be selected 7 1.0.0, 3.0.0 Click Save A message is displayed stating the first name hasn't been entered 8 1.0.0, 3.0.0 Enter the first name Sandra Sandra appears in the first name box. 9 1.0.0, 3.0.0 Click Save A message is displayed stating that the second name hasn't been entered. 10 1.0.0, 3.0.0 Enter the second name Jones Jones appears in the second name box. 11 1.0.0, 3.0.0 Click Save A message is displayed stating the address 1 field hasn't been filled in. 12 1.0.0, 3.0.0 Enter 78, Manor View in the address 1 box 78, Manor View appears in the address 1 box. 13 1.0.0, 3.0.0 Click Save A message is displayed stating the address 2 field hasn't been filled in. 14 1.0.0, 3.0.0 Enter Leeds in the address 2 box. Leeds appears in the address 2 box. 15 1.0.0, 3.0.0 Click Save A message is displayed stating the address 3 field hasn't been filled in. 16 1.0.0, 3.0.0 Enter 'West Yorkshire' in the address 3 box West Yorkshire' Appears in the address 3 box 17 1.0.0, 3.0.0 Click Save A message is displayed stating that the post code hasn't been entered 18 1.0.0, 3.0.0 Enter the postcode 'LS2 9PY' LS2 9PY appears in the post code box. 74

19 1.0.0, 3.0.0 Click Save A message is displayed stating that the Date of Birth hasn't been entered. 20 1.0.0, 3.0.0 Enter the home phone number 0113 778954 The phone number '0113 778954' appears in the home phone number box. 21 1.0.0, 3.0.0 Click Save A message is displayed stating that the Date of Birth hasn't been entered. 22 1.0.0, 3.0.0 Enter the work phone number '0161 6546671' The phone number '0161 6546671' appears in the work phone number box. 23 1.0.0, 3.0.0 Click Save A message is displayed stating that the Date of Birth hasn't been entered. 24 1.0.0, 3.0.0 Enter the mobile phone number '07898667456' The phone number '07898667456' appears in the mobile phone number box. 25 1.0.0, 3.0.0 Click Save A message is displayed stating that the Date of Birth hasn't been entered. 26 1.0.0, 3.0.0 Enter the e-mail address 'Sandra@yahoo.com' The e-mail address 'Sandra@yahoo.com' appears in the e-mail box 27 1.0.0, 3.0.0 Click Save A message is displayed stating that the Date of Birth hasn't been entered. 28 1.0.0, 3.0.0 Enter a date of birth which is greater than the current date The date of birth entered appears in the data of birth field. 29 1.0.0, 3.0.0 Click Save A message is displayed stating that the Referal Date hasn't been entered. 30 1.0.0, 3.0.0 Select the ethnic origin 'White' White is selected from the list 31 1.0.0, 3.0.0 Click Save A message is displayed stating that the Referal Date hasn't been entered. 32 1.0.0, 3.0.0 Enter the refereal date 04/10/2001 The referal date 04/10/2001 appears in the referal date box 33 1.0.0, 3.0.0 Click Save A message is displayed stating that the patient's GP hasn't been selected. 34 1.0.0, 3.0.0 Select a GP The GP is selected 35 1.0.0, 3.0.0 Click Save An error is displayed stating the date of birth entered is greater than today's date 36 1.0.0, 3.0.0 Change the date of birth to 10/09/1967 The date of birth is changed 37 1.0.0, 3.0.0 Click Save A conformation prompt is given asking if the user wants to save the patient 75

38 1.0.0, 3.0.0 Click Cancel and No changes have been made check the Patient table 39 1.0.0, 3.0.0 Click Save and then Yes A conformation message is given stating the save completed successfully 40 1.0.0, 3.0.0 Check the patient table The patient has been saved with the correct details that were entered on the form 41 1.0.0, 3.0.0 Open up the edit patient and check the details of the patient entered can be loaded in The details entered for 'Sandra Jones' appear in all the correct boxes. 42 1.0.0, 3.0.0 Change all of the fields All the text boxes clear of Sandra Jones and click save 43 1.0.0, 3.0.0 Check the patient table The changes made in the above test are reflected in the table 44 1.0.0, 3.0.0 Open up the View Patient Form and view the edited patient The details of the edited patient can be displayed correctly 76

Appendix I: Change Requests This appendix lists the changes and additional requirements that were requested to the requirements that were asked for at various stages throughout the project. These changes and additions were implemented and tested and as explained in Chapter 11 and were more important for the end user than the query wizard which wasn t implemented. Change / Addition Requested Blood and tissue sample labels to be generated by system Additional information for NE/EC patients. Additional HADS questionnaire required. Additional SF36 questionnaire required. Changes to satisfaction questionnaire. Basing blood and tissue sample dates on the visit date. Additional information required for samples. Description Associated with each blood sample bottle, tissue sample and tissue slide is a label made up of the patient s study ID and the year the sample was taken. It was requested the system generate each label, display it on the GUI and save it to the database. Initially only a list of NE/EC patients with their blood and tissue samples was requested but latter on it was requested that their medical histories and results of endometrial biopsy also be stored on the system. Initially only one questionnaire was required to be stored on the system but it was then requested an additional HADS questionnaire be incorporated which relates to a patient s general well being. In addition to a HADS questionnaire, the request to store a SF36 questionnaire was also requested which relates to a patients recent health and physical activities. Once implemented, the questions of the satisfaction questionnaire were changed by Nick Wood, this meant that the question data had to be repopulated and a number of changes made to the code and GUI. Initially every dates for the blood and tissue samples had to be entered by the user but it was later requested that they be initially set to the date of the patient s corresponding visit and changed in the GUI by the user if needed. Once the GUI and code had been completed for entering, editing and viewing blood and tissue samples the following additional information to be stored and manipulated was required: 1. DNA extraction for EDTA blood samples together with date, amount extracted and location. 2. RNA extraction for tissue samples together with data, amount extracted and location. 3. Extra Aliquots extraction together with amount and date stored. 77

Additional information required for hysteroscopy results. Additional information required for endometrial biopsy results. Additional information required for trans vaginal ultrasound Initially this was simply the hysteroscopy report and whether or not there was a presence of polyps. Throughout the project various additional attributes were requested: Cervix findings, appearance of cavity, details of cavity if abnormal. Initially this was simply the histology. Throughout the project various additional attributes were requested: Results, details of result, and biopsy report. Initially this was simply the TVS report and endometrial thickness. Throughout the project various additional attributes were requested: Probe used, appearance of endometrial thickness, details of endometrial thickness, results of left ovary, results of right ovary. 78

Appendix J: End User Feedback J.1 Naomi Quinton Date: Thu, 24 Apr 2003 09:52:18 +0100 From: N.D. Quinton <bmsndq@south-02.novell.leeds.ac.uk> To: T Sehgal <ctxts1@comp.leeds.ac.uk> Subject: Re: Evaluating Database Hi there, I've finished the evaluation...after I got it to work on the computer!!! Overall it's great. I was really impressed with the database, you must have put in a lot of work. I put in all the data Nick had for one patients (which was quite a lot in the end with 3 questionnaires!!) and it took me around 40 minutes. I think I'll get a lot quicker especially as I know what I'm looking for (not a database problem...i'm not familiar with the hospital paperwork). I found it really easy to use and managed to navigate my way round without any input from Nick. I can see that it will be a really powerful tool when we have all the patients entered. I was really pleased that the database reminded you if you hadn't entered specific data. However, it wasn't possible to override this if you didn't have the data (e.g. if you didn't have the slide info or if you weren't sure if the subject had ever smoked)...is there any way of letting you do this? Let me know if you need to know any more input. Thanks. Naomi. 79

J.2 Dr. Nick Wood General Comments The final version of the database has addressed all the requirements defined in our initial consultations. It is user friendly, which has been demonstrated by other members of our research team who have not been involved in its development. It appears to be a robust tool for data storage and retrieval. Tristan has dealt with the, sometimes confusing, medical and scientific terminology well and accommodated my frequent changes in requirements. Equally he has always discussed the workings of the database with me in language that I can understand. Specific positive points 1. Separation of forms and tables This appears to be a most sensible decision, which allows modification and development of the forms by correspondence and eases backing up of data. 2. Form layout The forms have been developed in a straightforward manner with sensible warnings to confirm data entry/editing. 3. Export data This system allows the transfer of specific details to Excel for analysis. This gives another and most important feature, the ability to analyse the data entered. Specific negative points 1. No time for proper assessment Time to fully assess the final version has been limited. This has been compounded by my absence during the last month. However, Tristan has agreed to help resolve any subsequent problems. 2. GP form is separate This can sometimes be annoying if you have forgotten to enter the GP details before trying to enter a patient. i.e. you have to go back and start again. I think that this stemmed from a comment I made that patients can share the same GP. I have now resolved this by entering all the GPs! 3. Data analysis We have not developed the data analysis/assessment tool as much as I thought. This has mainly been due to us not discussing in detail what I wanted to know. Tristan resolved this in part with the export data tool. 80

Future developments I think that there are likely to be a few points to resolve in the short term including extending the export data tool to the Normal Endometrium/Endometrial Cancer part of the database and a form for the study centres in the HNPCC database. Tristan has agreed to address this after his exams. There are a few specific areas that could be developed in the future and may constitute a future student project: 1. I would like to be able to generate form letters for inviting patients to clinic and providing results etc. (HNPCC database). 2. Expanding the system such that I could enter the data directly on to desktop/laptop, generate hard copies of patient details screening information for patient notes and send data to the central database electronically/on disc. (HNPCC database). 3. Develop the system so that the above could be done in all the screening centres. 81