A Decision Guide on the Uses and Applications of EpiData Entry and EpiData Analysis Software Created as part of a collaborative project 1 : Association of Public Health Epidemiologists in Ontario (APHEO), EpiData Association and the Public Health Agency of Canada. Authors: APHEO EpiData Expert Panel and J.Lauritsen, EpiData Association. V1.0 Mar 2011 1 This project has been made possible through a financial contribution from the Public Health Agency of Canada
Table of Contents Purpose and scope 3 Introduction EpiData Entry and Analysis Software 3 Why Choose EpiData? 3 Software Feature Checklist 4 Success Stories 6
Purpose and scope The purpose of this document is to provide guidance on the uses and application of EpiData Entry and EpiData Analysis software. This document was developed by the EpiData Expert Panel as part of a collaborative project between the Association of Public Health Epidemiologists in Ontario (APHEO) and the EpiData Association with funding from the Public Health Agency of Canada. The document is written primarily for Epidemiologists working in local public health units in Ontario, but those working in other settings may find the information helpful in assessing the utility of EpiData software. Introduction EpiData Entry and Analysis Software EpiData is a free software suite designed to assist epidemiologists, public health investigators and others to enter, manage and analyze data in the field. All software is available from http://www.epidata.dk. A number of field guides, software documentation notes, examples and other information are also available. Users are encouraged to: (1) Join the EpiData-list discussion group, and (2) Sign up for the information newsletters sent periodically each year. For more information, visit the EpiData website at http://www.epidata.dk. The APHEO EpiData Project has produced two EpiData field guides, available for download from their website: http://www.apheo.ca/index.php?pid=47. Why Choose EpiData? There are many software programs available for data entry, management and analysis. The Microsoft suite of programs which includes Excel and Access are fairly standard within Ontario public health units (PHUs). In addition, every PHU is supported by the Ministry of Health and Long-Term Care (MOHLTC) with at least one SPSS license. Many PHUs purchase other licenses such as SAS, Stata, SurveyMonkey and FluidSurveys. Freeware such as EpiInfo is used by PHUs and there is increasing use of statistical freeware such as R in many of the public health graduate programs. So why choose EpiData Entry and EpiData Analysis? The following is a list of considerations when choosing software. EpiData Entry is a great program to consider if There is a need to rapidly design and implement a front-end data entry system. While most of the popular programs used in PHUs offer data entry functionality, creating data entry screens in EpiData Entry is quick and easy. Controlled data entry is essential. If you need to provide a questionnaire to other individuals that will be entering data for you, then EpiData Entry is easy to program with checks (i.e., skips,
validity checks and other data entry control options). It is also simple for others to use and doesn t require much data entry training. There is a need for data collection for a single investigation limited in time. If a system with large amounts of data collected over long periods of time and accessed by multiple users is needed, consider different software and IT support. Limited need for multiple simultaneous data entry across users. EpiData cannot handle several users working in the same file at the same time. It is a single user system. But there is no problem in placing data files on a shared network drive as long as only one operator works with the data at a time. While there can be multiple users of EpiData Entry, the files need to be merged to view the complete data set. Consider different software and IT support if there will be multiple and/or simultaneous data entry sites. EpiData Analysis is a great program to use if The sampling design does not need to be considered in the analysis. Other programs such as SPSS, SAS, Stata and R offer survey sampling functionality. Descriptive analyses, i.e., frequencies and cross tabulations, are needed. Some advanced statistical analysis options are also available in EpiData Analysis, i.e., regression and survival analysis, but not with the full range and functionality that other full-priced statistical packages like SPSS, SAS, Stata and R offer. Quick and simple charts are needed. EpiData analysis offers a number of well-designed charts like epidemic curves, proportion plots with 95% confidence intervals and histograms that are quick and easy to create and export. Other advantages The program is free and, as it is locally available as a system download, fewer web-based security concerns exist (i.e. as in SurveyMonkey). It doesn t require a lot of computer memory to install and run. The redesign of EpiData Entry into EpiData Manager and EpiData EntryClient will ensure that data entry personnel do not easily change rules or data structure. Online support available through an EpiData-list discussion group and information newsletters Software Feature Checklist The following table provides a checklist of common data entry and data analysis features that are available with EpiData Entry and Analysis, as well as some that are not.
Features of EpiData Entry and EpiData Analysis Feature Yes No EpiData Entry Data Entry Form Skip Patterns Required Fields Range of Legal Values Conditional Formatting Duplicate Entry Validation Import and Export Data Simultaneous Data Entry EpiData Analysis Frequencies & Cross Tabulations Descriptive Statistics Statistical Tests of Significance Summary statistics, confidence intervals Linear regression, correlations Outbreak focused analysis (i.e., attack rate tables, epidemic curves) Comprehensive data management: recode variables, define missing values, label values and label variables Survival curves Life table and Kaplan-Meir plots Charts Complex Survey Design
Success Stories 1. Oral Health Study Simcoe Muskoka District Health Unit, Barrie Ontario In 2007, the Oral Health Team, lead by the Dental Consultant, conducted a study to determine the risk factors that may be associated with the differences in decay scores between Simcoe Muskoka children living in fluoridated and non-fluoridated communities. The cross-sectional study collected information via the annual provincial dental indices survey and a parental questionnaire. The questionnaire data were entered into EpiData by data management assistants. EpiData was chosen over other software solutions because of the user-friendly interface, the functionality that facilitated data entry (e.g. skip patterns and data validation), as well as the added benefit of no cost. 2. DineSafe Evaluation - Durham Region Health Department, Whitby Ontario DineSafe Durham, launched in March of 2009, is a program designed to increase compliance with the Ontario Food Premises Regulations and increase transparency of, and public accessibility to, food safety inspection results and information. A short telephone survey of restaurant owners/operators was conducted in July 2010 to obtain feedback on the potential impact of DineSafe Durham on business and food safety handling practices. The entire survey needed to be completed in three weeks. The data entry screen was quickly created in EpiData which included the telephone interview scripts. After interviewer training and pilot testing, the interviewer entered the data during the telephone call thus avoiding double data entry. Data were analyzed on a weekly basis by the Epidemiologist using a.pgm file (descriptive analysis) and exported to excel to be sent to the manager. EpiData was a quick, easy-to-use software for data entry and analysis. 3. Communicable Disease Outbreak Investigations - Thunder Bay District Health Unit, Thunder Bay Ontario Thunder Bay District Health Unit used EpiData entry for a gonorrhea case-control study conducted in response to a recent outbreak. EpiData Entry was used to create a questionnaire and database for the study. Data collection was then distributed among seven public health nurses by creating seven copies of the EpiData database. The nurses entered data directly into EpiData Entry while conducting the follow-up telephone interviews with the cases and controls. Once data collection was complete, the seven databases were appended together using EpiData's append function. The data was then exported to Stata for conditional logistic regression analysis. The Thunder Bay District Health Unit also used EpiData Entry and EpiData Analysis during the investigation of a community foodborne illness outbreak associated with a banquet. EpiData Entry was used to enter data from paper questionnaires obtained from the banquet attendees. EpiData Analysis was then used to create the initial attack rate and risk ratio tables.