SART Data-Sharing Manual USAID Ed Strategy 2011 2015 Goal 1 This manual was developed for USAID s Secondary Analysis for Results Tracking (SART) project by Optimal Solutions Group, LLC (Prime Contractor), under SART Contract AID-OAA-C-12-00069. February 2015
This page left blank intentionally
Contents BACKGROUND... 5 HOW TO SHARE DATA WITH SART?... 6 A. Are you a USAID Mission sharing data with SART?... 6 B. Are you an implementing partner for USAID s Education Strategy 2011 2015?... 8 C. How to contact the SART contractor... 9 SHARING DATA VIA SECURE FTP (SFTP) SERVER... 10 A. How to upload data files to SART s SFTP server... 10 B. How to create a new folder within your designated folder on SART s SFTP server... 13 C. Replacing, renaming and deleting uploaded data files.... 14 GUIDANCE FOR PREPARING GOAL 1 DATA AND DOCUMENTATION... 15 A. Data preparation... 15 B. Documentation preparation... 17 APPENDIX A: PII AND GOAL 1 DATA... 19 APPENDIX B: SUGGESTED DATA CHECKS PRIOR TO SHARING DATA... 21 3
This page intentionally left blank 4
BACKGROUND To support the tracking and analysis of Goal 1 of the USAID Education Strategy, 1 the USAID Bureau for Economic Growth, Education, and the Environment s Office of Education implemented the Secondary Analysis for Results Tracking (SART) contract. The SART contractor will work with USAID/Washington (USAID/W), USAID Missions (Missions), and implementing partners to track secondary data collected from participating countries and analyze the progress toward the goals. As part of this effort, the SART contractor will conduct detailed reviews of the data and supporting documentation collected. To aid this work, USAID/W and the SART contractor would appreciate the help of the Missions and implementing partners to ensure that all the requested information is included in the data and documentation being shared. This manual serves two purposes for Missions and implementing partners: 1) it explains how to submit learning assessment data and documents to SART, and 2) it provides guidance on how to properly prepare data and documents. Data and document submission will vary from Missions to implementing partners. If you are a Mission, see pages 6 7 for guidance on how to share data and documents; if you are an implementing partner, see page 8 for guidance. One of the ways to transmit data and documents is through SART s Secure File Transfer Protocol (SFTP). The manual describes this process in detail on pages 10 14. Data and document preparation are outlined on pages 15 18. This section describes how to prepare to send data and documents, the types of information data and documents should contain and a standard naming convention when submitting data files to SART.. 1 Goal 1: By 2015, improve reading skills for 100 million children in primary grades. 5
HOW TO SHARE DATA WITH SART? If you are an implementing partner for a USAID Education Strategy program, please go to page 8. A. Are you a USAID Mission sharing data with SART? Please use any of the following methods to share data. Please ensure that data to be uploaded do not contain any Personally Identifiable Information (PII). 2 1. SFTP (FTP over a secure server connection) To use the SFTP method, you will need authorized credentials set up by the SART contractor to access the SFTP server. If you do not have authorized credentials to access the SFTP server, please contact the SART Helpdesk at info@sartdatacollection.org to request credentials. When contacting the SART Helpdesk to request credentials, please provide your First Name, Last Name, E-mail address and organization. Also include the name of the country and project(s) that the user requesting credentials will have access to. Once you have authorization to access the SFTP server, visit https://sartdatacollection.org/login to share data with SART. For detailed instructions on how to use the SFTP method, see page 10 of this document. 2. Data shared from a USAID email address to a USAID e-mail address a. Data may also be shared by a Mission from one USAID address to another USAID address. Please send data to bsylla@usaid.gov when using this medium. 3. Data shared via the USAID Google Drive a. Data may also be shared by a Mission via the USAID Google Drive. Please send a message to bsylla@usaid.gov if you have any questions regarding this method. 4. Diplomatic Pouch You may use a diplomatic pouch to send data on a CD to USAID/W. For any questions, contact bsylla@usaid.gov. 5. A CD delivered via FedEx a. To send the data on a CD, you must encrypt the data using publicly available encryption software. Before sending the CD, please notify SART, via e-mail or other written communication, that encrypted data are being sent. Please also include the encryption key in your communication and any other pertinent information related to the encryption (e.g., special software requirements). b. Please send the CD via FedEx to Optimal Solutions Group, LLC 2 For more details on PII, see Appendix A: PII and Goal 1 Data (pgs. 19-20). 6
5825 University Research Court, Suite 2800 College Park, MD 20740-9998 Attn: SART Administrator c. When the package containing data is received at SART, the team will follow Optimal s internal policies that specifically address how all media are accessed, labeled, stored, and transported. The policy also addresses media sanitation and disposal. (A copy of Optimal s data management and security policies can be provided on request.) d. A SART team member will then transfer the data to SART s secure server. If none of the above-mentioned methods will work for you, please send a message to info@sartdatacollection.org and copy bsylla@usaid.gov. 7
B. Are you an implementing partner for USAID s Education Strategy 2011 2015? Please use any of the following methods to share data with the SART contractor. Please ensure that data to be uploaded do not contain any Personally Identifiable Information (PII). 3 1. SFTP (FTP over a secure server connection) To use the SFTP method, you will need authorized credentials set up by the SART contractor to access the SFTP server. If you do not have authorized credentials to access the SFTP server, please contact the SART Helpdesk at info@sartdatacollection.org to request credentials. When contacting the SART Helpdesk to request credentials, please provide your First Name, Last Name, E-mail address and organization. Also include the name of the country and project(s) that the user requesting credentials will have access to. Once you have authorization to access the SFTP server, visit https://sartdatacollection.org/login to share data with SART. For detailed instructions on how to use the SFTP method, see page 10 of this document. 2. A CD delivered via FedEx a. To send the data on a CD, you must encrypt the data using publicly available encryption software. Before sending the CD, please notify SART, via e-mail or other written communication, that encrypted data are being sent. Please also include the encryption key in your communication and any other pertinent information related to the encryption (e.g., special software requirements). b. Please mail the CD via FedEx to Optimal Solutions Group, LLC 5825 University Research Court, Suite 2800 College Park, MD 20740-9998 Attn: SART Administrator c. When the package containing data is received at SART, the team will follow Optimal s internal policies that specifically address how all media are accessed, labeled, stored, and transported. The policy also addresses media sanitation and disposal. (A copy of Optimal s data management and security policies can be provided on request.) \ d. A SART team member will then transfer the data to SART s secure server. If none of the above-mentioned methods will work for you, please send a message to info@sartdatacollection.org and copy bsylla@usaid.gov. 3 For more details on PII, see Appendix A: PII and Goal 1 Data (pgs. 19-20). 8
C. How to contact the SART contractor SART Helpdesk +1-301-289-7398 or +1-844-AID-SART (+001-844-243-7278) Monday Friday: 9:00 a.m. to 4:00 p.m. (Eastern Standard Time) info@sartdatacollection.org DISCLAIMER The protocols and server should be used solely for the purposes of sharing accurate data and documentation for Goal 1 of the USAID Education Initiative. Please ensure that data to be uploaded do not contain any Personally Identifiable Information (PII). 9
SHARING DATA VIA SECURE FTP (SFTP) SERVER The following steps explain how to use SART s SFTP servers. A. How to upload data files to SART s SFTP server 1. Once you have the credentials to access SART s SFTP server, go to https://sartdatacollection.org/login. If you do not have authorized credentials to access the SFTP server, please contact info@sartdatacollection.org to request credentials. When contacting the SART Helpdesk to request credentials, please provide your First Name, Last Name, E-mail address, and organization. Also include the name of the country and project(s) that the user requesting credentials will have access to. 2. You will be directed to the login page for the SFTP server. Enter the credentials provided to you and click Sign in (see Figure 1). Figure 1 3. If there is no error in the credentials, you will be taken to the upload page (see Figure 2). a. If you receive an error message and believe that you have entered the credentials correctly, please contact the SART contractor at info@sartdatacollection.org or call +1-844-AID-SART (+001-844-243-7278). 10
4. Click the Add files button. Figure 2 5. Browse and select the file(s) (use Control + Click to select multiple files) you wish to upload to the folder and click Open (see Figure 3). Figure 3 Please ensure that data to be uploaded do not contain any PII. 4 4 For more details on PII, see Appendix A: PII and Goal 1 Data (pgs. 19-20). 11
6. Repeat this step for all files you wish to share with SART. 7. To upload all selected files at the same time, click the Start upload button above the selected files (see Figure 4). Figure 4 8. Alternatively, to upload each selected file individually (one at a time), click the Start button beside each individual file (see Figure 5). Figure 5 12
9. A confirmation message will pop up listing all the files that have been successfully uploaded (see Figure 6). Figure 6 10. When you have finished uploading all files, log out of the SFTP server and close the window. B. How to create a new folder within your designated folder on SART s SFTP server You may choose to create a new folder if you are sharing data for more than one project and would like to upload the datasets to different folders named after each project. 1. To create a new folder, click the New Folder button (see Figure 7). Figure 7 13
2. Enter the desired name for the folder and click Create (see Figure 8). Figure 8 Insert desired name C. Replacing, renaming and deleting uploaded data files. To replace a file that is already uploaded, please follow the instructions below. If a file needs to be renamed, please upload the file as a replacement and do not rename the existing uploaded file. Please do not delete files once they have been uploaded to SART s SFTP server. 1. To replace a file, upload the replacement file by following the steps outlined in Section A. 2. Include the word replacement in the file name. 3. Send an email to info@sartdatacollection.org indicating which file is being replaced. 14
GUIDANCE FOR PREPARING GOAL 1 DATA AND DOCUMENTATION A. Data preparation Please ensure that data to be uploaded do not contain any PII. In preparation for sharing data and documentation with USAID, Missions and implementing partners are encouraged to do the following: 1. If you are an implementer, inform your AOR/COR that you are preparing to submit a Goal One learning assessment dataset to USAID via SART. Note: Assessment datasets are requested within 90 days of the completion of testing. 2. Include the standard codebook (preferably in English in one of the formats listed in Item #5) with each data file. The codebook should at a minimum contain a. clear definitions of all variables (including composite variables); b. value labels or definitions for all response categories for categorical variables; c. the format/type of each variable (e.g., numeric, string, integer, etc.); and d. clear explanations of how missing values are defined and coded. 3. Implement file-naming standards. It would be helpful if all datasets were clearly named with descriptive information. Dataset names should include the country, project, year, and phase (baseline; midline; endline) that the data represent. Also include in the dataset name the last date on which the dataset was edited. This way, if a subsequent change is needed, it is easier to tell which dataset is the most recent. a. An example of a properly named file: Ethiopia, IQPEP, 2014, Endline, 2015 02 11. 4. Ensure that data files are in one of the following formats: SPSS, STATA, SAS, R, Excel, CSV. 5. Ensure that data to be uploaded do not contain any PII, student or school identifying information such as EMIS codes. For more information, see Appendix A: PII and Goal 1 Data (pgs. 19-20). 6. Ensure that the data files adhere to USAID guidelines on sharing data (USAID Education Strategy Update to Reporting Guidance August 2014, pages 24 25). 5 7. Ensure that Appendix B: Suggested Data Checks Prior to Sharing Data (pg. 21) has been reviewed. 5 2011 2015 USAID Education Strategy Update to Reporting Guidance, http://pdf.usaid.gov/pdf_docs/pbaab002.pdf 15
To the best of your ability, please ensure that each reading assessment data file can serve as a standalone document that contains complete assessment data for the EGRA administered. The components of a complete assessment data file are listed below, split into necessary information and useful information. It is recommended that the data file contains, at a minimum, the following information: Necessary Information 1. Variable for name of the country. 2. Variable for name of the program. 3. Variable for start and end date (month/year) for data collection for each phase or collection cycle. 4. Variable for round or collection cycle (e.g., baseline, midline, endline). 5. Variable to denote the type of treatment received by the respondent, i.e., if the respondent belonged to the treatment, control, comparison groups, etc. 6. Variables measuring the following key performance indicators (KPIs) for reading (if applicable): a. Oral reading frequency (correct words per minute for connected text); b. Reading comprehension score; c. Words correct per minute for disconnected words; d. Correct letters per minute; e. Correct letter sounds per minute; f. Correct syllable sounds per minute. 7. For all of the KPIs listed in item 3 above, each observation/record s raw and constructed scores for each subsection of the reading assessment. 8. Variables for the timed and untimed scores for any reading subsections. 9. Variables for the raw and equated scores for any round of data collection that contains midline/endline data scores that were statistically equated with scores from the baseline. 10. Variables that capture the following: a. Sex of the students assessed; b. Language(s) in which the assessment was administered; c. Region, geographic, or jurisdiction area of the country in which the assessed school is located; d. Type of school location (e.g., urban, rural, etc.). 11. Any notes or guidance on how to use the data file or potential challenges and sensitivities to be encountered while analyzing the data. Useful Information 1. Variables that measure other student-level characteristics, such as socioeconomic status (e.g., access to electricity). 2. Region, geographic, or jurisdiction area of the country in which the assessed school is located. 3. Variables used for sample weighting, if applicable. 4. Description of how to apply sample weights, if sample weights are included in the data file. 16
B. Documentation preparation Please provide the following types of documentation on USAID Education Strategy Goal 1 Projects: 1. Impact evaluation reports 2. Assessment evaluation reports 3. Annual reports 4. Performance management plans 5. Policy briefs 6. Planning documents To the best of your ability, please ensure that reports pertaining to the reading assessment contain, at a minimum, the following information: Implementation and Intervention Information: Necessary Information 1. Start and end date of the program. 2. Description of program foci, purpose, implementation approach, and pertinent context. 3. Specific population targeted or covered by the program. Examples of this information could include: Specific grades, age levels, or stage of educational attainment. 4. Language(s) of instruction for the program, and the participants mother tongue(s). 5. Specific regions of the country in which the interventions took place. Useful Information 1. Start and end date of specific interventions (if applicable) within the program. 2. Focus and aims of specific interventions (if applicable) within the program. 3. Types of educational institutions targeted (e.g., public vs. private schools, accelerated learning programs, etc.). 4. Progress of the implementation of the program, typically reported through annual reports or performance reports. Methodology and Results Information: Necessary Information 1. Description of the methodological design of the evaluation. If it is experimental or quasiexperimental, describe the treatment and control/comparison groups, including how groups were selected. 2. Description of the sampling design. This should include information about how schools/students were selected to participate in the program, as well as how they were selected for participation in the reading assessment. 3. Brief description of the process of collecting multiple rounds of assessment data, if the program has collected multiple rounds of assessment data. Describe the equating of reading assessment scores in the midline/endline data if the midline/endline reading assessment scores were equated with baseline reading assessment scores. 4. Full range of instruments used to collect the survey data. Examples of this would include reading assessments and questionnaires for students, teachers, school administrators, etc. 17
Useful Information 1. If applicable, provide information about sample weighting, including how the sample weights and strata were determined. 2. A brief description of comparability of sample populations. This could include baseline comparability for treatment/control groups in terms of assessment scores as well as demographic, school, and household-level characteristics. This sample comparability should particularly highlight any aspects in which the samples are not comparable. 18
APPENDIX A: PII AND GOAL 1 DATA 6 Personally Identifiable Information (PII) is information that can be used on its own or with other information to identify, contact, or locate a single person or to identify an individual in context. The PII abbreviation has four common variants based on personal/personally and identifiable/identifying. NIST Special Publication 800-122 defines PII as any information about an individual maintained by an agency, including (1) any information that can be used to distinguish or trace an individual s identity, such as name, Social Security number, date and place of birth, mother s maiden name, or biometric records; and (2) any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information. Examples of data or variables that can unambiguously be classified as PII include the following: Full name 7 Home address E-mail address National identification number Unique individual identification number 8 Telephone number The following variables are potentially PII, if they may be combined with other personal information to identify an individual: First or last name Date of birth Birthplace Country, state, or city of residence Race Name of the school the subject attends or workplace Education Management Information System (EMIS) code for schools or other similar unique code 9 Grades, salary, or job position 6 USAID and NIST standards (NIST Special Publication 800-122) are used to define PII. 7 This includes names of students, teachers, school administrators, and any other respondent. The names of assessment and/or program administrators, supervisors, assessors, or anyone involved in implementing the assessment or program are not considered PII or identifiable. 8 Unique IDs for students, teachers, and any other respondents are considered to be PII if (A) the IDs are part of a central database, such as EMIS, voter registration, birth certificate, passport services, etc.; and (B) if there is a publicly available crosswalk to identify individuals who were assigned those IDs. 9 This can be PII if there is a publicly available crosswalk to identify schools that were assigned those codes. 19
Please ensure that data to be uploaded do not contain any Personally Identifiable Information (PII). It is essential that datasets are de-identified to protect the privacy and anonymity of individuals and schools associated with an assessment. At the same time, it is important to ensure that steps taken to preserve anonymity do not unnecessarily destroy useful or essential information. This can be achieved by completing the following: 1. Remove unambiguously identifying information such as names, addresses, or telephone numbers of students, teachers, headmasters, and schools. 2. Identify individual variables, or combinations of variables, within the dataset that might be used to identify people or schools, and take steps to make the variables less exact without destroying them. For example: Replace full birthdates with month and year of birth Replace continuous variables representing precise school or classroom enrollments with categorical variables indicating ranges of enrollment values Reduce the precision of GIS coordinates to indicate the general location without identifying a specific school 3. Mask school EMIS codes, or other identifying codes that may have value to researchers a. EMIS codes should not be included in a dataset because they can be used to identify specific schools, but if EMIS codes are simply deleted researchers will lose access to valuable information about the grouping of students who attend the same school. EMIS codes should be masked with an alternative identifier so grouping information is not lost. Rather than providing the precise EMIS code of each school, develop a systematic approach for replacing those codes with another set of codes that does not identify the schools. Be certain to retain documentation on this approach so the alternative identifiers can be generated consistently over time. 20
APPENDIX B: SUGGESTED DATA CHECKS PRIOR TO SHARING DATA Examples of suggested data checks prior to sharing data include the following: 10 1. Check for date-format errors. For example, if the entries in the date variable only contain the day and the month but not the year, statistical packages sometime add a default year, such as 1960, to complete the date entry, which then corrupts it. Please ensure that the dates are accurate. 2. Check for outlier data, often found in such variables as dates of assessment and assessment scores as well as inconsistencies in the number of records per unit of observation. For example, if the basic summary statistics of the data file show that the range of the oral reading fluency score is from 0 words/minute to 50,000 words/minute, it is very likely that the upper bound of the range i.e., 50,000 words/minute could be an outlier. It would be most helpful for Missions and implementing partners to fix such errors prior to sharing the data. 3. Check the codebook and the data file for variables that may be missing definitions and/or labels. 4. Check the codebook and the data file for categorical variables that may be missing value labels for response categories. 5. Check for the occurrence of missing or masked entries for the requested variables. 10 For best practices in preparing and editing survey data, please refer to Chapter 4 of the NCES Statistical Standards document. 21