Child Labour Survey Data Processing and Storage of Electronic Files

Size: px
Start display at page:

Download "Child Labour Survey Data Processing and Storage of Electronic Files"

Transcription

1 Statistical Information and Monitoring Programme on Child labour (SIMPOC) International Programme on the Elimination of Child Labour (IPEC) Child Labour Survey Data Processing and Storage of Electronic Files A Practical Guide Revised December 2003 International Labour Office Geneva

2 Copyright International Labour Organization 2004 Publications of the International Labour Office enjoy copyright under Protocol 2 of the Universal Copyright Convention. Nevertheless, short excerpts from them may be reproduced without authorization, on condition that the source is indicated. For rights of reproduction or translation, application should be made to the ILO Publications Bureau (Rights and Permissions), International Labour Office, CH-1211 Geneva 22, Switzerland. The International Labour Office welcomes such applications. Libraries, institutions and other users registered in the United Kingdom with the Copyright Licensing Agency, 90 Tottenham Court Road, London WIT 4LP [Fax: (+44) (0) ; cla@cla.co.uk], in the United States with the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA [Fax: (+ 1) (978) ; info@copyright.com] or in other countries with associated Reproduction Rights Organizations, may make photocopies in accordance with the licences issued to them for this purpose. ISBN First published 2004 The designations employed in ILO publications, which are in conformity with United Nations practice, and the presentation of material therein do not imply the expression of any opinion whatsoever on the part of the International Labour Office concerning the legal status of any country, area or territory or of its authorities, or concerning the delimitation of its frontiers. The responsibility for opinions expressed in signed articles, studies and other contributions rests solely with their authors, and publication does not constitute an endorsement by the International Labour Office of the opinions expressed in them. Reference to names of Firms and commercial products and processes does not imply their endorsement by the International Labour Office, and any failure to mention a particular firm, commercial product or process is not a sign of disapproval. ILO publications can be obtained through major booksellers or ILO local offices in many countries, or direct from ILO Publications, International Labour Office, CH-1211 Geneva 22, Switzerland. Catalogues or lists of new publications are available free of charge from the above address. Photocomposed in Switzerland Printed in Switzerland BRI VAU

3 Foreword and acknowledgements The production of survey results in a presentable format is often delayed, one of the main reasons being that data processing issues are addressed neither properly nor early enough. Emphasizing the importance of careful and informed data processing, this guide provides detailed guidelines for survey planners, data processors, and computer system administrators with respect to data processing planning, actual data processing activities, and the storage of generated files. The guide also outlines the requirements and procedures for transferring electronic files to the ILO at the completion of child labour surveys, a process that is contributing to a growing global child labour data repository. The main aim is to facilitate the generation of high-quality micro-data derived from child labour surveys. This guide has been prepared by Muhammad Q. Hasan of SIMPOC/IPEC, ILO. Many people involved with child labour surveys helped in the exercise. We would like to express sincere thanks to all concerned. We particularly wish to thank Mr. Sylvester Young, Director of the ILO Bureau of Statistics and Mr Farhad Mehran of the ILO s Department of Policy Integration for their valuable comments and suggestions. This guide is planned to be revised and reproduced on a regular basis. For this suggestions and comments are always welcome. Users should direct any feedback to simpoc@ilo.org iii

4

5 Contents 1. Introduction 1.1 Background Field data collection: A brief overview Importance of data processing Planning 2.1 Introduction Data processing policy planning Defining the relevant aspects of a dataset Selection of hardware and software Identification of personnel Scheduling the data processing Data preservation strategy and access procedure Data processing 3.1 Introduction Data entry and preliminary validations Appending/merging/splitting files Data validation Final decisions on errors Completion of data processing and generation of data file(s) Preparation of public use datasets Final documentation Final tabulation Conversion of data files to other formats Storage of all files Data preservation 4.1 Introduction Organization of files Transfer of files to a preservation machine Backups Transfer of files to the ILO Bibliography and further ressources v

6 Glossary Annexes Annex I Comparison of statistical packages Annex II English country names and code elements Annex III Zambia end of decade and child labour questionnaire (education module) Annex IV A sample codebook for ASCII data created in SAS Annex V Structure of dataset vi

7 1. Introduction 1.1 Background ILO/IPEC s Statistical Information and Monitoring Programme on Child Labour (SIMPOC) supports child labour surveys conducted in a large number of countries. One of the most important aspects of this programme is the collection, archiving, and dissemination of credible, well-documented, and easily accessible micro-data. This requires the extensive planning, organization, and execution of planned activities, especially at the country level, where, it is expected, collected data will be archived for an indefinite period. At the ILO, meanwhile, this information will provide the basis of a global child labour data repository for use by a variety of people in a variety of countries in a variety of computing environments. Thus, the data must be clean, with no inconsistencies, and well documented, readily accessible for use at any time in research and policy-making activities. The dataset received by the ILO also needs to be complete incorporating codebooks, questionnaires and so on and ready for straightforward use by any analyst in any computing environment. Child labour surveys include three phases. First of all, data are collected through interviews with children and other family members. Data collection is followed by data processing, where the collected information is checked for errors, and micro-data and relevant documentation files are created. Finally, data analysis is performed in the light of any additional requirements or policy. Data processing is a difficult and complex process, but in many cases this is the stage that receives least attention. Data processing activities such as planning for equipment, software, and training of personnel can be conducted concurrently with such activities as survey design and field-data collection. Since all child labour surveys are carried out under strict time constraints, it is recommended that all planning, training, and testing procedures are completed before the field data collection is undertaken. The data processing phase includes several distinct stages, each of these comprising multiple steps where errors can and do occur. Child labour surveys are smaller operations than censuses, but, since most are first-time surveys and collect a greater amount of information than many other general household surveys, they tend to be more complex. While overall data processing activities are in many respects similar to other general householdtype surveys, child labour surveys, given their larger sample sizes and questionnaires, sometimes make greater demands on time and other resources. Presentable survey results are commonly delayed because data processing issues are addressed neither appropriately nor early enough. This guide presents a brief overview of the data collection phase before going on, first, to highlight the importance of data processing and, second, to provide detailed guidelines for its conduct, with particular emphasis on issues pertinent to child labour surveys. Chapter 2 addresses planning issues involved in data processing. Chapter 3 looks at the conduct of data processing and, immediately upon completion of a child labour survey, the generation of files, including well-documented public use datasets. One main purpose of this guide is to help data processors at the country level to produce clean, reliable datasets together with all the necessary documentation for use by secondary analysts, at the conclusion of surveys, in producing reliable aggregate data. Chapter 4 provides information on how to preserve datasets, allowing continued ease of access over an indefinite period. Survey design issues, data analysis, and data dissemination lie beyond the scope of this guide. 1

8 The information presented in the following chapters should be viewed only as guidelines the procedures outlined here may of course be adapted in the light of available national resources and experience. This guide as a whole is intended for planners and technical experts supervising data processing activities. Chapter 3, however, is designed specifically for those who perform the actual data processing, while Chapter 4 is for computer system administrators responsible for storage of child labour survey data. The guide also provides an overview of data processing activities that can be carried out at the survey design stage. 1.2 Field data collection: A brief overview In general, data collection can involve a variety of methods, from face-to-face or telephone interviews to aerial photography. Child labour surveys, however, involve only faceto-face interviews, and only two such methods are feasible. PAPI. With paper-and-pencil interviews, enumerators apply questionnaires on paper, recording the data with pencils. Data entry operators then key data into computers or convert them into machine-readable form through some scanning technique coupled with character recognition technology. No matter which method of data entry is selected, the information needs to be rechecked. Various means are used to ensure that data are entered properly. Much of this will be explained in the following chapters. CAPI. With computer-aided personal interviews, enumerators are supplied with handheld electronic devices (e.g. palmtop or laptop computers), permitting the direct digital recording of data. This method offers advantages, compared to PAPI, since major errors occur only while keying in the data, which can be rechecked immediately after data collection. Data are then transferred to computers, with almost no time needed for additional data entry, and data cleaning can begin immediately. This guide primarily addresses PAPI, which is the data collection method applied in most child labour surveys. 1.3 Importance of data processing Child labour typically remains a hidden issue, in many respects; and surveys seek reliable quantitative data about the various related concerns. National surveys require important sums of money and enormous organizational efforts involving ministries, national statistical offices, and other agencies. The resultant data are then provided to policy-makers, researchers, global estimators, and campaigners responsible for publicizing the adverse effects of child labour. All of these mentioned above need easily accessible, reliable data on the various aspects of child labour. Both non-sampling and sampling errors appear in survey datasets. Sampling errors are handled during the sample design phase, and are not addressed in this guide. Non-sampling errors may originate with respondents, interviewers, data-entry clerks, or processing programmers. One main objective of data processing is to find these errors and fix them in the shortest possible time. Where irreparable errors are found, they should be flagged with explanations. Unidentified and unflagged errors can corrupt interpretations of data and, ultimately, may result in the adoption of inappropriate policies. Competent and thorough data processing activities including error correction, logical checks, and compilation of information as the basis of documentation are vital to reliable 2

9 survey information. Otherwise, output from a successful survey (the collected field data) may be limited only to a few tables. Secondary analysts will find it difficult, if not impossible, to use the data, while national and international policy-makers may be misguided by the survey results. One key to successful data processing is careful planning. The various related activities need to be detailed as early as possible, and should include fallback plans. Data processing is immensely important to the survey output, and cleaning and verifying the data is essential. 3

10

11 2. Planning 2.1 Introduction The preparation of high-quality datasets requires proper planning, and involves two essential elements: Statistical method. One should employ good data collection tools and a well-developed survey methodology. Processing and subsequent storage of datasets. A second essential element involves the informed use of established data processing tools, processing methodology, and up-to-date computer hardware and software where applicable. In most cases, child labour surveys are conducted either as stand-alone operations or are attached as a module to some form of national household survey. In stand-alone surveys, children and their parents are usually interviewed. On the basis of initial investigations, this manual assumes that all child labour surveys employ paper-and-pencil interviews (PAPI) for data collection. Survey planning and data cleaning are discussed in light of this assumption. Upon completion of the interviews, the collected data are entered on a computer. Data entry in the field may occur under the supervision of field office supervisors or at survey headquarters, which is normally the national statistics office. If data are entered in the field, there will be a minimum of one file at each field location. Since the same survey questionnaire is used, all files generated in field locations will be similar with respect to the number of variables. No matter how the data are entered, different files are either appended before data cleaning or cleaned and then appended. These activities are normally conducted at survey headquarters. Where a child labour survey is conducted as a module attached to some other household-based survey (e.g. a household member health and education module), child labour data may be collected together with other modules (as with stand-alone surveys) or as a complete module without the household information (which is collected as part of another module). The data may also be collected at different times (e.g. if attached to a quarterly labour-force survey, the total sample will be covered over a period of one year). In such cases, household information needs to be extracted from the other data file(s) and then combined with the child labour data. Such cases entail both appending and merging of data. (Merging and appending are described in greater detail in later sections.) Completion of a data file is followed by data cleaning (partial cleaning may also occur in each modular file). It should be noted that child labour is difficult to define unless all relevant information about children is thoroughly investigated understanding the causes and consequences of child labour requires analysis of information about the household and other family members. Another scenario 1 presents itself when the survey is conducted in phases, with a series of questionnaires referring to different entities or differing in their respective coverage. In this situation, data may have to be presented in separate files, with no merging or appending. All of the situations described above warn us that careful planning is needed before the collected information is processed and made available for analysis. All planning issues can 1 One example for this would be the SIMPOC assisted country report Survey of activities of young people in South Africa 1999, report/rep1999.pdf. 5

12 be addressed while the survey design is in progress. If financial and time constraints are not an issue, all data processing activities should be tested during the pilot survey. (If the CAPI data collection method is used, this is essential.) It should be noted that careful advance planning considerably reduces actual processing time. The following sections discuss planning issues that need consideration before the actual data processing. 2.2 Data processing policy planning Two areas of planning are important to data processing. On the one hand, we must decide how the actual data processing is to be accomplished, and this is treated in detail in Chapter 3. But first we must ask what resources and definitions are required for effective and efficient data processing. We may term this initial step policy planning. Policy planning comprises the following essential features: defining the relevant aspects of a dataset; selecting hardware and software; identifying personnel; scheduling the time needed for data processing; formulating a data preservation strategy; and designing an access procedure. 2.3 Defining the relevant aspects of a dataset If analysts are to use a dataset effectively, the micro-data must first be properly processed. This involves a number of stages. Preliminary planning is essential, and includes the identification and definition of such aspects of the dataset as the following. Record identification variable To identify a case or record, an identifying variable is usually created and encoded with a unique value. The encoding method and the elements that constitute this variable have to be determined, and the variable often referred to as the unique record identifier should be named in accordance with the procedure described later in this chapter. This identification variable will provide the only linkage between the original dataset containing all the variables and a public use dataset (where many identification variables may have been deleted for reasons of confidentiality) or when data are in different files, but a cross comparison of information is required. For example, a combination of state or provincial code, enumeration area code, and house number appended one after another may be enough to identify a house uniquely. A line number (position of a person in a house) can be used to identify a person in the house uniquely. Other approaches may achieve the same goal, but care should always be taken when appending these numbers, and each household, as well as each person living in a household, should always have a unique identifier. 6

13 File structure In child labour surveys, the unit of analysis is the child or the person, whereas the medium is the household, because information about the child or person is collected by first identifying a house. Thus, it is worth deciding what the final data files should look like. The structure of data files may vary considerably in format and organization when, upon completion of data entry, they become available to secondary analysts. Is a large data file with one long data record preferable (describing both a child and the house in which he/she is living, for example), or does one want several small data files with short data records (where child and house information, for instance, reside in different files with a linking variable)? This decision will depend on factors such as how the survey is conducted and what statistical software is being used for data entry and processing. The following considerations may serve as guidelines. A data file may contain one long record or several smaller records. A large number of records slows processing speed. Some statistical packages (e.g. Stata) limit records to a maximum number of variables. On the other hand, one advantage of long records in a single file is that secondary analysts do not have to merge files at a later date. Annex I describes limitations associated with statistical packages such as SPSS, SAS, and Stata. Data may be organized in a file such that household records are followed by person records (with different record types in an ASCII hierarchical file). Alternatively, there may be two separate files: one for a house and one for persons living in that house, with welldefined linking variables common to both files in a package-specific format. There may also be a single merged file with long records. The values for many variables will be repeated for members of the same house in such files, thus occupying more storage space. Each system has its pros and cons, and one planning decision must address the questions of how many data files are to be included in the dataset and what the structure of each should be. Because of the way specific software handles data files, processing large data files within a Windows environment may be a problem. A child labour survey data file may become large when associated with a labour force survey, so the data file may need to be split before analysis. The file structure should be chosen according to available computing resources and the experience of the data processors. Because of its simplicity, however, a single flat data file is recommended where possible for child labour surveys. Naming files As soon as a file is created, it must be named, and it is worth deciding beforehand how all files are to be named. This means, at a minimum, adopting a naming convention. It is always recommended, for one thing, that names reflect the file contents. The version number of the file can also be included. (In Chapter 3 we see how different versions may be generated.) For child labour surveys, specifically, it is recommended that the following information is included in file names: file content (data, documentation, questionnaire, etc.); to whom the file relates (child, parent, both); version number; relevant country; and whether the file is for general or restricted use. 7

14 Such a standard naming convention greatly assists users in choosing the correct file from the dataset. In general, it facilitates processing the contents, often at a much later date, of computer-based storage systems that may contain thousands of files. Other information, such as survey year and survey round, may also be included in the file name. However, there are generally restrictions on the number of characters used in naming a file, with most computer systems allowing 8.3 structures i.e., eight characters for the actual file name and three characters for the file extension (e.g. MY_FILE.DOC). The extension is usually allocated by the package that created the file. (MSWord, for example, will use DOC as the extension.) In other words, only eight characters can be manipulated to express as much information as possible about the nature of a file. In view of these limitations, the following naming convention is recommended. All filenames should start with a country code (Annex II lists the two-character codes) followed by the abbreviations C for child or P for parents or F for family (both parents and children) and H for house (dwelling). The version number follows and, since more than nine versions can easily evolve over time, two characters should be used. G indicates the file is available for general use, and R marks it as restricted. Finally, the eighth character D, Q, or C representing data, questionnaire, or codebook respectively indicates the file contents. If any field in the file name is not applicable, that should be replaced with an underscore (_), thereby simplifying manipulations during computer processing. In summary, when naming files according to an 8.3 structure, use the following convention. The first eight characters: first and second characters country code third and fourth characters child/parent (person), house (dwellings) or both C_ for child only F_ for child and parent (family) H_ for house P_ for parent only FH a single file containing information on child, parent and house (dwellings) combined Note: an underscore (_) is used to fill the blank space of the fourth character fifth and sixth characters version number 01 first or original version 02 second version, and therefore not the original version and so on seventh character file use G for general (public) use R for restricted (internal) use (in case of data only) eighth character file contents C for codebook (normally associated with an ASCII data file) D for data I for summary of classification of industries 8

15 L internal consistency check rules Q for questionnaire S for summary of classification of occupations V for variable list The last three characters, following the decimal point, denote the type of file (proprietary or otherwise). The following examples should clarify the convention: BDC_01RD.DOC/SAV/POR A file containing data about children in Bangladesh, and which is the original version, might be named BDC_01RD, where BD stands for Bangladesh; C stands for child; _ indicates there is no information regarding the house or dwelling; 01 marks this file as the first version; R shows that the file is restricted; and D stands for data. The corresponding public use data file derived from the above would carry the name BDC_01GD. The associated questionnaires would be called BDC_01GQ. (Since the questionnaires are for general public use, they would always include the G code.) The extension would say whether it is a package-specific data file or documentation. For example, a SPSS data file takes a SAV or POR extension, while documentation in MSWord takes a DOC extension. UAFH04RD.[xxx] Similarly, a file containing data about parents, children, and their house in the Ukraine, and which is the fourth version, can be named UAFH04RD. The public use version would then be named UAFH04GD. Associated questionnaires are named UAFH04GQ, while a variable description file is named UAFH04GV. A summary classification of occupations file would be named UAFH04GS. All the file names would include appropriate three-character extensions. PAFH02RD.txt An ASCII data file that contains data about parents, children, and households in Panama, and is the second version, can be named PAFH02RD.txt, and the public use version would be PAFH02GD.txt. The associated codebook file should be named PAFH02GC with a TXT or DOC extension, depending on file type. Creation and naming of variables Once a survey is completed, a set of variables is created from the questionnaire (primary variables). At a later stage, manipulating the primary variables may produce derived variables. Unless conventions are followed, naming these variables can prove awkward. Here are a few rules of thumb: Variable names should convey the meaning of the data content they represent. Any potential analyst should be confident that the same variable names apply to the same data. For example, if two questions are used to determine the work status of a respondent e.g. enquiring as to both current work and usual work variables representing these questions should never be named work1 and work2, since this leaves it unclear which variable refers to which question. Ideally, questionnaires should be prepared such that each question comes with a predesignated variable name. For example: How old are you? would be annotated 9

16 with the variable name AGE. This type of questionnaire is often referred to as an annotated questionnaire. As with files, naming variables often depends on statistical packages that restrict the number of code characters to eight or fewer (SPSS for example). 2 The prevailing computing environment in any particular country will also influence naming conventions. Each answer in a multiple-choice question should also be assigned a variable name. For example : If question number 9 has 2 multiple-choice answers then variables may be named as Q9A and Q9B. Several different methods may be applied for naming variables 3. One-up numbers. In this approach, variables are numbered sequentially. Thus if there are 100 variables in a data file, they can be numbered from 1 to 100. However, many statistical software packages do not allow a digit to be the first character in a variable name (e.g. in SPSS), as such a letter can be added as a first character (e.g. in SPSS, variable names will be automatically assigned either v1 to v100 or var0001 to var00100.) Variable names can be changed manually afterwards. However, the problem with this method is that it is often impossible to comprehend the meaning of the variable or to match some variable names with the respective questions without additional labels. Errors can easily happen if variables are named in this way. Question numbers. A possible alternative to the one-up number method is to name variables with the respective question number; for example, Q1 is the variable that corresponds to question 1. Since multiple answer questions would require more than one variable to be created for a single question, a letter can be appended after the question number, Q4a, Q4b etc. Since all child labour questionnaires consist of multiple sections, the first letter can be chosen to represent the section (A1, A2 B4a, B4b etc, where A, B are two different sections) Again, additional labels can also be used to explain the actual meaning of the variables Mnemonic names. In this method, variables are named with words representing the concept of the variable. However, the same word may offer different meanings to different users. Also, the maximum of eight permissible characters in the variable name may impose severe restrictions to conveying the actual meaning. It is also hard to assign manually the same word to different variables conveying the same type of meaning. Prefix, root, suffix systems. A possible alternative to the mnemonic method of constructing variable names is to use predefined abbreviated words and join them as prefix, root and suffix. For example, all variables related to children may use CH as a prefix; WW and WY, to denote last week s work and last year s work respectively, as a root; and GRP, to group cases, as a suffix. Derived variables. As mentioned earlier, derived variables are created from primary variables or by combining multiple primary variables. For example, age may be a primary variable, but analysts might need information about children in the 5- to 9-year age group. Information about individual children s ages can then be grouped to form the derived variable age group. It is always recommended that primary and derived 2 See Annex I for maximum number of characters allowed in naming a variable in some statistical packages. 3 This follows the approaches outlined in: Inter-university Consortium for Political and Social Research (ICPSR), Guide to Social Science Data Preparation and Archiving. Retrieved from 10

17 variables are distinguishable. For a variety of reasons, it is also advised that public use datasets should not contain large numbers of derived variables: they are costly in terms of data-processing time; if they are to be properly used, they need adequate explanations; and the datasets may become too large and unwieldy. Moreover, data analysts may not have occasion to use these derived variables at a later date, and prefer to tailor derived variables to their own requirements. Remember that the weight factor included in a dataset is not a variable from the questionnaire, and it should be treated separately. It should be named WEIGHT, using the naming convention applied to a primary variable. Individual countries are of course free to choose the naming convention appropriate to their variables. With the aim of establishing international consistency with regard to child labour data, however, the following rules are recommended: use the question-number method in naming variables, with the character representing the section appearing as the first character in the variable name; use the prefix method in naming derived variables; use capital letters for primary variables, when possible; use lower-case characters for derived variables; and the weight factor should be named according to the rules for primary variables, but at the same time be distinguished from a primary variable. Variable labels It is more difficult to understand a dataset if attributes associated with the variables for example the literal question asked are not properly described inside that dataset. People who want to perform secondary analyses of child labour surveys prefer that all information be contained in the dataset. One sign-posting method is to provide an adequate label for each variable. Since nowadays almost all data processing software (e.g. SPSS) provides the option to add labels, this option should be used to describe each variable. If no suitable labels can be found, the literal question together with the appropriate question number should be used as a label. If the variable is a derived variable, a label can be added to express which variable or variables are used to create this new variable and if possible indicate the reason for creating such a variable. Coding A statistical software package is used to analyse the information collected through collection of field data. Thus, the information needs to be transformed into data that the software can handle. To this end, each answer is coded, and the process that determines which symbol represents what item is known as coding. Coding should be undertaken during the survey design process, and it is important that the data processors themselves are involved. Child labour surveys should be pre-coded before data entry. All possible values including those such as not available, not applicable, refused to answer ought to be included in the questionnaire, and interviewers should receive proper training. These measures will greatly reduce the time that data entry or data processing personnel need to spend on coding. Following are a few guidelines drawing on the ICPSR Guide to Social Science Data Preparation and Archiving 4 and the Audience Dialogue Survey analysis 5. 4 ibid. 5 Audience Dialogue: Survey analysis. Retrieved from kya5.html 11

18 Should the need for additional codes arise (for example, assigning a specific code for open-ended questions), this is to be carried out with proper consideration to the coding scheme defined during the questionnaire design. It is particularly important to ensure that there are no overlaps between code categories and that each code fits into only one category. For open-ended questions, major categories/classifications should be identified by examining the number of responses and should be used for additional coding. The meaning of each code should be clearly documented. During the additional coding procedure it is also good practice to preserve as much information as possible in the data as they are collected (i.e. no collapsing or bracketing etc.). With occupational coding, it is important to follow a standard format defined by an accepted standards institution e.g. the International Standard Classification of Occupations, ISCO-88 and to use as many digits and, therefore, include as many details as possible. Specify all possible missing values (such as no response or not applicable ). Assign the same value (99, for example) to each type (e.g. not applicable ) in the same dataset. One of the following factors is usually responsible for missing values in child labour survey data, and a different code should be assigned to each case. Refused to answer. A child or parent did not answer the question. Don t know. A child or parent was unable to answer the question. The respondent might not have had any concept of time or arithmetic, for example, and replied Don t know to the question: What was your total income last year. (Respondents should be discouraged from answering, Don t know.) Not applicable. For some valid reason, the question was not asked. Following the response Not working, for example, any questions related to income were not asked. It has been observed in many child labour surveys that missing values were left blank or coded with a zero that was not pre-defined. It is of paramount importance, therefore, that all cases are assigned different codes during the coding process; and these should then appear pre-coded in the questionnaire. If for any reason missing values are assigned codes, the documentation should include clear descriptions. It is often quite difficult to code such items as occupations and industries. Where codes are developed, some classifications (occupation, for example) may be missed, making the jobs of enumerators and data processors even more difficult. Consequently, countries are encouraged to consult the following resources for help: International Standard Classification of Occupations (ISCO) 6 International Classification of Status in Employment (ICSE) 7 International Standard Industrial Classification of all Economic Activities (ISIC) 8 Classifications of Occupational Injuries 9 6 retrieved from 7 retrieved from 8 retrieved from retrieved from 12

19 This list, which is not exhaustive, can be accessed through the Bureau of Statistics web page. 10 Child labour classifications, the relevant categories varying from country to country, are not yet in a finalized form, and additional coding schemes may need to be developed. Consistency and logic check rules It is important to develop as many logic check rules as possible by going through the questionnaire. This requires a detailed understanding of the questionnaire and its flow, and will greatly help computer programmers at later stages. First, consistency check rules have to be generated by studying the routing of each question (e.g., if the answer to question 20 is yes, enter skip pattern as answers to questions 21 and 22). Sample responses from questionnaires that suggest other consistency checking rules include these: A child aged younger than six years is reported as having completed secondary school. A child is reported as not working but as nevertheless bringing cash into the household. A child did not work, but reported a work-related injury. Another type of logic check rule needs to be developed where data contains a legal value but nevertheless does not look right. For example, a parent is reported as having 11 children. This may be true, but may not look right, and could well represent a typographical error. The correct value may more likely be 1 child. The corresponding rule could read: Flag cases where parents reported having more than 10 children. These flagged cases then need to be checked manually. Imputations Once consistency checks are performed, many missing values can be replaced following imputation rules. Imputations estimate what would otherwise be missing values, where survey respondents failed to provide responses to given items. One rule might indicate, for example, that a person s income can be imputed by generating a formula involving age, type of work, wage rate, and number of days worked in a particular geographical area. As many of these formulae as possible should be developed by going through the questionnaire. It must be decided how imputed variables are to be incorporated in the dataset, and, where needed, relevant computer programs may be developed and tested. For simplicity, a completely new variable can be created, one which includes imputed values for missing codes, or where missing codes are replaced with imputed values together with a flagged variable with a value of 1 for imputed, and a value of 0 if not. Weights Since all child labour surveys are sample surveys, weights need to be calculated in order to produce national estimates. In choosing a sampling procedure, we should ask whether standard errors based on simple random sampling are appropriate, or whether more complex methods are required. If weights are required, they should be described. A clear indication of the response rate should be provided in the documentation, indicating what proportion of those sampled actually participated in the survey. The retention rate, if applicable, should also be noted. Weights are usually developed by specialists, and it is essential that a weighting formula with descriptions of all its elements is obtained well before data processing begins. 10 Details may be obtained from 13

20 Documentation Documentation should be as much a part of overall planning as is analysis. It has to be decided who is responsible for keeping a log of what is happening during data processing, including such considerations as problems encountered, major decisions taken, and any imputation method adopted. A more detailed account of this process is presented in Final documentation, (Section 3.8). 2.4 Selection of hardware and software Marshalling resources for a child labour survey strongly depends on what hardware, software, and national statistics office personnel are available. Given those constraints, the following aspects must be considered when selecting hardware and software for data processing: computers and printers data entry and data cleaning statistical processing and tabulations documentations and other tabulations software utility tools automation tools (to perform repeated tasks) Computers and printers tools for transferring files among different computers. virus-checking software hardware accessories cables, disks, CD, UPS, etc. Since data will be entered in batches and probably in parallel, one PC is needed for each data entry operator. Different data entry operators, however, can often share the same computer at different times. Printers capable of printing landscape format are also necessary. If line printers/ dot matrix printers are used, they should have a capacity of 120 characters per line. A Pentium computer with a 1GB hard disk is more than enough for data processing and temporary storage of child labour survey data. A permanent computer is also needed where the final dataset will be archived. It is highly recommended that the computer used for permanent storage of data is not the same one used for day-to-day work, even where this computer may be a central one, shared by different sections in the national statistics offices to store their data on a permanent basis. Data entry and data cleaning A great number of staff-hours are sometimes devoted to developing custom software for checking data entry errors. A better solution can be to use automatic data entry software, most of which has some form of built-in checking facility. Over the years, a variety of organizations have developed data entry software, and many national statistics offices use one or all of the following programs for data entry and initial data validations (this list is not exhaustive): 14

21 Blaise. 11 A flexible, relatively powerful system developed by Statistics Netherlands for computer-assisted interviewing, data entry, and data editing, Blaise is a software system for survey processing on microcomputers. Blaise also simplifies subsequent processing of the collected data. This software is being used primarily by European Union countries. IMPS. 12 Developed by the US Census Bureau, the original DOS-based Integrated Microcomputer Processing System has been superseded by a Windows-based version. Many developing countries are using this software for data entry. ISSA. 13 Integrated Systems for Survey Analysis is produced jointly by SerPro Ltd of Chile and Macro International of the USA. A number of developing countries are using this software for data entry. Evidence suggests that ISSA does not have a wide user base in SIMPOC countries and offers limited support in the form of training courses and documentation. EpiInfo. 14 This word-processing, database, and statistics program for public health on IBM-compatible microcomputers is produced by the Centre for Disease Control and Prevention, in the USA. Many developing countries are using this software for data entry. CSPro. 15 The Census and Survey Processing System was also developed by the US Census Bureau. Incorporating many features of IMPS, ISSA, and EpiInfo, CSPro is designed to replace both IMPS and ISSA, eventually. Detailed evaluation of the above software lies beyond the scope of this manual. In general, however, availability of financial resources, trained personnel, and microcomputers are all-important considerations in choosing any child labour survey software. Where no other data entry software is available and trained national statistics office personnel are lacking, CSPro (see above), public domain software from the US Census Bureau, can be used for entering, tabulating, and mapping survey data. This software, together with its documentation, is free online, although online registration may be required. The US Census Bureau can arrange training programmes, but charges for them. According to the software documentation, it is possible to handle child labour survey data with this software. Nevertheless, although some national statistics offices reportedly use versions of this software, they have yet to be tried on child labour surveys specifically, and the training may be worth the cost. An alternative is Blaise (see above), a user-friendly, high-speed data entry and data manipulation software with an interactive editing facility and survey management capabilities. The software is not free, but is offered at a discounted price to developing countries. However, it has a number of characteristics that can make it harder for non-programmers to learn. One such characteristic is the use of advanced programming concepts such as data typing and procedure parameters. Another is the lack of structured forms to aid in defining questionnaire forms and variables. Blaise is not widely used outside Europe, moreover, so an established user base in developing countries does not yet exist. 11 Details may be obtained from Statistics Netherlands 12 Details may be obtained from U.S Census Bureau index.html 13 More information is available at SERPRO 14 Details may be obtained from Centre for Disease Control and Prevention gov/epiinfo/ 15 Details may be obtained from U.S Census Bureau cspro/index.html 15

22 In any case, software should be tested beforehand, and data entry operators should be both trained with the software and familiar with child labour surveys before actual data entry. Processing and tabulations Evidence suggests that virtually all national statistics offices have access to either SAS or SPSS or both statistical packages. Where that may not be the case, national statistical offices should try to adopt one standard statistical software package (e.g. SPSS, SAS, or Stata). Where that is not possible, data entry software can also be used for child labour survey data processing purposes. (Data analysis can be performed using EpiInfo, for example.) See Annex I for a comparison of the SAS, SPSS, and Stata statistical packages. Documentation and other tabulation Microsoft Office Suite, comprising Word, Excel, and Access, is being used by many statistical offices, and is adequate for creating the appropriate documentation, including creation of the questionnaires. Both MSExcel, a spreadsheet program, and Access, a database program, are user-friendly means to preparing tables. TPL, table generation software from QQQ software, 16 can also be used. Again, availability of resources and trained personnel should be the main criteria for choosing a particular software. Software utility tools The following list of software utilities is not an exhaustive one, and many other utility tools may currently be in use in various countries. Databases. General users are often unfamiliar with statistical packages, and they might prefer to have a subset of the data (or even the entire dataset) in a database format. Many statistical packages allow data to be saved in a database format, and database programs such as Microsoft Access are sometimes quite helpful. File compression software (e.g. WinZip, PKZIP, gzip). This software is used for compressing files. It is sometime possible to reduce the file sizes as much as 80 per cent or more using these kinds of software. Compression is useful where a hard disk is short of storage space or when using floppy disks to transfer files between computers. Compiling software (e.g. Visual Basic, FoxPro, C++). This is programming software other than that incorporated in the statistical package. Compilers can be used to develop user-friendly front-end for data entry, for example, or to produce customized, in-house automation software for performing repetitive tasks. Conversion software. Utility software such as STAT Transfer and DBMScopy converts files from one specific statistical package to another. SAS proc convert statements can easily convert SPSS portable files into SAS datasets. File transfer software. This is software that allows files to be transferred between computers, whether networked or not. These utilities include Direct Cable Connection, which is included in the Windows operating system, or LL3 for non-windows-based transfers. FTP programs are also helpful in transferring files among networked computers. 16 More details are available at QQQ Software, Inc 16

First published 2007. applications. purpose. 121 p. International Labour O 13.02.3

First published 2007. applications. purpose. 121 p. International Labour O 13.02.3 Copyright Internation Publications of the In Protocol 2 of the Un excerpts from them ma that the source is ind application should be m pubdroit@ilo.org. The applications. Libraries, institutions an the

More information

International Comparison Program IDEAS

International Comparison Program IDEAS International Comparison Program Guidelines for archiving data in the ICP context ICP Data Electronic Archiving System IDEAS Draft version Operational Guide Contents Introduction... 3 1. Little questions

More information

CROATIAN BUREAU OF STATISTICS REPUBLIC OF CROATIA MAIN (STATISTICAL) BUSINESS PROCESSES INSTRUCTIONS FOR FILLING OUT THE TEMPLATE

CROATIAN BUREAU OF STATISTICS REPUBLIC OF CROATIA MAIN (STATISTICAL) BUSINESS PROCESSES INSTRUCTIONS FOR FILLING OUT THE TEMPLATE CROATIAN BUREAU OF STATISTICS REPUBLIC OF CROATIA MAIN (STATISTICAL) BUSINESS PROCESSES INSTRUCTIONS FOR FILLING OUT THE TEMPLATE CONTENTS INTRODUCTION... 3 1. SPECIFY NEEDS... 4 1.1 Determine needs for

More information

Measuring Intangible Investment

Measuring Intangible Investment Measuring Intangible Investment The Treatment of the Components of Intangible Investment in the UN Model Survey of Computer Services by OECD Secretariat OECD 1998 ORGANISATION FOR ECONOMIC CO-OPERATION

More information

TANZANIA - Agricultural Sample Census 2007-2008 Explanatory notes

TANZANIA - Agricultural Sample Census 2007-2008 Explanatory notes TANZANIA - Agricultural Sample Census 2007-2008 Explanatory notes 1. Historical Outline In 2007, the Government of Tanzania launched 2007/08 National Sample Census of Agriculture as an important part of

More information

Country Paper: Automation of data capture, data processing and dissemination of the 2009 National Population and Housing Census in Vanuatu.

Country Paper: Automation of data capture, data processing and dissemination of the 2009 National Population and Housing Census in Vanuatu. Country Paper: Automation of data capture, data processing and dissemination of the 2009 National Population and Housing Census in Vanuatu. For Expert Group Meeting Opportunities and advantages of enhanced

More information

Documenting the research life cycle: one data model, many products

Documenting the research life cycle: one data model, many products Documenting the research life cycle: one data model, many products Mary Vardigan, 1 Peter Granda, 2 Sue Ellen Hansen, 3 Sanda Ionescu 4 and Felicia LeClere 5 Introduction Technical documentation for social

More information

Human resources development and training

Human resources development and training International Labour Conference 92nd Session 2004 Report IV (1) Human resources development and training Fourth item on the agenda International Labour Office Geneva ISBN 92-2-113036-3 ISSN 0074-6681 First

More information

For further information on ILO-OSH 2001, please contact:

For further information on ILO-OSH 2001, please contact: For further information on ILO-OSH 2001, please contact: SafeWork-ILO InFocus Programme on Safety and Health at Work and the Environment International Labour Office (ILO) 4, route des Morillons CH-1211

More information

Foundations for Systems Development

Foundations for Systems Development Foundations for Systems Development ASSIGNMENT 1 Read this assignment introduction. Then, read Chapter 1, The Systems Development Environment, on pages 2 25 in your textbook. What Is Systems Analysis and

More information

Data preparation and management

Data preparation and management Quantitative research methods in educational planning Series editor: Kenneth N.Ross 10 Module Andreas Schleicher and Mioko Saito Data preparation and management UNESCO International Institute for Educational

More information

Research Data Archival Guidelines

Research Data Archival Guidelines Research Data Archival Guidelines LEROY MWANZIA RESEARCH METHODS GROUP APRIL 2012 Table of Contents Table of Contents... i 1 World Agroforestry Centre s Mission and Research Data... 1 2 Definitions:...

More information

Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer

Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer Survey of Canadian and International Data Management Initiatives By Diego Argáez and Kathleen Shearer on behalf of the CARL Data Management Working Group (Working paper) April 28, 2008 Introduction Today,

More information

International Labour Office GLOBAL STRATEGY ON OCCUPATIONAL SAFETY AND HEALTH

International Labour Office GLOBAL STRATEGY ON OCCUPATIONAL SAFETY AND HEALTH International Labour Office GLOBAL STRATEGY ON OCCUPATIONAL SAFETY AND HEALTH Conclusions adopted by the International Labour Conference at its 91 st Session, 2003 Global Strategy on Occupational Safety

More information

THE SURVEY OF INCOME AND PROGRAM PARTICIPATION SURVEYS-ON-CALL: ON-LINE ACCESS TO SURVEY DATA. No. 229

THE SURVEY OF INCOME AND PROGRAM PARTICIPATION SURVEYS-ON-CALL: ON-LINE ACCESS TO SURVEY DATA. No. 229 THE SURVEY OF INCOME AND PROGRAM PARTICIPATION SURVEYS-ON-CALL: ON-LINE ACCESS TO SURVEY DATA No. 229 Stacey Furukawa Enrique Lamas Judith Eargle Census Bureau U.S Department of Commerce BUREAU OF THE

More information

Guide for Applicants COSME calls for proposals 2015

Guide for Applicants COSME calls for proposals 2015 Guide for Applicants COSME calls for proposals 2015 CONTENTS I. Introduction... 3 II. Preparation of the proposal... 3 II.1. Relevant documents... 3 II.2. Participants... 4 Consortium coordinator... 4

More information

CSPro Getting Started

CSPro Getting Started CSPro Getting Started Version 2.6 International Programs Center U.S. Census Bureau Washington DC 20233-8860 Phone: 1-301-763-1451 Fax: 1-301-457-3033 E-mail: CSPro@lists.census.gov 13 Jun 2005 Table of

More information

Census Data Capture with OCR Technology: Ghana s Experience.

Census Data Capture with OCR Technology: Ghana s Experience. 1. Background Census Data Capture with OCR Technology: Ghana s Experience. Population censuses have been conducted in Ghana at approximately ten-year intervals since 1891 except in 1941, when the series

More information

Document Management/Scanning White Paper

Document Management/Scanning White Paper Document Management/Scanning White Paper Justification for document scanning services may involve many factors; some are easy to quantify in monetary terms and others that can only be quantified through

More information

Recommendation 195. Recommendation concerning Human Resources Development: Education, Training and Lifelong Learning

Recommendation 195. Recommendation concerning Human Resources Development: Education, Training and Lifelong Learning Recommendation 195 International Labour Office Geneva Recommendation concerning Human Resources Development: Education, Training and Lifelong Learning Recommendation 195 Recommendation concerning Human

More information

International Labour Office Geneva. Audit Matrix for the ILO Guidelines on Occupational Safety and Health Management Systems ( ILO-OSH 2001)

International Labour Office Geneva. Audit Matrix for the ILO Guidelines on Occupational Safety and Health Management Systems ( ILO-OSH 2001) International Labour Office Geneva Audit Matrix for the ILO Guidelines on Occupational Safety and Health Management Systems ( ILO-OSH 2001) Programme on Safety and Health at Work and the Environment (SafeWork)

More information

Streamlining Reports: A Look into Ad Hoc and Standardized Processes James Jenson, US Bancorp, Saint Paul, MN

Streamlining Reports: A Look into Ad Hoc and Standardized Processes James Jenson, US Bancorp, Saint Paul, MN Working Paper 138-2010 Streamlining Reports: A Look into Ad Hoc and Standardized Processes James Jenson, US Bancorp, Saint Paul, MN Abstract: This paper provides a conceptual framework for quantitative

More information

OECD SERIES ON PRINCIPLES OF GOOD LABORATORY PRACTICE AND COMPLIANCE MONITORING NUMBER 10 GLP CONSENSUS DOCUMENT

OECD SERIES ON PRINCIPLES OF GOOD LABORATORY PRACTICE AND COMPLIANCE MONITORING NUMBER 10 GLP CONSENSUS DOCUMENT GENERAL DISTRIBUTION OCDE/GD(95)115 OECD SERIES ON PRINCIPLES OF GOOD LABORATORY PRACTICE AND COMPLIANCE MONITORING NUMBER 10 GLP CONSENSUS DOCUMENT THE APPLICATION OF THE PRINCIPLES OF GLP TO COMPUTERISED

More information

GUIDELINES FOR CLEANING AND HARMONIZATION OF GENERATIONS AND GENDER SURVEY DATA. Andrej Kveder Alexandra Galico

GUIDELINES FOR CLEANING AND HARMONIZATION OF GENERATIONS AND GENDER SURVEY DATA. Andrej Kveder Alexandra Galico GUIDELINES FOR CLEANING AND HARMONIZATION OF GENERATIONS AND GENDER SURVEY DATA Andrej Kveder Alexandra Galico Table of contents TABLE OF CONTENTS...2 1 INTRODUCTION...3 2 DATA PROCESSING...3 2.1 PRE-EDITING...4

More information

Position Classification Standard for Management and Program Clerical and Assistance Series, GS-0344

Position Classification Standard for Management and Program Clerical and Assistance Series, GS-0344 Position Classification Standard for Management and Program Clerical and Assistance Series, GS-0344 Table of Contents SERIES DEFINITION... 2 EXCLUSIONS... 2 OCCUPATIONAL INFORMATION... 3 TITLES... 6 EVALUATING

More information

Guide for Documenting and Sharing Best Practices. in Health Programmes

Guide for Documenting and Sharing Best Practices. in Health Programmes Guide for Documenting and Sharing Best Practices in Health Programmes Guide for Documenting and Sharing Best Practices in Health Programmes WORLD HEALTH ORGANIZATION Regional Office for Africa Brazzaville

More information

A Computer Glossary. For the New York Farm Viability Institute Computer Training Courses

A Computer Glossary. For the New York Farm Viability Institute Computer Training Courses A Computer Glossary For the New York Farm Viability Institute Computer Training Courses 2006 GLOSSARY This Glossary is primarily applicable to DOS- and Windows-based machines and applications. Address:

More information

Price: 20 Swiss francs INTERNATIONAL LABOUR OFFICE. GENEVA

Price: 20 Swiss francs INTERNATIONAL LABOUR OFFICE. GENEVA Guidelines on occupational safety and health management systems ILO-OSH OSH 2001 At the onset of the twenty-first century, a heavy human and economic toll is still exacted by unsafe and unhealthy working

More information

Using emerging technologies Getting the best results from paper based data capture

Using emerging technologies Getting the best results from paper based data capture Using emerging technologies Getting the best results from paper based data capture Andy Tye 1 & Mike Smethurst 2 DRS Data Services Ltd. 3 Introduction: This paper reviews the tried and tested techniques

More information

Management Challenge. Managing Hardware Assets. Central Processing Unit. What is a Computer System?

Management Challenge. Managing Hardware Assets. Central Processing Unit. What is a Computer System? Management Challenge Managing Hardware Assets What computer processing and storage capability does our organization need to handle its information and business transactions? What arrangement of computers

More information

APPENDIX B: FEMA 452: Risk Assessment Database V5.0. User Guide

APPENDIX B: FEMA 452: Risk Assessment Database V5.0. User Guide APPENDIX B: FEMA 452: Risk Assessment Database V5.0 User Guide INTRODUCTION... 5 DATABASE ADMINISTRATOR INFORMATION... 6 INSTALLATION PROCESS... 8 USE OF THE DATABASE... 10 OPENING THE DATABASE... 12 FACILITY

More information

Updating the International Standard Classification of Occupations (ISCO) Draft ISCO-08 Group Definitions: Occupations in Secretarial and Reception

Updating the International Standard Classification of Occupations (ISCO) Draft ISCO-08 Group Definitions: Occupations in Secretarial and Reception International Labour Organization Organisation internationale du Travail Organización Internacional del Trabajo Updating the International Standard Classification of Occupations (ISCO) Draft ISCO-08 Group

More information

PISA 2003 MAIN STUDY DATA ENTRY MANUAL

PISA 2003 MAIN STUDY DATA ENTRY MANUAL PISA 2003 MAIN STUDY DATA ENTRY MANUAL Project Consortium: Australian Council For Educational Research (ACER) Netherlands National Institute for Educational Measurement (CITO group) Educational Testing

More information

Answers to Review Questions

Answers to Review Questions Tutorial 2 The Database Design Life Cycle Reference: MONASH UNIVERSITY AUSTRALIA Faculty of Information Technology FIT1004 Database Rob, P. & Coronel, C. Database Systems: Design, Implementation & Management,

More information

Appendix B Data Quality Dimensions

Appendix B Data Quality Dimensions Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational

More information

THE SIPP UTILITIES USER'S MANUAL

THE SIPP UTILITIES USER'S MANUAL THE SIPP UTILITIES USER'S MANUAL Unicon Research Corporation 1640 Fifth Street, Suite 100 Santa Monica, CA 90401 Version 3.0 November 2005 Copyright 2005 by Unicon Research Corporation All Rights Reserved

More information

Survey Management for the United Enterprise Statistics Programme at Statistics Canada

Survey Management for the United Enterprise Statistics Programme at Statistics Canada Survey Management for the United Enterprise Statistics Programme at Statistics Canada Armin Braslins, Operations and Research Development Division, Statistics Canada March 2000 Background In October 1996,

More information

ILLINOIS DEPARTMENT OF CENTRAL MANAGEMENT SERVICES CLASS SPECIFICATION DATA PROCESSING OPERATIONS SERIES CLASS TITLE POSITION CODE EFFECTIVE

ILLINOIS DEPARTMENT OF CENTRAL MANAGEMENT SERVICES CLASS SPECIFICATION DATA PROCESSING OPERATIONS SERIES CLASS TITLE POSITION CODE EFFECTIVE ILLINOIS DEPARTMENT OF CENTRAL MANAGEMENT SERVICES CLASS SPECIFICATION DATA PROCESSING OPERATIONS SERIES CLASS TITLE POSITION CODE EFFECTIVE DATA PROCESSING OPERATOR 11425 4-16-89 DATA PROCESSING ASSISTANT

More information

Implementing an Automated Digital Video Archive Based on the Video Edition of XenData Software

Implementing an Automated Digital Video Archive Based on the Video Edition of XenData Software Implementing an Automated Digital Video Archive Based on the Video Edition of XenData Software The Video Edition of XenData Archive Series software manages one or more automated data tape libraries on

More information

Accounts Receivable System Administration Manual

Accounts Receivable System Administration Manual Accounts Receivable System Administration Manual Confidential Information This document contains proprietary and valuable, confidential trade secret information of APPX Software, Inc., Richmond, Virginia

More information

SCADAPack E ISaGRAF 3 User Manual

SCADAPack E ISaGRAF 3 User Manual SCADAPack E ISaGRAF 3 User Manual 2 SCADAPack E ISaGRAF 3 User Manual Table of Contents Part I ISaGRAF 3 User Manual 3 1 Technical... Support 3 2 Safety... Information 4 3 Preface... 6 4 Overview... 8

More information

Employer Survey Guide

Employer Survey Guide Employer Survey Guide Commonwealth of Australia July 2008 This work is copyright. It may be reproduced in whole or in part for study or training purposes, subject to the inclusion of an acknowledgement

More information

Case Study No. 6. Good practice in data management

Case Study No. 6. Good practice in data management Case Study No. 6 Good practice in data management This Case Study is drawn from the DFID-funded Farming Systems Integrated Pest Management (FSIPM) Project conducted between 1996 and 1999 in Blantyre-Shire

More information

An Application of the Internet-based Automated Data Management System (IADMS) for a Multi-Site Public Health Project

An Application of the Internet-based Automated Data Management System (IADMS) for a Multi-Site Public Health Project An Application of the Internet-based Automated Data Management System (IADMS) for a Multi-Site Public Health Project Michele G. Mandel, National Centers for Disease Control and Prevention, Atlanta, GA

More information

Data Management Implementation Plan

Data Management Implementation Plan Appendix 8.H Data Management Implementation Plan Prepared by Vikram Vyas CRESP-Amchitka Data Management Component 1. INTRODUCTION... 2 1.1. OBJECTIVES AND SCOPE... 2 2. DATA REPORTING CONVENTIONS... 2

More information

Multi-Environment Trials: Data Quality Guide

Multi-Environment Trials: Data Quality Guide Multi-Environment Trials: Data Quality Guide Thomas Mawora (tmawora@yahoo.com ), Maseno University, Kenya Cathy Garlick (c.a.garlick@reading.ac.uk), Statistical Services Centre, University of Reading,

More information

File Magic 5 Series. The power to share information PRODUCT OVERVIEW. Revised November 2004

File Magic 5 Series. The power to share information PRODUCT OVERVIEW. Revised November 2004 File Magic 5 Series The power to share information PRODUCT OVERVIEW Revised November 2004 Copyrights, Legal Notices, Trademarks and Servicemarks Copyright 2004 Westbrook Technologies Incorporated. All

More information

Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database.

Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database. Physical Design Physical Database Design (Defined): Process of producing a description of the implementation of the database on secondary storage; it describes the base relations, file organizations, and

More information

Competent Data Management - a key component

Competent Data Management - a key component Competent Data Management - a key component Part II Illustrating the data entry application using CS-Pro April 2009 University of Reading Statistical Services Centre Data Management Support to RIU Projects

More information

Preparing data for sharing

Preparing data for sharing Preparing data for sharing Guide to social science data archiving DANS data guide 8 This Data Guide is aimed at those engaged in the cycle of social science research, from applying for a research grant,

More information

Simplify survey research with IBM SPSS Data Collection Data Entry

Simplify survey research with IBM SPSS Data Collection Data Entry IBM SPSS Data Collection Data Entry Simplify survey research with IBM SPSS Data Collection Data Entry Advanced, survey-aware software for creating surveys and capturing responses Highlights Create compelling,

More information

OCCUPATIONS & WAGES REPORT

OCCUPATIONS & WAGES REPORT THE COMMONWEALTH OF THE BAHAMAS OCCUPATIONS & WAGES REPORT 2011 Department of Statistics Ministry of Finance P.O. Box N-3904 Nassau Bahamas Copyright THE DEPARTMENT OF STATISTICS BAHAMAS 2011 Short extracts

More information

SPSS: Getting Started. For Windows

SPSS: Getting Started. For Windows For Windows Updated: August 2012 Table of Contents Section 1: Overview... 3 1.1 Introduction to SPSS Tutorials... 3 1.2 Introduction to SPSS... 3 1.3 Overview of SPSS for Windows... 3 Section 2: Entering

More information

Software: Systems and Application Software

Software: Systems and Application Software Software: Systems and Application Software Computer Software Operating System Popular Operating Systems Language Translators Utility Programs Applications Programs Types of Application Software Personal

More information

Project Data Archiving Lessons from a Case Study

Project Data Archiving Lessons from a Case Study Project Data Archiving Lessons from a Case Study March 1998 The University of Reading Statistical Services Centre Biometrics Advisory and Support Service to DFID Contents 1. Introduction 3 2. Why preserve

More information

XenData Archive Series Software Technical Overview

XenData Archive Series Software Technical Overview XenData White Paper XenData Archive Series Software Technical Overview Advanced and Video Editions, Version 4.0 December 2006 XenData Archive Series software manages digital assets on data tape and magnetic

More information

DATA QUALITY DATA BASE QUALITY INFORMATION SYSTEM QUALITY

DATA QUALITY DATA BASE QUALITY INFORMATION SYSTEM QUALITY DATA QUALITY DATA BASE QUALITY INFORMATION SYSTEM QUALITY The content of those documents are the exclusive property of REVER. The aim of those documents is to provide information and should, in no case,

More information

Exchange Mailbox Protection Whitepaper

Exchange Mailbox Protection Whitepaper Exchange Mailbox Protection Contents 1. Introduction... 2 Documentation... 2 Licensing... 2 Exchange add-on comparison... 2 Advantages and disadvantages of the different PST formats... 3 2. How Exchange

More information

RATIONALISING DATA COLLECTION: AUTOMATED DATA COLLECTION FROM ENTERPRISES

RATIONALISING DATA COLLECTION: AUTOMATED DATA COLLECTION FROM ENTERPRISES Distr. GENERAL 8 October 2012 WP. 13 ENGLISH ONLY UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Seminar on New Frontiers for Statistical Data Collection (Geneva, Switzerland,

More information

ABSTRACT. would end the use of the hefty 1.5-kg ticket racks carried by KSRTC conductors. It would also end the

ABSTRACT. would end the use of the hefty 1.5-kg ticket racks carried by KSRTC conductors. It would also end the E-Ticketing 1 ABSTRACT Electronic Ticket Machine Kerala State Road Transport Corporation is introducing ticket machines on buses. The ticket machines would end the use of the hefty 1.5-kg ticket racks

More information

Classification Appeal Decision Under section 5112 of title 5, United States Code

Classification Appeal Decision Under section 5112 of title 5, United States Code U.S. Office of Personnel Management Office of Merit Systems Oversight and Effectiveness Classification Appeals and FLSA Programs Atlanta Oversight Division 75 Spring Street, SW., Suite 1018 Atlanta, GA

More information

Notes. Business Management. Higher Still. Higher. www.hsn.uk.net. HSN81200 Unit 1 Outcome 2. Contents. Information and Information Technology 1

Notes. Business Management. Higher Still. Higher. www.hsn.uk.net. HSN81200 Unit 1 Outcome 2. Contents. Information and Information Technology 1 Higher Business Management Unit 1 Outcome 2 Contents Information and Information Technology 1 Data and Information 1 Information Sources 1 Information Types 2 The Value of Information 3 The Use of Information

More information

Accounts Receivable User Manual

Accounts Receivable User Manual Accounts Receivable User Manual Confidential Information This document contains proprietary and valuable, confidential trade secret information of APPX Software, Inc., Richmond, Virginia Notice of Authorship

More information

Space Project Management

Space Project Management EUROPEAN COOPERATION FOR SPACE STANDARDIZATION Space Project Management Information/Documentation Management Secretariat ESA ESTEC Requirements & Standards Division Noordwijk, The Netherlands Published

More information

DCI Solutions for Business Surveys 1

DCI Solutions for Business Surveys 1 DCI Solutions for Business Surveys 1 1. Collection by Paper - Scanning and Intelligent Character Recognition (ICR) 1.1. The current Integrated Data Capture (IDC) system for handling paper forms needs upgrading

More information

EMC Documentum Repository Services for Microsoft SharePoint

EMC Documentum Repository Services for Microsoft SharePoint EMC Documentum Repository Services for Microsoft SharePoint Version 6.5 SP2 Installation Guide P/N 300 009 829 A01 EMC Corporation Corporate Headquarters: Hopkinton, MA 01748 9103 1 508 435 1000 www.emc.com

More information

Results-based Management in the ILO. A Guidebook. Version 2

Results-based Management in the ILO. A Guidebook. Version 2 Results-based Management in the ILO A Guidebook Version 2 Applying Results-Based Management in the International Labour Organization A Guidebook Version 2 June 2011 Copyright International Labour Organization

More information

Manual for child labour data analysis and statistical reports. Statistical Information and Monitoring Programme on Child Labour (SIMPOC)

Manual for child labour data analysis and statistical reports. Statistical Information and Monitoring Programme on Child Labour (SIMPOC) ILO Manual for child labour data analysis and statistical reports SIMPOC Manual for child labour data analysis and statistical reports Statistical Information and Monitoring Programme on Child Labour (SIMPOC)

More information

Building an Integrated Clinical Trial Data Management System With SAS Using OLE Automation and ODBC Technology

Building an Integrated Clinical Trial Data Management System With SAS Using OLE Automation and ODBC Technology Building an Integrated Clinical Trial Data Management System With SAS Using OLE Automation and ODBC Technology Alexandre Peregoudov Reproductive Health and Research, World Health Organization Geneva, Switzerland

More information

Data Migration Service An Overview

Data Migration Service An Overview Metalogic Systems Pvt Ltd J 1/1, Block EP & GP, Sector V, Salt Lake Electronic Complex, Calcutta 700091 Phones: +91 33 2357-8991 to 8994 Fax: +91 33 2357-8989 Metalogic Systems: Data Migration Services

More information

Do you know? "7 Practices" for a Reliable Requirements Management. by Software Process Engineering Inc. translated by Sparx Systems Japan Co., Ltd.

Do you know? 7 Practices for a Reliable Requirements Management. by Software Process Engineering Inc. translated by Sparx Systems Japan Co., Ltd. Do you know? "7 Practices" for a Reliable Requirements Management by Software Process Engineering Inc. translated by Sparx Systems Japan Co., Ltd. In this white paper, we focus on the "Requirements Management,"

More information

II. Business case for Population and Housing Census 2011

II. Business case for Population and Housing Census 2011 Distr. GENERAL Working Paper 9 April 2013 ENGLISH ONLY UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE (ECE) CONFERENCE OF EUROPEAN STATISTICIANS ORGANISATION FOR ECONOMIC COOPERATION AND DEVELOPMENT (OECD)

More information

Documentation for data centre migrations

Documentation for data centre migrations Documentation for data centre migrations Data centre migrations are part of the normal life cycle of a typical enterprise. As organisations expand, many reach a point where maintaining multiple, distributed

More information

Guide to Social Science Data Preparation and Archiving. Best Practice Throughout the Data Life Cycle

Guide to Social Science Data Preparation and Archiving. Best Practice Throughout the Data Life Cycle Guide to Social Science Data Preparation and Archiving Best Practice Throughout the Data Life Cycle Copyright 2005, by the Inter-university Consortium for Political and Social Research (ICPSR) ICPSR Institute

More information

NHA. User Guide, Version 1.0. Production Tool

NHA. User Guide, Version 1.0. Production Tool NHA User Guide, Version 1.0 Production Tool Welcome to the National Health Accounts Production Tool National Health Accounts (NHA) is an internationally standardized methodology that tracks public and

More information

Data Management Procedures

Data Management Procedures Data Management Procedures Introduction... 166 Data management at the National Centre... 167 Data cleaning by the international contractor... 170 Final review of the data... 172 Next steps in preparing

More information

B.Com(Computers) II Year RELATIONAL DATABASE MANAGEMENT SYSTEM Unit- I

B.Com(Computers) II Year RELATIONAL DATABASE MANAGEMENT SYSTEM Unit- I B.Com(Computers) II Year RELATIONAL DATABASE MANAGEMENT SYSTEM Unit- I 1 1. What is Data? A. Data is a collection of raw information. 2. What is Information? A. Information is a collection of processed

More information

BC Geographic Warehouse. A Guide for Data Custodians & Data Managers

BC Geographic Warehouse. A Guide for Data Custodians & Data Managers BC Geographic Warehouse A Guide for Data Custodians & Data Managers Last updated November, 2013 TABLE OF CONTENTS INTRODUCTION... 1 Purpose... 1 Audience... 1 Contents... 1 It's All About Information...

More information

Records Management. Objectives. With the person sitting next to you, Presented by: Rachel Martin. After this workshop, you ll be able to:

Records Management. Objectives. With the person sitting next to you, Presented by: Rachel Martin. After this workshop, you ll be able to: Records Management Presented by: Rachel Martin Objectives After this workshop, you ll be able to: Implement a new records management system Perfect filing techniques Streamline and improve records management

More information

User Guide. Contents. December 2010 1

User Guide. Contents. December 2010 1 User Guide December 2010 1 User Guide Contents Welcome... 1 The course... 1 Technical information for using the tool... 2 How the tool is structured... 3 How to use the tool... 7 Course planner... 8 Using

More information

Litigation Support. Learn How to Talk the Talk. solutions. Document management

Litigation Support. Learn How to Talk the Talk. solutions. Document management Document management solutions Litigation Support glossary of Terms Learn How to Talk the Talk Covering litigation support from A to Z. Designed to help you come up to speed quickly on key terms and concepts,

More information

DATA CAPTURE AND PROCESSING 2006 POPULATION AND HOUSING CENSUS, NIGERIA,

DATA CAPTURE AND PROCESSING 2006 POPULATION AND HOUSING CENSUS, NIGERIA, DATA CAPTURE AND PROCESSING 2006 POPULATION AND HOUSING CENSUS, NIGERIA, 9-13 th June, 2008, Dar es salaam, Tanzania By Ms Adesola Fatilewa, National Population Commission, Nigeria Summary of Paper This

More information

1-04-10 Configuration Management: An Object-Based Method Barbara Dumas

1-04-10 Configuration Management: An Object-Based Method Barbara Dumas 1-04-10 Configuration Management: An Object-Based Method Barbara Dumas Payoff Configuration management (CM) helps an organization maintain an inventory of its software assets. In traditional CM systems,

More information

Clinical Data Management (Process and practical guide) Dr Nguyen Thi My Huong WHO/RHR/RCP/SIS

Clinical Data Management (Process and practical guide) Dr Nguyen Thi My Huong WHO/RHR/RCP/SIS Clinical Data Management (Process and practical guide) Dr Nguyen Thi My Huong WHO/RHR/RCP/SIS Training Course in Sexual and Reproductive Health Research Geneva 2012 OUTLINE Clinical Data Management CDM

More information

How To Use A Court Record Electronically In Idaho

How To Use A Court Record Electronically In Idaho Idaho Judicial Branch Scanning and Imaging Guidelines DRAFT - October 25, 2013 A. Introduction Many of Idaho s courts have considered or implemented the use of digital imaging systems to scan court documents

More information

Data Availability Policies & Author Responsibility Policies Time of Evaluation: May 2014

Data Availability Policies & Author Responsibility Policies Time of Evaluation: May 2014 Data policies found in a sample of 346 journals in economic sciences Data Availability Policies & Author Responsibility Policies Time of Evaluation: May 2014 Table of Contents: Data Availability Policies:...

More information

How To Read Data Files With Spss For Free On Windows 7.5.1.5 (Spss)

How To Read Data Files With Spss For Free On Windows 7.5.1.5 (Spss) 05-Einspruch (SPSS).qxd 11/18/2004 8:26 PM Page 49 CHAPTER 5 Managing Data Files Chapter Purpose This chapter introduces fundamental concepts of working with data files. Chapter Goal To provide readers

More information

How To Manage Assets On A Microsoft Powerbook 2.5.2 (For Microsoft)

How To Manage Assets On A Microsoft Powerbook 2.5.2 (For Microsoft) Sage Fixed Assets Tracking User s Guide Version 12.1 Contents Chapter 1. Introduction Welcome to Sage Fixed Assets............................................................... 1-2 Sage Fixed Assets -

More information

B.Sc (Computer Science) Database Management Systems UNIT-V

B.Sc (Computer Science) Database Management Systems UNIT-V 1 B.Sc (Computer Science) Database Management Systems UNIT-V Business Intelligence? Business intelligence is a term used to describe a comprehensive cohesive and integrated set of tools and process used

More information

How To Backup A Database In Navision

How To Backup A Database In Navision Making Database Backups in Microsoft Business Solutions Navision MAKING DATABASE BACKUPS IN MICROSOFT BUSINESS SOLUTIONS NAVISION DISCLAIMER This material is for informational purposes only. Microsoft

More information

International Certificate in Financial English

International Certificate in Financial English International Certificate in Financial English Past Examination Paper Writing May 2007 University of Cambridge ESOL Examinations 1 Hills Road Cambridge CB1 2EU United Kingdom Tel. +44 1223 553355 Fax.

More information

Guide for Applicants. Call for Proposal:

Guide for Applicants. Call for Proposal: Guide for Applicants Call for Proposal: COSME Work Programme 2014 TABLE OF CONTENTS I. Introduction... 3 II. Preparation of the proposal... 3 II.1. Relevant documents... 3 II.2. Participants... 4 II.2.1.

More information

OMCL Network of the Council of Europe QUALITY ASSURANCE DOCUMENT

OMCL Network of the Council of Europe QUALITY ASSURANCE DOCUMENT OMCL Network of the Council of Europe QUALITY ASSURANCE DOCUMENT PA/PH/OMCL (08) 69 3R Full document title and reference Document type VALIDATION OF COMPUTERISED SYSTEMS Legislative basis - CORE DOCUMENT

More information

Accounts Payable System Administration Manual

Accounts Payable System Administration Manual Accounts Payable System Administration Manual Confidential Information This document contains proprietary and valuable, confidential trade secret information of APPX Software, Inc., Richmond, Virginia

More information

Basic Requirements...2. Software Requirements...2. Mailbox...2. Gatekeeper...3. Plan Your Setup...3. Meet Extreme Processing...3. Script Editor...

Basic Requirements...2. Software Requirements...2. Mailbox...2. Gatekeeper...3. Plan Your Setup...3. Meet Extreme Processing...3. Script Editor... Guide on EDI automation and use of VAN services Copyright 2008-2009 Etasoft Inc. Main website http://www.etasoft.com Extreme Processing website http://www.xtranslator.com Basic Requirements...2 Software

More information

PART I DEPARTMENT OF PERSONNEL SERIVCES 4.140 STATE OF HAWAII 4.142... 4.144 MEDICAL RECORD TECHNICIAN SERIES

PART I DEPARTMENT OF PERSONNEL SERIVCES 4.140 STATE OF HAWAII 4.142... 4.144 MEDICAL RECORD TECHNICIAN SERIES PART I DEPARTMENT OF PERSONNEL SERIVCES 4.140 STATE OF HAWAII 4.142................................................................ 4.144 Class Specifications for the: MEDICAL RECORD TECHNICIAN SERIES

More information

Improvements of the Census Operation of Japan by Using Information Technology

Improvements of the Census Operation of Japan by Using Information Technology Paper to be presented at the 22nd Population Census Conference March 7-9, 2005, Seattle, Washington, USA Improvements of the Census Operation of Japan by Using Information Technology Statistics Bureau

More information

IMPROVING OPERATIONAL QUALITY AND PRODUCTIVITY FOR THE 1990 CENSUS

IMPROVING OPERATIONAL QUALITY AND PRODUCTIVITY FOR THE 1990 CENSUS IMPROVING OPERATIONAL QUALITY AND PRODUCTIVITY FOR THE 1990 CENSUS Robert T. Smith, Jr., John Linebarger, and Glenn White, Jr. United States Bureau of the Census 1/ Robert T. Smith, Jr., Statistical Support

More information

Help to Buy: Equity Loan scheme 19,394 properties 791 million 184,995 36,999 First Time Buyers, account- ing for 16,964 (87.

Help to Buy: Equity Loan scheme 19,394 properties 791 million 184,995 36,999 First Time Buyers, account- ing for 16,964 (87. Help to Buy (Equity Loan scheme) and Help to Buy: NewBuy statistics: Data to 31 March 2014, England In the first twelve months of the Help to Buy: Equity Loan scheme (to 31 March), 19,394 properties were

More information

Quick Guide: Meeting ISO 55001 Requirements for Asset Management

Quick Guide: Meeting ISO 55001 Requirements for Asset Management Supplement to the IIMM 2011 Quick Guide: Meeting ISO 55001 Requirements for Asset Management Using the International Infrastructure Management Manual (IIMM) ISO 55001: What is required IIMM: How to get

More information