Intelligent Use of Metadata in the Questionnaire Design Process

Similar documents
Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain

Movida provides a complete solution for managing catch-up and Video On-Demand services. It enables you to manage the editorial side of your Video

A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems

Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object

EUROPASS DIPLOMA SUPPLEMENT

User research for information architecture projects

Adding Semantics to Business Intelligence

THE JOINT HARMONISED EU PROGRAMME OF BUSINESS AND CONSUMER SURVEYS

STATISTICAL DATABASES: THE REFERENCE ENVIRONMENT AND THREE LAYERS PROPOSED BY EUROSTAT

Building Semantic Content Management Framework

Documenting the research life cycle: one data model, many products

Requirements. Approaches to user support. Command assistance. Chapter 11 User support

Model-Driven Data Warehousing

Federated, Generic Configuration Management for Engineering Data

OECD SHORT-TERM ECONOMIC STATISTICS WORKING PARTY (STESWP)

INNOVATOR. The integrated tool suite for business process and software engineering

2011 Census User Satisfaction Survey. Summary Report

Integrated Accounting System for Mac OS X

ENHANCED PUBLICATIONS IN THE CZECH REPUBLIC

Metadata Repositories in Health Care. Discussion Paper

Metadata Management for Data Warehouse Projects

CRM Phase 3 Development, support and maintenance - Questions and Answers

EU-WISE: Enhancing self-care support for people with long term conditions across Europe

Windchill PDMLink Curriculum Guide

FreeForm Designer. Phone: Fax: POB 8792, Natanya, Israel Document2

KS3 Computing Group 1 Programme of Study hours per week

The basic data mining algorithms introduced may be enhanced in a number of ways.

IFS-8000 V2.0 INFORMATION FUSION SYSTEM

REPUBLIC OF MACEDONIA STATE STATISTICAL OFFICE. Metadata Strategy

CA Repository for z/os r7.2

Service Oriented Architecture

zen Platform technical white paper

Test Automation Architectures: Planning for Test Automation

METADATA DRIVEN INTEGRATED STATISTICAL DATA PROCESSING AND DISSEMINATION SYSTEM

Integrated Invoicing and Debt Management System for Mac OS X

Measuring the Impact of Volunteering

Fogbeam Vision Series - The Modern Intranet

Enterprise Architecture Process, Structure and Organization

Towards an EXPAND Assessment Model for ehealth Interoperability Assets. Dipak Kalra on behalf of the EXPAND Consortium

Business Benefits From Microsoft SQL Server Business Intelligence Solutions How Can Business Intelligence Help You? PTR Associates Limited

Software review: A process change model to meet the Enterprise Marketing Automation (EMA) vision Received: 20th July, 2000

Java Metadata Interface and Data Warehousing

Draft Response for delivering DITA.xml.org DITAweb. Written by Mark Poston, Senior Technical Consultant, Mekon Ltd.

Methods and tools for data and software integration Enterprise Service Bus

Rotorcraft Health Management System (RHMS)

Filtering the Web to Feed Data Warehouses

TECHNOLOGY BRIEF: CA ERWIN SAPHIR OPTION. CA ERwin Saphir Option

An Automated Workflow System Geared Towards Consumer Goods and Services Companies

Universiteit Leiden. Opleiding Informatica

AN INTEGRATION APPROACH FOR THE STATISTICAL INFORMATION SYSTEM OF ISTAT USING SDMX STANDARDS

DSS based on Data Warehouse

RATIONALISING DATA COLLECTION: AUTOMATED DATA COLLECTION FROM ENTERPRISES

Automatic promotion and versioning with Oracle Data Integrator 12c

Data Warehouses in the Path from Databases to Archives

2. Application Domains

Improving the visualisation of statistics: The use of SDMX as input for dynamic charts on the ECB website

The Department for Business, Innovation and Skills IMA Action Plan PRIORITY RECOMMENDATIONS

CONDIS. IT Service Management and CMDB

Annex C Data Quality Statement on Statistics Release: Adults with learning disabilities known to Scottish local authorities 2012 (esay)

Authoring Within a Content Management System. The Content Management Story

Planning and conducting a dissertation research project

Fast and Easy Delivery of Data Mining Insights to Reporting Systems

ETSO Modelling Methodology for the Automation of Data Interchange of Business Processes (EMM)

OpenText Output Transformation Server

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.

Service Road Map for ANDS Core Infrastructure and Applications Programs

CES Annual Report 1999/2000

CONCEPTCLASSIFIER FOR SHAREPOINT

INTEROPERABILITY IN DATA WAREHOUSES


LinkZoo: A linked data platform for collaborative management of heterogeneous resources

Integration and Reuse of Heterogeneous Information Hetero-Homogeneous Data Warehouse Modeling in the CWM

Revision Number: 1. CUFDIG505A Design information architecture

Welcome 5. Four steps to apply for Grants for the arts 5. Eligibility 7

Automatic Generation Between UML and Code. Fande Kong and Liang Zhang Computer Science department

GUIDE Gentle User Interfaces for Elderly People

ETPL Extract, Transform, Predict and Load

Acronym: Data without Boundaries. Deliverable D12.1 (Database supporting the full metadata model)

Digital Industries Apprenticeship: Assessment Plan. Cyber Security Technologist. April 2016

OpenText Content Hub for Publishers

Information Management Advice 39 Developing an Information Asset Register

Data warehouse and Business Intelligence Collateral

Chapter 3. Technology review Introduction

- 1 - Guidance for the use of the WEB-tool for UWWTD reporting

Cloud-Based Self Service Analytics

TIBCO Spotfire Guided Analytics. Transferring Best Practice Analytics from Experts to Everyone

Documenting questionnaires

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

curriculum for excellence building the curriculum 2 active learning a guide to developing professional practice

UTILIZING COMPOUND TERM PROCESSING TO ADDRESS RECORDS MANAGEMENT CHALLENGES

XML DATA INTEGRATION SYSTEM

The Recipe for Sarbanes-Oxley Compliance using Microsoft s SharePoint 2010 platform

INTEGRATING RECORDS SYSTEMS WITH DIGITAL ARCHIVES CURRENT STATUS AND WAY FORWARD

Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery

Towards a common methodology for automation FOTs and pilots

A Data Browsing from Various Sources Driven by the User s Data Models

Zhenping Liu *, Yao Liang * Virginia Polytechnic Institute and State University. Xu Liang ** University of California, Berkeley

Masters in Information Technology

in collaboration with: Maximising Where are my assets? Adding the Spatial Dimension

Transcription:

Intelligent Use of Metadata in the Questionnaire Design Process Karen BRANNEN Centre for Educational Sociology University of Edinburgh St John s Land Holyrood Road Edinburgh EH8 8AQ United Kingdom e-mail: K.Brannen@ed.ac.uk Abstract: IQML (A Software Suite and Extended Markup Language (XML) Standard for Intelligent Questionnaires) is a project funded by the EU. Modules will be produced for metadata maintenance, questionnaire designer, questionnaire presentation, database interrogation and survey administration. This paper will discuss the main points of innovation used in the questionnaire designer module: capturing and storing the data used within the questionnaire design process as metadata within a repository for re-use later; the ability to design any questionnaire at a conceptual level and have this design realised in different media; questions stored in question banks. Additionally, different approaches to questionnaire design and different stages within the process will be discussed. Finally the paper will give an overview of the other modules in the IQML system and demonstrating how they interact. Keywords: metadata, intelligent questionnaires, questionnaire design, surveys, question re-use, question banks, metadata repository 1. Introduction Within the questionnaire design process there is a great deal of information that is used to realise the goal of producing a questionnaire and subsequently discarded. This data includes structural, contextual and semantic information as well as validation and navigation rules. However, much of this data can be used as metadata further on in the statistical process for production of datasets and documentation etc. This is one element of intelligence which is inherent in every questionnaire but is very often not used. Another interpretation of intelligence is the tools given to the questionnaire designer, one aspect of which is re-use of questions stored in question banks. An intelligent tool also links the questions to underlying concepts and to the variables underlying those concepts. This paper discusses the main points of innovation used in the IQML system, particularly within the questionnaire design module. The first is that of capturing and storing the data used within the questionnaire design process as metadata within a repository for re-use 155

Karen BRANNEN later within the statistical process. The second is the ability to design any questionnaire at a conceptual level and have this same design realised in different media. The third is the ability to re-use questions which have been stored in a question bank in the metadata repository. In addition the paper discusses different ways of approaching questionnaire design based on the needs of different types of users and different types of questionnaires. Furthermore, the paper describes the two clear stages within the questionnaire design process development and specification. In IQML, we define development as the complete conceptual design of the questionnaire and specification as the implementation of a questionnaire which has already been designed at the conceptual level. Finally the paper places the questionnaire design process in the broader framework by giving an overview of the other modules in the IQML system and demonstrating how they interact. 2. The IQML project IQML 1 (A Software Suite and Extended Markup Language (XML) Standard for Intelligent Questionnaires) is a project funded by the EU under the Framework 5 IST programme. It s goal is to automate and integrate the data collection process by producing software to support and implement emerging metadata standards using definition tools such as XML. The aims of the project will be achieved by developing five related software modules (metadata maintenance, questionnaire design, questionnaire presentation, database interrogation and survey administration) and by contributing to specification standards. These five modules are described in a little more detail later in the paper. The project contributes to standards by participating in the development of the Common Warehouse Metadata Metamodel (CWM) of the Object Management Group (OMG). This ensures that emerging models in the domain of object analysis can be used to define the structure, behaviour and visualisation of a statistical questionnaire including its relevant metadata. Furthermore, the resulting DTD and supporting XML software will demonstrate the benefits of both XML and object technology for administrations, enterprises and other organisations in the context of intelligent questionnaires. The resulting software will be demonstrated to the wider community by using it directly with the databases of six of the largest financial institutions in Ireland, and six SMEs using the Internet solution. By the end of the project, the software will also have been demonstrated at an intensive workshop aimed at members of NSIs and candidate countries. The Centre for Educational Sociology (CES) at The University of Edinburgh co-ordinates and manages the project. The bulk of the technical work is split between four partners: the University of Edinburgh (questionnaire design), Dimension EDI (metadata maintenance), DESAN Marktonderzoek (survey administration) and Comfact AB (questionnaire 1 http://www.epros.ed.ac.uk/iqml 156

Intelligent Use of Metadata in the Questionnaire Design Process presentation and database interrogation). Two National Statistical Institutes (the Central Statistics Office of Ireland and Statistics Norway) contribute to user needs and carry out the prototype testing. The National Statistical University of Athens provides knowledge of statistics, and acts as a channel of communication with users in Greece. This paper discusses the main points of innovation used within the questionnaire design module, focussing particularly on the theoretical and conceptual aspects of the questionnaire design process. 3. Interpreting intelligence What do we mean by an intelligent questionnaire? In this paper the focus is on how intelligence already available within the questionnaire design process can, and indeed should, be exploited. Further, how intelligence, especially in the form of a tool, can be used to aid the questionnaire designer throughout the whole process. 3.1 History of CES From 1976 to 1993, the CES conducted the Scottish Young Peoples Survey first on behalf of the Social Science Research Council (SSRC), and later for the Scottish Office. During this time considerable expertise in the design, conduct and analysis of educational surveys was gained. The Centre wrote its own software to support these activities. Questmast [1], a questionnaire design tool, was developed in the Centre in 1981. In its original form it was used for CES surveys for 10 years, running on the University mainframe, and later was translated into a PC version with a GUI interface. Questmast allowed users to define questions and group them into related questionnaires. Camera ready output was produced for printing, and the programme also exported an SPSS set up file for the resulting data. Later, it also exported a relational database schema. In the late 1980s, the output from the Questmast program was linked to a survey metadata system which included management of the databases and production of the documentation [2]. The experience gained from writing this software and observing its flaws has been vital for our input to the questionnaire design module of the IQML system. Since we were involved in the whole process of survey design and implementation (including questionnaire design, survey administration, data collection, database creation, data documentation and analysis) we were able to make some observations about the use of metadata throughout the process. For instance, there were cases where the same metadata were requiring to be retyped for a different part of the process. We believe that, although there are many questionnaire design packages on the market, there are very few, if any, which deliver the functionality of capturing all of the metadata once and store it for subsequent use. This has been the ideal of statistical information systems for many years but has not yet been realised [3]. 3.2 Data used as metadata The following figure (Figure 1) shows an example of a relatively simple question which was recently used during a project currently running at CES [4]. All of the text is the 157

Karen BRANNEN semantic metadata which can be used in the design of the database and documentation of the survey. However, during the questionnaire design process, this is simply the data which is required in order to produce a questionnaire. Furthermore, there is a set of data which may be unseen in the published questionnaire but which should be captured as metadata for subsequent processing. Some examples of this are Routing information (e.g. whether this question is skipped by a filter) Validation rules (e.g. whether the answer to this question is used to validate any other) Sequencing (e.g. identifying which question comes before and after this one). The ultimate goal of a questionnaire is to collect data for some form of analysis. All of the previously mentioned metadata is vital if the data is to be interpreted in the correct way. 4 How would you describe the involvement of staff in your school in the Higher Still Development programme in any of the capacities listed in Q.3? Please tick one box No staff involved A few staff involved Some staff involved A lot staff involved Figure 1: an example question The ability to capture such metadata at the first moment it is used and store it for subsequent use is one element of intelligence. 3.3 An intelligent tool 3.3.1 Re-use of questions in question banks Intelligence within the questionnaire design process can also be seen as the intelligence of the tools provided to the designer, one aspect of which would be tools to allow re-use of existing questions. Very often, the questions asked are very similar, or identical, to questions which appear either elsewhere in the current questionnaire or are asked in another questionnaire possibly at another time-point. One of the major innovations within the IQML system is the storage of question banks within a metadata repository [5]. In this way, questions will either be available from a question bank belonging to someone else for browsing and copying, or from one s own question bank for browsing, copying and saving. 158

Intelligent Use of Metadata in the Questionnaire Design Process 3.3.2 Linking questions to concepts and variables Another aspect of an intelligent tool is the provision to the questionnaire designer of the ability to link the questions to an underlying concept. The concept is the main underlying construct or indicator which the questionnaire designer is trying to identify. Generally, the questionnaire designer starts with an idea of the research question or hypothesis which requires to be answered [6]. This main hypothesis is broken down into a series of concepts into which the questions can be collected. The concepts are in turn broken down into a series of variables which taken together can be used to describe the concept. A classic example of this is the construction of some kind of social class indicator. There are usually a number of questions (eg description of job, whether self-employed, size of business), the answers to which are taken together to form a further variable describing social class. A concept can therefore be seen as a hierarchical structure, each branch of which ends in a single variable, each of which should be linked to a question. 3.3.3 Using different media A further point of innovation within the IQML system is the ability to carry out a design of questionnaire within the system and then to see this same design realised in several different media. How a questionnaire is delivered to a respondent has changed along with technology. In the past, the only method available was paper. This still has a place and is still widely used. However, there are now many other media which can be used. For instance, computer-aided telephone interviewing, computer-aided personal interviewing and delivery via the web. Each of these has a place and it is conceivable that several different media be used in a single survey design. For instance, the main survey could be carried out on paper but telephone interviewing may be used in the case of non-response. In the IQML system, the user will be able to design the questionnaire conceptually and then present it using different media. 3.3.4 Question types In the IQML system, every question in a questionnaire has a type. The use of question types is desirable for a number of reasons: it helps keep the styles of the questionnaire consistent it gives the user a type of shorthand it maintains consistency with the data to be captured. At first sight, a question type may be treated syntactically, however, there are some underlying semantics which can be captured. There are three underlying concepts which define a question type: data type, response type and sub-questions. The data type will be that of the resulting answer (eg integer, string, date). There is also the type of response expected. This can be classified as eg Single a single response to a question (eg number of children) Choice the respondent must choose one of a number of possible responses Multiple the respondent can choose any of a number of possible responses. A single question on a page can be seen as having several sub-questions. For instance, Figure 2 shows a single question which actually asks 3 separate questions with the same response categories. 159

Karen BRANNEN 5 Before Higher Still, did your school offer Please tick one box for each line GSVQs Yes No If yes, how many School Group Awards NC clusters Figure 2: an example of a more complex question In some cases, the expected data type of the response changes (see Figure 2) in which case, the question has several chunks. In the figure, the data type of the third column differs from the other two since it expects a number to be entered rather than a tick or a check. A chunk is therefore a part of a question which shares a uniform response data type. Once question types have been identified, they are available to the user who can specify which question type a question belongs to [7]. The use of question types also determines the nature of the related variables, thereby using intelligence in the construction of the database for later data capture. 4. Questionnaire development 4.1 Approaches to design There are different approaches to the development of a questionnaire which are based on the needs of different types of user and different types of questionnaire. The expected users of a questionnaire design module of IQML are, in the main, Statistical Institutes, Academic researchers and Market Researchers [8]. Cross-cutting types of users are the types of questionnaires which these users produce. These vary from very short and simple (no routing) questionnaires being distributed to a small number of respondents up to long and complex (lots of routing) questionnaires being distributed to a large number of respondents several times over a period of years. Obviously, everything in between is also covered (e.g. short questionnaires to many respondents or long questionnaires distributed just once to a small number of respondents). The combination of type of user and type of questionnaire will influence the way in which questionnaire design is approached. If a short, simple questionnaire is being designed then the user will not want to start with design of concepts they will simply want to design a few questions and put them in the order they require as quickly as possible. However, if the user is designing a large study with several different types of lengthy questionnaire with lots of routing then they will require a different level of support. The use of question banks and design of concepts will be vital for this user. 160

Intelligent Use of Metadata in the Questionnaire Design Process 4.2 Stages of design Linked with the above discussion on types of user and types of questionnaire is the concept of stages of questionnaire design. In the IQML system we have identified two clear stages of designing a questionnaire. These are defined as development and specification. In the development stage, the designer will carry out the complete conceptual design of the questionnaire, starting with designing the concepts and continuing on down to the definition of variables. The question types to be used, overall style, classifications etc may also be defined at this stage. The questionnaire will then be specified by actually constructing the questions using the previously defined question types and apply style and navigation. There are some cases where the questionnaire designer is handed a design of a questionnaire already sketched out, perhaps by hand and asked to implement it. In this case, only the questionnaire specification stage needs to be done. 5. The IQML system The Questionnaire Design module, as previously discussed, enables the user to design and manage questionnaires which can be deployed using the other software modules of the suite. The tool allows the user to define questionnaires at a number of levels: conceptual, logical, and formal. Attention is paid to requirements of different types of respondent (business and individual), and to the different types of surveys (e.g. economic or social) that may be addressed. The questionnaire design tool captures all relevant metadata and stores it in the metadata repository. The Questionnaire Presentation tool renders the questionnaire for use with PCs and in particular with web browsers. XML support for the presentation, validation, navigation and calculation will be implemented by the tool. This allows users to fill in the data and validate it as appropriate. The first trial of the IQML project involved a trial of the Questionnaire Presentation Tool in order to prove the concept. It was piloted in the field in two applications: Balance of payments by CSO and financial and servicing data from local authorities by SSB. The trial was completed successfully and an evaluation report was written, peer-reviewed and forms deliverable 5 of the project [9]. The Database Interrogation tool supports the extraction of data from popular databases and maps these data to the XML. It also allows data to be extracted from the XML and loaded into a database. Once configured, this will support the automated loading and extraction of data to and from databases and the electronic questionnaire. The Survey Administration package allows the questionnaires to be integrated with registers and sample frames. It tracks the despatch and receipts of questionnaires and software to individuals and organisations. Sitting at the heart of all of these modules is the Metadata Repository which supports the definition of metadata objects that can be used in a questionnaire. APIs are being developed to store and access these metadata objects. The product allows questionnaire design systems and other software to access the metadata without the need to know the 161

Karen BRANNEN underlying structure or source of the metadata by implementing object interfaces that follow international standards. 6. Conclusions In conclusion, intelligence should be used right the way through the questionnaire design process and into the subsequent stages of survey processing. Intelligence within this process can be in many forms. The questionnaire designer should be given a tool with enough intelligence to allow any type of user to design any type of questionnaire at the conceptual level and then realise that same design in many different media. At the same time, it should allow capture of as much metadata as possible and store it so that it can be used, not only by themselves for design of future questionnaires but also in the eventual interpretation of the resulting data. The use of a question bank is important to allow re-use of questions. The innovation in IQML is in the use of a metadata repository which allows sharing of metadata objects (including question banks) between modules of the IQML system and other software. IQML is contributing towards the realisation of the ideal of capturing all metadata, once, at the moment it is most appropriate, for use in the subsequent processing of the resulting data. References [1] Lamb, J.M. QUESTMAST: a package to aid the design and construction of questionnaires, Royal Statistical Society News and Notes, December 1983. [2] Lamb, J.M. Metadata in survey processing, Proceedings of the EUROSTAT Statistical Meta Information Systems Workshop, Luxembourg, 2-4 February, 1993 ISBN92-826-0478-0. [3] Sundgren, B. An Infological Approach to data bases, Statistics Sweden, Stockholm, 1973. [4] The Introduction of a Unified System of Post-Compulsory Education in Scotland (IUS), ESRC, April 2000 July 2003. [5] Nelson, C. The Affect of Standards on Software Component Architecture in Proceedings of the ASC conference: The Challenge of the Internet, edited by Andrew Westlake, Chesham, UK, 11-12 May, 2001, forthcoming. [6] Peterson, R. A. Constructing Effective Questionnaires Sage Publications inc., California, 2000, ISBN 0-7619-1641-5. [7] Lamb, J.M. Formatting Questionnaires the Questionnaire Design Module, presented to IQML Fourth Project Meeting, Den Dolder, Netherlands, 28 30 March, 2001. [8] Pagrach, K., Rutjes, H., Tjemmes, R. Synthesised description of user needs, Deliverable 4 of the IQML project, 2001. [9] Folkedal, J., Hoel, T. Evaluation report of Trial 1, Deliverable 5 of the IQML project, 2000. 162