Some fallacies and remedies in secondary data analysis for survey data

Size: px
Start display at page:

Download "Some fallacies and remedies in secondary data analysis for survey data"

Transcription

1 Some fallacies and remedies in secondary data analysis for survey data Giancarlo Manzi Department of Economics, Management and Quantitative Methods, Università degli Studi di Milano, Italy Sonia Stefanizzi - Department of Sociology and Social Research, of Milan-Bicocca, Italy Pier Alda Ferrari Department of Economics, Management and Quantitative Methods, Università degli Studi di Milano, Italy Conference of European Statistics Stakeholders November 24-25, 2014 ROME, Sapienza

2 Fisher s famous quote revisited To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of (Fisher, 1938). We revisit this famous quote by the following: To call in the statistician after the experiment is done may be sometimes convenient: he or she can revive it! Our particular focus is on Secondary Data Analysis (SDA) fallacies in surveys, emerging during the statistical analysis. Some suggestions for future surveys and remedies for current European surveys are also presented. Conference of European Statistics Stakeholders, November 24-25, 2014, ROME, Sapienza

3 Overview of the talk Introduction and motivation. Quality of data and data analysis Coherence and comparability Issues in conducting SDA Data validity, reflexivity and reliability SDA statistical remedies: Combining survey results borrowing strength from each other Use of suitable analysis tools Building improved surveys from existing surveys An example on European data showing how these fallacies arises when performing statistical analysis, and some points to think of. Conclusion and future steps

4 Introduction and motivation (1) In a collaboration project with statisticians and sociologists, a series of problems aroused during when performing SDA on European data. From this, the need to study this topic further. We started first with general definitions of data information quality. For example, Kenett & Shmueli (2014) define eight dimensions of info quality: data resolution; data structure; data integration; temporal relevance; generalizability; chronology of data and goal; construct operationalization and communication.

5 Introduction and motivation (2) Eurostat has established seven dimensions of quality since 2000: Quality dimension 1. Relevance of statistical concept 2. Accuracy of estimates 3. Timeliness and punctuality in disseminating results 4. Accessibility and clarity of the information 5. Comparability 6. Coherence 7. Completeness Source: Eurostat (2000) Remark A statistical product is relevant if it meets user needs. Thus user needs has to be established at the outset. Accuracy is the difference between the estimate and the true parameter value. Assessing the accuracy is not always possible, due to financial and methodological constraints. In our experience this is perhaps one of the most important user needs. Perhaps this is so because this dimension is so obviously linked to an efficient use of the results. Results are of high value when they are easily accessible and available in forms suitable to users. The data provider should also assist the users in interpreting the results. Reliable comparisons across space and time are often crucial. Recently, new demands on cross-national comparisons have become common. This in turn puts new demands on developing methods for adjusting for cultural differences. When originating from a single source, statistics are coherent, in that elementary concepts can be combined in more complex ways. When originating from different sources, and in particular from statistical studies of different periodicities, statistics are coherent insofar as they are based on common definition, classifications, and methodological standards Domains for which statistics are available should reflect the needs and priorities expressed by users as a collective.

6 Introduction and motivation (3) In this talk we focus on points 5. and 6. above. Data quality can be addressed in connection with coherence and comparability as follows: i. resuming the most common fallacies linked to survey implementation and result analysis; ii. recalling some statistical tools able to reduce or control bias when analyzing, comparing or combining survey data. A special reference to problems arising in SDA is also given with an example in the social and economic field (the Eurobarometer survey (EB)) where issues are detected and some sketches of remedies are pointed out.

7 SDA: target switching from original data SDA: the set of research activities through which data from different surveys with certain assumptions and conceptual frameworks are used individually for purposes not necessarily coinciding with those that guided the data collection Examples: Boudon, 1973: through SDA the social researcher can widen the validity of atomic results to the point that he/she is able to modify the original conceptual framework, formulating new interpretative hypotheses which can be different with respect to those in primary analysis. Ferrari & Salini, 2011: data on European user satisfaction for utilities can be used to reveal the multifaceted importance and quality of different aspects of public services.

8 Issues in SDA: data validity DV: the correspondence between characteristics to be detected and indicators chosen to measure them Objective (i.e. observable) aspects must lead to latent states of societies and individuals. CONSTRUCTION OF VARIABLES APPROPRIATE CONCEPTUALIZATION APPROPRIATE MEASUREMENT MEANING OF THE REAL RELATIONSHIP BETWEEN Conference of European Statistics VARIABLES Stakeholders, November REVEALED 24, 2014 November 25, 2014, ROME, Sapienza

9 Issues in SDA: data reflexivity Example: performing immigration surveys. Immigration surveys express also the reflexive character of policies. Immigration surveys may result limited and constrained. Examples of constraints: Immigration policies defined only in terms of migrant categories and quotas. Official statistics almost exclusively focused on the foreigners legal matters as: their nationality; residence; duration and purpose of stay; etc. This sometimes leaves aside other important components of migration such as: social contexts of migrants origin (urban/rural); their social background; the way their migratory experience is articulated; etc.

10 Issues in SDA: data reliability DR: the degree to which data collection procedures are applied in a consistent and coherent way with respect to previously established criteria. The reliability issue occurs both at the level of data production and at the level of data collection, classification and dissemination.

11 Some examples of remedies for SDA fallacies on EU data (1) Methods for blending results from different surveys to attenuate data flaws. 1. Small area estimation where results from different surveys are blended to attenuate data flaws. Lohr & Brick (2012) explore methods for small domain estimation from two surveys when one survey is believed to be biased with respect to the other. The novelty of their work is that they use methods to adjust estimates before a new companion survey is being implemented, i.e. in the stage of constructing a newly planned survey. 2. Meta-analytic approaches. Manzi et al. [16] use a meta-analytic approach with a hierarchical Bayesian model for small domain statistics. Official survey estimates are integrated with estimates from smaller surveys covering smaller areas. Estimates are averaged with weights proportional to the strength of each survey, with the bigger surveys dominating the others, but with information coming also from smaller but more up-to-date surveys.

12 Some examples of remedies for SDA fallacies on EU data (2) Methods for the detection of latent variables which explain hidden structures in the data. 1. Ferrari et al. (2010) use Nonlinear Principal Component Analysis (NPCA) to detect latent constructs and then average them over countries. 2. In Ferrari & Salini (2011) NPCA is proposed together with the Rasch Model (RM) for the assessment of latent concepts such as satisfaction for public services. With this use of NPCA and RM: the level of satisfaction is individually determined via NPCA, but the importance of single satisfaction components (given by component loadings in NPCA) and the quality of components (given by item parameters of RM) are also determined.

13 A motivational example: SDA fallacies arising in EB survey Analysis on European citizens attachment/ expectations/information with regard to the EU. Data: EB survey Techniques used: NPCA and ML analysis. Evident problems: Excessive number of Don t Know answers: Questions erroneously formulated? Sometimes a DK answer makes sense, sometimes not. Maybe, should a way to avoid or diminish their presence in the data set be established? A great deal with imputation. Sometimes there is no coherence in the scales of same type of variables (same questionnaire sections) with recoding needed.

14 A motivational example: Post-analysis incoherence detection (1) Question understanding: What really does «understand» mean in this question? Respondents may be puzzled. Consequence: when exploring for latent variables (using NPCA) this question is ambiguously classified.

15 A motivational example: Post-analysis incoherence detection (2) Ambiguous questions/wording: Is this a question about trust in the EU? Or rather about how citizens are well-informed about it (most probably but not sure). Consequence: again, problems when clustering variables.

16 A motivational example: Post-analysis incoherence detection (3) Is maybe the choice of question formulation (verb tense, for example) decisive to assign a question to a category rather than to another? Sometimes questions have intrinsic double meanings: Are we sure that this question is really correct to check how citizens are well-informed about the EU? Is it also trying to investigate their attachment?

17 A motivational example: SDA in action (1) We wanted to evaluate EU citizens feelings about the EU. An initial set of 44 candidate variables were detected in a series of meetings among the authors. The order of categories of some variables were recoded inverting their order for homogeneity with other variables. After performing a NPCA with three and four components on these variables, some variables were excluded because not clearly in line with one of the extracted components. Some variables initially inserted in a dimension were included in another dimension. Final number of variables left: 37.

18 A motivational example: SDA in action (2) 22 variables for the EU attachment/expectation/confidence dimension. 9 variables for the EU evaluation dimension. 6 variables for the level of information about EU dimension. After some correlation and regression analysis on sociodemographic variables in the EB data set, 4 individual variables were left. After performing NPCA, individual NPCA scores were obtained separately for each of the three dimensions. Country averages of these scores where obtained, intended to show average country EU attachment/evaluation/information.

19 A motivational example: SDA in action (3) We also wanted to know if country ranking on citizens attachment/evaluation/information was related to some contextual country variable and therefore performed a ML analysis inserting contextual independent variables to detect determinants of attachment/evaluation/information. Contextual variables were essential economic and social measurements (GDP per capita, Public debt, Index of deprivation, inactivity rate, etc.). After performing the ML analysis, ranking was altered with respect to a one-level analysis and citizens of the socalled PIIGS where not among the less satisfied with the EU.

20 NPCA logic NPCA: belongs to the nonlinear multivariate analysis family is the nonlinear counterpart of principal component analysis provides dimensionality reduction by means of nonlinear transformation of variables, i.e. assigning quantitative values to qualitative scales has a solution which is derived by minimizing a least squares type loss function, expressed in terms of optimally quantified variables and scores on objects

21 NPCA: how it works in general (1) The goal of NPCA is the construction of a p- dimensional Euclidean space in which objects (individuals) are represented Suppose J categorical variables are observed on N objects (survey respondents) Let X be a N x p matrix of object scores (to be determined) Let be the x p matrix of "quantifications" of the J variables ( has to be determined, j = 1,,J, is the number of categories for the j-th variable). Let be an indicator matrix with entries if object i holds category t or otherwise,

22 NPCA: how it works in general (2) The solution of NPCA is determined by minimizing the following loss function: where SSQ(H) denotes the sum of squares of the elements of matrix H, is an column vector of single category optimal quantifications for the j-th variable and is a p-column vector of weights

23 ML: how it works in general (1) Consider the simple regression model: y ij 0 j 1 j performed in J (j=1,.,j) different groups (schools, regions, countries, etc.) with individual variables X. At the second level a group variable expressing changes from group to group can be important to explain second-level variability, and therefore: Inserting the two equations above in the regression model we get the full level-2 multilevel linear model: y ij x ij 0 j 00 01w j u0 j w 1 j j u1 j ( 01w j 10 xij 11w j xij ) [ u0 j u1 00 j ij ij ij x ]

24 A motivational example: results (1) ATTACHMENT Model 0 ONLY RANDOM INTERCEPT Coefficient SE z p-value CI Intercept Random Effects (RE) First-level variance (variability between citizen) Second-level variance (variability between countries) Deviance Model 1 ONLY INDIVIDUAL EXPLICATIVE VARIABLES VARIABLES IN THE MODEL: Age education; Age: years; Age: 55 years or more; Job: medium status; Job: High status; Community: small or medium town; Community: big town Coefficient SE z p-value CI Intercept Individual variables: Age education Age: years Age: 55 years or more Job: medium status Job: high status Community: small-medium town Community: big town Random Effects (RE) First-level variance (variability between citizen)

25 A motivational example: results (2) MODEL 2 (FULL MODEL): INDIVIDUAL AND CONTEXTUAL EXPLICATIVE VARIABLES VARIABLES IN THE MODEL: Age education; Age: years; Age: 55 years or more; Job: medium status; Job: High status; Community: small or medium town; Community: big town; Public deficit (2013); Household deprivation (2013) Coefficient SE z p-value CI Intercept Individual variables: Age education Age: years Age: 55 years or more Job: medium status Job: high status Community: small-medium town Community: big town Contextual variables Public deficit (2013) Household deprivation (2013) Random Effects (RE) First-level variance (variability between citizen) Second-level variance (variability between countries) Deviance

26 A motivational example: results (3) Two-step analysis on EU attachment First case: residuals of the null model no explicative variables Second case: residuals of the individual model only individual explicative variables Third case: residuals of the full model individual and contextual explicative variables

27 Some focus points for discussion (1) More integration between disciplines (Statistics and Sociology, for example). For example, the European Social Survey is pushing towards a more integrated work for the improvement of survey results. Statistics is useful for interpreting survey respondents answers, for example to unveil citizens attitudes towards the EU. Some classic and consolidated questions in EU questionnaires about citizens attachment to the EU may result obsolete: statistical techniques help in detecting flaws for future surveys. In our work, dimensions traditionally used by EU policy makers to analyze the level of Europeanization (evaluation, information and attachment) have shown many problems.

28 Some focus points for discussion (2) When doing SDA researchers are focused only on their particular problems. Comparability, harmonization and quality: in practice these problems are not sufficiently highlighted or are stressed with superficiality ( This new wave does not contain this question contained in the previous wave ) Statistical analysis helps in formulating new proposals for improved survey in the course of its implementation. Statistical techniques should not be used for the benefit of statistics only, but should be contextualized to give an answer to epistemological problems.

29 Some suggestions Questions in questionnaires should be as objective as possible. When planning new surveys use results emerged from statistical analysis in other studies/surveys (metaanalytic approach). From fallacies emerged from statistical analysis, construct new surveys (Lohr s example). Analyses should be contextualized referring to different areas. A meta-data codebook with rules coming also from previous statistical analysis should accompany traditional meta-data (example of ML results: are really Greeks angry with the EU?)

30 References Boudon, R. (1973) Equality, Opportunity, and Social Inequality. New York: Wiley. Eurostat (2000) Assessment of the Quality in Statistics. Eurostat/A4/Quality/00/General Standard Report, April 4-5, Luxembourg. Ferrari, P. A., Annoni, P., Manzi, G.: Evaluation and Comparison of European Countries (2010) Public Opinion on Services, Qual Quant, 44, Ferrari, P. A., Salini, S. (2011) Complementary Use of Rasch Models and Nonlinear Principal Components Analysis in the Assessment of the Opinion of Europeans about Utilities, J Classif, 28, Fisher, R. A. (1938) Indian statistical congress. CA: Sankhya. Kenett, R.S., Shmueli G. (2014) On information quality, J Roy Stat Soc A Sta, 177, Lohr, S.L., Brick, J.M. (2012) Blending domain estimates from two victimization surveys with possible bias, Can J Stat, 40(4), Manzi, G., Spiegelhalter, D.J., Turner, R.M., Flowers, J., Thompson, S.G. (2011) Modelling bias in combining small area prevalence estimates from multiple surveys, J Roy Stat Soc A Sta, 174,

Handling attrition and non-response in longitudinal data

Handling attrition and non-response in longitudinal data Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

More information

Statistical Office of the European Communities PRACTICAL GUIDE TO DATA VALIDATION EUROSTAT

Statistical Office of the European Communities PRACTICAL GUIDE TO DATA VALIDATION EUROSTAT EUROSTAT Statistical Office of the European Communities PRACTICAL GUIDE TO DATA VALIDATION IN EUROSTAT TABLE OF CONTENTS 1. Introduction... 3 2. Data editing... 5 2.1 Literature review... 5 2.2 Main general

More information

Qualitative vs Quantitative research & Multilevel methods

Qualitative vs Quantitative research & Multilevel methods Qualitative vs Quantitative research & Multilevel methods How to include context in your research April 2005 Marjolein Deunk Content What is qualitative analysis and how does it differ from quantitative

More information

TEACHING OF STATISTICS IN NEWLY INDEPENDENT STATES: THE CASE OF KAZAKSTAN

TEACHING OF STATISTICS IN NEWLY INDEPENDENT STATES: THE CASE OF KAZAKSTAN TEACHING OF STATISTICS IN NEWLY INDEPENDENT STATES: THE CASE OF KAZAKSTAN Guido Ferrari, Dipartimento di Statistica G. Parenti, Università di Firenze, Italy The aim of this report is to discuss the state

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Application of discriminant analysis to predict the class of degree for graduating students in a university system

Application of discriminant analysis to predict the class of degree for graduating students in a university system International Journal of Physical Sciences Vol. 4 (), pp. 06-0, January, 009 Available online at http://www.academicjournals.org/ijps ISSN 99-950 009 Academic Journals Full Length Research Paper Application

More information

Making Sense of Web Traffic Data and Its Implications for B2B Marketing Strategies

Making Sense of Web Traffic Data and Its Implications for B2B Marketing Strategies Making Sense of Web Data and Its Implications for B2B Marketing Strategies Analysis with Data Generated Website Abstract In the realm of B2B marketing, it is now realized buyers are switching catalogs

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

The Basic Two-Level Regression Model

The Basic Two-Level Regression Model 2 The Basic Two-Level Regression Model The multilevel regression model has become known in the research literature under a variety of names, such as random coefficient model (de Leeuw & Kreft, 1986; Longford,

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Department of Economics

Department of Economics Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional

More information

Statistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth Garrett-Mayer

Statistics in Psychosocial Research Lecture 8 Factor Analysis I. Lecturer: Elizabeth Garrett-Mayer This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Data quality and metadata

Data quality and metadata Chapter IX. Data quality and metadata This draft is based on the text adopted by the UN Statistical Commission for purposes of international recommendations for industrial and distributive trade statistics.

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Course Catalog Sociology Courses - Graduate Level Subject Course Title Course Description

Course Catalog Sociology Courses - Graduate Level Subject Course Title Course Description Course Catalog Sociology Courses - Graduate Level Subject Course Title Course Description SO 6113 SO 6123 SO 6173 SO 6203 SO 6223 SO 6233 SO 6243 SO 6253 Soc Org & Change Poverty Analysis Environment-

More information

the general concept down to the practical steps of the process.

the general concept down to the practical steps of the process. Article Critique Affordances of mobile technologies for experiential learning: the interplay of technology and pedagogical practices C.- H. Lai, J.- C. Yang, F.- C. Chen, C.- W. Ho & T.- W. Chan Theoretical

More information

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

Introduction to Data Analysis in Hierarchical Linear Models

Introduction to Data Analysis in Hierarchical Linear Models Introduction to Data Analysis in Hierarchical Linear Models April 20, 2007 Noah Shamosh & Frank Farach Social Sciences StatLab Yale University Scope & Prerequisites Strong applied emphasis Focus on HLM

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models Factor Analysis Principal components factor analysis Use of extracted factors in multivariate dependency models 2 KEY CONCEPTS ***** Factor Analysis Interdependency technique Assumptions of factor analysis

More information

Approaches for Analyzing Survey Data: a Discussion

Approaches for Analyzing Survey Data: a Discussion Approaches for Analyzing Survey Data: a Discussion David Binder 1, Georgia Roberts 1 Statistics Canada 1 Abstract In recent years, an increasing number of researchers have been able to access survey microdata

More information

Joint models for classification and comparison of mortality in different countries.

Joint models for classification and comparison of mortality in different countries. Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute

More information

AC 2009-803: ENTERPRISE RESOURCE PLANNING: A STUDY OF USER SATISFACTION WITH REFERENCE TO THE CONSTRUCTION INDUSTRY

AC 2009-803: ENTERPRISE RESOURCE PLANNING: A STUDY OF USER SATISFACTION WITH REFERENCE TO THE CONSTRUCTION INDUSTRY AC 2009-803: ENTERPRISE RESOURCE PLANNING: A STUDY OF USER SATISFACTION WITH REFERENCE TO THE CONSTRUCTION INDUSTRY I. Choudhury, Texas A&M University American Society for Engineering Education, 2009 Page

More information

How To Understand Multivariate Models

How To Understand Multivariate Models Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Introducing the Multilevel Model for Change

Introducing the Multilevel Model for Change Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

More information

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. 277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

More information

240ST014 - Data Analysis of Transport and Logistics

240ST014 - Data Analysis of Transport and Logistics Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 240 - ETSEIB - Barcelona School of Industrial Engineering 715 - EIO - Department of Statistics and Operations Research MASTER'S

More information

SEM Analysis of the Impact of Knowledge Management, Total Quality Management and Innovation on Organizational Performance

SEM Analysis of the Impact of Knowledge Management, Total Quality Management and Innovation on Organizational Performance 2015, TextRoad Publication ISSN: 2090-4274 Journal of Applied Environmental and Biological Sciences www.textroad.com SEM Analysis of the Impact of Knowledge Management, Total Quality Management and Innovation

More information

Sampling solutions to the problem of undercoverage in CATI household surveys due to the use of fixed telephone list

Sampling solutions to the problem of undercoverage in CATI household surveys due to the use of fixed telephone list Sampling solutions to the problem of undercoverage in CATI household surveys due to the use of fixed telephone list Claudia De Vitiis, Paolo Righi 1 Abstract: The undercoverage of the fixed line telephone

More information

Introduction to time series analysis

Introduction to time series analysis Introduction to time series analysis Margherita Gerolimetto November 3, 2010 1 What is a time series? A time series is a collection of observations ordered following a parameter that for us is time. Examples

More information

The primary goal of this thesis was to understand how the spatial dependence of

The primary goal of this thesis was to understand how the spatial dependence of 5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial

More information

1 Theory: The General Linear Model

1 Theory: The General Linear Model QMIN GLM Theory - 1.1 1 Theory: The General Linear Model 1.1 Introduction Before digital computers, statistics textbooks spoke of three procedures regression, the analysis of variance (ANOVA), and the

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Longitudinal Meta-analysis

Longitudinal Meta-analysis Quality & Quantity 38: 381 389, 2004. 2004 Kluwer Academic Publishers. Printed in the Netherlands. 381 Longitudinal Meta-analysis CORA J. M. MAAS, JOOP J. HOX and GERTY J. L. M. LENSVELT-MULDERS Department

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics

More information

Quality and critical appraisal of clinical practice guidelines a relevant topic for health care?

Quality and critical appraisal of clinical practice guidelines a relevant topic for health care? Quality and critical appraisal of clinical practice guidelines a relevant topic for health care? Françoise Cluzeau, PhD St George s Hospital Medical School, London on behalf of the AGREE Collaboration

More information

A Bayesian hierarchical surrogate outcome model for multiple sclerosis

A Bayesian hierarchical surrogate outcome model for multiple sclerosis A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)

More information

Graduate Certificate in Systems Engineering

Graduate Certificate in Systems Engineering Graduate Certificate in Systems Engineering Systems Engineering is a multi-disciplinary field that aims at integrating the engineering and management functions in the development and creation of a product,

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. Nullspace Let A = (a ij ) be an m n matrix. Definition. The nullspace of the matrix A, denoted N(A), is the set of all n-dimensional column

More information

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

More information

Problem of Missing Data

Problem of Missing Data VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;

More information

ANALYTIC HIERARCHY PROCESS (AHP) TUTORIAL

ANALYTIC HIERARCHY PROCESS (AHP) TUTORIAL Kardi Teknomo ANALYTIC HIERARCHY PROCESS (AHP) TUTORIAL Revoledu.com Table of Contents Analytic Hierarchy Process (AHP) Tutorial... 1 Multi Criteria Decision Making... 1 Cross Tabulation... 2 Evaluation

More information

10. Analysis of Longitudinal Studies Repeat-measures analysis

10. Analysis of Longitudinal Studies Repeat-measures analysis Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.

More information

Appendix B Data Quality Dimensions

Appendix B Data Quality Dimensions Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational

More information

MSCA 31000 Introduction to Statistical Concepts

MSCA 31000 Introduction to Statistical Concepts MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced

More information

Multilevel Models for Social Network Analysis

Multilevel Models for Social Network Analysis Multilevel Models for Social Network Analysis Paul-Philippe Pare ppare@uwo.ca Department of Sociology Centre for Population, Aging, and Health University of Western Ontario Pamela Wilcox & Matthew Logan

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Multivariate Analysis of Variance (MANOVA): I. Theory

Multivariate Analysis of Variance (MANOVA): I. Theory Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the

More information

Power and sample size in multilevel modeling

Power and sample size in multilevel modeling Snijders, Tom A.B. Power and Sample Size in Multilevel Linear Models. In: B.S. Everitt and D.C. Howell (eds.), Encyclopedia of Statistics in Behavioral Science. Volume 3, 1570 1573. Chicester (etc.): Wiley,

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Differences in Characteristics of the ERP System Selection Process between Small or Medium and Large Organizations

Differences in Characteristics of the ERP System Selection Process between Small or Medium and Large Organizations Proc. of the Sixth Americas Conference on Information Systems (AMCIS 2000), pp. 1022-1028, Long Beach, CA, 2000. Differences in Characteristics of the ERP System Selection Process between Small or Medium

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling

Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling Pre-requisites Modules 1-4 Contents P5.1 Comparing Groups using Multilevel Modelling... 4

More information

Local outlier detection in data forensics: data mining approach to flag unusual schools

Local outlier detection in data forensics: data mining approach to flag unusual schools Local outlier detection in data forensics: data mining approach to flag unusual schools Mayuko Simon Data Recognition Corporation Paper presented at the 2012 Conference on Statistical Detection of Potential

More information

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A = MAT 200, Midterm Exam Solution. (0 points total) a. (5 points) Compute the determinant of the matrix 2 2 0 A = 0 3 0 3 0 Answer: det A = 3. The most efficient way is to develop the determinant along the

More information

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-

More information

Monica Pratesi, University of Pisa

Monica Pratesi, University of Pisa DEVELOPING ROBUST AND STATISTICALLY BASED METHODS FOR SPATIAL DISAGGREGATION AND FOR INTEGRATION OF VARIOUS KINDS OF GEOGRAPHICAL INFORMATION AND GEO- REFERENCED SURVEY DATA Monica Pratesi, University

More information

How To Understand The Data Collection Of An Electricity Supplier Survey In Ireland

How To Understand The Data Collection Of An Electricity Supplier Survey In Ireland COUNTRY PRACTICE IN ENERGY STATISTICS Topic/Statistics: Electricity Consumption Institution/Organization: Sustainable Energy Authority of Ireland (SEAI) Country: Ireland Date: October 2012 CONTENTS Abstract...

More information

THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA

THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA Abstract THE IMPACT OF MACROECONOMIC FACTORS ON NON-PERFORMING LOANS IN THE REPUBLIC OF MOLDOVA Dorina CLICHICI 44 Tatiana COLESNICOVA 45 The purpose of this research is to estimate the impact of several

More information

Penalized regression: Introduction

Penalized regression: Introduction Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

More information

Early FP Estimation and the Analytic Hierarchy Process

Early FP Estimation and the Analytic Hierarchy Process Early FP Estimation and the Analytic Hierarchy Process Luca Santillo (luca.santillo@gmail.com) Abstract Several methods exist in order to estimate the size of a software project, in a phase when detailed

More information

Basic Concepts in Research and Data Analysis

Basic Concepts in Research and Data Analysis Basic Concepts in Research and Data Analysis Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...3 The Research Question... 3 The Hypothesis... 4 Defining the

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

DGD14-006. ACT Health Data Quality Framework

DGD14-006. ACT Health Data Quality Framework ACT Health Data Quality Framework Version: 1.0 Date : 18 December 2013 Table of Contents Table of Contents... 2 Acknowledgements... 3 Document Control... 3 Document Endorsement... 3 Glossary... 4 1 Introduction...

More information

Should we Really Care about Building Business. Cycle Coincident Indexes!

Should we Really Care about Building Business. Cycle Coincident Indexes! Should we Really Care about Building Business Cycle Coincident Indexes! Alain Hecq University of Maastricht The Netherlands August 2, 2004 Abstract Quite often, the goal of the game when developing new

More information

HMRC Tax Credits Error and Fraud Additional Capacity Trial. Customer Experience Survey Report on Findings. HM Revenue and Customs Research Report 306

HMRC Tax Credits Error and Fraud Additional Capacity Trial. Customer Experience Survey Report on Findings. HM Revenue and Customs Research Report 306 HMRC Tax Credits Error and Fraud Additional Capacity Trial Customer Experience Survey Report on Findings HM Revenue and Customs Research Report 306 TNS BMRB February2014 Crown Copyright 2014 JN119315 Disclaimer

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

[This document contains corrections to a few typos that were found on the version available through the journal s web page]

[This document contains corrections to a few typos that were found on the version available through the journal s web page] Online supplement to Hayes, A. F., & Preacher, K. J. (2014). Statistical mediation analysis with a multicategorical independent variable. British Journal of Mathematical and Statistical Psychology, 67,

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Overview... 2. Accounting for Business (MCD1010)... 3. Introductory Mathematics for Business (MCD1550)... 4. Introductory Economics (MCD1690)...

Overview... 2. Accounting for Business (MCD1010)... 3. Introductory Mathematics for Business (MCD1550)... 4. Introductory Economics (MCD1690)... Unit Guide Diploma of Business Contents Overview... 2 Accounting for Business (MCD1010)... 3 Introductory Mathematics for Business (MCD1550)... 4 Introductory Economics (MCD1690)... 5 Introduction to Management

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

More information

Random Effects Models for Longitudinal Survey Data

Random Effects Models for Longitudinal Survey Data Analysis of Survey Data. Edited by R. L. Chambers and C. J. Skinner Copyright 2003 John Wiley & Sons, Ltd. ISBN: 0-471-89987-9 CHAPTER 14 Random Effects Models for Longitudinal Survey Data C. J. Skinner

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Competency 1 Describe the role of epidemiology in public health

Competency 1 Describe the role of epidemiology in public health The Northwest Center for Public Health Practice (NWCPHP) has developed competency-based epidemiology training materials for public health professionals in practice. Epidemiology is broadly accepted as

More information

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

More information

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

More information

Statistics in Retail Finance. Chapter 2: Statistical models of default

Statistics in Retail Finance. Chapter 2: Statistical models of default Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information