A repeated measures concordance correlation coefficient



Similar documents
Econometrics Simple Linear Regression

Section 3 Part 1. Relationships between two numerical variables

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Nonparametric Statistics

Introduction: Overview of Kernel Methods

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Some probability and statistics

An Introduction to Machine Learning

Lecture 3: Linear methods for classification

MULTIVARIATE PROBABILITY DISTRIBUTIONS

The Basics of Regression Analysis. for TIPPS. Lehana Thabane. What does correlation measure? Correlation is a measure of strength, not causation!

Fitting Subject-specific Curves to Grouped Longitudinal Data

Sections 2.11 and 5.8

Notes on Applied Linear Regression

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Least Squares Estimation

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Efficiency and the Cramér-Rao Inequality

Module 3: Correlation and Covariance

The Bivariate Normal Distribution

Highlights the connections between different class of widely used models in psychological and biomedical studies. Multiple Regression

Covariance and Correlation

Exploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016

Illustration (and the use of HLM)

東 海 大 學 資 訊 工 程 研 究 所 碩 士 論 文

Understanding and Applying Kalman Filtering

Hypothesis testing - Steps

Statistiek II. John Nerbonne. October 1, Dept of Information Science

Introduction to General and Generalized Linear Models

containing Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.

Additional sources Compilation of sources:

Sample Size Calculation for Longitudinal Studies

UNIVERSITY OF NAIROBI

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

An introduction to Value-at-Risk Learning Curve September 2003

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Multivariate normal distribution and testing for means (see MKB Ch 3)

Goodness of fit assessment of item response theory models

A Basic Introduction to Missing Data

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Simple Linear Regression Inference

Longitudinal Data Analysis

Statistics for Sports Medicine

Chapter 1. Longitudinal Data Analysis. 1.1 Introduction

Understanding the Impact of Weights Constraints in Portfolio Theory

Correlation in Random Variables

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Confidence Intervals for Spearman s Rank Correlation

Models for Longitudinal and Clustered Data

Regression Analysis. Regression Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013

DAIC - Advanced Experimental Design in Clinical Research

Linear Models for Continuous Data

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

Factor Analysis. Factor Analysis

Introduction to Matrix Algebra

Poisson Models for Count Data

Clustering in the Linear Model

Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification

II. DISTRIBUTIONS distribution normal distribution. standard scores

Part 2: Analysis of Relationship Between Two Variables

Analysis of Questionnaires and Qualitative Data Non-parametric Tests

Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance

October 3rd, Linear Algebra & Properties of the Covariance Matrix

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

1 Portfolio mean and variance

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb

Week 5: Multiple Linear Regression

Data Mining: Algorithms and Applications Matrix Math Review

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Lecture 10. Finite difference and finite element methods. Option pricing Sensitivity analysis Numerical examples

5. Linear Regression

Lecture 14: GLM Estimation and Logistic Regression

Chapter 7 Factor Analysis SPSS

Implementing Propensity Score Matching Estimators with STATA

Topic 3. Chapter 5: Linear Regression in Matrix Form

Logit Models for Binary Data

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Simple Predictive Analytics Curtis Seare

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Dependence Concepts. by Marta Monika Wawrzyniak. Master Thesis in Risk and Environmental Modelling Written under the direction of Dr Dorota Kurowicka

UNDERSTANDING THE TWO-WAY ANOVA

U-statistics on network-structured data with kernels of degree larger than one

Extreme Value Modeling for Detection and Attribution of Climate Extremes

A LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY. Workshop

STANDARDISATION OF DATA SET UNDER DIFFERENT MEASUREMENT SCALES. 1 The measurement scales of variables

A Mixed Model Approach for Intent-to-Treat Analysis in Longitudinal Clinical Trials with Missing Values

To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

A Log-Robust Optimization Approach to Portfolio Management

Multivariate Analysis of Ecological Data

Spazi vettoriali e misure di similaritá

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March Due:-March 25, 2015.

A MULTIVARIATE TEST FOR SIMILARITY OF TWO DISSOLUTION PROFILES

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

SAS Software to Fit the Generalized Linear Model

Multiple group discriminant analysis: Robustness and error rate

arxiv: v6 [math.pr] 31 Jan 2014

The Basic Two-Level Regression Model

Transcription:

A repeated measures concordance correlation coefficient Presented by Yan Ma July 20,2007 1

The CCC measures agreement between two methods or time points by measuring the variation of their linear relationship from the 45 o line through the origin. (Lin (1989),A CCC to Evaluate Reproducibility, Biometrics). 2

The CCC measures agreement between two methods or time points by measuring the variation of their linear relationship from the 45 o line through the origin. (Lin (1989),A CCC to Evaluate Reproducibility, Biometrics). Blood draw data example 3

Background Example (Y 1, Y 2 ) = {(1, 2.8), (2, 2.9), (3, 3), (4, 3.1), (5, 3.2)} Y2 0 1 2 3 4 5 0 1 2 3 4 5 Y1 4

Background Example (Y 1, Y 2 ) = {(1, 2.8), (2, 2.9), (3, 3), (4, 3.1), (5, 3.2)} t-test: H 0 : Means are equal. (p-value=1); 5

Background Example (Y 1, Y 2 ) = {(1, 2.8), (2, 2.9), (3, 3), (4, 3.1), (5, 3.2)} t-test: H 0 : Means are equal. (p-value=1); Pearson correlation coefficient=1; 6

Background Example (Y 1, Y 2 ) = {(1, 2.8), (2, 2.9), (3, 3), (4, 3.1), (5, 3.2)} t-test: H 0 : Means are equal. (p-value=1); Pearson correlation coefficient=1; Kendall s tau =1; 7

Background Example (Y 1, Y 2 ) = {(1, 2.8), (2, 2.9), (3, 3), (4, 3.1), (5, 3.2)} t-test: H 0 : Means are equal. (p-value=1); Pearson correlation coefficient=1; Kendall s tau =1; Spearman s rho=1. 8

Background Lin (1989), ρ c = 1 E(Y 1 Y 2 ) 2 E indep (Y 1 Y 2 ) 2 = 2σ Y1Y 2 σ Y1Y 1 + σ Y2Y 2 + (µ Y1 µ Y2 ) 2 9

Background Lin (1989), ρ c = 1 E(Y 1 Y 2 ) 2 E indep (Y 1 Y 2 ) 2 = 2σ Y1Y 2 σ Y1Y 1 + σ Y2Y 2 + (µ Y1 µ Y2 ) 2 1 ρ ρ c ρ 1 10

ρ c = ρc b where C b = [(v + 1/v + u 2 )/2] 1, v = σ 1 /σ 2, u = (µ 1 µ 2 )/ σ 1 σ 2. 11

ρ c = ρc b where C b = [(v + 1/v + u 2 )/2] 1, v = σ 1 /σ 2, u = (µ 1 µ 2 )/ σ 1 σ 2. ˆρ c = 0.2 12

Table 1: A comparison of CCC, Pearson CC, Kendall s tau and Spearman s rho Ex1. (Y 1, Y 2 ) = {(1, 2.8), (2, 2.9), (3, 3), (4, 3.1), (5, 3.2)} Ex2. (Y 1, Y 2 ) = {(1, 21), (2, 22), (3, 23), (4, 24), (5, 25)} Ex3. (Y 1, Y 2 ) = {(1, 1), (2, 12), (3, 93), (4, 124), (5, 95)} Example CCC Pearson CC Kendall s tau Spearman s rho 1 0.2 1 1 1 2 0.01 1 1 1 3 0.02 0.86 0.8 0.9 13

Development Introduction Since its introduction, the coefficient has been used sample size calculations for assay validation studies (Lin, 1992); 14

Development Introduction Since its introduction, the coefficient has been used sample size calculations for assay validation studies (Lin, 1992); repeated measures studies resulting in a weighted version of the coefficient (Chinchilli et al., 1996); 15

Development Introduction Since its introduction, the coefficient has been used sample size calculations for assay validation studies (Lin, 1992); repeated measures studies resulting in a weighted version of the coefficient (Chinchilli et al., 1996); assessing goodness of fit in generalized linear and nonlinear mixed-effect models (Vonesh et al., 1996); 16

Development Introduction Since its introduction, the coefficient has been used sample size calculations for assay validation studies (Lin, 1992); repeated measures studies resulting in a weighted version of the coefficient (Chinchilli et al., 1996); assessing goodness of fit in generalized linear and nonlinear mixed-effect models (Vonesh et al., 1996); has been generalized to a class of CCCs whose estimators better handle data with outliers (King and Chinchilli, 2001); 17

Development Introduction Since its introduction, the coefficient has been used sample size calculations for assay validation studies (Lin, 1992); repeated measures studies resulting in a weighted version of the coefficient (Chinchilli et al., 1996); assessing goodness of fit in generalized linear and nonlinear mixed-effect models (Vonesh et al., 1996); has been generalized to a class of CCCs whose estimators better handle data with outliers (King and Chinchilli, 2001); has been expanded to assess the amount of agreement among more than two raters or methods (King and Chinchilli,2001; and Barnhart et al., 2002); 18

Development Introduction Since its introduction, the coefficient has been used sample size calculations for assay validation studies (Lin, 1992); repeated measures studies resulting in a weighted version of the coefficient (Chinchilli et al., 1996); assessing goodness of fit in generalized linear and nonlinear mixed-effect models (Vonesh et al., 1996); has been generalized to a class of CCCs whose estimators better handle data with outliers (King and Chinchilli, 2001); has been expanded to assess the amount of agreement among more than two raters or methods (King and Chinchilli,2001; and Barnhart et al., 2002); has been shown to be equivalent to a particular specification of the intraclass correlation coefficient (ICC) (Carrasco and Jover, 2003). 19

King et al., 2007 Introduction Abstract Statistical Methodology Simulation Study Applications Discussion This paper proposes an approach to assessing agreement between two responses in the presence of repeated measures which is based on obtaining population estimates. We incorporate an unstructured correlation structure of the repeated measurements, and use the population estimates, rather than subject-specific estimates, to construct a repeated measures CCC. 20

Notations and Assumptions Abstract Statistical Methodology Simulation Study Applications Discussion X(Y)(x(y) ij ): ith subject and jth repeated measure of the first (second) method of measurement,i = 1, 2,.., n; j = 1, 2,..., p. 21

Notations and Assumptions Abstract Statistical Methodology Simulation Study Applications Discussion X(Y)(x(y) ij ): ith subject and jth repeated measure of the first (second) method of measurement,i = 1, 2,.., n; j = 1, 2,..., p. Assume [X i, Y i ] are selected from a multivariate normal population with 2p 1 mean vector [µ X, µ Y ], and 2p 2p covariance matrix Σ, which consists of the following four p p matrices: Σ XX, Σ XY, Σ YX and Σ YY. 22

A repeated measures CCC Abstract Statistical Methodology Simulation Study Applications Discussion ρ c,rm = 1 E[(X-Y) D(X-Y)] E indep [(X-Y) D(X-Y)] P p P p j=1 k=1 = d jk(σ Xj Y k + σ Yj X k ) P p j=1 P p k=1 d jk(σ Xj X k + σ Yj Y k ) + P p j=1 P p k=1 d jk(µ Xj µ Yj )(µ Xk µ Yk ) where D is a p p non-negative definite matrix of weight between the different repeated measurements. This parameter is a generalization of that described by Lin (1989), and reduces to Lins CCC if p = 1 for i = 1, 2,..., n 23

Abstract Statistical Methodology Simulation Study Applications Discussion To consider the measurement of agreement in a variety of paired and unpaired data situations, we can specify different definitions of D which incorporate strictly within-visit (X j versus Y j ) or between- and within-visit agreement (X j versus Y k ). Four options we consider for the D matrix are as follows: 24

Abstract Statistical Methodology Simulation Study Applications Discussion An estimator of ρ c ˆρ c,rm = P p P p j=1 k=1 d jk(ˆσ Xj Y k + ˆσ Yj X k ) P p P P p j=1 k=1 d p jk(ˆσ Xj X k + ˆσ Yj P Y k ) + p j=1 k=1 d jk(ˆµ Xj ˆµ Yj )(ˆµ Xk ˆµ Yk ) A basic consideration for statistical inference concerning ρ c,rm is to recognize that the estimator ˆρ c,rm can be expressed as a ratio of functions of U-statistics. 25

Abstract Statistical Methodology Simulation Study Applications Discussion ˆρ c,rm = (n 1)(V U) U + (n 1)V Apply the theory of U-statistics,ˆρ c,rm has a normal distribution asymptotically with mean ρ c,rm and a variance that can be consistently estimated using the delta method with Var(ˆρ c,rm ) = dσd Ẑ = 1 ( ) 1 + 2 ln ˆρc,rm 1 ˆρ c,rm 26

Scenario Introduction Abstract Statistical Methodology Simulation Study Applications Discussion The simulation was performed for three cases evaluating different location and scale shifts, with sample sizes of n = 20, 40 and 80, considering scenarios with three repeated measurements per unit. Case 1: Means µ x = (4, 6, 8) and µ y = (5, 7, 9), within-visit covariance matrix ( 8 0.95 8 10 0.95 8 10 10 and a 3 3 compound symmetric within-subject correlation structure with ρ = 0.4. ) 27

Scenario Introduction Abstract Statistical Methodology Simulation Study Applications Discussion Case 2:Means µ x = (4, 6, 8) and µ y = (6, 8, 12), within-visit covariance matrix ( 8 0.8 8 12 0.8 8 12 12 ) 28

Scenario Introduction Abstract Statistical Methodology Simulation Study Applications Discussion Case 2:Means µ x = (4, 6, 8) and µ y = (6, 8, 12), within-visit covariance matrix ( 8 0.8 8 12 0.8 8 12 12 Case 3:Means µ x = (4, 6, 8) and µ y = (7, 9, 11), within-visit covariance matrix ( 8 0.5 8 15 0.5 8 15 15 ) ) 29

Abstract Statistical Methodology Simulation Study Applications Discussion 30

Abstract Statistical Methodology Simulation Study Applications Discussion 31

Abstract Statistical Methodology Simulation Study Applications Discussion Penn State Young Women s Health Study Body fat was estimated from skinfolds calipers and DEXA on a cohort of 90 adolescent girls whose initial visit occurred at age 12; 32

Abstract Statistical Methodology Simulation Study Applications Discussion Penn State Young Women s Health Study Body fat was estimated from skinfolds calipers and DEXA on a cohort of 90 adolescent girls whose initial visit occurred at age 12; Skinfolds calipers and DEXA measurements were taken for the subsequent visits, which occurred every 6 months. 33

Abstract Statistical Methodology Simulation Study Applications Discussion 34

Abstract Statistical Methodology Simulation Study Applications Discussion This paper proposes a repeated measures CCC that can handle both few or many repeated measurements, has a variance that can be estimated in a straightforward manner by U-statistic methodology, and performs well with small samples. 35

Abstract Statistical Methodology Simulation Study Applications Discussion This paper proposes a repeated measures CCC that can handle both few or many repeated measurements, has a variance that can be estimated in a straightforward manner by U-statistic methodology, and performs well with small samples. Weighted average of the pair-wise estimates of the CCC among the repeated measurements of the two variables X and Y. ρ c,w = p j=1 k=1 p w jk ρ cjk intuitively more appealing, based on a distance function, similar to Lin s original coefficient; The asymptotic variance of ρ c,w would be difficult to derive. 36

Abstract Statistical Methodology Simulation Study Applications Discussion ρ c,rm is an aggregated index. 37

Abstract Statistical Methodology Simulation Study Applications Discussion ρ c,rm is an aggregated index. If there is a pattern in these pair-wise CCCs, one may be interested in modeling agreement over time(barnhart and Williamson, 2001). 38