Psychological Questionnaires and Kernel Extension



Similar documents
Test Evaluation Assignment. Myers-Briggs Type Indicator vs Minnesota Multiphasic Personality Inventory. Gregory Moody. Arizona State University

1 Maximum likelihood estimation

So which is the best?

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

SAMPLE REPORT. Case Description: Alan G. Personal Injury Interpretive Report

SAMPLE REPORT. Case Description: Bill G. Law Enforcement Adjustment Rating Report

Special Issue on Using MMPI-2 Scale Configurations in Law Enforcement Selection: Introduction and Meta-Analysis

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Case Description: Del C. Personal Injury Neurological Interpretive Report

Linear Threshold Units

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

A Brief Introduction to Factor Analysis

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu

Component Ordering in Independent Component Analysis Based on Data Power

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus

Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections

How To Solve A Sequential Mca Problem

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability

Data Mining: Algorithms and Applications Matrix Math Review

Case Description: Elton W. College Counseling Interpretive Report

Applications to Data Smoothing and Image Processing I

Monte Carlo testing with Big Data

Multivariate Analysis of Variance (MANOVA): I. Theory

Notes on Continuous Random Variables

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Solutions to Exam in Speech Signal Processing EN2300

Least-Squares Intersection of Lines

degrees of freedom and are able to adapt to the task they are supposed to do [Gupta].

Case Description: Elizabeth Inpatient Mental Health Interpretive Report

Correlates of Work Injury Frequency and Duration Among Firefighters

How To Interpret An Alcohol/Drug Treatment Interpretive Report

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

Introduction to General and Generalized Linear Models

Statistical Machine Learning

Level Set Evolution Without Re-initialization: A New Variational Formulation

DATA ANALYSIS II. Matrix Algorithms

Going Big in Data Dimensionality:

Clustering and mapper

Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration

An Introduction to Machine Learning

A DISCRETE TIME MARKOV CHAIN MODEL IN SUPERMARKETS FOR A PERIODIC INVENTORY SYSTEM WITH ONE WAY SUBSTITUTION

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

A Basic Guide to Analyzing Individual Scores Data with SPSS

Monte Carlo Simulation

Alabama Department of Postsecondary Education

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, cm

Case Description: Karen Z. Inpatient Mental Health Interpretive Report

Supervised Feature Selection & Unsupervised Dimensionality Reduction

Some Quantitative Issues in Pairs Trading

Question 2 Naïve Bayes (16 points)

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

Environmental Remote Sensing GEOG 2021

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Ch.3 Demand Forecasting.

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

1 Example of Time Series Analysis by SSA 1

Biostatistics Short Course Introduction to Longitudinal Studies

Application of Markov chain analysis to trend prediction of stock indices Milan Svoboda 1, Ladislav Lukáš 2

APPLIED MISSING DATA ANALYSIS

Manifold Learning Examples PCA, LLE and ISOMAP

Moving Least Squares Approximation

Lecture 3: Linear methods for classification

How To Run Statistical Tests in Excel

Classification of Bad Accounts in Credit Card Industry

Robust Outlier Detection Technique in Data Mining: A Univariate Approach

Detecting Network Anomalies. Anant Shah

Package EstCRM. July 13, 2015

ADVANCED MACHINE LEARNING. Introduction

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

Simple Linear Regression

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Machine Learning.

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

MapReduce Approach to Collective Classification for Networks

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Object Recognition and Template Matching

Pathological Gambling and Age: Differences in personality, psychopathology, and response to treatment variables

Subspace Analysis and Optimization for AAM Based Face Alignment

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Battery Bias

Multivariate Normal Distribution

Magruder Statistics & Data Analysis

Case Description: Grace Drug/Alcohol Treatment Interpretive Report

Making Sense of the Mayhem: Machine Learning and March Madness

How to write a technique essay? A student version. Presenter: Wei-Lun Chao Date: May 17, 2012

Factor analysis. Angela Montanari

Monte Carlo-based statistical methods (MASM11/FMS091)

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Case Description: William S. Outpatient Mental Health Interpretive Report

Data Mining and Visualization

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Machine Learning and Pattern Recognition Logistic Regression

PARTICIPANTS GUIDE TO SCHOOL OF PSYCHOLOGY (SOP) PLYMOUTH PSYLAB PAID SYSTEM

Cluster Analysis. Aims and Objectives. What is Cluster Analysis? How Does Cluster Analysis Work? Postgraduate Statistics: Cluster Analysis

A Guide to Understanding and Using Data for Effective Advocacy

Supply planning for two-level assembly systems with stochastic component delivery times: trade-off between holding cost and service level

Transcription:

Psychological Questionnaires and Kernel Extension Liberty.E 1 Almagor.M 2 James.B 3 Keller.Y 4 Coifman.R.R. 5 Zucker.S.W. 6 1 Department of Computer Science, Yale University 2 Department of Psychology, University of Haifa. 3 Math Department, Davis University. 4 Engineering School, Bar Ilan University. 5 Program in Applied Mathematics, Yale University. 6 Program in Applied Mathematics, Yale University.

Psychological Questionnaires Our (online and actual) life, is constantly effected by questionnaires, surveys, tests, etc. Personality assessment Job Placement Psychological evaluation Directed marketing Online dating and socializing Can we make sense of people s answers, with out being experts in marketing, dating, or psychology? In all of the above, We are interested in an underlying property of the responder and NOT in their actual answers.

Psychological Questionnaires Answer by YES or NO Group A I find it hard to wake up in the morning. I m usually burdened by my tasks for the day. I frequently go to wild parties. And other, that seem less directed: Group B? I like poetry. I might enjoy being a dog trainer. I read the newspaper every day.

Psychological Questionnaires These are example questions from The Minnesota Multiphasic Personality Inventory (MMPI-2) which is amongst the most administered psychological evaluation questionnaires is the US. Group A are questions are used for estimating depression I find it hard to wake up in the morning. (yes) I m usually burdened by my tasks for the day. (yes) I frequently go to wild parties. (no) The depression score is the sum of all indicated matched answers. Group B is designed to test for other conditions and ignored when depression is evaluated.

Approach, Advantages, and Drawbacks Our goal: To learn a scoring function f from answers to scores, using only a training set (answers and scores) and no other prior knowledge. Advantages: We are free to learn traits for which we do not know how to manually devise f. (who is a good employee?) We are not imposing any ad hoc structure on questionnaire. (I might wake up late because I go to wild parties!) We can limit ourselves to learning noise robust functions. Drawbacks: If the property is a complicated (or under determined) function on the answers, we are bound to fail.

The MMPI-2 test About the MMPI-2: It contains 567 questions. (yes/no) It evaluates conditions such as: Depression, Hysteria, Paranoia, Schizophrenia, Hypomania, etc. Each condition is measured by a scale. A scale consists of 10 to 60 questions along with their indicated answers. The raw score on a scale is the number of questions answered in the indicated way. (raw scores are normalized to find deviations)

MMPI-2 and the kernel framework We set: Each person s response to the test is an x R 567 (yes/no answers ±1). A scoring function f, f scale (x) : R d R is the diagnosis for that person on that scale. We assume that: The scoring function f is sufficiently smooth for a meaningful kernel K, id est f, Kf 0. The training set sufficiently samples the probability density (and subsequently f ).

Diffusion kernel and eigenfunctions The diffusion kernel is a properly normalized Gaussian kernel. Given a set of n input vectors x i R d 1. K 0 (i, j) e x i x j 2 σ 2 2. p(i) n j=1 K 0(i, j) approximates the density at x i 3. K (i, j) K 0 (i,j) p(i)p(j) 4. d(i) n K j=1 (i, j) 5. K (i, j) K (i,j) d(i) d(j) 6. K = USU T m k=1 s ku k uk T (by SVD of K ) Stages 2 and 3 normalize for the density whereas stages 4 and 5 perform the graph laplacian normalization. Coifman at el. show that in the limit n, and σ 0 K converges to a conjugate to the diffusion operator. The functions ϕ k (x) = u k (x)/u 1 (x) converge to the eigenfunctions of.

Kernel Extension (Nyström) Since the u k are eigenvectors of K we have: λ k u k (x i ) = n K (x i, x j )u k (x j ) (1) j=1 Evaluate K (x, x j ) where x is not in the training set. u k (x) = 1 λ k n K (x, x j )u k (x j ) (2) j=1 The functions ϕ k (x) = u k (x)/u 1 (x) extend the kernel to the test set.

Approximating a scoring function f Given a smooth function f over the data points, f (x i ), approximate it with a few ϕ k : f (x) = a k ϕ k (x) k where a k = ϕ k (x)f (x)dx M n ϕ k (x i )f (x i )p 1 (x i )dx i=1 f is expressed as a linear combination of ϕ k. We can evaluate f (x) for any x. x K (x, x i ) u k (k) ϕ k (x) f (x) (3)

Experimental setup and Results Algorithm parameters: x i x j is the Hamming distance Training set size 500 subjects Test set size 1000 subjects f was approximated by m = 15 geometric harmonics 1 Geometric harmonics 1 PCA eigenvectors 0.5 0.5 0 10 15 20 30 0 10 15 20 30 Figure: Correlations between real and predicted scores for different numbers of geometric harmonics used. For comparison, on the righthand side, the same plot using the PCA kernel eigenfunctions.

EASY: Scores over the diffusion map Depression R.Q =0.35 Paranoia R.Q =0.34 Psychasthenia R.Q =0.31 GH2 GH2 GH2 GH1 Hypomania R.Q =0.35 GH1 Psychopathic Deviation R.Q =0.36 GH1 Demoralization R.Q =0.29 GH2 GH2 GH2 GH1 GH1 GH1

Missing data We calculate correlations between the given scores and our predicted scores under three conditions: 1. EASY: no missing answers. 2. HARD: randomly deleted answers from each test taker. 3. HARDEST: delete all answers corresponding to predicted scale. Note, this cannot be scored by other known scoring methods. When answers are missing the Hamming distance is measured only on the answered questions and scaled up.

EASY AND HARD: Data missing at random It is possible to score accurately with only half the answers! Scale \ missing items no missing 100 200 300 Hypochondriasis 0.95 0.94 0.93 0.92 Depression 0.94 0.93 0.93 0.92 Hysteria 0.89 0.88 0.87 0.85 Psychopathic Deviation 0.91 0.90 0.90 0.88 Paranoia 0.87 0.87 0.86 0.84 Psychasthenia 0.98 0.98 0.97 0.97 Schizophrenia 0.98 0.98 0.97 0.97 Hypomania 0.86 0.86 0.85 0.84 Social Introversion 0.97 0.96 0.96 0.95

HARDEST: Missing entire scale Scoring Depression with group B equations. All the items belonging to a the predicted scale are missing. For comparison we tried also to complete the missing responses using a Markov process and score the corrupted records using the usual scoring procedure. Scale r Hit rate r MC Hit rate MC Hypochondriasis 79 59 69 46 Depression 86 67 65 0 Hysteria 74 51 55 0 Psychopathic Deviation 80 59 48 0 Paranoia 78 54 55 5 Psychasthenia 94 88 70 26 Schizophrenia 94 85 73 41 Hypomania 80 58 35 2 Social Introversion 87 69 58 7 Table: Correlations and hit rates variance, for different choices of a training set, is smaller the 0.02.

Summary points Figure: Depression on the diffusion map. EASY HARD HARDEST No missing answers Missing at random Missing complete scale The same algorithm was run on another MMPI-2 data set, and on dating service data, with similar results. Psychological questionnaires and their scoring can be addressed with kernel extension ideas. Filling in missing data might be the wrong thing to do.

Thank you.

Answers missing at random q=30 q=50 q=100 q=200 q=300 r Hit rate r Hit rate r Hit rate r Hit rate r Hit rate HS 95 89 95 89 94 87 94 83 92 80 D 93 83 93 83 93 83 92 80 92 77 HY 89 71 88 70 88 69 87 67 84 62 PD 91 76 91 77 91 76 90 74 89 71 PA 88 67 88 67 87 66 87 64 85 62 PT 98 97 98 97 98 97 98 97 97 96 SC 98 98 98 98 98 98 98 98 97 97 MA 87 67 87 67 86 67 86 64 85 65 SI 96 91 96 92 96 91 95 90 95 88 RCD 98 96 97 96 97 96 97 94 97 94 RC1 95 86 94 86 94 85 93 81 91 75 RC2 93 82 93 82 92 80 92 79 91 77 RC3 89 73 89 73 89 72 89 71 88 70 RC4 92 81 92 78 92 76 90 74 88 69 RC6 92 78 92 78 91 78 91 75 89 72 RC7 96 92 96 93 96 92 96 91 95 91 RC8 93 84 93 84 93 83 93 82 91 78 RC9 93 82 93 81 92 80 92 77 91 75 Table: r, Correlation between real and predicted score. q, number of randomly deleted items. The hit rate indicated is the percent of subjects classified within 1/2 standard deviation from their original score. Correlations and hit rates variance, for different choices of a training set, is smaller the 0.02