Outline. Sequential Data Analysis Issues With Sequential Data. How shall we handle missing values? Missing data in sequences
|
|
|
- Jasper Logan
- 9 years ago
- Views:
Transcription
1 Outline Sequential Data Analysis Issues With Sequential Data Gilbert Ritschard Alexis Gabadinho, Matthias Studer Institute for Demographic and Life Course Studies, University of Geneva and NCCR LIVES: Overcoming vulnerability, life course perspectives September - November, State codings 4 Weights 5 Data size 6 Conclusion G. Ritschard (2012), 1/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 2/44. Distributed under licence CC BY-NC-ND 3.0 Coding the missing states Missing data in sequences Coding the missing states How shall we handle missing values? Missing values in the expanded (STS) form of a sequence occur, for example, when: Sequences do not start on the same date while using a calendar time axis; The follow-up time is shorter for some individuals than for others yielding sequences that do not end up at the same position; The observation at some positions is missing due to nonresponse, yielding internal gaps in the sequences. Handling may be different for each of the listed situations. In case of different start times, maintain the starting missing values to preserve alignment across sequences, or possibly left-align sequences by switching to a process time axis. In case of different end times, ending missing terms could just be ignored. In case of information missing due to non response, add an explicit non-response state to the alphabet; or maintain missing values to preserve alignment. G. Ritschard (2012), 5/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 6/44. Distributed under licence CC BY-NC-ND 3.0
2 Coding the missing states Coding left, gaps and right missing states To allow such differentiated treatments, TraMineR distinguishes left, in-between and right missing values. Use the left, gaps and right arguments of seqdef() to specify how each of the missing types should be encoded. By default, gaps and left-missing states are coded as NA, while all missing values encountered after the last valid (rightmost) state in a sequence are considered void elements (right="del"); i.e., the sequence is considered to end after the last valid state. (sequences with missing states) is more the rule than the exception. Unlike Event History Analysis (Survival analysis), which can handle censored data, no universal elegant way of handling censored data in sequences. G. Ritschard (2012), 7/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 9/44. Distributed under licence CC BY-NC-ND 3.0 Strategies in presence of uncomplete sequences Reliability of analysis with uncomplete sequences What can we do in presence of uncomplete sequences? Delete all uncomplete sequences. Delete sequences with more than an acceptable number of missing states. Consider the NA state as an element of the alphabet. Impute some missing states Not too restrictive assumptions often permit to guess the value of some missing state. For example, we can assume that people leaving with their both parents at 20, leaved with them since their birthday.... A mix of the previous solutions When states are missing at random, global picture given by the sequences remains satisfactory whatever the handling strategy for the missing states. G. Ritschard (2012), 10/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 11/44. Distributed under licence CC BY-NC-ND 3.0
3 Illustration: randomly turning states into NA in mvad Randomly turning states into NA in mvad To illustrate we randomly insert missing states into the mvad data, 1 Randomly select a proportion p of sequences to be modified. 2 In each selected sequence insert a random proportion < p G of gaps, set as missing a random proportion < p L of states from the left, set as missing a random proportion < p R of states from the right. For the next examples, we used p =.6, p G =.2, p L =.4, p R =.5 Missings where introduced with segen.missing(), from TraMineRextras R> mvadm.seq <- seqgen.missing(mvad.seq, p.cases = 0.6, p.left = 0.4, p.gaps = 0.2, p.right = 0.5, mt.gaps = "nr", mt.right = "nr") G. Ritschard (2012), 12/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 13/44. Distributed under licence CC BY-NC-ND 3.0 Rendering with and without missing states I-plot Rendering with and without missing states d-plot, with.missing=true G. Ritschard (2012), 14/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 15/44. Distributed under licence CC BY-NC-ND 3.0
4 Rendering with and without missing states d-plot, with.missing=false A crucial point when analyzing state sequences is to chose a relevant time alignment Calendar date Same date start date for each sequence. Process time, i.e., time since a event of interest birth date (position defined by age) date when starting to live with a partner, first childbirth,... start of first job, first unemployment month, immigration date,... G. Ritschard (2012), 16/44. Distributed under licence CC BY-NC-ND 3.0 Loading the srh data We illustrate with sequences of self reported health from the SHP (30% sample data in srh30.rdata) R> source(paste(scriptdir, "extractseqfromw.r", sep = "")) R> load(paste(datadir, "srh30.rdata", sep = "")) R> srh <- srh30 R> srh.shortlab <- c("b2", "B1", "M", "G1", "G2") R> srh.longlab <- c("not well at all", "not very well", "so, so", "well", "very well") R> srh.alph <- c("not well at all", "not very well", "so, so (average)", "well", "very well") R> var <- getcolumnindex(srh, "P$$C01") R> xtlab <- 1999:( length(var) - 1) R> mycol5 <- brewer.pal(5, "RdYlGn") R> srh.seq <- seqdef(srh[, var], right = NA, alphabet = srh.alph, states = srh.shortlab, labels = srh.longlab, cnames = xtlab, cpal = mycol5) R> x <- apply(is.na(srh[, var]), 1, sum) R> sel <- (x < seqlength(srh.seq) - 1) R> srh <- srh[sel, ] R> srh.seq <- srh.seq[sel, ] G. Ritschard (2012), 20/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 19/44. Distributed under licence CC BY-NC-ND 3.0 Illustration: Self-reported health, SHP 1999/2010 Sequences aligned on calendar year G. Ritschard (2012), 21/44. Distributed under licence CC BY-NC-ND 3.0
5 Changing alignment Illustration: Self-reported health, SHP 1999/2010 Sequences aligned on age Changing alignment with seqstart() from TraMineRextras. R> startyear < R> birthyear <- srh$birthy R> agesrh <- seqstart(srh[, var], data.start = startyear, new.start = birthyear) R> colnames(agesrh) <- 1:ncol(agesrh) R> agesrh <- agesrh[, 10:90] R> agesrh.seq <- seqdef(agesrh, alphabet = srh.alph, states = srh.shortlab, labels = srh.longlab, cpal = mycol5, right = NA, xtstep = 10) G. Ritschard (2012), 22/44. Distributed under licence CC BY-NC-ND 3.0 Illustration: Self-reported health, SHP 1999/2010 Sequences aligned on age, with ignored right missing positions, right="del" G. Ritschard (2012), 23/44. Distributed under licence CC BY-NC-ND 3.0 Illustration: Self-reported health, SHP 1999/2010 Focus on people born between 1930 and 1934 G. Ritschard (2012), 24/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 25/44. Distributed under licence CC BY-NC-ND 3.0
6 Time granularity Time granularity Time granularity Changing time granularity of the mvad data Monthly vs yearly states Time granularity: density of state positions within a given time length. defined by the duration of the used unit of time examples: year, quarter, month, week, day, hour,... Can switch from a fine granularity to a more rough one. But, cannot switch to a finer granularity than available in the data. Change granularity with seqgranularity() from TraMineRextras R> mvadg.seq <- seqgranularity(mvad.seq, tspan = 12) G. Ritschard (2012), 27/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 28/44. Distributed under licence CC BY-NC-ND 3.0 Time granularity Changing time granularity of the mvad data Monthly vs yearly states State codings State codings: What is the optimal alphabet size? The larger the alphabet, the less clear the results. Similarly to time aggregation, we can also merge together elements of the alphabet. Useful when different states reflect similar situations For example: in mvad, the distinction between further education (FE) and school (SC) is not so clear. Merging those categories improves readability of the outcomes. Avoid merging dissimilar states. Do not hide useful distinction such as Full time and Part time. G. Ritschard (2012), 29/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 31/44. Distributed under licence CC BY-NC-ND 3.0
7 State codings Merging two states Merging Further education with School in mvad State codings Merging two states Merging Further education with School in mvad R> mvadr.seq <- seqrecode(mvad.seq, recodes = list(fs = c("fe", "SC"))) R> seqdplot(mvadr.seq, group = mvad$gcse5eq, border = NA) G. Ritschard (2012), 32/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 33/44. Distributed under licence CC BY-NC-ND 3.0 State codings Merging two states Merging Further education with School in mvad Weights Weights Weights serve to improve sample representativeness Weights also useful for reducing the sequence data size by retaining only unique sequences. weight reflect the number of cases sharing the same unique sequence In any case, when weights are present, they should be accounted for. In TraMineR with the weights= argument of seqdef() When assigned to the state sequence object, weights are automatically accounted for. in produced plots, distributions, statistics,... G. Ritschard (2012), 34/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 36/44. Distributed under licence CC BY-NC-ND 3.0
8 Weights Results may be quite different R> layout(matrix(c(1, 2, 3, 3), 2, 2, byrow = TRUE), heights = c(2, 1.3)) R> seqdplot(mvad.seq, border = NA, withlegend = FALSE, weighted = FALSE, title = "Non Weighed") R> seqdplot(mvad.seq, border = NA, withlegend = FALSE, title = "Weighed") R> seqlegend(mvad.seq, ncol = 2, position = "top") Weights Which weights to use with panel data? Each wave of a panel survey usually includes 2 weights: a transversal weight (representativeness of current population) a longitudinal weight (representativeness of initial population), applies to full trajectories. Which weights should be used for uncomplete trajectories? For sequences over a subinterval of time? No evident solution. Weights lose their meaning when cases are filtered out! In SHP there are weights for cases for Sample I (1999) and for Sample I+II (2004). See G. Ritschard (2012), 37/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 38/44. Distributed under licence CC BY-NC-ND 3.0 Data size Data size, scalability Data size Size limitations: What can we do? Three types of size limitations: Number of sequences: no problem up to about Main problem is matrix of pairwise dissimilarities! Sequence length: no problem up to a few hundreds ( 300) In some functions default limit set as 100 should be increased Size of alphabet: not a too big problem for computation, but rendering becomes difficult with more than say 20 elements Default colors only for A 12 For number of sequences: Work on a representative sample of the sequences. For sequence length: Change time granularity. Split position (time) scale and work on subintervals For size of alphabet Merge elements of the alphabet. G. Ritschard (2012), 40/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 41/44. Distributed under licence CC BY-NC-ND 3.0
9 Conclusion Conclusion Conclusion Many issues in sequence analysis Solutions necessitate trade-offs Losing sequences (cases) vs allowing for missing states Losing sequences (cases) vs restricting time coverage... Holistic view provided by sequence analysis Cost: cannot account for most recent cohorts with yearly data. For example: Studying life course until 45 years with SHP biographical survey of 2002, means, if we want only complete trajectories, that younger people are born in The finer the granularity, the less constrained we are. Thank Thank you! you! Questions? See you next week. G. Ritschard (2012), 43/44. Distributed under licence CC BY-NC-ND 3.0 G. Ritschard (2012), 44/44. Distributed under licence CC BY-NC-ND 3.0
Course/Seminar Gilbert Ritschard Wednesday 10h15-14h M-5383 Anne-Laure Bertrand (Ass)
Institute for Demographic and Life Course Studies Sequential Data Analysis 4311012 Course by Gilbert Ritschard Master level Sequential, Spring 2014, Info This sequence analysis course (6 ECTS) is given
Mining sequence data in R with the TraMineR package: A user s guide 1
Mining sequence data in R with the TraMineR package: A user s guide 1 (for version 1.8) Alexis Gabadinho, Gilbert Ritschard, Matthias Studer and Nicolas S. Müller Department of Econometrics and Laboratory
Excel 2003 Tutorials - Video File Attributes
Using Excel Files 18.00 2.73 The Excel Environment 3.20 0.14 Opening Microsoft Excel 2.00 0.12 Opening a new workbook 1.40 0.26 Opening an existing workbook 1.50 0.37 Save a workbook 1.40 0.28 Copy a workbook
Article: Main results from the Wealth and Assets Survey: July 2012 to June 2014
Article: Main results from the Wealth and Assets Survey: July 2012 to June 2014 Coverage: GB Date: 18 December 2015 Geographical Area: Region Theme: Economy Main points In July 2012 to June 2014: aggregate
Excel 2007 Basic knowledge
Ribbon menu The Ribbon menu system with tabs for various Excel commands. This Ribbon system replaces the traditional menus used with Excel 2003. Above the Ribbon in the upper-left corner is the Microsoft
Advanced Microsoft Excel 2010
Advanced Microsoft Excel 2010 Table of Contents THE PASTE SPECIAL FUNCTION... 2 Paste Special Options... 2 Using the Paste Special Function... 3 ORGANIZING DATA... 4 Multiple-Level Sorting... 4 Subtotaling
The Interaction of Workforce Development Programs and Unemployment Compensation by Individuals with Disabilities in Washington State
Number 6 January 2011 June 2011 The Interaction of Workforce Development Programs and Unemployment Compensation by Individuals with Disabilities in Washington State by Kevin Hollenbeck Introduction The
sample median Sample quartiles sample deciles sample quantiles sample percentiles Exercise 1 five number summary # Create and view a sorted
Sample uartiles We have seen that the sample median of a data set {x 1, x, x,, x n }, sorted in increasing order, is a value that divides it in such a way, that exactly half (i.e., 50%) of the sample observations
Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods
Lecture 2 ESTIMATING THE SURVIVAL FUNCTION One-sample nonparametric methods There are commonly three methods for estimating a survivorship function S(t) = P (T > t) without resorting to parametric models:
Probability Distributions
CHAPTER 5 Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Probability Distribution 5.3 The Binomial Distribution
Access Tutorial 3 Maintaining and Querying a Database. Microsoft Office 2013 Enhanced
Access Tutorial 3 Maintaining and Querying a Database Microsoft Office 2013 Enhanced Objectives Session 3.1 Find, modify, and delete records in a table Hide and unhide fields in a datasheet Work in the
Tutorial 3 Maintaining and Querying a Database
Tutorial 3 Maintaining and Querying a Database Microsoft Access 2013 Objectives Session 3.1 Find, modify, and delete records in a table Hide and unhide fields in a datasheet Work in the Query window in
Scatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
Excel 2007 Tutorials - Video File Attributes
Get Familiar with Excel 2007 42.40 3.02 The Excel 2007 Environment 4.10 0.19 Office Button 3.10 0.31 Quick Access Toolbar 3.10 0.33 Excel 2007 Ribbon 3.10 0.26 Home Tab 5.10 0.19 Insert Tab 3.10 0.19 Page
Imputation and Analysis. Peter Fayers
Missing Data in Palliative Care Research Imputation and Analysis Peter Fayers Department of Public Health University of Aberdeen NTNU Det medisinske fakultet Missing data Missing data is a major problem
About PivotTable reports
Page 1 of 8 Excel Home > PivotTable reports and PivotChart reports > Basics Overview of PivotTable and PivotChart reports Show All Use a PivotTable report to summarize, analyze, explore, and present summary
Microsoft Excel Training - Course Topic Selections
Microsoft Excel Training - Course Topic Selections The Basics Creating a New Workbook Navigating in Excel Moving the Cell Pointer Using Excel Menus Using Excel Toolbars: Hiding, Displaying, and Moving
Microsoft Excel 2010 Part 3: Advanced Excel
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Microsoft Excel 2010 Part 3: Advanced Excel Winter 2015, Version 1.0 Table of Contents Introduction...2 Sorting Data...2 Sorting
NICK COLLIER - REPAST DEVELOPMENT TEAM
DATA COLLECTION FOR REPAST SIMPHONY JAVA AND RELOGO NICK COLLIER - REPAST DEVELOPMENT TEAM 0. Before We Get Started This document is an introduction to the data collection system introduced in Repast Simphony
Wave Analytics Data Integration
Wave Analytics Data Integration Salesforce, Spring 16 @salesforcedocs Last updated: April 28, 2016 Copyright 2000 2016 salesforce.com, inc. All rights reserved. Salesforce is a registered trademark of
How to Make the Most of Excel Spreadsheets
How to Make the Most of Excel Spreadsheets Analyzing data is often easier when it s in an Excel spreadsheet rather than a PDF for example, you can filter to view just a particular grade, sort to view which
RECOMMENDED CITATION: Pew Research Center, January, 2016, Republican Primary Voters: More Conservative than GOP General Election Voters
NUMBERS, FACTS AND TRENDS SHAPING THE WORLD FOR RELEASE JANUARY 28, 2016 FOR MEDIA OR OTHER INQUIRIES: Carroll Doherty, Director of Political Research Jocelyn Kiley, Associate Director, Research Bridget
Web-Scale Extraction of Structured Data Michael J. Cafarella, Jayant Madhavan & Alon Halevy
The Deep Web: Surfacing Hidden Value Michael K. Bergman Web-Scale Extraction of Structured Data Michael J. Cafarella, Jayant Madhavan & Alon Halevy Presented by Mat Kelly CS895 Web-based Information Retrieval
Association Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
Creating a Simple Macro
28 Creating a Simple Macro What Is a Macro?, 28-2 Terminology: three types of macros The Structure of a Simple Macro, 28-2 GMACRO and ENDMACRO, Template, Body of the macro Example of a Simple Macro, 28-4
Introduction Course in SPSS - Evening 1
ETH Zürich Seminar für Statistik Introduction Course in SPSS - Evening 1 Seminar für Statistik, ETH Zürich All data used during the course can be downloaded from the following ftp server: ftp://stat.ethz.ch/u/sfs/spsskurs/
ABOUT THIS DOCUMENT ABOUT CHARTS/COMMON TERMINOLOGY
A. Introduction B. Common Terminology C. Introduction to Chart Types D. Creating a Chart in FileMaker E. About Quick Charts 1. Quick Chart Behavior When Based on Sort Order F. Chart Examples 1. Charting
Developmental Research Methods and Design. Types of Data. Research Methods in Aging. January, 2007
Developmental Research Methods and Design January, 2007 Types of Data Observation (lab v. natural) Survey and Interview Standardized test Physiological measures Case study History record Research Methods
Estimates of the number of people facing inadequate retirement incomes. July 2012
Estimates of the number of people facing inadequate retirement incomes July 2012 Contents Introduction... 3 Background... 4 Methodology... 5 Results... 8 Introduction Previous work by the Pensions Commission
Microsoft Excel 2010 Pivot Tables
Microsoft Excel 2010 Pivot Tables Email: [email protected] Web Page: http://training.health.ufl.edu Microsoft Excel 2010: Pivot Tables 1.5 hours Topics include data groupings, pivot tables, pivot
Problem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;
All-in-one webinar solution. User Guide For Account Holders and Moderators
All-in-one webinar solution. User Guide For Account Holders and Moderators CHAPTER 1 Quick Start Guide You will learn how to schedule your first session in 5 easy steps. STEP ONE: Login to Onstream Webinars
Topographic Change Detection Using CloudCompare Version 1.0
Topographic Change Detection Using CloudCompare Version 1.0 Emily Kleber, Arizona State University Edwin Nissen, Colorado School of Mines J Ramón Arrowsmith, Arizona State University Introduction CloudCompare
ECDL / ICDL Spreadsheets Syllabus Version 5.0
ECDL / ICDL Spreadsheets Syllabus Version 5.0 Purpose This document details the syllabus for ECDL / ICDL Spreadsheets. The syllabus describes, through learning outcomes, the knowledge and skills that a
PowerScheduler Load Process User Guide. PowerSchool Student Information System
PowerSchool Student Information System Released November 18, 2008 Document Owner: Documentation Services This edition applies to Release 5.2 of the PowerSchool software and to all subsequent releases and
InfiniteInsight 6.5 sp4
End User Documentation Document Version: 1.0 2013-11-19 CUSTOMER InfiniteInsight 6.5 sp4 Toolkit User Guide Table of Contents Table of Contents About this Document 3 Common Steps 4 Selecting a Data Set...
Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random
[Leeuw, Edith D. de, and Joop Hox. (2008). Missing Data. Encyclopedia of Survey Research Methods. Retrieved from http://sage-ereference.com/survey/article_n298.html] Missing Data An important indicator
Pearson Student Mobile Device Survey 2013
Pearson Student Mobile Device Survey 2013 National Report: College Students Conducted by Harris Interactive Field dates: January 28 February 24, 2013 Report date: April 17, 2013 Table of Contents Background
Default Rates by Institution Level vs. Degree Program
Student Aid Policy Analysis Default Rates by vs. Mark Kantrowitz Publisher of FinAid.org and FastWeb.com July 15, 2010 EXECUTIVE SUMMARY The US Department of Education should consider publishing cohort
Symbol Tables. Introduction
Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The
Excel -- Creating Charts
Excel -- Creating Charts The saying goes, A picture is worth a thousand words, and so true. Professional looking charts give visual enhancement to your statistics, fiscal reports or presentation. Excel
A Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
Application in Predictive Analytics. FirstName LastName. Northwestern University
Application in Predictive Analytics FirstName LastName Northwestern University Prepared for: Dr. Nethra Sambamoorthi, Ph.D. Author Note: Final Assignment PRED 402 Sec 55 Page 1 of 18 Contents Introduction...
Survey Analysis: Options for Missing Data
Survey Analysis: Options for Missing Data Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD Abstract A common situation researchers working with survey data face is the analysis of missing
Mass Email. General Use
Mass Email The Q Mass Email application allows users to compose and mass email students and/or their contacts. The application will mass send emails based on the selected Sender and creation of a Recipients
2003 National Survey of College Graduates Nonresponse Bias Analysis 1
2003 National Survey of College Graduates Nonresponse Bias Analysis 1 Michael White U.S. Census Bureau, Washington, DC 20233 Abstract The National Survey of College Graduates (NSCG) is a longitudinal survey
Gamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
Client Marketing: Sets
Client Marketing Client Marketing: Sets Purpose Client Marketing Sets are used for selecting clients from the client records based on certain criteria you designate. Once the clients are selected, you
EXCEL PIVOT TABLE David Geffen School of Medicine, UCLA Dean s Office Oct 2002
EXCEL PIVOT TABLE David Geffen School of Medicine, UCLA Dean s Office Oct 2002 Table of Contents Part I Creating a Pivot Table Excel Database......3 What is a Pivot Table...... 3 Creating Pivot Tables
Wave Analytics Data Integration Guide
Wave Analytics Data Integration Guide Salesforce, Winter 16 @salesforcedocs Last updated: November 6, 2015 Copyright 2000 2015 salesforce.com, inc. All rights reserved. Salesforce is a registered trademark
Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing
Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing
Aras Corporation. 2005 Aras Corporation. All rights reserved. Notice of Rights. Notice of Liability
Aras Corporation 2005 Aras Corporation. All rights reserved Notice of Rights All rights reserved. Aras Corporation (Aras) owns this document. No part of this document may be reproduced or transmitted in
Solutions to Homework 10 Statistics 302 Professor Larget
s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the
There are six different windows that can be opened when using SPSS. The following will give a description of each of them.
SPSS Basics Tutorial 1: SPSS Windows There are six different windows that can be opened when using SPSS. The following will give a description of each of them. The Data Editor The Data Editor is a spreadsheet
NATIONAL STUDENT CLEARINGHOUSE RESEARCH CENTER
StudentTracker SM Detail Report NATIONAL STUDENT CLEARINGHOUSE RESEARCH CENTER 2300 Dulles Station Blvd., Suite 300, Herndon, VA 20171 Contents How the National Student Clearinghouse populates its database...
Detail Report Excel Guide for High Schools
StudentTracker SM Detail Report NATIONAL STUDENT CLEARINGHOUSE RESEARCH CENTER 2300 Dulles Station Blvd., Suite 300, Herndon, VA 20171 Contents How the National Student Clearinghouse populates its database...
MEASURING INCOME DYNAMICS: The Experience of Canada s Survey of Labour and Income Dynamics
CANADA CANADA 2 MEASURING INCOME DYNAMICS: The Experience of Canada s Survey of Labour and Income Dynamics by Maryanne Webber Statistics Canada Canada for presentation at Seminar on Poverty Statistics
The responses to this assessment will help you identify key opportunities to derive full value from the Net Promoter system process.
The purpose of this assessment is to understand the progress your company or organization is making against a full potential Net Promoter system implementation. The responses to this assessment will help
The American Recovery and Reinvestment Act of 2009, Meaningful Use and the Impact on Netsmart s Public Health Clients
The American Recovery and Reinvestment Act of 2009, Meaningful Use and the Impact on Netsmart s Public Health Clients Updated November 2011 Netsmart Note: The Health Information Technology for Economic
3-Step Competency Prioritization Sequence
3-Step Competency Prioritization Sequence The Core Competencies for Public Health Professionals (Core Competencies), a consensus set of competencies developed by the Council on Linkages Between Academia
Life after Lotus Notes
Welcome Google Apps Welcome to Gmail! Now that you ve switched from Lotus Notes to, here are some tips on beginning to use Gmail and your other new Apps. What s Different? Here are some of the differences
Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD
Tips for surviving the analysis of survival data Philip Twumasi-Ankrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes
Hill s Cipher: Linear Algebra in Cryptography
Ryan Doyle Hill s Cipher: Linear Algebra in Cryptography Introduction: Since the beginning of written language, humans have wanted to share information secretly. The information could be orders from a
Technical Note. Consumer Confidence Survey Technical Note February 2011. Introduction and Background
Technical Note Introduction and Background Consumer Confidence Index (CCI) is a barometer of the health of the U.S. economy from the perspective of the consumer. The index is based on consumers perceptions
COLLEGE RETIREMENT EQUITIES FUND RULES OF THE FUND
COLLEGE RETIREMENT EQUITIES FUND RULES OF THE FUND Effective as of April 24, 2015 Attached to and made part of the CREF Contract or Certificate at its Date of Issue Note to participants: CREF's rules of
Scientific Graphing in Excel 2010
Scientific Graphing in Excel 2010 When you start Excel, you will see the screen below. Various parts of the display are labelled in red, with arrows, to define the terms used in the remainder of this overview.
Economic inequality and educational attainment across a generation
Economic inequality and educational attainment across a generation Mary Campbell, Robert Haveman, Gary Sandefur, and Barbara Wolfe Mary Campbell is an assistant professor of sociology at the University
Microsoft Excel 2010 Tutorial
1 Microsoft Excel 2010 Tutorial Excel is a spreadsheet program in the Microsoft Office system. You can use Excel to create and format workbooks (a collection of spreadsheets) in order to analyze data and
Excel Intermediate. Table of Contents UPPER, LOWER, PROPER AND TRIM...28
Excel Intermediate Table of Contents Formulas UPPER, LOWER, PROPER AND TRM...2 LEFT, MID, and RIGHT...3 CONCATENATE...4 & (Ampersand)...5 CONCATENATE vs. & (Ampersand)...5 ROUNDUP, and ROUNDDOWN...6 VLOOKUP...7
2. Incidence, prevalence and duration of breastfeeding
2. Incidence, prevalence and duration of breastfeeding Key Findings Mothers in the UK are breastfeeding their babies for longer with one in three mothers still breastfeeding at six months in 2010 compared
SEQUENCES ARITHMETIC SEQUENCES. Examples
SEQUENCES ARITHMETIC SEQUENCES An ordered list of numbers such as: 4, 9, 6, 25, 36 is a sequence. Each number in the sequence is a term. Usually variables with subscripts are used to label terms. For example,
Logi Ad Hoc Reporting System Administration Guide
Logi Ad Hoc Reporting System Administration Guide Version 11.2 Last Updated: March 2014 Page 2 Table of Contents INTRODUCTION... 4 Target Audience... 4 Application Architecture... 5 Document Overview...
Kyubit Business Intelligence OLAP analysis - User Manual
Using OLAP analysis features of Kyubit Business Intelligence www.kyubit.com Kyubit Business Intelligence OLAP analysis - User Manual Using OLAP analysis features of Kyubit Business Intelligence 2016, All
Remarriage in the United States
Remarriage in the United States Poster presented at the annual meeting of the American Sociological Association, Montreal, August 10-14, 2006 Rose M. Kreider U.S. Census Bureau [email protected]
Math 202-0 Quizzes Winter 2009
Quiz : Basic Probability Ten Scrabble tiles are placed in a bag Four of the tiles have the letter printed on them, and there are two tiles each with the letters B, C and D on them (a) Suppose one tile
Descriptive Methods Ch. 6 and 7
Descriptive Methods Ch. 6 and 7 Purpose of Descriptive Research Purely descriptive research describes the characteristics or behaviors of a given population in a systematic and accurate fashion. Correlational
USER CONVERSION P3, SURETRAK AND MICROSOFT PROJECT ASTA POWERPROJECT PAUL E HARRIS EASTWOOD HARRIS
P.O. Box 4032 EASTWOOD HARRIS PTY LTD Tel 61 (0)4 1118 7701 Doncaster Heights ACN 085 065 872 Fax 61 (0)3 9846 7700 Victoria 3109 Project Management Systems Email: [email protected] Australia Software
Visualization with Excel Tools and Microsoft Azure
Visualization with Excel Tools and Microsoft Azure Introduction Power Query and Power Map are add-ins that are available as free downloads from Microsoft to enhance the data access and data visualization
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
Cluster Analysis using R
Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so that the objects in the same cluster are more similar (in some sense or another) to each other
Successful Mailings in The Raiser s Edge
Bill Connors 2010 Bill Connors, CFRE November 18, 2008 Agenda Introduction Preparation Query Mail Export Follow-up Q&A Blackbaud s Conference for Nonprofits Charleston Bill Connors, CFRE Page #2 Introduction
Section 1.3 P 1 = 1 2. = 1 4 2 8. P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., = 1 2 4.
Difference Equations to Differential Equations Section. The Sum of a Sequence This section considers the problem of adding together the terms of a sequence. Of course, this is a problem only if more than
A Guide. to Assessment of Learning Outcomes. for ACEJMC Accreditation
A Guide to Assessment of Learning Outcomes for ACEJMC Accreditation Accrediting Council on Education in Journalism and Mass Communications, 2012 This guide explains ACEJMC s expectations of an assessment
Linear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples
Drawing a histogram using Excel
Drawing a histogram using Excel STEP 1: Examine the data to decide how many class intervals you need and what the class boundaries should be. (In an assignment you may be told what class boundaries to
