Epidemiology-Biostatistics Exam Exam 2, 2001 PRINT YOUR LEGAL NAME:

Similar documents
C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

II. DISTRIBUTIONS distribution normal distribution. standard scores

Cohort Studies. Sukon Kanchanaraksa, PhD Johns Hopkins University

Biostatistics: Types of Data Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

PRACTICE PROBLEMS FOR BIOSTATISTICS

Descriptive Statistics

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Basic research methods. Basic research methods. Question: BRM.2. Question: BRM.1

Guide to Biostatistics

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Introduction to Statistics and Quantitative Research Methods

Chapter 7: Simple linear regression Learning Objectives

Study Guide for the Final Exam

11. Analysis of Case-control Studies Logistic Regression

WHAT IS A JOURNAL CLUB?

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.

Mortality Assessment Technology: A New Tool for Life Insurance Underwriting

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Statistics 2014 Scoring Guidelines

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

Factors affecting online sales

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

Regression Analysis: A Complete Example

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Principles of Hypothesis Testing for Public Health

Data Analysis, Research Study Design and the IRB

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Statistics for Sports Medicine

Independent t- Test (Comparing Two Means)

Simple Linear Regression Inference

Unit 26: Small Sample Inference for One Mean

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Analyzing Research Data Using Excel

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!

Study Design and Statistical Analysis

NHS Diabetes Prevention Programme (NHS DPP) Non-diabetic hyperglycaemia. Produced by: National Cardiovascular Intelligence Network (NCVIN)

Inclusion and Exclusion Criteria

AP Statistics. Chapter 4 Review

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

3. There are three senior citizens in a room, ages 68, 70, and 72. If a seventy-year-old person enters the room, the

Sample Size and Power in Clinical Trials

DATA COLLECTION AND ANALYSIS

SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Technical Information

LEVEL ONE MODULE EXAM PART ONE [Clinical Questions Literature Searching Types of Research Levels of Evidence Appraisal Scales Statistic Terminology]

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Mind on Statistics. Chapter 4

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

SPSS Guide: Regression Analysis

Part 2: Analysis of Relationship Between Two Variables

Research Methods & Experimental Design

Statistics 151 Practice Midterm 1 Mike Kowalski

2 Precision-based sample size calculations

The American Cancer Society Cancer Prevention Study I: 12-Year Followup

Father s height (inches)

Introduction to Quantitative Methods

This chapter discusses some of the basic concepts in inferential statistics.

Mind on Statistics. Chapter 13

Competency 1 Describe the role of epidemiology in public health

1) The table lists the smoking habits of a group of college students. Answer: 0.218

Name: Date: Use the following to answer questions 3-4:

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Linear Models in STATA and ANOVA

Categorical Data Analysis

Inferential Statistics. What are they? When would you use them?

Clinical Research on Lifestyle Interventions to Treat Obesity and Asthma in Primary Care Jun Ma, M.D., Ph.D.

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

The Cross-Sectional Study:

Elementary Statistics

SECOND M.B. AND SECOND VETERINARY M.B. EXAMINATIONS INTRODUCTION TO THE SCIENTIFIC BASIS OF MEDICINE EXAMINATION. Friday 14 March

Two Correlated Proportions (McNemar Test)

Non-Parametric Tests (I)

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

MY TYPE 2 DIABETES NUMBERS

Written Example for Research Question: How is caffeine consumption associated with memory?

MTH 140 Statistics Videos

Fairfield Public Schools

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

EXPANDING THE EVIDENCE BASE IN OUTCOMES RESEARCH: USING LINKED ELECTRONIC MEDICAL RECORDS (EMR) AND CLAIMS DATA

Statistics Review PSY379

Pearson's Correlation Tests

Correlational Research

Analysis and Interpretation of Clinical Trials. How to conclude?

ch12 practice test SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

Clocking In Facebook Hours. A Statistics Project on Who Uses Facebook More Middle School or High School?

Transcription:

Epidemiology-Biostatistics Exam Exam 2, 2001 PRINT YOUR LEGAL NAME: Instructions: This exam is 30% of your course grade. The maximum number of points for the course is 1,000; hence this exam is worth 300 points. There are 25 questions on this exam. Each question is worth 12 points to yield the maximum total of 300 points for this exam. For questions 1 12, record the best answer in pencil on the answer sheet provided. For questions 13 25, write your answers in the spaces provided. Submit the exam and your answer sheet as directed after you have completed the exam. Be sure that you have printed your legal name on the top of each page. 1. A parent calls the local Department of Public Health concerned that several children in his small community have been diagnosed with myocarditis, an inflammation of the heart muscle. After recording the information, one of the very first things the public health official should do is: (Select the best answer.) a. Gather information from sources in the small community to determine if there is a greater than expected number of cases of myocarditis. b. Conduct active surveillance by capturing ticks from area woods to determine the prevalence of the bacteria that causes Lyme Disease. c. Conduct a case control study to determine if cases were more likely than controls to have eaten in the nearby fast-food restaurant. d. Conduct a prospective cohort study to determine if exposure to wooded areas is associated with the development of myocarditis. e. In the interest of public safety, immediately inform the media that there may be an epidemic of a potentially serious disease. 2. Height is normally distributed in Town A. Researchers randomly select 50 subjects from Town A and calculate their mean height and its 95% confidence interval to make an inference about the true average height in Town A. Select the best statement: a. Confounding could lead to an erroneous conclusion about Town A s true average height. b. The sample size of 50 is too small to calculate a 95% confidence interval on the mean. c. If the researchers had drawn a sample size of 100 rather than 50, the 95% confidence interval around the mean would be more narrow. d. It is not appropriate to use a 95% confidence interval on the sample mean to make an inference about the true mean of Town A. e. Interviewer bias could lead to an erroneous conclusion about Town A s true average height.

PRINT YOUR LEGAL NAME: 3. Multiple linear regression analysis (multivariate analysis) can be used to: (Select the best statement.) a. adjust for confounding by adding the confounding variable as an independent variable. b. determine if a study result is clinically important. c. adjust for loss to follow-up bias. d. calculate a study s power. e. correct the p-value for multiple comparisons. 4. The standard error of the mean is: (Select the best statement.) a. used to calculate the mean of the sample. b. used whenever the researcher wants to adjust for confounding by the direct or indirect standardization method. c. the sample size divided by the square root of the sample standard deviation. d. a type of sampling bias that can lead to an erroneous study conclusion. e. the standard deviation of the distribution of sample means. 5. Researchers want to make an inference about the average serum (blood) cholesterol levels of students in college B. They will do this by randomly sampling 500 students from college B. The researchers want to be absolutely certain that they sample 250 male students and 250 female students. The researchers should: (Select the best statement.) a. calculate a 50% confidence interval on the sample mean. b. use the methods of stratified sampling. c. conduct a randomized controlled trial to help assure comparability of men and women. d. randomly select 500 subjects, and if there are 252 men and 248 women, discard the men with the highest two serum cholesterol levels and handselect two female students. e. perform a stratified t-test on the average serum cholesterol levels of the males vs. the females.

PRINT YOUR LEGAL NAME: 6. Researchers prospectively follow a group of 100 vegetarians and 200 nonvegetarians. After 30 years, 8 of the vegetarians develop heart disease and 20 of the non-vegetarians develop heart disease. The 95% confidence interval on the relative risk of 0.8 extends from 0.6 to 0.9. Alpha was set at 0.05. Select the best statement: a. Vegetarians were 80% less likely to develop heart disease over 30 years vs. the non-vegetarians. b. The relative risk of 0.8 is not statistically significant as the 95% confidence interval contains the value 0.8. c. Vegetarians were 20% less likely to develop heart disease over 30 years vs. non-vegetarians. d. The study had 95% power to detect a true difference of 20% or more. e. The researchers should have calculated an odds ratio rather than a relative risk. 7. The grades on exam C are normally distributed with a population mean of 75% and a population standard deviation of 5%. Select the best statement: a. The 95% confidence interval on the exam C distribution extends from 65% to 85%. b. 34% of the grades on exam C are between 75% and 80%. c. 65% of the grades on exam C are between 65% and 85%. d. The area under the entire curve of exam C grades could be less than 100%. e. 95% of the grades on exam C are between 75% and 85%. 8. Researchers want to determine if the proportion of men from town A who smoke cigarettes is different than the proportion of women from town A who smoke cigarettes. The researchers randomly select 1,000 men and 1,200 women from town A and they ask them if they smoke cigarettes. Subjects are only permitted to answer yes or no to the question. Select the best statement: a. Smoking could be a confounder in this study. b. Gender could be a confounder in this study. c. The study likely has poor internal validity as there are more women in the study than men. d. The researchers should test their hypothesis with a two-sample t-test. e. The researchers should test their hypothesis with a chi-square test.

PRINT YOUR LEGAL NAME: 9. Researchers develop a simple regression model with diastolic blood pressure as the dependent variable and height as the independent variable. They determine that R squared is 68%. Select the best statement: a. 68% of the variability in diastolic blood pressure is accounted for by height. b. 68% of the variability in height is accounted for by diastolic blood pressure. c. One standard deviation below and above R squared contains 68% of the heights. d. The R squared is at the 68 th percentile of the distribution of diastolic blood pressures. e. As diastolic blood pressure is a continuous variable, the researchers should use a logistic regression model. 10. Select the best statement concerning Pearson s correlation coefficient: a. It is an approximation of the mean of two variables. b. It reflects the magnitude of the association for linear and non-linear relationships between two continuous variables. c. A Pearson s correlation coefficient of zero indicates there is no linear association between the two variables. d. A Pearson s correlation coefficient of positive one indicates there is no linear association between the two variables. e. A Pearson s correlation coefficient of negative one indicates there is a non-linear relationship between the two variables. 11. Researchers conduct a randomized controlled trial comparing subjects on Med A vs. Med B in the prevention of strokes. They determine that 200 subjects, 100 subjects in each arm, are required to have a power of 90% to detect a true difference of 5% or more. One percent of the subjects on Med A developed a stroke while 4% of the subjects on Med B developed a stroke. Alpha was set at 0.05 and the resulting P-value was 0.07. Select the best statement: a. The researchers should reject the null hypothesis. b. Given the null hypothesis is correct, the probability of obtaining a difference of 5% or more due to chance alone is 7%. c. Given the alternative hypothesis is correct, the probability of obtaining a difference of 5% or more due to chance alone is 7%. d. Given the null hypothesis is correct, the probability of obtaining a difference of 3% or more due to chance alone is 90%. e. Given the null hypothesis is correct, the probability of obtaining a difference of 3% or more due to chance alone is 7%.

PRINT YOUR LEGAL NAME: 12. Referring to question 11 above: (Select the best statement.) a. The probability that a Type II Error occurred is 7%. b. The study might have insufficient power to detect a true difference of 3% or more. c. The probability that a Type II Error occurred is 3%. d. The researchers should reject the null hypothesis. e. The probability that a Type I Error occurred is 3%. FOR QUESTIONS 13 25, WRITE THE ANSWERS IN THE SPACES PROVIDED. 13. When is it appropriate to use a non-parametric test for continuous data? 14. Researchers develop the following regression equation: Diastolic Blood Pressure in mm Hg = constant + (slope) (age in years) If the slope is 0.50, how will each additional year of age affect the diastolic blood pressure? 15. Under what circumstances might a researcher decide to transform continuous data?

PRINT YOUR LEGAL NAME: 16. What is the difference between a statistic and a parameter? 17. A researcher records the serum (blood) HDL level from 100 randomly selected subjects from town A. The researcher then places the 100 subjects on med A, an experimental medicine to lower serum HDL, for six months and again records the serum HDL levels of the 100 subjects. Assume the serum HDL levels are normally distributed. What statistical test should the researcher use to determine if med A lowered the HDL levels in the subjects? 18. The prevalence of Disease X is known to be 10% in town A. The sensitivity of a test for Disease X is 90% and the specificity of a test for Disease X is 70%. Calculate the predictive value negative of a test for Disease X in town A. Show your work. 19. A researcher conducts a randomized controlled trial for subjects assigned to med A vs. med B in the prevention of heart disease. Because the researcher intends to perform multiple outcome comparisons, alpha is set at 0.01. The researcher determined that a minimum of 100 subjects would be necessary to have a power of 80% to detect a true difference of 3% or more. The researcher reports that 10% of the subjects on med A developed heart disease while 15% of the subjects on med B developed heart disease with a resulting P-value of 0.03. What should the researcher s decision be regarding the null hypothesis? Explain your answer.

PRINT YOUR LEGAL NAME: 20. Referring to question # 19 above, write a complete sentence interpreting the P- value and containing the appropriate data from the results. It is not acceptable to simply state that the P-value is or is not statistically significant. 21. Two hospitals choose to compare their CABG mortality rates over a one-year period. The hospitals have agreed to look at the mortality data considering patients with or without a diagnosis of hypertension (high blood pressure) at the time of the operation. The data for Hospital A are shown in the following table: Hospital A CABG Number Deaths Hypertension Diagnosis 600 60 No Hypertension Diagnosis 400 40 The data for Hospital B are shown in the following table: Hospital B CABG Number Deaths Hypertension Diagnosis 100 10 No Hypertension Diagnosis 400 40 The hospital officials report that the CABG mortality rates are the same between the two hospitals. Using the data from these tables only, was the diagnosis of hypertension a confounder in the comparison between Hospital A and Hospital B such that an adjustment would be required? Explain your answer.

PRINT YOUR LGAL NAME: 22. Researchers want to determine if there is a difference in the average height between Tufts medical students and Harvard medical students. They randomly select 100 students from each school. The heights are normally distributed in each group. What statistical test should the researchers use to test the hypothesis? 23. There are pros and cons to the various types of epidemiology studies. List one pro and one con of a randomized controlled trial. (Do not list more than one pro and one con.) 24. The scores on an exam are normally distributed with a population mean of 80% and a population standard deviation of 5%. A student scores 75% on the exam. What percent of the exam scores were higher than the student s exam score? Show your work. 25. Researchers want to make an inference about the average systolic blood pressure of Tufts medical students. The researchers report that the systolic blood pressures are normally distributed. They randomly select 100 students and report that the sample mean is 120 mm Hg with a 95% confidence interval extending from 115 mm Hg to 125 mm Hg. Write a sentence interpreting this confidence interval. END OF EXAM