A study on the bi-aspect procedure with location and scale parameters



Similar documents
The Variability of P-Values. Summary

Statistics Graduate Courses

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Non Parametric Inference

From the help desk: Bootstrapped standard errors

Maximally Selected Rank Statistics in R

Uncertainty quantification for the family-wise error rate in multivariate copula models

. (3.3) n Note that supremum (3.2) must occur at one of the observed values x i or to the left of x i.

Permutation Tests for Comparing Two Populations

Quantitative Methods for Finance

Nonparametric tests these test hypotheses that are not statements about population parameters (e.g.,

Nonparametric statistics and model selection

Non-Inferiority Tests for One Mean

Order Statistics: Theory & Methods. N. Balakrishnan Department of Mathematics and Statistics McMaster University Hamilton, Ontario, Canada. C. R.

Comparison of resampling method applied to censored data

Exact Nonparametric Tests for Comparing Means - A Personal Summary

NCSS Statistical Software

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn

Chapter G08 Nonparametric Statistics

Recall this chart that showed how most of our course would be organized:

Rank-Based Non-Parametric Tests

Multivariate Analysis of Ecological Data

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

About the inverse football pool problem for 9 games 1

THE MULTIVARIATE ANALYSIS RESEARCH GROUP. Carles M Cuadras Departament d Estadística Facultat de Biologia Universitat de Barcelona

Finding statistical patterns in Big Data

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course

Bootstrapping Big Data

STATISTICA Formula Guide: Logistic Regression. Table of Contents

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

MATHEMATICAL METHODS OF STATISTICS

Non-Inferiority Tests for Two Means using Differences

Statistical tests for SPSS

Chapter 1 Introduction. 1.1 Introduction

MODIFIED PARAMETRIC BOOTSTRAP: A ROBUST ALTERNATIVE TO CLASSICAL TEST

Monte Carlo testing with Big Data

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

NAG C Library Chapter Introduction. g08 Nonparametric Statistics

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Regression Modeling Strategies

Publication List. Chen Zehua Department of Statistics & Applied Probability National University of Singapore

Name: Date: Use the following to answer questions 3-4:

Multivariate Statistical Inference and Applications

How To Understand The Theory Of Probability

The Friedman Test with MS Excel. In 3 Simple Steps. Kilem L. Gwet, Ph.D.

Package HHG. July 14, 2015

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

MEASURES OF LOCATION AND SPREAD

THE CENTRAL LIMIT THEOREM TORONTO

Department of Economics

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

T-test & factor analysis

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

UNIVERSITY OF NAIROBI

Chapter 12 Nonparametric Tests. Chapter Table of Contents

Simulating Investment Portfolios

How To Check For Differences In The One Way Anova

SAS/STAT. 9.2 User s Guide. Introduction to. Nonparametric Analysis. (Book Excerpt) SAS Documentation

Generalized Linear Mixed Models via Monte Carlo Likelihood Approximation Short Title: Monte Carlo Likelihood Approximation

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

Study Design and Statistical Analysis

NCSS Statistical Software

Margin Calculation Methodology and Derivatives and Repo Valuation Methodology

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation

Management Science Letters

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

STATISTICS COURSES UNDERGRADUATE CERTIFICATE FACULTY. Explanation of Course Numbers. Bachelor's program. Master's programs.

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

MATH. ALGEBRA I HONORS 9 th Grade ALGEBRA I HONORS

Handling attrition and non-response in longitudinal data

Hailong Qian. Department of Economics John Cook School of Business Saint Louis University 3674 Lindell Blvd, St. Louis, MO 63108, USA

Learning outcomes. Knowledge and understanding. Competence and skills

E3: PROBABILITY AND STATISTICS lecture notes

An Empirical Analysis on the Performance Factors of Software Firm

COURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences Academic Year Qualification.

SAS Software to Fit the Generalized Linear Model

Tutorial 5: Hypothesis Testing

Nonparametric Predictive Methods for Bootstrap and Test Reproducibility

Simulating Chi-Square Test Using Excel

KADI SARVA VISHWA VIDYALAYA GANDHINAGAR. Ph.D. Course Work SOCIAL WORK

A Direct Numerical Method for Observability Analysis

MA651 Topology. Lecture 6. Separation Axioms.

Roots of Polynomials

Transcription:

통계연구(2012), 제17권 제1호, 19-26 A study on the bi-aspect procedure with location and scale parameters (Short Title: Bi-aspect procedure) Hyo-Il Park 1) Ju Sung Kim 2) Abstract In this research we propose a new nonparametric test based on the bi-aspect procedure. We consider the hypothesis which deals with the location and scale parameters simultaneously and propose a new test statistic by choosing a suitable combining function. Then we consider to obtain the -values by applying the permutation principle. We compare the efficiency among the combining functions by obtaining empirical powers through a simulation study. Then we discuss some interesting results in our procedure as concluding remarks. Key words : Combining function, location parameter, permutation principle, scale parameter, two-sample problem 1. Introduction In recent years, in order to improve the power of test, some nonparametric testing procedures have adopted the trend of multiple use of several nonparametric test statistics simultaneously. A well-known procedure is the so-called versatile test(cf. Fleming et al., 1987) which combines several tests under the identical hypothesis with a suitable combining function and obtains and reports the overall -values. Also Park(2011) considered the several versatile tests with a group of quantile tests. On the other hand one may consider reducing the scope of the null hypothesis by splitting the null hypothesis into sub-hypotheses with several aspects of the original null hypothesis and then intersecting them. Pesarin(2001) initiated this approach for nonparametric testing problems and named this the multi-aspect test. In order to provide some ideas for this test procedure, we consider the following null hypothesis for the two-sample problem 1) Professor, Department of Statistics, Chongju University, Chongju 360-764, Korea. E-mail: hipark @cju.ac.kr 2) (Corresponding author) Professor, Department of Informational Statistics, Chungbuk National University, Chongju 361-763, Korea. E-mail: kimjs@chungbuk.ac.kr

20 Hyo-Il Park Ju Sung Kim, (1.1) (1.1) where is the distribution function for the th population,. Then one may perform a test for (1.1) using a suitable nonparametric test for the two-sample problem, which can be optimal for the location alternatives. Also we note that one may be interested in testing (1.1) for the scale parameter. The bi-aspect test implies that one would like to test (1.1) for the location and scale parameters simultaneously. For this we express a relation between and as follows: for any real numbers and and for all, (1.2) (1.2) where and are the location translation and scale parameters, respectively. In view of the model (1.2), we can rewrite (1.1) as. (1.3) (1.3) Then for each individual sub-hypothesis and, one should choose a reasonable nonparametric test procedure. Then by using a suitable combining function, one may obtain an overall -value or critical value for any given significance level. Pesarin(2001) called this test procedure the bi-aspect test. This procedure has been developed and applied in the various situations(cf. Marozzi, 2004 and 2007 and Salmaso and Solari, 2005). Especially Brombin et al.(2011) considered applying this multi-aspect tests to the case of bio-medical data. For obtaining the critical value for any given significance level or -value, one has to obtain the null distribution. For this purpose, the permutation principle can be applied(cf. Good, 2000), which is a re-sampling method. The permutation principle has been initiated by (1925) but only recently it has been used widely with the rapid development of the computer facility and its softwares because of the nature of re-sampling method. In this research, we introduce a nonparametric test procedure for (1.1) with a bi-aspect approach based upon (1.2) or (1.3) for the two sample case. In the next section, we construct a new test statistics through the -values for the chosen test statistics with several combining functions. We consider using the Wilcoxon rank sum statistic for and the Mood statistic(1954) for. Then we obtain the overall -values by applying the permutation principle. We compare

A study on the bi-aspect procedure with location and scale parameters 21 the performance among the combining functions by obtaining empirical powers through a simulation study. Finally we discuss some interesting features of the bi-aspect tests as concluding remarks. 2. Formulation of bi-aspect test Let and be two independent random samples from populations with distribution functions and. We assume that is unknown but continuous for each, satisfying the relation (1.2). Based on these data sets, we assume that we are interested in testing (1.1) but through testing (1.3). For this purpose, we consider applying the Wilcoxon rank sum test for and Mood test(1954) for, respectively. Let be the rank of, from the combined sample. Then the Wilcoxon rank sum statistic and Mood statistic can be written as and. Then in oder to obtain -values for and for each sub-hypothesis, one has to obtain the null distributions. This can be done by applying the permutation principle(cf., 1925) for the small sample case. However for the large sample case, one has to obtain the limiting distributions using the large sample approximation theorem. It is well-known that when suitably normalized, both and converge in distribution to standard normal random variables. By obtaining the -values for and with any one from the two approaches, one may complete the test for (1.1) by choosing a suitable combining function to obtain an overall -value. Several useful combining functions are well summarized and explained by Pesarin(2001). We now review them in the following. For this let be the -value for for each,. (1) The omnibus combining function is based on the statistic. It is well known that if two partial test statistics are independent and continuous, then follows a distribution with 4 degrees of freedom under (1.1). (2) The combining function is based on the statistic, where is the inverse of the standard normal distribution function.

22 Hyo-Il Park Ju Sung Kim (3) The combining function is given by. Then by using a reasonably chosen combining function from the above reviewed ones, one has to obtain the null distribution to decide the final overall -value. One can obtain the null distribution by applying the permutation principle. It is well-known that the permutation principle yields the exact test when we consider all the permutational configurations for the given data. However the excessive burden of computations in this way may prohibit using the complete permutational configurations. Thus one should take the Monte-Carlo approach for the re-sampling phase. Then the test would produce an asymptotic result. In the next section, we compare the efficiency among the reviewed combining functions through a simulation study. 3. Some simulation results and concluding remarks We compare the performance of the tests by obtaining empirical powers through a simulation study. For this we consider the normal, Cauchy, exponential, uniform and double exponential distributions for the three kinds of combining functions. In view of the hypothesis (1.3), we consider values of the pair varying from (0, 1) to (1, 2) with the increment 0.2 for each parameter. We note that (0, 1) is the values of under the null hypothesis. The sample sizes for this study are (10, 10), (10, 20) and (20, 10) and the nominal significance level is 0.05. The simulation has been conducted with SAS/IML on the PC version and all the results in the tables are based on 1,000 simulations with the Monte-Carlo method and within a simulation, we applied the permutation principle by 1,000 iterations also with the Monte-Carlo approach to estimate the distribution. The simulation results are summarized in Tables 1 through 5. For the normal, Cauchy and double exponential cases, the combining function show high performance but for the exponential distribution, the Tippitt one achieves better powers. combining function yields the highest performance for the uniform distribution. The simultaneous use of several statistics in the nonparametric test procedures has been mainly developed for the purpose of enhancing the power of test. However the increase of the number of test statistics does not guarantee the increase of power(cf. Park, 2011). Therefore before analyzing the data with intensity, it would be necessary to take some preliminary or explanatory analysis to choose some reasonable statistics which can be included in a multi-aspect test.

A study on the bi-aspect procedure with location and scale parameters 23 One can also try to obtain -values via a theoretic method such as the asymptotic normality. For this, first of all, we note that and are not correlated. In other words, the covariance between and is 0. One can prove this easily by applying some simple permutational arguments. Also we note that the limiting distributions of and are standard normal, where and are the standardized forms of and, respectively. Since the covariance between and is 0, one can conclude that the limiting distributions of and are independent. Also since and should yield the same -values as and, one can assert that and are independent in the asymptotic sense. Then with this asymptotic approach, one may obtain overall -values by relatively simpler computations than the permutation principle for this and case. Finally we note that there is one more useful and famous re-sampling method such as the bootstrap method(cf. Efron, 1979 and Shao and Tu, 1995). The distinction between the bootstrap and permutation methods are as follows. The bootstrap method re-samples with replacement but the permutation principle, without replacement from the original sample. However the difference can be significant for some cases(cf. Good, 2000). <Table 3.1> Power estimates for normal distribution 0.047 0.115 0.199 0.314 0.428 0.541 0.056 0.148 0.280 0.431 0.581 0.701 0.061 0.123 0.231 0.385 0.503 0.603 0.053 0.138 0.257 0.433 0.565 0.680 0.058 0.192 0.392 0.576 0.720 0.825 0.053 0.145 0.317 0.521 0.683 0.800 0.051 0.134 0.245 0.393 0.541 0.667 0.052 0.177 0.361 0.543 0.700 0.803 0.059 0.145 0.293 0.492 0.648 0.774

24 Hyo-Il Park Ju Sung Kim <Table 3.2> Power estimates for Cauchy distribution 0.048 0.104 0.146 0.188 0.241 0.289 0.051 0.105 0.177 0.253 0.322 0.391 0.058 0.106 0.154 0.208 0.264 0.315 0.049 0.108 0.169 0.224 0.299 0.350 0.065 0.127 0.208 0.289 0.373 0.441 0.062 0.122 0.185 0.261 0.324 0.391 0.047 0.114 0.178 0.235 0.292 0.343 0.059 0.126 0.194 0.283 0.373 0.455 0.056 0.117 0.181 0.247 0.320 0.380 <Table 3.3> Power estimates for exponential distribution 0.048 0.172 0.444 0.668 0.811 0.914 0.056 0.189 0.511 0.793 0.921 0.983 0.045 0.221 0.516 0.765 0.911 0.966 0.049 0.096 0.217 0.356 0.500 0.659 0.046 0.124 0.308 0.533 0.738 0.874 0.041 0.076 0.166 0.328 0.531 0.686 0.045 0.134 0.324 0.552 0.720 0.848 0.051 0.152 0.411 0.684 0.879 0.948 0.043 0.154 0.407 0.650 0.822 0.919 <Table 3.4> Power estimates for uniform distribution 0.055 0.239 0.488 0.675 0.845 0.930 0.050 0.307 0.597 0.814 0.915 0.967 0.045 0.274 0.584 0.821 0.945 0.986 0.063 0.288 0.581 0.769 0.891 0.958 0.059 0.428 0.699 0.876 0.950 0.981 0.042 0.289 0.605 0.801 0.912 0.961 0.054 0.282 0.566 0.760 0.899 0.959 0.059 0.397 0.676 0.867 0.948 0.981 0.042 0.288 0.645 0.867 0.969 0.992

A study on the bi-aspect procedure with location and scale parameters 25 <Table 3.5> Power estimates for double exponential distribution 0.055 0.125 0.248 0.368 0.515 0.588 0.050 0.154 0.311 0.473 0.616 0.733 0.045 0.123 0.243 0.382 0.513 0.607 0.063 0.156 0.298 0.465 0.611 0.711 0.059 0.197 0.385 0.589 0.743 0.837 0.042 0.137 0.285 0.467 0.632 0.762 0.054 0.140 0.292 0.450 0.610 0.705 0.059 0.175 0.380 0.563 0.726 0.820 0.042 0.135 0.278 0.457 0.624 0.749 Acknowledgments This work was supported by the research grant of the Chungbuk National University in 2011. Also the authors wish to express their sincere appreciation to three anonymous referees for the constructive suggestions and advices.

26 Hyo-Il Park Ju Sung Kim References Brombin, C., Salmaso, L., Ferronato, G. and Galzignato, P.-F. (2011). Multi-aspect procedures for paired data with application to biometric morphing, Communications in Statistics-Simulation and Computation, 40, 1-12. Efron, B. (1979). Bootstrap methods: another look at the jackknife, Annals of Statistics, 7, 1-26. Fleming, T. R., Harrington, D. P. and O'Sullivan, M. (1987). Supremum version of the logrank and generalized Wilcoxon statistics, Journal of American Statistical Society, 82, 312-320., R. A. (1925). Statistical Methods for Research Workers, Oliver & Boyd, Edinburgh. Good, P. (2000). Permutation s-a Practical Guide to Resampling Methods for ing Hypotheses, 2nd Ed., Springer, New York. Marozzi, M (2004). A bi-aspect nonparametric test for the two-sample location problem, Computational Statistics and Data Analysis, 46, 81-92. Marozzi, M (2007). Multivariate tri-aspect non-parametric testing, Journal of Nonparametric Statistics, 19, 269-282. Mood, A. M. (1954). On the asymptotic efficiency of certain nonparametric two-sample tests, Annals of Mathematical Statistics, 25, 514-522. Park, H. I. (2011). A nonparametric test procedure based on a group of quantile tests, Communications in Statistics-Simulation and Computation, 40, 759-783. Pesarin, F. (2001). Multivariate Permutation s, Wiley, New York. Salmaso, L. and Solari, A. (2005). Multiple aspect testing for case-control designs, Metrika, 62, 331-340. Shao, J. and Tu, D. (1995). The Jackknife and Bootstrap, Springer, New York. (Received September 28, 2011, Revised November 23, 2011, Accepted December 6, 2011)