The Analysis of Interdependent Series by Correlation Method

Similar documents
More Properties of Limits: Order of Operations

Risk and Return. Sample chapter. e r t u i o p a s d f CHAPTER CONTENTS LEARNING OBJECTIVES. Chapter 7

The Lognormal Distribution Engr 323 Geppert page 1of 6 The Lognormal Distribution

Implementation of Statistic Process Control in a Painting Sector of a Automotive Manufacturer

We are going to delve into some economics today. Specifically we are going to talk about production and returns to scale.

ANALYSIS OF THE ECONOMIC PERFORMANCE OF A ORGANIZATION USING MULTIPLE REGRESSION

DAY-AHEAD ELECTRICITY PRICE FORECASTING BASED ON TIME SERIES MODELS: A COMPARISON

United Arab Emirates University College of Sciences Department of Mathematical Sciences HOMEWORK 1 SOLUTION. Section 10.1 Vectors in the Plane

Multiple Regression Used in Macro-economic Analysis

Effect Sizes Based on Means

The impact of metadata implementation on webpage visibility in search engine results (Part II) q

Price Elasticity of Demand MATH 104 and MATH 184 Mark Mac Lean (with assistance from Patrick Chan) 2011W

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

On the predictive content of the PPI on CPI inflation: the case of Mexico

Stochastic Derivation of an Integral Equation for Probability Generating Functions

Measuring relative phase between two waveforms using an oscilloscope

An important observation in supply chain management, known as the bullwhip effect,

Minimizing the Communication Cost for Continuous Skyline Maintenance

The fast Fourier transform method for the valuation of European style options in-the-money (ITM), at-the-money (ATM) and out-of-the-money (OTM)

Correlation key concepts:

Index Numbers OPTIONAL - II Mathematics for Commerce, Economics and Business INDEX NUMBERS

Synopsys RURAL ELECTRICATION PLANNING SOFTWARE (LAPER) Rainer Fronius Marc Gratton Electricité de France Research and Development FRANCE

A graphical introduction to the budget constraint and utility maximization

Beyond the F Test: Effect Size Confidence Intervals and Tests of Close Fit in the Analysis of Variance and Contrast Analysis

Normally Distributed Data. A mean with a normal value Test of Hypothesis Sign Test Paired observations within a single patient group

Module 3: Correlation and Covariance

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Joint Production and Financing Decisions: Modeling and Analysis

03 The full syllabus. 03 The full syllabus continued. For more information visit PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS

A STUDY ON THE RETURN ON EQUITY FOR THE ROMANIAN INDUSTRIAL COMPANIES

Web Application Scalability: A Model-Based Approach

A MOST PROBABLE POINT-BASED METHOD FOR RELIABILITY ANALYSIS, SENSITIVITY ANALYSIS AND DESIGN OPTIMIZATION

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Chapter 13 Introduction to Linear Regression and Correlation Analysis

HFCC Math Lab Beginning Algebra 13 TRANSLATING ENGLISH INTO ALGEBRA: WORDS, PHRASE, SENTENCES

Measures of Central Tendency and Variability: Summarizing your Data for Others

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

NBER WORKING PAPER SERIES HOW MUCH OF CHINESE EXPORTS IS REALLY MADE IN CHINA? ASSESSING DOMESTIC VALUE-ADDED WHEN PROCESSING TRADE IS PERVASIVE

Managing specific risk in property portfolios

Probabilistic models for mechanical properties of prestressing strands

Point Location. Preprocess a planar, polygonal subdivision for point location queries. p = (18, 11)

Multistage Human Resource Allocation for Software Development by Multiobjective Genetic Algorithm

CSI:FLORIDA. Section 4.4: Logistic Regression

The risk of using the Q heterogeneity estimator for software engineering experiments

Complex Conjugation and Polynomial Factorization

Demand Forecasting When a product is produced for a market, the demand occurs in the future. The production planning cannot be accomplished unless

A Multivariate Statistical Analysis of Stock Trends. Abstract

Principles of Hydrology. Hydrograph components include rising limb, recession limb, peak, direct runoff, and baseflow.

Chapter 12 General Equilibrium and Welfare

The Big Picture. Correlation. Scatter Plots. Data

The Online Freeze-tag Problem

Large firms and heterogeneity: the structure of trade and industry under oligopoly

Session 7 Bivariate Data and Analysis

Monitoring Frequency of Change By Li Qin

Univariate Regression

Softmax Model as Generalization upon Logistic Discrimination Suffers from Overfitting

Pythagorean Triples and Rational Points on the Unit Circle

1 Gambler s Ruin Problem

Expert Systems with Applications

Software Cognitive Complexity Measure Based on Scope of Variables

Comparing Dissimilarity Measures for Symbolic Data Analysis

FREQUENCIES OF SUCCESSIVE PAIRS OF PRIME RESIDUES

An actuarial approach to pricing Mortgage Insurance considering simultaneously mortgage default and prepayment

Solving simultaneous equations using the inverse matrix

MEP Y9 Practice Book A

Simple linear regression

X How to Schedule a Cascade in an Arbitrary Graph

6.042/18.062J Mathematics for Computer Science December 12, 2006 Tom Leighton and Ronitt Rubinfeld. Random Walks

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Mathematics Online Instructional Materials Correlation to the 2009 Algebra I Standards of Learning and Curriculum Framework

POISSON PROCESSES. Chapter Introduction Arrival processes

Buffer Capacity Allocation: A method to QoS support on MPLS networks**

Zeros of a Polynomial Function

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

INFERRING APP DEMAND FROM PUBLICLY AVAILABLE DATA 1

Compensating Fund Managers for Risk-Adjusted Performance

NAVAL POSTGRADUATE SCHOOL THESIS

Unit 3. Elasticity Learning objectives Questions for revision: 3.1. Price elasticity of demand

GAS TURBINE PERFORMANCE WHAT MAKES THE MAP?

Title: Stochastic models of resource allocation for services

ENFORCING SAFETY PROPERTIES IN WEB APPLICATIONS USING PETRI NETS

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared.

Jena Research Papers in Business and Economics

Concepts in Investments Risks and Returns (Relevant to PBE Paper II Management Accounting and Finance)

Number Theory Naoki Sato

Partial Fractions. Combining fractions over a common denominator is a familiar operation from algebra:

Descriptive Statistics

Asymmetric Information, Transaction Cost, and. Externalities in Competitive Insurance Markets *

This is a square root. The number under the radical is 9. (An asterisk * means multiply.)

Lecture 21 and 22: The Prime Number Theorem

3.1. Solving linear equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

Pressure Drop in Air Piping Systems Series of Technical White Papers from Ohio Medical Corporation

3.1. RATIONAL EXPRESSIONS

Two-resource stochastic capacity planning employing a Bayesian methodology

Storage Basics Architecting the Storage Supplemental Handout

SPSS Guide: Regression Analysis

Optional Strain-Rate Forms for the Johnson Cook Constitutive Model and the Role of the Parameter Epsilon_0 1 AUTHOR

Load Balancing Mechanism in Agent-based Grid

c 2009 Je rey A. Miron 3. Examples: Linear Demand Curves and Monopoly

Free Software Development. 2. Chemical Database Management

Transcription:

The Analysis of Interdeendent Series by Correlation Method PhD Professor Angelica BĂCESCU - CĂRBUNARU PhD Lecturer Monica CONDRUZ - BĂCESCU Bucharest University of Economic Studies Abstract By the correlation method we can measure the degree of interdeendence between two or more variables. Qualitative analysis, on the basis of knowing the tye of variables, could elain the eistence of a common cause that influence both of them. As such, we roose to consider briefly the roblem of multile correlation between a deendent variable and two or more indeendent variables. Keywords: correlation analysis, functional relationshi ( mathematics law), the coefficient (intensity) of correlation, correlation, multifactorial (multile) correlation, total multile correlation, multile artial correlation. *** The essence of correlation method In statistical research we often encounter distributions where, to each unit of considered oulation, corresonds simultaneously two or more features of the same kind or of different nature. Eamles can be found in most areas : eole s height and weight, the amount of rainfall and harvests, technical equiment and labor roductivity, etc. Such distributions, called twodimensional, suggest the eistence of relationshis between those features. The correlation analysis measures the degree of interdeendence between two or more variables. It can not rove a causal relationshi, a relationshi of cause and effect between variables. Interdeendence can however be functional. By functional relationshi we understand the relationshi that can be eressed by a formula or by what mathematicians call a mathematics law, such as linear relationshi formula. The fact that the two variables tend to be related, meaning that one of them increased levels tend to be accomanied by an increase of the second and vice versa, it doesn t turn out that the first has a direct influence on the second or vice versa. But that association does not Revista Română de Statistică - Suliment nr. / 0 9

rove the oosite either. Qualitative analysis based on a thorough knowledge of the nature of the variables is necessarily required for correct interretation of the correlation or intensity coefficient. The correlation of the two variables can be elained by the eistence of common causes that affect both. Revenue growth can cause both increased variable of oulation cash availability and variable growth of oulation endowment with refrigerators. But we cannot say that the first variable is the cause of the second, but that both are caused by increasing incomes of oulation. Determination of relationshi between two variables raises the question: how close, how intense are these relationshis and, consequently, how much can vary estimates or redictions made on the basis of regression analysis. As the average can not be roerly interreted without a measure of the disersion or variability of the data from which it resulted (and the most common measure is the average square deviation or standard deviation), so estimates or redictions resulting from the regression analysis require finding a measure of their variability. We will consider as measures of estimations variability based on regression analysis, the standard error of estimation and correlation coefficient. To this end, we will first refer to a hyothetical eamle consisting of five airs of associated values. y y y y-y (y-y) 3 3 9 6,, 3,0 3,9,8-0, -0,,0-0,9 0, 0,0 0,0,00 0,8 0,0 TOTALS,0 0,0,90 The values of a and b are obtained with the hel of the following formulas: y y b N N 0 Romanian Statistical Review - Sulement nr. / 0

and a y y N which gives by relacement: b 0,9 a 0,3 The best adjusted equation of the right will be: y=a+b The regression equation is thus: Y = 0,3 + 0, 9 We note that the algebraic sum of the differences (y-y) is equal to zero. If we remember that the first algebraic roerty of arithmetic average is that the sum of deviations around the average is equal to zero, we conclude that the regression line is a line of averages. We mention that the regression line must ass through a oint with coordinates (, y) and this is the case of our equation for the regression line will ass through that resective oint. The multile multifactorial correlation We will only briefly eamine the roblem of multile correlation, ie, the roblem of correlation between a deendent variable and two or more indeendent variables. For eamle, we want to know not only the correlation coefficient between labor roductivity and the number of workers, but also between the number of workers and their energy endowment. Or we may want to know the correlation coefficient not only between the yield er hectare and alication of nitrogenous fertilizers, but also of work outut and alication of nitrogenous fertilizers and certain hoshate fertilizers. The introduction of a further indeendent or elanatory variables in a roblem of regression results in less standard error in the estimation of r, Revista Română de Statistică - Suliment nr. / 0

namely, the value of correlation coefficient will increase. It may be conceived that by the introduction of additional elanatory variables, the correlation coefficient to be so high, that almost all the variation to be elained. But the difficulties that will thus aear in the calculation of the correlation coefficient will be greater than the benefit that we will get from them. The multile correlation may be artial or total. While the overall multile correlation measures the influence of combined indeendent variables, multile artial correlation measures the influence of each indeendent variables variations when the other one or the other ones are considered constant. Partial correlation coefficient shows the relative imortance of each indeendent variable. M.Ezechiel, an American statistician who founded methods of calculating multile and artial correlation, gave the following eamle to illustrate these methods. He referred to the relationshi between the increase in the benefits of a farm and its size, the number of cattle and workers. Considering only the etent and number of cattle, he calculated a correlation coefficient of 0.90. Introducing a third indeendent variable, the number of workers, the correlation coefficient increased to 0.9. Transforming these correlation coefficients in coefficients of determination, the first two indeendent variables elain 8.8 % of the increase in benefits, and all three indeendent variables 83.7%. The introduction of the third indeendent variable incresed the elanation of farm benefits variation with the difference between 83.7% and 8.8 %, therefore with %. If the elanation of the significance of these increases is determined by comaring them to the unelained variation before the introduction of the third indeendent variable (the number of workers ), we find that.0 / 8.3 namely 0.93 ercent of the variation that remained unelained when we considered only the etent and number of cattle, could be associated with the number of workers. If you take the square root of 0.093 we find the artial correlation coefficient 0.33. Multifactorial links can be eressed with the hel of multile regression equation: Y = f (,... )+ where:,... reresent the indeendent or factorial characteristics. = o residue variable with zero average and constant variance. As noted above, the factorial variables included in the model should eress key factors influencing the henomenon investigated. The most widely used model of multifactorial regression is the linear model eressed as follows: Romanian Statistical Review - Sulement nr. / 0

Y = a + a + a +... + a 0 where: a 0 = a coefficient eressing the influence of not included factors in the model considered with constant action, a i (i=, ) are multile regression coefficients and show the share of each characteristic factor i influencing the characteristic factor y. Calculation of the arameters a 0, a a is made starting from the well known method of the smallest squares eressed as: ( y a0 a a... a ) = minimum By derivation is obtained a normal equation system with factorial variables and + arameters, as follows: a0n + a + a +... + a = y a0 + a + a +... + a = y a0 + a + a +... + a = y... a + + + + = a a... a y Regression coefficients a i can have either a ositive sign either a negative sign and show the tye of connection (direct or inverse) between the factorial variable i and the resultant variable y. Conclusions Checking the eistence or non-eistence and the intensity of these relations is the object of interdeendent series analysis. It involves the simultaneous analysis of two variables and uses two tyes of statistical methods: regression and correlation. If one of the two variables is considered as elanatory variable or indeendent, and the other called resultant or deendent, resent changes in the case of a variation of the first, we will use the regression method to analyze the relationshis between them. Revista Română de Statistică - Suliment nr. / 0 3

Bibliograhy. Andrei, T., (003), Statistică şi econometrie, Editura Economică, Bucureşti. Angelache, C., (008), Tratat de statistică teoretică şi economică, Editura Economică, Bucureşti 3. Băcescu-Cărbunaru, A. (009), Statistică Bazele statisticii, Ed. Universitară, Bucureşti. Mark, L. Berenson, David, M. Levine, Timothy, C. Krehbiel, (0), Basic Business statistics: concets and alications, twelfth edition, Pearson. Prodan, L., (0), Corelaţia dintre rodusul intern brut/locuitor şi rata de ocuare a oulaţiei model econometric de analiză, Revista Română de Statistică nr. 6. Tiţan, E., Ghiţă, S., Cărbunaru-Băcescu, A., (000), Bazele statisticii, Editura Meteora Press, Bucureşti 7. Vătui, M., Voineagu, V., Lilea, E., Goschin, Z., Isaic-Maniu, I., Danciu, A., Todose, D., (006), Statistică- Teorie şi alicaţii, Editura A.S.E., Bucureşti Romanian Statistical Review - Sulement nr. / 0