Characteristics of Global Calling in VoIP services: A logistic regression analysis
|
|
|
- Dinah Jacobs
- 10 years ago
- Views:
Transcription
1 17 Characteristics of Global Calling in VoIP services: A logistic regression analysis Thanyalak Mueangkhot 1, Julian Ming-Sung Cheng 2 and Chaknarin Kongcharoen 3 1 Department of Business Administration, National Central University, Jhongli City, Taoyuan Country, 32001, Taiwan (R.O.C.) 2 Department of Business Administration, National Central University, Jhongli City, Taoyuan Country, 32001, Taiwan (R.O.C.) 3 Department of Computer Science & Information Engineering, National Central University, Jhongli City, Taoyuan Country, 32001, Taiwan (R.O.C.) Abstract We investigate the characteristics of Voice over Internet Protocol (VoIP) services in Taiwan that influence Global Calling choice, by comparing Bayesian and Maximum likelihood estimates. The results will assist operators in setting their services. The sample data are 573 sales transactions using prepaid cards issued by Taiwanese e-retailers. The dataset contains individuals who placed recommendations in , which are representative of the actual market. Global Calling is defined as a critical characteristic selected by consumers. Data are analyzed by logistic regression of the two approaches. The study shows that the odds ratio of Global Calling is suitably modeled by the Bayesian approach. Collectively, the results show that the key characteristic of VoIP service is the price per month and Taiwanese landline calling. Taiwanese landline calling exerts the greater positive effect on Global Calling choice. The study then discusses the implications of VoIP operator selection, focusing on Taiwan s local call market. Keywords: Characteristic of VoIP, Logistic regression, Bayesian estimation, Maximum Likelihood estimation, Global Calling. 1. Introduction International calling (hereafter called Global Calling) is offered by Internet Network Operators, including the dominant international player Skype and others such as Phone Power, ITP, Lingo, Tokbox and icall. Consumers can make international calls at zero or very low rates by Voice over Internet Protocal (VoIP). By the end of 2011, more than 2.3 billion people worldwide could access the internet [1]. To satisfy consumers, operators attempt to improve their own infrastructure technology and provide a range of services. The majority of VoIP call services utilize existing long distance calling, implementing voice technologies such as Global Calling. Therefore, Global Calling is a critical characteristic by which operators can develop and increase revenue. Operators are interested in the service characteristics (attributes) that affect Global Calling services, and in which methods are applicable to small data sources. Using VoIP services, operators may strategically design and use instruments that promote Internet communication services. In this paper, we investigate the impact of VoIP service characteristics on consumer decision in Taiwan s local call market. The accuracy of Bayesian predictors is compared that of Maximum likelihood (ML) estimates. The methodology uses the odds ratio of Global Calling. The estimated parameters provide insight into the characteristics of VoIP, and can thereby be used to improve international call performance. The remainder of this study is organized as follows. The next section briefly reviews existing knowledge of VoIP services and the application of logistic regression to the service sector. Section 3 describes the data sources and evaluates the relationship between VoIP service characteristics and Global Calling choice by logistic regression. The results of the analysis are presented in Section 4, while Section 5 provides a summary and conclusions. 2. Literature review The relevant literature can be broadly divided into management of the VoIP characteristics and application of
2 18 the logistic model to the service sector. We discuss each class in turn below. 2.1 VoIP characteristics Largely as a result of globalization, electronic communication technologies have become faster, cheaper, more reliable and more effective in recent decades, and have supplanted traditional means of sending information across the globe, which are slow and often unpredictable. VoIP has become an effective means of international calling at comparatively cheap rates. This innovation in the communication industry has led to new voice technologies, such as prepaid card calling. Prepaid phone cards are compatible with any phone, anywhere at any time, and have been adapted to VoIP services, where they are used to make international calls at reduced rates. These cards have led to rapid sales increases in commercial local markets such as voipindia.co.in, aglow.sg, indosat.com, skype.com, and skype.pchome.com.tw. This study concentrates on the role of characteristic services in telecommunication sectors, which are perceived in different ways by consumers. Previously, the relationships between VoIP choice and telephone service features have been investigated by part-worth function models, which were validated on consumer data [2, 3]. In this study, VoIP characteristic services are defined from the packages or plans of VoIP services, which are presented in promotional messages on operator websites. Examples are: Skype.com: Unlimited calls, Landline calls, Mobile calls, Landline and mobile calls, credit payment, subscription payment. aglow.sg: international calls, traditional analog phone lines, mobile phone line. Skype.pchome: Unlimited world, Taiwan landline calls, Taiwan mobile calls, Unlimited United States calls, China calls. Most of the VoIP operators offer similar service plans with various features. This research uses Skype as a representative VoIP service, on account of its current popularity in voice networking. Skype enables international calling through any landline or mobile phone, including received calls and voic messages. The main function of Skype is to support long distance and local calling by consumers. 2.2 Logistic model Many techniques have been proposed for analyzing the characteristics or attributes of telecommunication products [2,3,4]. However, comparatively few studies in the VoIP service literature have related service characteristics to VoIP consumer choice. Here, we evaluate a VoIP transaction dataset by Bayesian logistic regression, and compare the results with those estimated by ML. ML is a preferred analysis technique because its estimators are consistent, sufficient, and efficient [5]. The parameters estimated by ML and Bayesian regression are compared in terms of their predictive accuracy in telecommunication sectors. 3. Research Methodology In our methodology, the variables are VoIP service characteristics. We model the relationship between these service characteristics and Global Calling service choice. 3.1 Data This data are e-retailer sales transaction data on prepaid cards sold in Taiwan. The dataset includes information on retail choice by consumers who had purchased Skype prepaid cards and had recommended used services in blogs. Many Taiwanese customers purchased from two e- retailers on the Yahoo (yahoo.com.tw) and Ruten (ruten.com) sites. The dataset comprised 573 individual records logged in 28 months between 2011 and 2013 of the service plan. We identified 9 service characteristics among 8 VoIP service plans from messages promoting prepaid cards on e-retailers websites. These are reported in Table 1. In summary, the data consists of 9 service characteristics as independent variables for 8 VoIP packages. The data are categorized as Global Calling or non-global Calling as dependent variables. These are defined as the voice characteristics which influence consumer choice. Data collection is implemented from the service features advertised in the prepaid card messages. We selected 8 service plans which could be purchased at the time of data collection. 3.2 Logistic regression approach Logistic regression is used to predict a discrete outcome. The dependent variable is dichotomous, such as presence/absence or success/failure. When the independent variables are categorical or mixed continuous/categorical, or when the dependent variable is dichotomous, logistic regression is the preferred analysis technique.
3 19 Table 1 Variables and descriptions of VoIP service characteristics used in the analysis Variables Definition Mean S.D. GLOBAL Plan call listing included call to landline and mobile for 10 countries (Canada, China, Guam, Hong Kong SAR, Puerto Rico, Singapore, Thailand, United States) and call to landline for 65 countries (available at TW_JP_KR Plan call listing included call to Taiwan, Japan and Korea US_CA Plan call listing included call to USA and Canada TW_L_UN Plan call through Taiwan landline unlimited included TW_L Plan call through Taiwan landline included TW_M Plan call through Taiwan mobile included ONE_M price prepaid per one month ($TWD) THREE_M price prepaid per three months ($TWD) ONE_YEAR price prepaid per one year ($TWD) OLD_CUS Plan for old consumers Taiwan Dollar ($TWD) = $ US Dollar In logistic regression, the relationship between the predictor variable and dependent variables is non-linear. The logistic regression function is the logarithmic transformation of the probability, and is defined as where are the predictors, is a constant and, is the coefficient of the predictor variables. This transformation yields a number, called logit P(X), for each test case with input variable X. The logistic regression model linearly fits the log odds to the variables [6]. An alternative from of the logistic regression equation is given as (1) This study compares the two approaches of ML estimation and Bayesian estimation. The first approach is a classical statistical method of parameter estimation. ML estimates the unknown parameters in a logistic regression model [7]. ML maximizes the log likelihood, defined as the likelihood (odds) that the observed value of the dependent variable will be predicted from the observed values of the independents. The results of logistic regression are expressed in terms of odds rather than probabilities because the odds provide a pure summary statistic of the partial effect of a given predictor, controlling for the predictors in the logistic regression. The second approach is a Bayesian statistical method of parameter estimation. The Bayesian regression model uses the Markov Chain Monte Carlo (MCMC) approach which is simple and accurate method [8]. MCMC involves three steps: (a) formulating prior probability distributions of targeted parameters; (b) specifying the likelihood function; and (c) MCMC sampling of the posterior probability distributions. In the Bayesian approach, parameters are considered random and a joint probability model for both data and parameters is required. This difficulty is most easily circumvented by proposing an informative prior of small precision, to avoid potential bias introduced by subjective beliefs [9]. Because the joint posterior distribution of the parameters of the proposed models is analytically intractable, the point and interval estimates of the parameters are obtained by simulation-based MCMC methods [10]. After estimating a suitable prior distribution of the model parameters, the model runs simultaneously in WinBUGS. The parameters are defined from the data by a set of vague normal priors. A burn-in of 5,000 iterations is permitted, followed by 10,000 iterations in which the calculated intercept and coefficients are stored. Convergence was achieved after 5,000 iterations and the posterior distributions of model parameters were summarized using descriptive statistics. ML estimator of the model parameters was obtained in R, using the Rcmdr package introduced and developed by John Fox [11]. Comparison of the obtained results was also implemented in this package. (2)
4 20 In this study, the dependent variable is Global Calling Global, which takes value 1 if the consumer has used the Global Calling service and 0 otherwise. The independent variables are TW_JP_KR, US_CA, TW_L_UN, TW_L, TW_M, and OLD_CUS is a dummy indicator variable. The three cost-per-period levels, ONE_M, THREE_M, and ONE_YEAR, are continuous variables. The analysis attempts to highlight the relationship between the VoIP service characteristics and Global Calling choice. A logistic regression model can be written in terms of the odds of an event occurring. These odds are defined as the ratio of the probability that an event will occur and the probability that it will not. The coefficients indicate the effect of individual explanatory variables on the logarithm of the odds. In terms of the variables used in this study, the logistic regression equation is + where Yi is the log odds of Global Calling for the ith observation and is its error term. As shown in Eq. (3), the probability of Global Calling is affected by the difference between Global Calling and its absence. In this study, the parameters are estimated by both Bayesian and ML methods. 4. Results The results are summarized in Tables 2 and 3. The fit models of the two approaches are assessed and the estimates are reported. Our main interest lies in analyzing the relationship between the presence/absence of Global Calling in VoIP service plans and the various characteristics of the service. 4.1 Model fitting The model is based on Bayesian and ML estimation. The eight predictors are the characteristics of the VoIP service extracted from the promotional messages of prepaid cards posted on company websites. We compared the model with respect ot Akaike s Information Criterion [12] and Deviance Information Criterion [13]. The DIC value reported by Bayesian analysis was 204.8, while the AIC reported by ML was The smaller DIC value indicates that the Bayesian method provides a better fit to the data. (3) 4.2 Variable effects The estimates output by the two approaches (Bayesian logistic regression and ML) are summarized in Tables 2 and 3. The results involve eight predictors and the outcome is the probability of GLOBAL. In the Bayesian method, the posterior estimations of beta coefficients are obtained as the means of simulated betas from different numbers of samples after confirming convergence. On the other hand, the ML estimates of beta coefficients are obtained by iterating the Newton Raphson method on the data. In both cases, the results are most conveniently explained in terms of the odds ratio (the exponential of the betas coefficients). The respective posterior credible intervals (or confidence intervals) can be readily obtained in the usual manner. The variables considered are dummies, discrete or continuous variables. The odd ratio of a discrete variable can be interpreted as the change in relative risk of a certain outcome associated with a change in the independent variable. Values below 1 imply a negative association and their magnitude indicates their contribution to the probability of the outcome. The odds ratios for the continuous variables, on the other hand, can be interpreted as elasticities; that is, the odds ratio indicates the extent to which a one per cent change in an independent variable affects the relative risk of the outcome. As for the discrete variables, values below 1 imply negative elasticities. This study presents the odd ratio analysis of the discrete variables in GLOBAL and non GLOBAL. The Bayesian analysis indicates that five variables are significantly correlated with Global Calling. Two variables, ONE_M and TW_L, make significantly positive contributions. On the other hand, ML analysis reveals only two variables that contribute significantly to Global Calling choice, of which one (ONE_M) makes a positive contribution. THREE_M exerts a significantly negative effect on the service. The odds ratio, together with the respective 95% posterior credible intervals and 95% confidence intervals, are also presented in Tables 2 and 3. Similar results are obtained for both Bayesian and ML estimation methods. The ONE_M variable is the characteristic yielding the highest number of sales. THREE_M and ONE_YEAR are variables of higher cost than ONE_M and generate fewer sales than ONE_M. From these results, it is evident that consumers who select higher-cost services are less likely to intensify the service. The results are supported by both Bayesian and ML at the 5% significance level.
5 21 The TW_L variable exerts a positive impact on Global Calling. In the real market, if an operator enables the TW_L characteristic, the volume of sales is likely to Table 2: Bayesian Estimates in Logistic model Parameter mean 2.50% 97.50% Odd Ratio alpha -3.43E E - 15 ONE_M 1.10E E + 00 * THREE_M -2.17E E - 01 * ONE_YEAR 1.02E E + 00 TW_L 2.42E E + 11 * TW_M -1.00E E - 44 * TW_L_UN -6.21E E - 03 * US_CA -1.31E E - 06 TW_JP_KR -4.93E E - 03 OLD_CUS 8.06E E + 00 Signif. Code: * Table 3: Maximum Likelihood Estimates in Logistic model variable Estimate Std. Error Pr(> z ) Odd Ratio Intercept -4.41E E E - 20 ONE_M 1.49E E E E + 00 * THREE_M -2.41E E E E - 01 * ONE_YEAR -1.54E E E - 01 TW_L 4.52E E E + 19 TW_M -7.54E E E - 33 TW_L_UN -1.94E E E - 09 US_CA 1.16E E E + 05 TW_JP_KR -2.71E E E - 12 OLD_CUS 1.03E E E + 00 Signif. Code: * increase. Therefore, the operator is likely to design service plans that support international calling. These conclusions can be drawn only from the Bayesian analysis; ML estimation did not identify a similar impact of the TW_L variable. A primary goal of the present study is to compare the cause-effect estimates from ML and Bayesian analysis. Interestingly, when the two methods were applied to Global Calling data, they yielded different results. The variables ONE_M, THREE_M, TW_L, TW_M and TW_L_UN showed a cause-effect relationship under Bayesian estimation, while ML estimation implicated just two variables, ONE_M and THREE_M, in the cause-effect of VoIP services. The results also demonstrate the applicability of the Bayesian approach to sample observations. 5. Conclusions The purpose of this study was to predict the important variables that influence Global Calling choice probability in the Taiwanese local call market. To this end, we identified the VoIP characteristics that most strongly affect Global and non-global Calling by logistic regression analysis using Bayesian and ML methods. The methods were compared in terms of their AIC and DIC statistics. The results show that ONE_M and TW_L make significantly positive contributions to GLOBAL choice, while TW_M, TW_L_UN and ONE_YEAR exert a negative influence. This study has also identified the Taiwan landline service package as the greatest positively contributing factor. The Bayesian method provided a better fit to the data than the ML method, and identified a larger number of significant factors. Similar statistical studies will assist operators in elucidating VoIP service patterns. Furthermore, researchers can identify interesting transaction data that will assist theoretical development in communication technology. The empirical results of this study have implications for Internet network operators. The main factors predicted to influence customer choice in Global Calling services depend on the adopted analysis approach. We conclude that the Taiwanese local call market should retain the landline feature in its service package. Acknowledgments Special acknowledgment is extended to Charnarin for assistance with data collection and for the many valuable discussions. We also acknowledge Ryan Chen for motivating this study, and who instigated useful discussions on the need to identify the characteristics of VoIP services from commercial Taiwanese websites. References [1] ITU, Internet continues to grow Worldwide to reach 2.3 billion mark late June 2012, available at: accessed 5 May [2] F.m Tseng, Forecasting the Taiwan customer market for Internet telephony, Journal of the Chinese Institute of Industrial Engineers, Vol. 22, No. 2, 2005, pp [3] M. L. Zubey, W. Wagner, and J. R. Otto, A conjoint analysis of voice over IP attributes, Internet Research, Vol. 12, No. 1, 2002, pp [4] G. Cecere, and N. Corrocher, The usage of VoIP services and other communication services: An empirical analysis of
6 22 Italian consumers, Technological Forecasting and Social Change, Vol. 79, No. 3, 2012, pp [5] M.E.B. Akiva, and S. R. Lerman, Discrete choice analysis: theory and application to predict travel demand, MIT press, Vol. 9, [6] B. W. Lindgren, Statistic theory, Macmillan College, [7] S.E. Fienberg, The analysis of cross-classified categorical data: Springer, [8] R. S. Akhter, S. Hussain and U. R. S. Habib, MCMC simulation of GARCH model to forecast network traffic boad, International Journal of Computer Science Issues, Vol. 9, No.2, 2012, pp [9] R. Haylock, and A. O Hagan, On inference for outputs of computationally expensive algorithms with uncertainty on the inputs, Bayesian statistics, Vol. 5, 1996, pp [10] L. Tierney, Markov chains for exploring posterior distributions, The Annuals of Statistics, 1994, pp [11] J. Fox, Getting Started With the R Commander: A Basic- Statistics Graphical User Interface to R, Journal of Statistical Software, Vol. 14, No. 9, 2005, pp [12] H. Akaike, "Information theory and an extension of the maximum likelihood principle, Selected Paper of Hirotugu Akike, 1998, pp [13] D.J. Spiegelhalter, N. G. Best, B. P. Carlin et al., Bayesian measures of model complexity and fit, Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 64, No. 4, 2002, pp Thanyalak Mueangkhot received her M.B.A in Faculty of Liberal Arts and Management of Kasetsart University, SakonNakhon campus, Thailand, in She is now a graduate in Department of Business Administration of National Central University. Her main research interest is service marketing, green marketing, and Internet Technologies. Julian Ming-Sung Cheng is an Associate Professor of Marketing in the Department of Business Administration, National Central University, Taiwan. His current research interests include marketing channels, international branding, global marketing and internet marketing. His papers have appeared in Journal of International Marketing, Industrial Marketing Management, Journal of Business and Industrial Marketing, Journal of Advertising Research, International Journal of Market Research, Journal of Product & Brand Management, Journal of Marketing Channels, Asia Pacific Journal of Marketing and Logistics and so on. Chaknarin Kongcharoen received his M.Eng in Computer Engineering in Faculty of Engineering, Suranaree University of Technology, Thailand, in He is now a graduate in Department of Computer Science & Information Engineering, National Central University. His main research interest in data mining, web mining, software metric and e-learning. His papers have appeared in KMITL Science and Technology Journal.
Handling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
Multivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
Generalized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
Prediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
More details on the inputs, functionality, and output can be found below.
Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing
LOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach
Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models
Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models Clement A Stone Abstract Interest in estimating item response theory (IRT) models using Bayesian methods has grown tremendously
Applications of R Software in Bayesian Data Analysis
Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
Imputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Marketing Mix Modelling and Big Data P. M Cain
1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored
Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
Behavioral Entropy of a Cellular Phone User
Behavioral Entropy of a Cellular Phone User Santi Phithakkitnukoon 1, Husain Husna, and Ram Dantu 3 1 [email protected], Department of Comp. Sci. & Eng., University of North Texas [email protected], Department
Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
VoIP Services in Asia
Brochure More information from http://www.researchandmarkets.com/reports/310908/ VoIP Services in Asia Description: Since its entrance into the diverse Asia telecommunication market in the late 1990s,
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
A Bayesian hierarchical surrogate outcome model for multiple sclerosis
A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)
Statistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
Some Essential Statistics The Lure of Statistics
Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived
Parallelization Strategies for Multicore Data Analysis
Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management
New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
It is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.
IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION
Analysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
Model-based Synthesis. Tony O Hagan
Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that
MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
Integrated Resource Plan
Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1
From the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
The HB. How Bayesian methods have changed the face of marketing research. Summer 2004
The HB How Bayesian methods have changed the face of marketing research. 20 Summer 2004 Reprinted with permission from Marketing Research, Summer 2004, published by the American Marketing Association.
A Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University [email protected]
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University [email protected] 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection
Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics
Analysis of Fire Statistics of China: Fire Frequency and Fatalities in Fires
Analysis of Fire Statistics of China: Fire Frequency and Fatalities in Fires FULIANG WANG, SHOUXIANG LU, and CHANGHAI LI State Key Laboratory of Fire Science University of Science and Technology of China
Multinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4
4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression
I. Basic concepts: Buoyancy and Elasticity II. Estimating Tax Elasticity III. From Mechanical Projection to Forecast
Elements of Revenue Forecasting II: the Elasticity Approach and Projections of Revenue Components Fiscal Analysis and Forecasting Workshop Bangkok, Thailand June 16 27, 2014 Joshua Greene Consultant IMF-TAOLAM
CHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios
Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios By: Michael Banasiak & By: Daniel Tantum, Ph.D. What Are Statistical Based Behavior Scoring Models And How Are
Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle
Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases Andreas Züfle Geo Spatial Data Huge flood of geo spatial data Modern technology New user mentality Great research potential
Simple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its
Model Selection and Claim Frequency for Workers Compensation Insurance
Model Selection and Claim Frequency for Workers Compensation Insurance Jisheng Cui, David Pitt and Guoqi Qian Abstract We consider a set of workers compensation insurance claim data where the aggregate
On the Determinants of Household Debt Maturity Choice
On the Determinants of Household Debt Maturity Choice by Wolfgang Breuer*, Thorsten Hens #, Astrid Juliane Salzmann +, and Mei Wang Abstract. This paper jointly analyzes a traditional and a behavioral
Impact of Firm Specific Factors on the Stock Prices: A Case Study on Listed Manufacturing Companies in Colombo Stock Exchange.
Impact of Firm Specific Factors on the Stock Prices: A Case Study on Listed Manufacturing Companies in Colombo Stock Exchange. Abstract: Ms. Sujeewa Kodithuwakku Department of Business Finance, Faculty
ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node
Enterprise Miner - Regression 1 ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node 1. Some background: Linear attempts to predict the value of a continuous
Free Trial - BIRT Analytics - IAAs
Free Trial - BIRT Analytics - IAAs 11. Predict Customer Gender Once we log in to BIRT Analytics Free Trial we would see that we have some predefined advanced analysis ready to be used. Those saved analysis
An Empirical Analysis on the Performance Factors of Software Firm
, pp.121-132 http://dx.doi.org/10.14257/ijseia.2014.8.7,10 An Empirical Analysis on the Performance Factors of Software Firm Moon-Jong Choi, Jae-Won Song, Rock-Hyun Choi and Jae-Sung Choi #3-707, DGIST,
R2MLwiN Using the multilevel modelling software package MLwiN from R
Using the multilevel modelling software package MLwiN from R Richard Parker Zhengzheng Zhang Chris Charlton George Leckie Bill Browne Centre for Multilevel Modelling (CMM) University of Bristol Using the
Models for Product Demand Forecasting with the Use of Judgmental Adjustments to Statistical Forecasts
Page 1 of 20 ISF 2008 Models for Product Demand Forecasting with the Use of Judgmental Adjustments to Statistical Forecasts Andrey Davydenko, Professor Robert Fildes [email protected] Lancaster
11. Time series and dynamic linear models
11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd
Master s Thesis. A Study on Active Queue Management Mechanisms for. Internet Routers: Design, Performance Analysis, and.
Master s Thesis Title A Study on Active Queue Management Mechanisms for Internet Routers: Design, Performance Analysis, and Parameter Tuning Supervisor Prof. Masayuki Murata Author Tomoya Eguchi February
Gaussian Processes to Speed up Hamiltonian Monte Carlo
Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo
Understanding Characteristics of Caravan Insurance Policy Buyer
Understanding Characteristics of Caravan Insurance Policy Buyer May 10, 2007 Group 5 Chih Hau Huang Masami Mabuchi Muthita Songchitruksa Nopakoon Visitrattakul Executive Summary This report is intended
Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function
The Empirical Econometrics and Quantitative Economics Letters ISSN 2286 7147 EEQEL all rights reserved Volume 1, Number 4 (December 2012), pp. 89 106. Comparison of sales forecasting models for an innovative
Bootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
Bayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
Executive Program in Managing Business Decisions: A Quantitative Approach ( EPMBD) Batch 03
Executive Program in Managing Business Decisions: A Quantitative Approach ( EPMBD) Batch 03 Calcutta Ver 1.0 Contents Broad Contours Who Should Attend Unique Features of Program Program Modules Detailed
How To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification
Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement
Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Toshio Sugihara Abstract In this study, an adaptive
Linear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
A Study on the Comparison of Electricity Forecasting Models: Korea and China
Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 675 683 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.675 Print ISSN 2287-7843 / Online ISSN 2383-4757 A Study on the Comparison
Java Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
ELASTICITY OF LONG DISTANCE TRAVELLING
Mette Aagaard Knudsen, DTU Transport, [email protected] ELASTICITY OF LONG DISTANCE TRAVELLING ABSTRACT With data from the Danish expenditure survey for 12 years 1996 through 2007, this study analyses
Branding and Search Engine Marketing
Branding and Search Engine Marketing Abstract The paper investigates the role of paid search advertising in delivering optimal conversion rates in brand-related search engine marketing (SEM) strategies.
OPTIMAL DESIGN OF A MULTITIER REWARD SCHEME. Amir Gandomi *, Saeed Zolfaghari **
OPTIMAL DESIGN OF A MULTITIER REWARD SCHEME Amir Gandomi *, Saeed Zolfaghari ** Department of Mechanical and Industrial Engineering, Ryerson University, Toronto, Ontario * Tel.: + 46 979 5000x7702, Email:
Voice Over Internet Protocol (VOIP)
Voice Over Internet Protocol (VOIP) Helping with your telecoms management Voice over Internet Protocol (VoIP) What is VoIP? VoIP is the ability to transmit voice over the Internet VoIP was adopted by the
Agilent Creating Multi-tone Signals With the N7509A Waveform Generation Toolbox. Application Note
Agilent Creating Multi-tone Signals With the N7509A Waveform Generation Toolbox Application Note Introduction Of all the signal engines in the N7509A, the most complex is the multi-tone engine. This application
TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND
I J A B E R, Vol. 13, No. 4, (2015): 1525-1534 TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND Komain Jiranyakul * Abstract: This study
Imperfect Debugging in Software Reliability
Imperfect Debugging in Software Reliability Tevfik Aktekin and Toros Caglar University of New Hampshire Peter T. Paul College of Business and Economics Department of Decision Sciences and United Health
Note on the EM Algorithm in Linear Regression Model
International Mathematical Forum 4 2009 no. 38 1883-1889 Note on the M Algorithm in Linear Regression Model Ji-Xia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University
Studying Achievement
Journal of Business and Economics, ISSN 2155-7950, USA November 2014, Volume 5, No. 11, pp. 2052-2056 DOI: 10.15341/jbe(2155-7950)/11.05.2014/009 Academic Star Publishing Company, 2014 http://www.academicstar.us
Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I
Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting
Chapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
Knowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
The SPSS TwoStep Cluster Component
White paper technical report The SPSS TwoStep Cluster Component A scalable component enabling more efficient customer segmentation Introduction The SPSS TwoStep Clustering Component is a scalable cluster
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Penalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
Calculating the Probability of Returning a Loan with Binary Probability Models
Calculating the Probability of Returning a Loan with Binary Probability Models Associate Professor PhD Julian VASILEV (e-mail: [email protected]) Varna University of Economics, Bulgaria ABSTRACT The
Forecast Model for Box-office Revenue of Motion Pictures
Forecast Model for Box-office Revenue of Motion Pictures Jae-Mook Lee, Tae-Hyung Pyo December 7, 2009 1 Introduction The main objective of the paper is to develop an econometric model to forecast box-office
Better decision making under uncertain conditions using Monte Carlo Simulation
IBM Software Business Analytics IBM SPSS Statistics Better decision making under uncertain conditions using Monte Carlo Simulation Monte Carlo simulation and risk analysis techniques in IBM SPSS Statistics
Predict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
STAT3016 Introduction to Bayesian Data Analysis
STAT3016 Introduction to Bayesian Data Analysis Course Description The Bayesian approach to statistics assigns probability distributions to both the data and unknown parameters in the problem. This way,
Merged Structural Equation Model of Online Retailer's Customer Preference and Stickiness
Merged Structural Equation Model of Online Retailer's Customer Preference and Stickiness Sri Hastuti Kurniawan Wayne State University, 226 Knapp Building, 87 E. Ferry, Detroit, MI 48202, USA [email protected]
