A New Evaluation Measure for Information Retrieval Systems


 Millicent French
 2 years ago
 Views:
Transcription
1 A New Evaluation Measure for Information Retrieval Systems Martin Mehlitz Christian Bauckhage Deutsche Telekom Laboratories Jérôme Kunegis Şahin Albayrak Abstract Some of the establishe approaches to evaluating text clustering algorithms for information retrieval show theoretical flaws. In this paper, we analyze these flaws an introuce a new evaluation measure to overcome them. Base on a simple yet rigorous mathematical analysis of the effect of certain parameters in cluster base retrieval, we show that certain conclusions rawn in the recent literature must be taken with a grain of salt. Our new measure, in contrast, accounts for statistical biases that have to be expecte accoring to our analysis. A series of experiments an a comparison with results reporte recently unerlines that this measure is a more suitable performance inicator that allows for more meaningful interpretations. I. INTRODUCTION Information retrieval is the iscipline within computer science that eals with the automate storage an retrieval of ocuments an ata [1], [2], [3], [4]. As the amount of ata available to moern information societies continues to grow rapily, it is not surprising to fin that information retrieval is a very active area of research. In this paper, we focus on the problems of text retrieval an text clustering where the latter has been propose as a technique for improving text retrieval [5]. In general, systems that apply text clustering either retrieve ocuments by automatically ranking clusters an then selecting ocuments from the highest ranking clusters [6] or by showing the cluster structure to the user who has to select a cluster whose ocuments are then isplaye [7]. The output of a system that automatically ranks clusters is a flat list of ocuments similar to that of a regular information retrieval system. Therefore, the evaluation of these systems is similar to the evaluation of regular information retrieval systems. Evaluating a system that shows a cluster structure to a user though requires other measures. A methoology that is often applie for evaluation of these systems is the optimal cluster search which selects the cluster with the optimal value in regar to the employe measure [8]. In this paper, we analyze the pitfalls of this evaluation methoology. Base on an appropriate mathematical moel for what is going on in clusterbase text retrieval, we reconsier results reporte in a recent paper which evaluates several cluster algorithms for queryspecific clustering [9]. Our main contribution will be a new measure for evaluating information retrieval systems, which circumvents the shortcommings which we ientify in our analysis. The remainer of this paper is structure as follows: In section II, we will briefly review concepts applie in information retrieval. Section III will then present a mathematical analysis of how the optimal cluster search methoology affects precision an recall measures. Base on the results we erive for ranom clusterings, section IV will introuce a new evaluation inicator, which, in the experiments in section V, is compare to the classic evaluation measures. A summary an an outlook on future work will conclue this contribution. II. BACKGROUND AND RELATED WORK A. Clustering Documents for Information Retrieval Clustering is the process of grouping items in a way that items in a group are similar to each other an issimilar to the items in other groups [10]. In information retrieval often the problem for retrieving the esire ocuments given a short query lies in the ambiguity of the query. Terms in a query can occur in various ocuments an contexts an usually it is very ifficult to ientify the ocuments where a search term is use in the intene sense. Therefore, many search results in a result list of a classic information retrieval system are irrelevant. Clustering results for information retrieval is a way to circumvent this problem by structuring the (probably) very long list of search results. Of course, this will not avoi any irrelevant ocuments but if the cluster labels are intuitive enough, the user can irectly choose the cluster which contains the most relevant ocuments for his request. Figure 1 illustrates the process of ocument clustering. Starting with a user querying an information retrieval system with the phrase jaguar, a list of relevant ocuments is foun by matching strings in the search phrase with ocuments. While most search engines stop after the secon step (the actual retrieval of ocuments), text clustering procees by analyzing the ocuments in the result list in orer to group
2 Fig. 1. Query: Jaguar Doc1 ( join the Jaguar club car ) Doc2 ( car clubjaguar ) Doc3 (this weekens racejaguarriver) Doc4 (the Jaguar riverclub) Doc5 (saw animalbig catjaguarhabitat) Doc6 (Jaguarcar sale) Doc7 (animalsjaguar in its natural habitat) Doc8 (as I saw this car, the Jaguar ) Doc9 (Jaguaranimaljoin nature club) Doc10 (Jaguar on a Mac OS computer sale) Doc11 (on my computerjaguarsoftware) Car Doc1, Doc2 Doc3, Doc4 Doc6, Doc8 Animal Doc5, Doc7 Doc9, Computer Doc10, Doc 11, Diactic example of clustering ocuments for information retrieval similar ocuments. The user who querie for jaguar can then choose among the clusters an access the corresponing ocuments. B. Evaluation of Information Retrieval Systems The measures typically employe to evaluate an information retrieval system are precision an recall. Precision ( How many of the retrieve results are relevant? ) an recall ( How many of the relevant results are retrieve? ) are efine as: P (g ret,r)= g ret (1) r an R (g ret,g)= g ret g, (2) where r enotes the number of ocuments retrieve, g correspons to the number of relevant ocuments in the collection an g ret enotes the number of relevant ocuments among the returne ocuments. The effectiveness measure E is propose in [8] as a measure for the evaluation of information retrieval systems that are base on text clustering. E is efine as: ( β 2 +1 ) PR E (P, R) =1 β 2 P + R, (3) where β is a factor that etermines the relative importance of precision an recall where precision an recall values are base on the ocuments in the cluster. C. Evaluation of Document Cluster Information Retrieval Systems Base on the hypothesis that closely associate ocuments ten to be relevant to the same request [4] some information retrieval systems employ ocument clustering in orer to achieve improvement in retrieval of relevant ocuments. Especially hierarchical cluster methos are very popular for text clustering, since hierarchies of clusters are suppose to be relatively intuitive to browse. Research on using ocument clustering techniques for information retrieval resulte in algorithms like Suffix Tree Clustering [7] an the DisCover Algorithm [11] as well as in the esign of the Scatter/Gather System [12] an the CREDO System [13]. Usually, novel ocument clustering methos are compare to existing methos or to an inverte file search, base on recall an precision measures or some variants of these. A common evaluation methoology for such a comparison is the Optimal Cluster Search ([7], [8], [9], [12], [14]). This methoology evaluates every single cluster an then selects the cluster with the optimal value as a representative for the cluster hierarchy. Text clustering for information retrieval can be one either statically on the whole collection of ocuments [6], [8] or in a queryspecific manner for a small number of top ranking ocuments returne by an inverte file search [7], [13]. In [9], Tombros, Villa an van Rijsbergen evaluate several cluster algorithms on ifferent ata sets with the goal of ientifying the extent to which the number n of top ranking ocuments that are clustere influences the retrieval of relevant ocuments. When creating ocument clusters, ocuments can either be limite to occur in only one leaf cluster of a hierarchy (which means they still appear in their parent clusters all the way to the root) or they are allowe to occur in multiple clusters. Note that forcing ocuments to appear in only one cluster is consiere to be artificial [7], [12] since ocuments can contain more than one topic an a goo cluster structure shoul reflect this. III. COMPUTING EXPECTED VALUES FOR RANDOM CLUSTERING When evaluating a text clustering algorithm for information retrieval with the optimal cluster evaluation methoology one has to consier the impact of this methoology on the achieve results. If an algorithm prouces a large number of clusters an the best one is selecte, then the critical point is, how likely it is to achieve such a result if the clusters were generate ranomly. In this section we review the mathematical backgroun for computing the expectation of ranom clusterings. This expectation value can inicate whether the outcome of an experiment was likely to happen by chance or not. First, we explain how to compute the value for a ranom pick of pages from a set of ocuments. Then we aress how a growing number of ranom picks influences this value. A. One Ranom Pick The probability istribution for the ranom pick scenario is a hypergeometric istribution: Given a finite set of items (ocuments) where some are goo (relevant) an some are
3 ba (irrelevant), we raw without replacement. Suppose we have a set of items consisting of g goo items an b ba items. If we raw exactly items, the probability for rawing a number i of goo items is given by: ( g b ) i)( P (x = i) = i=0 ( g+b i ( g+b ). (4) The corresponing expecte value for this probability amounts to: ( ( g b )) µ () = i i)( i ) = g. (5) g + b Writing this expectation as a function of, the total number of items rawn from the set of available items, shall emphasize that, if g an b are fixe, it only epens on how many items are rawn. B. Multiple Ranom Picks The next step is to analyze to which extent the methoology of first generating multiple clusters an then selecting the best one changes the expecte value of goo items. Therefore, we nee to express the likelihoo of an outcome of a certain number of occurrences of relevant pages with regar to the number an size of clusters. Suppose we create c clusters where each cluster contains items. Since the optimal cluster methoology selects the best cluster from a set of clusters, we nee to escribe the probability that a number of i relevant pages is the maximum number of relevant pages in the set of c clusters an that i really occurs. It is given by: B (x = i,, c) P (x = i, c, ) =, (6) A (, c) where A is the total number of ifferent outcomes of choosing c times the number of items an where B is the number of ifferent ways of choosing these items with i being the maximum number of goo items in the set of c clusters an with i really occurring. A is given by: ( ) c g + b A (, c) =, (7) while B is given by: B (x = i,, c) = c j=1 ( ) c C (i, ) j D (i, ) c j, (8) j where j enotes the number of clusters that actually contain i goo items. The first factor in the sum enotes the number of ways of istributing these j goo items over the clusters c. C is the number of ways of rawing the i goo items, while D counts the number of ways of rawing less then i goo items. C an D are given by: an C (i, ) = k=0 ( g i )( b i ) i 1 ( )( ) g b D (i, ) = k k (9) (10) Expecte number of goo ocuments in best cluster c (Number of clusters) Fig. 2. Expectation values obtaine from applying optimal search to ranomly generate ocument clusters. respectively. The expectation for multiple ranom raws is given by: µ all (, c) = xp (x, c, ) (11) x=0 which only epens on the number of clusters an the number of pages in each cluster. C. A Ranom Cluster Example Given the result in (11), let us consier a simple example of how the optimal cluster evaluation metho can istort the evaluation results for a information retrieval. Suppose there are a 100 ocuments, 10 of which are relevant. From these ocuments we always raw ten ocuments. Fig. 2 shows the expecte number of relevant ocuments that the best raw (optimal cluster) contains in relation to the number of times that we raw. Note that a small number of clusters alreay yiels quite a large number of relevant ocuments: Selecting the best cluster out of five ranom clusters yiels an expecte number of relevant ocuments that is more than twice as high than that of a single raw. Alreay this simple experiment unerlines how important it is either to report etaile information on the experimental statistics (which unfortunately is not common practice) or to use an evaluation measure which incorporates them. IV. A NEW MEASURE FOR EVALUATION OF INFORMATION RETRIEVAL SYSTEMS Recall an precision have been wiely employe by the information retrieval community in orer to evaluate information retrieval systems. We propose new measures calle absolute precision an absolute recall that incorporate the knowlege of the expecte value of a ranom retrieval with similar settings for a given scenario an experiment. We efine them as: P abs (g ret,g exp,r)= g ret g exp (12) r R abs (g ret,g exp,g)= g ret g exp (13) g
4 With g exp, the number of relevant ocuments that can be expecte to be retrieve by a ranom retrieval. Therefore, if the information retrieval system oes not employ ocument clustering, the value of g exp can be compute by (5). For a system that oes employ ocument clustering methos, (11) can be use to compute this value if the optimal cluster search is use for evaluation. These new values for precision an recall can then be use to compute the absolute effectiveness by slightly ajusting (3): E abs = E (P abs,r abs ) (14) Other measures base on recall an precision can be erive in a similar fashion. Note that P abs an R abs can be negative given that a ranom retrieval performe better than an actual experiment. Therefore, the values of E abs will be larger than 1 for these cases. V. EXPERIMENTS In this section we will employ the theoretical moel for computing expecte values for clustering ocuments an compare them to the results reporte by Tombros et al. [9] by using the new evaluation measure introuce above. A. Experimental Settings There are several ocument collections an cluster algorithms evaluate by Tombros et al., but for simplicity we shall consier only one collection an one algorithm. Since statistics such as average cluster size are not being reporte for every combination of algorithms an ocument collection, we will focus on the WSJ text corpora 1 an the group average cluster algorithm for which these statistics are most complete. The group average cluster algorithm is also the one which performe best in [9]. For computing the expecte values of an optimal cluster evaluation, we nee to know the number of ocuments n that are clustere an the number of relevant ocuments g among these ocuments. Furthermore, we nee to know the average cluster size an the average number of generate clusters c. Unfortunately, Tombros et al. only report the first three of these values while the average number of generate clusters is missing. Only for their experiments with the full LISA collection 2 (6004 Documents) o they report the number of clusters(6003). In orer to be able to compute the expecte values for the experiments we assume a relatively low number of generate clusters for each value of n. Since we nee integer values for g an, we roun both values own to the next smaller integer. Since the expectation in (11) is growing monotonously in these values, this will result in values that are slightly smaller than the real expecte values for the experiments, favoring the results in [9]. The actual values for the settings in each experiment are shown in Table I. 1 The Wall Street Journal (WSJ) text collection is part of the Tipster corpus which can be obtaine from 2 A collection of abstracts of the Library an Information Science Abstracts from 1982 TABLE I EXPERIMENTAL SETTINGS scenario number number generate average number retrieve relevant clusters size B. Experimental Results With the above settings we compute the expecte number of relevant ocuments in the optimal cluster for a ranom experiment. These values are use to compute the absolute effectiveness. For each scenario, we give the effectiveness values (the smaller the better) reporte by Tombros et al. as well as the values for the absolute effectiveness. TableII compares the values reporte in their experiments to the values that can be expecte with ranomly generate clusters. Note that for small values of n even though the effectiveness reporte is relatively low the absolute effectiveness is not. In fact, the absolute effectiveness for the value of n = 100 top ranking ocuments is higher than one, inicating that ranomly generating a similar number of cluster with a similar size yiels better results than the results reporte by Tombros et al. One of the goals of Tombros et al. was to etermine the importance of the parameter n for the retrieval. They conclue that this number has very little impact on the achieve effectiveness. However, accoring to our experiments it appears that one shoul avoi making any statement about the results achieve with small values of n, since the employe cluster algorithms i not manage to prouce a significantly goo cluster structures. There are several issues concerning the question whether this comparison is fair or not. First, a source of error in the computation of ranom clusters is that we take the average cluster size to compute expecte values for relevant ocuments. Therefore, for setting the parameter β 1in the expression use for computing effectiveness (see (3)), the effectiveness reporte in [9] will always show a positive bias, since the size of optimal clusters changes with this parameter (as acknowlege by the authors). Furthermore, we assume a hypergeometric istribution of ocuments in each cluster. This basically applies to cluster algorithms that allow ocuments to be inclue in more than one cluster at the same level of the hierarchy. The algorithms evaluate by Tombros et al. though are agglomerative hierarchical cluster algorithms, therefore a ocument is only inclue in one leaf noe. However, when Tombros et al. evalute cluster hierarchies they o not limit their search to leaf noes but also regar intermeiate noes, therefore ocuments actually o occur in more than one noe in the evaluate hierarchy.
5 TABLE II EXPERIMENTAL RESULTS scenario expecte number reporte absolute number of relevant effectiveness effectiveness ocuments β =0.5 β =1.0 β =2.0 β = VI. CONCLUSIONS AND FUTURE WORKS In this paper, we have shown that the commonly use methoology for evaluating the optimal cluster search approach to Information Retrieval may result in misleaing interpretations. Still, we consier evaluating every single cluster an selecting the one with the optimal score to be a vali tool for examining clustering approaches to retrieval, if, that is, one takes into account that certain observations are basically ue to the evaluation methoology. For instance, in a recent survey on the effectiveness of clusteringbase retrieval algorithms [9], the authors report that they i not fin a statistically significant variation in queryspecific cluster effectiveness for ifferent values of topranke ocuments. From the rigorous theoretical analysis presente in this paper, however, we can conclue that the algorithms they evaluate ha to perform poorly given the experimental setting an parameterization that was consiere. Concerning examples like this, our main contribution in this paper is therefore the efinition of a new measure for evaluating information retrieval systems that accounts for statistically expectable results by normalizing the performance measure with respect to these. In a series of experiments that analyze the impact of the number of clustere ocuments on the cluster effectiveness, we have shown that the use of this new measure yiels meaningful results. Compare to the figures obtaine from conventional precision an recall measures the absolute precision an absolute recall inicate, for instance, that no serious statement can be mae for small numbers of clustere ocuments. Since the influence of the number of top ranking ocuments that are clustere on the effectiveness of the clustering was not sufficiently analyze by [9], we will further investigate this influence base on the more appropriate performance measures which we introuce in this paper. Machine learning techniques that are employe for learning the parameters of a text cluster algorithm epen on an error function that inicates gooness of a certain set of parameters. We will compare the training results achieve by error functions base on conventional performance inicators to those achieve with error functions base on evaluation measures erive from our moel of expectation values for ranom clustering. REFERENCES [1] G. Kowalski, Information Retrieval Systems: Theory an Implementation, Kluwer, Norwell, MA, [2] M.T. Maybury, E., Intelligent Multimeia Information Retrieval, MIT Press, Cambrige, MA, [3] G. Salton, Introuction to Moern Information Retrieval, McGraw Hill, New York, NY, [4] C.J. van Rijsbergen, Information Retrieval, ButterworthHeinemann, Newton, MA, [5] G. Salton, Automatic Information Organization an Retrieval, Mc Graw Hill, New York, US, [6] W.B. Croft an C.J. van Rijsbergen, Document clustering: An evaluation of some experiments with the cranfiel 1400 collection, Information Storage Retrieval, 11, pp , [7] O. Zamir an O. Etzioni, Web ocument clustering: A feasibility emonstration, in SIGIR. 1998, pp , ACM. [8] N. Jarine an C.J. Van Rijsbergen, The use of hierarchic clustering in information retrieval, Information Storage an Retrieval, vol.7, pp , [9] A. Tombros, R. Villa, an C. J. van Rijsbergen, The effectiveness of queryspecific hierarchic clustering in information retrieval, Inf. Process. Manage, vol. 38, no. 4, pp , [10] A.K. Jain an R.C. Dubes, Algorithms for Clustering Data, Prentice Hall, Upper Sale River, NJ, [11] K. Kummamuru, R. Lotlikar, S. Roy, K. Singal, an R. Krishnapuram, A hierarchical monothetic ocument clustering algorithm for summarization an browsing search results, in Proc. Int. Conf. on Worl Wie Web. 2004, pp , ACM. [12] M.A. Hearst an J.O. Peersen, Reexamining the cluster hypothesis: Scatter/gather on retrieval results, in Proc. SIGIR , pp , ACM. [13] C. Carpineto an G. Romano, Exploiting the potential of concept lattices for information retrieval with CREDO, J. UCS, vol. 10, no. 8, pp , [14] H. Schutze an C. Silverstein, Projections for efficient ocument clustering, in Proc. SIGIR97, 1997, Classification Methos, pp
Detecting Possibly Fraudulent or ErrorProne Survey Data Using Benford s Law
Detecting Possibly Frauulent or ErrorProne Survey Data Using Benfor s Law Davi Swanson, Moon Jung Cho, John Eltinge U.S. Bureau of Labor Statistics 2 Massachusetts Ave., NE, Room 3650, Washington, DC
More informationMath 230.01, Fall 2012: HW 1 Solutions
Math 3., Fall : HW Solutions Problem (p.9 #). Suppose a wor is picke at ranom from this sentence. Fin: a) the chance the wor has at least letters; SOLUTION: All wors are equally likely to be chosen. The
More informationOpen World Face Recognition with Credibility and Confidence Measures
Open Worl Face Recognition with Creibility an Confience Measures Fayin Li an Harry Wechsler Department of Computer Science George Mason University Fairfax, VA 22030 {fli, wechsler}@cs.gmu.eu Abstract.
More informationDigital barrier option contract with exponential random time
IMA Journal of Applie Mathematics Avance Access publishe June 9, IMA Journal of Applie Mathematics ) Page of 9 oi:.93/imamat/hxs3 Digital barrier option contract with exponential ranom time Doobae Jun
More informationData Center Power System Reliability Beyond the 9 s: A Practical Approach
Data Center Power System Reliability Beyon the 9 s: A Practical Approach Bill Brown, P.E., Square D Critical Power Competency Center. Abstract Reliability has always been the focus of missioncritical
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS Contents 1. Moment generating functions 2. Sum of a ranom number of ranom variables 3. Transforms
More informationCrossOver Analysis Using TTests
Chapter 35 CrossOver Analysis Using ests Introuction his proceure analyzes ata from a twotreatment, twoperio (x) crossover esign. he response is assume to be a continuous ranom variable that follows
More informationModelling and Resolving Software Dependencies
June 15, 2005 Abstract Many Linux istributions an other moern operating systems feature the explicit eclaration of (often complex) epenency relationships between the pieces of software
More informationUnsteady Flow Visualization by Animating EvenlySpaced Streamlines
EUROGRAPHICS 2000 / M. Gross an F.R.A. Hopgoo Volume 19, (2000), Number 3 (Guest Eitors) Unsteay Flow Visualization by Animating EvenlySpace Bruno Jobar an Wilfri Lefer Université u Littoral Côte Opale,
More informationState of Louisiana Office of Information Technology. Change Management Plan
State of Louisiana Office of Information Technology Change Management Plan Table of Contents Change Management Overview Change Management Plan Key Consierations Organizational Transition Stages Change
More informationView Synthesis by Image Mapping and Interpolation
View Synthesis by Image Mapping an Interpolation Farris J. Halim Jesse S. Jin, School of Computer Science & Engineering, University of New South Wales Syney, NSW 05, Australia Basser epartment of Computer
More informationStochastic Modeling of MEMS Inertial Sensors
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 10, No Sofia 010 Stochastic Moeling of MEMS Inertial Sensors Petko Petkov, Tsonyo Slavov Department of Automatics, Technical
More informationLecture 8: Expanders and Applications
Lecture 8: Expaners an Applications Topics in Complexity Theory an Pseuoranomness (Spring 013) Rutgers University Swastik Kopparty Scribes: Amey Bhangale, Mrinal Kumar 1 Overview In this lecture, we will
More informationOn Adaboost and Optimal Betting Strategies
On Aaboost an Optimal Betting Strategies Pasquale Malacaria 1 an Fabrizio Smerali 1 1 School of Electronic Engineering an Computer Science, Queen Mary University of Lonon, Lonon, UK Abstract We explore
More informationJON HOLTAN. if P&C Insurance Ltd., Oslo, Norway ABSTRACT
OPTIMAL INSURANCE COVERAGE UNDER BONUSMALUS CONTRACTS BY JON HOLTAN if P&C Insurance Lt., Oslo, Norway ABSTRACT The paper analyses the questions: Shoul or shoul not an iniviual buy insurance? An if so,
More informationThe oneyear nonlife insurance risk
The oneyear nonlife insurance risk Ohlsson, Esbjörn & Lauzeningks, Jan Abstract With few exceptions, the literature on nonlife insurance reserve risk has been evote to the ultimo risk, the risk in the
More informationA Comparison of Performance Measures for Online Algorithms
A Comparison of Performance Measures for Online Algorithms Joan Boyar 1, Sany Irani 2, an Kim S. Larsen 1 1 Department of Mathematics an Computer Science, University of Southern Denmark, Campusvej 55,
More informationCHAPTER 5 : CALCULUS
Dr Roger Ni (Queen Mary, University of Lonon)  5. CHAPTER 5 : CALCULUS Differentiation Introuction to Differentiation Calculus is a branch of mathematics which concerns itself with change. Irrespective
More informationFirewall Design: Consistency, Completeness, and Compactness
C IS COS YS TE MS Firewall Design: Consistency, Completeness, an Compactness Mohame G. Goua an XiangYang Alex Liu Department of Computer Sciences The University of Texas at Austin Austin, Texas 787121188,
More informationFactoring Dickson polynomials over finite fields
Factoring Dickson polynomials over finite fiels Manjul Bhargava Department of Mathematics, Princeton University. Princeton NJ 08544 manjul@math.princeton.eu Michael Zieve Department of Mathematics, University
More informationA Case Study of Applying SOM in Market Segmentation of Automobile Insurance Customers
International Journal of Database Theory an Application, pp.2536 http://x.oi.org/10.14257/ijta.2014.7.1.03 A Case Stuy of Applying SOM in Market Segmentation of Automobile Insurance Customers Vahi Golmah
More informationUCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 9 Paired Data. Paired data. Paired data
UCLA STAT 3 Introuction to Statistical Methos for the Life an Health Sciences Instructor: Ivo Dinov, Asst. Prof. of Statistics an Neurology Chapter 9 Paire Data Teaching Assistants: Jacquelina Dacosta
More informationA BlameBased Approach to Generating Proposals for Handling Inconsistency in Software Requirements
International Journal of nowlege an Systems Science, 3(), 7, JanuaryMarch 0 A lamease Approach to Generating Proposals for Hanling Inconsistency in Software Requirements eian Mu, Peking University,
More informationPurpose of the Experiments. Principles and Error Analysis. ε 0 is the dielectric constant,ε 0. ε r. = 8.854 10 12 F/m is the permittivity of
Experiments with Parallel Plate Capacitors to Evaluate the Capacitance Calculation an Gauss Law in Electricity, an to Measure the Dielectric Constants of a Few Soli an Liqui Samples Table of Contents Purpose
More informationRobust Reading of Ambiguous Writing
In submission; please o not istribute. Abstract A given entity, representing a person, a location or an organization, may be mentione in text in multiple, ambiguous ways. Unerstaning natural language requires
More informationMSc. Econ: MATHEMATICAL STATISTICS, 1995 MAXIMUMLIKELIHOOD ESTIMATION
MAXIMUMLIKELIHOOD ESTIMATION The General Theory of ML Estimation In orer to erive an ML estimator, we are boun to make an assumption about the functional form of the istribution which generates the
More informationOption Pricing for Inventory Management and Control
Option Pricing for Inventory Management an Control Bryant Angelos, McKay Heasley, an Jeffrey Humpherys Abstract We explore the use of option contracts as a means of managing an controlling inventories
More informationOptimal Control Policy of a Production and Inventory System for multiproduct in Segmented Market
RATIO MATHEMATICA 25 (2013), 29 46 ISSN:15927415 Optimal Control Policy of a Prouction an Inventory System for multiprouct in Segmente Market Kuleep Chauhary, Yogener Singh, P. C. Jha Department of Operational
More information10.2 Systems of Linear Equations: Matrices
SECTION 0.2 Systems of Linear Equations: Matrices 7 0.2 Systems of Linear Equations: Matrices OBJECTIVES Write the Augmente Matrix of a System of Linear Equations 2 Write the System from the Augmente Matrix
More informationA Data Placement Strategy in Scientific Cloud Workflows
A Data Placement Strategy in Scientific Clou Workflows Dong Yuan, Yun Yang, Xiao Liu, Jinjun Chen Faculty of Information an Communication Technologies, Swinburne University of Technology Hawthorn, Melbourne,
More informationCh 10. Arithmetic Average Options and Asian Opitons
Ch 10. Arithmetic Average Options an Asian Opitons I. Asian Option an the Analytic Pricing Formula II. Binomial Tree Moel to Price Average Options III. Combination of Arithmetic Average an Reset Options
More informationProcedure to Measure the Data Transmission Speed in Mobile Networks in Accordance with the LTE Standard (Methodical procedure)
Prague, 15 August 2013 Proceure to Measure the Data Transmission Spee in Mobile Networks in Accorance with the LTE Stanar (Methoical proceure Publishe in connection with the Tener for the awar of the rights
More informationDECISION SUPPORT SYSTEM FOR MANAGING EDUCATIONAL CAPACITY UTILIZATION IN UNIVERSITIES
DECISION SUPPORT SYSTEM OR MANAGING EDUCATIONAL CAPACITY UTILIZATION IN UNIVERSITIES Svetlana Vinnik 1, Marc H. Scholl 2 Abstract Decisionmaking in the fiel of acaemic planning involves extensive analysis
More informationSensor Network Localization from Local Connectivity : Performance Analysis for the MDSMAP Algorithm
Sensor Network Localization from Local Connectivity : Performance Analysis for the MDSMAP Algorithm Sewoong Oh an Anrea Montanari Electrical Engineering an Statistics Department Stanfor University, Stanfor,
More informationDETERMINING OPTIMAL STOCK LEVEL IN MULTIECHELON SUPPLY CHAINS
HUNGARIAN JOURNA OF INDUSTRIA CHEMISTRY VESZPRÉM Vol. 39(1) pp. 107112 (2011) DETERMINING OPTIMA STOCK EVE IN MUTIECHEON SUPPY CHAINS A. KIRÁY 1, G. BEVÁRDI 2, J. ABONYI 1 1 University of Pannonia, Department
More informationDow Jones Sustainability Group Index: A Global Benchmark for Corporate Sustainability
www.corporateenvstrategy.com Sustainability Inex Dow Jones Sustainability Group Inex: A Global Benchmark for Corporate Sustainability Ivo Knoepfel Increasingly investors are iversifying their portfolios
More informationA Universal Sensor Control Architecture Considering Robot Dynamics
International Conference on Multisensor Fusion an Integration for Intelligent Systems (MFI2001) BaenBaen, Germany, August 2001 A Universal Sensor Control Architecture Consiering Robot Dynamics Frierich
More informationDEVELOPMENT OF A BRAKING MODEL FOR SPEED SUPERVISION SYSTEMS
DEVELOPMENT OF A BRAKING MODEL FOR SPEED SUPERVISION SYSTEMS Paolo Presciani*, Monica Malvezzi #, Giuseppe Luigi Bonacci +, Monica Balli + * FS Trenitalia Unità Tecnologie Materiale Rotabile Direzione
More informationThe most common model to support workforce management of telephone call centers is
Designing a Call Center with Impatient Customers O. Garnett A. Manelbaum M. Reiman Davison Faculty of Inustrial Engineering an Management, Technion, Haifa 32000, Israel Davison Faculty of Inustrial Engineering
More informationJitter effects on Analog to Digital and Digital to Analog Converters
Jitter effects on Analog to Digital an Digital to Analog Converters Jitter effects copyright 1999, 2000 Troisi Design Limite Jitter One of the significant problems in igital auio is clock jitter an its
More informationUnbalanced Power Flow Analysis in a Micro Grid
International Journal of Emerging Technology an Avance Engineering Unbalance Power Flow Analysis in a Micro Gri Thai Hau Vo 1, Mingyu Liao 2, Tianhui Liu 3, Anushree 4, Jayashri Ravishankar 5, Toan Phung
More informationTowards a Framework for Enterprise Architecture Frameworks Comparison and Selection
Towars a Framework for Enterprise Frameworks Comparison an Selection Saber Aballah Faculty of Computers an Information, Cairo University Saber_aballah@hotmail.com Abstract A number of Enterprise Frameworks
More informationRobust Reading: Identification and Tracing of Ambiguous Names
Robust Reaing: Ientification an Tracing of Ambiguous Names Xin Li Paul Morie Dan Roth Department of Computer Science University of Illinois, Urbana, IL 61801 {xli1,morie,anr}@uiuc.eu Abstract A given entity,
More informationImage compression predicated on recurrent iterated function systems **
1 Image compression preicate on recurrent iterate function systems ** W. Metzler a, *, C.H. Yun b, M. Barski a a Faculty of Mathematics University of Kassel, Kassel, F. R. Germany b Faculty of Mathematics
More informationWeb Document Clustering
Web Document Clustering Oren Zamir and Oren Etzioni Department of Computer Science and Engineering University of Washington Seattle, WA 981952350 U.S.A. {zamir, etzioni}@cs.washington.edu Abstract Users
More informationHardness Evaluation of Polytetrafluoroethylene Products
ECNDT 2006  Poster 111 Harness Evaluation of Polytetrafluoroethylene Proucts T.A.KODINTSEVA, A.M.KASHKAROV, V.A.KALOSHIN NPO Energomash Khimky, Russia A.P.KREN, V.A.RUDNITSKY, Institute of Applie Physics
More informationParameterized Algorithms for dhitting Set: the Weighted Case Henning Fernau. Univ. Trier, FB 4 Abteilung Informatik 54286 Trier, Germany
Parameterize Algorithms for Hitting Set: the Weighte Case Henning Fernau Trierer Forschungsberichte; Trier: Technical Reports Informatik / Mathematik No. 086, July 2008 Univ. Trier, FB 4 Abteilung Informatik
More informationImplementing IP Traceback in the Internet An ISP Perspective
Implementing IP Traceback in the Internet An ISP Perspective Dong Wei, Stuent Member, IEEE, an Nirwan Ansari, Senior Member, IEEE AbstractDenialofService (DoS) attacks consume the resources of remote
More informationWeek 4  Linear Demand and Supply Curves
Week 4  Linear Deman an Supply Curves November 26, 2007 1 Suppose that we have a linear eman curve efine by the expression X D = A bp X an a linear supply curve given by X S = C + P X, where b > 0 an
More informationLecture 17: Implicit differentiation
Lecture 7: Implicit ifferentiation Nathan Pflueger 8 October 203 Introuction Toay we iscuss a technique calle implicit ifferentiation, which provies a quicker an easier way to compute many erivatives we
More informationMathematics. Circles. hsn.uk.net. Higher. Contents. Circles 119 HSN22400
hsn.uk.net Higher Mathematics UNIT OUTCOME 4 Circles Contents Circles 119 1 Representing a Circle 119 Testing a Point 10 3 The General Equation of a Circle 10 4 Intersection of a Line an a Circle 1 5 Tangents
More informationA Generalization of Sauer s Lemma to Classes of LargeMargin Functions
A Generalization of Sauer s Lemma to Classes of LargeMargin Functions Joel Ratsaby University College Lonon Gower Street, Lonon WC1E 6BT, Unite Kingom J.Ratsaby@cs.ucl.ac.uk, WWW home page: http://www.cs.ucl.ac.uk/staff/j.ratsaby/
More informationProfessional Level Options Module, Paper P4(SGP)
Answers Professional Level Options Moule, Paper P4(SGP) Avance Financial Management (Singapore) December 2007 Answers Tutorial note: These moel answers are consierably longer an more etaile than woul be
More informationIdentification and Tracing of Ambiguous Names: Discriminative and Generative Approaches
Ientification an Tracing of Ambiguous Names: Discriminative an Generative Approaches Xin Li Paul Morie Dan Roth Department of Computer Science University of Illinois, Urbana, IL 61801 {xli1,morie,anr}@uiuc.eu
More informationStress Concentration Factors of Various Adjacent Holes Configurations in a Spherical Pressure Vessel
5 th Australasian Congress on Applie Mechanics, ACAM 2007 1012 December 2007, Brisbane, Australia Stress Concentration Factors of Various Ajacent Holes Configurations in a Spherical Pressure Vessel Kh.
More informationLearningBased Summarisation of XML Documents
LearningBase Summarisation of XML Documents Massih R. Amini Anastasios Tombros Nicolas Usunier Mounia Lalmas {name}@poleia.lip6.fr {first name}@cs.qmul.ac.uk University Pierre an Marie Curie Queen Mary,
More informationLagrangian and Hamiltonian Mechanics
Lagrangian an Hamiltonian Mechanics D.G. Simpson, Ph.D. Department of Physical Sciences an Engineering Prince George s Community College December 5, 007 Introuction In this course we have been stuying
More informationAn Information Retrieval using weighted Index Terms in Natural Language document collections
Internet and Information Technology in Modern Organizations: Challenges & Answers 635 An Information Retrieval using weighted Index Terms in Natural Language document collections Ahmed A. A. Radwan, Minia
More informationRisk Adjustment for Poker Players
Risk Ajustment for Poker Players William Chin DePaul University, Chicago, Illinois Marc Ingenoso Conger Asset Management LLC, Chicago, Illinois September, 2006 Introuction In this article we consier risk
More informationImproved division by invariant integers
1 Improve ivision by invariant integers Niels Möller an Torbjörn Granlun Abstract This paper consiers the problem of iviing a twowor integer by a singlewor integer, together with a few extensions an
More information6.3 Microbial growth in a chemostat
6.3 Microbial growth in a chemostat The chemostat is a wielyuse apparatus use in the stuy of microbial physiology an ecology. In such a chemostat also known as continuousflow culture), microbes such
More informationStock Market Value Prediction Using Neural Networks
Stock Market Value Preiction Using Neural Networks Mahi Pakaman Naeini IT & Computer Engineering Department Islamic Aza University Paran Branch email: m.pakaman@ece.ut.ac.ir Hamireza Taremian Engineering
More informationMinimizing Makespan in Flow Shop Scheduling Using a Network Approach
Minimizing Makespan in Flow Shop Scheuling Using a Network Approach Amin Sahraeian Department of Inustrial Engineering, Payame Noor University, Asaluyeh, Iran 1 Introuction Prouction systems can be ivie
More informationM147 Practice Problems for Exam 2
M47 Practice Problems for Exam Exam will cover sections 4., 4.4, 4.5, 4.6, 4.7, 4.8, 5., an 5.. Calculators will not be allowe on the exam. The first ten problems on the exam will be multiple choice. Work
More informationThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters
ThroughputScheuler: Learning to Scheule on Heterogeneous Haoop Clusters Shehar Gupta, Christian Fritz, Bob Price, Roger Hoover, an Johan e Kleer Palo Alto Research Center, Palo Alto, CA, USA {sgupta, cfritz,
More informationImproving Direct Marketing Profitability with Neural Networks
Volume 9 o.5, September 011 Improving Direct Marketing Profitability with eural etworks Zaiyong Tang Salem State University Salem, MA 01970 ABSTRACT Data mining in irect marketing aims at ientifying the
More informationENSURING POSITIVENESS OF THE SCALED DIFFERENCE CHISQUARE TEST STATISTIC ALBERT SATORRA UNIVERSITAT POMPEU FABRA UNIVERSITY OF CALIFORNIA
PSYCHOMETRIKA VOL. 75, NO. 2, 243 248 JUNE 200 DOI: 0.007/S336009935Y ENSURING POSITIVENESS OF THE SCALED DIFFERENCE CHISQUARE TEST STATISTIC ALBERT SATORRA UNIVERSITAT POMPEU FABRA PETER M. BENTLER
More informationMathematics Review for Economists
Mathematics Review for Economists by John E. Floy University of Toronto May 9, 2013 This ocument presents a review of very basic mathematics for use by stuents who plan to stuy economics in grauate school
More informationSeeing the Unseen: Revealing Mobile Malware Hidden Communications via Energy Consumption and Artificial Intelligence
Seeing the Unseen: Revealing Mobile Malware Hien Communications via Energy Consumption an Artificial Intelligence Luca Caviglione, Mauro Gaggero, JeanFrançois Lalane, Wojciech Mazurczyk, Marcin Urbanski
More information! # % & ( ) +,,),. / 0 1 2 % ( 345 6, & 7 8 4 8 & & &&3 6
! # % & ( ) +,,),. / 0 1 2 % ( 345 6, & 7 8 4 8 & & &&3 6 9 Quality signposting : the role of online information prescription in proviing patient information Liz Brewster & Barbara Sen Information School,
More informationOptimal Energy Commitments with Storage and Intermittent Supply
Submitte to Operations Research manuscript OPRE200909406 Optimal Energy Commitments with Storage an Intermittent Supply Jae Ho Kim Department of Electrical Engineering, Princeton University, Princeton,
More informationAn intertemporal model of the real exchange rate, stock market, and international debt dynamics: policy simulations
This page may be remove to conceal the ientities of the authors An intertemporal moel of the real exchange rate, stock market, an international ebt ynamics: policy simulations Saziye Gazioglu an W. Davi
More informationTraffic Delay Studies at Signalized Intersections with Global Positioning System Devices
Traffic Delay Stuies at Signalize Intersections with Global Positioning System Devices THIS FEATURE PRESENTS METHODS FOR APPLYING GPS DEVICES TO STUDY TRAFFIC DELAYS AT SIGNALIZED INTERSECTIONS. MOST OF
More informationGPRS performance estimation in GSM circuit switched services and GPRS shared resource systems *
GPRS performance estimation in GSM circuit switche serices an GPRS share resource systems * Shaoji i an SenGusta Häggman Helsinki Uniersity of Technology, Institute of Raio ommunications, ommunications
More informationUsing research evidence in mental health: userrating and focus group study of clinicians preferences for a new clinical questionanswering service
DOI: 10.1111/j.14711842.2008.00833.x Using research evience in mental health: userrating an focus group stuy of clinicians preferences for a new clinical questionanswering service Elizabeth A. Barley*,
More informationSensitivity Analysis of Nonlinear Performance with Probability Distortion
Preprints of the 19th Worl Congress The International Feeration of Automatic Control Cape Town, South Africa. August 2429, 214 Sensitivity Analysis of Nonlinear Performance with Probability Distortion
More information1 HighDimensional Space
Contents HighDimensional Space. Properties of HighDimensional Space..................... 4. The HighDimensional Sphere......................... 5.. The Sphere an the Cube in Higher Dimensions...........
More informationMODELLING OF TWO STRATEGIES IN INVENTORY CONTROL SYSTEM WITH RANDOM LEAD TIME AND DEMAND
art I. robobabilystic Moels Computer Moelling an New echnologies 27 Vol. No. 23 ransport an elecommunication Institute omonosova iga V9 atvia MOEING OF WO AEGIE IN INVENOY CONO YEM WIH ANOM EA IME AN
More informationRUNESTONE, an International Student Collaboration Project
RUNESTONE, an International Stuent Collaboration Project Mats Daniels 1, Marian Petre 2, Vicki Almstrum 3, Lars Asplun 1, Christina Björkman 1, Carl Erickson 4, Bruce Klein 4, an Mary Last 4 1 Department
More information. A UML/MARTE DETECTION OF STARVATION AND DEADLOCKS AT THE DESIGN LEVEL IN CONCURRENT SYSTEM
. A UML/MARTE DETECTION OF STARVATION AND DEADLOCKS AT THE DESIGN LEVEL IN CONCURRENT SYSTEM C.Revath Department of computer science an Engineering, Karunya University, comibatore,inia.rcswathiathi@gmail.com
More informationHOST SELECTION METHODOLOGY IN CLOUD COMPUTING ENVIRONMENT
International Journal of Avance Research in Computer Engineering & Technology (IJARCET) HOST SELECTION METHODOLOGY IN CLOUD COMPUTING ENVIRONMENT Pawan Kumar, Pijush Kanti Dutta Pramanik Computer Science
More informationIRMA InfraRed Motion Analyzer Automatic Passenger Counter (APC) System. Technical Documentation of the Fourth Generation
IRMA InfraRe Motion Analyzer Automatic Paenger Counter (APC) Sytem Technical Documentation of the Fourth Generation Part: General (Accuracy) Englih Verion pffile of the online ocumentation ate on the
More informationRESEARCH METHODS IN ECONOMICS
RESEARCH METHODS IN ECONOMICS Economics an Business Strategy Masters Programme Economics Department, University of Piraeus Instructor: C. Economiou Email: economiou@unipi.gr URL: http://web.mac.com/economiou
More informationPerformance And Analysis Of Risk Assessment Methodologies In Information Security
International Journal of Computer Trens an Technology (IJCTT) volume 4 Issue 10 October 2013 Performance An Analysis Of Risk Assessment ologies In Information Security K.V.D.Kiran #1, Saikrishna Mukkamala
More informationPhysics 112 Homework 4 (solutions) (2004 Fall) Solutions to Homework Questions 4 " R = 9.00 V " 72.0 # = 4.92!
Solutions to Homework Questions 4 Chapt18, Problem1: A battery having an emf of 9.00 V elivers 117 ma when connecte to a 72.0! loa. Determine the internal resistance of the battery. From!V = R+ r ),
More informationProduct Differentiation for SoftwareasaService Providers
University of Augsburg Prof. Dr. Hans Ulrich Buhl Research Center Finance & Information Management Department of Information Systems Engineering & Financial Management Discussion Paper WI99 Prouct Differentiation
More informationIntegral Regular Truncated Pyramids with Rectangular Bases
Integral Regular Truncate Pyramis with Rectangular Bases Konstantine Zelator Department of Mathematics 301 Thackeray Hall University of Pittsburgh Pittsburgh, PA 1560, U.S.A. Also: Konstantine Zelator
More informationThese draft test specifications and sample items and other materials are just that drafts. As such, they will systematically evolve over time.
t h e reesigne sat These raft test specifications an sample items an other materials are just that rafts. As such, they will systematically evolve over time. These sample items are meant to illustrate
More informationy or f (x) to determine their nature.
Level C5 of challenge: D C5 Fining stationar points of cubic functions functions Mathematical goals Starting points Materials require Time neee To enable learners to: fin the stationar points of a cubic
More informationYou Can t Not Choose: Embracing the Role of Choice in Ecological Restoration
EDITORIAL OPINION You Can t Not Choose: Embracing the Role of Choice in Ecological Restoration Stuart K. Allison 1,2 Abstract From the moment of its inception, human choice about how to treat the environment
More informationMathematical Models of Therapeutical Actions Related to Tumour and Immune System Competition
Mathematical Moels of Therapeutical Actions Relate to Tumour an Immune System Competition Elena De Angelis (1 an PierreEmmanuel Jabin (2 (1 Dipartimento i Matematica, Politecnico i Torino Corso Duca egli
More informationA New Pricing Model for Competitive Telecommunications Services Using Congestion Discounts
A New Pricing Moel for Competitive Telecommunications Services Using Congestion Discounts N. Keon an G. Ananalingam Department of Systems Engineering University of Pennsylvania Philaelphia, PA 191046315
More informationSupporting Adaptive Workflows in Advanced Application Environments
Supporting aptive Workflows in vance pplication Environments Manfre Reichert, lemens Hensinger, Peter Daam Department Databases an Information Systems University of Ulm, D89069 Ulm, Germany Email: {reichert,
More informationOptimized Capacity Bounds for the HalfDuplex Gaussian MIMO Relay Channel
Optimize Capacity Bouns for the HalfDuplex Gaussian MIMO elay Channel Lennart Geres an Wolfgang Utschick International ITG Workshop on mart Antennas (WA) February 211 c 211 IEEE. Personal use of this
More information19.2. First Order Differential Equations. Introduction. Prerequisites. Learning Outcomes
First Orer Differential Equations 19.2 Introuction Separation of variables is a technique commonly use to solve first orer orinary ifferential equations. It is socalle because we rearrange the equation
More informationComparative study regarding the methods of interpolation
Recent Avances in Geoesy an Geomatics Engineering Comparative stuy regaring the methos of interpolation PAUL DANIEL DUMITRU, MARIN PLOPEANU, DRAGOS BADEA Department of Geoesy an Photogrammetry Faculty
More informationApplications of Global Positioning System in Traffic Studies. Yi Jiang 1
Applications of Global Positioning System in Traffic Stuies Yi Jiang 1 Introuction A Global Positioning System (GPS) evice was use in this stuy to measure traffic characteristics at highway intersections
More informationOn the Multivariate t Distribution
Technical report from Automatic Control at Linköpings universitet On the Multivariate t Distribution Michael Roth Division of Automatic Control Email: roth@isy.liu.se 7th April 03 Report no.: LiTHISYR3059
More informationOn some absolute positiveness bounds
Bull. Math. Soc. Sci. Math. Roumanie Tome 53(101) No. 3, 2010, 269 276 On some absolute positiveness bouns by Doru Ştefănescu Deicate to the memory of Laurenţiu Panaitopol (19402008) on the occasion of
More informationMeanValue Theorem (Several Variables)
MeanValue Theorem (Several Variables) 1 MeanValue Theorem (Several Variables) THEOREM THE MEANVALUE THEOREM (SEVERAL VARIABLES) If f is ifferentiable at each point of the line segment ab, then there
More information