A New Evaluation Measure for Information Retrieval Systems
|
|
- Millicent French
- 8 years ago
- Views:
Transcription
1 A New Evaluation Measure for Information Retrieval Systems Martin Mehlitz Christian Bauckhage Deutsche Telekom Laboratories Jérôme Kunegis Şahin Albayrak Abstract Some of the establishe approaches to evaluating text clustering algorithms for information retrieval show theoretical flaws. In this paper, we analyze these flaws an introuce a new evaluation measure to overcome them. Base on a simple yet rigorous mathematical analysis of the effect of certain parameters in cluster base retrieval, we show that certain conclusions rawn in the recent literature must be taken with a grain of salt. Our new measure, in contrast, accounts for statistical biases that have to be expecte accoring to our analysis. A series of experiments an a comparison with results reporte recently unerlines that this measure is a more suitable performance inicator that allows for more meaningful interpretations. I. INTRODUCTION Information retrieval is the iscipline within computer science that eals with the automate storage an retrieval of ocuments an ata [1], [2], [3], [4]. As the amount of ata available to moern information societies continues to grow rapily, it is not surprising to fin that information retrieval is a very active area of research. In this paper, we focus on the problems of text retrieval an text clustering where the latter has been propose as a technique for improving text retrieval [5]. In general, systems that apply text clustering either retrieve ocuments by automatically ranking clusters an then selecting ocuments from the highest ranking clusters [6] or by showing the cluster structure to the user who has to select a cluster whose ocuments are then isplaye [7]. The output of a system that automatically ranks clusters is a flat list of ocuments similar to that of a regular information retrieval system. Therefore, the evaluation of these systems is similar to the evaluation of regular information retrieval systems. Evaluating a system that shows a cluster structure to a user though requires other measures. A methoology that is often applie for evaluation of these systems is the optimal cluster search which selects the cluster with the optimal value in regar to the employe measure [8]. In this paper, we analyze the pitfalls of this evaluation methoology. Base on an appropriate mathematical moel for what is going on in cluster-base text retrieval, we reconsier results reporte in a recent paper which evaluates several cluster algorithms for query-specific clustering [9]. Our main contribution will be a new measure for evaluating information retrieval systems, which circumvents the shortcommings which we ientify in our analysis. The remainer of this paper is structure as follows: In section II, we will briefly review concepts applie in information retrieval. Section III will then present a mathematical analysis of how the optimal cluster search methoology affects precision an recall measures. Base on the results we erive for ranom clusterings, section IV will introuce a new evaluation inicator, which, in the experiments in section V, is compare to the classic evaluation measures. A summary an an outlook on future work will conclue this contribution. II. BACKGROUND AND RELATED WORK A. Clustering Documents for Information Retrieval Clustering is the process of grouping items in a way that items in a group are similar to each other an issimilar to the items in other groups [10]. In information retrieval often the problem for retrieving the esire ocuments given a short query lies in the ambiguity of the query. Terms in a query can occur in various ocuments an contexts an usually it is very ifficult to ientify the ocuments where a search term is use in the intene sense. Therefore, many search results in a result list of a classic information retrieval system are irrelevant. Clustering results for information retrieval is a way to circumvent this problem by structuring the (probably) very long list of search results. Of course, this will not avoi any irrelevant ocuments but if the cluster labels are intuitive enough, the user can irectly choose the cluster which contains the most relevant ocuments for his request. Figure 1 illustrates the process of ocument clustering. Starting with a user querying an information retrieval system with the phrase jaguar, a list of relevant ocuments is foun by matching strings in the search phrase with ocuments. While most search engines stop after the secon step (the actual retrieval of ocuments), text clustering procees by analyzing the ocuments in the result list in orer to group
2 Fig. 1. Query: Jaguar Doc1 ( join the Jaguar club car ) Doc2 ( car clubjaguar ) Doc3 (this weekens racejaguarriver) Doc4 (the Jaguar riverclub) Doc5 (saw animalbig catjaguarhabitat) Doc6 (Jaguarcar sale) Doc7 (animalsjaguar in its natural habitat) Doc8 (as I saw this car, the Jaguar ) Doc9 (Jaguaranimaljoin nature club) Doc10 (Jaguar on a Mac OS computer sale) Doc11 (on my computerjaguarsoftware) Car Doc1, Doc2 Doc3, Doc4 Doc6, Doc8 Animal Doc5, Doc7 Doc9, Computer Doc10, Doc 11, Diactic example of clustering ocuments for information retrieval similar ocuments. The user who querie for jaguar can then choose among the clusters an access the corresponing ocuments. B. Evaluation of Information Retrieval Systems The measures typically employe to evaluate an information retrieval system are precision an recall. Precision ( How many of the retrieve results are relevant? ) an recall ( How many of the relevant results are retrieve? ) are efine as: P (g ret,r)= g ret (1) r an R (g ret,g)= g ret g, (2) where r enotes the number of ocuments retrieve, g correspons to the number of relevant ocuments in the collection an g ret enotes the number of relevant ocuments among the returne ocuments. The effectiveness measure E is propose in [8] as a measure for the evaluation of information retrieval systems that are base on text clustering. E is efine as: ( β 2 +1 ) PR E (P, R) =1 β 2 P + R, (3) where β is a factor that etermines the relative importance of precision an recall where precision an recall values are base on the ocuments in the cluster. C. Evaluation of Document Cluster Information Retrieval Systems Base on the hypothesis that closely associate ocuments ten to be relevant to the same request [4] some information retrieval systems employ ocument clustering in orer to achieve improvement in retrieval of relevant ocuments. Especially hierarchical cluster methos are very popular for text clustering, since hierarchies of clusters are suppose to be relatively intuitive to browse. Research on using ocument clustering techniques for information retrieval resulte in algorithms like Suffix Tree Clustering [7] an the DisCover Algorithm [11] as well as in the esign of the Scatter/Gather System [12] an the CREDO System [13]. Usually, novel ocument clustering methos are compare to existing methos or to an inverte file search, base on recall an precision measures or some variants of these. A common evaluation methoology for such a comparison is the Optimal Cluster Search ([7], [8], [9], [12], [14]). This methoology evaluates every single cluster an then selects the cluster with the optimal value as a representative for the cluster hierarchy. Text clustering for information retrieval can be one either statically on the whole collection of ocuments [6], [8] or in a query-specific manner for a small number of top ranking ocuments returne by an inverte file search [7], [13]. In [9], Tombros, Villa an van Rijsbergen evaluate several cluster algorithms on ifferent ata sets with the goal of ientifying the extent to which the number n of top ranking ocuments that are clustere influences the retrieval of relevant ocuments. When creating ocument clusters, ocuments can either be limite to occur in only one leaf cluster of a hierarchy (which means they still appear in their parent clusters all the way to the root) or they are allowe to occur in multiple clusters. Note that forcing ocuments to appear in only one cluster is consiere to be artificial [7], [12] since ocuments can contain more than one topic an a goo cluster structure shoul reflect this. III. COMPUTING EXPECTED VALUES FOR RANDOM CLUSTERING When evaluating a text clustering algorithm for information retrieval with the optimal cluster evaluation methoology one has to consier the impact of this methoology on the achieve results. If an algorithm prouces a large number of clusters an the best one is selecte, then the critical point is, how likely it is to achieve such a result if the clusters were generate ranomly. In this section we review the mathematical backgroun for computing the expectation of ranom clusterings. This expectation value can inicate whether the outcome of an experiment was likely to happen by chance or not. First, we explain how to compute the value for a ranom pick of pages from a set of ocuments. Then we aress how a growing number of ranom picks influences this value. A. One Ranom Pick The probability istribution for the ranom pick scenario is a hypergeometric istribution: Given a finite set of items (ocuments) where some are goo (relevant) an some are
3 ba (irrelevant), we raw without replacement. Suppose we have a set of items consisting of g goo items an b ba items. If we raw exactly items, the probability for rawing a number i of goo items is given by: ( g b ) i)( P (x = i) = i=0 ( g+b i ( g+b ). (4) The corresponing expecte value for this probability amounts to: ( ( g b )) µ () = i i)( i ) = g. (5) g + b Writing this expectation as a function of, the total number of items rawn from the set of available items, shall emphasize that, if g an b are fixe, it only epens on how many items are rawn. B. Multiple Ranom Picks The next step is to analyze to which extent the methoology of first generating multiple clusters an then selecting the best one changes the expecte value of goo items. Therefore, we nee to express the likelihoo of an outcome of a certain number of occurrences of relevant pages with regar to the number an size of clusters. Suppose we create c clusters where each cluster contains items. Since the optimal cluster methoology selects the best cluster from a set of clusters, we nee to escribe the probability that a number of i relevant pages is the maximum number of relevant pages in the set of c clusters an that i really occurs. It is given by: B (x = i,, c) P (x = i, c, ) =, (6) A (, c) where A is the total number of ifferent outcomes of choosing c times the number of items an where B is the number of ifferent ways of choosing these items with i being the maximum number of goo items in the set of c clusters an with i really occurring. A is given by: ( ) c g + b A (, c) =, (7) while B is given by: B (x = i,, c) = c j=1 ( ) c C (i, ) j D (i, ) c j, (8) j where j enotes the number of clusters that actually contain i goo items. The first factor in the sum enotes the number of ways of istributing these j goo items over the clusters c. C is the number of ways of rawing the i goo items, while D counts the number of ways of rawing less then i goo items. C an D are given by: an C (i, ) = k=0 ( g i )( b i ) i 1 ( )( ) g b D (i, ) = k k (9) (10) Expecte number of goo ocuments in best cluster c (Number of clusters) Fig. 2. Expectation values obtaine from applying optimal search to ranomly generate ocument clusters. respectively. The expectation for multiple ranom raws is given by: µ all (, c) = xp (x, c, ) (11) x=0 which only epens on the number of clusters an the number of pages in each cluster. C. A Ranom Cluster Example Given the result in (11), let us consier a simple example of how the optimal cluster evaluation metho can istort the evaluation results for a information retrieval. Suppose there are a 100 ocuments, 10 of which are relevant. From these ocuments we always raw ten ocuments. Fig. 2 shows the expecte number of relevant ocuments that the best raw (optimal cluster) contains in relation to the number of times that we raw. Note that a small number of clusters alreay yiels quite a large number of relevant ocuments: Selecting the best cluster out of five ranom clusters yiels an expecte number of relevant ocuments that is more than twice as high than that of a single raw. Alreay this simple experiment unerlines how important it is either to report etaile information on the experimental statistics (which unfortunately is not common practice) or to use an evaluation measure which incorporates them. IV. A NEW MEASURE FOR EVALUATION OF INFORMATION RETRIEVAL SYSTEMS Recall an precision have been wiely employe by the information retrieval community in orer to evaluate information retrieval systems. We propose new measures calle absolute precision an absolute recall that incorporate the knowlege of the expecte value of a ranom retrieval with similar settings for a given scenario an experiment. We efine them as: P abs (g ret,g exp,r)= g ret g exp (12) r R abs (g ret,g exp,g)= g ret g exp (13) g
4 With g exp, the number of relevant ocuments that can be expecte to be retrieve by a ranom retrieval. Therefore, if the information retrieval system oes not employ ocument clustering, the value of g exp can be compute by (5). For a system that oes employ ocument clustering methos, (11) can be use to compute this value if the optimal cluster search is use for evaluation. These new values for precision an recall can then be use to compute the absolute effectiveness by slightly ajusting (3): E abs = E (P abs,r abs ) (14) Other measures base on recall an precision can be erive in a similar fashion. Note that P abs an R abs can be negative given that a ranom retrieval performe better than an actual experiment. Therefore, the values of E abs will be larger than 1 for these cases. V. EXPERIMENTS In this section we will employ the theoretical moel for computing expecte values for clustering ocuments an compare them to the results reporte by Tombros et al. [9] by using the new evaluation measure introuce above. A. Experimental Settings There are several ocument collections an cluster algorithms evaluate by Tombros et al., but for simplicity we shall consier only one collection an one algorithm. Since statistics such as average cluster size are not being reporte for every combination of algorithms an ocument collection, we will focus on the WSJ text corpora 1 an the group average cluster algorithm for which these statistics are most complete. The group average cluster algorithm is also the one which performe best in [9]. For computing the expecte values of an optimal cluster evaluation, we nee to know the number of ocuments n that are clustere an the number of relevant ocuments g among these ocuments. Furthermore, we nee to know the average cluster size an the average number of generate clusters c. Unfortunately, Tombros et al. only report the first three of these values while the average number of generate clusters is missing. Only for their experiments with the full LISA collection 2 (6004 Documents) o they report the number of clusters(6003). In orer to be able to compute the expecte values for the experiments we assume a relatively low number of generate clusters for each value of n. Since we nee integer values for g an, we roun both values own to the next smaller integer. Since the expectation in (11) is growing monotonously in these values, this will result in values that are slightly smaller than the real expecte values for the experiments, favoring the results in [9]. The actual values for the settings in each experiment are shown in Table I. 1 The Wall Street Journal (WSJ) text collection is part of the Tipster corpus which can be obtaine from 2 A collection of abstracts of the Library an Information Science Abstracts from 1982 TABLE I EXPERIMENTAL SETTINGS scenario number number generate average number retrieve relevant clusters size B. Experimental Results With the above settings we compute the expecte number of relevant ocuments in the optimal cluster for a ranom experiment. These values are use to compute the absolute effectiveness. For each scenario, we give the effectiveness values (the smaller the better) reporte by Tombros et al. as well as the values for the absolute effectiveness. TableII compares the values reporte in their experiments to the values that can be expecte with ranomly generate clusters. Note that for small values of n even though the effectiveness reporte is relatively low the absolute effectiveness is not. In fact, the absolute effectiveness for the value of n = 100 top ranking ocuments is higher than one, inicating that ranomly generating a similar number of cluster with a similar size yiels better results than the results reporte by Tombros et al. One of the goals of Tombros et al. was to etermine the importance of the parameter n for the retrieval. They conclue that this number has very little impact on the achieve effectiveness. However, accoring to our experiments it appears that one shoul avoi making any statement about the results achieve with small values of n, since the employe cluster algorithms i not manage to prouce a significantly goo cluster structures. There are several issues concerning the question whether this comparison is fair or not. First, a source of error in the computation of ranom clusters is that we take the average cluster size to compute expecte values for relevant ocuments. Therefore, for setting the parameter β 1in the expression use for computing effectiveness (see (3)), the effectiveness reporte in [9] will always show a positive bias, since the size of optimal clusters changes with this parameter (as acknowlege by the authors). Furthermore, we assume a hypergeometric istribution of ocuments in each cluster. This basically applies to cluster algorithms that allow ocuments to be inclue in more than one cluster at the same level of the hierarchy. The algorithms evaluate by Tombros et al. though are agglomerative hierarchical cluster algorithms, therefore a ocument is only inclue in one leaf noe. However, when Tombros et al. evalute cluster hierarchies they o not limit their search to leaf noes but also regar intermeiate noes, therefore ocuments actually o occur in more than one noe in the evaluate hierarchy.
5 TABLE II EXPERIMENTAL RESULTS scenario expecte number reporte absolute number of relevant effectiveness effectiveness ocuments β =0.5 β =1.0 β =2.0 β = VI. CONCLUSIONS AND FUTURE WORKS In this paper, we have shown that the commonly use methoology for evaluating the optimal cluster search approach to Information Retrieval may result in misleaing interpretations. Still, we consier evaluating every single cluster an selecting the one with the optimal score to be a vali tool for examining clustering approaches to retrieval, if, that is, one takes into account that certain observations are basically ue to the evaluation methoology. For instance, in a recent survey on the effectiveness of clustering-base retrieval algorithms [9], the authors report that they i not fin a statistically significant variation in query-specific cluster effectiveness for ifferent values of top-ranke ocuments. From the rigorous theoretical analysis presente in this paper, however, we can conclue that the algorithms they evaluate ha to perform poorly given the experimental setting an parameterization that was consiere. Concerning examples like this, our main contribution in this paper is therefore the efinition of a new measure for evaluating information retrieval systems that accounts for statistically expectable results by normalizing the performance measure with respect to these. In a series of experiments that analyze the impact of the number of clustere ocuments on the cluster effectiveness, we have shown that the use of this new measure yiels meaningful results. Compare to the figures obtaine from conventional precision an recall measures the absolute precision an absolute recall inicate, for instance, that no serious statement can be mae for small numbers of clustere ocuments. Since the influence of the number of top ranking ocuments that are clustere on the effectiveness of the clustering was not sufficiently analyze by [9], we will further investigate this influence base on the more appropriate performance measures which we introuce in this paper. Machine learning techniques that are employe for learning the parameters of a text cluster algorithm epen on an error function that inicates gooness of a certain set of parameters. We will compare the training results achieve by error functions base on conventional performance inicators to those achieve with error functions base on evaluation measures erive from our moel of expectation values for ranom clustering. REFERENCES [1] G. Kowalski, Information Retrieval Systems: Theory an Implementation, Kluwer, Norwell, MA, [2] M.T. Maybury, E., Intelligent Multimeia Information Retrieval, MIT Press, Cambrige, MA, [3] G. Salton, Introuction to Moern Information Retrieval, McGraw- Hill, New York, NY, [4] C.J. van Rijsbergen, Information Retrieval, Butterworth-Heinemann, Newton, MA, [5] G. Salton, Automatic Information Organization an Retrieval, Mc- Graw Hill, New York, US, [6] W.B. Croft an C.J. van Rijsbergen, Document clustering: An evaluation of some experiments with the cranfiel 1400 collection, Information Storage Retrieval, 11, pp , [7] O. Zamir an O. Etzioni, Web ocument clustering: A feasibility emonstration, in SIGIR. 1998, pp , ACM. [8] N. Jarine an C.J. Van Rijsbergen, The use of hierarchic clustering in information retrieval, Information Storage an Retrieval, vol.7, pp , [9] A. Tombros, R. Villa, an C. J. van Rijsbergen, The effectiveness of query-specific hierarchic clustering in information retrieval, Inf. Process. Manage, vol. 38, no. 4, pp , [10] A.K. Jain an R.C. Dubes, Algorithms for Clustering Data, Prentice- Hall, Upper Sale River, NJ, [11] K. Kummamuru, R. Lotlikar, S. Roy, K. Singal, an R. Krishnapuram, A hierarchical monothetic ocument clustering algorithm for summarization an browsing search results, in Proc. Int. Conf. on Worl Wie Web. 2004, pp , ACM. [12] M.A. Hearst an J.O. Peersen, Reexamining the cluster hypothesis: Scatter/gather on retrieval results, in Proc. SIGIR , pp , ACM. [13] C. Carpineto an G. Romano, Exploiting the potential of concept lattices for information retrieval with CREDO, J. UCS, vol. 10, no. 8, pp , [14] H. Schutze an C. Silverstein, Projections for efficient ocument clustering, in Proc. SIGIR-97, 1997, Classification Methos, pp
Detecting Possibly Fraudulent or Error-Prone Survey Data Using Benford s Law
Detecting Possibly Frauulent or Error-Prone Survey Data Using Benfor s Law Davi Swanson, Moon Jung Cho, John Eltinge U.S. Bureau of Labor Statistics 2 Massachusetts Ave., NE, Room 3650, Washington, DC
More informationMath 230.01, Fall 2012: HW 1 Solutions
Math 3., Fall : HW Solutions Problem (p.9 #). Suppose a wor is picke at ranom from this sentence. Fin: a) the chance the wor has at least letters; SOLUTION: All wors are equally likely to be chosen. The
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS Contents 1. Moment generating functions 2. Sum of a ranom number of ranom variables 3. Transforms
More informationData Center Power System Reliability Beyond the 9 s: A Practical Approach
Data Center Power System Reliability Beyon the 9 s: A Practical Approach Bill Brown, P.E., Square D Critical Power Competency Center. Abstract Reliability has always been the focus of mission-critical
More informationDigital barrier option contract with exponential random time
IMA Journal of Applie Mathematics Avance Access publishe June 9, IMA Journal of Applie Mathematics ) Page of 9 oi:.93/imamat/hxs3 Digital barrier option contract with exponential ranom time Doobae Jun
More informationModelling and Resolving Software Dependencies
June 15, 2005 Abstract Many Linux istributions an other moern operating systems feature the explicit eclaration of (often complex) epenency relationships between the pieces of software
More informationUnsteady Flow Visualization by Animating Evenly-Spaced Streamlines
EUROGRAPHICS 2000 / M. Gross an F.R.A. Hopgoo Volume 19, (2000), Number 3 (Guest Eitors) Unsteay Flow Visualization by Animating Evenly-Space Bruno Jobar an Wilfri Lefer Université u Littoral Côte Opale,
More informationState of Louisiana Office of Information Technology. Change Management Plan
State of Louisiana Office of Information Technology Change Management Plan Table of Contents Change Management Overview Change Management Plan Key Consierations Organizational Transition Stages Change
More informationCross-Over Analysis Using T-Tests
Chapter 35 Cross-Over Analysis Using -ests Introuction his proceure analyzes ata from a two-treatment, two-perio (x) cross-over esign. he response is assume to be a continuous ranom variable that follows
More informationView Synthesis by Image Mapping and Interpolation
View Synthesis by Image Mapping an Interpolation Farris J. Halim Jesse S. Jin, School of Computer Science & Engineering, University of New South Wales Syney, NSW 05, Australia Basser epartment of Computer
More informationFirewall Design: Consistency, Completeness, and Compactness
C IS COS YS TE MS Firewall Design: Consistency, Completeness, an Compactness Mohame G. Goua an Xiang-Yang Alex Liu Department of Computer Sciences The University of Texas at Austin Austin, Texas 78712-1188,
More informationOn Adaboost and Optimal Betting Strategies
On Aaboost an Optimal Betting Strategies Pasquale Malacaria 1 an Fabrizio Smerali 1 1 School of Electronic Engineering an Computer Science, Queen Mary University of Lonon, Lonon, UK Abstract We explore
More informationThe one-year non-life insurance risk
The one-year non-life insurance risk Ohlsson, Esbjörn & Lauzeningks, Jan Abstract With few exceptions, the literature on non-life insurance reserve risk has been evote to the ultimo risk, the risk in the
More informationHow To Segmentate An Insurance Customer In An Insurance Business
International Journal of Database Theory an Application, pp.25-36 http://x.oi.org/10.14257/ijta.2014.7.1.03 A Case Stuy of Applying SOM in Market Segmentation of Automobile Insurance Customers Vahi Golmah
More informationA Blame-Based Approach to Generating Proposals for Handling Inconsistency in Software Requirements
International Journal of nowlege an Systems Science, 3(), -7, January-March 0 A lame-ase Approach to Generating Proposals for Hanling Inconsistency in Software Requirements eian Mu, Peking University,
More informationFactoring Dickson polynomials over finite fields
Factoring Dickson polynomials over finite fiels Manjul Bhargava Department of Mathematics, Princeton University. Princeton NJ 08544 manjul@math.princeton.eu Michael Zieve Department of Mathematics, University
More information10.2 Systems of Linear Equations: Matrices
SECTION 0.2 Systems of Linear Equations: Matrices 7 0.2 Systems of Linear Equations: Matrices OBJECTIVES Write the Augmente Matrix of a System of Linear Equations 2 Write the System from the Augmente Matrix
More informationOption Pricing for Inventory Management and Control
Option Pricing for Inventory Management an Control Bryant Angelos, McKay Heasley, an Jeffrey Humpherys Abstract We explore the use of option contracts as a means of managing an controlling inventories
More informationJON HOLTAN. if P&C Insurance Ltd., Oslo, Norway ABSTRACT
OPTIMAL INSURANCE COVERAGE UNDER BONUS-MALUS CONTRACTS BY JON HOLTAN if P&C Insurance Lt., Oslo, Norway ABSTRACT The paper analyses the questions: Shoul or shoul not an iniviual buy insurance? An if so,
More informationOptimal Control Policy of a Production and Inventory System for multi-product in Segmented Market
RATIO MATHEMATICA 25 (2013), 29 46 ISSN:1592-7415 Optimal Control Policy of a Prouction an Inventory System for multi-prouct in Segmente Market Kuleep Chauhary, Yogener Singh, P. C. Jha Department of Operational
More informationA Comparison of Performance Measures for Online Algorithms
A Comparison of Performance Measures for Online Algorithms Joan Boyar 1, Sany Irani 2, an Kim S. Larsen 1 1 Department of Mathematics an Computer Science, University of Southern Denmark, Campusvej 55,
More informationMSc. Econ: MATHEMATICAL STATISTICS, 1995 MAXIMUM-LIKELIHOOD ESTIMATION
MAXIMUM-LIKELIHOOD ESTIMATION The General Theory of M-L Estimation In orer to erive an M-L estimator, we are boun to make an assumption about the functional form of the istribution which generates the
More informationA Data Placement Strategy in Scientific Cloud Workflows
A Data Placement Strategy in Scientific Clou Workflows Dong Yuan, Yun Yang, Xiao Liu, Jinjun Chen Faculty of Information an Communication Technologies, Swinburne University of Technology Hawthorn, Melbourne,
More informationCh 10. Arithmetic Average Options and Asian Opitons
Ch 10. Arithmetic Average Options an Asian Opitons I. Asian Option an the Analytic Pricing Formula II. Binomial Tree Moel to Price Average Options III. Combination of Arithmetic Average an Reset Options
More informationDECISION SUPPORT SYSTEM FOR MANAGING EDUCATIONAL CAPACITY UTILIZATION IN UNIVERSITIES
DECISION SUPPORT SYSTEM OR MANAGING EDUCATIONAL CAPACITY UTILIZATION IN UNIVERSITIES Svetlana Vinnik 1, Marc H. Scholl 2 Abstract Decision-making in the fiel of acaemic planning involves extensive analysis
More informationThe most common model to support workforce management of telephone call centers is
Designing a Call Center with Impatient Customers O. Garnett A. Manelbaum M. Reiman Davison Faculty of Inustrial Engineering an Management, Technion, Haifa 32000, Israel Davison Faculty of Inustrial Engineering
More informationTowards a Framework for Enterprise Architecture Frameworks Comparison and Selection
Towars a Framework for Enterprise Frameworks Comparison an Selection Saber Aballah Faculty of Computers an Information, Cairo University Saber_aballah@hotmail.com Abstract A number of Enterprise Frameworks
More informationProfessional Level Options Module, Paper P4(SGP)
Answers Professional Level Options Moule, Paper P4(SGP) Avance Financial Management (Singapore) December 2007 Answers Tutorial note: These moel answers are consierably longer an more etaile than woul be
More informationUnbalanced Power Flow Analysis in a Micro Grid
International Journal of Emerging Technology an Avance Engineering Unbalance Power Flow Analysis in a Micro Gri Thai Hau Vo 1, Mingyu Liao 2, Tianhui Liu 3, Anushree 4, Jayashri Ravishankar 5, Toan Phung
More informationThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters
ThroughputScheuler: Learning to Scheule on Heterogeneous Haoop Clusters Shehar Gupta, Christian Fritz, Bob Price, Roger Hoover, an Johan e Kleer Palo Alto Research Center, Palo Alto, CA, USA {sgupta, cfritz,
More informationA Generalization of Sauer s Lemma to Classes of Large-Margin Functions
A Generalization of Sauer s Lemma to Classes of Large-Margin Functions Joel Ratsaby University College Lonon Gower Street, Lonon WC1E 6BT, Unite Kingom J.Ratsaby@cs.ucl.ac.uk, WWW home page: http://www.cs.ucl.ac.uk/staff/j.ratsaby/
More informationA Universal Sensor Control Architecture Considering Robot Dynamics
International Conference on Multisensor Fusion an Integration for Intelligent Systems (MFI2001) Baen-Baen, Germany, August 2001 A Universal Sensor Control Architecture Consiering Robot Dynamics Frierich
More informationSensor Network Localization from Local Connectivity : Performance Analysis for the MDS-MAP Algorithm
Sensor Network Localization from Local Connectivity : Performance Analysis for the MDS-MAP Algorithm Sewoong Oh an Anrea Montanari Electrical Engineering an Statistics Department Stanfor University, Stanfor,
More informationDow Jones Sustainability Group Index: A Global Benchmark for Corporate Sustainability
www.corporate-env-strategy.com Sustainability Inex Dow Jones Sustainability Group Inex: A Global Benchmark for Corporate Sustainability Ivo Knoepfel Increasingly investors are iversifying their portfolios
More informationAn Information Retrieval using weighted Index Terms in Natural Language document collections
Internet and Information Technology in Modern Organizations: Challenges & Answers 635 An Information Retrieval using weighted Index Terms in Natural Language document collections Ahmed A. A. Radwan, Minia
More informationLagrangian and Hamiltonian Mechanics
Lagrangian an Hamiltonian Mechanics D.G. Simpson, Ph.D. Department of Physical Sciences an Engineering Prince George s Community College December 5, 007 Introuction In this course we have been stuying
More informationStock Market Value Prediction Using Neural Networks
Stock Market Value Preiction Using Neural Networks Mahi Pakaman Naeini IT & Computer Engineering Department Islamic Aza University Paran Branch e-mail: m.pakaman@ece.ut.ac.ir Hamireza Taremian Engineering
More informationNotes on tangents to parabolas
Notes on tangents to parabolas (These are notes for a talk I gave on 2007 March 30.) The point of this talk is not to publicize new results. The most recent material in it is the concept of Bézier curves,
More informationParameterized Algorithms for d-hitting Set: the Weighted Case Henning Fernau. Univ. Trier, FB 4 Abteilung Informatik 54286 Trier, Germany
Parameterize Algorithms for -Hitting Set: the Weighte Case Henning Fernau Trierer Forschungsberichte; Trier: Technical Reports Informatik / Mathematik No. 08-6, July 2008 Univ. Trier, FB 4 Abteilung Informatik
More informationMinimizing Makespan in Flow Shop Scheduling Using a Network Approach
Minimizing Makespan in Flow Shop Scheuling Using a Network Approach Amin Sahraeian Department of Inustrial Engineering, Payame Noor University, Asaluyeh, Iran 1 Introuction Prouction systems can be ivie
More informationRisk Adjustment for Poker Players
Risk Ajustment for Poker Players William Chin DePaul University, Chicago, Illinois Marc Ingenoso Conger Asset Management LLC, Chicago, Illinois September, 2006 Introuction In this article we consier risk
More informationAn intertemporal model of the real exchange rate, stock market, and international debt dynamics: policy simulations
This page may be remove to conceal the ientities of the authors An intertemporal moel of the real exchange rate, stock market, an international ebt ynamics: policy simulations Saziye Gazioglu an W. Davi
More informationMathematics. Circles. hsn.uk.net. Higher. Contents. Circles 119 HSN22400
hsn.uk.net Higher Mathematics UNIT OUTCOME 4 Circles Contents Circles 119 1 Representing a Circle 119 Testing a Point 10 3 The General Equation of a Circle 10 4 Intersection of a Line an a Circle 1 5 Tangents
More informationWeb Document Clustering
Web Document Clustering Oren Zamir and Oren Etzioni Department of Computer Science and Engineering University of Washington Seattle, WA 98195-2350 U.S.A. {zamir, etzioni}@cs.washington.edu Abstract Users
More informationImproving Direct Marketing Profitability with Neural Networks
Volume 9 o.5, September 011 Improving Direct Marketing Profitability with eural etworks Zaiyong Tang Salem State University Salem, MA 01970 ABSTRACT Data mining in irect marketing aims at ientifying the
More informationSeeing the Unseen: Revealing Mobile Malware Hidden Communications via Energy Consumption and Artificial Intelligence
Seeing the Unseen: Revealing Mobile Malware Hien Communications via Energy Consumption an Artificial Intelligence Luca Caviglione, Mauro Gaggero, Jean-François Lalane, Wojciech Mazurczyk, Marcin Urbanski
More information! # % & ( ) +,,),. / 0 1 2 % ( 345 6, & 7 8 4 8 & & &&3 6
! # % & ( ) +,,),. / 0 1 2 % ( 345 6, & 7 8 4 8 & & &&3 6 9 Quality signposting : the role of online information prescription in proviing patient information Liz Brewster & Barbara Sen Information School,
More informationRUNESTONE, an International Student Collaboration Project
RUNESTONE, an International Stuent Collaboration Project Mats Daniels 1, Marian Petre 2, Vicki Almstrum 3, Lars Asplun 1, Christina Björkman 1, Carl Erickson 4, Bruce Klein 4, an Mary Last 4 1 Department
More informationOptimal Energy Commitments with Storage and Intermittent Supply
Submitte to Operations Research manuscript OPRE-2009-09-406 Optimal Energy Commitments with Storage an Intermittent Supply Jae Ho Kim Department of Electrical Engineering, Princeton University, Princeton,
More informationPerformance And Analysis Of Risk Assessment Methodologies In Information Security
International Journal of Computer Trens an Technology (IJCTT) volume 4 Issue 10 October 2013 Performance An Analysis Of Risk Assessment ologies In Information Security K.V.D.Kiran #1, Saikrishna Mukkamala
More informationGPRS performance estimation in GSM circuit switched services and GPRS shared resource systems *
GPRS performance estimation in GSM circuit switche serices an GPRS share resource systems * Shaoji i an Sen-Gusta Häggman Helsinki Uniersity of Technology, Institute of Raio ommunications, ommunications
More informationSensitivity Analysis of Non-linear Performance with Probability Distortion
Preprints of the 19th Worl Congress The International Feeration of Automatic Control Cape Town, South Africa. August 24-29, 214 Sensitivity Analysis of Non-linear Performance with Probability Distortion
More informationHOST SELECTION METHODOLOGY IN CLOUD COMPUTING ENVIRONMENT
International Journal of Avance Research in Computer Engineering & Technology (IJARCET) HOST SELECTION METHODOLOGY IN CLOUD COMPUTING ENVIRONMENT Pawan Kumar, Pijush Kanti Dutta Pramanik Computer Science
More informationThese draft test specifications and sample items and other materials are just that drafts. As such, they will systematically evolve over time.
t h e reesigne sat These raft test specifications an sample items an other materials are just that rafts. As such, they will systematically evolve over time. These sample items are meant to illustrate
More informationOptimizing Multiple Stock Trading Rules using Genetic Algorithms
Optimizing Multiple Stock Traing Rules using Genetic Algorithms Ariano Simões, Rui Neves, Nuno Horta Instituto as Telecomunicações, Instituto Superior Técnico Av. Rovisco Pais, 040-00 Lisboa, Portugal.
More informationMODELLING OF TWO STRATEGIES IN INVENTORY CONTROL SYSTEM WITH RANDOM LEAD TIME AND DEMAND
art I. robobabilystic Moels Computer Moelling an New echnologies 27 Vol. No. 2-3 ransport an elecommunication Institute omonosova iga V-9 atvia MOEING OF WO AEGIE IN INVENOY CONO YEM WIH ANOM EA IME AN
More informationUsing research evidence in mental health: user-rating and focus group study of clinicians preferences for a new clinical question-answering service
DOI: 10.1111/j.1471-1842.2008.00833.x Using research evience in mental health: user-rating an focus group stuy of clinicians preferences for a new clinical question-answering service Elizabeth A. Barley*,
More informationA New Pricing Model for Competitive Telecommunications Services Using Congestion Discounts
A New Pricing Moel for Competitive Telecommunications Services Using Congestion Discounts N. Keon an G. Ananalingam Department of Systems Engineering University of Pennsylvania Philaelphia, PA 19104-6315
More informationA Scheme to Estimate One-way Delay Variations for Diagnosing Network Traffic Conditions
Cyber Journals: Multiisciplinary Journals in Science an Technology Journal of Selecte Areas in Telecommunications (JSAT) August Eition 2011 A Scheme to Estimate One-way Delay Variations for Diagnosing
More informationDifferentiability of Exponential Functions
Differentiability of Exponential Functions Philip M. Anselone an John W. Lee Philip Anselone (panselone@actionnet.net) receive his Ph.D. from Oregon State in 1957. After a few years at Johns Hopkins an
More informationProduct Differentiation for Software-as-a-Service Providers
University of Augsburg Prof. Dr. Hans Ulrich Buhl Research Center Finance & Information Management Department of Information Systems Engineering & Financial Management Discussion Paper WI-99 Prouct Differentiation
More informationSCADA (Supervisory Control and Data Acquisition) systems
Proceeings of the 2013 Feerate Conference on Computer Science an Information Systems pp. 1423 1428 Improving security in SCADA systems through firewall policy analysis Onrej Rysavy Jaroslav Rab Miroslav
More informationMathematical Models of Therapeutical Actions Related to Tumour and Immune System Competition
Mathematical Moels of Therapeutical Actions Relate to Tumour an Immune System Competition Elena De Angelis (1 an Pierre-Emmanuel Jabin (2 (1 Dipartimento i Matematica, Politecnico i Torino Corso Duca egli
More informationYou Can t Not Choose: Embracing the Role of Choice in Ecological Restoration
EDITORIAL OPINION You Can t Not Choose: Embracing the Role of Choice in Ecological Restoration Stuart K. Allison 1,2 Abstract From the moment of its inception, human choice about how to treat the environment
More informationMathematics Review for Economists
Mathematics Review for Economists by John E. Floy University of Toronto May 9, 2013 This ocument presents a review of very basic mathematics for use by stuents who plan to stuy economics in grauate school
More informationSupporting Adaptive Workflows in Advanced Application Environments
Supporting aptive Workflows in vance pplication Environments Manfre Reichert, lemens Hensinger, Peter Daam Department Databases an Information Systems University of Ulm, D-89069 Ulm, Germany Email: {reichert,
More informationy or f (x) to determine their nature.
Level C5 of challenge: D C5 Fining stationar points of cubic functions functions Mathematical goals Starting points Materials require Time neee To enable learners to: fin the stationar points of a cubic
More informationCoalitional Game Theoretic Approach for Cooperative Transmission in Vehicular Networks
Coalitional Game Theoretic Approach for Cooperative Transmission in Vehicular Networks arxiv:.795v [cs.gt] 8 Feb Tian Zhang, Wei Chen, Zhu Han, an Zhigang Cao State Key Laboratory on Microwave an Digital
More informationHow To Find Out How To Calculate Volume Of A Sphere
Contents High-Dimensional Space. Properties of High-Dimensional Space..................... 4. The High-Dimensional Sphere......................... 5.. The Sphere an the Cube in Higher Dimensions...........
More informationSponsored by: N.E.C.A. CHAPTERS Minneapolis, St. Paul, South Central, Twinports Arrowhead I.B.E.W. Locals 292, 110, 343, 242, 294
Sponsore by: N.E.C.A. CHAPTERS Minneapolis, St. Paul, South Central, Twinports Arrowhea I.B.E.W. Locals 292, 110, 343, 242, 294 452 Northco Drive, Suite 140 Friley, MN 55432-3308 Phone: 763-571-5922 Fax:
More informationHow To Understand The Structure Of A Can (Can)
Thi t t ith F M k 4 0 4 BOSCH CAN Specification Version 2.0 1991, Robert Bosch GmbH, Postfach 50, D-7000 Stuttgart 1 The ocument as a whole may be copie an istribute without restrictions. However, the
More informationUNIFIED BIJECTIONS FOR MAPS WITH PRESCRIBED DEGREES AND GIRTH
UNIFIED BIJECTIONS FOR MAPS WITH PRESCRIBED DEGREES AND GIRTH OLIVIER BERNARDI AND ÉRIC FUSY Abstract. This article presents unifie bijective constructions for planar maps, with control on the face egrees
More informationHeat-And-Mass Transfer Relationship to Determine Shear Stress in Tubular Membrane Systems Ratkovich, Nicolas Rios; Nopens, Ingmar
Aalborg Universitet Heat-An-Mass Transfer Relationship to Determine Shear Stress in Tubular Membrane Systems Ratkovich, Nicolas Rios; Nopens, Ingmar Publishe in: International Journal of Heat an Mass Transfer
More informationCalibration of the broad band UV Radiometer
Calibration of the broa ban UV Raiometer Marian Morys an Daniel Berger Solar Light Co., Philaelphia, PA 19126 ABSTRACT Mounting concern about the ozone layer epletion an the potential ultraviolet exposure
More informationLegal Claim Identification: Information Extraction with Hierarchically Labeled Data
Legal Claim Ientification: Information Extraction with Hierarchically Labele Data Mihai Sureanu, Ramesh Nallapati an Christopher Manning Stanfor University {mihais,nmramesh,manning}@cs.stanfor.eu Abstract
More informationChapter 9 AIRPORT SYSTEM PLANNING
Chapter 9 AIRPORT SYSTEM PLANNING. Photo creit Dorn McGrath, Jr Contents Page The Planning Process................................................... 189 Airport Master Planning..............................................
More informationMinimum-Energy Broadcast in All-Wireless Networks: NP-Completeness and Distribution Issues
Minimum-Energy Broacast in All-Wireless Networks: NP-Completeness an Distribution Issues Mario Čagal LCA-EPFL CH-05 Lausanne Switzerlan mario.cagal@epfl.ch Jean-Pierre Hubaux LCA-EPFL CH-05 Lausanne Switzerlan
More informationCost Efficient Datacenter Selection for Cloud Services
Cost Efficient Datacenter Selection for Clou Services Hong u, Baochun Li henryxu, bli@eecg.toronto.eu Department of Electrical an Computer Engineering University of Toronto Abstract Many clou services
More informationHull, Chapter 11 + Sections 17.1 and 17.2 Additional reference: John Cox and Mark Rubinstein, Options Markets, Chapter 5
Binomial Moel Hull, Chapter 11 + ections 17.1 an 17.2 Aitional reference: John Cox an Mark Rubinstein, Options Markets, Chapter 5 1. One-Perio Binomial Moel Creating synthetic options (replicating options)
More informationSustainability Through the Market: Making Markets Work for Everyone q
www.corporate-env-strategy.com Sustainability an the Market Sustainability Through the Market: Making Markets Work for Everyone q Peter White Sustainable evelopment is about ensuring a better quality of
More informationMeasures of distance between samples: Euclidean
4- Chapter 4 Measures of istance between samples: Eucliean We will be talking a lot about istances in this book. The concept of istance between two samples or between two variables is funamental in multivariate
More informationINFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES
1 st Logistics International Conference Belgrae, Serbia 28-30 November 2013 INFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES Goran N. Raoičić * University of Niš, Faculty of Mechanical
More informationGame Theoretic Modeling of Cooperation among Service Providers in Mobile Cloud Computing Environments
2012 IEEE Wireless Communications an Networking Conference: Services, Applications, an Business Game Theoretic Moeling of Cooperation among Service Proviers in Mobile Clou Computing Environments Dusit
More informationA Monte Carlo Simulation of Multivariate General
A Monte Carlo Simulation of Multivariate General Pareto Distribution an its Application 1 1 1 1 1 1 1 1 0 1 Luo Yao 1, Sui Danan, Wang Dongxiao,*, Zhou Zhenwei, He Weihong 1, Shi Hui 1 South China Sea
More informationCompare Authentication Algorithms for Mobile Systems in Order to Introduce the Successful Characteristics of these Algorithms against Attacks
Compare Authentication Algorithms for Mobile Systems in Orer to Introuce the Successful Characteristics of these Algorithms against Attacks Shahriar Mohammai Assistant Professor of Inustrial Engineering
More informationThe Impact of Forecasting Methods on Bullwhip Effect in Supply Chain Management
The Imact of Forecasting Methos on Bullwhi Effect in Suly Chain Management HX Sun, YT Ren Deartment of Inustrial an Systems Engineering, National University of Singaore, Singaore Schoo of Mechanical an
More informationBellini: Ferrying Application Traffic Flows through Geo-distributed Datacenters in the Cloud
Bellini: Ferrying Application Traffic Flows through Geo-istribute Datacenters in the Clou Zimu Liu, Yuan Feng, an Baochun Li Department of Electrical an Computer Engineering, University of Toronto Department
More informationExploratory Optimal Latin Hypercube Designs for Computer Simulated Experiments
Thailan Statistician July 0; 9() : 7-93 http://statassoc.or.th Contribute paper Exploratory Optimal Latin Hypercube Designs for Computer Simulate Experiments Rachaaporn Timun [a,b] Anamai Na-uom* [a,b]
More informationTrace IP Packets by Flexible Deterministic Packet Marking (FDPM)
Trace P Packets by Flexible Deterministic Packet Marking (F) Yang Xiang an Wanlei Zhou School of nformation Technology Deakin University Melbourne, Australia {yxi, wanlei}@eakin.eu.au Abstract- Currently
More informationThe Quick Calculus Tutorial
The Quick Calculus Tutorial This text is a quick introuction into Calculus ieas an techniques. It is esigne to help you if you take the Calculus base course Physics 211 at the same time with Calculus I,
More informationBOSCH. CAN Specification. Version 2.0. 1991, Robert Bosch GmbH, Postfach 30 02 40, D-70442 Stuttgart
CAN Specification Version 2.0 1991, Robert Bosch GmbH, Postfach 30 02 40, D-70442 Stuttgart CAN Specification 2.0 page 1 Recital The acceptance an introuction of serial communication to more an more applications
More informationAutomatic Long-Term Loudness and Dynamics Matching
Automatic Long-Term Louness an Dynamics Matching Earl ickers Creative Avance Technology Center Scotts alley, CA, USA earlv@atc.creative.com ABSTRACT Traitional auio level control evices, such as automatic
More informationThere are two different ways you can interpret the information given in a demand curve.
Econ 500 Microeconomic Review Deman What these notes hope to o is to o a quick review of supply, eman, an equilibrium, with an emphasis on a more quantifiable approach. Deman Curve (Big icture) The whole
More informationSearch Advertising Based Promotion Strategies for Online Retailers
Search Avertising Base Promotion Strategies for Online Retailers Amit Mehra The Inian School of Business yeraba, Inia Amit Mehra@isb.eu ABSTRACT Web site aresses of small on line retailers are often unknown
More informationAn Introduction to Event-triggered and Self-triggered Control
An Introuction to Event-triggere an Self-triggere Control W.P.M.H. Heemels K.H. Johansson P. Tabuaa Abstract Recent evelopments in computer an communication technologies have le to a new type of large-scale
More informationA Theory of Exchange Rates and the Term Structure of Interest Rates
Review of Development Economics, 17(1), 74 87, 013 DOI:10.1111/roe.1016 A Theory of Exchange Rates an the Term Structure of Interest Rates Hyoung-Seok Lim an Masao Ogaki* Abstract This paper efines the
More informationWeb Appendices to Selling to Overcon dent Consumers
Web Appenices to Selling to Overcon ent Consumers Michael D. Grubb MIT Sloan School of Management Cambrige, MA 02142 mgrubbmit.eu www.mit.eu/~mgrubb May 2, 2008 B Option Pricing Intuition This appenix
More informationS&P Systematic Global Macro Index (S&P SGMI) Methodology
S&P Systematic Global Macro Inex (S&P SGMI) Methoology May 2014 S&P Dow Jones Inices: Inex Methoology Table of Contents Introuction 3 Overview 3 Highlights 4 The S&P SGMI Methoology 4 Inex Family 5 Inex
More informationHybrid Model Predictive Control Applied to Production-Inventory Systems
Preprint of paper to appear in the 18th IFAC Worl Congress, August 28 - Sept. 2, 211, Milan, Italy Hybri Moel Preictive Control Applie to Prouction-Inventory Systems Naresh N. Nanola Daniel E. Rivera Control
More informationEnterprise Resource Planning
Enterprise Resource Planning MPC 6 th Eition Chapter 1a McGraw-Hill/Irwin Copyright 2011 by The McGraw-Hill Companies, Inc. All rights reserve. Enterprise Resource Planning A comprehensive software approach
More information