AN OVERVIEW ON CLUSTERING METHODS
|
|
- Jewel Jones
- 8 years ago
- Views:
Transcription
1 IOSR Journal Engineering AN OVERVIEW ON CLUSTERING METHODS T. Soni Madhulatha Aociate Preor, Alluri Intitute Management Science, Warangal. ABSTRACT Clutering i a common technique for tatitical data analyi, which i ued in many field, including machine learning, data mining, pattern recognition, image analyi and bioinformatic. Clutering i the proce grouping imilar object into different group, or more preciely, the partitioning a data et into ubet, o that the data in each ubet according to ome defined ditance meaure. Thi paper cover about clutering alg, benefit and it application. Paper conclude by dicuing ome limitation. Keyword: Clutering, hierarchical alg, partitional alg, ditance meaure, I. INTRODUCTION Clutering can be conidered the mot important unupervied learning problem; o, a every other problem thi ind, it deal with finding a tructure in a collection unlabeled data. A cluter i therefore a collection object which are imilar between them and are diimilar to the object belonging to other cluter. Beide the term data clutering a ynonym lie cluter analyi, automatic claification, numerical taxonomy, botrology and typological analyi. II. TYPES OF CLUSTERING. Data clutering alg can be hierarchical or partitional. Hierarchical alg find ucceive cluter uing previouly etablihed cluter, wherea partitional alg determine all cluter at time. Hierarchical alg can be agglomerative (bottom-up or diviive (top-down. Agglomerative alg begin with each element a a eparate cluter and merge them in ucceively larger cluter. Diviive alg begin with the whole et and proceed to divide it into ucceively maller cluter. HIERARCHICAL CLUSTERING A ey tep in a hierarchical clutering i to elect a ditance meaure. A imple meaure i manhattan ditance, equal to the um abolute ditance for each variable. The name come from the fact that in a two-variable cae, the variable can be plotted on a grid that can be compared to city treet, and the ditance between two point i the bloc a peron would wal. A more common meaure i Euclidean ditance, computed by finding the quare the ditance between each variable, umming the quare, and finding the quare root that um. In the two-variable cae, the ditance i analogou to finding the length the hypotenue in a triangle; that i, it i the ditance "a the crow flie." A review cluter analyi in health pychology reearch found that the mot common ditance meaure in publihed tudie in that reearch area i the Euclidean ditance or the quared Euclidean ditance. The Manhattan ditance function compute the ditance that would be traveled to get from one data point to the other if a grid-lie path i followed. The Manhattan ditance between two item i the um the difference their correponding component. The formula for thi ditance between a point X=(X1, X2, etc. and a point Y=(Y1, Y2, etc. i: d n i1 X i Y i Where n i the variable, and Xi and Yi are the value the ith variable, at point X and Y repectively. The Euclidean ditance function meaure the athe-crow-flie ditance. The formula for thi ditance between a point X (X1, X2, etc. and a point Y (Y1, Y2, etc. i: d n j1 ( x j y j 2 Deriving the Euclidean ditance between two data point involve computing the quare root the um the quare the difference between correponding value. The following figure illutrate the difference between Manhattan ditance and Euclidean ditance: ISSN: P a g e
2 IOSR Journal Engineering 1 card ( A card ( B xa yb d( x, y Manhattan ditance Euclidean ditance Thi method build the hierarchy from the individual element by progreively merging cluter. Again, we have ix element {a} {b} {c} {d} {e} and {f}. The firt tep i to determine which element to merge in a cluter. Uually, we want to tae the two cloet element, therefore we mut define a ditance between element. One can alo contruct a ditance matrix at thi tage. The um all intra-cluter variance The increae in variance for the cluter being merged The probability that candidate cluter pawn from the ame ditribution function. Each agglomeration occur at a greater ditance between cluter than the previou agglomeration, and one can decide to top clutering either when the cluter are too far apart to be merged or when there i a ufficiently mall cluter. Agglomerative hierarchical clutering For example, uppoe thee data are to be analyzed, where pixel euclidean ditance i the ditance metric. Uually the ditance between two cluter and i one the following: The maximum ditance between element each cluter i alo called complete linage clutering. max d(x, y: x A,y B The minimum ditance between element each cluter i alo called ingle linage clutering. min d(x, y: x A,yB The mean ditance between element each cluter i alo called average linage clutering. Diviive clutering So far we have only looed at agglomerative clutering, but a cluter hierarchy can alo be generated top-down. Thi variant hierarchical clutering i called top-down clutering or diviive clutering. We tart at the top with all document in one cluter. The cluter i plit uing a flat clutering alg. Thi procedure i applied recurively until each document i in it own ingleton cluter. Top-down clutering i conceptually more complex than bottom-up clutering ince we need a econd, flat clutering alg a a ``ubroutine''. It ha the advantage being more efficient if we do not generate a complete hierarchy all the way down to individual document leave. For a fixed top level, uing an efficient flat alg lie K-mean, top-down alg are linear in the document and cluter ISSN: P a g e
3 IOSR Journal Engineering Hierarchal method uffer from the fact that once the merge/plit i done, it can never be undone. Thi rigidity i ueful in that i ueful in that it lead to maller computation cot by not worrying about a combinatorial different choice. However there are two approache to improve the quality hierarchical clutering Perform careful analyi object linage at each hierarchical partitioning uch a CURE and Chameleon. Integrate hierarchical agglomeration and then redefine the reult uing iterative relocation a in BRICH PARTITIONAL CLUSTERING: Partitioning alg are baed on pecifying an initial group, and iteratively reallocating object among group to convergence. Thi alg typically determine all cluter at once. Mot application adopt one two popular heuritic method lie -mean alg -medoid alg K-mean alg The K-mean alg aign each point to the cluter whoe center alo called centroid i nearet. The center i the average all the point in the cluter that i, it coordinate are the aritetic mean for each dimenion eparately over all the point in the cluter. The peudo code the -mean alg i to explain how it wor: A. Chooe K a the cluter. B. Initialize the codeboo vector the K cluter (randomly, for intance C. For every new ample vector: a. Compute the ditance between the new vector and every cluter' codeboo vector. b. Re-compute the cloet codeboo vector with the new vector, uing a learning rate that decreae in time. The reaon behind chooing the -mean alg to tudy i it popularity for the following reaon: It time complexity i O (nl, where n i the pattern, i the cluter, and l i the iteration taen by the alg to converge. It pace complexity i O (+n. It require additional pace to tore the data matrix. It i order-independent; for a given initial eed et cluter center, it generate the ame partition the data irrepective the order in which the pattern are preented to the alg. K-medoid alg: The baic trategy -medoid alg i each cluter i repreented by one the object located near the center the cluter. PAM (Partitioning around Medoid wa one the firt -medoid alg i introduced. The peudo code the -medoid alg i to explain how it wor: Arbitrarily chooe object a the initial medoid Repeat Aign each remaining object to the cluter with the nearet medoid ly elect a non-medoid object O random Compute the total cot, S, wapping O j with O random If S<0 the wap O j with O random to form the new et -medoid Until no change K-medoid method i more robut than -mean in preence noie and outlier becaue a medoid i le influenced by outlier or other extreme value than a mean. DENSITY-BASED CLUSTERING Denity-baed clutering alg are devied to dicover arbitrary-haped cluter. In thi approach, a cluter i regarded a a region in which the denity data object exceed a threhold. DBSCAN and SSN are two typical alg thi ind. DBSCAN alg The DBSCAN alg wa firt introduced by Eter, and relie on a denity-baed notion cluter. Cluter are identified by looing at the denity point. Region with a high denity point depict the exitence cluter wherea region with a low denity point indicate cluter noie or cluter outlier. Thi alg i particularly uited to deal with large dataet, with noie, and i able to identify cluter with different ize and hape. The ey idea the DBSCAN alg i that, for each point a cluter, the neighbourhood a given radiu ha to contain at leat a minimum point, that i, the denity in the neighbourhood ha to exceed ome predefined threhold. Thi alg need three input parameter: -, the neighbour lit ize; - Ep, the radiu that delimitate the neighbourhood area a point (Ep neighbourhood; - MinPt, the minimum point that mut exit in the Ep-neighbourhood. ISSN: P a g e
4 IOSR Journal Engineering The clutering proce i baed on the claification the point in the dataet a core point, border point and noie point, and on the ue denity relation between point to form the cluter. The peudo code the DBSCAN alg i to explain how it wor: To cluter a dataet, our DBSCAN implementation tart by identifying the nearet neighbour each point and identify the farthet nearet neighbour. The average all thi ditance i then calculated. For each point the dataet the alg identifie the directly denity-reachable point uing the Ep threhold provided by the uer and claifie the point into core or border point. It then loop trough all point the dataet and for the core point it tart to contruct a new cluter with the upport the GetDRPoint( procedure that follow the definition denity reachable point. In thi phae the value ued a Ep threhold i the average ditance calculated previouly. At the end, the compoition the cluter i verified in order to chec if there exit cluter that can be merged together. Thi can append if two point different cluter are at a ditance le than Ep. Note: DBSCAN doe not deal very well with cluter different denitie. SNN ALGORITHM The SNN alg, a DBSCAN, i a denity-baed clutering alg. The main difference between thi alg and DBSCAN i that it define the imilarity between point by looing at the nearet neighbour that two point hare. Uing thi imilarity meaure in the SNN alg, the denity i defined a the um the imilaritie the nearet neighbour a point. Point with high denity become core point, while point with low denity repreent noie point. All remainder point that are trongly imilar to a pecific core point will repreent a new cluter. The SNN alg need three input parameter: - K, the neighbour lit ize; - Ep, the threhold denity; - MinPt, the threhold that define the core point. The peudo code the SSN alg i to explain how it wor: Define the input parameter. Find the K nearet neighbour each point the dataet. Then the imilarity between pair point i calculated in term how many nearet neighbour the two point hare. Uing thi imilarity meaure, the denity each point can be calculated a being the neighbour with which the hared neighbour i equal or greater than Ep. The point are claified a being core point, if the denity the point i equal or greater than MinPt. At thi point, the alg ha all the information needed to tart to build the cluter. Thoe tart to be contructed around the core point. However, thee cluter do not contain all point. They contain only point that come from region relatively uniform denity. The point that are not claified into any cluter are claified a noie point. GRID-BASED CLUSTERING The grid baed clutering approach ue a multireolution grid data tructure. It quantize the pace into a finite cell that form a grid tructure on which all the operation for clutering are performed. Grid approach include STING (STatitical INformation Grid approach and CLIQUE Baic Grid-baed 1. Define a et grid-cell 2. Aign object to the appropriate grid cell and compute the denity each cell. 3. Eliminate cell, whoe denity i below a certain threhold t. 4. Form cluter from contiguou (adjacent group dene cell. The peudo code the STING alg i to explain how it wor: The patial area i divided into rectangular cell There are everal level cell correponding to different level reolution Each cell i partitioned into a maller cell in the next level. Statitical info each cell i calculated and tored beforehand and i ued to anwer querie Parameter higher level cell can be eaily calculated from parameter lower level cell count, mean,, min, max type ditribution normal, uniform, etc. Ue a top-down approach to anwer patial data querie Start from a pre-elected layer typically with a mall cell from the pre-elected layer until you reach the bottom layer do the following: For each cell in the current level compute the confidence interval indicating a cell relevance to a given query; 1. If it i relevant, include the cell in a cluter ISSN: P a g e
5 IOSR Journal Engineering 2. If it irrelevant, remove cell from further conideration 3. otherwie, loo for relevant cell at the next lower layer 1. Combine relevant cell into relevant region (baed on grid-neighborhood and return the o obtained cluter a your anwer. Advantage: Query-independent, eay to parallelize, incremental update O(K, where K i the grid cell at the lowet level Diadvantage: All the cluter boundarie are either horizontal or vertical, and no diagonal boundary i detected MODEL-BASED CLUSTERING Model-Baed Clutering method attempt to optimize the fit between the given data and ome mathematical model. Such method ten baed on the aumption that the data are generated by mixture underlying probability ditribution. Model-Baed Clutering method follow two major approache: Statitical Approach or Neural networ approach 1. Clutering i alo performed by having everal unit competing for the current object 2. The unit whoe weight vector i cloet to the current object win 3. The winner and it neighbor learn by having their weight adjuted 4. SOM are believed to reemble proceing that can occur in the brain 5. Ueful for viualizing high-dimenional data in 2- or 3-D pace In model-baed clutering, the data x are viewed a coming P from a mixture denity f ( x G 1 T f ( x ( x ;, i 1 T exp ( xi 2 det(2 1 ( x i For univariate data, the covariance matrix reduce to a calar variance. The lielihood for data coniting n obervation auming a Gauian mixture model with G multivariate mixture component i n G i1 1 T ( x i ;,. MCLUST i probably the mot well nown model-baed Thi i all about variou clutering alg. III. HOW TO DETERMINE THE NUMBER OF CLUSTERS Many clutering alg require the pecification the cluter to produce in the input data et, prior to execution the alg. Barring nowledge the proper value beforehand, the appropriate value mut be determined, a problem on it own for which a technique have been developed. If the cluter nown, termination condition i given! In general, et a ditance threhold value (termination condition The K-cluter lifetime a the range threhold value on the dendrogram tree that lead to the identification K cluter Heuritic rule: cut a dendrogram tree with maximum life time One imple rule thumb et the to n with n a the object. 2 where f i the probability denity function the obervation in group, and T i the probability that an obervation come from the th mixture component Each component i uually modeled by the normal or Gauian ditribution. Component ditribution are characterized by the mean μ and the covariance matrix, and have the probability denity function Elbow criterion The elbow criterion i a common rule thumb to determine what cluter hould be choen, for example for -mean and agglomerative hierarchical clutering. The elbow criterion ay that you hould chooe a cluter o that adding another cluter doen't add ufficient information. More preciely, if you graph the percentage variance explained by the cluter againt the cluter, the firt cluter will add much information, but at ome point the marginal gain will drop, giving an angle in the graph. ISSN: P a g e
6 IOSR Journal Engineering Another et method for determining the cluter are information criteria, uch a : The Aaie information criterion (AIC, The Bayeian information criterion (BIC, The Deviance information criterion (DIC. IV. HOW ALGORITHMS ARE COMPARED The above clutering alg are compared according to the following factor: The ize the dataet, Number the cluter, Type dataet, Type tware Table 1 explain how the four alg are compared and the concluion are written down. Parti tiona l Hie rarc hica l Grid bae d Mo del- Size Number Cluter cluter cluter cluter Cluter cluter cluter Type Type Stware ba ed cluter cluter V. POSSIBLE APPLICATIONS Clutering alg can be applied in many field, for intance: Mareting: finding group cutomer with imilar behavior given a large databae cutomer data containing their propertie and pat buying record; Financial ta: Forecating toc maret, currency exchange rate, ban banruptcie, un-dertanding and managing financial ri, trading future, credit rating, Biology: claification plant and animal given their feature; Librarie: boo ordering; Inurance: identifying group motor inurance policy holder with a high average claim cot; identifying fraud; City-planning: identifying group houe according to their houe type, value and geographical location; Earthquae tudie: clutering oberved earthquae epicenter to identify dangerou zone; WWW: document claification; clutering web log data to dicover group imilar acce pattern VI. CONCLUSION Clutering i a decriptive technique. The olution i not unique and it trongly depend upon the analyt choice. We decribed how it i poible to combine different reult in order to obtain table cluter, not depending too much on the criteria elected to analyze data. Clutering alway provide group, even if there i no group tructure. When applying a cluter analyi we are hypotheizing that the group exit. But thi aumption may be fale or wea. Clutering reult hould not be generalized. Cae in the ame cluter are imilar only with repect to the information cluter analyi wa baed on i.e., dimenion/variable inducing the diimilaritie. REFERENCES 1. Han, J. and Kamber, M. Data Mining: Concept and Technique, 2001 (Academic Pre, San Diego, California, USA. 2. Compariion between clutering alg- Oama Abu Abba. 3. Pham, D.T. and Afify, A.A. Clutering technique and their application in engineering. Submitted to Proceeding the Intitution Mechanical Engineer, ISSN: P a g e
7 IOSR Journal Engineering Part C: Journal Mechanical Engineering Science, Jain, A.K. and Dube, R.C. for Clutering Data, 1988 (Prentice Hall, Englewood Cliff, New Jerey, USA. 5. Bottou, L. and Bengio, Y. Convergence propertie the -mean alg. 6. Advance in Neural Information Proceing Sytem, 1995, 7, Grabmeier, J. and Rudolph, A. Technique cluter alg in data mining. Data Mining and Knowledge Dicovery, 2002, 6, Data Clutering. A Review: A.K. Jain Michigan State Univerity and M.N. Murty Indian Intitute Science and P.J. Flynn The Ohio State Univerity. 9. R C T Lee Cluter Analyi and It Application In J.T. Tou, editor, Advance in Information Sytem Science. Plenum Pre. New Yor. 10. Model-baed Method Claification: Uing the mclut Stware in Chemo metric Chri Fraley Univerity Wahington Adrian E. Raftery Univerity Wahington. ISSN: P a g e
A technical guide to 2014 key stage 2 to key stage 4 value added measures
A technical guide to 2014 key tage 2 to key tage 4 value added meaure CONTENTS Introduction: PAGE NO. What i value added? 2 Change to value added methodology in 2014 4 Interpretation: Interpreting chool
More informationUnit 11 Using Linear Regression to Describe Relationships
Unit 11 Uing Linear Regreion to Decribe Relationhip Objective: To obtain and interpret the lope and intercept of the leat quare line for predicting a quantitative repone variable from a quantitative explanatory
More informationOptical Illusion. Sara Bolouki, Roger Grosse, Honglak Lee, Andrew Ng
Optical Illuion Sara Bolouki, Roger Groe, Honglak Lee, Andrew Ng. Introduction The goal of thi proect i to explain ome of the illuory phenomena uing pare coding and whitening model. Intead of the pare
More informationPartial optimal labeling search for a NP-hard subclass of (max,+) problems
Partial optimal labeling earch for a NP-hard ubcla of (max,+) problem Ivan Kovtun International Reearch and Training Center of Information Technologie and Sytem, Kiev, Uraine, ovtun@image.iev.ua Dreden
More informationAssessing the Discriminatory Power of Credit Scores
Aeing the Dicriminatory Power of Credit Score Holger Kraft 1, Gerald Kroiandt 1, Marlene Müller 1,2 1 Fraunhofer Intitut für Techno- und Wirtchaftmathematik (ITWM) Gottlieb-Daimler-Str. 49, 67663 Kaierlautern,
More informationA Spam Message Filtering Method: focus on run time
, pp.29-33 http://dx.doi.org/10.14257/atl.2014.76.08 A Spam Meage Filtering Method: focu on run time Sin-Eon Kim 1, Jung-Tae Jo 2, Sang-Hyun Choi 3 1 Department of Information Security Management 2 Department
More informationTwo Dimensional FEM Simulation of Ultrasonic Wave Propagation in Isotropic Solid Media using COMSOL
Excerpt from the Proceeding of the COMSO Conference 0 India Two Dimenional FEM Simulation of Ultraonic Wave Propagation in Iotropic Solid Media uing COMSO Bikah Ghoe *, Krihnan Balaubramaniam *, C V Krihnamurthy
More informationA note on profit maximization and monotonicity for inbound call centers
A note on profit maximization and monotonicity for inbound call center Ger Koole & Aue Pot Department of Mathematic, Vrije Univeriteit Amterdam, The Netherland 23rd December 2005 Abtract We conider an
More informationSupport Vector Machine Based Electricity Price Forecasting For Electricity Markets utilising Projected Assessment of System Adequacy Data.
The Sixth International Power Engineering Conference (IPEC23, 27-29 November 23, Singapore Support Vector Machine Baed Electricity Price Forecating For Electricity Maret utiliing Projected Aement of Sytem
More informationA Note on Profit Maximization and Monotonicity for Inbound Call Centers
OPERATIONS RESEARCH Vol. 59, No. 5, September October 2011, pp. 1304 1308 in 0030-364X ein 1526-5463 11 5905 1304 http://dx.doi.org/10.1287/opre.1110.0990 2011 INFORMS TECHNICAL NOTE INFORMS hold copyright
More informationQueueing systems with scheduled arrivals, i.e., appointment systems, are typical for frontal service systems,
MANAGEMENT SCIENCE Vol. 54, No. 3, March 28, pp. 565 572 in 25-199 ein 1526-551 8 543 565 inform doi 1.1287/mnc.17.82 28 INFORMS Scheduling Arrival to Queue: A Single-Server Model with No-Show INFORMS
More information1 Introduction. Reza Shokri* Privacy Games: Optimal User-Centric Data Obfuscation
Proceeding on Privacy Enhancing Technologie 2015; 2015 (2):1 17 Reza Shokri* Privacy Game: Optimal Uer-Centric Data Obfucation Abtract: Conider uer who hare their data (e.g., location) with an untruted
More informationDISTRIBUTED DATA PARALLEL TECHNIQUES FOR CONTENT-MATCHING INTRUSION DETECTION SYSTEMS
DISTRIBUTED DATA PARALLEL TECHNIQUES FOR CONTENT-MATCHING INTRUSION DETECTION SYSTEMS Chritopher V. Kopek Department of Computer Science Wake Foret Univerity Winton-Salem, NC, 2709 Email: kopekcv@gmail.com
More informationDISTRIBUTED DATA PARALLEL TECHNIQUES FOR CONTENT-MATCHING INTRUSION DETECTION SYSTEMS. G. Chapman J. Cleese E. Idle
DISTRIBUTED DATA PARALLEL TECHNIQUES FOR CONTENT-MATCHING INTRUSION DETECTION SYSTEMS G. Chapman J. Cleee E. Idle ABSTRACT Content matching i a neceary component of any ignature-baed network Intruion Detection
More informationGrowing Self-Organizing Maps for Surface Reconstruction from Unstructured Point Clouds
Growing Self-Organizing Map for Surface Recontruction from Untructured Point Cloud Renata L. M. E. do Rêgo, Aluizio F. R. Araújo, and Fernando B.de Lima Neto Abtract Thi work introduce a new method for
More informationProject Management Basics
Project Management Baic A Guide to undertanding the baic component of effective project management and the key to ucce 1 Content 1.0 Who hould read thi Guide... 3 1.1 Overview... 3 1.2 Project Management
More informationRedesigning Ratings: Assessing the Discriminatory Power of Credit Scores under Censoring
Redeigning Rating: Aeing the Dicriminatory Power of Credit Score under Cenoring Holger Kraft, Gerald Kroiandt, Marlene Müller Fraunhofer Intitut für Techno- und Wirtchaftmathematik (ITWM) Thi verion: June
More informationChapter 10 Stocks and Their Valuation ANSWERS TO END-OF-CHAPTER QUESTIONS
Chapter Stoc and Their Valuation ANSWERS TO EN-OF-CHAPTER QUESTIONS - a. A proxy i a document giving one peron the authority to act for another, typically the power to vote hare of common toc. If earning
More informationA Resolution Approach to a Hierarchical Multiobjective Routing Model for MPLS Networks
A Reolution Approach to a Hierarchical Multiobjective Routing Model for MPLS Networ Joé Craveirinha a,c, Rita Girão-Silva a,c, João Clímaco b,c, Lúcia Martin a,c a b c DEEC-FCTUC FEUC INESC-Coimbra International
More informationDUE to the small size and low cost of a sensor node, a
1992 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 14, NO. 10, OCTOBER 2015 A Networ Coding Baed Energy Efficient Data Bacup in Survivability-Heterogeneou Senor Networ Jie Tian, Tan Yan, and Guiling Wang
More informationSENSING IMAGES. School of Remote Sensing and Information Engineering, Wuhan University, 129# Luoyu Road, Wuhan, China,ych@whu.edu.
International Archive of the Photogrammetry, Remote Sening and Spatial Information Science, Volume X-/W, 3 8th International Sympoium on Spatial Data Quality, 3 May - June 3, Hong Kong COUD DETECTION METHOD
More informationBidding for Representative Allocations for Display Advertising
Bidding for Repreentative Allocation for Diplay Advertiing Arpita Ghoh, Preton McAfee, Kihore Papineni, and Sergei Vailvitkii Yahoo! Reearch. {arpita, mcafee, kpapi, ergei}@yahoo-inc.com Abtract. Diplay
More informationImproving the Performance of Web Service Recommenders Using Semantic Similarity
Improving the Performance of Web Service Recommender Uing Semantic Similarity Juan Manuel Adán-Coello, Carlo Miguel Tobar, Yang Yuming Faculdade de Engenharia de Computação, Pontifícia Univeridade Católica
More informationCLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON MAPREDUCE FRAMEWORK
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON MAPREDUCE FRAMEWORK Sheela Gole 1 and Bharat Tidke 2 1 Department of Computer Engineering, Flora Intitute of Technology, Pune,
More informationName: SID: Instructions
CS168 Fall 2014 Homework 1 Aigned: Wedneday, 10 September 2014 Due: Monday, 22 September 2014 Name: SID: Dicuion Section (Day/Time): Intruction - Submit thi homework uing Pandagrader/GradeScope(http://www.gradecope.com/
More informationA model for the relationship between tropical precipitation and column water vapor
Click Here for Full Article GEOPHYSICAL RESEARCH LETTERS, VOL. 36, L16804, doi:10.1029/2009gl039667, 2009 A model for the relationhip between tropical precipitation and column water vapor Caroline J. Muller,
More informationTIME SERIES ANALYSIS AND TRENDS BY USING SPSS PROGRAMME
TIME SERIES ANALYSIS AND TRENDS BY USING SPSS PROGRAMME RADMILA KOCURKOVÁ Sileian Univerity in Opava School of Buine Adminitration in Karviná Department of Mathematical Method in Economic Czech Republic
More informationCHARACTERISTICS OF WAITING LINE MODELS THE INDICATORS OF THE CUSTOMER FLOW MANAGEMENT SYSTEMS EFFICIENCY
Annale Univeritati Apuleni Serie Oeconomica, 2(2), 200 CHARACTERISTICS OF WAITING LINE MODELS THE INDICATORS OF THE CUSTOMER FLOW MANAGEMENT SYSTEMS EFFICIENCY Sidonia Otilia Cernea Mihaela Jaradat 2 Mohammad
More informationTRADING rules are widely used in financial market as
Complex Stock Trading Strategy Baed on Particle Swarm Optimization Fei Wang, Philip L.H. Yu and David W. Cheung Abtract Trading rule have been utilized in the tock market to make profit for more than a
More information1) Assume that the sample is an SRS. The problem state that the subjects were randomly selected.
12.1 Homework for t Hypothei Tet 1) Below are the etimate of the daily intake of calcium in milligram for 38 randomly elected women between the age of 18 and 24 year who agreed to participate in a tudy
More informationSCM- integration: organiational, managerial and technological iue M. Caridi 1 and A. Sianei 2 Dipartimento di Economia e Produzione, Politecnico di Milano, Italy E-mail: maria.caridi@polimi.it Itituto
More informationINFORMATION Technology (IT) infrastructure management
IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 2, NO. 1, MAY 214 1 Buine-Driven Long-term Capacity Planning for SaaS Application David Candeia, Ricardo Araújo Santo and Raquel Lope Abtract Capacity Planning
More informationTowards Control-Relevant Forecasting in Supply Chain Management
25 American Control Conference June 8-1, 25. Portland, OR, USA WeA7.1 Toward Control-Relevant Forecating in Supply Chain Management Jay D. Schwartz, Daniel E. Rivera 1, and Karl G. Kempf Control Sytem
More informationCluster-Aware Cache for Network Attached Storage *
Cluter-Aware Cache for Network Attached Storage * Bin Cai, Changheng Xie, and Qiang Cao National Storage Sytem Laboratory, Department of Computer Science, Huazhong Univerity of Science and Technology,
More informationREDUCTION OF TOTAL SUPPLY CHAIN CYCLE TIME IN INTERNAL BUSINESS PROCESS OF REAMER USING DOE AND TAGUCHI METHODOLOGY. Abstract. 1.
International Journal of Advanced Technology & Engineering Reearch (IJATER) REDUCTION OF TOTAL SUPPLY CHAIN CYCLE TIME IN INTERNAL BUSINESS PROCESS OF REAMER USING DOE AND Abtract TAGUCHI METHODOLOGY Mr.
More informationReport 4668-1b 30.10.2010. Measurement report. Sylomer - field test
Report 4668-1b Meaurement report Sylomer - field tet Report 4668-1b 2(16) Contet 1 Introduction... 3 1.1 Cutomer... 3 1.2 The ite and purpoe of the meaurement... 3 2 Meaurement... 6 2.1 Attenuation of
More informationMorningstar Fixed Income Style Box TM Methodology
Morningtar Fixed Income Style Box TM Methodology Morningtar Methodology Paper Augut 3, 00 00 Morningtar, Inc. All right reerved. The information in thi document i the property of Morningtar, Inc. Reproduction
More informationHow Enterprises Can Build Integrated Digital Marketing Experiences Using Drupal
How Enterprie Can Build Integrated Digital Marketing Experience Uing Drupal acquia.com 888.922.7842 1.781.238.8600 25 Corporate Drive, Burlington, MA 01803 How Enterprie Can Build Integrated Digital Marketing
More informationScheduling of Jobs and Maintenance Activities on Parallel Machines
Scheduling of Job and Maintenance Activitie on Parallel Machine Chung-Yee Lee* Department of Indutrial Engineering Texa A&M Univerity College Station, TX 77843-3131 cylee@ac.tamu.edu Zhi-Long Chen** Department
More informationGroup Mutual Exclusion Based on Priorities
Group Mutual Excluion Baed on Prioritie Karina M. Cenci Laboratorio de Invetigación en Sitema Ditribuido Univeridad Nacional del Sur Bahía Blanca, Argentina kmc@c.un.edu.ar and Jorge R. Ardenghi Laboratorio
More informationReview of Multiple Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015
Review of Multiple Regreion Richard William, Univerity of Notre Dame, http://www3.nd.edu/~rwilliam/ Lat revied January 13, 015 Aumption about prior nowledge. Thi handout attempt to ummarize and yntheize
More informationTrusted Document Signing based on use of biometric (Face) keys
Truted Document Signing baed on ue of biometric (Face) Ahmed B. Elmadani Department of Computer Science Faculty of Science Sebha Univerity Sebha Libya www.ebhau.edu.ly elmadan@yahoo.com ABSTRACT An online
More informationGrowth and Sustainability of Managed Security Services Networks: An Economic Perspective
Growth and Sutainability of Managed Security Service etwork: An Economic Perpective Alok Gupta Dmitry Zhdanov Department of Information and Deciion Science Univerity of Minneota Minneapoli, M 55455 (agupta,
More informationProfitability of Loyalty Programs in the Presence of Uncertainty in Customers Valuations
Proceeding of the 0 Indutrial Engineering Reearch Conference T. Doolen and E. Van Aken, ed. Profitability of Loyalty Program in the Preence of Uncertainty in Cutomer Valuation Amir Gandomi and Saeed Zolfaghari
More informationUnobserved Heterogeneity and Risk in Wage Variance: Does Schooling Provide Earnings Insurance?
TI 011-045/3 Tinbergen Intitute Dicuion Paper Unoberved Heterogeneity and Rik in Wage Variance: Doe Schooling Provide Earning Inurance? Jacopo Mazza Han van Ophem Joop Hartog * Univerity of Amterdam; *
More informationLinear Momentum and Collisions
Chapter 7 Linear Momentum and Colliion 7.1 The Important Stuff 7.1.1 Linear Momentum The linear momentum of a particle with ma m moving with velocity v i defined a p = mv (7.1) Linear momentum i a vector.
More informationResearch Article An (s, S) Production Inventory Controlled Self-Service Queuing System
Probability and Statitic Volume 5, Article ID 558, 8 page http://dxdoiorg/55/5/558 Reearch Article An (, S) Production Inventory Controlled Self-Service Queuing Sytem Anoop N Nair and M J Jacob Department
More informationPOSSIBILITIES OF INDIVIDUAL CLAIM RESERVE RISK MODELING
POSSIBILITIES OF INDIVIDUAL CLAIM RESERVE RISK MODELING Pavel Zimmermann * 1. Introduction A ignificant increae in demand for inurance and financial rik quantification ha occurred recently due to the fact
More informationSELF-MANAGING PERFORMANCE IN APPLICATION SERVERS MODELLING AND DATA ARCHITECTURE
SELF-MANAGING PERFORMANCE IN APPLICATION SERVERS MODELLING AND DATA ARCHITECTURE RAVI KUMAR G 1, C.MUTHUSAMY 2 & A.VINAYA BABU 3 1 HP Bangalore, Reearch Scholar JNTUH, Hyderabad, India, 2 Yahoo, Bangalore,
More informationv = x t = x 2 x 1 t 2 t 1 The average speed of the particle is absolute value of the average velocity and is given Distance travelled t
Chapter 2 Motion in One Dimenion 2.1 The Important Stuff 2.1.1 Poition, Time and Diplacement We begin our tudy of motion by conidering object which are very mall in comparion to the ize of their movement
More informationIntroduction to the article Degrees of Freedom.
Introduction to the article Degree of Freedom. The article by Walker, H. W. Degree of Freedom. Journal of Educational Pychology. 3(4) (940) 53-69, wa trancribed from the original by Chri Olen, George Wahington
More informationControl of Wireless Networks with Flow Level Dynamics under Constant Time Scheduling
Control of Wirele Network with Flow Level Dynamic under Contant Time Scheduling Long Le and Ravi R. Mazumdar Department of Electrical and Computer Engineering Univerity of Waterloo,Waterloo, ON, Canada
More informationBi-Objective Optimization for the Clinical Trial Supply Chain Management
Ian David Lockhart Bogle and Michael Fairweather (Editor), Proceeding of the 22nd European Sympoium on Computer Aided Proce Engineering, 17-20 June 2012, London. 2012 Elevier B.V. All right reerved. Bi-Objective
More informationCASE STUDY BRIDGE. www.future-processing.com
CASE STUDY BRIDGE TABLE OF CONTENTS #1 ABOUT THE CLIENT 3 #2 ABOUT THE PROJECT 4 #3 OUR ROLE 5 #4 RESULT OF OUR COLLABORATION 6-7 #5 THE BUSINESS PROBLEM THAT WE SOLVED 8 #6 CHALLENGES 9 #7 VISUAL IDENTIFICATION
More informationTHE IMPACT OF MULTIFACTORIAL GENETIC DISORDERS ON CRITICAL ILLNESS INSURANCE: A SIMULATION STUDY BASED ON UK BIOBANK ABSTRACT KEYWORDS
THE IMPACT OF MULTIFACTORIAL GENETIC DISORDERS ON CRITICAL ILLNESS INSURANCE: A SIMULATION STUDY BASED ON UK BIOBANK BY ANGUS MACDONALD, DELME PRITCHARD AND PRADIP TAPADAR ABSTRACT The UK Biobank project
More informationRisk Management for a Global Supply Chain Planning under Uncertainty: Models and Algorithms
Rik Management for a Global Supply Chain Planning under Uncertainty: Model and Algorithm Fengqi You 1, John M. Waick 2, Ignacio E. Gromann 1* 1 Dept. of Chemical Engineering, Carnegie Mellon Univerity,
More informationExposure Metering Relating Subject Lighting to Film Exposure
Expoure Metering Relating Subject Lighting to Film Expoure By Jeff Conrad A photographic expoure meter meaure ubject lighting and indicate camera etting that nominally reult in the bet expoure of the film.
More informationNETWORK TRAFFIC ENGINEERING WITH VARIED LEVELS OF PROTECTION IN THE NEXT GENERATION INTERNET
Chapter 1 NETWORK TRAFFIC ENGINEERING WITH VARIED LEVELS OF PROTECTION IN THE NEXT GENERATION INTERNET S. Srivatava Univerity of Miouri Kana City, USA hekhar@conrel.ice.umkc.edu S. R. Thirumalaetty now
More informationMECH 2110 - Statics & Dynamics
Chapter D Problem 3 Solution 1/7/8 1:8 PM MECH 11 - Static & Dynamic Chapter D Problem 3 Solution Page 7, Engineering Mechanic - Dynamic, 4th Edition, Meriam and Kraige Given: Particle moving along a traight
More informationQUANTIFYING THE BULLWHIP EFFECT IN THE SUPPLY CHAIN OF SMALL-SIZED COMPANIES
Sixth LACCEI International Latin American and Caribbean Conference for Engineering and Technology (LACCEI 2008) Partnering to Succe: Engineering, Education, Reearch and Development June 4 June 6 2008,
More informationMaximizing Acceptance Probability for Active Friending in Online Social Networks
Maximizing for Active Friending in Online Social Network De-Nian Yang, Hui-Ju Hung, Wang-Chien Lee, Wei Chen Academia Sinica, Taipei, Taiwan The Pennylvania State Univerity, State College, Pennylvania,
More informationA New Optimum Jitter Protection for Conversational VoIP
Proc. Int. Conf. Wirele Commun., Signal Proceing (Nanjing, China), 5 pp., Nov. 2009 A New Optimum Jitter Protection for Converational VoIP Qipeng Gong, Peter Kabal Electrical & Computer Engineering, McGill
More informationBrokerage Commissions and Institutional Trading Patterns
rokerage Commiion and Intitutional Trading Pattern Michael Goldtein abon College Paul Irvine Emory Univerity Eugene Kandel Hebrew Univerity and Zvi Wiener Hebrew Univerity June 00 btract Why do broker
More informationTOWARDS AUTOMATED LIDAR BORESIGHT SELF-CALIBRATION
TOWARDS AUTOMATED LIDAR BORESIGHT SELF-CALIBRATION J. Skaloud a, *, P. Schaer a a TOPO Lab, Ecole Polytechnique Fédérale de Lauanne (EPFL), Station 18, 1015 Lauanne, Switzerland KEY WORDS: airborne laer
More informationMixed Method of Model Reduction for Uncertain Systems
SERBIAN JOURNAL OF ELECTRICAL ENGINEERING Vol 4 No June Mixed Method of Model Reduction for Uncertain Sytem N Selvaganean Abtract: A mixed method for reducing a higher order uncertain ytem to a table reduced
More informationSocially Optimal Pricing of Cloud Computing Resources
Socially Optimal Pricing of Cloud Computing Reource Ihai Menache Microoft Reearch New England Cambridge, MA 02142 t-imena@microoft.com Auman Ozdaglar Laboratory for Information and Deciion Sytem Maachuett
More informationQueueing Models for Multiclass Call Centers with Real-Time Anticipated Delays
Queueing Model for Multicla Call Center with Real-Time Anticipated Delay Oualid Jouini Yve Dallery Zeynep Akşin Ecole Centrale Pari Koç Univerity Laboratoire Génie Indutriel College of Adminitrative Science
More informationOnline story scheduling in web advertising
Online tory cheduling in web advertiing Anirban Dagupta Arpita Ghoh Hamid Nazerzadeh Prabhakar Raghavan Abtract We tudy an online job cheduling problem motivated by toryboarding in web advertiing, where
More informationHUMAN CAPITAL AND THE FUTURE OF TRANSITION ECONOMIES * Michael Spagat Royal Holloway, University of London, CEPR and Davidson Institute.
HUMAN CAPITAL AND THE FUTURE OF TRANSITION ECONOMIES * By Michael Spagat Royal Holloway, Univerity of London, CEPR and Davidon Intitute Abtract Tranition economie have an initial condition of high human
More informationBenchmarking Bottom-Up and Top-Down Strategies for SPARQL-to-SQL Query Translation
Benchmarking Bottom-Up and Top-Down Strategie for SPARQL-to-SQL Query Tranlation Kahlev a, Chebotko b,c, John Abraham b, Pearl Brazier b, and Shiyong Lu a a Department of Computer Science, Wayne State
More informationAssigning Tasks for Efficiency in Hadoop
Aigning Tak for Efficiency in Hadoop [Extended Abtract] Michael J. Ficher Computer Science Yale Univerity P.O. Box 208285 New Haven, CT, USA michael.ficher@yale.edu Xueyuan Su Computer Science Yale Univerity
More informationAvailability of WDM Multi Ring Networks
Paper Availability of WDM Multi Ring Network Ivan Rado and Katarina Rado H d.o.o. Motar, Motar, Bonia and Herzegovina Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture, Univerity
More informationFEDERATION OF ARAB SCIENTIFIC RESEARCH COUNCILS
Aignment Report RP/98-983/5/0./03 Etablihment of cientific and technological information ervice for economic and ocial development FOR INTERNAL UE NOT FOR GENERAL DITRIBUTION FEDERATION OF ARAB CIENTIFIC
More informationA Review On Software Testing In SDlC And Testing Tools
www.ijec.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Iue -9 September, 2014 Page No. 8188-8197 A Review On Software Teting In SDlC And Teting Tool T.Amruthavalli*,
More informationMANAGING DATA REPLICATION IN MOBILE AD- HOC NETWORK DATABASES (Invited Paper) *
MANAGING DATA REPLICATION IN MOBILE AD- HOC NETWORK DATABASES (Invited Paper) * Praanna Padmanabhan School of Computer Science The Univerity of Oklahoma Norman OK, USA praannap@yahoo-inc.com Dr. Le Gruenwald
More informationThe Arms Race on American Roads: The Effect of SUV s and Pickup Trucks on Traffic Safety
The Arm Race on American Road: The Effect of SUV and Pickup Truck on Traffic Safety Michelle J. White Univerity of California, San Diego, and NBER Abtract Driver have been running an arm race on American
More informationDigital Communication Systems
Digital Communication Sytem The term digital communication cover a broad area of communication technique, including digital tranmiion and digital radio. Digital tranmiion, i the tranmitted of digital pule
More informationMethod of Moments Estimation in Linear Regression with Errors in both Variables J.W. Gillard and T.C. Iles
Method of Moment Etimation in Linear Regreion with Error in both Variable by J.W. Gillard and T.C. Ile Cardiff Univerity School of Mathematic Technical Paper October 005 Cardiff Univerity School of Mathematic,
More informationGrowth and Sustainability of Managed Security Services Networks: An Economic Perspective
Growth and Sutainability of Managed Security Service etwork: An Economic Perpective Alok Gupta Dmitry Zhdanov Department of Information and Deciion Science Univerity of Minneota Minneapoli, M 55455 (agupta,
More informationG*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences
Behavior Reearch Method 007, 39 (), 75-9 G*Power 3: A flexible tatitical power analyi program for the ocial, behavioral, and biomedical cience FRAZ FAUL Chritian-Albrecht-Univerität Kiel, Kiel, Germany
More informationPekka Helkiö, 58490K Antti Seppälä, 63212W Ossi Syd, 63513T
Pekka Helkiö, 58490K Antti Seppälä, 63212W Oi Syd, 63513T Table of Content 1. Abtract...1 2. Introduction...2 2.1 Background... 2 2.2 Objective and Reearch Problem... 2 2.3 Methodology... 2 2.4 Scoping
More informationSector Concentration in Loan Portfolios and Economic Capital. Abstract
Sector Concentration in Loan Portfolio and Economic Capital Klau Düllmann and Nancy Machelein 2 Thi verion: September 2006 Abtract The purpoe of thi paper i to meaure the potential impact of buine-ector
More informationTax Evasion and Self-Employment in a High-Tax Country: Evidence from Sweden
Tax Evaion and Self-Employment in a High-Tax Country: Evidence from Sweden by Per Engtröm * and Bertil Holmlund ** Thi verion: May 17, 2006 Abtract Self-employed individual have arguably greater opportunitie
More informationA comparison of various clustering methods and algorithms in data mining
Volume :2, Issue :5, 32-36 May 2015 www.allsubjectjournal.com e-issn: 2349-4182 p-issn: 2349-5979 Impact Factor: 3.762 R.Tamilselvi B.Sivasakthi R.Kavitha Assistant Professor A comparison of various clustering
More informationEXPERIMENT 11 CONSOLIDATION TEST
119 EXPERIMENT 11 CONSOLIDATION TEST Purpoe: Thi tet i performed to determine the magnitude and rate of volume decreae that a laterally confined oil pecimen undergoe when ubjected to different vertical
More informationAbstract parsing: static analysis of dynamically generated string output using LR-parsing technology
Abtract paring: tatic analyi of dynamically generated tring output uing LR-paring technology Kyung-Goo Doh 1, Hyunha Kim 1, David A. Schmidt 2 1 Hanyang Univerity, Anan, South Korea 2 Kana State Univerity,
More informationSimulation of Sensorless Speed Control of Induction Motor Using APFO Technique
International Journal of Computer and Electrical Engineering, Vol. 4, No. 4, Augut 2012 Simulation of Senorle Speed Control of Induction Motor Uing APFO Technique T. Raghu, J. Sriniva Rao, and S. Chandra
More informationRisk-Sharing within Families: Evidence from the Health and Retirement Study
Rik-Sharing within Familie: Evidence from the Health and Retirement Study Ş. Nuray Akın and Okana Leukhina December 14, 2014 We report trong empirical upport for the preence of elf-interet-baed rik haring
More informationyour Rights Consumer Guarantees Understanding Consumer Electronic Devices, Home Appliances & Home Entertainment Products
Conumer Guarantee Undertanding your Right Conumer Electronic Device, Home Appliance & Home Entertainment Product Voluntary Warranty Guide February 2014 JB Hi-Fi Group Pty Ltd (ABN 37 093 II4 286) The Autralian
More informationMulti-Objective Optimization for Sponsored Search
Multi-Objective Optimization for Sponored Search Yilei Wang 1,*, Bingzheng Wei 2, Jun Yan 2, Zheng Chen 2, Qiao Du 2,3 1 Yuanpei College Peking Univerity Beijing, China, 100871 (+86)15120078719 wangyileipku@gmail.com
More informationGraph Analyi I Network Meaure of the Networked Adaptive Agents
Uing Graph Analyi to Study Network of Adaptive Agent Sherief Abdallah Britih Univerity in Dubai, United Arab Emirate Univerity of Edinburgh, United Kingdom hario@ieee.org ABSTRACT Experimental analyi of
More informationData Mining. Cluster Analysis: Advanced Concepts and Algorithms
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based
More informationLadar-Based Detection and Tracking of Moving Objects from a Ground Vehicle at High Speeds
Ladar-aed Detection and Tracing of Moving Object fro a Ground Vehicle at High Speed Chieh-Chih Wang, Charle Thorpe and rne Suppe Robotic Intitute Carnegie Mellon niverity Pittburgh, P, 15213S Eail: {bobwang,
More informationDistributed Monitoring and Aggregation in Wireless Sensor Networks
Ditributed Monitoring and Aggregation in Wirele Senor Network Changlei Liu and Guohong Cao Department of Computer Science & Engineering The Pennylvania State Univerity E-mail: {chaliu, gcao}@ce.pu.edu
More informationComputing Location from Ambient FM Radio Signals
Computing Location from Ambient FM Radio Signal Adel Youef Department of Computer Science Univerity of Maryland A.V. William Building College Park, MD 20742 adel@c.umd.edu John Krumm, Ed Miller, Gerry
More informationAnalysis of Mesostructure Unit Cells Comprised of Octet-truss Structures
Analyi of Meotructure Unit Cell Compried of Octet-tru Structure Scott R. Johnton *, Marque Reed *, Hongqing V. Wang, and David W. Roen * * The George W. Woodruff School of Mechanical Engineering, Georgia
More informationIndependent Samples T- test
Independent Sample T- tet With previou tet, we were intereted in comparing a ingle ample with a population With mot reearch, you do not have knowledge about the population -- you don t know the population
More informationTHE ECONOMIC INCENTIVES OF PROVIDING NETWORK SECURITY SERVICES ON THE INTERNET INFRASTRUCTURE
Journal of Information Technology Management ISSN #1042-1319 A Publication of the Aociation of Management THE ECONOMIC INCENTIVES OF PROVIDING NETWORK SECURITY SERVICES ON THE INTERNET INFRASTRUCTURE LI-CHIOU
More informationRO-BURST: A Robust Virtualization Cost Model for Workload Consolidation over Clouds
!111! 111!ttthhh IIIEEEEEEEEE///AAACCCMMM IIInnnttteeerrrnnnaaatttiiiooonnnaaalll SSSyyymmmpppoooiiiuuummm ooonnn CCCllluuuttteeerrr,,, CCClllooouuuddd aaannnddd GGGrrriiiddd CCCooommmpppuuutttiiinnnggg
More informationDATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
More information