Tailoring Fuzzy C-Means Clustering Algorithm for Big Data Using Random Sampling and Particle Swarm Optimization

Similar documents
How To Write A Powerpoint Powerpoint Commandbook For A Data Center

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

An Interest-Oriented Network Evolution Mechanism for Online Communities

Resource Management For Workflows In Cloud Computing Environment Based On Group Technology Approach

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

An Electricity Trade Model for Microgrid Communities in Smart Grid

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A DATA MINING APPLICATION IN A STUDENT DATABASE

Maximizing profit using recommender systems

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Study on Secure Data Storage Strategy in Cloud Computing

An Alternative Way to Measure Private Equity Performance

Forecasting the Direction and Strength of Stock Market Movement

A Prediction System Based on Fuzzy Logic

Ants Can Schedule Software Projects

Least Squares Fitting of Data

Evaluation of the information servicing in a distributed learning environment by using monitoring and stochastic modeling

Research Article Load Balancing for Future Internet: An Approach Based on Game Theory

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

An Enhanced K-Anonymity Model against Homogeneity Attack

A Novel Dynamic Role-Based Access Control Scheme in User Hierarchy

Recurrence. 1 Definitions and main statements

Tourism Demand Forecasting by Improved SVR Model

Support Vector Machines

The Subtraction Rule and its Effects on Pricing in the Electricity Industry

Chapter 3: Dual-bandwidth Data Path and BOCP Design

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Two-Phase Traceback of DDoS Attacks with Overlay Network

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Time Domain simulation of PD Propagation in XLPE Cables Considering Frequency Dependent Parameters

Real-Time Traffic Signal Intelligent Control with Transit-Priority

Naglaa Raga Said Assistant Professor of Operations. Egypt.

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST)

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm

Virtual machine resource allocation algorithm in cloud environment

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Learning User's Scheduling Criteria in a Personal Calendar Agent!

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

Yixin Jiang and Chuang Lin. Minghui Shi and Xuemin Sherman Shen*

A Multi Due Date Batch Scheduling Model. on Dynamic Flow Shop to Minimize. Total Production Cost

Stochastic Models of Load Balancing and Scheduling in Cloud Computing Clusters

BANDWIDTH ALLOCATION AND PRICING PROBLEM FOR A DUOPOLY MARKET

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Stochastic Models of Load Balancing and Scheduling in Cloud Computing Clusters

A Survey of Stroke- Based Rendering

J. Parallel Distrib. Comput.

Analysis and Modeling of Buck Converter in Discontinuous-Output-Inductor-Current Mode Operation *

Scan Detection in High-Speed Networks Based on Optimal Dynamic Bit Sharing

Mining Multiple Large Data Sources

Stochastic Models of Load Balancing and Scheduling in Cloud Computing Clusters

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

A Statistical Model for Detecting Abnormality in Static-Priority Scheduling Networks with Differentiated Services

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

Capacity Planning for Virtualized Servers

Investigation of Modified Bee Colony Algorithm with Particle and Chaos Theory

Research Article Competition and Integration in Closed-Loop Supply Chain Network with Variational Inequality

8 Algorithm for Binary Searching in Trees

Laddered Multilevel DC/AC Inverters used in Solar Panel Energy Systems

How Much to Bet on Video Poker

Efficient Computation of Optimal, Physically Valid Motion

A Comprehensive Analysis of Bandwidth Request Mechanisms in IEEE Networks

An interactive system for structure-based ASCII art creation

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

A Structure Preserving Database Encryption Scheme

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

An efficient constraint handling methodology for multi-objective evolutionary algorithms

Research Article Enhanced Two-Step Method via Relaxed Order of α-satisfactory Degrees for Fuzzy Multiobjective Optimization

An Analytical Model of Web Server Load Distribution by Applying a Minimum Entropy Strategy

International Journal of Industrial Engineering Computations

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

A novel Method for Data Mining and Classification based on

A NEW ACTIVE QUEUE MANAGEMENT ALGORITHM BASED ON NEURAL NETWORKS PI. M. Yaghoubi Waskasi M. J. Yazdanpanah

Mooring Pattern Optimization using Genetic Algorithms

Neural Network Solutions for Forward Kinematics Problem of Hybrid Serial-Parallel Manipulator

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Applied Research Laboratory. Decision Theory and Receiver Design

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

An Approach For: Improving Voice Command processor Based On Better Features and Classifiers Selection

Transcription:

Internatonal Journal of Database Theory and Alcaton,.191-202 htt://dx.do.org/10.14257/jdta.2015.8.3.16 Talorng Fuzzy -Means lusterng Algorth for Bg Data Usng Rando Salng and Partcle Swar Otzaton Yang Xanfeng 1 and Lu Pengfe 2 1 Yang Xanfeng, School of Inforaton Engneerng, Henan Insttute of Scence and Technology, Henan Xnxang, hna 2 HEBI olleges of Vocaton and Technology, Henan Heb, hna 1 49377535@qq.co, 2 274772453@qq.co, Abstract As one of the ost coon data nng technques, clusterng has been wdely aled n any felds, aong whch fuzzy clusterng can reflect the real world n a ore objectve ersectve. As one of the ost oular fuzzy clusterng algorths, Fuzzy -Means FM) clusterng cobnes the fuzzy theory and K-Means clusterng algorth. However, there are soe ssues wth FM clusterng. For exale, FM s very senstve to the ntalzaton condton, such as the deternaton of ntal clusters; the seed of convergence s lted, and the global otal soluton s hard to be guaranteed. Esecally for bg data scenaro, the overall seed s slow, and t s hard to erfor the clusterng algorth on all the orgnal dataset. To solve above challenges, n ths aer, we roose a odfed FM based on Partcle Swar Otzaton PSO). Besdes, we also resent a ult-round rando salng ethod to deal wth the bg data roble, by sulatng the clusterng on the orgnal bg dataset wth the objectve to aroxate the clusterng results on sale datasets. Our exerents show that both the odfed FM usng PSO and the ult-round salng strategy are effcent and effectve. Keywords: lusterng, Fuzzy -Means FM), Partcle Swar Otzaton PSO), Bg data 1. Introducton lusterng [1] s one of the ost coon data nng technques, whch refers to grou a set of objects n such a way that objects wthn the sae grou are slar and objects across dfferent grous are dfferent. lusterng s an unsuervsed learnng rocess, and has been wdely aled n any felds, such as attern recognton [2], data analyss [3], age rocessng [4] and arketng research [5], etc. Fuzzy clusterng also referred to as soft clusterng) can allocate objects to clusters n a fuzzy way, where data objects can belong to ore than one clusters and assocated wth dfferent ebersh levels [6]. Therefore, fuzzy clusterng can reflect the real world n a ore objectve ersectve. As one of the ost oular fuzzy clusterng algorths, Fuzzy -Means FM) clusterng [7] cobnes the fuzzy theory and K-Means clusterng algorth. However, there are soe robles wth FM clusterng. Frst, FM s very senstve to the ntalzaton condton, such as the deternaton of ntal clusters. Second, the seed of convergence s lted, and the global otal soluton s hard to be guaranteed, esecally for bg data scenaro. Thrd, for bg data, the overall seed s slow, and t s hard to erfor the clusterng algorth on all the orgnal dataset. To solve above challenges, n ths aer we roose a odfed FM algorth esecally for bg data soluton. Frst, to deal wth the ntalzaton senstvty, we roose to rove FM usng Partcle Swar Otzaton PSO) [8] by otzng the ISS: 2005-4270 IJDTA oyrght c 2015 SERS

Internatonal Journal of Database Theory and Alcaton ntalzaton rocedure of FM. Second, to solve the challenges of bg data, we roose a ult-round rando salng ethod to sulate the clusterng on the orgnal bg dataset wth the objectve to aroxate the clusterng results on sale datasets. The rean of ths aer s organzed as follows. Secton 2 rovdes soe related work. In Secton 3, we resent soe relnares of FM and PSO. The roosed odfed FM based on PSO and the rocedures of FM clusterng on bg data are dscussed n secton 4. Then, ercal exerents are conducted n Secton 5. Fnally, the aer s concluded n Secton 6. 2. Related Work Exstng clusterng ethods can be groued as arttonng ethods, herarchcal ethods, densty-based ethods, grd-based ethods, and odel-based ethods. For exale, tycal arttonng ethods nclude K-Means [9], LARAS [10], and FREM [11]. oon herarchcal clusterng algorths are BIRH [12] and URE [13]. Densty-based ethods nclude DBSA [14], OPTIS [15], ST-DBSA [16], etc. STIG [17], LIQUE [18], WAVE-LUSTER [19] are the exales for grd-based ethods. Many efforts have done n fuzzy clusterng as well. For exale, Rusn [20] frst roosed the concet of fuzzy arttonng by ntroducng fuzzy theory nto clusterng analyss. Then, any fuzzy clusterng ethods were roosed, such as the transtve closure based on fuzzy equvalence relatons [21-22], ethod based on slarty relatons and fuzzy relatons [23], the axu tree based on fuzzy grah theory [24] and ethods based on the dataset convex decooston [25] and dynac rograng [26], etc. ow one of the ost coon fuzzy clusterng algorths s Fuzzy -Means FM) clusterng [27]. However, FM has soe ssues such as senstvty to the ntalzaton status and reature convergence due to ts gradent descent based search strategy. Soe researchers ntroduce Genetc Algorth GA) to solve the local otal roble [28-29]. However, when the sze of datasets s very large, reature convergence cannot be avoded. oared to GA, PSO can not only rovde global search, but also resent strong local search caablty by adjustng araeters, and s easy to leent [30-31]. Therefore, n ths aer, we roose to use PSO for rovng FM. 3. Prelnary 3.1. FM Bascs Let X x, x,..., x } be the set of data onts, where n the nuber of data s { 1 2 n objects. The clusterng results can be reresented by fuzzy atrx U [ uj] n, where s the nuber of clusters, and each eleent u j denotes the degree of ebersh of -th data ont to j -th cluster, 1,2,..., n; j 1,2,...,. And, 192 oyrght c 2015 SERS

Internatonal Journal of Database Theory and Alcaton u j, j, s. t., j, u [0,1]; j j1 n 1 u u j j 1; 0.. 1) FM s forulated as the otzaton roble wth above constrants usng the followng objectve functon: n J U, ) n 1 j1 2 uj d x, c j), 2) where c, c,..., c ), c j s the centrod of j -th cluster; 1 2 d x, c ) x c s the j j Eucldean dstance between x and c j, d j for short; s the fuzzy weghtng exonent, used to control the nfluence of atrx U, and bgger eans that the clusterng results are ore fuzzy t s establshed that [1.5,2.5 ] [32]); and J U, ) s the weghted su of squares of the dstances between data onts and the cluster centrods. In ths aer, we set 2. Aly Lagrange ultlers to solve Equaton 2), we have u j 1 k1 d d j k 2 1, 3) c j n n uj x uj 1 1. 4) The rocedure of FM can be descrbed as follows: Ste 1: gven the nuber of clusters, the sze of data sales n, the ternaton threshold, fuzzy exonent, axu teraton tes u, the randoly generated ntal centrod atrx, and t 0. Intalze 0 U usng Equaton 3); Ste 2: t t 1.Udate centrod atrx t usng Equaton 4); Ste 3: Recalculate ebersh atrx t U usng Equaton 3); oyrght c 2015 SERS 193

Internatonal Journal of Database Theory and Alcaton Ste 4: f t U U t1, algorth stos; otherwse, go to Ste 2. 3.2. PSO Bascs PSO algorth sulates the attern and behavor of brd flock, and each brd s reresented as a artcle. For each artcle, there are two ortant attrbutes,.e., locaton and seed. Suose there are artcles n a D -densonal sace. Gven artcle, ts locaton s x x, x,..., x ), and seed s v v, v,..., v ). The locaton and 1 2 D 1 2 D seed of artcles are udated durng the flyng rocess. Suose the best locaton of artcle s,,..., ), and the overall best locaton of all artcles s g g, g 2,..., 1 gd v d 1 2 D ). The seed and locaton are udated by: t 1) wvd t) c1 1 d t) xd t)) c2 2 gd t) xgd t)), 5) xd t 1) xd t) vd t 1), 6) where 1,2,...,, d 1,2,..., D, c 1,c2 are learnng factors, 1, 2 [0,1] are rando nubers, and w s an nerta weght tycally w [0.1,0.9 ] ). v d t) s the seed of artcle at teraton t, and x d t) s the locaton of artcle at teraton t. Equaton 5) s the udate rule of artcle s seed, where the frst coonent s the current seed of artcle; the second coonent s the nfluence of self-recognton,.e., the dstance between the current locaton and the best locaton of each artcle; and the thrd coonent s the nfluence of the whole oulaton,.e., the dstance between the current locaton of the artcle and the overall best locaton of all artcles. Equaton 6) s the udate rule of artcle s locaton, whch ndcates that the locaton at teraton t 1 s calculated as the locaton at teraton t lus the flyng dstance durng one teraton. Fgure 1 gves the seudo-code of PSO. 4. Proosed Modfed FM 4.1. Irovng FM wth PSO The gradent descent based FM algorth s essentally a local search algorth, and therefore s easly to arrve a local nu, esecally then the sze of data sales s large. Besdes, FM s senstve to the ntalzaton status, and ost FM algorths sly rando choose ntal centrods, whch n turn affects the convergence seed. 194 oyrght c 2015 SERS

Internatonal Journal of Database Theory and Alcaton Fortunately, swar based PSO algorth has a strong global search caablty, and the convergence seed s fast. Therefore, we roose to cobne FM and PSO to otze the objectve functon n Equaton 2). Bascally, the dea s to reresent centrods as artcles, and utlze PSO for global search and FM for local search. Fgure 1. Algorth Descrton of PSO 4.1.1. Partcles Encodng: we reresent each artcle as a feasble soluton. That s, the locaton of each artcle conssts of centrods c, c,..., c ), where c j s 1 2 the centrod of j -th cluster. 4.1.2. Ftness alculaton: recall that our goal s to nze J U, ). Therefore, the ftness functon can be defned as: f J 1 U, ) 1, 7) The larger f s, the saller J U, ) s, and the better the soluton s. 4.1.3. Tng of obnaton: Generally, we utlze PSO for global search and FM for local search. The key s to deterne when to aly FM durng the PSO rocess. In oyrght c 2015 SERS 195

Internatonal Journal of Database Theory and Alcaton ths aer, we roose to erfor FM when the artcle swar converges. Defne the degree of convergence by the varance of ftness: 2 1 f f f avg 2, 8) Where f s the ftness of -th artcle, avg f s the average ftness of all artcles, and s the nuber of artcles. The saller 2 s, the ore converge the artcles are. We use 2 to deterne when to aly FM durng PSO global search. If 2 s close to 0, the artcles are alost the sae, whch eans PSO arrves reature convergence or global convergence. Otherwse, artcles wth varous ftness are dong rando search. If 2 s saller than a threshold, t eans that PSO s about to converge, and alyng FM at that te for local search can rove the erforance of global search to avod reature convergence. 4.1.4 Algorth Procedure: Fgure 2 llustrates the flow chart of odfed FM wth PSO algorth. Secfcally, the rocedure can be descrbed as follows: Ste 1: ntalze the artcle swar and araeters. Gven the nuber of clusters, the sze of data sales n, fuzzy exonent, axu teraton tes u, andt 0. Randoly generate ntal centrod atrx, and ntalze 0 U usng Equaton 3). Generate a artcle fro the centrod vector, and ntalze ts seed usng Equaton 5). Run the rando centrods generaton tes, and get artcles. Ste 2: for each artcle, calculate ts ftness usng Equaton 7), and get ts best locaton. Ste 3: for each artcle, f ts best locaton s better than any other artcles, udate the global locaton g as. Ste 4: udate seed and locaton of all artcles usng Equatons 5) and 6). Ste 5: f the varance of ftness 2, where s the threshold of the varance of ftness values of all artcles, go to Ste 6; otherwse, return to Ste 2. Ste 6: erfor FM clusterng. alculate the centrods of clusters usng Equaton 4). 196 oyrght c 2015 SERS

Internatonal Journal of Database Theory and Alcaton alculate ebersh atrx U usng Equaton 3). Udate corresondng artcles based on centrods. Ste 7: f ternaton condton s satsfed, algorth stos. Otherwse, return to Ste 2. Fgure 2. Flow hart of Modfed FM lusterng based on PSO Fgure 3. Flow hart of Mult-round Rando Salng Method oyrght c 2015 SERS 197

Internatonal Journal of Database Theory and Alcaton 4.2. Procedure for Bg Data lusterng Even though our odfed FM clusterng algorth can solve the ssues, such as ntalzaton senstvty and low convergence seed, there are stll robles wth bg data scenaro. For exale, n Ste 1 n Secton 4,1, where artcles are generated by randoly generatng centrods tes on the orgnal dataset. If the sze of dataset n s not very large, the erforance s good. However, f n s very bg, ths ste could be extreely te-consung, and even not feasble to leent. Therefore, n ths secton, we roose a ult-round rando salng ethod to sulate the clusterng on the orgnal bg dataset wth the objectve to aroxatng the clusterng results on sale datasets. The basc assuton here s that [33] by erforng clusterng on a saller subset, we can aroxate the results on the orgnal large dataset. The ult-round rando salng can be descrbed as follows. Frst, randoly select a subset fro the orgnal dataset D, notated as D, and then erfor the odfed FM algorth on D as resented n Secton 3, and get the clusterng results 1 2 c, c,..., c ). After that, randoly select another subset wth the sae sze fro the reanng dataset, notated as D j, and cobne D D and j, notated as D j. Usng D j c, c,..., c ) as the ntal centrods, and then erfor odfed FM on, 1 2 j j j and get clusterng results c, c,..., c ). Reeatedly randoly select a subset j 1 2 fro the reanng dataset and cobne to the exstng sales, and then erfor odfed FM on the new dataset. Untl the dfference of clusterng results between two rounds s sall enough. That s, 12... t 12... t 1), where 12... t s the clusterng results of the t -th round, and s the threshold. The flow chart s shown n Fgure 3, where the odfed FM algorth s dscussed n Secton 4.1. 5. Exerent Our datasets are orgnally obtaned fro UI database,.e., rs, glass and wne. To sulate the bg data scenaro, we enlarge orgnal datasets by 1000 tes n ths exerent. The araeter settngs are 3, 2, 10 5, 30, u 100, % 3%. We defne three baselnes for coarson. 1) FM: basc FM algorth on 198 oyrght c 2015 SERS

Internatonal Journal of Database Theory and Alcaton randoly saled dataset. 2) FM-PSO: odfed FM based on PSO on randoly saled dataset. 3) FM-PSO-MRS: odfed FM based on PSO sung the ult-round rando salng ethod on the orgnal large dataset. Table 1 gves the coarson of average recson for three ethods on the UI datasets and large scale datasets, resectvely. We have the followng observatons. Frst, the erforance of FM-PSO s better than FM, whch eans our odfed strategy based on PSO can ndeed rove the recson of clusterng. Second, FM-PSO-MRS s better than the other two ethods, whch eans that our ult-round rando salng strategy s better than randoly salng. Thrd, for bg data scenaro, FM-PSO-MRS can reserve good erforance, whle FM and FM-PSO have soe degradaton of erforance, and thus FM-PSO-MRS s sutable for bg data clusterng. Secfcally, coarng the saller UI datasets and the 1000 tes bg datasets, the recson for FM decreases ost when the sze of dataset s large, whle the erforance of roosed FM-PSO-MRS s alost the sae deste the sze of dataset. Table 1. The Average Precson of lusterng Dataset FM FM-PSO FM-PSO-MRS Irs 80.58% 84.37% 89.12% Glass 71.34% 78.92% 81.05% Wne 63.41% 69.56% 73.87% Irs*1000 67.23% 70.84% 88.92% Glass*1000 60.67% 74.43% 80.85% Wne*1000 58.50% 67.32% 73.28% Besdes, we evaluate the convergence erforance of our odfed FM-PSO algorth on bg dataset n Fgures 4, 5 and 6. ote that n these exerents, to be far, both FM and FM-PSO are based on one round rando salng, n order to coare the convergence seed. For all three datasets, we have the sae concluson, that s, our odfed FM-PSO converges faster than FM wth better ftness value. Fgure4. Perforance of FM-PSO for rs*1000 Dataset oyrght c 2015 SERS 199

Internatonal Journal of Database Theory and Alcaton 6. oncluson In ths aer, we focus on the odfcaton of FM clusterng consderng ts ltatons such as senstvty to ntalzaton status, low convergence, etc. Secfcally, we roose to cobne PSO nto FM. Moreover, we resent a ult-round rando salng ethod to deal wth the bg data challenge. Fgure 5. Perforance of FM-PSO for Glass*1000 Dataset Fgure 6. Perforance of FM-PSO for Wne*1000 Dataset However, the exerents n ths study n obtaned by exandng UI datasets, whch s ore, lke sulated datasets. In future works, we would lke to ebrace the roosed clusterng nto real world bg data scenaros, such as large scale socal networks, and try to exlore ore clusterng alcatons n bg data. References [1] A. K. Jan, M.. Murty and P. J. Flynn. "Data clusterng: a revew", AM coutng surveys SUR), vol. 31, no. 3, 1999),. 264-323. [2] A. Barald and P. Blonda, "A survey of fuzzy clusterng algorths for attern recognton. I." Systes, Man, and ybernetcs, Part B: ybernetcs, IEEE Transactons, vol. 29, no. 6, 1999),. 778-785. [3] P. Berkhn, "A survey of clusterng data nng technques", Groung ultdensonal data, Srnger Berln Hedelberg, 2006),. 25-71. [4] B.. L, "Integratng satal fuzzy clusterng wth level set ethods for autoated edcal age segentaton", outers n Bology and Medcne, vol. 41, no. 1, 2011),. 1-10. 200 oyrght c 2015 SERS

Internatonal Journal of Database Theory and Alcaton [5] D. Won, B. M. Song and D. McLeod, "An aroach to clusterng arketng data", Proceedngs of the 2nd Internatonal Advanced Database onference, 2006). [6] D. Graves and W. Pedrycz, "Kernel-based fuzzy clusterng and fuzzy clusterng: A coaratve exerental study", Fuzzy sets and systes, vol. 161, no. 4, 2010),. 522-543. [7] J.. Bezdek, R. Ehrlch and W. Full, "FM: The fuzzy c-eans clusterng algorth", outers & Geoscences, vol. 10, no. 2, 1984),. 191-203. [8] J. Kennedy, "Partcle swar otzaton", Encycloeda of Machne Learnng, Srnger US, 2010),.760-766. [9] A. K. Jan, "Data clusterng: 50 years beyond K-eans", Pattern Recognton Letters, vol. 31, no 8, 2010),. 651-666. [10] R. T. g and J. Han, "LARAS: A ethod for clusterng objects for satal data nng", Knowledge and Data Engneerng, IEEE Transactons, vol. 14, no. 5, 2002),. 1003-1016. [11]. Ordonez and E. Oecnsk, "FREM: fast and robust EM clusterng for large data sets", Proceedngs of the eleventh nternatonal conference on Inforaton and knowledge anageent, AM, 2002). [12] T. Zhang, R. Raakrshnan and M. Lvny, "BIRH: an effcent data clusterng ethod for very large databases", AM SIGMOD Record, vol. 25, no. 2, 1996). [13] S. Guha, R. Rastog and K. Sh, "URE: an effcent clusterng algorth for large databases", AM SIGMOD Record, vol. 27, no. 2, AM, 1998). [14] M. Ester, "A densty-based algorth for dscoverng clusters n large satal databases wth nose", Kdd. vol. 96, 1996). [15] M. Ankerst, "Otcs: Orderng onts to dentfy the clusterng structure", AM Sgod Record, vol. 28, no. 2, AM, 1999). [16] D. Brant and A. Kut, "ST-DBSA: An algorth for clusterng satal teoral data", Data & Knowledge Engneerng, vol. 60, no. 1, 2007),. 208-221. [17] W. Wang, J. Yang and R. Muntz, "STIG: A statstcal nforaton grd aroach to satal data nng", VLDB, vol. 97, 1997). [18] R. Agrawal, Autoatc subsace clusterng of hgh densonal data for data nng alcatons, AM, vol. 27, no. 2, 1998). [19] G. Shekholesla, S. hatterjee, A. Zhang, Wave-luster: A Mult-Resoluton lusterng Aroach for Very Large Satal Databases, Proc. 24th Int. onf. on Very Large Data Bases, 1998), ew York. [20] E. H. Rusn, "A new aroach to clusterng", Inforaton and control, vol. 15, no. 1, 1969),. 22-32. [21] G.-S. Lang, T.-Y. hou and T.-. Han, "luster analyss based on fuzzy equvalence relaton", Euroean Journal of Oeratonal Research, vol. 166, no.1, 2005),. 160-171. [22] D. Boxader, J. Jacas and J. Recasens, "Fuzzy equvalence relatons: advanced ateral", Fundaentals of Fuzzy Sets. Srnger US, 2000),. 261-290. [23] M.-S. Yang and H.-M. Shh, "luster analyss based on fuzzy relatons", Fuzzy Sets and Systes, vol. 120, vol. 2, 2001),. 197-212. [24] Z. Wu and R. Leahy, "An otal grah theoretc aroach to data clusterng: Theory and ts alcaton to age segentaton", Pattern Analyss and Machne Intellgence, IEEE Transactons, 1993). [25] J.. Bezdek and J. D. Harrs, "Fuzzy arttons and relatons; an axoatc bass for clusterng", Fuzzy sets and systes, vol. 1, no. 2, 1978),. 111-127. [26] A. O. Esogbue, "Otal clusterng of fuzzy data va fuzzy dynac rograng", Fuzzy Sets and Systes, vol. 18, no. 3, 1986),. 283-298. [27] D. Graves and W. Pedrycz, "Kernel-based fuzzy clusterng and fuzzy clusterng: A coaratve exerental study", Fuzzy sets and Systes vol. 161, no. 4, 2010),. 522-543. [28] K. Krshna and M.. Murty, "Genetc K-eans algorth" Systes, Man, and ybernetcs, Part B: ybernetcs, IEEE Transactons, vol. 29, no. 3, 1999),. 433-439. [29] U. Maulk and S. Bandyoadhyay, "Genetc algorth-based clusterng technque", Pattern recognton vol. 33, no. 9, 2000),. 1455-1465. [30] J.-. Zeng and Z.-H u, A guaranteed global convergence artcle swar otzer, Journal of outer Research and Develoent, vol. 41, no. 8,, 2004),. 1333-1338. [31] L. Xao-qng and J. Su-n, PSO satcal clusterng wth obstacles constrants, outer Engneerng and Desgn, vol. 28, no. 24,, b,. 5924-5926. [32]. R. Pal and J.. Bezdek, "On cluster valdty for the fuzzy c-eans odel", Fuzzy Systes, IEEE Transactons, vol. 3, no. 3, 1995),. 370-379. [33] M.-. Hung, and D.-L. Yang, "An effcent fuzzy c-eans clusterng algorth", Data Mnng, 2001), IDM 2001, Proceedngs IEEE Internatonal onference, 2001). oyrght c 2015 SERS 201

Internatonal Journal of Database Theory and Alcaton Authors Yang Xanfeng, he receved hs bachelor's degree n outer Alcaton Technology fro Henan oral Unversty, Xnxang, Henan, hna, n 2001, the aster's degree n outer Alcaton Technology fro hna Unversty of Petroleu, Dongyng, hna, n 2007. He s now a lecturer at School of Inforaton Engneerng, Henan Insttute of Scence and Technology, Xnxang, hna. Hs current research nterests nclude attern recognton, age rocessng, neural networks, natural language rocessng. Lu Pengfe, he receved hs bachelor's degree n outer Scence and Technology fro Anyang oral Unversty, Anyang, hna, n 2004, the aster degree n outer Alcaton fro Huazhong Unversty of Scence and Technology, Wuhan, hna, n 2010. He s now a lecturer at Heb ollege of Vocaton And Technology. Hs current research nterests nclude couter network, web alcaton, and network oeratng syste. 202 oyrght c 2015 SERS