CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON MAPREDUCE FRAMEWORK
|
|
|
- Roland Harmon
- 10 years ago
- Views:
Transcription
1 CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON MAPREDUCE FRAMEWORK Sheela Gole 1 and Bharat Tidke 2 1 Department of Computer Engineering, Flora Intitute of Technology, Pune, India ABSTRACT Now a day enormou amount of data i getting explored through Internet of Thing (IoT) a technologie are advancing and people ue thee technologie in day to day activitie, thi data i termed a Big Data having it characteritic and challenge. Frequent Itemet Mining algorithm are aimed to dicloe frequent itemet from tranactional databae but a the dataet ize increae, it cannot be handled by traditional frequent itemet mining. MapReduce programming model olve the problem of large dataet but it ha large communication cot which reduce execution efficiency. Thi propoed new pre-proceed k-mean technique applied on BigFIM algorithm. ClutBigFIM ue hybrid approach, clutering uing k- mean algorithm to generate Cluter from huge dataet and Apriori and Eclat to mine frequent itemet from generated cluter uing MapReduce programming model. Reult hown that execution efficiency of ClutBigFIM algorithm i increaed by applying k-mean clutering algorithm before BigFIM algorithm a one of the pre-proceing technique. KEYWORDS Aociation Rule Mining, Big Data, Clutering, Frequent Itemet Mining, MapReduce. 1. INTRODUCTION Data mining and KDD (Knowledge Dicovery in Databae) are eential technique to dicover hidden information from large dataet with variou characteritic. Now a day Big Data ha bloom in variou area uch a ocial networking, retail, web blog, forum, online group [1]. Frequent Itemet Mining i one of the important technique of ARM. Goal of FIM technique i to reveal frequent itemet from tranactional databae. Agrawal et al. [2] put forward Apriori algorithm which generate frequent itemet having frequency greater than minimum upport given. It i not efficient on ingle computer when dataet ize increae. Enormou amount of work ha been put forward to uncover frequent item. There exit variou parallel and ditributed algorithm which work on large dataet but having memory and I/O cot limitation and cannot handle Big Data [3] [4]. MapReduce developed by Google [5] along with hadoop ditributed file ytem i exploited to find out frequent itemet from Big Data on large cluter. MapReduce ue parallel computing approach and HDFS i fault tolerant ytem. MapReduce ha Map and Reduce function; data flow in MapReduce i hown in below figure. DOI: /ijfct
2 Figure 1. Map-Reduce Data flow. In thi paper, baed on BigFIM algorithm, a new algorithm optimizing the peed of BigFIM algorithm i propoed. Firtly uing parallel K-Mean clutering cluter are generated from Big Dataet. Then cluter are mined uing ClutBigFIM algorithm, effectively increaing the execution efficiency. Thi paper i organized a follow ection 2 give overview of related work done on frequent itemet mining. Section 3 give overview of background theory for ClutBigFIM. Section 4 explain peudo code of ClutBigFIM. The experimental reult with comparative analyi are given in ection 5. Section 6 conclude the paper. 2. RELATED WORK Variou equential and parallel frequent itemet parallel algorithm are available [5] [6] [7] [8] [9] [10]. But there i need of FIM algorithm which can handle Big Data. Thi ection give an inight into frequent itemet mining which exploit MapReduce framework. The exiting algorithm have challenge while dealing with Big Data. Parallel implementation of traditional Apriori algorithm baed on MapReduce framework i put forward by Lin et al. [11] and Li et al. [12] alo propoed parallel implementation of Apriori algorithm. Hammoud [13] ha put forward MRApriori algorithm which i baed on MapReduce programming model and claic Apriori algorithm. It doe not require repetitive can of databae which ue iterative horizontal and vertical witching. Parallel implementation of FP-Growth algorithm ha been put forward in [14]. Liu et al. [15] ha been put forward IOMRA algorithm which i a modified FAMR algorithm optimize execution efficiency by pre-proceing uing Apriori TID which remove all low frequency 1-item itemet from given databae. Then poible longet candidate itemet ize i determined uing length of each tranaction and minimum upport. 80
3 Moen et al. [16] ha been put forward two algorithm uch a DitEclat and BigFIM, DitEclat i ditributed verion of Eclat algorithm which mine prefix tree and extract frequent itemet fater but not calable enough. BigFIM applie Apriori algorithm before DitEclat to handle frequent itemet till ize k and next k+1 item are extracted uing Eclat algorithm but BigFIM algorithm ha limitation on peed. Both algorithm are baed on MapReduce framework. Currently Moen alo propoed implementation of DitEclat and BigFIM algorithm uing Mahout. Approximate frequent itemet are mined uing PARMA algorithm which ha been put forward by Riondato et al. [17]. K-mean clutering algorithm i ued for finding cluter which i called a ample lit. Frequent item et are extracted very fat, reducing execution time. Malek and Kadima [18] ha been put forward parallel k-mean clutering which ue MapReduce programming model for generating cluter parallel by increaing performance of traditional K- Mean algorithm. It ha Map, Combine and Reduce function which ue (key, value) pair. Ditance between ample point and random centre are calculated for all point uing map function. Intermediate output value from map function are combined uing combiner function. All ample are aigned to cloet cluter uing reduce function. 3. BACKGROUND 3.1. Problem Statement Let I be a et of item, I = {i 1,i 2,i 3,,i n }, X i a et of item, X = {i 1,i 2,i 3,,i k } I called k - itemet. A tranaction T = {t 1, t 2, t 3,,t m }, denoted a T = (tid, I) where tid i tranaction ID. T D, where D i a tranactional databae. The cover of itemet X in D i the et of tranaction ID containing item from X. Cover(X, D) = {tid (tid, I) D, X I} The upport of an itemet X in D i count of tranaction containing item from X. Support (X, D) = Cover(X, D) An itemet i called frequent when it abolute minimum upport threhold σ ab, with 0 σ ab D. Partitioning of tranaction into et of group i called clutering. Let be the number of cluter then {C 1, C2, C3 C} i a et of cluter from {t 1, t 2, t 3,,t m }, where m i number of tranaction. Each tranaction i aigned to only one cluter i.e. C p φ C p C q for 1 p, q, C p i called a cluter. Let µ z be the mean of cluter C z, quared error between mean of cluter and tranaction in cluter i given a below, J (C ) = ti C t i µ k-mean i ued for minimizing um of quared error over all S cluter and i given by, S J (C ) = = 1 ti C 2 2 t i µ k-mean algorithm tart with one cluter and aign each tranaction to cluter with minimum quared error. 81
4 3.2. Apriori Algorithm Apriori i the firt frequent itemet mining algorithm which ha been put forward by Agarwal et al. [19]. Tranactional databae ha tranaction identifier and et of item preenting tranaction. Apriori algorithm can the horizontal databae and find frequent item of ize 1-item uing minimum upport condition. From thee frequent item dicovered in iteration 1 candidate itemet are formed and frequent itemet of ize two are extracted uing minimum upport condition. Thi proce i repeated till either lit of candidate itemet or frequent itemet i empty. It require repetitive can of databae. Monotonicity property i ued for removing frequent item Eclat Algorithm Eclat algorithm i propoed by Zaki et al. [20] which work on vertical databae. TID lit of each item i calculated and interection of TID lit of item i ued for extracting frequent itemet of ize k+1. No need of iterative can of databae but expenive to manipulate large TID lit k-mean Algorithm The k-mean algorithm [21] i well known technique of clutering which take number of cluter a input, random point are choen a centre of gravity and ditance meaure to calculate ditance of each point from centre of gravity. Each point i aigned to only one cluter baed on high intra-cluter imilarity and low inter-cluter imilarity. 4. CLUSTBIGFIM ALGORITHM Thi ection give high level architecture of ClutBigFIM algorithm and peudo code of phae ued in ClutBigFIM algorithm High Level Architecture Figure 2. High Level Architecture of ClutBigFIM Algorithm Clutering i applied on large dataet a one of the pre-proceing technique and then frequent itemet are mined from clutered data uing frequent itemet mining algorithm, Apriori and Eclat. 82
5 4.2. ClutBigFIM on MapReduce ClutBigFIM algorithm ha below phae, a. Find Cluter b. Finding k-fi c. Generate ingle global TID lit d. Mining of ubtree Find Cluter K-mean clutering algorithm i ued for finding cluter from given large dataet. Cluter of tranaction are formed baed on below formula which calculate minimum quared error, J (C ) = ti C t i µ and aign each tranaction to the cluter. Input to thi phae i tranaction dataet and number of cluter, cluter of tranaction are generated like C={t 1,t 10,...t }. 2 Input : Cluter Size and Dataet Output : Cluter with ize z Step : 1. Find ditance between centre and tranaction id in map phae. 2. Ue combiner function to combine reult of above tep. 3. Compute MSE uing below formula and aign all point to cluter in reduce phae, J (C ) = S J (C ) = = 1 ti C t i µ ti C 2 2 t i µ 4. Repeat tep 1-3 by changing Centre and top when convergence criteria i reached Finding k-fi Tranaction ID lit for large dataet cannot be handled by Eclat algorithm, So frequent itemet of ize k are mined from generated cluter in above phae uing Apriori algorithm baed on minimum upport condition which handle problem of large dataet. Prefix tree i generated uing frequent itemet. 83
6 Input : Cluter Size, Minimum threhold σ, prefix length(l) Output : Prefixe with length l and k-fi Step : 5. Find upport of all item in a cluter uing Apriori algorithm. 6. Apply Support (x i )> σ and calculate FI uing monotonic property. 7. Repeat tep 5-6 till calculating all k-fi uing mapper and reducer. 8. Repeat tep 5-7 for cluter (1 To S) and find final k-fi. 9. Keep created prefixe in lexicographic order uing lexicographic prefix tree Generate ingle global TID lit Eclat algorithm ue vertical databae, item and lit of tranaction where item i preent. The global TID lit i generated by combining local TID lit uing mapper and reducer. Generated TID lit i ued in next phae. Input : Prefix Tree, Min Supportσ Output : Single TID lit of all item Step : 10. Calculate TID lit uing prefix tree in map phae 11. Create ingle TID lit from TID lit generated in above tep. Perform pruning with upport( i a ) upport( i b ) a < b 12. Generate prefix group, P k = (P k 1, P k 2,, P k n ) Mining of Subtree Next (k+1) FI are mined uing Eclat algorithm. Prefix tree generated in phae2 i mined independently by mapper and frequent itemet are generated. Input : Prefix tree, Minimum upportσ Output : k-fi Step : 13. Apply Eclat algorithm and find FI till ize k. 14. Repeat tep 13 for each Subtree in map phae. 15. Find all frequent item of ize k and tore them in compreed trie format. 84
7 5. EXPERIMENTS Thi ection give overview of dataet ued and experimental reult with comparative analyi. For experiment 2 machine are going to be ued. Each machine contain Intel Core i5-3230m [email protected] proceing unit and 6.00GB RAM with Ubuntu and Hadoop Currently algorithm run on ingle peudo ditributed hadoop cluter. Dataet ued from tandard UCI repoitory and FIMI repoitory in order to compare reult with exiting ytem uch a DitEclat and BigFIM Dataet Information Experiment are performed on below dataet, Muhroom Provided by FIMI repoitory [22] ha 119 item and 8,124 tranaction. T10I4D100K- Provided by UCI repoitory [23] ha 870 item and 100,000 tranaction. Retail - Provided by UCI repoitory [23]. Pumb - Provided by FIMI repoitory [22] ha 49,046 tranaction Reult Analyi Experiment are performed on T10I4D100K, Retail, Muhroom and Pumb dataet and execution time required for generating k-fi i compared baed on number of mapper and Minimum Support. Reult hown that Dit-Eclat i fater than BigFIM and ClutBigFIM algorithm on T10I4D100K but Dit-Eclat algorithm i not working on large dataet uch a Pumb. Dit-Eclat i not calable enough and face memory problem a the dataet ize increae. Experiment performed on T10I4D100K dataet in order to compare execution time with different Minimum Support and number of mapper on Dit-Eclat, BigFIM and ClutBigFIM. Table 1. how Execution Time (Sec) for T10I4D100K dataet with different value of Minimum Support and 6 number of mapper. Figure 3. how timing comparion for variou method on T10I4D100K dataet which how that Dit-Eclat ha fater performance over BigFIM and ClutBigFIM algorithm. Execution time decreae a Minimum Support value increae which how effect of Minimum Support on execution time. Table 2. how Execution Time (Sec) for T10I4D100K dataet with different value of Number of mapper and Minimum Support 100. Figure 4. how timing comparion for variou method on T10I4D100K dataet which how that Dit-Eclat ha fater performance over BigFIM and ClutBigFIM algorithm. Execution time increae a number of mapper increae a communication cot between mapper and reducer increae. Table 1. Execution Time (Sec) for T10I4D100K with different Support. Dataet T10I4D100K Algorithm Min. Support Dit-Eclat BigFIM ClutBigFIM No. of Mapper
8 Table 2. Execution Time (Sec) for T10I4D100K with different No. of Mapper Dataet T10I4D100K Algorithm Number of Mapper Dit-Eclat BigFIM ClutBigFIM Minimum Support Figure 3. Timing comparion for variou method and Minimum Support on T10I4D100K Figure 4. Timing comparion for different method and No. of Mapper on T10I4D100K 86
9 Reult have been hown that ClutBigFIM algorithm work on Big Data. Experiment are performed on Pumb dataet. Dit-Eclat algorithm faced memory problem with Pumb dataet. Reult of ClutBigFIM are compared with BigFIM algorithm which i calable. Table 3. and Table 4. how execution time taken for BigFIM and ClutBigFIM algorithm on Pumb dataet with variable Minimum Support and No. of Mapper. Number of mapper i 20 and Minimum Support i for the experiment. Figure 3. And Figure 5 and Figure 6. how that ClutBigFIM algorithm ha better performance over BigFIM algorithm due to preproceing. Table 3. Execution Time (Sec) for Pumb with different Support. Dataet Pumb Algorithm Min. Support BigFIM ClutBigFIM No. of Mapper - 20 Table 4. Execution Time (Sec) for Pumb with different No. of Mapper Dataet Pumb Algorithm Number of Mapper BigFIM ClutBigFIM Minimum Support Figure 5. Timing comparion for different method and Minimum Support on Pumb 87
10 Figure 6. Timing comparion for different method and No. of Mapper on Pumb 6. CONCLUSIONS In thi paper we implemented FIM algorithm baed on MapReduce programming model. K- mean clutering algorithm focue on pre-proceing, frequent itemet of ize k are mined uing Apriori algorithm and dicovered frequent itemet are mined uing Eclat algorithm. ClutBigFIM work on large dataet with increaed execution efficiency uing pre-proceing. Experiment are done on tranactional dataet, reult hown that ClutBigFIM work on Big Data very efficiently and with higher peed. We are planning to run ClutBigFIM algorithm on different dataet for further comparative analyi. REFERENCES [1] Uama Fayyad, Gregory Piatetky-Shapiro, and Padhraic Smyth The KDD proce for extracting ueful knowledge from volume of data. Commun. ACM 39, 11 (November 1996), DOI= / [2] Rakeh Agrawal, Tomaz Imielińki, and Arun Swami Mining aociation rule between et of item in large databae. SIGMOD Rec. 22, 2 (June 1993), DOI= / [3] M. Zaki, S. Parthaarathy, M. Ogihara, and W. Li. Parallel algorithm for dicovery of aociation rule. Data Min. and Knowl. Dic., page , [4] G. A. Andrew. Foundation of Multithreaded, Parallel, and Ditributed Programming. Addion- Weley, [5] J. Li, Y. Liu, W. k. Liao, and A. Choudhary. Parallel data mining algorithm for aociation rule and clutering. In Intl. Conf. on Management of Data, [6] E. Ozkural, B. Ucar, and C. Aykanat. Parallel frequent item et mining with elective item replication. IEEE Tran. Parallel Ditrib. Syt., page , [7] M. J. Zaki. Parallel and ditributed aociation mining: A urvey. IEEE Concurrency, page 14 25, [8] L. Zeng, L. Li, L. Duan, K. Lu, Z. Shi, M. Wang, W. Wu, and P. Luo. Ditributed data mining: a urvey. Information Technology and Management, page , [9] J. Han, J. Pei, and Y. Yin. Mining frequent pattern without candidate generation. SIGMOD Rec., page 1 12,
11 [10] L. Liu, E. Li, Y. Zhang, and Z. Tang. Optimization of frequent itemet mining on multiple-core proceor. In Proceeding of the 33rd international conference on Very large data bae, VLDB 07, page VLDB Endowment, [11] M.-Y. Lin, P.-Y. Lee and S.C. Hueh. Apriori-baed frequent itemet mining algorithm on MapReduce. In Proc. ICUIMC, page ACM, [12] N. Li, L. Zeng, Q. He, and Z. Shi. Parallel implementation of Apriori algorithm baed on MapReduce. In Proc. SNPD, page , [13] S. Hammoud. MapReduce Network Enabled Algorithm for Claification Baed on Aociation Rule. Thei, [14] L. Zhou, Z. Zhong, J. Chang, J. Li, J. Huang, and S. Feng. Balanced parallel FP-Growth with MapReduce. In Proc. YC-ICT, page , [15] Sheng-Hui Liu; Shi-Jia Liu; Shi-Xuan Chen; Kun-Ming Yu, "IOMRA - A High Efficiency Frequent Itemet Mining Algorithm Baed on the MapReduce Computation Model," Computational Science and Engineering (CSE), 2014 IEEE 17th International Conference on, vol., no., pp.1290,1295, Dec doi: /CSE [16] Moen, S.; Akehirli, E.; Goethal, B., "Frequent Itemet Mining for Big Data," Big Data, 2013 IEEE International Conference on, vol., no., pp.111,118, 6-9 Oct doi: /BigData [17] M. Riondato, J. A. DeBrabant, R. Foneca, and E. Upfal. PARMA: a parallel randomized algorithm for approximate aociation rule mining in MapReduce. In Proc. CIKM, page ACM, [18] M. Malek and H. Kadima. Searching frequent itemet by clutering data: toward a parallel approach uing mapreduce. In Proc. WISE 2011 and 2012 Workhop, page Springer Berlin Heidelberg, [19] R. Agrawal and R. Srikant. Fat algorithm for mining aociation rule in large databae. In Proc. VLDB, page , [20] M. Zaki, S. Parthaarathy, M. Ogihara, and W. Li. Parallel algorithm for dicovery of aociation rule. Data Min. and Knowl. Dic., page , [21] A K Jain, M N Murty, P. J. Flynn, Data Clutering: A Review, ACM COMPUTING SURVEYS, [22] Frequent itemet mining dataet repoitory [23] T. De Bie. An information theoretic framework for data mining. In Proc. ACM SIGKDD, page ,
Frequent Itemset Mining for Big Data
Frequent Itemset Mining for Big Data Sandy Moens, Emin Aksehirli and Bart Goethals Universiteit Antwerpen, Belgium Email: [email protected] Abstract Frequent Itemset Mining (FIM) is one
A Spam Message Filtering Method: focus on run time
, pp.29-33 http://dx.doi.org/10.14257/atl.2014.76.08 A Spam Meage Filtering Method: focu on run time Sin-Eon Kim 1, Jung-Tae Jo 2, Sang-Hyun Choi 3 1 Department of Information Security Management 2 Department
Performance Analysis of Apriori Algorithm with Different Data Structures on Hadoop Cluster
Performance Analysis of Apriori Algorithm with Different Data Structures on Hadoop Cluster Sudhakar Singh Dept. of Computer Science Faculty of Science Banaras Hindu University Rakhi Garg Dept. of Computer
DISTRIBUTED DATA PARALLEL TECHNIQUES FOR CONTENT-MATCHING INTRUSION DETECTION SYSTEMS. G. Chapman J. Cleese E. Idle
DISTRIBUTED DATA PARALLEL TECHNIQUES FOR CONTENT-MATCHING INTRUSION DETECTION SYSTEMS G. Chapman J. Cleee E. Idle ABSTRACT Content matching i a neceary component of any ignature-baed network Intruion Detection
DISTRIBUTED DATA PARALLEL TECHNIQUES FOR CONTENT-MATCHING INTRUSION DETECTION SYSTEMS
DISTRIBUTED DATA PARALLEL TECHNIQUES FOR CONTENT-MATCHING INTRUSION DETECTION SYSTEMS Chritopher V. Kopek Department of Computer Science Wake Foret Univerity Winton-Salem, NC, 2709 Email: [email protected]
Static Data Mining Algorithm with Progressive Approach for Mining Knowledge
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive
MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH
MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, [email protected]
Searching frequent itemsets by clustering data
Towards a parallel approach using MapReduce Maria Malek Hubert Kadima LARIS-EISTI Ave du Parc, 95011 Cergy-Pontoise, FRANCE [email protected], [email protected] 1 Introduction and Related Work
Exploiting A Support-based Upper Bound of Pearson s Correlation Coefficient for Efficiently Identifying Strongly Correlated Pairs
Exploiting A Support-baed Upper Bound of Pearon Correlation Coefficient for Efficiently Identifying Strongly Correlated Pair Hui Xiong Computer Science Univerity of Minneota [email protected] Shahi Shekhar
Performance of Multiple TFRC in Heterogeneous Wireless Networks
Performance of Multiple TFRC in Heterogeneou Wirele Network 1 Hyeon-Jin Jeong, 2 Seong-Sik Choi 1, Firt Author Computer Engineering Department, Incheon National Univerity, [email protected] *2,Correponding
Cluster-Aware Cache for Network Attached Storage *
Cluter-Aware Cache for Network Attached Storage * Bin Cai, Changheng Xie, and Qiang Cao National Storage Sytem Laboratory, Department of Computer Science, Huazhong Univerity of Science and Technology,
Optical Illusion. Sara Bolouki, Roger Grosse, Honglak Lee, Andrew Ng
Optical Illuion Sara Bolouki, Roger Groe, Honglak Lee, Andrew Ng. Introduction The goal of thi proect i to explain ome of the illuory phenomena uing pare coding and whitening model. Intead of the pare
Performance Evaluation of some Online Association Rule Mining Algorithms for sorted and unsorted Data sets
Performance Evaluation of some Online Association Rule Mining Algorithms for sorted and unsorted Data sets Pramod S. Reader, Information Technology, M.P.Christian College of Engineering, Bhilai,C.G. INDIA.
AN OVERVIEW ON CLUSTERING METHODS
IOSR Journal Engineering AN OVERVIEW ON CLUSTERING METHODS T. Soni Madhulatha Aociate Preor, Alluri Intitute Management Science, Warangal. ABSTRACT Clutering i a common technique for tatitical data analyi,
Performance of a Browser-Based JavaScript Bandwidth Test
Performance of a Brower-Baed JavaScript Bandwidth Tet David A. Cohen II May 7, 2013 CP SC 491/H495 Abtract An exiting brower-baed bandwidth tet written in JavaScript wa modified for the purpoe of further
CASE STUDY BRIDGE. www.future-processing.com
CASE STUDY BRIDGE TABLE OF CONTENTS #1 ABOUT THE CLIENT 3 #2 ABOUT THE PROJECT 4 #3 OUR ROLE 5 #4 RESULT OF OUR COLLABORATION 6-7 #5 THE BUSINESS PROBLEM THAT WE SOLVED 8 #6 CHALLENGES 9 #7 VISUAL IDENTIFICATION
Novel Framework for Distributed Data Stream Mining in Big data Analytics Using Time Sensitive Sliding Window
ISSN(Print): 2377-0430 ISSN(Online): 2377-0449 JOURNAL OF COMPUTER SCIENCE AND SOFTWARE APPLICATION In Press Novel Framework for Distributed Data Stream Mining in Big data Analytics Using Time Sensitive
Assessing the Discriminatory Power of Credit Scores
Aeing the Dicriminatory Power of Credit Score Holger Kraft 1, Gerald Kroiandt 1, Marlene Müller 1,2 1 Fraunhofer Intitut für Techno- und Wirtchaftmathematik (ITWM) Gottlieb-Daimler-Str. 49, 67663 Kaierlautern,
Comparison of Data Mining Techniques for Money Laundering Detection System
Comparison of Data Mining Techniques for Money Laundering Detection System Rafał Dreżewski, Grzegorz Dziuban, Łukasz Hernik, Michał Pączek AGH University of Science and Technology, Department of Computer
Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm
R. Sridevi et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi,*
SCM- integration: organiational, managerial and technological iue M. Caridi 1 and A. Sianei 2 Dipartimento di Economia e Produzione, Politecnico di Milano, Italy E-mail: [email protected] Itituto
A New Optimum Jitter Protection for Conversational VoIP
Proc. Int. Conf. Wirele Commun., Signal Proceing (Nanjing, China), 5 pp., Nov. 2009 A New Optimum Jitter Protection for Converational VoIP Qipeng Gong, Peter Kabal Electrical & Computer Engineering, McGill
Bi-Objective Optimization for the Clinical Trial Supply Chain Management
Ian David Lockhart Bogle and Michael Fairweather (Editor), Proceeding of the 22nd European Sympoium on Computer Aided Proce Engineering, 17-20 June 2012, London. 2012 Elevier B.V. All right reerved. Bi-Objective
Advances in Natural and Applied Sciences
AENSI Journals Advances in Natural and Applied Sciences ISSN:1995-0772 EISSN: 1998-1090 Journal home page: www.aensiweb.com/anas Clustering Algorithm Based On Hadoop for Big Data 1 Jayalatchumy D. and
MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM
MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM J. Arokia Renjit Asst. Professor/ CSE Department, Jeppiaar Engineering College, Chennai, TamilNadu,India 600119. Dr.K.L.Shunmuganathan
CHARACTERISTICS OF WAITING LINE MODELS THE INDICATORS OF THE CUSTOMER FLOW MANAGEMENT SYSTEMS EFFICIENCY
Annale Univeritati Apuleni Serie Oeconomica, 2(2), 200 CHARACTERISTICS OF WAITING LINE MODELS THE INDICATORS OF THE CUSTOMER FLOW MANAGEMENT SYSTEMS EFFICIENCY Sidonia Otilia Cernea Mihaela Jaradat 2 Mohammad
Mining Interesting Medical Knowledge from Big Data
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 1, Ver. II (Jan Feb. 2016), PP 06-10 www.iosrjournals.org Mining Interesting Medical Knowledge from
Two Dimensional FEM Simulation of Ultrasonic Wave Propagation in Isotropic Solid Media using COMSOL
Excerpt from the Proceeding of the COMSO Conference 0 India Two Dimenional FEM Simulation of Ultraonic Wave Propagation in Iotropic Solid Media uing COMSO Bikah Ghoe *, Krihnan Balaubramaniam *, C V Krihnamurthy
Directed Graph based Distributed Sequential Pattern Mining Using Hadoop Map Reduce
Directed Graph based Distributed Sequential Pattern Mining Using Hadoop Map Reduce Sushila S. Shelke, Suhasini A. Itkar, PES s Modern College of Engineering, Shivajinagar, Pune Abstract - Usual sequential
Distributed Framework for Data Mining As a Service on Private Cloud
RESEARCH ARTICLE OPEN ACCESS Distributed Framework for Data Mining As a Service on Private Cloud Shraddha Masih *, Sanjay Tanwani** *Research Scholar & Associate Professor, School of Computer Science &
Improving Apriori Algorithm to get better performance with Cloud Computing
Improving Apriori Algorithm to get better performance with Cloud Computing Zeba Qureshi 1 ; Sanjay Bansal 2 Affiliation: A.I.T.R, RGPV, India 1, A.I.T.R, RGPV, India 2 ABSTRACT Cloud computing has become
A technical guide to 2014 key stage 2 to key stage 4 value added measures
A technical guide to 2014 key tage 2 to key tage 4 value added meaure CONTENTS Introduction: PAGE NO. What i value added? 2 Change to value added methodology in 2014 4 Interpretation: Interpreting chool
Implementing Improved Algorithm Over APRIORI Data Mining Association Rule Algorithm
Implementing Improved Algorithm Over APRIORI Data Mining Association Rule Algorithm 1 Sanjeev Rao, 2 Priyanka Gupta 1,2 Dept. of CSE, RIMT-MAEC, Mandi Gobindgarh, Punjab, india Abstract In this paper we
A note on profit maximization and monotonicity for inbound call centers
A note on profit maximization and monotonicity for inbound call center Ger Koole & Aue Pot Department of Mathematic, Vrije Univeriteit Amterdam, The Netherland 23rd December 2005 Abtract We conider an
Project Management Basics
Project Management Baic A Guide to undertanding the baic component of effective project management and the key to ucce 1 Content 1.0 Who hould read thi Guide... 3 1.1 Overview... 3 1.2 Project Management
Optimizing a Semantic Comparator using CUDA-enabled Graphics Hardware
Optimizing a Semantic Comparator uing CUDA-enabled Graphic Hardware Aalap Tripathy Suneil Mohan, Rabi Mahapatra Embedded Sytem and Codeign Lab codeign.ce.tamu.edu (Preented at ICSC 0, September 9, 0 in
Large-Scale Data Sets Clustering Based on MapReduce and Hadoop
Journal of Computational Information Systems 7: 16 (2011) 5956-5963 Available at http://www.jofcis.com Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Ping ZHOU, Jingsheng LEI, Wenjun YE
Fuzzy Logic -based Pre-processing for Fuzzy Association Rule Mining
Fuzzy Logic -based Pre-processing for Fuzzy Association Rule Mining by Ashish Mangalampalli, Vikram Pudi Report No: IIIT/TR/2008/127 Centre for Data Engineering International Institute of Information Technology
A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains
A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains Dr. Kanak Saxena Professor & Head, Computer Application SATI, Vidisha, [email protected] D.S. Rajpoot Registrar,
IMPLEMENTATION OF P-PIC ALGORITHM IN MAP REDUCE TO HANDLE BIG DATA
IMPLEMENTATION OF P-PIC ALGORITHM IN MAP REDUCE TO HANDLE BIG DATA Jayalatchumy D 1, Thambidurai. P 2 Abstract Clustering is a process of grouping objects that are similar among themselves but dissimilar
Improving the Performance of Web Service Recommenders Using Semantic Similarity
Improving the Performance of Web Service Recommender Uing Semantic Similarity Juan Manuel Adán-Coello, Carlo Miguel Tobar, Yang Yuming Faculdade de Engenharia de Computação, Pontifícia Univeridade Católica
processed parallely over the cluster nodes. Mapreduce thus provides a distributed approach to solve complex and lengthy problems
Big Data Clustering Using Genetic Algorithm On Hadoop Mapreduce Nivranshu Hans, Sana Mahajan, SN Omkar Abstract: Cluster analysis is used to classify similar objects under same group. It is one of the
BUILT-IN DUAL FREQUENCY ANTENNA WITH AN EMBEDDED CAMERA AND A VERTICAL GROUND PLANE
Progre In Electromagnetic Reearch Letter, Vol. 3, 51, 08 BUILT-IN DUAL FREQUENCY ANTENNA WITH AN EMBEDDED CAMERA AND A VERTICAL GROUND PLANE S. H. Zainud-Deen Faculty of Electronic Engineering Menoufia
Association Rule Mining using Apriori Algorithm for Distributed System: a Survey
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. VIII (Mar-Apr. 2014), PP 112-118 Association Rule Mining using Apriori Algorithm for Distributed
Mixed Method of Model Reduction for Uncertain Systems
SERBIAN JOURNAL OF ELECTRICAL ENGINEERING Vol 4 No June Mixed Method of Model Reduction for Uncertain Sytem N Selvaganean Abtract: A mixed method for reducing a higher order uncertain ytem to a table reduced
Binary Coded Web Access Pattern Tree in Education Domain
Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: [email protected] M. Moorthi
SPMF: a Java Open-Source Pattern Mining Library
Journal of Machine Learning Research 1 (2014) 1-5 Submitted 4/12; Published 10/14 SPMF: a Java Open-Source Pattern Mining Library Philippe Fournier-Viger [email protected] Department
DUE to the small size and low cost of a sensor node, a
1992 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 14, NO. 10, OCTOBER 2015 A Networ Coding Baed Energy Efficient Data Bacup in Survivability-Heterogeneou Senor Networ Jie Tian, Tan Yan, and Guiling Wang
A hybrid algorithm combining weighted and hasht apriori algorithms in Map Reduce model using Eucalyptus cloud platform
A hybrid algorithm combining weighted and hasht apriori algorithms in Map Reduce model using Eucalyptus cloud platform 1 R. SUMITHRA, 2 SUJNI PAUL AND 3 D. PONMARY PUSHPA LATHA 1 School of Computer Science,
SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY
SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY G.Evangelin Jenifer #1, Mrs.J.Jaya Sherin *2 # PG Scholar, Department of Electronics and Communication Engineering(Communication and Networking), CSI Institute
Turbulent Mixing and Chemical Reaction in Stirred Tanks
Turbulent Mixing and Chemical Reaction in Stirred Tank André Bakker Julian B. Faano Blend time and chemical product ditribution in turbulent agitated veel can be predicted with the aid of Computational
Queueing systems with scheduled arrivals, i.e., appointment systems, are typical for frontal service systems,
MANAGEMENT SCIENCE Vol. 54, No. 3, March 28, pp. 565 572 in 25-199 ein 1526-551 8 543 565 inform doi 1.1287/mnc.17.82 28 INFORMS Scheduling Arrival to Queue: A Single-Server Model with No-Show INFORMS
Simulation of Sensorless Speed Control of Induction Motor Using APFO Technique
International Journal of Computer and Electrical Engineering, Vol. 4, No. 4, Augut 2012 Simulation of Senorle Speed Control of Induction Motor Uing APFO Technique T. Raghu, J. Sriniva Rao, and S. Chandra
Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques
Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques Subhashree K 1, Prakash P S 2 1 Student, Kongu Engineering College, Perundurai, Erode 2 Assistant Professor,
Growing Self-Organizing Maps for Surface Reconstruction from Unstructured Point Clouds
Growing Self-Organizing Map for Surface Recontruction from Untructured Point Cloud Renata L. M. E. do Rêgo, Aluizio F. R. Araújo, and Fernando B.de Lima Neto Abtract Thi work introduce a new method for
Future Trend Prediction of Indian IT Stock Market using Association Rule Mining of Transaction data
Volume 39 No10, February 2012 Future Trend Prediction of Indian IT Stock Market using Association Rule Mining of Transaction data Rajesh V Argiddi Assit Prof Department Of Computer Science and Engineering,
International Journal of Heat and Mass Transfer
International Journal of Heat and Ma Tranfer 5 (9) 14 144 Content lit available at ScienceDirect International Journal of Heat and Ma Tranfer journal homepage: www.elevier.com/locate/ijhmt Technical Note
Log Mining Based on Hadoop s Map and Reduce Technique
Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, [email protected] Amruta Deshpande Department of Computer Science, [email protected]
Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727 Volume 6, Issue 5 (Nov. - Dec. 2012), PP 36-41 Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
The Cash Flow Statement: Problems with the Current Rules
A C C O U N T I N G & A U D I T I N G accounting The Cah Flow Statement: Problem with the Current Rule By Neii S. Wei and Jame G.S. Yang In recent year, the tatement of cah flow ha received increaing attention
Top Top 10 Algorithms in Data Mining
ICDM 06 Panel on Top Top 10 Algorithms in Data Mining 1. The 3-step identification process 2. The 18 identified candidates 3. Algorithm presentations 4. Top 10 algorithms: summary 5. Open discussions ICDM
FEDERATION OF ARAB SCIENTIFIC RESEARCH COUNCILS
Aignment Report RP/98-983/5/0./03 Etablihment of cientific and technological information ervice for economic and ocial development FOR INTERNAL UE NOT FOR GENERAL DITRIBUTION FEDERATION OF ARAB CIENTIFIC
CASE STUDY ALLOCATE SOFTWARE
CASE STUDY ALLOCATE SOFTWARE allocate caetud y TABLE OF CONTENTS #1 ABOUT THE CLIENT #2 OUR ROLE #3 EFFECTS OF OUR COOPERATION #4 BUSINESS PROBLEM THAT WE SOLVED #5 CHALLENGES #6 WORKING IN SCRUM #7 WHAT
A COMPARATIVE STUDY OF THREE-PHASE AND SINGLE-PHASE PLL ALGORITHMS FOR GRID-CONNECTED SYSTEMS
A COMPARATIE STUDY OF THREEPHASE AND SINGLEPHASE PLL ALGORITHMS FOR GRIDCONNECTED SYSTEMS Ruben Marco do Santo Filho Centro Federal de Educação Tecnológica CEFETMG Coord. Eletrônica Av. Amazona 553 Belo
Redesigning Ratings: Assessing the Discriminatory Power of Credit Scores under Censoring
Redeigning Rating: Aeing the Dicriminatory Power of Credit Score under Cenoring Holger Kraft, Gerald Kroiandt, Marlene Müller Fraunhofer Intitut für Techno- und Wirtchaftmathematik (ITWM) Thi verion: June
SELF-MANAGING PERFORMANCE IN APPLICATION SERVERS MODELLING AND DATA ARCHITECTURE
SELF-MANAGING PERFORMANCE IN APPLICATION SERVERS MODELLING AND DATA ARCHITECTURE RAVI KUMAR G 1, C.MUTHUSAMY 2 & A.VINAYA BABU 3 1 HP Bangalore, Reearch Scholar JNTUH, Hyderabad, India, 2 Yahoo, Bangalore,
CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES
CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES 1 MYOUNGJIN KIM, 2 CUI YUN, 3 SEUNGHO HAN, 4 HANKU LEE 1,2,3,4 Department of Internet & Multimedia Engineering,
A Note on Profit Maximization and Monotonicity for Inbound Call Centers
OPERATIONS RESEARCH Vol. 59, No. 5, September October 2011, pp. 1304 1308 in 0030-364X ein 1526-5463 11 5905 1304 http://dx.doi.org/10.1287/opre.1110.0990 2011 INFORMS TECHNICAL NOTE INFORMS hold copyright
ANALYSING THE FEATURES OF JAVA AND MAP/REDUCE ON HADOOP
ANALYSING THE FEATURES OF JAVA AND MAP/REDUCE ON HADOOP Livjeet Kaur Research Student, Department of Computer Science, Punjabi University, Patiala, India Abstract In the present study, we have compared
CHAPTER 5 BROADBAND CLASS-E AMPLIFIER
CHAPTER 5 BROADBAND CLASS-E AMPLIFIER 5.0 Introduction Cla-E amplifier wa firt preented by Sokal in 1975. The application of cla- E amplifier were limited to the VHF band. At thi range of frequency, cla-e
Research on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2
Advanced Engineering Forum Vols. 6-7 (2012) pp 82-87 Online: 2012-09-26 (2012) Trans Tech Publications, Switzerland doi:10.4028/www.scientific.net/aef.6-7.82 Research on Clustering Analysis of Big Data
A Parallel Spatial Co-location Mining Algorithm Based on MapReduce
214 IEEE International Congress on Big Data A Parallel Spatial Co-location Mining Algorithm Based on MapReduce Jin Soung Yoo, Douglas Boulware and David Kimmey Department of Computer Science Indiana University-Purdue
KNOWLEDGE DISCOVERY and SAMPLING TECHNIQUES with DATA MINING for IDENTIFYING TRENDS in DATA SETS
KNOWLEDGE DISCOVERY and SAMPLING TECHNIQUES with DATA MINING for IDENTIFYING TRENDS in DATA SETS Prof. Punam V. Khandar, *2 Prof. Sugandha V. Dani Dept. of M.C.A., Priyadarshini College of Engg., Nagpur,
Nimble Storage Exchange 2013 100,000-Mailbox Resiliency Storage Solution
Nimble Stor Exchan 213 1,-Mailbox Reilie Stor Solution Teted with: ESRP Stor Verion. Tet date: May 2, 21 Overview Thi document provide information on Nimble Stor' iscsi tor olution for Microoft Exchan
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: [email protected] November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering
A Resolution Approach to a Hierarchical Multiobjective Routing Model for MPLS Networks
A Reolution Approach to a Hierarchical Multiobjective Routing Model for MPLS Networ Joé Craveirinha a,c, Rita Girão-Silva a,c, João Clímaco b,c, Lúcia Martin a,c a b c DEEC-FCTUC FEUC INESC-Coimbra International
Support Vector Machine Based Electricity Price Forecasting For Electricity Markets utilising Projected Assessment of System Adequacy Data.
The Sixth International Power Engineering Conference (IPEC23, 27-29 November 23, Singapore Support Vector Machine Baed Electricity Price Forecating For Electricity Maret utiliing Projected Aement of Sytem
Top 10 Algorithms in Data Mining
Top 10 Algorithms in Data Mining Xindong Wu ( 吴 信 东 ) Department of Computer Science University of Vermont, USA; 合 肥 工 业 大 学 计 算 机 与 信 息 学 院 1 Top 10 Algorithms in Data Mining by the IEEE ICDM Conference
KEYWORD SEARCH IN RELATIONAL DATABASES
KEYWORD SEARCH IN RELATIONAL DATABASES N.Divya Bharathi 1 1 PG Scholar, Department of Computer Science and Engineering, ABSTRACT Adhiyamaan College of Engineering, Hosur, (India). Data mining refers to
Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture
Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture He Huang, Shanshan Li, Xiaodong Yi, Feng Zhang, Xiangke Liao and Pan Dong School of Computer Science National
Research Article An (s, S) Production Inventory Controlled Self-Service Queuing System
Probability and Statitic Volume 5, Article ID 558, 8 page http://dxdoiorg/55/5/558 Reearch Article An (, S) Production Inventory Controlled Self-Service Queuing Sytem Anoop N Nair and M J Jacob Department
DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE
DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE SK MD OBAIDULLAH Department of Computer Science & Engineering, Aliah University, Saltlake, Sector-V, Kol-900091, West Bengal, India [email protected]
Selection of Optimal Discount of Retail Assortments with Data Mining Approach
Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.
A Survey on Association Rule Mining in Market Basket Analysis
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 4 (2014), pp. 409-414 International Research Publications House http://www. irphouse.com /ijict.htm A Survey
CLOUD BASED PEER TO PEER NETWORK FOR ENTERPRISE DATAWAREHOUSE SHARING
CLOUD BASED PEER TO PEER NETWORK FOR ENTERPRISE DATAWAREHOUSE SHARING Basangouda V.K 1,Aruna M.G 2 1 PG Student, Dept of CSE, M.S Engineering College, Bangalore,[email protected] 2 Associate Professor.,
MSc Financial Economics: International Finance. Bubbles in the Foreign Exchange Market. Anne Sibert. Revised Spring 2013. Contents
MSc Financial Economic: International Finance Bubble in the Foreign Exchange Market Anne Sibert Revied Spring 203 Content Introduction................................................. 2 The Mone Market.............................................
Map/Reduce Affinity Propagation Clustering Algorithm
Map/Reduce Affinity Propagation Clustering Algorithm Wei-Chih Hung, Chun-Yen Chu, and Yi-Leh Wu Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology,
Risk Management for a Global Supply Chain Planning under Uncertainty: Models and Algorithms
Rik Management for a Global Supply Chain Planning under Uncertainty: Model and Algorithm Fengqi You 1, John M. Waick 2, Ignacio E. Gromann 1* 1 Dept. of Chemical Engineering, Carnegie Mellon Univerity,
Mobile Network Configuration for Large-scale Multimedia Delivery on a Single WLAN
Mobile Network Configuration for Large-cale Multimedia Delivery on a Single WLAN Huigwang Je, Dongwoo Kwon, Hyeonwoo Kim, and Hongtaek Ju Dept. of Computer Engineering Keimyung Univerity Daegu, Republic
