AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE MINING(27-32) AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE MINING

Size: px
Start display at page:

Download "AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE MINING(27-32) AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE MINING"

Transcription

1 AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE MINING Ravindra Kumar Tiwari Ph.D Scholar, Computer Sc. AISECT University, Bhopal Abstract-The recent advancement in data mining technology to analyze vast amount of data has played an important role in several areas of Business processing. Data mining also opens new threats to privacy and information security if not done or used properly. The main problem is that from non-sensitive data, one is able to infer sensitive information, including personal information, fact or even patterns which are generated by any algorithm of data mining. In order to focusing on privacy preserving association rule mining, the simplistic solution to address the problem of privacy is presented. The solution is to survey different aspects which are discussed in the several research papers and after analyzing those research papers conclude a new solution which is best in efficiency and performance. Before analyzing the algorithms, the data structure of database and sensitive association rule mining set have been analyzed to build the more effective model. Keywords -Data Mining, Association Rule Mining, Privacy Preserving 1. INTRODUCTION Data mining services is not alone sufficient. Data mining services play an important role in the field of Communication industry. The recent advancement in data mining technology to analyze vast amount of data has played an important role in several areas of Business processing. Data mining also opens new threats to privacy and information security if not done or used properly. The main problem is that to hide sensitive information, including personal information, even patterns which are generated by any algorithm of data mining. In order to focusing on privacy preserving association rule mining. The statistical significance of a pattern (called support) was measured as a percentage of data sequences containing the pattern. In the problem was generalized by adding taxonomy (is-a hierarchy) on items and time constraints such as minimum and maximum gap between adjacent elements of a pattern, where discovered patterns (called episodes) could have different type of ordering: full (serial episodes), none (parallel episodes) or partial and had to appear within a user-defined time window. The episodes were mined over a single event sequence and their statistical significance was measured as a percentage of windows containing the episode (frequency) or as a number of occurrences. Efficient algorithms were presented for serial and parallel episodes. In the model was extended to handle events described by a set of attributes. Episodes mined in sequences of such events were build of a set of unary and binary predicates on event attributes. To make discovery of such complex episodes feasible, it was assumed that a user has to specify a class of interesting patterns by providing a template. In a language capable of specifying episodes of interest based on logical predicates was presented and a few further extensions to the model were added. 1.1 Hiding Purposes The PPDM algorithms [4] is classified into two types :Data hiding and Rule hiding, According to the purposes of hiding, Data hiding refers to the cases where the sensitive data from original database like identity, name, and address that can be linked, directly or indirectly, to an individual person are hided. In contrast, the Rule hiding, the sensitive knowledge (rule) derived from original database after applying data mining are hided. Majority of the PPDM algorithms used data hiding techniques. Most PPDM algorithms hide sensitive patterns by modifying data. Currently, the PPDM algorithms are mainly used on the tasks of classification, association rule and clustering. Association analysis involves the discovery of associated rules, showing attribute value and conditions that occur frequently in a given set of data. Classification is the process of finding a set of models that describe and distinguish data classes or concepts, for the purpose of being able to use the model to predict the class of objects whose class label is unknown. Clustering Analysis concerns the problem of decomposing or partitioning a data set (usually multivariate) into groups so that the points in one group are similar to each other and are as different as possible from the points in other groups. 1.2 Goal of Privacy Preservation The privacy preserving goal [5] is to mine the raw data while privacy is not being leaked. Current technology is mainly realized from these two aspects: 1) The sensitive raw data in database such as names, certificate numbers, addresses and hobbies can be modified or cut to avoid the leak of personal private information. That is to say, without visiting privacy data, correct results can be gained by using data mining algorithms. 2) Sensitive rules included in data mining results can be eliminated through rule algorithms. That is, try to protect potential sensitive rules in mining process not to be Vol. 1(1), January 2014 (ISSN: ) Page

2 obtained by the party with ill intention who will maliciously reason. 1.3 Privacy Preservation Techniques Several privacy-preserving techniques [13] for association rule mining have also been proposed in the past few years. Various proposals and algorithms have been developed for centralized data, while others refer to a distributed data scenario. Distributed data scenarios can also be classified as horizontal data distribution and vertical data distribution. The purpose of privacy preserving [13] is to discover accurate patterns without precise access to the original data. The algorithm of association rule mining is to mine the association rule based on the given minimal support and minimal confidence. Therefore, the most direct method to hide association rule is to reduce the support or confidence of the association rule below the minimal support of minimal confidence. A lot of implementations [2] of the confidentiality of data and knowledge are applied in association rule mining process. According to privacy protection technologies, at present, privacy preserving association rule mining algorithms commonly can be divided into three categories: i) Heuristic-based techniques ii) Reconstruction-based techniques iii)cryptography-based techniques Heuristic based techniques is used for centralized data set and cryptography-based techniques are designed for protecting privacy in a distributed dataset by using encryption technique. Heuristic-based techniques [2] are to resolve how to select the appropriate data sets for data modification. Since the optimal selective data modification or sanitization is an NP-Hard problem, heuristics can be used to address the complexity issues. The methods of Heuristic-based modification include perturbation, which is accomplished by the alteration of an attribute value by a new value (i.e., changing a 1-value to a 0-value, or adding noise), and blocking, which is the replacement of an existing attribute value with a?. There is a basic principle of choosing the transaction or the item of item set to be modified that we should reduce the influence of the original database as far as possible. 2. MOTIVATION Successful applications of data mining techniques have been demonstrated in many areas that benefit commercial, social and human activities. Along with the success of these techniques, they pose a threat to privacy. One can easily disclose other s sensitive information or knowledge by using these techniques. So, before releasing database, sensitive information or knowledge must be hidden from unauthorized access. To solve privacy problem, PPDM has become a hotspot in data mining and database security field. In order to focusing on privacy preserving association rule mining, the simplistic solution to address the problem of privacy is presented. To overcome these problems, Improved Privacy Preserving Algorithm Using Association Rule Mining is proposed which is based on the random Perturbation technique and gives best result in terms of efficiency and performance. Proposed algorithm is a good way to apply data mining techniques with security that hides logical instances from others. Data mining is an interactive and iterative process. A user formulate a data mining task as a KDD query in a high level language. The query is sent to the knowledge Discovery Management System which retrieve the data from the database, chooses the right data mining algorithm and return result in a form of frequent pattern, association rule and pruning result to the user. The system should provide mechanism for storing discovered knowledge in a database for further selective analyses. So far proposed an SQL like language for specifying all tasks concerning discovery of frequent pattern, association rule and pruning resulting databases. The language is MineSQL, which is an extension of SQL proposed to handle association rules queries. This approach seems to be reasonable because association rules and sequential patterns are very often mined in the same datasets. MineSQL is designed as a query language for advanced users but it can also serve as an Application Programming Interface (API) for building business application dealing with knowledge discovery. MineSQL provides mechanisms for storing patterns in relational tables by offering new complex data types. MineSQL allows a user to specify various constraints defining the requested class of patterns. Current algorithm does not handle item constraints at all or require too detailed information on the structure of patterns. In this Dissertation an algorithm using item constraints in the mining process will be presented. A special emphasis will be laid on the fact that the source data is likely to be stored in relational tables. 3. PRIVACY PROTECTION TECHNIQUE There are various of privacy protection [7] Technique what apply to centralized distribution like Reconstruction Technique, Random response technique, Random perturbation technique, Heuristic Technology, Isometric transformation technology. There are various of privacy protection technique what apply to distributed distribution Vol. 1(1), January 2014 (ISSN: ) Page

3 Like Switching encryption technique, Secure multiparty computation. Among them Random perturbation technique is to convert the raw data randomly according to the set of probability which has a great advantage in the privacy data mining. 3.1 Data Distribution The PPDM algorithms [13] can be first divided into two major categories, centralized and distributed data, based on the distribution of data. In a centralized database environment, data are all stored in a single database; while, in a distributed database environment, data are stored in different databases. Distributed data scenarios can be further classified into horizontal and vertical data distributions. Horizontal distributions refer to the cases where different records of the same data attributes are resided in different places. While in a vertical data distribution, different attributes of the same record of data are resided in different places. Earlier research has been predominately focused on dealing with privacy preservation in a centralized database. The difficulties of applying PPDM algorithms to a distributed database can be attributed to: first, the data owners have privacy concerns so they may not willing to release their own data for others; second, even if they are willing to share data, the communication cost between the sites is too expensive. 3.2 Randomization method The randomization method [6] provides an effective yet simple way of preventing the user from learning sensitive data, which can be easily implemented at data collection phase for privacy preserving data mining, because the noise added to a given record is independent of the behaviour of other data records. When the randomization Age Sex Blood pressure EC G Maximum heart rate Resul t Male Hyp Healt hy Male Hyp Sick Fema Hyp Healt le hy Fema Nor Sick le mal Male Nor Sick mal Male Nor Healt mal hy method is carried out, the data collection process consists of two steps.the first step is for the data providers to randomize their data and transmit the randomized data to the data receiver. In the second step, the data receiver estimates the original distribution of the data by employing a distribution reconstruction algorithm. The model of randomization is shown in Figure 3.2 Figure 3.2 : The Model Of Randomization 3.3 Random Perturbation Technique Age Sex Blood Pressure ECG This method [7] can deal with character type,boolean type, number types of discrete data and to facilitate conversion of data sets, it is necessary to preprocess the original data set. The data preprocessing is divided into discrete data, attribute coding, data sets coded data set,three parts. A (max) - A (min)/n = length A is continuous attributes, n is the number of discrete, length is the length of the discrete interval. When the interval length is a decimal, round to the nearest integer, the first interval of discrete begin from A(min), the last interval is A(max). In this paper, the attributes of number are seen as continuous attributes, taking Table I as an example, the continuous attributes have age, resting blood pressure and maximum heart rate. TABLE I CARDIOLOGY DATE SET When n is 5, the discrete data sets are shown in Table II. Attribute coding find out different values of each attribute domain by querying the discrete data sets, and then use natural numbers to encode these different attribute values to generate attribute coding sheet. (As shown in Table III, IV) Table II DISCRETE DATA SET Table III ATTRIBUTE DOMAIN CODE Maxi Mum heart rate Result 39 Male 128 Hyp 130 Healthy 60 Male 135 Hyp 170 Sick 58 Female 137 Hyp 147 Healthy 45 Female 142 Normal 163 Sick 62 Male 140 Normal 151 Sick 70 Male 146 Normal 148 Healthy Vol. 1(1), January 2014 (ISSN: ) Page

4 Age Cod Ing Sex Cod ing Blood pressure Cod ing Female Male E ECG Coding Maximum heart rate Coding Result Table V PERTURBATION DATA SET Table IV ATT RIBU TE DOM AIN COD Setting data set into a set of encoded data is to replace the attribute values of discrete data set with the corresponding code according to the attribute table, and then form data set encoding. (As shown in Table V) Apriori algorithms having a two-step process. Coding Hyp Healthy 1 Normal Sick Age Sex Blood ECG Maximum Result heart rate Step 1: To find L k, a set of candidate k item sets is generated by joining L k-1 with itself. This set of candidate is denoted C k. Step 2 (Prune Step ): C k is the superset of L k, that is, its members may or may not be frequent, but all of the frequent k-itemsets are included in C k. A scan of the databases to determine the count of each candidate in C k would result in the determination of L k. (i.e. all candidates having a count no less than the minimum support count are frequent by definition, and therefore belongs to L k ) 4. PROPOSED WORK In this Paper, proposed algorithm named Improved Privacy Preserving Mining (IPPM). The entire system architecture consists of five phases: Proposed algorithm is a good way to apply data mining techniques with security that hides our logical instances from others. 1) Check for Authentication. 2) Reading 3) Association Rule Mining 4) Encoded and decoded the data by using random perturbation technique 5) Perform Pruning. Data mining techniques [4] are used in the discovery of user behavior patterns using several algorithms. Data mining can find interesting valuable patterns or relationships describing the data and predictive or classify the behavior of the model based on available data. In other words. It uses automated tools that employ several methodologies and algorithms to discover mainly hidden patterns, associations, frequent structure from large amounts of data stored in data warehouses or other information repositories and filter necessary information from this big dataset. Telecommunications industry is a typical data intensive industry, competition is also becoming fierce increasingly. Compared with other industries, the telecommunications industry have more crucial personal user s data, which can help people analyze the data accurately and obtain useful knowledge, in order to maintain and win the competition, people should find more interactive business opportunities and provide users with better service with short time duration. As a result, data warehouse and data mining has important value in the telecommunications industry. In this paper, propose an efficient data mining algorithm named Improved Privacy Preserving Mining (IPPM). 4.1 Proposed Method: IPPM There is some terminology which is important for understanding the novel technique. 1) Frequent Pattern- Frequent pattern means the item set which are used by the customer frequently. For example if item I1 is purchased by 10 customers and item I2 is purchased by 5 customers then the item I1 is most frequently used. So the owner must concentrate on I1 Items because it is visited by more no of customers. 2) Minimum support-for Item to be a frequent member we decide a minimum support count by which we will determine that the item is in the list of Frequent Pattern or not. For Example if minimum support is 2 then the item which count or customer visiting no is = or > 2 is the most frequent one, which will be consider for pruning. 3) Data Pruning The act of removing those item set which is not necessary is called data Pruning. Vol. 1(1), January 2014 (ISSN: ) Page

5 Memory (MB) AN IMPROVED PRIVACY PRESERVING ALGORITHM USING ASSOCIATION RULE 4) Encryption/Decryption :-We will provide encryption/ decryption at four level such as transaction,frequent item,association rule,pruning result Working Procedure Our module is divided into two parts. We can login as the normal user or by the Admin. If we enter as the normal user we can sub categorize our model of Improved Privacy Preserving Mining (IPPM) in five phases: 1) Check for Authentication. 2) Reading 3) Association Rule Mining 4) Encoded and decoded the data by using random perturbation technique 5) Perform Pruning. 5. RESULT ANALYSIS The result analysis is based on IPPM and SPADE algorithm. The new method shows in the graph that the time is less in comparison of old methods like spade. So it is more efficient. One taking Spade algorithm and IPPM techniques to analyze several aspects like Memory and computation time. It possibly takes a very long time on large inputs until the program has completed its work and gives a sign of life again. Sometimes it makes sense to be able to estimate the running time before starting a program. Obviously, the running time depends on the number n of the strings to be sorted. If we analyze SPADE (Sequential Pattern Discovery using Equivalence classes) algorithm for discovering the set of all frequent sequences the key features of SPADE algorithm is 1. They use a vertical id-list database format, where they associate with each sequence a list of objects in which it occurs, along with the time-stamps. They show that all frequent sequences can be enumerated via simple temporal joins (or intersections) on id-lists. 2. They use a lattice-theoretic approach to decompose the original search space (lattice) into smaller pieces (sublattices) which can be processed independently in mainmemory. 3. Their approach usually requires three database scans, or only a single scan with some pre-processed information, thus minimizing the I/O costs in comparison of Generalized Sequential Pattern. SPADE not only minimizes I/O costs by reducing database scans, but also minimizes computational costs by using efficient search schemes. The vertical id-list based approach is also insensitive to data-skew. An extensive set of experiments shows that SPADE outperforms previous approaches by a factor of two, and by an order of magnitude if we have some additional off-line information. Furthermore, SPADE scales linearly in the database size, and a number of other database parameters. In spade he main steps include for the computation of the frequent 1-sequences and 2-sequences, the decomposition into prefix-based parent equivalence classes, and the enumeration of all other frequent sequences via BFS or DFS search within each class. In proposed algorithm one only compute pre subset for the computation so one only include on side subset not the whole as well as we not consider the candidate generation. Time efficiency estimates depend on what we define to be a step. For the analysis to correspond usefully to the actual execution time, the time required to perform a step must be guaranteed to be bounded above by a constant. One must be careful here; for instance, some analyses count an addition of two numbers as one step. This assumption may not be warranted in certain contexts. The Graphs show that proposed method is better in comparison to spade. 5.1 Memory Based graph Min Support Figure 5.1 Memory Based graph Above figure shows that proposed algorithm IPPM takes less memory as comparison of Spade algorithm. At the min support 1 Proposed algorithm IPPM requires <= 500 MB memory for storing frequent item set while Spade requires 1000 MB memory because proposed algorithm work on either pre or post basis while Spade work on pre and post both. 5.2 Time Based graph Time (ms) Vol. 1(1), January 2014 (ISSN: ) Page

6 Min Support Figure 5.2 Time Based graph Above figure shows that proposed algorithm IPPM takes less computation time as comparison of Spade algorithm. At the min support 1 Proposed algorithm IPPM requires <= 500 millisecond computation time while Spade requires 1000 millisecond computation time because proposed algorithm work on either pre or post basis while Spade work on pre and post both. 6. CONCLUSION The recent advancement in data mining technology to analyse vast amount of data has played an important role in several areas of Business processing. Data mining also opens new threats to privacy and information security if not done or used properly. The main problem is that from non-sensitive data, one is able to infer sensitive information, including personal information, fact or even patterns which are generated by any algorithm of data mining. In order to focusing on privacy preserving association rule mining, the simplistic solution is presented, which is best in terms of efficiency and performance.because proposed algorithm takes just half computation time and memory in comparison of Spade algorithm. [6] Pingshui WANG, Survey on Privacy Preserving Data Mining, International Journal of Digital Content Technology and its Applications, Vol. 4, No. 9, 2010 [7] Brian, C.S. Loh and Patrick, H.H. Then, Ontology- Enhanced Interactive Anonymization in Domain- Driven Data Mining Outsourcing, IEEE, Second International Symposium on Data, Privacy, and E- Commerce,,2010 [8] Chirag N. Modi, Udai Pratap Rao and Dhiren R. Patel, Maintaining privacy and data quality in privacy preserving association rule mining, IEEE, International Conference on Advances in Communication, Network, and Computing, [9] Wang Yan, Le Jiajin and Huang Dongmei, A Method for Privacy Preserving Mining of Association Rules Based on Web Usage Mining, IEEE,International Conference on Web Information Systems and Mining, Vol.1, pp , FUTURE WORK In future one also include the simulation result which shows proposed method is good than other traditional methods.and one can overcome this limitation by providing one more additional key as for security purpose at time of accessing high confidential data. REFERENCES [1] R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules, 20th International Conference on Very Large Data Bases, pp , [2] Vassilios S. Verykios, Elisa Bertino,et al., Stateof-the-art in Privacy Preserving Data Mining, SIGMOD Record, Vol. 33, pp.50-57, March [3] Alan F. Karr, Xiaodong Lin, Ashish P. Sanil and Jerome P. Reiter Privacy-Preserving Analysis of Vertically Partitioned Data Using Secure Matrix Products Journal of Official Statistics, Vol. 25, pp , [4] J. Han and M. Kamber, Data Mining: Concepts and Techniques. [5] Yanguang Shen, Junrui Han and HuiShao, Research on Privacy-Preserving Technology of Data Mining, IEEE, Second International Conference on Intelligent Computation Technology and Automation, Vol. 2, pp , Vol. 1(1), January 2014 (ISSN: ) Page

Mining various patterns in sequential data in an SQL-like manner *

Mining various patterns in sequential data in an SQL-like manner * Mining various patterns in sequential data in an SQL-like manner * Marek Wojciechowski Poznan University of Technology, Institute of Computing Science, ul. Piotrowo 3a, 60-965 Poznan, Poland Marek.Wojciechowski@cs.put.poznan.pl

More information

Data Outsourcing based on Secure Association Rule Mining Processes

Data Outsourcing based on Secure Association Rule Mining Processes , pp. 41-48 http://dx.doi.org/10.14257/ijsia.2015.9.3.05 Data Outsourcing based on Secure Association Rule Mining Processes V. Sujatha 1, Debnath Bhattacharyya 2, P. Silpa Chaitanya 3 and Tai-hoon Kim

More information

International Journal of Advanced Computer Technology (IJACT) ISSN:2319-7900 PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS

International Journal of Advanced Computer Technology (IJACT) ISSN:2319-7900 PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS First A. Dr. D. Aruna Kumari, Ph.d, ; Second B. Ch.Mounika, Student, Department Of ECM, K L University, chittiprolumounika@gmail.com; Third C.

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October-2013 ISSN 2229-5518 1582

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October-2013 ISSN 2229-5518 1582 1582 AN EFFICIENT CRYPTOGRAPHIC APPROACH FOR PRESERVING PRIVACY IN DATA MINING T.Sujitha 1, V.Saravanakumar 2, C.Saravanabhavan 3 1. M.E. Student, Sujiraj.me@gmail.com 2. Assistant Professor, visaranams@yahoo.co.in

More information

Information Security in Big Data using Encryption and Decryption

Information Security in Big Data using Encryption and Decryption International Research Journal of Computer Science (IRJCS) ISSN: 2393-9842 Information Security in Big Data using Encryption and Decryption SHASHANK -PG Student II year MCA S.K.Saravanan, Assistant Professor

More information

Privacy Preserved Association Rule Mining For Attack Detection and Prevention

Privacy Preserved Association Rule Mining For Attack Detection and Prevention Privacy Preserved Association Rule Mining For Attack Detection and Prevention V.Ragunath 1, C.R.Dhivya 2 P.G Scholar, Department of Computer Science and Engineering, Nandha College of Technology, Erode,

More information

A Novel Technique of Privacy Protection. Mining of Association Rules from Outsourced. Transaction Databases

A Novel Technique of Privacy Protection. Mining of Association Rules from Outsourced. Transaction Databases A Novel Technique of Privacy Protection Mining of Association Rules from Outsource Transaction Databases 1 Dhananjay D. Wadkar, 2 Santosh N. Shelke 1 Computer Engineering, Sinhgad Academy of Engineering

More information

To Enhance The Security In Data Mining Using Integration Of Cryptograhic And Data Mining Algorithms

To Enhance The Security In Data Mining Using Integration Of Cryptograhic And Data Mining Algorithms IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 06 (June. 2014), V2 PP 34-38 www.iosrjen.org To Enhance The Security In Data Mining Using Integration Of Cryptograhic

More information

Privacy Preserving Outsourcing for Frequent Itemset Mining

Privacy Preserving Outsourcing for Frequent Itemset Mining Privacy Preserving Outsourcing for Frequent Itemset Mining M. Arunadevi 1, R. Anuradha 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College, Coimbatore, India 1 Assistant

More information

A Time Efficient Algorithm for Web Log Analysis

A Time Efficient Algorithm for Web Log Analysis A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,

More information

PRIVACY PRESERVING ASSOCIATION RULE MINING

PRIVACY PRESERVING ASSOCIATION RULE MINING Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 10, October 2014,

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm R. Sridevi,*

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Enhancement of Security in Distributed Data Mining

Enhancement of Security in Distributed Data Mining Enhancement of Security in Distributed Data Mining Sharda Darekar 1, Prof.D.K.Chitre, 2 1,2 Department Of Computer Engineering, Terna Engineering College,Nerul,Navi Mumbai. 1 sharda.darekar@gmail.com,

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

A Survey of Quantification of Privacy Preserving Data Mining Algorithms

A Survey of Quantification of Privacy Preserving Data Mining Algorithms A Survey of Quantification of Privacy Preserving Data Mining Algorithms Elisa Bertino, Dan Lin, and Wei Jiang Abstract The aim of privacy preserving data mining (PPDM) algorithms is to extract relevant

More information

Homomorphic Encryption Schema for Privacy Preserving Mining of Association Rules

Homomorphic Encryption Schema for Privacy Preserving Mining of Association Rules Homomorphic Encryption Schema for Privacy Preserving Mining of Association Rules M.Sangeetha 1, P. Anishprabu 2, S. Shanmathi 3 Department of Computer Science and Engineering SriGuru Institute of Technology

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Data Mining Project Report. Document Clustering. Meryem Uzun-Per

Data Mining Project Report. Document Clustering. Meryem Uzun-Per Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...

More information

MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM

MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM J. Arokia Renjit Asst. Professor/ CSE Department, Jeppiaar Engineering College, Chennai, TamilNadu,India 600119. Dr.K.L.Shunmuganathan

More information

A Survey on Intrusion Detection System with Data Mining Techniques

A Survey on Intrusion Detection System with Data Mining Techniques A Survey on Intrusion Detection System with Data Mining Techniques Ms. Ruth D 1, Mrs. Lovelin Ponn Felciah M 2 1 M.Phil Scholar, Department of Computer Science, Bishop Heber College (Autonomous), Trichirappalli,

More information

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Understanding Web personalization with Web Usage Mining and its Application: Recommender System Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,

More information

Performing Data Mining in (SRMS) through Vertical Approach with Association Rules

Performing Data Mining in (SRMS) through Vertical Approach with Association Rules Performing Data Mining in (SRMS) through Vertical Approach with Association Rules Mr. Ambarish S. Durani 1 and Miss. Rashmi B. Sune 2 MTech (III rd Sem), Vidharbha Institute of Technology, Nagpur, Nagpur

More information

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive

More information

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains

A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains Dr. Kanak Saxena Professor & Head, Computer Application SATI, Vidisha, kanak.saxena@gmail.com D.S. Rajpoot Registrar,

More information

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING M.Gnanavel 1 & Dr.E.R.Naganathan 2 1. Research Scholar, SCSVMV University, Kanchipuram,Tamil Nadu,India. 2. Professor

More information

Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

More information

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Enhanced Boosted Trees Technique for Customer Churn Prediction Model IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

More information

A generalized Framework of Privacy Preservation in Distributed Data mining for Unstructured Data Environment

A generalized Framework of Privacy Preservation in Distributed Data mining for Unstructured Data Environment www.ijcsi.org 434 A generalized Framework of Privacy Preservation in Distributed Data mining for Unstructured Data Environment V.THAVAVEL and S.SIVAKUMAR* Department of Computer Applications, Karunya University,

More information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Eric Hsueh-Chan Lu Chi-Wei Huang Vincent S. Tseng Institute of Computer Science and Information Engineering

More information

A Statistical Text Mining Method for Patent Analysis

A Statistical Text Mining Method for Patent Analysis A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, shjun@cju.ac.kr Abstract Most text data from diverse document databases are unsuitable for analytical

More information

Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing

Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 4, April 2015,

More information

PRIVACY-PRESERVING DATA ANALYSIS AND DATA SHARING

PRIVACY-PRESERVING DATA ANALYSIS AND DATA SHARING PRIVACY-PRESERVING DATA ANALYSIS AND DATA SHARING Chih-Hua Tai Dept. of Computer Science and Information Engineering, National Taipei University New Taipei City, Taiwan BENEFIT OF DATA ANALYSIS Many fields

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

Binary Coded Web Access Pattern Tree in Education Domain

Binary Coded Web Access Pattern Tree in Education Domain Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: kc.gomathi@gmail.com M. Moorthi

More information

Personalization of Web Search With Protected Privacy

Personalization of Web Search With Protected Privacy Personalization of Web Search With Protected Privacy S.S DIVYA, R.RUBINI,P.EZHIL Final year, Information Technology,KarpagaVinayaga College Engineering and Technology, Kanchipuram [D.t] Final year, Information

More information

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Chapter 6: Episode discovery process

Chapter 6: Episode discovery process Chapter 6: Episode discovery process Algorithmic Methods of Data Mining, Fall 2005, Chapter 6: Episode discovery process 1 6. Episode discovery process The knowledge discovery process KDD process of analyzing

More information

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Mobile Phone APP Software Browsing Behavior using Clustering Analysis Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis

More information

Formal Methods for Preserving Privacy for Big Data Extraction Software

Formal Methods for Preserving Privacy for Big Data Extraction Software Formal Methods for Preserving Privacy for Big Data Extraction Software M. Brian Blake and Iman Saleh Abstract University of Miami, Coral Gables, FL Given the inexpensive nature and increasing availability

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Classification and Prediction

Classification and Prediction Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser

More information

PartJoin: An Efficient Storage and Query Execution for Data Warehouses

PartJoin: An Efficient Storage and Query Execution for Data Warehouses PartJoin: An Efficient Storage and Query Execution for Data Warehouses Ladjel Bellatreche 1, Michel Schneider 2, Mukesh Mohania 3, and Bharat Bhargava 4 1 IMERIR, Perpignan, FRANCE ladjel@imerir.com 2

More information

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH

MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, rajalakshmi@cit.edu.in

More information

Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved

Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Perspective Big Data Framework for Healthcare using Hadoop

More information

Selection of Optimal Discount of Retail Assortments with Data Mining Approach

Selection of Optimal Discount of Retail Assortments with Data Mining Approach Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.

More information

Privacy-preserving Data Mining: current research and trends

Privacy-preserving Data Mining: current research and trends Privacy-preserving Data Mining: current research and trends Stan Matwin School of Information Technology and Engineering University of Ottawa, Canada stan@site.uottawa.ca Few words about our research Universit[é

More information

Multi-table Association Rules Hiding

Multi-table Association Rules Hiding Multi-table Association Rules Hiding Shyue-Liang Wang 1 and Tzung-Pei Hong 2 1 Department of Information Management 2 Department of Computer Science and Information Engineering National University of Kaohsiung

More information

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One

More information

ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM

ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM IRANDOC CASE STUDY Ammar Jalalimanesh a,*, Elaheh Homayounvala a a Information engineering department, Iranian Research Institute for

More information

Privacy-preserving Analysis Technique for Secure, Cloud-based Big Data Analytics

Privacy-preserving Analysis Technique for Secure, Cloud-based Big Data Analytics 577 Hitachi Review Vol. 63 (2014),. 9 Featured Articles Privacy-preserving Analysis Technique for Secure, Cloud-based Big Data Analytics Ken Naganuma Masayuki Yoshino, Ph.D. Hisayoshi Sato, Ph.D. Yoshinori

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

Standardization and Its Effects on K-Means Clustering Algorithm

Standardization and Its Effects on K-Means Clustering Algorithm Research Journal of Applied Sciences, Engineering and Technology 6(7): 399-3303, 03 ISSN: 040-7459; e-issn: 040-7467 Maxwell Scientific Organization, 03 Submitted: January 3, 03 Accepted: February 5, 03

More information

New Matrix Approach to Improve Apriori Algorithm

New Matrix Approach to Improve Apriori Algorithm New Matrix Approach to Improve Apriori Algorithm A. Rehab H. Alwa, B. Anasuya V Patil Associate Prof., IT Faculty, Majan College-University College Muscat, Oman, rehab.alwan@majancolleg.edu.om Associate

More information

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

A Review of Anomaly Detection Techniques in Network Intrusion Detection System A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In

More information

Improving Apriori Algorithm to get better performance with Cloud Computing

Improving Apriori Algorithm to get better performance with Cloud Computing Improving Apriori Algorithm to get better performance with Cloud Computing Zeba Qureshi 1 ; Sanjay Bansal 2 Affiliation: A.I.T.R, RGPV, India 1, A.I.T.R, RGPV, India 2 ABSTRACT Cloud computing has become

More information

Building A Smart Academic Advising System Using Association Rule Mining

Building A Smart Academic Advising System Using Association Rule Mining Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 raedamin@just.edu.jo Qutaibah Althebyan +962796536277 qaalthebyan@just.edu.jo Baraq Ghalib & Mohammed

More information

Enhance Preprocessing Technique Distinct User Identification using Web Log Usage data

Enhance Preprocessing Technique Distinct User Identification using Web Log Usage data Enhance Preprocessing Technique Distinct User Identification using Web Log Usage data Sheetal A. Raiyani 1, Shailendra Jain 2 Dept. of CSE(SS),TIT,Bhopal 1, Dept. of CSE,TIT,Bhopal 2 sheetal.raiyani@gmail.com

More information

DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE

DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE SK MD OBAIDULLAH Department of Computer Science & Engineering, Aliah University, Saltlake, Sector-V, Kol-900091, West Bengal, India sk.obaidullah@gmail.com

More information

Chapter 20: Data Analysis

Chapter 20: Data Analysis Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

More information

SPADE: An Efficient Algorithm for Mining Frequent Sequences

SPADE: An Efficient Algorithm for Mining Frequent Sequences Machine Learning, 42, 31 60, 2001 c 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. SPADE: An Efficient Algorithm for Mining Frequent Sequences MOHAMMED J. ZAKI Computer Science Department,

More information

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02) Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we

More information

PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE

PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE International Journal of Computer Science and Applications, Vol. 5, No. 4, pp 57-69, 2008 Technomathematics Research Foundation PREDICTIVE MODELING OF INTER-TRANSACTION ASSOCIATION RULES A BUSINESS PERSPECTIVE

More information

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan , pp.217-222 http://dx.doi.org/10.14257/ijbsbt.2015.7.3.23 A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan Muhammad Arif 1,2, Asad Khatak

More information

DATA MINING - 1DL360

DATA MINING - 1DL360 DATA MINING - 1DL360 Fall 2013" An introductory class in data mining http://www.it.uu.se/edu/course/homepage/infoutv/per1ht13 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

More information

Database and Data Mining Security

Database and Data Mining Security Database and Data Mining Security 1 Threats/Protections to the System 1. External procedures security clearance of personnel password protection controlling application programs Audit 2. Physical environment

More information

Secure Collaborative Privacy In Cloud Data With Advanced Symmetric Key Block Algorithm

Secure Collaborative Privacy In Cloud Data With Advanced Symmetric Key Block Algorithm Secure Collaborative Privacy In Cloud Data With Advanced Symmetric Key Block Algorithm Twinkle Graf.F 1, Mrs.Prema.P 2 1 (M.E- CSE, Dhanalakshmi College of Engineering, Chennai, India) 2 (Asst. Professor

More information

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm R.Karthiyayini 1, J.Jayaprakash 2 Assistant Professor, Department of Computer Applications, Anna University (BIT Campus),

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

Chapter 23. Database Security. Security Issues. Database Security

Chapter 23. Database Security. Security Issues. Database Security Chapter 23 Database Security Security Issues Legal and ethical issues Policy issues System-related issues The need to identify multiple security levels 2 Database Security A DBMS typically includes a database

More information

How To Solve The Kd Cup 2010 Challenge

How To Solve The Kd Cup 2010 Challenge A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China catch0327@yahoo.com yanxing@gdut.edu.cn

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Computing Range Queries on Obfuscated Data

Computing Range Queries on Obfuscated Data Computing Range Queries on Obfuscated Data E. Damiani 1 S. De Capitani di Vimercati 1 S. Paraboschi 2 P. Samarati 1 (1) Dip. di Tecnologie dell Infomazione (2) Dip. di Ing. Gestionale e dell Informazione

More information

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. White Paper Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. Using LSI for Implementing Document Management Systems By Mike Harrison, Director,

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

DWMiner : A tool for mining frequent item sets efficiently in data warehouses

DWMiner : A tool for mining frequent item sets efficiently in data warehouses DWMiner : A tool for mining frequent item sets efficiently in data warehouses Bruno Kinder Almentero, Alexandre Gonçalves Evsukoff and Marta Mattoso COPPE/Federal University of Rio de Janeiro, P.O.Box

More information

Information Security in Big Data: Privacy and Data Mining (IEEE, 2014) Dilara USTAÖMER 2065787

Information Security in Big Data: Privacy and Data Mining (IEEE, 2014) Dilara USTAÖMER 2065787 Information Security in Big Data: Privacy and Data Mining (IEEE, 2014) Dilara USTAÖMER 2065787 2015/5/13 OUTLINE Introduction User Role Based Methodology Data Provider Data Collector Data Miner Decision

More information

IMPROVED MASK ALGORITHM FOR MINING PRIVACY PRESERVING ASSOCIATION RULES IN BIG DATA

IMPROVED MASK ALGORITHM FOR MINING PRIVACY PRESERVING ASSOCIATION RULES IN BIG DATA International Conference on Computer Science, Electronics & Electrical Engineering-0 IMPROVED MASK ALGORITHM FOR MINING PRIVACY PRESERVING ASSOCIATION RULES IN BIG DATA Pavan M N, Manjula G Dept Of ISE,

More information

OLAP Online Privacy Control

OLAP Online Privacy Control OLAP Online Privacy Control M. Ragul Vignesh and C. Senthil Kumar Abstract--- The major issue related to the protection of private information in online analytical processing system (OLAP), is the privacy

More information

A SURVEY ON GENETIC ALGORITHM FOR INTRUSION DETECTION SYSTEM

A SURVEY ON GENETIC ALGORITHM FOR INTRUSION DETECTION SYSTEM A SURVEY ON GENETIC ALGORITHM FOR INTRUSION DETECTION SYSTEM MS. DIMPI K PATEL Department of Computer Science and Engineering, Hasmukh Goswami college of Engineering, Ahmedabad, Gujarat ABSTRACT The Internet

More information

A Study of Data Perturbation Techniques For Privacy Preserving Data Mining

A Study of Data Perturbation Techniques For Privacy Preserving Data Mining A Study of Data Perturbation Techniques For Privacy Preserving Data Mining Aniket Patel 1, HirvaDivecha 2 Assistant Professor Department of Computer Engineering U V Patel College of Engineering Kherva-Mehsana,

More information

KNOWLEDGE DISCOVERY FOR SUPPLY CHAIN MANAGEMENT SYSTEMS: A SCHEMA COMPOSITION APPROACH

KNOWLEDGE DISCOVERY FOR SUPPLY CHAIN MANAGEMENT SYSTEMS: A SCHEMA COMPOSITION APPROACH KNOWLEDGE DISCOVERY FOR SUPPLY CHAIN MANAGEMENT SYSTEMS: A SCHEMA COMPOSITION APPROACH Shi-Ming Huang and Tsuei-Chun Hu* Department of Accounting and Information Technology *Department of Information Management

More information

Keywords: Mobility Prediction, Location Prediction, Data Mining etc

Keywords: Mobility Prediction, Location Prediction, Data Mining etc Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Data Mining Approach

More information

Database security. André Zúquete Security 1. Advantages of using databases. Shared access Many users use one common, centralized data set

Database security. André Zúquete Security 1. Advantages of using databases. Shared access Many users use one common, centralized data set Database security André Zúquete Security 1 Advantages of using databases Shared access Many users use one common, centralized data set Minimal redundancy Individual users do not have to collect and maintain

More information

List of Promising Concepts EA6: BIG DATA

List of Promising Concepts EA6: BIG DATA List of Promising Concepts EA6: BIG DATA Project acronym Project title Project number 611961 Starting date 01/10/2013 Duration in months 24 Call identifier FP7-ICT-2013-10 CAPITAL security research Agenda

More information

A Model For Revelation Of Data Leakage In Data Distribution

A Model For Revelation Of Data Leakage In Data Distribution A Model For Revelation Of Data Leakage In Data Distribution Saranya.R Assistant Professor, Department Of Computer Science and Engineering Lord Jegannath college of Engineering and Technology Nagercoil,

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Intelligent Stock Market Assistant using Temporal Data Mining

Intelligent Stock Market Assistant using Temporal Data Mining Intelligent Stock Market Assistant using Temporal Data Mining Gerasimos Marketos 1, Konstantinos Pediaditakis 2, Yannis Theodoridis 1, and Babis Theodoulidis 2 1 Database Group, Information Systems Laboratory,

More information

Enhanced data mining analysis in higher educational system using rough set theory

Enhanced data mining analysis in higher educational system using rough set theory African Journal of Mathematics and Computer Science Research Vol. 2(9), pp. 184-188, October, 2009 Available online at http://www.academicjournals.org/ajmcsr ISSN 2006-9731 2009 Academic Journals Review

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

Implementation of Data Mining Techniques to Perform Market Analysis

Implementation of Data Mining Techniques to Perform Market Analysis Implementation of Data Mining Techniques to Perform Market Analysis B.Sabitha 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, P.Balasubramanian 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Pattern-Aided Regression Modelling and Prediction Model Analysis

Pattern-Aided Regression Modelling and Prediction Model Analysis San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Fall 2015 Pattern-Aided Regression Modelling and Prediction Model Analysis Naresh Avva Follow this and

More information