CREATING MINIMIZED DATA SETS BY USING HORIZONTAL AGGREGATIONS IN SQL FOR DATA MINING ANALYSIS
|
|
|
- Darren Fowler
- 10 years ago
- Views:
Transcription
1 CREATING MINIMIZED DATA SETS BY USING HORIZONTAL AGGREGATIONS IN SQL FOR DATA MINING ANALYSIS Subbarao Jasti #1, Dr.D.Vasumathi *2 1 Student & Department of CS & JNTU, AP, India 2 Professor & Department of CSE & JNTU, AP, India 1 [email protected] 2 [email protected] Abstract: Mining historical data can help in discovering actionable knowledge. However, mining activities cannot be done directly on the regular databases. In order to perform data mining, it is required to prepare datasets that will be useful for mining process. Preparing datasets manually for data mining is a challenging task as it needs aggregations, complex SQL queries. The problem with existing aggregate functions of SQL such as SUM, MAX, MIN, COUNT and AVG return a single value as output. Such scalar value cannot be used for preparing datasets. Therefore a new approach is required to generate datasets required for data mining automatically. In this paper we present various techniques to generate datasets for data mining through horizontal aggregations. The proposed aggregations in this paper include PIVOT, SPJ, and CASE. We also built a prototype application to demonstrate the proof of concept. The experimental results revealed that the proposed aggregations are useful in generating datasets for data mining. Key words : Data mining, data sets, SQL, horizontal aggregations INTRODUCTION RDBMS is the stable database model which is widely used in the real world. It is used to store data permanently. Such data is accessed via applications known as front ends. The front end applications make use of SQL to interact with RDBMS for storing and retrieving data. Databases are prepared based on the requirements of an organization. The data is stored in relations and they are normalized to make them more compact and consistent. SQL queries can be used to aggregate data present in tables. The common aggregate functions supported by SQL include COUNT, MAX, MIN, AVG and SUM. All these functions can produce output as a scalar value. They are also known as vertical aggregations. The results of these aggregation functions can be used in summary analysis. However, they are not useful for data mining operations. It does mean that they cannot form a dataset for data mining. Datasets can be prepared for data mining by horizontally aggregating the data present in relational tables [1]. Statistical algorithms [2], [3] can make use of such summary of data. Many existing data mining techniques like classification, PCA, regression, and clustering expect data in the form of tables with tuples and domains [2], [4]. Therefore it is important to use horizontal aggregations in order to generate datasets. In this paper we proposed such aggregations like PIVOT, CASE, and SPJ. These operations are the routines that may use SQL commands internally to generate horizontal aggregations that result in data sets which can be used directly for data mining. Built in facility of pivoting of SQL is used by PIVOT. In the same fashion SPJ and CASE are built with underlying SQL commands to generate data sets for real world data mining. This work is motivated by the fact that generating datasets can help in data mining directly. This is because the database which is regularly used store data related to OLTP (Online Transaction Processing) which is not suitable for data mining. For this reason, it is better to take data from such databases and generate data sets meant for data mining. The horizontal aggregations that we have proposed are very useful for generating data sets for data mining operations. Various constructs like SPJ, PIVOT and CASE are prepared in order to generate data sets required. The underlying SQL commands used by these aggregations help in achieving the goal of preparing datasets for data mining. The horizontal aggregation operations we proposed can be used by users. 32
2 They in turn generate required SQL code automatically to achieve the goal. We also built a prototype application which demonstrates the proof of concept. The application is web based and allows the end users to generate datasets using different horizontal aggregations proposed by us. The experimental results revealed that the proposed application is useful in generating real world data sets for data mining purposes. The rest of the paper is organized as follows. Section II reviews literature. Section III provides details about proposed horizontal aggregations. PRIOR WORK SQL is widely used in real world applications as it supports interaction with various relational databases. SQL has simple and effective commands to perform operations such as DML, DDL, TCL and DCL. They also support sub queries, joins and aggregations. Optimization of queries is possible by using certain performance tuning activities such as indexing. As part of SQL queries, aggregate functions like MIN, MAX, COUNT, AVG and SUM can be used to get summary of data [5]. The aggregate functions provided by SQL generate single row output. They cannot provide data in horizontal layout which is essential for data mining applications. One of the data mining techniques is association rule mining which is widely used in OLAP processing [6]. In [7] an extension to SQL aggregate functions is made for efficient data mining solutions. The result of such aggregate functions can provide data in horizontal layout which is useful for data mining. Clustering algorithm [5] is one of the data mining algorithms that make use of SQL internally in order to perform clustering. Horizontal layout is used for data which is further used to perform data mining operations through custom SQL extensions [8]. SQL extensions provided by them have optimizations for joins but not for resultant groups. For this reason joins can be avoided using PIVOT and CASE constructs. For new class of aggregations in [9] relational algebra is used. Such aggregations are known as horizontal aggregations. In this paper also we focus on the horizontal aggregations such as CASE, PIVOT and SPJ. Optimizing joins is also presented in [10] but that is not useful for large queries. Query optimizations using tree-based plans are also 33 ISSN explored in [11]. There was lot of research on aggregations and related optimizations that include cross tabulation [12]. Exploration of un-pivoting relational tables is done in [13] where decision trees are computed using input rows. The results contain multiple attribute value pairs for horizontal aggregations. Transforming data from one format to another format can be done using SQL operations [14]. Unpivot operator is similar to TRANSPOSE operator that produce multiple rows. Number of operations is reduced in TRANSPOSE when compared to PIVOT. There is inverse relationship between these two operators. Vertical aggregations and decision trees are used for horizontal aggregations that produce layouts that can be used for data mining operations. SQL Server [15] supports both unpivot and pivot operations are available. In [16] and [17] aggregations are explored but they have some limitations. The results of them can t be directly used to produce datasets for real time data mining operations. From built in aggregations of SQL language, the horizontal aggregations are different. In this paper we implement horizontal aggregations such as CASE, SPJ, and PIVOT which are extensions to the built in operators of SQL. For example CASE is a construct which is done programmatically that make use of SQL internally to perform the horizontal aggregations. This section provides details about horizontal aggregations with suitable illustrations. The aggregations considered include SPJ, PIVOT and CASE. III.EXECUTION STRATEGIES IN HORIZONTAL AGGREGATION This section provides details about horizontal aggregations with suitable illustrations. The aggregations propose a new class of functions that aggregate numeric expressions and the result are transposed to produce data sets with a horizontal aggregation. The operation is needed in a number of data mining tasks, such as unsupervised grouping and data aggregation, as well as reduction of large heterogeneous data sets into smaller homogeneous subsets that can be easily managed, separately modeled and analyzed. To create datasets for data mining related works, efficient and aggregation of data are needed. For that this
3 proposed system collect particular needed attributes from the different fact tables and displayed columns in order to create data in horizontal aggregation. Main goal is to define a template to generate SQL code combining aggregation and pivoting(transposition). A second goal is to extend the SELECT statement with a clause that combines transposition with aggregation. Speculate the following GROUP BY query in standard SQL that takes a subset C1,..,Cm fromb1,.., Bp: SELECT C1,.., Cm, sum (S) FROM E1,E2 GROUP BY C1,Cm; In a horizontal aggregation there are 4 input parameters to generate SQL code: 1) The input table E1,E2,En 2) The list of GROUP BY columns C1,..., Cj, 3) The column to aggregate (S), 4) The list of transposing columns A1,...,Ak. This aggregation query will produce a wide table with m+1 columns (automatically determined), with one group for each different combination of values C1, ,Cm and one aggregated value per group (i.e., sum(s) ). In order to evaluate this query the query optimizer takes three input parameters. First parameter is the input table E1. Second parameter is the list of grouping columns C1,..., Cm. And the final parameter is the column to aggregate (S). Example In the Figure.1 there is a common field K in E1 and E2.In E2, B2 consist of only two distinct values P and Q and is used to transpose the table. The grouping operation is used in this is sum (). The values within B1 are repeated, 1 appears 3 times, for row 3, 4 and, and for row 3 & 4 value of B2 is P & Q. So B2P and B2Q is newly generated columns in E H. Table E 2 K S NULL Table E H B1 B2P B2Q 1 NULL NULL Figure 1. An example of Horizontal aggregation Commonly using Query Evaluation methods in Horizontal aggregation functions [18] are SPJ, PIVOT and CASE methods. Steps Used in All Methods Figure. 2 shows steps on all methods based on input table All aggregation methods such as PIVOT, CASE, and SPJ proposed in this paper follow the steps as described in fig. 2. For each method the first step is issuing a SELECT query. Then other operator is applied as required. Then the code of underlying operation gets executed to perform that horizontal aggregation. Table E 1 K B1 B2 1 3 P 2 2 Q 3 1 Q 4 1 Q 5 2 P 6 1 P 7 3 P 8 2 P Fig. 3 shows steps on all methods based on table containing results of vertical aggregations 34
4 All aggregation methods such as PIVOT, CASE, and SPJ proposed in this paper follow the steps as described in fig. 2. For each method the first step is issuing a SELECT query. Then other operator is applied as required. Then the code of underlying operation gets executed to perform that horizontal aggregation. SPJ Method Based on the relational operators the construct of this method is built. According to this method a table is created which exhibits vertical aggregation for every column. Afterwards, joining of all tables is performed in order to produce horizontal aggregations. The implementation of the SPJ method is based on the procedure given in [18]. PIVOT Method Every RDBMS has a provision for PIVOT operation. The PIVOT operator that comes built in with RDBMS is used to build this construct programmatically. It can provide transpositions that can be utilized for evolution of resultant aggregations. The PIVOT operator knew the number of columns required to represent a transpose that can be combined with the operator GOUP BY of SQL. The existing CASE command of SQL is used to develop the CASE construct for horizontal aggregation. From the relational point of view it is very much similar to the aggregation/projection. It makes use of many conditions with conjunction. There are two strategies used for computing horizontal aggregations here. The first one is by using input table directly whereas the second one is by using vertical aggregation and store the resultant data into some arbitrary relation. That relation is further used to calculate horizontal aggregations. The implementation is based on the procedure given in [18]. EXPERIMENTAL EVALUATION For development of the prototype application Visual Studio 2010 and SQL Server 2008 is the environment used. The experiments are done in a PC with 4 GB RAM, Core 2 Dual processor running Windows 7 operating system. ASP.NET and AJAX are the technologies used to build front end application with web based interface while the C# programming language is used for coding functionality. ADO.NET is used to interact with relational database. The result of SPJ construct for horizontal aggregation is presented in fig. 4. Listing 1 Shows elaboration instructions for PIVOT construct As seen in listing 1, the optimized instructions are used in PIVOT construct. Only the columns that are required for horizontal aggregations are used for computations. CASE Method 35
5 Fig. 4 Results of SJP aggregation As shown in fig. 4, the results of SPJ operator are presented in horizontal layout. The data in the resultant table can be used for data mining operations. As seen in fig. 6, the results of CASE operator are presented in horizontal layout. The data thus resulted can be used in data mining operations later. The aim of the operator is to generate datasets for data mining. CONCLUSIONS In this paper we built three constructs for horizontal aggregations of data. Thus it generates data sets for data mining. The provisions of SQL are used to build three aggregations namely PIVOT, CASE, and SPJ. It is achieved by using SQL as underlying query language. When these constructs are used by end user, the underlying commands are executed and the data sets are generated in horizontal layout. These operators can be used in OLAP applications where huge amount of historical data is analyzed. We built the operators for horizontal aggregations as the vertical aggregations supported by SQL return scalar values which are not suitable for data mining purposes. In order to build real world datasets for data mining we built the said operators. We built a web based prototype that demonstrates the proof of concept. The experimental results revealed that the proposed constructs for horizontal aggregations are effective in producing datasets for data mining. Fig. 5 Result of Pivoting Aggregation As seen in fig. 5, the results of PIVOT are presented in horizontal layout. The resultant data which is in tabular format can be used further for data mining. Fig. 6 Result of CASE Aggregation REFERENCES [1] C. Ordonez, Data Set Preprocessing and Transformation in a Database System, Intelligent Data Analysis, vol. 15, no. 4, pp , [2] C. Ordonez, Statistical Model Computation with UDFs, IEEE Trans. Knowledge and Data Eng., vol. 22, no. 12, pp , Dec [3] C. Ordonez and S. Pitchaimalai, Bayesian Classifiers Programmed in SQL, IEEE Trans. Knowledge and Data Eng., vol. 22, no. 1, pp , Jan [4] J. Han and M. Kamber, Data Mining:Concepts and Techniques, first ed. Morgan Kaufmann, [5] C. Ordonez, Integrating K-Means Clustering with a Relational DBMS Using SQL, IEEE Trans. Knowledge and Data Eng., vol. 18, no. 2, pp , Feb [6] S. Sarawagi, S. Thomas, and R. Agrawal, Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications, Proc. ACM SIGMOD Int l 36
6 Conf. Management of Data (SIGMOD 98), pp , [7] H. Wang, C. Zaniolo, and C.R. Luo, ATLAS: A Small But Complete SQL Extension for Data Mining and Data Streams, Proc. 29th Int l Conf. Very Large Data Bases (VLDB 03), pp , [8] A. Witkowski, S. Bellamkonda, T. Bozkaya, G. Dorman, N. Folkert, A. Gupta, L. Sheng, and S. Subramanian, Spreadsheets in RDBMS for OLAP, Proc. ACM SIGMOD Int l Conf. Management of Data (SIGMOD 03), pp , [9] H. Garcia-Molina, J.D. Ullman, and J. Widom, Database Systems: The Complete Book, first ed. Prentice Hall, [10] C. Galindo-Legaria and A. Rosenthal, Outer Join Simplification and Reordering for Query Optimization, ACM Trans. Database Systems, vol. 22, no. 1, pp , [11] G. Bhargava, P. Goel, and B.R. Iyer, Hypergraph Based Reorderings of Outer Join Queries with Complex Predicates, Proc. ACM SIGMOD Int l Conf. Management of Data (SIGMOD 95), pp , [12] J. Gray, A. Bosworth, A. Layman, and H. Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross- Tab and Sub-Total, Proc. Int l Conf. Data Eng., pp , [13] G. Graefe, U. Fayyad, and S. Chaudhuri, On the Efficient Gathering of Sufficient Statistics for Classification from Large SQL Databases, Proc. ACM Conf. Knowledge Discovery and Data Mining (KDD 98), pp , [14] J. Clear, D. Dunn, B. Harvey, M.L. Heytens, and P. Lohman, Non- Stop SQL/MX Primitives for Knowledge Discovery, Proc. ACM SIGKDD Fifth Int l Conf. Knowledge Discovery and Data Mining (KDD 99), pp , [15] C. Cunningham, G. Graefe, and C.A. Galindo-Legaria, PIVOT and UNPIVOT: Optimization and Execution Strategies in an RDBMS, Proc. 13th Int l Conf. Very Large Data Bases (VLDB 04), pp , [16] C. Ordonez, Horizontal Aggregations for Building Tabular Data Sets, Proc. Ninth ACM SIGMOD Workshop ISSN Data Mining and Knowledge Discovery (DMKD 04), pp , [17] C. Ordonez, Vertical and Horizontal Percentage Aggregations, Proc. ACM SIGMOD Int l Conf. Management of Data (SIGMOD 04), pp , [18] Carlos Ordonez and Zhibo Chen, Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 24, NO. 4, APRIL AUTHORS Mr.Subbarao.Jasti received B.Tech in Information Technology from Bapatla Engineering College and pursing M.Tech in Computer Science from jawaharlal nehru technological university, kukatpally,hyderabad,ap, INDIA. His areas of interest includes Data Mining and Warehousing, Database Management Systems. Prof. D Vasumathi received B.Tech in Computer Science and Engineering and Master s Degree M.Tech in Computer Science from JNT University Hyderabad, AP. She has been awarded Doctoral Degree (Ph.D) by the University of JNTU Hyderabad in computer Science. Currently she is working as an Professor in Department of Computer Science and Engineering in JNTU College of Engineering, kukatpally, Hyderabad, Andhra Pradesh. She has 12 years of Teaching and Administration experience in JNTU Hyderabad. She has published number of papers at national and international journals and conferences. She is a member of ISTE,IEEE,CSI et al. Her areas of interest includes Data Mining and Warehousing, Artificial Intelligence, Database Management Systems, Java and Web Technologies. 37
Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727 Volume 6, Issue 5 (Nov. - Dec. 2012), PP 36-41 Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner
24 Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner Rekha S. Nyaykhor M. Tech, Dept. Of CSE, Priyadarshini Bhagwati College of Engineering, Nagpur, India
Preparing Data Sets for the Data Mining Analysis using the Most Efficient Horizontal Aggregation Method in SQL
Preparing Data Sets for the Data Mining Analysis using the Most Efficient Horizontal Aggregation Method in SQL Jasna S MTech Student TKM College of engineering Kollam Manu J Pillai Assistant Professor
International Journal of Advanced Research in Computer Science and Software Engineering
Volume, Issue, March 201 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Efficient Approach
Dataset Preparation and Indexing for Data Mining Analysis Using Horizontal Aggregations
Dataset Preparation and Indexing for Data Mining Analysis Using Horizontal Aggregations Binomol George, Ambily Balaram Abstract To analyze data efficiently, data mining systems are widely using datasets
International Journal of Advanced Research in Computer Science and Software Engineering
Volume 2, Issue 8, August 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Aggregations
Creation of Datasets for Data Mining Analysis by Using Horizontal Aggregation in SQL
International Journal of Computer Applications in Engineering Sciences [VOL III, ISSUE I, MARCH 2013] [ISSN: 2231-4946] Creation of Datasets for Data Mining Analysis by Using Horizontal Aggregation in
Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis
Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis Rajesh Reddy Muley 1, Sravani Achanta 2, Prof.S.V.Achutha Rao 3 1 pursuing M.Tech(CSE), Vikas College of Engineering and
Data sets preparing for Data mining analysis by SQL Horizontal Aggregation
Data sets preparing for Data mining analysis by SQL Horizontal Aggregation V.Nikitha 1, P.Jhansi 2, K.Neelima 3, D.Anusha 4 Department Of IT, G.Pullaiah College of Engineering and Technology. Kurnool JNTU
Role of Materialized View Maintenance with PIVOT and UNPIVOT Operators A.N.M. Bazlur Rashid, M. S. Islam
2009 IEEE International Advance Computing Conference (IACC 2009) Patiala, India, 6 7 March 2009 Role of Materialized View Maintenance with PIVOT and UNPIVOT Operators A.N.M. Bazlur Rashid, M. S. Islam
Data Mining and Database Systems: Where is the Intersection?
Data Mining and Database Systems: Where is the Intersection? Surajit Chaudhuri Microsoft Research Email: [email protected] 1 Introduction The promise of decision support systems is to exploit enterprise
Exploring Resource Provisioning Cost Models in Cloud Computing
Exploring Resource Provisioning Cost Models in Cloud Computing P.Aradhya #1, K.Shivaranjani *2 #1 M.Tech, CSE, SR Engineering College, Warangal, Andhra Pradesh, India # Assistant Professor, Department
DATA MINING ANALYSIS TO DRAW UP DATA SETS BASED ON AGGREGATIONS
DATA MINING ANALYSIS TO DRAW UP DATA SETS BASED ON AGGREGATIONS Paparao Golusu 1, Nagul Shaik 2 1 M. Tech Scholar (CSE), Nalanda Institute of Tech, (NIT), Siddharth Nagar, Guntur, A.P, (India) 2 Assistant
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: [email protected] November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering
Turkish Journal of Engineering, Science and Technology
Turkish Journal of Engineering, Science and Technology 03 (2014) 106-110 Turkish Journal of Engineering, Science and Technology journal homepage: www.tujest.com Integrating Data Warehouse with OLAP Server
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,
DBMS / Business Intelligence, SQL Server
DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
A Brief Tutorial on Database Queries, Data Mining, and OLAP
A Brief Tutorial on Database Queries, Data Mining, and OLAP Lutz Hamel Department of Computer Science and Statistics University of Rhode Island Tyler Hall Kingston, RI 02881 Tel: (401) 480-9499 Fax: (401)
Oracle SQL. Course Summary. Duration. Objectives
Oracle SQL Course Summary Identify the major structural components of the Oracle Database 11g Create reports of aggregated data Write SELECT statements that include queries Retrieve row and column data
Programming the K-means Clustering Algorithm in SQL
Programming the K-means Clustering Algorithm in SQL Carlos Ordonez Teradata, NCR San Diego, CA 92127, USA [email protected] ABSTRACT Using SQL has not been considered an efficient and feasible
Enhanced Boosted Trees Technique for Customer Churn Prediction Model
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction
MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH
MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, [email protected]
Efficient Integration of Data Mining Techniques in Database Management Systems
Efficient Integration of Data Mining Techniques in Database Management Systems Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex France
ISSN 2278-3091. Index Terms Cloud computing, outsourcing data, cloud storage security, public auditability
Outsourcing and Discovering Storage Inconsistencies in Cloud Through TPA Sumathi Karanam 1, GL Varaprasad 2 Student, Department of CSE, QIS College of Engineering and Technology, Ongole, AndhraPradesh,India
Mining Signatures in Healthcare Data Based on Event Sequences and its Applications
Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1
Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
Static Data Mining Algorithm with Progressive Approach for Mining Knowledge
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive
ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL
International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia
The Study on Data Warehouse Design and Usage
International Journal of Scientific and Research Publications, Volume 3, Issue 3, March 2013 1 The Study on Data Warehouse Design and Usage Mr. Dishek Mankad 1, Mr. Preyash Dholakia 2 1 M.C.A., B.R.Patel
PartJoin: An Efficient Storage and Query Execution for Data Warehouses
PartJoin: An Efficient Storage and Query Execution for Data Warehouses Ladjel Bellatreche 1, Michel Schneider 2, Mukesh Mohania 3, and Bharat Bhargava 4 1 IMERIR, Perpignan, FRANCE [email protected] 2
Integrating Pattern Mining in Relational Databases
Integrating Pattern Mining in Relational Databases Toon Calders, Bart Goethals, and Adriana Prado University of Antwerp, Belgium {toon.calders, bart.goethals, adriana.prado}@ua.ac.be Abstract. Almost a
Data Mining: A Preprocessing Engine
Journal of Computer Science 2 (9): 735-739, 2006 ISSN 1549-3636 2005 Science Publications Data Mining: A Preprocessing Engine Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh Applied Science University,
Future Trend Prediction of Indian IT Stock Market using Association Rule Mining of Transaction data
Volume 39 No10, February 2012 Future Trend Prediction of Indian IT Stock Market using Association Rule Mining of Transaction data Rajesh V Argiddi Assit Prof Department Of Computer Science and Engineering,
An Algorithm to Evaluate Iceberg Queries for Improving The Query Performance
INTERNATIONAL OPEN ACCESS JOURNAL ISSN: 2249-6645 OF MODERN ENGINEERING RESEARCH (IJMER) An Algorithm to Evaluate Iceberg Queries for Improving The Query Performance M.Laxmaiah 1, A.Govardhan 2 1 Department
Data Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
A Critical Review of Data Warehouse
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 95-103 Research India Publications http://www.ripublication.com A Critical Review of Data Warehouse Sachin
How To Solve The Kd Cup 2010 Challenge
A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]
Introduction to Data Mining
Introduction to Data Mining José Hernández ndez-orallo Dpto.. de Systems Informáticos y Computación Universidad Politécnica de Valencia, Spain [email protected] Horsens, Denmark, 26th September 2005
Prediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search
Project for Michael Pitts Course TCSS 702A University of Washington Tacoma Institute of Technology ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search Under supervision of : Dr. Senjuti
Robust Outlier Detection Technique in Data Mining: A Univariate Approach
Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,
Enhanced data mining analysis in higher educational system using rough set theory
African Journal of Mathematics and Computer Science Research Vol. 2(9), pp. 184-188, October, 2009 Available online at http://www.academicjournals.org/ajmcsr ISSN 2006-9731 2009 Academic Journals Review
AMIS 7640 Data Mining for Business Intelligence
The Ohio State University The Max M. Fisher College of Business Department of Accounting and Management Information Systems AMIS 7640 Data Mining for Business Intelligence Autumn Semester 2013, Session
OLAP, Knowledge Discovery from Database, Social Security Fund, Oracle Warehouse Builder, Oracle Discoverer.
ABSTRACT Mohamed Salah GOUIDER 1, Amine FARHAT 2 BESTMOD Laboratory Institut Supérieur de Gestion 41, rue de la liberté, cite Bouchoucha Bardo, 2000, Tunis, TUNISIA [email protected] 1, [email protected]
II. OLAP(ONLINE ANALYTICAL PROCESSING)
Association Rule Mining Method On OLAP Cube Jigna J. Jadav*, Mahesh Panchal** *( PG-CSE Student, Department of Computer Engineering, Kalol Institute of Technology & Research Centre, Gujarat, India) **
CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing Class Projects Class projects are going very well! Project presentations: 15 minutes On Wednesday
Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
1 Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis Carlos Ordonez, Zhibo Chen University of Houston Houston, TX 77204, USA Abstract Preparing a data set for analysis is generally
CubeView: A System for Traffic Data Visualization
CUBEVIEW: A SYSTEM FOR TRAFFIC DATA VISUALIZATION 1 CubeView: A System for Traffic Data Visualization S. Shekhar, C.T. Lu, R. Liu, C. Zhou Computer Science Department, University of Minnesota 200 Union
College information system research based on data mining
2009 International Conference on Machine Learning and Computing IPCSIT vol.3 (2011) (2011) IACSIT Press, Singapore College information system research based on data mining An-yi Lan 1, Jie Li 2 1 Hebei
Oracle Database 10g: Introduction to SQL
Oracle University Contact Us: 1.800.529.0165 Oracle Database 10g: Introduction to SQL Duration: 5 Days What you will learn This course offers students an introduction to Oracle Database 10g database technology.
Dynamic Data in terms of Data Mining Streams
International Journal of Computer Science and Software Engineering Volume 2, Number 1 (2015), pp. 1-6 International Research Publication House http://www.irphouse.com Dynamic Data in terms of Data Mining
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
International Journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online http://www.ijoer.
REVIEW ARTICLE ISSN: 2321-7758 UPS EFFICIENT SEARCH ENGINE BASED ON WEB-SNIPPET HIERARCHICAL CLUSTERING MS.MANISHA DESHMUKH, PROF. UMESH KULKARNI Department of Computer Engineering, ARMIET, Department
Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices
Proc. of Int. Conf. on Advances in Computer Science, AETACS Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Ms.Archana G.Narawade a, Mrs.Vaishali Kolhe b a PG student, D.Y.Patil
LearnFromGuru Polish your knowledge
SQL SERVER 2008 R2 /2012 (TSQL/SSIS/ SSRS/ SSAS BI Developer TRAINING) Module: I T-SQL Programming and Database Design An Overview of SQL Server 2008 R2 / 2012 Available Features and Tools New Capabilities
PERFORMANCE ANALYSIS OF CLUSTERING ALGORITHMS IN DATA MINING IN WEKA
PERFORMANCE ANALYSIS OF CLUSTERING ALGORITHMS IN DATA MINING IN WEKA Prakash Singh 1, Aarohi Surya 2 1 Department of Finance, IIM Lucknow, Lucknow, India 2 Department of Computer Science, LNMIIT, Jaipur,
MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM
MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM J. Arokia Renjit Asst. Professor/ CSE Department, Jeppiaar Engineering College, Chennai, TamilNadu,India 600119. Dr.K.L.Shunmuganathan
Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques Chapter 1 Introduction SURESH BABU M ASST PROF IT DEPT VJIT 1 Chapter 1. Introduction Motivation: Why data mining? What is data mining? Data Mining: On what kind of
Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 9, Issue 8 (January 2014), PP. 19-24 Comparative Analysis of EM Clustering Algorithm
Secure and Faster NN Queries on Outsourced Metric Data Assets Renuka Bandi #1, Madhu Babu Ch *2
Secure and Faster NN Queries on Outsourced Metric Data Assets Renuka Bandi #1, Madhu Babu Ch *2 #1 M.Tech, CSE, BVRIT, Hyderabad, Andhra Pradesh, India # Professor, Department of CSE, BVRIT, Hyderabad,
An Analysis on Density Based Clustering of Multi Dimensional Spatial Data
An Analysis on Density Based Clustering of Multi Dimensional Spatial Data K. Mumtaz 1 Assistant Professor, Department of MCA Vivekanandha Institute of Information and Management Studies, Tiruchengode,
Beginning C# 5.0. Databases. Vidya Vrat Agarwal. Second Edition
Beginning C# 5.0 Databases Second Edition Vidya Vrat Agarwal Contents J About the Author About the Technical Reviewer Acknowledgments Introduction xviii xix xx xxi Part I: Understanding Tools and Fundamentals
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Client Perspective Based Documentation Related Over Query Outcomes from Numerous Web Databases
Beyond Limits...Volume: 2 Issue: 2 International Journal Of Advance Innovations, Thoughts & Ideas Client Perspective Based Documentation Related Over Query Outcomes from Numerous Web Databases B. Santhosh
DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
Mining Association Rules: A Database Perspective
IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.12, December 2008 69 Mining Association Rules: A Database Perspective Dr. Abdallah Alashqur Faculty of Information Technology
Cleveland State University
Cleveland State University CIS 612 Modern Database Programming & Big Data Processing (3-0-3) Fall 2014 Section 50 Class Nbr. 2670. Tues, Thur 4:00 5:15 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred.
Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries
Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries Kale Sarika Prakash 1, P. M. Joe Prathap 2 1 Research Scholar, Department of Computer Science and Engineering, St. Peters
APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION ANALYSIS. email [email protected]
Eighth International IBPSA Conference Eindhoven, Netherlands August -4, 2003 APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION Christoph Morbitzer, Paul Strachan 2 and
NEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE
www.arpapress.com/volumes/vol13issue3/ijrras_13_3_18.pdf NEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE Hebah H. O. Nasereddin Middle East University, P.O. Box: 144378, Code 11814, Amman-Jordan
Regression model approach to predict missing values in the Excel sheet databases
Regression model approach to predict missing values in the Excel sheet databases Filling of your missing data is in your hand Z. Mahesh Kumar School of Computer Science & Engineering VIT University Vellore,
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Data Mining Techniques for Banking Applications
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 2, Issue 4, April 2015, PP 15-20 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Data
DBTech Pro Workshop. Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining. Georgios Evangelidis
DBTechNet DBTech Pro Workshop Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining Dimitris A. Dervos [email protected] http://aetos.it.teithe.gr/~dad Georgios Evangelidis
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
Graphical Web based Tool for Generating Query from Star Schema
Graphical Web based Tool for Generating Query from Star Schema Mohammed Anbar a, Ku Ruhana Ku-Mahamud b a College of Arts and Sciences Universiti Utara Malaysia, 0600 Sintok, Kedah, Malaysia Tel: 604-2449604
Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results
, pp.33-40 http://dx.doi.org/10.14257/ijgdc.2014.7.4.04 Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results Muzammil Khan, Fida Hussain and Imran Khan Department
Extending Data Processing Capabilities of Relational Database Management Systems.
Extending Data Processing Capabilities of Relational Database Management Systems. Igor Wojnicki University of Missouri St. Louis Department of Mathematics and Computer Science 8001 Natural Bridge Road
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
