Improving Data Excellence Reliability and Precision of Functional Dependency
|
|
|
- Harry Lloyd
- 10 years ago
- Views:
Transcription
1 Improving Data Excellence Reliability and Precision of Functional Dependency Tiwari Pushpa 1, K. Dhanasree 2, Dr.R.V.Krishnaiah 3 1 M.Tech Student, 2 Associate Professor, 3 PG Coordinator 1 Dept of CSE, 2 Dept of CSIT, 3 Dept of CSE 1,2 DRK Institute of Science & Technology, Hyderabad, Andhra Pradesh, India 3 DRK Group of Institutions, Hyderabad, Andhra Pradesh, India ABSTRACT: Poor quality data is a rising and expensive problem that affects many enterprises across all aspects of their business ranging from operational effectiveness to revenue protection. A novel type of semantic rules extended from traditional functional dependencies is proposed as Conditional Functional Dependencies (CFDs). In this paper, for detecting inconsistencies in data, we present an approach that efficiently and robustly discovers conditional functional dependencies and improves data quality. An expensive process that involves intensive manual effort is to find the quality of CFDs. We develop techniques for discovering CFDs from relations to effectively identify data cleaning rules. The discovery problem is more difficult for CFDs and indeed, mining patterns in CFDs introduces new challenges. For discovering general CFDs two algorithms are developed they are the First is a level wise algorithm that extends TANE, a well-known algorithm for mining FDs. And the second algorithm is a method for discovering FDs which is based on the depth-first approach used in Fast FD. CFD Miner efficiently discovers constant CFDs as verified by our experimental study. In general as CFDs, it does not scale well with the arity of the relation as CTANE works well when a given relation is large. A set of cleaning-rule discovery tools are provided by these two algorithms for the users to choose for different applications. Keywords: Conditional Functional Dependencies, CFD Miner, CTANE, Functional Dependencies.
2 1. INTRODUCTION: The main problem that is getting worse in many organizations is that they are suffering from poor quality data because data is growing at astonishing rates and few organizations have an effective data governance process [1] [3]. Poor data quality can occur along several dimensions and consistency is one dimension that many Organizations struggle with. For data cleaning, Conditional functional dependencies (CFDs) were introduced recently [2] [4] [6]. And b y enforcing patterns of semantically related constants the CDFs extend standard functional dependencies (FDs). In detecting and repairing inconsistencies of data, CFDs have been proven more effective than FDs and are expected to be adopted by data-cleaning tools that currently employ standard FDs [5] [7] [8]. However, it is necessary to have techniques in place that can automatically discover or learn CFDs from sample data for CFD-based cleaning methods to be effective in practice, and to be used as data cleaning rules. Indeed, to design CFDs it is often unrealistic to rely solely on human experts and by using an expensive and long manual process [9] [10]. Fig 1: Functional Dependency between X and Y As indicated to commercial data quality tools, cleaning-rule discovery is critical. Each CFD in the canonical cover should be minimal, i.e., nontrivial and left-reduced to reduce redundancy [11] [13]. The discovery problem is, however, highly nontrivial. Moreover, with constants CFD discovery requires mining of semantic patterns, a challenge that was not encountered when discovering FDs [12] [14] [15]. The possible applications of CFDs in data cleaning show up the need for further investigations of CFD discovery. First constant CFDs are particularly importantfor object identification as remarked earlier, and thus deserve a separate treatment. Without paying the price of discovering all CFDs, one wants efficient methods to discover constant CFDs [16] [19]. Second is a Level wise algorithm which may not perform well on sample relations of large arty, given their
3 inbuilt exponential difficulty. To deal with datasets more effective methods have to be in place with a large arity. Third for association rule mining a host of techniques have been developed, and it is only natural to capitalize on these for CFD discovery [17] [18] [20]. These techniques can not readily be used in constant CFD discovery, but also considerably speed up general CFD discovery. 2. CONDITIONAL DEPENDENCIES RELATED WORK: Conditional dependencies has mostly focused on the reliability and allegation analyses of CFDs, and repairing methods to localize and fix errors detected by CFDs, propagation of CFDs from source data to views in data integration, extensions of CFDs by adding disjunction and negation or adding ranges, confidence of CFDs, as well as extensions of inclusion dependencies with conditions. CFD discovery was only studied to our knowledge and the previous work assumes that CFDs are already designed and provided. There has been a host of work on minimal FD discovery. However, minimal CFDs are more involved than their FD counterparts: they require both the minimality of attributes and the minimality of patterns. Our algorithms CTANE and Fast CFD extend TANE and Fast FD are respectively used, for discovering minimal CFDs. For a fixed traditional FD, proposed criteria for sensible patterns that, together with the FD, make useful CFDs. It showed that the problem of finding such patterns is NP-complete, and developed efficient heuristic algorithms for discovering patterns from samples. In contrast to, this work studies CFD discovery when the embedded traditional FDs are not given. An algorithm for discovering CFDs is developed, and which aims to find both traditional FDs and patterns in CFDs, the same as what this work does. In contrast, the algorithms of this work are developed to discover minimal k- frequent CFDs. The connection between association rule mining and constant CFD discovery is also observed. 3. RESULTS: conditional functional dependencies (CFD)Miner only mines stable CFDs which is multiple orders of magnitude faster than the other algorithms which determines both constant and variable CFDs. When database size becomes superior, there are more item sets with large support that require to be considered for constructing the difference
4 sets. This results in a significant performance degradation of NaiveFast. when DBSIZE is less than one million tuples which is reasonably large, then FastCFD outperforms CTANE and NaiveFast. This can confirm the effectiveness of the optimization by leveraging the closed- item sets. 4. CONCLUSION: For discovering minimal CFDs we have developed and implemented three algorithms: First, CFD Miner for mining minimal constant CFDs, for both data cleaning and data integration a class of CFDs are important; Second is CTANE for discovering general minimal CFDs based on the level wise approach and third is Fast CFD for discovering general minimal CFDs based on a depth-first search strategy, and a novel optimization technique via closeditem set mining. These results provide a set of tools for users to choose for different applications. One can simply use CFD Miner without paying the price of mining general CFDs when only constant CFDs are needed. One should opt for Fast CFD when the arity of a sample dataset is large. One could use CTANE when k-frequent CFDs are needed for a large k and there is naturally much to be done. While we have employed in Fast CFD techniques for mining closed item sets, we expect that other mining techniques may also shed light in improving the performance of discovery algorithms. And we plan to explore the use of CFD inference in discovery, to eliminate CFDs that are entailed by those CFDs already found. REFERENCES: [1] W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis, Conditional functional dependencies for capturing data inconsistencies, TODS, vol. 33, no. 2, june [2] G. Cong, W. Fan, F. Geerts, X. Jia, and S. Ma, Improving data quality: Consistency and accuracy, in VLDB, [3] M. Arenas, L. E. Bertossi, and J. Chomicki, Consistent query answers in inconsistent databases, TPLP, vol. 3, no. 4-5, pp , [4] J. Chomicki and J. Marcinkowski, Minimalchange integrity maintenance using tuple deletions, Information and Computation, vol. 197, no. 1-2, pp , [5] J. Wijsen, Database repairing using updates, TODS, vol. 30, no. 3, pp , [6] C. Batini and M. Scannapieco, Data Quality: Concepts, Methodologies and Techniques. Springer, [7] E. Rahm and H. H. Do, Data cleaning: Problems and current approaches. IEEE Data Eng. Bull., vol. 23, no. 4, pp. 3 13, [8] Gartner, Forecast: Data quality tools, worldwide, , [9] S. Abiteboul, R. Hull, and V. Vianu, Foundations of Databases. Addison-Wesley, [10] L. Golab, H. Karloff, F. Korn, D. Srivastava, and B. Yu, On generating near-optimal tableaux for conditional functional dependencies, in VLDB, [11] E.-P. Lim, J. Srivastava, S. Prabhakar, and J. Richardson, Entity identification in database integration, Inf. Sci., vol. 89, no. 1-2, pp. 1 38, 1996.
5 [12] H. Mannila and K.-J. R aih a, Dependency inference, in VLDB, [13] Y. Huhtala, J. K arkk ainen, P. Porkka, and H. Toivonen, TANE: An efficient algorithm for discovering functional and approximate dependencies, Comput. J., vol. 42, no. 2, pp , [14] C. M. Wyss, C. Giannella, and E. L. Robertson, FastFDs: A heuristicdriven, depth-first algorithm for mining functional dependencies from relation instances - extended abstract, in DaWak, [15] P. A. Flach and I. Savnik, Database dependency discovery: A machine learning approach, AI Commun., vol. 12, no. 3, pp , [16] S. Lopes, J.-M. Petit, and L. Lakhal, Efficient discovery of functional dependencies and armstrong relations, in EDBT, [17] T. Calders, R. T. Ng, and J. Wijsen, Searching for dependencies at multiple abstraction levels, TODS, vol. 27, no. 3, pp , [18] R. S. King and J. J. Legendre, Discovery of functional and approximate functional dependencies in relational databases, JAMDS, vol. 7, no. 1, pp , [19] I. F. Ilyas, V. Markl, P. J. Haas, P. Brown, and A. Aboulnaga, Cords: Automatic discovery of correlations and soft functional dependencies, in SIGMOD, [20] H. Mannila and H. Toivonen, Levelwise search and borders of theories in knowledge discovery, Data Min. Knowl. Discov., vol. 1, no. 3, pp , BIOGRAPHY: TIWARI PUSHPA has completed MCA from Maharaja Sayajirao University, Baroda, and pursuing M.Tech (C.S.E) in DRK Institute of Science and Technology, JNTUH, Hyderabad, Andhra Pradesh, India. Her main research interest includes Data Mining & Databases. K. Dhanasree M.Tech, (Ph.D) Associate Professor Dept.Of CSIT, DRK Institute of Science and Technology, JNTU, Hyderabad. Research Area Datamining, Computer Networks And Securtiy Computing. Previously Published 3 International Journal Papers And 1 International Conference Paper. Dr.R.V.Krishnaiah, did M.Tech (EIE) from NIT Waranagal, MTech(CSE) form JNTU,,Ph.D, from JNTU Ananthapur, He has memberships in professional bodies MIE, MIETE, MISTE. He is working as Principal in DRK Institute of Science and Technology, Hyderabad. His main research interests include Image Processing, Security systems, Sensors, Intelligent Systems, Computer networks, Data mining, Software Engineering, network protection and security control. He has various publications and presentations in various national and international journals.
A Fast and Efficient Method to Find the Conditional Functional Dependencies in Databases
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 3, Issue 5 (August 2012), PP. 56-61 A Fast and Efficient Method to Find the Conditional
Detailed Investigation on Strategies Developed for Effective Discovery of Matching Dependencies
Detailed Investigation on Strategies Developed for Effective Discovery of Matching Dependencies R.Santhya 1, S.Latha 2, Prof.S.Balamurugan 3, S.Charanyaa 4 Department of IT, Kalaignar Karunanidhi Institute
Big Data Cleaning. Nan Tang. Qatar Computing Research Institute, Doha, Qatar [email protected]
Big Data Cleaning Nan Tang Qatar Computing Research Institute, Doha, Qatar [email protected] Abstract. Data cleaning is, in fact, a lively subject that has played an important part in the history of data
Conditional-Functional-Dependencies Discovery
Conditional-Functional-Dependencies Discovery Raoul Medina 1 Lhouari Nourine 2 Research Report LIMOS/RR-08-13 19 décembre 2008 1 [email protected] 2 [email protected], Université Blaise Pascal LIMOS Campus
Data Development and User Interaction in GDR
Guided Data Repair Mohamed Yakout 1,2 Ahmed K. Elmagarmid 2 Jennifer Neville 1 Mourad Ouzzani 1 Ihab F. Ilyas 2,3 1 Purdue University 2 Qatar Computing Research Institute 3 University of Waterloo Qatar
Mining Constant Conditional Functional Dependencies for Improving Data Quality
Mining Constant Conditional Functional Dependencies for Improving Data Quality D Devi Kalyani Department of Computer Science Engineering Gitam School of Technology, Gitam University, Hyderabad, India.
Edinburgh Research Explorer
Edinburgh Research Explorer CerFix: A System for Cleaning Data with Certain Fixes Citation for published version: Fan, W, Li, J, Ma, S, Tang, N & Yu, W 2011, 'CerFix: A System for Cleaning Data with Certain
FINDING, MANAGING AND ENFORCING CFDS AND ARS VIA A SEMI-AUTOMATIC LEARNING STRATEGY
STUDIA UNIV. BABEŞ BOLYAI, INFORMATICA, Volume LIX, Number 2, 2014 FINDING, MANAGING AND ENFORCING CFDS AND ARS VIA A SEMI-AUTOMATIC LEARNING STRATEGY KATALIN TÜNDE JÁNOSI-RANCZ Abstract. This paper describes
AN EFFICIENT EXTENSION OF CONDITIONAL FUNCTIONAL DEPENDENCIES IN DISTRIBUTED DATABASES
IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 2, Issue 3, Mar 2014, 17-24 Impact Journals AN EFFICIENT EXTENSION OF CONDITIONAL
Dependencies Revisited for Improving Data Quality
Dependencies Revisited for Improving Data Quality Wenfei Fan University of Edinburgh & Bell Laboratories Wenfei Fan Dependencies Revisited for Improving Data Quality 1 / 70 Real-world data is often dirty
Data Quality Problems beyond Consistency and Deduplication
Data Quality Problems beyond Consistency and Deduplication Wenfei Fan 1, Floris Geerts 2, Shuai Ma 3, Nan Tang 4, and Wenyuan Yu 1 1 University of Edinburgh, UK, [email protected],[email protected]
Incremental Violation Detection of Inconsistencies in Distributed Cloud Data
Incremental Violation Detection of Inconsistencies in Distributed Cloud Data S. Anantha Arulmary 1, N.S. Usha 2 1 Student, Department of Computer Science, SIR Issac Newton College of Engineering and Technology,
SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL
SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL Krishna Kiran Kattamuri 1 and Rupa Chiramdasu 2 Department of Computer Science Engineering, VVIT, Guntur, India
Conditional Functional Dependencies for Data Cleaning
Conditional Functional Dependencies for Data Cleaning Philip Bohannon 1 Wenfei Fan 2,3 Floris Geerts 3,4 Xibei Jia 3 Anastasios Kementsietsidis 3 1 Yahoo! Research 2 Bell Laboratories 3 University of Edinburgh
Robust Outlier Detection Technique in Data Mining: A Univariate Approach
Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,
Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727 Volume 6, Issue 5 (Nov. - Dec. 2012), PP 36-41 Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
AN EFFICIENT STRATEGY OF AGGREGATE SECURE DATA TRANSMISSION
INTERNATIONAL JOURNAL OF REVIEWS ON RECENT ELECTRONICS AND COMPUTER SCIENCE AN EFFICIENT STRATEGY OF AGGREGATE SECURE DATA TRANSMISSION K.Anusha 1, K.Sudha 2 1 M.Tech Student, Dept of CSE, Aurora's Technological
Integrating Pattern Mining in Relational Databases
Integrating Pattern Mining in Relational Databases Toon Calders, Bart Goethals, and Adriana Prado University of Antwerp, Belgium {toon.calders, bart.goethals, adriana.prado}@ua.ac.be Abstract. Almost a
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,
A New Approach For Estimating Software Effort Using RBFN Network
IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.7, July 008 37 A New Approach For Estimating Software Using RBFN Network Ch. Satyananda Reddy, P. Sankara Rao, KVSVN Raju,
TECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING
TECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING Sunghae Jun 1 1 Professor, Department of Statistics, Cheongju University, Chungbuk, Korea Abstract The internet of things (IoT) is an
CREATING MINIMIZED DATA SETS BY USING HORIZONTAL AGGREGATIONS IN SQL FOR DATA MINING ANALYSIS
CREATING MINIMIZED DATA SETS BY USING HORIZONTAL AGGREGATIONS IN SQL FOR DATA MINING ANALYSIS Subbarao Jasti #1, Dr.D.Vasumathi *2 1 Student & Department of CS & JNTU, AP, India 2 Professor & Department
Data Mining Techniques for Banking Applications
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 2, Issue 4, April 2015, PP 15-20 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Data
The Data Analytics Group at the Qatar Computing Research Institute
The Data Analytics Group at the Qatar Computing Research Institute George Beskales Gautam Das Ahmed K. Elmagarmid Ihab F. Ilyas Felix Naumann Mourad Ouzzani Paolo Papotti Jorge Quiane-Ruiz Nan Tang Qatar
A Research Article on Data Mining in Addition to Process Mining: Similarities and Dissimilarities
A Research Article on Data Mining in Addition to Process Mining: Similarities and Dissimilarities S. Sowjanya Chintalapati 1, Ch.G.V.N.Prasad 2, J. Sowjanya 3, R.Vineela 4 1, 3, 4 Assistant Professor,
IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD
Journal homepage: www.mjret.in ISSN:2348-6953 IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD Deepak Ramchandara Lad 1, Soumitra S. Das 2 Computer Dept. 12 Dr. D. Y. Patil School of Engineering,(Affiliated
DATA MINING ANALYSIS TO DRAW UP DATA SETS BASED ON AGGREGATIONS
DATA MINING ANALYSIS TO DRAW UP DATA SETS BASED ON AGGREGATIONS Paparao Golusu 1, Nagul Shaik 2 1 M. Tech Scholar (CSE), Nalanda Institute of Tech, (NIT), Siddharth Nagar, Guntur, A.P, (India) 2 Assistant
A Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
IMPROVISATION OF STUDYING COMPUTER BY CLUSTER STRATEGIES
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPROVISATION OF STUDYING COMPUTER BY CLUSTER STRATEGIES C.Priyanka 1, T.Giri Babu 2 1 M.Tech Student, Dept of CSE, Malla Reddy Engineering
A Stock Trading Algorithm Model Proposal, based on Technical Indicators Signals
Informatica Economică vol. 15, no. 1/2011 183 A Stock Trading Algorithm Model Proposal, based on Technical Indicators Signals Darie MOLDOVAN, Mircea MOCA, Ştefan NIŢCHI Business Information Systems Dept.
Binary Coded Web Access Pattern Tree in Education Domain
Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: [email protected] M. Moorthi
Predictive Analytics using Genetic Algorithm for Efficient Supply Chain Inventory Optimization
182 IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.3, March 2010 Predictive Analytics using Genetic Algorithm for Efficient Supply Chain Inventory Optimization P.Radhakrishnan
International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET
DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand
CONSIDERATION OF TRUST LEVELS IN CLOUD ENVIRONMENT
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE CONSIDERATION OF TRUST LEVELS IN CLOUD ENVIRONMENT Bhukya Ganesh 1, Mohd Mukram 2, MD.Tajuddin 3 1 M.Tech Student, Dept of CSE, Shaaz
Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner
24 Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner Rekha S. Nyaykhor M. Tech, Dept. Of CSE, Priyadarshini Bhagwati College of Engineering, Nagpur, India
IJCSES Vol.7 No.4 October 2013 pp.165-168 Serials Publications BEHAVIOR PERDITION VIA MINING SOCIAL DIMENSIONS
IJCSES Vol.7 No.4 October 2013 pp.165-168 Serials Publications BEHAVIOR PERDITION VIA MINING SOCIAL DIMENSIONS V.Sudhakar 1 and G. Draksha 2 Abstract:- Collective behavior refers to the behaviors of individuals
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
Distributed Framework for Data Mining As a Service on Private Cloud
RESEARCH ARTICLE OPEN ACCESS Distributed Framework for Data Mining As a Service on Private Cloud Shraddha Masih *, Sanjay Tanwani** *Research Scholar & Associate Professor, School of Computer Science &
A Review of Anomaly Detection Techniques in Network Intrusion Detection System
A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In
Addressing internal consistency with multidimensional conditional functional dependencies
Addressing internal consistency with multidimensional conditional functional dependencies Stefan Brüggemann OFFIS - Institute for Information Technology Escherweg 2 262 Oldenburg Germany [email protected]
Discovering Data Quality Rules
Discovering Data Quality Rules Fei Chiang University of Toronto [email protected] Renée J. Miller University of Toronto [email protected] ABSTRACT Dirty data is a serious problem for businesses
IMPLEMENTATION OF NOVEL MODEL FOR ASSURING OF CLOUD DATA STABILITY
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPLEMENTATION OF NOVEL MODEL FOR ASSURING OF CLOUD DATA STABILITY K.Pushpa Latha 1, V.Somaiah 2 1 M.Tech Student, Dept of CSE, Arjun
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of
A Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems
A Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems Ismail Hababeh School of Computer Engineering and Information Technology, German-Jordanian University Amman, Jordan Abstract-
International Journal of Innovative Research in Computer and Communication Engineering
FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,
Efficient Integration of Data Mining Techniques in Database Management Systems
Efficient Integration of Data Mining Techniques in Database Management Systems Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex France
DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE
DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE SK MD OBAIDULLAH Department of Computer Science & Engineering, Aliah University, Saltlake, Sector-V, Kol-900091, West Bengal, India [email protected]
Schema mapping and query reformulation in peer-to-peer XML data integration system
Control and Cybernetics vol. 38 (2009) No. 1 Schema mapping and query reformulation in peer-to-peer XML data integration system by Tadeusz Pankowski Institute of Control and Information Engineering Poznań
A UPS Framework for Providing Privacy Protection in Personalized Web Search
A UPS Framework for Providing Privacy Protection in Personalized Web Search V. Sai kumar 1, P.N.V.S. Pavan Kumar 2 PG Scholar, Dept. of CSE, G Pulla Reddy Engineering College, Kurnool, Andhra Pradesh,
M C. Foundations of Data Quality Management. Wenfei Fan University of Edinburgh. Floris Geerts University of Antwerp
Foundations of Data Quality Management Wenfei Fan University of Edinburgh Floris Geerts University of Antwerp SYNTHESIS LECTURES ON DATA MANAGEMENT #29 & M C Morgan & claypool publishers ABSTRACT Data
KEYWORD SEARCH IN RELATIONAL DATABASES
KEYWORD SEARCH IN RELATIONAL DATABASES N.Divya Bharathi 1 1 PG Scholar, Department of Computer Science and Engineering, ABSTRACT Adhiyamaan College of Engineering, Hosur, (India). Data mining refers to
Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis
Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis Rajesh Reddy Muley 1, Sravani Achanta 2, Prof.S.V.Achutha Rao 3 1 pursuing M.Tech(CSE), Vikas College of Engineering and
Consistent Query Answering in Databases Under Cardinality-Based Repair Semantics
Consistent Query Answering in Databases Under Cardinality-Based Repair Semantics Leopoldo Bertossi Carleton University School of Computer Science Ottawa, Canada Joint work with: Andrei Lopatenko (Google,
Survey On: Nearest Neighbour Search With Keywords In Spatial Databases
Survey On: Nearest Neighbour Search With Keywords In Spatial Databases SayaliBorse 1, Prof. P. M. Chawan 2, Prof. VishwanathChikaraddi 3, Prof. Manish Jansari 4 P.G. Student, Dept. of Computer Engineering&
A Systematic Approach for Configuration Management in Software Product Lines
, March 18-20, 2015, Hong Kong A Systematic Approach for Configuration Management in Software Product Lines K.L.S. Soujanya,Member,IAENG, A. Ananda Rao, Member, IAENG Abstract Product lines achieve significant
A Survey on Web Mining From Web Server Log
A Survey on Web Mining From Web Server Log Ripal Patel 1, Mr. Krunal Panchal 2, Mr. Dushyantsinh Rathod 3 1 M.E., 2,3 Assistant Professor, 1,2,3 computer Engineering Department, 1,2 L J Institute of Engineering
Data Quality: From Theory to Practice
Data Quality: From Theory to Practice Wenfei Fan School of Informatics, University of Edinburgh, and RCBD, Beihang University [email protected] ABSTRACT Data quantity and data quality, like two sides
An Effective Analysis of Weblog Files to improve Website Performance
An Effective Analysis of Weblog Files to improve Website Performance 1 T.Revathi, 2 M.Praveen Kumar, 3 R.Ravindra Babu, 4 Md.Khaleelur Rahaman, 5 B.Aditya Reddy Department of Information Technology, KL
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: [email protected] November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering
IMPLEMENTATION OF RELIABLE CACHING STRATEGY IN CLOUD ENVIRONMENT
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPLEMENTATION OF RELIABLE CACHING STRATEGY IN CLOUD ENVIRONMENT M.Swapna 1, K.Ashlesha 2 1 M.Tech Student, Dept of CSE, Lord s Institute
Static Data Mining Algorithm with Progressive Approach for Mining Knowledge
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive
DISTRIBUTION OF DATA SERVICES FOR CORPORATE APPLICATIONS IN CLOUD SYSTEM
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE DISTRIBUTION OF DATA SERVICES FOR CORPORATE APPLICATIONS IN CLOUD SYSTEM Itishree Boitai 1, S.Rajeshwar 2 1 M.Tech Student, Dept of
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
Hybrid Technique for Data Cleaning
Hybrid Technique for Data Cleaning Ashwini M. Save P.G. Student, Department of Computer Engineering, Thadomal Shahani Engineering College, Bandra, Mumbai, India Seema Kolkur Assistant Professor, Department
Optimization of Image Search from Photo Sharing Websites Using Personal Data
Optimization of Image Search from Photo Sharing Websites Using Personal Data Mr. Naeem Naik Walchand Institute of Technology, Solapur, India Abstract The present research aims at optimizing the image search
DATA MINING TECHNIQUES FOR CRM
International Journal of Scientific & Engineering Research, Volume 4, Issue 4, April-2013 509 DATA MINING TECHNIQUES FOR CRM R.Senkamalavalli, Research Scholar, SCSVMV University, Enathur, Kanchipuram
A Simplified Framework for Data Cleaning and Information Retrieval in Multiple Data Source Problems
A Simplified Framework for Data Cleaning and Information Retrieval in Multiple Data Source Problems Agusthiyar.R, 1, Dr. K. Narashiman 2 Assistant Professor (Sr.G), Department of Computer Applications,
Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India
Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone
How To Improve Cloud Computing With An Ontology System For An Optimal Decision Making
International Journal of Computational Engineering Research Vol, 04 Issue, 1 An Ontology System for Ability Optimization & Enhancement in Cloud Broker Pradeep Kumar M.Sc. Computer Science (AI) Central
Automatic Annotation Wrapper Generation and Mining Web Database Search Result
Automatic Annotation Wrapper Generation and Mining Web Database Search Result V.Yogam 1, K.Umamaheswari 2 1 PG student, ME Software Engineering, Anna University (BIT campus), Trichy, Tamil nadu, India
Data Mining & Data Stream Mining Open Source Tools
Data Mining & Data Stream Mining Open Source Tools Darshana Parikh, Priyanka Tirkha Student M.Tech, Dept. of CSE, Sri Balaji College Of Engg. & Tech, Jaipur, Rajasthan, India Assistant Professor, Dept.
Mining Binary Expressions: Applications and Algorithms
Mining Binary Expressions: Applications and Algorithms Toon Calders Jan Paredaens Universiteit Antwerpen, Departement Wiskunde-Informatica, Universiteitsplein 1, B-2610 Wilrijk, Belgium. {calders,pareda}@uia.ua.ac.be
Data Quality in Information Integration and Business Intelligence
Data Quality in Information Integration and Business Intelligence Leopoldo Bertossi Carleton University School of Computer Science Ottawa, Canada : Faculty Fellow of the IBM Center for Advanced Studies
A SEMANTICALLY ENRICHED WEB USAGE BASED RECOMMENDATION MODEL
A SEMANTICALLY ENRICHED WEB USAGE BASED RECOMMENDATION MODEL C.Ramesh 1, Dr. K. V. Chalapati Rao 1, Dr. A.Goverdhan 2 1 Department of Computer Science and Engineering, CVR College of Engineering, Ibrahimpatnam,
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
