A Knowledge Discovery Based Big Data for Context aware Monitoring Model for Assisted Healthcare

Size: px
Start display at page:

Download "A Knowledge Discovery Based Big Data for Context aware Monitoring Model for Assisted Healthcare"

Transcription

1 A Knowledge Discovery Based Big Data for Context aware Monitoring Model for Assisted Healthcare M. Angel Vinodhini Student, Department of Information Technology, Periyar Maniammai University, Periyar Nagar, Vallam, Thanjavur , Tamil Nadu, India. R. Vanitha Assistant Professor, Department of Information Technology, Periyar Maniammai University, Periyar Nagar, Vallam, Thanjavur , Tamil Nadu, India. Abstract Cloud computing is a fast growing technology that eliminates the need to maintain expensive hardware, software and dedicated storage space. It is an on-demand computing that stores data in remote locations rather than storing in local storage attached to the computer. Big data utilizes cloud computing to store, process and analyze a large amount of data. The existing models have no prior knowledge about the storage repositories. The rapid increase in the amount of data leads to storage and time complexity. To overcome these drawbacks, this paper proposes a fuzzy based MapReduce Apriori algorithm using context aware monitoring to extract the required information for decision making. It also proposed a Context Aware Healthcare Monitoring (CAHM) system to monitor the users activities and to classify them based on their behavior. Context awareness provides a personalized service to the users related to their expectations without any explicit request. The proposed algorithm has two functions: Mapper and Reducer. The mapper function gets a key value pair as input and generates intermediate key value pairs. The reducer function aggregates all the associated key value pairs using context aware information. This algorithm takes less execution time and achieves high accuracy. The use of context aware monitoring improves the accuracy of the system. The experimental results evaluate the proposed system in terms of efficiency, accuracy and activity count. Keywords: Cloud computing, MapReduce, Apriori, Big Data, Context aware monitoring, Fuzzy. Introduction Cloud computing provides reliable software, hardware and infrastructure to perform complex, large scale computation tasks. Cloud service models are classified as Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS). The benefits of cloud storage include economies of scale, reduced capital cost, improved accessibility and flexibility. Big data is a massive volume of both structured and unstructured data that requires a large amount of data handling and storage facilities. Cloud storage provides a platform to address the data storage required for big data. The datasets in big data are large because of the presence of unstructured heterogeneous data. Data heterogeneity is one of the major challenges of cloud storage. Larger organizations seek both faster and better decisions with big data. The huge amount of data that includes patient histories, contact information, medical records and profiles led to the development of cloud-based health care framework. Traditional systems have the following drawbacks: It is difficult to store and manipulate the massive amount of gathered data on a local server. The medical rules are not personalized based on the physical activities and user s profile. The monitoring systems may generate false alarms about the conditions of patients. They have no knowledge regarding the past and current events, which makes decision making harder. To overcome these drawbacks, a fuzzy based Map-reduce Apriori algorithm is proposed using context aware monitoring. MapReduce is a programming paradigm used for big data analysis. Map and reduce are the two separate and distinct tasks performed on big data. Map converts the data into individual elements. Reduce combines the result from map into smaller subset of tuples. The mapping is performed concurrently in a parallel manner. The huge amount of storage requirement and parallel processing are satisfied by cloud computing. The context aware monitoring will enhance the effectiveness in the delivery of services based on the activity and location information of the users. The context awareness correlates the analysis with different physiological data such as heart rate, blood pressure, sugar level, ECG etc., to generate positive alarms. The main objective of this work is to extract the required information for the accurate prediction of any patient situation. It makes the job of medical professionals easier by using the early diagnosis report of patients stored in the context aware database. The proposed system consists of following steps: data preprocessing, Fuzzy rule generation, feature extraction, rule matching and classification. Data preprocessing reduces the data from the large data sets, focusing and selecting the most informative features or instances in the data. MapReduce based feature extraction decomposes the original dataset into blocks of instances and merges the partially obtained results. 3241

2 The rule matching and classification is done using the context attributes and the minimum support value. The remaining sections of this paper are organized as follows: Section II surveyed the traditional approaches of healthcare systems in big data. Section III provides an overview of the fuzzy based Map Reduce Apriori algorithm using context aware monitoring. Section IV shows the performance analysis of the proposed system. Section V concludes the paper by presenting the future work. Related Work This section presents some existing works related to cloud computing usage in big data. Ji, et al [1] presented the key issues of big data processing, the cloud computing database and data storage. The authors also introduced the MapReduce optimization strategies and applications to improve the performance of big data processing. Assunção, et al [2] discussed the approaches and environments for carrying out analytics on Clouds for Big Data applications. Data management and architectures, developing models, calculating scores, visualization and user interaction, and business models were the important areas of analytics and big data. The challenging issues in big data management were data heterogeneity, volume and velocity, data storage, and data integration. With scalability and ondemand resources, cloud computing helped in alleviating the above mentioned problems. Kale, et al [3] analyzed the problems and challenges of big data technology. Using Hadoop and MapReduce framework optimal solutions were presented to speed up the processing of big data. MapReduce distributed data processing architecture was described in detail to achieve excellent fault tolerant features and scalability. Lee, et al [4] surveyed the MapReduce framework in various technical aspects to assist the database and open source communities. Fault tolerance, high scalability, flexibility, and independent storage were the advantages of MapReduce. High level languages, schema support, flexible data flow, I/O optimization, scheduling, join, and performance tuning were incorporated to improve the MapReduce framework. Kim, et al [5] proposed a context-aware item recommendation method to establish personalized u-healthcare services. A context information model was developed based on the classification of context information. A context hidden Markov model was developed to recommend item in u- healthcare environment in an efficient manner. Fenza, et al [6] presented an integrated environment to provide personalized health care services. Fuzzy logic was applied to automatically recognize the context and to find the right set of health care services among the available services. Ontologies enabled service characterization in terms of context. Forkan, et al [7] Described a hidden Markov model based approach to detect the abnormal behaviors of the patients using the statistical histories. The final guess was made by the fused fuzzy rule-based model. The fuzzy model produced an accurate context-aware alert about the health related changes in a patient. Fong, et al [8] Presented a non-contact ECG measurement employed health care system for the mobile cloud environment. To collect biomedical signals from multiple locations, health data were synchronized into the healthcare cloud computing service. A Global Positioning System (GPS) based Google maps display was used to locate the monitored used with abnormal health condition. Medical data were easily accessed by the medical professionals using a web page application. Jin, et al [9] proposed a data fusion based patient monitoring system in cloud computing to improve the diagnostic accuracy. The complex machine learning approaches and a large set of knowledge database was used for fusing multiparameter physiological signals. The fusion of multiparameter data provided significant advantages over singleparameter data. Bourouis, et al [10] proposed a prototype of cloud health monitoring system using Wireless Body Area Sensor Networks (WBASN) to determine the state of patients. The sensory parameters were fed into a neural network engine to fuse information from WBASN. A hybrid location system was used to determine the location of patients. The medical authority was alerted in a timely manner when an abnormal situation was detected. Chawla, et al [11] proposed a patient centric framework for providing a personalized prediction of patients using a collaborative filtering methodology. The collaborative filtering generated predictions based on a set of other similar patients. The active participation of patients improves the prediction accuracy than the data centric model. Jain, et al [12] surveyed the implementation of apriori algorithm on different health care data sets. The apriori algorithm was compared with the predictive apriori algorithm and the tertius algorithm. The apriori algorithm achieved better results in mining frequent itemsets. Yuan, et al [13] presented a fuzzy-logic based Context Aware Real-time Assistant (CARA) to support remote patient monitoring and caregiver notification. To support contextaware decision making a rule based approach was adopted. In real time environmental data, CARA detected the emergency situations accurately with higher performance. Chiang, et al [14] explained the fuzzy algorithm based activity recognition to calibrate the sensed data and obtain output movements. The fuzzy parameters were calibrated by the adoptive feature sets. The overall accuracy was improved in u-healthcare system using fuzzy techniques. Tartarisco, et al [15] integrated an autoregressive model, artificial neural networks, and fuzzy logic model to analyze the features of the electrocardiographic signals and human activities. The fuzzy rule based classification, enhanced the understanding of the dynamic evolution of diseases by continuous monitoring and achieved higher performance. Proposed System This section provides the overview of context-aware monitoring system using fuzzy-logic. The aim of this work is to classify the patients based on the statistical information gathered from their day-to-day activities. Fig.1 describes the overall flow of the proposed system. 3242

3 The data preprocessing improves the mining efficiency and quality. It provides a reduced dataset, which produces the same analytical results. Fuzzy Method Fuzzy logic models, also known as fuzzy inference systems consist of a number of conditional if- then rules. The concept of fuzzy logic is used to improve the prediction accuracy of the system. The fuzzy method has the advantages such as flexibility, expressiveness, and computational efficiency. The fuzzy method has the following steps: (i) Fuzzification (ii) Rule evaluation and (iii) Defuzzification. The fuzzification gets input from the reduced datasets. The fuzzified inputs are applied to the fuzzy rules using the fuzzy operators. It generates a less complex and easily computable aggregated output. The output of rule evaluation is fed into the defuzzifier to get a crisp result. Rule Generation The rules are generated using the information gathered from the preprocessed data. The threshold values are assumed by analyzing the activities of the patient recorded in the dataset. It includes the calculation of minimum support value. The generated rules are used in the rule evaluation step of the fuzzy inference system. Feature Extraction To improve the performance of the fuzzy based context aware system, feature extraction is used to reduce the feature space. A minimal set of features is extracted from the preprocessed dataset through functional mapping. The approximately matching patterns are extracted using rule matching. The redundancy values are calculated between the non extracted features and the recently extracted features. It is difficult to compute the mutual and conditional information results. Several MapReduce phases are executed to distribute and join probabilities with its correspondent redundancy values. Figure 1: Flow of the proposed system Preprocessing The data from the Aerial and Drug prediction datasets are preprocessed for further processing. The data collected from different sensor undergoes data cleaning, data integration, data reduction and data transformation. Data cleaning removes noisy data and corrects inconsistent data. In data integration, the data from the multiple sources are merged into a single data store. Data reduction eliminates the redundant data, and reduces the size of the dataset by clustering or aggregation. The preprocessing also includes filling the missing values. Rule Matching The rule matching is performed using the threshold values of each field in the dataset. Each field is assigned with a support value, based on how closely it matches the same field in an existing record. The field with a minimum support value is considered as the match. If there is no match, the process is redirected to the rule generation stage. Classification using MapReduce Apriori algorithm MapReduce is a parallel programming model based on a keyvalue pair data structure. Two key operations of MapReduce are map function and reduce function. The map function processes the independent data blocks and outputs the summary information. The reduce function is used to process the previous intermediate results. Each map constitutes a subset of the original training data and returns a generated set of prototypes. The reducer iteratively aggregates all the resulting sets to produce a final generated set. The reduce phase is the key of a MapReduce data partitioning. For dataset 3243

4 larger than ten thousand instances, windowing can be applied to reduce the storage requirements and the classification time. The data reduction process in MapReduce provides a scalable and flexible way to apply the feature extraction. Both the accuracy and runtime can be improved using data reduction. Two classes labelled with unknown and known activities are formed. The input data from the reduced datasets are analyzed according to the contextual information to accurately predict the patient s behavior. The fuzzy method classifies the activities of patients into known and unknown activities based on the threshold values. If the generated output exceeds the threshold the activity is recorded as an abnormal/unknown activity. Once an unknown activity is recorded the alarm system notifies the medical professional. Algorithm I - MapReduce Apriori algorithm Input: A set of context information IDt k for all CAHM systems. Output: Context state Cj t for each CAHM system j Procedure Mapper() begin for each CAHM system j do for domain 1 to k do generate IDt k for time t output(key=(j,t), value=idt k ) if IDs 6= φ then output(key=(j,t), value=ids ) end if end Experimental Results This section analyses the performance of the CAHM system. The performance of the proposed is analyzed in terms of activity count, accuracy, and efficiency. The proposed system is compared with a Cloud-oriented Context-aware Middleware in Ambient Assisted Living (CoCaMAAL) model. Two datasets, namely, Aerial and Drug prediction datasets are used for experimentation. The Aeriel dataset contains data captured from the sensors and the patient s daily activities. The activities of any two users, user A and user B is compared in the Aerial dataset. The attributes of Aerial dataset include start time, start date, end time, end date, activity, type and location. The Drug prediction dataset has attributes such as name, age, disease, symptoms, and drugs used. The records of 63 patients from the Drug prediction dataset are used for measuring the performance and accuracy. Activity Count Activity count is the number of known and unknown activities performed by the users. Patients are continuously monitored using sensor to find out their behavior changes. The data, time and location of each activity are captured. Any two patient s daily activities are extracted and loaded in the Aerial dataset. The daily activities of User A are classified into known and unknown activities based on the threshold. The number of known and unknown activities is counted. Fig.2 shows the activity count of user A. The number of known activities is higher than the unknown activities. Procedure Reducer(key=(j,t), value=set of IDt k ) begin for each CAHM system j do Cjt φ for each ItDk at t in CAHM system j do Cjt Cjt ItDk if Exists(IDs ) in CAHM system j then Cjt Cjt IDs end if output(key=(j,t), value=cj t) end Figure 2: Activity count of user A Similarly, the activity count of user B is also calculated and the number of known and unknown activities is plotted on a graph. Fig.3 shows the activity count of user B. The number known activities of user B is also higher than the unknown activities. The overall activity count of the proposed system is compared with the activity count of the existing system. The higher activity count of proposed system shows the seamless monitoring and reduced false alarm rates. Fig.4 compares the activity counts of the proposed and existing system. 3244

5 Figure 3: Activity count of user B Figure 5: Accuracy of CAHM system Vs CoCaMAAL model Efficiency Efficiency is the performance of the system in the classification of records and prediction of results from the dataset. The fuzzy logic provides a better classification results along with high efficiency. When compared to an association rule mining algorithm, the MapReduce Apriori algorithm takes less computation time. The contextual information from the Aerial dataset provides better feature extraction, which in turn reduces the dataset for classification. The efficiency of the CAHM system and CoCaMAAL system are shown in Fig 6. Figure 4: Comparison between the activity counts of CAHM system and CoCaMAAL system Accuracy Accuracy is attaining the quality through exact measurement and prediction of the dataset. The Drug prediction dataset is loaded and a selected feature is extracted. All the patient record, which contains the selected feature are listed. According to the Apriori algorithm, the match scores are computed based on the support and confidence value of each feature. The key value pairs are generated based on the minimum support value. The generated pairs are passed through the MapReduce phases to attain the best accuracy rate. The accuracy rates of the existing and proposed system are shown in Fig.5. When compared to the accuracy of the existing system, the accuracy of the results generated by the proposed system is high. The CAHM system achieved 90% accuracy, whereas the CoCaMAAL model achieved only 80%. The drugs and adverse reactions of the drugs are also predicted accurately. The false alarm rates are reduced to increase in accuracy, which makes the medical professionals easier in the treatment process. Figure 6: Efficiency of CAHM system Vs CoCaMAAL model The graph shows that the efficiency of the CAHM is higher than the CoCaMAAL model. The efficiency of the CAHM system is 10% more over the CoCaMAAL model. The processing of intermediate results in the reducer phase increases the efficiency of the system. 3245

6 Conclusion This paper proposed a Context-Aware Healthcare Monitoring system to monitor the users activities and to classify them based on their behavior. To extract the required information for decision making, a fuzzy based MapReduce Apriori algorithm is implemented in this system. The fuzzy logic helps in increasing the classification accuracy. The MapReduce Apriori algorithm reduced the computational complexity. The Map and Reduce functions processed the key value pairs. The key value pairs are generated using the support and confidence value. The users activities are classified into known and unknown activities based on the threshold. An alarm notification is generated in case of abnormal activities. The results showed that the CAHM system achieved better activity count, accuracy, and efficiency than the CoCaMAAL model. References [1] C. Ji, Y. Li, W. Qiu, U. Awada, and K. Li, "Big data processing in cloud computing environments," in Pervasive Systems, Algorithms and Networks (ISPAN), th International Symposium on, 2012, pp [2] M. D. Assuncao, R. N. Calheiros, S. Bianchi, M. A. Netto, and R. Buyya, "Big Data computing and clouds: Trends and future directions," Journal of Parallel and Distributed Computing, vol. 79, pp. 3-15, [3] M. S. A. Kale and S. S. Dandge, "Understanding the Big Data problems and their solutions using Hadoop and Map-Reduce," Application or Innovation in Engineering and Management (IJAIEM), [4] K.-H. Lee, Y.-J. Lee, H. Choi, Y. D. Chung, and B. Moon, "Parallel data processing with MapReduce: a survey," AcM sigmod Record, vol. 40, pp , [5] J. Kim, D. Lee, and K.-Y. Chung, "Item recommendation based on context-aware model for personalized u-healthcare service," Multimedia Tools and Applications, vol. 71, pp , [6] G. Fenza, D. Furno, and V. Loia, "Hybrid approach for context-aware service discovery in healthcare domain," Journal of Computer and System Sciences, vol. 78, pp , [7] A. R. M. Forkan, I. Khalil, Z. Tari, S. Foufou, and A. Bouras, "A context-aware approach for long-term behavioural change detection and abnormality prediction in ambient assisted living," Pattern Recognition, vol. 48, pp , [8] E.-M. Fong and W.-Y. Chung, "Mobile cloudcomputing-based healthcare service by noncontact ECG monitoring," Sensors, vol. 13, pp , [9] Z. Jin, X. Wang, Q. Gui, B. Liu, and S. Song, "Improving diagnostic accuracy using multiparameter patient monitoring based on data fusion in the cloud," in Future Information Technology, ed: Springer, 2014, pp [10] A. Bourouis, M. Feham, and A. Bouchachia, "A new architecture of a ubiquitous health monitoring system: a prototype of cloud mobile health monitoring system," arxiv preprint arxiv: , [11] N. V. Chawla and D. A. Davis, "Bringing big data to personalized healthcare: a patient-centered framework," Journal of general internal medicine, vol. 28, pp , [12] D. Jain and S. Gautam, "Implementation of Apriori Algorithm in Health Care Sector: A Survey," International Journal of Computer Science and Communication Engineering, vol. 2, [13] B. Yuan and J. Herbert, "Fuzzy cara-a fuzzy-based context reasoning system for pervasive healthcare," Procedia Computer Science, vol. 10, pp , [14] S.-Y. Chiang, Y.-C. Kan, Y.-C. Tu, and H.-C. Lin, "Activity recognition by fuzzy logic system in wireless sensor network for physical therapy," in Intelligent Decision Technologies, ed: Springer, 2012, pp [15] G. Tartarisco, G. Baldus, D. Corda, R. Raso, A. Arnao, M. Ferro, et al., "Personal Health System architecture for stress monitoring and support to clinical decisions," Computer Communications, vol. 35, pp ,

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com

More information

Big Data with Rough Set Using Map- Reduce

Big Data with Rough Set Using Map- Reduce Big Data with Rough Set Using Map- Reduce Mr.G.Lenin 1, Mr. A. Raj Ganesh 2, Mr. S. Vanarasan 3 Assistant Professor, Department of CSE, Podhigai College of Engineering & Technology, Tirupattur, Tamilnadu,

More information

International Journal of Innovative Research in Computer and Communication Engineering

International Journal of Innovative Research in Computer and Communication Engineering FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6

International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6 International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: editor@ijermt.org November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering

More information

ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS

ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS A.Divya *1, A.M.Saravanan *2, I. Anette Regina *3 MPhil, Research Scholar, Muthurangam Govt. Arts College, Vellore, Tamilnadu, India Assistant

More information

IMAV: An Intelligent Multi-Agent Model Based on Cloud Computing for Resource Virtualization

IMAV: An Intelligent Multi-Agent Model Based on Cloud Computing for Resource Virtualization 2011 International Conference on Information and Electronics Engineering IPCSIT vol.6 (2011) (2011) IACSIT Press, Singapore IMAV: An Intelligent Multi-Agent Model Based on Cloud Computing for Resource

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved

Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Perspective Big Data Framework for Healthcare using Hadoop

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Research on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2

Research on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2 Advanced Engineering Forum Vols. 6-7 (2012) pp 82-87 Online: 2012-09-26 (2012) Trans Tech Publications, Switzerland doi:10.4028/www.scientific.net/aef.6-7.82 Research on Clustering Analysis of Big Data

More information

Keywords: Big Data, HDFS, Map Reduce, Hadoop

Keywords: Big Data, HDFS, Map Reduce, Hadoop Volume 5, Issue 7, July 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Configuration Tuning

More information

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image

More information

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Journal of Computational Information Systems 7: 16 (2011) 5956-5963 Available at http://www.jofcis.com Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Ping ZHOU, Jingsheng LEI, Wenjun YE

More information

Mobile Storage and Search Engine of Information Oriented to Food Cloud

Mobile Storage and Search Engine of Information Oriented to Food Cloud Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:

More information

A NEW ARCHITECTURE OF A UBIQUITOUS HEALTH MONITORING SYSTEM:

A NEW ARCHITECTURE OF A UBIQUITOUS HEALTH MONITORING SYSTEM: A NEW ARCHITECTURE OF A UBIQUITOUS HEALTH MONITORING SYSTEM: A Prototype Of Cloud Mobile Health Monitoring System Abderrahim BOUROUIS 1,Mohamed FEHAM 2 and Abdelhamid BOUCHACHIA 3 1 STIC laboratory, Abou-bekr

More information

Log Mining Based on Hadoop s Map and Reduce Technique

Log Mining Based on Hadoop s Map and Reduce Technique Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, anujapandit25@gmail.com Amruta Deshpande Department of Computer Science, amrutadeshpande1991@gmail.com

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR

More information

Hadoop Operations Management for Big Data Clusters in Telecommunication Industry

Hadoop Operations Management for Big Data Clusters in Telecommunication Industry Hadoop Operations Management for Big Data Clusters in Telecommunication Industry N. Kamalraj Asst. Prof., Department of Computer Technology Dr. SNS Rajalakshmi College of Arts and Science Coimbatore-49

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications Ahmed Abdulhakim Al-Absi, Dae-Ki Kang and Myong-Jong Kim Abstract In Hadoop MapReduce distributed file system, as the input

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Efficient Analysis of Big Data Using Map Reduce Framework

Efficient Analysis of Big Data Using Map Reduce Framework Efficient Analysis of Big Data Using Map Reduce Framework Dr. Siddaraju 1, Sowmya C L 2, Rashmi K 3, Rahul M 4 1 Professor & Head of Department of Computer Science & Engineering, 2,3,4 Assistant Professor,

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

A Survey of Cloud Based Health Care System

A Survey of Cloud Based Health Care System A Survey of Cloud Based Health Care System Chandrani Ray Chowdhury Assistant Professor, Dept. of MCA, SDET-Brainware Group of Institution, Barasat, West Bengal, India ABSTRACT: Cloud communicating is an

More information

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India 3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India Call for Papers Cloud computing has emerged as a de facto computing

More information

The WAMS Power Data Processing based on Hadoop

The WAMS Power Data Processing based on Hadoop Proceedings of 2012 4th International Conference on Machine Learning and Computing IPCSIT vol. 25 (2012) (2012) IACSIT Press, Singapore The WAMS Power Data Processing based on Hadoop Zhaoyang Qu 1, Shilin

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Cloud Computing in Medical Diagnosis for improving Health Care Environment

Cloud Computing in Medical Diagnosis for improving Health Care Environment Cloud Computing in Medical Diagnosis for improving Health Care Environment Dr.V.Jeyabalaraja 1,Dr.M.S.Josephine 2 1 Professor, Velammal Engineering College, 2 Professor, Dr.MGR University, jeyabalaraja@gmail.com,

More information

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India 1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto

More information

A Demonstration of a Robust Context Classification System (CCS) and its Context ToolChain (CTC)

A Demonstration of a Robust Context Classification System (CCS) and its Context ToolChain (CTC) A Demonstration of a Robust Context Classification System () and its Context ToolChain (CTC) Martin Berchtold, Henning Günther and Michael Beigl Institut für Betriebssysteme und Rechnerverbund Abstract.

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Big Data Mining Services and Knowledge Discovery Applications on Clouds Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy talia@dimes.unical.it Data Availability or Data Deluge? Some decades

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

Big Data Storage Architecture Design in Cloud Computing

Big Data Storage Architecture Design in Cloud Computing Big Data Storage Architecture Design in Cloud Computing Xuebin Chen 1, Shi Wang 1( ), Yanyan Dong 1, and Xu Wang 2 1 College of Science, North China University of Science and Technology, Tangshan, Hebei,

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy

More information

How To Analyze Log Files In A Web Application On A Hadoop Mapreduce System

How To Analyze Log Files In A Web Application On A Hadoop Mapreduce System Analyzing Web Application Log Files to Find Hit Count Through the Utilization of Hadoop MapReduce in Cloud Computing Environment Sayalee Narkhede Department of Information Technology Maharashtra Institute

More information

Task Scheduling in Hadoop

Task Scheduling in Hadoop Task Scheduling in Hadoop Sagar Mamdapure Munira Ginwala Neha Papat SAE,Kondhwa SAE,Kondhwa SAE,Kondhwa Abstract Hadoop is widely used for storing large datasets and processing them efficiently under distributed

More information

Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques

Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques Subhashree K 1, Prakash P S 2 1 Student, Kongu Engineering College, Perundurai, Erode 2 Assistant Professor,

More information

Web Usage Mining: Identification of Trends Followed by the user through Neural Network

Web Usage Mining: Identification of Trends Followed by the user through Neural Network International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 617-624 International Research Publications House http://www. irphouse.com /ijict.htm Web

More information

Bisecting K-Means for Clustering Web Log data

Bisecting K-Means for Clustering Web Log data Bisecting K-Means for Clustering Web Log data Ruchika R. Patil Department of Computer Technology YCCE Nagpur, India Amreen Khan Department of Computer Technology YCCE Nagpur, India ABSTRACT Web usage mining

More information

Enhancing Quality of Data using Data Mining Method

Enhancing Quality of Data using Data Mining Method JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2, ISSN 25-967 WWW.JOURNALOFCOMPUTING.ORG 9 Enhancing Quality of Data using Data Mining Method Fatemeh Ghorbanpour A., Mir M. Pedram, Kambiz Badie, Mohammad

More information

Boarding to Big data

Boarding to Big data Database Systems Journal vol. VI, no. 4/2015 11 Boarding to Big data Oana Claudia BRATOSIN University of Economic Studies, Bucharest, Romania oc.bratosin@gmail.com Today Big data is an emerging topic,

More information

Homomorphic Encryption Schema for Privacy Preserving Mining of Association Rules

Homomorphic Encryption Schema for Privacy Preserving Mining of Association Rules Homomorphic Encryption Schema for Privacy Preserving Mining of Association Rules M.Sangeetha 1, P. Anishprabu 2, S. Shanmathi 3 Department of Computer Science and Engineering SriGuru Institute of Technology

More information

Advances in Natural and Applied Sciences

Advances in Natural and Applied Sciences AENSI Journals Advances in Natural and Applied Sciences ISSN:1995-0772 EISSN: 1998-1090 Journal home page: www.aensiweb.com/anas Clustering Algorithm Based On Hadoop for Big Data 1 Jayalatchumy D. and

More information

Storage and Retrieval of Data for Smart City using Hadoop

Storage and Retrieval of Data for Smart City using Hadoop Storage and Retrieval of Data for Smart City using Hadoop Ravi Gehlot Department of Computer Science Poornima Institute of Engineering and Technology Jaipur, India Abstract Smart cities are equipped with

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

Survey on Load Rebalancing for Distributed File System in Cloud

Survey on Load Rebalancing for Distributed File System in Cloud Survey on Load Rebalancing for Distributed File System in Cloud Prof. Pranalini S. Ketkar Ankita Bhimrao Patkure IT Department, DCOER, PG Scholar, Computer Department DCOER, Pune University Pune university

More information

Keywords data mining, prediction techniques, decision making.

Keywords data mining, prediction techniques, decision making. Volume 5, Issue 4, April 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of Datamining

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Understanding Web personalization with Web Usage Mining and its Application: Recommender System Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,

More information

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 James Maltby, Ph.D 1 Outline of Presentation Semantic Graph Analytics Database Architectures In-memory Semantic Database Formulation

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA

SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA J.RAVI RAJESH PG Scholar Rajalakshmi engineering college Thandalam, Chennai. ravirajesh.j.2013.mecse@rajalakshmi.edu.in Mrs.

More information

Detection of Distributed Denial of Service Attack with Hadoop on Live Network

Detection of Distributed Denial of Service Attack with Hadoop on Live Network Detection of Distributed Denial of Service Attack with Hadoop on Live Network Suchita Korad 1, Shubhada Kadam 2, Prajakta Deore 3, Madhuri Jadhav 4, Prof.Rahul Patil 5 Students, Dept. of Computer, PCCOE,

More information

Mining Large Datasets: Case of Mining Graph Data in the Cloud

Mining Large Datasets: Case of Mining Graph Data in the Cloud Mining Large Datasets: Case of Mining Graph Data in the Cloud Sabeur Aridhi PhD in Computer Science with Laurent d Orazio, Mondher Maddouri and Engelbert Mephu Nguifo 16/05/2014 Sabeur Aridhi Mining Large

More information

Big Data With Hadoop

Big Data With Hadoop With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials

More information

A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING

A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING A HYBRID RULE BASED FUZZY-NEURAL EXPERT SYSTEM FOR PASSIVE NETWORK MONITORING AZRUDDIN AHMAD, GOBITHASAN RUDRUSAMY, RAHMAT BUDIARTO, AZMAN SAMSUDIN, SURESRAWAN RAMADASS. Network Research Group School of

More information

Key words: web usage mining, clustering, e-marketing and e-business, business intelligence; hybrid soft computing.

Key words: web usage mining, clustering, e-marketing and e-business, business intelligence; hybrid soft computing. Volume 5, Issue 3, March 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue:

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

ANALYTICS IN BIG DATA ERA

ANALYTICS IN BIG DATA ERA ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut

More information

Designing and Embodiment of Software that Creates Middle Ware for Resource Management in Embedded System

Designing and Embodiment of Software that Creates Middle Ware for Resource Management in Embedded System , pp.97-108 http://dx.doi.org/10.14257/ijseia.2014.8.6.08 Designing and Embodiment of Software that Creates Middle Ware for Resource Management in Embedded System Suk Hwan Moon and Cheol sick Lee Department

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Chapter 5. Warehousing, Data Acquisition, Data. Visualization Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives

More information

Open Access Research of Fast Search Algorithm Based on Hadoop Cloud Platform

Open Access Research of Fast Search Algorithm Based on Hadoop Cloud Platform Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2015, 7, 1153-1159 1153 Open Access Research of Fast Search Algorithm Based on Hadoop Cloud Platform

More information

Review of Computer Engineering Research CURRENT TRENDS IN SOFTWARE ENGINEERING RESEARCH

Review of Computer Engineering Research CURRENT TRENDS IN SOFTWARE ENGINEERING RESEARCH Review of Computer Engineering Research ISSN(e): 2410-9142/ISSN(p): 2412-4281 journal homepage: http://www.pakinsight.com/?ic=journal&journal=76 CURRENT TRENDS IN SOFTWARE ENGINEERING RESEARCH Gayatri

More information

Cloud Computing and Health Care Facing the Future. Jerry Fahrni, Pharm.D. April 14, 2010

Cloud Computing and Health Care Facing the Future. Jerry Fahrni, Pharm.D. April 14, 2010 Cloud Computing and Health Care Facing the Future Jerry Fahrni, Pharm.D. April 14, 2010 Objectives Describe what cloud computing is and what cloud computing is not Separate fact from fiction when talking

More information

DESIGN AND STRUCTURE OF FUZZY LOGIC USING ADAPTIVE ONLINE LEARNING SYSTEMS

DESIGN AND STRUCTURE OF FUZZY LOGIC USING ADAPTIVE ONLINE LEARNING SYSTEMS Abstract: Fuzzy logic has rapidly become one of the most successful of today s technologies for developing sophisticated control systems. The reason for which is very simple. Fuzzy logic addresses such

More information

Mining Interesting Medical Knowledge from Big Data

Mining Interesting Medical Knowledge from Big Data IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 1, Ver. II (Jan Feb. 2016), PP 06-10 www.iosrjournals.org Mining Interesting Medical Knowledge from

More information

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and

More information

Redundant Data Removal Technique for Efficient Big Data Search Processing

Redundant Data Removal Technique for Efficient Big Data Search Processing Redundant Data Removal Technique for Efficient Big Data Search Processing Seungwoo Jeon 1, Bonghee Hong 1, Joonho Kwon 2, Yoon-sik Kwak 3 and Seok-il Song 3 1 Dept. of Computer Engineering, Pusan National

More information

How To Integrate Big Data, Cloud Computing, And Enhanced Data Processing

How To Integrate Big Data, Cloud Computing, And Enhanced Data Processing Integration of Big Data in Cloud computing environments for enhanced data processing capabilities Rohit Chandrashekar [1] Maya Kala [2] Dashrath Mane [3] VES Institute of Technology, Chembur, Mumbai [1]

More information

Distributed Computing and Big Data: Hadoop and MapReduce

Distributed Computing and Big Data: Hadoop and MapReduce Distributed Computing and Big Data: Hadoop and MapReduce Bill Keenan, Director Terry Heinze, Architect Thomson Reuters Research & Development Agenda R&D Overview Hadoop and MapReduce Overview Use Case:

More information

2. IMPLEMENTATION. International Journal of Computer Applications (0975 8887) Volume 70 No.18, May 2013

2. IMPLEMENTATION. International Journal of Computer Applications (0975 8887) Volume 70 No.18, May 2013 Prediction of Market Capital for Trading Firms through Data Mining Techniques Aditya Nawani Department of Computer Science, Bharati Vidyapeeth s College of Engineering, New Delhi, India Himanshu Gupta

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Website Personalization using Data Mining and Active Database Techniques Richard S. Saxe

Website Personalization using Data Mining and Active Database Techniques Richard S. Saxe Website Personalization using Data Mining and Active Database Techniques Richard S. Saxe Abstract Effective website personalization is at the heart of many e-commerce applications. To ensure that customers

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

Clinic + - A Clinical Decision Support System Using Association Rule Mining

Clinic + - A Clinical Decision Support System Using Association Rule Mining Clinic + - A Clinical Decision Support System Using Association Rule Mining Sangeetha Santhosh, Mercelin Francis M.Tech Student, Dept. of CSE., Marian Engineering College, Kerala University, Trivandrum,

More information

Hadoop Technology for Flow Analysis of the Internet Traffic

Hadoop Technology for Flow Analysis of the Internet Traffic Hadoop Technology for Flow Analysis of the Internet Traffic Rakshitha Kiran P PG Scholar, Dept. of C.S, Shree Devi Institute of Technology, Mangalore, Karnataka, India ABSTRACT: Flow analysis of the internet

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Preprocessing Web Logs for Web Intrusion Detection

Preprocessing Web Logs for Web Intrusion Detection Preprocessing Web Logs for Web Intrusion Detection Priyanka V. Patil. M.E. Scholar Department of computer Engineering R.C.Patil Institute of Technology, Shirpur, India Dharmaraj Patil. Department of Computer

More information

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

More information

UPS battery remote monitoring system in cloud computing

UPS battery remote monitoring system in cloud computing , pp.11-15 http://dx.doi.org/10.14257/astl.2014.53.03 UPS battery remote monitoring system in cloud computing Shiwei Li, Haiying Wang, Qi Fan School of Automation, Harbin University of Science and Technology

More information

INTEROPERABLE FEATURES CLASSIFICATION TECHNIQUE FOR CLOUD BASED APPLICATION USING FUZZY SYSTEMS

INTEROPERABLE FEATURES CLASSIFICATION TECHNIQUE FOR CLOUD BASED APPLICATION USING FUZZY SYSTEMS INTEROPERABLE FEATURES CLASSIFICATION TECHNIQUE FOR CLOUD BASED APPLICATION USING FUZZY SYSTEMS * C. Saravanakumar 1 and C. Arun 2 1 Department of Computer Science and Engineering, Sathyabama University,

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014 Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions

More information

Big data platform for IoT Cloud Analytics. Chen Admati, Advanced Analytics, Intel

Big data platform for IoT Cloud Analytics. Chen Admati, Advanced Analytics, Intel Big data platform for IoT Cloud Analytics Chen Admati, Advanced Analytics, Intel Agenda IoT @ Intel End-to-End offering Analytics vision Big data platform for IoT Cloud Analytics Platform Capabilities

More information

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

A Novel Cloud Based Elastic Framework for Big Data Preprocessing School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview

More information

An Experimental Approach Towards Big Data for Analyzing Memory Utilization on a Hadoop cluster using HDFS and MapReduce.

An Experimental Approach Towards Big Data for Analyzing Memory Utilization on a Hadoop cluster using HDFS and MapReduce. An Experimental Approach Towards Big Data for Analyzing Memory Utilization on a Hadoop cluster using HDFS and MapReduce. Amrit Pal Stdt, Dept of Computer Engineering and Application, National Institute

More information

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING M.Gnanavel 1 & Dr.E.R.Naganathan 2 1. Research Scholar, SCSVMV University, Kanchipuram,Tamil Nadu,India. 2. Professor

More information

Application of Data Mining Techniques in Intrusion Detection

Application of Data Mining Techniques in Intrusion Detection Application of Data Mining Techniques in Intrusion Detection LI Min An Yang Institute of Technology leiminxuan@sohu.com Abstract: The article introduced the importance of intrusion detection, as well as

More information

Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies

Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Somesh S Chavadi 1, Dr. Asha T 2 1 PG Student, 2 Professor, Department of Computer Science and Engineering,

More information

Industry 4.0 and Big Data

Industry 4.0 and Big Data Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and

More information

Neural Networks in Data Mining

Neural Networks in Data Mining IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V6 PP 01-06 www.iosrjen.org Neural Networks in Data Mining Ripundeep Singh Gill, Ashima Department

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON BIG DATA MANAGEMENT AND ITS SECURITY PRUTHVIKA S. KADU 1, DR. H. R.

More information