International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 4, Jul-Aug 2015

Size: px
Start display at page:

Download "International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 4, Jul-Aug 2015"

Transcription

1 RESEARCH ARTICLE OPEN ACCESS Data Mining Approach To Big Data Jyothiprasanna Jaladi [1], B.V.Kiranmayee [2], S.Nagini [3] Student of M.Tech(SE) [1], Associate Professor Département Computer Science and Engineering VNRVJIET, Hyderabad Telangana - India. [2] & [3] ABSTRACT The main motivation for extracting information from massive-data is to improve the efficiency of single-source mining methods. On the basis of gradual improvement of computer hardware features, researchers continue to disc over ways to make stronger, the efficiency of potential discovery algorithms to make them higher for huge knowledge. Because massive data are most of the time amassed from one of a kind data sources, the capabilities discovery of the giant knowledge must be carried out using a multisource mining mechanism. As real-world knowledge often come as an information stream or a characteristic waft, a good-situated mechanism is needed to become aware of competencies and grasp the evolution of knowledge in the dynamic data source. As a result, the large, heterogeneous and real-time traits of multisource data provide most important variations between single-source knowledge discovery and multisource data mining. Keywords:- Single-source Mining, Multi-source Mining Huge knowledge. I. INTRODUCTION The potential discovery of the large data must have got to be performed using a multisource mining mechanism. As real-world information mainly come as an information stream or a characteristic glide, a good-based mechanism is needed to notice capabilities and grasp the evolution of knowledge in the dynamic, potential-supply. Consequently, the big, heterogeneous and genuine-time characteristics of multi supply knowledge provide essential difference between single-supply competencies discovery and multisource expertise mining. The targets of large information mining methods go beyond fetching the requested knowledge or even uncovering some hidden relationships and patterns between numeral parameters. Inspecting rapid movement data could result in new valuable insights and theoretical concepts. Comparing with the results derived from mining the natural datasets, Unveiling the massive number of interconnected heterogeneous vast advantage has the knowledge to maximize our abilities and insights within the purpose area. On the other hand, this brings a series of new challenges to the study group. II. BIG DATA Big data [1] is a labelled term for large data units which might be large that cannot be dealt with through a common information processing purposes which can be inadequate. Challenges include analysis, capture, information-curation, search, sharing, storage, transfer, visualization, and knowledge privacy. The time period more often than not refers conveniently to the use of predictive analytics or other specified developed ways to extract worth from information, and infrequently to a specific dimension of data set. Accuracy in enormous information may result in extra confident choice making, and better selections can imply higher operational affectivity, cost reductions and reduced risk. III. CHARACTERISTICS OF BIG DATA Big Data is characterised by using five V s particularly Volume, velocity,sort,veracity and price[2]. Volume: Volume of big data refers back to the amount of data that's being generated in a amount of time. It stages from petabytes to petabytes. Velocity: Velocity refers back to the speed at which the information is being generated every second. Variety: Type refers back to the types that is being generated from the sources. ISSN: Page 258

2 Veracity: Veracity depends on the reliability of the information. Value: It stands for the worth of the information. IV. BIG DATA TYPES Enormous knowledge consists of every knowledge that represents from greenback transactions to tweets to pictures to audio. Accordingly, taking advantage of giant data requires that each one this information to be built-in for analysis and data management. That is extra difficult than it appears. There are two main varieties of information concerned right here: structured and unstructured. Structured data is like a information warehouse, where information is tagged and sortable, even as unstructured knowledge is random and difficult to analyze[4]. HACE THEOREM Hace Theorem [5] is used to model the characteristics of the big data. Big data -information includes of big, heterogeneous, autonomous, and decentralized manipulate desires to discover to the complex and dynamic relationship between data. These traits make it an severe challenge for locating useful advantage from the large knowledge. In a native sense, it will probably imagined that a blind man is making an attempt to measurement up a huge elephant for you to be the colossal knowledge in this context. The term tremendous information actually considerations about data volumes, HACE theorem suggests that the key characteristics of the tremendous data are: A.Huge with various and miscellaneous data sources:- The major characteristics of the big data is the large volume of knowledge represented by quite a lot of sources. This gigantic quantity of data comes from quite a lot of websites like Twitter, Myspace, Orkut and LinkedIn and so on. This is due to the fact one of a kind knowledge collectors prefer their possess representation or approach for data recording, and the character of specific applications additionally outcome in more than a few knowledge representations. Autonomous Sources with circulated &decentralised Control:- Independent Sources with circulated & decentralised manipulate are a essential characteristic of enormous data applications. Complex and evolving associations:-in an early stage of knowledge centralized information systems, the focus is on discovering first-class feature values to represent every remark. This type of pattern feature representation inherently treats each man or woman as an impartial entity without given that their social connections, which is likely one of the important reasons of the human society. V. BIG DATA MINING CHALLENGES It's major for a procedure [6] to manage the enormous quantities of data. The foremost element is to control the big amounts of knowledge and it should provide the imperative solutions which can be recounted within the HACE theorem. The above figure suggests a conceptual view of the giant data processing framework, which includes three tiers from inside of out with considerations on data accessing and computing (Tier I), data isolation and domain skills (Tier II), and massive data mining algorithms (Tier III). Tier: 1Big Data Accessing & Computing: In older days enormous quantities of data have to be extracted from dependant on the current set of data For this reason, there is a need ofone-of-a-kind set of data to extract the reward skills inside the significant knowledge. ISSN: Page 259

3 There are probabilities that may take it extra delicate and abilities can lack some of its attribute which is capable to as a consequence be changed right into a bigger set of information that may clearly me modified and utilized without difficulty to the given regular set of data. Such big information procedures on the way to therefore helps in each hardware and application performance. Tier:2 Data Isolation and Semantic Knowledge: In massive data, it is quintessential to comprehend the semantics of the info, ideas,policies involving to a distinctive software. With admire to the rules and policies of the information, a technical barrier known as privacy of the data will have to be viewed. It is a predominant to shield the privateers of the information, regarding sensitive applications like banking transactions, business transactions etc., simple information interactions do not get to the bottom of privateers issues.to be able to shield the privateness: limit the acess to the data and sensitive knowledge information fields may also be removed. Semantic talents can help to determine proper features for modeling the foremost information. The domain and software competencies may help design viable trade targets by using making use of tremendous data analytical procedures. Tier:3 Big Data Mining Algorithms: Gigantic knowledge functions are featured with more than one sources and with decentralized manage. Managing these forms of information to a single centralized site for mining is exhaustive because of the privacy concerns. Below these occasions, a large information mining process has to permit an knowledge exchange to make sure that everyone distributed web sites can work collectively to acquire a world optimization intention. The global mining can also be featured with a two-step approach, at knowledge, mannequin, and at expertise levels. At the information level, every local website can calculate the information records founded on the nearby information sources and alternate the information between sites to obtain a worldwide information distribution view. On the mannequin or sample level, each website online can carry out regional mining routine, with appreciate to the localized data, to notice nearby patterns. By exchanging patterns between more than one sources, new world patterns can also be synthetized by way of aggregating patterns across all web sites. On the capabilities degree, mannequin correlation evaluation finds out the value between items generated from exclusive knowledge sources to determine how important the information sources are correlated with each and every other, and methods to type correct decisions headquartered on items constructed from self-reliant sources. VI. BIG DATA MINING The intention of the enormous data goes is to fetch the desired understanding or to seek out the undiscovered patterns among the data. On account that big knowledge are customarily accrued from distinctive data sources, the knowledge discovery of the significant knowledge mustbe carried out making use of a multisource mining mechanism. As real-world data most commonly come as a data movement or a attribute-waft, good-founded mechanism is needed to observe competencies and master the evolution of abilities within the dynamic data source. Consequently, the huge, heterogeneous and real-time traits of multisource data provide essential differences between single-supply skills discovery and multisource information mining proposed and established the speculation of neighbourhood sample analysis, which has laid a foundation for international talents discovery in multisource knowledge mining. Local sample analysis of knowledge processing can avoid hanging one of a kind information-sources together to hold out centralized computing. Information streams are broadly utilized in fiscal evaluation, online trading, scientific trying out, and many others. Extracting speedy and massive movement information may just results in new useful standards. Tremendous knowledge mining have to deal with heterogeneity, severe scale, pace, privacy, accuracy, trust, and instructiveness that current mining procedures and algorithms are incapable of. The necessity for designing and enforcing very-significant-scale parallel computing device finding out and data mining algorithms (ML-DM) has remarkably increased, which accompanies the emergence of powerful parallel and very-significant-scale data processing systems, e.g., Hadoop MapReduce. NIMBLE is a transportable infrastructure that has been exceptionally designed to enable rapid implementation of parallel MLDM algorithms, jogging on high of Hadoop. VII. PROPOSED SYSTEM Considering that big data knowledge is a collection of difficult and enormous information units that are elaborate to system and mine for patterns and expertise utilising common database management instruments or data processing and mining techniques ISSN: Page 260

4 The primary motivation leads for discovering competencies from tremendous data is making improvements to the affectivity of single-supply mining approaches. On the groundwork of gradual improvement of computer hardware features, the methods to support the affectivity of potential discovery algorithms to make them better for giant knowledge. Since giant data are quite often collected from unique data sources, the advantage discovery of the huge data must be performed utilising a multisource mining mechanism. As real-world knowledge quite often come as a data movement or characteristic glide, a good-cantered mechanism is required to detect capabilities and grasp the evolution of capabilities within the dynamic knowledge supply. Consequently, the giant, heterogeneous and realtime traits of multi supply knowledge provide main differences between single-source knowledge discovery and multisource knowledge mining. Dataset Selection: a data set is a group of data. The term information set can also be used more loosely, to refer to the info in a set of closely associated tables, corresponding to a distinct test or occasion. Pre-Processing: Pre-processing entails putting off contaminated knowledge, removing null values and unformatted values. Clustering: Clustering is a key concept in Dataset mining. A clustering process pursuits to analyse the similarities between knowledge objects and build businesses of them. The grouped objects can then be used to navigate quite simply via a very large record of information units. Dataset clustering ambitions to routinely divide Datasets in to corporations founded on similarities of their contents. Each and every corporations encompass Datasets which can be identical between themselves, they have high intra-cluster similarity and varied to Datasets of other corporations that have low inter-cluster similarity. K-means is a simple however good recognized algorithm for group-ing objects, clustering. Again all objects must be represented as a collection of numerical elements. Furthermore the user has to specify the number of corporations (referred to as k) he wants to establish. Every object can be concept of as being represented by using some feature vector in an n dimensional house, n being the number of all features used to explain the objects to cluster. The algorithm then randomly chooses okay facets in that vector area, these points serve as the preliminary centres of the clusters. Afterwards all objects are each and every assigned to the centre they're closest to. In general the distance measure is chosen by way of the person and determined by means of the learning project. After that mission is computed, for each and every cluster a new core is computed through averaging the feature vectors of all objects assigned to it. The procedure of assigning objects and re computing facilities is repeated unless the approach converges. The algorithm may also be tested to converge after a finite quantity of iterations. A number of tweaks regarding distance measure, initial middle alternative and computation of recent traditional facilities were explored, as well as the estimation of the number of clusters k. Experimental Results: This experimental results shows that it can act like big data mining platform to extract the useful data from the data source. ISSN: Page 261

5 VIII. CONCLUSION The main motivation for discovering knowledge from massive data is improving the efficiency of single-source mining method is achieved by developing a big data mining platform, where is no proper layout to extract the massive data from the data source, because massive data are typically collected from different data sources, the knowledge discovery of the massive data must be performed using a multi-source mining mechanism. As real-world data often come as a data stream or a characteristic flow, a well-established mechanism is needed to discover knowledge and master the evolution of knowledge in the dynamic data source. [4] Big data and Five V s Characteristics, HIBA JASIM HADI, AMMAR HAMEED SHNAIN, SARAH HADISHAHEED,AZIZAHBTHAJIAHMAD,Ministry of Education, Islamic University College. [5] Data Mining with Big Data Xindong Wu, Fellow, IEEE, Xingquan Zhu, Senior Member, IEEE,Gong- Qing Wu, and Wei Ding, Senior Member, IEEE, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,VOL. 26, NO. 1, JANUARY [6] Big data analysis using HaceTheorem,Deepak S. Tamhane, Sultana N. Sayyad, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Volume 4 Issue 1, January [7] Issues and Challenges in the Era of Big DataMining, B R Prakash, Dr. M. Hanumanthappa Assistant Professor,Department of MCA,Sri Siddhartha Institute of Technology,Tumkur Professor, Department of Computer Science & Applications, Bangalore University, Bangalore, International Journal of Emerging Trends & Technology in Computer Science (IJETTCS), Volume 3, Issue 4 July-August [8] Galen Gruman, PRICEWATERHOUSECOOPERS, Technology Forecast, (2010), Issue 3, A quarterly journal, Making Sense of Big Data. [9] Phil Shelly, Doug Cutting, Infosys White paper, Big Data Spectrum. REFERENCES [1] [2] From Big Data to Big Data Mining: Challenges, Issues, and Opportunities Dunren Che, Mejdl Safran, and Zhiyong Peng, Department of Computer Science, Southern Illinois University Carbondale, Illinois 62901, USA. [3] Exploring the Big Data Spectrum, Jaya Singh and Ajay Rana, International Journal of Emerging Technology and Advanced Engineering, Volume 3, Issue 4, April ISSN: Page 262

Volume 3, Issue 8, August 2015 International Journal of Advance Research in Computer Science and Management Studies

Volume 3, Issue 8, August 2015 International Journal of Advance Research in Computer Science and Management Studies Volume 3, Issue 8, August 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com An

More information

International Journal of Innovative Research in Computer and Communication Engineering

International Journal of Innovative Research in Computer and Communication Engineering FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,

More information

BIG DATA ANALYSIS USING HACE THEOREM

BIG DATA ANALYSIS USING HACE THEOREM BIG DATA ANALYSIS USING HACE THEOREM Deepak S. Tamhane, Sultana N. Sayyad Abstract- Big Data consists of huge modules, difficult, growing data sets with numerous and, independent sources. With the fast

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer

More information

An Analysis and Review on Various Techniques of Mining Big Data

An Analysis and Review on Various Techniques of Mining Big Data Journal homepage: http://www.journalijar.com INTERNATIONAL JOURNAL OF ADVANCED RESEARCH RESEARCH ARTICLE An Analysis and Review on Various Techniques of Mining Big Data Hitesh Kumar Bhatia Masters of Computer

More information

Keywords- Big data, HACE theorem, Processing Framework, Methodologies, Applications.

Keywords- Big data, HACE theorem, Processing Framework, Methodologies, Applications. Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Big Data Analytics

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Transitioning

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 ISSN 2278-7763. BIG DATA: A New Technology

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 ISSN 2278-7763. BIG DATA: A New Technology International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 BIG DATA: A New Technology Farah DeebaHasan Student, M.Tech.(IT) Anshul Kumar Sharma Student, M.Tech.(IT)

More information

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: [email protected]

More information

Finding Insights & Hadoop Cluster Performance Analysis over Census Dataset Using Big-Data Analytics

Finding Insights & Hadoop Cluster Performance Analysis over Census Dataset Using Big-Data Analytics Finding Insights & Hadoop Cluster Performance Analysis over Census Dataset Using Big-Data Analytics Dharmendra Agawane 1, Rohit Pawar 2, Pavankumar Purohit 3, Gangadhar Agre 4 Guide: Prof. P B Jawade 2

More information

International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6

International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6 International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: [email protected] November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering

More information

Knowledge Engineering with Big Data

Knowledge Engineering with Big Data Knowledge Engineering with Big Data (joint work with Nanning Zheng, Huanhuan Chen, Qinghua Zheng, Aoying Zhou, Xingquan Zhu, Gong-Qing Wu, Wei Ding, Kui Yu et al.) Xindong Wu ( 吴 信 东 ) Department of Computer

More information

Big Data and Healthcare Payers WHITE PAPER

Big Data and Healthcare Payers WHITE PAPER Knowledgent White Paper Series Big Data and Healthcare Payers WHITE PAPER Summary With the implementation of the Affordable Care Act, the transition to a more member-centric relationship model, and other

More information

How To Learn To Use Big Data

How To Learn To Use Big Data Information Technologies Programs Big Data Specialized Studies Accelerate Your Career extension.uci.edu/bigdata Offered in partnership with University of California, Irvine Extension s professional certificate

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

A Study on Effective Business Logic Approach for Big Data Mining

A Study on Effective Business Logic Approach for Big Data Mining A Study on Effective Business Logic Approach for Big Data Mining T. Sathis Kumar Assistant Professor, Dept of C.S.E, Saranathan College of Engineering, Trichy, Tamil Nadu, India. ABSTRACT: Big data is

More information

A REVIEW REPORT ON DATA MINING

A REVIEW REPORT ON DATA MINING Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 4, April 2015,

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Improving Data Processing Speed in Big Data Analytics Using. HDFS Method

Improving Data Processing Speed in Big Data Analytics Using. HDFS Method Improving Data Processing Speed in Big Data Analytics Using HDFS Method M.R.Sundarakumar Assistant Professor, Department Of Computer Science and Engineering, R.V College of Engineering, Bangalore, India

More information

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS

BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS Megha Joshi Assistant Professor, ASM s Institute of Computer Studies, Pune, India Abstract: Industry is struggling to handle voluminous, complex, unstructured

More information

Big Data Analytic and Mining with Machine Learning Algorithm

Big Data Analytic and Mining with Machine Learning Algorithm International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 1 (2014), pp. 33-40 International Research Publications House http://www. irphouse.com /ijict.htm Big Data

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin

More information

SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics

SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics SAP Brief SAP HANA Objectives Transform Your Future with Better Business Insight Using Predictive Analytics Dealing with the new reality Dealing with the new reality Organizations like yours can identify

More information

Big Data Efficiencies That Will Transform Media Company Businesses

Big Data Efficiencies That Will Transform Media Company Businesses Big Data Efficiencies That Will Transform Media Company Businesses TV, digital and print media companies are getting ever-smarter about how to serve the diverse needs of viewers who consume content across

More information

BIG DATA CHALLENGES AND PERSPECTIVES

BIG DATA CHALLENGES AND PERSPECTIVES BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,

More information

Big Data: Study in Structured and Unstructured Data

Big Data: Study in Structured and Unstructured Data Big Data: Study in Structured and Unstructured Data Motashim Rasool 1, Wasim Khan 2 [email protected], [email protected] Abstract With the overlay of digital world, Information is available

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

High-Performance Analytics

High-Performance Analytics High-Performance Analytics David Pope January 2012 Principal Solutions Architect High Performance Analytics Practice Saturday, April 21, 2012 Agenda Who Is SAS / SAS Technology Evolution Current Trends

More information

Sentiment Analysis on Big Data

Sentiment Analysis on Big Data SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social

More information

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

Data Mining in the Swamp

Data Mining in the Swamp WHITE PAPER Page 1 of 8 Data Mining in the Swamp Taming Unruly Data with Cloud Computing By John Brothers Business Intelligence is all about making better decisions from the data you have. However, all

More information

Mining With Big Data Using HACE

Mining With Big Data Using HACE Mining With Big Data Using HACE 1 R. M. Shete, 2 Snehal N. Kathale 1 CSE Department, DMIETR, Sawangi (M), Wardha, 2 CSE/IT Department, GHRIETW, Nagpur 2 1,2 RTMNU, Nagpur Email: 1 [email protected],

More information

Collaborations between Official Statistics and Academia in the Era of Big Data

Collaborations between Official Statistics and Academia in the Era of Big Data Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI [email protected] What

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])

More information

A Survey of Classification Techniques in the Area of Big Data.

A Survey of Classification Techniques in the Area of Big Data. A Survey of Classification Techniques in the Area of Big Data. 1PrafulKoturwar, 2 SheetalGirase, 3 Debajyoti Mukhopadhyay 1Reseach Scholar, Department of Information Technology 2Assistance Professor,Department

More information

Big Data Mining: Challenges and Opportunities to Forecast Future Scenario

Big Data Mining: Challenges and Opportunities to Forecast Future Scenario Big Data Mining: Challenges and Opportunities to Forecast Future Scenario Poonam G. Sawant, Dr. B.L.Desai Assist. Professor, Dept. of MCA, SIMCA, Savitribai Phule Pune University, Pune, Maharashtra, India

More information

Anuradha Bhatia, Faculty, Computer Technology Department, Mumbai, India

Anuradha Bhatia, Faculty, Computer Technology Department, Mumbai, India Volume 3, Issue 9, September 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Real Time

More information

Manifest for Big Data Pig, Hive & Jaql

Manifest for Big Data Pig, Hive & Jaql Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,

More information

IJRCS - International Journal of Research in Computer Science ISSN: 2349-3828

IJRCS - International Journal of Research in Computer Science ISSN: 2349-3828 ISSN: 2349-3828 Implementing Big Data for Intelligent Business Decisions Dr. V. B. Aggarwal Deepshikha Aggarwal 1(Jagan Institute of Management Studies, Delhi, India, [email protected]) 2(Jagan

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Big Data. Fast Forward. Putting data to productive use

Big Data. Fast Forward. Putting data to productive use Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize

More information

Big Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD

Big Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD Big Analytics for Space Exploration, Entrepreneurship and Policy Opportunities Tiffani Crawford, PhD Big Analytics Characteristics Large quantities of many data types Structured Unstructured Human Machine

More information

Data Mining with Big Data

Data Mining with Big Data Data Mining with Big Data Xindong Wu 1,2, Xingquan Zhu 3, Gong-Qing Wu 2, Wei Ding 4 1 School of Computer Science and Information Engineering, Hefei University of Technology, China 2 Department of Computer

More information

Web Data Mining: A Case Study. Abstract. Introduction

Web Data Mining: A Case Study. Abstract. Introduction Web Data Mining: A Case Study Samia Jones Galveston College, Galveston, TX 77550 Omprakash K. Gupta Prairie View A&M, Prairie View, TX 77446 [email protected] Abstract With an enormous amount of data stored

More information

DATAOPT SOLUTIONS. What Is Big Data?

DATAOPT SOLUTIONS. What Is Big Data? DATAOPT SOLUTIONS What Is Big Data? WHAT IS BIG DATA? It s more than just large amounts of data, though that s definitely one component. The more interesting dimension is about the types of data. So Big

More information

Demystifying Big Data Government Agencies & The Big Data Phenomenon

Demystifying Big Data Government Agencies & The Big Data Phenomenon Demystifying Big Data Government Agencies & The Big Data Phenomenon Today s Discussion If you only remember four things 1 Intensifying business challenges coupled with an explosion in data have pushed

More information

CONNECTING DATA WITH BUSINESS

CONNECTING DATA WITH BUSINESS CONNECTING DATA WITH BUSINESS Big Data and Data Science consulting Business Value through Data Knowledge Synergic Partners is a specialized Big Data, Data Science and Data Engineering consultancy firm

More information

ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING

ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING ISSUES, CHALLENGES, AND SOLUTIONS: BIG DATA MINING Jaseena K.U. 1 and Julie M. David 2 1,2 Department of Computer Applications, M.E.S College, Marampally, Aluva, Cochin, India 1 [email protected],

More information

Literature Survey in Data Mining with Big Data

Literature Survey in Data Mining with Big Data Literature Survey in Data Mining with Big Data 1 Mr.Mohammad Raziuddin & 2 Prof. T.Venkata Ramana Department of CSE SLC's Institute of Engineering and Technology, Hyderabad, India. 1 [email protected],

More information

ANALYTICS BUILT FOR INTERNET OF THINGS

ANALYTICS BUILT FOR INTERNET OF THINGS ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that

More information

Big Data with Rough Set Using Map- Reduce

Big Data with Rough Set Using Map- Reduce Big Data with Rough Set Using Map- Reduce Mr.G.Lenin 1, Mr. A. Raj Ganesh 2, Mr. S. Vanarasan 3 Assistant Professor, Department of CSE, Podhigai College of Engineering & Technology, Tirupattur, Tamilnadu,

More information

Fight fire with fire when protecting sensitive data

Fight fire with fire when protecting sensitive data Fight fire with fire when protecting sensitive data White paper by Yaniv Avidan published: January 2016 In an era when both routine and non-routine tasks are automated such as having a diagnostic capsule

More information

A New Era Of Analytic

A New Era Of Analytic Penang egovernment Seminar 2014 A New Era Of Analytic Megat Anuar Idris Head, Project Delivery, Business Analytics & Big Data Agenda Overview of Big Data Case Studies on Big Data Big Data Technology Readiness

More information

ISSN:2321-1156 International Journal of Innovative Research in Technology & Science(IJIRTS)

ISSN:2321-1156 International Journal of Innovative Research in Technology & Science(IJIRTS) Nguyễn Thị Thúy Hoài, College of technology _ Danang University Abstract The threading development of IT has been bringing more challenges for administrators to collect, store and analyze massive amounts

More information

[callout: no organization can afford to deny itself the power of business intelligence ]

[callout: no organization can afford to deny itself the power of business intelligence ] Publication: Telephony Author: Douglas Hackney Headline: Applied Business Intelligence [callout: no organization can afford to deny itself the power of business intelligence ] [begin copy] 1 Business Intelligence

More information

Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved

Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Perspective Big Data Framework for Healthcare using Hadoop

More information

Business Intelligence Using Data Mining Techniques on Very Large Datasets

Business Intelligence Using Data Mining Techniques on Very Large Datasets International Journal of Science and Research (IJSR) Business Intelligence Using Data Mining Techniques on Very Large Datasets Arti J. Ugale 1, P. S. Mohod 2 1 Department of Computer Science and Engineering,

More information

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects Abstract: Build a model to investigate system and discovering relations that connect variables in a database

More information

ISSN: 2348 9510. A Review: Image Retrieval Using Web Multimedia Mining

ISSN: 2348 9510. A Review: Image Retrieval Using Web Multimedia Mining A Review: Image Retrieval Using Web Multimedia Satish Bansal*, K K Yadav** *, **Assistant Professor Prestige Institute Of Management, Gwalior (MP), India Abstract Multimedia object include audio, video,

More information

Big Data Introduction, Importance and Current Perspective of Challenges

Big Data Introduction, Importance and Current Perspective of Challenges International Journal of Advances in Engineering Science and Technology 221 Available online at www.ijaestonline.com ISSN: 2319-1120 Big Data Introduction, Importance and Current Perspective of Challenges

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Exploring Big Data in Social Networks

Exploring Big Data in Social Networks Exploring Big Data in Social Networks [email protected] ([email protected]) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about

More information

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool. International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 9, Issue 8 (January 2014), PP. 19-24 Comparative Analysis of EM Clustering Algorithm

More information

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India 1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

IoT and Big Data- The Current and Future Technologies: A Review

IoT and Big Data- The Current and Future Technologies: A Review Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 5, Issue. 1, January 2016,

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

Data Mining with Big Data e-health Service Using Map Reduce

Data Mining with Big Data e-health Service Using Map Reduce Data Mining with Big Data e-health Service Using Map Reduce Abinaya.K PG Student, Department Of Computer Science and Engineering, Parisutham Institute of Technology and Science, Thanjavur, Tamilnadu, India

More information

ENHANCING INTELLIGENCE SUCCESS: DATA CHARACTERIZATION Francine Forney, Senior Management Consultant, Fuel Consulting, LLC May 2013

ENHANCING INTELLIGENCE SUCCESS: DATA CHARACTERIZATION Francine Forney, Senior Management Consultant, Fuel Consulting, LLC May 2013 ENHANCING INTELLIGENCE SUCCESS: DATA CHARACTERIZATION, Fuel Consulting, LLC May 2013 DATA AND ANALYSIS INTERACTION Understanding the content, accuracy, source, and completeness of data is critical to the

More information

AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP

AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP Asst.Prof Mr. M.I Peter Shiyam,M.E * Department of Computer Science and Engineering, DMI Engineering college, Aralvaimozhi.

More information

Survey of Big Data Architecture and Framework from the Industry

Survey of Big Data Architecture and Framework from the Industry Survey of Big Data Architecture and Framework from the Industry NIST Big Data Public Working Group Sanjay Mishra May13, 2014 3/19/2014 NIST Big Data Public Working Group 1 NIST BD PWG Survey of Big Data

More information

Data Mining System, Functionalities and Applications: A Radical Review

Data Mining System, Functionalities and Applications: A Radical Review Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially

More information

CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved

CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved CHAPTER SIX DATA Business Intelligence 2011 The McGraw-Hill Companies, All Rights Reserved 2 CHAPTER OVERVIEW SECTION 6.1 Data, Information, Databases The Business Benefits of High-Quality Information

More information

How To Turn Big Data Into An Insight

How To Turn Big Data Into An Insight mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed

More information

Various Data-Mining Techniques for Big Data

Various Data-Mining Techniques for Big Data Various Data-Mining Techniques for Big Data Manisha R. Thakare Mtech (CSE) B.D. College of Engineering, Sevagram, Wardha, India S. W. Mohod Assistant Professor, Department of C.E. B.D. College of EngineeringWardha,

More information

The Big Picture on Big Data. Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg

The Big Picture on Big Data. Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg The Big Picture on Big Data Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg Objective of Talk 1. Deliver a Primer on Big Data. 2. How does this emerging topic apply to Quality? 3.

More information

A Strategic Approach to Unlock the Opportunities from Big Data

A Strategic Approach to Unlock the Opportunities from Big Data A Strategic Approach to Unlock the Opportunities from Big Data Yue Pan, Chief Scientist for Information Management and Healthcare IBM Research - China [contacts: [email protected] ] Big Data or Big Illusion?

More information

Cleaned Data. Recommendations

Cleaned Data. Recommendations Call Center Data Analysis Megaputer Case Study in Text Mining Merete Hvalshagen www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 10 Bloomington, IN 47404, USA +1 812-0-0110

More information

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.

More information