Opportunity Analysis for Enterprise Collaboration between Network of SMEs

Size: px
Start display at page:

Download "Opportunity Analysis for Enterprise Collaboration between Network of SMEs"

Transcription

1 Presenter: M. Naeem Opportunity Analysis for Enterprise Collaboration between Network of SMEs Supervisor: Abdelaziz Bouras, Yacine Ouzrout, Néjib Moalla Laboratoire Décision et Information pour les Systèmes de Production (DISP),Université Lumière Lyon 2, France 27-May

2 Agenda Background Context of Research Challenge & Opportunities Objective Research Problem Expected Results Related Work Proposed Framework Results Pig/Hive Results Enterprise Collaboration Functional Flow Enterprise Collaboration Big Data Capability Results Ontological Modeling Results Asset AS Service (SWRL) 2

3 Background Context of Research Network of SMEs Diversified Data Emergence of Big data technologies Open data modeling 3

4 Opportunity Background Challenge The diversity of data sources and the ontology modeling perspective The analysis of data repositories to create enterprise assets (services) for collaboration The composition of collaborative business processes from identified services SME (Plastic Manufacturer) DP ERP BA SME (Metal Manufacturer) DP ERP BA Martin Hilbert, Priscila Lopez, The world s technological capacity to store, communicate, and compute information, Science 332 (6025) (2011)

5 Background Challenge and Opportunities Big Data Opportunities: above 50% of 560 enterprises think Big Data will help them in increasing operational efficiency, etc. Philip Chen, C. L., & Zhang, C. Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences, 275,

6 Bottom Up Approach Background Objectives Integrate systems to capitalize and reuse enterprise capabilities and experiences when making decision. Support concurrent/collaborative partners consortium in the definition of added value collaboration schema Provide ontologies Output (Collaborative Added Value) Service Orchestrator Asset Enabler New Data Saving configuration Service Orchestrator Asset Enabler New Data DP + DMS DP + DMS Federated enterprises data repositories to create new collaboration capabilities. Digital preservation system Acquisition Enterprises legacy systems 6

7 Background Research Problem How high degree of data integration in corporate data sources can be associated with perceived benefits of Added value during Inter-Enterprise collaboration? How to define data and information assets in an enterprise Find out the unique characteristics associated with this data How to accelerate the creation of new business collaboration 7

8 Background Expected Results Repository of assets published as services. Assessment model for new collaboration opportunities. Service matchmaker for collaborative business process composition. 8

9 Literature Review Enterprise Collaboration Ontology Engineering Framework/Architect ure Year Methodology Comments SnoBase 2006 Ontology KAON 2004 Ontology SymOntoX 2003 Ontology Large organizations are producing complex data, focus on acquisition and other aspects of valorizations of collaborations were missing powl 2005 Ontology KACP 2008 Ontology Limited to only enterprise security access Yuh-Jen et, al., 2009 ontology Covers PLM but ignores numerous complexities related to unstructured data Daniel et al., 2010 ontology ARIS 1998 Rstatic ontology Generalization not possible CRE 2012 Fuzzy Logic Limited to risk analysis NEGOSIS 2014 Ontology Limited to analysis phase only 9

10 Literature Review Enterprise Collaboration Bigdata Chelmis et. al., (2013) Studied the exploitation of big data technologies for working collaboration with focus on interesting questions: users' communication behavioral patterns dynamics and characteristics, statistical properties and complex correlations between social and topical structures. However limited to a single enterprise and did not address impact of big data for product improvement 10

11 Literature Review Enterprise Collaboration Bigdata Bigdata bring new opportunities for: Business analytic techniques and strategies. (Özcan et al., 2014) Resources, capabilities, and skills needed to maximize business analytics impact. ( Shvachko et al., 2010 ) Challenge of globalized standard for inter-enterprise collaboration (Lin et al., 2007). 11

12 Back-End Front-End Towards Solution Enterprise Collaboration Framework Consortium of SMEs Make best use of collaboration capabilities in order to answer to new business requirements: Co production of new product Find best supplier of a specific raw material Find a sub-contractor Join capacity building... SME-1 SME-2 Added Value Output Repository Assets Asset as Service (AaS).. (AaS) (AaS) (AaS) Collaborative Model Added Value Assessment Model Service Orchestrator Input New Opportunities Ontological Modeling Business Process Inf. Tech. Dig. Res. Business Process Inf. Tech. Dig. Res. Big Data Technologies Acquisition Organize Analyze Decide Data Anonymizer Data Anonymizer Data Anonymizer SCM SRM ERP PLM CRM Document Management System (Un-structured docs) Digital Preservation Platform 12

13 Back- End Front- End Phase of Enterprise Collaboration: Big Data Perspective Proposed Architecture for Enterprise collaboration Consortium of SMEs Make best use of collaboration capabilities in order to answer to new business requirements: Co production of new product Find best supplier of a specific raw material Find a sub-contractor Join capacity building Added Value Output Repositor y Assets Collaborative Model Added Value Assessme nt Model Input New Opportunities Analysis in the phase of Acquisition (Case Studies) SME-1 Business Process Inf. Tech. Dig. Res. SME-2 Business Process Inf. Tech. Dig. Res. Asset as Service (AaS) (AaS) (AaS)..(AaS) Acquisitio n Big Data Technologies Organiz e Service Orchestrator Ontological Modeling Analyze Decid e Data Anonymizer Data Anonymizer Data Anonymizer ER P PL M Document Management System (Un-structured docs) SCM CP M SR N Digital Preservation Platform Big Data Technologies Acquisition Organization Analysis Decide 13

14 Results Pig / Hive Results Data Mining Results (MapReduce) Big Data (Deep Learning) 14

15 Results Hive / Pig Results Query-1. Three types of clients. How to review it, given three parameters? Query-2. Three types of clients. How to review it provided four parameters? Query-3. Which specific business-deals pays us more? 15

16 Results Hive / Pig Results Query-4. List of customers with orders abandoned greater than specific threshold? Query-5. Churn out analysis (leaving out customers).? Query-6. Identification of valuable customers who left away.?(those who paid n% more than customers who stayed) 16

17 MAP Reduce Functional Layer Grouping Business Assets Business Ontology = small data Data Sources Big Data Storage Results Intermediate key-value pairs k k k v v v. k v Group by Key Big Data Processing Aggregation Summarize Filter / Transform Visualization Key-value groups k v k v v k v v v reduce reduce Output key-value pairs Sort Shuffle k v k v k v Document Management System (SCM) PLM CRM SRM ERP 17

18 Functional Layer Contribution Classification for continuos variables Simple Naive Bayes is parallel in nature. No need for memory resident problem Tradeoff. Poor Performance because of underfitting Better solution is Graphical Bayesian Network AIC. Aikac Information Criteria BIC. Bayes Information Criteria MDL. Minimum Description Length 1.1 For each feature in dataset Run Map without Reduce Run Sort/Shuffle Output mean and SD in individual file 1.21 Run Map without Reduce Calculate MDL Run Sort 1.22 Run Map without Reduce Calculate BIC Run Sort 1.23 Run Map without Reduce Calculate AIC Run Sort 1.41 Run Map MDL-BestScore (HDFS) Run Reduce 1.42 Run Map BIC-BestScore (HDFS) Run Reduce 1.43 Run Map AIC-BestScore (HDFS) Run Reduce 2.1 Run Map and Reduce Output Optimized Model 18

19 Enterprise Collaboration Functional Flow Customers ( u) t rating event ( c, p, r) t Train Model model with validity time interval f ( c, p) r Get unrated items Predict rating recommend f ( ) t, t lr ( ) s e f x, *r, i xj cr xi c x c j sim( xi, x j) xx i. j, x, x Select top k 2 2 c c i c c j r r feedback ( cpr,, ) t feedback Customers Collaborative Recommendation Model Why Big Data.no cold start Dim Date ensi of ons Order Co m pa ny Customer Grading Product Detail Company Detail P r o d u Ide c ntifi t er P ri c e Produ ct Name Identified by Customer Value Massive Detail Quantity Ordered Category of Company Revenue in offpeak Coefficient for Price Calculation Identified by P r i c Versioning e Detail Revenue Detail Product History Business Assets Order Detail Order Detail Supplier Detail Client Quota Detail Business Object Information Asset Data Element Business Rule Capability Symbol Legend 19

20 Results Data Mining Results Famille Format Mode Charge ABS Coulé Chargé CHAUDRO Extrudé Bronze Granule CONSO Coulé Polyester COULEE PU Pressé Chargé Pressé Bronze last 2 Couleur quantity production Blanc Bleu Rouge 5789 Jun Famille Format Mode Charge Tube GRAPHITAGE Pressé Anti UV GRAVAGE Stabilisé INJECT APR Rectifié INJECT CLI Régénéré INJECTION Grainé Anti Rayure JONC Régénéré Grainé Couleur Rouge Incolore Beige Fumé Bronze quantity last production Nov

21 Results Data Mining Results Famille Format Mode Charge Couleur quantity last production Plaque CONSO GRANULE COULEE PU DECOUPE PETG FABRIQUES PETG NEGOCE Grainé Médical Moulé Poreux OIL Antistatique Diffusant HI Prismatique Lubrifiant NEGOCE OIL GRANULE Moulé Antistatique COULEE PU Expansé Diffusant DECOUPE Moulé HI Beige Fumé Bronze Gris Gris Bleu Ivoire Jaune Incolore 9476 May

22 Polyamide Results Data Mining Results Famille Format Mode Charge INJECT CLI INJECTION JONC Chargé Bronze MAINT LOC Chargé Carbone MATIERE Chargé Calcium MONTAGE Extrudé Lubrifiant NEGOCE Pressé Antistatique PA Rectifié Diffusant Granule PC Grainé Additif Jonc PE Moulé Anti UV PEEK Lisse AXPET PETG Plaxe Confetti PETIT EQUI FROST PF Polyester PLAQUE Prismatique PONCTUELS TUBE USINAGE Famille Format Mode Charge PONCTUELS Expansé Confetti TUBE Lisse Poreux USINAGE Plaxe Prismatique TUBE USINAGE Lisse Poreux Prismatique Couleur quantity last production Blanc Bleu Transparent Rouge Incolore Fumé Bronze Gris Bleu Ivoire Jaune Orange Vert 3 Couleur quantity Blanc Bleu Naturel Transparent Noir Rouge NON DEFINI Beige Fumé Bronze Gris Bleu Ivoire Jaune Vert Aluminium Sep Last production 2967 Nov

23 Polyoxym Results Data Mining Results Famille Format Mode Charge Couleur quantity Last production DIVERS Grainé Diffusant Blanc FABRIQUES Médical HI BUR & INFO Moulé Additif Noir 1758 Dec JONC Expansé Moulé HI 23

24 Results Data Mining Results Quantity-Ordered Base Price Type of Customer Abandoned Cart Price Discount Recommendation less than more than A <10% 5%-6% B <10% 5%-6% C <7% 1%-3% A <7% 6%-8% B <7% 6%-8% C <5% 3%-5% A <5% 9%-12% B <6% 8%-12% C <4% 6%-9% 24

25 Results Data Mining Results Quantity- Ordered Base Price Nomenclature Gamme Interne Outillage Transport Devis lie Gamme soustraitance technique Globale Discount Recommendation less than more than to 3 >12% <7% >15% > 70% >3 5%-6% 3 to 7 >10% <7% >14% > 65% >3 5%-6% 8 to 10 <4 >9% <5% >11% > 55% >40 >2 1%-3% 2 to 3 >12% <7% >16% > 65% 6%-8% 4 to 8 >10% <7% >14% > 60% 6%-8% <8 and 9 to 13 >3 >9% <6% >12% > 58% >50 >5 3%-5% 1 to 3 >12% <7% >18% > 67% >2.4 9%-12% 4 to 9 >10% <7% >15% > 65% >2 8%-12% <8 and 10 to 14 >3 >9% <5% >12% > 60% >70 >1.5 6%-9% 25

26 Results Data Mining Results Famille Production Hours Minimum Maximum Average Granule 57 hours 78 hours 70 hours Tube 19 hours 23 hours 20 hours Plaque 87 hours 101 hours 90 hours Granule 123 hours 189 hours 169 hours Jonc 68 hours 79 hours 73 hours Polyamide 65 hours 74 hours 70 hours 26

27 Enterprise Collaboration Big Data Capability Results Companies Items (Mode) x/10 NEC75 BUR & INFO (2) COULEE PU (6) MARCHES (9) METALISA (9) PONCTUELS (9) LABEL74 DIVERS (9) FABRIQUES (7) INJECT APR (10) MAINT LOC (6) OUTILLAGE (3) PONCTUELS (1) CEZUS44 CHAUDRO (1) GRAVAGE (7) PETIT EQUI (1) PONCTUELS (6) HEULIE79 FABRIQUES (1) MAINT LOC (3) PONCTUELS (6) GLYNWE34 DIVERS (2) INJECT APR (5) MAINT LOC (2) MARCHES (6) AER69 FABRIQUES (7) GRAVAGE (4) METALISA (3) OUTILLAGE (4) RHODIA93 CHAUDRO (4) MARCHES (3) PONCTUELS (8) DINEL76 FABRIQUES (7) OUTILLAGE (3) NEC75 LABEL74 CEZUS44 HEULIE79 GLYNWE34 AER69 RHODIA93 DINEL76 NEC75 1,0 11,8 11,4 9,3 11,4 10,0 22,0 0,0 LABEL74 11,8 1,0 4,5 12,6 21,9 14,8 5,8 10,5 CEZUS44 11,4 4,5 1,0 9,8 0,0 12,0 19,0 0,0 HEULIE79 9,3 12,6 9,8 1,0 6,1 10,7 17,1 8,0 GLYNWE34 11,4 21,9 0,0 6,1 1,0 0,0 9,0 0,0 AER69 10,0 14,8 12,0 10,7 0,0 1,0 0,0 15,7 RHODIA93 22,0 5,8 19,0 17,1 9,0 0,0 1,0 0,0 DINEL76 0,0 10,5 0,0 8,0 0,0 15,7 0,0 1,0 NEC75 LABEL74 CEZUS44 HEULIE79 GLYNWE34 AER69 RHODIA93 DINEL76 NEC75 1,0 11,8 11,4 9,3 11,4 10,0 22,0 0,0 LABEL74 11,8 1,0 4,5 12,6 21,9 14,8 5,8 10,5 CEZUS44 11,4 4,5 1,0 9,8 0,0 12,0 19,0 0,0 HEULIE79 9,3 12,6 9,8 1,0 6,1 10,7 17,1 8,0 GLYNWE34 11,4 21,9 0,0 6,1 1,0 0,0 9,0 0,0 AER69 10,0 14,8 12,0 10,7 0,0 1,0 0,0 15,7 RHODIA93 22,0 5,8 19,0 17,1 9,0 0,0 1,0 0,0 DINEL76 0,0 10,5 0,0 8,0 0,0 15,7 0,0 1,0 NEC75 CHAUDRO LABEL74 MARCHES CEZUS44 MARCHES HEULIE79 CHAUDRO MARCHES GLYNWE34 FABRIQUES OUTILLAGE PONCTUELS AER69 RHODIA93 METALISA BUR & INFO COULEE PU DINEL76 GRAVAGE METALISA 27

28 contains contains determined by contains contains Ontological Modelling Relationship among Information Assets, Data Elements, and Business Objects Thing Order demanded by demands are Customer Product Recommendation Detail Quotation Detail determined by determined by N.R.P Base Price Product History Color Creation Hours Famille R.P uses uses rating event Train Model Predict rating determined by Order Date Category Format Mode Business Object Abandoned Cart Amount Charge Information Asset Coefficient of Price Discount Recommended sion Last-prod Data Element Data Properties 28

29 Asset As Service (SWRL) APR(? x) produce. product(? x,?y) (mode(?y,?m) selection. range(divers,fabriques,bur info)) (charge(?y,?c) range(diffusant, HI?Additif)) Production Capability dim ension((?y,?d) d1(?d,?d1) range((?d1,?r) (?r,70 164)) qty((?y,?q) (?q,1800))) production. capability($ x,$ y) conditions(($ y,$m) ($ y,$c) ($ y,$d) ($ y,$q)) Timing Capability APR(? x) produce. product (? x,?y) production. hours(?y,?z) min(?z,57) max(?z,78) average(?z,70) granule ($y) APR(? x) produce. product (? x,?y) production. hours(?y,?z) min(?z,19) max(?z,23) average(?z,20) tube($y) APR(? x) produce. product (? x,?y) production. hours(?y,?z) min(?z,87) max(?z,101) average(?z,90) plaque ($y) APR(? x) produce. product (? x,?y) production. hours(?y,?z) min(?z,68) max(?z,79) average(?z,73) jonc($y) APR(? x) produce. product (? x,?y) production. hours(?y,?z) min(?z,65) max(?z,74) average(?z,70) polyamide ($y) Discount Recommendation (previous purchase history) APR(? x) produce. product(? x,?y) ( base. price(?y,?z) range(300,400)) (quantity.ordered(?y,?q) range(300,400)) abandoned. cartprice((?y,?a) min(?a,10)) customer. type(?c,a) discount($c,range(5,6)) 29

30 30

31 References Chelmis C., "Complex modeling and analysis of workplace collaboration data", Collaboration Technologies and Systems (CTS), 2013 International Conference on. IEEE, 2013, pp Chen T.-Y., "Knowledge sharing in virtual enterprises via an ontology-based access control approach", Computers in Industry", vol. 59 no. 5, 2008, p Denicolai S., Zucchella A., Strange R., "Knowledge assets and firm international performance", International Business Review, vol. 23, no. 1, 2014, p Ding Y., Foo S., "Ontology research and development, Part 2 - A review of ontology mapping and evolving", Journal of Information Science, vol. 28, no. 5, 2002, p Gene Ontology Consortium, 2015, Gene Ontology Consortium: going forward, Nucleic Acids Research 43, no. D1, D1049-D1056. Geerts G. L., McCarthy W. E.,. "An ontological analysis of the economic primitives of the extended-rea enterprise information architecture", International Journal of Accounting Information Systems, vol. 3, no 1, 2002, p Lee J., Chae H., Kim C.-H., Kim K., "Design of product ontology architecture for collaborative enterprises", Expert Systems with Applications, vol. 36, no. 2, 2009, p Lee J., Goodwin R., "Ontology management for large-scale enterprise systems", Electronic Commerce Research and Applications, vol. 5, no. 1, 2006, p

32 References Lin H. K., Harding J. A., "A manufacturing system engineering ontology model on the semantic web for inter-enterprise collaboration", Computers in Industry, vol. 58, no. 5, 2007, p Naeem M., Moalla N., Ouzrout Y., Bouaras A. "An ontology based digital preservation system for enterprise collaboration", Computer Systems and Applications (AICCSA), 2014 IEEE/ACS 11th International Conference on, November 2014, p O'Leary D. E., "Enterprise ontologies: Review and an activity theory approach", International Journal of Accounting Information Systems, vol. 11, no. 4, 2010, p Özcan F., Tatbul N., Abadi D. J., Kornacker M., Mohan C., Ramasamy K., Wiener J. "Are we experiencing a big data bubble?", Proceedings of the 2014 ACM SIGMOD international conference on Management of data, June 2014, p Shvachko K., Kuang H., Radia S., Chansler R. "The hadoop distributed file system", Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, May, 2010, p. 1-10). Scheer A.-W., Nttgens M., "ARIS architecture and reference models for business process management", Springer., 2000 Wulan M., Petrovic D., "A fuzzy logic based system for risk analysis and evaluation within enterprise collaborations", Computers in Industry, vol. 63, no 8, 2012, p

Opportunity Analysis for Enterprise Collaboration between Networks of SMEs

Opportunity Analysis for Enterprise Collaboration between Networks of SMEs Opportunity Analysis for Enterprise Collaboration between Networks of SMEs Muhammad Naeem Decision and Information for Production Systems (DISP), University Lumière Lyon 2, France Muhammad.Naeem@univ-lyon2.fr

More information

Log Mining Based on Hadoop s Map and Reduce Technique

Log Mining Based on Hadoop s Map and Reduce Technique Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, anujapandit25@gmail.com Amruta Deshpande Department of Computer Science, amrutadeshpande1991@gmail.com

More information

A Knowledge Management Framework Using Business Intelligence Solutions

A Knowledge Management Framework Using Business Intelligence Solutions www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov An Industrial Perspective on the Hadoop Ecosystem Eldar Khalilov Pavel Valov agenda 03.12.2015 2 agenda Introduction 03.12.2015 2 agenda Introduction Research goals 03.12.2015 2 agenda Introduction Research

More information

Manifest for Big Data Pig, Hive & Jaql

Manifest for Big Data Pig, Hive & Jaql Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,

More information

Analyzing Web Application Log Files to Find Hit Count Through the Utilization of Hadoop MapReduce in Cloud Computing Environment

Analyzing Web Application Log Files to Find Hit Count Through the Utilization of Hadoop MapReduce in Cloud Computing Environment Analyzing Web Application Log Files to Find Hit Count Through the Utilization of Hadoop MapReduce in Cloud Computing Environment Sayalee Narkhede Department of Information Technology Maharashtra Institute

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects Abstract: Build a model to investigate system and discovering relations that connect variables in a database

More information

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Testing 3Vs (Volume, Variety and Velocity) of Big Data Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used

More information

Mobile Storage and Search Engine of Information Oriented to Food Cloud

Mobile Storage and Search Engine of Information Oriented to Food Cloud Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:

More information

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop

More information

Social Network Sentiment Analysis for security uses using: Apache Flume and Hive

Social Network Sentiment Analysis for security uses using: Apache Flume and Hive Social Network Sentiment Analysis for security uses using: Apache Flume and Hive OUERHANI Marouane Business intelligence engineering student at ESPRIT, TUNISIA marwen.werheni@esprit.tn Abstract We can

More information

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON HIGH PERFORMANCE DATA STORAGE ARCHITECTURE OF BIGDATA USING HDFS MS.

More information

Big Data and Analytics in Government

Big Data and Analytics in Government Big Data and Analytics in Government Nov 29, 2012 Mark Johnson Director, Engineered Systems Program 2 Agenda What Big Data Is Government Big Data Use Cases Building a Complete Information Solution Conclusion

More information

Data Management in SAP Environments

Data Management in SAP Environments Data Management in SAP Environments the Big Data Impact Berlin, June 2012 Dr. Wolfgang Martin Analyst, ibond Partner und Ventana Research Advisor Data Management in SAP Environments Big Data What it is

More information

Big Data and Scripting map/reduce in Hadoop

Big Data and Scripting map/reduce in Hadoop Big Data and Scripting map/reduce in Hadoop 1, 2, parts of a Hadoop map/reduce implementation core framework provides customization via indivudual map and reduce functions e.g. implementation in mongodb

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

Extract Transform and Load Strategy for Unstructured Data into Data Warehouse Using Map Reduce Paradigm and Big Data Analytics

Extract Transform and Load Strategy for Unstructured Data into Data Warehouse Using Map Reduce Paradigm and Big Data Analytics Extract Transform and Load Strategy for Unstructured Data into Data Warehouse Using Map Reduce Paradigm and Big Data Analytics P.Saravana kumar 1, M.Athigopal 2, S.Vetrivel 3 Assistant Professor, Dept

More information

International Journal of Innovative Research in Computer and Communication Engineering

International Journal of Innovative Research in Computer and Communication Engineering FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,

More information

A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS

A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS Dr. Ananthi Sheshasayee 1, J V N Lakshmi 2 1 Head Department of Computer Science & Research, Quaid-E-Millath Govt College for Women, Chennai, (India)

More information

Performance Analysis of Book Recommendation System on Hadoop Platform

Performance Analysis of Book Recommendation System on Hadoop Platform Performance Analysis of Book Recommendation System on Hadoop Platform Sugandha Bhatia #1, Surbhi Sehgal #2, Seema Sharma #3 Department of Computer Science & Engineering, Amity School of Engineering & Technology,

More information

MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration

MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration Hoi-Wan Chan 1, Min Xu 2, Chung-Pan Tang 1, Patrick P. C. Lee 1 & Tsz-Yeung Wong 1, 1 Department of Computer Science

More information

Oracle s Big Data solutions. Roger Wullschleger.

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Hadoop Big Data for Processing Data and Performing Workload

Hadoop Big Data for Processing Data and Performing Workload Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer

More information

SAP and Hortonworks Reference Architecture

SAP and Hortonworks Reference Architecture SAP and Hortonworks Reference Architecture Hortonworks. We Do Hadoop. June Page 1 2014 Hortonworks Inc. 2011 2014. All Rights Reserved A Modern Data Architecture With SAP DATA SYSTEMS APPLICATIO NS Statistical

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data

FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez To cite this version: Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez. FP-Hadoop:

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

Policy-based Pre-Processing in Hadoop

Policy-based Pre-Processing in Hadoop Policy-based Pre-Processing in Hadoop Yi Cheng, Christian Schaefer Ericsson Research Stockholm, Sweden yi.cheng@ericsson.com, christian.schaefer@ericsson.com Abstract While big data analytics provides

More information

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer Alejandro Vaisman Esteban Zimanyi Data Warehouse Systems Design and Implementation ^ Springer Contents Part I Fundamental Concepts 1 Introduction 3 1.1 A Historical Overview of Data Warehousing 4 1.2 Spatial

More information

HDFS Space Consolidation

HDFS Space Consolidation HDFS Space Consolidation Aastha Mehta*,1,2, Deepti Banka*,1,2, Kartheek Muthyala*,1,2, Priya Sehgal 1, Ajay Bakre 1 *Student Authors 1 Advanced Technology Group, NetApp Inc., Bangalore, India 2 Birla Institute

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

NoSQL for SQL Professionals William McKnight

NoSQL for SQL Professionals William McKnight NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to

More information

NoSQL and Hadoop Technologies On Oracle Cloud

NoSQL and Hadoop Technologies On Oracle Cloud NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath

More information

Formal Methods for Preserving Privacy for Big Data Extraction Software

Formal Methods for Preserving Privacy for Big Data Extraction Software Formal Methods for Preserving Privacy for Big Data Extraction Software M. Brian Blake and Iman Saleh Abstract University of Miami, Coral Gables, FL Given the inexpensive nature and increasing availability

More information

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.

More information

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India 1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

Hadoop Technology for Flow Analysis of the Internet Traffic

Hadoop Technology for Flow Analysis of the Internet Traffic Hadoop Technology for Flow Analysis of the Internet Traffic Rakshitha Kiran P PG Scholar, Dept. of C.S, Shree Devi Institute of Technology, Mangalore, Karnataka, India ABSTRACT: Flow analysis of the internet

More information

Apache Kylin Introduction Dec 8, 2014 @ApacheKylin

Apache Kylin Introduction Dec 8, 2014 @ApacheKylin Apache Kylin Introduction Dec 8, 2014 @ApacheKylin Luke Han Sr. Product Manager lukhan@ebay.com @lukehq Yang Li Architect & Tech Leader yangli9@ebay.com Agenda What s Apache Kylin? Tech Highlights Performance

More information

Hadoop Job Oriented Training Agenda

Hadoop Job Oriented Training Agenda 1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model

More information

Systems Engineering II. Pramod Bhatotia TU Dresden pramod.bhatotia@tu- dresden.de

Systems Engineering II. Pramod Bhatotia TU Dresden pramod.bhatotia@tu- dresden.de Systems Engineering II Pramod Bhatotia TU Dresden pramod.bhatotia@tu- dresden.de About me! Since May 2015 2015 2012 Research Group Leader cfaed, TU Dresden PhD Student MPI- SWS Research Intern Microsoft

More information

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 8, August 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Introduction to SAP. SAP University Alliances Author Stefan Weidner Babett Koch Chris Bernhardt. Product SAP ERP 6.0 EhP7.

Introduction to SAP. SAP University Alliances Author Stefan Weidner Babett Koch Chris Bernhardt. Product SAP ERP 6.0 EhP7. SAP University Alliances Author Stefan Weidner Babett Koch Chris Bernhardt Introduction to SAP Product SAP ERP 6.0 EhP7 Level Beginner Focus Cross-functional integration SD, MM, PP, FI, CO, HCM, WM, PS,

More information

Survey on Scheduling Algorithm in MapReduce Framework

Survey on Scheduling Algorithm in MapReduce Framework Survey on Scheduling Algorithm in MapReduce Framework Pravin P. Nimbalkar 1, Devendra P.Gadekar 2 1,2 Department of Computer Engineering, JSPM s Imperial College of Engineering and Research, Pune, India

More information

Big Data and Apache Hadoop s MapReduce

Big Data and Apache Hadoop s MapReduce Big Data and Apache Hadoop s MapReduce Michael Hahsler Computer Science and Engineering Southern Methodist University January 23, 2012 Michael Hahsler (SMU/CSE) Hadoop/MapReduce January 23, 2012 1 / 23

More information

FOUNDATIONS OF A CROSS- DISCIPLINARY PEDAGOGY FOR BIG DATA

FOUNDATIONS OF A CROSS- DISCIPLINARY PEDAGOGY FOR BIG DATA FOUNDATIONS OF A CROSSDISCIPLINARY PEDAGOGY FOR BIG DATA Joshua Eckroth Stetson University DeLand, Florida 3867402519 jeckroth@stetson.edu ABSTRACT The increasing awareness of big data is transforming

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

Big Data Processing with MapReduce for E-Book

Big Data Processing with MapReduce for E-Book Big Data Processing with MapReduce for E-Book Tae Ho Hong 2, Chang Ho Yun 1,2, Jong Won Park 1,2, Hak Geon Lee 2, Hae Sun Jung 1 and Yong Woo Lee 1,2 1 The Ubiquitous (Smart) City Consortium 2 The University

More information

L1: Introduction to Hadoop

L1: Introduction to Hadoop L1: Introduction to Hadoop Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: December 1, 2014 Today we are going to learn... 1 General

More information

Model of Cloud-Based Services for Data Mining Analysis

Model of Cloud-Based Services for Data Mining Analysis Computer and Information Science; Vol. 8, No. 4; 2015 ISSN 1913-8989 E-ISSN 1913-8997 Published by Canadian Center of Science and Education Model of Cloud-Based Services for Data Mining Analysis Aleksandar

More information

In-Memory Analytics for Big Data

In-Memory Analytics for Big Data In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...

More information

Turning Big Data into Big Insights

Turning Big Data into Big Insights mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed

More information

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Journal of Computational Information Systems 7: 16 (2011) 5956-5963 Available at http://www.jofcis.com Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Ping ZHOU, Jingsheng LEI, Wenjun YE

More information

A Study on Data Analysis Process Management System in MapReduce using BPM

A Study on Data Analysis Process Management System in MapReduce using BPM A Study on Data Analysis Process Management System in MapReduce using BPM Yoon-Sik Yoo 1, Jaehak Yu 1, Hyo-Chan Bang 1, Cheong Hee Park 1 Electronics and Telecommunications Research Institute, 138 Gajeongno,

More information

I/O Considerations in Big Data Analytics

I/O Considerations in Big Data Analytics Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very

More information

fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries

fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries Johan Montagnat CNRS, I3S lab, Modalis team on behalf of the CrEDIBLE

More information

SAP Business Suite powered by SAP HANA

SAP Business Suite powered by SAP HANA SAP Business Suite powered by SAP HANA CeBIT 2013, March 5 th Bernd Leukert, Corporate Officer and Executive Vice President Application Innovation, SAP AG Magnitude of Change: Omission of Restrictions

More information

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University

More information

Big Data Architect Certification Self-Study Kit Bundle

Big Data Architect Certification Self-Study Kit Bundle Big Data Architect Certification Bundle This certification bundle provides you with the self-study materials you need to prepare for the exams required to complete the Big Data Architect Certification.

More information

Teradata s Big Data Technology Strategy & Roadmap

Teradata s Big Data Technology Strategy & Roadmap Teradata s Big Data Technology Strategy & Roadmap Artur Borycki, Director International Solutions Marketing 18 March 2014 Agenda > Introduction and level-set > Enabling the Logical Data Warehouse > Any

More information

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools

More information

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

Energy-Saving Cloud Computing Platform Based On Micro-Embedded System

Energy-Saving Cloud Computing Platform Based On Micro-Embedded System Energy-Saving Cloud Computing Platform Based On Micro-Embedded System Wen-Hsu HSIEH *, San-Peng KAO **, Kuang-Hung TAN **, Jiann-Liang CHEN ** * Department of Computer and Communication, De Lin Institute

More information

Introduction to Hadoop and MapReduce

Introduction to Hadoop and MapReduce Introduction to Hadoop and MapReduce THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Large-scale Computation Traditional solutions for computing large quantities of data

More information

ITG Software Engineering

ITG Software Engineering Introduction to Cloudera Course ID: Page 1 Last Updated 12/15/2014 Introduction to Cloudera Course : This 5 day course introduces the student to the Hadoop architecture, file system, and the Hadoop Ecosystem.

More information

TRAINING PROGRAM ON BIGDATA/HADOOP

TRAINING PROGRAM ON BIGDATA/HADOOP Course: Training on Bigdata/Hadoop with Hands-on Course Duration / Dates / Time: 4 Days / 24th - 27th June 2015 / 9:30-17:30 Hrs Venue: Eagle Photonics Pvt Ltd First Floor, Plot No 31, Sector 19C, Vashi,

More information

Cloud Computing Now and the Future Development of the IaaS

Cloud Computing Now and the Future Development of the IaaS 2010 Cloud Computing Now and the Future Development of the IaaS Quanta Computer Division: CCASD Title: Project Manager Name: Chad Lin Agenda: What is Cloud Computing? Public, Private and Hybrid Cloud.

More information

An Efficient and Scalable Management of Ontology

An Efficient and Scalable Management of Ontology An Efficient and Scalable Management of Ontology Myung-Jae Park 1, Jihyun Lee 1, Chun-Hee Lee 1, Jiexi Lin 1, Olivier Serres 2, and Chin-Wan Chung 1 1 Korea Advanced Institute of Science and Technology,

More information

Information Systems in the Enterprise

Information Systems in the Enterprise Chapter 2 Information Systems in the Enterprise 2.1 Prentice Hall Objectives 1. What are the major types of systems in a business? What role do they play? 2. How do information systems support the major

More information

APPROACHABLE ANALYTICS MAKING SENSE OF DATA

APPROACHABLE ANALYTICS MAKING SENSE OF DATA APPROACHABLE ANALYTICS MAKING SENSE OF DATA AGENDA SAS DELIVERS PROVEN SOLUTIONS THAT DRIVE INNOVATION AND IMPROVE PERFORMANCE. About SAS SAS Business Analytics Framework Approachable Analytics SAS for

More information

MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy. Satish Krishnaswamy VP MDM Solutions - Teradata

MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy. Satish Krishnaswamy VP MDM Solutions - Teradata MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy Satish Krishnaswamy VP MDM Solutions - Teradata 2 Agenda MDM and its importance Linking to the Active Data Warehousing

More information

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013 Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache

More information

On a Hadoop-based Analytics Service System

On a Hadoop-based Analytics Service System Int. J. Advance Soft Compu. Appl, Vol. 7, No. 1, March 2015 ISSN 2074-8523 On a Hadoop-based Analytics Service System Mikyoung Lee, Hanmin Jung, and Minhee Cho Korea Institute of Science and Technology

More information

Big Fast Data Hadoop acceleration with Flash. June 2013

Big Fast Data Hadoop acceleration with Flash. June 2013 Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional

More information

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn Presented by :- Ishank Kumar Aakash Patel Vishnu Dev Yadav CONTENT Abstract Introduction Related work The Ecosystem Ingress

More information

The basic data mining algorithms introduced may be enhanced in a number of ways.

The basic data mining algorithms introduced may be enhanced in a number of ways. DATA MINING TECHNOLOGIES AND IMPLEMENTATIONS The basic data mining algorithms introduced may be enhanced in a number of ways. Data mining algorithms have traditionally assumed data is memory resident,

More information

Extending The Value of SAP with the SAP BusinessObjects Business Intelligence Platform Product Integration Roadmap

Extending The Value of SAP with the SAP BusinessObjects Business Intelligence Platform Product Integration Roadmap Extending The Value of SAP with the SAP BusinessObjects Business Intelligence Platform Product Integration Roadmap Naomi Tomioka Phipps Principal Solution Advisor Business User South East Asia 22 nd April,

More information

Business Process Modeling. Introduction to ARIS Methodolgy

Business Process Modeling. Introduction to ARIS Methodolgy Business Process Modeling Introduction to ARIS Methodolgy Agenda What s in modeling? Situation today Objectives of Process Management ARIS Framework and methods ARIS suite of products Live demo Page 2

More information

Big Data: What You Should Know. Mark Child Research Manager - Software IDC CEMA

Big Data: What You Should Know. Mark Child Research Manager - Software IDC CEMA Big Data: What You Should Know Mark Child Research Manager - Software IDC CEMA Agenda Market Dynamics Defining Big Data Technology Trends Information and Intelligence Market Realities Future Applications

More information

Radoop: Analyzing Big Data with RapidMiner and Hadoop

Radoop: Analyzing Big Data with RapidMiner and Hadoop Radoop: Analyzing Big Data with RapidMiner and Hadoop Zoltán Prekopcsák, Gábor Makrai, Tamás Henk, Csaba Gáspár-Papanek Budapest University of Technology and Economics, Hungary Abstract Working with large

More information

Big Data Introduction

Big Data Introduction Big Data Introduction Ralf Lange Global ISV & OEM Sales 1 Copyright 2012, Oracle and/or its affiliates. All rights Conventional infrastructure 2 Copyright 2012, Oracle and/or its affiliates. All rights

More information

Comprehensive Analytics on the Hortonworks Data Platform

Comprehensive Analytics on the Hortonworks Data Platform Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page

More information

Big Data Too Big To Ignore

Big Data Too Big To Ignore Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction

More information

The Business Analyst s Guide to Hadoop

The Business Analyst s Guide to Hadoop White Paper The Business Analyst s Guide to Hadoop Get Ready, Get Set, and Go: A Three-Step Guide to Implementing Hadoop-based Analytics By Alteryx and Hortonworks (T)here is considerable evidence that

More information

RECOMMENDATION SYSTEM USING BLOOM FILTER IN MAPREDUCE

RECOMMENDATION SYSTEM USING BLOOM FILTER IN MAPREDUCE RECOMMENDATION SYSTEM USING BLOOM FILTER IN MAPREDUCE Reena Pagare and Anita Shinde Department of Computer Engineering, Pune University M. I. T. College Of Engineering Pune India ABSTRACT Many clients

More information

Hadoop & SAS Data Loader for Hadoop

Hadoop & SAS Data Loader for Hadoop Turning Data into Value Hadoop & SAS Data Loader for Hadoop Sebastiaan Schaap Frederik Vandenberghe Agenda What s Hadoop SAS Data management: Traditional In-Database In-Memory The Hadoop analytics lifecycle

More information

Prof. Dr. Lutz Heuser SAP Research

Prof. Dr. Lutz Heuser SAP Research Enterprise Services Architecture & Semantic Web Services Prof. Dr. Lutz Heuser SAP Research Enterprise Services Architecture Architecture for Change Semantic Web Services Time for Change: IT is Entering

More information

The BigData Top100 List Initiative. Chaitan Baru San Diego Supercomputer Center

The BigData Top100 List Initiative. Chaitan Baru San Diego Supercomputer Center The BigData Top100 List Initiative Chaitan Baru San Diego Supercomputer Center 2 Background Workshop series on Big Data Benchmarking (WBDB) First workshop, May 2012, San Jose. Hosted by Brocade. Second

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper Offload Enterprise Data Warehouse (EDW) to Big Data Lake Oracle Exadata, Teradata, Netezza and SQL Server Ample White Paper EDW (Enterprise Data Warehouse) Offloads The EDW (Enterprise Data Warehouse)

More information