Opportunity Analysis for Enterprise Collaboration between Network of SMEs

Size: px
Start display at page:

Download "Opportunity Analysis for Enterprise Collaboration between Network of SMEs"

Transcription

1 Presenter: M. Naeem Opportunity Analysis for Enterprise Collaboration between Network of SMEs Supervisor: Abdelaziz Bouras, Yacine Ouzrout, Néjib Moalla Laboratoire Décision et Information pour les Systèmes de Production (DISP),Université Lumière Lyon 2, France 27-May

2 Agenda Background Context of Research Challenge & Opportunities Objective Research Problem Expected Results Related Work Proposed Framework Results Pig/Hive Results Enterprise Collaboration Functional Flow Enterprise Collaboration Big Data Capability Results Ontological Modeling Results Asset AS Service (SWRL) 2

3 Background Context of Research Network of SMEs Diversified Data Emergence of Big data technologies Open data modeling 3

4 Opportunity Background Challenge The diversity of data sources and the ontology modeling perspective The analysis of data repositories to create enterprise assets (services) for collaboration The composition of collaborative business processes from identified services SME (Plastic Manufacturer) DP ERP BA SME (Metal Manufacturer) DP ERP BA Martin Hilbert, Priscila Lopez, The world s technological capacity to store, communicate, and compute information, Science 332 (6025) (2011)

5 Background Challenge and Opportunities Big Data Opportunities: above 50% of 560 enterprises think Big Data will help them in increasing operational efficiency, etc. Philip Chen, C. L., & Zhang, C. Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences, 275,

6 Bottom Up Approach Background Objectives Integrate systems to capitalize and reuse enterprise capabilities and experiences when making decision. Support concurrent/collaborative partners consortium in the definition of added value collaboration schema Provide ontologies Output (Collaborative Added Value) Service Orchestrator Asset Enabler New Data Saving configuration Service Orchestrator Asset Enabler New Data DP + DMS DP + DMS Federated enterprises data repositories to create new collaboration capabilities. Digital preservation system Acquisition Enterprises legacy systems 6

7 Background Research Problem How high degree of data integration in corporate data sources can be associated with perceived benefits of Added value during Inter-Enterprise collaboration? How to define data and information assets in an enterprise Find out the unique characteristics associated with this data How to accelerate the creation of new business collaboration 7

8 Background Expected Results Repository of assets published as services. Assessment model for new collaboration opportunities. Service matchmaker for collaborative business process composition. 8

9 Literature Review Enterprise Collaboration Ontology Engineering Framework/Architect ure Year Methodology Comments SnoBase 2006 Ontology KAON 2004 Ontology SymOntoX 2003 Ontology Large organizations are producing complex data, focus on acquisition and other aspects of valorizations of collaborations were missing powl 2005 Ontology KACP 2008 Ontology Limited to only enterprise security access Yuh-Jen et, al., 2009 ontology Covers PLM but ignores numerous complexities related to unstructured data Daniel et al., 2010 ontology ARIS 1998 Rstatic ontology Generalization not possible CRE 2012 Fuzzy Logic Limited to risk analysis NEGOSIS 2014 Ontology Limited to analysis phase only 9

10 Literature Review Enterprise Collaboration Bigdata Chelmis et. al., (2013) Studied the exploitation of big data technologies for working collaboration with focus on interesting questions: users' communication behavioral patterns dynamics and characteristics, statistical properties and complex correlations between social and topical structures. However limited to a single enterprise and did not address impact of big data for product improvement 10

11 Literature Review Enterprise Collaboration Bigdata Bigdata bring new opportunities for: Business analytic techniques and strategies. (Özcan et al., 2014) Resources, capabilities, and skills needed to maximize business analytics impact. ( Shvachko et al., 2010 ) Challenge of globalized standard for inter-enterprise collaboration (Lin et al., 2007). 11

12 Back-End Front-End Towards Solution Enterprise Collaboration Framework Consortium of SMEs Make best use of collaboration capabilities in order to answer to new business requirements: Co production of new product Find best supplier of a specific raw material Find a sub-contractor Join capacity building... SME-1 SME-2 Added Value Output Repository Assets Asset as Service (AaS).. (AaS) (AaS) (AaS) Collaborative Model Added Value Assessment Model Service Orchestrator Input New Opportunities Ontological Modeling Business Process Inf. Tech. Dig. Res. Business Process Inf. Tech. Dig. Res. Big Data Technologies Acquisition Organize Analyze Decide Data Anonymizer Data Anonymizer Data Anonymizer SCM SRM ERP PLM CRM Document Management System (Un-structured docs) Digital Preservation Platform 12

13 Back- End Front- End Phase of Enterprise Collaboration: Big Data Perspective Proposed Architecture for Enterprise collaboration Consortium of SMEs Make best use of collaboration capabilities in order to answer to new business requirements: Co production of new product Find best supplier of a specific raw material Find a sub-contractor Join capacity building Added Value Output Repositor y Assets Collaborative Model Added Value Assessme nt Model Input New Opportunities Analysis in the phase of Acquisition (Case Studies) SME-1 Business Process Inf. Tech. Dig. Res. SME-2 Business Process Inf. Tech. Dig. Res. Asset as Service (AaS) (AaS) (AaS)..(AaS) Acquisitio n Big Data Technologies Organiz e Service Orchestrator Ontological Modeling Analyze Decid e Data Anonymizer Data Anonymizer Data Anonymizer ER P PL M Document Management System (Un-structured docs) SCM CP M SR N Digital Preservation Platform Big Data Technologies Acquisition Organization Analysis Decide 13

14 Results Pig / Hive Results Data Mining Results (MapReduce) Big Data (Deep Learning) 14

15 Results Hive / Pig Results Query-1. Three types of clients. How to review it, given three parameters? Query-2. Three types of clients. How to review it provided four parameters? Query-3. Which specific business-deals pays us more? 15

16 Results Hive / Pig Results Query-4. List of customers with orders abandoned greater than specific threshold? Query-5. Churn out analysis (leaving out customers).? Query-6. Identification of valuable customers who left away.?(those who paid n% more than customers who stayed) 16

17 MAP Reduce Functional Layer Grouping Business Assets Business Ontology = small data Data Sources Big Data Storage Results Intermediate key-value pairs k k k v v v. k v Group by Key Big Data Processing Aggregation Summarize Filter / Transform Visualization Key-value groups k v k v v k v v v reduce reduce Output key-value pairs Sort Shuffle k v k v k v Document Management System (SCM) PLM CRM SRM ERP 17

18 Functional Layer Contribution Classification for continuos variables Simple Naive Bayes is parallel in nature. No need for memory resident problem Tradeoff. Poor Performance because of underfitting Better solution is Graphical Bayesian Network AIC. Aikac Information Criteria BIC. Bayes Information Criteria MDL. Minimum Description Length 1.1 For each feature in dataset Run Map without Reduce Run Sort/Shuffle Output mean and SD in individual file 1.21 Run Map without Reduce Calculate MDL Run Sort 1.22 Run Map without Reduce Calculate BIC Run Sort 1.23 Run Map without Reduce Calculate AIC Run Sort 1.41 Run Map MDL-BestScore (HDFS) Run Reduce 1.42 Run Map BIC-BestScore (HDFS) Run Reduce 1.43 Run Map AIC-BestScore (HDFS) Run Reduce 2.1 Run Map and Reduce Output Optimized Model 18

19 Enterprise Collaboration Functional Flow Customers ( u) t rating event ( c, p, r) t Train Model model with validity time interval f ( c, p) r Get unrated items Predict rating recommend f ( ) t, t lr ( ) s e f x, *r, i xj cr xi c x c j sim( xi, x j) xx i. j, x, x Select top k 2 2 c c i c c j r r feedback ( cpr,, ) t feedback Customers Collaborative Recommendation Model Why Big Data.no cold start Dim Date ensi of ons Order Co m pa ny Customer Grading Product Detail Company Detail P r o d u Ide c ntifi t er P ri c e Produ ct Name Identified by Customer Value Massive Detail Quantity Ordered Category of Company Revenue in offpeak Coefficient for Price Calculation Identified by P r i c Versioning e Detail Revenue Detail Product History Business Assets Order Detail Order Detail Supplier Detail Client Quota Detail Business Object Information Asset Data Element Business Rule Capability Symbol Legend 19

20 Results Data Mining Results Famille Format Mode Charge ABS Coulé Chargé CHAUDRO Extrudé Bronze Granule CONSO Coulé Polyester COULEE PU Pressé Chargé Pressé Bronze last 2 Couleur quantity production Blanc Bleu Rouge 5789 Jun Famille Format Mode Charge Tube GRAPHITAGE Pressé Anti UV GRAVAGE Stabilisé INJECT APR Rectifié INJECT CLI Régénéré INJECTION Grainé Anti Rayure JONC Régénéré Grainé Couleur Rouge Incolore Beige Fumé Bronze quantity last production Nov

21 Results Data Mining Results Famille Format Mode Charge Couleur quantity last production Plaque CONSO GRANULE COULEE PU DECOUPE PETG FABRIQUES PETG NEGOCE Grainé Médical Moulé Poreux OIL Antistatique Diffusant HI Prismatique Lubrifiant NEGOCE OIL GRANULE Moulé Antistatique COULEE PU Expansé Diffusant DECOUPE Moulé HI Beige Fumé Bronze Gris Gris Bleu Ivoire Jaune Incolore 9476 May

22 Polyamide Results Data Mining Results Famille Format Mode Charge INJECT CLI INJECTION JONC Chargé Bronze MAINT LOC Chargé Carbone MATIERE Chargé Calcium MONTAGE Extrudé Lubrifiant NEGOCE Pressé Antistatique PA Rectifié Diffusant Granule PC Grainé Additif Jonc PE Moulé Anti UV PEEK Lisse AXPET PETG Plaxe Confetti PETIT EQUI FROST PF Polyester PLAQUE Prismatique PONCTUELS TUBE USINAGE Famille Format Mode Charge PONCTUELS Expansé Confetti TUBE Lisse Poreux USINAGE Plaxe Prismatique TUBE USINAGE Lisse Poreux Prismatique Couleur quantity last production Blanc Bleu Transparent Rouge Incolore Fumé Bronze Gris Bleu Ivoire Jaune Orange Vert 3 Couleur quantity Blanc Bleu Naturel Transparent Noir Rouge NON DEFINI Beige Fumé Bronze Gris Bleu Ivoire Jaune Vert Aluminium Sep Last production 2967 Nov

23 Polyoxym Results Data Mining Results Famille Format Mode Charge Couleur quantity Last production DIVERS Grainé Diffusant Blanc FABRIQUES Médical HI BUR & INFO Moulé Additif Noir 1758 Dec JONC Expansé Moulé HI 23

24 Results Data Mining Results Quantity-Ordered Base Price Type of Customer Abandoned Cart Price Discount Recommendation less than more than A <10% 5%-6% B <10% 5%-6% C <7% 1%-3% A <7% 6%-8% B <7% 6%-8% C <5% 3%-5% A <5% 9%-12% B <6% 8%-12% C <4% 6%-9% 24

25 Results Data Mining Results Quantity- Ordered Base Price Nomenclature Gamme Interne Outillage Transport Devis lie Gamme soustraitance technique Globale Discount Recommendation less than more than to 3 >12% <7% >15% > 70% >3 5%-6% 3 to 7 >10% <7% >14% > 65% >3 5%-6% 8 to 10 <4 >9% <5% >11% > 55% >40 >2 1%-3% 2 to 3 >12% <7% >16% > 65% 6%-8% 4 to 8 >10% <7% >14% > 60% 6%-8% <8 and 9 to 13 >3 >9% <6% >12% > 58% >50 >5 3%-5% 1 to 3 >12% <7% >18% > 67% >2.4 9%-12% 4 to 9 >10% <7% >15% > 65% >2 8%-12% <8 and 10 to 14 >3 >9% <5% >12% > 60% >70 >1.5 6%-9% 25

26 Results Data Mining Results Famille Production Hours Minimum Maximum Average Granule 57 hours 78 hours 70 hours Tube 19 hours 23 hours 20 hours Plaque 87 hours 101 hours 90 hours Granule 123 hours 189 hours 169 hours Jonc 68 hours 79 hours 73 hours Polyamide 65 hours 74 hours 70 hours 26

27 Enterprise Collaboration Big Data Capability Results Companies Items (Mode) x/10 NEC75 BUR & INFO (2) COULEE PU (6) MARCHES (9) METALISA (9) PONCTUELS (9) LABEL74 DIVERS (9) FABRIQUES (7) INJECT APR (10) MAINT LOC (6) OUTILLAGE (3) PONCTUELS (1) CEZUS44 CHAUDRO (1) GRAVAGE (7) PETIT EQUI (1) PONCTUELS (6) HEULIE79 FABRIQUES (1) MAINT LOC (3) PONCTUELS (6) GLYNWE34 DIVERS (2) INJECT APR (5) MAINT LOC (2) MARCHES (6) AER69 FABRIQUES (7) GRAVAGE (4) METALISA (3) OUTILLAGE (4) RHODIA93 CHAUDRO (4) MARCHES (3) PONCTUELS (8) DINEL76 FABRIQUES (7) OUTILLAGE (3) NEC75 LABEL74 CEZUS44 HEULIE79 GLYNWE34 AER69 RHODIA93 DINEL76 NEC75 1,0 11,8 11,4 9,3 11,4 10,0 22,0 0,0 LABEL74 11,8 1,0 4,5 12,6 21,9 14,8 5,8 10,5 CEZUS44 11,4 4,5 1,0 9,8 0,0 12,0 19,0 0,0 HEULIE79 9,3 12,6 9,8 1,0 6,1 10,7 17,1 8,0 GLYNWE34 11,4 21,9 0,0 6,1 1,0 0,0 9,0 0,0 AER69 10,0 14,8 12,0 10,7 0,0 1,0 0,0 15,7 RHODIA93 22,0 5,8 19,0 17,1 9,0 0,0 1,0 0,0 DINEL76 0,0 10,5 0,0 8,0 0,0 15,7 0,0 1,0 NEC75 LABEL74 CEZUS44 HEULIE79 GLYNWE34 AER69 RHODIA93 DINEL76 NEC75 1,0 11,8 11,4 9,3 11,4 10,0 22,0 0,0 LABEL74 11,8 1,0 4,5 12,6 21,9 14,8 5,8 10,5 CEZUS44 11,4 4,5 1,0 9,8 0,0 12,0 19,0 0,0 HEULIE79 9,3 12,6 9,8 1,0 6,1 10,7 17,1 8,0 GLYNWE34 11,4 21,9 0,0 6,1 1,0 0,0 9,0 0,0 AER69 10,0 14,8 12,0 10,7 0,0 1,0 0,0 15,7 RHODIA93 22,0 5,8 19,0 17,1 9,0 0,0 1,0 0,0 DINEL76 0,0 10,5 0,0 8,0 0,0 15,7 0,0 1,0 NEC75 CHAUDRO LABEL74 MARCHES CEZUS44 MARCHES HEULIE79 CHAUDRO MARCHES GLYNWE34 FABRIQUES OUTILLAGE PONCTUELS AER69 RHODIA93 METALISA BUR & INFO COULEE PU DINEL76 GRAVAGE METALISA 27

28 contains contains determined by contains contains Ontological Modelling Relationship among Information Assets, Data Elements, and Business Objects Thing Order demanded by demands are Customer Product Recommendation Detail Quotation Detail determined by determined by N.R.P Base Price Product History Color Creation Hours Famille R.P uses uses rating event Train Model Predict rating determined by Order Date Category Format Mode Business Object Abandoned Cart Amount Charge Information Asset Coefficient of Price Discount Recommended sion Last-prod Data Element Data Properties 28

29 Asset As Service (SWRL) APR(? x) produce. product(? x,?y) (mode(?y,?m) selection. range(divers,fabriques,bur info)) (charge(?y,?c) range(diffusant, HI?Additif)) Production Capability dim ension((?y,?d) d1(?d,?d1) range((?d1,?r) (?r,70 164)) qty((?y,?q) (?q,1800))) production. capability($ x,$ y) conditions(($ y,$m) ($ y,$c) ($ y,$d) ($ y,$q)) Timing Capability APR(? x) produce. product (? x,?y) production. hours(?y,?z) min(?z,57) max(?z,78) average(?z,70) granule ($y) APR(? x) produce. product (? x,?y) production. hours(?y,?z) min(?z,19) max(?z,23) average(?z,20) tube($y) APR(? x) produce. product (? x,?y) production. hours(?y,?z) min(?z,87) max(?z,101) average(?z,90) plaque ($y) APR(? x) produce. product (? x,?y) production. hours(?y,?z) min(?z,68) max(?z,79) average(?z,73) jonc($y) APR(? x) produce. product (? x,?y) production. hours(?y,?z) min(?z,65) max(?z,74) average(?z,70) polyamide ($y) Discount Recommendation (previous purchase history) APR(? x) produce. product(? x,?y) ( base. price(?y,?z) range(300,400)) (quantity.ordered(?y,?q) range(300,400)) abandoned. cartprice((?y,?a) min(?a,10)) customer. type(?c,a) discount($c,range(5,6)) 29

30 30

31 References Chelmis C., "Complex modeling and analysis of workplace collaboration data", Collaboration Technologies and Systems (CTS), 2013 International Conference on. IEEE, 2013, pp Chen T.-Y., "Knowledge sharing in virtual enterprises via an ontology-based access control approach", Computers in Industry", vol. 59 no. 5, 2008, p Denicolai S., Zucchella A., Strange R., "Knowledge assets and firm international performance", International Business Review, vol. 23, no. 1, 2014, p Ding Y., Foo S., "Ontology research and development, Part 2 - A review of ontology mapping and evolving", Journal of Information Science, vol. 28, no. 5, 2002, p Gene Ontology Consortium, 2015, Gene Ontology Consortium: going forward, Nucleic Acids Research 43, no. D1, D1049-D1056. Geerts G. L., McCarthy W. E.,. "An ontological analysis of the economic primitives of the extended-rea enterprise information architecture", International Journal of Accounting Information Systems, vol. 3, no 1, 2002, p Lee J., Chae H., Kim C.-H., Kim K., "Design of product ontology architecture for collaborative enterprises", Expert Systems with Applications, vol. 36, no. 2, 2009, p Lee J., Goodwin R., "Ontology management for large-scale enterprise systems", Electronic Commerce Research and Applications, vol. 5, no. 1, 2006, p

32 References Lin H. K., Harding J. A., "A manufacturing system engineering ontology model on the semantic web for inter-enterprise collaboration", Computers in Industry, vol. 58, no. 5, 2007, p Naeem M., Moalla N., Ouzrout Y., Bouaras A. "An ontology based digital preservation system for enterprise collaboration", Computer Systems and Applications (AICCSA), 2014 IEEE/ACS 11th International Conference on, November 2014, p O'Leary D. E., "Enterprise ontologies: Review and an activity theory approach", International Journal of Accounting Information Systems, vol. 11, no. 4, 2010, p Özcan F., Tatbul N., Abadi D. J., Kornacker M., Mohan C., Ramasamy K., Wiener J. "Are we experiencing a big data bubble?", Proceedings of the 2014 ACM SIGMOD international conference on Management of data, June 2014, p Shvachko K., Kuang H., Radia S., Chansler R. "The hadoop distributed file system", Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, May, 2010, p. 1-10). Scheer A.-W., Nttgens M., "ARIS architecture and reference models for business process management", Springer., 2000 Wulan M., Petrovic D., "A fuzzy logic based system for risk analysis and evaluation within enterprise collaborations", Computers in Industry, vol. 63, no 8, 2012, p

A Knowledge Management Framework Using Business Intelligence Solutions

A Knowledge Management Framework Using Business Intelligence Solutions www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For

More information

Log Mining Based on Hadoop s Map and Reduce Technique

Log Mining Based on Hadoop s Map and Reduce Technique Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, anujapandit25@gmail.com Amruta Deshpande Department of Computer Science, amrutadeshpande1991@gmail.com

More information

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov An Industrial Perspective on the Hadoop Ecosystem Eldar Khalilov Pavel Valov agenda 03.12.2015 2 agenda Introduction 03.12.2015 2 agenda Introduction Research goals 03.12.2015 2 agenda Introduction Research

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

How To Analyze Log Files In A Web Application On A Hadoop Mapreduce System

How To Analyze Log Files In A Web Application On A Hadoop Mapreduce System Analyzing Web Application Log Files to Find Hit Count Through the Utilization of Hadoop MapReduce in Cloud Computing Environment Sayalee Narkhede Department of Information Technology Maharashtra Institute

More information

Mobile Storage and Search Engine of Information Oriented to Food Cloud

Mobile Storage and Search Engine of Information Oriented to Food Cloud Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:

More information

Manifest for Big Data Pig, Hive & Jaql

Manifest for Big Data Pig, Hive & Jaql Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,

More information

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects Abstract: Build a model to investigate system and discovering relations that connect variables in a database

More information

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

Big Data and Scripting map/reduce in Hadoop

Big Data and Scripting map/reduce in Hadoop Big Data and Scripting map/reduce in Hadoop 1, 2, parts of a Hadoop map/reduce implementation core framework provides customization via indivudual map and reduce functions e.g. implementation in mongodb

More information

Formal Methods for Preserving Privacy for Big Data Extraction Software

Formal Methods for Preserving Privacy for Big Data Extraction Software Formal Methods for Preserving Privacy for Big Data Extraction Software M. Brian Blake and Iman Saleh Abstract University of Miami, Coral Gables, FL Given the inexpensive nature and increasing availability

More information

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Big Data and Analytics in Government

Big Data and Analytics in Government Big Data and Analytics in Government Nov 29, 2012 Mark Johnson Director, Engineered Systems Program 2 Agenda What Big Data Is Government Big Data Use Cases Building a Complete Information Solution Conclusion

More information

FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data

FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez To cite this version: Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez. FP-Hadoop:

More information

Apache Kylin Introduction Dec 8, 2014 @ApacheKylin

Apache Kylin Introduction Dec 8, 2014 @ApacheKylin Apache Kylin Introduction Dec 8, 2014 @ApacheKylin Luke Han Sr. Product Manager lukhan@ebay.com @lukehq Yang Li Architect & Tech Leader yangli9@ebay.com Agenda What s Apache Kylin? Tech Highlights Performance

More information

Survey on Scheduling Algorithm in MapReduce Framework

Survey on Scheduling Algorithm in MapReduce Framework Survey on Scheduling Algorithm in MapReduce Framework Pravin P. Nimbalkar 1, Devendra P.Gadekar 2 1,2 Department of Computer Engineering, JSPM s Imperial College of Engineering and Research, Pune, India

More information

Performance Analysis of Book Recommendation System on Hadoop Platform

Performance Analysis of Book Recommendation System on Hadoop Platform Performance Analysis of Book Recommendation System on Hadoop Platform Sugandha Bhatia #1, Surbhi Sehgal #2, Seema Sharma #3 Department of Computer Science & Engineering, Amity School of Engineering & Technology,

More information

Extract Transform and Load Strategy for Unstructured Data into Data Warehouse Using Map Reduce Paradigm and Big Data Analytics

Extract Transform and Load Strategy for Unstructured Data into Data Warehouse Using Map Reduce Paradigm and Big Data Analytics Extract Transform and Load Strategy for Unstructured Data into Data Warehouse Using Map Reduce Paradigm and Big Data Analytics P.Saravana kumar 1, M.Athigopal 2, S.Vetrivel 3 Assistant Professor, Dept

More information

HDFS Space Consolidation

HDFS Space Consolidation HDFS Space Consolidation Aastha Mehta*,1,2, Deepti Banka*,1,2, Kartheek Muthyala*,1,2, Priya Sehgal 1, Ajay Bakre 1 *Student Authors 1 Advanced Technology Group, NetApp Inc., Bangalore, India 2 Birla Institute

More information

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer Alejandro Vaisman Esteban Zimanyi Data Warehouse Systems Design and Implementation ^ Springer Contents Part I Fundamental Concepts 1 Introduction 3 1.1 A Historical Overview of Data Warehousing 4 1.2 Spatial

More information

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration

MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration MyCloudLab: An Interactive Web-based Management System for Cloud Computing Administration Hoi-Wan Chan 1, Min Xu 2, Chung-Pan Tang 1, Patrick P. C. Lee 1 & Tsz-Yeung Wong 1, 1 Department of Computer Science

More information

How To Turn Big Data Into An Insight

How To Turn Big Data Into An Insight mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed

More information

Policy-based Pre-Processing in Hadoop

Policy-based Pre-Processing in Hadoop Policy-based Pre-Processing in Hadoop Yi Cheng, Christian Schaefer Ericsson Research Stockholm, Sweden yi.cheng@ericsson.com, christian.schaefer@ericsson.com Abstract While big data analytics provides

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

SAP and Hortonworks Reference Architecture

SAP and Hortonworks Reference Architecture SAP and Hortonworks Reference Architecture Hortonworks. We Do Hadoop. June Page 1 2014 Hortonworks Inc. 2011 2014. All Rights Reserved A Modern Data Architecture With SAP DATA SYSTEMS APPLICATIO NS Statistical

More information

Systems Engineering II. Pramod Bhatotia TU Dresden pramod.bhatotia@tu- dresden.de

Systems Engineering II. Pramod Bhatotia TU Dresden pramod.bhatotia@tu- dresden.de Systems Engineering II Pramod Bhatotia TU Dresden pramod.bhatotia@tu- dresden.de About me! Since May 2015 2015 2012 Research Group Leader cfaed, TU Dresden PhD Student MPI- SWS Research Intern Microsoft

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS

A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS Dr. Ananthi Sheshasayee 1, J V N Lakshmi 2 1 Head Department of Computer Science & Research, Quaid-E-Millath Govt College for Women, Chennai, (India)

More information

Data Management in SAP Environments

Data Management in SAP Environments Data Management in SAP Environments the Big Data Impact Berlin, June 2012 Dr. Wolfgang Martin Analyst, ibond Partner und Ventana Research Advisor Data Management in SAP Environments Big Data What it is

More information

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

International Journal of Innovative Research in Computer and Communication Engineering

International Journal of Innovative Research in Computer and Communication Engineering FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,

More information

NoSQL and Hadoop Technologies On Oracle Cloud

NoSQL and Hadoop Technologies On Oracle Cloud NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Testing 3Vs (Volume, Variety and Velocity) of Big Data Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used

More information

Big Data Processing with MapReduce for E-Book

Big Data Processing with MapReduce for E-Book Big Data Processing with MapReduce for E-Book Tae Ho Hong 2, Chang Ho Yun 1,2, Jong Won Park 1,2, Hak Geon Lee 2, Hae Sun Jung 1 and Yong Woo Lee 1,2 1 The Ubiquitous (Smart) City Consortium 2 The University

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.

More information

Hadoop Technology for Flow Analysis of the Internet Traffic

Hadoop Technology for Flow Analysis of the Internet Traffic Hadoop Technology for Flow Analysis of the Internet Traffic Rakshitha Kiran P PG Scholar, Dept. of C.S, Shree Devi Institute of Technology, Mangalore, Karnataka, India ABSTRACT: Flow analysis of the internet

More information

Hadoop Big Data for Processing Data and Performing Workload

Hadoop Big Data for Processing Data and Performing Workload Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

Disributed Query Processing KGRAM - Search Engine TOP 10

Disributed Query Processing KGRAM - Search Engine TOP 10 fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries Johan Montagnat CNRS, I3S lab, Modalis team on behalf of the CrEDIBLE

More information

MapReduce in GPI-Space

MapReduce in GPI-Space MapReduce in GPI-Space Tiberiu Rotaru 1, Mirko Rahn 1, and Franz-Josef Pfreundt 1 Fraunhofer Institute for Industrial Mathematics, Fraunhofer-Platz 1, 67663 Kaiserslautern, Germany, {tiberiu.rotaru, mirko.rahn,

More information

Hadoop Job Oriented Training Agenda

Hadoop Job Oriented Training Agenda 1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON HIGH PERFORMANCE DATA STORAGE ARCHITECTURE OF BIGDATA USING HDFS MS.

More information

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

On a Hadoop-based Analytics Service System

On a Hadoop-based Analytics Service System Int. J. Advance Soft Compu. Appl, Vol. 7, No. 1, March 2015 ISSN 2074-8523 On a Hadoop-based Analytics Service System Mikyoung Lee, Hanmin Jung, and Minhee Cho Korea Institute of Science and Technology

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

Big Data and Apache Hadoop s MapReduce

Big Data and Apache Hadoop s MapReduce Big Data and Apache Hadoop s MapReduce Michael Hahsler Computer Science and Engineering Southern Methodist University January 23, 2012 Michael Hahsler (SMU/CSE) Hadoop/MapReduce January 23, 2012 1 / 23

More information

JackHare: a framework for SQL to NoSQL translation using MapReduce

JackHare: a framework for SQL to NoSQL translation using MapReduce DOI 10.1007/s10515-013-0135-x JackHare: a framework for SQL to NoSQL translation using MapReduce Wu-Chun Chung Hung-Pin Lin Shih-Chang Chen Mon-Fong Jiang Yeh-Ching Chung Received: 15 December 2012 / Accepted:

More information

APACHE HADOOP JERRIN JOSEPH CSU ID#2578741

APACHE HADOOP JERRIN JOSEPH CSU ID#2578741 APACHE HADOOP JERRIN JOSEPH CSU ID#2578741 CONTENTS Hadoop Hadoop Distributed File System (HDFS) Hadoop MapReduce Introduction Architecture Operations Conclusion References ABSTRACT Hadoop is an efficient

More information

In-Memory Analytics for Big Data

In-Memory Analytics for Big Data In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...

More information

Big Data Introduction

Big Data Introduction Big Data Introduction Ralf Lange Global ISV & OEM Sales 1 Copyright 2012, Oracle and/or its affiliates. All rights Conventional infrastructure 2 Copyright 2012, Oracle and/or its affiliates. All rights

More information

L1: Introduction to Hadoop

L1: Introduction to Hadoop L1: Introduction to Hadoop Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: December 1, 2014 Today we are going to learn... 1 General

More information

Teradata s Big Data Technology Strategy & Roadmap

Teradata s Big Data Technology Strategy & Roadmap Teradata s Big Data Technology Strategy & Roadmap Artur Borycki, Director International Solutions Marketing 18 March 2014 Agenda > Introduction and level-set > Enabling the Logical Data Warehouse > Any

More information

Keywords: Big Data, HDFS, Map Reduce, Hadoop

Keywords: Big Data, HDFS, Map Reduce, Hadoop Volume 5, Issue 7, July 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Configuration Tuning

More information

Open source framework for data-flow visual analytic tools for large databases

Open source framework for data-flow visual analytic tools for large databases Open source framework for data-flow visual analytic tools for large databases D5.6 v1.0 WP5 Visual Analytics: D5.6 Open source framework for data flow visual analytic tools for large databases Dissemination

More information

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop

Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Journal of Computational Information Systems 7: 16 (2011) 5956-5963 Available at http://www.jofcis.com Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Ping ZHOU, Jingsheng LEI, Wenjun YE

More information

Optimization of Analytic Data Flows for Next Generation Business Intelligence Applications

Optimization of Analytic Data Flows for Next Generation Business Intelligence Applications Optimization of Analytic Data Flows for Next Generation Business Intelligence Applications Umeshwar Dayal, Kevin Wilkinson, Alkis Simitsis, Malu Castellanos, Lupita Paz HP Labs Palo Alto, CA, USA umeshwar.dayal@hp.com

More information

Formal Verification Problems in a Bigdata World: Towards a Mighty Synergy

Formal Verification Problems in a Bigdata World: Towards a Mighty Synergy Dept. of Computer Science Formal Verification Problems in a Bigdata World: Towards a Mighty Synergy Matteo Camilli matteo.camilli@unimi.it http://camilli.di.unimi.it ICSE 2014 Hyderabad, India June 3,

More information

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University

More information

Apache Hadoop: The Big Data Refinery

Apache Hadoop: The Big Data Refinery Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data

More information

Energy-Saving Cloud Computing Platform Based On Micro-Embedded System

Energy-Saving Cloud Computing Platform Based On Micro-Embedded System Energy-Saving Cloud Computing Platform Based On Micro-Embedded System Wen-Hsu HSIEH *, San-Peng KAO **, Kuang-Hung TAN **, Jiann-Liang CHEN ** * Department of Computer and Communication, De Lin Institute

More information

SAP Business Suite powered by SAP HANA

SAP Business Suite powered by SAP HANA SAP Business Suite powered by SAP HANA CeBIT 2013, March 5 th Bernd Leukert, Corporate Officer and Executive Vice President Application Innovation, SAP AG Magnitude of Change: Omission of Restrictions

More information

HadoopSPARQL : A Hadoop-based Engine for Multiple SPARQL Query Answering

HadoopSPARQL : A Hadoop-based Engine for Multiple SPARQL Query Answering HadoopSPARQL : A Hadoop-based Engine for Multiple SPARQL Query Answering Chang Liu 1 Jun Qu 1 Guilin Qi 2 Haofen Wang 1 Yong Yu 1 1 Shanghai Jiaotong University, China {liuchang,qujun51319, whfcarter,yyu}@apex.sjtu.edu.cn

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 8, August 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

An Efficient and Scalable Management of Ontology

An Efficient and Scalable Management of Ontology An Efficient and Scalable Management of Ontology Myung-Jae Park 1, Jihyun Lee 1, Chun-Hee Lee 1, Jiexi Lin 1, Olivier Serres 2, and Chin-Wan Chung 1 1 Korea Advanced Institute of Science and Technology,

More information

Big Data Management. Big Data Management. (BDM) Autumn 2013. Povl Koch September 16, 2013 15-09-2013 1

Big Data Management. Big Data Management. (BDM) Autumn 2013. Povl Koch September 16, 2013 15-09-2013 1 Big Data Management Big Data Management (BDM) Autumn 2013 Povl Koch September 16, 2013 15-09-2013 1 Overview Today s program 1. Little more practical details about this course 2. Chapter 7 in NoSQL Distilled

More information

CS 378 Big Data Programming

CS 378 Big Data Programming CS 378 Big Data Programming Lecture 2 Map- Reduce CS 378 - Fall 2015 Big Data Programming 1 MapReduce Large data sets are not new What characterizes a problem suitable for MR? Most or all of the data is

More information

Massive Cloud Auditing using Data Mining on Hadoop

Massive Cloud Auditing using Data Mining on Hadoop Massive Cloud Auditing using Data Mining on Hadoop Prof. Sachin Shetty CyberBAT Team, AFRL/RIGD AFRL VFRP Tennessee State University Outline Massive Cloud Auditing Traffic Characterization Distributed

More information

The New Face of Business Intelligence for SAP Customers

The New Face of Business Intelligence for SAP Customers Business Objects, an SAP company The New Face of Business Intelligence for SAP Customers Place holder Dan Kearnan, SAP BI Marketing, Business Objects Ken Hartman, Hughes Network Systems Agenda Why SAP

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

Business Process Modeling. Introduction to ARIS Methodolgy

Business Process Modeling. Introduction to ARIS Methodolgy Business Process Modeling Introduction to ARIS Methodolgy Agenda What s in modeling? Situation today Objectives of Process Management ARIS Framework and methods ARIS suite of products Live demo Page 2

More information

USC Viterbi School of Engineering

USC Viterbi School of Engineering USC Viterbi School of Engineering INF 551: Foundations of Data Management Units: 3 Term Day Time: Spring 2016 MW 8:30 9:50am (section 32411D) Location: GFS 116 Instructor: Wensheng Wu Office: GER 204 Office

More information

NoSQL for SQL Professionals William McKnight

NoSQL for SQL Professionals William McKnight NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to

More information

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India 1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto

More information

HMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCE

HMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCE HMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCE Sayalee Narkhede 1 and Tripti Baraskar 2 Department of Information Technology, MIT-Pune,University of Pune, Pune sayleenarkhede@gmail.com

More information

Radoop: Analyzing Big Data with RapidMiner and Hadoop

Radoop: Analyzing Big Data with RapidMiner and Hadoop Radoop: Analyzing Big Data with RapidMiner and Hadoop Zoltán Prekopcsák, Gábor Makrai, Tamás Henk, Csaba Gáspár-Papanek Budapest University of Technology and Economics, Hungary Abstract Working with large

More information

Best Practices for Hadoop Data Analysis with Tableau

Best Practices for Hadoop Data Analysis with Tableau Best Practices for Hadoop Data Analysis with Tableau September 2013 2013 Hortonworks Inc. http:// Tableau 6.1.4 introduced the ability to visualize large, complex data stored in Apache Hadoop with Hortonworks

More information

Big Fast Data Hadoop acceleration with Flash. June 2013

Big Fast Data Hadoop acceleration with Flash. June 2013 Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional

More information

IBM Information Management

IBM Information Management IBM Information Management January 2008 IBM Information Management software Enterprise Information Management, Enterprise Content Management, Master Data Management How Do They Fit Together An IBM Whitepaper

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

The BigData Top100 List Initiative. Chaitan Baru San Diego Supercomputer Center

The BigData Top100 List Initiative. Chaitan Baru San Diego Supercomputer Center The BigData Top100 List Initiative Chaitan Baru San Diego Supercomputer Center 2 Background Workshop series on Big Data Benchmarking (WBDB) First workshop, May 2012, San Jose. Hosted by Brocade. Second

More information

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn Presented by :- Ishank Kumar Aakash Patel Vishnu Dev Yadav CONTENT Abstract Introduction Related work The Ecosystem Ingress

More information

Introduction to DISC and Hadoop

Introduction to DISC and Hadoop Introduction to DISC and Hadoop Alice E. Fischer April 24, 2009 Alice E. Fischer DISC... 1/20 1 2 History Hadoop provides a three-layer paradigm Alice E. Fischer DISC... 2/20 Parallel Computing Past and

More information

Data Modeling for Big Data

Data Modeling for Big Data Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes

More information

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS

ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS A.Divya *1, A.M.Saravanan *2, I. Anette Regina *3 MPhil, Research Scholar, Muthurangam Govt. Arts College, Vellore, Tamilnadu, India Assistant

More information

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging

More information

A Risk Management System Framework for New Product Development (NPD)

A Risk Management System Framework for New Product Development (NPD) 2011 International Conference on Economics and Finance Research IPEDR vol.4 (2011) (2011) IACSIT Press, Singapore A Risk Management System Framework for New Product Development (NPD) Seonmuk Park, Jongseong

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

Using a Failure Modes, Effects and Diagnostic Analysis (FMEDA) to Measure Diagnostic Coverage in Programmable Electronic Systems.

Using a Failure Modes, Effects and Diagnostic Analysis (FMEDA) to Measure Diagnostic Coverage in Programmable Electronic Systems. Using a Failure Modes, Effects and Diagnostic Analysis (FMEDA) to Measure Diagnostic Coverage in Programmable Electronic Systems. Dr. William M. Goble exida.com, 42 Short Rd., Perkasie, PA 18944 Eindhoven

More information

Supply Chain Enterprise and the Need for Integrated Information

Supply Chain Enterprise and the Need for Integrated Information Design and Delivery of Information System using ERP Database Management Software Track: Enterprise Resource Planning The importance of global trade has aroused interest in Enterprise Systems as catalysts

More information

Framework and key technologies for big data based on manufacturing Shan Ren 1, a, Xin Zhao 2, b

Framework and key technologies for big data based on manufacturing Shan Ren 1, a, Xin Zhao 2, b International Conference on Materials Engineering and Information Technology Applications (MEITA 2015) Framework and key technologies for big data based on manufacturing Shan Ren 1, a, Xin Zhao 2, b 1

More information