BDA Technologies & Selected Case Studies
|
|
- Gary Kennedy
- 8 years ago
- Views:
Transcription
1 BDA Technologies & Selected Case Studies Ettikan Kandasamy Karuppiah (Ph.D), Principal Researcher & Director of Accelerative Technologies Lab MIMOS Berhad SEMINAR INTERNET COMPUTING TECHNOLOGY Theme: Delivering Values From Hyperconnectivities Serbaguna 1, MAMPU 19 th January 2015
2 Big Analytics in a Glance Big data is defined by the high volume, velocity, variety, veracity and value of data which are generated every second, minute, hour, day.by device, human etc Growing data VOLUME 90% of world s data generated over last 2 years Turning big data into Value Broadening data VARIETY 80% of the world s data is unstructured (text, geospatial, audio, video) Increasing data VELOCITY 175,000 tweets per second ECONOMIC BENEFITS GOVERNMENT BENEFITS SOCIETAL BENEFITS Establishing the VERACITY of big data sources Big technology allows us to establish quality and accuracy especially in unstructured data 2
3 B u s i n e s s V a l u e Big Computing in ICT Sector The Malaysian ICT services sub-sector has huge potential growth, with a projected share of 35% in the nation s Digital Economy in Requires Transformative Platform Source: MDEC, as taken from APeJ Big MaturityScape Assessment 2013 by IDC Software Solutions and Support is the Key GDP Contributor
4 Conducted Workshop, Hadoop Programming training to Malaysian Research Community MIMOS Big Technologies R&D MultiCore Java Compiler Acquire Train R&D Intel Malaysia /US MoU Establish work on General Purpose Graphics Processing Unit for text manipulation, Hadoop Trainings nvidia COE for GPGPU Established Cleansing Engine for PERKESO & Warehouse for PERKESO R&D Collaboration MiAccLib Cleansing MiAccLib Cleansing RM10 -> Foundation & Early Adaptation for Heterogenetic Computing AMD Malaysia /US/Europe MoU GE 13 Electoral Roll Analysis with Hadoop & GPU GPU Accelerated Libraries for Cleansing & Financial Risk Modeling ESRI Inc/US Mou Established MiAccLib Finance Acquire Train Sentiment Analysis Model & Modeling & Warehouse for PIK MOH & GPGPU Video Analytics Library MiAccLib Video Modeling & Visualization for PDRM Workforce Planning & GPGPU Security Library Accelerated Libraries for base Accelerator Library (Galactica) MiAccLib Algo/Map High Risk Profiling, Illicit, Taxable & Drugs Detection (PoC) MiAccLib Image MiAccLib Big 2014 MIMOS Berhad. All Rights Reserved. 4 Encryption/Decr yption for National Protection MiAccLib Crypto RM11 -> Maturation & Progressive Deployment of Scalable Heterogenetic Computing
5 Assisting Both Government & Private Sector Needs Private Sector to Go Global National Public Sector DECISIONS REQUESTED FCC is requested to: 1. Take note of data science upskilling for civil servants 2. Take note of MAMPU developing the Government Open framework by Endorse the DG Lab on BDA to identify use cases and pilot projects that address societal wellbeing 4. Take note of MIMOS defining and developing the Big technology platform for Government by Mandate opening up of all relevant data (Open/Non-Open) to the DG Lab on BDA for the pilot projects Source : MDeC Rahsia Besar Rahsia Sulit Terhad Terbuka Opening Up Non-Sensitive Government Policy for all government agencies to open up data categorised under terbuka o Policy Technology E.g. - non-sensitive data like meteorology, transport timetables and pricing of essential goods based on Open criteria +
6 Developing BDA Open Innovation Platform An open-innovation platform between Government, businesses and Rakyat to improve e- participation and user satisfaction. Prioritization through the development of high impact, lowcost, demand driven life-event solutions Secure environment (sandbox) for Government DATA - Community - Government BDA DG (Digital Government) LAB Project Sponsor Sector-specific use cases /life-events: eg. Welfare, Education, Healthcare, Transportation BDA Technology Platform Expertise OUTCOMES POCs, pilots & apps Open.gov.my
7 BDA Technology Platform Strategy Research & Development on KEY Extraction, Processing & Analytics Components DATA Community Government Secured Cloud Services Extraction Security DB Store Infrastructure Management Accelerated Staging Computing Cleansing Harmonisation Anonymisation Machine Learning - Malaysian Model & Analytics Context - (BM, English, Chinese, Tamil) Visualization Visualization - Malaysian Traceability Perspective Key Values i. National Sovereignty ii. Trusted iii. Secured Localized Entity (ie. MIMOS, Cybersecurity)
8 BDA Technology Platform Strategy Applications Source Mi-CLIP Extraction Mi-Harvester Staging Mi-Morphe Cleansing Mi-Harmony Customization Mi-Helio Visualization Mi-BIS Mi-UAP Security Mi-ARMC Structured + Open Linked Harmonisation Mi-Doc Mi-Scrambler Mi-Portal Mi-Trust Mi-DW Management Anonymisation Mi-AccLib Mi-DSS Mi-SP (Video Analytics) Mi-Market Galactica DB Store Traceability Mi-Trace Mi-ROSS Mi-AccLytics Model & Analytics Mi-STP Mi-HPDW Mi-Target Unstructured Infrastructure Mi-Cloud Management Mi-Mobile Mi-MOCHA 3 rd Party Systems & Hardware 8
9 Extracting Value from Unstructured Collector Mi-Clip Harvesting Cleansing Harmonisation Anonymisation Sharing Scrambled database & marts Staging UnStructured Sources Structured Sources Knowledge Harvester (LOD) Mi-Harvester Authentication & Authorization Mi-UAP Mi-ARMC Cleansing Correction Detect Correction Exception Mi-Morphe + Mi-AccLib Harmonisation Harmonisation Terminologies Harmonisation Mi-Harmony + Mi-Semantics Warehouse Platform (Mi-Galactica, Mi-AccConnect, Mi-HPDW) Anonymisation Mi-Scramble + Mi-Crypto + MiAccLib Granular Primary base Published Marts Modeling Visualization Mi-HELIO; Mi-BIS Analytics Mi-HPDW Statistics Mi-AccStat Visualization Social Network Analytics Mi-Visualitic Analytics Mi-Portal Analytics Sentiment Analytics Mi-Intelligence; Mi-NLP Analytics Mi-Target Virtualized Platform & Integrity Manager Mi-CLOUD + Mi-Mocha 2014 MIMOS Berhad. All Rights Reserved. 9
10 Technology Push Technology Challenges Ahead (11 th Malaysia Plan) NEWER Sources of (eg. high speed streams) Technology Pull NEWER Channels of Consumption (eg. Omni channel data market) New Platforms & Revisions NEWER Methods of Visualization (eg. Multi dimensional view) Mi-CLIP Mi-Morphe Mi-Helio Mi-UAP Mi-Harvester Mi-Harmony Mi-BIS Mi-ARMC Mi-Doc Mi-Scrambler Mi-Portal Mi-Trust Mi-DW Mi-AccLib Mi-DSS Mi-SP (Video Analytics) Mi-Market Mi-Trace Mi-AccLytics Mi-STP Galactica Mi-ROSS Mi-HPDW Mi-Target Mi-Cloud Mi-Mobile Mi-MOCHA NEWER Paradigms on Computing (eg. Dockers) 10
11 Big Moving Forward IoA Internet of Anything II Industrial Internet IoE Internet of Everything IoT Internet of Things 11
12 Big Moving Forward IoA Internet of Anything Big Processing Software Defined Network II Industrial Internet IoE Internet of Everything Mobile Systems Wearables Cloud Computing IoT Internet of Things Cyberphysical systems Cyberbiological systems Internet of Humans 12
13 Open Platform & BDA Middleware Architecture Source Structured, Semistructured & Un-structured Sources Web & Social Media RDBM S Extraction Mi-Clip Mi-Harvester Mi-Morphe Flume Sqoop Kafka Model Staging Cleansing Mi-Morphe Mi-AccLib Harmonization Mi-Harmony Anonymisation Mi-Scramble Mi-BIS Mi- HPDW Mi-AccConnect Mi-HPDW Mi-Portal Visualisation Mi-Helio Analytics Tools (Machine Learning) Mi-Intelligence R Mahout ML-Lib (Spark) Mi-NLP Cloudera Search & Solr Galactica Connector Mi-Visualitics Mi-AccStat GIS Mi-Target Apache Drill Spark/Shark Hue Pig Hive Impala Shark Security Mi-UAP Mi- Trust Open Linked Mi-HPDW Mi-Crypto Mi-AccLib Mi-HPDW Galactica YARN Management Files Mi-HPDW STORAGE Storage Galactica FS HDFS, NoSQL RDBMS Galactica Hadoop warehouse / mart Infrastructure RDF Graph DB Cloudera Manager/ Falcon Zoo Keeper Oozie Mi-Cloud Mi-Mocha Sentry MIMOS Solution 3 rd Party Solution
14 Security and Authentication Sentry Mi-UAP Mi-ARMC Mi-Trust MIMOS Big Stack With Reference to Hadoop Stack Visualization Mi-Helio Mi-Portal Mi-BIS (Mi-AccConnect) 3 rd Party Apps Batch Query MapReduce v2 Pig Hive Processing Mi-Morphe Morphlines Mi-Acclib MapReduce v2 (Accelerated ETL) HPDW Model Plugin (For MiMorphe v3 /Pentaho) Machine Learning Mi-BIS (Weka) Accstats (R and Cloudera C++) ML-LIB (Spark) Revolution R, Weka Real Time Query Mi-BIS with Impala through Mi-AccConnect Hue Galactica Apache Drill Spark/Shark HPDW-Big DB Analytics Simulator Planning Tool Predictive Prescriptive Prediction Algorithm Mi-BIS (Mi-Accstats) Mi-BIS ( Mining) Revolution R 3 rd Party GIS 3rd party Management YARN (resource management) Big Orchestration Engine/Layer Zookeeper (configuration and synchronization) Oozie (work flow scheduler) Cloudera Manager Management for Luster Management Sqoop Flume Storage Application Program Interface Thrift REST Java API AVRO Stream Spark Kafka Spring XD & Storm HDFS HPDW-Storage Galactica FS NoSQL (Hbase) Distributed base (Cassandra) RDBMS (Postgress, MySQL) Multi & Many Cores Processors (CPU + GPU) Search Cloudera Search & Solr Streaming (twitter, logs, etc) RDBMS ( Sources Type) NoSQL Type Legend: Complete 3 rd Party 3 rd Party & MIMOS Offering MIMOS Technologies 3 rd Party Technologies
15 Proof of Concepts Selected Use Cases 15
16 Proof of Concepts -Mixed Scenario- (Technology Capabilities) 16
17 Challenges to be Addressed During Initial Roll-Outs 17
18 Challenges (Stage 1) is stored in partial & distributed locations Format of data both in digital & non digital while some are in paper based format Incomplete data set (Q issues) Cleanliness of the data Missing values, Random, Non-Random, CR, Noise Cleaning while maintaining integrity & value Extracting the features in plural languages (at least English & Malay) Structured has longer historical value to be acquired storage media & format for extraction and usage How to authenticate the key values? Where is the reference point? As for unstructured data (e.g social media), current technology is adequate to support the pre-processing, analytics With some local challenges Who are the data owner? How to ensure the security level of the data for sharing? PDP compliance confusion. More to be share by visiting MIMOS Lab
19 Analytics Challenges (Stage 2) Tools are available but right approach is still critical for evaluation Which are the best/right algorithms to be used? Can you identify the right domain expert within the organization? Who are the local domain experts to be consulted for the methods/algorithms selection? You may not have data scientist in specific gov. organization, but how to form one (external + internal) -> analytics team What exactly are the data owners business needs? Why do they need to do this? Headache for them best to leave the data to rest in peace!! Which data to be included and which to be excluded, what to be anonymized? concern of meaning/trend extraction Plurality of languages & interpretation accuracy Semantification of the language specific analytics Bottlenecks to be identified and accelerated approach required for the specific processing Agile is the best way
20 Results Challenges (Stage 3) Visualization of the results in simple, action-able and communicable how to handle continuously changing analytics (and the results) due to New data inclusion New domain expert inclusion New additional factors to be considered Who validates the results? How to translate results to value for (gov) organization How to translate the value to actions? How to follow-up on 2nd cycle of activities?
21 Benefiting Humanity Through Technology Thank You
The Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationHadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
More informationA Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
More informationInformation Builders Mission & Value Proposition
Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns
More information#TalendSandbox for Big Data
Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND
More informationMobile Application and Public Safety System for Crime Tip-Off and crime prevention by communities
Mobile Application and Public Safety System for Crime Tip-Off and crime prevention by communities By Dr. Keeratpal Singh, Principal Engineer, MIMOS Berhad Senior Assistant Commissioner Dato' Aishah binti
More informationGAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
More informationWhy Spark on Hadoop Matters
Why Spark on Hadoop Matters MC Srivas, CTO and Founder, MapR Technologies Apache Spark Summit - July 1, 2014 1 MapR Overview Top Ranked Exponential Growth 500+ Customers Cloud Leaders 3X bookings Q1 13
More informationBig Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
More informationTalend Big Data. Delivering instant value from all your data. Talend 2014 1
Talend Big Data Delivering instant value from all your data Talend 2014 1 I may say that this is the greatest factor: the way in which the expedition is equipped. Roald Amundsen race to the south pole,
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationDominik Wagenknecht Accenture
Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna
More informationBig Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools
More informationHadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
More informationSOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
More informationCollaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
More informationComprehensive Analytics on the Hortonworks Data Platform
Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page
More informationBig Data and Industrial Internet
Big Data and Industrial Internet Keijo Heljanko Department of Computer Science and Helsinki Institute for Information Technology HIIT School of Science, Aalto University keijo.heljanko@aalto.fi 16.6-2015
More informationIntel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache
More informationANALYTICS CENTER LEARNING PROGRAM
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
More informationGetting Started with Hadoop. Raanan Dagan Paul Tibaldi
Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop
More informationBuilding Scalable Big Data Pipelines
Building Scalable Big Data Pipelines NOSQL SEARCH ROADSHOW ZURICH Christian Gügi, Solution Architect 19.09.2013 AGENDA Opportunities & Challenges Integrating Hadoop Lambda Architecture Lambda in Practice
More informationHDP Enabling the Modern Data Architecture
HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationHadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
More informationHortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015
Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015 We Do Hadoop Fall 2014 Page 1 HDP delivers a comprehensive data management platform GOVERNANCE Hortonworks Data Platform
More informationUpcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
More informationRoadmap Talend : découvrez les futures fonctionnalités de Talend
Roadmap Talend : découvrez les futures fonctionnalités de Talend Cédric Carbone Talend Connect 9 octobre 2014 Talend 2014 1 Connecting the Data-Driven Enterprise Talend 2014 2 Agenda Agenda Why a Unified
More informationInfomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
More informationOracle Big Data Fundamentals Ed 1 NEW
Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big
More informationNative Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
More informationBringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
More informationWWW.WIPRO.COM HADOOP VENDOR DISTRIBUTIONS THE WHY, THE WHO AND THE HOW? Guruprasad K.N. Enterprise Architect Wipro BOTWORKS
WWW.WIPRO.COM HADOOP VENDOR DISTRIBUTIONS THE WHY, THE WHO AND THE HOW? Guruprasad K.N. Enterprise Architect Wipro BOTWORKS Table of contents 01 Abstract 01 02 03 04 The Why - Need for The Who - Prominent
More informationWorkshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
More informationSelf-service BI for big data applications using Apache Drill
Self-service BI for big data applications using Apache Drill 2015 MapR Technologies 2015 MapR Technologies 1 Data Is Doubling Every Two Years Unstructured data will account for more than 80% of the data
More informationSelf-service BI for big data applications using Apache Drill
Self-service BI for big data applications using Apache Drill 2015 MapR Technologies 2015 MapR Technologies 1 Management - MCS MapR Data Platform for Hadoop and NoSQL APACHE HADOOP AND OSS ECOSYSTEM Batch
More informationDatenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
More informationReal Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationBig Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:
More informationBig Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.
Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their
More informationBig Data Course Highlights
Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like
More informationIntegrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
More informationMaking Sense of Big Data in Insurance
Making Sense of Big Data in Insurance Amir Halfon, CTO, Financial Services, MarkLogic Corporation BIG DATA?.. SLIDE: 2 The Evolution of Data Management For your application data! Application- and hardware-specific
More informationApache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com
Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture
More informationTRAINING PROGRAM ON BIGDATA/HADOOP
Course: Training on Bigdata/Hadoop with Hands-on Course Duration / Dates / Time: 4 Days / 24th - 27th June 2015 / 9:30-17:30 Hrs Venue: Eagle Photonics Pvt Ltd First Floor, Plot No 31, Sector 19C, Vashi,
More informationDell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert Armando_Acosta@Dell.com/
More informationCloudera Enterprise Data Hub in Telecom:
Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer
More informationBig Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based
More informationCommunicating with the Elephant in the Data Center
Communicating with the Elephant in the Data Center Who am I? Instructor Consultant Opensource Advocate http://www.laubersoltions.com sml@laubersolutions.com Twitter: @laubersm Freenode: laubersm Outline
More informationBig Data & Security. Aljosa Pasic 12/02/2015
Big Data & Security Aljosa Pasic 12/02/2015 Welcome to Madrid!!! Big Data AND security: what is there on our minds? Big Data tools and technologies Big Data T&T chain and security/privacy concern mappings
More informationOracle Big Data Spatial & Graph Social Network Analysis - Case Study
Oracle Big Data Spatial & Graph Social Network Analysis - Case Study Mark Rittman, CTO, Rittman Mead OTN EMEA Tour, May 2016 info@rittmanmead.com www.rittmanmead.com @rittmanmead About the Speaker Mark
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationBig Data and Data Science. The globally recognised training program
Big Data and Data Science The globally recognised training program Certificate in Big Data Analytics Duration 5 days Big Data and Data Science enables value creation from data, through the use of calculative
More informationThe Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson
The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities
More informationApache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah
Apache Hadoop: The Pla/orm for Big Data Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah 1 The Problems with Current Data Systems BI Reports + Interac7ve Apps RDBMS (aggregated
More informationThe Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader
The Digital Enterprise Demands a Modern Integration Approach Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader Yesterday s approach to data and application integration is a barrier
More informationNative Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy
Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics
More informationBuilding Your Big Data Team
Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.
More informationData Analyst Program- 0 to 100
Development Data Analyst Program- 0 to 100 Master the Data Analysis tools like Pig and hive Data Science Build a recommendation engine 1 Data Analyst Program- 0 to 100 HADOOP SCHOOL OF TRAINING Basics
More informationData Lake In Action: Real-time, Closed Looped Analytics On Hadoop
1 Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 2 Pivotal s Full Approach It s More Than Just Hadoop Pivotal Data Labs 3 Why Pivotal Exists First Movers Solve the Big Data Utility Gap
More informationBig Data Open Source Stack vs. Traditional Stack for BI and Analytics
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at spoozhikala@stratapps.com.
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationITG Software Engineering
Introduction to Cloudera Course ID: Page 1 Last Updated 12/15/2014 Introduction to Cloudera Course : This 5 day course introduces the student to the Hadoop architecture, file system, and the Hadoop Ecosystem.
More informationDeploying Hadoop with Manager
Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution
More informationPeers Techno log ies Pv t. L td. HADOOP
Page 1 Peers Techno log ies Pv t. L td. Course Brochure Overview Hadoop is a Open Source from Apache, which provides reliable storage and faster process by using the Hadoop distibution file system and
More informationHow to Hadoop Without the Worry: Protecting Big Data at Scale
How to Hadoop Without the Worry: Protecting Big Data at Scale SESSION ID: CDS-W06 Davi Ottenheimer Senior Director of Trust EMC Corporation @daviottenheimer Big Data Trust. Redefined Transparency Relevance
More informationInteractive data analytics drive insights
Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has
More informationData Security in Hadoop
Data Security in Hadoop Eric Mizell Director, Solution Engineering Page 1 What is Data Security? Data Security for Hadoop allows you to administer a singular policy for authentication of users, authorize
More informationCloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
More informationIBM Big Data Platform
IBM Big Data Platform Turning big data into smarter decisions Stefan Söderlund. IBM kundarkitekt, Försvarsmakten Sesam vår-seminarie Big Data, Bigga byte kräver Pigga Hertz! May 16, 2013 By 2015, 80% of
More informationRed Hat Enterprise Linux is open, scalable, and flexible
CHOOSING AN ENTERPRISE PLATFORM FOR BIG DATA Red Hat Enterprise Linux is open, scalable, and flexible TECHNOLOGY OVERVIEW 10 things your operating system should deliver for big data 1) Open source project
More informationArchitectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
More informationEMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.
EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationBusiness Intelligence for Big Data
Business Intelligence for Big Data Will Gorman, Vice President, Engineering May, 2011 2010, Pentaho. All Rights Reserved. www.pentaho.com. What is BI? Business Intelligence = reports, dashboards, analysis,
More informationQsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
More informationAre You Big Data Ready?
ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain
More informationAddressing Open Source Big Data, Hadoop, and MapReduce limitations
Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?
More informationData Services Advisory
Data Services Advisory Modern Datastores An Introduction Created by: Strategy and Transformation Services Modified Date: 8/27/2014 Classification: DRAFT SAFE HARBOR STATEMENT This presentation contains
More informationWe are building the next generation of Big Data and Analytics solutions!
We are building the next generation of Big Data and Analytics solutions! Background 26 years Experience IT Industry 12 Years Solutions Architect - International Profile Passionate about Technology Genuine
More informationBig Data Realities Hadoop in the Enterprise Architecture
Big Data Realities Hadoop in the Enterprise Architecture Paul Phillips Director, EMEA, Hortonworks pphillips@hortonworks.com +44 (0)777 444 3857 Hortonworks Inc. 2012 Page 1 Agenda The Growth of Enterprise
More informationPilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing
Pilot-Streaming: Design Considerations for a Stream Processing Framework for High- Performance Computing Andre Luckow, Peter M. Kasson, Shantenu Jha STREAMING 2016, 03/23/2016 RADICAL, Rutgers, http://radical.rutgers.edu
More informationHow Cisco IT Built Big Data Platform to Transform Data Management
Cisco IT Case Study August 2013 Big Data Analytics How Cisco IT Built Big Data Platform to Transform Data Management EXECUTIVE SUMMARY CHALLENGE Unlock the business value of large data sets, including
More informationWHITE PAPER. Four Key Pillars To A Big Data Management Solution
WHITE PAPER Four Key Pillars To A Big Data Management Solution EXECUTIVE SUMMARY... 4 1. Big Data: a Big Term... 4 EVOLVING BIG DATA USE CASES... 7 Recommendation Engines... 7 Marketing Campaign Analysis...
More informationBig data for the Masses The Unique Challenge of Big Data Integration
Big data for the Masses The Unique Challenge of Big Data Integration White Paper Table of contents Executive Summary... 4 1. Big Data: a Big Term... 4 1.1. The Big Data... 4 1.2. The Big Technology...
More informationBig Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012
Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation
More informationConstructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
More informationCertified Big Data and Apache Hadoop Developer VS-1221
Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification
More informationBIG DATA IS MESSY PARTNER WITH SCALABLE
BIG DATA IS MESSY PARTNER WITH SCALABLE SCALABLE SYSTEMS HADOOP SOLUTION WHAT IS BIG DATA? Each day human beings create 2.5 quintillion bytes of data. In the last two years alone over 90% of the data on
More informationTDWI: BUSINESS INTELLIGENCE & DATA WAREHOUSING EDUCATION EUROPE
TDWI: BUSINESS INTELLIGENCE & DATA WAREHOUSING EDUCATION EUROPE TDWI In-Depth Courses 1st Half 2016 In-Depth course: Data Visualization In-Depth course: Big Data In-Depth course: Hadoop CBIP Preparation
More informationINDUS / AXIOMINE. Adopting Hadoop In the Enterprise Typical Enterprise Use Cases
INDUS / AXIOMINE Adopting Hadoop In the Enterprise Typical Enterprise Use Cases. Contents Executive Overview... 2 Introduction... 2 Traditional Data Processing Pipeline... 3 ETL is prevalent Large Scale
More informationIntegrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013
Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationThe 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
More informationHadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?
Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily
More informationBIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview
BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM An Overview Contents Contents... 1 BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM... 1 Program Overview... 4 Curriculum... 5 Module 1: Big Data: Hadoop
More informationHow To Create A Data Visualization With Apache Spark And Zeppelin 2.5.3.5
Big Data Visualization using Apache Spark and Zeppelin Prajod Vettiyattil, Software Architect, Wipro Agenda Big Data and Ecosystem tools Apache Spark Apache Zeppelin Data Visualization Combining Spark
More informationLecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
More information