Gerrit and Jenkins for Big Data Continuous Delivery. Santa Clara, CA, September 2-3
|
|
- Ronald Gallagher
- 8 years ago
- Views:
Transcription
1 Gerrit and Jenkins for Big Data Continuous Delivery Santa Clara, CA, September 2-3 1
2 About GerritForge Founded in 2009 in London Committed to OpenSource 2
3 The Team Luca Milanesio Co-founder and Director of GerritForge over 20 years in Agile Development and ALM OpenSource contributor to many projects (BigData, Continuous Integration, Git/Gerrit) Antonios Chalkiopulos Author of Programming MapReduce with Scalding Open source contributor to many BigData projects Working on the "land-of-hadoop' (landoop.com) 3
4 The Team (2) Tiago Palma Data Warehouse & Big Data Development Senior Data Modeler Big Data infrastructure specialist Stefano Galarraga 20 years of Agile Development Middleware, Big Data, Reactive Distributed Systems. Open Source contributor to many BigData projects. 4
5 Agenda Why continuous deployment on BigData? Our Development Lifecycle ingredients Gerrit, Jenkins, Mesos, Marathon, CDH / Spark Topics to address in BigData development Type of tests (Unit vs. Integration) Testing the "real thing" (aka the Cluster) Our BigData virtualised infrastructure Marathon, Mesos and Dockers all around Live (minimised) Demo 5
6 WHY? Early BigData had no process at all = may fail at any time Mature BigData is mission critical decision maker Need for more stable sw-engineering methodologies: Test-Driven Development (Stefano's ScaldingUnit) Continuous Integration with Jenkins Integration & Performance testing Code review and validation 6
7 Code-Review BigData Lifecycle (1) GIT used by distributed teams (UK, Israel, India) Topics and Code Review Jenkins build on every patch-set Commits reviewed / approved via Gerrit Submit 7
8 Code-Review BigData Lifecycle (2) 8
9 Code-Review BigData Lifecycle (3) Submitting a Topic automatically does: all patch-sets merged (semi-atomically) trigger a longer chain of CI steps automatically promote a RC if everything passes Jenkins automation via Gerrit Trigger Plugin 9
10 Ingredients: Gerrit Git-based Code Review system Pre-commit review Allows multiple validation steps (pipeline) Validation + Integration flags 10
11 Ingredients: Jenkins Plugins: Gerrit trigger Docker build step Post-build script plugin 11
12 Fitting CDH Into this Picture Integration Test Running integration tests into an CDH-enabled docker container Hadoop/local and Spark/standalone is not enough Need to test classes serialisation Validate package fat-jars (libs conflicts with CDH) Performance on a real cluster 12
13 Fitting CDH Into this Picture Acceptance / performance test with short-lived CDHs Solution: Mesos, Marathon and Docker: Ephemeral clusters with defined capacity Automatic cluster-config All controlled via Docker/Mesos 13
14 Mesos + Marathon Apache Mesos Abstracts CPU, memory, storage, other compute resources away from machines Marathon Framework Runs on top of Mesos Guarantees that long-running applications never stop REST API for managing and scaling services 14
15 CDH Components CDH distribution Apache Spark Hadoop HDFS YARN 15
16 Integration Test Flow on CDH Cluster Slave Host Jenkins Master Marathon Mesos Master Mesos Slave Docker Private Docker Registry Waiting for Dockers POST to Marathon REST API to start 1 docker container with Cloudera Manager and N docker containers with cloudera agents Marathon Framework receives resource offers from Mesos Master and submits the tasks The task is sent to the Mesos Slave Install Cloudera packages via Cloudera Manager API using Python Mesos slave starts the docker container Dockers UP Docker image is fetched from Docker registry if not present in Slave host Deploy the ETL, run the ETL and the Integration Tests 16
17 Unit and Integration Tests sample Test project: Test Spark project ETL from Oracle to HDFS Unit-test directly on Spark logic Integration tests for every patch-set: VERY small dataset just for this demo CDH and Oracle Docker Images 17
18 Unit and Integration Tests Jenkins init Build Job O Hadoop Pseudodistributed mode Spark Standalone Init/read HDFS Submit job 18
19 DEMO Small-scale of BigData Delivery Pipeline 19
20 References Replay, share or extend the continuous delivery pipeline demo: Blog: @GerritForge Learn Gerrit Code Review: Get in touch with GerritForge: mailto: 20
21 Thanks to our Sponsors! 21
Mobile Development with Git, Gerrit & Jenkins
Mobile Development with Git, Gerrit & Jenkins Luca Milanesio luca@gerritforge.com June 2013 1 ENTERPRISE CLOUD DEVELOPMENT Copyright 2013 CollabNet, Inc. All Rights Reserved. About CollabNet Founded in
More informationSavanna Hadoop on. OpenStack. Savanna Technical Lead
Savanna Hadoop on OpenStack Sergey Lukjanov Savanna Technical Lead Mirantis, 2013 Agenda Savanna Overview Savanna Use Cases Roadmap & Current Status Architecture & Features Overview Hadoop vs. Virtualization
More informationDeveloping Plugins for Cloud Scale
Developing Plugins for Cloud Scale Who I am? 2 Who I am? I am? Oscar Sanjuan Engineering Director Email: oscar@elasticbox.com Twitter: twitter.com/elasticbox Blog: elasticbox.com/blog What does Elasticbox?
More informationYARN, the Apache Hadoop Platform for Streaming, Realtime and Batch Processing
YARN, the Apache Hadoop Platform for Streaming, Realtime and Batch Processing Eric Charles [http://echarles.net] @echarles Datalayer [http://datalayer.io] @datalayerio FOSDEM 02 Feb 2014 NoSQL DevRoom
More informationWhite paper: Delivering Business Value with Apache Mesos
Executive Summary In today s business environment, time to market is critical as we are more reliant on technology to meet customer needs. Traditional approaches to solving technology problems are failing
More informationReal Time Data Processing using Spark Streaming
Real Time Data Processing using Spark Streaming Hari Shreedharan, Software Engineer @ Cloudera Committer/PMC Member, Apache Flume Committer, Apache Sqoop Contributor, Apache Spark Author, Using Flume (O
More informationHow To Choose A Data Flow Pipeline From A Data Processing Platform
S N A P L O G I C T E C H N O L O G Y B R I E F SNAPLOGIC BIG DATA INTEGRATION PROCESSING PLATFORMS 2 W Fifth Avenue Fourth Floor, San Mateo CA, 94402 telephone: 888.494.1570 www.snaplogic.com Big Data
More informationDevOps Course Content
DevOps Course Content INTRODUCTION TO DEVOPS What is DevOps? History of DevOps Dev and Ops DevOps definitions DevOps and Software Development Life Cycle DevOps main objectives Infrastructure As A Code
More informationSoma: Linked Data Infrastructure
Soma: Linked Data Infrastructure What is Soma? It s Big Data Candy for the Cloud. The Soma platform helps Data Scientist to collaborate together to discover and share new facts from large datasets hosted
More informationPro Apache Hadoop. Second Edition. Sameer Wadkar. Madhu Siddalingaiah
Pro Apache Hadoop Second Edition Sameer Wadkar Madhu Siddalingaiah Contents J About the Authors About the Technical Reviewer Acknowledgments Introduction xix xxi xxiii xxv Chapter 1: Motivation for Big
More informationJenkins Slave Cloud with Apache Mesos. Klaus Azesberger Reinhard Kiesswetter Infonova GmbH
Jenkins Cloud with Apache Mesos Klaus Azesberger Reinhard Kiesswetter Infonova GmbH Agenda Our Jenkins Reasons to adopt our approach Pains of a static slave cloud Live Demo of setup and common ops use
More informationWorkshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
More informationHadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN
Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current
More informationInfomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
More informationHadoop on OpenStack Cloud. Dmitry Mescheryakov Software Engineer, @MirantisIT
Hadoop on OpenStack Cloud Dmitry Mescheryakov Software Engineer, @MirantisIT Agenda OpenStack Sahara Demo Hadoop Performance on Cloud Conclusion OpenStack Open source cloud computing platform 17,209 commits
More informationDell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert Armando_Acosta@Dell.com/
More informationCisco IT Hadoop Journey
Cisco IT Hadoop Journey Srini Desikan, Program Manager IT 2015 MapR Technologies 1 Agenda Hadoop Platform Timeline Key Decisions / Lessons Learnt Data Lake Hadoop s place in IT Data Platforms Use Cases
More informationQsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
More informationIntegrate Master Data with Big Data using Oracle Table Access for Hadoop
Integrate Master Data with Big Data using Oracle Table Access for Hadoop Kuassi Mensah Oracle Corporation Redwood Shores, CA, USA Keywords: Hadoop, BigData, Hive SQL, Spark SQL, HCatalog, StorageHandler
More informationSuggest Topics / Workshop Event Registration Agenda
Login Signup November 13 to 15 2015, Santa Clara, USA. Overview Registration Agenda Speakers Sponsors Suggest Topics / Event Registration Agenda Day -1 ( Nov 13 7:45AM- 7:30PM ) Big Data Track 7:45 AM
More informationCloudStack and Big Data. Sebastien Goasguen @sebgoa May 22nd 2013 LinuxTag, Berlin
CloudStack and Big Data Sebastien Goasguen @sebgoa May 22nd 2013 LinuxTag, Berlin Google trends Start of Clouds Cloud computing trending down, while Big Data is booming. Virtualization BigData on the Trigger
More informationHadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
More informationBig Data Analytics. Lucas Rego Drumond
Big Data Analytics Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Apache Spark Apache Spark 1
More informationHadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
More informationA Performance Analysis of Distributed Indexing using Terrier
A Performance Analysis of Distributed Indexing using Terrier Amaury Couste Jakub Kozłowski William Martin Indexing Indexing Used by search
More informationBig Graph Analytics on Neo4j with Apache Spark. Michael Hunger Original work by Kenny Bastani Berlin Buzzwords, Open Stage
Big Graph Analytics on Neo4j with Apache Spark Michael Hunger Original work by Kenny Bastani Berlin Buzzwords, Open Stage My background I only make it to the Open Stages :) Probably because Apache Neo4j
More informationCopyright 2012, Oracle and/or its affiliates. All rights reserved.
1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions
More informationBusiness Intelligence in Microservice Architecture. Debarshi Basak @ bol.com
Business Intelligence in Microservice Architecture Debarshi Basak @ bol.com What can you expect? - Introduction Monolithic days Mapreduce Era Flink Era Operational Aspect Who am I? Debarshi Basak Software
More informationApache Spark : Fast and Easy Data Processing Sujee Maniyam Elephant Scale LLC sujee@elephantscale.com http://elephantscale.com
Apache Spark : Fast and Easy Data Processing Sujee Maniyam Elephant Scale LLC sujee@elephantscale.com http://elephantscale.com Spark Fast & Expressive Cluster computing engine Compatible with Hadoop Came
More informationChase Wu New Jersey Ins0tute of Technology
CS 698: Special Topics in Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Ins0tute of Technology Some of the slides have been provided through the courtesy of Dr. Ching-Yung Lin at
More informationA Brief Introduction to Apache Tez
A Brief Introduction to Apache Tez Introduction It is a fact that data is basically the new currency of the modern business world. Companies that effectively maximize the value of their data (extract value
More informationHiBench Introduction. Carson Wang (carson.wang@intel.com) Software & Services Group
HiBench Introduction Carson Wang (carson.wang@intel.com) Agenda Background Workloads Configurations Benchmark Report Tuning Guide Background WHY Why we need big data benchmarking systems? WHAT What is
More informationRED HAT CONTAINER STRATEGY
RED HAT CONTAINER STRATEGY An introduction to Atomic Enterprise Platform and OpenShift 3 Gavin McDougall Senior Solution Architect AGENDA Software disrupts business What are Containers? Misconceptions
More informationBig Data Analytics OverOnline Transactional Data Set
Big Data Analytics OverOnline Transactional Data Set Rohit Vaswani 1, Rahul Vaswani 2, Manish Shahani 3, Lifna Jos(Mentor) 4 1 B.E. Computer Engg. VES Institute of Technology, Mumbai -400074, Maharashtra,
More informationBIG DATA HADOOP TRAINING
BIG DATA HADOOP TRAINING DURATION 40hrs AVAILABLE BATCHES WEEKDAYS (7.00AM TO 8.30AM) & WEEKENDS (10AM TO 1PM) MODE OF TRAINING AVAILABLE ONLINE INSTRUCTOR LED CLASSROOM TRAINING (MARATHAHALLI, BANGALORE)
More informationH2O on Hadoop. September 30, 2014. www.0xdata.com
H2O on Hadoop September 30, 2014 www.0xdata.com H2O on Hadoop Introduction H2O is the open source math & machine learning engine for big data that brings distribution and parallelism to powerful algorithms
More informationDeveloping Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University
More informationBig Data and Scripting Systems beyond Hadoop
Big Data and Scripting Systems beyond Hadoop 1, 2, ZooKeeper distributed coordination service many problems are shared among distributed systems ZooKeeper provides an implementation that solves these avoid
More informationThe Virtualization Practice
The Virtualization Practice White Paper: Managing Applications in Docker Containers Bernd Harzog Analyst Virtualization and Cloud Performance Management October 2014 Abstract Docker has captured the attention
More informationHadoop IST 734 SS CHUNG
Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationJenkins World Tour 2015 Santa Clara, CA, September 2-3
1 Jenkins World Tour 2015 Santa Clara, CA, September 2-3 Continuous Delivery with Container Ecosystem CAD @ Platform Equinix - Overview CAD Current Industry - Opportunities Monolithic to Micro Service
More informationThere's Plenty of Room in the Cloud
There's Plenty of Room in the Cloud [Shameless reference to Feynman s talk from 1959] Lecturer: Zoran Dimitrijevic Altiscale, Inc. Spring 2015 CS290B -- Cloud Computing 50 Years of Moore
More informationHow Bigtop Leveraged Docker for Build Automation and One-Click Hadoop Provisioning
How Bigtop Leveraged Docker for Build Automation and One-Click Hadoop Provisioning Evans Ye Apache Big Data 2015 Budapest Who am I Apache Bigtop PMC member Software Engineer at Trend Micro Develop Big
More informationEmbracing Spark as the Scalable Data Analytics Platform for the Enterprise
Embracing Spark as the Scalable Data Analytics Platform for the Enterprise Matthew J. Glickman GS.com/Engineering Spark Summit East 2015 1 How did we get here today? Strata + Hadoop World October 2014
More informationHADOOP BIG DATA DEVELOPER TRAINING AGENDA
HADOOP BIG DATA DEVELOPER TRAINING AGENDA About the Course This course is the most advanced course available to Software professionals This has been suitably designed to help Big Data Developers and experts
More informationWHAT S NEW IN SAS 9.4
WHAT S NEW IN SAS 9.4 PLATFORM, HPA & SAS GRID COMPUTING MICHAEL GODDARD CHIEF ARCHITECT SAS INSTITUTE, NEW ZEALAND SAS 9.4 WHAT S NEW IN THE PLATFORM Platform update SAS Grid Computing update Hadoop support
More informationData Analytics Infrastructure
Data Analytics Infrastructure Data Science SG Nov 2015 Meetup Le Nguyen The Dat @lenguyenthedat Backgrounds ZALORA Group (2013 2014) o Biggest online fashion retails in South East Asia o Data Infrastructure
More informationNoSQL and Hadoop Technologies On Oracle Cloud
NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath
More informationNative Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
More informationPrivate Cloud Management
Private Cloud Management Speaker Systems Engineer Unified Data Center & Cloud Team Germany Juni 2016 Agenda Cisco Enterprise Cloud Suite Two Speeds of Applications DevOps Starting Point into PaaS Cloud
More informationCertified Big Data and Apache Hadoop Developer VS-1221
Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification
More informationproject collects data from national events, both natural and manmade, to be stored and evaluated by
Joseph Sebastian CS 2994 Spring 2014 Undergraduate Research Final Paper GOALS The goal of my research was to assist the Integrated Digital Event Archive (IDEAL) team in transferring their Twitter data
More informationBuilding a data analytics platform with Hadoop, Python and R
Building a data analytics platform with Hadoop, Python and R Agenda Me Sanoma Past Present Future 3 18 November 2013 /me @skieft Software architect for Sanoma Managing the data and search team Focus on
More informationThe KPMG-NL Big Data team 16 March 2015
The KPMG-NL Big Data team 16 March 2015 Core analysis tools SQL Anaconda SciPy Matplotlib CERN C++ for advanced data science Statistical tools widely used in social sciences The development line ETL ETL
More informationMassively! Continuous Integration! A case study for Jenkins at cloud-scale
Massively! Continuous Integration! A case study for Jenkins at cloud-scale Thank you to our sponsors Platinum Sponsor Gold Sponsors Silver Sponsors Bronze Sponsors Jesse Dowdle, Sr Manager of Development
More informationUnified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia
Unified Big Data Processing with Apache Spark Matei Zaharia @matei_zaharia What is Apache Spark? Fast & general engine for big data processing Generalizes MapReduce model to support more types of processing
More informationTalend Big Data. Delivering instant value from all your data. Talend 2014 1
Talend Big Data Delivering instant value from all your data Talend 2014 1 I may say that this is the greatest factor: the way in which the expedition is equipped. Roald Amundsen race to the south pole,
More informationGregory Chomatas @gchomatas. PaaS team
Mesos + Singularity: PaaS automation for mortals Gregory Chomatas @gchomatas PaaS team 120 meters: My shortest travel to a Conference Miletus Thales of Miletus - 624 BC Those who can, do, the others philosophise...
More informationAli Ghodsi Head of PM and Engineering Databricks
Making Big Data Simple Ali Ghodsi Head of PM and Engineering Databricks Big Data is Hard: A Big Data Project Tasks Tasks Build a Hadoop cluster Challenges Clusters hard to setup and manage Build a data
More informationPracticing Continuous Delivery using Hudson. Winston Prakash Oracle Corporation
Practicing Continuous Delivery using Hudson Winston Prakash Oracle Corporation Development Lifecycle Dev Dev QA Ops DevOps QA Ops Typical turn around time is 6 months to 1 year Sprint cycle is typically
More informationHadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
More informationExecutive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
More informationExtending Hadoop beyond MapReduce
Extending Hadoop beyond MapReduce Mahadev Konar Co-Founder @mahadevkonar (@hortonworks) Page 1 Bio Apache Hadoop since 2006 - committer and PMC member Developed and supported Map Reduce @Yahoo! - Core
More informationMySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering
MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation
More informationDistributed Scheduling with Apache Mesos in the Cloud. PhillyETE - April, 2015 Diptanu Gon Choudhury @diptanu
Distributed Scheduling with Apache Mesos in the Cloud PhillyETE - April, 2015 Diptanu Gon Choudhury @diptanu Who am I? Distributed Systems/Infrastructure Engineer in the Platform Engineering Group Design
More informationNative Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy
Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics
More informationThe Inside Scoop on Hadoop
The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. Orion.Gebremedhin@Neudesic.COM B-orgebr@Microsoft.com @OrionGM The Inside Scoop
More informationEnabling Continuous Delivery for Java Projects with Oracle Cloud Services (Oracle PaaS) Siva Rama Krishna Oracle India
Enabling Continuous Delivery for Java Projects with Oracle Services (Oracle PaaS) Siva Rama Krishna Oracle India Agenda What is Continuous Delivery? What is Oracle PaaS? Enabling Continuous Delivery with
More informationHadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
More informationSujee Maniyam, ElephantScale
Hadoop PRESENTATION 2 : New TITLE and GOES Noteworthy HERE Sujee Maniyam, ElephantScale SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationUbuntu and Hadoop: the perfect match
WHITE PAPER Ubuntu and Hadoop: the perfect match February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction In many fields of IT, there are always stand-out technologies. This is definitely
More informationLeveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015
Leveraging the Power of SOLR with SPARK Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015 Welcome Johannes Weigend - CTO QAware GmbH - Software architect / developer - 25 years
More informationContinuous Integration
Continuous Integration Sébastien Besson Open Microscopy Environment Wellcome Trust Centre for Gene Regulation & Expression College of Life Sciences, University of Dundee Dundee, Scotland, UK 1 Plan 1.
More informationContinuous Delivery for Linux/Windows/Hadoop
Continuous Delivery for Linux/Windows/Hadoop Wisely Chen Agenda Background Problem Solution Demo Q & A Who I am Wisely Chen ( thegiive@gmail.com ) Enjoys promoting open source tech Release manager of Yahoo!
More informationBig Data Open Source Stack vs. Traditional Stack for BI and Analytics
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at spoozhikala@stratapps.com.
More informationApache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source
Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source DMITRIY SETRAKYAN Founder, PPMC http://www.ignite.incubator.apache.org #apacheignite Agenda Apache Ignite (tm) In- Memory
More informationDevOps. Jesse Pai Robert Monical 8/14/2015
DevOps Jesse Pai Robert Monical 8/14/2015 Agile Software Development 8/14/2015 2015 SGT Inc. 2 Agile Practices Adaptive planning Acceptance of changes in requirements and adapting to said changes Close
More informationCan t We All Just Get Along? Spark and Resource Management on Hadoop
Can t We All Just Get Along? Spark and Resource Management on Hadoop Introduc=ons So>ware engineer at Cloudera MapReduce, YARN, Resource management Hadoop commider Introduc=on Spark as a first class data
More informationLessons Learned: Building a Big Data Research and Education Infrastructure
Lessons Learned: Building a Big Data Research and Education Infrastructure G. Hsieh, R. Sye, S. Vincent and W. Hendricks Department of Computer Science, Norfolk State University, Norfolk, Virginia, USA
More informationContinuous integration with Jenkins CI
Continuous integration with Jenkins CI Vojtěch Juránek JBoss - a division by Red Hat 17. 2. 2012, Developer conference, Brno Vojtěch Juránek (Red Hat) Continuous integration with Jenkins CI 17. 2. 2012,
More informationCloudera Manager Introduction
Cloudera Manager Introduction Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained
More informationCoDe:U Git Flow - a Continuous Delivery Approach
CoDe:U Git Flow - a Continuous Delivery Approach Praqmatic Software Development 1 Continuous Delivery (CoDe) is an approach that builds very much on divide and conquer. It s bred from a lean and agile
More informationBig Data Training - Hackveda
Big Data Training - Hackveda Become a Hackveda Certified Big Data Professional - (Beginner) Skill level: Beginner Training fee: INR 9000 only (Topics covered: 108) Chief Trainer: Mr. Devanshu Shukla Training
More informationDevOps Best Practices for Mobile Apps. Sanjeev Sharma IBM Software Group
DevOps Best Practices for Mobile Apps Sanjeev Sharma IBM Software Group Me 18 year in the software industry 15+ years he has been a solution architect with IBM Areas of work: o DevOps o Enterprise Architecture
More informationIntroduction to Hadoop. New York Oracle User Group Vikas Sawhney
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
More informationCisco Application-Centric Infrastructure (ACI) and Linux Containers
White Paper Cisco Application-Centric Infrastructure (ACI) and Linux Containers What You Will Learn Linux containers are quickly gaining traction as a new way of building, deploying, and managing applications
More informationMoving From Hadoop to Spark
+ Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com sujee@elephantscale.com Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee
More informationWhat is Big Data? Concepts, Ideas and Principles. Hitesh Dharamdasani
What is Big Data? Concepts, Ideas and Principles Hitesh Dharamdasani # whoami Security Researcher, Malware Reversing Engineer, Developer GIT > George Mason > UC Berkeley > FireEye > On Stage Building Data-driven
More informationWEBAPP PATTERN FOR APACHE TOMCAT - USER GUIDE
WEBAPP PATTERN FOR APACHE TOMCAT - USER GUIDE Contents 1. Pattern Overview... 3 Features 3 Getting started with the Web Application Pattern... 3 Accepting the Web Application Pattern license agreement...
More informationA central continuous integration platform
A central continuous integration platform Agile Infrastructure use case and future plans Dec 5th, 2014 1/3 The Agile Infrastructure Use Case By Stefanos Georgiou What? Development practice Build better
More informationCloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
More informationNovember 12 th 13 th London: Mastering Continuous Integration with Jenkins
1. Course Objectives Students will walk away with a solid understanding of how to implement a Continuous Integration (CI) environment, including: Setting up a production-grade instance of a Jenkins server,
More informationIT@Intel How Intel IT Successfully Migrated to Cloudera Apache Hadoop*
White Paper April 2015 IT@Intel How Intel IT Successfully Migrated to Cloudera Apache Hadoop* From our original experience with Apache Hadoop software, Intel IT identified new opportunities to reduce IT
More informationBig Data Course Highlights
Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like
More informationCloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
More informationManaging large clusters resources
Managing large clusters resources ID2210 Gautier Berthou (SICS) Big Processing with No Locality Job( /crawler/bot/jd.io/1 ) submi t Workflow Manager Compute Grid Node Job This doesn t scale. Bandwidth
More informationTUT NoSQL Seminar (Oracle) Big Data
Timo Raitalaakso +358 40 848 0148 rafu@solita.fi TUT NoSQL Seminar (Oracle) Big Data 11.12.2012 Timo Raitalaakso MSc 2000 Work: Solita since 2001 Senior Database Specialist Oracle ACE 2012 Blog: http://rafudb.blogspot.com
More informationOracle Big Data Fundamentals Ed 1 NEW
Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big
More information