Write Once, Run Anywhere Pat McDonough
|
|
- Imogen Walters
- 8 years ago
- Views:
Transcription
1 Write Once, Run Anywhere Pat McDonough
2 Write Once, Run Anywhere
3 Write Once, Run Anywhere You Might Have Heard This Before!
4 Java, According to Wikipedia
5 Java, According to Wikipedia Java is a computer programming language specifically designed to have as few implementation dependencies as possible. It is intended to let application developers "write once, run anywhere" (WORA)
6 Java & WORA in the First Decade Java Client Applications Apps with GUIs (AWT or Swing) could be deployed to any OS with a JVM
7 Java & WORA in the First Decade Java Client Applications Apps with GUIs (AWT or Swing) could be deployed to any OS with a JVM Neat! but not all that useful - people don t want non-native GUIs
8 Java & WORA in the First Decade
9 Java & WORA in the First Decade Applets A way to deliver rich GUIs to many different platforms through the browser [Insert Ugly Applet Here]
10 Java & WORA in the First Decade Applets A way to deliver rich GUIs to many different platforms through the browser [Insert Ugly Applet Here] Neat!
11 Java & WORA in the First Decade Applets A way to deliver rich GUIs to many different platforms through the browser [Insert Ugly Applet Here] Neat! but basically ended at producing many gimmicky website animations
12 Java & WORA in the First Decade Back-end Applications! Windows Desktop for an IDE Unix Server for Production Neat! And actually useful too
13 Java & WORA in the First Back-end Java starts to formalize around standards > J2EE Decade Core libraries, deployment formats, etc. Vendors Offer J2EE App Servers Ironically, this immediately lead to no more WORA Specific App Servers required a specific SDK or even a specific IDE
14 Java & WORA in the First Decade Fixing WORA on the back-end: Fall back to the Least Common Denominator > Servlets (usually via Tomcat) Spring comes about to dominate as the SDK of choice for Java back-end applications specifically designed to have as few implementation dependencies as possible
15 So yes, you ve heard this before
16 So yes, you ve heard this before Which examples apply to the state of Big Data Ecosystem?
17 Important Changes Since Then Vendor Standards Open Source Data has overwhelmed us Distributed Systems Are The New Standard (specifically, Data Parallel systems)
18 Big Data Platforms Are Everywhere Now But Where Are the Big Data Applications? Big Data Applications don t exist very far beyond connecting ODBC/JDBC or simple ETL integrations Why?! Too many disparate systems to piece together Complicated matrix of compile-time and runtime dependencies across distributions i.e. each distribution effectively has it s own SDK
19 The Big Data Ecosystem Needs a Common SDK
20 The Big Data Ecosystem Needs a Common SDK Apache Spark is the answer
21 Spark An SDK for Big Data Applications SQL MLlib Streaming GraphX Core
22 Spark An SDK for Big Data Applications SQL MLlib Streaming GraphX Core Unified System With Libraries to Build a Complete Solution! Full-featured Programming Environment
23 Spark An SDK for Big Data Applications SQL MLlib Streaming GraphX Core Unified System With Libraries to Build a Complete Solution! Full-featured Programming Environment Single, Consistent Interface for Developers to Write Against! Runtimes available on several platforms
24 Develop Big Data Applications Python/Scala/Java SQL MLlib Streaming GraphX Dependencies Core Your Application
25 Develop Big Data Applications SQL MLlib Streaming GraphX Python/Scala/Java Dependencies Your Application Spark APIs Core Develop Applications using your preferred language, using existing libraries, using Spark s Public APIs (SparkContext, RDDs)
26 Work With Data SQL MLlib Streaming GraphX Python/Scala/Java Dependencies Core Your Application Data HDFS* Local S3 JDBC Cassandra
27 Work With Data Python/Scala/Java Dependencies Your Application Spark APIs SQL MLlib Streaming GraphX Core Spark Internals Care For Scheduling Data Operations Data Access & Scheduling Data HDFS* Local S3 JDBC Cassandra
28 Run Your Applications Python/Scala/Java SQL MLlib Streaming GraphX Dependencies Core Your Application YARN Mesos Spark Standalone Cluster
29 Run Your Applications Python/Scala/Java SQL MLlib Streaming GraphX Dependencies Core Your Application Submit Your Application and the Spark Runtime to a Cluster Manager YARN Mesos Spark Standalone Cluster
30 The Complete Picture Python/Scala/Java SQL MLlib Streaming GraphX Dependencies Core Your Application YARN Mesos Spark Standalone Clusters Data HDFS* Local S3 JDBC Cassandra
31 The Complete Picture Python/Scala/Java SQL MLlib Streaming GraphX Dependencies Core Your Application Spark Abstracts Runtime Dependencies from Developers YARN Mesos Spark Standalone Clusters Data HDFS* Local S3 JDBC Cassandra
32 How Spark Handles Hadoop Dependencies The Spark library is compiled with compatibility to a specific Hadoop version SQL MLlib Streaming GraphX At runtime, Spark uses reflection to find any Hadoop classes it needs Core Examples: # Apache Hadoop 2.2.X mvn -Pyarn -Phadoop-2.2 \ -Dhadoop.version=2.2.0 \ -DskipTests clean package # CDH with MapReduce v1 mvn -Dhadoop.version= mr1-cdh DskipTests \ clean package
33 How Spark Handles Hadoop Dependencies The Spark library is compiled with compatibility to a specific Hadoop version SQL MLlib Streaming GraphX At runtime, Spark uses reflection to find any Hadoop classes it needs Examples: Hadoop Client Core # Apache Hadoop 2.2.X mvn -Pyarn -Phadoop-2.2 \ -Dhadoop.version=2.2.0 \ -DskipTests clean package # CDH with MapReduce v1 mvn -Dhadoop.version= mr1-cdh DskipTests \ clean package
34 Spark Support Included on Big Data Platforms While this build process is very easy, it s even easier to have the runtime pre-built Platform support also indicates stronger integration testing, supported, and integrated management tools SQL MLlib Streaming GraphX Hadoop Client Core
35 Spark 1.0
36 Spark-Submit Spark-submit provides a consistent manner to launch jobs regardless of which platform Includes an important clean-up to make configurations more consistent # Run on a Spark standalone cluster./bin/spark-submit \ --class org.apache.spark.examples.sparkpi \ --master spark:// :7077 \ --executor-memory 20G \ --total-executor-cores 100 \ /path/to/examples.jar \ 1000! # Run on a YARN cluster export HADOOP_CONF_DIR=XXX./bin/spark-submit \ --class org.apache.spark.examples.sparkpi \ --master yarn-cluster \ --executor-memory 20G \ --num-executors 50 \ /path/to/examples.jar \ 1000
37 Spark SQL We actually wrestled with the name a bit because it s not only about SQL SQL is actually not the only developer interface - there is also a DSL SparkSQL introduces SchemaRDDs and an Optimizer (Catalyst) This provides a deeper integration for any structured data val sqlcontext = new org.apache.spark.sql.sqlcontext(sc) import sqlcontext._ val people: RDD[Person] =... // An RDD of case class objects! // The following is the same as // SELECT name FROM people WHERE age >= 10 AND age <= 19' val teenagers = people.where('age >= 10).where('age <= 19).select('name)
38 Databricks Is Committed to Growing Apache Spark s Developer Ecosystem Developer Training, Online Materials, Free Resources Strong Commitment to Open Source Certification Programs
39 We re Hiring! Evangelists Trainers Solutions Architects Software Engineers
Unified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia
Unified Big Data Processing with Apache Spark Matei Zaharia @matei_zaharia What is Apache Spark? Fast & general engine for big data processing Generalizes MapReduce model to support more types of processing
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically
More informationHow To Create A Data Visualization With Apache Spark And Zeppelin 2.5.3.5
Big Data Visualization using Apache Spark and Zeppelin Prajod Vettiyattil, Software Architect, Wipro Agenda Big Data and Ecosystem tools Apache Spark Apache Zeppelin Data Visualization Combining Spark
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful
More informationBig Data Analytics. Lucas Rego Drumond
Big Data Analytics Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Apache Spark Apache Spark 1
More informationMoving From Hadoop to Spark
+ Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com sujee@elephantscale.com Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee
More informationArchitectures for massive data management
Architectures for massive data management Apache Spark Albert Bifet albert.bifet@telecom-paristech.fr October 20, 2015 Spark Motivation Apache Spark Figure: IBM and Apache Spark What is Apache Spark Apache
More informationHadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
More informationUnified Big Data Analytics Pipeline. 连 城 lian@databricks.com
Unified Big Data Analytics Pipeline 连 城 lian@databricks.com What is A fast and general engine for large-scale data processing An open source implementation of Resilient Distributed Datasets (RDD) Has an
More informationShark Installation Guide Week 3 Report. Ankush Arora
Shark Installation Guide Week 3 Report Ankush Arora Last Updated: May 31,2014 CONTENTS Contents 1 Introduction 1 1.1 Shark..................................... 1 1.2 Apache Spark.................................
More informationProgramming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview
Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce
More informationAli Ghodsi Head of PM and Engineering Databricks
Making Big Data Simple Ali Ghodsi Head of PM and Engineering Databricks Big Data is Hard: A Big Data Project Tasks Tasks Build a Hadoop cluster Challenges Clusters hard to setup and manage Build a data
More informationCS555: Distributed Systems [Fall 2015] Dept. Of Computer Science, Colorado State University
CS 555: DISTRIBUTED SYSTEMS [SPARK] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey Streaming Significance of minimum delays? Interleaving
More informationEmbracing Spark as the Scalable Data Analytics Platform for the Enterprise
Embracing Spark as the Scalable Data Analytics Platform for the Enterprise Matthew J. Glickman GS.com/Engineering Spark Summit East 2015 1 How did we get here today? Strata + Hadoop World October 2014
More informationApache Spark : Fast and Easy Data Processing Sujee Maniyam Elephant Scale LLC sujee@elephantscale.com http://elephantscale.com
Apache Spark : Fast and Easy Data Processing Sujee Maniyam Elephant Scale LLC sujee@elephantscale.com http://elephantscale.com Spark Fast & Expressive Cluster computing engine Compatible with Hadoop Came
More informationBig Data Research in the AMPLab: BDAS and Beyond
Big Data Research in the AMPLab: BDAS and Beyond Michael Franklin UC Berkeley 1 st Spark Summit December 2, 2013 UC BERKELEY AMPLab: Collaborative Big Data Research Launched: January 2011, 6 year planned
More informationBig Data Analytics with Spark and Oscar BAO. Tamas Jambor, Lead Data Scientist at Massive Analytic
Big Data Analytics with Spark and Oscar BAO Tamas Jambor, Lead Data Scientist at Massive Analytic About me Building a scalable Machine Learning platform at MA Worked in Big Data and Data Science in the
More informationHadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
More informationFunctional Query Optimization with" SQL
" Functional Query Optimization with" SQL Michael Armbrust @michaelarmbrust spark.apache.org What is Apache Spark? Fast and general cluster computing system interoperable with Hadoop Improves efficiency
More informationHow Companies are! Using Spark
How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made
More informationBeyond Hadoop with Apache Spark and BDAS
Beyond Hadoop with Apache Spark and BDAS Khanderao Kand Principal Technologist, Guavus 12 April GITPRO World 2014 Palo Alto, CA Credit: Some stajsjcs and content came from presentajons from publicly shared
More informationTechnical White Paper The Excel Reporting Solution for Java
Technical White Paper The Excel Reporting Solution for Java Using Actuate e.spreadsheet Engine as a foundation for web-based reporting applications, Java developers can greatly enhance the productivity
More informationSpark Application Carousel. Spark Summit East 2015
Spark Application Carousel Spark Summit East 2015 About Today s Talk About Me: Vida Ha - Solutions Engineer at Databricks. Goal: For beginning/early intermediate Spark Developers. Motivate you to start
More informationLAB 2 SPARK / D-STREAM PROGRAMMING SCIENTIFIC APPLICATIONS FOR IOT WORKSHOP
LAB 2 SPARK / D-STREAM PROGRAMMING SCIENTIFIC APPLICATIONS FOR IOT WORKSHOP ICTP, Trieste, March 24th 2015 The objectives of this session are: Understand the Spark RDD programming model Familiarize with
More informationThis is a brief tutorial that explains the basics of Spark SQL programming.
About the Tutorial Apache Spark is a lightning-fast cluster computing designed for fast computation. It was built on top of Hadoop MapReduce and it extends the MapReduce model to efficiently use more types
More informationHadoop2, Spark Big Data, real time, machine learning & use cases. Cédric Carbone Twitter : @carbone
Hadoop2, Spark Big Data, real time, machine learning & use cases Cédric Carbone Twitter : @carbone Agenda Map Reduce Hadoop v1 limits Hadoop v2 and YARN Apache Spark Streaming : Spark vs Storm Machine
More informationHiBench Introduction. Carson Wang (carson.wang@intel.com) Software & Services Group
HiBench Introduction Carson Wang (carson.wang@intel.com) Agenda Background Workloads Configurations Benchmark Report Tuning Guide Background WHY Why we need big data benchmarking systems? WHAT What is
More informationApache Flink Next-gen data analysis. Kostas Tzoumas ktzoumas@apache.org @kostas_tzoumas
Apache Flink Next-gen data analysis Kostas Tzoumas ktzoumas@apache.org @kostas_tzoumas What is Flink Project undergoing incubation in the Apache Software Foundation Originating from the Stratosphere research
More informationBIG DATA APPLICATIONS
BIG DATA ANALYTICS USING HADOOP AND SPARK ON HATHI Boyu Zhang Research Computing ITaP BIG DATA APPLICATIONS Big data has become one of the most important aspects in scientific computing and business analytics
More information1. The orange button 2. Audio Type 3. Close apps 4. Enlarge my screen 5. Headphones 6. Questions Pane. SparkR 2
SparkR 1. The orange button 2. Audio Type 3. Close apps 4. Enlarge my screen 5. Headphones 6. Questions Pane SparkR 2 Lecture slides and/or video will be made available within one week Live Demonstration
More informationSpark and the Big Data Library
Spark and the Big Data Library Reza Zadeh Thanks to Matei Zaharia Problem Data growing faster than processing speeds Only solution is to parallelize on large clusters» Wide use in both enterprises and
More informationHadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
More informationOpen Source Technologies on Microsoft Azure
Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions
More informationApache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source
Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source DMITRIY SETRAKYAN Founder, PPMC http://www.ignite.incubator.apache.org @apacheignite @dsetrakyan Agenda About In- Memory
More informationStreaming items through a cluster with Spark Streaming
Streaming items through a cluster with Spark Streaming Tathagata TD Das @tathadas CME 323: Distributed Algorithms and Optimization Stanford, May 6, 2015 Who am I? > Project Management Committee (PMC) member
More informationIntroduction to Spark
Introduction to Spark Shannon Quinn (with thanks to Paco Nathan and Databricks) Quick Demo Quick Demo API Hooks Scala / Java All Java libraries *.jar http://www.scala- lang.org Python Anaconda: https://
More informationDell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert Armando_Acosta@Dell.com/
More informationSystem requirements. Java SE Runtime Environment(JRE) 7 (32bit) Java SE Runtime Environment(JRE) 6 (64bit) Java SE Runtime Environment(JRE) 7 (64bit)
Hitachi Solutions Geographical Information System Client Below conditions are system requirements for Hitachi Solutions Geographical Information System Client. 1/5 Hitachi Solutions Geographical Information
More informationThe Flink Big Data Analytics Platform. Marton Balassi, Gyula Fora" {mbalassi, gyfora}@apache.org
The Flink Big Data Analytics Platform Marton Balassi, Gyula Fora" {mbalassi, gyfora}@apache.org What is Apache Flink? Open Source Started in 2009 by the Berlin-based database research groups In the Apache
More informationBrave New World: Hadoop vs. Spark
Brave New World: Hadoop vs. Spark Dr. Kurt Stockinger Associate Professor of Computer Science Director of Studies in Data Science Zurich University of Applied Sciences Datalab Seminar, Zurich, Oct. 7,
More informationBig Data Rethink Algos and Architecture. Scott Marsh Manager R&D Personal Lines Auto Pricing
Big Data Rethink Algos and Architecture Scott Marsh Manager R&D Personal Lines Auto Pricing Agenda History Map Reduce Algorithms History Google talks about their solutions to their problems Map Reduce:
More informationCrystal Reports for Eclipse
Crystal Reports for Eclipse Table of Contents 1 Creating a Crystal Reports Web Application...2 2 Designing a Report off the Xtreme Embedded Derby Database... 11 3 Running a Crystal Reports Web Application...
More informationSpark ΕΡΓΑΣΤΗΡΙΟ 10. Prepared by George Nikolaides 4/19/2015 1
Spark ΕΡΓΑΣΤΗΡΙΟ 10 Prepared by George Nikolaides 4/19/2015 1 Introduction to Apache Spark Another cluster computing framework Developed in the AMPLab at UC Berkeley Started in 2009 Open-sourced in 2010
More informationReal Time Data Processing using Spark Streaming
Real Time Data Processing using Spark Streaming Hari Shreedharan, Software Engineer @ Cloudera Committer/PMC Member, Apache Flume Committer, Apache Sqoop Contributor, Apache Spark Author, Using Flume (O
More informationDeveloping Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University
More informationTableau Spark SQL Setup Instructions
Tableau Spark SQL Setup Instructions 1. Prerequisites 2. Configuring Hive 3. Configuring Spark & Hive 4. Starting the Spark Service and the Spark Thrift Server 5. Connecting Tableau to Spark SQL 5A. Install
More informationBig Data Frameworks: Scala and Spark Tutorial
Big Data Frameworks: Scala and Spark Tutorial 13.03.2015 Eemil Lagerspetz, Ella Peltonen Professor Sasu Tarkoma These slides: http://is.gd/bigdatascala www.cs.helsinki.fi Functional Programming Functional
More informationBig Data Processing. Patrick Wendell Databricks
Big Data Processing Patrick Wendell Databricks About me Committer and PMC member of Apache Spark Former PhD student at Berkeley Left Berkeley to help found Databricks Now managing open source work at Databricks
More informationBig Data Analytics Hadoop and Spark
Big Data Analytics Hadoop and Spark Shelly Garion, Ph.D. IBM Research Haifa 1 What is Big Data? 2 What is Big Data? Big data usually includes data sets with sizes beyond the ability of commonly used software
More informationPHP vs. Java. In this paper, I am not discussing following two issues since each is currently hotly debated in various communities:
PHP vs. Java *This document reflects my opinion about PHP and Java. I have written this without any references. Let me know if there is a technical error. --Hasari Tosun It isn't correct to compare Java
More informationImplement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
More informationRoadmap Talend : découvrez les futures fonctionnalités de Talend
Roadmap Talend : découvrez les futures fonctionnalités de Talend Cédric Carbone Talend Connect 9 octobre 2014 Talend 2014 1 Connecting the Data-Driven Enterprise Talend 2014 2 Agenda Agenda Why a Unified
More informationInternet Engineering: Web Application Architecture. Ali Kamandi Sharif University of Technology kamandi@ce.sharif.edu Fall 2007
Internet Engineering: Web Application Architecture Ali Kamandi Sharif University of Technology kamandi@ce.sharif.edu Fall 2007 Centralized Architecture mainframe terminals terminals 2 Two Tier Application
More informationHadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationApache Spark and the future of big data applica5ons. Eric Baldeschwieler
Apache Spark and the future of big data applica5ons Eric Baldeschwieler Who is Eric14? Big data veteran (since 1996) Databricks Tech Advisor Twitter handle: @jeric14 Previously CTO/CEO of Hortonworks Yahoo
More informationBEST WEB PROGRAMMING LANGUAGES TO LEARN ON YOUR OWN TIME
BEST WEB PROGRAMMING LANGUAGES TO LEARN ON YOUR OWN TIME System Analysis and Design S.Mohammad Taheri S.Hamed Moghimi Fall 92 1 CHOOSE A PROGRAMMING LANGUAGE FOR THE PROJECT 2 CHOOSE A PROGRAMMING LANGUAGE
More informationWhat s Cooking in KNIME
What s Cooking in KNIME Thomas Gabriel Copyright 2015 KNIME.com AG Agenda Querying NoSQL Databases Database Improvements & Big Data Copyright 2015 KNIME.com AG 2 Querying NoSQL Databases MongoDB & CouchDB
More informationIntegrate Master Data with Big Data using Oracle Table Access for Hadoop
Integrate Master Data with Big Data using Oracle Table Access for Hadoop Kuassi Mensah Oracle Corporation Redwood Shores, CA, USA Keywords: Hadoop, BigData, Hive SQL, Spark SQL, HCatalog, StorageHandler
More informationIn-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon and Parquet
In-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon and Parquet Ema Iancuta iorhian@gmail.com Radu Chilom radu.chilom@gmail.com Buzzwords Berlin - 2015 Big data analytics / machine
More informationCertification Study Guide. MapR Certified Spark Developer Study Guide
Certification Study Guide MapR Certified Spark Developer Study Guide 1 CONTENTS About MapR Study Guides... 3 MapR Certified Spark Developer (MCSD)... 3 SECTION 1 WHAT S ON THE EXAM?... 5 1. Load and Inspect
More informationCloudStack and Big Data. Sebastien Goasguen @sebgoa May 22nd 2013 LinuxTag, Berlin
CloudStack and Big Data Sebastien Goasguen @sebgoa May 22nd 2013 LinuxTag, Berlin Google trends Start of Clouds Cloud computing trending down, while Big Data is booming. Virtualization BigData on the Trigger
More informationNetBeans IDE Field Guide
NetBeans IDE Field Guide Copyright 2005 Sun Microsystems, Inc. All rights reserved. Table of Contents Introduction to J2EE Development in NetBeans IDE...1 Configuring the IDE for J2EE Development...2 Getting
More informationSTREAM ANALYTIX. Industry s only Multi-Engine Streaming Analytics Platform
STREAM ANALYTIX Industry s only Multi-Engine Streaming Analytics Platform One Platform for All Create real-time streaming data analytics applications in minutes with a powerful visual editor Get a wide
More informationApache Spark 11/10/15. Context. Reminder. Context. What is Spark? A GrowingStack
Apache Spark Document Analysis Course (Fall 2015 - Scott Sanner) Zahra Iman Some slides from (Matei Zaharia, UC Berkeley / MIT& Harold Liu) Reminder SparkConf JavaSpark RDD: Resilient Distributed Datasets
More informationWHAT S NEW IN SAS 9.4
WHAT S NEW IN SAS 9.4 PLATFORM, HPA & SAS GRID COMPUTING MICHAEL GODDARD CHIEF ARCHITECT SAS INSTITUTE, NEW ZEALAND SAS 9.4 WHAT S NEW IN THE PLATFORM Platform update SAS Grid Computing update Hadoop support
More informationSupported Platforms. HP Vertica Analytic Database. Software Version: 7.0.x
HP Vertica Analytic Database Software Version: 7.0.x Document Release Date: 5/7/2014 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements
More information1 Building, Deploying and Testing DPES application
1 Building, Deploying and Testing DPES application This chapter provides updated instructions for accessing the sources code, developing, building and deploying the DPES application in the user environment.
More informationFahim Uddin http://fahim.cooperativecorner.com email@fahim.cooperativecorner.com. 1. Java SDK
PREPARING YOUR MACHINES WITH NECESSARY TOOLS FOR ANDROID DEVELOPMENT SEPTEMBER, 2012 Fahim Uddin http://fahim.cooperativecorner.com email@fahim.cooperativecorner.com Android SDK makes use of the Java SE
More informationMachine- Learning Summer School - 2015
Machine- Learning Summer School - 2015 Big Data Programming David Franke Vast.com hbp://www.cs.utexas.edu/~dfranke/ Goals for Today Issues to address when you have big data Understand two popular big data
More informationSpark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. www.spark- project.org. University of California, Berkeley UC BERKELEY
Spark in Action Fast Big Data Analytics using Scala Matei Zaharia University of California, Berkeley www.spark- project.org UC BERKELEY My Background Grad student in the AMP Lab at UC Berkeley» 50- person
More informationWorkshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
More informationLearning. Spark LIGHTNING-FAST DATA ANALYTICS. Holden Karau, Andy Konwinski, Patrick Wendell & Matei Zaharia
Compliments of Learning Spark LIGHTNING-FAST DATA ANALYTICS Holden Karau, Andy Konwinski, Patrick Wendell & Matei Zaharia Bring Your Big Data to Life Big Data Integration and Analytics Learn how to power
More informationDistributed DataFrame on Spark: Simplifying Big Data For The Rest Of Us
DATA INTELLIGENCE FOR ALL Distributed DataFrame on Spark: Simplifying Big Data For The Rest Of Us Christopher Nguyen, PhD Co-Founder & CEO Agenda 1. Challenges & Motivation 2. DDF Overview 3. DDF Design
More informationBIG DATA SOLUTION DATA SHEET
BIG DATA SOLUTION DATA SHEET Highlight. DATA SHEET HGrid247 BIG DATA SOLUTION Exploring your BIG DATA, get some deeper insight. It is possible! Another approach to access your BIG DATA with the latest
More informationNative Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
More informationWhat s next for the Berkeley Data Analytics Stack?
What s next for the Berkeley Data Analytics Stack? Michael Franklin June 30th 2014 Spark Summit San Francisco UC BERKELEY AMPLab: Collaborative Big Data Research 60+ Students, Postdocs, Faculty and Staff
More informationJAVA WEB START OVERVIEW
JAVA WEB START OVERVIEW White Paper May 2005 Sun Microsystems, Inc. Table of Contents Table of Contents 1 Introduction................................................................. 1 2 A Java Web Start
More informationBig Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based
More informationApache Spark and Distributed Programming
Apache Spark and Distributed Programming Concurrent Programming Keijo Heljanko Department of Computer Science University School of Science November 25th, 2015 Slides by Keijo Heljanko Apache Spark Apache
More informationWhat Is the Java TM 2 Platform, Enterprise Edition?
Page 1 de 9 What Is the Java TM 2 Platform, Enterprise Edition? This document provides an introduction to the features and benefits of the Java 2 platform, Enterprise Edition. Overview Enterprises today
More informationBig Data Management and Security
Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value
More informationthe missing log collector Treasure Data, Inc. Muga Nishizawa
the missing log collector Treasure Data, Inc. Muga Nishizawa Muga Nishizawa (@muga_nishizawa) Chief Software Architect, Treasure Data Treasure Data Overview Founded to deliver big data analytics in days
More informationCloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
More informationIntroduction to Big Data! with Apache Spark" UC#BERKELEY#
Introduction to Big Data! with Apache Spark" UC#BERKELEY# This Lecture" The Big Data Problem" Hardware for Big Data" Distributing Work" Handling Failures and Slow Machines" Map Reduce and Complex Jobs"
More informationThe Inside Scoop on Hadoop
The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. Orion.Gebremedhin@Neudesic.COM B-orgebr@Microsoft.com @OrionGM The Inside Scoop
More informationWhy Spark on Hadoop Matters
Why Spark on Hadoop Matters MC Srivas, CTO and Founder, MapR Technologies Apache Spark Summit - July 1, 2014 1 MapR Overview Top Ranked Exponential Growth 500+ Customers Cloud Leaders 3X bookings Q1 13
More informationProductionizing a 24/7 Spark Streaming Service on YARN
Productionizing a 24/7 Spark Streaming Service on YARN Issac Buenrostro, Arup Malakar Spark Summit 2014 July 1, 2014 About Ooyala Cross-device video analytics and monetization products and services Founded
More informationMATLAB in Business Critical Applications Arvind Hosagrahara Principal Technical Consultant Arvind.Hosagrahara@mathworks.
MATLAB in Business Critical Applications Arvind Hosagrahara Principal Technical Consultant Arvind.Hosagrahara@mathworks.com 310-819-3970 2014 The MathWorks, Inc. 1 Outline Problem Statement The Big Picture
More informationOverview. The Android operating system is like a cake consisting of various layers.
The Android Stack Overview The Android operating system is like a cake consisting of various layers. Each layer has its own characteristics and purpose but the layers are not always cleanly separated and
More informationDeveloping modular Java applications
Developing modular Java applications Julien Dubois France Regional Director SpringSource Julien Dubois France Regional Director, SpringSource Book author :«Spring par la pratique» (Eyrolles, 2006) new
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationKarmjeet Kahlon Director Global z Systems Software Sales
Karmjeet Kahlon Director Global z Systems Software Sales The market is moving, forcing businesses to transform Explosion in transaction growth driven by mobility and the Internet of Things Analytics is
More informationLeveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015
Leveraging the Power of SOLR with SPARK Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015 Welcome Johannes Weigend - CTO QAware GmbH - Software architect / developer - 25 years
More informationIntroduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.
Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in
More informationEnable BI, Reporting, and ETL Integration with Your App
Enable BI, Reporting, and ETL Integration with Your App Using OpenAccess for customized data connectivity Brad Wright Data Connectivity & Integration Progress Software Agenda Challenges of Data Integration
More informationApache Jakarta Tomcat
Apache Jakarta Tomcat 20041058 Suh, Junho Road Map 1 Tomcat Overview What we need to make more dynamic web documents? Server that supports JSP, ASP, database etc We concentrates on Something that support
More informationHadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN
Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationNext-Gen Big Data Analytics using the Spark stack
Next-Gen Big Data Analytics using the Spark stack Jason Dai Chief Architect of Big Data Technologies Software and Services Group, Intel Agenda Overview Apache Spark stack Next-gen big data analytics Our
More information