Använd SAS för att bearbeta och analysera ditt data i Hadoop



Similar documents
Paper SAS Techniques in Processing Data on Hadoop

WHAT S NEW IN SAS 9.4

QUEST meeting Big Data Analytics

Hadoop & SAS Data Loader for Hadoop

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

ANALYTICS MODERNIZATION TRENDS, APPROACHES, AND USE CASES. Copyright 2013, SAS Institute Inc. All rights reserved.

Gør dine big data klar til analyse på en nem måde med Hadoop og SAS Data Loader for Hadoop. Jens Dahl Mikkelsen SAS Institute

SAS Visual Analytics: Arkitektur, data flow og administration

and Hadoop Technology

DATA VISUALIZATION: CONVERTING INFORMATION TO DECISIONS DAVID FRONING, PRINCIPAL PRODUCT MANAGER

Introduktion till SAS 9 Plattformen Helikopterkursen

Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. DATA MANAGEMENT FOR ANALYTICS

Bringing the Power of SAS to Hadoop. White Paper

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

SAS 9.3 Intelligence Platform

9.4 SPD Engine: Storing Data in the Hadoop Distributed File System

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

Constructing a Data Lake: Hadoop and Oracle Database United!

Data processing goes big

Hadoop Job Oriented Training Agenda

What does SAS Data Management do? Why is SAS Data Management important? For whom is SAS Data Management designed? Key Benefits

In-Memory Analytics for Big Data

Document Type: Best Practice

Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

9.4 Hadoop Configuration Guide for Base SAS. and SAS/ACCESS

Bringing the Power of SAS to Hadoop

Tips and Techniques for Efficiently Updating and Loading Data into SAS Visual Analytics

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Microsoft + SOA = Sant? Joakim Linghall Principal System Engineer SOA and Business Process joakiml@microsoft.com

Qsoft Inc

Architecting the Future of Big Data

SAS Data Integration SAS Business Intelligence

Books-by-Users Web Development with SAS by Example (Third Edition) Frederick E. Pratter

The SAS Software installed and referred to throughout is: SAS 9.4M2 SAS High Performance Analytics 2.8 SAS Visual Analytics 6.4

Cloudera Manager Training: Hands-On Exercises

Comprehensive Analytics on the Hortonworks Data Platform

SAS In-Database. Forum Analytique d'affaires SAS 1 Déc Ronald Allard SAS Montréal. Copyright 2010 SAS Institute Inc. All rights reserved.

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Integrating VoltDB with Hadoop

The Inside Scoop on Hadoop

Nyheter i SAS9.4 för programmerare

Hadoop Ecosystem B Y R A H I M A.

SAP and Hortonworks Reference Architecture

EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data

Polybase for SQL Server 2016

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

9.4 Intelligence. SAS Platform. Overview Second Edition. SAS Documentation

Big Data and the Analytic Race. Copyright 2012, SAS Institute Inc. All rights reserved.

Native Connectivity to Big Data Sources in MSTR 10

SAS Data Loader 2.1 for Hadoop

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Bringing Big Data to People

From Relational to Hadoop Part 1: Introduction to Hadoop. Gwen Shapira, Cloudera and Danil Zburivsky, Pythian

ANALYTICS IN BIG DATA ERA

MySQL and Hadoop. Percona Live 2014 Chris Schneider

Creating a universe on Hive with Hortonworks HDP 2.0

Big Data and Hadoop with components like Flume, Pig, Hive and Jaql

Readme10_054.doc page 1 of 7

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

Stinger Initiative: Introduction

What's New in SAS Data Management

Lofan Abrams Data Services for Big Data Session # 2987

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Oracle Big Data SQL Technical Update

SAS 9.4 In-Database Products

Oracle Big Data Handbook

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Big Data Introduction

Teradata s Big Data Technology Strategy & Roadmap

SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs

Nyheter i SAS Data Management med SAS versjon 9.4

Big Data Operations Guide for Cloudera Manager v5.x Hadoop

Access to Relational Databases Using SAS. Frederick Pratter, Destiny Corp.

Jag valde att använda Net-EPP_client.php från centralnic för att komma igång.

Nyheder i SAS Data Management med SAS version 9.4 Jens Dahl Mikkelsen Nordic CoE, Information Management

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Scalable Forensics with TSK and Hadoop. Jon Stewart

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer,

PRTK. Password Recovery ToolKit EFS (Encrypting File System)

SAS and Teradata Partnership

Enhance your Analytics using Logical Data Warehouse and Data Virtualization thru SAP HANA smart data access SESSION CODE: 0210

June JMS and Hadoop Agent. Automic Workload Automation

SAS. 9.4 Guide to Software Updates. SAS Documentation

Ange om en aktivitet har medfört att en tjänsteresa har utförts med flyg under 2013, och i sådana fall antal gånger.

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Self-service BI for big data applications using Apache Drill

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

Automated distribution of SAS results Jacques Pagé, Les Services Conseils HARDY, Quebec, Qc

ABSTRACT INTRODUCTION SAS AND EXCEL CAPABILITIES SAS AND EXCEL STRUCTURES

Red Hat Enterprise Linux OpenStack Platform 7 OpenStack Data Processing

Financial, Telco, Retail, & Manufacturing: Hadoop Business Services for Industries

SAS LASR Analytic Server 2.4

SAS ANALYTIC SOLUTIONS RUNNING ON A HADOOP CLUSTER USING YARN JAMES KOCHUBA. Copyright 2015, SAS Institute Inc. All rights reserved.

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Connecting Hadoop with Oracle Database

Improving Your Relationship with SAS Enterprise Guide

Transcription:

make connections share ideas be inspired Använd SAS för att bearbeta och analysera ditt data i Hadoop Mikael Turvall

Arkitektur SAS VISUAL ANALYTICS and SAS VISUAL STATISTICS SAS IN-MEMORY STATISTICS FOR HADOOP BLADE ENVIRONMENT MPP DATASTORE WEB-BASED CLIENT SAS VA/VS SAS Studio METADATA SERVER (Optional) MID-TIER WORKSPACE SERVER IN-MEMORY STORE SAS LASR ANALYTIC SERVER Hadoop Cloudera Hortonworks SAS Embedded Process Hadoop Teradata Pivotal Oracle RDBMS Nonrelational Click Stream PC Files Other

Varför? Hadoop som en platform för data Hadoop som kärnan i nästa generations analysplatform EVALUATE / MONITOR RESULTS IDENTIFY / FORMULATE PROBLEM DATA PREPARATION DEPLOY MODEL DATA EXPLORATION VALIDATE MODEL BUILD MODEL TRANSFORM & SELECT

DEPLOY & MONITOR Från data till beslut SAS/ACCESS SAS Data Management SAS Federation Server SAS Data Loader for Hadoop MANAGE DATA DATA EXPLORE SAS Visual Analytics SAS In-memory Statistics for Hadoop TEXT SAS Scoring Accelerator for Hadoop SAS Code Accelerator for Hadoop DEVELOP MODELS SAS HPA Products SAS Visual Statistics SAS In-memory Statistics for Hadoop SAS Enterprise Miner

Kom igång snabbt Möjligheter Transparent access till Hadoop-tabeller i vanliga SAS-library Man kan programmera i SAS SQL och SAS datasteg som vanligt Man kan hantera Hadoop från SAS: Native HDFS kommandon MapReduce, Pig, och HiveQL Fördelar Man behöver inte vara expert på Hadoopspecifik syntax Byta till Hadoop är lika enkelt som att byta ett libname Befintliga SAS program, rapproter, etc. kan återanvändas Många olika sätt att accessa data ger IT olika möjligheter att utnyttja kapaciteten MAN KAN BÖRJA IDAG

Var får jag tag i Hadoop?

SAS/ACCESS to Hadoop HADOOP SAS SERVER Hive QL Flytta delar av jobbet in i Hadoop

Komma igång med Hadoop libname elefant hadoop PORT=10000 SERVER=sascldserv02 USER=hadoop PASSWORD= hadoop" ;

Hadoop Filename Statement FILENAME hdpfile1 hadoop "/user/hadoop/gutenberg/pg20417.txt" cfg="c:\users\hadoop_config.xml" user='hadoop' ; Definiera en fileref DATA my_analysis_data; INFILE hdpfile1 ; INPUT ; RUN; Använd den som vanligt OBS! Flytta inte över ALL data i till en SAS-tabell

Hadoop File Reader SAS 9.4 kan läsa icke-hive -filer som tabeller Filformatformat Delimited CSV XML JSON (experimental) Binary files Multipla filer i en katalog

Hadoop File Reader libname HDP hadoop user=hadoop pw=hadoop config = '/home/sasinst/hadoop_config.xml hdfs_tempdir = '/user/hadoop/tmp hdfs_metadir = '/user/hadoop/metadata hdfs_permdir = '/user/hadoop/dataload' ; proc hdmd name=hdp.pipedata_dept format=delimited sep = ' DATA_FILE='pipedata_dept.txt' ; COLUMN col1 int; COLUMN col2 char(15); run; proc print data=hdp.pipedata_dept; run; Definiera ett libname Specificera filformatet Använd den som vanligt

DI Studio Access data in Hadoop Transform data inside Hadoop using HiveQL Creating new data in Hadoop

SPDE Traditionellt filsystem libname spdlib spde /path ; proc print data=spdlib.mytab; run; SPDE Open/read/close mytab.mdf Open/read/close mytab.dpf1 Open/read/close mytab.dpf2 t k i o e mytab.mdf mytab.dpf1 mytab.dpf2

SPDE - Hadoop HDFS libname spdlib spde /path hdfshost=default; proc print data=spdlib.mytab; run; Get data block locations Namenode Get data Datanode SPDE Open/read/close mytab.mdf Open/read/close mytab.dpf1 Open/read/close mytab.dpf2 H D F S C l i e n t Get data Get data M1 D1 Datanode D2 Datanode D1 D2

Nästa steg - SAS-jobb i Hadoop SAS SERVER SAS Data Step & DS2 HADOOP SAS Data Loader for Hadoop SAS Code Accelerator for Hadoop SAS Scoring Accelerator for Hadoop

SAS Data Director SAS Data Loader for Hadoop User Name What directive do you want to perform? Show: All Directives Saved Directives Open a previously created directive to run, view, or edit. Schedule a Directive to Run Schedule a directive to run at specified dates and times Chain Directives Together Run a number of directives in a specific order. Copy Data for Visualization Copy data from Hadoop and load it into LASR for visualization. Existing data in the target table will be replaced. Copy Data to Hadoop Copy data from a source and load it into Hadoop. Existing data in the target file will be replaced. Join Tables in Hadoop Create a table in Hadoop from multiple tables. Pivot a Table in Hadoop Transpose the columns of a table in Hadoop. Transform Data in Hadoop Transform the data in an Hadoop data file. 1 Click Verify Mailing Address Check the validity of the mailing address data in a table. Profile Data Create a report profiling the data in a table. Generate Business Rules Analyze data in a table and generate business rules. Send Data for Remediation Select data to send to the remediation queue for further action.

DEPLOY & MONITOR Från data till beslut SAS/ACCESS SAS Data Management SAS Federation Server SAS Data Loader for Hadoop MANAGE DATA DATA EXPLORE SAS Visual Analytics SAS In-memory Statistics for Hadoop TEXT SAS Scoring Accelerator for Hadoop SAS Code Accelerator for Hadoop DEVELOP MODELS SAS HPA Products SAS Visual Statistics SAS In-memory Statistics for Hadoop SAS Enterprise Miner

make connections share ideas be inspired mikael.turvall@sas.com