Universal PMML Plug-in for EMC Greenplum Database
|
|
- Merryl Lester
- 8 years ago
- Views:
Transcription
1 Universal PMML Plug-in for EMC Greenplum Database Delivering Massively Parallel Predictions Zementis, Inc. USA: 6125 Cornerstone Court East, Suite #250, San Diego, CA T +1(619) Asia: 19/F, Unit A Ho Lee Commercial Bldg D Aguilar Street, Central, Hong Kong T
2 Delivering Massively Parallel Predictions As advanced analytics becomes pervasive across the enterprise to drive better business decisions, the need for efficient execution of predictive models is paramount. Zementis and Greenplum join forces to help companies easily bring predictive models into their database and score in-place and in-parallel huge amounts of data. This joint product combines the Zementis Universal PMML Plug-in for execution of predictive models with the power and scale of the EMC Greenplum Database. The result is an end-to-end solution that enhances Greenplum s large scale analytics processing capabilities with scoring of standards-based predictive models on a massively parallel architecture. By embedding predictive analytics directly in the database, this solution minimizes the movement of data and enables the efficient in-place processing of very large data sets. In this whitepaper, we demonstrate how to deploy and execute predictive models from several statistical tools, including IBM SPSS and the open source R program. Predictive Model Markup Language (PMML) As the de-facto standard for data mining models, PMML provides tremendous benefits for business, IT, and the data mining industry in general. Developed by the Data Mining Group (DMG - an independent, vendor-led consortium, PMML increases business agility by eliminating the need for proprietary solutions or custom code development. Today, it is supported by all the major data mining tools, commercial and open source. As an open standard, it enables project stakeholders to standardize on one common representation for data mining models. It practically eliminates the barriers and gaps between development and production deployment of predictive analytics. In effect, it minimizes the complexity, cost, and time to turn predictive models into operable IT and business assets. As the lingua franca for predictive analytics, data mining models can be easily exchanged between PMMLcompliant applications. In this way, a model may be built in one statistical tool and easily moved to another for production deployment or visualization. PMML also serves as a bridge between all the teams involved in the data mining process inside a company since it can be used to disseminate knowledge and best practices. In a world in which sensors and data gathering are becoming more and more pervasive, predictive analytics and standards such as PMML make it possible for organizations to benefit from smart solutions that will truly revolutionize their business. Universal PMML Plug-in for EMC Greenplum Database 1
3 Zementis Universal PMML Plug-in The Universal PMML Plug-in (Figure 1) builds on the heritage of Zementis s flagship product, the ADAPA Decision Engine, a web services-based framework for the execution of predictive analytics and rules, available onsite or as cloud computing platform. The Universal PMML Plug-in is a highly optimized, in-database scoring engine for predictive models, fully supporting the PMML standard. With PMML, the Plug-in delivers a wide range of predictive analytics for high performance scoring. It shortens time to market for predictive models and empowers users through instant deployment of predictive models. Figure 1: The Universal PMML Plug-in. Data in, predictions out. In the context of in-database scoring, it allows us to execute predictive models from all major commercial and open source data mining tools within the database, minimizing data movement and maximizing processing efficiency. Very large datasets can be easily scored against a variety of predictive models including neural network models, regression models, support vector machines, and decision trees (as well as a host of other advance analytic techniques). Besides models per se, the Universal PMML Plug-in also supports data pre- and post-processing. That is because the latest version of the PMML standard is loaded with built-in functions which allow for arithmetic calculations, string manipulations as well as logic operations. An entire predictive solution, one that operates from raw data all the way to predictions, can be represented in PMML and directly used in the Universal PMML Plug-in for data scoring. The Universal PMML Plug-in not only supports the latest version of PMML, but also older versions. In fact, it is version agnostic since it incorporates a converter which automatically converts older versions of PMML to its newest. EMC Greenplum Database Architecture The EMC Greenplum Database utilizes a shared-nothing MPP (massively parallel processing) architecture that has been designed from the ground up for BI and analytical processing using commodity hardware. In this architecture, data is automatically partitioned across multiple 'segment' servers, and each 'segment' owns and manages a distinct portion of the overall data. All communication is via a network interconnect -- there is no disk-level sharing or contention to be concerned with (i.e. it is a 'shared-nothing' architecture). Most of today s general-purpose relational database management systems (e.g. Oracle, Microsoft SQL Server) were originally designed for Online Transaction Processing (OLTP) applications. These databases utilize 'shared-disk' or 'shared-everything' architectures that are optimized for high transaction rates at the expense of individual query performance and parallelism. Greenplum s shared-nothing MPP architecture (Figure 2) provides every segment with a dedicated, independent high-bandwidth channel to its disk. The segment servers are able to process every query in a fully parallel manner, Universal PMML Plug-in for EMC Greenplum Database 2
4 use all disk connections simultaneously, and efficiently flow data between segments as query plans dictates. The degree of parallelism and overall scalability that this allows far exceeds general purpose database systems. Figure 2: Greenplum s shared-nothing MPP architecture Universal PMML Plug-in for the EMC Greenplum Database The Universal PMML Plugin for the EMC Greenplum Database enables execution of standards-based predictive analytics directly within the Greenplum Database. It seamlessly embeds the Universal PMML Plug-in into Greenplum s shared-nothing, massively parallel processing (MPP) architecture. The Universal Plug-in s own shared-nothing design philosophy and replication flexibility fits like a glove into multi-server environments. With Greenplum, each individual server (with a dedicated, independent, high-bandwidth channel connection to local disks) houses a separate Universal PMML Plug-in instance that can take full advantage of these local resources (Figure 3). The net result is the ability to leverage the power of standards-based predictive analytics on a massive scale, right where the data resides. The EMC Greenplum PMML Plug-in not only delivers high performance model execution but it does so in an easy and seamless manner. With a couple of simple steps, PMML models are distributed to all segments of the Greenplum installation and are made available for execution. Each model is presented as a separate SQL function that can be used in any query. The name, input parameters and outputs of each function matches the name, input fields, and output fields of the corresponding model as defined in the corresponding PMML file. This way, scoring a Universal PMML Plug-in for EMC Greenplum Database 3
5 data set with one or more models becomes as simple as writing a SQL statement on that data set. Predictions (scores, probabilities, categories, clusters, etc.) can be just as easily written back to the database, become part of a report, or passed on to an application. Figure 3: Each individual server houses a separate Universal PMML Plug-in instance. In addition, the Universal PMML Plug-in includes the popular Zementis PMML Converter. This means that it accepts PMML models of all versions (2.0, 2.1, 3.0, 3.1, 3.2, and 4.0) generated by any of the major commercial and open source mining tools. Example: Use IBM SPSS and R Models in Greenplum The Universal PMML Plug-in for the EMC Greenplum Database ships with several sample PMML models. A number of these predictive models were created with the well-known Elnino data set. This data set contains oceanographic and surface meteorological readings taken from a series of buoys positioned throughout the equatorial Pacific. The data is expected to aid in the understanding and prediction of El Nino/Southern Oscillation (ENSO) cycles (see Here, we discuss two of these: A neural network model built in IBM SPSS Statistics; A linear regression model built in R. Universal PMML Plug-in for EMC Greenplum Database 4
6 After being built, all models were directly exported into PMML since IBM SPSS and R provide comprehensive support for the PMML standard. The steps to install and use these models in Greenplum using the PMML plugin are: 1. Prepare and copy PMML files into the Greenplum segments 2. Run the automatically generated script to define the corresponding SQL functions 3. Run queries using the new SQL functions Each step is described in detail below. Prepare and Copy PMML Files In the first step, a script needs to be run to validate the provided PMML files, copy them into the Greenplum segments, and generate a SQL script containing the function definitions for all the provided models. Below we present an excerpt from the SQL script generated for the two sample models. CREATE FUNCTION SPSS_Neural_Network_ElNino(float8,float8,float8,float8,float8,float8) RETURNS float8 AS CREATE FUNCTION R_LinearRegression_ElNino(float8,float8,float8,float8,float8,float8) RETURNS float8 AS To put these definitions in context, below we present the code for the PMML data dictionary and mining schema for the IBM SPSS Neural Network model as listed in the corresponding PMML file. <DataDictionary numberoffields="7"> <DataField name="humidity" optype="continuous" datatype="double"/> <DataField name="latitude" optype="continuous" datatype="double"/> <DataField name="longitude" optype="continuous" datatype="double"/> <DataField name="mer_winds" optype="continuous" datatype="double"/> <DataField name="s_s_temp" optype="continuous" datatype="double"/> <DataField name="zon_winds" optype="continuous" datatype="double"/> <DataField name="airtemp" optype="continuous" datatype="double"/> </DataDictionary> <NeuralNetwork functionname="regression" activationfunction="logistic" modelname="spss Neural Network - ElNino"> <MiningSchema> <MiningField name="humidity" usagetype="active" optype="continuous"/> <MiningField name="latitude" usagetype="active" optype="continuous"/> <MiningField name="longitude" usagetype="active" optype="continuous"/> <MiningField name="mer_winds" usagetype="active" optype="continuous"/> <MiningField name="s_s_temp" usagetype="active" optype="continuous"/> <MiningField name="zon_winds" usagetype="active" optype="continuous"/> <MiningField name="airtemp" usagetype="predicted" optype="continuous"/> </MiningSchema> In the SQL script, each model is presented as a function with six numeric parameters; they all work on the same data and return one numeric value. The name of the SQL function is created from the name of the model (SPSS_Neural_Network_ElNino). The six numeric parameters correspond to the six input (active mining) fields of Universal PMML Plug-in for EMC Greenplum Database 5
7 type double defined in the PMML file (humidity, latitude, longitude, mer_winds, s_s_temp, and zon_winds). Finally, the numeric return value of the SQL function reflects the predicted output field of type double (airtemp). Run SQL Script to Create SQL Functions The second step is to run the generated SQL script to create the new functions. After the new functions are created, the predictive models are ready to be used in SQL queries like any other built-in or custom function. Execute Queries to Score Data With the installation steps completed, the predictive models can be easily used in SQL queries. Below is an example of such a query: SELECT buoy_day_id, SPSS_Neural_Network_ElNino (latitude, longitude, zon_winds, mer_winds, humidity, s_s_temp) AS airtemp FROM elnino_input Getting predictions from the two models at the same time would be just as easy: SELECT buoy_day_id, R_Linear_Regression_ElNino(latitude, longitude, zon_winds, mer_winds, humidity, s_s_temp) AS airtemp_r, SPSS_Neural_Network_ElNino(latitude, longitude, zon_winds, mer_winds, humidity, s_s_temp) AS airtemp_nn FROM elnino_input Advantages of the Universal PMML Plug-in for the EMC Greenplum Database Zementis and EMC Greenplum bring together two essential technologies, offering the best combination of open standards and scalability for the in-database application of predictive analytics. The Universal PMML Plug-in delivers instant and scalable scoring for big data while retaining compatibility with most major data mining tools through the PMML Standard. In summary, the Universal PMML Plug-in for the EMC Greenplum Database Integrates advanced analytical algorithms directly into the database engine for high-performance scoring in a massively parallel environment; Supports the PMML standard to avoid time-consuming and expensive one-off predictive analytics projects; Executes predictive models from all major commercial and open source data mining tools; Minimizes data movement to enable efficient processing of very large data sets; and Reduces total cost of ownership (TCO) for analytical environment by means of streamlined and platformindependent data mining processes. Universal PMML Plug-in for EMC Greenplum Database 6
8 About Greenplum and the EMC Data Computing Products Division EMC s new Data Computing Products Division is driving the future of data warehousing and analytics with breakthrough products including Greenplum Database 4.1, Greenplum Data Computing Appliance (DCA), Greenplum Database Single-Node Edition, Greenplum Community Edition and Greenplum Chorus. The division s products embody the power of open systems, cloud computing, virtualization, and social collaboration enabling global organizations to gain greater insight and value from their data than ever before possible. For more information, please visit About Zementis Zementis, Inc. is a leading software company focused on the operational deployment and integration of predictive analytics and data mining solutions. Its ADAPA decision engine successfully bridges the gap between science and engineering. ADAPA and the Universal PMML Plug-in are designed from the ground up to benefit from open standards and to significantly shorten the time-to-market for predictive analytics in any industry. For more information, please visit Universal PMML Plug-in for EMC Greenplum Database 7
Easy Execution of Data Mining Models through PMML
Easy Execution of Data Mining Models through PMML Zementis, Inc. UseR! 2009 Zementis Development, Deployment, and Execution of Predictive Models Development R allows for reliable data manipulation and
More informationModel Deployment. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/
Model Deployment Dr. Saed Sayad University of Toronto 2010 saed.sayad@utoronto.ca http://chem-eng.utoronto.ca/~datamining/ 1 Model Deployment Creation of the model is generally not the end of the project.
More informationI/O Considerations in Big Data Analytics
Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very
More informationUsing Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database
White Paper Using Attunity Replicate with Greenplum Database Using Attunity Replicate for data migration and Change Data Capture to the Greenplum Database Abstract This white paper explores the technology
More informationJuly 2015. Zementis for IBM z Systems
July 2015 Zementis for IBM z Systems Page 1 Zementis for IBM z Systems An integrated predictive analytics deployment and scoring capability for organizations managing data and transactions with IBM z Systems
More informationIn-Database Analytics
Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing
More informationHow to Optimize Your Data Mining Environment
WHITEPAPER How to Optimize Your Data Mining Environment For Better Business Intelligence Data mining is the process of applying business intelligence software tools to business data in order to create
More informationEMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data
EMC Greenplum Driving the Future of Data Warehousing and Analytics Tools and Technologies for Big Data Steven Hillion V.P. Analytics EMC Data Computing Division 1 Big Data Size: The Volume Of Data Continues
More informationHadoop s Advantages for! Machine! Learning and. Predictive! Analytics. Webinar will begin shortly. Presented by Hortonworks & Zementis
Webinar will begin shortly Hadoop s Advantages for Machine Learning and Predictive Analytics Presented by Hortonworks & Zementis September 10, 2014 Copyright 2014 Zementis, Inc. All rights reserved. 2
More informationHarnessing the power of advanced analytics with IBM Netezza
IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced
More informationThe R pmmltransformations Package
The R pmmltransformations Package Tridivesh Jena Alex Guazzelli Wen-Ching Lin Michael Zeller Zementis, Inc.* Zementis, Inc. Zementis, Inc. Zementis, Inc. Tridivesh.Jena@ Alex.Guazzelli@ Wenching.Lin@ Michael.Zeller@
More informationAdvanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
More informationHigh-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances
High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances Highlights IBM Netezza and SAS together provide appliances and analytic software solutions that help organizations improve
More informationGreenplum Database. Getting Started with Big Data Analytics. Ofir Manor Pre Sales Technical Architect, EMC Greenplum
Greenplum Database Getting Started with Big Data Analytics Ofir Manor Pre Sales Technical Architect, EMC Greenplum 1 Agenda Introduction to Greenplum Greenplum Database Architecture Flexible Database Configuration
More informationSQL Server 2012 Parallel Data Warehouse. Solution Brief
SQL Server 2012 Parallel Data Warehouse Solution Brief Published February 22, 2013 Contents Introduction... 1 Microsoft Platform: Windows Server and SQL Server... 2 SQL Server 2012 Parallel Data Warehouse...
More informationEMC/Greenplum Driving the Future of Data Warehousing and Analytics
EMC/Greenplum Driving the Future of Data Warehousing and Analytics EMC 2010 Forum Series 1 Greenplum Becomes the Foundation of EMC s Data Computing Division E M C A CQ U I R E S G R E E N P L U M Greenplum,
More informationBig Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
More informationMake Better Decisions Through Predictive Intelligence
IBM SPSS Modeler Professional Make Better Decisions Through Predictive Intelligence Highlights Easily access, prepare and model structured data with this intuitive, visual data mining workbench Rapidly
More informationEMC GREENPLUM DATABASE
EMC GREENPLUM DATABASE Driving the future of data warehousing and analytics Essentials A shared-nothing, massively parallel processing (MPP) architecture supports extreme performance on commodity infrastructure
More informationAchieve Better Insight and Prediction with Data Mining
Clementine 11.1 Specifications Achieve Better Insight and Prediction with Data Mining Data mining provides organizations with a clearer view of current conditions and deeper insight into future events.
More informationCustomer Insight Appliance. Enabling retailers to understand and serve their customer
Customer Insight Appliance Enabling retailers to understand and serve their customer Customer Insight Appliance Enabling retailers to understand and serve their customer. Technology has empowered today
More informationBIG DATA-AS-A-SERVICE
White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers
More informationSEIZE THE DATA. 2015 SEIZE THE DATA. 2015
1 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. BIG DATA CONFERENCE 2015 Boston August 10-13 Predicting and reducing deforestation
More informationAchieve Better Insight and Prediction with Data Mining
Clementine 12.0 Specifications Achieve Better Insight and Prediction with Data Mining Data mining provides organizations with a clearer view of current conditions and deeper insight into future events.
More informationMike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.
Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,
More informationThe Use of Open Source Is Growing. So Why Do Organizations Still Turn to SAS?
Conclusions Paper The Use of Open Source Is Growing. So Why Do Organizations Still Turn to SAS? Insights from a presentation at the 2014 Hadoop Summit Featuring Brian Garrett, Principal Solutions Architect
More informationOperationalise Predictive Analytics
Operationalise Predictive Analytics Publish SPSS, Excel and R reports online Predict online using SPSS and R models Access models and reports via Android app Organise people and content into projects Monitor
More informationSAP Predictive Analytics: An Overview and Roadmap. Charles Gadalla, SAP @cgadalla SESSION CODE: 603
SAP Predictive Analytics: An Overview and Roadmap Charles Gadalla, SAP @cgadalla SESSION CODE: 603 Advanced Analytics SAP Vision Embed Smart Agile Analytics into Decision Processes to Deliver Business
More informationOptimizing Storage for Better TCO in Oracle Environments. Part 1: Management INFOSTOR. Executive Brief
Optimizing Storage for Better TCO in Oracle Environments INFOSTOR Executive Brief a QuinStreet Excutive Brief. 2012 To the casual observer, and even to business decision makers who don t work in information
More informationCopyright 2012 EMC Corporation. All rights reserved.
1 Greenplum UAP Enabling Big Data Analytics Brendon Moran Data Scientist 2 Agenda Background On Greenplum And Big Data Analytics Greenplum UAP Greenplum: Not Just Infrastructure Pivotal Labs Customers
More informationMASSIVEDATANEWS. Load and Go: Fast Data Loading with the Greenplum Data Computing Appliance (DCA)
Greenplum Data Computing Appliance (DCA) Introduction: Why Fast and Flexible Data Loading Matters Data loading is the beginning of the entire analytics process. Everything starts by getting data into the
More informationNext Generation Data Mining. Data Mining Automation & Realtime-Scoring "on-the-cloud.
Next Generation Data Mining. Data Mining Automation & Realtime-Scoring "on-the-cloud. Outline DYMATRIX & Zementis Overview Consulting & Product Expertise DynaMine & ADAPA Solution Framework Case Study:
More informationData Virtualization Overview
Data Virtualization Overview Take Big Advantage of Your Data "Using a data virtualization technique is: number one, much quicker time to market; number two, much more cost effective; and three, gives us
More informationDevelop Predictive Models Using Your Business Expertise
Clementine 8.5 Specifications Develop Predictive Models Using Your Business Expertise Clementine is an integrated data mining workbench, popular worldwide with data miners and business analysts alike.
More informationData Warehouse Appliances: The Next Wave of IT Delivery. Private Cloud (Revocable Access and Support) Applications Appliance. (License/Maintenance)
Appliances are rapidly becoming a preferred purchase option for large and small businesses seeking to meet expanding workloads and deliver ROI in the face of tightening budgets. TBR is reporting the results
More informationIntegrated Grid Solutions. and Greenplum
EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving
More informationETPL Extract, Transform, Predict and Load
ETPL Extract, Transform, Predict and Load An Oracle White Paper March 2006 ETPL Extract, Transform, Predict and Load. Executive summary... 2 Why Extract, transform, predict and load?... 4 Basic requirements
More informationCollaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationIBM SPSS Modeler Professional
IBM SPSS Modeler Professional Make better decisions through predictive intelligence Highlights Create more effective strategies by evaluating trends and likely outcomes. Easily access, prepare and model
More informationHigh Performance Analytics with In-Database Processing
High Performance Analytics with In-Database Processing Stephen Brobst, Chief Technology Officer, Teradata Corporation, San Diego, CA Keith Collins, Senior Vice President & Chief Technology Officer, SAS
More informationIBM SPSS Modeler Professional
IBM SPSS Modeler Professional Make better decisions through predictive intelligence Highlights Create more effective strategies by evaluating trends and likely outcomes. Easily access, prepare and model
More informationBig Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
More informationLowering the Total Cost of Ownership (TCO) of Data Warehousing
Ownership (TCO) of Data If Gordon Moore s law of performance improvement and cost reduction applies to processing power, why hasn t it worked for data warehousing? Kognitio provides solutions to business
More informationBig Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
More informationGigaSpaces Real-Time Analytics for Big Data
GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and
More informationCitusDB Architecture for Real-Time Big Data
CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing
More information2015 Ironside Group, Inc. 2
2015 Ironside Group, Inc. 2 Introduction to Ironside What is Cloud, Really? Why Cloud for Data Warehousing? Intro to IBM PureData for Analytics (IPDA) IBM PureData for Analytics on Cloud Intro to IBM dashdb
More informationImprove Results with High- Performance Data Mining
Clementine 10.0 Specifications Improve Results with High- Performance Data Mining Data mining provides organizations with a clearer view of current conditions and deeper insight into future events. With
More informationDriving Peak Performance. 2013 IBM Corporation
Driving Peak Performance 1 Session 2: Driving Peak Performance Abstract We know you want the fastest performance possible for your deployments, and yet that relies on many choices across data storage,
More informationBIG DATA APPLIANCES. July 23, TDWI. R Sathyanarayana. Enterprise Information Management & Analytics Practice EMC Consulting
BIG DATA APPLIANCES July 23, TDWI R Sathyanarayana Enterprise Information Management & Analytics Practice EMC Consulting 1 Big data are datasets that grow so large that they become awkward to work with
More informationHortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
More informationMicrosoft Dynamics AX 2012 A New Generation in ERP
A New Generation in ERP Mike Ehrenberg Technical Fellow Microsoft Corporation April 2011 Microsoft Dynamics AX 2012 is not just the next release of a great product. It is, in fact, a generational shift
More informationGrow Revenues and Reduce Risk with Powerful Analytics Software
Grow Revenues and Reduce Risk with Powerful Analytics Software Overview Gaining knowledge through data selection, data exploration, model creation and predictive action is the key to increasing revenues,
More informationUpgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000
Upgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000 Your Data, Any Place, Any Time Executive Summary: More than ever, organizations rely on data
More informationKnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES
HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within
More informationWhy Big Data in the Cloud?
Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data
More informationA financial software company
A financial software company Projecting USD10 million revenue lift with the IBM Netezza data warehouse appliance Overview The need A financial software company sought to analyze customer engagements to
More informationFocus on the business, not the business of data warehousing!
Focus on the business, not the business of data warehousing! Adam M. Ronthal Technical Product Marketing and Strategy Big Data, Cloud, and Appliances @ARonthal 1 Disclaimer Copyright IBM Corporation 2014.
More informationE M C P E R S P E C T I V E MANAGING HEALTHCARE DATA WITHIN THE ECOSYSTEM WHILE REDUCING IT COSTS AND COMPLEXITIES
E M C P E R S P E C T I V E MANAGING HEALTHCARE DATA WITHIN THE ECOSYSTEM WHILE REDUCING IT COSTS AND COMPLEXITIES With more than 3,000 attendees and hundreds of exhibitors, the annual HIMSS World Health
More informationOracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.
Oracle9i Data Warehouse Review Robert F. Edwards Dulcian, Inc. Agenda Oracle9i Server OLAP Server Analytical SQL Data Mining ETL Warehouse Builder 3i Oracle 9i Server Overview 9i Server = Data Warehouse
More informationORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process
ORACLE OLAP KEY FEATURES AND BENEFITS FAST ANSWERS TO TOUGH QUESTIONS EASILY KEY FEATURES & BENEFITS World class analytic engine Superior query performance Simple SQL access to advanced analytics Enhanced
More informationWHAT S NEW IN SAS 9.4
WHAT S NEW IN SAS 9.4 PLATFORM, HPA & SAS GRID COMPUTING MICHAEL GODDARD CHIEF ARCHITECT SAS INSTITUTE, NEW ZEALAND SAS 9.4 WHAT S NEW IN THE PLATFORM Platform update SAS Grid Computing update Hadoop support
More informationTRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS
9 8 TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS Assist. Prof. Latinka Todoranova Econ Lit C 810 Information technology is a highly dynamic field of research. As part of it, business intelligence
More informationHow To Use Hp Vertica Ondemand
Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater
More informationIBM SPSS Modeler Premium
IBM SPSS Modeler Premium Improve model accuracy with structured and unstructured data, entity analytics and social network analysis Highlights Solve business problems faster with analytical techniques
More informationORACLE TAX ANALYTICS. The Solution. Oracle Tax Data Model KEY FEATURES
ORACLE TAX ANALYTICS KEY FEATURES A set of comprehensive and compatible BI Applications. Advanced insight into tax performance Built on World Class Oracle s Database and BI Technology Design after the
More informationNetezza and Business Analytics Synergy
Netezza Business Partner Update: November 17, 2011 Netezza and Business Analytics Synergy Shimon Nir, IBM Agenda Business Analytics / Netezza Synergy Overview Netezza overview Enabling the Business with
More informationMicroStrategy Course Catalog
MicroStrategy Course Catalog 1 microstrategy.com/education 3 MicroStrategy course matrix 4 MicroStrategy 9 8 MicroStrategy 10 table of contents MicroStrategy course matrix MICROSTRATEGY 9 MICROSTRATEGY
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationCisco Data Preparation
Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and
More informationWhy compute in parallel? Cloud computing. Big Data 11/29/15. Introduction to Data Management CSE 344. Science is Facing a Data Deluge!
Why compute in parallel? Introduction to Data Management CSE 344 Lectures 23 and 24 Parallel Databases Most processors have multiple cores Can run multiple jobs simultaneously Natural extension of txn
More informationSQL Server 2005 Features Comparison
Page 1 of 10 Quick Links Home Worldwide Search Microsoft.com for: Go : Home Product Information How to Buy Editions Learning Downloads Support Partners Technologies Solutions Community Previous Versions
More informationContents. Overview. The solid foundation for your entire, enterprise-wide business intelligence system
Data Warehouse The solid foundation for your entire, enterprise-wide business intelligence system The core of the high-performance intelligence delivery infrastructure, designed to meet even the most demanding
More informationInteractive data analytics drive insights
Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has
More informationIBM Netezza High Capacity Appliance
IBM Netezza High Capacity Appliance Petascale Data Archival, Analysis and Disaster Recovery Solutions IBM Netezza High Capacity Appliance Highlights: Allows querying and analysis of deep archival data
More informationName: Srinivasan Govindaraj Title: Big Data Predictive Analytics
Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Please note the following IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice
More informationBIG DATA IS MESSY PARTNER WITH SCALABLE
BIG DATA IS MESSY PARTNER WITH SCALABLE SCALABLE SYSTEMS HADOOP SOLUTION WHAT IS BIG DATA? Each day human beings create 2.5 quintillion bytes of data. In the last two years alone over 90% of the data on
More informationFive Best Practices for Maximizing Big Data ROI
E-PAPER FEBRUARY 2014 Five Best Practices for Maximizing Big Data ROI Lessons from early adopters show how IT can deliver better business results at less cost. TW_1401138 Organizations of all kinds have
More informationHexaware E-book on Predictive Analytics
Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,
More informationBeyond Conventional Data Warehousing. Florian Waas Greenplum Inc.
Beyond Conventional Data Warehousing Florian Waas Greenplum Inc. Takeaways The basics Who is Greenplum? What is Greenplum Database? The problem Data growth and other recent trends in DWH A look at different
More informationWHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics
WHITE PAPER Harnessing the Power of Advanced How an appliance approach simplifies the use of advanced analytics Introduction The Netezza TwinFin i-class advanced analytics appliance pushes the limits of
More informationMake Better Decisions Through Predictive Intelligence
IBM SPSS Modeler Professional Make Better Decisions Through Predictive Intelligence Highlights Easily access, prepare and model structured data with this intuitive, visual data mining workbench Expand
More informationIBM SPSS Modeler 15 In-Database Mining Guide
IBM SPSS Modeler 15 In-Database Mining Guide Note: Before using this information and the product it supports, read the general information under Notices on p. 217. This edition applies to IBM SPSS Modeler
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationFive Technology Trends for Improved Business Intelligence Performance
TechTarget Enterprise Applications Media E-Book Five Technology Trends for Improved Business Intelligence Performance The demand for business intelligence data only continues to increase, putting BI vendors
More informationRevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
More informationEinsatzfelder von IBM PureData Systems und Ihre Vorteile.
Einsatzfelder von IBM PureData Systems und Ihre Vorteile demirkaya@de.ibm.com Agenda Information technology challenges PureSystems and PureData introduction PureData for Transactions PureData for Analytics
More informationReal Life Performance of In-Memory Database Systems for BI
D1 Solutions AG a Netcetera Company Real Life Performance of In-Memory Database Systems for BI 10th European TDWI Conference Munich, June 2010 10th European TDWI Conference Munich, June 2010 Authors: Dr.
More informationThe Ultimate Guide to Buying Business Analytics
The Ultimate Guide to Buying Business Analytics How to Evaluate a BI Solution for Your Small or Medium Sized Business: What Questions to Ask and What to Look For Copyright 2012 Pentaho Corporation. Redistribution
More informationA Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel
A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated
More informationPentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System
Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System By Jake Cornelius Senior Vice President of Products Pentaho June 1, 2012 Pentaho Delivers High-Performance
More informationIBM SPSS Modeler 14.2 In-Database Mining Guide
IBM SPSS Modeler 14.2 In-Database Mining Guide Note: Before using this information and the product it supports, read the general information under Notices on p. 197. This edition applies to IBM SPSS Modeler
More informationIncrease Agility and Reduce Costs with a Logical Data Warehouse. February 2014
Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4
More informationTen Things You Need to Know About Data Virtualization
White Paper Ten Things You Need to Know About Data Virtualization What is Data Virtualization? Data virtualization is an agile data integration method that simplifies information access. Data virtualization
More informationCost-Effective Business Intelligence with Red Hat and Open Source
Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,
More informationRealizing the True Potential of Software-Defined Storage
Realizing the True Potential of Software-Defined Storage Who should read this paper Technology leaders, architects, and application owners who are looking at transforming their organization s storage infrastructure
More informationThe Ultimate Guide to Buying Business Analytics
The Ultimate Guide to Buying Business Analytics How to Evaluate a BI Solution for Your Small or Medium Sized Business: What Questions to Ask and What to Look For Copyright 2012 Pentaho Corporation. Redistribution
More information