HIGH PERFORMANCE COMPUTING AT SCALE FOR PHARMACEUTICAL R&D

Size: px
Start display at page:

Download "HIGH PERFORMANCE COMPUTING AT SCALE FOR PHARMACEUTICAL R&D"

Transcription

1 HIGH PERFORMANCE COMPUTING AT SCALE FOR PHARMACEUTICAL R&D EMC Isilon & RCH Solutions ABSTRACT This case study describes how Sanofi R&D (Sanofi) addressed its scientific computing challenges by building out a next generation High Performance Computing (HPC) hub for its R&D activities. The case study also highlights how RCH Solutions implemented EMC Isilon NAS to address the next generation HPC hub requirements. March, 2014

2 Copyright 2014 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners. Part Number H

3 TABLE OF CONTENTS THE CHALLENGE 4 Blocked By Dispersed Data And Computing 4 UNBLOCKING INNVOATION 4 Looking Beyond Today 5 Tackling Data Location 5 Servicing A Spectrum Of HPC Workloads 5 RESULTS 5 MORE INFORMATION 7 About RCH Solutions 7 About EMC Corporation 7 3

4 THE CHALLENGE New drugs introduced today in a therapeutic category such as cardiovasculars maintain a their competitive advantage for less than two years as compared to 10+ plus years during the 1970s 1. In addition to the competitive landscape, the pharmaceutical industry is under increased fiscal pressure from a range of issues, including: decreased revenues from patent losses, changes in healthcare policy, and increased regulatory requirements. To successfully compete in this challenging business climate, pharmaceutical and biotechnology companies are executing strategies that balance research innovation while containing Research & Development (R&D) costs. To spur innovation and disrupt the competitive advantage status quo, IT organizations are re-inventing high performance computing in the pharmaceutical R&D environment. BLOCKED BY DISPERSED DATA AND COMPUTING Many pharmaceutical companies today are made up of many independent R&D departments. At each of these departments, research teams no longer work at the lab bench but work in silico generating massive amounts of data from a variety of laboratory instruments designed for next generation sequencing, liquid chromatography mass spectrometry (LC-MS), x-ray crystallography, electron microscopy and high-content screening workflows. The data generated from these workflows is measured in terabytes to petabytes. It is also not uncommon for each of these research departments to host and maintain their own high performance computing (HPC) and storage environments designed to support the primary workflows of that department. Although operating independently was once viewed as a competitive advantage, now the silos of data and compute are a disadvantage. To derive the maximum value from research data, today s data mining and analysis requires sharing it between departments and external collaborators, combining it with data sourced from public resources like The Cancer Genome Atlas (TCGA), and moving selected data sets to available computing resources best matched for the desired analysis. Unfortunately dispersed data and compute pose operational challenges. Networking between sites may be insufficient or very expensive to move terabyte sized data sets in reasonable amount of time. Because each research department has unique needs, additional staff is required to maintain and support the heterogonous computing hardware and software environments. Extra resources are also needed to manage technology vendors and product support contracts. Also, it is not uncommon for research groups compete for priority access to computing resources. This can lead to underutilized processors or oversubscribed services. Competing for access also leads teams to set up computing and storage resources independent of IT (aka Shadow IT ) which may lead to governance, risk and compliance challenges for the organization at large. UNBLOCKING INNVOATION Sanofi R&D (Sanofi) is one company that successfully re-invented its scientific computing capabilities to foster research innovation. Sanofi addressed its R&D data and compute challenges by creating a central, shared computing and storage platform that could scale capacity and performance with predictable cost. The design and implementation of this next generation computing resource at Sanofi included the services of an experienced scientific computing partner, RCH Solutions (RCH). Before building this shared computing resource, Sanofi and RCH initiated an internal program to examine the existing High Performance Computing (HPC) capabilities at Sanofi. The examination used a holistic approach that evaluated requirements for Symmetric Multi-Processing (SMP), high-performance /graphical/3d workstations, data storage, and networking. 1 Biopharmaceuticals in Perspective, Accessed March 4,

5 LOOKING BEYOND TODAY RCH has been providing managed services supporting research computing at Sanofi since 2006 and was able to provide detailed information, inventories, and historical trends for HPC deployments in the Americas and Europe. But the scope of the HPC assessment at Sanofi was far broader than a typical technology refresh. The assessment factored in objectives detailed outlined in longer-term R&D plans, interviews with the majority of in silico researchers and their management, a review of applications in use, and evaluations and assessments of Cloud-based technologies. The scope of the examination also included a focus on emerging and disruptive technologies that were likely to become part of a next generation, in silico Research Computing platform. The results of this comprehensive examination were funneled directly into an initiative to design and implement a central, next generation HPC platform for Research Computing at Sanofi. TACKLING DATA LOCATION The key factor in the design, along with Sanofi s desire to achieve scale via a shared platform, was data management and movement. Even with a commitment to build-out a high-performing Metro-Area Network (MAN), moving terabytes of data between repositories would consume precious time in the fast moving R&D world and result in duplication of large datasets. To address this challenge, the results of the assessment proposed using scale-out, Network Attached Storage (NAS) technology from EMC Isilon. RCH first introduced Isilon NAS technology at a Sanofi research facility in Cambridge, MA in The initial storage cluster was only several terabytes, but it eventually grew to over 5 petabytes. The Isilon storage cluster not only scaled out in capacity, but also scaled out performance required by HPC jobs. The Isilon cluster also provided a single name space managed by 0.25 Full Time Equivalent (FTE). With the success of Isilon technology supporting Research Computing at Sanofi (Cambridge), the design decision was made to again use the Isilon technology for what would become the Research Computing platform for Sanofi in the US the Boston Hub. To support the centralization of EMC Isilon NAS storage and to achieve the desired levels of performance, the Boston Hub required the build-out of fault-tolerant 10-gigabit Ethernet network. SERVICING A SPECTRUM OF HPC WORKLOADS It was also very clear that the Boston Hub would need to provide computing that could address a spectrum of HPC workloads that ranged from the embarrassing parallel to fine grained jobs. The design and implementation included server technology from Hewlett-Packard (HP), an enterprise-wide compute standard for a number of RCH customers. HP compute technology was to be configured as Beowulf Cluster (core count in excess of 1,000). There were additional servers adjacent to the cluster for application specific needs, all with 10- gigabit Ethernet connectivity to the EMC Isilon NAS. Servers in the Beowulf cluster were configured for grids managed by Open Grid Scheduler (OGS) as the Distributed Resource Manager (DRM). Platform Computing s Load Sharing Facility (LSF) is also available where niche requirements mandate that LSF act as the DRM. R&D is a dynamic environment and teams are constantly iterating on new approaches to answer questions in silico. These approaches depend on the ability to access, retrieve, manage and move data using a variety of protocols like SMB and NFS. During the assessment, Hadoop and was starting to emerge as new option for accessing and computing data. Although there was no immediate use for Hadoop, support for it was a functional requirement for the Hub. Because the Isilon NAS provides native support for multiple protocols including the Hadoop Distributed File System (HDFS), the use of Isilon in the Hub s design ensured that it could support to the rapid introduction of Hadoop at a later date without re-architecting the platform. Finally, a separate dedicated Internet connection was specified and provisioned for Research Computing in the Boston Hub to facilitate data exchange with outside data sources and collaborators. Included with this was the evaluation of network accelerator technologies (Riverbed, Aspera, etc.). RESULTS The centralized shared platform to support Sanofi s US Research Computing was brought on-line for early adopters in December Research teams and projects incrementally moved to the new-shared platform during the first half of By its first birthday, Boston Hub is now a multi-tenant compute and storage cluster that has run in excess of five million jobs. Its scale provides vast improvements in throughput compared to the dispersed pockets of legacy HPC deployments, and it is enabling the creation of compute work-flows not previously possible. 5

6 For Sanofi the Boston Hub is now the center of gravity for Research Computing and Isilon NAS with HPC compute are its foundation. The multi-petabyte NAS employs a combination of Isilon X and NL nodes to support data performance and density requirements. With the success of this new platform and the Managed Services provided by RCH, Sanofi is provision more Isilon nodes and compute to support the organic growth of the Hub. RCH, leveraging EMC Isilon scale-out NAS technology, helped Sanofi achieve its goals to establish a centralized Research Computing center. The new functionally meets today s research requirements, and at the same time ready to scale and adapt to tomorrow s systems, storage, network, and applications needs using only a handful of administrators. Data management concerns are eased by managing data in a large single name space, where data is automatically tiered between higher performing components and more dense components. This process is based on policies, and data is readily available to compute resources via high-speed networks. This new platform will not be a limiting factor for Sanofi in achieving research innovation. 6

7 MORE INFORMATION ABOUT RCH SOLUTIONS RCH Solutions is a Managed Services Provider and a Systems Integrator with a specific focus and expertise in Research Computing. Since 1991, RCH has provided Solutions that include hardware, software, and services for Life Sciences and Healthcare companies. RCH has helped our customers achieve measureable performance results and cost savings by acting as experienced liaisons between the research teams and the information technology groups that support them. Our specific focus and experience in Research Computing has allowed RCH to become the trusted partner advisor of organizations that see value in Managed Services and System Integration expertise. As subject matter experts, we understand the entire research computing stack while ensuring comprehensive ownership, accountability and completeness. We believe the most effective and efficient model, to support Research computing, is nimble, transparent, responsive and adaptive to the unique needs of the scientists. RCH has a thorough understanding of and deep domain expertise in, the many challenges of Life Sciences which allows us to help solve scientific and business problems with respect to research computing. Additional information about RCH Solutions can be found at ABOUT EMC CORPORATION EMC Corporation is a global leader in enabling businesses and service providers to transform their operations and deliver IT as a service. Fundamental to this transformation is cloud computing. Through innovative products and services, EMC accelerates the journey to cloud computing, helping IT departments to store, manage, protect and analyze their most valuable asset information in a more agile, trusted and cost-efficient way. Additional information about EMC can be found at EMC Isilon is fully committed to advances in application development including supporting the trend to incorporate Hadoop into evolving Life Sciences applications. EMC Isilon is the only scale-out NAS platform natively integrated with the Hadoop Distributed File System (HDFS). Using HDFS as an over-the-wire protocol, you can deploy a powerful, efficient, and flexible Big Data storage and analytics ecosystem. Isilon storage and analytics solutions support multiple instances of Apache Hadoop distributions from different vendors simultaneously including Pivotal HD, Cloudera CHD, and Hortonworks Data Platform. Our solutions also support both HDFS 1.0 and HDFS 2.0. This allows you to leverage the specific tools you need for each of your unstructured data analytics projects. Isilon's in-place analytics approach eliminates the need to invest in a standalone Hadoop infrastructure. Our solution also allows you to eliminate the time and resources required to replicate your data into a separate infrastructure. This means that you can initiate data analytics projects faster and get results in a matter of minutes. And when your data changes, simply rerun the job with no re-ingest requirement. 7

Protecting Big Data Data Protection Solutions for the Business Data Lake

Protecting Big Data Data Protection Solutions for the Business Data Lake White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

BIG DATA-AS-A-SERVICE

BIG DATA-AS-A-SERVICE White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers

More information

EMC ISILON AND ELEMENTAL SERVER

EMC ISILON AND ELEMENTAL SERVER Configuration Guide EMC ISILON AND ELEMENTAL SERVER Configuration Guide for EMC Isilon Scale-Out NAS and Elemental Server v1.9 EMC Solutions Group Abstract EMC Isilon and Elemental provide best-in-class,

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

Intel Platform and Big Data: Making big data work for you.

Intel Platform and Big Data: Making big data work for you. Intel Platform and Big Data: Making big data work for you. 1 From data comes insight New technologies are enabling enterprises to transform opportunity into reality by turning big data into actionable

More information

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS ESSENTIALS Executive Summary Big Data is placing new demands on IT infrastructures. The challenge is how to meet growing performance demands

More information

CONVERGE APPLICATIONS, ANALYTICS, AND DATA WITH VCE AND PIVOTAL

CONVERGE APPLICATIONS, ANALYTICS, AND DATA WITH VCE AND PIVOTAL CONVERGE APPLICATIONS, ANALYTICS, AND DATA WITH VCE AND PIVOTAL Vision In today s volatile economy, an organization s ability to exploit IT to speed time-to-results, control cost and risk, and drive differentiation

More information

Simple. Extensible. Open.

Simple. Extensible. Open. White Paper Simple. Extensible. Open. Unleash the Value of Data with EMC ViPR Global Data Services Abstract The following paper opens with the evolution of enterprise storage infrastructure in the era

More information

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst White Paper EMC s Enterprise Hadoop Solution Isilon Scale-out NAS and Greenplum HD By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst February 2012 This ESG White Paper was commissioned

More information

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise ESSENTIALS Easy-to-use, single volume, single file system architecture Highly scalable with

More information

EMC PERSPECTIVE: THE POWER OF WINDOWS SERVER 2012 AND EMC INFRASTRUCTURE FOR MICROSOFT PRIVATE CLOUD ENVIRONMENTS

EMC PERSPECTIVE: THE POWER OF WINDOWS SERVER 2012 AND EMC INFRASTRUCTURE FOR MICROSOFT PRIVATE CLOUD ENVIRONMENTS EMC PERSPECTIVE: THE POWER OF WINDOWS SERVER 2012 AND EMC INFRASTRUCTURE FOR MICROSOFT PRIVATE CLOUD ENVIRONMENTS EXECUTIVE SUMMARY It s no secret that organizations continue to produce overwhelming amounts

More information

Make the Most of Big Data to Drive Innovation Through Reseach

Make the Most of Big Data to Drive Innovation Through Reseach White Paper Make the Most of Big Data to Drive Innovation Through Reseach Bob Burwell, NetApp November 2012 WP-7172 Abstract Monumental data growth is a fact of life in research universities. The ability

More information

TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC

TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC Vision Big data and analytic initiatives within enterprises have been rapidly maturing from experimental efforts to production-ready deployments.

More information

Isilon OneFS. Version 7.2. OneFS Migration Tools Guide

Isilon OneFS. Version 7.2. OneFS Migration Tools Guide Isilon OneFS Version 7.2 OneFS Migration Tools Guide Copyright 2014 EMC Corporation. All rights reserved. Published in USA. Published November, 2014 EMC believes the information in this publication is

More information

How To Protect Data On Network Attached Storage (Nas) From Disaster

How To Protect Data On Network Attached Storage (Nas) From Disaster White Paper EMC FOR NETWORK ATTACHED STORAGE (NAS) BACKUP AND RECOVERY Abstract This white paper provides an overview of EMC s industry leading backup and recovery solutions for NAS systems. It also explains

More information

Integrated Grid Solutions. and Greenplum

Integrated Grid Solutions. and Greenplum EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving

More information

Big Data on the Open Cloud

Big Data on the Open Cloud Big Data on the Open Cloud Rackspace Private Cloud, Powered by OpenStack, Helps Reduce Costs and Improve Operational Efficiency Written by Niki Acosta, Cloud Evangelist, Rackspace Big Data on the Open

More information

I D C T E C H N O L O G Y S P O T L I G H T. T i m e t o S c ale Out, Not Scale Up

I D C T E C H N O L O G Y S P O T L I G H T. T i m e t o S c ale Out, Not Scale Up I D C T E C H N O L O G Y S P O T L I G H T M a naging the Explosion of Enterprise Data: T i m e t o S c ale Out, Not Scale Up July 2014 Adapted from Scale-Out Meets Virtualization by Ashish Nadkarni,

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS

EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS EMC Isilon solutions for oil and gas EMC PERSPECTIVE TABLE OF CONTENTS INTRODUCTION: THE HUNT FOR MORE RESOURCES... 3 KEEPING PACE WITH

More information

Isilon OneFS. Version 7.2.1. OneFS Migration Tools Guide

Isilon OneFS. Version 7.2.1. OneFS Migration Tools Guide Isilon OneFS Version 7.2.1 OneFS Migration Tools Guide Copyright 2015 EMC Corporation. All rights reserved. Published in USA. Published July, 2015 EMC believes the information in this publication is accurate

More information

White Paper. Version 1.2 May 2015 RAID Incorporated

White Paper. Version 1.2 May 2015 RAID Incorporated White Paper Version 1.2 May 2015 RAID Incorporated Introduction The abundance of Big Data, structured, partially-structured and unstructured massive datasets, which are too large to be processed effectively

More information

High Performance Computing and Big Data: The coming wave.

High Performance Computing and Big Data: The coming wave. High Performance Computing and Big Data: The coming wave. 1 In science and engineering, in order to compete, you must compute Today, the toughest challenges, and greatest opportunities, require computation

More information

Scale-out NAS Unifies the Technical Enterprise

Scale-out NAS Unifies the Technical Enterprise Scale-out NAS Unifies the Technical Enterprise Panasas Inc. White Paper July 2010 Executive Summary Tremendous effort has been made by IT organizations, and their providers, to make enterprise storage

More information

IBM Global Technology Services September 2007. NAS systems scale out to meet growing storage demand.

IBM Global Technology Services September 2007. NAS systems scale out to meet growing storage demand. IBM Global Technology Services September 2007 NAS systems scale out to meet Page 2 Contents 2 Introduction 2 Understanding the traditional NAS role 3 Gaining NAS benefits 4 NAS shortcomings in enterprise

More information

Actian SQL in Hadoop Buyer s Guide

Actian SQL in Hadoop Buyer s Guide Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop

More information

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software Real-Time Big Data Analytics with the Intel Distribution for Apache Hadoop software Executive Summary is already helping businesses extract value out of Big Data by enabling real-time analysis of diverse

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

EMC Virtual Infrastructure for Microsoft SQL Server

EMC Virtual Infrastructure for Microsoft SQL Server Microsoft SQL Server Enabled by EMC Celerra and Microsoft Hyper-V Copyright 2010 EMC Corporation. All rights reserved. Published February, 2010 EMC believes the information in this publication is accurate

More information

EMC Isilon: Data Lake 2.0

EMC Isilon: Data Lake 2.0 ` ESG Solution Showcase EMC Isilon: Data Lake 2.0 Date: November 2015 Author: Scott Sinclair, Analyst Abstract: With the rise of new workloads such as big data analytics and the Internet of Things, data

More information

Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V. Reference Architecture

Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V. Reference Architecture Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V Copyright 2011 EMC Corporation. All rights reserved. Published February, 2011 EMC believes the information

More information

Future Proofing Data Archives with Storage Migration From Legacy to Cloud

Future Proofing Data Archives with Storage Migration From Legacy to Cloud Future Proofing Data Archives with Storage Migration From Legacy to Cloud ABSTRACT This white paper explains how EMC Elastic Cloud Storage (ECS ) Appliance and Seven10 s Storfirst software enable organizations

More information

Microsoft Analytics Platform System. Solution Brief

Microsoft Analytics Platform System. Solution Brief Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal

More information

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with

More information

Cloud Service Provider Builds Cost-Effective Storage Solution to Support Business Growth

Cloud Service Provider Builds Cost-Effective Storage Solution to Support Business Growth Cloud Service Provider Builds Cost-Effective Storage Solution to Support Business Growth Overview Country or Region: United States Industry: Hosting Customer Profile Headquartered in Overland Park, Kansas,

More information

NetApp Big Content Solutions: Agile Infrastructure for Big Data

NetApp Big Content Solutions: Agile Infrastructure for Big Data White Paper NetApp Big Content Solutions: Agile Infrastructure for Big Data Ingo Fuchs, NetApp April 2012 WP-7161 Executive Summary Enterprises are entering a new era of scale, in which the amount of data

More information

High Performance Computing Cloud Offerings from IBM Technical Computing IBM Redbooks Solution Guide

High Performance Computing Cloud Offerings from IBM Technical Computing IBM Redbooks Solution Guide High Performance Computing Cloud Offerings from IBM Technical Computing IBM Redbooks Solution Guide The extraordinary demands that engineering, scientific, and research organizations place upon big data

More information

Table of Contents. Technical paper Open source comes of age for ERP customers

Table of Contents. Technical paper Open source comes of age for ERP customers Technical paper Open source comes of age for ERP customers It s no secret that open source software costs less to buy the software is free, in fact. But until recently, many enterprise datacenter managers

More information

EMC SOLUTION FOR SPLUNK

EMC SOLUTION FOR SPLUNK EMC SOLUTION FOR SPLUNK Splunk validation using all-flash EMC XtremIO and EMC Isilon scale-out NAS ABSTRACT This white paper provides details on the validation of functionality and performance of Splunk

More information

Big Data and Apache Hadoop Adoption:

Big Data and Apache Hadoop Adoption: Expert Reference Series of White Papers Big Data and Apache Hadoop Adoption: Key Challenges and Rewards 1-800-COURSES www.globalknowledge.com Big Data and Apache Hadoop Adoption: Key Challenges and Rewards

More information

Scaling up to Production

Scaling up to Production 1 Scaling up to Production Overview Productionize then Scale Building Production Systems Scaling Production Systems Use Case: Scaling a Production Galaxy Instance Infrastructure Advice 2 PRODUCTIONIZE

More information

Copyright 2012 EMC Corporation. All rights reserved.

Copyright 2012 EMC Corporation. All rights reserved. 1 Greenplum UAP Enabling Big Data Analytics Brendon Moran Data Scientist 2 Agenda Background On Greenplum And Big Data Analytics Greenplum UAP Greenplum: Not Just Infrastructure Pivotal Labs Customers

More information

IBM System x reference architecture solutions for big data

IBM System x reference architecture solutions for big data IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,

More information

EMC ISILON ONEFS OPERATING SYSTEM

EMC ISILON ONEFS OPERATING SYSTEM EMC ISILON ONEFS OPERATING SYSTEM Powering scale-out storage for the Big Data and Object workloads of today and tomorrow ESSENTIALS Easy-to-use, single volume, single file system architecture Highly scalable

More information

EMC Documentum Content Management Interoperability Services

EMC Documentum Content Management Interoperability Services EMC Documentum Content Management Interoperability Services Version 6.7 SP1 Release Notes EMC Corporation Corporate Headquarters: Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com EMC believes the information

More information

MOVING TO THE NEXT-GENERATION MEDICAL INFORMATION CALL CENTER

MOVING TO THE NEXT-GENERATION MEDICAL INFORMATION CALL CENTER MOVING TO THE NEXT-GENERATION MEDICAL INFORMATION CALL CENTER Pharma companies are improving personalized relationships across more channels while cutting cost, complexity, and risk Increased competition

More information

How To Manage A Single Volume Of Data On A Single Disk (Isilon)

How To Manage A Single Volume Of Data On A Single Disk (Isilon) 1 ISILON SCALE-OUT NAS OVERVIEW AND FUTURE DIRECTIONS PHIL BULLINGER, SVP, EMC ISILON 2 ROADMAP INFORMATION DISCLAIMER EMC makes no representation and undertakes no obligations with regard to product planning

More information

Managing Massive Data Growth to Keep Pace with Regulatory Change

Managing Massive Data Growth to Keep Pace with Regulatory Change White Paper Managing Massive Data Growth to Keep Pace with Regulatory Change Paul Smith and Monty Zarrouk, NetApp April 2013 WP-7183 Abstract Payors are under increasing pressure to meet regulatory mandates,

More information

EMC IT S JOURNEY TO THE PRIVATE CLOUD: SERVER VIRTUALIZATION

EMC IT S JOURNEY TO THE PRIVATE CLOUD: SERVER VIRTUALIZATION White Paper EMC IT S JOURNEY TO THE PRIVATE CLOUD: SERVER VIRTUALIZATION A series exploring how EMC IT is architecting for the future and our progress toward offering IT as a Service to the business Abstract

More information

EPIC EHR: BUILDING HIGH AVAILABILITY INFRASTRUCTURES

EPIC EHR: BUILDING HIGH AVAILABILITY INFRASTRUCTURES EPIC EHR: BUILDING HIGH AVAILABILITY INFRASTRUCTURES BEST PRACTICES FOR PROTECTING EPIC EHR ENVIRONMENTS EMC HEALTHCARE GROUP ABSTRACT Epic Electronic Health Records (EHR) is at the core of delivering

More information

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons The NIH Commons Summary The Commons is a shared virtual space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage,

More information

EMC AVAMAR INTEGRATION WITH EMC DATA DOMAIN SYSTEMS

EMC AVAMAR INTEGRATION WITH EMC DATA DOMAIN SYSTEMS EMC AVAMAR INTEGRATION WITH EMC DATA DOMAIN SYSTEMS A Detailed Review ABSTRACT This white paper highlights integration features implemented in EMC Avamar with EMC Data Domain deduplication storage systems

More information

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved. EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics

More information

How To Use Hp Vertica Ondemand

How To Use Hp Vertica Ondemand Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater

More information

Tap into Big Data at the Speed of Business

Tap into Big Data at the Speed of Business SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics

More information

Extending Microsoft SharePoint Environments with EMC Documentum ApplicationXtender Document Management

Extending Microsoft SharePoint Environments with EMC Documentum ApplicationXtender Document Management Extending Microsoft SharePoint Environments with EMC Documentum ApplicationXtender A Detailed Review Abstract By combining the universal access and collaboration features of Microsoft SharePoint with the

More information

The BIG Data Era has. your storage! Bratislava, Slovakia, 21st March 2013

The BIG Data Era has. your storage! Bratislava, Slovakia, 21st March 2013 The BIG Data Era has arrived Re-invent your storage! Bratislava, Slovakia, 21st March 2013 Luka Topic Regional Manager East Europe EMC Isilon Storage Division luka.topic@emc.com 1 What is Big Data? 2 EXABYTES

More information

Top 5 reasons to choose HP Information Archiving

Top 5 reasons to choose HP Information Archiving Technical white paper Top 5 reasons to choose HP Information Archiving Proven, market-leading archiving solutions The value of intelligent archiving The requirements around managing information are becoming

More information

EMC Backup and Recovery for SAP Oracle with SAP BR*Tools Enabled by EMC Symmetrix DMX-3, EMC Replication Manager, EMC Disk Library, and EMC NetWorker

EMC Backup and Recovery for SAP Oracle with SAP BR*Tools Enabled by EMC Symmetrix DMX-3, EMC Replication Manager, EMC Disk Library, and EMC NetWorker EMC Backup and Recovery for SAP Oracle with SAP BR*Tools Enabled by EMC Symmetrix DMX-3, EMC Replication Manager, EMC Disk Library, and EMC NetWorker Reference Architecture EMC Global Solutions Operations

More information

Introduction to NetApp Infinite Volume

Introduction to NetApp Infinite Volume Technical Report Introduction to NetApp Infinite Volume Sandra Moulton, Reena Gupta, NetApp April 2013 TR-4037 Summary This document provides an overview of NetApp Infinite Volume, a new innovation in

More information

WHITE PAPER. www.fusionstorm.com. Get Ready for Big Data:

WHITE PAPER. www.fusionstorm.com. Get Ready for Big Data: WHitE PaPER: Easing the Way to the cloud: 1 WHITE PAPER Get Ready for Big Data: How Scale-Out NaS Delivers the Scalability, Performance, Resilience and manageability that Big Data Environments Demand 2

More information

Big Data and Natural Language: Extracting Insight From Text

Big Data and Natural Language: Extracting Insight From Text An Oracle White Paper October 2012 Big Data and Natural Language: Extracting Insight From Text Table of Contents Executive Overview... 3 Introduction... 3 Oracle Big Data Appliance... 4 Synthesys... 5

More information

Big Data Storage: Convergence and Efficiency

Big Data Storage: Convergence and Efficiency I D C T E C H N O L O G Y S P O T L I G H T Big Data Storage: Convergence and Efficiency May 2014 Frank Cai, William Zhang, Craig Stires Sponsored by Huawei IDC Opinion The emergence of the big data age

More information

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report

More information

Effective Storage Management for Cloud Computing

Effective Storage Management for Cloud Computing IBM Software April 2010 Effective Management for Cloud Computing April 2010 smarter storage management Page 1 Page 2 EFFECTIVE STORAGE MANAGEMENT FOR CLOUD COMPUTING Contents: Introduction 3 Cloud Configurations

More information

locuz.com Big Data Services

locuz.com Big Data Services locuz.com Big Data Services Big Data At Locuz, we help the enterprise move from being a data-limited to a data-driven one, thereby enabling smarter, faster decisions that result in better business outcome.

More information

BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH MELLANOX SWITCHX

BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH MELLANOX SWITCHX White Paper BRIDGING EMC ISILON NAS ON IP TO INFINIBAND NETWORKS WITH Abstract This white paper explains how to configure a Mellanox SwitchX Series switch to bridge the external network of an EMC Isilon

More information

Enabling High performance Big Data platform with RDMA

Enabling High performance Big Data platform with RDMA Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery

More information

CONFIGURATION GUIDELINES: EMC STORAGE FOR PHYSICAL SECURITY

CONFIGURATION GUIDELINES: EMC STORAGE FOR PHYSICAL SECURITY White Paper CONFIGURATION GUIDELINES: EMC STORAGE FOR PHYSICAL SECURITY DVTel Latitude NVMS performance using EMC Isilon storage arrays Correct sizing for storage in a DVTel Latitude physical security

More information

Big Data - Infrastructure Considerations

Big Data - Infrastructure Considerations April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright

More information

Actifio Big Data Director. Virtual Data Pipeline for Unstructured Data

Actifio Big Data Director. Virtual Data Pipeline for Unstructured Data Actifio Big Data Director Virtual Data Pipeline for Unstructured Data Contact Actifio Support As an Actifio customer, you can get support for all Actifio products through the Support Portal at http://support.actifio.com/.

More information

The Liaison ALLOY Platform

The Liaison ALLOY Platform PRODUCT OVERVIEW The Liaison ALLOY Platform WELCOME TO YOUR DATA-INSPIRED FUTURE Data is a core enterprise asset. Extracting insights from data is a fundamental business need. As the volume, velocity,

More information

Guidewire InsuranceSuite 9 READY FOR THE CLOUD

Guidewire InsuranceSuite 9 READY FOR THE CLOUD Guidewire InsuranceSuite 9 READY FOR THE CLOUD INSURANCESUITE 9: READY FOR THE CLOUD Guidewire InsuranceSuite is a proven solution that helps property and casualty (P/C) insurers worldwide enrich customer

More information

GIVE YOUR ORACLE DBAs THE BACKUPS THEY REALLY WANT

GIVE YOUR ORACLE DBAs THE BACKUPS THEY REALLY WANT Why Data Domain Series GIVE YOUR ORACLE DBAs THE BACKUPS THEY REALLY WANT Why you should take the time to read this paper Speed up backups (Up to 58.7 TB/hr, Data Domain systems are about 1.5 times faster

More information

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR 1 Agenda Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback 2 A World of Connected Devices Need a new data management architecture for Internet of Things 21% the % of

More information

SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX

SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX White Paper SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX Abstract This white paper explains the benefits to the extended enterprise of the on-

More information

Virtualized Exchange 2007 Local Continuous Replication

Virtualized Exchange 2007 Local Continuous Replication EMC Solutions for Microsoft Exchange 2007 Virtualized Exchange 2007 Local Continuous Replication EMC Commercial Solutions Group Corporate Headquarters Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com

More information

Reducing Storage TCO With Private Cloud Storage

Reducing Storage TCO With Private Cloud Storage Prepared by: Colm Keegan, Senior Analyst Prepared: October 2014 With the burgeoning growth of data, many legacy storage systems simply struggle to keep the total cost of ownership (TCO) in check. This

More information

SIMPLIFYING AND AUTOMATING MANAGEMENT ACROSS VIRTUALIZED/CLOUD-BASED INFRASTRUCTURES

SIMPLIFYING AND AUTOMATING MANAGEMENT ACROSS VIRTUALIZED/CLOUD-BASED INFRASTRUCTURES SIMPLIFYING AND AUTOMATING MANAGEMENT ACROSS VIRTUALIZED/CLOUD-BASED INFRASTRUCTURES EMC IT s strategy for leveraging enterprise management, automation, and orchestration technologies to discover and manage

More information

Delivering information you can trust. IBM InfoSphere Master Data Management Server 9.0. Producing better business outcomes with trusted data

Delivering information you can trust. IBM InfoSphere Master Data Management Server 9.0. Producing better business outcomes with trusted data Delivering information you can trust IBM InfoSphere Master Data Management Server 9.0 Producing better business outcomes with trusted data Every day, organizations generate and collect a veritable landscape

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Introduction to Red Hat Storage. January, 2012

Introduction to Red Hat Storage. January, 2012 Introduction to Red Hat Storage January, 2012 1 Today s Speakers 2 Heather Wellington Tom Trainer Storage Program Marketing Manager Storage Product Marketing Manager Red Hat Acquisition of Gluster What

More information

EMC ISILON X-SERIES. Specifications. EMC Isilon X200. EMC Isilon X210. EMC Isilon X410 ARCHITECTURE

EMC ISILON X-SERIES. Specifications. EMC Isilon X200. EMC Isilon X210. EMC Isilon X410 ARCHITECTURE EMC ISILON X-SERIES EMC Isilon X200 EMC Isilon X210 The EMC Isilon X-Series, powered by the OneFS operating system, uses a highly versatile yet simple scale-out storage architecture to speed access to

More information

Building a Scalable Big Data Infrastructure for Dynamic Workflows

Building a Scalable Big Data Infrastructure for Dynamic Workflows Building a Scalable Big Data Infrastructure for Dynamic Workflows INTRODUCTION Organizations of all types and sizes are looking to big data to help them make faster, more intelligent decisions. Many efforts

More information

How to Cost Effectively Retain Reference Data for Analytics and Big Data. Molly Rector, EVP Product Management & WW Marketing, Spectra Logic

How to Cost Effectively Retain Reference Data for Analytics and Big Data. Molly Rector, EVP Product Management & WW Marketing, Spectra Logic How to Cost Effectively Retain Reference Data for Analytics and Big Data Molly Rector, EVP Product Management & WW Marketing, Spectra Logic SNIA Legal Notice The material contained in this tutorial is

More information

Effective storage management and data protection for cloud computing

Effective storage management and data protection for cloud computing IBM Software Thought Leadership White Paper September 2010 Effective storage management and data protection for cloud computing Protecting data in private, public and hybrid environments 2 Effective storage

More information

BEST PRACTICES FOR INTEGRATING TELESTREAM VANTAGE WITH EMC ISILON ONEFS

BEST PRACTICES FOR INTEGRATING TELESTREAM VANTAGE WITH EMC ISILON ONEFS Best Practices Guide BEST PRACTICES FOR INTEGRATING TELESTREAM VANTAGE WITH EMC ISILON ONEFS Abstract This best practices guide contains details for integrating Telestream Vantage workflow design and automation

More information

IBM Unstructured Data Identification and Management

IBM Unstructured Data Identification and Management IBM Unstructured Data Identification and Management Discover, recognize, and act on unstructured data in-place Highlights Identify data in place that is relevant for legal collections or regulatory retention.

More information

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that

More information

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved. Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,

More information

Big data management with IBM General Parallel File System

Big data management with IBM General Parallel File System Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers

More information

THE EMC ISILON SCALE-OUT DATA LAKE

THE EMC ISILON SCALE-OUT DATA LAKE THE EMC ISILON SCALE-OUT DATA LAKE Key capabilities ABSTRACT This white paper provides an introduction to the EMC Isilon scale-out data lake as the key enabler to store, manage, and protect unstructured

More information

DEDICATED NETWORKS FOR IP STORAGE

DEDICATED NETWORKS FOR IP STORAGE DEDICATED NETWORKS FOR IP STORAGE ABSTRACT This white paper examines EMC and VMware best practices for deploying dedicated IP storage networks in medium to large-scale data centers. In addition, it explores

More information

Backup and Recovery for SAP Environments using EMC Avamar 7

Backup and Recovery for SAP Environments using EMC Avamar 7 White Paper Backup and Recovery for SAP Environments using EMC Avamar 7 Abstract This white paper highlights how IT environments deploying SAP can benefit from efficient backup with an EMC Avamar solution.

More information