Linux as a Data Integration Platform



Similar documents
Informatica MDM High Availability Solution

ManageEngine (division of ZOHO Corporation) Infrastructure Management Solution (IMS)

ORACLE DATA SHEET KEY FEATURES AND BENEFITS ORACLE WEBLOGIC SERVER STANDARD EDITION

AdvancedHosting SM Solutions from SunGard Availability Services

IBM Optim. The ROI of an Archiving Project. Michael Mittman Optim Products IBM Software Group IBM Corporation

IBM i25 Trends & Directions

High Availability Databases based on Oracle 10g RAC on Linux

Chapter 10: Scalability

<Insert Picture Here> Infrastructure as a Service (IaaS) Cloud Computing for Enterprises

Ultimate Guide to Oracle Storage

Migrate AS 400 Applications to Windows, UNIX or Linux


Welcome. Changes and Choices

Active/Active DB2 Clusters for HA and Scalability

Migrate AS 400 Applications to Linux

Configuration Management of Massively Scalable Systems

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

Blackboard Learn TM, Release 9 Technology Architecture. John Fontaine

High Availability with Elixir

Return On Investment XpoLog Center

How To Optimize The Org Database Platform

Project Manager 1 Post == Experience years in Project Management in reputed company, Salary Rs.1,20,000/-

JBoss Enterprise Middleware

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.

Centrata IT Management Suite 3.0

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

PolyServe Matrix Server for Linux

Oracle s Cloud Computing Strategy

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform

Oracle WebLogic Server 11g: Administration Essentials


Migrating from Unix to Oracle on Linux. Sponsored by Red Hat. An Oracle and Red Hat White Paper September 2003

Cisco Integration Platform

Cloud Based Application Architectures using Smart Computing

Diploma in Computer Science

Cloud Computing Now and the Future Development of the IaaS

Hard Partitioning and Virtualization with Oracle Virtual Machine. An approach toward cost saving with Oracle Database licenses

Vendors Questions for RFP DBA and System Administration Support

Oracle Database Solutions on VMware High Availability. Business Continuance of SAP Solutions on Vmware vsphere

Managed Services Portfolio. MindCraft Software Pvt. Ltd.

Tushar Joshi Turtle Networks Ltd

DOCUMENTATION MICROSOFT SQL BACKUP & RESTORE OPERATIONS

Data Backup and Restore (DBR) Overview Detailed Description Pricing... 5 SLAs... 5 Service Matrix Service Description

Requirements Checklist for Choosing a Cloud Backup and Recovery Service Provider

<Insert Picture Here> Enabling Cloud Deployments with Oracle Virtualization

IBM PureData System for Transactions. Technical Deep Dive. Jonathan Rossi, PureSystems Specialist

Ecomm Enterprise High Availability Solution. Ecomm Enterprise High Availability Solution (EEHAS) Page 1 of 7

Application Performance Management for Enterprise Applications

ORACLE DATA SHEET KEY FEATURES AND BENEFITS ORACLE WEBLOGIC SERVER STANDARD EDITION

Database High Availability. Solutions 2010

1.0 Hardware Requirements:

Oracle RAC Services Appendix

CHOOSING A RACKSPACE HOSTING PLATFORM

Legacy Systems Modernization. Caravel BASE100.

Eliminate SQL Server Downtime Even for maintenance

Internet Engineering: Web Application Architecture. Ali Kamandi Sharif University of Technology Fall 2007

Client Study Portfolio

High Availability Technical Notice

Implementing Managed Services in the Data Center and Cloud Space

Requirements Checklist for Choosing a Cloud Backup and Recovery Service Provider

Veritas Cluster Server by Symantec

IBM Informix Dynamic Server v10.0

Hosting Applications on Standards- Based and Open-Source Platforms Dell IT Executive Learning Series

Procase Consulting. APEX 4.1 Introduction. Oleg Mochkin

Realizing the Benefits of Hybrid Cloud. Anand MS Cloud Solutions Architect Microsoft Asia Pacific

Comparing TCO for Mission Critical Linux and NonStop

Data Sheet: Disaster Recovery Veritas Volume Replicator by Symantec Data replication for disaster recovery

Oracle Databases on VMware High Availability

CSS ONEVIEW G-Cloud CA Nimsoft Monitoring

Job Descriptions REMEDY. Job Code: 4870 Level 2 Support. Skill: Remedy Experience: 3-6 yrs Location: Chennai. Candidate Profile: Mandatory Skills:

An Oracle White Paper May Oracle Database Cloud Service

CLOUD COMPUTING An Overview


Oracle WebLogic Foundation of Oracle Fusion Middleware. Lawrence Manickam Toyork Systems Inc

Oracle 11g is by far the most robust database software on the market

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Cloud Vendor Evaluation

Inge Os Sales Consulting Manager Oracle Norway

<Insert Picture Here> Camilla Kampmann

Backup Exec 15: Administration

MySQL Enterprise Edition Most secure, scalable MySQL Database, Online Backup, Development/Monitoring Tools, backed by Oracle Premier Lifetime Support

DATABASES AND ERP SELECTION: ORACLE VS SQL SERVER

Datamation. Find the Right Cloud Computing Solution. Executive Brief. In This Paper

How To Manage An Orgs Server 10G (Operating System)

Understanding Enterprise NAS

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Attachment E. RFP Requirements: Mandatory Requirements: Vendor must respond with Yes or No. A No response will render the vendor nonresponsive.

Big Data Analytics - Accelerated. stream-horizon.com

JBS-102: Jboss Application Server Administration. Course Length: 4 days

Transcription:

Linux as a Data Integration Platform Click to edit Master subtitle style Evan Bauer, VP Technology Architecture DealerTrack, Inc.

Background DealerTrack (NASDAQ:TRAK) is the leading software-as-a-service and data provider to the automotive industry Origins as the spinout of Chase's auto finance portal In our 10-year history, DealerTrack has made 17 acquisitions and developed 4 major new products internally Integration of our products and the ability to create new resources from the databases they build and create is the company's #1 strategic project Stabilization of the runtime environment for availability and increasing scalability to handle rapid growth in transaction volumes is part of the price of success in the market

What are we integrating? A combination of interactive web, web services, and 5250 telnet delivered SaaS products for the retail automotive supply chain Customers include: Dealers Lenders Manufacturers Insurers Parts and Service Suppliers State Governments Implemented with: IIS /.NET / MSMQ Java / Weblogic / J2EE Apache / perl RPG2 / LookSoftware Oracle MS SQL Server DB2 MySQL

Constraints and Approach Constraints: Business is sensitive to both availability and cost Wholesale application re-hosting or conversion is much too costly and would provide marginal user benefit Subject matter expert developers are our scarcest resources and are often deeply wedded to specific development environments Integration of data is a prerequisite to integrating applications Thousands of external data integration points with 3 rd parties Approach: Definition of common data entities that are shared across platforms Minimum modification of applications to create and consume shared data Data replication between databases in a hub and spoke topology

Integration Project Scope Create a common data platform for all current and future DealerTrack Products and Services Design and implement a multi-platform software/database framework upon which DT 2.0 will be built Create and maintain database tables that will store common data and make it easily available to all products regardless of platforms Implement replication software so that common data will be replicated to all the platforms and will be available locally to all products Common Entities to be released in 2011 include: Dealer Groups, Dealers, Users, Customers, Vehicles, and Partners 8/23/11 Slide 5

Fusion Replication Topology

MDM and Replication Change Data Capture selected as the replication mechanism Data replication becomes a service provided by the DBA team, rather than the programmatic responsibility of the application development teams Replications defined as the relationships between tables, with both copies of the Common Master available in application DBs (full or in subset) as well as replication with transformation to modified legacy tables Approach simplifies and accelerates delivery of data integration Common Master the hub becomes a hotspot. It receives the sum of all updates and needs to be available as the projection of all application SLAs

Controllable shown once The SaaS SLA Challenge Repair Time 8 hours Availability = SLA and the Clock MTBF MTBF + MTTR 1000 1000 + 8 Availability = 99.2 % Assumptions: 1. Mean Time to Repair is with tools and architecture 2. Mean Time Between Failure at 1,000 hours every 40 days ½ hour 1000 1000 + 0.5 = 99.95 % 20 seconds 1000 1000 + 20/3600 = 99.999% Availability is sensitive to MTTR even 3-nines is hard to reach without having high availability during maintenance windows. Rolling upgrades and >=3-way live-live clusters are essential for stateful subsystems

Linux Clusters - Experience Original architecture for production OLTP systems had stateless web farms in front of Solaris SPARC Oracle database servers on Veritas clusters Lower cost Linux clusters on commodity x86 servers, initially implemented for less critical or lower volume systems, have had substantially better availability and performance The fault-rate associated with system software root causes has been lower on Linux-based clusters than on either Solaris or Windows Measured MTTR is lower on our Linux clusters than on any other platform in production Device driver support, a critical cause of outages and a substantial engineering and testing burden on our Windows IIS and MSMQ servers, has never been an issue on Linux database clusters running on comparable hardware from the same manufacturer

Realities and Requirements As desirable as a fully open source solution would be, support for both Windows and iseries (OS/400) legacy requires proprietary CDC and RDBMS technology Without at least 3 active cluster nodes, you don't have HA during maintenance and upgrades You can't use DR as HA when 99.99% uptime is your target For all vendors of alternatives for the necessary software, Linux has become the common, tier-1, supported platform Key issues become file system and support across SAN alternatives and the interaction between CDC, backup, and DR storage replication technologies Testing and stabilization become as critical as design and development

Additional Opportunities Once the platform decision was made, additional cost-effective Linux-based solutions presented themselves for related systems: Replication of transactional data to operational data stores for to support ad hoc reporting. Linux versions of both reporting and database software provide a 4x price performance advantage over the same proprietary software running on iseries Cost-effective av-scanned, remote replicated, encrypted storage of sensitive documents Host platform for utility services (e.g. network management, log concentration and analysis) supporting tools for the diverse set of primarily proprietary platforms

Key Future Considerations Linux as a database and infrastructure service delivery platform has been adopted within DealerTrack because of the strengths that come from an open platform depsite fears about open. In turn, Linux is helping to open DealerTrack's platform and options going forward. RAS there is nothing worse than being offline RDBMS cluster support facilities for live-live clusters (RAC or PureScale vs HA or HA-DR) Cookbook filesystem selection and implementation for database clusters LSB remains critical as database and proprietary ISV products maintain distro affinities Virtualization kvm becoming increasingly attractive, proprietary offering feature rich but problematic in production. Management tools welcome