Implementing an Enterprise Order Database with DB2 purexml at Verizon Business



Similar documents
IBM DB2 XML support. How to Configure the IBM DB2 Support in oxygen

OData Extension for XML Data A Directional White Paper

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

XML Programming with PHP and Ajax

purexml Critical to Capitalizing on ACORD s Potential

Course 6232A: Implementing a Microsoft SQL Server 2008 Database

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

Technologies for a CERIF XML based CRIS

Holistic Performance Analysis of J2EE Applications

Geodatabase Programming with SQL

ON ANALYZING THE DATABASE PERFORMANCE FOR DIFFERENT CLASSES OF XML DOCUMENTS BASED ON THE USED STORAGE APPROACH

AWS Schema Conversion Tool. User Guide Version 1.0

What is Data Virtualization?

Chapter 2 Database System Concepts and Architecture

UltraQuest Cloud Server. White Paper Version 1.0

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)

SOLUTION BRIEF. Advanced ODBC and JDBC Access to Salesforce Data.

Unified XML/relational storage March The IBM approach to unified XML/relational databases

Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework

Talend for Data Integration guide

Performance And Scalability In Oracle9i And SQL Server 2000

DataDirect XQuery Technical Overview

REST vs. SOAP: Making the Right Architectural Decision

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

Programming Against Hybrid Databases with Java Handling SQL and NoSQL. Brian Hughes IBM

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

IBM DB2 for Linux, UNIX, and Windows. Best Practices. Managing XML Data. Matthias Nicola IBM Silicon Valley Lab Susanne Englert IBM Silicon Valley Lab

JOURNAL OF OBJECT TECHNOLOGY

Data processing goes big

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Implementing a Microsoft SQL Server 2008 Database

AIX NFS Client Performance Improvements for Databases on NAS

Databases and BigData

EDG Project: Database Management Services

Java DB Performance. Olav Sandstå Sun Microsystems, Trondheim, Norway Submission ID: 860

Programa de Actualización Profesional ACTI Oracle Database 11g: SQL Tuning Workshop

Business Object Document (BOD) Message Architecture for OAGIS Release 9.+

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

Configuring Apache Derby for Performance and Durability Olav Sandstå

Advanced Information Management

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

Online shopping cart. Tarik Guelzim Graduate school of computer science at Monmouth University. CS517 Database management systems Project 2

IBM Rational Asset Manager

Developing Microsoft SQL Server Databases 20464C; 5 Days

CHAPTER 5: BUSINESS ANALYTICS

Primavera Project Management System at WVDOT. Presented by Marshall Burgess, WVDOT Stephen Cole, Stephen Cole Consulting Jervetta Bruce, CDP, Inc.

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

Driver for JDBC Implementation Guide

ovirt Introduction James Rankin Product Manager Red Hat Virtualization Management the ovirt way

<Insert Picture Here> Move to Oracle Database with Oracle SQL Developer Migrations

Comparison of XML Support in IBM DB2 9, Microsoft SQL Server 2005, Oracle 10g

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

Best Practices. Temporal Data Management with DB2

What is Data Virtualization? Rick F. van der Lans, R20/Consultancy

Tier Architectures. Kathleen Durant CS 3200

Data Modeling for Big Data

Websense SQL Queries. David Buyer June 2009

20464C: Developing Microsoft SQL Server Databases

Generating XML from Relational Tables using ORACLE. by Selim Mimaroglu Supervisor: Betty O NeilO

White Paper February IBM InfoSphere DataStage Performance and Scalability Benchmark Whitepaper Data Warehousing Scenario

How To Integrate Pricing Into A Websphere Commerce Pricing Integration

How To Create A Large Data Storage System

Execution Plans: The Secret to Query Tuning Success. MagicPASS January 2015

The Classical Architecture. Storage 1 / 36

Virtuoso and Database Scalability

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package Data Federation Administration Tool Guide

Capacity Plan. Template. Version X.x October 11, 2012

MS Designing and Optimizing Database Solutions with Microsoft SQL Server 2008

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington

An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases

Managing E-Commerce Catalogs in a DBMS with Native XML Support

Introduction to Databases

Consolidate by Migrating Your Databases to Oracle Database 11g. Fred Louis Enterprise Architect

Accelerate Data Loading for Big Data Analytics Attunity Click-2-Load for HP Vertica

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001

Course 20464: Developing Microsoft SQL Server Databases

DB2 V8 Performance Opportunities

DB2 Application Development and Migration Tools

Content Management System (CMS)

Implementing efficient system i data integration within your SOA. The Right Time for Real-Time

Leveraging Service Oriented Architecture (SOA) to integrate Oracle Applications with SalesForce.com

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015

WSO2 Business Process Server Clustering Guide for 3.2.0

Oracle to SQL Server 2005 Migration

Virtuoso Replication and Synchronization Services

VI Performance Monitoring

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

BIRT Document Transform

Cloud Computing with Windows Azure using your Preferred Technology

ETL Tools. L. Libkin 1 Data Integration and Exchange

System and Storage Virtualization For ios (AS/400) Environment

Developing Microsoft SQL Server Databases (20464) H8N64S

Bright Idea: GE s Storage Performance Best Practices Brian W. Walker

Transcription:

Implementing an Enterprise Order Database with DB2 purexml at Verizon Business Andrew Washburn, Verizon Business Matthias Nicola, IBM Silicon Valley Lab Session TLU-2075 0 October 25 29, 2009 Mandalay Bay Las Vegas, Nevada

Agenda Verizon Business* Brief Recap: DB2 purexml Enterprise Order Management at Verizon Requirements & Challenges Orders as Business Object Documents (BODs) Database Design Considerations Implementation Performance Tests Benefits, Lessons Learned * - All Verizon information is provided for informational purposes only. Verizon makes no representation or warranty of the accuracy of this information, and no license in such information is granted. All rights are reserved. 1

Verizon Business Delivers IP, data, voice, wireless, networks, security, mobility and other communication services $21.1B revenues in FY2008 321 offices in 75 countries Customers include 98% of Fortune 1000 Verizon Business connects customers in 2,700 cities in 150 countries Network includes 485,000 miles of terrestrial and undersea cable, spanning 6 continents 2

Overview of DB2 purexml (1 of 2) XML stored in a parsed hierarchical format No fixed XML schema per XML column required XML schema validation is optional, per document XML indexes for specific elements/attributes XQuery and SQL/XML integration create table customer (cid integer,, info XML) DB2 Storage Relational Storage page page page purexml Storage 3

Overview of DB2 purexml (2 of 2) 1 2 3 4 5 create table customer (cid integer, info XML) insert into customer (cid, info) values (?,?) select cid, info from customer select xmlquery('$info/customer/name') from customer where xmlexists('$info/customer/addr[zip = 95123]') create index idx1 on customer(info) generate keys using xmlpattern '/customer/addr/zip' as sql varchar(12) 6 Plus: updates, XML Schema support, utilities, etc. 4

Agenda Verizon Business Brief Recap: DB2 purexml Enterprise Order Management at Verizon Requirements & Challenges Orders as Business Object Documents (BODs) Database Design Considerations Implementation Performance Tests Benefits, Lessons Learned 5

Enterprise Order Management Overview Challenge Orders come in through different channels, stored in channel-specific repositories Coherent view of all orders is required Reporting and tracking across all orders is critical Orders are XML documents (OAGIS standard) Medium and complex queries and updates Need a consistent way to figure out what changed on an order Solution Single order database, hybrid storage in DB2 purexml SQL, XQuery, XQuery Update in Stored Procedures Utilitze XQuery and XML to solve what changed from one order revision to the next Many OE systems tried to solve this problem now coded in one place 6

Enterprise Order Management (EOM) The Big Picture Customer Interface Sales & Contracts Order Entry & Management DOMS OrderPro iorder ioe Let s zoom in Enterprise Service Bus Service Profile Provisioning Workflow Activation / Inventory EOM Billing EODB Work Dispatch 7

A typical EOM Architecture Heterogeneous Order Entry Systems System1 System2 System3 System4 System5 System6 Converting XML to different data formats XML Converting non-xml data to XML XML Enterprise Order Management EOM Application 8 8

Characteristics of a Heterogeneous Architecture Each order entry system stores an order by transforming standard OAGIS XML BODs to a relational schema Converting XML to relational format and back can be expensive Manual work to design the mapping between the data formats Resource consumption (CPU cycles) Heterogeneous order repositories Optimized for order entry rather than for global order tracking and reporting Typically use product-specific database schemas Provide widely varying levels of data access 9

New EOM Architecture, with DB2 9.5 Single order database, common for all OE systems, product-agnostic No shredding and construction of XML High performance, granular data access Order Entry Systems System1 System2 System3 System4 System5 System6 ESB EODB (DB2 purexml) XML Enterprise Order Management EOM Application 10 10

New EOM Architecture, with DB2 9.5 Single order database, common for all OE systems, product-agnostic No shredding and construction of XML High performance, granular data access Order Entry Systems System1 System2 System3 System4 System5 System6 ESB purexml storage SQL & XQuery No Conversions XML Enterprise Order Management EOM Application DB2 9.5, Linux, Intel-based server, HADR 11 11

Agenda Verizon Business Brief Recap: DB2 purexml Enterprise Order Management at Verizon Requirements & Challenges Orders as Business Object Documents (BODs) Database Design Considerations Implementation Performance Tests Benefits, Lessons Learned 12

What is OAGIS, what is a BOD? OAGIS (Open Applications Group Integration Specification) provides a canonical business language for information integration, in XML. XML message format for Business Object Documents (BODs) BODs use OO design patterns: they define a noun (object) that has verbs to indicate the action to be performed Verb Noun The name of the BOD is the verb-noun combination in the data area 13

XML Schema for Business Object Documents (BODs) ProcessServiceOrder DataArea ServiceOrder (ORDER) ServiceOrderLine (LINE) 14

XML Schema for Business Object Documents (BODs) ProcessServiceOrder DataArea ServiceOrder (ORDER) ServiceOrderLine (LINE) Product (PRODUCT) Services Service (SERVICE) 15

XML Schema for Business Object Documents (BODs) ProcessServiceOrder DataArea ServiceOrder (ORDER) ServiceOrderLine (LINE) Product (PRODUCT) Services ORDER LINE PRODUCT SERVICE SUBSERVICE Most BODs are 25kb to 500kb Some in the MB range Most BODs have 3-5 services Some have dozens or more Service (SERVICE) 16

Agenda Verizon Business Brief Recap: DB2 purexml Enterprise Order Management at Verizon Requirements & Challenges Orders as Business Object Documents (BODs) Database Design Considerations Implementation Performance Tests Benefits, Lessons Learned 17

Workload & Requirements High frequency/priority Update individual service Summary queries Database design Medium frequency/priority Get individual service Insert new order Get order Get order summary Application characteristics OK to favor update and query performance over insert performance Frequent granularity of access: 1. Service 2. Order 18

Database Schema Design Options <ProcessServiceOrder> <ApplicationArea></ApplicationArea> <DataArea> <ServiceOrder> <ServiceOrderHeader> <ServiceOrderLine> ( ) <Product> ( ) <Services> <DataService> </DataService> <DataService> </DataService> </Services> ( ) </Product> <ServiceOrderLine> </ServiceOrder> </DataArea> <ProcessServiceOrder> order oid order XML 19

Database Schema Design Options <ProcessServiceOrder> <ApplicationArea></ApplicationArea> <DataArea> <ServiceOrder> order oid order XML Extract important XML fields into relational tables. Orders are stored fully intact as XML in the order table. No other table contains XML. <ServiceOrderHeader> <ServiceOrderLine> ( ) <Product> ( ) line lid oid service sid oid <Services> <DataService> </DataService> <DataService> </DataService> </Services> ( ) product subservice </Product> <ServiceOrderLine> pid oid ssid sid </ServiceOrder> </DataArea> <ProcessServiceOrder> 20

Database Schema Design Options Extract important XML fields into relational tables. <ProcessServiceOrder> <ApplicationArea></ApplicationArea> <DataArea> <ServiceOrder> <ServiceOrderHeader> order oid order XML Also extract Service XML fragments. Orders are stored as XML, but with all Services removed. <ServiceOrderLine> ( ) <Product> line service ( ) <Services> <DataService> </DataService> <DataService> </DataService> lid oid sid oid service XML XML </Services> ( ) </Product> <ServiceOrderLine> product pid oid subservice ssid sid </ServiceOrder> </DataArea> <ProcessServiceOrder> 21 Best design for the given workload and requirement

Agenda Verizon Business Brief Recap: DB2 purexml Enterprise Order Management at Verizon Requirements & Challenges Orders as Business Object Documents (BODs) Database Design Considerations Implementation Performance Tests Benefits, Lessons Learned 22

Insert Stored Procedure CREATE PROCEDURE INSERT_BOD (IN BOD XML) LANGUAGE SQL P1: BEGIN DECLARE orderid BIGINT; SET orderid = NEXTVAL FOR order_id_seq; INSERT INTO order (id, tin, tin_version, CONTRACT_ID, CONTRACT_NAME, XML_DOC, insert_datetime, update_datetime) SELECT orderid, T.tin, T.tin_version, T.contract_id, T.contract_name, T.XML_DOC, CURRENT_TIMESTAMP AS insert_datetime, CURRENT_TIMESTAMP AS update_datetime... FROM xmltable('$xml_doc' passing BOD AS "XML_DOC" COLUMNS tin VARCHAR(128) PATH '.../*:ServiceOrderHeader/*:DocumentID/*:ID', tin_version BIGINT PATH '.../*:ServiceOrderHeader/*:DocumentID/*:RevisionID', contract_id VARCHAR(32) PATH '.../*:ServiceOrderHeader/*:Contract/*:ID', contract_name VARCHAR(128) PATH '.../*:ServiceOrderHeader/*:Contract/@schemeName', XML_DOC XML PATH 'copy $new :=. modify do delete $new/.../*:serviceorder/*:serviceorderline/*:product/*:services/* return $new' ) AS T; blue dots : paths are abbreviated to fit on slide 23

Insert Stored Procedure (cont d)... SET productid = NEXTVAL FOR product_id_seq; INSERT INTO product (id, line_id, order_id, identifier, code, name, line_number, insert_datetime, update_datetime)...... END SELECT productid, lineid, orderid, T.IDENTIFIER, T.CODE, T.NAME, T.LINE_NUMBER, CURRENT_TIMESTAMP AS insert_datetime, CURRENT_TIMESTAMP AS update_datetime FROM XMLTABLE('$XML_DOC/.../:*DataArea/*:ServiceOrder/*:ServiceOrderLine/*:Product' passing BOD AS "XML_DOC" COLUMNS identifier VARCHAR(32) PATH '*:ID', code VARCHAR(32) PATH '*:Code', name VARCHAR(32) PATH '*:Name', line_number BIGINT PATH '*:LineNumber') AS T; 24

Insert Stored Procedure (cont d)... INSERT INTO service (ID, ORDER_ID, PRODUCT_ID, SEQUENCE_NUMBER, TYPE, IDENTIFIER, CATEGORY_CODE, PRODUCT_OFFERING_ID, NAME, PRODUCT_IDENTIFIER, NODE_ID, ACTIONCODE, XML_SERVICE, insert_datetime,update_datetime) SELECT NEXTVAL FOR EODB.SERVICE_SEQ, orderid, productid, T.SEQUENCE_NUMBER, T.TYPE, T.IDENTIFIER, T.CATEGORY_CODE, T.PRODUCT_OFFERING_ID,T.NAME, T.PRODUCT_IDENTIFIER,T.NODE_ID,T.ACTIONCODE, T.XML_SERVICE, CURRENT_TIMESTAMP AS insert_datetime, CURRENT_TIMESTAMP AS update_datetime FROM XMLTABLE('$XML_DOC/.../*:DataArea/*:ServiceOrder/*:ServiceOrderLine/*:Product/*:Services/*' passing BOD AS "XML_DOC" COLUMNS SEQUENCE_NUMBER FOR ORDINALITY, TYPE VARCHAR(32) PATH '@type', IDENTIFIER VARCHAR(32) PATH '*:ID[1]', CATEGORY_CODE VARCHAR(32) PATH '*:CategoryCode', PRODUCT_OFFERING_ID VARCHAR(32) PATH '*:ProductOfferingID', NAME VARCHAR(64) PATH '*:Name', PRODUCT_IDENTIFIER VARCHAR(32) PATH '*:ProductId', NODE_ID VARCHAR(32) PATH '*:NodeID', ACTIONCODE VARCHAR(128) PATH '*:ActionCode[1]', XML_SERVICE XML PATH 'document{.}' ) AS T;... 25

Update an Existing Service in an Order DB2 9.5 Update BOD update existing service in BOD Order BOD Order BOD Order BOD Update BODs represent messages to update existing BODs Typically smaller than the existing BOD If an element exists in the Update BOD, but not in the existing BOD, it needs to be inserted, otherwise updated Requires merge operation of two XML documents 26

DB2 Stored Procedure Implements Update Update BOD DB2 9.5 Stored Procedure UPDATE_SERVICE() Order BOD Order BOD Order BOD SQL/XML Update Statement The stored procedure Reads the "Update BOD" and the "Target BOD" Generates appropriate XQuery update operations And applies update statement to the "Target BOD" All within DB2! 27

SP Generates and Executes Update Stmt Stored Procedure UPDATE_SERVICE() SQL/XML Update Statement UPDATE service SET xml_service = XMLQUERY(' copy $ORIGINAL := $XML_SERVICE modify ( do insert $UPDATE/*:ProcessServiceOrderRequest/*/*:DataArea/*:ServiceOrder/ *:ServiceOrderLine/*:Product/*:Services/*/*:ProvisioningStatus after $ORIGINAL/*/*:Status, do replace $ORIGINAL/*/*:ServiceContact/*:ID with $UPDATE/*:ProcessServiceOrderRequest/*/*:DataArea/*:ServiceOrder/ *:ServiceOrderLine/*:Product/*:Services/*/*:ServiceContact/*:ID ) return $ORIGINAL ' PASSING XMLCAST(? as XML) AS "UPDATE" WHERE node_id =? AND type =? AND product_offering_id =? AND order_id =? 28

"GET" Stored Procedure EOM struggled with memory issues due to the size of the XML mapped into Java memory Needed a way to strip down the BOD Solution Use XQuery and SQL/XML to return order summary BOD where all non-essential data is left in the DB Reduces memory footprint in EOM by 75% Also needed a more granular component data access GET stored procedure can run the gamut Whole order Order summary Single service Single service with only some subcomponents Could return single field if necessary Not limited at all by the XML structure 29

Agenda Verizon Business Brief Recap: DB2 purexml Enterprise Order Management at Verizon Requirements & Challenges Orders as Business Object Documents (BODs) Database Design Considerations Implementation Performance Tests Benefits, Lessons Learned 30

Concurrency and Performance Tests Using the TPoX Workload Driver (http://tpox.sourceforge.net/) Workload: 10% insert order 50% update service 15% get service 10% get order by tin (main identifier) 5% get order by alternate ID 10% get order summary Each concurrent user keeps calling all of these stored procedures randomly, but based on their relative weights In each stored procedure call, a different BOD is inserted or retrieved (i.e., no trivial repetition of operations). Test system with 300,000 order BODs 31

Concurrent Performance Test Results Test system Old pseries Power4 4 single-core CPUs 1500 MHz / CPU 180 160 140 120 100 80 60 40 20 0 EODB Transactions per second 1 2 3 4 5 6 7 8 9 10 Number of concurrent users (read/write) Higher performance and throughput on modern production hardware! 32

Performance Tests - Results Scalability of 100 Inserts with 1, 2, 3 Services Avg Elapsed time per insert 140 m s 128.3 m s 120 m s 100 m s 80.4 m s 80 m s 60 m s 39.7 m s 40 m s 20 m s The time to insert is roughly linear to the number of services 0 m s 1 service 2 services 3 services numbe r of se rv ice s in the B O D Insert Performance for Large BODs 4 kb /m s 3 kb /m s in se rt th ro u g h p u t in kb /m s 2 kb /m s Even for very large BODs, the time to insert is linear in the size of the BOD 1 kb /m s 0 kb /m s 1 mb 2 mb 3 mb 4 mb 5 mb 6 mb 7 mb 8 mb 9 mb Siz e Of the Bod 10 mb 11 mb 12 mb 13 mb 14 mb 15 mb 33

Performance Tests - Results All critical operations perform with < 50ms response times 50 ms 40 ms Elapsed Time per Stored Procedure Call BOD_BY_ALTERNATEID BOD_SUMMARY_BY_TIN Elapsed time 30 ms 20 ms BOD_BY_ORDERID BOD_BY_TIN SERVICE_INCONTEXT SERVICE_INCONTEXT BY_PARAMS BY_SERVICEID UPDATE SERVICEID 10 ms 0 ms get BOD get BOD summary get Service update ServiceID 34

Agenda Verizon Business Brief Recap: DB2 purexml Enterprise Order Management at Verizon Requirements & Challenges Orders as Business Object Documents (BODs) Database Design Considerations Implementation Performance Tests Benefits, Lessons Learned 35

Summary & Benefits of the Design (1/2) Hybrid XML/relational storage and indexing, designed to support specific workload and requirements XML manipulation expressed in declarative terms (i.e., SQL/XML), not in a procedural language Insert, Query, Update logic in SQL/XML stored procedures instead of Java code Data manipulation happens close to the data Number of JDBC calls greatly reduced Better performance Thin Java application layer calls stored procs Less code, lower complexity Leveraging the power of XQuery and SQL/XML for complex document manipulation in DB/2 More efficient storage so no parsing Leave the data in DB; don t pull it into Java memory where the footprint can be 10x or more than the raw XML footprint 36

Summary & Benefits of the Design (2/2) Database schema and design is product-agnostic, hence easier to maintain Less maintenance to add new products Shorter time to market Supports advanced operations such as get_service and get_order_summary Applications can get data at a more granular level Less resource consumption, better performance Efficient updates of individual services 37

Benefits of the DB2 purexml Solution Less application code Lower complexity Better performance Single consistent view of all orders, easier and more accurate reporting Data access at more granular levels Easier (less expensive) maintenance, e.g., to introduce new products Reduced time to market 38

Lessons Learned Managing XML in the database efficiently is no longer wishful thinking Initial fears about DB2 purexml turned out to be unfounded purexml performance Resource consumption Database and index and maintenance Performance with large order documents Schema changes Rapid Application Development with purexml Core concepts prototyped within days (POC) 39

Questions? 40