Deutsche Bank Finance IT Migration Oracle Exadata March 2012 Dr. Marcus Praetzas
Agenda 1 2 3 Business Drivers / Background Motivation PoC (Phase 1) 4 Migration (Phase 2) 5 Observations
Deutsche Bank Finance IT dbartos/fdw Team Volker Bettag, Architect Dr. Michael Dreier, Infrastructure Manager Randolf Geist, Oracle Specialist Erwin Heute, Oracle Specialist Jens Koch, MicroStrategy Infrastructure/Project Manager Deutsche Bank Data Centre Contact Dr. Marcus Prätzas, Program Manager Deutsche Bank AG Wilhelm-Fay-Str. 31-37 D-65936 Frankfurt marcus.praetzas@db.com page 3
Business Background Drivers Topic Area Output / Activity Disclosure RWA (Basel I / II) German Regulatory EC / EL / GVA Daily Derivatives Others 20-F Item 11 (Risk Section, 37 pages), Footnotes, Annual Report Analyst presentations, Interim Report, Financial Data Supplement RWA calculations, Monthly driver analysis, Quarterly COREP reporting Monthly Basel II reporting, EPE, MR-RWA KWG Capital, KWG 13 / 14, Financial Conglomerate disclosure Influence rule making and interpretation EC / EL / Average Active Equity calculation and reporting GVA for not impaired corporate credit exposure Daily derivatives counterparty risk Provide EPE calculations Group Derivative Bookings, Global Securities Netting, Banking book collateral, Country risk,... page 4
Motivation Daily Processing Performance Demand 2010 Core Process* Run Times by Quarter Q1 2010 SAS deployed on AMD CPUs with internal PCIe SSD storage Q2 2010 InfiniBand private interconnect, Enhanced parallel processing Q3 2010 Datawarehouse Infrastructure PoC using Oracle Exadata and SSD based storage severs. Oracle Infrastructure setup. Q4 2010 Exadata Oracle Infrastructure go-live Q1-Q3 2011 Exadata Migration of full environment * MR-RWA functionality added Target was ~10h i.e. 50% reduction in noncalculation steps required * page 5
Technical Process Credit Risk Engines Disclosure Daily Source KWG Netting Distributed Engines Expected Loss B/S Netting GVA Daily Source RWA Basel II EC EL / EC Daily Source Monthly Source Monthly Source Monthly Source Daily QA Regional QA Process Control Data Warehouse Country Risk KWG 13 / 14 Principle I/II Monthly Source Regional QA Basel II SAS Ext. Calc page 6
Data Warehouse and more page 7
PoC Testpoints Five key production processes have been chosen A full set of production data ist used for testing The tests were executed in DB datacenter The requirement has been set to 50% performance increase compared to the monthly production setup at the time. Input Area Master Area Reporting Area 5 View, Extract 1 Data Delivery Input Master Report 4 2 3 Calculation page 8
PoC Testpoint Characteristics Testpoint 1 Integration Function (TP1) Data Transformation between two Oracle schemas. CPU power consumption (e.g. currency conversion) as well as large sequential IO operations. The IO is done in parallel and includes substantial DML. Testpoint 2/3 SAS Engine Interface (TP2/3) Perform a data down- and upload to the SAS Engine environment. As this is not a core database functionality rather than a regression test of the InfiniBand connection not further listed here. Testpoint 4 Starbuilder (TP4) Large single threaded operation, where CPU and IO performance are equally essential. Compared to TP1 these are far less complex operations. Testpoint 5 Microstrategy Reporting (TP5) Random IO and massive parallel execution. Representative set of 110 and 470 reports from production. page 9 DDL - Data Definition Language, e.g. create table, partition vs. DML - Data Manipulation Language e.g. insert record
Testpoint 1 Results Integration Function 54% performance gain on Exadata (V2) About 25 test-runs with different Oracle / System configuration settings have been executed for each environment. Minor application changes. The maximum parallelism causes internal Oracle contention issues. 5 compute nodes show best performance. TP1 different Parameter settings per System page 10
Oracle Exadata Scalability With the exception of some parts that are executed across all available nodes the scalability has been tested using a variable number of compute nodes The optimum is reached with 5 nodes. Beyond that no improvement has been observed. page 11
Oracle Exadata Scalability Data Volume When doubling the data volume the runtime increases by 7%, for a factor of three the runtime increases by 19%, with a factor of 4 the runtime gets 34% longer. Base run-time is 45min page 12
Disaster Recovery Active Data Guard 1. Disaster recovery solutions utilizing Oracle Data-Guard for replication. 2. Utilise the Standby Database for reporting and backup purposes (Microstrategy) page 13
PoC Results Decision Result Oracle Exadata achieves an overall better performance improvement of ~55% In particular the better reporting performance of the Oracle Exadata adds significant more value. The feature of hybrid column compression (available on Exadata only) enables a data reduction for historical data down to ~25%. Lower cost than the previous solution (traditional SAN based) Observation The PoC showed contention-issues effecting the achievable performance and scalability of Oracle RAC on the Exadata V2. This occurs in particular when heavily using DDL like truncating partitioning and rebuilding indexes on other partitions in parallel. Conclusion Migration of full environment using V2-8 page 14
Architecture Solution (2011) Oracle Exadata V28 Datacentre #1 Oracle Exadata V28 Datacentre #2 Oracle Exadata V28 Datacentre #2 Oracle Exadata V2 Datacentre #1 Monthly Prod. (10 TB) Monthly Production (Data Guard Copy) (10 TB) Monthly UAT (10 TB) INT (10 TB) DEV (3 TB) Regional QA (2 TB) Daily Prod. (Data Guard Copy) (6 TB) Regional OA. (Data Guard Copy) (2 TB) Daily Prod. (6 TB) Daily UAT (6 TB) QA UAT (2 TB) QA INT (1 TB) QADEV (0.5 TB) Contingency (7 TB) Flash Recovery Area (all databases - 22 TB) Flash Recovery Area (all databases 22TB) Flash Recovery Area (all databases - 22 TB) Flash Recovery Area (all databases - 22 TB) Cluster Filesystem (Buffer, etc. 4 TB) Cluster Filesystem (Buffer, etc. 4 TB) Cluster Filesystem (Buffer, etc. 4 TB) Cluster Filesystem (Buffer, 4 TB) Full Rack #1 (45 TB available for data + FRA) Full Rack #2 (45 TB available for data + FRA) DR (Data Guard Copy) Full Rack #3 (45 TB available for data + FRA) Clone (Snapshot Copy) Full Rack #4 (45 TB available for data + FRA) (existing system) page 15
Migration started Q1 2011 Core functionality was proven & further performance gains indentified (index usage on ODM) in the PoC PoC complete and (daily) system live Oracle Support for go live, environment review, tuning tips. All 12 findings during POC had been resolved in < 3 weeks and addressed by patch bundle sets. Two additional findings since go live resolved. page 16
Migration Log Book Q2 Q3 2011 May 3 ODMs have been delivered and handed over from Oracle to Data Centre HW & SW install in ~10 days (Oracle) ODMs have been handed over from data centre to project 2 weeks later June / July First full environment (incl. SAS, NFS, etc.) established Migration rehearsal & testing cycles Integration testing in Jul September DataGuard lines established Improved performance with 10G line to be compared with Q1 POC on 1GB October / November Last cell patches applied on all 3 ODMs Final test cycles Go-Live August Dress rehearsal page 17
Summary More than one year experience with the software stack on Oracle Exadata processing data on a daily, weekly and monthly data Performance, cost and storage objectives have been met No Hardware failures detected so far, important patches applied Exadata v2.8 configuration is to be rated above commodity level (using SAS disk only) Two powerful database nodes proves higher performance & stability vs. a smaller node Since September no critical Service open with Oracle page 18
Effizienteres Kreditrisikoreporting dank optimierter Data Warehouse Infrastruktur DAS UNTERNEHMEN Die Deutsche Bank ist eine führende globale Investmentbank mit einem bedeutenden Privatkundengeschäft sowie sich gegenseitig verstärkenden Geschäftsfeldern. Branche: Finanzdienstleistungen Mitarbeiter: > 100.000 DIE HERAUSFORDERUNG Die Analyse von Kreditrisiken und zeitnahes Reporting gewinnt immer größere Bedeutung. Die gestiegenen Datenvolumina sowie die umfangreichen Berechnungen stellen eine Herausforderung für das zeitnahe Reporting dar. Dem zu begegnen erfordert den Aufbau einer zukunftsorienterten, performanteren Infrastruktur. Mehr als 500 Benutzer greifen aktiv auf die verschiedensten Aspekte im DWH zu. Tausende Abnehmer werden weltweit mit Informationen in unterschiedlichen Formaten versorgt. ORACLE PRODUKTE & SERVICES Oracle Exadata Database Machine Oracle Linux Oracle Customer Support FAZIT Der Einsatz der Oracle Exadata Database Machine für das Data Warehouse für das Kreditrisikoreporting steht für 50% weniger Laufzeit sowie 75% geringeres Datenvolumen und das bei rund 20% niedrigeren Kosten. DIE LÖSUNG Quartalsweise, monatliche, wöchentliche bzw. tägliche Bereitstellung der Berichte mit massiv verbesserter Performance Laufzeit zur Generierung der täglichen Reports um 50% verkürzt Dank der Storage-Kompression wurde das Datenvolumen um 75% reduziert Kosteneinsparungen von etwa 20%, reduzierte Platzanforderungen und weniger Stromverbrauch March 2012