Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO Big Data Everywhere Conference, NYC November 2015
Agenda 1. Challenges with Risk Data Aggregation and Risk Reporting (RDARR) 2. How a managed Hadoop Data Lake can serve as an ideal data acquisition hub for RDARR analytics and reporting 3. Architectural considerations and required capabilities BCBS 239 addressing Risk Data Aggregation and Risk Reporting 4. Potential solution leveraging Zaloni s Bedrock, a Hadoop data management platform
Overview of BCBS 239 and RDARR Basel Committee of Banking Supervision (BCBS) 239 Requires that the information used to drive decision making, captures all risks with appropriate accuracy and timeliness Overarching principles of effective risk management reporting and governance Data Governance Adaptability Data and IT architecture Frequency Accuracy and Integrity Distribution Completeness Review
Challenges with BCBS 239 compliance 1. Performance requirements: Computationally intensive models Systems must scale and retain security and resiliency 2. Large volumes of data: Demands to manage and record every transaction in real time Long term data retention requirements 3. Fragmented systems: Risk data is often scattered across the organization in silos Current relational stores have different schemas, limiting cross enterprise visibility 4. Cost pressures: Increasing cost of compliance while profits and budgets decline 5. Data Governance: One of the key aspects of RDARR is proper Data Governance
Modernizing your data architecture as the path to success A Hadoop Data Lake is the optimal underlying architecture: Provides the most scalable solution Dramatically more cost effective than traditional data storage solutions Enables you to deal with the volume, variety and velocity of data that is coming in Breaks down the silos built up through the traditional database architecture Potential challenges with a Hadoop Data Lake solution: Data Management: Metadata, Lineage, Data Quality, Automation for Data acquisition Does it have enterprise grade data integrity and security? (e.g. Access control, Data masking) Will it integrate in my existing data environment? (e.g. can data flow with required frequency for SLAs, etc.)
Data Lake reference architecture for RDARR Hadoop Data Lake Source Systems Transient Loading Zone Raw Data Refined Data Integrate to common format Data Validation Data Cleansing Aggregations Consumption Zone OLTP or ODS Enterprise Data Warehouse File Data DB Data Original unaltered data attributes Trusted Data Reference Data Master Data Logs (or other unstrctured data) ETL Extracts Tokenized Data Discovery Sandbox Data Wrangling Data Discovery Exploratory Analytics Business Analysts Researchers Data Scientists Streaming Cloud Services { } APIs Metadata Data Quality Data Catalog Security
Data Acquisition Framework for RDARR Automated Data Acquisition Framework providing timeliness of data Capture Metadata in all phases: Ingestion, Transformation Integration with Enterprise Metadata Management Integrated Data Quality Analysis Metadata repositories Metadata Management solution Register/ update metadata Source Systems RDBMS Extract/ Read metadata Operational Metadata Generation Mainframes Data Acquisition Automation Data Ingestion Data Quality and Validation Layout Standardization Flat files Binary files Data at Rest
Metadata Registration Considerations: Business metadata: Business names, descriptions, tags, quality and masking rules Operational metadata: Source and target locations of data, size, number of records, lineage Technical metadata: Type of data (text, JSON, Avro), structure of the data (the fields and their types) Integration with Enterprise Metadata Management Solutions Edge-node to Cluster metadata file START API retrieve metadata origin info, timestamp, etc. Enterprise Metadata Repositories Hadoop Cluster Metadata check-in copy to repository add tags END operational metadata file
Data Transformation and Aggregation Considerations: Layout Standardization Create rationalized data models for RDARR Data Completeness - Capture and aggregate all material risk data across the banking group. Transformation Raw data from multiple sources is aggregated to create risk reports. Timeliness and Frequency - Generate aggregate and up-to-date risk data in a timely manner. Security and Controls Masking for PII Access controls for Risk Reports Risk Reports: Credit Risk Market Risk Liquidity Risk Capital Risk Stress Testing
Non functional considerations Ability to handle a variety of input sources and output destinations Handle fluctuations in input data High throughput and low latency handling Validation and tagging on the fly Preserve order Fault tolerance Non-stop modifications Simple to build and operate
Managed Data Lake enabled for RDARR Data Types Edge Node Data Lake Consumers Relational Change Data Data Analytical Applications Streaming File Stream Adapters File Collectors Hadoop Cluster Export Enterprise Data Warehouse Apps/ Analytics Tools Data Sources Portfolios Positions Market Data Social Enterprise Data Bedrock Application Manager Configure Ingestion Operations and Metadata Store Transformations Bedrock Applications Manager Administer Metadata Data Quality & Rules Engine Query Builder Manage, Monitor, Schedule Work flow Executor Risk reporting: Credit Risk Market Risk Liquidity Risk Capital Risk Stress Testing Scorecards Enterprise Reports
Benefits Addressing some of the key principles of BCBS 239: Data architecture and IT infrastructure that supports normal and high stress scenarios Accuracy and integrity of data for effective risk management and decision-making Completeness of data to ensure informed decision-making Additional benefits beyond demonstrating compliance: Banks can make timely, defensible, informed decisions related to Risk exposure. Reduced probability and severity of loss from fraud, etc. Improved strategic planning and ability to launch new products and services Reduced cost compared to traditional EDW solutions (60-70% OPEX reduction over 3 years)
Join us for Upcoming Webinar Governance of the Big Data Lake with a focus on RDARR for Financial Services Identifying Critical Data Elements for RDARR Establishing Data Standards for Critical Data Elements Supporting data lineage for Big Data to support regulatory compliance Managing Data Quality Hardening Information Security Register at www.zaloni.com Speakers: Sunil Soares, Principle, Information Asset Ben Sharma, CEO, Zaloni
Visit zaloni.com or Contact us at info@zaloni.com