UNCLASSIFIED UNCLASSIFIED

Size: px
Start display at page:

Download "UNCLASSIFIED UNCLASSIFIED"

Transcription

1 UNCLASSIFIED Navy Cyber Defense Operations Command UNCLASSIFIED Cyber Warriors Ever Vigilant

2 How Fannie Mae Leverages Data Quality to Improve the Business April 23, 2015 Speaker: James Barrett Federal National Mortgage Association Washington DC USA 2015 Fannie Mae

3 About Fannie Mae As the leading source of residential mortgage credit in the U.S. secondary market, Fannie Mae is supporting today's economic recovery and helping to build a sustainable housing finance system. We exist to provide reliable, large scale access to affordable mortgage credit in all communities across the country at all times so people can buy, refinance, or rent homes. We are working to establish and implement industry standards develop better tools to price and manage credit risk build new infrastructure to ensure a liquid and efficient market and facilitate the collection and reporting of data for accurate financial reporting and improved risk management. We are committed to being our customers most valued business partner and delivering the products, services, and tools our customers need to serve the entire market confidently, efficiently, and profitably.

4 SPEAKER James Barrett is the Data Quality Manager in Enterprise Data, Operations & Technology at Federal National Mortgage Association (Fannie Mae). His background includes architecture (enterprise and solutions), database administration, project management, and custom software development specializing in enterprise data stores, and of course, data quality. Note: The views expressed in this presentation are the speaker s and do not necessarily represent those of Fannie Mae.

5 How Fannie Mae Leverages Data Quality to Improve the Business 1. Overview 2. Data Quality Who cares? Why care? What is it? When to deploy? Where to deploy? 3. Expectations & Experiences Centralized vs. federated vs. self service models for DQ build out Effective self service DQ DQ integration with enterprise architecture Cost reduction DQ ownership 4. Data Quality Next Steps

6 Who cares about Data Quality? Regulators Enterprise vs business silos Data Governance & Chief Data Officer Responsible for DQ to Senior Management Data Owners Need to be aware of DQ and fix it if necessary Data Managers Governance and Owners look to EDM for solutions for DQ Users of Data Provide data used by decision makers tactical and strategic People affected by decisions made by users of data Customers, policy makers, planners The Enterprise Data Quality Manager has many viewpoints and opinions to consider!

7 Why care about Data Quality? Because regulators care Data quality affects quality of work and life Did you use DQ today? Do your teams use DQ in their jobs? How can governance, data owners, and data management ever meet the enterprise DQ need? Data keeps growing Roles and responsibilities need definition, and change over time Some sort of balancing act must be achieved

8 What is Data Quality? Fit for use Avoid over kill; use DQ to meet purpose for which data is used; not all data is critical for all purposes; global vs. local As many criteria as there are uses for data My fatal error may be your trivial warning Measurement, Monitoring, Remediation DQ Business architecture Data Correction is tough Attribute validation vs. source target reconciliation Use cases enable fit for use analysis Start with a DQ Business Architecture DQ Rule Metadata Template DQ Functions DQ Use Cases DQ Relationships with Metadata Management DQ Relationships with Enterprise Logical Data Modeling

9 When to deploy Data Quality? Re active After somebody notices After somebody asks (with/out $) Pro active Before anybody notices Before it spreads downstream Use a pre defined list of data attributes and standard rules Exceptions: accept, replace, or reject Being pro-active can be expensive; being re-active is risky Consider your consumers when defining exception rules

10 Where to deploy Data Quality? Application Build Out Centralized vs. Federated While loading ( in flight) OR after loading data ( at rest ) Self Service Can be fast and cheap Can t handle all DQ rules and requirements Areas of risk How to identify? How to quantify? At the source if possible Need 20/20 hindsight OR green field projects Hybrid strategies seem the most robust

11 DQ application build out Centralized Rules built/run by 1 application for other applications data at rest Single application implies single owner > enterprise data governance Initial DQ build out: DQ standards, design patterns, CoE resources Federated Rules built/run by application for its own data in flight or at rest Rules and stored metrics/exceptions owned by each application Self Service Rules built/run by data analyst/team at rest Rules and stored metrics/exceptions owned by team/analyst Tool based (e.g, IDQ Analyst) rather than custom development Federated scales better than centralized model; Self-Service has lowest cost per rule but setup and support requires DQ CoE

12 Effective Self Service DQ Training and documentation Usage tracking after training Customer feedback Desk side support after training Team based access and change control Individual and shared rule folders Confucius: Feed a man a fish, you feed him for a day. Teach him to fish, you feed him for a life time

13 DQ < > Enterprise Architecture Centralized DQ rule repository Data quality rule lineage Technical vs. business DQ rules Patterns for DQ rules in data flow from/to: Transaction Data Store Operational Data Store Master Data Store Data Warehouse Data Mart BPM and BAM Data exceptions and corrections imply: Alerts Replay corrections for downstream Re calculation of derived attributes Architecting DQ <-> EA: don t let the perfect be the enemy of the good. 13

14 Cost reduction Re usable components DQ Rules: Logical and Physical Design Patterns DQ Rules Results: Data Structures for Summary and Detail Metrics DQ Reports: Metadata driven DQ Warranty DQ Metrics Common Message XSD Self Service DQ framework Killing two birds with one stone is a proverb made for DQ cost reduction

15 DQ ownership Data Owners defining DQ rules, by ELDM entity/attribute Application Owners remediation of DQ exceptions Data Governance DQ policies and standards Data Management best practices for implementation of DQ standards Data Users identifying and raising DQ issues to all of the above DQ management requires good negotiation and persuasion skills to build teams

16 Data quality next steps Define KPIs to manage 3 DQ build out models Integrate Self Service and Federated DQ Quantify DQ risk at rule level, and apply to DQ warranty value chain Integrate BAM and BPM with data corrections It ain t what you don t know that hurts you, it s what you know for certain that ain t so. - Mark Twain

17