Data Quality and the PPDM Business Rules Steve Cooper: President 7061 S. University Blvd Centennial, CO 80122 303-790-0919 www.energyiq.info
Background The PPDM Business Rules initiative provides a platform for sharing data quality rules 2
Background The rules by themselves are only part of the solution We need to also establish a consistent process for applying the rules: Identify the most valuable data based upon an analysis of workflows and decisions Apply the dimensions of data quality Develop the rules to provide a quantitative assessment of the quality of the data that we care about Manage and run the rules effectively Present the results so that critical trends and problems can be easily identified 3
Background Most companies do not follow an established process Establishing a consistent process for applying data quality rules is the focus of this presentation 4
Data Value Data quality initiatives are expensive and can be overwhelming We need to focus resources on delivering the most value to the organization This can be achieved by assigning a value to data based upon an assessment of business needs: Workflows Processes Decisions 5
Data Value Assign a value to data based upon a scale: Level 1: Critical Level 2: Important Level 3: Useful Level 4: Supportive 6
Data Quality Dimensions We need to be clear on what we mean by data quality Typically data quality is defined and measured along a number of different dimensions - Accuracy - Timeliness - Completeness - Currency - Consistency - Standards We can establish quality requirements for the most valuable data along these dimensions
Data Quality Dimensions 8
Data Quality Rules Once the data value and quality matrix has been established it provides the framework for building the rules library The PPDM Business Rules initiative will provide a comprehensive list of quality rules in the form of a definition and supporting information The rules need to be translated into a format that can be executed to return a quantitative assessment of data quality: Combine rules Target different databases, subsets of a database Process automatically or manually 9
Data Quality Rules Individual rules can be created in standard SQL and stored in the PPDM data model Rules should return a Quality % and list of exceptions: Quality % = 1- Exception Count Population Count 10
Data Quality Rules Rule Sets combine individual rules: Well Header Well Test Dates and Elevations.. They can be run against a target subset of the database: State or County Formation Rig Operator.. This combination of Rule, Rule Set, and Target enables sophisticated data quality analysis to be performed: Results can be stored in the PPDM database 11
Data Quality Rules Management 12
Data Quality Results 13
Data Quality Results Establish acceptable thresholds and ranges Set meaningful targets for data vendors Assign a value to data in an acquisition Begin to treat data as an asset 14
Summary Business Workflows Decision Points Data Requirements Business Analysis Data Value Data Quality QualityRules Metrics Data Analysis (PPDM) Fix/Audit 15
Summary The PPDM Business Rules initiative provides a great foundation for any data quality initiative To be successful, however, a consistent and robust process must be adopted for developing and executing data quality rules The process must include an analysis of the needs of the business and the corresponding value of the data We must be able to effectively manage large numbers of rules and how they are executed Data quality visualization is important The PPDM data model is a great place to store the data quality rules, exceptions, and results 16
Questions? steve.cooper@energyiq.info Steve Cooper Ph.D. Principal EnergyIQ 7061 S. University Blvd Centennial, CO 80122 303-790-0919 www.energyiq.info
Data Value: Data Objects It is difficult to think about data attributes in isolation It makes more sense to think in terms of data objects about related information: Well location Depths and elevations Directional surveys Tests This ties in to the concept of the Common Object Model See PPDM Houston conference 18