Modernizing Your Data Strategy Understanding SAS Solutions for Data Integration, Data Quality, Data Governance and Master Data Management Gregory S. Nelson ThotWave Technologies, LLC. Lisa Dodson SAS 1
Outline Introduction Data Integration Data Quality Master Data Management Data Governance SAS Technology Landscape Reference Architectures Typical Architectures Modernization Strategies Summary
Your Presenters Lisa Dodson, Lisa Dodson, Manager, Data Management Americas Technology Practice (SAS) Lisa has been with SAS for 14 years and is a recognized expert in the informaeon management, data governance and data management space within the organizaeon. She holds a Master's Degree in InformaEon Quality, and has affiliaeons with many data management/governance organizaeons including as a former board member and President for the InternaEonal AssociaEon for InformaEon and Data Quality and organizing comminee member for MITIQ's Industry Symposium. Through job roles including, account execueve, systems engineer, product manager, technical trainer and solueons architect she s developed a deep understanding of the SAS soqware architecture. In her current role she leads the Americas Data Management PracEce.. Greg Nelson, CPHIMS, MMCi President and CEO ThotWave Technologies Founder and CEO of ThotWave. Recovering Social Psychologist, Clinical InformaEcist (Duke University), Prolific writer/ presenter (150+ publicaeons/ presentaeons). 25+ year SAS Expert 5 and cerefied in SAS (Grid CompuEng, Data Mining), Healthcare, Six Sigma, Balanced Scorecard. Passionate about healthcare and how analyecs can improve both populaeons and paeents.
Business context for these technologies 1. Introduction 4
Rationale for this paper Provide a clear understanding of what the modernization strategies are for our customers Create clarity around how SAS bundles these products 5
Data Integration Extract, Transform, Load 6
Data Integration Defined Data Integration Bringing data from two or more sources together into a single view for analysis and reporting the set of tools that support the extraction, transformation and loading processes Data Management Global set of practices that govern how data strategies are designed, executed and governed within an organization. guiding principles, architectures, policies, practices and procedures for managing data within enterprise 7
Value of Integrated Data 8
Where we see Data Integration Data acquisition for data warehousing, business intelligence and analytics Integration of master data in support of master data management (MDM) Data migration or conversion (common when integrating systems, companies or legacy system retirement) Data sharing where information is exchanged beyond the corporate firewalls with partners/ suppliers, customers or regulatory agencies Delivery of data throughout an organization (enterprise application or a service-oriented architecture (SOA). 9
Data Quality Cleanse, Enrich, Augment 10
Data Quality Principles Quality information is not the result of doing work., as Larry English states in his book Information Quality Applied, but rather [comes] as a result of designing quality (error-proofing) into processes that create, maintain, and present information. [6] 11
Data Quality functions Data profiling Data quality measurement Parsing and standardization General cleansing" routines Matching Monitoring Data enrichment 12
Master Data Management One version of the truth 13
MDM Defined Gartner defines MDM is a discipline in which an organization comes together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of an enterprise's official, shared master data assets. [10] 14
Approaches to MDM Goal: Rationalized, integrated view of what the single truth is for a given domain Approaches: a master index or registry warehoused or consolidated view of data centralized application and associated workflows that govern the master 15
Data Governance It s my data! No, It s mine! 16
Organizational Mandate People, processes and technology for how data is managed: its use, interpretation, value and quality Specific focus by the organization on data quality, data management, data policies, business process management, and risk management An organizational mandate to ensure confidence (assurance) around data 17
Data Governance Life-Cycle 1. Identify the data quality issues to address 2. Prioritize the portfolio of issues to isolate/tackle the most important 3. Perform root cause analysis to determine the true source of the data issue 4. Design the corrective action 5. Formalize the correction through consideration and approval by the data governance organization 6. Implement the fix 7. Monitor the results * Carol Newcomb Blog 18
Products, Suites, Packages, Bundles holy moly! 2. SAS Technology Landscape 19
Data Quality Integration Data Profiling ExtracEon Data Cleansing TransformaEon Data StandardizaEon Loading Data AugmentaEon Metadata HISTORICAL VIEW ALTERNATE VIEW
Capabilities included in SAS Data access refers to your ability to get to and retrieve informaeon wherever it is stored. Data quality is the pracece of making sure data is accurate and usable for its intended purpose. Data integra1on defines the steps for combining different types of data (ETL). Data Access Data IntegraEon Data Quality Governance/ MDM Data governance is an ongoing set of rules and decisions for managing your organizaeon s data to ensure that your data strategy is aligned with your business strategy. Data federa1on is a special kind of virtual data integraeon that allows you to look at combined data from muleple sources without the need to move and store the combined view in a new locaeon. Master data management (MDM) defines, unifies and manages all of the data that is common and esseneal to all areas of an organizaeon. Data streaming involves analyzing data as it moves by applying logic to the data, recognizing panerns in the data and filtering it for muleple uses as it flows into your organizaeon. 21
Maturation Data Quality Data Access Data IntegraEon Data Management Governance or MDM 22
Data Quality SAS Data Quality Desktop SAS Data Quality Standard/Advanced SAS Data Quality Accelerator for Teradata Data Integra1on/ Management SAS Data IntegraEon Server SAS Data Management Standard/Advanced Data Governance SAS Data Governance Master Data Management Master Data Management Standard/Advanced 23
Data Quality Standard Data Quality Standard Data Quality Advanced Data Quality Standard Data Governance Master Data Management Standard Data Quality Standard + MDM Components + DQ Rules & Jobs + En1ty Data Models Data Management Standard Data Integra1on Server Data Quality Standard Data Management Advanced Data Integra1on Server Data Quality Standard Data Governance Master Data Management Advanced Data Quality Standard + MDM Components + DQ Rules & Jobs + En1ty Data Models Data Governance 24
Is there anything typical? 3. Reference Architectures 25
Reference Architecture 26
Data Management on Steroids 27
Data Integration 28
Holy Moly! Where do I start? 4. Modernization Strategies 29
Modernization Paths Moving from PC SAS or SAS Enterprise Guide Data Integration Plus Data Quality Adding Data Governance 30
Moving up from SAS Foundation libname ditest 'c:\disdata'; data temp.burgers; input where $ 1-18 food $ 19-34 calories fat $ sodium $ id $; cards; Burger King cheeseburger 380 19g 780mg 1 Hardees cheeseburger 390 20g 990mg 10 Jack In The Box cheeseburger 320 15g 670mg 0 McDonalds cheeseburger 320 14g 750mg 35 Wendys cheeseburger 320 13g 770mg 20 ; run; data temp.lesscalories; 31
Metadata 32
Data Integration with Quality 33
Data Governance 34
Interactive Questions and Answers 5. Summary 35
Summary Remember, IT is a means to and end Data Integration is about combining data in ways useful for the organization Data quality is about getting the data right MDM ensures that we have one version of the truth Data governance is creating alignment between people, process and technologies to support the above 36
Questions and Answers What is Data Management Console? Is DI Studio going away? What is this data director for hadoop thing? What is the data quality accelerator? 37
Contact @healthcare_bi linkedin.com/in/thotwave greg@thotwave.com 919.931.4736
we are we create thinking data