Data Quality and Record Linkage Techniques
|
|
|
- Madeleine Gilbert
- 10 years ago
- Views:
Transcription
1 Data Quality and Record Linkage Techniques
2 Thomas N. Herzog Fritz J. Scheuren William E. Winkler Data Quality and Record Linkage Techniques
3 Thomas N. Herzog Fritz J. Scheuren Office of Evaluation National Opinion Research Center Federal Housing Administration University of Chicago U.S. Department of Housing and Urban Development 1402 Ruffner Road th Street, SW Alexandria, VA Washington, DC William E. Winkler Statistical Research Division U.S. Census Bureau 4700 Silver Hill Road Washington, DC Library of Congress Control Number: ISBN-13: e-isbn-13: Printed on acid-free paper Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. There may be no basis for a claim to copyright with respect to a contribution prepared by an officer or employee of the United States Government as part of that person s official duties. Printed in the United States of America springer.com
4 Preface Readers will find this book a mixture of practical advice, mathematical rigor, management insight, and philosophy. Our intended audience is the working analyst. Our approach is to work by real life examples. Most illustrations come out of our successful practice. A few are contrived to make a point. Sometimes they come out of failed experience, ours and others. We have written this book to help the reader gain a deeper understanding, at an applied level, of the issues involved in improving data quality through editing, imputation, and record linkage. We hope that the bulk of the material is easily accessible to most readers although some of it does require a background in statistics equivalent to a 1-year course in mathematical statistics. Readers who are less comfortable with statistical methods might want to omit Section 8.5, Chapter 9, and Section 18.6 on first reading. In addition, Chapter 7 may be primarily of interest to those whose professional focus is on sample surveys. We provide a long list of references at the end of the book so that those wishing to delve more deeply into the subjects discussed here can do so. Basic editing techniques are discussed in Chapter 5, with more advanced editing and imputation techniques being the topic of Chapter 7. Chapter 14 illustrates some of the basic techniques. Chapter 8 is the essence of our material on record linkage. In Chapter 9, we describe computational techniques for implementing the models of Chapter 8. Chapters 9 13 contain techniques that may enhance the record linkage process. In Chapters 15 17, we describe a wide variety of applications of record linkage. Chapter 18 is our chapter on data confidentiality, while Chapter 19 is concerned with record linkage software. Chapter 20 is our summary chapter. Three recent books on data quality Redman [1996], English [1999], and Loshin [2001] are particularly useful in effectively dealing with many management issues associated with the use of data and provide an instructive overview of the costs of some of the errors that occur in representative databases. Using as their starting point the work of quality pioneers such as Deming, Ishakawa, and Juran whose original focus was on manufacturing processes, the recent books cover two important topics not discussed by those seminal authors: (1) errors that affect data quality even when the underlying processes are operating properly and (2) processes that are controlled by others (e.g., other organizational units within one s company or other companies). Dasu and Johnson [2003] provide an overview of some statistical summaries and other conditions that must exist for a database to be useable for v
5 vi Preface specific statistical purposes. They also summarize some methods from the database literature that can be used to preserve the integrity and quality of a database. Two other interesting books on data quality Huang, Wang and Lee [1999] and Wang, Ziad, and Lee [2001] supplement our discussion. Readers will find further useful references in The International Monetary Fund s (IMF) Data Quality Reference Site on the Internet at We realize that organizations attempting to improve the quality of the data within their key databases do best when the top management of the organization is leading the way and is totally committed to such efforts. This is discussed in many books on management. See, for example, Deming [1986], Juran and Godfrey [1999], or Redman [1996]. Nevertheless, even in organizations not committed to making major advances, analysts can still use the tools described here to make substantial quality improvement. A working title of this book Playing with Matches was meant to warn readers of the danger of data handling techniques such as editing, imputation, and record linkage unless they are tightly controlled, measurable, and as transparent as possible. Over-editing typically occurs unless there is a way to measure the costs and benefits of additional editing; imputation always adds uncertainty; and errors resulting from the record linkage process, however small, need to be taken into account during future uses of the data. We would like to thank the following people for their support and encouragement in writing this text: Martha Aliaga, Patrick Ball, Max Brandstetter, Linda Del Bene, William Dollarhide, Mary Goulet, Barry I. Graubard, Nancy J. Kirkendall, Susan Lehmann, Sam Phillips, Stephanie A. Smith, Steven Sullivan, and Gerald I. Webber. We would especially like to thank the following people for their support and encouragement as well as for writing various parts of the text: Patrick Baier, Charles D. Day, William J. Eilerman, Bertram M. Kestenbaum, Michael D. Larsen, Kevin J. Pledge, Scott Schumacher, and Felicity Skidmore.
6 Contents Preface... v About the Authors... xiii 1. Introduction Audience and Objective Scope Structure... 2 PART 1 DATA QUALITY: WHAT IT IS, WHY IT IS IMPORTANT, AND HOW TO ACHIEVE IT 2. What Is Data Quality and Why Should We Care? When Are Data of High Quality? Why Care About Data Quality? How Do You Obtain High-Quality Data? Practical Tips Where Are We Now? Examples of Entities Using Data to their Advantage/Disadvantage Data Quality as a Competitive Advantage Data Quality Problems and their Consequences How Many People Really Live to 100 and Beyond? Views from the United States, Canada, and the United Kingdom Disabled Airplane Pilots A Successful Application of Record Linkage Completeness and Accuracy of a Billing Database: Why It Is Important to the Bottom Line Where Are We Now? Properties of Data Quality and Metrics for Measuring It Desirable Properties of Databases/Lists Examples of Merging Two or More Lists and the Issues that May Arise Metrics Used when Merging Lists Where Are We Now? vii
7 viii Contents 5. Basic Data Quality Tools Data Elements Requirements Document A Dictionary of Tests Deterministic Tests Probabilistic Tests Exploratory Data Analysis Techniques Minimizing Processing Errors Practical Tips Where Are We Now? PART 2 SPECIALIZED TOOLS FOR DATABASE IMPROVEMENT 6. Mathematical Preliminaries for Specialized Data Quality Techniques Conditional Independence Statistical Paradigms Capture Recapture Procedures and Applications Automatic Editing and Imputation of Sample Survey Data Introduction Early Editing Efforts Fellegi Holt Model for Editing Practical Tips Imputation Constructing a Unified Edit/Imputation Model Implicit Edits A Key Construct of Editing Software Editing Software Is Automatic Editing Taking Up Too Much Time and Money? Selective Editing Tips on Automatic Editing and Imputation Where Are We Now? Record Linkage Methodology Introduction Why Did Analysts Begin Linking Records? Deterministic Record Linkage Probabilistic Record Linkage A Frequentist Perspective Probabilistic Record Linkage A Bayesian Perspective Where Are We Now?... 92
8 Contents ix 9. Estimating the Parameters of the Fellegi Sunter Record Linkage Model Basic Estimation of Parameters Under Simple Agreement/Disagreement Patterns Parameter Estimates Obtained via Frequency-Based Matching Parameter Estimates Obtained Using Data from Current Files Parameter Estimates Obtained via the EM Algorithm Advantages and Disadvantages of Using the EM Algorithm to Estimate m- and u-probabilities General Parameter Estimation Using the EM Algorithm Where Are We Now? Standardization and Parsing Obtaining and Understanding Computer Files Standardization of Terms Parsing of Fields Where Are We Now? Phonetic Coding Systems for Names Soundex System of Names NYSIIS Phonetic Decoder Where Are We Now? Blocking Independence of Blocking Strategies Blocking Variables Using Blocking Strategies to Identify Duplicate List Entries Using Blocking Strategies to Match Records Between Two Sample Surveys Estimating the Number of Matches Missed Where Are We Now? String Comparator Metrics for Typographical Error Jaro String Comparator Metric for Typographical Error Adjusting the Matching Weight for the Jaro String Comparator Winkler String Comparator Metric for Typographical Error Adjusting the Weights for the Winkler Comparator Metric Where are We Now?
9 x Contents PART 3 RECORD LINKAGE CASE STUDIES 14. Duplicate FHA Single-Family Mortgage Records: A Case Study of Data Problems, Consequences, and Corrective Steps Introduction FHA Case Numbers on Single-Family Mortgages Duplicate Mortgage Records Mortgage Records with an Incorrect Termination Status Estimating the Number of Duplicate Mortgage Records Record Linkage Case Studies in the Medical, Biomedical, and Highway Safety Areas Biomedical and Genetic Research Studies Who goes to a Chiropractor? National Master Patient Index Provider Access to Immunization Register Securely (PAiRS) System Studies Required by the Intermodal Surface Transportation Efficiency Act of Crash Outcome Data Evaluation System Constructing List Frames and Administrative Lists National Address Register of Residences in Canada USDA List Frame of Farms in the United States List Frame Development for the US Census of Agriculture Post-enumeration Studies of US Decennial Census Social Security and Related Topics Hidden Multiple Issuance of Social Security Numbers How Social Security Stops Benefit Payments after Death CPS IRS SSA Exact Match File Record Linkage and Terrorism PART 4 OTHER TOPICS 18. Confidentiality: Maximizing Access to Micro-data while Protecting Privacy Importance of High Quality of Data in the Original File Documenting Public-use Files Checking Re-identifiability Elementary Masking Methods and Statistical Agencies Protecting Confidentiality of Medical Data More-advanced Masking Methods Synthetic Datasets Where Are We Now?
10 Contents xi 19. Review of Record Linkage Software Government Commercial Checklist for Evaluating Record Linkage Software Summary Chapter Bibliography Index
11 About the Authors Thomas N. Herzog, Ph.D., ASA, is the Chief Actuary at the US Department of Housing and Urban Development. He holds a Ph.D. in mathematics from the University of Maryland and is also an Associate of the Society of Actuaries. He is the author or co-author of books on Credibility Theory, Monte Carlo Methods, and Risk Models. He has devoted a major effort to improving the quality of the databases of the Federal Housing Administration. Fritz J. Scheuren, Ph.D., is a general manager with the National Opinion Research Center. He has a Ph.D. in statistics from the George Washington University. He is much published with over 300 papers and monographs. He is the 100th President of the American Statistical Association and a Fellow of both the American Statistical Association and the American Association for the Advancement of Science. He has a wide range of experience in all aspects of survey sampling, including data editing and handling missing data. Much of his professional life has been spent employing large operational databases, whose incoming quality was only marginally under the control of the data analysts under his direction. His extensive work in recent years on human rights data collection and analysis, often under very adverse circumstances, has given him a clear sense of how to balance speed and analytic power within a framework of what is feasible. William E. Winkler, Ph.D., is Principal Researcher at the US Census Bureau. He holds a Ph.D. in probability theory from Ohio State University and is a fellow of the American Statistical Association. He has more than 110 papers in areas such as automated record linkage and data quality. He is the author or co-author of eight generalized software systems, some of which are used for production in the largest survey and administrative-list situations. xiii
Data Quality and Record Linkage Techniques
Data Quality and Record Linkage Techniques Thomas N. Herzog Fritz J. Scheuren William E. Winkler Data Quality and Record Linkage Techniques Thomas N. Herzog Fritz J. Scheuren Office of Evaluation National
PHYSICAL TESTING OF RUBBER
PHYSICAL TESTING OF RUBBER PHYSICAL TESTING OF RUBBER Roger Brown Springer Library of Congress Cataloging-in-Publication Data A CLP. Catalogue record for this book is available from the Library of Congress.
Adult Attachment in Clinical Social Work
Adult Attachment in Clinical Social Work Essential Clinical Social Work Series Series Editor: Carol Tosone For other titles published in this series, go to www.springer.com/series/8115 Susanne Bennett
Schneps, Leila; Colmez, Coralie. Math on Trial : How Numbers Get Used and Abused in the Courtroom. New York, NY, USA: Basic Books, 2013. p i.
New York, NY, USA: Basic Books, 2013. p i. http://site.ebrary.com/lib/mcgill/doc?id=10665296&ppg=2 New York, NY, USA: Basic Books, 2013. p ii. http://site.ebrary.com/lib/mcgill/doc?id=10665296&ppg=3 New
NETWORK INFRASTRUCTURE SECURITY
NETWORK INFRASTRUCTURE SECURITY Network Infrastructure Security Angus Wong Alan Yeung Angus Wong Macao Polytechnic Institute Rua de Luis Gonzaga Gomes Macao Alan Yeung City University of Hong Kong 83 Tat
What Is Data Quality and Why Should We Care?
2 What Is Data Quality and Why Should We Care? Caring about data quality is key to safeguarding and improving it. As stated, this sounds like a very obvious proposition. But can we, as the expression goes,
The Neuropsychology Toolkit
The Neuropsychology Toolkit Richard L. Wanlass The Neuropsychology Toolkit Guidelines, Formats, and Language Richard L. Wanlass University of California, Davis, Medical Center Sacramento, CA, USA [email protected]
Evaluation of Frequency and Injury Outcomes of Lane Departure Crashes
University of Massachusetts Traffic Safety Research Program www.ecs.umass.edu/umasssafe Evaluation of Frequency and Injury Outcomes of Lane Departure Crashes Marta Benavente, Heather Rothenberg, Michael
Sustainable Supply Chains
Sustainable Supply Chains International Series in Operations Research & Management Science Volume 174 Series Editor: Frederick S. Hillier Stanford University, CA, USA Special Editorial Consultant: Camille
Statistics for Biology and Health
Statistics for Biology and Health Series Editors M. Gail, K. Krickeberg, J.M. Samet, A. Tsiatis, W. Wong For further volumes: http://www.springer.com/series/2848 David G. Kleinbaum Mitchel Klein Survival
Avoiding Medical Malpractice. A Physician s Guide to the Law
Avoiding Medical Malpractice A Physician s Guide to the Law Avoiding Medical Malpractice A Physician s Guide to the Law William T. Choctaw, MD, JD William T. Choctaw, MD, JD Chief of Surgery, Citrus Valley
How To Write An Fpa Programmable Gate Array
Reconfigurable Field Programmable Gate Arrays for Mission-Critical Applications Niccolò Battezzati Luca Sterpone Massimo Violante Reconfigurable Field Programmable Gate Arrays for Mission-Critical Applications
DESIGNING ORGANIZATIONS
DESIGNING ORGANIZATIONS Information and Organization Design Series series editors Børge Obel ORGANIZATIONAL LEARNING: Creating, Retaining and Transferring Knowledge, by Linda Argote STRATEGIC ORGANIZATIONAL
Big-Data Analytics and Cloud Computing
Big-Data Analytics and Cloud Computing Marcello Trovati Richard Hill Ashiq Anjum Shao Ying Zhu Lu Liu Editors Big-Data Analytics and Cloud Computing Theory, Algorithms and Applications 123 Editors Marcello
Automated Firewall Analytics
Automated Firewall Analytics Ehab Al-Shaer Automated Firewall Analytics Design, Configuration and Optimization 123 Ehab Al-Shaer University of North Carolina Charlotte Charlotte, NC, USA ISBN 978-3-319-10370-9
International Series on Consumer Science
International Series on Consumer Science For further volumes: http://www.springer.com/series/8358 Tsan-Ming Choi Editor Fashion Branding and Consumer Behaviors Scientific Models 1 3 Editor Tsan-Ming Choi
New Frontiers in Entrepreneurship
New Frontiers in Entrepreneurship International Studies In Entrepreneurship Series Editors: Zoltan J. Acs Geroge Manson University Fairfox, VA, USA David B. Audretsch Indiana University Bloomington, IN,
Studies in the Economics of Uncertainty
Studies in the Economics of Uncertainty Josef Hadar Thomas B. Fomby Tae Kun Sea Editors Studies in the Economics of Uncertainty In Honor of Josef Hadar With 25 Illustrations Springer Verlag New York Berlin
Essential Sports Medicine
Essential Sports Medicine Essential Sports Medicine Joseph E. Herrera Editor Department of Rehabilitation Medicine, Interventional Spine and Sports, Mount Sinai Medical Center Grant Cooper Editor New York
Online Business Security Systems
Online Business Security Systems Online Business Security Systems by Godfried B.Williams University of East London UK Godfried B. Williams School of Computing & Technology University of East London Docklands
The Political Economy of Regulation in Turkey
The Political Economy of Regulation in Turkey Tamer Çetin Fuat Oğuz Editors The Political Economy of Regulation in Turkey Editors Tamer Çetin Department of Economics Yildiz Technical University Yildiz
IMPROVEMENT THE PRACTITIONER'S GUIDE TO DATA QUALITY DAVID LOSHIN
i I I I THE PRACTITIONER'S GUIDE TO DATA QUALITY IMPROVEMENT DAVID LOSHIN ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Morgan Kaufmann
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com Advanced Analytics Dan Vesset September 2003 INTRODUCTION In the previous sections of this series
Understanding Data and Information Systems for Recordkeeping. by Philip C. Bantin
Contents i Understanding Data and Information Systems for Recordkeeping by Philip C. Bantin The Archives & Record Manager s Bookshelf 2 Neal-Schuman Publishers, Inc. New York London Published by Neal-Schuman
Analysis of Financial Time Series
Analysis of Financial Time Series Analysis of Financial Time Series Financial Econometrics RUEY S. TSAY University of Chicago A Wiley-Interscience Publication JOHN WILEY & SONS, INC. This book is printed
Injection Procedures
Injection Procedures wwww Todd P. Stitik Editor Injection Procedures Osteoarthritis and Related Conditions Editor Todd P. Stitik Professor, Physical Medicine and Rehabilitation Co-Director, Musculoskeletal/Pain
Writing Grant Proposals That Win
Writing Grant Proposals That Win FOR SALE OR FOURTH DISTRIBUTION EDITION Deborah Ward, MA, CFRE President Jones & Bartlett Learning, LLC Ward and Associates FOR SALE OR DISTRIBUTION Winona, Minnesota..
Health Informatics (formerly Computers in Health Care) Kathryn J. Hannah Marion J. Ball Series Editors
Health Informatics (formerly Computers in Health Care) Kathryn J. Hannah Marion J. Ball Series Editors Health Informatics Series (formerly Computers in Health Care) Series Editors Kathryn J. Hannah Marion
Sourcebook for Training in Clinical Psychology
Sourcebook for Training in Clinical Psychology Contributors: ELTON AsH HAROLD BASOWITZ LEONARD BLANK HENRY P. DAVID GORDON F. DERNER ARTHUR KOVACS LUCIANO L' ABATE MARTIN MAYMAN CECIL P. PECK LESLIE PmLLIPs
Automatic Storage and Retrieval System Flexible Manufacturing System
Mechatronics Automatic Storage and Retrieval System Flexible Manufacturing System Job Sheets - Courseware Sample 86694-F0 Order no.: 86694-30 First Edition Revision level: 06/2015 By the staff of Festo
FORMAT GUIDELINES FOR MASTER S THESES AND REPORTS
FORMAT GUIDELINES FOR MASTER S THESES AND REPORTS The University of Texas at Austin Graduate School September 2010 Formatting questions not addressed in these guidelines should be directed to a Graduate
Spatial Inequalities
Spatial Inequalities GeoJournal Library Volume 110 Managing Editor: Daniel Z. Sui, Columbus, Ohio, USA Founding Series Editor: Wolf Tietze, Helmstedt, Germany Editorial Board: Paul Claval, France Yehuda
Oral and Cranial Implants
Oral and Cranial Implants Hugh Devlin Ichiro Nishimura Editors Oral and Cranial Implants Recent Research Developments Editors Hugh Devlin School of Dentistry University of Manchester Manchester United
Building a Data Quality Scorecard for Operational Data Governance
Building a Data Quality Scorecard for Operational Data Governance A White Paper by David Loshin WHITE PAPER Table of Contents Introduction.... 1 Establishing Business Objectives.... 1 Business Drivers...
Youth Gangs in International Perspective
Youth Gangs in International Perspective Finn-Aage Esbensen Editors Cheryl L. Maxson Youth Gangs in International Perspective Results from the Eurogang Program of Research Editors Finn-Aage Esbensen Department
Effective Methods for Software and Systems Integration
Effective Methods for Software and Systems Integration Boyd L. Summers CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 CRC Press is an imprint of Taylor
Projections of Education Statistics to 2021
Projections of Education Statistics to 2021 Fortieth Edition 17 017 2018 2018 2020 2020 2019 2019 2021 2021 NCES 2013-008 U.S. DEPARTMENT OF EDUCATION Projections of Education Statistics to 2021 Fortieth
Competency-Based Education: The Skunk Works A Competency Based MBA Degree
Competency-Based Education: The Skunk Works A Competency Based MBA Degree Michael Williams, Ph.D. Dean, School of Business and Management Steve Phillips Assessment Strategist, Center for the Assessment
Projections of Education Statistics to 2022
Projections of Education Statistics to 2022 Forty-first Edition 18 018 2019 2019 2020 2020 2021 2021 2022 2022 NCES 2014-051 U.S. DEPARTMENT OF EDUCATION Projections of Education Statistics to 2022 Forty-first
PeopleSoft Enterprise CRM 9.1 Marketing Applications PeopleBook
PeopleSoft Enterprise CRM 9.1 Marketing Applications PeopleBook October 2009 PeopleSoft Enterprise CRM 9.1 Marketing Applications PeopleBook SKU crm91pbr0 Copyright 2001, 2009, Oracle and/or its affiliates.
Strategic Online Advertising: Modeling Internet User Behavior with
2 Strategic Online Advertising: Modeling Internet User Behavior with Patrick Johnston, Nicholas Kristoff, Heather McGinness, Phuong Vu, Nathaniel Wong, Jason Wright with William T. Scherer and Matthew
Energy Efficient Thermal Management of Data Centers
Energy Efficient Thermal Management of Data Centers Yogendra Joshi l Editors Pramod Kumar Energy Efficient Thermal Management of Data Centers Editors Yogendra Joshi G.W. Woodruff School of Mechanical
INFORMATION MANAGEMENT
United States Government Accountability Office Report to the Committee on Homeland Security and Governmental Affairs, U.S. Senate May 2015 INFORMATION MANAGEMENT Additional Actions Are Needed to Meet Requirements
Private Record Linkage with Bloom Filters
To appear in: Proceedings of Statistics Canada Symposium 2010 Social Statistics: The Interplay among Censuses, Surveys and Administrative Data Private Record Linkage with Bloom Filters Rainer Schnell,
Lasers in Restorative Dentistry
Lasers in Restorative Dentistry Giovanni Olivi Matteo Olivi Editors Lasers in Restorative Dentistry A Practical Guide Editors Giovanni Olivi Rome Italy Matteo Olivi Rome Italy ISBN 978-3-662-47316-0 DOI
The Complete Software Package to Developing a Complete Set of Ratio Edits
The Complete Software Package to Developing a Complete Set of Ratio Edits Roger Goodwin and Maria Garcia [email protected] Abstract We present documentation for running the GenBounds software
Issues in Identification and Linkage of Patient Records Across an Integrated Delivery System
Issues in Identification and Linkage of Patient Records Across an Integrated Delivery System Max G. Arellano, MA; Gerald I. Weber, PhD To develop successfully an integrated delivery system (IDS), it is
How To Calculate A College Degree In The United States
Projections of Education Statistics to 2020 Thirty-ninth Edition 2017 2017 2019 16 2018 2019 2020 016 2018 2020 NCES 2011-026 U.S. DEPARTMENT OF EDUCATION Projections of Education Statistics to 2020 Thirty-ninth
WHEN YOU CONSULT A STATISTICIAN... WHAT TO EXPECT
WHEN YOU CONSULT A STATISTICIAN... WHAT TO EXPECT SECTION ON STATISTICAL CONSULTING AMERICAN STATISTICAL ASSOCIATION 2003 When you consult a statistician, you enlist the help of a professional who is particularly
Instructional Design
Instructional Design for Librarians and Information Professionals Lesley S. J. Farmer Neal-Schuman Publishers New York London Published by Neal-Schuman Publishers, Inc. 100 William St., Suite 2004 New
Five Fundamental Data Quality Practices
Five Fundamental Data Quality Practices W H I T E PA P E R : DATA QUALITY & DATA INTEGRATION David Loshin WHITE PAPER: DATA QUALITY & DATA INTEGRATION Five Fundamental Data Quality Practices 2 INTRODUCTION
Three-Phase Motor Starters
Industrial Maintenance Three-Phase Motor Starters Courseware Sample 38527-F0 Order no.: 38527-70 Second Edition Revision level: 08/2015 By the staff of Festo Didactic Festo Didactic Ltée/Ltd, Quebec, Canada
Review the Texas Division of Emergency Management (TDEM) Documentation Standards for Preparedness Plans Units for a comprehensive example.
How to Create Your Own Documentation Standards Documentation standards are the surest, most efficient way to develop accurate, credible and professional documents that people use and trust. This document
Alignment and Couplings
Industrial Maintenance Alignment and Couplings Courseware Sample 36965-F0 Order no.: 36965-30 First Edition Revision level: 06/2015 By the staff of Festo Didactic Festo Didactic Ltée/Ltd, Quebec, Canada
Yale University Graduate School of Arts and Sciences
Yale University Graduate School of Arts and Sciences Guide to Formatting the Doctoral Dissertation Summary of Physical Requirements: Typing All text (including the abstract) must be double spaced on one
Essential Clinical Social Work Series
Essential Clinical Social Work Series Series Editor Carol Tosone For further volumes: http://www.springer.com/series/8115 Judith B. Rosenberger Editor Relational Social Work Practice with Diverse Populations
SpringerBriefs in Criminology
SpringerBriefs in Criminology More information about this series at http://www.springer.com/series/10159 Wesley G. Jennings Rolf Loeber Dustin A. Pardini Alex R. Piquero David P. Farrington Offending
Normalizing SAS Datasets Using User Define Formats
Normalizing SAS Datasets Using User Define Formats David D. Chapman, US Census Bureau, Washington, DC ABSTRACT Normalization is a database concept used to eliminate redundant data, increase computational
Using Oracle Time Management. Release 11.i A77086-01
Using Oracle Time Management Release 11.i A77086-01 Using Oracle Time Management, Release 11.i (A77086-01) Copyright Oracle Corporation 1999 Primary Author: Joycelyn Smith. Contributing Authors: Linda
Probabilistic Record Matching and Deduplication Using Open Source Software
Probabilistic Record Matching and Deduplication Using Open Source Software Immunization Registry Conference Atlanta, GA October 19, 2004 Magaly Angeloni Rhode Island Department of Health www.health.ri.gov
Principles of Inventory and Materials Management
Principles of Inventory and Materials Management Second Edition Richard J. Tersine The University of Oklahoma m North Holland New York Amsterdam Oxford TECHNISCHE HOCHSCHULE DARMSTADT Fochbereich 1 Gesamthiblio-thek
Advanced record linkage methods: scalability, classification, and privacy
Advanced record linkage methods: scalability, classification, and privacy Peter Christen Research School of Computer Science, ANU College of Engineering and Computer Science, The Australian National University
FORMAT GUIDELINES FOR DISSERTATIONS, TREASTISES, THESES AND REPORTS
FORMAT GUIDELINES FOR DISSERTATIONS, TREASTISES, THESES AND REPORTS The University of Texas at Austin Graduate School July 2014 Formatting questions not addressed in these guidelines should be directed
A Practical Guide to Sport Management Internships
A Practical Guide to Sport Management Internships Edited by John Miller and Todd Seidler Carolina Academic Press Durham, North Carolina Copyright 2010 John Miller Todd Seidler All Rights Reserved Library
Ding-Zhu Du Editors. Network Security
Network Security Scott C.-H. Huang Ding-Zhu Du Editors David MacCallum Network Security 123 Editors Scott C.-H. Huang Department of Computer Science City University of Hong Kong Tat Chee Avenue 83 Hong
A review and critique of the 2014 actuarial assessment of FHA s Mutual Mortgage Insurance Fund
H O U S I N G F I N A N C E P O L I C Y C E N T E R B R I E F A review and critique of the 2014 actuarial assessment of FHA s Mutual Mortgage Insurance Fund Laurie Goodman January 2015 On November 17,
BUILDING A W I N N I N G
BOUND MANUSCRIPT NOT FOR SALE POWERFUL STRATEGIES FOR DRIVING HIGH PERFORMANCE BUILDING A W I N N I N G SALES FORCE Andris A. Zoltners Prabhakant Sinha Sally E. Lorimer To Advance Readers: In quoting from
Transform Remediation: The Co-Requisite Course Model
Transform Remediation: The Co-Requisite Course Model For far too many students, postsecondary remedial education is a dead end. About 40 percent of all students entering postsecondary education in recent
New Developments in Data Sharing, Remote Access, Secure Data, and Documentation at the Cornell Institute for Social and Economic Research (CISER)
New Developments in Data Sharing, Remote Access, Secure Data, and Documentation at the Cornell Institute for Social and Economic Research (CISER) William C. Block and Lars Vilhuber 4 th Workshop on Data
Master Data Management
Master Data Management David Loshin AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO Ик^И V^ SAN FRANCISCO SINGAPORE SYDNEY TOKYO W*m k^ MORGAN KAUFMANN PUBLISHERS IS AN IMPRINT OF ELSEVIER
From Big Data to Big Profits SUCCESS WITH DATA AND ANALYTICS. Russell Walker OXFORD UNIVERSITY PRESS
From Big Data to Big Profits SUCCESS WITH DATA AND ANALYTICS Russell Walker OXFORD UNIVERSITY PRESS Contents Foreword xiii Preface xvii Acknowledgments xix Introduction xxi Definitions of Concepts and
Marketing Analytics. Methods, Metrics, and Tools. Jerry Rackley
Marketing Analytics Roadmap Methods, Metrics, and Tools Jerry Rackley Marketing Analytics Roadmap: Methods, Metrics, and Tools Copyright 2015 by Jerry Rackley This work is subject to copyright. All rights
MRI Market Solutions CREATIVE RESPONSES TO STRATEGIC CHALLENGES
[Cr] CREATIVE MRI Market Solutions [St] STRATEGIC [Me] MEDIA [Co] CONSUMER [An] ANALYTICAL Mediamark Research & Intelligence THE SURVEY OF THE AMERICAN CONSUMER [Cr] CREATIVE MRI Market Solutions The Survey
Social Media Intelligence
Social Media Intelligence In the world of Facebook, Twitter, and Yelp, water-cooler conversations with co- workers and backyard small talk with neighbors have moved from the physical world to the digital
Better planning and forecasting with IBM Predictive Analytics
IBM Software Business Analytics SPSS Predictive Analytics Better planning and forecasting with IBM Predictive Analytics Using IBM Cognos TM1 with IBM SPSS Predictive Analytics to build better plans and
How. Matching Technology Improves. White Paper
How Matching Technology Improves Data Quality White Paper Table of Contents How... 3 What is Matching?... 3 Benefits of Matching... 5 Matching Use Cases... 6 What is Matched?... 7 Standardization before
Data Isn't Everything
June 17, 2015 Innovate Forward Data Isn't Everything The Challenges of Big Data, Advanced Analytics, and Advance Computation Devices for Transportation Agencies. Using Data to Support Mission, Administration,
Dealing with digital Information richness in supply chain Management - A review and a Big Data Analytics approach
Florian Kache Dealing with digital Information richness in supply chain Management - A review and a Big Data Analytics approach kassel IH university press Contents Acknowledgements Preface Glossary Figures
Applying Comparative Effectiveness Data to Medical Decision Making
Applying Comparative Effectiveness Data to Medical Decision Making Carl V. Asche Editor Applying Comparative Effectiveness Data to Medical Decision Making A Practical Guide Adis Editor Carl V. Asche Research
