Predictive Coding, TAR, CAR NOT Just for Litigation



Similar documents
Recent Developments in the Law & Technology Relating to Predictive Coding

MANAGING BIG DATA IN LITIGATION

Predictive Coding Defensibility and the Transparent Predictive Coding Workflow

IBM Unstructured Data Identification & Management An on ramp to reducing information costs and risk

Document Review Costs

Review & AI Lessons learned while using Artificial Intelligence April 2013

Predictive Coding Defensibility and the Transparent Predictive Coding Workflow

Architecting Our Future

Technology- Assisted Review 2.0

Auto-Classification for Document Archiving and Records Declaration

Fundamentals of Information Governance:

Miguel Ortiz, Sr. Systems Engineer. Globanet

Quality Control for predictive coding in ediscovery. kpmg.com

Guide to Information Governance: A Holistic Approach

Nuix bolsters its e-discovery team and continues its push to information governance

Proactive Data Management for ediscovery

PREDICTIVE CODING: SILVER BULLET OR PANDORA S BOX?

How Good is Your Predictive Coding Poker Face?

Technology Assisted Review of Documents

ZEROING IN DATA TARGETING IN EDISCOVERY TO REDUCE VOLUMES AND COSTS

WHITE PAPER Practical Information Governance: Balancing Cost, Risk, and Productivity

Data Sheet: Archiving Symantec Enterprise Vault Discovery Accelerator Accelerate e-discovery and simplify review

Navigating Information Governance and ediscovery

Autonomy Consolidated Archive

Intelligent Information Management: Archive & ediscovery

Three Methods for ediscovery Document Prioritization:

Integrated archiving: streamlining compliance and discovery through content and business process management

The Truth About Predictive Coding: Getting Beyond The Hype

Breaking Down the Silos: A 21st Century Approach to Information Governance. May 2015

What We ll Cover. Defensible Disposal of Records and Information Litigation Holds Information Governance the future of records management programs

THE PREDICTIVE CODING CASES A CASE LAW REVIEW

Reduce Cost, Time, and Risk ediscovery and Records Management in SharePoint

Managed Services: Maximizing Transparency and Minimizing Expense and Risk in ediscovery and Information Governance

Nuix continues rapid growth, expands e-discovery into information governance

PICTERA. What Is Intell1gent One? Created by the clients, for the clients SOLUTIONS

Information Archiving

Considering Third Generation ediscovery? Two Approaches for Evaluating ediscovery Offerings

ediscovery Solutions

SMART ARCHIVING. The need for a strategy around archiving. Peter Van Camp

PRESENTATION TOPICS 2/27/2014. Why Update Policies? 21st Century Best Practices for Information Governance & Policies. Why update policies??

Traditionally, the gold standard for identifying potentially

Brochure. ECM without borders. HP Enterprise Content Management (ECM)

Data Sheet: Archiving Symantec Enterprise Vault Store, Manage, and Discover Critical Business Information

Viewpoint ediscovery Services

W H I T E P A P E R E X E C U T I V E S U M M AR Y S I T U AT I O N O V E R V I E W. Sponsored by: EMC Corporation. Laura DuBois May 2010

CA Records Manager. Benefits. CA Advantage. Overview

Functions & Importance of a Strategic Business Plan

Meeting E-Discovery Challenges with Confidence

The Case for Technology Assisted Review and Statistical Sampling in Discovery

ILM: Tiered Services & The Need For Classification

CA Message Manager. Benefits. Overview. CA Advantage

Electronically Stored Information in Litigation

Successful Implementation of Enterprise-Wide Information Governance

Technology Assisted Review: The Disclosure of Training Sets and Related Transparency Issues Whitney Street, Esq. 1

IBM Unstructured Data Identification and Management

Metrics-Based Information Governance

Lowering E-Discovery Costs Through Enterprise Records and Retention Management. An Oracle White Paper March 2007

Predictive Coding Helps Companies Reduce Discovery Costs

IBM ediscovery Identification and Collection

Litigation Solutions. insightful interactive culling. distributed ediscovery processing. powering digital review

This Symposium brought to you by

The Future of Records Management. Senior Director, Loss Prevention Project Manager/Developer

From Chaos to Clarity.

Information governance is old news at Nuix

DOCUMENT RETENTION STRATEGIES FOR HEALTHCARE ORGANIZATIONS

Transcription:

Predictive Coding, TAR, CAR NOT Just for Litigation February 26, 2015 Olivia Gerroll VP Professional Services, D4

Agenda Drivers The Evolution of Discovery Technology Definitions & Benefits How Predictive Coding Works The Ripple Effect Data and Information Governance Implications Use Cases & Considerations Selection & Technologies How Do Lawyers and Court View PC? Resources Q&A

Drivers Big Data - Growing FAST HOW MUCH DATA IS A PETABYTE

Drivers - The Technology is Not Unique As you select, the application zeroes in on what you like. 1 1 1 111 11 33 3333 1 1 1 2 3 3 Song 1 Song 2 Song 3

Evolution of Discovery Technology 1990 1995 2000 2005 2010 2015 Stand Alone Review Apps Document Imaging OCR Client/Server Review Tools Computer Forensics Automated Litigation Support Tape Restoration Web Based Review ASP or SaaS Auto Coding Email Thread & Near Dupe Detection Conceptual Search Visualization Systems Clustering & Categorization Legal Hold Management All in One Litigation Support Platforms Social Media and Cloud Collection Managed Services Predictive Coding & Assisted Review Natural Language Applications Artificial Intelligence?

TAR and CAR Technology-Assisted Review (TAR), or Computer- Assisted Review (CAR), is the use of advanced information retrieval technology that helps make the identification and review process more efficient TAR/CAR uses components of existing technologies to organize and sort documents by priority or relevance What differentiates TAR/CAR from other technologies are concept-based search engines and application of quantitative analysis.

Predictive Coding Defined Predictive Coding is one type of TAR/CAR Combines the efficiencies of concept search and statistics with the knowledge of human beings Uses an active machine learning approach or sometimes a support vector machine to distinguish relevant from non-relevant documents, based on decision made by a subject matter expert Uses established statistical principles to measure status and accuracy The technology can be used for applying Information Governance (IG) within a firm to both structured and unstructured data. One key component of predictive coding that differs from searching analytics is the methodology for training the technology that is used to automatically classify records and improve the accuracy and self-learning of predictive coding technology.

Benefits In the normal course of business documents are not organized by relevance With a predictive coding approach a Subject Matter Expert trains the software by coding individual documents responsive or not responsive, as the system samples the population Software calculates relevance scores for each document based on relevance

How It Works Matter expert is assigned to train the engine. The software initially selects a random sample of documents. The expert identifies relevant documents in the sample. The software analyzes the expert s input and creates a profile for relevant and irrelevant documents. The software generates new samples, each time learning more from the expert s input. The process repeats until the software determines it has sufficient information to scores all of the documents. The scores are then used to make informed decisions about the data management.

Predictive Coding Workflow - Discovery

Data Environment

The Ripple Effect Early use of predictive coding can be used to confidently impact settlement before heads-down legal review. www.ediscovery.com -Kroll Predictive coding is a natural way to assess and detect risk patterns, and stop them from developing further. Predictive coding can be utilized to enforce and create record retention policies.

Data and Information Governance Key problems for organizations Find information they need, when needed and in a cheap and efficient manner Have to have the information Must keep it till needed Find valuable information Destroy worthless or unessential information What is valuable? What is worthless?

Implications? Chucking Daisies Ten Rules For Taking Control of Your Organization s Digital Debris Kahn & Datskovsky ARMA International (2013) Ch 1: Stop Keeping Everything Forever Ch 2: Clean Up the Past to Gain Business Efficiency But how? Since people are storing yet even more, predictive coding can help separate the debris: from what is required to be kept. Backup tape reduction. Early Case Assessment. Big Data mining. Compliance investigations.

Considerations All document-related information governance and RIM initiatives rest on and depend upon consistent, comprehensive document classification. Without consistent, comprehensive classification, an organization can't determine what to keep, how long to keep it, who should have access to it, and where to store it. Replace manual classification decision making processes with technology Use predictive technology to create classification schemas for identifying and categorizing data currently in unstructured systems Predictive technology can identify areas of conflict in existing classifications and ensure consistency and uniformity going forward

Considerations Use your skilled experts for creating the appropriate data sets (or seed sets). The data sets should represent content from all information repositories. Product must be able to meet your end-user, IT and legal compliance requirements. Oversight and a comprehensive remediation plan, agreed upon by all stakeholders. Deployment should include a process to audit the application s decisions. Ideally - leverage internal ediscovery resources to help guide the deployment. Litigation technology experts have been working with this technology for years and can provide valuable insight into its usability and functionalities. The hybrid approach You do not have to choose between upstream or downstream data movement. Predictive coding is not a panacea, so any project needs to start with the establishment of an IG framework. See Slide18 Item 3 for resource content location

Information Management IG Data Governance Email Shared drives Local hard drives SharePoint DM Systems Extranets, intranets RIM Data Control & Management Email official record Retention and disposition Onboarding client file intake Off-boarding client file transfer Identify vital and/or historical records Legal Hold/preservation Security, conflict and risk remediation

How can Predictive Coding be Applied? Seed Set Context Human Interaction Validate and Automate Identify Existing Information Data that has been classified in accordance with the organizations RIM policies Leverage Context Use existing resources such as the DM or financial system to provide context to the process Manual Verification Records staff interact with the technology to validate findings and ensure validity of predictive coding assessments Validate & Fully Automate After the manual verification has been validated and/or corrections made the system can be let loose

Application - Information Governance and Records Retention: How to Start Three Key Steps Executive sponsorship that supports Information Governance Form a steering committee of key stakeholders across multiple departments IT Legal Records Management Compliance Security & Privacy, etc. Define global policies Committee must focus on the business processes, laws & regulations, departmental requirements needed to define the global policies needed to govern information within the organization.

Selection Factors Ensure that your environment is ready to implement the technology. Factor in the learning curve necessary to fully understand and effectively use the technology. Skilled resources: The tools are best used by people skilled in big data information analysis understanding the analysis and patterns and how to interpret the results. Ensure that the technology and environment are correctly secured especially when dealing with the cloud and internet access. Understand the technology: Dig under the hood How good are the algorithms inside the software at doing what we tell it to do in finding information?

Some Technologies Information Management: Equivio Recommind Autonomy Symantec IBM EMC CommVault Discovery Nuix Relativity IPRO Autonomy Recommind FTI Catalyst

How Does the Judiciary View PC? Da Silva Moore v. Publicis Groupe Court okayed parties agreement to use; 3.3M emails) Kleen Products v. Packaging Corp. of America Plaintiffs abandoned arguments in favor of PC and went Boolean Global Aerospace Inc. v. Landow Aviation, L.P. Court approved defendant use of PC over objections (2M emails) Actos (Pioglitazone) Products Liability Litigation Court affirmatively approved using PC for review and production EORHB, Inc., et al v. HOA Holdings, LLC Court orders parties to use PC and share an ediscovery vendor

Defensible Predictive Coding Using Da Silva is a Map: Senior attorneys must be involved Cooperate in devising approach Have a written protocol Share the Seed Set (maybe!) Refine repeatedly for accuracy Be transparent Bottom Line for Defensibility: Sampling, transparency, documentation

Resources 1. D4 Knowledge Center 2. The Grossman-Cormack Glossary of Technology-Assisted Review http://www.fclr.org/fclr/articles/html/2010/grossman.pdf 3. Chucking Daisies Ten Rules For Taking Control of Your Organization s Digital Debris ARMA Publication 4. Predictive Coding for Information Governance http://www.ironmountain.com/knowledge-center/reference-library 5. The Electronic Discovery Reference Model www.edrm.net The Sedona Conference Thesedonaconference.com

Questions? On behalf of everyone at D4 thank you ARMA Iowa for this opportunity to present. Olivia Gerroll VP, Professional Services Group OGerroll@d4discovery o 402.682.3771 m 402.547.0742