Taming Big Data ediscovery. Ten Tips to Avoid be Byten by Big Data in Your Case



Similar documents
Stu Van Dusen Marketing Manager, Lexbe LC. September 18, 2014

ediscovery & Case Management for Disability Rights Advocates

Redefining High Speed ediscovery Processing & Production

Best Practices: Cloud ediscovery Using On-Demand Technology and Workflows to Speed Discovery and Reduce Expenditure

On-Demand ediscovery Processing

Best Practices: Defensibly Collecting, Reviewing, and Producing

HIGH- SPEED EDISCOVERY PROCESSING & PRODUCTION

Discussion of Electronic Discovery at Rule 26(f) Conferences: A Guide for Practitioners

Best Practices: ediscovery Search

Reduce Cost and Risk during Discovery E-DISCOVERY GLOSSARY

Electronic Discovery

Xact Data Discovery. Xact Data Discovery. Xact Data Discovery. Xact Data Discovery. ediscovery for DUMMIES LAWYERS. MDLA TTS August 23, 2013

Best Practices in Electronic Record Retention

Digital Forensics, ediscovery and Electronic Evidence

Viewpoint ediscovery Services

Mastering Electronic Discovery. Practice Tips and Traps for the Unwary

Predictive Coding Defensibility and the Transparent Predictive Coding Workflow

Are you ready for more efficient and effective ways to manage discovery?

E-Discovery Best Practices

ediscovery Software Buyer s Guide FOR SMALL LAW FIRMS

Making Sense of E-Discovery: 10 Plain Steps for Producing ESI

E- Discovery in Criminal Law

How To Manage Cloud Data Safely

A Modern Approach for Corporations Facing the Demands of Litigation

102 ediscovery Shakedown: Lowering your Risk. Kindred Healthcare

Proactive Data Management for ediscovery

ZEROING IN DATA TARGETING IN EDISCOVERY TO REDUCE VOLUMES AND COSTS

ediscovery: In-house vs. Outsource?

Discovery Data Management

State of Texas Office of the Attorney General (OAG)

DSi Pilot Program: Comparing Catalyst Insight Predict with Linear Review

In-House Solutions to the E-Discovery Conundrum

E-Discovery in Michigan. Presented by Angela Boufford

Navigating Information Governance and ediscovery

Portable. Harvester 4.0 has Arrived!! POWERFUL E-DISCOVERY COLLECTION SOFTWARE SEARCH AND COLLECT DISCOVERABLE DOCUMENTS AND HARVESTER FEATURES

ELECTRONIC DISCOVERY. Dawn M. Curry

REDUCING COSTS WITH ADVANCED REVIEW STRATEGIES - PRIORITIZATION FOR 100% REVIEW. Bill Tolson Sr. Product Marketing Manager Recommind Inc.

3 "C" Words You Need to Know: Custody - Control - Cloud

PICTERA. What Is Intell1gent One? Created by the clients, for the clients SOLUTIONS

Predictive Coding Defensibility and the Transparent Predictive Coding Workflow

It's a Numbers Game: The Value of ediscovery Metrics

Litigation Solutions. insightful interactive culling. distributed ediscovery processing. powering digital review

What You Should Know About ediscovery

How To Write A Hit Report On A Lawsuit Against A Company

Five Steps to Ensure a Technically Accurate Document Production

INDEX. OutIndex Services...2. Collection Assistance...2. ESI Processing & Production Services...2. Computer-Based Language Translation...

GUIDELINES FOR USE OF THE MODEL AGREEMENT REGARDING DISCOVERY OF ELECTRONICALLY STORED INFORMATION

In-house Counsel s Next Cost Savings Frontier: Cost Minimization by Centralizing Litigation Document Collections

Digital Government Institute. Managing E-Discovery for Government: Integrating Teams and Technology

Director, Value Engineering

Corporations Take Control of E-Discovery

What Am I Looking At? Andy Kass

The Clearwell ediscovery Platform

SEVENTH CIRCUIT ELECTRONIC DISCOVERY PILOT PROGRAM FOR DISCOVERY OF ELECTRONICALLY STORED

the e-discovery how-to Guide page : 1 The E-Discovery Practical Recommendations for Streamlining Corporate E-Discovery A Clearwell White Paper

Electronic Discovery. Answers to life s enduring questions

Integration of E-Discovery and FOIA

Minimizing ediscovery risks. What organizations need to know in today s litigious and digital world.

Take an Enterprise Approach to E-Discovery. Streamline Discovery and Control Review Cost Using a Central, Secure E-Discovery Cloud Platform

Symantec ediscovery Platform, powered by Clearwell

EMC SourceOne Management and ediscovery Overview

Office 365 for the Information Governance and ediscovery Practitioner. Part II: ediscovery Deep Dive October 27, 2015

Case 2:14-cv KHV-JPO Document 12 Filed 07/10/14 Page 1 of 10 IN THE UNITED STATES DISTRICT COURT FOR THE DISTRICT OF KANSAS

ediscovery Policies: Planned Protection Saves More than Money Anticipating and Mitigating the Costs of Litigation

CAPABILITY STATEMENT LEGAL TECHNOLOGIES AND COMPUTER FORENSICS. DECEMBER 2013

AccessData Corporation. No More Load Files. Integrating AD ediscovery and Summation to Eliminate Moving Data Between Litigation Support Products

THE NEW WORLD OF E-DISCOVERY

capabilities statement

The Business Case for ECA

Best practices for comparing apples to apples in e- discovery pricing and services

E-DISCOVERY: A Primer

How To Find Out What You Know About Esi

Understanding How Service Providers Charge for ediscovery Services

Streamlining the ediscovery

IBM ediscovery Identification and Collection

Transcription:

Taming Big Data ediscovery Ten Tips to Avoid be Byten by Big Data in Your Case Presented at the University of Texas 27th Annual Technology Law Conference May 22, 2014 Austin, Texas Gene Albert Principal, Lexbe LC Gene Albert bio Principal of Lexbe LC, a provider of cloud-based litigation review and document management software & ediscovery services. Prior business experience in software, medical services and internet-based businesses. Prior legal experience as in-house counsel and in private practice. Frequent speaker and author on ediscovery and legal technology issues. Education MBA, University of Texas (2005) JD, Southern Methodist University (1983) BA, University of Texas (1979) Contact Gene Albert 512-686-3460 gene@lexbe.com

Taming Big Data ediscovery Agenda ediscovery - the Original Big Data Problem How Should We Look at ediscovery Costs? How Much is ediscovery Data Increasing? Why Are ediscovery Costs Continuing to Rise? 10 Ways to Tame Big Data ediscovery ediscovery: the Original Big Data What is Big Data? Flavor of the Month/News Cycle? Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization. (Gartner) Litigation document management is an important and early Big Data application. ediscovery is characterized by sometimes extreme time-pressures often not present in Big Data applications in other industries. How to Control ediscovery Costs Easy answer - drop your case. No ediscovery costs then... Nonsensical answer because ediscovery is part of the strategic pursuit of case goals, and should be approached, evaluated and managed in that context.

What is ediscovery Like? Are ediscovery Costs Like Expert Fees? Yes Large, expected part of commercial litigation Quality matters; not a commodity You need to understand the process undertaken for both No Involuntary (you don t have to hire an expert, but you do need to respond to discovery) Experts are always important to litigation if retained; ediscovery usually only if mistakes occur ediscovery expenses increase based on quantity of data, but experts bill hourly, related to to case importance Why Do ediscovery Costs Rise? ediscovery Market is Big & Growing ediscovery Software & Services Source: Complex Discovery (ComplexDiscovery.com) Based on a combination of public market sizing estimates. $5.5 Billion today Growing 15.5% annually Projected $9.8 Billion (2017) Services (72%) Software (28%)

Is ediscovery Data Increasing? Data Types and Volume Keep Growing Zettabytes* 4 3 2 1 Digital Information Created, Captured, Replicated Worldwide Voip Email iphones Peer-to-Peer 2.8 zettabytes of information were created Online Storage and replicated during 2012, a 56% increase Digital Cameras from 2011 (IDC) Facebook LinkedIn DropBox Backup Devices Elastic Storage SaaS Google Streets Personal Blogs Skype World Satellite Images Personal Scanners Customer Service Recordings Public Webcams Google Goggles Netbooks Cloud Instance Servers PaaS 2005 2010 2015 Source: IDC Digital Universe Study (2012) * 1 Zettabyte = 1 Trillion Gigabytes Is ediscovery Data Increasing? But It s Not Being Retained Today Types of Information that Organizations Retain and Do Not Retain These data types will be used in future litigation because they contain relevant evidence Source: Osterman Research 2014

Why Are ediscovery Costs Rising? It s Not Costs - They re Falling Cost per GB to Process ESI in Volume $2,000 $1,800/GB (2006) ESI costs have fallen 90% in the last 10 years Source: Forrester Research $1,500 $1,000 $500 $500/GB (2011) $150/GB (2014) Source: Forrester Research $0 2005 2010 2015 Why Are ediscovery Costs Rising? The Volume is ESI is Rising Faster GBs of ESI in a Typical Commercial Case Enron Criminal Trial (2005) High Source ESI: 100M pages (~4 TBs) Brought to Trial: 1M pages (~40 GBs) Extraordinary at time Not now Microsoft (2011) Low 1995 2000 2005 2010 2015 Microsoft collects 45 custodians per matter average (2011) Almost 1 TB per matter, average

Why Are ediscovery Costs Rising? Increase in Size per Custodian Microsoft Custodian Size Increases GBs 30 Drivers of Costs 2011 17.5 GBs per Custodian (0.9 Million pages) 20 More ESI data sources More ESI stored Increases in Custodian ESI size Outpaces Drops in per GB costs 10 Source: Legal Technology Leadership Summit (2011) 2008 7 GBs per Custodian (0.5 Million pages) 5 2005 2010 2015 Why Do ediscovery Costs Rise? Review Costs Dominate Total Costs CASE STAGE SOURCE 8% 19% edisc Providers 26% Review 73% Outside Counsel 70% Total 100% Internal Total 4% 100% Best opportunities for further cost savings will be technologies and process improvements that increase attorney review efficiencies. N. Pace and L. Zakaras, Where the Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery (RAND Institute for Civil Justice 2012)

1. Understand the ediscovery Process Is ediscovery a Black Box? Seemingly, in goes files, data, money and time And, out comes documents and data What happens inside?... who knows How Do You Manage a Black Box? Yell, plead, whine, beg, pray ediscovery is not a Black Box 1. Understand the ediscovery Processes No avoiding getting one s hands dirty and understanding what is going on in the ediscovery Black Box to manage effectively. At each stage ask: What is being done? What are the drivers of quality and cost? Would process and technology improvement yield a better result? ediscovery processes should be matched to case needs, overall case strategy and risk management.

2. Use a Systems Approach Start with the end in mind -- where do you want to end up? How will ediscovery choices help or hinder. Determine internal and external expertise available to help. Do internal IT personnel support and have expertise? Consider using a project manager and project management methodology for larger, time sensitive or complicated matters to coordinate. Use Rule 26 Conference to negotiate/document discovery plan. Consider negotiating a formal ESI Protocol to address issues of collection, production, data availability, etc. 3. Manage the ediscovery Funnel High Volume Low Relevance ESI ID & << Early Efforts Here Result in << m e EDA & Culling/Filtering << Ti Use (Depos,, Trial) Low Volume High Relevance << Improved Quality and Reduced Costs Here <<

4. Know Your Custodians and ESI Data Mapping/tracking custodians & ESI sources Methodology based on litigation goals & requirements Proportionality considerations Rule 26 (meet & confer); cooperation Issues with asynchronous discovery 4. Know Your Custodians and ESI Data IT network maps alone are not sufficient to identify and manage ESI

4. Know Your Custodians and ESI Data Content Mapping is better for quantitative & qualitative data analysis Custodians Amount of ESI Accessible Backup Policy 5. Avoid Under- & Over- Dangers of Under- Missed, lost or destroyed documents data Sanctions, adverse inferences Greater costs to collect later, reconstruct Address by Early ID of custodians, Mapping of ESI data sources Early case analysis to find data holes Document process; Use Rule 26 conferences

5. Avoid Under- & Over- Dangers of Over- Unnecessary expense in collection, review and production Slowing of case progression; mis-allocation of time and other resources Address by Analyze and test ESI data sources for likely responsive content Prioritize data by importance, proportionality Separate out hold, collection, and processing steps and progressively move through them by data source 5. Avoid Over- & Under- Who is Collecting? Custodian Self-Collect - Increasingly difficult to justify. Issues with custodian collection competency, even if you assume good custodian intentions. Internal IT - Make sure training in ediscovery, sufficient time to meet deadlines. Outside Vendors - Usually most expertise and expense. Rent (external vendor) vs. buy (internal staff) issue.

5. Avoid Over- & Under- Plan & Document How is to be Done Files vs. disc images; Active vs. forensic collection Local vs. remote vs. network collections Search and index quality if data has not been processed Preserving metadata Devices (laptops, phones, flash drives, etc.); Cloud accounts; social media (e.g., Google, Dropbox, Facebook, LinkedIn) 6. Reduce Reviewable Data with Culling Purpose Defensibly remove files from process that are unlikely to lead to responsive documents Culling Processes Issues Keyword selection & testing, concept searching, process documentation, repeatability, culled file retention Reduction ESI may reduce 95% at this stage from raw data size Expansion, repair, DeNIST, OCR Filter by file type & date Deduplication (within or between custodians) Indexing and keyword filtering Linear vs. dynamic culling

7. Prepare ESI for Review Platform ESI and metadata must be processed for review & production except for the smallest cases. Review in Native, Near-native, HTML, PDF or TIFF; Choice driven by review platform capacities; Metadata in load files. How are exceptions handled (e.g., corrupt files, unusual file types, password-protected files, etc.); Use of Placeholder files. Use high-volume vendors when time is tight. for TIFF review tools requires the most throughput capacity. 8. Right-Size Your Review Methodology Match Methodology and Case: Case size/type/budget will drive which review method is preferable in any given case. Linear Review: Read, review, and code all documents, one at a time. Comforting but not cost-effective or even possible in larger cases. Keyword Search: Using search keywords to identify responsive and privileged documents. Accurate and cost-effective if done correctly. TAR: Technology assisted review/predictive coding. Manually review a seed set of documents to train computer algorithm that will automatically code the remaining documents.

8. Right-Size Your Review Methodology Watch out for Inadvertent Privilege Release Larger cases have put a strain on accurate privilege review. Finding 24 versions of a privileged document doesn t help if you release version 25. Nothing is more costly than compromising or losing a case because of privilege disclosure. Claw-back agreements a good idea, but no panacea. You can t unring a bell. 8. Right-Size Your Review Methodology Minimizing Risk of Privilege Release Understand the Privilege Review process undertaken in detail. Build dictionary of privileged sources and issues early in doc review. Check for: untrained or sloppy review; unsearchable documents; incomplete search indices; poor redaction procedures; search not done in metadata and full-text; privilege text retained in natives, text files, load files, text-based PDFs. Use specialized computerized privilege checks for container (email family) consistency, exact-dup and near-dup identification.

9. Understand Your Case Facts & Issues How to identify key documents for depos, dispositive motions, case evaluations and settlement discussions. Particularly important if review has been mechanical/algorithmic (keyword dependent, groupings, predictive coding). Depo prep may be first time attorneys are really looking at evidence. Increasing need for 'early case analysis'; timelining; ID of key docs. 9. Understand Your Case Facts & Issues

9. Understand Your Case Facts & Issues 10. Evaluate and Adopt New Technologies Needed Technology created Big Data and will be needed for evolving solutions. Lawyers generally not the fastest in analyzing and adopting new technologies and modifying workflows. Approach Strategically adopt new ediscovery technologies. Plan time to evaluate and test new software and approaches in a non-emergency environment. Maintain personnel with expertise, within or without the organization, for recommendations and assistance.

Taming Big Data ediscovery UT 27th Annual Tech Law Conference May 22, 2014 Austin, TX Thank you. Questions & comments are welcome. Gene Albert Lexbe 512-686-3460 gene@lexbe.com