E-Discovery Tip Sheet



Similar documents
E-Discovery Tip Sheet

Three Methods for ediscovery Document Prioritization:

Predictive Coding Defensibility and the Transparent Predictive Coding Workflow

E-Discovery Tip Sheet

Predictive Coding Defensibility and the Transparent Predictive Coding Workflow

E-Discovery Tip Sheet

How It Works and Why It Matters for E-Discovery

Hosted Concordance Andy Kass

Judge Peck Provides a Primer on Computer-Assisted Review By John Tredennick

Viewpoint ediscovery Services

Reviewing Review Under EDRM Andy Kass

What Am I Looking At? Andy Kass

The Evolution, Uses, and Case Studies of Technology Assisted Review

An Open Look at Keyword Search vs. Predictive Analytics

Training Questions Andy Kass

Enhancing Document Review Efficiency with OmniX

Mastering Predictive Coding: The Ultimate Guide

Making reviews more consistent and efficient.

E-Discovery Tip Sheet

Assisted Review Guide

Training Agendas and Pricing

E-Discovery Tip Sheet

PRESENTED BY: Sponsored by:

Predictive Coding: E-Discovery Game Changer?

Cost-Effective and Defensible Technology Assisted Review

Whitepaper: Enterprise Vault Discovery Accelerator and Clearwell A Comparison August 2012

2011 Winston & Strawn LLP

Electronically Stored Information: Focus on Review and Strategies

APPENDIX B TO REQUEST FOR PROPOSALS

Predictive Coding Defensibility

AccessData Corporation. No More Load Files. Integrating AD ediscovery and Summation to Eliminate Moving Data Between Litigation Support Products

Stu Van Dusen Marketing Manager, Lexbe LC. September 18, 2014

Review Easy Guide for Administrators. Version 1.0

Software-assisted document review: An ROI your GC can appreciate. kpmg.com

Technology Assisted Review of Documents

A Practitioner s Guide to Statistical Sampling in E-Discovery. October 16, 2012

Take an Enterprise Approach to E-Discovery. Streamline Discovery and Control Review Cost Using a Central, Secure E-Discovery Cloud Platform

One Decision Document Review Accelerator. Orange Legal Technologies. OrangeLT.com

Technology Assisted Review: Don t Worry About the Software, Keep Your Eye on the Process

The Next Phase of Electronic Discovery Process Automation

ESI: Focus on Review and Production Strategy. Meredith Lee, Online Document Review Supervisor, Paralegal

Making The Most Of Document Analytics

HOW TO BECOME AN ESI HERO

REDUCING COSTS WITH ADVANCED REVIEW STRATEGIES - PRIORITIZATION FOR 100% REVIEW. Bill Tolson Sr. Product Marketing Manager Recommind Inc.

Starter Template. August 14, Version 9.2

Contents. Meltwater Quick-Start Guide

Capstone for Records Management

ESI and Predictive Coding

Review & AI Lessons learned while using Artificial Intelligence April 2013

Pr a c t i c a l Litigator s Br i e f Gu i d e t o Eva l u at i n g Ea r ly Ca s e

Only 1% of that data has preservation requirements Only 5% has regulatory requirements Only 34% is active and useful

Symantec ediscovery Platform, powered by Clearwell

IBM ediscovery Identification and Collection

E-discovery Taking Predictive Coding Out of the Black Box

Are You Paying Too Much for ediscovery Processing?

Document Review Costs

How to pick ediscovery software

The Case for Technology Assisted Review and Statistical Sampling in Discovery

Best Practices: ediscovery Search

2972 NW 60 th Street, Fort Lauderdale, Florida Tel Fax

ZEROING IN DATA TARGETING IN EDISCOVERY TO REDUCE VOLUMES AND COSTS

ARCHIVING FOR EXCHANGE 2013

Predictive Coding Helps Companies Reduce Discovery Costs

Amazing speed and easy to use designed for large-scale, complex litigation cases

This Webcast Will Begin Shortly

ediscovery Technology That Works for You

Predictive Coding, TAR, CAR NOT Just for Litigation

Where the Rubber Meets the Road: The Evolution of ESI Discovery Readiness When IT Resources are Limited

InfoView User s Guide. BusinessObjects Enterprise XI Release 2

The Tested Effectiveness of Equivio>Relevance in Technology Assisted Review

DSi Pilot Program: Comparing Catalyst Insight Predict with Linear Review

Reduce Cost and Risk during Discovery E-DISCOVERY GLOSSARY

E- Discovery in Criminal Law

Best Practices: Cloud ediscovery Using On-Demand Technology and Workflows to Speed Discovery and Reduce Expenditure

Best Practices: Defensibly Collecting, Reviewing, and Producing

Veritas ediscovery Platform

Voice and data recording Red Box makes it easier than you imagine

Autonomy Education. Autonomy ediscovery Administrator, Project Manager & End User Training

II Workshop University of Pennsylvania Philadelphia, PA

LexisNexis TotalPatent. Training Manual

Workflow Solutions for Very Large Workspaces

Implementing Project Server 2010

Integrated Analytics. Simplified Case Administration

The United States Law Week

READY FOR THE MATRIX? MAN VERSUS MACHINE

Setting Up Person Accounts

Discovery of Electronically Stored Information ECBA conference Tallinn October 2012

Christina Wojcik, VP Legal Services, Seal Software Steven Toole, VP Marketing, Content Analyst Company Jason Voss, Senior Product Manager, TCDi

MailTags 4. Quick Start Guide V: 4.0

Introduction to Predictive Coding

White Paper Technology Assisted Review. Allison Stanfield and Jeff Jarrett 25 February

Data Mining. SPSS Clementine Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine

The Business Case for ECA

Litigation Solutions insightful interactive culling distributed ediscovery processing powering digital review

Data Targeting to Reduce EDVERTISING Costs

Highly Efficient ediscovery Using Adaptive Search Criteria and Successive Tagging [TREC 2010]

Enterprise Archive Managed Archiving & ediscovery Services User Manual

The Truth About Predictive Coding: Getting Beyond The Hype

Transcription:

E-Discovery Tip Sheet LegalTech 2015 Some Panels and Briefings Last month I took you on a select tour of the vendor exhibits and products from LegalTech 2015. This month I want to provide a small brief that might provide a little more incentive to brave the cold and crowds next time around. Below I have digested one plenary session, and two vendor briefings from kcura on developments in their industry-leading review platform, Relativity 9. A. Legal Tech Panel Session: Taking TAR to the Next Level: Recent Research and the Promise of Continuous Active Learning This panel was comprised of Professor Gordon Cormack of University of Waterloo and Maura R. Grossman of Wachtell Lipton, co-authors of a cornerstone study of technology assisted review; Magistrate Judge Andrew Peck, a leading voice from the Federal bench on ediscovery issues; Susan Nielsen Hammond, General Counsel of Regions Financial Corporation; and moderator John Tredennick of big data review vendor Catalyst Systems. To boil down a deep and interesting discussion, the evolution and efficacy of several classes of computer assisted review were compared to the false gold standard (per Judge Peck) of linear manual review, and to each other. Ms. Grossman, followed up by Professor Cormack, used slides to illustrate the differences in process and efficacy of three different types of computer learning: > Simple Passive Learning (SPL): 1. Critical initial factors are (a) seed set selection random vs. judgmental; and (b) number of documents in the seed set.

March 2015 E-Discovery Tip Sheet Page 2 2. Review and code seed set (by Expert, i.e., senior attorney on case). 3. Feed expertly-coded seed set to algorithm; evaluate machine vote effectiveness and training result. 4. Repeat as required to stabilize results ( till the Popcorn stops popping, or the stability does not materially change). 5. When done, run results against entire document set. 6. Review documents auto-coded Responsive or above confidence ranking percentile cut-off. 7. Team chooses next set to review. > Simple Active Learning (SAL): 1. Create Control Set think of as a Responsive key for benchmark. 2. Critical factors in seed set selection are random vs. judgmental, and number of documents in set, as above. 3. Review and code seed set (by Expert ). 4. Use machine learning algorithm to select documents from which it will learn the most (ambiguous content). 5. Still an iterative process until stable (all the popcorn is popped). > Continuous Active Learning (CAL): 1. Seed set (initial training set) selection is judgmental; also dependent on number of documents in set. Inferentially, some initial document counts have been calculated that seem to create a stable set under multiple circumstances (between about 5,000 and 14,000). Example given was to put in one or more party s Request for Production as part of the set. 2. Machine learning algorithm based upon review. (a) Review and code newly suggested documents and add to training set.

March 2015 E-Discovery Tip Sheet Page 3 (b) Repeat until substantially all documents have been reviewed. 3. Iterative, constant review and feedback. Professor Cormack noted, in reviewing the higher recall of CAL, that search term-based seed sets contain a built-in STOP, limited by the keyword hits, even within TAR. Analyses were offered of recall versus effort in a first-level doc review: for example, 56,000 documents were required in SPL to reach the same level of recall as 5,000 documents in SAL. Ms. Hammond added a practical perspective on the theoretical and judicial discussions: in regulatory practice, precision is vital. Having used most of the types of tools under discussion, she noted that testing is needed for determining good seed sets, and continues to be required as new terms arise during review. She recommended a blended approach continuing to engage human intelligence. A toolkit and resources for the Cormack & Gordon SIGIR 14 report, including 4 Text Retrieval Conference (TREC) 09 Enron databases which were part of those used for the controlled comparison of SPL, SAL and CAL cited, are available for free under the GPL at trec.nist.gov, among other sources. B. kcura Relativity Briefings 1. The Mobile Attorney: Working with Key Documents Using Relativity Binders. Relativity v8 and later can export and synchronize Binder data with an ipad in this Mobile and Web application that helps consolidate critical case documents. Binders are locked behind the Apple encryption keychain for security. Relativity field settings control metadata, docket or coding output, with contents based upon a Saved Search. Binder users must already be licensed Relativity users. Among the limited palette of features available to mobile Binder users are: - Annotations (highlight, note, draw, control colors and thickness only see own); - Organization (create Sections, drag and drop); - Search of metadata or text (builds an index on the ipad, with highlights on hits; must use UPPERCASE only for Boolean AND, OR, NOT);

March 2015 E-Discovery Tip Sheet Page 4 - Offline Access (sync with Relativity as Backup, visible only to individual Binders user, via HTTPS or SSL); - AirPlay ipad Binder info can be wirelessly projected to Apple TV; and - Binders on Web (Binder viewer, track changes, sync across multiple devices). One can do incremental Binder builds, with updates and additions; won t remove anything, though. Apple ios will warn on space, and can set auto-expire to clear. With Relativity 9, users will be able to publish to Binders, even push a single doc to a pre-made Binder. There will also be mobile device management and security configuration, as well as added Notifications, Favorites, Preview before download (but no filesize parameter); the beta is due in March/April 2015. Must have Native Imaging Server (the processing bit add-on module, which requires additional servers) to use Binders. This is NOT a collaborative tool at this point. 2. Relativity Analytics Overview. The presenter discussed analytics in case workflow as ideal where there is a short time line, such as in a Federal second request on a prospective merger, and a lot of data to get through. She cited that the average case here was about 1M docs, and the top 1000 cases were about 3.8M docs. Relativity Analytics is thus intended to (a) investigate an unknown data set for doc types, languages, and find related documents; (b) evaluate large sets of data and prioritize; or (c) structure documents by batching out clusters. The presenter broke it out as follows: > Email Threading (based on Content Analyst) identify a group within a conversation; display groupings; show master inclusive email (indicated by a solid dot). > Near Duplication organization of highly similar text into relational groups with percentage of similarity; used for review batching or conflict check, or to find subtle differences in language between documents. > Language Identification Determine primary and up to 2 secondary languages per document; report percentage of text in each language found; handles 172 languages and

March 2015 E-Discovery Tip Sheet Page 5 dialects. Used to assign documents to language review teams, create grand total charts and reports, and further classification. One cannot exclude text, at least in Relativity 8. The above fall into the category of Document Organization and Structure. Next are Conceptual Analytics: > Latent Semantic Analysis mathematical assessment of language learned from documents in the current case, based upon concepts, not words - aboutness (about a plan, RFP re subject, precis of blog post content), versus more common - is-ness (metadata, keyword, proximity, document type, author). > Search using example sentence, paragraph, entire document to return documents related in concept, based on ideas and thus conceptual relevancy to get around false keyword hits, misspellings and code words. > Keyword expansion submit a term to list conceptually-related items - Develop a search term list (synonyms). - Learn language of a case (jargon/ new terms / idiom). - Revealing code words and variations. Last are the Review and QC analytics: > Clustering group documents by concept and visual hierarchy (a title is provided for each cluster of 4 words found together). One can then batch out by cluster (# of docs, score e.g. 0.65). The process runs an index of all documents in the workspace, or by custodian, or by set submitted for Analytics clustering. This facilitates Mass actions, e.g., Mass Tag a certain cluster Not Relevant. One can batch out either using or overriding the Family Field Group identifier. > Categorization Based upon expert user-defined examples or categories, using example documents from Relativity Assisted Review. Use for Prioritization, sorting large volumes quickly, or creating a pivot table to visualize clusters against categories. Under the Indexing & Analytics Tab, can set example source (e.g., Tag), maximum

March 2015 E-Discovery Tip Sheet Page 6 categories per document, minimum coherent score (default = 70%) and issue designation. The above notes represent a tiny fraction of what was on offer at LegalTech. The show truly is one place and time where legal technology people, knowledge and commerce converge. Hope to see you there next year! -- Andy Kass akass@uslegalsupport.com 917-512-7503 The views expressed in this E-Discovery Tip Sheet are solely the views of the author, and do not necessarily represent the opinion of U.S. Legal Support, Inc. U.S. LEGAL SUPPORT, INC. ESI & Litigation Services PROVIDING EXPERT SOLUTIONS FROM DISCOVERY TO VERDICT e-discovery Document Collection & Review Litigation Management Litigation Software Training Meet & Confer Advice Court Reporting Services At Trial Electronic Evidence Presentation Trial Consulting Demonstrative Graphics Courtroom & War Room Equipment Deposition & Case Management Services Record Retrieval www.uslegalsupport.com Copyright 2015 U.S. Legal Support, Inc., 425 Park Avenue, New York NY 10022 (800) 824-9055. All rights reserved. To update your e-mail address or unsubscribe from these mailings, please reply to this email with CANCEL in the subject line.