ediscovery Institute Survey on Predictive Coding

Size: px
Start display at page:

Download "ediscovery Institute Survey on Predictive Coding"

Transcription

1 Released October 1, 2010 The ediscovery Institute is a 501(c)(3) nonprofit research organization dedicated to identifying and promoting cost-effective methods of processing discovery. More information on the work of the Institute is available at ediscovery Institute 2010, all rights reserved.

2 Foreword: Why A Survey on Predictive Coding? The largest cost element in the ever-escalating cost of electronic discovery is typically the cost of having teams of lawyers review and select records for production or privilege. Those costs can be especially staggering if the lawyers are reviewing every record that is produced. Predictive coding is a process in which review decisions from examining sample records are propagated or extended by the use of various technologies to records which have not been individually examined. The producing party may use the suggested evaluations to avoid examining all records, or it can lower costs by triaging the documents, assigning the lower ranking documents to the lowest cost personnel, letting the more expensive resources focus on the records that are most likely to be relevant. Either way, predictive coding can significantly reduce the largest single element of cost in e-discovery. The survey was undertaken to collect information on technologies or processes that were being used to accomplish predictive coding and to quantify the savings that they were achieving. There is growing recognition that the old brute-force linear review process in which each record is examined is not economically feasible. For example: Principle 6 of the Sedona Principles (Second Edition), June 2007: Responding parties are best situated to evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information. Principle 11 of the Sedona Principles (Second Edition), June 2007: A responding party may satisfy its good faith obligation to preserve and produce relevant electronically stored information by using electronic tools and processes, such as data sampling, searching, or the use of selection criteria, to identify data reasonably likely to contain relevant information. Practice Point 1 from The Sedona Conference Best Practices Commentary on the use of Search and Information Retrieval Methods in E-Discovery: In many settings involving electronically stored information, reliance solely on a manual search process for the purpose of finding responsive documents may be infeasible or unwarranted. In such cases, the use of automated search methods should be viewed as reasonable, valuable, and even necessary. (Emphasis added.) Not only is predictive coding less expensive, there is also a growing belief that it is actually superior to linear review in several ways: Consistency. Human review is not necessarily the gold standard it is sometimes assumed to be. In a study by the ediscovery Institute 1 and earlier studies by the Text Retrieval Conference ( TREC), 2 two reviewers or teams of reviewers have examined the same records. In these 1 Document Categorization in Legal Electronic Discovery: Computer Classification vs. Manual Review, Herbert L. Roitblat, Anne Kershaw and Patrick Oot, Journal of the American Society for Information Science and Technology, 61(1):70-80, Two teams of reviewers examined 5,000 documents that had earlier been examined as part of a response to a US Department of Justice request. Team A identified 48.8% of the records identified as responsive by the original reviewers from the sample and Team B identified 53.9%. Two computer-assisted review systems were also used to review the entire original population. System C identified 45.8% of the documents originally identified as responsive, System D indentified 52.7%. See also, Automated Document Review Proves Its Reliability, Anne Kershaw, Digital Discovery & e-evidence Newsletter, Pike & Fischer, November It describes a study comparing the performance of a human review team to that of an automated document assessment system in evaluating a sample of 43% of a collection of 48,000 documents. Relevant documents were deemed to be those identified by both the system and the humans plus those identified by either the system or humans where subsequent arbitration decided that they were relevant. The system identified more than 95% of relevant records whereas the people identified 51%. 2 Overview of the TREC 2009 Legal Track, by Bruce Hedin, Stephen Tomlinson, Jason R. Baron, and Douglas W. Oard, downloaded from on September 29, 2010, provides corrected results for a study ii

3 studies, the second review has identified as responsive just 48.8 to 62.0% of the records identified as responsive by the first review. In other words, linear human review is itself quite fallible. Transparency. When humans review records there is seldom any documentation on why particular records were deemed responsive or not. By contrast, most predictive coding methodologies build an audit trail of what decisions were made and what rules were applied. Retroactive Evaluation. Linear review is so expensive that it is rarely feasible to re-examine records that had been reviewed earlier even though the review team may have gained substantial new insight into the issues of the case in the meantime. Not so with some automated review technologies and processes. Time. Predictive reviews can greatly speed the time required to produce records, thereby shortening the time required to resolve disputes. Confidentiality. Individually reviewing each record requires large review teams; this necessarily exposes confidential information to more risks of unwanted disclosure than would predictive reviews that can process the same volumes with far fewer reviewers. We hope that the results will inform discussions on what types of pre-production review are legally defensible. The ediscovery Institute Anne Kershaw President & Co-founder Joe Howie Director, Metrics Development and Communications done in 2008 in which a subset of records that had previously been reviewed in 2006 and 2007 were reviewed again in According to the Overview, just 62% of documents previously judged relevant were judged relevant again in The sample consisted of 104 documents that had been previously judged relevant and 120 documents that had previously been deemed not relevant. See discussion at section 4 (Correction to 2008 Assessor Consistency Study) in the Legal09 Overview. See also, interassessor consistency data on TREC 06 Legal track ad hoc topics, by Dave Lewis, downloaded on Sept. 29, 2010 from In 2006 the TREC Legal Track took a sample of documents on each of 40 topics. That consisted of 25 documents per topic that had been deemed relevant by an assessor or all of them if there weren t 25 relevant documents, and enough other nonrelevant documents to bring the total to 50. Due to a glitch, one topic only had 49 records. This sample set was then reviewed by another assessor. The first assessor identified 877 of 1999 documents as relevant. The second assessor identified 470 of those 877 documents, or 58.0%, as being relevant. iii

4 Contents Foreword: Why A Survey on Predictive Coding?... ii I. Special Thanks... 1 II. Background... 2 III. Overview of Results... 3 IV. Respondents... 4 V. Terminology... 5 VI. Offering & Overall Process... 6 VII. Identifying Like Records VIII. Threading IX. Paper-Based Records X. Savings XI. Pricing/Cost XII. Incremental Cost of Predictive Coding XIII. Sample Sizes XIV. Set Up Efforts XV. Transparency XVI. Privilege XVII. Repeatable Results XVIII. Elevator Pitch XIX. Acceptance/Adoption XX. Type Matters, Size Threshold XXI. Obstacles to Broader Adoption XXII. Languages XXIII. Review Platforms XXIV. Judicial Review XXV. Should Have Asked XXVI. Comments iv

5 I. Special Thanks We believe that the best way to identify and adopt cost effective ways to process electronic discovery is to have an informed debate on the various options and we want to thank the companies that provided the responses shown in this report. Many of them provided great insight into how they accomplish predictive coding and what the benefits of this approach are. Kudos to the following companies for stepping up and sharing this valuable information: Capital Legal Solutions Catalyst Repository Systems, Inc. Equivio FTI Technology Gallivan Gallivan & O Melia Hot Neuron InterLegis Kroll Ontrack Recommind Valora Technologies, Inc. Xerox Litigation Services 2010 ediscovery Institute 2010, all rights reserved.

6 II. Background The predictive coding survey is the third in series of surveys by the ediscovery Institute on technologies or processes that can be used to speed the processing of electronic data while improving the quality of the review. The first showed that proper consolidation of duplicate electronic files could, on average, reduce the volume of records to be reviewed by 38%. 3 The second survey showed that with grouping s in threads or conversations could reduce the effort required to review by an additional 36% on average. 4 This survey dealt with predictive coding, which we defined as a combination of technologies and processes in which decisions pertaining to the responsiveness of records gathered or preserved for potential production purposes are made by having reviewers examine a subset of the collection and having the decisions on those documents propagated to the rest of the collection without reviewers examining each record. In May, 2010, invitations to participate in the survey were sent to a number of companies known to be active in the electronic discovery market. Additionally, postings inviting participation were made on a number of forums including the EDDUpdate.com blog, the Lit Support list serve on Yahoo, the ediscovery group on LinkedIn.com and on LegalORamp.com. Responses were all received by July 1, Report on Kershaw-Howie Survey of E-Discovery Providers Pertaining to Deduping Strategies, available at This study showed that on average, consolidating duplicates across custodians reduced the volume to be reviewed by 38% on average with many individual respondents reporting project-level reductions in excess of 70%. The savings from across-custodian deduping was almost double the reduction in volume of electronic discovery compared to only consolidating within the records of individual custodians, yet was performed in only half the cases raising serious ethical considerations which were explored in Ethics and Ediscovery Review, published in the ACC Docket, Jan/Feb Reprints are available at EthicsOfEdiscover.pdf 4 Report on Kershaw-Howie Survey of E-Discovery Providers Pertaining to Threading, available at 2

7 III. Overview of Results This is a summary of the results. Complete responses to the questions are provided later in this report. Some highlights: Savings. Respondents reported average savings of 45% with 71% average maximum observed savings and 23% average minimum observed savings. Individual respondents reported savings as high as 80 to 95 and even 100% and minimum savings as low as zero % on individual projects. Obstacles to Implementation. The respondents felt that the largest obstacle to a more widespread use of predictive coding was uncertainty over judicial acceptance of that approach. The next closest obstacle was lack of awareness of options on the part of in-house counsel followed closely by insensitivity to the cost of inefficiencies by law firms. General Approach. The respondents varied in their approach to predictive coding. Most respondents used some form of queries combined with document clustering. Non-Binary Process. In describing their responses, several of the respondents noted that predictive coding is non-binary in nature, i.e. documents are ranked according to how closely they match previously examined records. In other words there is a continuum and the review team has to select what the cutoff point is. Terminology. Almost all of the respondents thought there was a better generic term than predictive coding. Suggestions included: Automated Document Classification Automatic Categorization Predictive Categorization Predictive Ranking Prognostic Document Profiling Propagated Coding Relevance Assessment Replicated Coding Suggested coding Pricing Models. The respondents offered a variety of pricing models, including per GB pre-culling, per GB post-culling, hourly fees, and flat per case fees. Sampling. Most respondents used some form of statistical sampling. Transparency. Most of the respondents provide an audit trail of what decisions were made. Replicability. Most of the respondents indicated that the results of a second analysis using the audit trail from the first analysis would produce the same results. Adoption Rate. There were not enough responses in this area to provide metrics on the rate of adoption. Maturity of Offerings. Predictive coding as an offering is far more recent than deduping or threading. Many of the respondents have added predictive coding in the last two years. Threading. Most respondents were able to treat either individual s or to treat s grouped in threads. Paper records. All the respondents included scanned and OCR d paper records in with electronic records for predictive coding purposes. Languages. All the respondents could handle English, French, German and Spanish but there were a few who could not handle Chinese, Japanese, Korean or Arabic. 3

8 IV. Respondents The following companies provided responses to the survey. Company Contact Person Involvement w/ Predictive Coding Capital Legal Solutions Catalyst Repository Systems, Inc. Equivio FTI Technology Gallivan Gallivan & O Melia Hot Neuron, LLC InterLegis Kroll Ontrack Recommind Valora Technologies, Inc. Xerox Litigation Services Gregory Brooks, VP Information Technology gbrooks@capitallegals.com John Tredennick, CEO jtredennick@catalystsecure.com Warwick Sharp, VP Marketing and Business Development warwick.sharp@equivio.com Kate Holmes, Director, Corporate Communications kate.holmes@fticonsulting.com Daniel Gallivan Bill Dimm, CEO clustify@hotneuron.com Kevin Carr, President kcarr@interlegis.com x205 Jamie Ritter, Document Review Manager jritter@krollontrack.com Chris Hutcheson, Marketing Director Sandra Serkes, President & CEO sserkes@valoratech.com Karen Miller, Director of Marketing (karen.miller@xls.xerox.com) Developed own predictive coding software; provide software & hosted review Developed own predictive coding software; developed methodology within Developed own predictive coding software Developed own predictive coding software; we provide both software and hosted review. Integrated others predictive coding Developed own predictive coding software Developed own predictive coding software Developed own predictive coding software Developed own predictive coding software; provide hosting and software Combo software provider & services provider. Developed own predictive coding software 4

9 V. Terminology ediscovery Institute Survey on Predictive Coding The survey asked, If you think there is a better generic term than predictive coding, what would it be? and Why? These were the responses: Company Better Term Why Capital Legal Solutions Catalyst Repository Systems Equivio FTI Technology Gallivan Gallivan & O Melia Hot Neuron, LLC InterLegis Kroll Ontrack Recommind Valora Technologies, Inc. Xerox Litigation Services Prognostic Document Profiling Predictive Ranking Relevance Assessment Suggested coding Predictive Categorization Automatic Categorization No Propogated Coding or Replicated Coding Automated Document Classification The prognostic and iterative content categorization can play a broader part than simply review "call" score evaluation; for example in the document management system's context. More descriptive of the process and result. All systems deal with a rank or likelihood of responsive or not responsive. It is up to the trial team to determine the acceptable risk. The term "coding" suggests that the output is binary (responsive or not). However, one of the important use scenarios is prioritized review, which can only be facilitated by graduated relevance scores. In addition, graduated relevance scores are important in allowing the user to select which documents to review (above a certain cut-off score), based on the mix of risk (recall) and cost (precision) appropriate for the given case and business scenario. We take the approach that this review technology does not completely eliminate human review. "Suggested coding" correctly indicates that human review decisions are preserved and help guide the computer through concept-clustering of documents and the integration of reference documents into the review. Review decisions become more consistent and faster, without relinquishing control over the substantive decisions for each document. "Coding" implies a decision, the machine is suggesting. The coding happens when a person confirms (or refutes) the suggested category I don't know if it is "better," but it better aligns with terminology outside of the legal field. May we suggest "Propagated Coding," rather than Predictive Coding, as "predictive" tends to mean ahead of the current time (like a forecast), whereas "propagated" would indicate taking existing results and carrying them forth across the remainder of the population (at any time, including the present). We believe that a generic term for a new offering in this market should be as transparent and descriptive as possible. Automated Document Classification is our preferred name for this particular technology, because we believe that it more clearly conveys the intended output for the technology namely, a definitively classified set of documents. In our view, the term Predictive Coding is opaque and imprecise. It does not differentiate Automated Document Classification from less robust similarity-detection technologies, like clustering, near de-duplication, and threading. These other techniques could be used to make predictions regarding relevance for certain groups of documents within a corpus. Unlike Automated Document Classification, though, they would not comprehensively classify a document population such that clear definitive lines could confidently be drawn segregating relevant documents from non-relevant documents. In sum, the term Predictive Coding seems to us to suggest a technology whose end results are imprecise, immeasurable, and unreliable. This is not, in our view, an appropriate designation for the emerging body of Automated Document Classification systems. 5

10 VI. The survey asked: ediscovery Institute Survey on Predictive Coding Offering & Overall Process Name of PC Offering. What do you call your predictive coding offering? Time Offered. What year did you first provide predictive coding software or services? Overall Process. Please describe the overall process involved in your offering: (Example: After records have been collected and placed in a repository, the duplicate records are consolidated. Reviewers perform full text searches and otherwise browse the records of custodians with the most known involvement in the issues. The reviewers identify records known to be responsive and then our system identifies other records that are most like those records based on. We sample non-selected records based on and examine samples of about XX records to determine if there are still sets of relevant records that had not already been selected for production. We repeat iterations until ) The responses were as follows: Company Offering Capital Legal Solutions Dynamic Content Profiling Catalyst Repository Systems Predictive Ranking Offered Since nd Quarter Predictive Coding Offering & Overall Process Overall Process Dynamic Content Profiling will work on any corpus of documents across any language set that is imported into our ezreview repository pre or post culling for de-duplication, date filtering or key word searching. Dynamic profiler works on any folder or navigational view in the system. As such, client can execute across searches, tags, production data, random sample sets or customized queries. In any event, CLS review architects can work with client to create a powerful strategy whereby they can preview deliberate batches based on any folder technique mentioned prior or through our automated randomizer engine. In our random sampling module, user can make a decision as to the size of sample set and the pass or fail threshold levels. Sampled or deliberate batches then receive review decisions by expert or top level reviewers. Our profiling engine will then scan across the corpus of documents in entire database and find similar documents based on content and concepts based on our CLS s customized algorithms. All such documents are pulled and folder for mass categorization. A random sampling can then we performed against that data set for quality assurance purposes. This process can be repeated until all documents in the corpus are reviewed Catalyst offers Predictive Ranking and statistical analysis based upon initial coding decisions made by counsel during initial document review/sampling. These coding decisions are coupled with weighted key concepts and search terms, and then are applied against the non-reviewed documents, leading to an assigned predictive weighting for responsiveness. The ranks are typically used two ways: 1. Documents with a very low rank, tested and shown to be extremely unlikely to be responsive, are not reviewed and not produced. 2. The remaining documents are typically prioritized and reviewed in priority order, beginning with those most likely to be responsive. This allows for a prioritized review, making the 6 Underlying Technology Capital Legal Solutions own Intellectual Property / developed internally. Catalyst Repository Systems (Catalyst CR)

11 Company Offering (Catalyst Cont d) Equivio Equivio> Relevance Offered Since ediscovery Institute Survey on Predictive Coding Predictive Coding Offering & Overall Process Overall Process review more efficient and supporting rolling productions. The steps we follow are as follows: Begin with a list of search terms that counsel believes are likely to find responsive documents, and run those searches. Sample a random sample of both the hits and non-hits, tagging for responsiveness, and looking for additional words and phrases that are found in responsive documents and false hit terms that are often found in non-responsive documents. Adjust the search terms based on what was learned during the sampling. If there were phrases found that are common false hits, run a Catalyst unique True Hit Finder/False Hit Remover process to tag the true hits and not the false hits. Assign the search terms scores representing likelihood of responsiveness, and run the searches in Power Search, based on subject matter expertise and sampling results. Assign each document a responsiveness rank based on a combination of the search terms that hit and the scores of each search term.. Sample additional documents to verify the scoring. Determine cut-off, and remove the docs that are ranked as nonresponsive to a subcollection where they can be sampled and archived. Review the docs ranked as likely responsive beginning with the highest ranked documents. The benefits can be magnified when combined with Catalyst s additional features to accelerate the review, such as Equivio Thread/Near Dupe analysis, sophisticated handling of multiple languages, clustering, and managed review workflow Equivio>Relevance enables organization of a document collection by relevance. Based on initial input from an attorney knowledgeable of the case, Equivio>Relevance uses statistical and self-learning techniques to calculate graduated relevance scores for each document in the data collection. As an expert-guided system, Equivio>Relevance works as follows: An expert reviews a sample of documents, ranking them as relevant or not. Based on the results, Equivio learns how to score documents for relevance. In an iterative, self-correcting process, Equivio feeds additional samples to the expert. These statistically generated samples allow Equivio>Relevance to progressively improve the accuracy of its relevance scoring. Once the sampling process has optimized, Equivio scores the entire collection, calculating a graduated relevance score for each document. The product includes a statistical model which monitors the software training process, ensuring validation and optimization of the sampling and training effort. Underlying Technology We are not at liberty to disclose this information. 7

12 Company Offering FTI Technology Acuity is the name of our allin-one legal review offering that utilizes "predictive coding" (our preference is "suggested coding"). Gallivan Gallivan & O Melia depends on client -- we are not consistent: have used clustering, auto tagging, grouping Hot Neuron, LLC Clustify (PC is a subset of its functionality Offered Since Acuity launched in January rudiments in 2003 (Attenex style); fully (if client requested) since 2008 ediscovery Institute Survey on Predictive Coding Predictive Coding Offering & Overall Process Overall Process The Acuity process is to review a subset of the documents, which we call the reference set, and have the review team code them as appropriate. This serves two functions - these will suggest coding on uncoded documents, and will continually guide and instruct reviewers. From there, the reference set is uploaded to an enhanced Attenex Document Mapper tool where the coded documents are clustered with other documents of similar content and themes. Based upon the coding of the reference set, the software provides suggestions to the reviewers on how to code the similar documents. Coding decisions are implemented by the reviewers rather than automatically by the software and the process can be accurately described as machine-assisted document review. Collect and process records; extract content and placed in a repository, store references in a database. Consolidate duplicates. Extract text or OCR, compare text content to create a similarity vector, store results. Reviewers perform full text searches and otherwise browse the records of custodians, filtering based on metadata as needed. Similar documents are grouped together. The reviewers identify groups known to be responsive and then we associate other records that are most like those records based on the similarity vector. Reviewer decisions define the actual mark of the documents vs. the mark suggested by our system. As new waves of data arrived, they are placed in groups based on similarity vectors generated for that data Clustify only does the automatic categorization step of the process, so the details of other steps (de-dupe, searches, etc.) are really up to the user. The user supplies two sets of documents to Clustify: documents that have already been categorized (perhaps as responsive/not-resposive, or whatever categories the user wants to use), and documents that haven't been categorized. Clustify compares the uncategorized documents to the ones that have been categorized, and categorizes them automatically if they are sufficiently similar to any of the categorized documents. The "sufficiently similar" criteria is specified by the user. It could be a minimum conceptual similarity percentage, or a near-dupe percentage. Any uncategorized documents that aren't sufficiently similar to any categorized document for automatic categorization are clustered, labeled with descriptive keywords, and presented to the user for manual categorization. Clustify tells the user how similar an auto-categorized document is to the most similar manually categorized document, so the user can identify the documents most at risk of incorrect categorization (i.e., those with lowest similarity). The process can be iterated in an effort to cover more of the 8 Underlying Technology The underlying software is Attenex Patterns. The Acuity all-inone offering utilizes an enhanced version of wellknown software that includes the suggested coding features. It is Hot Neuron's own proprietary technology.

13 Company Offering (Hot Neuron Cont d) InterLegis Discovery360 Predictive Coding (Interlegis Cont d) Kroll Ontrack Intelligent Prioritization Recommind Axcelerate ediscovery ediscovery Institute Survey on Predictive Coding Predictive Coding Offering & Overall Process Offered Overall Process Since uncategorized documents, but it is only wise to do so if there is a manual review of the documents most at-risk for errors. Without such review, it is better to increase coverage by simply setting the similarity requirement lower Predictive coding is a technology feature within Discovery360 Reviewer. There are two ways to leverage PC within Discovery User-Defined: Case administrators define various attributes that define certain issue codes. They can use any attribute in the database, including: keywords, concepts, file types, domains, specific names and more. This process enables users to first "teach" the system, then ask it to find all documents that match their criteria. 2. Automatic: With this feature activated, the system will essentially "watch and learn" what commonalities are found between documents as they are issue coded. And as reviewers work, the system will find and recommend likely candidates for each issue code. Users can then either approve the entire recommended list, edit the criteria, or quickly QC the list to confirm selections. Additionally, case administrators have the ability to ask the system to either code matching documents immediately, or place likely candidates in a holding folder for confirmation. In all cases, documents coded via the PC engine are always designated as such in the database for logging and defensibility purposes After documents have been processed and uploaded into Ontrack Inview, the project administrator builds an initial workflow. An early workflow stage is designated for Intelligent Prioritization. Initially, a statistically relevant sample of the uploaded documents is provided to reviewers for standard linear review. The system then assesses the reviewed documents and defines the characteristics of potentially Responsive documents. The system then prioritizes other likely Responsive documents for review. As the review continues, the system s knowledge of Responsive characteristics improves. When new documents are loaded into Ontrack Inview another statistically significant sample is identified from this new data and that sample of data is prioritized for Responsive review. In addition to the document prioritization identified above, Kroll Ontrack provides additional project analysis that helps determine when a high percentage of potentially Responsive documents have been identified within the data. By analyzing the Responsiveness patterns in the data and comparing them to the entire population of documents, Ontrack Inview can provide statistical details that can be utilized to indicate the completeness of a review All software, processes and workflow are the proprietary intellectual property of Recommind and cannot, therefore, be disclosed. 9 Underlying Technology InterLegis' proprietary technology Intelligent Prioritization is a proprietary Kroll Ontrack technology. Recommind

14 Company Offering Valora Technologies, Inc. We have numerous offerings here: AutoCoding, AutoIssues, AutoPriv, AutoResponsive, AutoND (NearDupe), AutoETG ( ThreadGro up) and a roll-up of the above: AutoReview Xerox Litigation Services CategoriX Offered Since Our first predictive/ propagated capability was AutoCoding, first offered in ediscovery Institute Survey on Predictive Coding Predictive Coding Offering & Overall Process Overall Process Valora loads the entire collected population into our system, including any review data already available from previously (typically manual) efforts by reviewers. We build a custom computer-representation of the Document Review Ruleset for each matter. We extract/understand these Rules from three possible places: 1) From a Coding or Review Manual, typically written by the client to train human reviewers 2) From existing/previously coded data from earlier review efforts. In this case, Valora creates a translation from prior actions taken to the underlying rules that guided those decisions (even if not explicitly stated). 3) From direct conversations with the client, particularly when no existing data or review efforts exist (e.g., starting fresh). Once established, Valora propogates the Document Review Ruleset uniformly across any already-coded documents. The results are reviewed and corrected for precision and recall (accuracy). Once the results meet the desired threshold, the Ruleset is propagated across the entire population CategoriX automatically classifies documents by learning from samples that have been reviewed by knowledgeable case attorneys. CategoriX utilizes attorney-supplied document assessments, together with its own statistical analyses, to create a model that will accurately and consistently generalize the attorneys assessments across the entire review population. The statistical analysis underlying CategoriX technology is called Probabilistic Latent Semantic Analysis (PLSA). CategoriX leverages PLSA to identify correlations between words and attorney-supplied relevance assessments. This knowledge then informs CategoriX classifications for novel documents going forward. CategoriX performance depends on the quality of the assessments provided by the attorneys in the training samples. For this reason, several iterations of training and intensive quality control are undertaken during the model-building process to ensure the accuracy and consistency of the training input. Precision and recall are monitored throughout the incremental model-building process to ensure that progress is being made toward our client s performance goals. Once CategoriX models are consistently performing at the desired levels, CategoriX is applied to the entire review population. Finally, one last round of attorney-driven QC sample review is undertaken to validate the quality of the final result set. The iterative CategoriX approach has several distinct stages and entails a strong consultative partnership between CategoriX technical experts at XLS and the client s attorneys. Nevertheless, a CategoriX-based review can typically be completed in a very short timeframe, as many of the analyses are aided by computers working 24 x Underlying Technology Valora Technologies, Inc. Xerox s two research centers, Xerox Research Centre Europe (XRCE) and Xerox Palo Alto Research Center (PARC).

15 VII. Identifying Like Records The survey asked: ediscovery Institute Survey on Predictive Coding What general approach is used to identify like records? (Select all that apply) Custom queries Statistically based clustering, with no terms inferred, e.g., basing a search or clustering on a document that contains Ford and Toyota would not find or associate documents that only contained the words Chevy and Honda Statistically based clustering with co occurring words inferred, e.g. basing a search or clustering on a document that contains Ford and Toyota could find or associate documents that only contained the words Chevy and Honda Taxonomies Other (please specify): These were the responses: Company Queries Clustering (no inf.) Clustering (w/ inf.) Taxonomies Other Capital Legal Solutions Yes Yes Like records are identifiable is various ways. In addition to the above two we identify also based on document content. Catalyst Repository Yes Yes Systems Equivio Supervised learning FTI Technology Linguistic statistical analysis assesses similarity in documents. Gallivan Gallivan & Yes Yes O Melia Hot Neuron, LLC Yes InterLegis Yes Machine Learning based on common threads between documents. Kroll Ontrack Classification based technology that assesses document text to determine related documents. Recommind Yes Yes Yes Valora Technologies Yes Xerox Litigation Services CategoriX uses Probabilistic Latent Semantic Analysis to identify correlations between words and attorney-supplied category assessments. From these building blocks, CategoriX assembles models capable of assigning relevance probabilities to novel documents that have not been manually reviewed. CategoriX s probability assignments do not depend on the presence of any specific words or phrases in a document. Instead, each document s score is dictated by the probabilities of the specific combination of words comprising it. 11

16 VIII. Threading Section 3 of the survey asked: ediscovery Institute Survey on Predictive Coding Threads. Please explain how threads are handled in conjunction with your offering. s are analyzed individually so that different s from the same thread can be placed in different groups or clusters threads are identified prior to grouping or clustering so that all s in a thread or branch of a discussion areplaced in the same group or cluster Other: please explain: The responses were as follows: Company Indiv. s All EM in Thread Other Capital Legal Solutions With our system, there is no boxed in solution for Thread review. We can and will work with case team to establish a workflow that will be most efficient per their strategy. If review based on searching is required for instance, then we can search get those results, pull in the entire conversation and take that into account. Or if review based on similar or associated documents is the desired first pass we can do that way and then account for s in those thread to be automatically categorized. So we allow a flexibility here as different clients work different way but we can find the efficient way per their work methods. Predictive Ranking is flexible: Searching is done by document, but analysis and ranking can be done by: (a) individual documents, (b) families of and related attachments (c) threads (optional with Equivio thread processing) Both options are supported. This is a user-defined parameter. We can do both depending upon client preference. Catalyst Repository Systems Equivio FTI Technology Gallivan Gallivan & O Melia Hot Neuron, LLC InterLegis Kroll Ontrack Yes Clustify allows you to do it either way. Yes s are handled in the Intelligent Prioritization technical solution without additional document type handling. In addition to Intelligent Prioritization, Kroll Ontrack provides threading technology that analyzes s and presents them to reviewers grouped by conversation, and identifies the earliest and latest s in each thread. Recommind Valora Technologies Xerox Litigation Services Yes We offer both choices as an option to our customers. CategoriX typically operates on individual s. However, the XLS review platform incorporates threading technology that could be used to ensure that all members of an thread would be assigned to the same category, should the client prefer this organization. 12

17 IX. Paper-Based Records Section 3 of the survey asked: ediscovery Institute Survey on Predictive Coding Paper based Records. How are paper based record treated for predictive coding purposes? Paper records are scanned and OCR d and the OCR d text is included with the ESI for predictive coding Paper records are scanned and OCR d and treated as a separate population from ESI for predictive coding Paper records are not treated with predictive coding Other (please explain) The responses were: Company Paper w/esi Paper Separate Paper not treated for Predictive Coping Other Capital Legal Solutions Yes Catalyst Repository Systems Yes Equivio Yes FTI Technology Yes Gallivan Gallivan & O Melia Yes Hot Neuron, LLC It can be any of the above. It's entirely up to the user to decide whether to put OCR'ed text in the same document set as the ESI, or whether to separate them. InterLegis Yes Kroll Ontrack Yes Recommind Yes Valora Technologies, Inc. Yes Any ESI documents without text are processed like paper (OCR, etc.). Xerox Litigation Services Yes 13

18 X. Savings The survey asked, Cost Savings As compared to a linear review of the same content after duplicate consolidation, after culling based on domain name analysis of s (e.g. excluding s from CNNSports.com) and after threading, what percentage of time do you estimate is saved by predictive coding when used to select responsive records? On average: % Most observed: % Least observed: % The responses were: Company Average % Savings Most % Savings Observed Least % Savings Observed Capital Legal Solutions Catalyst Repository Systems Equivio (note 1) FTI Technology Gallivan Gallivan & O Melia Hot Neuron, LLC InterLegis Kroll Ontrack Recommind Valora Technologies, Inc. ** Xerox Litigation Services Total Average of Responses (divide by 9) 45.4% 71.5% 23.1% Green shading with a gold star indicates that the respondent provided names and contact information for a client who substantiated the information provided regarding savings. Two stars indicate two references. Providing references was optional for the respondents ** Equivio Note 1: These percentage savings refer to cases in which the software was successfully trained and used. The software includes a statistical model which monitors the "success" of training. Occasionally, due to poorly-defined issues, inconsistent tagging by the expert, or exceptionally low richness (less than 1%), the statistical model detects and notifies the user that training is ineffective, and in these cases, the results are not used. ** Valora Note: Valora builds a computer-representation of the Document Review Ruleset for each matter as part of Valora s services. In some cases clients have completely forgone a linear review and used the results of the Ruleset instead. 14

19 XI. Pricing/Cost The survey asked: How do your calculate the prices you charge for PC? (select all that apply) Per GB, pre-culling Per GB, post culling Per GB, post culling and deduping Per File, pre-culling Per File, post culling Per File, post culling and deduping Hourly consulting fees Flat Fee per case Other (Please specify below) The responses were as follows: Company Per GB, pre-cull Per GB, post cull Per GB post cull & dedupe Per File Pre Cull Per File, postcull Per File, post cull & Dedupe Hourly Fees Flat Fees Per Case Other Other Text Capital LS Yes Yes Yes Catalyst RS Yes Equivio Yes Yes Most customers prefer the per-file pricing model. FTI Tech. Yes Yes Gallivan Yes Yes Gallivan & O Melia Hot Neuron Yes Yes Yes Yes We also offer perpetual site licenses with no per- GB fee. Note that our per-gb fees are based on the amount of text, not raw data, which we believe is more fair and economically sensible. Whether the user culls/de-dupes first is up to him/her. InterLegis Yes Per GB fee after culling, and includes all software, technologies, and services such as project management and productions. Kroll Ontrack Yes Free introductory offer. Recommind Yes Yes Yes Yes Enterprise license; SaaS (i.e. per month/quarter/ year charge for all volume) Valora Tech. Yes Yes Yes Yes Yes Per page or per paper document. Xerox LS Yes Similar to our processing and review platform pricing, our models are very flexible. Depending on client needs and the complexity and size of the matter, our pricing models can vary from matter to matter. 15

20 XII. Incremental Cost of Predictive Coding The survey asked: What is the incremental cost of providing predictive coding technology above the basic costs of ingesting and deduping electronic records? (express as a percentage over basic ingesting, deduping and threading) These were the responses: Company Capital Legal Solutions 20% Catalyst Repository Systems Hourly consulting at $250-$350 per hour Equivio Equivio is a software vendor. Processing and hosting services, as referred to in the question, are provided by our e-discovery partners. As such, we are not in a position to respond to this question. FTI Technology Acuity is an all-in-one offering from processing through to production, including legal review. The predictive coding feature is included in the fees so there is no additional cost - in fact it offers cost savings. Gallivan Gallivan & O Melia Less than 1/10 of 1%. Since we do no charge for processing time, the only "cost" is the extra time required to process the documents. Not all clients want the delay given the perceived small % gain in time. Hot Neuron, LLC InterLegis Included in full-suite of services. Kroll Ontrack This information is proprietary. Recommind Question is unclear Valora Technologies, Inc. When Valora performs the ingesting, deduping, etc., there is no incremental cost to perform document tagging of any sort. This includes AutoCoding, AutoReview, etc. When Valora does not perform the preliminary steps, the cost of AutoReview usually runs between 25-50% of typical ESI processing/scanning costs. A better cost comparison is the cost of Predictive/Propogated Coding against the cost of linear review. Xerox Litigation Services Because our pricing models are based off of client needs and the complexity and size of the matter, incremental costs can vary from matter to matter. 16

21 XIII. Sample Sizes The survey asked: ediscovery Institute Survey on Predictive Coding Sampling Non-selected Records. If you use sampling of non-selected records as a way of validating your approach, what size samples do you use and how is that sample size determined? The responses were: Company Capital Legal Solutions Catalyst Repository Systems Equivio FTI Technology Gallivan Gallivan & O Melia Hot Neuron, LLC InterLegis Kroll Ontrack 17 Sampling Using statistical random sampling techniques. Inspection batch sizes can be determined A) by desired % of records; B) by a set number of items; or C) to achieve a degree of accuracy % based on pool size and accuracy level formula Generally a statistically valid sample with 95% confidence level is used. Sample size required depends on several variables, including collection richness and size, and the required level of statistical confidence. Size sample is different for each case, depending on what we're looking for (nonresponsive versus privileged, as an example). We use accepted statistical methodology (acceptance sampling, statistical sampling) which includes expected responsive rate, confidence level and acceptable error rate. n/a Intelligent Prioritization does not utilize sampling of non-selected records as an automated way of validating the technical approach. The system is designed to allow clients to utilize the approach of sampling non-selected documents as a companion validation of the solution if they choose to do so. Recommind 10,000 Valora Technologies, Inc. Valora samples records using random selection from across the entire population. Sample size determination is a function of the size of the population and the accuracy desired. Xerox Litigation Services XLS relies on statistical methods developed by our in-house statistician to calculate sound precision and recall estimates for CategoriX results. Our techniques focus on establishing extremely accurate estimates of the rates of relevance (or yields) for the client s categories in the review population as a whole. We ensure that our yield estimates are reliable by selecting random samples for review that are large enough to produce yield estimates with very narrow margins of error according to standard sample size tables. Once stable yield estimates have been established, they provide a reference point from which recall estimates can be calculated following a) the final assessment of categories to documents by CategoriX and b) the establishment of a precision estimate based on direct sampling from the set of documents classified as relevant by CategoriX. Direct sampling in the non-selected records is undertaken only in circumstances where that represents the most efficient option for establishing recall for the final result set. In those cases, the sample size for non-selected records would be dictated by the desired width of the margins of error for the resulting recall estimate.

22 XIV. Set Up Efforts ediscovery Institute Survey on Predictive Coding The survey asked: Set-up Effort What level of effort it terms of time and level of people involved, is required to set up or start a PC review using your offering? To what extent can efforts expended to start up one review in your system be re-used in other reviews? To what extent can efforts expended to start up one review in your system be re-used as part of an enterprise-wide information management or retrieval system? The responses were: Company Set Up Effort Re-Use in Other Reviews Re-Use in Enterprise System Capital Legal Solutions Catalyst Repository Systems Equivio FTI Technology Dynamic Content Profiling engine is an offering built into our application. However, the time to setup is dependent upon the data set received as we have to run several processes before we can activate the various features. History shows that we have already been able to work with clients per their time line. No more than at the start of any typical review. Creation of searches, scoring and initial sampling should be done by associates or higher level attorneys familiar with the case. Installation and set-up of the software takes about 1-2 hours. For each case, the software needs to be trained by an "expert" (an attorney familiar with the case) in order to estimate the relevance of documents in the specific case. This training process typically takes days Nothing - it's currently part of the Acuity all-in-one service. Work flows for executing our Content Intelligence process are identifiable and reusable. However, work flows depends on the case team and their needs. We can streamline the path to take depending on strategy that team decides to take. Our review consultants are pretty methodological when it comes to devising the most desirable, defensible cost effective review workflow. Most setup for one can be applied to another case as to review forms, views, folders, subcollections, etc. The training of Equivio>Relevance is specific per case/issue. If there is overlap in data or issues, the efforts and work product can be reused. 18 Not currently planned to deploy as such but could envision the use of one document corpus' prognostic scores against other matters or cross matters document profiling. A default site is created and replicated for an unlimited number of matters. As above, the training of Equivio>Relevance is specific per case/issue. Because predictive coding comes with Acuity, clients can realize great efficiencies as FTI becomes familiar with

Predictive Coding Defensibility and the Transparent Predictive Coding Workflow

Predictive Coding Defensibility and the Transparent Predictive Coding Workflow Predictive Coding Defensibility and the Transparent Predictive Coding Workflow Who should read this paper Predictive coding is one of the most promising technologies to reduce the high cost of review by

More information

Predictive Coding Defensibility and the Transparent Predictive Coding Workflow

Predictive Coding Defensibility and the Transparent Predictive Coding Workflow WHITE PAPER: PREDICTIVE CODING DEFENSIBILITY........................................ Predictive Coding Defensibility and the Transparent Predictive Coding Workflow Who should read this paper Predictive

More information

Software-assisted document review: An ROI your GC can appreciate. kpmg.com

Software-assisted document review: An ROI your GC can appreciate. kpmg.com Software-assisted document review: An ROI your GC can appreciate kpmg.com b Section or Brochure name Contents Introduction 4 Approach 6 Metrics to compare quality and effectiveness 7 Results 8 Matter 1

More information

REDUCING COSTS WITH ADVANCED REVIEW STRATEGIES - PRIORITIZATION FOR 100% REVIEW. Bill Tolson Sr. Product Marketing Manager Recommind Inc.

REDUCING COSTS WITH ADVANCED REVIEW STRATEGIES - PRIORITIZATION FOR 100% REVIEW. Bill Tolson Sr. Product Marketing Manager Recommind Inc. REDUCING COSTS WITH ADVANCED REVIEW STRATEGIES - Bill Tolson Sr. Product Marketing Manager Recommind Inc. Introduction... 3 Traditional Linear Review... 3 Advanced Review Strategies: A Typical Predictive

More information

Viewpoint ediscovery Services

Viewpoint ediscovery Services Xerox Legal Services Viewpoint ediscovery Platform Technical Brief Viewpoint ediscovery Services Viewpoint by Xerox delivers a flexible approach to ediscovery designed to help you manage your litigation,

More information

Three Methods for ediscovery Document Prioritization:

Three Methods for ediscovery Document Prioritization: Three Methods for ediscovery Document Prioritization: Comparing and Contrasting Keyword Search with Concept Based and Support Vector Based "Technology Assisted Review-Predictive Coding" Platforms Tom Groom,

More information

The Tested Effectiveness of Equivio>Relevance in Technology Assisted Review

The Tested Effectiveness of Equivio>Relevance in Technology Assisted Review ediscovery & Information Management White Paper The Tested Effectiveness of Equivio>Relevance in Technology Assisted Review Scott M. Cohen Elizabeth T. Timkovich John J. Rosenthal February 2014 2014 Winston

More information

Assisted Review Guide

Assisted Review Guide Assisted Review Guide Version 8.2 November 20, 2015 For the most recent version of this document, visit our documentation website. Table of Contents 1 Relativity Assisted Review overview 5 Using Assisted

More information

The Evolution, Uses, and Case Studies of Technology Assisted Review

The Evolution, Uses, and Case Studies of Technology Assisted Review FEBRUARY 4 6, 2014 / THE HILTON NEW YORK The Evolution, Uses, and Case Studies of Technology Assisted Review One Size Does Not Fit All #LTNY Meet Our Panelists The Honorable Dave Waxse U.S. Magistrate

More information

Making reviews more consistent and efficient.

Making reviews more consistent and efficient. Making reviews more consistent and efficient. PREDICTIVE CODING AND ADVANCED ANALYTICS Predictive coding although yet to take hold with the enthusiasm initially anticipated is still considered by many

More information

Predictive Coding: E-Discovery Game Changer?

Predictive Coding: E-Discovery Game Changer? PAGE 11 Predictive Coding: E-Discovery Game Changer? By Melissa Whittingham, Edward H. Rippey and Skye L. Perryman Predictive coding promises more efficient e- discovery reviews, with significant cost

More information

Document Review Costs

Document Review Costs Predictive Coding Gain Earlier Insight and Reduce Document Review Costs Tom Groom Vice President, Discovery Engineering tgroom@d4discovery.com 303.840.3601 D4 LLC Litigation support service provider since

More information

Predictive Coding, TAR, CAR NOT Just for Litigation

Predictive Coding, TAR, CAR NOT Just for Litigation Predictive Coding, TAR, CAR NOT Just for Litigation February 26, 2015 Olivia Gerroll VP Professional Services, D4 Agenda Drivers The Evolution of Discovery Technology Definitions & Benefits How Predictive

More information

eops 2010: Electronic Discovery Operational Parameters Survey Executive Summary April, 2010

eops 2010: Electronic Discovery Operational Parameters Survey Executive Summary April, 2010 eops 2010: Electronic Discovery Operational Parameters Survey Executive Summary April, 2010 Better information will make E-Discovery more efficient. The multi-billion dollar electronic discovery market

More information

AccessData Corporation. No More Load Files. Integrating AD ediscovery and Summation to Eliminate Moving Data Between Litigation Support Products

AccessData Corporation. No More Load Files. Integrating AD ediscovery and Summation to Eliminate Moving Data Between Litigation Support Products AccessData Corporation No More Load Files Integrating ediscovery and Summation to Eliminate Moving Data Between Litigation Support Products White Paper August 2010 TABLE OF CONTENTS Introduction... 1 The

More information

Enhancing Document Review Efficiency with OmniX

Enhancing Document Review Efficiency with OmniX Xerox Litigation Services OmniX Platform Review Technical Brief Enhancing Document Review Efficiency with OmniX Xerox Litigation Services delivers a flexible suite of end-to-end technology-driven services,

More information

Judge Peck Provides a Primer on Computer-Assisted Review By John Tredennick

Judge Peck Provides a Primer on Computer-Assisted Review By John Tredennick By John Tredennick CEO Catalyst Repository Systems Magistrate Judge Andrew J. Peck issued a landmark decision in Da Silva Moore v. Publicis and MSL Group, filed on Feb. 24, 2012. This decision made headlines

More information

This Webcast Will Begin Shortly

This Webcast Will Begin Shortly This Webcast Will Begin Shortly If you have any technical problems with the Webcast or the streaming audio, please contact us via email at: accwebcast@commpartners.com Thank You! Welcome! Electronic Data

More information

E-discovery Taking Predictive Coding Out of the Black Box

E-discovery Taking Predictive Coding Out of the Black Box E-discovery Taking Predictive Coding Out of the Black Box Joseph H. Looby Senior Managing Director FTI TECHNOLOGY IN CASES OF COMMERCIAL LITIGATION, the process of discovery can place a huge burden on

More information

Quality Control for predictive coding in ediscovery. kpmg.com

Quality Control for predictive coding in ediscovery. kpmg.com Quality Control for predictive coding in ediscovery kpmg.com Advances in technology are changing the way organizations perform ediscovery. Most notably, predictive coding, or technology assisted review,

More information

E-Discovery Basics For the RIM Professional. Learning Objectives 5/18/2015. What is Electronic Discovery?

E-Discovery Basics For the RIM Professional. Learning Objectives 5/18/2015. What is Electronic Discovery? E-Discovery Basics For the RIM Professional By: Andy Sokol, CEDS, CSDS Adding A New Service Offering For Your Legal & Corporate Clients Learning Objectives What is Electronic Discovery? How Does E-Discovery

More information

Litigation Support. Learn How to Talk the Talk. solutions. Document management

Litigation Support. Learn How to Talk the Talk. solutions. Document management Document management solutions Litigation Support glossary of Terms Learn How to Talk the Talk Covering litigation support from A to Z. Designed to help you come up to speed quickly on key terms and concepts,

More information

Introduction to Predictive Coding

Introduction to Predictive Coding Introduction to Predictive Coding Herbert L. Roitblat, Ph.D. CTO, Chief Scientist, OrcaTec Predictive coding uses computers and machine learning to reduce the number of documents in large document sets

More information

2011 Winston & Strawn LLP

2011 Winston & Strawn LLP Today s elunch Presenters John Rosenthal Litigation Washington, D.C. JRosenthal@winston.com Scott Cohen Director of E Discovery Support Services New York SCohen@winston.com 2 What Was Advertised Effective

More information

ESI and Predictive Coding

ESI and Predictive Coding Beijing Boston Brussels Chicago Frankfurt Hong Kong ESI and Predictive Coding Houston London Los Angeles Moscow Munich New York Palo Alto Paris São Paulo Charles W. Schwartz Chris Wycliff December 13,

More information

Mastering Predictive Coding: The Ultimate Guide

Mastering Predictive Coding: The Ultimate Guide Mastering Predictive Coding: The Ultimate Guide Key considerations and best practices to help you increase ediscovery efficiencies and save money with predictive coding 4.5 Validating the Results and Producing

More information

Industry Leading Solutions: Innovative Technology. Quality Results.

Industry Leading Solutions: Innovative Technology. Quality Results. Industry Leading Solutions: Innovative Technology. Quality Results. April 10, 2013 emagsolutions.com Agenda Speaker Introduction A Quick Word about emag Importance of Technology Assisted Review (TAR) Key

More information

The Business Case for ECA

The Business Case for ECA ! AccessData Group The Business Case for ECA White Paper TABLE OF CONTENTS Introduction... 1 What is ECA?... 1 ECA as a Process... 2 ECA as a Software Process... 2 AccessData ECA... 3 What Does This Mean

More information

How It Works and Why It Matters for E-Discovery

How It Works and Why It Matters for E-Discovery Continuous Active Learning for Technology Assisted Review How It Works and Why It Matters for E-Discovery John Tredennick, Esq. Founder and CEO, Catalyst Repository Systems Peer-Reviewed Study Compares

More information

Predictive Coding Defensibility

Predictive Coding Defensibility Predictive Coding Defensibility Who should read this paper The Veritas ediscovery Platform facilitates a quality control workflow that incorporates statistically sound sampling practices developed in conjunction

More information

LONG INTERNATIONAL. Long International, Inc. 10029 Whistling Elk Drive Littleton, CO 80127-6109 (303) 972-2443 Fax: (303) 972-6980

LONG INTERNATIONAL. Long International, Inc. 10029 Whistling Elk Drive Littleton, CO 80127-6109 (303) 972-2443 Fax: (303) 972-6980 LONG INTERNATIONAL Long International, Inc. 10029 Whistling Elk Drive Littleton, CO 80127-6109 (303) 972-2443 Fax: (303) 972-6980 www.long-intl.com TABLE OF CONTENTS INTRODUCTION... 1 Why Use Computerized

More information

E-Discovery Getting a Handle on Predictive Coding

E-Discovery Getting a Handle on Predictive Coding E-Discovery Getting a Handle on Predictive Coding John J. Jablonski Goldberg Segalla LLP 665 Main St Ste 400 Buffalo, NY 14203-1425 (716) 566-5400 jjablonski@goldbergsegalla.com Drew Lewis Recommind 7028

More information

PICTERA. What Is Intell1gent One? Created by the clients, for the clients SOLUTIONS

PICTERA. What Is Intell1gent One? Created by the clients, for the clients SOLUTIONS PICTERA SOLUTIONS An What Is Intell1gent One? Created by the clients, for the clients This white paper discusses: Understanding How Intell1gent One Saves Time and Money Using Intell1gent One to Save Money

More information

Technology Assisted Review: Don t Worry About the Software, Keep Your Eye on the Process

Technology Assisted Review: Don t Worry About the Software, Keep Your Eye on the Process Technology Assisted Review: Don t Worry About the Software, Keep Your Eye on the Process By Joe Utsler, BlueStar Case Solutions Technology Assisted Review (TAR) has become accepted widely in the world

More information

Recent Developments in the Law & Technology Relating to Predictive Coding

Recent Developments in the Law & Technology Relating to Predictive Coding Recent Developments in the Law & Technology Relating to Predictive Coding Presented by Paul Neale CEO Presented by Gene Klimov VP & Managing Director Presented by Gerard Britton Managing Director 2012

More information

2972 NW 60 th Street, Fort Lauderdale, Florida 33309 Tel 954.462.5400 Fax 954.463.7500

2972 NW 60 th Street, Fort Lauderdale, Florida 33309 Tel 954.462.5400 Fax 954.463.7500 2972 NW 60 th Street, Fort Lauderdale, Florida 33309 Tel 954.462.5400 Fax 954.463.7500 5218 South East Street, Suite E-3, Indianapolis, IN 46227 Tel 317.247.4400 Fax 317.247.0044 Presented by Providing

More information

The United States Law Week

The United States Law Week The United States Law Week Source: U.S. Law Week: News Archive > 2012 > 04/24/2012 > BNA Insights > Under Fire: A Closer Look at Technology- Assisted Document Review E-DISCOVERY Under Fire: A Closer Look

More information

PRESENTED BY: Sponsored by:

PRESENTED BY: Sponsored by: PRESENTED BY: Sponsored by: Practical Uses of Analytics in E-Discovery - A PRIMER Jenny Le, Esq. Vice President of Discovery Services jle@evolvediscovery.com E-Discovery & Ethics Structured, Conceptual,

More information

E-Discovery Tip Sheet

E-Discovery Tip Sheet E-Discovery Tip Sheet LegalTech 2015 Some Panels and Briefings Last month I took you on a select tour of the vendor exhibits and products from LegalTech 2015. This month I want to provide a small brief

More information

Predictive Coding as a Means to Prioritize Review and Reduce Discovery Costs. White Paper

Predictive Coding as a Means to Prioritize Review and Reduce Discovery Costs. White Paper Predictive Coding as a Means to Prioritize Review and Reduce Discovery Costs White Paper INTRODUCTION Computers and the popularity of digital information have changed the way that the world communicates

More information

ediscovery Policies: Planned Protection Saves More than Money Anticipating and Mitigating the Costs of Litigation

ediscovery Policies: Planned Protection Saves More than Money Anticipating and Mitigating the Costs of Litigation Brought to you by: ediscovery Policies: Planned Protection Saves More than Money Anticipating and Mitigating the Costs of Litigation Introduction: Rising costs of litigation The chance of your organization

More information

Take an Enterprise Approach to E-Discovery. Streamline Discovery and Control Review Cost Using a Central, Secure E-Discovery Cloud Platform

Take an Enterprise Approach to E-Discovery. Streamline Discovery and Control Review Cost Using a Central, Secure E-Discovery Cloud Platform Take an Enterprise Approach to E-Discovery Streamline Discovery and Control Review Cost Using a Central, Secure E-Discovery Cloud Platform A Smarter Approach Catalyst s e-discovery cloud platform provides

More information

The Next Phase of Electronic Discovery Process Automation

The Next Phase of Electronic Discovery Process Automation White Paper Predictive Coding The Next Phase of Electronic Discovery Process Automation By Katey Wood and Brian Babineau August, 2011 This ESG White Paper was commissioned by Recommind and is distributed

More information

Measurement in ediscovery

Measurement in ediscovery Measurement in ediscovery A Technical White Paper Herbert Roitblat, Ph.D. CTO, Chief Scientist Measurement in ediscovery From an information-science perspective, ediscovery is about separating the responsive

More information

Workflow Administration of Windchill 10.2

Workflow Administration of Windchill 10.2 Workflow Administration of Windchill 10.2 Overview Course Code Course Length TRN-4339-T 2 Days In this course, you will learn about Windchill workflow features and how to design, configure, and test workflow

More information

Litigation Solutions insightful interactive culling distributed ediscovery processing powering digital review

Litigation Solutions insightful interactive culling distributed ediscovery processing powering digital review Litigation Solutions i n s i g h t f u l i n t e r a c t i ve c u l l i n g d i s t r i b u t e d e d i s cove r y p ro ce s s i n g p owe r i n g d i g i t a l re v i e w Advanced Analytical Review Data

More information

AN E-DISCOVERY MODEL ORDER

AN E-DISCOVERY MODEL ORDER AN E-DISCOVERY MODEL ORDER INTRODUCTION Since becoming a staple of American civil litigation, e-discovery has been the subject of extensive review, study, and commentary. See The Sedona Principles: Best

More information

Whitepaper: Enterprise Vault Discovery Accelerator and Clearwell A Comparison August 2012

Whitepaper: Enterprise Vault Discovery Accelerator and Clearwell A Comparison August 2012 888.427.5505 Whitepaper: Enterprise Vault Discovery Accelerator and Clearwell A Comparison August 2012 Prepared by Dan Levine, Principal Engineer & Miguel Ortiz, Esq., ediscovery Specialist Globanet 15233

More information

E-Discovery Tip Sheet

E-Discovery Tip Sheet E-Discovery Tip Sheet A TAR Too Far Here s the buzzword feed for the day: Technology-assisted review (TAR) Computer-assisted review (CAR) Predictive coding Latent semantic analysis Precision Recall The

More information

Reducing the Cost of ediscovery in Today s Economic Climate. risk of ediscovery

Reducing the Cost of ediscovery in Today s Economic Climate. risk of ediscovery Reducing the Cost of ediscovery in Today s Economic Climate Strategies for reducing the cost and Strategies for reducing the cost and risk of ediscovery The 2009 Corporate ediscovery Challenge Corporate

More information

System Administration of Windchill 10.2

System Administration of Windchill 10.2 System Administration of Windchill 10.2 Overview Course Code Course Length TRN-4340-T 3 Days In this course, you will gain an understanding of how to perform routine Windchill system administration tasks,

More information

for Insurance Claims Professionals

for Insurance Claims Professionals A Practical Guide to Understanding ediscovery for Insurance Claims Professionals ediscovery Defined and its Relationship to an Insurance Claim Simply put, ediscovery (or Electronic Discovery) refers to

More information

A Radicati Group Webconference

A Radicati Group Webconference The Radicati Group, Inc. www.radicati.com A Radicati Group Webconference 9:30 am, PT November 3, 2011 The Radicati Group, Inc. Copyright November 2011, Reproduction Prohibited The Radicati Group, Inc.

More information

A Modern Approach for Corporations Facing the Demands of Litigation

A Modern Approach for Corporations Facing the Demands of Litigation A Modern Approach for Corporations Facing the Demands of Litigation The first pure Software-as-a-Service (SaaS) e-discovery technology designed to help in-house legal teams face the increased risk and

More information

Symantec ediscovery Platform, powered by Clearwell

Symantec ediscovery Platform, powered by Clearwell Symantec ediscovery Platform, powered by Clearwell Data Sheet: Archiving and ediscovery The brings transparency and control to the electronic discovery process. From collection to production, our workflow

More information

Considering Third Generation ediscovery? Two Approaches for Evaluating ediscovery Offerings

Considering Third Generation ediscovery? Two Approaches for Evaluating ediscovery Offerings Considering Third Generation ediscovery? Two Approaches for Evaluating ediscovery Offerings Developed by Orange Legal Technologies, Providers of the OneO Discovery Platform. Considering Third Generation

More information

Pros And Cons Of Computer-Assisted Review

Pros And Cons Of Computer-Assisted Review Portfolio Media. Inc. 860 Broadway, 6th Floor New York, NY 10003 www.law360.com Phone: +1 646 783 7100 Fax: +1 646 783 7161 customerservice@law360.com Pros And Cons Of Computer-Assisted Review Law360,

More information

Litigation Solutions. insightful interactive culling. distributed ediscovery processing. powering digital review

Litigation Solutions. insightful interactive culling. distributed ediscovery processing. powering digital review Litigation Solutions insightful interactive culling distributed ediscovery processing powering digital review TECHNOLOGY ASSISTED REVIEW Eclipse combines advanced analytic technology with machine learning

More information

Windchill PDMLink 10.2. Curriculum Guide

Windchill PDMLink 10.2. Curriculum Guide Windchill PDMLink 10.2 Curriculum Guide Live Classroom Curriculum Guide Update to Windchill PDMLink 10.2 from Windchill PDMLink 9.0/9.1 for the End User Introduction to Windchill PDMLink 10.2 for Light

More information

ARCHIVING FOR EXCHANGE 2013

ARCHIVING FOR EXCHANGE 2013 White Paper ARCHIVING FOR EXCHANGE 2013 A Comparison with EMC SourceOne Email Management Abstract Exchange 2013 is the latest release of Microsoft s flagship email application and as such promises to deliver

More information

Veritas ediscovery Platform

Veritas ediscovery Platform TM Veritas ediscovery Platform Overview The is the leading enterprise ediscovery solution that enables enterprises, governments, and law firms to manage legal, regulatory, and investigative matters using

More information

Understanding How Service Providers Charge for ediscovery Services

Understanding How Service Providers Charge for ediscovery Services ediscovery SERVICES Understanding How Service Providers Charge for ediscovery Services The objective of this document is to briefly define the prominent phases of the ediscovery lifecycle, the fees associated

More information

ediscovery Solutions

ediscovery Solutions The Radicati Group, Inc. www.radicati.com ediscovery Solutions A Radicati Group, Inc. Webconference The Radicati Group, Inc. Copyright November 2010, Reproduction Prohibited 9:30 am, PT November 4, 2010

More information

community for use in e-discovery. It is an iterative process involving relevance feedback and

community for use in e-discovery. It is an iterative process involving relevance feedback and Survey of the Use of Predictive Coding in E-Discovery Julie King CSC 570 May 4, 2014 ABSTRACT Predictive coding is the latest and most advanced technology to be accepted by the legal community for use

More information

INDEX. OutIndex Services...2. Collection Assistance...2. ESI Processing & Production Services...2. Computer-Based Language Translation...

INDEX. OutIndex Services...2. Collection Assistance...2. ESI Processing & Production Services...2. Computer-Based Language Translation... SERVICES INDEX OutIndex Services...2 Collection Assistance...2 ESI Processing & Production Services...2 Computer-Based Language Translation...3 OutIndex E-Discovery Deployment & Installation Consulting...3

More information

Data Sheet: Archiving Symantec Enterprise Vault Discovery Accelerator Accelerate e-discovery and simplify review

Data Sheet: Archiving Symantec Enterprise Vault Discovery Accelerator Accelerate e-discovery and simplify review Accelerate e-discovery and simplify review Overview provides IT/Legal liaisons, investigators, lawyers, paralegals and HR professionals the ability to search, preserve and review information across the

More information

How To Manage Records At The Exceeda Agency

How To Manage Records At The Exceeda Agency Table of Contents Table of Contents... i I. Introduction and Background... 1 A. Challenges... 1 B. Vision and Goals... 1 II. Approach and Response... 2 A. Set up a Governance Structure... 2 B. Established

More information

E- Discovery in Criminal Law

E- Discovery in Criminal Law E- Discovery in Criminal Law ! An e-discovery Solution for the Criminal Context Criminal lawyers often lack formal procedures to guide them through preservation, collection and analysis of electronically

More information

For Your ediscovery... Software

For Your ediscovery... Software For Your ediscovery... Software is not enough Leading Provider of Investigatory and Litigation Support Services for Corporations, Government Agencies and Am Law Firms Worldwide Our People Make the Difference

More information

Cost-Effective and Defensible Technology Assisted Review

Cost-Effective and Defensible Technology Assisted Review WHITE PAPER: SYMANTEC TRANSPARENT PREDICTIVE CODING Symantec Transparent Predictive Coding Cost-Effective and Defensible Technology Assisted Review Who should read this paper Predictive coding is one of

More information

INTERNAL REGULATIONS OF THE AUDIT AND COMPLIANCE COMMITEE OF BBVA COLOMBIA

INTERNAL REGULATIONS OF THE AUDIT AND COMPLIANCE COMMITEE OF BBVA COLOMBIA ANNEX 3 INTERNAL REGULATIONS OF THE AUDIT AND COMPLIANCE COMMITEE OF BBVA COLOMBIA (Hereafter referred to as the Committee) 1 INDEX CHAPTER I RULES OF PROCEDURE OF THE BOARD OF DIRECTORS 1 NATURE 3 2.

More information

One Decision Document Review Accelerator. Orange Legal Technologies. OrangeLT.com Info@OrangeLT.com

One Decision Document Review Accelerator. Orange Legal Technologies. OrangeLT.com Info@OrangeLT.com One Decision Document Review Accelerator Orange Legal Technologies OrangeLT.com Info@OrangeLT.com By the Numbers: The Need for Technology in Attorney Review Seventy. Integrated near- duplicate detection

More information

In recent years there has been growing concern with the financial and tax compliance risks associated with

In recent years there has been growing concern with the financial and tax compliance risks associated with Development of Financial Products Business Rules Using Business Intelligence Technology David Macias and Jennifer Li, IRS: Large Business and International Division In recent years there has been growing

More information

Artificial Intelligence and Transactional Law: Automated M&A Due Diligence. By Ben Klaber

Artificial Intelligence and Transactional Law: Automated M&A Due Diligence. By Ben Klaber Artificial Intelligence and Transactional Law: Automated M&A Due Diligence By Ben Klaber Introduction Largely due to the pervasiveness of electronically stored information (ESI) and search and retrieval

More information

The Benefits of. in E-Discovery. How Smart Sampling Can Help Attorneys Reduce Document Review Costs. A white paper from

The Benefits of. in E-Discovery. How Smart Sampling Can Help Attorneys Reduce Document Review Costs. A white paper from The Benefits of Sampling in E-Discovery How Smart Sampling Can Help Attorneys Reduce Document Review Costs A white paper from 615.255.5343 dsi.co 414 Union Street, Suite 1210 Nashville, TN 37219-1771 Table

More information

Pr a c t i c a l Litigator s Br i e f Gu i d e t o Eva l u at i n g Ea r ly Ca s e

Pr a c t i c a l Litigator s Br i e f Gu i d e t o Eva l u at i n g Ea r ly Ca s e Ba k e Offs, De m o s & Kicking t h e Ti r e s: A Pr a c t i c a l Litigator s Br i e f Gu i d e t o Eva l u at i n g Ea r ly Ca s e Assessment So f t wa r e & Search & Review Tools Ronni D. Solomon, King

More information

UNIVERSITY OF NEVADA, LAS VEGAS Master Agreement Agreement No.

UNIVERSITY OF NEVADA, LAS VEGAS Master Agreement Agreement No. UNIVERSITY OF NEVADA, LAS VEGAS Master Agreement Agreement No. This agreement is made effective as of Date (Effective Date), by and between the Board of Regents, Nevada System of Higher Education on behalf

More information

LexisNexis Concordance Evolution Amazing speed plus LAW PreDiscovery and LexisNexis Near Dupe integration

LexisNexis Concordance Evolution Amazing speed plus LAW PreDiscovery and LexisNexis Near Dupe integration LexisNexis Concordance Evolution Amazing speed plus LAW PreDiscovery and LexisNexis Near Dupe integration LexisNexis is committed to developing new and better Concordance Evolution capabilities. All based

More information

Introduction to Windchill Projectlink 10.2

Introduction to Windchill Projectlink 10.2 Introduction to Windchill Projectlink 10.2 Overview Course Code Course Length TRN-4270 1 Day In this course, you will learn how to participate in and manage projects using Windchill ProjectLink 10.2. Emphasis

More information

Review Easy Guide for Administrators. Version 1.0

Review Easy Guide for Administrators. Version 1.0 Review Easy Guide for Administrators Version 1.0 Notice to Users Verve software as a service is a software application that has been developed, copyrighted, and licensed by Kroll Ontrack Inc. Use of the

More information

THE PROPERTY TAX PROTEST PROCESS

THE PROPERTY TAX PROTEST PROCESS THE PROPERTY TAX PROTEST PROCESS A summary of the appeal procedures under the Texas Property Tax Code Presented by: Jason C. Marshall THE MARSHALL FIRM PC 302 N. Market Suite 510 Dallas TX 75202 214.742.4800

More information

Only 1% of that data has preservation requirements Only 5% has regulatory requirements Only 34% is active and useful

Only 1% of that data has preservation requirements Only 5% has regulatory requirements Only 34% is active and useful Page 1 LMG GROUP vs. THE BIG DATA TIDAL WAVE Recognizing that corporations, law firms and government entities are faced with tough questions in today s business climate, LMG Group LLC ( LMG Group ) has

More information

ediscovery Technology That Works for You

ediscovery Technology That Works for You ediscovery Technology That Works for You Peace of Mind for Serious ediscovery ediscovery demands options, and having one of the industry s most comprehensive portfolios of proprietary and best-of-breed

More information

Equivio FAQs. No. Equivio groups documents by similarity or email threads. The documents are grouped together for bulk coding but are not culled out.

Equivio FAQs. No. Equivio groups documents by similarity or email threads. The documents are grouped together for bulk coding but are not culled out. Equivio FAQs Here is some basic information about Equivio for use by Catalyst and our Partners. These will help in responding to queries from clients and prospective clients. 1. What is Equivio? Equivio

More information

IBM ediscovery Identification and Collection

IBM ediscovery Identification and Collection IBM ediscovery Identification and Collection Turning unstructured data into relevant data for intelligent ediscovery Highlights Analyze data in-place with detailed data explorers to gain insight into data

More information

An Open Look at Keyword Search vs. Predictive Analytics

An Open Look at Keyword Search vs. Predictive Analytics 877.557.4273 catalystsecure.com ARTICLE An Open Look at Keyword Search vs. Can Keyword Search Be As Effective as TAR? John Tredennick, Esq. Founder and CEO, Catalyst Repository Systems 2015 Catalyst Repository

More information

Challenges in Legal Electronic Discovery CISML 2011

Challenges in Legal Electronic Discovery CISML 2011 Challenges in Legal Electronic Discovery CISML 2011 this presentation is available: http://tinyurl.com/cat-cisml2011 Dr Jeremy Pickens Sr Applied Research Scientist Likes: Information Retrieval, collaborative

More information

Discovery in the Digital Age: e-discovery Technology Overview. Chuck Rothman, P.Eng Wortzman Nickle Professional Corp.

Discovery in the Digital Age: e-discovery Technology Overview. Chuck Rothman, P.Eng Wortzman Nickle Professional Corp. Discovery in the Digital Age: e-discovery Technology Overview Chuck Rothman, P.Eng Wortzman Nickle Professional Corp. The Ontario e-discovery Institute 2013 Contents 1 Technology Overview... 1 1.1 Introduction...

More information

Contents. April 2015 312.201.8400. www.hbrconsulting.com info@hbrconsulting.com. 2015 HBR CONSULTING. All Rights Reserved.

Contents. April 2015 312.201.8400. www.hbrconsulting.com info@hbrconsulting.com. 2015 HBR CONSULTING. All Rights Reserved. Executive Summary Contents E-Discovery Landscape... 2 Survey Highlights... 5 Strategic Importance... 5 Current Services... 6 Future Services... 7 Revenue Expectations... 8 Spending Focus... 9 E-Discovery

More information

Reduce Cost, Time, and Risk ediscovery and Records Management in SharePoint

Reduce Cost, Time, and Risk ediscovery and Records Management in SharePoint Reduce Cost, Time, and Risk ediscovery and Records Management in SharePoint David Tappan SharePoint Consultant C/D/H davidt@cdh.com Twitter @cdhtweetstech Don Miller Vice President of Sales Concept Searching

More information

Christina Wojcik, VP Legal Services, Seal Software Steven Toole, VP Marketing, Content Analyst Company Jason Voss, Senior Product Manager, TCDi

Christina Wojcik, VP Legal Services, Seal Software Steven Toole, VP Marketing, Content Analyst Company Jason Voss, Senior Product Manager, TCDi FEBRUARY 3 5, 2015 / THE HILTON NEW YORK ML1: Machine Learning Powered Rapid Insight into Big Content: Discovery from Contracts to Patents to Litigation Panelists Christina Wojcik, VP Legal Services, Seal

More information

Highly Efficient ediscovery Using Adaptive Search Criteria and Successive Tagging [TREC 2010]

Highly Efficient ediscovery Using Adaptive Search Criteria and Successive Tagging [TREC 2010] 1. Introduction Highly Efficient ediscovery Using Adaptive Search Criteria and Successive Tagging [TREC 2010] by Ron S. Gutfinger 12/3/2010 1.1. Abstract The most costly component in ediscovery is the

More information

Amazing speed and easy to use designed for large-scale, complex litigation cases

Amazing speed and easy to use designed for large-scale, complex litigation cases Amazing speed and easy to use designed for large-scale, complex litigation cases LexisNexis is committed to developing new and better Concordance Evolution capabilities. All based on feedback from customers

More information

Benefits of using the Indian CST s GPMS Cloud

Benefits of using the Indian CST s GPMS Cloud Benefits of using the Indian CST s GPMS Cloud In-built escalation mechanism where superiors can quickly identify nonconformances and initiate interventions leading preventive delays Empowers Project Management

More information

E-DISCOVERY GUIDELINES. Former Reference: Practice Directive #6 issued September 1, 2009

E-DISCOVERY GUIDELINES. Former Reference: Practice Directive #6 issued September 1, 2009 CIVIL PRACTICE DIRECTIVE #1 REFERENCE: CIV-PD #1 E-DISCOVERY GUIDELINES Former Reference: Practice Directive #6 issued September 1, 2009 Effective: July 1, 2013 Introduction 1. While electronic documents

More information

Q2, 2013. Quarterly Critical Trends Report TCG. Advisory Services & Market Research. 211 East 43 rd Group

Q2, 2013. Quarterly Critical Trends Report TCG. Advisory Services & Market Research. 211 East 43 rd Group Q2, 2013 Quarterly Critical Trends Report TCG Advisory Services & Market Research The Cowen 211 East 43 rd Group Street, Suite 1606 New York 10017 www.cowengroup.com +1 (212) 661 0025 Released 15 July

More information

Information Retrieval for E-Discovery Douglas W. Oard

Information Retrieval for E-Discovery Douglas W. Oard Information Retrieval for E-Discovery Douglas W. Oard College of Information Studies and Institute for Advanced Computer Studies University of Maryland, College Park Joint work with: Mossaab Bagdouri,

More information