On-Demand ediscovery Processing Fast, Scalable Processing Capacity to Meet any Project Need March 26, 2015 Stu Van Dusen Lexbe LC
ediscovery Webinar Series Info & Future Takes Place Monthly Cover a Variety of Relevant ediscovery Topics Presentations Available for Download by Registrants.
ediscovery Webinar Series About Lexbe Lexbe is an Austin, TX based ediscovery software and services provider. Lexbe ediscovery Platform Lexbe ediscovery Platform is a hosted ediscovery processing and review tool. Users can load a variety of file types, process for review, OCR for search, and conduct document reviews, productions, prepare for depos & analyze transcripts, conduct case analytics, prepare for dispositive motions, and provide litigation support during trial. Per GB hosting charges won t break the bank and there are no user fees. Lexbe ediscovery Services Lexbe does large volume document culling, processing from native to PDF or TIFF, load file creation, high-volume OCR of image files, Rule 26 and project management consulting, and related ediscovery Services. Lexbe is recognized as a 'Top 100' ediscovery Provider by ComplexDiscovery, a leading electronic discovery and information governance firm. Lexbe Sales sales@lexbe.com (800) 401-7809 x22
ediscovery Webinar Series Questions & Technical Issues If you have any questions or technical issues, please e-mail them to: webinars@lexbe.com
ediscovery Webinar Series Stu Van Dusen Bio ediscovery Solutions Consultant at Lexbe LC, advising legal professionals on cloud-based litigation data processing, review and document management software and litigation support services. Prior business experience in financial services, software, and internet-based businesses. Education MS - Technology Commercialization, University of Texas McCombs School of Business (2013) BS - Management, Trinity University (2010) Contact Stu Van Dusen 512-843-7672 svandusen@lexbe.com
Agenda ediscovery processing and the EDRM Fast, scalable processing: why it s needed Traditional processing workflows Balancing processing demands with internal resources Cost-efficiently increasing capacity and speed Features & benefits of eprocessing+ Integrated ECA, processing, and review to speed throughput and reduce total costs Summary
More ESI Means More Processing Needed Zettabytes* 4 3 2 1 Digital Information Created, Captured, Replicated Worldwide Voip Email iphones Peer-to-Peer 2.8 zettabytes of information were created Online Storage and replicated during 2012, a 56% increase Digital Cameras from 2011 (IDC) Facebook LinkedIn DropBox Backup Devices Elastic Storage SaaS Google Streets Personal Blogs Skype World Satellite Images Personal Scanners Customer Service Recordings Public Webcams Google Goggles Netbooks Cloud Instance Servers PaaS 2005 2010 Source: IDC Digital Universe Study (2012) * 1 Zettabyte = 1 Trillion Gigabytes 2015
EDRM
Key Questions What is ediscovery Processing? Why is it needed? Why are speed and scalability important? Cost Considerations?
What is ediscovery Processing? Aim: Perform actions on ESI to allow for metadata preservation, itemization, normalization of format, and data reduction via selection for review. Goal: Identify ESI items appropriate for review and production as per project requirements. Source: http://www.edrm.net/resources/guides/edrm-framework-guides/processing
Why is ediscovery Processing Needed? Support ECA and data reduction Standardize all documents into a review format Allow load to a review platform Create high quality search indexes Prepare data for assisted review, text, and visual analytics
Speed, Scalability, and Cost Considerations Benefits of Fast Processing: Meet tight deadlines More time for atty review (front and back) More time for QC Benefits of Scalability: Large collections can break non-scalable systems Allows large jobs to be completed quickly Lets managers match capacity with need Underlying Cost Consideration: Maintaining a large at rest capacity is very expensive in hardware, software, and human resources
Traditional Approaches Benefits Drawbacks Local Hardware and Software Installs -Internal control -Known capacity -Costly to maintain -Limited capacity and throughput -Litigation expense is a fixed capital cost External Service Bureaus -Outsourced capacity -Can be very slow -Often expensive -Minimal control (linear process) Dedicated Hardware Terminals -Convenience -Mobility -Expensive -Not scalable -Fixed capacity -Still requires IT support
Peaks & Troughs of Demand High Low Legend Internal Processing Need
Overspending on Internal Infrastructure Building internal capacity means purchasing hardware, software licenses, and increased staffing and management requirements. When internal processing needs fall, these resources are dormant, but still expensive. High Low Legend Internal Processing Need Internal Capacity Inactive Resources
Staffing for Best Fit Developing internal capacity to meet a consistent need reduces inefficient spend on inactive hardware, software, and staffing. Leverage scalable solutions to meet excess demand. High Low Legend Demand Internal Capacity
External Processing Requirements Quality -- Processing needs to deliver usable, complete outputs and load files to support the following stages of discovery Speed and Scalability -- Available capacity needs meet demand. The faster collections can begin processing the sooner review can begin Price -- ediscovery processing pricing needs to be predictable and within budget. Don t pay for more than is needed! Integration -- Output data should move smoothly into ECA and litigation review platforms to avoid additional time delays and expenses.
The Lexbe Engine ESI collections can be broken into smaller pieces and processed simultaneously in parallel server environments Scalable, proprietary architecture allows for instant access to near unlimited computing power. This means faster processing, hours and days vs. weeks. Supported by the AWS cloud, the best, most reliable, and most secure Incoming ESI Reviewable Documents
TIFFing Speed Example Traditional In-House: 1TB to process to TIFF, 1 month to do it. Incoming ESI Internal Capacity: 10 dedicated servers operating @ 2GBs/day 10 servers x 2GBs/day x 30 days = 600GB And with any problems the job will certainly not be finished Lexbe Scalable Capacity: Deploy 500 servers x 2GBs/day and finish processing in 1 day Scalable processing allows for much improved throughput at less cost Reviewable or Producable Documents
Complementary Processing Options eprocessing+ High Capacity Processing+ Service Type OnDemand Self-Service Service bureau model Availability Continuous, 24x7 Access Scheduled for large jobs Data Transfer Web interface HTTPS RAR & Secure FTP transfer Throughput Most case needs Extraordinarily large cases or tight timelines Server Scaling Reactive (queue managed) Reserved & deployed for job Processing Culling, Native to PDF, Native to TIFF, Native Extracted, OCR, NearDup Groupings; Assisted Review Ingestion Processing support for hundreds of popular file types; unicode Exports File formats export as PDF, TIFF or Native Expanded Multiple endorsing options Load file support for major review systems (DAT, DII, XLSX, etc.) QC & Reporting Integrated pivot tables Bi-directional Excel support Standard exception reporting Custom reports available
eprocessing+ Workflow
Next Generation Technology Architecture Scalable -- Systems architecture allows Lexbe to massively increase server instances as needed to apply more resources to an ediscovery processing task Fully Automated -- Eliminates the need for your babysitting of processing jobs Fault Tolerant -- Processing tasks are not batch-centric and checkout/check-in procedures insure individual processing steps operate independently
Security Secure Processing Environment -- Lexbe On-Demand processing is powered by Amazon S3 servers to facilitate redundancy and the high security standards. All data is strong encrypted (256-bit) in-transit and in-place. All Data centers are US based, and provide SOC I and II reports published under SSAE 16 and ISAE 3402 professional standards and are ISO 27001 certified.
Key Features Archive/Container Decompression Full-text indexing File Repair Bates stamping Metadata extraction & fielding PDF & TIFF creation MD5 hash code generation Placeholder creation System file identification & DeNIST Email attachment extraction & parent email association Native text extraction Native extracted, PDF and TIFF loadfile generation in multiple formats: XLSX (Lexbe), DAT/OPT (Case Logistix, Concordance, ipro Allegro, Ringtail, Kura Relativity) and DII (Summation), and quality control reports OCR of image files
Quality Control Tools and Features Programmatic batching of processing to individual servers (reduces human error) Custom QC flag creation and filtering Integration with Excel for reporting and analysis Pivot table analysis and charting Ability to view all documents including parent containers (email and attachments) together Ability to verify image quality Filtering and reporting by any captured or calculated fields including failed to convert, words in document, placeholders, etc. Native files are extracted and provided for linked load and review Statistical sampling and reporting
Benefits of eprocessing+ Access fastest on-demand processing available, anytime, anywhere to increase processing throughput Increase responsiveness to changing business demands and deadlines while maintaining high quality Grow capacity as demand increases without adding additional fixed hardware costs and support staff Eliminate high overhead monitoring of processing jobs by using a fully automated system that notifies you when your jobs are done
Benefits Cont d. Pay one low flat fee per GB ingested with no additional license, user, or startup fees Improve the quality of search and analytics through comprehensive dual-index methodology Integrate Processing, ECA, and review tasks into a single platform; eliminate time, risk, and cost associated with migrating data between different modules.
Dual Indexing With a Dual-Index approach the search engine indexes both text extracted from Native files (email, attachments, spreadsheets, etc.) and imaged file OCR text (TIFF, JPG or PDF). Most comprehensive approach minimizes potential for lost and unsearchable data, finds more privileged documents, and improves the accuracy and quality of culling. Benefits of Dual Index Approach Captures Embedded Text Captures Text Excluded From Print Captures Hidden Text Imaged/OCR Yes No No Native Extraction No Yes Yes Yes Yes Yes Index Method Dual Index
ECA Integration Track incoming ESI using pivot charts and visual graphs. Conduct ECA on fully extracted documents to avoid culling key docs. Export Early Case Analysis data directly to Excel
Review Integration Key Features Self-administration Dual-index search Exact & near-dup ID Comprehensive doc review & issue tagging functionality Blended productions Transcript management Timelining, depo prep Dispositive motion management Trial document management
Summary Processing is an essential element of discovery and the foundation of high quality reviews Traditional processing workflows can be expensive and inefficient Complement internal capacity with scalable and granular solutions You can increase capacity on a per/gb level with eprocessing+ and HighCapacity Processing+ eprocessing+ integrates with ECA and review systems to save time and money
Thank You Contact Info Gene Albert: gene@lexbe.com (512) 686-3382 Stu Van Dusen: svandusen@lexbe.com (512) 843-7672 Lexbe Sales: sales@lexbe.com (800) 401-7809 x22 Webinar Questions: webinars@lexbe.com