FEBRUARY 3 5, 2015 / THE HILTON NEW YORK Using Artificial Intelligence to Manage Big Data for Litigation Understanding Artificial Intelligence to Make better decisions Improve the process Allay the fear And Keep Screen-Punching to a Minimum
Introduction Stuart Miles "Maybe the only significant difference between a really smart [AI] simulation and a human being was the noise they made when you punched them." Terry Pratchett, The Long Earth
Introduction Define your terms definitions matter Confirm that everyone has the same definition other parties and fact finders as well when making agreements If your counsel or expert cannot explain a concept or its application, start to question the counsel or expertise if you can't teach an idea, how well do you really know it?
Definitions Conventional Artificial Intelligence Deductive AI Formal and statistical analysis of human behavior Computational Intelligence Inductive AI Development or interactive learning using empirical data
Definitions "Big Data" Massive amounts of an organization's diverse data, ranging from email messages, to file servers, specialized databases, social media, online business transactions (e.g., Amazon), etc. Unstructured data (e.g., emails) Structured data (e.g., specialized databases, archives)
Important Big Data Considerations Anonymization Ownership Permissions Privacy Records Retention Security Tokenization or R n (Redaction)
State of ediscovery Technology Know the current state of ediscovery technologies General, non-ai applications of technology that may be applicable singly or in concert with AI technologies De-NISTing De-duplication (Hash) Parametric Boolean Key Word searches Date Ranges Custodial Filtering
State of ediscovery AI Predictive Coding, Assisted Review, and New Technologies have changed the game Unsupervised Learning Algorithms use what the data can provide without attorney input: Clustering Near-Duplicate Detection Concept Search Other types of input Linguistic Analysis
State of ediscovery AI Supervised Learning Algorithms use attorney review in combination with the back-end math: Active Learning Language Modeling Logistic Regression LSA & Probabilistic LSA Naïve Bayesian Classifiers Nearest Neighbor Relevance Feedback Support Vector Machines
State of ediscovery AI Active Learning: Learning algorithm that reduces human effort by selecting the most informative data for training Language Modeling: Seeks out ideas in context in large data collections rather than using keywords Latent Semantic Analysis: An extraction, identification and categorization of a large set of documents by using statistical analysis to identify meaning based on the contexts in which words appear
State of ediscovery AI Support Vector Machines: Machine learning technique for classifying images, text, and other data into groups Uses human-classified data ("training") to categorize unknown data based on its resemblance to training data Relevance Feedback: In an iterative process, human user identifies search results as "relevant" and the identified "relevant" information is used as the basis for the next search
New Data Sources Understand how big data represents new sources and associated challenges for litigation preservation, collection, and use Automated collection Database complexity "Known" and "unknown" unknowns Third-party data mining and related services
Alternative Uses of AI Don't limit considerations to ediscovery and consider other types of AI Use of non-tar and Learning Algorithms for nonproduction activities: Client data analysis Early case assessment Opposing and third-party productions Prior party productions and representations
Alternative Uses of AI A* ("A-star") Pathfinding algorithm for application to underpinning contentions Considerations of mapping point-by-point uses of "found" information to most directly support points for a fact finder
Alternative Uses of AI Neural Networks Modeled on the brain; weighs connection strength Applies to handwriting analysis and better OCR Applications for speech recognition in matters involving financial institutions or caches of like data Some even theorize that they might be applied to basic legal analysis D. Hunter, Looking for Law in all the Wrong Places: Legal Theory and Legal Neural Networks, in: A. Soeteman (eds.), Legal knowledge based systems JURIX 94: The Foundation for Legal Knowledge Systems, Lelystad: Koninklijke Vermande, 1994, pp. 55-64,
Alternative Uses of AI Genetic Algorithms A population of candidate solutions to an optimization problem evolves toward better solutions Applications for settlement considerations Pattern litigation considerations Some even theorize that they might be applied to basic legal analysis A.S. Pannu, Using Genetic Algorithms to Inductively Reason with Cases in the Legal Domain, Intelligent Systems Program, University of Pittsburgh, in: Proceedings of the Fifth International Conference on Artificial Intelligence and Law (ICAIL-95)
Miscellaneous AI is not replacing Subject Matter Experts Despite reliance on AI and associated technologies in many other areas Despite scholarship in this direction Future Considerations within AI Quantum computing Outputs from AI as evidence Pattern recognition that indicates a "truth" that doesn't exist
Glossary Anonymization A* ("A-star") Big Data Computational Intelligence Conventional Artificial Intelligence De-NISTing Deductive AI Genetic Algorithms Hash Inductive AI Language Modeling Latent Semantic Analysis ("LSA") Logistic Regression Naïve Bayesian Classifiers Neural Networks Parametric Boolean Permissions Relevance Feedback Structured data Support Vector Machines Tokenization Unstructured data Unsupervised Learning Algorithms