Nuix Forensic Focus 2014 Webinar Accelerating investigations using advanced ediscovery techniques 6 th March 2014 All rights reserved 2014. Nuix Software
ABOUT THE PRESENTERS Paul Slater Director of Forensic Solutions (EMEA) Over 20 years of investigation experience as an advisor within the law enforcement, government, financial and commercial sectors Originally a detective within the Greater Manchester Police Spent seven years as a computer forensic investigator Since 2003, a digital forensics adviser to legal, corporate and government clients Led UK forensic technology teams at PwC and Deloitte Spent two years at UK Serious Fraud Office as Interim Head of the Digital Forensics Unit and forensic technology consultant Member of the Review Board for the Association of Chief Police Officers Good Practice Guide for Computer-based Electronic Evidence 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 2
ABOUT THE PRESENTERS Ady Cassidy - Director of Investigation Consultancy (Global) Forensic investigator and ediscovery consultant with more than 10 years experience as a computer forensic analyst Former police officer with Suffolk Constabulary High Tech Crime Unit Previously Managing Consultant with 7Safe London, responsible for managing the London based ediscovery team deploying end-to-end forensic and ediscovery services 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 3
TODAY S AGENDA Improving the efficiency of digital investigations Where is the key evidence found in most cases? What lessons can we learn from other disciplines? Advanced workflows Near duplicates Named entities Visual analytics 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 4
ACCELERATING INVESTIGATIONS We need to improve the efficiency of our digital investigations The key challenge: finding the truth in ever larger, more varied and increasingly complex stores of electronic evidence How can we zero in on critical data and only use time-consuming data forensics analysis on this data? ediscovery methodologies and techniques can help 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 5
DIGITAL FORENSICS VS EDISCOVERY Traditionally digital forensics and ediscovery have been considered two distinct professions dealing with digital evidence in different ways Digital forensics encompasses the entire universe of data stored on a hard disk drive, whereas ediscovery usually only focuses on a smaller grouping of data stored on the drive. Phillip Rodokanakis, Certified Fraud Examiner (2011) 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 6
DIGITAL FORENSICS VS EDISCOVERY Digital forensics investigates everything, including deleted files or remnants from former files that have been partially overwritten. A forensic examiner must pay particular attention to certain operating system and log files, temporary files and the file remnants found in unallocated clusters. Phillip Rodokanakis, Certified Fraud Examiner 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 7
DIGITAL FORENSICS VS EDISCOVERY Whereas ediscovery filters out program, temporary and system files, and processes only active user accessible files. This usually involves Microsoft or other office suite files and emails. These types of files are then processed in an ediscovery engine, where they are indexed and catalogued, and then usually loaded into a Litigation Support Platform. Phillip Rodokanakis, Certified Fraud Examiner 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 8
HIDDEN IN PLAIN SIGHT However, in many investigations, the key evidence is more often found hidden in plain sight In communications such as emails, SMS messages or chat logs In images and videos In documents and files...rather than as a result of performing deep forensic analysis 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 9
ACCELERATING INVESTIGATIONS Nowadays: Digital forensics and ediscovery share many processes, tools and workflows As the volume of data, variety and complexity of storage devices increases, investigators must be able to quickly identify potentially relevant material for analysis By applying ediscovery-like workflows such as content-based forensic triage, investigators can use their digital forensics skills to dig deep into the likeliest data sources In this way, they avoid spending countless hours forensically analysing irrelevant material 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 10
FINDING A NEEDLE IN A DIGITAL HAYSTACK Consider the typical digital forensic investigation The growing volume of data has stretched traditional forensic tools to capacity; it has become more difficult to examine all data sources Investigators may take arbitrary decisions as to which evidence sources they analyse first or if they examine them at all 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 11
LESSONS FROM EDISCOVERY What lessons can we learn from ediscovery? which typically has even larger volumes of digital evidence than forensic investigations? Three workflows that can aid an investigator in processing, analysis, reporting and the decision process throughout an investigation: Near-duplicates Named entities Visual analytics 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 12
Near-duplicates All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix
WHAT IS A NEAR-DUPLICATE? The most common method of identifying duplicates is to perform a cryptographic hash on the contents of each file This only works for exact duplicates What happens when documents are visually identical?? 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 14
IDENTICAL DOCUMENTS? Microsoft Word Adobe PDF 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 15
IDENTICAL DOCUMENTS? Microsoft Word Adobe PDF MD5 Match? X 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 16
SHINGLES Near-duplicate technology extracts and hashes multiple overlapping phrases of around four or five words each This technology is called w-shingling or shingles Identifies and extracts the text in each file Removes superfluous characters leaving letters and digits Converts to lower case Splits text into tokens (overlapping groups of words) to build shingles? 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 17
SHINGLES We can then compare the sets of shingles to establish if documents contain the same text Uses the Jaccard similarity algorithm, a statistic method for comparing the similarity and diversity of sample sets Sometimes we can compare apples and oranges 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 18
SHINGLES Where w = 2 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 19
IDENTICAL DOCUMENTS 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 20
SIMILAR DOCUMENTS Using 'shingles' we can also identify items that contain similar text and calculate just how similar they are It can show us how a document has evolved over time, such as previous versions of a Word document that are stored in Volume Shadow Copies or help link fragments of documents recovered from unallocated space with documents present within the live data set 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 21
SIMILAR DOCUMENTS 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 22
SIMILAR DOCUMENTS Finding and grouping similar documents is a very powerful way to increase the efficiency of an investigation Allows investigators to focus on the key items or evidence sources within a case Once we have identified items that are relevant (or definitely irrelevant), nearduplicate analysis can quickly find similar items for investigation 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 23
INCREASING SEARCH EFFICIENCY After we have extracted a list of shingles, we can search within the shingles for particular keywords and review each keyword hit in context to the surrounding text If we are searching for mouse this helps us to avoid files containing non-relevant phrases such as how the mouse buttons work.this isn't the mouse I was looking for 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 24
INCREASING SEARCH EFFICIENCY 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 25
USING NEAR-DUPLICATES TO LINK ARTEFACTS 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 26
Named entities All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix
NAMED ENTITIES Because we have already indexed the file content and its metadata as part of our workflow we can automatically and intelligently search for certain types of named entities such as: Companies Credit cards Email and IP addresses Monetary values Passport/ID information using REGular EXpressions 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 28
NAMED ENTITIES Named Entities follow standard regex syntax so investigators can quickly build their own and re-use these on other investigations Nuix can automatically filter named entities to quickly identify responsive material Cross-referencing this intelligence across all available evidence rapidly reveals relationships between people and entities Named entities can added to workflows, for example identifying all emails containing credit card numbers sent to a country on a hot register for fraud or corruption 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 29
NAMED ENTITIES Nuix extracts intelligence from the content of all items while the data is being processed and indexed Investigators can quickly assess content at the start of an investigation Investigators can easily add their own regex files to the library # File containing regular expressions for money. # Matches US and European formats with a leading dollar, pound or euro sign. ^[A-Z]{0,3}[$\u00A3\u20ac](?:0 [1-9]\d{0,2}(?:[.,]?\d{3}){0,10})[,.]?\d{0,2}(\b _) Sample regex file matching monetary values Entity Company Email Money Country IP Address URL Custom Description Displays results related to company names Displays results related to email addresses Displays results related to currencies Displays results related to country Displays results related to IP Addresses Displays results related to URLs User defined regular expressions 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 30
NAMED ENTITIES Extracted named entities are available in the investigation workbench as soon as indexing is complete Investigators can immediately assess the content of the dataset for associated intelligence items Entities are also available from the filtered items menu allowing the analyst to isolate the values to search against 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 31
Visual analytics All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix
VISUAL ANALYTICS Q. What s in your data? missing ^ 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 33
VISUAL ANALYTICS ediscovery methodologies such as early case assessment give us a powerful, visual insight into our data Having the ability to represent data visually can open up our understanding of the data To enable less technical people to quickly gain insight, we can use tools such as: Email gap analysis Timelines of deleted file activity Links between people, devices and places 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 34
EMAIL GAP ANALYSIS 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 35
USER ACTIVITY 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 36
TIMELINES FOR ANY PURPOSE 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 37
VISUALISING GEO-TAGGED ITEMS 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 38
EMAIL REVIEW 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 39
EMAIL REVIEW 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 40
EVENT MAP 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 41
In summary All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix
SUMMARY Using ediscovery techniques in digital investigations allows us to increase the focus and efficiency of the investigative process This allows the digital forensic investigator to see the bigger picture and not just the individual parts and to focus their time looking for the zipped, attached, filed, emailed, deleted needle in the digital haystack 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 43
Forensic Focus Forum Discussion www.forensicfocus.com/forums/viewtopic/t=11555/ Presenters Paul.Slater@nuix.com Ady.Cassidy@nuix.com nuix.com/investigations 7 March 2014 All rights reserved All rights reserved 2014. Nuix 2014. Software Nuix 44