S20 - Applying Information Governance to Computer Storage Information (CSI) Tue, May 20, 2014 2:00 PM Brian Tuemmler IG Program Architect Nuix Brian.Tuemmler@nuix.com All rights reserved 2012. Nuix Software AGENDA What is IG What unstructured content is likely on shared drives What your content tells you Content and context for classification Taking action Information Governance Records Management Compliance Information Technologies Operations ECRM Legal Retention Categories Regulation Stakeholders Information Policies Needs and Issues Enterprise Metadata Preservation Retained Records Risk Mitigation Requirements Information Management Improved Productivity Minimized Migration Effort Reduced Litigation Costs Benefits 2014 Managing Electronic Records Conference 20.1
WHAT WE RE HEARING ediscovery Not another discovery request! Searching our email archives is brutal. It takes forever, lots of vital staff resources, and I never know if it s going to work or not. Fraud Fraud is occurring every single day. I need a better way analyze all my data to find patterns. Information Management Knowledge workers just don t classify documents. They use email, file shares and their home drives to store official documents. If only I could easily find those records Privacy We have all kinds of health records, credit card data and PII. There are just so many places to look: email, SharePoint, file shares... Storage Optimization Between legal holds, and the explosive growth of email and files, we are fighting a losing battle. Our budget is slashed and I have got to find a way to mitigate risks while cutting costs on ediscovery and storage.. General Counsel Compliance Officer Records Manager Chief Privacy Officer CIO 4 INFORMATION GOVERNANCE Analyzing, cleaning, classifying, and organizing content on network shared drives to improve information management Findability - Searching - Sorting Defensibility Storage & disaster recovery Retention Categorization Understand what information is managed by the organization Identify ways to improve that management though policies and communication, applying best practices, adding or changing existing technologies Clean and restructure information to meet business requirements APPROACH To govern unstructured data to reduce cost and risk, and add value using a repeatable four-stage process: Identify and inventory content to make sense of murky pools of dark unstructured data Understand the age, ownership, format and content of each item Classify content based on the facts in context to determine whether information is an asset or a liability Act and execute on governance decisions: optimize storage systems; classify, migrate and protect data; or make it more readily available to the business 6 2014 Managing Electronic Records Conference 20.2
ENTERPRISE KNOWLEDGE Data SAP Case Mgt GIS Wikis & Blogs Tech Sup Dev ECRM Share Point Other Shared Drives M:\ Hard Copy Central Off site Desks IDENTIFY Finance Marketing HR IT NAS02006 NAS01772 NAS11266 RANGE OF ACTIONS TO TAKE <= Stuff I want to migrate and control in ECRM <= Stuff I want to schedule the retention of <= Stuff I want to keep <= Stuff I can t inventory, but can t get rid of <= Stuff I need to ask someone s permission <= Stuff somebody else owns that is abandoned <= Stuff I need to create a policy before I delete <= Stuff I can delete now 30 April 2014 9 2014 Managing Electronic Records Conference 20.3
UNDERSTAND 10 CONTENT ASSESSMENT Are we in compliance? Acceptable use Risk Regulatory Records Management Best practices Are we managing effectively? Are we retaining efficiently? How long will the XYZ program take? How much benefit will we see? What do we need to plan for? CLASSIFY Content value - Good, bad, and ugly Retention categories Information Security Projects, cases, clients Content types and metadata 12 2014 Managing Electronic Records Conference 20.4
CLASSIFY AGAINST WHAT? Content Text (Capitals, punctuation, stop words, stemming) Summary Number patterns Similarity Context Filename, path, extension Dates File Attributes Derived metadata CONTENT CATEGORIES FINDINGS WHERE CAN WE TAKE ACTION? Temporary Backups Zero content Remove Identified garbage - "To be deleted" Photos, identified low value Policy deletes Duplicates Review Database Application Webcontent unitized record Relocate Recent Voluminous Important Migratable Retain 14 PRIORITIZE CLASSIFICATION EFFORTS 10 categories = 80% of expired records Row Labels Event Retention Expired 2,012 2,011 2,010 2,009 2,008 2,007 UNV2020 Retain 2 17,702 8,343 9,483 7,362 4,771 2,180 714 drafts and working files no longer than completion of the final ve UHN3040 7 Years 7 9,792 12,789 13,507 9,393 8,949 4,499 1,511 3,118 1,587 80% 72 PRC4000 No Longer than 3 Years 3 7,986 1,462 2,406 1,862 951 AFP9900 No Longer than 5 years; recommended retention of up to 3 years 5 6,101 114 99 67 34 8 LCR2010 Contract Execution + 10 Years 5 10 3,836 9,970 5,317 882 7,317 2,626 3,786 12,515 4,418 2,281 compliance 491 616 1,248 SMK1000 Last Authorized Use + 6 Years 3 6 3,325 7,922 1,588 664 AFP8000 No Longer than 5 Years 5 2,288 315 574 461 LCR7000 Expiration + 6 Years 5 6 2,140 255 142 34 254 295 65 LCR3200 Resolution + 10 Years 2 10 1,906 3,947 4,317 9,496 970 1,346 330 OPS2000 10 Years 10 1,598 6,904 8,746 8,938 7,429 9,795 923 ITS1300 Until system is no longer relevant to existing data 3 1,248 773 587 197 389 421 217 OPS1300 10 Years 10 1,111 592 727 1,061 863 408 290 HPI5300 Date of Use + 6 years 1 6 992 1,186 1,444 1,165 2,722 3,045 2,626 AFP1000 10 Years 10 988 2,657 3,439 1,650 3,417 21,644 931 ITS3210 90 days to 3 Years 3 957 1,411 914 899 548 301 75 AFP7000 10 Years 10 714 453 1,208 1,197 137 405 51 OPS1200 6 Years 6 701 3,018 2,092 1,004 1,156 1,430 2,420 HMC1040 U tilu d t d 695 983 404 22 285 130 18 3 15 2014 Managing Electronic Records Conference 20.5
ACT Clean Migrate documents to ECRM Register documents in ECRM Register documents in federated RM (MIP) Purge in place Move Hide Strategize and design IG programs 16 ACTION STRATEGY Assessment to prepare for critical business-driven transformation Evaluate the value and risks that may result from a merger, divestiture, or re-organization Help groom the organization for a transformation by identifying opportunities to improve information management and value Develop a plan to minimize negative operational or cultural change requirements and to restructure content accordingly Your content can tell you Active file ownership to identify thought leaders Volume of modified or accessed content to identify level of decision documentation reuse Level of intellectual property control and organizations EXPERTISE Perception: Allen and Julia have the most experience writing contracts Contracts and Proposals Bob Julia Frank Alice Debra Alan Ibrahim Fact: Frank appears to have contributed the most content 2014 Managing Electronic Records Conference 20.6
ACTION LITIGATION CLEANUP Assessment to reduce future litigation costs Recommendations for establishing governance, identifying standards, change management and technology requirements necessary to ensure a comprehensive approach. Identifying types and value of culling etrash, including duplicates, to reduce the volume of content to be delivered in an ediscovery review Develop information structures and practices to make finding and producing content easier Expected costs and benefits from recommendations Your content can tell you Term/subject specific content ownership Types and volume of non-content (applications, install files, other client driven definitions) Key product (or other word list) file volume and location Key term consistency adherence (Comparison of the use of key terms, such as product names, supply chain participants, or other key litigation entities, with possible misspelling, acronyms, and nicknames) ESI Data Map (Content function to share cross tab) FORMATS OVER TIME Perception: Our content consists of standard office content 3500 3000 2500 2000 1500 1000 500 0 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 Fact: Large collections of non- content can appear in single events. ACTION MIGRATE Assessment for Enterprise Records and Content Management (ECRM) planning Strategy development to identify library structures, required technologies, storage planning, security needs and other design elements for ECRM Identification of the general categories of information that can be managed or migrated directly, should be reviewed prior to migration, should NOT be migrated and where it should go - and what should be deleted. Identifying and adding content types and metadata for sets of content worthy of capture and organization What your content can tell you Cross departmental location of duplicates Size, volume and growth of files Migratable and blocked content Content type and metadata analysis (Metadata standards for status, date, version, classification or workgroup level details) Template identifiers 2014 Managing Electronic Records Conference 20.7
METADATA VALUES Perception: We have 4 kinds of contracts to put in the drop down list Contract Type Word List Purchase Agrement 53 Employment 44 Letter of Intent 43 NDA 42 Subcontract Agreement 22 License agreement 12 Referral Agreement 4 Lease 4 Partnership Agreement 3 Fact: There are 9 that appear in the content ACTION - STORAGE OPTIMIZATION Assessment to plan for storage and IT services optimization Enable tiered storage or improved disaster recovery by identifying ways to minimize the impact low value content on high value storage High level technology environmental issues impacting cleanup, migration, and IT practices including volumes, network architecture, speed, policies and standards. Evaluate compliance with acceptable use and security policies and plan for increasing compliance What your content can tell you Volume of duplicates Size and volume of files by category Security assessment and organization Non shared drive content analysis (virtual disks, pc backups, applications, databases, web content, installation files) Obsolete application analysis (Client applications older than current system registries, or matching typical downloaded application descriptions) Custom extension analysis (to identify internally created file formats) RECORDS MANAGEMENT Perception: We can base our tiered storage on the last access dates of files 700 600 500 400 300 200 Modified Date Accessed Date 100 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Fact: Last Accessed dates are often reset by virus or backup programs (or they were) 2014 Managing Electronic Records Conference 20.8
ACTION - RISK MITIGATION Assessment of Risk Impact and Mitigation Opportunities and level of effort required to apply audit requirements, privacy and risk mitigation, and other file tagging and identification to help the organization achieve greater compliance. Evaluate the level of and types of risk that the organization faces Changing risky processes and behaviors to minimize future risk What your content can tell you Risk element type analysis (High, medium, low) Risk elements by year to identify trends Risk elements by other metadata value (geography, source format, source) Risk elements by user volume Compliance content organization and location Risk type overlap with other metadata such as department or geography (Venn diagram) RISK MITIGATION Perception: I need to address both CCNs and SSNs immediately 40 35 30 25 20 15 10 5 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Credit Card SSN Fact: A singular event may not need fixing ACTION - RECORDS MANAGEMENT Assessment of an erecords Management Program Initiative Recommendations for a program setup to identify required staff, SMEs, level of user interaction Map data sources to content and custodians Developing definitions of records, non-records and etrash files to help build an RM communication and support program What your content can tell you RM Program communication analysis and level of effort Volume of non-record deletable files Volume of duplicates, renditions, and other courtesy copies Volume of expired records Naming convention - Root level function, Acronyms, Date format Custodian and liaison analysis Modification and file access analysis 2014 Managing Electronic Records Conference 20.9
CONTENT AGING Perception: The desired business retention of XXX1020 is 10 years 120 Age Modified 100 80 60 40 Age 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Fact: 90% of reports are never accessed after 4 days; 100% after 10 days FOLDER STRUCTURES LDM BJL MMW HGS HSH TKJ Bills files Altos3.2 EPA Payments JGAGDG Final Accounting Finance HR Marketing Legal Duplicates 25% 15% 5% Accuracy 65% 75% 95% AN IG ENABLEMENT PLATFORM Data Source Data Source Data Source Data Source Information Transparency IG Tool Catalyst Point Analysis Layer Clean Organize Migrate Protect Discover Storage Optimization Classify/ Move Email/ Archives Risk Assessment/ Audit ediscovery Investigation Audit Enable active governance through Information Transparency 2014 Managing Electronic Records Conference 20.10
QUESTIONS BRIAN.TUEMMLER@NUIX.COM All rights reserved 2012. Nuix Software 2014 Managing Electronic Records Conference 20.11