Auto-Classification for Document Archiving and Records Declaration

Similar documents
IBM Unstructured Data Identification and Management

IBM Content Analytics with Enterprise Search, Version 3.0

IBM Policy Assessment and Compliance

IBM ediscovery Identification and Collection

ILM et Archivage Les solutions IBM

SMART ARCHIVING. The need for a strategy around archiving. Peter Van Camp

The Future of Business Analytics is Now! 2013 IBM Corporation

IBM Unstructured Data Identification & Management An on ramp to reducing information costs and risk

Industry Impact of Big Data in the Cloud: An IBM Perspective

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Agile enterprise content management and the IBM Information Agenda.

Brochure. ECM without borders. HP Enterprise Content Management (ECM)

3 MUST-HAVES IN PUBLIC SECTOR INFORMATION GOVERNANCE

CONCEPTCLASSIFIER FOR SHAREPOINT

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.

Three proven methods to achieve a higher ROI from data mining

Integrated archiving: streamlining compliance and discovery through content and business process management

KPMG Unlocks Hidden Value in Client Information with Smartlogic Semaphore

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy

Leading the Pack - IBM Enterprise Content Management Solutions

Text Analytics Beginner s Guide. Extracting Meaning from Unstructured Data

The Smart Archive strategy from IBM

The Core Pillars of AN EFFECTIVE DOCUMENT MANAGEMENT SOLUTION

White Paper: Information Governance In The Era of Big Data. Information Governance In The Era of Big Data

A New Era Of Analytic

Predictive Coding, TAR, CAR NOT Just for Litigation

QRadar SIEM and FireEye MPS Integration

Test Data Management in the New Era of Computing

Real World Application and Usage of IBM Advanced Analytics Technology

III JORNADAS DE DATA MINING

InfoSphere Governance Solutions Maximizing your Information Supply Chain

IBM Security Intelligence Strategy

Beyond Watson: The Business Implications of Big Data

TEXT ANALYTICS INTEGRATION

Leveraging Information For Smarter Business Outcomes With IBM Information Management Software

Gain control over all enterprise content

Information Lifecycle Governance. Surabhi Kapoor & Jan Lambrechts

Capstone for Records Management

Cognitive z. Mathew Thoennes IBM Research System z Research June 13, 2016

How To Understand The Benefits Of Big Data

SAME PRINCIPLES APPLY, BUT NEW MANDATES FOR CHANGE

Industry Models and Information Server

Social Media Implementations

itunes 1.6 Cognitive - The Building Blocks for Smarter Apps

Maximize customer value and reduce costs and risk

Introduction to Text Mining and Semantics. Seth Grimes -- President, Alta Plana

IBM Information Archive for , Files and ediscovery

STAR WARS AND THE ART OF DATA SCIENCE

Combining the power of content and process with the right content management solution. IBM Information Management software

Reduce Cost, Time, and Risk ediscovery and Records Management in SharePoint

Direct-to-Company Feedback Implementations

84% of Migration Projects Fail Getting it Right in SharePoint

Why Modern B2B Marketers Need Predictive Marketing

UTILIZING COMPOUND TERM PROCESSING TO ADDRESS RECORDS MANAGEMENT CHALLENGES

EMC DOCUMENTUM XCP Accelerate the development of custom content-enabled solutions to support case management

Auto-Classification in SharePoint. How BA Insight AutoClassifier Integrates with the SharePoint Managed Metadata Service

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Miguel Ortiz, Sr. Systems Engineer. Globanet

What you can accomplish with IBMContent Analytics

Solve your toughest challenges with data mining

Microsoft SharePoint THE PLATFORM ENTERPRISES NEED

The evolution of data archiving

The biggest challenges of Life Sciences companies today. Comply or Perish: Maintaining 21 CFR Part 11 Compliance

Predictive Marketing for Banking

Information Governance in the Cloud

Document Management and Records Management in SharePoint Scott Jamison

Optimizing government and insurance claims management with IBM Case Manager

The New Normal: Get Ready for the Era of Extreme Information Management. John Mancini President, DigitalLandfill.

Metrics that Matter Security Risk Analytics

Top 5 reasons to choose HP Information Archiving

QRadar SIEM and Zscaler Nanolog Streaming Service

Predictive Analytics: Turn Information into Insights

Analyzing survey text: a brief overview

The top 10 secrets to using data mining to succeed at CRM

Management Accountants and IT Professionals providing Better Information = BI = Business Intelligence. Peter Simons peter.simons@cimaglobal.

IBM Enterprise Content Management Product Strategy

Solve Your Toughest Challenges with Data Mining

Are You Ready for Big Data?

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data

WHITE PAPER. Creating your Intranet Checklist

Data Refinery with Big Data Aspects

Making critical connections: predictive analytics in government

Intelligent document management for the legal industry

2011 Cyber Security and the Advanced Persistent Threat A Holistic View

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

Luncheon Webinar Series May 13, 2013

AV-20 Best Practices for Effective Document and Knowledge Management

Delivering Smart Answers!

Big Data and Trusted Information

Spend Enrichment: Making better decisions starts with accurate data

Big Data overview. Livio Ventura. SICS Software week, Sept Cloud and Big Data Day

Harnessing the power of advanced analytics with IBM Netezza

Deliver Value and See Your Market Research Business Grow

ECM: Key Market Trends and the Impact of Business Intelligence

Symantec Enterprise Vault and Symantec Enterprise Vault.cloud

Hexaware E-book on Predictive Analytics

Big Data. Fast Forward. Putting data to productive use

BIG DATA THE NEW OPPORTUNITY

Discover 2014 Update Big Data changes everything. Roy Ritthaler Vice President, IT Operations Management

Transcription:

Auto-Classification for Document Archiving and Records Declaration Josemina Magdalen, Architect, IBM November 15, 2013

Agenda IBM / ECM/ Content Classification for Document Archiving and Records Management IBM environment: Research and SWG working together to produce the best solution for our Customers How does Content Classification bring value to ECM? Content Classification concepts, components and architecture How can Content Classification help with Document Archiving and Records Management? Classification and Compliance Managing content at the entry point with Content navigator and Classification Optimizing your business workflow with Case manager and Classification Copyright International Business Machines Corporation 2013. All Rights Reserved. US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. 2 Internal Use Only

Beating the competition and leading the industry IBM ahead in both growth and market share Source: Gartner, ECM Market Share, April 30, 2013 IBM is the undisputed leader in the 2013 Gartner ECM Magic Quadrant (Sep 2013) 3 Internal Use Only Source: IDC, ECM Market Share, April 30, 2013 IBM Corporation

It s no longer about one thing Volume Velocity Variety 12 terabytes of Tweets create daily 5 trade events per second million 4 terabytes/site/day average surveillance video Analyze product sentiment Identify potential fraud Monitor events of interest 15 petabytes 500 million 80% info of new information daily growth call detail records per day is unstructured content Determine relevance Prevent customer churn Improve customer satisfaction During this presentation, 458.81 terabytes of information will have been created 4 Internal Use Only

Unleash the value of content in motion Capture it. Activate it. Socialize it. Analyze it. Govern it. Content at Rest equals Cost,.. Content in Motion equals Value 5 Internal Use Only

Agenda IBM / ECM/ Content Classification for Document Archiving and Records Management IBM environment: Research and SWG working together to produce the best solution for our Customers How does Content Classification bring value to ECM? Content Classification concepts, components and architecture How can Content Classification help with Document Archiving and Records Management? Classification and Compliance Managing content at the entry point with Content navigator and Classification Optimizing your business workflow with Case manager and Classification Copyright International Business Machines Corporation 2013. All Rights Reserved. US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. 6 Internal Use Only

IBM Research: Open and Collaborative The Eras of IBM Research: The World Is Now Our Lab Isolated Research Joint Projects IBM Divisions, Clients, Universities Radical Collaboration In-world Research, Smarter Planet Research 50s 90s Hardware 90s 00s + Software & Services 10s + Smarter Planet 7 7 Internal Use Only

Let s talk about Watson What is IBM Watson? Why is it important? How is IBM putting Watson to work? 9 What can we expect in the future? 9 Internal Use Only

IBM Watson combines transformational technologies 1 Understands natural language and human communication 2 Generates and evaluates evidence-based hypothesis 3 Adapts and learns from user selections and responses built on a massively parallel architecture optimized for IBM POWER7 10 10 Internal Use Only

Agenda IBM / ECM/ Content Classification for Document Archiving and Records Management IBM environment: Research and SWG working together to produce the best solution for our Customers How does Content Classification bring value to ECM? Content Classification concepts, components and architecture How can Content Classification help with Document Archiving and Records Management? Classification and Compliance Managing content at the entry point with Content navigator and Classification Optimizing your business workflow with Case manager and Classification Copyright International Business Machines Corporation 2013. All Rights Reserved. US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. 11 Internal Use Only

Content is Exploding Content is Evolving Content is Transforming The marketplace is driving greater volume, variety and velocity 12 Internal Use Only

Organizations will need to redefine their content strategy In order to gain control, optimize business outcomes, improve collaboration, achieve new insight, and govern for reduced cost and risk content in motion 13 13 Internal Use Only 2013 2012 IBM IBM Corporation Corporation

IBM helps companies realize the full value of content for better insight and outcomes Capture harness and exploit Activate Socialize Analyze optimize outcomes share and collaborate achieve new insights Govern reduce costs and risks 14 Internal Use Only

Classification brings value to IBM ECM products helping organizations with: Accessibility, Usability, Compliance, Analytics Can you find relevant content, quickly? Search, Refine, Repeat is no longer an acceptable Image Capture, Content Collection, Enterprise Search Is the right content available at the right time? Business processes require timely access to content Business Process Management, Case Management Are you complying with Legal and Business mandates? Content has a compliance lifecycle that must be enforced Content Collection, Enterprise Records, ediscovery Are you uncovering business insight from your content? Organized content produces better insight Content Analytics 15 Internal Use Only IBM Confidential 15

Content Classification: Analyze content to unlock critical insight Derive new business insight rapidly by accessing, interpreting and analyzing unstructured content Analyze content to derive 360-degree visibility and insight into unstructured information Search, assess and analyze large volumes of text in order to understand and determine relevant insight quickly Classify content through contextual understanding Only IBM brings together the technologies that define the next generation of Smarter Analytics solutions that can reason and learn Natural language Hypothesis testing 2 1 3 Evidence-based learning IBM Content Classification Moving your organization from search to discovery, from possibilities to probabilities, and from simple outputs to intelligent options 16 16 Internal Use Only

What does IBM Content Classification do? Content Classification discovers the intent of a document by analyzing its content automatically learns from examples allows you to auto-classify huge volumes of documents into pretrained categories, consistently and efficiently 17 Internal Use Only

What is IBM Content Classification used for? Content Classification is most valuable when: A large number of documents need to be categorized Documents need to be categorized based on their content When an action needs to be taken as a result of the classification Need to order the chaos and bring structure into unstructured data 18 Internal Use Only

What is IBM Content Classification used for? (cont.) Automatic classification advantages over manual classification: Reduces training cost Reduces laborious activities Consistent decisions, reduces errors Coherent and legally defensible Extremely fast 19 Internal Use Only

Why organizations need Content Classification Through automated, advanced classification, knowledge workers have quick access to relevant content can use the information they need to complete tasks are not burdened with enforcing compliance and retention policies can analyze content relevant to specific subject matter Automated classification allows workers to focus on key business tasks, rather than spend time with manual categorization of content In short, Content Classification improves productivity 20 20 Internal Use Only

Agenda IBM / ECM/ Content Classification for Document Archiving and Records Management IBM environment: Research and SWG working together to produce the best solution for our Customers How does Content Classification bring value to ECM? Content Classification concepts, components and architecture How can Content Classification help with Document Archiving and Records Management? Classification and Compliance Managing content at the entry point with Content navigator and Classification Optimizing your business workflow with Case manager and Classification Copyright International Business Machines Corporation 2013. All Rights Reserved. US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. 21 Internal Use Only

Content Classification What If? Manual classification might yield Correspondence or Complaint Rules-based classification needs rules for One Million Mile, delay, weather and what else? Context-Based Classification enables you to.. Classify as Presidential or High Value Client Route to worker assigned to High Value Clients Assign to High Value Client Record Class & Retention Rules Analyze content for High Value Client feedback 22 Internal Use Only

Content Classification What If? Manual classification might yield Correspondence or Complaint Rules-based classification needs rules for awful, delay, never and what else? Context-Based Classification enables you to.. Classify as Customer Complaint Route to worker assigned to Complaints Assign to Complaint Record Class & Retention Rules Analyze content for Customer Satisfaction or Dissatisfaction 23 Internal Use Only

Content Classification What If? Manual classification might yield Correspondence Rules-based classification needs rules for wonderful, delay, pleasure and what else? Context-Based Classification enables you to.. Classify as Customer Compliment Route to worker assigned to Cross Sell or Up Sell Assign to Compliment Record Class & Retention Rules Analyze content for Customer Satisfaction or Dissatisfaction 24 Internal Use Only

What-If Summary The sample emails were not, specifically, about delays, they were about how customers were treated during a delay Manual classification would be slow and could have resulted in inaccurate categorization Rules-based, keyword-only classification would require rules for specific keywords and may have miscategorized based on words like delay Context-based classification allowed the system to understand the context of each email and classify, route, govern, and analyze with better accuracy The Bottom Line: Content Classification tells you what you content is about 25 Internal Use Only

Classification Process Train using Quick Start Tool 1. Train Decision Plan 2. Deploy Classification Server Classification Application? The core market for this new product has been defined as such by IBM 3. Auto Classify A The core market for this new product has been defined as such by IBM 26 26 Internal Use Only

Classification by Contextual Understanding Text Analysis, Statistics, and Learning by Example Knowledge Base Custom & partner applications IBM pre-built integrations (ECM,...) Input Team, We need to determine how to handle the results of the most recent earnings report and how it will impact the reaction on Wall Street. We need to get out in front of this before the press does! Jack, get the status from Engineering ahead of time. Regards, John Output PR(92%) FINANCE(82%) ENGINEERING(32%) Feedback Intent = PR Email IBM Content Classification 27 Internal Use Only

Control the level of Classification automation Advanced classification can be executed as an assistance to authors in user interfaces Semi-automated advanced classification via monitoring Assisted classification in user interfaces like SharePoint or in the future in IBM s Office integration Complete Automation Automation with Auditing Automation of Medium Confidence and Above Automation of High Confidence and Above Assisted Manual Classification 100% 0% 28 Internal Use Only

Data in motion: Periodic human oversight facilitates automatic adjustment of policies Content Classification learns from user feedback to improve and adapt policies Category Recommendation User Interactions User Feedback Classification Server 29 Internal Use Only

Content Classification Rules Decision Plan A decision plan is a sequence of rules and calls to statistical analysis Rule capabilities: String search Word distance Regular expressions Pattern extraction Boolean expressions Decision plan capabilities: Identify category (in more than one taxonomy) Set document metadata Invoke statistical analysis Language identification Recommend actions 30 Internal Use Only

Content Classification Rules Fine-tuning with Rules Use rules to select a category based on score Use rules to extract data from textual content Test rules to analyze their behavior with variable content items 31 Internal Use Only

Content Classification Rules More Decision Plan rules Triggers Substring search, words search/ words within a distance Search based on (large) words/phrases lists Search based Date/Time search Actions Set Date/Time actions Set Expiration Date (Retention Date) Entity identification and extraction: standard regular expression syntax supported Decision plan pipeline has a published API. Customers can create their own custom classification methods or call out to other systems to enhance classification You can invoke your preferred ontology You can use UIMA annotators 32 Internal Use Only

Content Classification Methods: Contextual and Rules Content Classification combines multiple methods of categorization technologies to deliver the automatic classification Uses contextual analysis based on machine learning techniques Uses natural language processing and semantic analysis Uses rules-based on metadata or confidence score Can be used in tandem or separately depending on requirements 33 33 Internal Use Only