New Developments in the Automatic Classification of Records. Inge Alberts, André Vellino, Craig Eby, Yves Marleau

Size: px
Start display at page:

Download "New Developments in the Automatic Classification of Email Records. Inge Alberts, André Vellino, Craig Eby, Yves Marleau"

Transcription

1 New Developments in the Automatic Classification of Records Inge Alberts, André Vellino, Craig Eby, Yves Marleau ARMA Canada 2014

2 INTRODUCTION

3 OUTLINE 1. Research team 2. Research context / Problem statement 3. Overview of records auto-classification 4. Project objectives 5. Research methodology 6. Qualitative analysis 7. Automatic classification 8. Future work 3

4 RESEARCH TEAM CISRI Center of excellence and a catalyst for collaborative, interdisciplinary research in information science ÉSIS, University of Ottawa Information Studies program grounded in theory, supported by practical work experience, and integrally connected to the trends of the leading knowledge centres in the National Capital Region and beyond 4

5 RESEARCH TEAM Inge Alberts ÉSIS / CISRI André Vellino ÉSIS / Institute for Science, Society and Policy (ISSP) Craig Eby CISRI / Cogniva Yves Marleau CISRI / Cogniva 5

6 BUSINESS PROBLEMS management has always been a problem Terabytes of data in mailboxes and PST files Integration of and ECM systems What is the role of the user? 6

7 GOC EXAMPLE Three new initiatives make the problem even more complex: Transformation Directive on Recordkeeping Open Government 7

8 IMPORTANCE OF BUSINESS CONTEXT Need to identify information of business value Need to better define the concept of business value Need to situate information within its context of use 8

9 DEFINING BUSINESS CONTEXT 9

10 RESEARCH PROJECT File Plan Business Function Function Function Sub- Function Sub- Function Sub- Function Sub- Function 10

11 ISIS Methodology produces models of an organization s business context ISIS Enterprise Software Solution helps organizations implement business context centric classification Classification automation Automated business rules Centralized Taxonomy and Rules Management Goal is to further reduce the requirements on users 11

12 OVERVIEW OF RECORDS AUTO- CLASSIFICATION Current state: Reaching similar classification quality as human users Mix of statistical and rule-based implementations Challenges Challenging to implement Confidence in and acceptance of results Lacking a structured approach to systematically ensure quality of system and interpret results 12

13 RELATED RESEARCH Analysis of - Ph.D. on automatic classification (Inge) - Extrusion protection at Entrust (André) - Business value pilot (André & Inge) Business Modeling - ISIS Methodology (Cogniva ) - Collaborations with LAC (Cogniva ) - Research on faceted classification (CISRI & University of Montreal) Process Discovery & Auto Classification - Auto-classification Research (Cogniva) - IRAP Research (Cogniva & CISRI) - ISIS Software (Cogniva) 13

14 RESEARCH OBJECTIVES 1. Understand how the concept of business value applies to the management of records 2. Develop a model of information experts strategies while appraising value in a work context 3. Propose a set of requirements to automatically classify organizational records 4. Test these requirements on a corpus of s 14

15 RESEARCH METHODOLOGY Phase 1: User Study Qualitative analysis of 8 information experts appraisal strategies Phase 2: Automatic Classification Quantitative analysis of ~900 messages 15

16 Research Focus Phase 1 Criteria to identify the business value of Decision process when appraising the value of Lexical and nonlexical features used when appraising the business value of Requirements needed to automatically classify organizational 16

17 Methodology Phase 1 Semi-structured Interviews Cognitive Inquiries 8 experts ~1h 14 questions on business value & 7 experts ~30 minutes classification exercise sample (n=174) Model Development Manual Classification BV factors feature analysis Classification model of appraisal strategies 2 inboxes (n=1975) 1 corpus classified (n=~800) 17

18 PROFILE OF PARTICIPANTS (N=8) 18

19 RESULTS FROM PHASE Business Value Factors 2. Features Analysis 3. Classification Model of Appraisal Strategies 19

20 BUSINESS VALUE Information resources of business value: Are published and unpublished materials, regardless of medium or form Created or acquired because they enable and document decision-making in support of programs, services and ongoing operations Support departmental reporting, performance and accountability requirements (Directive on Recordkeeping) 20

21 BUSINESS VALUE Process Context Operational = Performance Support actions & decisions Enhance performance Mitigate risks Evidential = Accountability Evidence of transaction Report on results ATIP, litigation Time 21

22 BUSINESS VALUE FACTORS Origin Action Chronology Meaning 22

23 ORIGIN origin is internal (team members, supervisors) or external (clients, professional network) Origin is the main factor affecting the appraisal of business value Appraisal decisions related to origin are based on: Name of the sender Position and organization of the sender Hierarchical relation between the sender & the recipient Active project involving the sender & the recipient 23

24 ACTION action is passive (no engagement from the recipient) or performative (engagement & accountability from the recipient) Action is an important factor affecting the appraisal of business value, both operational & evidential Appraisal decisions related to action are based on: Type of action Level of engagement & accountability of the recipient (high risk or low risk) 24

25 CHRONOLOGY Chronology is operational (during project) or postmortem (after project) For IM consultants, keeping track of action history is a determinant factor to appraise business value, specially for active projects Challenging factor for defining business value Appraisal decisions related to chronology are based on: Project status: active or closed 25

26 MEANING meaning is explicit (rich vocabulary) or latent (based on context) Many solutions available on the market classify based on explicit meaning but appraisal decisions are often based on latent meaning Appraisal decisions related to meaning are based on: EXPLICIT: Keywords, Attachment, Thread, Type of Action IMPLICIT: Origin, Chronology, Level of Engagement 26

27 ANALYSIS OF FEATURES Lexical Features Name & organization of the sender Action verbs: approval, confirmation, request, reminder, negotiation Action objects: SOW, meeting, status report, deadlines, deliverables, decision, reference material Presence of RE or FW in the title Name of attachment Nonlexical Features Message sent or received Hierarchical relation between the sender & the recipient Position of the recipient (TO, CC) Number of recipients (TO, CC) Project status: active or closed Presence of attachment Presence of a thread Presence of high priority symbol 27

28 HUMAN CLASSIFICATION MODEL (1/2) 28

29 HUMAN CLASSIFICATION MODEL (2/2) 29

30 MANUAL CLASSIFICATION CHALLENGES (1/2) More BV=NO than BV=Yes Bilingual messages Two recipients in the TO fields Threads: a message of business value is quickly superseded by a more recent one Sender is accountable for internal messages sent but for both sent and received when external 30

31 MANUAL CLASSIFICATION CHALLENGES (2/2) s of business value and attachments of business value have to be differentiated Perception of value is different between the individuals and the organization Evaluating the importance of some decisions or main revisions of drafts can be challenging Some s of business value during active projects are ephemeral = operational versus evidential 31

32 Objectives Phase 2 Attempt the automation of binary classifications of Business Value No Business Value Compare the human labeling process with the machine learning 32

33 Methodology Phase 2 Corpus Creation Manual Classificatio n Feature Extraction Machine Training Cross- Validation Model Testing 33

34 A Machine Learning toolkit for non-experts For experimenting with text mining technology Based on Weka Data Mining Open Source software Developed by smart Ph.D. students at Carnegie Mellon University Offers Feature extraction Model building Automated analysis and labeling Prediction 34

35 CORPUS FOR TRAINING 2 individual collections (inbox + outbox) 250 s Business Value s No Business Value 172 s Business Value No Business Value Features extracted: Originator and Recipients (To / From / Cc) Content of Subject / Body / Attachments Number of recipients in To and Cc fields Number of attachments Importance flag Forwarded indications Part of Thread indications 35

36 FROM AND TO FIELDS "From sender supervisor colleague client To solerecipient "soleorganizationalrecipient "supervisorisrecipient "oneamongmanyrecipient clientisrecipient" in every other case 36

37 GENERAL RESULTS Business Value SVM compare w/ Spam SVM Accuracy 0.91 Accuracy 0.96 Kappa 0.83 Kappa 0.93 Support Vector Machines (SVMs) are highly accurate predictors of Business Value / Not Business Value SVMs models are very specific to sender / recipient One model does not appear to suffice for organizational automatic classification of business value Training SVMs for greater accuracy makes it more difficult to explain the behaviour of the model 37

38 CONTRIBUTION OF ATTACHMENT CONTENT Without Attachment Content Analysis With Attachment Content Analysis 38

39 No business value Features Company Acronym Company Colleague Colleague Colleague 39

40 business value Features owner Client Acronym Client Client Acronym 40

41 FUTURE WORK Experiment w/ alternative classifiers besides SVMs Grow the corpus of BV and NBV from a wider variety of senders / recipients Add / subtract features Vary text analysis parameters Unigram / Bigram / Trigram PoS tagging Stop words Stemming Punctuation 41

42 ACKNOWLEDGMENTS Special thanks to: The participants for their time and their enthusiasm during this study The organization which granted us permission to analyze data The research assistants for their active contribution This project is supported by a research grant from the University of Ottawa 42

43 THANK YOU 43

44 ACCURACY True-Pos. (229) + True-Neg. (228) (True + False) Pos. + (True + False) Neg (500) = 457 / 500 =

45 Cohen s Kappa The degree to which the machine classifier and the human classifiers agree. κ= Pr(M) Pr(R) / 1-Pr(R) Pr(M) = ( ) / 500 = 0.91 Pr(R) = probability of random agreement = 0.5 κ=

Managing e-records without an EDRMS. Linda Daniels-Lewis Senior IM Consultant Systemscope

Managing e-records without an EDRMS. Linda Daniels-Lewis Senior IM Consultant Systemscope Managing e-records without an EDRMS Linda Daniels-Lewis Senior IM Consultant Systemscope Outline The e-record What s involved in managing e-records? Where do we start? How do we classify? How do we proceed?

More information

Facilitating Business Process Discovery using Email Analysis

Facilitating Business Process Discovery using Email Analysis Facilitating Business Process Discovery using Email Analysis Matin Mavaddat Matin.Mavaddat@live.uwe.ac.uk Stewart Green Stewart.Green Ian Beeson Ian.Beeson Jin Sa Jin.Sa Abstract Extracting business process

More information

Taxonomies in Practice Welcome to the second decade of online taxonomy construction

Taxonomies in Practice Welcome to the second decade of online taxonomy construction Building a Taxonomy for Auto-classification by Wendi Pohs EDITOR S SUMMARY Taxonomies have expanded from browsing aids to the foundation for automatic classification. Early auto-classification methods

More information

Deleting Electronic Records Setting Yourself Up for Success. Pilar C. McAdam, CRM, ERMm Partner, Information Governance

Deleting Electronic Records Setting Yourself Up for Success. Pilar C. McAdam, CRM, ERMm Partner, Information Governance Deleting Electronic Records Setting Yourself Up for Success 1 1 Pilar C. McAdam, CRM, ERMm Partner, Information Governance About Pilar C. McAdam, CRM, ERMm Partner, Information Governance, Kaizen InfoSource

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

Feature Subset Selection in E-mail Spam Detection

Feature Subset Selection in E-mail Spam Detection Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature

More information

Automated Content Analysis of Discussion Transcripts

Automated Content Analysis of Discussion Transcripts Automated Content Analysis of Discussion Transcripts Vitomir Kovanović v.kovanovic@ed.ac.uk Dragan Gašević dgasevic@acm.org School of Informatics, University of Edinburgh Edinburgh, United Kingdom v.kovanovic@ed.ac.uk

More information

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,

More information

Sentiment analysis on tweets in a financial domain

Sentiment analysis on tweets in a financial domain Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International

More information

Management of Email Records

Management of Email Records Department of Culture and the Arts Government of Western Australia State Records Office of Western Australia SRO Guideline Management of Email Records A Recordkeeping Guideline for State Organizations

More information

CONCEPTCLASSIFIER FOR SHAREPOINT

CONCEPTCLASSIFIER FOR SHAREPOINT CONCEPTCLASSIFIER FOR SHAREPOINT PRODUCT OVERVIEW The only SharePoint 2007 and 2010 solution that delivers automatic conceptual metadata generation, auto-classification and powerful taxonomy tools running

More information

The Enron Corpus: A New Dataset for Email Classification Research

The Enron Corpus: A New Dataset for Email Classification Research The Enron Corpus: A New Dataset for Email Classification Research Bryan Klimt and Yiming Yang Language Technologies Institute Carnegie Mellon University Pittsburgh, PA 15213-8213, USA {bklimt,yiming}@cs.cmu.edu

More information

UTILIZING COMPOUND TERM PROCESSING TO ADDRESS RECORDS MANAGEMENT CHALLENGES

UTILIZING COMPOUND TERM PROCESSING TO ADDRESS RECORDS MANAGEMENT CHALLENGES UTILIZING COMPOUND TERM PROCESSING TO ADDRESS RECORDS MANAGEMENT CHALLENGES CONCEPT SEARCHING This document discusses some of the inherent challenges in implementing and maintaining a sound records management

More information

High Productivity Data Processing Analytics Methods with Applications

High Productivity Data Processing Analytics Methods with Applications High Productivity Data Processing Analytics Methods with Applications Dr. Ing. Morris Riedel et al. Adjunct Associate Professor School of Engineering and Natural Sciences, University of Iceland Research

More information

A Method for Automatic De-identification of Medical Records

A Method for Automatic De-identification of Medical Records A Method for Automatic De-identification of Medical Records Arya Tafvizi MIT CSAIL Cambridge, MA 0239, USA tafvizi@csail.mit.edu Maciej Pacula MIT CSAIL Cambridge, MA 0239, USA mpacula@csail.mit.edu Abstract

More information

State of Montana E-Mail Guidelines

State of Montana E-Mail Guidelines State of Montana E-Mail Guidelines A Management Guide for the Retention of E-Mail Records for Montana State Government Published by the: Montana State Records Committee Helena, Montana September 2006 Based,

More information

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!

More information

Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model

Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model AI TERM PROJECT GROUP 14 1 Anti-Spam Filter Based on,, and model Yun-Nung Chen, Che-An Lu, Chao-Yu Huang Abstract spam email filters are a well-known and powerful type of filters. We construct different

More information

Feature Selection for Electronic Negotiation Texts

Feature Selection for Electronic Negotiation Texts Feature Selection for Electronic Negotiation Texts Marina Sokolova, Vivi Nastase, Mohak Shah and Stan Szpakowicz School of Information Technology and Engineering, University of Ottawa, Ottawa ON, K1N 6N5,

More information

Delivering Smart Answers!

Delivering Smart Answers! Companion for SharePoint Topic Analyst Companion for SharePoint All Your Information Enterprise-ready Enrich SharePoint, your central place for document and workflow management, not only with an improved

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

How to gather and evaluate information

How to gather and evaluate information 09 May 2016 How to gather and evaluate information Chartered Institute of Internal Auditors Information is central to the role of an internal auditor. Gathering and evaluating information is the basic

More information

A Comparative Study on Sentiment Classification and Ranking on Product Reviews

A Comparative Study on Sentiment Classification and Ranking on Product Reviews A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan

More information

Twitter sentiment vs. Stock price!

Twitter sentiment vs. Stock price! Twitter sentiment vs. Stock price! Background! On April 24 th 2013, the Twitter account belonging to Associated Press was hacked. Fake posts about the Whitehouse being bombed and the President being injured

More information

TOWN OF COTTESLOE POLICY EMAIL MANAGEMENT

TOWN OF COTTESLOE POLICY EMAIL MANAGEMENT EMAIL MANAGEMENT POLICY STATEMENT Town of Cottesloe email accounts are intended for business transactions in support of the Town s strategic goals and objectives. Accordingly any email transmission residing

More information

Electronic Business Communication and University Records (E-mail, Chat and Text) UW- Madison Employee Guidance to. UW-Madison Record Management

Electronic Business Communication and University Records (E-mail, Chat and Text) UW- Madison Employee Guidance to. UW-Madison Record Management UW-Madison Record Management UW- Madison Archives & Records Management 2014 UW- Madison Employee Guidance to Electronic Business Communication and University Records (E-mail, Chat and Text) Questions:

More information

ILM et Archivage Les solutions IBM

ILM et Archivage Les solutions IBM Information Management ILM et Archivage Les solutions IBM Dr. Christian ARNOUX Consultant Information Management IBM Suisse, Software Group 2007 IBM Corporation IBM Strategy for Enterprise Content Compliance

More information

Big Data Text Mining and Visualization. Anton Heijs

Big Data Text Mining and Visualization. Anton Heijs Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark

More information

Cloud Storage-based Intelligent Document Archiving for the Management of Big Data

Cloud Storage-based Intelligent Document Archiving for the Management of Big Data Cloud Storage-based Intelligent Document Archiving for the Management of Big Data Keedong Yoo Dept. of Management Information Systems Dankook University Cheonan, Republic of Korea Abstract : The cloud

More information

Forecasting stock markets with Twitter

Forecasting stock markets with Twitter Forecasting stock markets with Twitter Argimiro Arratia argimiro@lsi.upc.edu Joint work with Marta Arias and Ramón Xuriguera To appear in: ACM Transactions on Intelligent Systems and Technology, 2013,

More information

Data Mining in Personal Email Management

Data Mining in Personal Email Management Data Mining in Personal Email Management Gunjan Soni E-mail is still a popular mode of Internet communication and contains a large percentage of every-day information. Hence, email overload has grown over

More information

How To Write A Summary Of A Review

How To Write A Summary Of A Review PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,

More information

Spam Filtering Based On The Analysis Of Text Information Embedded Into Images

Spam Filtering Based On The Analysis Of Text Information Embedded Into Images Journal of Machine Learning Research 7 (2006) 2699-2720 Submitted 3/06; Revised 9/06; Published 12/06 Spam Filtering Based On The Analysis Of Text Information Embedded Into Images Giorgio Fumera Ignazio

More information

WHITE PAPER: DATA SYSTEM AND PROTECTION. Symantec Enterprise Vault Intelligent Archiving and Email Classification, Retention, Filtering, and Search

WHITE PAPER: DATA SYSTEM AND PROTECTION. Symantec Enterprise Vault Intelligent Archiving and Email Classification, Retention, Filtering, and Search WHITE PAPER: DATA SYSTEM AND PROTECTION Symantec Enterprise Vault Intelligent Archiving and Email Classification, Retention, Filtering, and Search Nick Mehta Vice President, Symantec White Paper: Data

More information

INFORMATION GOVERNANCE A Holistic Approach to Information Governance. David Peterson June 6, 2014

INFORMATION GOVERNANCE A Holistic Approach to Information Governance. David Peterson June 6, 2014 INFORMATION GOVERNANCE A Holistic Approach to Information Governance David Peterson June 6, 2014 Presentation Overview WHAT IS INFORMATION GOVERNANCE? CHALLENGES OUTCOMES ESSENTIAL ELEMENTS STANDARD DEFINITIONS

More information

The Introduction of a New Performance Management System. for Administrative & Professional, and Exempt Employees at Brock University

The Introduction of a New Performance Management System. for Administrative & Professional, and Exempt Employees at Brock University The Introduction of a New Performance Management System for Administrative & Professional, and Exempt Employees at Brock University Your Role Today In your day-to-day activities you may wear many different

More information

Spam detection with data mining method:

Spam detection with data mining method: Spam detection with data mining method: Ensemble learning with multiple SVM based classifiers to optimize generalization ability of email spam classification Keywords: ensemble learning, SVM classifier,

More information

A Proposed Algorithm for Spam Filtering Emails by Hash Table Approach

A Proposed Algorithm for Spam Filtering Emails by Hash Table Approach International Research Journal of Applied and Basic Sciences 2013 Available online at www.irjabs.com ISSN 2251-838X / Vol, 4 (9): 2436-2441 Science Explorer Publications A Proposed Algorithm for Spam Filtering

More information

Information Access Platforms: The Evolution of Search Technologies

Information Access Platforms: The Evolution of Search Technologies Information Access Platforms: The Evolution of Search Technologies Managing Information in the Public Sphere: Shaping the New Information Space April 26, 2010 Purpose To provide an overview of current

More information

Social Media Analytics

Social Media Analytics Social Media Analytics Raghu Krishnapuram and Jitendra Ajmera IBM Research - India 2011 IBM Corporation Convergence of Social and Analytic Technologies Transform the Way the World Operates Socially Synergistic

More information

MICROSOFT OUTLOOK 2010

MICROSOFT OUTLOOK 2010 MICROSOFT OUTLOOK 2010 George W. Rumsey Computer Resource Center 1525 East 53rd, Suite 906 Chicago, IL 60615 (773) 955-4455 www.computer-resource.com gwrumsey@att.net What Is Outlook?... 1 Folders... 2

More information

Achieve. Performance objectives

Achieve. Performance objectives Achieve Performance objectives Performance objectives are benchmarks of effective performance that describe the types of work activities students and affiliates will be involved in as trainee accountants.

More information

Governance in Digital Asset Management

Governance in Digital Asset Management Governance in Digital Asset Management When was the last time you spent longer than it should have taken trying to find a specific file? Did you have to ask someone to help you? Or, has someone asked you

More information

conceptsearching Prepared by: Concept Searching 8300 Greensboro Drive, Suite 800 McLean, VA 22102 USA +1 703 531 8567

conceptsearching Prepared by: Concept Searching 8300 Greensboro Drive, Suite 800 McLean, VA 22102 USA +1 703 531 8567 conceptsearching Empowering Knowledge in Professional Services White Paper Prepared by: Concept Searching 8300 Greensboro Drive, Suite 800 McLean, VA 22102 USA +1 703 531 8567 9 Shephall Lane Stevenage

More information

OUTLOOK 2013 - GETTING STARTED

OUTLOOK 2013 - GETTING STARTED OUTLOOK 2013 - GETTING STARTED Information Technology September 1, 2014 1 GETTING STARTED IN OUTLOOK 2013 Backstage View Ribbon Navigation Pane View Pane Navigation Bar Reading Pane 2 Backstage View contains

More information

Certified Information Professional 2016 Update Outline

Certified Information Professional 2016 Update Outline Certified Information Professional 2016 Update Outline Introduction The 2016 revision to the Certified Information Professional certification helps IT and information professionals demonstrate their ability

More information

How to Manage Email. Guidance for staff

How to Manage Email. Guidance for staff How to Manage Email Guidance for staff 1 Executive Summary Aimed at Note Purpose Benefits staff Necessary skills to All staff who use email This guidance does NOT cover basic IT literacy skills. Staff

More information

Office of the Auditor General of Canada. Internal Audit of Document Management Through PROxI Implementation. July 2014

Office of the Auditor General of Canada. Internal Audit of Document Management Through PROxI Implementation. July 2014 Office of the Auditor General of Canada Internal Audit of Document Management Through PROxI Implementation July 2014 Practice Review and Internal Audit Her Majesty the Queen in Right of Canada, represented

More information

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015 Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015

More information

IMF Tune Opens Exchange to Any Anti-Spam Filter

IMF Tune Opens Exchange to Any Anti-Spam Filter Page 1 of 8 IMF Tune Opens Exchange to Any Anti-Spam Filter September 23, 2005 10 th July 2007 Update Include updates for configuration steps in IMF Tune v3.0. IMF Tune enables any anti-spam filter to

More information

Using SMART objectives Online Performance Review and Development Program (PRDP)

Using SMART objectives Online Performance Review and Development Program (PRDP) 1. Introduction The purpose of this PRDP Resource is to provide guidance on developing objectives using the SMART methodology. It also provides guidance on how to complete a Performance Plan and Professional

More information

Record Retention and Digital Asset Management Tim Shinkle Perpetual Logic, LLC

Record Retention and Digital Asset Management Tim Shinkle Perpetual Logic, LLC Record Retention and Digital Asset Management Tim Shinkle Perpetual Logic, LLC 1 Agenda Definitions Electronic Records Management EDMS and ERM ECM Objectives Benefits Legal and Regulatory Requirements

More information

Managing Your E-mails Presentation Given by Tom Forsyth, CRM

Managing Your E-mails Presentation Given by Tom Forsyth, CRM Managing Your E-mails Presentation Given by Tom Forsyth, CRM Presentation to the Austin ARMA Chapter October 9, 2012 Presentation Agenda E-mail Usage and Challenges E-mail as a Business Record E-mail Strategies

More information

Fraud Detection in Online Reviews using Machine Learning Techniques

Fraud Detection in Online Reviews using Machine Learning Techniques ISSN (e): 2250 3005 Volume, 05 Issue, 05 May 2015 International Journal of Computational Engineering Research (IJCER) Fraud Detection in Online Reviews using Machine Learning Techniques Kolli Shivagangadhar,

More information

A Knowledge-Poor Approach to BioCreative V DNER and CID Tasks

A Knowledge-Poor Approach to BioCreative V DNER and CID Tasks A Knowledge-Poor Approach to BioCreative V DNER and CID Tasks Firoj Alam 1, Anna Corazza 2, Alberto Lavelli 3, and Roberto Zanoli 3 1 Dept. of Information Eng. and Computer Science, University of Trento,

More information

Tightening the Net: A Review of Current and Next Generation Spam Filtering Tools

Tightening the Net: A Review of Current and Next Generation Spam Filtering Tools Tightening the Net: A Review of Current and Next Generation Spam Filtering Tools Spam Track Wednesday 1 March, 2006 APRICOT Perth, Australia James Carpinter & Ray Hunt Dept. of Computer Science and Software

More information

Technical Competency Framework for Information Management (IM)

Technical Competency Framework for Information Management (IM) Technical Competency Framework for Information Management (IM) Office of the Chief Information Officer (OCIO) June 15, 2009 Table of contents IM Competency Framework...1 Competency 1: Information Management

More information

Sentiment Analysis on Twitter with Stock Price and Significant Keyword Correlation. Abstract

Sentiment Analysis on Twitter with Stock Price and Significant Keyword Correlation. Abstract Sentiment Analysis on Twitter with Stock Price and Significant Keyword Correlation Linhao Zhang Department of Computer Science, The University of Texas at Austin (Dated: April 16, 2013) Abstract Though

More information

GUIDELINE RECORDS AND INFORMATION INVENTORY

GUIDELINE RECORDS AND INFORMATION INVENTORY Government of Newfoundland and Labrador Office of the Chief Information Officer Information Management Branch GUIDELINE RECORDS AND INFORMATION INVENTORY Guideline (Definition): OCIO Guidelines derive

More information

RECORDS AND INFORMATION MANAGEMENT AND RETENTION

RECORDS AND INFORMATION MANAGEMENT AND RETENTION RECORDS AND INFORMATION MANAGEMENT AND RETENTION Policy The Health Science Center recognizes the need for orderly management and retrieval of all official records and a documented records retention and

More information

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning

More information

APPENDIX to http://dx.doi.org/10.4338/aci-2014-09-ra-0083 CAHIIM 2012 Curriculum Requirements Health Informatics Master s Degree

APPENDIX to http://dx.doi.org/10.4338/aci-2014-09-ra-0083 CAHIIM 2012 Curriculum Requirements Health Informatics Master s Degree APPENDIX to http://dx.doi.org/10.4338/aci-2014-09-ra-0083 CAHIIM 2012 Curriculum Requirements Health Informatics Master s Degree Column 1 - Health Informatics Facet I. Information Systems concerned with

More information

About this documentation

About this documentation Wilkes University, Staff, and Students have a new email spam filter to protect against unwanted email messages. Barracuda SPAM Firewall will filter email for all campus email accounts before it gets to

More information

Managing explicit knowledge using SharePoint in a collaborative environment: ICIMOD s experience

Managing explicit knowledge using SharePoint in a collaborative environment: ICIMOD s experience Managing explicit knowledge using SharePoint in a collaborative environment: ICIMOD s experience I Abstract Sushil Pandey, Deependra Tandukar, Saisab Pradhan Integrated Knowledge Management, ICIMOD {spandey,dtandukar,spradhan}@icimod.org

More information

QUALITY ASSURANCE and QUALITY CONTROL DATA STANDARD

QUALITY ASSURANCE and QUALITY CONTROL DATA STANDARD QUALITY ASSURANCE and QUALITY CONTROL DATA STANDARD Standard No.: EX000012.1 January 6, 2006 This standard has been produced through the Environmental Data Standards Council (EDSC). The Environmental Data

More information

E-Mail Policy. Government of Newfoundland and Labrador (GNL)

E-Mail Policy. Government of Newfoundland and Labrador (GNL) Document Title: E-Mail Policy Document Type: Policy No. Of Pages 6 Scope: Government of Newfoundland and Labrador (GNL) Trim # DOC15481/2009 Revision ( # ) 26 Treasury Board Approval ( # ) TBM2009-298

More information

Three Methods for ediscovery Document Prioritization:

Three Methods for ediscovery Document Prioritization: Three Methods for ediscovery Document Prioritization: Comparing and Contrasting Keyword Search with Concept Based and Support Vector Based "Technology Assisted Review-Predictive Coding" Platforms Tom Groom,

More information

TOPIC NO 30505 TOPIC Physical Inventory Table of Contents Overview...2 Policy...2 Procedures...3 Internal Control...13 Records Retention...

TOPIC NO 30505 TOPIC Physical Inventory Table of Contents Overview...2 Policy...2 Procedures...3 Internal Control...13 Records Retention... Table of Contents Overview...2 Introduction...2 Policy...2 General...2 Procedures...3 Guidelines...3 Timing of Inventory Activities...5 Inventory Staffing...6 Tagging...7 Statistical Sampling...8 Internal

More information

APPENDIX I. Best Practices: Ten design Principles for Performance Management 1 1) Reflect your company's performance values.

APPENDIX I. Best Practices: Ten design Principles for Performance Management 1 1) Reflect your company's performance values. APPENDIX I Best Practices: Ten design Principles for Performance Management 1 1) Reflect your company's performance values. Identify the underlying priorities that should guide decisions about performance.

More information

Life after Microsoft Outlook Google Apps

Life after Microsoft Outlook Google Apps Welcome Welcome to Gmail! Now that you ve switched from Microsoft Outlook to, here are some tips on beginning to use Gmail. Google Apps What s Different? Here are some of the differences you ll notice

More information

User Guide for Kelani Mail

User Guide for Kelani Mail User Guide for Kelani Mail Table of Contents Log in to Kelani Mail 1 Using Kelani Mail 1 Changing Password 2 Using Mail Application 3 Using email system folders 3 Managing Your Mail 4 Using your Junk folder

More information

Managing Email Records Strategies That Work

Managing Email Records Strategies That Work Autonomy an HP Company Managing Email Records Strategies That Work Bill Manago, CRM VP, Information Governance Solutions Copyright 2012 Autonomy Inc., an HP Company. All rights reserved. Other trademarks

More information

Content-Based Recommendation

Content-Based Recommendation Content-Based Recommendation Content-based? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items User-based CF Searches

More information

RECORDS MANAGEMENT IN THE UNITED NATIONS

RECORDS MANAGEMENT IN THE UNITED NATIONS RECORDS MANAGEMENT IN THE UNITED NATIONS A Shared Responsibility arms@un.org Agenda Why Records Management? Records vs. documents Roles and Responsibilities Records Life Cycle Records Disposition Information

More information

Automating Document Review

Automating Document Review Automating Document Review CS224n Final Project Nathaniel Love June 9, 2006 Abstract Law firms engaged in litigation expend significant time and resources on document review, a process requiring brief

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

E-discovery Taking Predictive Coding Out of the Black Box

E-discovery Taking Predictive Coding Out of the Black Box E-discovery Taking Predictive Coding Out of the Black Box Joseph H. Looby Senior Managing Director FTI TECHNOLOGY IN CASES OF COMMERCIAL LITIGATION, the process of discovery can place a huge burden on

More information

Information Systems and Technologies in Organizations

Information Systems and Technologies in Organizations Information Systems and Technologies in Organizations Information System One that collects, processes, stores, analyzes, and disseminates information for a specific purpose Is school register an information

More information

Using Data Mining Methods to Predict Personally Identifiable Information in Emails

Using Data Mining Methods to Predict Personally Identifiable Information in Emails Using Data Mining Methods to Predict Personally Identifiable Information in Emails Liqiang Geng 1, Larry Korba 1, Xin Wang, Yunli Wang 1, Hongyu Liu 1, Yonghua You 1 1 Institute of Information Technology,

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Taxonomies for Auto-Tagging Unstructured Content. Heather Hedden Hedden Information Management Text Analytics World, Boston, MA October 1, 2013

Taxonomies for Auto-Tagging Unstructured Content. Heather Hedden Hedden Information Management Text Analytics World, Boston, MA October 1, 2013 Taxonomies for Auto-Tagging Unstructured Content Heather Hedden Hedden Information Management Text Analytics World, Boston, MA October 1, 2013 About Heather Hedden Independent taxonomy consultant, Hedden

More information

Auto-Classification for Document Archiving and Records Declaration

Auto-Classification for Document Archiving and Records Declaration Auto-Classification for Document Archiving and Records Declaration Josemina Magdalen, Architect, IBM November 15, 2013 Agenda IBM / ECM/ Content Classification for Document Archiving and Records Management

More information

ModusMail Software Instructions.

ModusMail Software Instructions. ModusMail Software Instructions. Table of Contents Basic Quarantine Report Information. 2 Starting A WebMail Session. 3 WebMail Interface. 4 WebMail Setting overview (See Settings Interface).. 5 Account

More information

Information Management

Information Management G i Information Management Information Management Planning March 2005 Produced by Information Management Branch Open Government Service Alberta 3 rd Floor, Commerce Place 10155 102 Street Edmonton, Alberta,

More information

QUALIFICATIONS PACK - OCCUPATIONAL STANDARDS FOR IT-ITeS INDUSTRY. SUB-SECTOR: Business Process Management. ITES)ces Helpdesk Attendant

QUALIFICATIONS PACK - OCCUPATIONAL STANDARDS FOR IT-ITeS INDUSTRY. SUB-SECTOR: Business Process Management. ITES)ces Helpdesk Attendant QUALIFICATIONS PACK - OCCUPATIONAL STANDARDS FOR IT-ITeS INDUSTRY OS describe what individuals need to do, know and understand in order to carry out a particular job role or function OS are performance

More information

ZEROING IN DATA TARGETING IN EDISCOVERY TO REDUCE VOLUMES AND COSTS

ZEROING IN DATA TARGETING IN EDISCOVERY TO REDUCE VOLUMES AND COSTS ZEROING IN DATA TARGETING IN EDISCOVERY TO REDUCE VOLUMES AND COSTS WELCOME Thank you for joining Numerous diverse attendees Today s topic and presenters This is an interactive presentation You will receive

More information

MSU Guidance for Electronic Mail Records

MSU Guidance for Electronic Mail Records MSU Guidance for Electronic Mail Records University Archives and Historical Collections Contents Introduction to Records Management and Electronic Mail... 1 Emails and Social Media... 2 Using Email for

More information

Robust Sentiment Detection on Twitter from Biased and Noisy Data

Robust Sentiment Detection on Twitter from Biased and Noisy Data Robust Sentiment Detection on Twitter from Biased and Noisy Data Luciano Barbosa AT&T Labs - Research lbarbosa@research.att.com Junlan Feng AT&T Labs - Research junlan@research.att.com Abstract In this

More information

How To Understand The Impact Of A Computer On Organization

How To Understand The Impact Of A Computer On Organization International Journal of Research in Engineering & Technology (IJRET) Vol. 1, Issue 1, June 2013, 1-6 Impact Journals IMPACT OF COMPUTER ON ORGANIZATION A. D. BHOSALE 1 & MARATHE DAGADU MITHARAM 2 1 Department

More information

Writing Quality Learning Objectives

Writing Quality Learning Objectives http://captain.park.edu/facultydevelopment/writing_learning_objectives.htm 1 Writing Quality Learning Objectives Learning objectives (often called performance objectives or competencies) are brief, clear,

More information

Predictive Coding, TAR, CAR NOT Just for Litigation

Predictive Coding, TAR, CAR NOT Just for Litigation Predictive Coding, TAR, CAR NOT Just for Litigation February 26, 2015 Olivia Gerroll VP Professional Services, D4 Agenda Drivers The Evolution of Discovery Technology Definitions & Benefits How Predictive

More information

What are research, evaluation and audit?

What are research, evaluation and audit? 1 What are research, evaluation and audit? Barbara Sen, Maria J. Grant and Hannah Spring I don t do research. I don t have the time. I am too busy with the day to day running of the library. I do evaluations

More information

Intercept Anti-Spam Quick Start Guide

Intercept Anti-Spam Quick Start Guide Intercept Anti-Spam Quick Start Guide Software Version: 6.5.2 Date: 5/24/07 PREFACE...3 PRODUCT DOCUMENTATION...3 CONVENTIONS...3 CONTACTING TECHNICAL SUPPORT...4 COPYRIGHT INFORMATION...4 OVERVIEW...5

More information

Microsoft Outlook 2013 Part 1: Introduction to Outlook

Microsoft Outlook 2013 Part 1: Introduction to Outlook CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Microsoft Outlook 2013 Part 1: Introduction to Outlook Fall 2014, Version 1.0 Table of Contents Introduction...3 Starting Outlook...3

More information

QUALIFICATIONS PACK - OCCUPATIONAL STANDARDS FOR IT-ITeS INDUSTRY. SECTOR: IT-ITES ITES)ces Helpdesk Attendant SUB-SECTOR: Business Process Management

QUALIFICATIONS PACK - OCCUPATIONAL STANDARDS FOR IT-ITeS INDUSTRY. SECTOR: IT-ITES ITES)ces Helpdesk Attendant SUB-SECTOR: Business Process Management QUALIFICATIONS PACK - OCCUPATIONAL STANDARDS FOR IT-ITeS INDUSTRY Contents 1. Introduction and Contacts.......P.1 2. Qualifications Pack....P.2 3. Glossary of Key Terms.......P.3 4. NOS Units.. P.5 OS

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

BEST PRACTICES FOR MANAGEMENT OF LOTUS NOTES EMAIL RECORDS September 4, 2003

BEST PRACTICES FOR MANAGEMENT OF LOTUS NOTES EMAIL RECORDS September 4, 2003 BEST PRACTICES FOR MANAGEMENT OF LOTUS NOTES EMAIL RECORDS September 4, 2003 All information in a Lotus Notes email system workspace is a record. Following are best practices for managing those records:

More information

Election of Diagnosis Codes: Words as Responsible Citizens

Election of Diagnosis Codes: Words as Responsible Citizens Election of Diagnosis Codes: Words as Responsible Citizens Aron Henriksson and Martin Hassel Department of Computer & System Sciences (DSV), Stockholm University Forum 100, 164 40 Kista, Sweden {aronhen,xmartin}@dsv.su.se

More information

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,

More information

Master Degree Project Ideas (Fall 2014) Proposed By Faculty Department of Information Systems College of Computer Sciences and Information Technology

Master Degree Project Ideas (Fall 2014) Proposed By Faculty Department of Information Systems College of Computer Sciences and Information Technology Master Degree Project Ideas (Fall 2014) Proposed By Faculty Department of Information Systems College of Computer Sciences and Information Technology 1 P age Dr. Maruf Hasan MS CIS Program Potential Project

More information