1 Introduction 2 Personal Digital Document Management the process of acquiring, storing, managing, retrieving and using digital documents Sarah Henderson Department of Information Systems and Operations Management University of Auckland s.henderson@auckland.ac.nz Files overloaded Mars Probe Monday 2 th January, 200 http://news.bbc.co.uk/1/hi/sci/tech/117.stm 1
Introduction 5 Documents are now primarily created, stored and communicated electronically Lifestreams (Freeman & Gelernter) Anecdotal evidence suggests personal document management software (ie Windows Explorer) is not very usable and Space Improvements are challenging: Embedded nature of document management activities Hard to imagine alternative systems Individual differences 7 8 Data Mountain (Robertson) TimeScape (Rekimoto) and Space and Space 2
9 Window XP Task Bar (Microsoft) Presto/Vista (Dourish, Edwards, LaMarca & Salisbury) and Space and Space 11 Metadata in the hierarchical folder system 12 Gmail labels & Flickr tags and Space Setting up a hierarchy of folders is essentially equivalent to pre-defining a set of attributes or keywords that can be applied to a document.
File System Snapshot 1 File System Statistics 1 Software has been customwritten to gather information about the file system Supplements the subjective accounts Analysis gives: Number of files and folders File type profile Depth and breadth of hierarchy File age profile Hotspot analysis Folder names and structures File names and types Date created, last accessed and modified Years Experience Files Folders Empty Folders Files per Folder Average Depth File Duplication % Folder Duplication % Shortcuts on Desktop A,790 228 1 1. 2 11.7% 12.7% 5 B 20,27 2,7 59 7.5 7.% 81.8% C,79 85 50. 2.8% 2.5% 9 D 1,55 211 7 7. 18.% 7.9% 5 E,021 15 9 7..0%.5% F,1 09 72 5.9 28.5% 51.0% 12 Desktops (Colour coded by number of files) 15 My Documents (Colour coded by number of files) 1 A: A: D: D: C: C: B: B:
Proportion of folders with each naming code 17 Code Descriptions 18 25% 20% 15% % 5% 0% Genre Task Topic Time Course Person FileType Temporary Source Security Unknown Multiple Code Genre Task Course Topic Description & Examples Indicates that the contents of the folder are a particular class or type of document, with a commonly recognized form and structure. Examples: Lecture Notes, Presentations, Timesheets, Budgets, Letters. Indicates that the contents of the folder are related to a task, project, event or some other type of activity. Examples: Assignment 5, Lec01, PhD, recruitment, For DSS Presentation. Indicates that the contents of the folder are related to a specific course. (This is a special case of Task above) Examples: Database Systems, 222, INFOSYS 222 Indicates that the contents of the folder are all about a particular subject matter. Examples: Web development, Database Architectures, JavaScript Code occurrence by participant (top four codes) 19 Folder hierarchy 20 B, C & D are Course Managers Participant Scheme Confidence A Time > [various] Low B Time > Course > Task Medium Course Managers C Genre > Time Medium D Task > Course > Time > Genre High E & F are Lecturers Lecturers E F Task > Time > Course > Genre or Task > Course > Time > Genre Genre/ Task > [various] High Low 5
Facets 21 22 Desktop Screenshots Similar to A B C D E F File Management vs Document Management 2 Useful Distinctions between users 2 Distinction between files and documents Browse oriented Search oriented Example: a status report that went through five drafts, was edited once by the boss and sent to a client. This is actually six separate files in the file system plus two in the email system, with no relationship between any of them except perhaps a similar file name, but that is up to the user Post-structure Deleter Conceal oriented Tree averse Pre-structure Hoarder Display oriented Tree oriented An interface that recognizes and manages documents (rather than files) could help overcome the version management problems reported by these participants. Subject/Project oriented Specify names oriented Use default names
Conclusion 25 2 There is no best way to combine these dimensions into a hierarchy. Forcing them into a hierarchy results in duplication A facet-based document management system would be a promising approach Some opportunities for automatic software support of document management (Person, Source, Topic, Time, File Type), but more research needed. Questions? More research needed on genre in personal digital document management References 27 Dourish, P. et al. (1999) Presto: An Experimental Architecture for Fluid Interactive Document Spaces. ACM Transactions on Computer-Human Interaction (2), 1-11 Freeman, E. and Gelernter, D. (199) Lifestreams: A Storage Model for Personal Data. SIGMOD Bulletin 25 (1), 80-8 Rekimoto, J., Time Machine Computing: A time-centric approach for the information environment. in UIST'99 Symposium on User Interface Software and Technology, (Asheville, North Carolina, USA, 1999), 5-5. Robertson, G.G. et al. (1998) Data Mountain: Using Spatial Memory for Document Management. In UIST'98 Symposium on User Interface Software and Technology, pp. 15-12, ACM Press 7