Free and Open Source Document Management Systems



Similar documents
InstaFile. Complete Document management System

Aprotec DMS Electronic Document Management Solution

Digital Asset Management

Indian Journal of Science International Weekly Journal for Science ISSN EISSN Discovery Publication. All Rights Reserved

The biggest challenges of Life Sciences companies today. Comply or Perish: Maintaining 21 CFR Part 11 Compliance

The Recipe for Sarbanes-Oxley Compliance using Microsoft s SharePoint 2010 platform

Document Management Server - Overview

THE BRITISH LIBRARY. Unlocking The Value. The British Library s Collection Metadata Strategy Page 1 of 8

Moodle: Suitability as a repository for learning objects

System Requirement Specification for A Distributed Desktop Search and Document Sharing Tool for Local Area Networks

FAQ. Hosted Data Disaster Protection

Enterprise Content Management with Microsoft SharePoint

DEVELOPING AN OPEN SOURCE CONTENT MANAGEMENT STRATEGY FOR E-GOVERNMENT

Workshop on Using Open Source Content Management System Drupal to build Library Websites Hasina Afroz Auninda Rumy Saleque

BENEFITS OF IMAGE ENABLING ORACLE E-BUSINESS SUITE:

DRUPAL: DEVELOPING LIBRARY PORTAL USING OPEN SOURCE CONTENT MANAGEMENT SYSTEM

Qlik Sense Enabling the New Enterprise

TREENO ELECTRONIC DOCUMENT MANAGEMENT. Administration Guide

Protecting Business Information With A SharePoint Data Governance Model. TITUS White Paper

northplains Whitepaper Differentiating DAM from ECM What Do You Really Need? Connecting your world. Visually.

Document Management Return on Investment (ROI) Analysis

Sharepoint vs. inforouter

The All-In-One Browser-Based Document Management Solution

Developing a Website. Chito N. Angeles Web Technologies: Training for Development and Teaching Resources

Nexus Professional Whitepaper. Repository Management: Stages of Adoption

Invest in your business with Ubuntu Advantage.

Controlling and Managing Security with Performance Tools

Enterprise Solution for Remote Desktop Services System Administration Server Management Server Management (Continued)...

Protecting Data with a Unified Platform

Moving from Sage 50 Accounts to Sage 200 Standard Online

Papermule Workflow. Workflow and Asset Management Software. Papermule Ltd

Next Generation Business Performance Management Solution

Extracting and Preparing Metadata to Make Video Files Searchable

Regulated Documents. A concept solution for SharePoint that enables FDA 21CFR part 11 compliance when working with digital documents

Online Client Portal Client User Guide

City of Ryde Drives Business Forward with Enterprise-wide Information Management Solution

Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities

Case Management and Real-time Data Analysis

Creating Library Website Using Open Source Content Management System

INSTALLING, CONFIGURING, AND DEVELOPING WITH XAMPP

Document Management Glossary

PHP ON WINDOWS THE PROS AND CONS OF IMPLEMENTING PHP IN A WINDOWS INFRASTRUCTURE

Open Source Content Management System for content development: a comparative study

Extending SharePoint for Real-time Collaboration: Five Business Use Cases and Enhancement Opportunities

Exchange Brick-level Backup and Restore

The Clear Path to Business Intelligence

Data Governance Best Practice

An open source Paperless Office solution

Test Management Tools

Integrated archiving: streamlining compliance and discovery through content and business process management

Document Archiving White Paper. Secure. Accessible. Reliable

DSI File Server Client Documentation

owncloud Architecture Overview

Record Management in SharePoint

Your Data, Any Place, Any Time.

ANSYS EKM Overview. What is EKM?

AssurX Makes Quality & Compliance a Given Not Just a Goal

Part I. OpenCIT Server

Comprehensive Guide to Moving a File Server to Google Drive

How to Go Paperless In Three Simple Steps: A Guide for Small Businesses

ENTERPRISE DOCUMENT MANAGEMENT SYSTEM

owncloud Architecture Overview

Orchestrating Document and Media Management using CMIS

Using EMC Documentum with Adobe LiveCycle ES

Your Data, Any Place, Any Time. Microsoft SQL Server 2008 provides a trusted, productive, and intelligent data platform that enables you to:

DIGITAL STRATEGY

A SURVEY OF OPEN SOURCE ERP SYSTEMS

Introduction. Connection security

SEMINAR. Content Management System. Presented by: Radhika Khandelwal

Maximizing ROI on Test and Durability

Developing an Effective Management Solution in SharePoint

Filestor Digital Asset Management. The way it works

Managing explicit knowledge using SharePoint in a collaborative environment: ICIMOD s experience

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

TYPO3 6.x Enterprise Web CMS

DOCUMENT MANAGEMENT SYSTEM

ENTERPRISE CONTENT MANAGEMENT. Trusted by Government Easy to Use Vast Scalability Flexible Deployment Automate Business Processes

Version 1.0 January Xerox Phaser 3635MFP Extensible Interface Platform

SOA REFERENCE ARCHITECTURE: WEB TIER

Contents of This Paper

Agile Business Intelligence Data Lake Architecture

The Business Value of a Web Services Platform to Your Prolog User Community

The Core Pillars of AN EFFECTIVE DOCUMENT MANAGEMENT SOLUTION

Records Management and SharePoint 2013

Business Software Defined DMS DOCUMENT MANAGEMENT SYSTEM ZETA SOFTWARE

SHAREPOINT 2016 POWER USER BETA. Duration: 4 days

Mediasite A Video Content Management & Distribution Platform. Technical planner: TP-10

An Oracle White Paper May Oracle Database Cloud Service

Implementing a Successful Digital First Strategy

Content Marketing Integration Workbook

WordPress Security Scan Configuration

Business Proposition. Digital Asset Management. Media Intelligent

Information Security and Continuity Management Information Sharing Portal. Category: Risk Management Initiatives

Case Study - MetaVis Migrator

Transcription:

Free and Open Source Document Management Systems Anas Tawileh School of Computer Science, Cardiff University 5 The Parade, Cardiff CF24 3AA, UK anas@tawileh.net Abstract Document Management Systems captured the interest of academics and decision makers for a long while. Aiming mainly to facilitate the creation, distribution and collaborative editing of large amounts of documents, many applications were developed to address the increasingly critical need to manage information efficiently and securely. However, most of these applications were developed in commercial, proprietary software development organisations. The Free and Open Source Software development paradigm has been criticised for the scarcity of FOSS applications in the Document Management domain. This situation has changed significantly over the past few years. Currently, several FOSS Document Management Systems provide functionality that is equivalent to their proprietary counterparts. In this tutorial we will provide an overview of some of the available FOSS Document Management Systems, and compare their functionality and maturity. We will also describe the main concepts of Document Management and discuss the benefits of implementing a Document Management system. We will introduce the practical implementation of these applications by demonstrating the setup and operation of the KnowldgeTree Document Management System. We aim to cover content creation and collaboration, document auditing, search facilities, security and access control features. Finally, we will discuss the future trends and directions in the FOSS Document Management Systems development. Keywords: Free Software, Open Source, Document Management, KnowlegdeTree. 1. Introduction Appreciation of the importance and value of information and its management has been growing at substantial rates with the raise of the so-called knowledge economy. Companies and organisations are increasingly realising that their most valuable asset is the intangible wisdom of the organisation. Less emphasis is being placed upon physical resources as the information content of products and services is escalating, giving the most knowledgeable firms the competitive advantage they need to survive in today's fierce marketplace. Organisations' knowledge is reserved in the form of documents, such as product designs, management reports, memos, contracts, training materials, etc. Managing these documents effectively and efficiently has become an important necessity. Most of the technological development in computing focused on the management of structured electronic information, such as databases and e-mails. However, 80 to 90 per cent of organisational information is in documents rather than structured databases [1]. In a study of the value of different forms on information for managers, participants rated computer-generated reports as the least valuable, with other forms of information such as meetings, news and memos as much more valuable sources [2]. In order to facilitate the management of the vast amounts of documents used on a daily basis in organisations, computerised Document Management Systems emerged to apply technology to the production, transmission, storage and retrieval of document based information. The development of these applications was made possible by different technological advances in areas such as digital image processing, larger and more reliable data storage and higher bandwidth communication channels. While many companies have developed and marketed different flavours of Document Management Systems, the Free and Open Source Software community has been criticised for its failure to build dependable, feature rich Document Management application. Free and Open Source Software, contrary to the commercial, proprietary software application, are developed in an open environment by a large number of contributors distributed all over the world. The software is released with its source code to enable used to investigate, modify and redistribute the software. However, the situation has changed significantly in the past few years, and different FOSS Document Management Systems are now available with features that match those available on their proprietary counterparts. In this tutorial we aim to provide an overview of the concepts of Documents Management and the value it provides for organisations, and then we will present some of the available FOSS applications in the Document Management domain. The practical aspects and issues of implementing these systems will be introduced by demonstrating the setup and operation of the KnowledgeTree Document Management system [3]. We will cover content creation and collaboration, document auditing, search facilities, security and access control. We conclude with a forecast of the future trends and directions in the FOSS Documents Management Systems development. 2. What is Document Management? A document can be defined as a piece of "recorded information structured for human consumption" [4]. This definition is wide enough to embrace many forms of documents used in organisations, such as books,

magazines, news articles, design drawings, video recordings, etc. For our purposes, we identify documents as any item that can be contained in an electronic file, such as e-mails, video files, audio recordings, scanned images etc. Some Document Management Systems enable the management of physical items that are not contained in electronic files, such as the actual hard-copy books in a library, machines and equipment. However, this functionality falls outside of the scope of this tutorial. Documents can be found in almost every aspect of organisations' daily activities. They store, transmit and communicate the knowledge of the organisation which distinguishes its operation and processes from its rivals. The premium an organisation can charge for its products or services is becoming more and more dependant on the information content embedded in this product or service. Therefore, managing this information in the most way is of utmost importance to the long term survivability of any organisation. Documents Management can be described as "a systematic method for storing, locating and keeping track of information that is valuable to a business" [5]. In order to provide such functionality, a Document Management Systems use meta-data, which are pieces of information that describes the content of the document, such as the author's name, date of creation, summary and subject. Meta-data is used to classify documents into useful groups to facilitate its retrieval at later stages. When applied and implemented properly, Document Management Systems can transform the workplace into a paperless office, by storing all the required documents in electronic, easy to manage formats. This will eliminate the need for physical documents unless kept for regulatory compliance requirement. In any case, managing the documents electronically will be much easier and cost effective. The basic functionality required from Document Management Systems include: document creation, version control, security, document sharing, collaborative editing, workflow support and flexible search facilities. Different Document Management Systems satisfy each of these requirements to higher or lesser degrees. The selection of the appropriate software depends on the specific needs of the organisation and the information management environment it employs 3. FOSS Document Management Systems The concept of what constitutes a Document Management System has been the subject of much debate. Some argue that Content Management Systems (CMS) used to build and manage information content (such as Drupal [6] and PostNuke [7]) can be considered Document Management Systems because they facilitate the management and collaborative creation of information. The second point of view rejects this claim on the grounds that Content Management Systems usually deal with structured data software in databases. This limitation prevents the accommodation of other forms of documents, such as video recordings and design drawings for example. They also argue that for a software to be considered a Document Management system, it should provide powerful meta-data processing capabilities. Only few applications have this essential feature. Certainly, some Content Management Systems are continuously evolving, mainly because of the open, collaborative nature of FOSS development. These applications may implement enhanced meta-data processing features, which will make the distinction even more difficult. On the other hand, arguments have been made that Content Management Systems are much better at managing information as they effectively deal with individual pieces of information. While Document Management Systems deal with the document in its entirety (a document may be composed of different pieces of information). Another claim in favour of Document Management Systems criticises the emphasis on the presentation layer in Content Management System, while what is really needed in a Document Management System is the support of scalable management of large documents [8]. For the purposes of this tutorial, we will identify Document Management Systems according to the definition of document presented in the previous section in its wider form that embraces all documents that can be saved electronically. Critics of the FOSS community have argued that while it produced software in different application domains that were able to compete against commercially developed software, it failed to provide appropriate alternatives to proprietary Document Management Systems. However, this situation has changed over the past few years, and currently, different Document Management Systems are released under a Free or Open Source license. We present few examples of FOSS Document Management Systems. CPS Project [9]: Collaborative Portal Server is designed to handle documents and web based presentation formats such as pages and news items. It has a robust role based framework to enhance document security. CPS supports basic meta data for most content types, and additional attributes can be added through its easy to use web interface. It includes a powerful versioning system to track document changes, and has a complete workflow framework (CPSWorkflow). However, CPS does not support the WebDAV standards [10] which allows users to collaboratively edit and manage files on remote web servers. OWL [11]: the development of owl is totally driven by the community. Its main feature is simplicity, though at the price of an unattractive user interface. The system has basic versioning capabilities and a simplistic permission system. Documentation of the project is flimsy and standards support is rather weak. However, the project's community is highly active and major contributions are built around the project. Open steam [12]: this project, with its roots originating in academia, has a fairly rich feature set. It is developed around the idea of virtual rooms, where uploaded documents are organised and stored. This approach allows for great flexibility in connecting rooms

and facilitates collaboration. Collaboration is further enhanced through a shard whiteboard to aid visual communication. The system interface is simple and user friendly. On the other hand, the search capabilities of the system are not very powerful. Plone [13]: is originally a Content Management System but has good document management capabilities. The main distinct feature of Plone is the large size of its active community, which is accelerating the evolution of the software. The interface is very intuitive and user friendly, and the underlying access control mechanism is quite flexible. Plone has a strong support for meta data and its workflow features are robust. It also has powerful search features. On the down side, Plone does not have an integrated versioning system, which may prevent it from being implemented in scenarios where versioning support is very important. Add-on solutions for versioning are still immature and can not be relied upon. KnowledgeTree [3]: probably the most Document focused' FOSS application. It enjoys a relatively high users ' base, and a good deal of simplicity and ease of use. Because of its strong focus on Document Management, the workflow model and versioning capabilities implemented in KnowledgeTree are very useful in facilitating the management of large numbers of documents. Security is provided through a robust role based model and the software has an advanced search feature which enables the user to construct complex queries. The main disadvantage of KnowledgeTree is its poor standards support which can be overcome through the proprietary product Baobab developed by Jam Warehoue [14]. successful test and evaluation of the pilot project, the system can be rolled out to the whole organisation. We will demonstrate the setup and operation of Document Management Systems by using KnowledgeTree as an illustrative example. Installation of KnowledgeTree is fairly easy and straight forward. It will require the existence of MySQL Database Server [15], the Apache Web Server [16] and the PHP Scripting Language [17]. Each of these packages can be installed independently and configured so they can function properly together. However, a much easier way is to use the XAMPP [18] package which will automatically install and configure all these applications with minimal user intervention. Once the required applications are installed and configured, KnowledgeTree can be downloaded freely and installed using the integrated installer. The installation process will ask the user for few configuration parameters. When installed, the user can log in into the system using the default password provided with the installation (which should be changed as soon as the user starts using the system) from the system's main login web based interface. KnowledgeTree's Dashboard is shown in Figure 1. 4. Implementing FOSS Document Management Systems The best starting point for the implementations of a Document Management System is a precise and clearly defined set of requirement. These requirements should be based on a careful analysis of the organisation's needs and work processes. The importance of eliciting and defining requirements can not be over emphasised. Different Document Management Systems have different characteristics, as presented in the previous section. Without a clear vision of what the organisation really needs and what it wants to achieve with the implementation of the Document Management System, selecting the most appropriate software will be a difficult task. The ultimate benefits that would be obtained by the organisation depend on the best possible fit between the organisation's requirements and the implemented Document Management System. After the requirements are identified and formulated clearly and unambiguously, and the appropriate matching FOSS Document Management System is selected, it is advisable to commence the implementation in small increments. Ideally, the implementation should start with a low scale pilot project to test the applicability of the design decisions and to identify any possible problems and area for improvement. Upon the Figure 1: KnowledgeTree Dashboard 4.1. Document Creation, Auditing and Collaboration: Documents are usually added to the system through the Add Document web based user interface. KnowledgeTree has the ability to upload multiple documents simultaneously in a zipped format. After the document is added to the system's repository, the system asks for details of the document to be stored as meta data. When the document is added to the system for the first time, it will be assigned the version number 0.1, this number will be increased by 0.1 increments each time the document is checked in after editing. The Document Details page is the main interface to interact and work with the document. It includes all the actions that can be performed on the document, such as download, check out, check in, edit meta data, delete, move, version comparison, archive, etc. The transaction history of any document provides a complete account of all the actions performed on the document since its submission to the repository. Version history facilitates the comparison of different versions of any document's meta data in order to facilitate auditing and change control.

One of the most useful features of any Document Management System is workflow management. This feature enables the user to determine the processing steps of any document in the organisation during its complete lifecycle. Workflows are usually designed by administrators, with users having the ability to assign specific workflows to documents they have control over. The Document Details interface provides information about the current status of the document in the workflow. The Document Details interface is shown in Figure 2. Figure 2: Document Details Interface 4.2. Security and Access Control Features: KnowledgeTree has a robust permission management and security model based on roles and groups. The system administrator can use the Document Management System (DMS) interface to create or manage groups, users and roles. User groups are allocated to roles on a per-directory basis and are inherited from the root folder of the DMS. Roles can be assigned to users or to groups. Permissions assigned to roles or groups include: Read, Write, Delete, Add Folder, and Manage Security in addition to other custom values. The system also has the capability of assigning dynamic permissions to users and groups based on certain rules. These rules may be applied to the document's metadata, contents, or transactional information. Access control and users permissions can be applied according to the state of the document in the workflow. For instance, a specific group of users may not be allowed to view certain documents until they are approved by a manager. 4.3. Workflow Management: Administrators can configure the workflows that should be followed by specific documents during their lifecycle. These workflows should reflect the current business processes implemented in the organisation. Workflows consist of states, which describe where in the lifecycle the document is, and transitions, which indicate the next steps within the lifecycle of the document. The system allows the administrator to determine the groups or users to be notified when the document reaches a specific state in the workflow. For instance, everyone assigned the role of the finance manager may be informed when an invoice has reached the state (submitted) and it required his review or approval. Transitions describe how documents are moved from one state to another. Usually, transitions can only be performed by people with a specific role or part of a specific group. 6. Conclusions Information has become the most valuable asset for almost any organisation or individual. Dependence on valid, easily accessible information is increasing at substantial rates, and enormous amounts of information are being produced by almost every human activity. However, technological developments over the past few decades have focused only on a specific form of information: structured data organised in databases or other special configurations. Unfortunately, the vast majority of useful information is stored in different formats, such as word processed documents, video recordings, project schedules, design drawings, etc. Therefore, the need for an effective and efficient way to manage this information became an obvious necessity. Document Management Systems are software applications that were designed to address these needs by managing documents that fall outside the structured data domain. Many Document Management Systems are available from different software vendors. However, most of these applications are proprietary, and developed within commercial settings. Some people arguer that the Free and Open Source Software community did not pay much attention to the Document Management domain, and that they failed to develop applications with decent functionality to compete against the commercial offerings. We showed that this situation has changed dramatically lately, and that many Document Management applications that are released under a free or open source license are currently available. We reviewed different examples and explored each strengths and weaknesses. We then selected a particularly focused application on Document Management for use as an illustrative example of the implementation and operation of the FOSS Document Management Systems. Each application presented has unique features. Selection of the most appropriate Document Management Systems should be based on a comprehensive requirements elicitation and analysis. The ultimate success of any Document Management System implementation relies on the fit of this system to the organisation's needs and current environment and processes. The FOSS development process has a completely different set of mechanisms governing the evolution of its resulting software that those encountered in the commercial software development organisations. Availability of source code and the freedom to review it, modify it and redistribute it will stimulate the interest of many talented developers to contribute and enhance these applications. Therefore, the feature set and functionality of FOSS Document Management Systems is expected to get richer with the increasing appreciation of the value of information and the need to manage data stored in unstructured formats. Another interesting trend

to observe is the convergence of Document Management with other information management applications, such as Content Management Systems. This trend is already manifesting in the incorporation of richer Document Management functionality in Content Management Systems such as Plone. References [l] R.H. Sprague, "Electronic Document Management: Challenges and Opportunities for Information Systems Managers", MIS Quarterly 19, 1, 1995, pp. 29 50. [2] McLeod, R., Jr., & Jones, J.W. "A framework for office automation", MIS Quarterly l, 1987, pp. 86-104. [3] KnowledgeTree Document Management System, http://www.ktdms.com, [4] Levien, R.E., "The Civilizing Currency: Document and Their Revolutionary Technologies", Xerox Corporation, Rochester, NY, 1989. [5] Bannan, J., 1997 "Intranet Document Management: A Guide for Webmasters and Content Providers", ISBN: 0201873796, Addison-Wesley, Boston, USA. [6] Drupal Content Management System, http://www.drupal.org, [7] PostNuke Document Management System, http://www.postnuke.com, [8] Perez, C.E., "Open Source Document Management Solutions Written in Java", http://www.manageability.org/blog/stuff/open-sourcedocument-repository/view, [9] CPS Project, http://www.cps-project.org, accessed 28 [10] Web-based Distributed Authoring and Versioning (WebDAV), http://www.webdav.org, accessed 28 March 2006. [11] OWL, http://awl.sourceforge.net, accessed 28 March 2006. [12] Open steam, http://www.open-steam.org/, accessed 28 [13] Plone Content Management System, http://www.plone.org, [14] Jam Warehouse, http://www.jamwarehouse.com/, [15] MySQL Database Server, http://www.mysql.com/, [16] Apache Web Server, http://www.apache.org/, accessed 28 [17] PHP Scripting Language, http://www.php.net/, accessed 28 [18] XAMPP, http://www.apachefriends.org/en/xampp.html,