Flattening Enterprise Knowledge

Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it s an organizational need. What is generally forgotten though is that ECM is more than just putting content on a server. Content drives the daily operations of an enterprise. If it didn t there wouldn t be a need for content and information could be regularly discarded. Unfortunately that s not realistic; discarding this information creates liability, drives up costs and increases the likelihood for duplicity of information and processes. Enterprises create content through daily operations and associated tasks, so ECM offerings must have the ability to not only meet basic storage needs but also function within the framework of various standard operating procedures (SOPs). This makes selecting the right ECM far more critical than just picking the right server to store your information on 2 Issues Enterprises operate as well as their content allows Do you control your content or does your content control you? Enterprises can consist of one office in a small town or hundreds of offices across the globe. Offices have hundreds if not, thousands, of employees and customers. Each employee and customer has associated data (HR Records, client relationship information, invoice information, payroll information ). This data isn t all in the same language or file cabinet or shared drive or office or repository. That s billions of words and images, a successful enterprise needs to be able to locate, track and analyze at the blink of an eye. Space Costs Money Daily enterprise operations create content. Content Requires Space. Whether content is the creation of common forms, emails, corporate documents, specific hard copy materials (architectural drawings), media or web pages, it requires space. Space could be a dark closet in the basement of your office or it could be the file server space on your enterprise s network. Various SOPs Within each enterprise and subsequent office there are multiple departments from business development to IT to HR. Each of these departments use their own unique SOPs which create information and content. Not only do these SOPs create massive amounts of information and content, but they also require very specific processes for corporate, legal, and trade compliance. This requires any ECM to have the flexibility to implement multiple SOPs while having the ability to edit on the fly. Paper Based SOPs By nature, enterprises create content for records management and daily operations reactive to workflows and SOPs. This content is crucial to the processes and operations. SOPs vary across enterprises from business development and HR requirements to IT documentation. Each division or branch of enterprise requires its own flexible SOP for content and daily operations creating an ever growing collection of paper based workflows. Discovery Enterprise operations require leveraging information in a timely and efficient manner. But the more content created, the harder locating and examining this information becomes and the more cumbersome enterprise content becomes. Increased time to convert data to knowledge discovery creates inefficient processes and reduces return on information assets. 1

There s only one solution capable of spanning all of an enterprise s data and information resources, finding drops of information in the sea of data, and discovering the knowledge embedded within... Knowvation TM 3 Knowvation Knowvation is a complete ECM platform for information intensive enterprises. Built on robust, fieldproven technology, Knowvation is designed for knowledge workers to reduce the time and cost of finding information in the sea of data. Reduced time to convert data to knowledge discovery and action improves business process efficiency and increases return on information assets. While all portals, applications, or ECMs have some basic search and classification functionality, they are typically limited to keyword search and manual classification. Furthermore, access to information in these repositories is often stove-piped or out of context. With growing number of enterprise portals, applications, and CMS on different platforms and databases, the need for unified, context relevant access to information across all information resources is key to increasing productivity of knowledge workers. Knowvation can access information across all types of portals, applications, repositories, and file systems. It can utilize a variety of indexing, concept and entity extraction, and content filtering methods regardless of content type unstructured, semi-structured, and structured. Knowvation s unique Information Discovery Services such as Boolean, Concept, and Pattern Searching provide contextrelevant, precise, and unified information retrieval. Flatten your enterprise knowledge and take control of your content! Semantic Indexing The unification of content begins in Knowvation s initial semantic index process, which is based on terms, expressions and concepts. Compared to simply ranking the frequency of words for indexing, Knowvation initiates a semantic analysis to discover synonyms and related concepts embedded throughout the entire collection of content. This provides the ability to execute searches on keyword relations and phrases in a timely and effective process. Categorization and Classification with Taxonomies Knowvation s Categorizer automatically extracts concepts from documents using taxonomies and creates a semantic signature (metadata) for each document. Taxonomies contain thousands of concepts organized in consistent hierarchies with generic to specific relationships. This component provides the knowledge foundation going forward as new data is discovered, modified or added, and will automatically refresh the overall data repository to ensure the latest version to the user base. Knowvation uses scalable and consistent taxonomies for categorization and flexible and pragmatic classifications for information access. This unique approach provides users with fast and secure access to relevant information in a portal like user interface bringing context to content. Categorization and Dynamic Classification represent a behavioral and technological leap for users and enterprises alike. Rather than being forced to fit searches within the constraints of inflexible categories, users can dynamically create their own information categories based on the context of their search. Further, those categories can inter-relate and display information from widely disparate sources and locations, permitting users to discover knowledge that might have otherwise remained hidden. Optimized Search Precision & Recall Through the combination of keyword indexing, advanced linguistics processing, and variety of basic and advanced search methods, Knowvation achieves optimized precision & recall across vast and diverse information sources. Boolean, Pattern, and Concept modes can be used independently or interactively. 2

Pattern Search Pattern searches tolerate spelling errors in either the body of the text or the keyword search. It automatically performs pattern expansion on all keywords based the number of words set by the user, and then ranks the retrieved documents. Pattern searching overcomes already stated deficiencies in OCR quality. Concept Search This type of search expands your search term to include semantically related terms. It uses a network of word associations enabling the expansion of search terms by using variations, synonyms, antonyms, and other relationships to search the entire document text. This allows users to have the most relevant documents delivered to the top of the result list. Intelligent Browse Structure This functionality allows administrators to create a rich hierarchical structure to organize archived documents. This web-based navigation tool allows users to walk through a folder structure to retrieve content. Browse can be invoked from a search result screen to allow users to quickly find documents related to their searches that may not contain any of the desired search terms. Advanced Language Processing Adding Value to Multi-lingual Content Ability to process content consistently even if documents may be in different languages and encoding requires advanced language processing. Knowvation offers Advanced Natural Language Processing features that include Language Identification, Tokenization, Morphology Analysis, Idiom Processing, Part-of- Speech Tagging. Furthermore, Knowvation uses Special Data Preparation features such as Character Normalization, Stop-word Removal, and Exact Phrase Identification. Cross Lingual Searching Knowvation s advanced language processing features enables content to be indexed, and categorized with linguistic markers such as language tags, conceptual identifiers, grammatical categories (i.e. is the term a noun or a verb). This allows for more accurate retrieval of results. Web Editor Performs necessary maintenance, editing, and cataloging as well as manages metadata associated with digital documents from both hard copy source materials to electronically published materials. The web editor allows privileged users to create and edit descriptive metadata right from the search results without loading a client application or installing an ActiveX control Scalable Architecture Knowvation s scalable multi-tier J2EE architecture is built for consistent and fast response time over local and wide area networks. The design supports a distributed system configuration that can be tuned to maintain performance regardless of size. The default web based client is implemented with JSP and Servlet technology allowing for simple customization and web integration. 3

Technical Design The core of the Knowvation design is the underlying database structure that defines the fundamental relationship of the digital document to its descriptive metadata. The Dublin Core standard is used to describe the metadata of the document represented as XML and stored in the database schema. A best-of-breed full-text search engine is pre-configured to index and retrieve documents from the digital archive based on full text and metadata queries against the complete data repository. For added flexibility, the Dublin Core metadata framework can be expanded to accommodate an additional 50 customer-defined fields. Business services implemented as Enterprise Java Beans (EJB) provide extensible APIs for content management, access and administration. These data objects provide full lifecycle management of digital documents in the archive. Role based security wraps the entire system providing granular levels of Knowvation Architecture access to the business objects and the documents themselves that can be tightly controlled by system administrators. The extensible security model can be configured to use LDAP for authentication and directory services, allowing existing enterprise policies to be migrated to the digital archive. Application Programming Interface A complete API exposes all system functionality to facilitate customization and integration of Knowvation with 3 rd party and legacy enterprise applications. Communication with the backend Enterprise Java Beans is done via RMI and JNDI over HTTP/SSL, allowing any Java application either local or remote access to system functionality. Knowvation can be installed as a complete turnkey application. Knowvation can also be customized and integrated with other third party technologies such as document capture, forms processing, workflow, and document management systems using Knowvation s powerful API services. Knowvation API 4

Document Version Control and Security Knowvation incorporates full Document Version Control and Security. This feature can be customized to set specific rights and privileges to certain users and groups in the modification, preservation and identification of specified documents. The level of custody, possession and control of documents can be set by the original author or other authorized personnel. Knowvation Record History As new authorized versions and provisions of a document are created, Knowvation automatically preserves the original. Reports can be generated automatically detailing each version change including; chain of custody identification, location, date of change and reasons. Additional reports can also include the specifics on individual user s history of searches, topics, specified sites and current document custody status as illustrated in Knowvation Document History. Users can access the original file by clicking the Restore option under the Version column. MyWare Knowvation provides individual users the ability to create their own preferences through the MyWare functionality. Users have the ability to save specific searches and create notifications for new additions that meet relevant search criteria. With MyWare, users also have the ability to customize and edit their layout and setup preferences in addition to creating RSS feeds. Knowvation MyWare Functionality Speech to Text Manual transcription of audio can be very expensive. To provide a more cost effective solution, Knowvation provides its customers with speech to text searching. Text is indexed for search when the speech to text conversion occurs. This allows rapid discovery across large volumes of audio to include oral histories, congressional proceedings, official audio recordings and signals intelligence. Commonly audio supported files include.wav and.mp3. As displayed in the figure to the right users can search full text of converted audio using all Knowvation s full search feature set. Selection of highlighted hit plays audio at that selection. Knowvation Speech to Text Functionality 5

Flexible Displays Knowvation s unique Graphical User Interface (GUI) provides users the ability to quickly change screen displays to meet their preferences. This GUI flexibility allows users to toggle back and forth between different displays while maintaining search results or the file currently being viewed. Below are a few examples. The first screenshots show the column and row views for search results. The 3 rd screenshot displays the quick toggle to the metadata portion of a file. Knowvation Search Result Column View Knowvation Search Result Row View Knowvation Metadata Editor Interface About PTFS With more than 500 partnerships and installations for clients internationally, PTFS offers customized and proven content management solutions. Our core products include KnowvationTM, BibliovationTM and DronewareTM. To help organizations focus on their core missions, we also offer highly technical teams that streamline the process to implement and maintain custom solutions that best meet their needs. 6