SEMANTIC-BASED AUTHORING OF TECHNICAL DOCUMENTATION R Setchi, Cardiff University, UK, Setchi@cf.ac.uk N Lagos, Cardiff University, UK, LagosN@cf.ac.uk ABSTRACT Authoring of technical documentation is a knowledge-intensive activity that requires knowledge of various domains and information from several product life-cycle phases. This paper presents two complementary semantic-based methods for creating product support documentation and delivering it to the user. The first approach is based on using Concurrent Engineering and Product Data Management (PDM) technology. The second approach uses ontologies to capture the semantic complexity of the product support domain. Promising direction for further research is the integration of these two approaches. KEYWORDS: Semantic Modeling, Product Lifecycle Management, Knowledge Engineering, Hypermedia Documentation. 1. INTRODUCTION Informative, up-to-date and well-prepared product documentation can provide time and cost savings to both users and product manufacturers/supplier. However, authoring of product documentation is not easy. It is described in [1] as a composite skilled task that requires special knowledge and experience. It is a knowledge-intensive activity that requires knowledge of various domains and information from several product life-cycle phases. Product support is an area currently undersupported by tools. The problem of cost-effectively producing technical documentation becomes apparent especially when product support is compared to other phases of the product lifecycle, which are supported by tools for computeraided design (CAD), product data management (PDM), computer-aided manufacture (CAM) and computer-aided planning (CAPP) [2]. Without similar tools, technical authors experience difficulties in collecting and integrating product data, and structuring and maintaining technical documentation. The aim of this work is to demonstrate how these problems can be addressed by employing semantic modeling techniques. In particular, the paper presents two complementary approaches to semantic based authoring of hypermedia product documentation. The first approach is based on using Concurrent Engineering (CE) and PDM technology, while the second approach uses ontologies to capture the semantic complexity of the product support domain. The remainder of the paper is organized as follows. Section 2 briefly reviews recent developments in modeling hypermedia documentation for product support. Section 3 presents two complementary approaches to semantic-based authoring of technical documentation. Section 4 concludes the paper. 2. BACKGROUND AND RELATED WORK CE is an accepted systematic methodology for integrated product development. CE focuses on considering all product life cycle issues as early as possible in product development. It also emphasizes the importance of cross-disciplinary collaboration within the design stages of the product life cycle. The adoption of CE principles in documentation development directly influences the authoring of technical documentation. For example, product development and support knowledge could be distributed across a network of experts collaborating in parallel. Experts across the product lifecycle phases responsible for a given engineering activity would
directly contribute to the authoring process. Technical authors would have access to all available information sources and would utilize them in the best way. Furthermore, the importance of integrating knowledge engineering practices into the development of product support systems has been acknowledged by several researchers. Earlier studies focused on introducing reasoning in product support systems, mainly as diagnostic tools. Gradually, the focus in this area of research shifted from the reasoning mechanism to the knowledge base. Recent studies focus on user classifications, product and/or task structures, and their integration. For example, the adaptive product manual developed by Pham and Setchi [3] is based on using product, user and task models. These models are integrated within a knowledge based system which uses cases that represent previously solved situations. A similar approach is adopted by Brusilovsky and Cooper [4] who utilize integrated domain, task, and user models supporting the maintenance of equipment. Latest research in product support indicates a trend towards semantic data modeling. For instance, Pham et al. [5] employ a semantic data model to generate virtual documentation. The model is based on data usage analysis, which abstracts the intended purpose of the product and task data elements, and their functional characteristics. This paper presents two complementary semantic-based approaches to modeling product support knowledge. The first approach is based on using CE and PDM technology, while the second approach uses ontologies. 3. TWO APPROACHES TO SEMANTIC MODELLING OF HYPERMEDIA DOCUMENTATION 3.1 Semantic Hypermedia Authoring Using CE Principles This approach focuses on the development of an integrated virtual environment for product and documentation development that is shared by both product developers and technical authors. The core of this virtual collaborative space is the integrated database storing data accumulated throughout all stages of the product and documentation life cycles. This data is structured to facilitate updating and management of product documentation through associative links with product data sources. This approach advocates the building of product documentation from logically linked, semantically distinguishable and classifiable information objects (IOs) [2]. An IO is defined as a data structure that represents an identifiable and meaningful instance of information in a specific presentation form [6]. These IOs are structured in the virtual collaborative space using product data and product documentation models. These are information models that describe the underlying structures and relationships of product data and documentation. The product data model is a structured representation of information about a product. It integrates the life-cycle information necessary to completely define the product and each of its components for the purposes of design, engineering analysis, manufacture, test and product support. The structure of the product data model follows the topology of the product (product, systems, subsystems, assemblies, etc.). Similarly, the product documentation model is a structured representation of the basic document components (chapters, sections, subsections, etc.). The product documentation model contains objects that are created during documentation development. It uses the product data model as a primary source of information. To help the process of direct association between these two models, they are constructed identically as sets of manageable information elements (IEs). IEs are annotated using attributes that enable product data to be easily organized, addressed, classified and maintained. As Figure 1 shows, each product or documentation element contains a definition and data. The element definition consists of attributes used to identify the element (id number, name, quantity, type, and version). Element data comprise one or more information objects (IOs).
Figure 1. Information elements. All IOs associated with a particular IE inherit its definition. The IOs are described using attributes identified through an analysis of the main documentation components and the principal sources of product data. The analysis focused primarily on the function, content and presentation form of the IOs. As Figure 2 shows, the IOs are classified using four groups of attributes which capture the semantics of these objects. Two of the groups, definition and processing, contain attributes used to address, process and maintain data. The other two categories, type and assessment, are for describing information objects in terms of their contextual meaning and presentation form. Figure 2. Attributes of information objects. Table 1 shows an example of existing relationships between IOs, and the attributes used to describe them. The hypermedia document in Table 1 lubric_points_10.htm illustrating the lubricating procedure 10 contains a video file created using the CAD models of the braking system and its components movie_pt10.avi. The identification number of this IE (DOC-8-1-2-4-0-0-0-0) indicates that it belongs to the product documentation model, and is included in chapter 8 (Lubrication Chart), section 1-2 (Lubrication points), and subsection 4 (point 10) of the technical documentation. The ID of the information element outside.asm (DV-1-1-1-2-0-0) is
composed in a similar way. It shows that this IE belongs to the product data model; its coding follows the topology of the product. In addition, Table 1 includes IOs generated using the CAD model of the subassembly "outside" such as the VR animation outside.iv and the video clip outside_a_1.avi. This table comprises data extracted from the integrated virtual environment for collaborative working developed using PTC Pro/INTRALINK. Currently, the integrated data base contains 1558 files including 336 hypermedia documents, 656 images, 213 VR models, 166 CAD models, 63 3D animations, 75 video files, 27 audio files, etc. IO Name IE ID IE Type IO PLC Phase IO PLC Activity IO Form IO Function lubric_points_10.htm DOC-8-1-2-4-0- Subsection Documentation Modeling HM Procedure 0-0-0 movie_pt10.avi DV-1-1-1-1-0-0 Assembly Documentation Maintenance Movie Procedure outside.asm DV-1-1-1-2-0-0 Subassembly Product Design Modeling CAD Structure 020700_1.prt DV-1-1-1-2-0-1 Part Product Design Modeling CAD Structure outside1.unv DV-1-1-1-2-0-0 Subassembly Product Design Modeling Text Definition outside.iv DV-1-1-1-2-0-0 Subassembly Documentation Modeling VR Structure outside_a_1.avi DV-1-1-1-2-0-0 Subassembly Documentation Training Movie Description out_exp.tif DV-1-1-1-2-0-0 Subassembly Documentation Prototyping Graphics Structure Abbreviations: IO - Information Object, IE - Information Element, ID - Identification Number, PLC - Product Life Cycle, VR - Virtual Reality, HM - Hypermedia, CAD - CAD Model. Table 1.Relationships report generated in Pro/INTRALINK. 3.2 Semantic Hypermedia Authoring Using Ontological Principles This approach aims at organizing data in a way that ensures the homogeneity, validity and effectiveness of the product support information generated. It is based on using ontological principles and take advantage of the formality and richness of ontological knowledge modeling. This approach employs some widely accepted notions in knowledge engineering, such as concept, instance, and relation. In addition, several specialized structures, such as domain, connector, knowledge-specifier and arc, are employed to enrich the representation ability of the system. Domain is a hierarchy of concepts (i.e. concepts linked with is-a relation), which describes a part of the real world and is a concept itself. For example, a domain Product (D) (see Figure 3) includes the hierarchy car is-a vehicle is-a Product. Domains identify where each abstraction hierarchy is contained, thus clarifying the mapping to different real world components. Associating concepts with the is-a relation and developing abstraction hierarchies (i.e. domains) is not enough in the case of a product support system. A relation that connects different domains is also needed. Take the example of Product (D) and Assembly (E) domains that are linked with the Product has Assembly relation (F). Relations that associate different domains are called connectors. Connectors express part-of, is-composed-of, is-realized-with, and aggregation relations. Connectors can also link instances that belong to different domains, if the concepts to which these instances belong are included within these domains and the domains are also linked with connectors. Connectors cannot link information structures describing different knowledge areas (i.e. product, task, and user). Knowledge-specifier is a property that is considered significant within the application domain (i.e. product support) and therefore is represented as a concept. For example, it may be useful to know whether a product is viewed as complex or not, since that determines the way in which the support provided is adapted. Therefore, a knowledge-specifier called ProductType (G), which defines product s complexity, is introduced. The level of the product support will be determined using mechanisms that include ProductType. A simple example is the following predicate, which states that if the product is complex, the user is novice and the task is complex, then the product support information generated should be presented in a greater detail.
Figure 3. A knowledge base developed using ontological principles (fragment). ((ProductType = complex) AND (User = novice) AND (Task = complex)) ProductSupport = detailed (1) Arc is a relation that links the knowledge-specifiers with other concepts and/or domains. For example, ComplexProduct (H) can refer to concept car and clutch (I). Furthermore, knowledge-specifiers and arcs are used to relate different knowledge areas. In Figure 3, the task knowledge set (part of which is the task domain (J)) is related to the product knowledge set (composed by product, assembly, subassembly, and part domains) via ProductType. The arcs in this case establish a constraint that is described by the following predicate. Predicate Arc (ProductType, 2. Arc within Task) a = predicate false ProductSupport = null (2) Predicate 2 states that if there is no arc between the knowledge-specifier and the task, then the product support system should not deliver a solution. This corresponds to the case when both the product and the task are too complex for the user and (s)he would not be able to perform the task even with the help of the system (e.g. assembling a car). Such a situation is considered a safety hazard and should not be allowed. More complex predicates can be developed that take in consideration the status of the user and relate it to the above arc, which determines the amount of support provided (see Predicate 3). Predicate ((Arc (ProductType, 2. Arc within Task) a predicate =true) AND (ProductType = complex) AND (User = novice) AND (Task = complex)) ProductSupport = detailed (3) Connectors can be characterized as active or inactive depending on whether the real world components they describe exist or not. For example, if a product is very simple, then it may not have any assemblies and subassemblies, but only parts. The possibility of having such cases is represented by the connector-dependence (CD) measure. CD can take two values and be either true or false depending on whether the connector is always active or not. Take the example of the connectors shown in Figure 3. The CD of the connectors between Product and Assembly, and Assembly and Subassembly (K) is false because the existence of assemblies and
subassemblies depends on the product s complexity (i.e. a simple product is one that has only part(s)), which is expressed by ProductType. Therefore, only the connector that is directed to Part (L) has CD that equals to true, as it is always active. The connector-dependence measure establishes existential constraints on the architectural model (e.g. each Product should have at least one Part ), which is a way to ensure the validity of the acquired knowledge. The knowledge base of the system is developed using the knowledge model described above, populated with instances and their associations. Each attribute of each concept corresponds to a small fraction of a product support document, while the value of that attribute defines the way in which the content of that fraction changes. Each concept is mapped to a part of the document that contains several of the aforementioned fragments that match the attributes that characterize each concept. The connectors and generalization relations are used to search for part of documents that relate to the required concept, while the knowledge-specifiers and arcs are utilized to establish the presentation format of the provided document. The knowledge base is created using the Protégé ontology editor, and part of it is depicted in Figure 3 with the Jambalaya plug-in. The squares in the figure represent concepts. An is-a relation is represented by placing a square within a bigger one. The connectors between different domains, with CD being false, are represented with bold straight lines, and with CD being true are shown as bold curved lines. The straight lines that link the black squares (instances) are connectors, with CD always false, linking different instances. The knowledge-specifier ( ProductType ) is shown as a rounded square and the arcs that link it to the other concepts are represented by bold dotted lines. THING (the domain within which everything is contained) is the root concept of the ontology. Currently, the knowledge base contains 685 frames, including 162 concepts, 144 slots, 22 facets, and 357 instances. 4. CONCLUSIONS The proposed semantic-based approaches to authoring of product documentation facilitate simultaneous product design and documentation development, collaborative authoring of product documentation, sharing and reuse of engineering data and knowledge. They both support structuring of product knowledge, integration of data and generation of up-to-date dynamic documents. Promising direction for further research is the integration of these two approaches. ACKNOWLEDGEMENTS The research described in this paper was conducted within the ISAR project sponsored by FP6 of the European Commission. REFERENCES [1] BS 4884, Technical manuals, Part2: Guide to content, British Standards Institution, London, 1993. [2] D.T. Pham and R.M. Setchi, Authoring environment for documentation development, IMechE Proceedings, vol. 215, part B, 2001, pp. 877-882. [3] D.T. Pham DT R.M. Setchi, Adaptive product manuals, IMechE Proceedings, C-214, 2000, pp. 1013-1018. [4] P. Brusilovsky and D.W. Cooper, Domain, task, and user models for an adaptive hypermedia performance support system, Proceedings of the IUI 02, ACM, San Fransisco, California, USA, 2002, pp. 23-30. [5] D.T. Pham, S.S. Dimov and A.M. Huneiti, Semantic data model for product support systems, IEEE International Conference on Industrial Informatics (INDIN), 2003, pp. 279-285. [6] R. Setchi, Enhanced product support through intelligent product manuals, PhD thesis, University of Wales Cardiff, UK, 2000.