A Semantic Wiki approach for integrated data access for different workflow meta-models
|
|
- Ophelia Parrish
- 8 years ago
- Views:
Transcription
1 DERI DIGITAL ENTERPRISE RESEARCH INSTITUTE A Semantic Wiki approach for integrated data access for different workflow meta-models Eyal Oren DERI Technical Report February 2006 DERI DIGITAL ENTERPRISE RESEARCH INSTITUTE DERI Galway University Road Galway IRELAND DERI Innsbruck Technikerstrasse 13 A-6020 Innsbruck AUSTRIA
2
3 DERI Technical Report DERI Technical Report , February 2006 A Semantic Wiki approach for integrated data access for different workflow meta-models Eyal Oren 1 Abstract. Organisations commonly employ multiple workflow management systems, for various reasons. A multitude of employed workflow management systems leads to issues in data consistency and data retrieval, since organisational data is related but maintained in multiple systems and disparately available for the employees. This paper outlines our research to address workflow data integration. Our integration of workflow data is based on our multi-meta model process ontology m3po. As unifying interface to the integrated data (for editing, adding, annotating, browsing, and retrieving workflow data) we introduce the paradigm of semantic wikis: easy to use collobarative editing systems. 1 Digital Enterprise Research Institute, National University of Ireland, Galway. Acknowledgements: This material is based upon works supported by the Science Foundation Ireland under Grant No. 02/CE1/I131. Copyright c 2006 by the authors
4 DERI TR I Contents 1 Introduction 1 2 Workflow management 1 3 Related work Enterprise-wide workflow management Data integration Approach Architecture Implementation User Interface Wikis and Semantic Wikis Limitations of Wikis Semantic Wikis Architecture Overview Annotation language Information access Semantic Wiki: Implementation System overview Writing Navigating Searching Storage Component interaction Addressing Wiki limitations Summary 22
5 DERI TR Introduction Workflow management systems are widely used in organisations. They provide automated support for managing business processes; they operate on a schema definition of the processes in the organisation. Organisations commonly employ multiple workflow management systems, for historical, functional, or technical reasons [6]. Each workflow management system manages a part of the organisational processes and maintains a part of the organisational data. A multitude of employed workflow management systems leads to issues in data consistency and data retrieval, since organisational data is maintained in multiple systems. Data consistency issues arise since the data in the workflow management systems are related: they need to be (kept) consistent. Data retrieval issues arise since employees in the organisation need complete information, compiled from all workflow management systems. This paper outlines our research to address the issues of data retrieval. We proceed as follows: first, in section 2, we give a detailed description of data management in workflow management; section 3 then outlines existing work that addresses these issues and explains where we position our work; section 4 describes our approach on an abstract and section 5 describes our concrete solution, based on Semantic Wikis. 2 Workflow management Workflow management deals with supporting business processes in organisations, it involves managing the flow of work through an organisation. A workflow is a collection of coordinated tasks designed to carry out a well-defined process [20]. A workflow management system is a generic information system that supports modelling, execution, management and monitoring of workflows. Such a system operates on a workflow specification, a description of the business processes in the organisation that should be supported. A workflow management system can be compared to a database management system: it is a generic system that operates on a schema definition of the (processes in the) organisation. Workflow modelling is the task of creating workflow specifications, that are used as input to a workflow management system. Different workflow management systems have been developed, focusing on different application domains and providing different functionality. Workflow management lacks a standardised theory that provides a theoretical background; despite standardisation efforts no consensus exits on the representation or conceptual model of workflows [22]. Jablonski and Busser [14] give a comprehensive overview of issues in workflow modelling, divided in five key aspects: functional, behavioural, informational, organisational, and operational. An organisation can have several different workflow management systems deployed for managing their business processes [6]. The reasons for such diversity can be historical (e.g. the result of business acquisitions or mergers) or functional (e.g. the result of different requirements in organisational units). The information captured in these separate workflows is interrelated: the workflows belong to the same organisation and model related business processes. However, the interrelation between these different workflows can not be captured in current workflow management systems, for two reasons. First, since each workflow management system
6 2 DERI TR has a distinct meta-model and specification language, the workflows in one system are generally not understandable by the other systems: the representations of workflows differ across systems. Second, each workflow specification considers a closed world, it is not possible to refer to external entitities (defined in other workflow management systems): the workflows are disconnected. In the current state-of-the art an organisation can thus not access its information completely: the relation between different sets of workflow data is lost. Organisations can not get a complete picture of their workflow data, and can not see and edit their information in the correct and complete context. Example 1 An academic organisation has procedures in place for arranging the travels of its employees. Before each travel employees must request a travel permission from their superior; after travel employees can apply for expense refunds only if permission was granted prior to the travel. An automated workflow, enacted with Microsoft BizTalk, is used to manage these travel requests and refunds. Travel reservations are arranged centrally for each granted travel request. The employees do not arrange their own travels, instead a specific travel coordinator does so for all travels in the organisation. The travel coordinator uses travel agencies to find and make travel arrangements. The travel coordinator has a workflow to manage these reservations with the travel agencies, which is enacted using IBM WorkflowMQ. The problem is that, although the data in these two workflow management systems are related (both deal with employees and their superiors, status of travel requests, university travel regulations, etc.), one can currently not easily retrieve this integrated information. Where this information is needed, it has to be collected and maintained manually: In order to plan the ongoing work, the operations manager needs to know the current load and availability of all employees. He needs to go into the seperate workflow management systems and collect the information from these systems manually. The financial officer needs to run queries that span data in both systems: she needs information about travel requests, the projects that they are charged, the travel agency that dealt with them, and the costs that they incurred. Again, she needs to go into the separate workflow management systems and collect this information manually. The quality manager wants to document all processes in the organisation. All activities that are performed internally are described in standard operating procedures and stored on the internal network. Each documentation should be linked to the relevant activities, so that employees can quickly find documentation on the process that they are involved in. But the workflow management systems do not allow the quality manager to annotate activities with documentation or to link them to external resources. And even if they would, there is no easy way for the quality manager to see all that documentation, and manage it. Again, he would need to go into all separate workflow management systems and collect the information manually. Problem Business processes that are defined and enacted in different workflow management systems are disconnected. Organisations that deploy multiple workflow management systems find their workflow information dispersed. Users have to manually integrate related workflow data.
7 DERI TR Related work 3.1 Enterprise-wide workflow management Bussler [6] discusses the management and execution of enterprise-wide workflows based on a case study performed in Boeing. Enterprise-wide workflow execution is difficult to achieve because of the heterogeneity between workflow systems. Even within one enterprise workflow management systems are not homogeneous: the enterprise employs workflows in different functional domains (in Boeing workflows exist for airplane design, for stock management, for internal travel management, for human resource management, etc.), and it is common to use a different workflow management system in each domain (because the workflow requirements are domain-specific and because workflow vendors offer domain-specific solutions). To enable enterprise-wide workflow management the various workflow management systems should not run in isolation but exchange workflow information and share workflow execution. The case study identifies the need for (i) workflow data integration, and (ii) distributed workflow execution. Workflow data integration: workflow data integration provides a unified view of workflow data from the various workflow management systems; it is a special case of data integration, which we discuss in section 3.2. We can distinguish three dimensions of workflow data integration: (i) integration of workflow models (type integration), (ii) integration of completed workflow instances (history integration), and (iii) integration of possible workflow executions (projective integration). Type integration allows users for example to see all processes related to part XYZ. History integration allows users for example to see the execution time of all processes, related to part XYZ, that completed in the last tree weeks. Projective integration allows users for example to project the time it will take to complete all unfinished workflows related to part XYZ. Projective integration is only possible if the semantics of the workflow meta-models is captured in the integration. Distributed workflow execution: distributed workflow execution is necessary as soon as crossdomain workflows are needed, as functional domains inside the enterprise develop their own workflow definitions and execute them in isolation. Distributed workflow execution is constituted by either instance migration, instance distribution, or instance replation [7]. Instance migration means that a workflow instance can migrate (move) from one workflow management system to another; this is only possible if the two workflow management systems implement the same execution semantics. Instance distribution means that a part of the workflow resides in another workflow management system. Instance replication means that an instace is replicated (continuously reflecting all changes) in two workflow management systems. Distributed execution can be characterised on several dimensions [6]: direct vs. indirect distribution: if the workflows know each other they can execute each other directly. Otherwise external functionality is necessary for their cooperation, for instance coordination via shared databases or communication queues.
8 4 DERI TR objects of distribution: any object in a workflow management systems (e.g. subworkflows, resources, instance data, applications) can be distributed, i.e. reside on a different installation than the rest of the workflow. distribution transparency: workflow distribution is transparent if the location of objects is not visible in the workflow specification, i.e. if workflow designers do not see (and need to design) the distribution of the workflow objects. Bussler [6] describes four possible architectures for workflow inte- Integration architectures gration: user interface collocation: the workflow management systems are not integrated, but the user interfaces of different workflow management systems reside at the user s desktop together. user interface integration: one user interface is offered that accesses different underlying workflow engines. We can further classify this layer by whether the user interface integrates the workflow data from the source systems or stricly separates workflow objects from the various source systems. workflow logic integration: the workflow management systems know each other and can share executions. workflow database integration: the workflow management systems share one database. We can again differentiate whether the workflow management systems keep their objects separate, or share each other s objects. In each of these layers, true workflow integration is only possible if the workflow management systems are designed and built with integration in mind; if the workflow management systems are independent and do not recognise objects from the other systems native interoperation is not possible. The case study demonstrates the heterogeneity in enterprise-wide workflows and the need for integrated user interfaces, workflow logic and workflow data. According to Bussler [6], current workflow management systems do not support the necessary integration and distribution functionality; an encapsulating infrastructure is necessary. 3.2 Data integration Data integration comprises problems in storing and manipulating heterogeneous data sources in a uniform way [19]. Data integration deals with combining data residing at different sources, and providing the user with an integrated view of these data [16]. Data integration systems provide a uniform query interface to a collection of autonomous data sources [9]. Data integration systems typically deal with one global (mediated) schema and a set of local (source) schemas, and offer transparent access to the local schemas through the global schema: queries and result sets are formulated in terms of the global schema. A basic problem in integrating different data sources is heterogeneity, ranging from the hardware and software the database systems are running on, to the data schemas and data models that structure the data, to the kinds of data that are being stored [12]. Semantic data integration focuses on the heterogeneity between the data schemas of the sources.
9 DERI TR Standardisation policy Trivially, standardising the source databases (hardware, software, and data schemas) solves the integration problem by preventing heterogeneity to appear. The source schemas are mandated to be the same as the global schema, and no semantic integration is necessary. Although such solutions are applied in practice (for example when using an enterprise-wide system such as SAP), we focus on autonomous data sources, where mandated standardisation is not possible. Architectures We can distinguish three basic architectures for data integration [9, 12, 16]: federation, mediation, and data warehousing. We first consider read-only access to integrated data; we will address read-write access later. federation: [HM85] a federated database system consists of source databases that agree to share partial data with other members of the federation. Each source database offer an interface for communication with the other source databases. Typically, members extend their own schema to incorporate subsets of the schemas of other members. A federation has neither a global query interface nor a global schema. To answer a query, a source can turn to others in the federation; it is the responsibility of the first source to integrate the results into a coherent answer. mediation: [Wie92] a mediated database system offers a single query interface to sources. Each source database is encapsulated by a wrapper that hides low-level protocols. A global mediator offers a common interface; it decomposes queries into subqueries that are sent to individual data sources, and integrates (mediates) their results into a coherent set. The mediated architecture offers a virtual view on the source data: the data is not collected in a database, but integrated on the fly for each query. A critical element of this architecture is the description of the sources, and their relation to the global schemas; several approaches exist for managing these source descriptions, which we investigate in the next section. data warehousing: the data warehousing approach offers a materialised view on the source databases. It regularly collects and integrates data from sources, and maintains these data in its separate database; users can query the integrated database directly. Data warehouses are typically faster than on-the-fly integration since the views are materialised in the warehouse. They also allow historical data analysis, since they can maintain temporal snapshots of data. Issues in data warehouses include the frequency of updates (changes in the source data are only represented in the warehouse after the next update step). Similarly to the mediated architecture, the source descriptions are critical in data warehouses. Source descriptions Each source in the data integration system needs to be related to the common global schema used for integration. A source description, also called mapping, relates the local data schema of a source to the common global data schema. Two basic approaches have been proposed for these mappings[16]: global-as-view (GAV) and local-as-view (LAV). Global-as-view considers the global schema as a view of local schemas; it requires that the global schema is expressed in terms of the local schemas. Local-as-view requires
10 6 DERI TR the global schema to be defined independently, and then defines the source databases as views on the global schemas. Query processing is easier in GAV, since in GAV query processing can be done by unfolding the query using the mappings. The mappings describe directly how to rewrite the query in terms of the sources. In LAV however, the mappings need to be reversed before they can be applied to the query; it is not directly clear how and which sources can be used to answer a query. Data modelling is easier in LAV. In GAV the global schema is expressed as a view on all sources. If a source is changed or added, it can effect all the mappings in the system. In LAV each mapping is a relation between the global schema and one source; a change or addition of one source does not effect any other source mapping. 4 Approach As discussed in section 2, organisations commonly find their business processes and their business data dispersed over different workflow management systems. Since these systems are disconnected, it is not possible to offer an integrated view over these data. Our approach consists of two parts: our unifying ontology m3po [10] that represents all common workflow management metamodels from existing workflow management systems, and our semantic wiki SemperWiki [21] that allows ordinary users to easily browse, view, edit, and manage semantic information. 4.1 Architecture On the abstract level, we develop an integration architecture that offers transparent access to data from different workflow management systems. It allows integrated information management of business processes. Users can build connections between related data items and view, edit, and query workflow information. The architecture is shown in figure 1. It consists of a uniform workflow representation, a methodology for importing workflow management systems into this common format, and operations that users perform on the workflow data. A uniform representation of workflow data addresses the syntactical and semantical differences between workflow management systems. All workflow models are represented in a common format and the semantics of all workflow constructs is specified. We use rdf as common format and we use m3po to define the used RDF terminology and its semantics. rdf [18] is a datamodel for describing resource on the Semantic Web. rdf statements are triples, they state a property of a resource. All resources have a unique identifier, statements can be made about any resource. A set of rdf statements corresponds to a directed graph of resources and literals, connected by predicates. An rdf graph does not necessarily conform to a schema (therefore it is called semi-structured); it can be queried using graph patterns. rdf is suitable for information integration [8]: (i) its flexibility allows representing arbitrary data without a predefined schema, (ii) unique identifiers for each resource allow making statements about arbitrary objects and (iii) the graph-based model allows merging data from different sources without problems. Our multi-meta model process ontology m3po [10] is an ontology that unifies existing metamodels in the workflow domain. It is based on various reference models and standards: the workflow
11 DERI TR Figure 1: Integration Platform exchange format XPDL [24], the formal process ontology PSL [3], the workflow language YAWL [1], the web orchestration language WS-BPEL [23] and the choreography language WS-CDL [15]. Methodology The methodology to construct the m3po ontology is described in [10]. For our purposes it is important that the ontology is complete in the sense of the workflow management systems it can represent, and available in RDF. A set of user operations on the common representation describes the tasks users need to perform on the workflow data. This set of tasks will be compiled from the requirements of the workflow domain. 4.2 Implementation On the concrete level we implement this archictecture. We develop a set of tools that (i) store and retrieve m3po; (ii) transform between certain workflow management systems and m3po; and (iii) provide the defined user operations in a easy-to-use manner. Storage: for storing and retrieving m3po we can use standard rdf stores such as Redland [2], YARS [11], Jena2 [25], or Sesame [4]. These differ mostly in storage space, query functionality, support for higher-level semantics, performance and scalability, and programming interfaces. We will choose a store based on our technical requirements. Transformation: we will develop transformation tools between Microsoft BizTalk and m3po, and between IBM WorkflowMQ and m3po.
12 8 DERI TR User operations To provide the identified user operations we use the SemperWiki platform and develop plugins that support specific workflow functionality. We thus utilise the generic data management functionality and provide extensions that add specific support for the workflow domain. 4.3 User Interface We base our user interface implementation on the paradigm of semantic personal Wikis [21], and specifically on our tool SemperWiki, as explained in the next section. 5 Wikis and Semantic Wikis Wiki Wiki Webs are collaborative hypertext environments, focused on open access, ease-of-use, and modification [17]. Wikis are interlinked web sites that can be collaboratively edited by anyone. Pages are written in a simple syntax so that even novice users can easily edit pages. The syntax consists of simple tags for creating links to other Wiki pages and textual markups such as lists and headings. Wikis can be regarded as editing environments for the web. They are manifestations of the writable web [13], enabling users to write web content with the same skills and tools used to read them. Anecdotal evidence 1 suggests that the popularity of Wiki systems, compared to other collaborative hypertext systems, is due to their simplicity and their open and easy access [5]. The user interface of most web-based Wikis consists of two modes: in reading mode, the user is presented with normal web pages that can contain pictures, links, textual markup, etc. In editing mode, the user is presented with an editing box displaying the Wiki syntax of the page (containing the text and including the markup tags). During editing, the user can request a preview of the page, which is then rendered by the server and returned to the user. Some so-called desktop Wikis such as Tomboy 2 or WikidPad 3 have only one mode: pages are directly editable. Wikis are used for various collaborative tasks, including collecting general encyclopedic knowledge, event organisation, and writing research proposals and papers. Many sites run a Wiki as a community venue, enabling users to discuss and write on topics, such as product support or project documentation. The burden of editing and maintenance is thus shared over the whole community. Popular Wikis such as Wikipedia can grow in size very quickly, since interested visitors can edit and create pages at will; this poses new requirements on information access and retrieval. 5.1 Limitations of Wikis A shortcoming of large Wikis is lack of support for finding and maintaining information. The main reason is a lack of semantic structure in the Wiki content: almost all information is written in natural language, and has little machine-understandable semantics. For example, a page about John Grisham could contain a link to the page about The Pelican Brief. The English text would say that John Grisham wrote the Pelican Brief, but that information is not machine-understandable. This leads to the following consequences: 1 See e.g
13 DERI TR Structured access of a Wiki is not possible, which arises when one is browsing or searching for information: One cannot currently query Wiki systems, because the information is not structured but rather is textual. For example, users looking for How old is John Grisham?, Who wrote the Pelican Brief?, or Which European authors have won the Nobel price for literature? cannot ask these questions directly. Instead, they can navigate to the page that contains this information and read it themselves. They first have to locate the page, and then mentally process the information on that page. More complicated queries that require some background knowledge, such as Find all the authors who are poets who are located in countries that are part of the European continent, are not possible at all. Another problem is in navigating the pages: Wikis allow users to easily make links from one page to other pages, and these links can then be used to navigate to related pages. But these explicit links are actually the only means of navigation 4. If no explicit connection is made between two related pages, e.g. between two authors that have the same publishing company, then no navigation will be possible between those pages. Information reuse of Wiki content is not possible. To reuse information is useful when it becomes necessary to provide translations or to create views (as known from databases) of content. In current Wikis it is either assumed that people will speak a common language (usually English) or that translations to other languages will be provided. But manually translating pages is a maintenance burden, since the Wiki system does not recognise the structured information inside the page text. For example, a page about John Grisham contains structured information such as his birth date, the books he authored, and his publisher. Updates to this information have to be migrated manually to the translated versions of this page. Considering that for example Wikipedia has translations of pages in up to 189 languages 5, synchronisation of page versions can become quite a burden. For reusing information the creation of views is often useful. As an example consider that, in general, books are written by an author and published by the author s publisher. The books authored by John Grisham (on his page) should therefore also automatically appear as books published by Random House (on their page). But creating such a view is not possible, and therefore the information has to be copied and maintained manually. 5.2 Semantic Wikis Generally speaking, a Semantic Wiki allows users to make formal descriptions of resources ( things ) by annotating the pages that represent those resources. Where a regular Wiki enables users to describe resources in natural language, a Semantic Wiki enables users to additionally describe resources in a formal language. Using the formal descriptions, or annotations, of resources, Semantic Wikis offer additional features over regular Wikis. Users can query the annotations directly ( show me all authors ) or create views from such queries. Also users can navigate the Wiki using the annotated relations ( go 4 except for back-references, appearing on each page, showing other pages that reference it. 5
14 10 DERI TR to other books by John Grisham ), and users can introduce background knowledge to the system ( all poets are authors; show me all authors ). In our vision, as Wikis are editing environments for the Web, Semantic Wikis will become editing environments for the Semantic Web. Finding the right balance between authoring effort and benefit is however crucial: Different forms of knowledge authoring can be positioned on a continuum in invested effort and returned benefit. For example, knowledge written as free text requires little effort but provides also little benefit, the information is unstructured and cannot be retrieved and reused efficiently; tagging texts with keywords requires slightly more effort and provides slightly improved retrieval; and formal ontology languages require significant authoring effort (authors are restricted in their possibilities and have to follow specific rules) but also provides significant benefits: automated support for knowledge retrieval, reuse, and reasoning. Figure 2: Effort and benefit in knowledge authoring On this continuum, Wikis have a flexible position: they allow users different levels of authoring (from free-text to structure and layout markup). In our opinion that flexibility is key to their success: they do not force users into one single approach but can be used in various degrees of annotations. Each degree introduces an increased authoring effort but also an increased benefit: this gradual increase of effort and benefit allows users to adjust the authoring platform for their needs. For Semantic Wikis to be adopted, this authoring continuum should be taken into account: we envision an evolutionary approach that offers different degrees of additional annotations with increasing effort and benefit. We therefore allow users to annotate their data, but we do not force them to do so. The benefits of adding annotations to Wiki pages are better navigation and better information retrieval. The authoring effort is relatively low: the semantic annotations are very similar to the layout or structural directives that are already in widespread use in ordinary Wikis. In designing a Semantic Wiki system several architectural decisions need to be taken. In this section, we explain the basic architecture and outline the design choices and their consequences.
15 DERI TR Architecture Overview A Semantic Wiki consists (at least) of the following components: a user interface, a parser, a data analyser, and a data store, as shown in figure 3. First we introduce each component, then we discuss the information access, the annotation language, and the ontological representation of the Wiki. Figure 3: Architecture of a Semantic Wiki Users can browse, edit, and query pages via the user interface. When users edit a page, the user interface notifies the parser. The parser analyses the text, and extracts annotations and links. All data (text, annotations, etc.) are stored in the semantic storage. From the data in the storage, the analyser computes sets of pages that are related to the current page, which are displayed by the user interface. Queries are posed to the storage, and the results are displayed by the user interface. All these operations should happen unobtrusively in the background, as to provide the user a responsive application. The user interface component is responsible for all user interaction. If the Wiki is web-based (the classical model), then the user interface is a server-based component that generates web pages to be viewed in a browser. The user interface can also be a desktop application; in that case the Wiki can be used for personal note-taking 6, or collaboration functionality is offered through shared storage. The user interface shows pages, their annotations, and navigation possibilities to related pages. It allows users to type text and annotations in a freely intermixed fashion. The user interface also shows available terms from shared ontologies, enabling users to browse for an appropriate term to use in their annotations 7. The parser component converts the text written by the user into objects: it parses the text for semantic annotations, layout directives, and links. The data analyser is responsible for computing a set of related resources from a given page. In a regular Wiki, this means just to find all back-references, i.e. pages that link to the current one. In a semantic environment, the relations between resources are much richer. The data analyser uses the annotations about the current page, found by the parser, and searches for relevant relations in the data store (such as other books by current author, or other persons that have the same parents as the current one ). 6 as has recently become quite popular, e.g. Tomboy or WikidPad. 7 descriptions can be shared and understood if users write them in a common terminology; browsing ontologies helps users to find the appropriate common term for their annotations.
16 12 DERI TR The semantic storage is responsible for storing and retrieving the semantic annotations. The location of the datastore determines collaboration features: 1. the datastore is hosted locally, on the same machine as the Semantic Wiki (a) if the user interface is server-based and supports multiple users, collaboration is possible (b) if the user interface is desktop-based then the Semantic Wiki is limited to single person usage 2. The datastore is hosted on a server; the user interface can still be desktop-based but users can collaborate using the same shared datastore. 3. The datastore is hosted locally with peer-to-peer connections to other stores on other machines (distributed) Annotation language The for the user of a Semantic Wiki most visible change compared to conventional Wikis is the modified annotation language. For Semantic Wikis the annotation language is not only responsible for change in text style and for creating links, but also for the semantic annotation of Wiki pages and for writing embedded queries in a page. Annotation primitives As in conventional Wikis, internal links are written in CamelCase 8 or by enclosing them in brackets; external links are written as full absolute URIs, or are abbreviated using namespace abbreviations. Internal Wiki links are expanded to absolute URIs using the usually configurable Wiki base namespace, namespace abbreviations are expanded using configurable namespace definitions. Semantic annotations are written on a separate line, and, following RDF conventions, consist of a predicate followed by an object. Predicates can only be resources, objects can be either resources or literals. Annotations are expanded to triples using the resource that the page is describing as the subject of the triple. To annotate the current Wiki page itself, instead of the resource that the page is describing, annotations have to be pre-pended with an exclamation mark (since annotations of resources are more common than annotations of pages). An example page with annotations and an embedded query is displayed in figure 4. It describes John Grisham, an author published by Random House. Information about John Grisham is mixed between natural text in English and formal annotations. Page: JohnGrisham John Grisham is an author and retired lawyer. rdf:type foaf:person dc:publisher RandomHouse Figure 4: Example page 8 cf.
17 DERI TR Wiki pages as resources Wiki pages often refer to real-world resources, and annotations may refer both to the page resource and to the resource that is described. For example, a triple John- Grisham created on can refer to the creation date (or birth date) of the person John Grisham, or about the creation date of the Wiki page about that person. We discuss three possibilities to resolve this issue: 1. We may use the same URI to denote pages and resources, but duplicate the problematic predicates in different namespaces, as shown in figure 5. The predicate semperwiki:date refers to the creation of the page, the birthdate predicate refers to the creation of the person. Figure 5: Duplicating predicates 2. We may use the same URI and the same predicate for these statements, but represent the scope of statements in RDF contexts, as shown in figure 6. Users may create arbitrary scopes when the make a statement, for example the scope of JohnGrisham as a page, of JohnGrisham as a person, and of JohnGrisham as a user of the Wiki system. When consuming the information, the scope explains how to interpret the statements, i.e. whether we are talking about JohnGrisham as a page or as a person. 3. We may separate the resource into the page resource and the real world resource, as shown in figure 7. We have a resource that identifies the page about John Grisham, and we have another resource that identifies the person. The page has an about predicate, linking it to the resource it describes. The approaches 1 and 2 have the disadvantage that the same URI is intentionally used for two different resources and mechanisms outside the shared understanding of the RDF graph are necessary to deduce the true meaning of the URI. Therefore the URI is not acting as an identifier identifying exactly one resource. Therefore we are following approach 3 and introduce different URIs to denote the Wiki page and the resource that the Wiki page describes. Since we expect that annotation of the resource that the page describes will occur more frequently, it is justifiable to make it syntactically easier
18 14 DERI TR Figure 6: Scoping statements Figure 7: Separating page and resource to annotate this resource instead of the Wiki page itself. In case the author of a Wiki page has naming authority it is often convenient to assume an automatic means to create the URI of the described resources, for example by a simple syntactic manipulation of the page URL. Advanced annotations In case a author of a Wiki page does not have naming authority over the resource it is necessary to explicitly specify the URI of the described resource. For example, figure 8 shows a page that describes the research institute DERI. The page uses the semper:about predicate to relate the page to the resource (DERI, identified by urn://deri.ie) that it is describing. The annotations state, using the Semantic Web Research Community ontology, that DERI is a research institute, founded in June 2003, and located in Galway. The last annotation, prepended with an exclamation mark, refers to the page instead of the resource; it states that Eyal Oren is the creator of that page. Figure 9 shows the RDF graph that is generated from the page in figure 8. The page DERI
19 DERI TR Page: DERI Galway DERI Galway is one location of the Digital Enterprise Research Institute, researching Semantic Web technology; our main page is at semper:about urn://deri.ie rdf:type swrc:organization swrc:location "Galway" swrc:created " "!dc:creator EyalOren Figure 8: Annotating real-world resources Galway has a property text, it has a creator and creation date that are populated by the Wiki system, and it contains a navigational hyperlink to The page DERI Galway contains information about the resource urn://deri.ie. This resource is the actual subject of the annotations: it is a research organisation, it has a creation date and location, and it has a logical relation to the resource. Figure 9: Corresponding RDF graph about DERI Galway The annotation mechanism can be used to annotate arbitrary resources, including editing or creating existing ontologies. Figure 10 for example shows how to describe a new class Cluster (an organisational structure) in the DERI ontology. Arbitrary annotations The proposed annotation syntax is simple and direct, but users can only annotate the current page, which is a severe limitation. If one describes an address book for example, then each contact in the address book needs to be described on his own Wiki page. But if one just wants to make a list of contacts and their addresses, then making a separate page for each contact is time consuming. Also it is not possible to make statements about unnamed resources ( blank nodes in RDF), since each resource that is described has to have a named page.
20 16 DERI TR Page: DeriCluster semper:about deri:cluster rdf:type rdfs:class rdfs:subclassof swrc:organization We have now created a new class Cluster in our DERI ontology. This class appears in the ontology browser, and can be used as any other class. We can add new properties to the ontology in the same way. Figure 10: Editing ontologies But unnamed resources are common in reality, see for example figure 11. Figure 11: RDF graph of an address A Semantic Wiki needs to offer a syntax for arbitrary RDF statements, that offers a shorthand for multiple values for a key, and allows to describe resources on arbitrary pages. In the Semantic Web similar issues have been worked on, leading to several compact syntaxes for RDF; we propose to use Turtle 9 for the arbitrary annotations. Representing Wiki pages in RDF We have developed a simple representation format for storing Wiki content and annotations in a triple store. Figure 9 shows the example of figure 8 and the additional metadata about the Wiki page that is stored in the datastore. The additional metadata about the page is capturing the creator, creation time, and links to other pages in order to provide the conventional hyperlink functionality. Embedded queries Embedded queries (generating views) are written using triple patterns, sequences of subject, predicate, object, that can contain variables (names that start with a question mark). A triple pattern is interpreted as a query: triples matching the pattern are returned. Patterns can be combined to form joins. 9
21 DERI TR Figure 12 shows the earlier example page about John Grisham, including an embedded query at the bottom of the page. The query returns all books written by JohnGrisham; it creates a view on the data that is displayed below the page text. Page: JohnGrisham John Grisham is an author and retired lawyer. rdf:type foaf:person dc:publisher RandomHouse this query shows all his books:?book dc:creator JohnGrisham TheFirm dc:creator JohnGrisham TheJury dc:creator JohnGrisham ThePelicanBrief dc:creator JohnGrisham Figure 12: Page showing embedded query Information access Information access is offered through structured navigation and various querying facilities: Navigation Navigation in ordinary Wikis is limited to explicit links entered by users. It is not possible to navigate the information based on structural relations. As explained, that is a severe limitation. A Semantic Wiki provides the metadata necessary to navigate the information in a structured way. For example, knowing that John Grisham is an author, we can automatically show all other authors in the system, and offer navigation to them. One approach for structural navigation is faceted meta-data browsing, or category-based browsing [26]. In faceted browsing, the information space is partitioned using orthogonal conceptual dimensions (facets) of the data, which can be used to constrain the relevant elements in the information space. For example, a collection of art works can consists of facets such as type of work, time periods, artist names, geographical locations, etc. Users can select a certain facets (e.g. 20th century) to constrain the visible collection to only the art works from that facet. Multiple constraints can be applied conjunctively: by selecting the facets book art form, 21st century period, located in USA, artist name John Grisham, the selection is restricted to the books written by John Grisham in the USA in the 21st century. Using the structured metadata enables faceted browsing of the Wiki information space. We use the metadata of the pages to partition the information space into facets. Partitions (sets of resources) are created for each predicate, object pair in the knowledge base, as follows: R p,o = {page (page, p, o) KB}, where (s, p, o) denotes the triple (subject, predicate, object) and KB is the knowledge base consisting of all RDF statements in the datastore. Each set R p,o is a partition of the space on
22 18 DERI TR a particular predicate-object pair, and contains the resources that have that particular predicateobject pair. Together the sets R p,o form the base partition of our information space. For example, the set R dc:author,johngrisham contains the Pelican Brief, the Firm, the Jury, etc.; and the set R rdf:type,foaf:p erson contains John Grisham, John le Carré, Joseph Roth, etc. When the user is viewing some selection (e.g. all books by John Grisham) or a single item (e.g. The Pelican Brief), we display the related items based on the partitioning: for all partitions that have a non-empty intersection with the current selection navigational links are shown. For example, if the user is looking at all books by John Grisham, then we would show links to (i) all books (since the current selection contains books), (ii) all books published by Random House (since the current selection contains books published by Random House), (iii) all books from 1995 (since the current selection contains books from 1995), etc. If the user is viewing for example the page about John Grisham, we would show: (i) all people (since John Grisham is a person), (ii) all authors, (iii) all authors published by Random House, etc. The backlinks are computed in a similar fashion: each set B sel,pred contains the back-references from a certain selection of pages sel, using a certain predicate pred. These back-references are computed by the union of all pages that point to some page in the current selection (p sel), using the predicate pred. B sel,pred = {page (page, pred, p) KB}. p sel This means that each set B sel,pred contains the pages that point to a page in the selection. For example, the backlinks for the selection containing John Grisham and Madeleine Albright, would include books written by John Grisham (page with dc:creator John Grisham), friends of Madeleine Albright (page with foaf:knows Madeleine Albright), etc. Querying views: We distinguish three kinds of querying functionality: keyword search, queries, and 1. A keyword-based full-text search is useful for simple information retrieval, and supported by all conventional Wiki systems. 2. Structured queries use the annotations to allow more advanced information retrieval. The user can query the Wiki for pages (or resources) that satisfy certain properties. We suggest to use triple patterns that include free variables as basic query statements. To retrieve for example all authors one can query for?x type author. Triple patterns can be combined to form database-like joins:?x type author and?x has-publisher?y retrieves all authors and their publishing companies. 3. By embedding queries into the Wiki page, users can create persistent searches or views on the Wiki. A query included on a page is executed each time the page is visited, and continuously shows up-to-date query results. Embedded queries are further discussed in section
23 DERI TR Figure 13: SemperWiki user interface 5.3 Semantic Wiki: Implementation This section presents SemperWiki, our prototype implementation of a Semantic Wiki [21]. It is a desktop application that follows the previously-discussed architecture; the architecture itself is equally applicable to Web systems. SemperWiki is implemented in Ruby 10, using the GTK 11 windowing toolkit for the graphical programming. It is open source, consists of around 1500 lines of code, and can be downloaded at System overview SemperWiki is a Semantic Personal Wiki that can be used for personal knowledge management. The main advantages compared to a normal Wiki are intelligent navigation, semantic search, and embedded queries. All information in SemperWiki can be annotated semantically, and all information can be exported and shared on the Semantic Web. The user interface of SemperWiki is shown in figure 13 On the left-hand side the user can edit pages, and on the right-hand side the user can navigate. At the top, there is the menu bar, which gives access to all functions, and the location bar, showing the current page. Navigating through SemperWiki works similar to a Web browser: the user navigates to a page by typing its address or by clicking a link to that page; one can also go back and forth through the browsing history. SemperWiki offers a sidebar with intelligent navigation links, which can be used to navigate to related pages. These links are based on the faceted browsing explained in section 5.2.3: the links show related information categorised per facet. Ordinary Wikis only show pages that contain links to the current one; SemperWiki shows much richer related information
24 20 DERI TR In the top right corner, the semantic search functionality is visible: one can find pages by listing one or more properties, such as the author or the publisher of a book. Ordinary Wikis only offer full-text search. The main text shows an example of embedded queries: a query can be embedded in any page; its results are displayed each time the page is visited. Each page can contain an arbitrary number of such queries. Ordinary Wikis do not offer such functionality Writing A page in SemperWiki can consist of arbitrary text, links to other pages or websites, and annotations. Links can be internal, to other pages in the Wiki, or external, to arbitrary Web pages. Clicking internal links navigates to them, clicking external links opens them in their default application. SemperWiki aims for high usability: it should be easy and fast to use both for novice and experienced users. We adhere to the Gnome human interface guidelines 12, a set of guidelines for consistent and user-friendly user interfaces. All functions are accessible through keyboard shortcuts. The interface provides instant response to all user actions: links are tagged instantly, correct annotations are colored green, and changing annotations changes the sidebar in real-time. After writing some pages, the user can just hit Escape to exit; all user changes are instantly saved to the data storage, and SemperWiki remembers each cursor position on each page: the next time SemperWiki is started, everything is exactly how the user left it. To help users reuse existing terminology instead of inventing their own, and to thus enable better sharing of knowledge, SemperWiki offers a simple ontology browser. Users can browse terms from common ontologies, and import others by entering the URL for the ontology. We show the terms (classes and properties) in each ontology; the user can browse those terms and drag-and-drop them to the current Wiki page. When the user drags a class, we complete it to a full annotation by pre-pending it with rdf:type ; otherwise we just insert the selected property; the user only needs to give an object value for the property Navigating The navigation bar, on the right in figure 14, shows intelligent navigation options based on the annotations of the current page and the information in the knowledge base. The related pages are categorised per facet, computed by the data analyser. The data analyser returns several sets of related pages, representing different facets in which other pages relate to the current one. We display those sets in the order from most specific to most general, ending with the set of all pages Searching Figure 14 also shows a query, embedded in the ordinary text. The query results are shown at the bottom of the screen. Each result can be clicked on, which navigates to the page where the statement was made; the user could then edit the statement. Any changes to the embedded query or the underlying data are reflected instantly on the result set: the results are continuously up-to-date. 12
25 DERI TR Figure 14: Navigating and Information reuse Storage The storage component persistently stores and retrieves the pages and their annotations. We use RDF as the datamodel, and a standard RDF database for storing our data. All our data (page text, page annotations, user preferences) are stored in the RDF store. We use Redland [2] for storing and retrieving RDF. Redland is a mature RDF store. It has a small footprint and can easily be embedded in an application. It has bindings for many different languages, including Ruby. We do not yet use contexts of statements, since support for triple contexts in Redland is not complete (namely, contexts are not first-class citizens, and cannot be used in the same manner as subjects, predicates, and objects; in particular, it is not possible to directly find the context of a statement, other than by traversing all existing contexts) Component interaction To keep the system responsive, components cooperate via event notification: each component is subscribed to relevant events of other components; upon notification, components do their job in the background, and notify other components when they are done. The parser is subscribed to the user interface, it gets notified on each text change (when the user adds or removes some text). It then parses the page in the background, updates the set of queries and annotations belonging to that page, and notifies the components subscribed to that change: the user interface, the storage, and the analyser. The user interface looks for changes in the embedded queries, and if found, asks the storage for query results. The data analyser looks for changes in the annotations and computes the related pages (which the user interfaces displays); the storage is subscribed to changes in the annotations as well, and stores them persistently.
Leveraging existing Web frameworks for a SIOC explorer to browse online social communities
Leveraging existing Web frameworks for a SIOC explorer to browse online social communities Benjamin Heitmann and Eyal Oren Digital Enterprise Research Institute National University of Ireland, Galway Galway,
More informationOWL based XML Data Integration
OWL based XML Data Integration Manjula Shenoy K Manipal University CSE MIT Manipal, India K.C.Shet, PhD. N.I.T.K. CSE, Suratkal Karnataka, India U. Dinesh Acharya, PhD. ManipalUniversity CSE MIT, Manipal,
More informationLightweight Data Integration using the WebComposition Data Grid Service
Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed
More informationTowards a reference architecture for Semantic Web applications
Towards a reference architecture for Semantic Web applications Benjamin Heitmann 1, Conor Hayes 1, and Eyal Oren 2 1 firstname.lastname@deri.org Digital Enterprise Research Institute National University
More informationQuery Processing in Data Integration Systems
Query Processing in Data Integration Systems Diego Calvanese Free University of Bozen-Bolzano BIT PhD Summer School Bressanone July 3 7, 2006 D. Calvanese Data Integration BIT PhD Summer School 1 / 152
More informationData Grids. Lidan Wang April 5, 2007
Data Grids Lidan Wang April 5, 2007 Outline Data-intensive applications Challenges in data access, integration and management in Grid setting Grid services for these data-intensive application Architectural
More informationSoftware Development Kit
Open EMS Suite by Nokia Software Development Kit Functional Overview Version 1.3 Nokia Siemens Networks 1 (21) Software Development Kit The information in this document is subject to change without notice
More informationDemonstrating WSMX: Least Cost Supply Management
Demonstrating WSMX: Least Cost Supply Management Eyal Oren 2, Alexander Wahler 1, Bernhard Schreder 1, Aleksandar Balaban 1, Michal Zaremba 2, and Maciej Zaremba 2 1 NIWA Web Solutions, Vienna, Austria
More informationCHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL
CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL This chapter is to introduce the client-server model and its role in the development of distributed network systems. The chapter
More informationLinkZoo: A linked data platform for collaborative management of heterogeneous resources
LinkZoo: A linked data platform for collaborative management of heterogeneous resources Marios Meimaris, George Alexiou, George Papastefanatos Institute for the Management of Information Systems, Research
More informationA SOA visualisation for the Business
J.M. de Baat 09-10-2008 Table of contents 1 Introduction...3 1.1 Abbreviations...3 2 Some background information... 3 2.1 The organisation and ICT infrastructure... 3 2.2 Five layer SOA architecture...
More informationPerformance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology
Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology Hong-Linh Truong Institute for Software Science, University of Vienna, Austria truong@par.univie.ac.at Thomas Fahringer
More informationTalend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain
Talend Metadata Manager Reduce Risk and Friction in your Information Supply Chain Talend Metadata Manager Talend Metadata Manager provides a comprehensive set of capabilities for all facets of metadata
More informationK@ A collaborative platform for knowledge management
White Paper K@ A collaborative platform for knowledge management Quinary SpA www.quinary.com via Pietrasanta 14 20141 Milano Italia t +39 02 3090 1500 f +39 02 3090 1501 Copyright 2004 Quinary SpA Index
More informationIncreasing Development Knowledge with EPFC
The Eclipse Process Framework Composer Increasing Development Knowledge with EPFC Are all your developers on the same page? Are they all using the best practices and the same best practices for agile,
More informationEnterprise Architecture Modeling PowerDesigner 16.1
Enterprise Architecture Modeling PowerDesigner 16.1 Windows DOCUMENT ID: DC00816-01-1610-01 LAST REVISED: November 2011 Copyright 2011 by Sybase, Inc. All rights reserved. This publication pertains to
More informationComparing Data Integration Algorithms
Comparing Data Integration Algorithms Initial Background Report Name: Sebastian Tsierkezos tsierks6@cs.man.ac.uk ID :5859868 Supervisor: Dr Sandra Sampaio School of Computer Science 1 Abstract The problem
More informationA Framework for Developing the Web-based Data Integration Tool for Web-Oriented Data Warehousing
A Framework for Developing the Web-based Integration Tool for Web-Oriented Warehousing PATRAVADEE VONGSUMEDH School of Science and Technology Bangkok University Rama IV road, Klong-Toey, BKK, 10110, THAILAND
More informationModern Databases. Database Systems Lecture 18 Natasha Alechina
Modern Databases Database Systems Lecture 18 Natasha Alechina In This Lecture Distributed DBs Web-based DBs Object Oriented DBs Semistructured Data and XML Multimedia DBs For more information Connolly
More informationArtificial Intelligence & Knowledge Management
Artificial Intelligence & Knowledge Management Nick Bassiliades, Ioannis Vlahavas, Fotis Kokkoras Aristotle University of Thessaloniki Department of Informatics Programming Languages and Software Engineering
More informationGEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington
GEOG 482/582 : GIS Data Management Lesson 10: Enterprise GIS Data Management Strategies Overview Learning Objective Questions: 1. What are challenges for multi-user database environments? 2. What is Enterprise
More informationCAMDIT: A Toolkit for Integrating Heterogeneous Medical Data for improved Health Care Service Provisioning
CAMDIT: A Toolkit for Integrating Heterogeneous Medical Data for improved Health Care Service Provisioning 1 Ipadeola Abayomi, 2 Ahmed Ameen Department of Computer Science University of Ilorin, Kwara State.
More informationCOMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters
COMP5426 Parallel and Distributed Computing Distributed Systems: Client/Server and Clusters Client/Server Computing Client Client machines are generally single-user workstations providing a user-friendly
More informationPostgres Plus xdb Replication Server with Multi-Master User s Guide
Postgres Plus xdb Replication Server with Multi-Master User s Guide Postgres Plus xdb Replication Server with Multi-Master build 57 August 22, 2012 , Version 5.0 by EnterpriseDB Corporation Copyright 2012
More informationTheme 6: Enterprise Knowledge Management Using Knowledge Orchestration Agency
Theme 6: Enterprise Knowledge Management Using Knowledge Orchestration Agency Abstract Distributed knowledge management, intelligent software agents and XML based knowledge representation are three research
More informationComparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &
More informationJOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,
More informationIt s all around the domain ontologies - Ten benefits of a Subject-centric Information Architecture for the future of Social Networking
It s all around the domain ontologies - Ten benefits of a Subject-centric Information Architecture for the future of Social Networking Lutz Maicher and Benjamin Bock, Topic Maps Lab at University of Leipzig,
More informationReusable Knowledge-based Components for Building Software. Applications: A Knowledge Modelling Approach
Reusable Knowledge-based Components for Building Software Applications: A Knowledge Modelling Approach Martin Molina, Jose L. Sierra, Jose Cuena Department of Artificial Intelligence, Technical University
More informationHow To Write A Drupal 5.5.2.2 Rdf Plugin For A Site Administrator To Write An Html Oracle Website In A Blog Post In A Flashdrupal.Org Blog Post
RDFa in Drupal: Bringing Cheese to the Web of Data Stéphane Corlosquet, Richard Cyganiak, Axel Polleres and Stefan Decker Digital Enterprise Research Institute National University of Ireland, Galway Galway,
More informationCloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad
Cloud Computing: Computing as a Service Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Abstract: Computing as a utility. is a dream that dates from the beginning from the computer
More informationInformation and documentation The Dublin Core metadata element set
ISO TC 46/SC 4 N515 Date: 2003-02-26 ISO 15836:2003(E) ISO TC 46/SC 4 Secretariat: ANSI Information and documentation The Dublin Core metadata element set Information et documentation Éléments fondamentaux
More informationGetting Started with Service- Oriented Architecture (SOA) Terminology
Getting Started with - Oriented Architecture (SOA) Terminology Grace Lewis September 2010 -Oriented Architecture (SOA) is a way of designing, developing, deploying, and managing systems it is neither a
More informationREST Web Services in Collaborative Work Environments
REST Web Services in Collaborative Work Environments Luis Oliva a and Luigi Ceccaroni a a Departament de Llenguatges i Sistemes Informàtics (LSI), Universitat Politècnica de Catalunya (UPC), Campus Nord,
More informationLinksTo A Web2.0 System that Utilises Linked Data Principles to Link Related Resources Together
LinksTo A Web2.0 System that Utilises Linked Data Principles to Link Related Resources Together Owen Sacco 1 and Matthew Montebello 1, 1 University of Malta, Msida MSD 2080, Malta. {osac001, matthew.montebello}@um.edu.mt
More information12 The Semantic Web and RDF
MSc in Communication Sciences 2011-12 Program in Technologies for Human Communication Davide Eynard nternet Technology 12 The Semantic Web and RDF 2 n the previous episodes... A (video) summary: Michael
More informationSemantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
More informationSoftware Life-Cycle Management
Ingo Arnold Department Computer Science University of Basel Theory Software Life-Cycle Management Architecture Styles Overview An Architecture Style expresses a fundamental structural organization schema
More informationThe Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets
The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and
More informationTopBraid Insight for Life Sciences
TopBraid Insight for Life Sciences In the Life Sciences industries, making critical business decisions depends on having relevant information. However, queries often have to span multiple sources of information.
More informationInformation Services for Smart Grids
Smart Grid and Renewable Energy, 2009, 8 12 Published Online September 2009 (http://www.scirp.org/journal/sgre/). ABSTRACT Interconnected and integrated electrical power systems, by their very dynamic
More informationTowards a Semantic Wiki Wiki Web
Towards a Semantic Wiki Wiki Web Roberto Tazzoli, Paolo Castagna, and Stefano Emilio Campanini Abstract. This article describes PlatypusWiki, an enhanced Wiki Wiki Web using technologies from the Semantic
More informationSecurity Issues for the Semantic Web
Security Issues for the Semantic Web Dr. Bhavani Thuraisingham Program Director Data and Applications Security The National Science Foundation Arlington, VA On leave from The MITRE Corporation Bedford,
More informationWhat is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World
COSC 304 Introduction to Systems Introduction Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca What is a database? A database is a collection of logically related data for
More informationData Modeling for Big Data
Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes
More informationSemantic Stored Procedures Programming Environment and performance analysis
Semantic Stored Procedures Programming Environment and performance analysis Marjan Efremov 1, Vladimir Zdraveski 2, Petar Ristoski 2, Dimitar Trajanov 2 1 Open Mind Solutions Skopje, bul. Kliment Ohridski
More informationConcepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches
Concepts of Database Management Seventh Edition Chapter 9 Database Management Approaches Objectives Describe distributed database management systems (DDBMSs) Discuss client/server systems Examine the ways
More informationScalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens
Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens 1 Optique: Improving the competitiveness of European industry For many
More informationData Quality in Information Integration and Business Intelligence
Data Quality in Information Integration and Business Intelligence Leopoldo Bertossi Carleton University School of Computer Science Ottawa, Canada : Faculty Fellow of the IBM Center for Advanced Studies
More informationData Integration. Maurizio Lenzerini. Universitá di Roma La Sapienza
Data Integration Maurizio Lenzerini Universitá di Roma La Sapienza DASI 06: Phd School on Data and Service Integration Bertinoro, December 11 15, 2006 M. Lenzerini Data Integration DASI 06 1 / 213 Structure
More informationTaming Big Data Variety with Semantic Graph Databases. Evren Sirin CTO Complexible
Taming Big Data Variety with Semantic Graph Databases Evren Sirin CTO Complexible About Complexible Semantic Tech leader since 2006 (née Clark & Parsia) software, consulting W3C leadership Offices in DC
More informationA Business Process Services Portal
A Business Process Services Portal IBM Research Report RZ 3782 Cédric Favre 1, Zohar Feldman 3, Beat Gfeller 1, Thomas Gschwind 1, Jana Koehler 1, Jochen M. Küster 1, Oleksandr Maistrenko 1, Alexandru
More informationFIPA agent based network distributed control system
FIPA agent based network distributed control system V.Gyurjyan, D. Abbott, G. Heyes, E. Jastrzembski, C. Timmer, E. Wolin TJNAF, Newport News, VA 23606, USA A control system with the capabilities to combine
More informationFunctional Requirements for Digital Asset Management Project version 3.0 11/30/2006
/30/2006 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20 2 22 23 24 25 26 27 28 29 30 3 32 33 34 35 36 37 38 39 = required; 2 = optional; 3 = not required functional requirements Discovery tools available to end-users:
More information2. Basic Relational Data Model
2. Basic Relational Data Model 2.1 Introduction Basic concepts of information models, their realisation in databases comprising data objects and object relationships, and their management by DBMS s that
More informationWeb-Based Genomic Information Integration with Gene Ontology
Web-Based Genomic Information Integration with Gene Ontology Kai Xu 1 IMAGEN group, National ICT Australia, Sydney, Australia, kai.xu@nicta.com.au Abstract. Despite the dramatic growth of online genomic
More informationLinked Data Interface, Semantics and a T-Box Triple Store for Microsoft SharePoint
Linked Data Interface, Semantics and a T-Box Triple Store for Microsoft SharePoint Christian Fillies 1 and Frauke Weichhardt 1 1 Semtation GmbH, Geschw.-Scholl-Str. 38, 14771 Potsdam, Germany {cfillies,
More informationStructure of Presentation. The Role of Programming in Informatics Curricula. Concepts of Informatics 2. Concepts of Informatics 1
The Role of Programming in Informatics Curricula A. J. Cowling Department of Computer Science University of Sheffield Structure of Presentation Introduction The problem, and the key concepts. Dimensions
More informationRDF Resource Description Framework
RDF Resource Description Framework Fulvio Corno, Laura Farinetti Politecnico di Torino Dipartimento di Automatica e Informatica e-lite Research Group http://elite.polito.it Outline RDF Design objectives
More informationAn Ontology-based e-learning System for Network Security
An Ontology-based e-learning System for Network Security Yoshihito Takahashi, Tomomi Abiko, Eriko Negishi Sendai National College of Technology a0432@ccedu.sendai-ct.ac.jp Goichi Itabashi Graduate School
More informationLINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model
LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model 22 October 2014 Tony Hammond Michele Pasin Background About Macmillan
More informationSemantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo
DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo Expected Outcomes You will learn: Basic concepts related to ontologies Semantic model Semantic web Basic features of RDF and RDF
More informationService Oriented Architecture
Service Oriented Architecture Charlie Abela Department of Artificial Intelligence charlie.abela@um.edu.mt Last Lecture Web Ontology Language Problems? CSA 3210 Service Oriented Architecture 2 Lecture Outline
More informationGrids, Logs, and the Resource Description Framework
Grids, Logs, and the Resource Description Framework Mark A. Holliday Department of Mathematics and Computer Science Western Carolina University Cullowhee, NC 28723, USA holliday@cs.wcu.edu Mark A. Baker,
More informationtechnische universiteit eindhoven WIS & Engineering Geert-Jan Houben
WIS & Engineering Geert-Jan Houben Contents Web Information System (WIS) Evolution in Web data WIS Engineering Languages for Web data XML (context only!) RDF XML Querying: XQuery (context only!) RDFS SPARQL
More informationEnterprise Application Designs In Relation to ERP and SOA
Enterprise Application Designs In Relation to ERP and SOA DESIGNING ENTERPRICE APPLICATIONS HASITH D. YAGGAHAVITA 20 th MAY 2009 Table of Content 1 Introduction... 3 2 Patterns for Service Integration...
More informationXML DATA INTEGRATION SYSTEM
XML DATA INTEGRATION SYSTEM Abdelsalam Almarimi The Higher Institute of Electronics Engineering Baniwalid, Libya Belgasem_2000@Yahoo.com ABSRACT This paper describes a proposal for a system for XML data
More informationAutomatic Timeline Construction For Computer Forensics Purposes
Automatic Timeline Construction For Computer Forensics Purposes Yoan Chabot, Aurélie Bertaux, Christophe Nicolle and Tahar Kechadi CheckSem Team, Laboratoire Le2i, UMR CNRS 6306 Faculté des sciences Mirande,
More informationzen Platform technical white paper
zen Platform technical white paper The zen Platform as Strategic Business Platform The increasing use of application servers as standard paradigm for the development of business critical applications meant
More informationA Semantic web approach for e-learning platforms
A Semantic web approach for e-learning platforms Miguel B. Alves 1 1 Laboratório de Sistemas de Informação, ESTG-IPVC 4900-348 Viana do Castelo. mba@estg.ipvc.pt Abstract. When lecturers publish contents
More informationRDF graph Model and Data Retrival
Distributed RDF Graph Keyword Search 15 2 Linked Data, Non-relational Databases and Cloud Computing 2.1.Linked Data The World Wide Web has allowed an unprecedented amount of information to be published
More informationFirewall Builder Architecture Overview
Firewall Builder Architecture Overview Vadim Zaliva Vadim Kurland Abstract This document gives brief, high level overview of existing Firewall Builder architecture.
More informationOracle Endeca Server. Cluster Guide. Version 7.5.1.1 May 2013
Oracle Endeca Server Cluster Guide Version 7.5.1.1 May 2013 Copyright and disclaimer Copyright 2003, 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of
More informationCombining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery
Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery Dimitrios Kourtesis, Iraklis Paraskakis SEERC South East European Research Centre, Greece Research centre of the University
More informationIntegrating and Exchanging XML Data using Ontologies
Integrating and Exchanging XML Data using Ontologies Huiyong Xiao and Isabel F. Cruz Department of Computer Science University of Illinois at Chicago {hxiao ifc}@cs.uic.edu Abstract. While providing a
More informationCourse 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
More informationUnderstanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
More informationInformation Access Platforms: The Evolution of Search Technologies
Information Access Platforms: The Evolution of Search Technologies Managing Information in the Public Sphere: Shaping the New Information Space April 26, 2010 Purpose To provide an overview of current
More informationDatabase Resources. Subject: Information Technology for Managers. Level: Formation 2. Author: Seamus Rispin, current examiner
Database Resources Subject: Information Technology for Managers Level: Formation 2 Author: Seamus Rispin, current examiner The Institute of Certified Public Accountants in Ireland This report examines
More informationCOGNOS Query Studio Ad Hoc Reporting
COGNOS Query Studio Ad Hoc Reporting Copyright 2008, the California Institute of Technology. All rights reserved. This documentation contains proprietary information of the California Institute of Technology
More informationCONCEPTCLASSIFIER FOR SHAREPOINT
CONCEPTCLASSIFIER FOR SHAREPOINT PRODUCT OVERVIEW The only SharePoint 2007 and 2010 solution that delivers automatic conceptual metadata generation, auto-classification and powerful taxonomy tools running
More informationSCADE System 17.0. Technical Data Sheet. System Requirements Analysis. Technical Data Sheet SCADE System 17.0 1
SCADE System 17.0 SCADE System is the product line of the ANSYS Embedded software family of products and solutions that empowers users with a systems design environment for use on systems with high dependability
More informationService-Oriented Architectures
Architectures Computing & 2009-11-06 Architectures Computing & SERVICE-ORIENTED COMPUTING (SOC) A new computing paradigm revolving around the concept of software as a service Assumes that entire systems
More informationContents. Introduction... 1
Managed SQL Server 2005 Deployments with CA ERwin Data Modeler and Microsoft Visual Studio Team Edition for Database Professionals Helping to Develop, Model, and Maintain Complex Database Architectures
More informationHow To Create An Enterprise Class Model Driven Integration
Creating an Enterprise Class Scalable Model Driven Infrastructure The use case for using IBM, OSIsoft, and SISCO technologies Version: 1.1 Date: May 28, 2009 Systems Integration Specialist Company, Inc.
More informationA Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System
A Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System Mohammad Ghulam Ali Academic Post Graduate Studies and Research Indian Institute of Technology, Kharagpur Kharagpur,
More informationMarkLogic Enterprise Data Layer
MarkLogic Enterprise Data Layer MarkLogic Enterprise Data Layer MarkLogic Enterprise Data Layer September 2011 September 2011 September 2011 Table of Contents Executive Summary... 3 An Enterprise Data
More informationSAS BI Dashboard 4.3. User's Guide. SAS Documentation
SAS BI Dashboard 4.3 User's Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2010. SAS BI Dashboard 4.3: User s Guide. Cary, NC: SAS Institute
More informationCONTEMPORARY SEMANTIC WEB SERVICE FRAMEWORKS: AN OVERVIEW AND COMPARISONS
CONTEMPORARY SEMANTIC WEB SERVICE FRAMEWORKS: AN OVERVIEW AND COMPARISONS Keyvan Mohebbi 1, Suhaimi Ibrahim 2, Norbik Bashah Idris 3 1 Faculty of Computer Science and Information Systems, Universiti Teknologi
More informationCitationBase: A social tagging management portal for references
CitationBase: A social tagging management portal for references Martin Hofmann Department of Computer Science, University of Innsbruck, Austria m_ho@aon.at Ying Ding School of Library and Information Science,
More informationFederated, Generic Configuration Management for Engineering Data
Federated, Generic Configuration Management for Engineering Data Dr. Rainer Romatka Boeing GPDIS_2013.ppt 1 Presentation Outline I Summary Introduction Configuration Management Overview CM System Requirements
More informationIncrease Agility and Reduce Costs with a Logical Data Warehouse. February 2014
Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4
More informationOracle Warehouse Builder 10g
Oracle Warehouse Builder 10g Architectural White paper February 2004 Table of contents INTRODUCTION... 3 OVERVIEW... 4 THE DESIGN COMPONENT... 4 THE RUNTIME COMPONENT... 5 THE DESIGN ARCHITECTURE... 6
More informationCommunity Edition. Master Data Management 3.X. Administrator Guide
Community Edition Talend Master Data Management 3.X Administrator Guide Version 3.2_a Adapted for Talend MDM Studio v3.2. Administrator Guide release. Copyright This documentation is provided under the
More informationPOLAR IT SERVICES. Business Intelligence Project Methodology
POLAR IT SERVICES Business Intelligence Project Methodology Table of Contents 1. Overview... 2 2. Visualize... 3 3. Planning and Architecture... 4 3.1 Define Requirements... 4 3.1.1 Define Attributes...
More informationUsing LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.
White Paper Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. Using LSI for Implementing Document Management Systems By Mike Harrison, Director,
More informationInformation Systems Analysis and Design CSC340. 2004 John Mylopoulos. Software Architectures -- 1. Information Systems Analysis and Design CSC340
XIX. Software Architectures Software Architectures UML Packages Client- vs Peer-to-Peer Horizontal Layers and Vertical Partitions 3-Tier and 4-Tier Architectures The Model-View-Controller Architecture
More informationDatabases in Organizations
The following is an excerpt from a draft chapter of a new enterprise architecture text book that is currently under development entitled Enterprise Architecture: Principles and Practice by Brian Cameron
More informationChapter 1: Introduction
Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db book.com for conditions on re use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases
More information