A Semantic Wiki approach for integrated data access for different workflow meta-models

Size: px
Start display at page:

Download "A Semantic Wiki approach for integrated data access for different workflow meta-models"

Transcription

1 DERI DIGITAL ENTERPRISE RESEARCH INSTITUTE A Semantic Wiki approach for integrated data access for different workflow meta-models Eyal Oren DERI Technical Report February 2006 DERI DIGITAL ENTERPRISE RESEARCH INSTITUTE DERI Galway University Road Galway IRELAND DERI Innsbruck Technikerstrasse 13 A-6020 Innsbruck AUSTRIA

2

3 DERI Technical Report DERI Technical Report , February 2006 A Semantic Wiki approach for integrated data access for different workflow meta-models Eyal Oren 1 Abstract. Organisations commonly employ multiple workflow management systems, for various reasons. A multitude of employed workflow management systems leads to issues in data consistency and data retrieval, since organisational data is related but maintained in multiple systems and disparately available for the employees. This paper outlines our research to address workflow data integration. Our integration of workflow data is based on our multi-meta model process ontology m3po. As unifying interface to the integrated data (for editing, adding, annotating, browsing, and retrieving workflow data) we introduce the paradigm of semantic wikis: easy to use collobarative editing systems. 1 Digital Enterprise Research Institute, National University of Ireland, Galway. Acknowledgements: This material is based upon works supported by the Science Foundation Ireland under Grant No. 02/CE1/I131. Copyright c 2006 by the authors

4 DERI TR I Contents 1 Introduction 1 2 Workflow management 1 3 Related work Enterprise-wide workflow management Data integration Approach Architecture Implementation User Interface Wikis and Semantic Wikis Limitations of Wikis Semantic Wikis Architecture Overview Annotation language Information access Semantic Wiki: Implementation System overview Writing Navigating Searching Storage Component interaction Addressing Wiki limitations Summary 22

5 DERI TR Introduction Workflow management systems are widely used in organisations. They provide automated support for managing business processes; they operate on a schema definition of the processes in the organisation. Organisations commonly employ multiple workflow management systems, for historical, functional, or technical reasons [6]. Each workflow management system manages a part of the organisational processes and maintains a part of the organisational data. A multitude of employed workflow management systems leads to issues in data consistency and data retrieval, since organisational data is maintained in multiple systems. Data consistency issues arise since the data in the workflow management systems are related: they need to be (kept) consistent. Data retrieval issues arise since employees in the organisation need complete information, compiled from all workflow management systems. This paper outlines our research to address the issues of data retrieval. We proceed as follows: first, in section 2, we give a detailed description of data management in workflow management; section 3 then outlines existing work that addresses these issues and explains where we position our work; section 4 describes our approach on an abstract and section 5 describes our concrete solution, based on Semantic Wikis. 2 Workflow management Workflow management deals with supporting business processes in organisations, it involves managing the flow of work through an organisation. A workflow is a collection of coordinated tasks designed to carry out a well-defined process [20]. A workflow management system is a generic information system that supports modelling, execution, management and monitoring of workflows. Such a system operates on a workflow specification, a description of the business processes in the organisation that should be supported. A workflow management system can be compared to a database management system: it is a generic system that operates on a schema definition of the (processes in the) organisation. Workflow modelling is the task of creating workflow specifications, that are used as input to a workflow management system. Different workflow management systems have been developed, focusing on different application domains and providing different functionality. Workflow management lacks a standardised theory that provides a theoretical background; despite standardisation efforts no consensus exits on the representation or conceptual model of workflows [22]. Jablonski and Busser [14] give a comprehensive overview of issues in workflow modelling, divided in five key aspects: functional, behavioural, informational, organisational, and operational. An organisation can have several different workflow management systems deployed for managing their business processes [6]. The reasons for such diversity can be historical (e.g. the result of business acquisitions or mergers) or functional (e.g. the result of different requirements in organisational units). The information captured in these separate workflows is interrelated: the workflows belong to the same organisation and model related business processes. However, the interrelation between these different workflows can not be captured in current workflow management systems, for two reasons. First, since each workflow management system

6 2 DERI TR has a distinct meta-model and specification language, the workflows in one system are generally not understandable by the other systems: the representations of workflows differ across systems. Second, each workflow specification considers a closed world, it is not possible to refer to external entitities (defined in other workflow management systems): the workflows are disconnected. In the current state-of-the art an organisation can thus not access its information completely: the relation between different sets of workflow data is lost. Organisations can not get a complete picture of their workflow data, and can not see and edit their information in the correct and complete context. Example 1 An academic organisation has procedures in place for arranging the travels of its employees. Before each travel employees must request a travel permission from their superior; after travel employees can apply for expense refunds only if permission was granted prior to the travel. An automated workflow, enacted with Microsoft BizTalk, is used to manage these travel requests and refunds. Travel reservations are arranged centrally for each granted travel request. The employees do not arrange their own travels, instead a specific travel coordinator does so for all travels in the organisation. The travel coordinator uses travel agencies to find and make travel arrangements. The travel coordinator has a workflow to manage these reservations with the travel agencies, which is enacted using IBM WorkflowMQ. The problem is that, although the data in these two workflow management systems are related (both deal with employees and their superiors, status of travel requests, university travel regulations, etc.), one can currently not easily retrieve this integrated information. Where this information is needed, it has to be collected and maintained manually: In order to plan the ongoing work, the operations manager needs to know the current load and availability of all employees. He needs to go into the seperate workflow management systems and collect the information from these systems manually. The financial officer needs to run queries that span data in both systems: she needs information about travel requests, the projects that they are charged, the travel agency that dealt with them, and the costs that they incurred. Again, she needs to go into the separate workflow management systems and collect this information manually. The quality manager wants to document all processes in the organisation. All activities that are performed internally are described in standard operating procedures and stored on the internal network. Each documentation should be linked to the relevant activities, so that employees can quickly find documentation on the process that they are involved in. But the workflow management systems do not allow the quality manager to annotate activities with documentation or to link them to external resources. And even if they would, there is no easy way for the quality manager to see all that documentation, and manage it. Again, he would need to go into all separate workflow management systems and collect the information manually. Problem Business processes that are defined and enacted in different workflow management systems are disconnected. Organisations that deploy multiple workflow management systems find their workflow information dispersed. Users have to manually integrate related workflow data.

7 DERI TR Related work 3.1 Enterprise-wide workflow management Bussler [6] discusses the management and execution of enterprise-wide workflows based on a case study performed in Boeing. Enterprise-wide workflow execution is difficult to achieve because of the heterogeneity between workflow systems. Even within one enterprise workflow management systems are not homogeneous: the enterprise employs workflows in different functional domains (in Boeing workflows exist for airplane design, for stock management, for internal travel management, for human resource management, etc.), and it is common to use a different workflow management system in each domain (because the workflow requirements are domain-specific and because workflow vendors offer domain-specific solutions). To enable enterprise-wide workflow management the various workflow management systems should not run in isolation but exchange workflow information and share workflow execution. The case study identifies the need for (i) workflow data integration, and (ii) distributed workflow execution. Workflow data integration: workflow data integration provides a unified view of workflow data from the various workflow management systems; it is a special case of data integration, which we discuss in section 3.2. We can distinguish three dimensions of workflow data integration: (i) integration of workflow models (type integration), (ii) integration of completed workflow instances (history integration), and (iii) integration of possible workflow executions (projective integration). Type integration allows users for example to see all processes related to part XYZ. History integration allows users for example to see the execution time of all processes, related to part XYZ, that completed in the last tree weeks. Projective integration allows users for example to project the time it will take to complete all unfinished workflows related to part XYZ. Projective integration is only possible if the semantics of the workflow meta-models is captured in the integration. Distributed workflow execution: distributed workflow execution is necessary as soon as crossdomain workflows are needed, as functional domains inside the enterprise develop their own workflow definitions and execute them in isolation. Distributed workflow execution is constituted by either instance migration, instance distribution, or instance replation [7]. Instance migration means that a workflow instance can migrate (move) from one workflow management system to another; this is only possible if the two workflow management systems implement the same execution semantics. Instance distribution means that a part of the workflow resides in another workflow management system. Instance replication means that an instace is replicated (continuously reflecting all changes) in two workflow management systems. Distributed execution can be characterised on several dimensions [6]: direct vs. indirect distribution: if the workflows know each other they can execute each other directly. Otherwise external functionality is necessary for their cooperation, for instance coordination via shared databases or communication queues.

8 4 DERI TR objects of distribution: any object in a workflow management systems (e.g. subworkflows, resources, instance data, applications) can be distributed, i.e. reside on a different installation than the rest of the workflow. distribution transparency: workflow distribution is transparent if the location of objects is not visible in the workflow specification, i.e. if workflow designers do not see (and need to design) the distribution of the workflow objects. Bussler [6] describes four possible architectures for workflow inte- Integration architectures gration: user interface collocation: the workflow management systems are not integrated, but the user interfaces of different workflow management systems reside at the user s desktop together. user interface integration: one user interface is offered that accesses different underlying workflow engines. We can further classify this layer by whether the user interface integrates the workflow data from the source systems or stricly separates workflow objects from the various source systems. workflow logic integration: the workflow management systems know each other and can share executions. workflow database integration: the workflow management systems share one database. We can again differentiate whether the workflow management systems keep their objects separate, or share each other s objects. In each of these layers, true workflow integration is only possible if the workflow management systems are designed and built with integration in mind; if the workflow management systems are independent and do not recognise objects from the other systems native interoperation is not possible. The case study demonstrates the heterogeneity in enterprise-wide workflows and the need for integrated user interfaces, workflow logic and workflow data. According to Bussler [6], current workflow management systems do not support the necessary integration and distribution functionality; an encapsulating infrastructure is necessary. 3.2 Data integration Data integration comprises problems in storing and manipulating heterogeneous data sources in a uniform way [19]. Data integration deals with combining data residing at different sources, and providing the user with an integrated view of these data [16]. Data integration systems provide a uniform query interface to a collection of autonomous data sources [9]. Data integration systems typically deal with one global (mediated) schema and a set of local (source) schemas, and offer transparent access to the local schemas through the global schema: queries and result sets are formulated in terms of the global schema. A basic problem in integrating different data sources is heterogeneity, ranging from the hardware and software the database systems are running on, to the data schemas and data models that structure the data, to the kinds of data that are being stored [12]. Semantic data integration focuses on the heterogeneity between the data schemas of the sources.

9 DERI TR Standardisation policy Trivially, standardising the source databases (hardware, software, and data schemas) solves the integration problem by preventing heterogeneity to appear. The source schemas are mandated to be the same as the global schema, and no semantic integration is necessary. Although such solutions are applied in practice (for example when using an enterprise-wide system such as SAP), we focus on autonomous data sources, where mandated standardisation is not possible. Architectures We can distinguish three basic architectures for data integration [9, 12, 16]: federation, mediation, and data warehousing. We first consider read-only access to integrated data; we will address read-write access later. federation: [HM85] a federated database system consists of source databases that agree to share partial data with other members of the federation. Each source database offer an interface for communication with the other source databases. Typically, members extend their own schema to incorporate subsets of the schemas of other members. A federation has neither a global query interface nor a global schema. To answer a query, a source can turn to others in the federation; it is the responsibility of the first source to integrate the results into a coherent answer. mediation: [Wie92] a mediated database system offers a single query interface to sources. Each source database is encapsulated by a wrapper that hides low-level protocols. A global mediator offers a common interface; it decomposes queries into subqueries that are sent to individual data sources, and integrates (mediates) their results into a coherent set. The mediated architecture offers a virtual view on the source data: the data is not collected in a database, but integrated on the fly for each query. A critical element of this architecture is the description of the sources, and their relation to the global schemas; several approaches exist for managing these source descriptions, which we investigate in the next section. data warehousing: the data warehousing approach offers a materialised view on the source databases. It regularly collects and integrates data from sources, and maintains these data in its separate database; users can query the integrated database directly. Data warehouses are typically faster than on-the-fly integration since the views are materialised in the warehouse. They also allow historical data analysis, since they can maintain temporal snapshots of data. Issues in data warehouses include the frequency of updates (changes in the source data are only represented in the warehouse after the next update step). Similarly to the mediated architecture, the source descriptions are critical in data warehouses. Source descriptions Each source in the data integration system needs to be related to the common global schema used for integration. A source description, also called mapping, relates the local data schema of a source to the common global data schema. Two basic approaches have been proposed for these mappings[16]: global-as-view (GAV) and local-as-view (LAV). Global-as-view considers the global schema as a view of local schemas; it requires that the global schema is expressed in terms of the local schemas. Local-as-view requires

10 6 DERI TR the global schema to be defined independently, and then defines the source databases as views on the global schemas. Query processing is easier in GAV, since in GAV query processing can be done by unfolding the query using the mappings. The mappings describe directly how to rewrite the query in terms of the sources. In LAV however, the mappings need to be reversed before they can be applied to the query; it is not directly clear how and which sources can be used to answer a query. Data modelling is easier in LAV. In GAV the global schema is expressed as a view on all sources. If a source is changed or added, it can effect all the mappings in the system. In LAV each mapping is a relation between the global schema and one source; a change or addition of one source does not effect any other source mapping. 4 Approach As discussed in section 2, organisations commonly find their business processes and their business data dispersed over different workflow management systems. Since these systems are disconnected, it is not possible to offer an integrated view over these data. Our approach consists of two parts: our unifying ontology m3po [10] that represents all common workflow management metamodels from existing workflow management systems, and our semantic wiki SemperWiki [21] that allows ordinary users to easily browse, view, edit, and manage semantic information. 4.1 Architecture On the abstract level, we develop an integration architecture that offers transparent access to data from different workflow management systems. It allows integrated information management of business processes. Users can build connections between related data items and view, edit, and query workflow information. The architecture is shown in figure 1. It consists of a uniform workflow representation, a methodology for importing workflow management systems into this common format, and operations that users perform on the workflow data. A uniform representation of workflow data addresses the syntactical and semantical differences between workflow management systems. All workflow models are represented in a common format and the semantics of all workflow constructs is specified. We use rdf as common format and we use m3po to define the used RDF terminology and its semantics. rdf [18] is a datamodel for describing resource on the Semantic Web. rdf statements are triples, they state a property of a resource. All resources have a unique identifier, statements can be made about any resource. A set of rdf statements corresponds to a directed graph of resources and literals, connected by predicates. An rdf graph does not necessarily conform to a schema (therefore it is called semi-structured); it can be queried using graph patterns. rdf is suitable for information integration [8]: (i) its flexibility allows representing arbitrary data without a predefined schema, (ii) unique identifiers for each resource allow making statements about arbitrary objects and (iii) the graph-based model allows merging data from different sources without problems. Our multi-meta model process ontology m3po [10] is an ontology that unifies existing metamodels in the workflow domain. It is based on various reference models and standards: the workflow

11 DERI TR Figure 1: Integration Platform exchange format XPDL [24], the formal process ontology PSL [3], the workflow language YAWL [1], the web orchestration language WS-BPEL [23] and the choreography language WS-CDL [15]. Methodology The methodology to construct the m3po ontology is described in [10]. For our purposes it is important that the ontology is complete in the sense of the workflow management systems it can represent, and available in RDF. A set of user operations on the common representation describes the tasks users need to perform on the workflow data. This set of tasks will be compiled from the requirements of the workflow domain. 4.2 Implementation On the concrete level we implement this archictecture. We develop a set of tools that (i) store and retrieve m3po; (ii) transform between certain workflow management systems and m3po; and (iii) provide the defined user operations in a easy-to-use manner. Storage: for storing and retrieving m3po we can use standard rdf stores such as Redland [2], YARS [11], Jena2 [25], or Sesame [4]. These differ mostly in storage space, query functionality, support for higher-level semantics, performance and scalability, and programming interfaces. We will choose a store based on our technical requirements. Transformation: we will develop transformation tools between Microsoft BizTalk and m3po, and between IBM WorkflowMQ and m3po.

12 8 DERI TR User operations To provide the identified user operations we use the SemperWiki platform and develop plugins that support specific workflow functionality. We thus utilise the generic data management functionality and provide extensions that add specific support for the workflow domain. 4.3 User Interface We base our user interface implementation on the paradigm of semantic personal Wikis [21], and specifically on our tool SemperWiki, as explained in the next section. 5 Wikis and Semantic Wikis Wiki Wiki Webs are collaborative hypertext environments, focused on open access, ease-of-use, and modification [17]. Wikis are interlinked web sites that can be collaboratively edited by anyone. Pages are written in a simple syntax so that even novice users can easily edit pages. The syntax consists of simple tags for creating links to other Wiki pages and textual markups such as lists and headings. Wikis can be regarded as editing environments for the web. They are manifestations of the writable web [13], enabling users to write web content with the same skills and tools used to read them. Anecdotal evidence 1 suggests that the popularity of Wiki systems, compared to other collaborative hypertext systems, is due to their simplicity and their open and easy access [5]. The user interface of most web-based Wikis consists of two modes: in reading mode, the user is presented with normal web pages that can contain pictures, links, textual markup, etc. In editing mode, the user is presented with an editing box displaying the Wiki syntax of the page (containing the text and including the markup tags). During editing, the user can request a preview of the page, which is then rendered by the server and returned to the user. Some so-called desktop Wikis such as Tomboy 2 or WikidPad 3 have only one mode: pages are directly editable. Wikis are used for various collaborative tasks, including collecting general encyclopedic knowledge, event organisation, and writing research proposals and papers. Many sites run a Wiki as a community venue, enabling users to discuss and write on topics, such as product support or project documentation. The burden of editing and maintenance is thus shared over the whole community. Popular Wikis such as Wikipedia can grow in size very quickly, since interested visitors can edit and create pages at will; this poses new requirements on information access and retrieval. 5.1 Limitations of Wikis A shortcoming of large Wikis is lack of support for finding and maintaining information. The main reason is a lack of semantic structure in the Wiki content: almost all information is written in natural language, and has little machine-understandable semantics. For example, a page about John Grisham could contain a link to the page about The Pelican Brief. The English text would say that John Grisham wrote the Pelican Brief, but that information is not machine-understandable. This leads to the following consequences: 1 See e.g

13 DERI TR Structured access of a Wiki is not possible, which arises when one is browsing or searching for information: One cannot currently query Wiki systems, because the information is not structured but rather is textual. For example, users looking for How old is John Grisham?, Who wrote the Pelican Brief?, or Which European authors have won the Nobel price for literature? cannot ask these questions directly. Instead, they can navigate to the page that contains this information and read it themselves. They first have to locate the page, and then mentally process the information on that page. More complicated queries that require some background knowledge, such as Find all the authors who are poets who are located in countries that are part of the European continent, are not possible at all. Another problem is in navigating the pages: Wikis allow users to easily make links from one page to other pages, and these links can then be used to navigate to related pages. But these explicit links are actually the only means of navigation 4. If no explicit connection is made between two related pages, e.g. between two authors that have the same publishing company, then no navigation will be possible between those pages. Information reuse of Wiki content is not possible. To reuse information is useful when it becomes necessary to provide translations or to create views (as known from databases) of content. In current Wikis it is either assumed that people will speak a common language (usually English) or that translations to other languages will be provided. But manually translating pages is a maintenance burden, since the Wiki system does not recognise the structured information inside the page text. For example, a page about John Grisham contains structured information such as his birth date, the books he authored, and his publisher. Updates to this information have to be migrated manually to the translated versions of this page. Considering that for example Wikipedia has translations of pages in up to 189 languages 5, synchronisation of page versions can become quite a burden. For reusing information the creation of views is often useful. As an example consider that, in general, books are written by an author and published by the author s publisher. The books authored by John Grisham (on his page) should therefore also automatically appear as books published by Random House (on their page). But creating such a view is not possible, and therefore the information has to be copied and maintained manually. 5.2 Semantic Wikis Generally speaking, a Semantic Wiki allows users to make formal descriptions of resources ( things ) by annotating the pages that represent those resources. Where a regular Wiki enables users to describe resources in natural language, a Semantic Wiki enables users to additionally describe resources in a formal language. Using the formal descriptions, or annotations, of resources, Semantic Wikis offer additional features over regular Wikis. Users can query the annotations directly ( show me all authors ) or create views from such queries. Also users can navigate the Wiki using the annotated relations ( go 4 except for back-references, appearing on each page, showing other pages that reference it. 5

14 10 DERI TR to other books by John Grisham ), and users can introduce background knowledge to the system ( all poets are authors; show me all authors ). In our vision, as Wikis are editing environments for the Web, Semantic Wikis will become editing environments for the Semantic Web. Finding the right balance between authoring effort and benefit is however crucial: Different forms of knowledge authoring can be positioned on a continuum in invested effort and returned benefit. For example, knowledge written as free text requires little effort but provides also little benefit, the information is unstructured and cannot be retrieved and reused efficiently; tagging texts with keywords requires slightly more effort and provides slightly improved retrieval; and formal ontology languages require significant authoring effort (authors are restricted in their possibilities and have to follow specific rules) but also provides significant benefits: automated support for knowledge retrieval, reuse, and reasoning. Figure 2: Effort and benefit in knowledge authoring On this continuum, Wikis have a flexible position: they allow users different levels of authoring (from free-text to structure and layout markup). In our opinion that flexibility is key to their success: they do not force users into one single approach but can be used in various degrees of annotations. Each degree introduces an increased authoring effort but also an increased benefit: this gradual increase of effort and benefit allows users to adjust the authoring platform for their needs. For Semantic Wikis to be adopted, this authoring continuum should be taken into account: we envision an evolutionary approach that offers different degrees of additional annotations with increasing effort and benefit. We therefore allow users to annotate their data, but we do not force them to do so. The benefits of adding annotations to Wiki pages are better navigation and better information retrieval. The authoring effort is relatively low: the semantic annotations are very similar to the layout or structural directives that are already in widespread use in ordinary Wikis. In designing a Semantic Wiki system several architectural decisions need to be taken. In this section, we explain the basic architecture and outline the design choices and their consequences.

15 DERI TR Architecture Overview A Semantic Wiki consists (at least) of the following components: a user interface, a parser, a data analyser, and a data store, as shown in figure 3. First we introduce each component, then we discuss the information access, the annotation language, and the ontological representation of the Wiki. Figure 3: Architecture of a Semantic Wiki Users can browse, edit, and query pages via the user interface. When users edit a page, the user interface notifies the parser. The parser analyses the text, and extracts annotations and links. All data (text, annotations, etc.) are stored in the semantic storage. From the data in the storage, the analyser computes sets of pages that are related to the current page, which are displayed by the user interface. Queries are posed to the storage, and the results are displayed by the user interface. All these operations should happen unobtrusively in the background, as to provide the user a responsive application. The user interface component is responsible for all user interaction. If the Wiki is web-based (the classical model), then the user interface is a server-based component that generates web pages to be viewed in a browser. The user interface can also be a desktop application; in that case the Wiki can be used for personal note-taking 6, or collaboration functionality is offered through shared storage. The user interface shows pages, their annotations, and navigation possibilities to related pages. It allows users to type text and annotations in a freely intermixed fashion. The user interface also shows available terms from shared ontologies, enabling users to browse for an appropriate term to use in their annotations 7. The parser component converts the text written by the user into objects: it parses the text for semantic annotations, layout directives, and links. The data analyser is responsible for computing a set of related resources from a given page. In a regular Wiki, this means just to find all back-references, i.e. pages that link to the current one. In a semantic environment, the relations between resources are much richer. The data analyser uses the annotations about the current page, found by the parser, and searches for relevant relations in the data store (such as other books by current author, or other persons that have the same parents as the current one ). 6 as has recently become quite popular, e.g. Tomboy or WikidPad. 7 descriptions can be shared and understood if users write them in a common terminology; browsing ontologies helps users to find the appropriate common term for their annotations.

16 12 DERI TR The semantic storage is responsible for storing and retrieving the semantic annotations. The location of the datastore determines collaboration features: 1. the datastore is hosted locally, on the same machine as the Semantic Wiki (a) if the user interface is server-based and supports multiple users, collaboration is possible (b) if the user interface is desktop-based then the Semantic Wiki is limited to single person usage 2. The datastore is hosted on a server; the user interface can still be desktop-based but users can collaborate using the same shared datastore. 3. The datastore is hosted locally with peer-to-peer connections to other stores on other machines (distributed) Annotation language The for the user of a Semantic Wiki most visible change compared to conventional Wikis is the modified annotation language. For Semantic Wikis the annotation language is not only responsible for change in text style and for creating links, but also for the semantic annotation of Wiki pages and for writing embedded queries in a page. Annotation primitives As in conventional Wikis, internal links are written in CamelCase 8 or by enclosing them in brackets; external links are written as full absolute URIs, or are abbreviated using namespace abbreviations. Internal Wiki links are expanded to absolute URIs using the usually configurable Wiki base namespace, namespace abbreviations are expanded using configurable namespace definitions. Semantic annotations are written on a separate line, and, following RDF conventions, consist of a predicate followed by an object. Predicates can only be resources, objects can be either resources or literals. Annotations are expanded to triples using the resource that the page is describing as the subject of the triple. To annotate the current Wiki page itself, instead of the resource that the page is describing, annotations have to be pre-pended with an exclamation mark (since annotations of resources are more common than annotations of pages). An example page with annotations and an embedded query is displayed in figure 4. It describes John Grisham, an author published by Random House. Information about John Grisham is mixed between natural text in English and formal annotations. Page: JohnGrisham John Grisham is an author and retired lawyer. rdf:type foaf:person dc:publisher RandomHouse Figure 4: Example page 8 cf.

17 DERI TR Wiki pages as resources Wiki pages often refer to real-world resources, and annotations may refer both to the page resource and to the resource that is described. For example, a triple John- Grisham created on can refer to the creation date (or birth date) of the person John Grisham, or about the creation date of the Wiki page about that person. We discuss three possibilities to resolve this issue: 1. We may use the same URI to denote pages and resources, but duplicate the problematic predicates in different namespaces, as shown in figure 5. The predicate semperwiki:date refers to the creation of the page, the birthdate predicate refers to the creation of the person. Figure 5: Duplicating predicates 2. We may use the same URI and the same predicate for these statements, but represent the scope of statements in RDF contexts, as shown in figure 6. Users may create arbitrary scopes when the make a statement, for example the scope of JohnGrisham as a page, of JohnGrisham as a person, and of JohnGrisham as a user of the Wiki system. When consuming the information, the scope explains how to interpret the statements, i.e. whether we are talking about JohnGrisham as a page or as a person. 3. We may separate the resource into the page resource and the real world resource, as shown in figure 7. We have a resource that identifies the page about John Grisham, and we have another resource that identifies the person. The page has an about predicate, linking it to the resource it describes. The approaches 1 and 2 have the disadvantage that the same URI is intentionally used for two different resources and mechanisms outside the shared understanding of the RDF graph are necessary to deduce the true meaning of the URI. Therefore the URI is not acting as an identifier identifying exactly one resource. Therefore we are following approach 3 and introduce different URIs to denote the Wiki page and the resource that the Wiki page describes. Since we expect that annotation of the resource that the page describes will occur more frequently, it is justifiable to make it syntactically easier

18 14 DERI TR Figure 6: Scoping statements Figure 7: Separating page and resource to annotate this resource instead of the Wiki page itself. In case the author of a Wiki page has naming authority it is often convenient to assume an automatic means to create the URI of the described resources, for example by a simple syntactic manipulation of the page URL. Advanced annotations In case a author of a Wiki page does not have naming authority over the resource it is necessary to explicitly specify the URI of the described resource. For example, figure 8 shows a page that describes the research institute DERI. The page uses the semper:about predicate to relate the page to the resource (DERI, identified by urn://deri.ie) that it is describing. The annotations state, using the Semantic Web Research Community ontology, that DERI is a research institute, founded in June 2003, and located in Galway. The last annotation, prepended with an exclamation mark, refers to the page instead of the resource; it states that Eyal Oren is the creator of that page. Figure 9 shows the RDF graph that is generated from the page in figure 8. The page DERI

19 DERI TR Page: DERI Galway DERI Galway is one location of the Digital Enterprise Research Institute, researching Semantic Web technology; our main page is at semper:about urn://deri.ie rdf:type swrc:organization swrc:location "Galway" swrc:created " "!dc:creator EyalOren Figure 8: Annotating real-world resources Galway has a property text, it has a creator and creation date that are populated by the Wiki system, and it contains a navigational hyperlink to The page DERI Galway contains information about the resource urn://deri.ie. This resource is the actual subject of the annotations: it is a research organisation, it has a creation date and location, and it has a logical relation to the resource. Figure 9: Corresponding RDF graph about DERI Galway The annotation mechanism can be used to annotate arbitrary resources, including editing or creating existing ontologies. Figure 10 for example shows how to describe a new class Cluster (an organisational structure) in the DERI ontology. Arbitrary annotations The proposed annotation syntax is simple and direct, but users can only annotate the current page, which is a severe limitation. If one describes an address book for example, then each contact in the address book needs to be described on his own Wiki page. But if one just wants to make a list of contacts and their addresses, then making a separate page for each contact is time consuming. Also it is not possible to make statements about unnamed resources ( blank nodes in RDF), since each resource that is described has to have a named page.

20 16 DERI TR Page: DeriCluster semper:about deri:cluster rdf:type rdfs:class rdfs:subclassof swrc:organization We have now created a new class Cluster in our DERI ontology. This class appears in the ontology browser, and can be used as any other class. We can add new properties to the ontology in the same way. Figure 10: Editing ontologies But unnamed resources are common in reality, see for example figure 11. Figure 11: RDF graph of an address A Semantic Wiki needs to offer a syntax for arbitrary RDF statements, that offers a shorthand for multiple values for a key, and allows to describe resources on arbitrary pages. In the Semantic Web similar issues have been worked on, leading to several compact syntaxes for RDF; we propose to use Turtle 9 for the arbitrary annotations. Representing Wiki pages in RDF We have developed a simple representation format for storing Wiki content and annotations in a triple store. Figure 9 shows the example of figure 8 and the additional metadata about the Wiki page that is stored in the datastore. The additional metadata about the page is capturing the creator, creation time, and links to other pages in order to provide the conventional hyperlink functionality. Embedded queries Embedded queries (generating views) are written using triple patterns, sequences of subject, predicate, object, that can contain variables (names that start with a question mark). A triple pattern is interpreted as a query: triples matching the pattern are returned. Patterns can be combined to form joins. 9

21 DERI TR Figure 12 shows the earlier example page about John Grisham, including an embedded query at the bottom of the page. The query returns all books written by JohnGrisham; it creates a view on the data that is displayed below the page text. Page: JohnGrisham John Grisham is an author and retired lawyer. rdf:type foaf:person dc:publisher RandomHouse this query shows all his books:?book dc:creator JohnGrisham TheFirm dc:creator JohnGrisham TheJury dc:creator JohnGrisham ThePelicanBrief dc:creator JohnGrisham Figure 12: Page showing embedded query Information access Information access is offered through structured navigation and various querying facilities: Navigation Navigation in ordinary Wikis is limited to explicit links entered by users. It is not possible to navigate the information based on structural relations. As explained, that is a severe limitation. A Semantic Wiki provides the metadata necessary to navigate the information in a structured way. For example, knowing that John Grisham is an author, we can automatically show all other authors in the system, and offer navigation to them. One approach for structural navigation is faceted meta-data browsing, or category-based browsing [26]. In faceted browsing, the information space is partitioned using orthogonal conceptual dimensions (facets) of the data, which can be used to constrain the relevant elements in the information space. For example, a collection of art works can consists of facets such as type of work, time periods, artist names, geographical locations, etc. Users can select a certain facets (e.g. 20th century) to constrain the visible collection to only the art works from that facet. Multiple constraints can be applied conjunctively: by selecting the facets book art form, 21st century period, located in USA, artist name John Grisham, the selection is restricted to the books written by John Grisham in the USA in the 21st century. Using the structured metadata enables faceted browsing of the Wiki information space. We use the metadata of the pages to partition the information space into facets. Partitions (sets of resources) are created for each predicate, object pair in the knowledge base, as follows: R p,o = {page (page, p, o) KB}, where (s, p, o) denotes the triple (subject, predicate, object) and KB is the knowledge base consisting of all RDF statements in the datastore. Each set R p,o is a partition of the space on

22 18 DERI TR a particular predicate-object pair, and contains the resources that have that particular predicateobject pair. Together the sets R p,o form the base partition of our information space. For example, the set R dc:author,johngrisham contains the Pelican Brief, the Firm, the Jury, etc.; and the set R rdf:type,foaf:p erson contains John Grisham, John le Carré, Joseph Roth, etc. When the user is viewing some selection (e.g. all books by John Grisham) or a single item (e.g. The Pelican Brief), we display the related items based on the partitioning: for all partitions that have a non-empty intersection with the current selection navigational links are shown. For example, if the user is looking at all books by John Grisham, then we would show links to (i) all books (since the current selection contains books), (ii) all books published by Random House (since the current selection contains books published by Random House), (iii) all books from 1995 (since the current selection contains books from 1995), etc. If the user is viewing for example the page about John Grisham, we would show: (i) all people (since John Grisham is a person), (ii) all authors, (iii) all authors published by Random House, etc. The backlinks are computed in a similar fashion: each set B sel,pred contains the back-references from a certain selection of pages sel, using a certain predicate pred. These back-references are computed by the union of all pages that point to some page in the current selection (p sel), using the predicate pred. B sel,pred = {page (page, pred, p) KB}. p sel This means that each set B sel,pred contains the pages that point to a page in the selection. For example, the backlinks for the selection containing John Grisham and Madeleine Albright, would include books written by John Grisham (page with dc:creator John Grisham), friends of Madeleine Albright (page with foaf:knows Madeleine Albright), etc. Querying views: We distinguish three kinds of querying functionality: keyword search, queries, and 1. A keyword-based full-text search is useful for simple information retrieval, and supported by all conventional Wiki systems. 2. Structured queries use the annotations to allow more advanced information retrieval. The user can query the Wiki for pages (or resources) that satisfy certain properties. We suggest to use triple patterns that include free variables as basic query statements. To retrieve for example all authors one can query for?x type author. Triple patterns can be combined to form database-like joins:?x type author and?x has-publisher?y retrieves all authors and their publishing companies. 3. By embedding queries into the Wiki page, users can create persistent searches or views on the Wiki. A query included on a page is executed each time the page is visited, and continuously shows up-to-date query results. Embedded queries are further discussed in section

23 DERI TR Figure 13: SemperWiki user interface 5.3 Semantic Wiki: Implementation This section presents SemperWiki, our prototype implementation of a Semantic Wiki [21]. It is a desktop application that follows the previously-discussed architecture; the architecture itself is equally applicable to Web systems. SemperWiki is implemented in Ruby 10, using the GTK 11 windowing toolkit for the graphical programming. It is open source, consists of around 1500 lines of code, and can be downloaded at System overview SemperWiki is a Semantic Personal Wiki that can be used for personal knowledge management. The main advantages compared to a normal Wiki are intelligent navigation, semantic search, and embedded queries. All information in SemperWiki can be annotated semantically, and all information can be exported and shared on the Semantic Web. The user interface of SemperWiki is shown in figure 13 On the left-hand side the user can edit pages, and on the right-hand side the user can navigate. At the top, there is the menu bar, which gives access to all functions, and the location bar, showing the current page. Navigating through SemperWiki works similar to a Web browser: the user navigates to a page by typing its address or by clicking a link to that page; one can also go back and forth through the browsing history. SemperWiki offers a sidebar with intelligent navigation links, which can be used to navigate to related pages. These links are based on the faceted browsing explained in section 5.2.3: the links show related information categorised per facet. Ordinary Wikis only show pages that contain links to the current one; SemperWiki shows much richer related information

24 20 DERI TR In the top right corner, the semantic search functionality is visible: one can find pages by listing one or more properties, such as the author or the publisher of a book. Ordinary Wikis only offer full-text search. The main text shows an example of embedded queries: a query can be embedded in any page; its results are displayed each time the page is visited. Each page can contain an arbitrary number of such queries. Ordinary Wikis do not offer such functionality Writing A page in SemperWiki can consist of arbitrary text, links to other pages or websites, and annotations. Links can be internal, to other pages in the Wiki, or external, to arbitrary Web pages. Clicking internal links navigates to them, clicking external links opens them in their default application. SemperWiki aims for high usability: it should be easy and fast to use both for novice and experienced users. We adhere to the Gnome human interface guidelines 12, a set of guidelines for consistent and user-friendly user interfaces. All functions are accessible through keyboard shortcuts. The interface provides instant response to all user actions: links are tagged instantly, correct annotations are colored green, and changing annotations changes the sidebar in real-time. After writing some pages, the user can just hit Escape to exit; all user changes are instantly saved to the data storage, and SemperWiki remembers each cursor position on each page: the next time SemperWiki is started, everything is exactly how the user left it. To help users reuse existing terminology instead of inventing their own, and to thus enable better sharing of knowledge, SemperWiki offers a simple ontology browser. Users can browse terms from common ontologies, and import others by entering the URL for the ontology. We show the terms (classes and properties) in each ontology; the user can browse those terms and drag-and-drop them to the current Wiki page. When the user drags a class, we complete it to a full annotation by pre-pending it with rdf:type ; otherwise we just insert the selected property; the user only needs to give an object value for the property Navigating The navigation bar, on the right in figure 14, shows intelligent navigation options based on the annotations of the current page and the information in the knowledge base. The related pages are categorised per facet, computed by the data analyser. The data analyser returns several sets of related pages, representing different facets in which other pages relate to the current one. We display those sets in the order from most specific to most general, ending with the set of all pages Searching Figure 14 also shows a query, embedded in the ordinary text. The query results are shown at the bottom of the screen. Each result can be clicked on, which navigates to the page where the statement was made; the user could then edit the statement. Any changes to the embedded query or the underlying data are reflected instantly on the result set: the results are continuously up-to-date. 12

25 DERI TR Figure 14: Navigating and Information reuse Storage The storage component persistently stores and retrieves the pages and their annotations. We use RDF as the datamodel, and a standard RDF database for storing our data. All our data (page text, page annotations, user preferences) are stored in the RDF store. We use Redland [2] for storing and retrieving RDF. Redland is a mature RDF store. It has a small footprint and can easily be embedded in an application. It has bindings for many different languages, including Ruby. We do not yet use contexts of statements, since support for triple contexts in Redland is not complete (namely, contexts are not first-class citizens, and cannot be used in the same manner as subjects, predicates, and objects; in particular, it is not possible to directly find the context of a statement, other than by traversing all existing contexts) Component interaction To keep the system responsive, components cooperate via event notification: each component is subscribed to relevant events of other components; upon notification, components do their job in the background, and notify other components when they are done. The parser is subscribed to the user interface, it gets notified on each text change (when the user adds or removes some text). It then parses the page in the background, updates the set of queries and annotations belonging to that page, and notifies the components subscribed to that change: the user interface, the storage, and the analyser. The user interface looks for changes in the embedded queries, and if found, asks the storage for query results. The data analyser looks for changes in the annotations and computes the related pages (which the user interfaces displays); the storage is subscribed to changes in the annotations as well, and stores them persistently.

Leveraging existing Web frameworks for a SIOC explorer to browse online social communities

Leveraging existing Web frameworks for a SIOC explorer to browse online social communities Leveraging existing Web frameworks for a SIOC explorer to browse online social communities Benjamin Heitmann and Eyal Oren Digital Enterprise Research Institute National University of Ireland, Galway Galway,

More information

OWL based XML Data Integration

OWL based XML Data Integration OWL based XML Data Integration Manjula Shenoy K Manipal University CSE MIT Manipal, India K.C.Shet, PhD. N.I.T.K. CSE, Suratkal Karnataka, India U. Dinesh Acharya, PhD. ManipalUniversity CSE MIT, Manipal,

More information

Lightweight Data Integration using the WebComposition Data Grid Service

Lightweight Data Integration using the WebComposition Data Grid Service Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed

More information

Towards a reference architecture for Semantic Web applications

Towards a reference architecture for Semantic Web applications Towards a reference architecture for Semantic Web applications Benjamin Heitmann 1, Conor Hayes 1, and Eyal Oren 2 1 firstname.lastname@deri.org Digital Enterprise Research Institute National University

More information

Query Processing in Data Integration Systems

Query Processing in Data Integration Systems Query Processing in Data Integration Systems Diego Calvanese Free University of Bozen-Bolzano BIT PhD Summer School Bressanone July 3 7, 2006 D. Calvanese Data Integration BIT PhD Summer School 1 / 152

More information

Data Grids. Lidan Wang April 5, 2007

Data Grids. Lidan Wang April 5, 2007 Data Grids Lidan Wang April 5, 2007 Outline Data-intensive applications Challenges in data access, integration and management in Grid setting Grid services for these data-intensive application Architectural

More information

Software Development Kit

Software Development Kit Open EMS Suite by Nokia Software Development Kit Functional Overview Version 1.3 Nokia Siemens Networks 1 (21) Software Development Kit The information in this document is subject to change without notice

More information

Demonstrating WSMX: Least Cost Supply Management

Demonstrating WSMX: Least Cost Supply Management Demonstrating WSMX: Least Cost Supply Management Eyal Oren 2, Alexander Wahler 1, Bernhard Schreder 1, Aleksandar Balaban 1, Michal Zaremba 2, and Maciej Zaremba 2 1 NIWA Web Solutions, Vienna, Austria

More information

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL This chapter is to introduce the client-server model and its role in the development of distributed network systems. The chapter

More information

LinkZoo: A linked data platform for collaborative management of heterogeneous resources

LinkZoo: A linked data platform for collaborative management of heterogeneous resources LinkZoo: A linked data platform for collaborative management of heterogeneous resources Marios Meimaris, George Alexiou, George Papastefanatos Institute for the Management of Information Systems, Research

More information

A SOA visualisation for the Business

A SOA visualisation for the Business J.M. de Baat 09-10-2008 Table of contents 1 Introduction...3 1.1 Abbreviations...3 2 Some background information... 3 2.1 The organisation and ICT infrastructure... 3 2.2 Five layer SOA architecture...

More information

Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology

Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology Hong-Linh Truong Institute for Software Science, University of Vienna, Austria truong@par.univie.ac.at Thomas Fahringer

More information

Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain

Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain Talend Metadata Manager Reduce Risk and Friction in your Information Supply Chain Talend Metadata Manager Talend Metadata Manager provides a comprehensive set of capabilities for all facets of metadata

More information

K@ A collaborative platform for knowledge management

K@ A collaborative platform for knowledge management White Paper K@ A collaborative platform for knowledge management Quinary SpA www.quinary.com via Pietrasanta 14 20141 Milano Italia t +39 02 3090 1500 f +39 02 3090 1501 Copyright 2004 Quinary SpA Index

More information

Increasing Development Knowledge with EPFC

Increasing Development Knowledge with EPFC The Eclipse Process Framework Composer Increasing Development Knowledge with EPFC Are all your developers on the same page? Are they all using the best practices and the same best practices for agile,

More information

Enterprise Architecture Modeling PowerDesigner 16.1

Enterprise Architecture Modeling PowerDesigner 16.1 Enterprise Architecture Modeling PowerDesigner 16.1 Windows DOCUMENT ID: DC00816-01-1610-01 LAST REVISED: November 2011 Copyright 2011 by Sybase, Inc. All rights reserved. This publication pertains to

More information

Comparing Data Integration Algorithms

Comparing Data Integration Algorithms Comparing Data Integration Algorithms Initial Background Report Name: Sebastian Tsierkezos tsierks6@cs.man.ac.uk ID :5859868 Supervisor: Dr Sandra Sampaio School of Computer Science 1 Abstract The problem

More information

A Framework for Developing the Web-based Data Integration Tool for Web-Oriented Data Warehousing

A Framework for Developing the Web-based Data Integration Tool for Web-Oriented Data Warehousing A Framework for Developing the Web-based Integration Tool for Web-Oriented Warehousing PATRAVADEE VONGSUMEDH School of Science and Technology Bangkok University Rama IV road, Klong-Toey, BKK, 10110, THAILAND

More information

Modern Databases. Database Systems Lecture 18 Natasha Alechina

Modern Databases. Database Systems Lecture 18 Natasha Alechina Modern Databases Database Systems Lecture 18 Natasha Alechina In This Lecture Distributed DBs Web-based DBs Object Oriented DBs Semistructured Data and XML Multimedia DBs For more information Connolly

More information

Artificial Intelligence & Knowledge Management

Artificial Intelligence & Knowledge Management Artificial Intelligence & Knowledge Management Nick Bassiliades, Ioannis Vlahavas, Fotis Kokkoras Aristotle University of Thessaloniki Department of Informatics Programming Languages and Software Engineering

More information

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington GEOG 482/582 : GIS Data Management Lesson 10: Enterprise GIS Data Management Strategies Overview Learning Objective Questions: 1. What are challenges for multi-user database environments? 2. What is Enterprise

More information

CAMDIT: A Toolkit for Integrating Heterogeneous Medical Data for improved Health Care Service Provisioning

CAMDIT: A Toolkit for Integrating Heterogeneous Medical Data for improved Health Care Service Provisioning CAMDIT: A Toolkit for Integrating Heterogeneous Medical Data for improved Health Care Service Provisioning 1 Ipadeola Abayomi, 2 Ahmed Ameen Department of Computer Science University of Ilorin, Kwara State.

More information

COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters

COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters COMP5426 Parallel and Distributed Computing Distributed Systems: Client/Server and Clusters Client/Server Computing Client Client machines are generally single-user workstations providing a user-friendly

More information

Postgres Plus xdb Replication Server with Multi-Master User s Guide

Postgres Plus xdb Replication Server with Multi-Master User s Guide Postgres Plus xdb Replication Server with Multi-Master User s Guide Postgres Plus xdb Replication Server with Multi-Master build 57 August 22, 2012 , Version 5.0 by EnterpriseDB Corporation Copyright 2012

More information

Theme 6: Enterprise Knowledge Management Using Knowledge Orchestration Agency

Theme 6: Enterprise Knowledge Management Using Knowledge Orchestration Agency Theme 6: Enterprise Knowledge Management Using Knowledge Orchestration Agency Abstract Distributed knowledge management, intelligent software agents and XML based knowledge representation are three research

More information

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &

More information

JOURNAL OF OBJECT TECHNOLOGY

JOURNAL OF OBJECT TECHNOLOGY JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,

More information

It s all around the domain ontologies - Ten benefits of a Subject-centric Information Architecture for the future of Social Networking

It s all around the domain ontologies - Ten benefits of a Subject-centric Information Architecture for the future of Social Networking It s all around the domain ontologies - Ten benefits of a Subject-centric Information Architecture for the future of Social Networking Lutz Maicher and Benjamin Bock, Topic Maps Lab at University of Leipzig,

More information

Reusable Knowledge-based Components for Building Software. Applications: A Knowledge Modelling Approach

Reusable Knowledge-based Components for Building Software. Applications: A Knowledge Modelling Approach Reusable Knowledge-based Components for Building Software Applications: A Knowledge Modelling Approach Martin Molina, Jose L. Sierra, Jose Cuena Department of Artificial Intelligence, Technical University

More information

How To Write A Drupal 5.5.2.2 Rdf Plugin For A Site Administrator To Write An Html Oracle Website In A Blog Post In A Flashdrupal.Org Blog Post

How To Write A Drupal 5.5.2.2 Rdf Plugin For A Site Administrator To Write An Html Oracle Website In A Blog Post In A Flashdrupal.Org Blog Post RDFa in Drupal: Bringing Cheese to the Web of Data Stéphane Corlosquet, Richard Cyganiak, Axel Polleres and Stefan Decker Digital Enterprise Research Institute National University of Ireland, Galway Galway,

More information

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Cloud Computing: Computing as a Service Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Abstract: Computing as a utility. is a dream that dates from the beginning from the computer

More information

Information and documentation The Dublin Core metadata element set

Information and documentation The Dublin Core metadata element set ISO TC 46/SC 4 N515 Date: 2003-02-26 ISO 15836:2003(E) ISO TC 46/SC 4 Secretariat: ANSI Information and documentation The Dublin Core metadata element set Information et documentation Éléments fondamentaux

More information

Getting Started with Service- Oriented Architecture (SOA) Terminology

Getting Started with Service- Oriented Architecture (SOA) Terminology Getting Started with - Oriented Architecture (SOA) Terminology Grace Lewis September 2010 -Oriented Architecture (SOA) is a way of designing, developing, deploying, and managing systems it is neither a

More information

REST Web Services in Collaborative Work Environments

REST Web Services in Collaborative Work Environments REST Web Services in Collaborative Work Environments Luis Oliva a and Luigi Ceccaroni a a Departament de Llenguatges i Sistemes Informàtics (LSI), Universitat Politècnica de Catalunya (UPC), Campus Nord,

More information

LinksTo A Web2.0 System that Utilises Linked Data Principles to Link Related Resources Together

LinksTo A Web2.0 System that Utilises Linked Data Principles to Link Related Resources Together LinksTo A Web2.0 System that Utilises Linked Data Principles to Link Related Resources Together Owen Sacco 1 and Matthew Montebello 1, 1 University of Malta, Msida MSD 2080, Malta. {osac001, matthew.montebello}@um.edu.mt

More information

12 The Semantic Web and RDF

12 The Semantic Web and RDF MSc in Communication Sciences 2011-12 Program in Technologies for Human Communication Davide Eynard nternet Technology 12 The Semantic Web and RDF 2 n the previous episodes... A (video) summary: Michael

More information

Semantic Search in Portals using Ontologies

Semantic Search in Portals using Ontologies Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br

More information

Software Life-Cycle Management

Software Life-Cycle Management Ingo Arnold Department Computer Science University of Basel Theory Software Life-Cycle Management Architecture Styles Overview An Architecture Style expresses a fundamental structural organization schema

More information

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and

More information

TopBraid Insight for Life Sciences

TopBraid Insight for Life Sciences TopBraid Insight for Life Sciences In the Life Sciences industries, making critical business decisions depends on having relevant information. However, queries often have to span multiple sources of information.

More information

Information Services for Smart Grids

Information Services for Smart Grids Smart Grid and Renewable Energy, 2009, 8 12 Published Online September 2009 (http://www.scirp.org/journal/sgre/). ABSTRACT Interconnected and integrated electrical power systems, by their very dynamic

More information

Towards a Semantic Wiki Wiki Web

Towards a Semantic Wiki Wiki Web Towards a Semantic Wiki Wiki Web Roberto Tazzoli, Paolo Castagna, and Stefano Emilio Campanini Abstract. This article describes PlatypusWiki, an enhanced Wiki Wiki Web using technologies from the Semantic

More information

Security Issues for the Semantic Web

Security Issues for the Semantic Web Security Issues for the Semantic Web Dr. Bhavani Thuraisingham Program Director Data and Applications Security The National Science Foundation Arlington, VA On leave from The MITRE Corporation Bedford,

More information

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World COSC 304 Introduction to Systems Introduction Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca What is a database? A database is a collection of logically related data for

More information

Data Modeling for Big Data

Data Modeling for Big Data Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes

More information

Semantic Stored Procedures Programming Environment and performance analysis

Semantic Stored Procedures Programming Environment and performance analysis Semantic Stored Procedures Programming Environment and performance analysis Marjan Efremov 1, Vladimir Zdraveski 2, Petar Ristoski 2, Dimitar Trajanov 2 1 Open Mind Solutions Skopje, bul. Kliment Ohridski

More information

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches Concepts of Database Management Seventh Edition Chapter 9 Database Management Approaches Objectives Describe distributed database management systems (DDBMSs) Discuss client/server systems Examine the ways

More information

Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens

Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens 1 Optique: Improving the competitiveness of European industry For many

More information

Data Quality in Information Integration and Business Intelligence

Data Quality in Information Integration and Business Intelligence Data Quality in Information Integration and Business Intelligence Leopoldo Bertossi Carleton University School of Computer Science Ottawa, Canada : Faculty Fellow of the IBM Center for Advanced Studies

More information

Data Integration. Maurizio Lenzerini. Universitá di Roma La Sapienza

Data Integration. Maurizio Lenzerini. Universitá di Roma La Sapienza Data Integration Maurizio Lenzerini Universitá di Roma La Sapienza DASI 06: Phd School on Data and Service Integration Bertinoro, December 11 15, 2006 M. Lenzerini Data Integration DASI 06 1 / 213 Structure

More information

Taming Big Data Variety with Semantic Graph Databases. Evren Sirin CTO Complexible

Taming Big Data Variety with Semantic Graph Databases. Evren Sirin CTO Complexible Taming Big Data Variety with Semantic Graph Databases Evren Sirin CTO Complexible About Complexible Semantic Tech leader since 2006 (née Clark & Parsia) software, consulting W3C leadership Offices in DC

More information

A Business Process Services Portal

A Business Process Services Portal A Business Process Services Portal IBM Research Report RZ 3782 Cédric Favre 1, Zohar Feldman 3, Beat Gfeller 1, Thomas Gschwind 1, Jana Koehler 1, Jochen M. Küster 1, Oleksandr Maistrenko 1, Alexandru

More information

FIPA agent based network distributed control system

FIPA agent based network distributed control system FIPA agent based network distributed control system V.Gyurjyan, D. Abbott, G. Heyes, E. Jastrzembski, C. Timmer, E. Wolin TJNAF, Newport News, VA 23606, USA A control system with the capabilities to combine

More information

Functional Requirements for Digital Asset Management Project version 3.0 11/30/2006

Functional Requirements for Digital Asset Management Project version 3.0 11/30/2006 /30/2006 2 3 4 5 6 7 8 9 0 2 3 4 5 6 7 8 9 20 2 22 23 24 25 26 27 28 29 30 3 32 33 34 35 36 37 38 39 = required; 2 = optional; 3 = not required functional requirements Discovery tools available to end-users:

More information

2. Basic Relational Data Model

2. Basic Relational Data Model 2. Basic Relational Data Model 2.1 Introduction Basic concepts of information models, their realisation in databases comprising data objects and object relationships, and their management by DBMS s that

More information

Web-Based Genomic Information Integration with Gene Ontology

Web-Based Genomic Information Integration with Gene Ontology Web-Based Genomic Information Integration with Gene Ontology Kai Xu 1 IMAGEN group, National ICT Australia, Sydney, Australia, kai.xu@nicta.com.au Abstract. Despite the dramatic growth of online genomic

More information

Linked Data Interface, Semantics and a T-Box Triple Store for Microsoft SharePoint

Linked Data Interface, Semantics and a T-Box Triple Store for Microsoft SharePoint Linked Data Interface, Semantics and a T-Box Triple Store for Microsoft SharePoint Christian Fillies 1 and Frauke Weichhardt 1 1 Semtation GmbH, Geschw.-Scholl-Str. 38, 14771 Potsdam, Germany {cfillies,

More information

Structure of Presentation. The Role of Programming in Informatics Curricula. Concepts of Informatics 2. Concepts of Informatics 1

Structure of Presentation. The Role of Programming in Informatics Curricula. Concepts of Informatics 2. Concepts of Informatics 1 The Role of Programming in Informatics Curricula A. J. Cowling Department of Computer Science University of Sheffield Structure of Presentation Introduction The problem, and the key concepts. Dimensions

More information

RDF Resource Description Framework

RDF Resource Description Framework RDF Resource Description Framework Fulvio Corno, Laura Farinetti Politecnico di Torino Dipartimento di Automatica e Informatica e-lite Research Group http://elite.polito.it Outline RDF Design objectives

More information

An Ontology-based e-learning System for Network Security

An Ontology-based e-learning System for Network Security An Ontology-based e-learning System for Network Security Yoshihito Takahashi, Tomomi Abiko, Eriko Negishi Sendai National College of Technology a0432@ccedu.sendai-ct.ac.jp Goichi Itabashi Graduate School

More information

LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model

LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model 22 October 2014 Tony Hammond Michele Pasin Background About Macmillan

More information

Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo

Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo Expected Outcomes You will learn: Basic concepts related to ontologies Semantic model Semantic web Basic features of RDF and RDF

More information

Service Oriented Architecture

Service Oriented Architecture Service Oriented Architecture Charlie Abela Department of Artificial Intelligence charlie.abela@um.edu.mt Last Lecture Web Ontology Language Problems? CSA 3210 Service Oriented Architecture 2 Lecture Outline

More information

Grids, Logs, and the Resource Description Framework

Grids, Logs, and the Resource Description Framework Grids, Logs, and the Resource Description Framework Mark A. Holliday Department of Mathematics and Computer Science Western Carolina University Cullowhee, NC 28723, USA holliday@cs.wcu.edu Mark A. Baker,

More information

technische universiteit eindhoven WIS & Engineering Geert-Jan Houben

technische universiteit eindhoven WIS & Engineering Geert-Jan Houben WIS & Engineering Geert-Jan Houben Contents Web Information System (WIS) Evolution in Web data WIS Engineering Languages for Web data XML (context only!) RDF XML Querying: XQuery (context only!) RDFS SPARQL

More information

Enterprise Application Designs In Relation to ERP and SOA

Enterprise Application Designs In Relation to ERP and SOA Enterprise Application Designs In Relation to ERP and SOA DESIGNING ENTERPRICE APPLICATIONS HASITH D. YAGGAHAVITA 20 th MAY 2009 Table of Content 1 Introduction... 3 2 Patterns for Service Integration...

More information

XML DATA INTEGRATION SYSTEM

XML DATA INTEGRATION SYSTEM XML DATA INTEGRATION SYSTEM Abdelsalam Almarimi The Higher Institute of Electronics Engineering Baniwalid, Libya Belgasem_2000@Yahoo.com ABSRACT This paper describes a proposal for a system for XML data

More information

Automatic Timeline Construction For Computer Forensics Purposes

Automatic Timeline Construction For Computer Forensics Purposes Automatic Timeline Construction For Computer Forensics Purposes Yoan Chabot, Aurélie Bertaux, Christophe Nicolle and Tahar Kechadi CheckSem Team, Laboratoire Le2i, UMR CNRS 6306 Faculté des sciences Mirande,

More information

zen Platform technical white paper

zen Platform technical white paper zen Platform technical white paper The zen Platform as Strategic Business Platform The increasing use of application servers as standard paradigm for the development of business critical applications meant

More information

A Semantic web approach for e-learning platforms

A Semantic web approach for e-learning platforms A Semantic web approach for e-learning platforms Miguel B. Alves 1 1 Laboratório de Sistemas de Informação, ESTG-IPVC 4900-348 Viana do Castelo. mba@estg.ipvc.pt Abstract. When lecturers publish contents

More information

RDF graph Model and Data Retrival

RDF graph Model and Data Retrival Distributed RDF Graph Keyword Search 15 2 Linked Data, Non-relational Databases and Cloud Computing 2.1.Linked Data The World Wide Web has allowed an unprecedented amount of information to be published

More information

Firewall Builder Architecture Overview

Firewall Builder Architecture Overview Firewall Builder Architecture Overview Vadim Zaliva Vadim Kurland Abstract This document gives brief, high level overview of existing Firewall Builder architecture.

More information

Oracle Endeca Server. Cluster Guide. Version 7.5.1.1 May 2013

Oracle Endeca Server. Cluster Guide. Version 7.5.1.1 May 2013 Oracle Endeca Server Cluster Guide Version 7.5.1.1 May 2013 Copyright and disclaimer Copyright 2003, 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of

More information

Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery

Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery Dimitrios Kourtesis, Iraklis Paraskakis SEERC South East European Research Centre, Greece Research centre of the University

More information

Integrating and Exchanging XML Data using Ontologies

Integrating and Exchanging XML Data using Ontologies Integrating and Exchanging XML Data using Ontologies Huiyong Xiao and Isabel F. Cruz Department of Computer Science University of Illinois at Chicago {hxiao ifc}@cs.uic.edu Abstract. While providing a

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Understanding Web personalization with Web Usage Mining and its Application: Recommender System Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,

More information

Information Access Platforms: The Evolution of Search Technologies

Information Access Platforms: The Evolution of Search Technologies Information Access Platforms: The Evolution of Search Technologies Managing Information in the Public Sphere: Shaping the New Information Space April 26, 2010 Purpose To provide an overview of current

More information

Database Resources. Subject: Information Technology for Managers. Level: Formation 2. Author: Seamus Rispin, current examiner

Database Resources. Subject: Information Technology for Managers. Level: Formation 2. Author: Seamus Rispin, current examiner Database Resources Subject: Information Technology for Managers Level: Formation 2 Author: Seamus Rispin, current examiner The Institute of Certified Public Accountants in Ireland This report examines

More information

COGNOS Query Studio Ad Hoc Reporting

COGNOS Query Studio Ad Hoc Reporting COGNOS Query Studio Ad Hoc Reporting Copyright 2008, the California Institute of Technology. All rights reserved. This documentation contains proprietary information of the California Institute of Technology

More information

CONCEPTCLASSIFIER FOR SHAREPOINT

CONCEPTCLASSIFIER FOR SHAREPOINT CONCEPTCLASSIFIER FOR SHAREPOINT PRODUCT OVERVIEW The only SharePoint 2007 and 2010 solution that delivers automatic conceptual metadata generation, auto-classification and powerful taxonomy tools running

More information

SCADE System 17.0. Technical Data Sheet. System Requirements Analysis. Technical Data Sheet SCADE System 17.0 1

SCADE System 17.0. Technical Data Sheet. System Requirements Analysis. Technical Data Sheet SCADE System 17.0 1 SCADE System 17.0 SCADE System is the product line of the ANSYS Embedded software family of products and solutions that empowers users with a systems design environment for use on systems with high dependability

More information

Service-Oriented Architectures

Service-Oriented Architectures Architectures Computing & 2009-11-06 Architectures Computing & SERVICE-ORIENTED COMPUTING (SOC) A new computing paradigm revolving around the concept of software as a service Assumes that entire systems

More information

Contents. Introduction... 1

Contents. Introduction... 1 Managed SQL Server 2005 Deployments with CA ERwin Data Modeler and Microsoft Visual Studio Team Edition for Database Professionals Helping to Develop, Model, and Maintain Complex Database Architectures

More information

How To Create An Enterprise Class Model Driven Integration

How To Create An Enterprise Class Model Driven Integration Creating an Enterprise Class Scalable Model Driven Infrastructure The use case for using IBM, OSIsoft, and SISCO technologies Version: 1.1 Date: May 28, 2009 Systems Integration Specialist Company, Inc.

More information

A Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System

A Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System A Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System Mohammad Ghulam Ali Academic Post Graduate Studies and Research Indian Institute of Technology, Kharagpur Kharagpur,

More information

MarkLogic Enterprise Data Layer

MarkLogic Enterprise Data Layer MarkLogic Enterprise Data Layer MarkLogic Enterprise Data Layer MarkLogic Enterprise Data Layer September 2011 September 2011 September 2011 Table of Contents Executive Summary... 3 An Enterprise Data

More information

SAS BI Dashboard 4.3. User's Guide. SAS Documentation

SAS BI Dashboard 4.3. User's Guide. SAS Documentation SAS BI Dashboard 4.3 User's Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2010. SAS BI Dashboard 4.3: User s Guide. Cary, NC: SAS Institute

More information

CONTEMPORARY SEMANTIC WEB SERVICE FRAMEWORKS: AN OVERVIEW AND COMPARISONS

CONTEMPORARY SEMANTIC WEB SERVICE FRAMEWORKS: AN OVERVIEW AND COMPARISONS CONTEMPORARY SEMANTIC WEB SERVICE FRAMEWORKS: AN OVERVIEW AND COMPARISONS Keyvan Mohebbi 1, Suhaimi Ibrahim 2, Norbik Bashah Idris 3 1 Faculty of Computer Science and Information Systems, Universiti Teknologi

More information

CitationBase: A social tagging management portal for references

CitationBase: A social tagging management portal for references CitationBase: A social tagging management portal for references Martin Hofmann Department of Computer Science, University of Innsbruck, Austria m_ho@aon.at Ying Ding School of Library and Information Science,

More information

Federated, Generic Configuration Management for Engineering Data

Federated, Generic Configuration Management for Engineering Data Federated, Generic Configuration Management for Engineering Data Dr. Rainer Romatka Boeing GPDIS_2013.ppt 1 Presentation Outline I Summary Introduction Configuration Management Overview CM System Requirements

More information

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014 Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4

More information

Oracle Warehouse Builder 10g

Oracle Warehouse Builder 10g Oracle Warehouse Builder 10g Architectural White paper February 2004 Table of contents INTRODUCTION... 3 OVERVIEW... 4 THE DESIGN COMPONENT... 4 THE RUNTIME COMPONENT... 5 THE DESIGN ARCHITECTURE... 6

More information

Community Edition. Master Data Management 3.X. Administrator Guide

Community Edition. Master Data Management 3.X. Administrator Guide Community Edition Talend Master Data Management 3.X Administrator Guide Version 3.2_a Adapted for Talend MDM Studio v3.2. Administrator Guide release. Copyright This documentation is provided under the

More information

POLAR IT SERVICES. Business Intelligence Project Methodology

POLAR IT SERVICES. Business Intelligence Project Methodology POLAR IT SERVICES Business Intelligence Project Methodology Table of Contents 1. Overview... 2 2. Visualize... 3 3. Planning and Architecture... 4 3.1 Define Requirements... 4 3.1.1 Define Attributes...

More information

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. White Paper Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. Using LSI for Implementing Document Management Systems By Mike Harrison, Director,

More information

Information Systems Analysis and Design CSC340. 2004 John Mylopoulos. Software Architectures -- 1. Information Systems Analysis and Design CSC340

Information Systems Analysis and Design CSC340. 2004 John Mylopoulos. Software Architectures -- 1. Information Systems Analysis and Design CSC340 XIX. Software Architectures Software Architectures UML Packages Client- vs Peer-to-Peer Horizontal Layers and Vertical Partitions 3-Tier and 4-Tier Architectures The Model-View-Controller Architecture

More information

Databases in Organizations

Databases in Organizations The following is an excerpt from a draft chapter of a new enterprise architecture text book that is currently under development entitled Enterprise Architecture: Principles and Practice by Brian Cameron

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db book.com for conditions on re use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases

More information