Large Scale Integration of Senses for the Semantic Web

Size: px
Start display at page:

Download "Large Scale Integration of Senses for the Semantic Web"

Transcription

1 Large Scale Integration of Senses for the Semantic Web Jorge Gracia IIS Department University of Zaragoza Zaragoza, Spain Mathieu d Aquin Knowledge Media Institute The Open University United Kingdom m.daquin@open.ac.uk Eduardo Mena IIS Department University of Zaragoza Zaragoza, Spain emena@unizar.es ABSTRACT Nowadays, the increasing amount of semantic data available on the Web leads to a new stage in the potential of Semantic Web applications. However, it also introduces new issues due to the heterogeneity of the available semantic resources. One of the most remarkable is redundancy, that is, the excess of different semantic descriptions, coming from different sources, to describe the same intended meaning. In this paper, we propose a technique to perform a large scale integration of senses (expressed as ontology terms), in order to cluster the most similar ones, when indexing large amounts of online semantic information. It can dramatically reduce the redundancy problem on the current Semantic Web. In order to make this objective feasible, we have studied the adaptability and scalability of our previous work on sense integration, to be translated to the much larger scenario of the Semantic Web. Our evaluation shows a good behaviour of these techniques when used in large scale experiments, then making feasible the proposed approach. Categories and Subject Descriptors H.3 [Information Systems]: Information Storage and Retrieval; H.4.m [Information Systems]: Miscellaneous General Terms Algorithms, Performance Keywords Semantic Web, scalable sense integration, ontologies 1. INTRODUCTION An increasing amount of online ontologies and semantic data is available on the Web, enabling a new generation of intelligent applications that exploit this semantic information as source of knowledge [5]. However, the growing amount of semantic content also leads to an increasing semantic heterogeneity. Heterogeneity is something inherent to the current (and future) Semantic Web. Far from assuming an ideal Semantic Web (under the authority of a few high-quality large ontologies, with maximal coverage and expressivity levels), we advocate to learn how to deal with the real one instead, Copyright is held by the International World Wide Web Conference Committee (IW3C2). Distribution of these papers is limited to classroom use, and personal use by others. WWW 2009, April 20 24, 2009, Madrid, Spain. ACM /09/04. where semantics is constructed by different authors, for very different domains, and with very different levels of expressivity and richness. The redundancy problem. As a direct consequence of heterogeneity in the Semantic Web, the problem of redundancy arises. It is characterized by the excess of different semantic descriptions, coming from different sources, to describe the same intended meaning. For example, if one searches ontological terms to describe the keyword apple in the Swoogle semantic web search engine 1, 262 different results are obtained 2. Each of them represents a possible semantic description for a particular meaning of apple. Obviously, this number is quite above the real polysemy of the word apple. For example, Word- Net corpus 3 identifies only two possible meanings for this word (the fruit and the tree), while Wikipedia 4 identifies around 20. However, it is still far from the number of possible senses given by Swoogle. If we take a look at the first page of results in Swoogle, the redundancy problem becomes evident (e.g., the meaning of apple as a kind of fruit appears repeated in five, out of ten, ontological terms). In order to solve the problem of redundancy, we propose a method to cluster the ontology terms that one can find on the Semantic Web, according to the meaning that they intend to represent. Achieving such a clustering is not only essential for human users to better apprehend the results of semantic web search engines, and generally the content of the Semantic Web, but is also crucial for applications that rely on the knowledge from the Semantic Web (see e.g. [5]). Indeed, eliminating redundancy is a way to improve performance for such applications, but more importantly, knowing which ontological entities refer to the same sense, and which refer to different meanings, makes possible the automatic selection of relevant knowledge in a more accurate and precise way. Proposed solution. Such an ambitious goal cannot be tackled without relying on an index of the ontological elements accessible through the Semantic Web. For this, our work is supported by the Watson system [4]. Watson is a gateway to the Semantic Web that crawls the Web to find Checked on 23 October 2008, using the exact match mode (localname:apple) for the query

2 and index available semantic resources, providing a single access point to online semantic information. It also provides efficient services to support application developers in exploiting the Semantic Web voluminous distributed and heterogeneous data. To tackle the problem of redundancy reduction on the Semantic Web, we define a clustering technique that we apply to the base of ontological terms collected by Watson, and that creates groups of ontological terms having similar meanings. The result is a set of concise senses for each keyword one can find in Watson. For example, a search of apple will return all the ontology terms that refer to the meaning fruit, grouped together as a single integrated sense. Different integration levels are possible, to adapt the granularity in senses discrimination to the requirements of client applications, and to the different points of view of users. Let us imagine that a user considers that Apple as electronics label and Apple as computer company should be treated differently, while for many others they intend the same meaning. Therefore, it is important to make the integration flexible, preserving the original source of information still accessible, just in case the user is not satisfied with the proposed integration. To achieve this goal of sense integration we do not start from scratch. As a matter of fact, in [21] we presented a technique that integrates various keyword senses, when they are similar enough. It was developed for a system that analyzes a keyword-based user query in order to automatically extract and make explicit, without ambiguities, its semantics. However, this technique was initially targeted to a small scale context, the one given by the few keywords involved in a user query. Therefore, an additional goal that motivates this paper is to adapt these small scale integration techniques to be used in a much more ambitious context: the Semantic Web reachable by the Watson gateway. The idea of adapting these integration techniques, to be imbricated in the process of indexing the Semantic Web, leads to another more specific task, that is a scalability study, in order to check the feasibility of these techniques when applied to a much larger body of knowledge. In this work we have tackled both the redundancy and scalability problems. First we propose a large scale integration method that applies our semantic techniques, in order to reduce the redundancy problem on the current Semantic Web, and second we contribute with a scalability study to evaluate the feasibility of our proposal. The rest of the paper is organized as follows. In Section 2 we present some previous work which serves as background for the present study. In Section 3, our proposal for large scale redundancy reduction is presented. Section 4 focuses on how to optimize the integration level and, in Section 5, we detail the scalability study that we have performed in order to check the suitability of our techniques. Some illustrating examples and further discussion can be found in Section 6. Section 7 reviews some related work and, finally, conclusions and future work appear in Section BACKGROUND In this section we briefly present some background work that serves as basis for this study. 2.1 Discovering Senses of User Keywords In a previous work [21] we have developed a system that analyzes a keyword-based user query in order to automatically extract and make explicit its semantics. Firstly, it discovers candidate senses for these keywords among pools of web ontologies (accessed by Watson [4] or Swoogle [8]). Also other local ontologies and lexical resources, as Word- Net [15], can be accessed. Secondly, it extracts the ontological context that characterizes each candidate sense, and an alignment and integration step are carried out in order to reduce redundancy. Finally, a disambiguation process is run to pick up the most probable sense for each keyword, according to the context. The result can be eventually used in the construction of a well-defined semantic query (expressed in a formal language) to make explicit the intended meaning of the user [3, 21]. It was not the first technique to match and merge ontological information from different ontologies [7]. Nevertheless, it was quite novel in combining the use of any available pool of online ontologies (without being restricted to a set of predefined ones) as source of knowledge, as well as in the extraction and use of only the relevant portions of semantic descriptions, instead of the whole ontologies (following the recommendation in [1]). Also the absence of pre-processing tasks is a remarkable characteristic of this method. Among all the different techniques developed in [21], we will focus, in this work, on the alignment and integration services. They give us the tools we need to recognize similarities among ontological terms, and to integrate them when necessary. The latest version of the alignment service receives the name of CIDER (Context and Inference based aligner), and it was successfully tested during the Ontology Alignment Evaluation Initiative (OAEI) 2008 [10]. 2.2 Watson: A gateway to the Semantic Web As already mentioned, Watson [4] is a gateway to the Semantic Web. It makes use of a set of dedicated crawlers to collect online semantic content. A range of validation and analysis steps is then performed to extract information from the collected semantic documents, including information about the ontological terms they describe and their relations. This information provides a valuable base of metadata, employed to create indexes of the semantic content currently available on the Web. Watson provides both a Web interface for human users 5, as well as a set of Web services and APIs for building applications, exploiting the semantic information available on the Web. These services and this interface provide various functionalities for searching and exploring online semantic documents, ontologies, and ontological terms, including keyword search, metadata retrieval, the exploration of ontological term descriptions and formal query facilities. This combination of mechanisms for searching, retrieving and querying online semantic content provides all the necessary elements enabling applications to select and exploit online semantic resources, without having to download the corresponding ontologies or to identify them at design time. Many applications have already been developed that dynamically exploit online semantic content in an open domain thanks to Watson [5]

3 3. LARGE SCALE INTEGRATION SYSTEM In this section we summarize our proposed large scale redundancy reduction technique. The idea is to carry out an off-line process that analyzes the whole set of ontology terms indexed in Watson (classes, properties and individuals), in order to identify clusters that group the terms according to their intended meaning, by exploring how similar they are. These equivalent ontology terms are also integrated, to show a unified semantic description of the meaning they represent. A second processing step is carried out at run-time, when the system needs to be adapted, in terms of the granularity of the provided clustering, to accommodate the user s view or the application requirements. An overview of the approach is shown in Figure 1, and will be explained in the following subsections. Watson CIDER Keyword maps Synonym expansion Synonym maps Ontology terms Extraction similarity > threshold? yes integration Senses (each synonym map) Similarity Computation no Integration modify integration? yes rise threshold? no Sense Clustering yes Sense clustering more ont. terms? Sense Disintegration no yes Modifying integration degree Figure 1: Scheme of the approach 3.1 Preliminary definitions Let us call C,P,I the sets of all classes, properties, and individuals, respectively, indexed by Watson. In the rest of the paper, we will call ontology term (and denote ot) any element of the union set: ot C P I. We will denote OE the set of all ontological elements that can be accessed by Watson, considering as elements not only ontology terms, but also any other semantic information that characterizes an ontology: hierarchies, axioms, lexical entries, etc. (we adopt here the characterization of ontology given in [6]). A keyword k will represent, throughout this paper, an element of the set of strings S (sequences of letters of any length over an alphabet), with any special significance. We will represent the set of all possible keywords as K. Definition 1. Watson search. Let us denote P (C), P (P ) and P (I) the σ-algebras formed by the power sets (sets of all subsets) of C, P, I respectively. We define the function Watson Search as: wsearch(k) : K P (C) P (P ) P (I) from a keyword k to the set of ontology terms retrieved from Watson for this keyword. In particular, we employ a dedicated function of the Watson API corresponding to the search of any ontological term that matches the keyword in its label or identifier. The matching that is used is said to be exact, as it would discard any ontological term for which the keyword is only a part of the identifier or label 6. Definition 2. Extraction. Given the power set (set of all subsets) of OE, that we denote P (OE), we define extraction as a function ext : C P I P (OE) that, for each ontology term, retrieves its neighbouring ontological elements characterizing its meaning. We are not being more specific in this definition deliberately, leaving open the particular criteria of what is considered as part of the neighbourhood of an ontology term. For our purposes, we use the extraction described in [21, 10]. Definition 3. Ontological context. Given an extraction function ext, we define ontological context of an ontology term ot, and denote OC ot, as the element of P (OE) such that OC ot = ext(ot) Finally, in the following subsections we will introduce the use of a similarity measure. Recalling the definition given in [7], similarity is a function from a pair of objects to a real number expressing how similar they are. For our purposes, the compared objects will be subsets of OE, even when, for simplicity, we say similarity between ontology terms instead of between their ontological contexts. 3.2 Keyword maps Due to the huge amount of available semantic information on the current Semantic Web, it is not computationally feasible to establish comparisons among all accessible ontology terms, in order to cluster them. We have conceived, instead, a pre-processing step to reduce the space of comparisons, by establishing an initial grouping of all ontology terms with identical labels. In the following, we call them keyword maps. 6 However, this matching is not case sensitive and relies on simple tokenization techniques to properly match compound terms independently of the employed separators (e.g. cupof-tea matches CupOfTea ). 613

4 Definition 4. Keyword map. Given a keyword k K, and a set of ontology terms OT P (C) P (P ) P (I), we define the keyword map of k as the tuple k, OT, such that OT = wsearch(k) Here, and in the rest of the paper, we will use the term map to mean any abstract structure that contains, among other information, an associative array (a collection of unique keys and their associated values) of ontology terms, where their URIs act as unique keys. 3.3 Synonym maps We subsequently enrich these keyword maps by including all the possible synonym labels, obtained by the Watson synonymy service 7. We call these sets of ontology terms direct synonym maps (or simply synonym maps). More formally: Definition 5. Direct synonymy. We define direct synonymy between two different keywords k 1, k 2 K, and denote it as k 1 k 2, as a relationship that satisfies: wsearch(k 1) wsearch(k 2) That is, different keywords are considered synonyms if they return the same ontology term in a Watson search. Definition 6. Direct synonym map. We define direct synonym map as the tuple SYN, OT, where SYN K and OT P (C) P (P ) P (I), such that k i SYN k j SYN k i k j ot OT k SYN ot wsearch(k) Up to this point, we have grouped all ontology terms corresponding to an equivalent syntactical expression. It is symbolized in Figure 2, where a synonym map is represented, not only grouping the ontological terms that correspond to apple but also to any of its direct synonyms. Note that if multilingual ontologies have been accessed, terms in different languages could be provided as synonyms, as for example manzana 8 : wsearch( manzana ) wsearch( apple ) 3.4 Sense clustering Each execution of this process receives as input a synonym map, containing all the ontology terms that are candidates to describe the meaning of certain keyword. An iterative algorithm takes each ontology term, extracts the ontological context that describes the term in the ontology, and computes its similarity degree with respect to each of the other terms (and corresponding ontological contexts) in the synonym map, by using the measure described in [10, 21]. The extracted ontological context depends on the type of term, and it includes synonyms, comments, hypernyms, hyponyms, domains, ranges, etc. A reasoner enhances this extraction step, by adding some inferred facts that are not manzana is the translation of apple in Spanish. Figure 2: Synonym map: expanding the keyword map with synonyms. present in the asserted ontology. Although different inference levels can be applied, we use simple inferences based on the transitivity of the subclass relation, due to the fact that it has shown a good balance between quality of results and processing time according to our experiments. Also the access methods in the Watson API provide ways to access classes and subclasses inferred through transitivity (thus saving us much time in local processing and reasoning). The similarity computation explores the ontological context of the compared terms, providing a measure of how similar they are. When the obtained value is under a given threshold, we consider them as different senses, and the algorithm continues comparing other terms. If, on the contrary, the similarity is high enough, the terms are considered equivalent and they are integrated into a single sense. The comparison process is then reinitiated among the new sense and the rest of terms in the synonym map. If we find a new equivalence between the new sense and another ontology term, a further integration is done. We can consider it as an agglomerative clustering technique [14]. In the following definitions we clarify what we mean by integration of ontology terms and equivalence between two terms. Definition 7. Integration. Let us call T either C, P or I. Given a certain extraction function, we define integration of n N ontological terms, and denote int(ot 1,..., ot n), as the function int : T n P (OE) such that, int(ot 1,..., ot n) = OC ot1... OC otn Definition 8. Equivalence. Let us call T either C, P or I. Given a similarity measure sim between two sets of ontological elements, and a certain threshold, we define equivalence between two different ontology terms ot 1, ot 2 T with respect to sim, as a relationship, denoted ot 1 ot 2, such that t T k (k N) that satisfies one of the following: sim(oc ot1, int(ot 2, t)) threshold sim(int(ot 1, t), OC ot2 ) threshold 614

5 Note that a consequence of the previous definition is that, sim(oc ot1, OC ot2 ) threshold ot 1 ot 2 (in case t = ). That is, two ontology terms are equivalent if their contexts are similar enough, or if one of them belongs to an integration which is similar enough to the other. In general, we use sim(ot 1, ot 2) as an equivalent notation for sim(oc ot1, OC ot2 ). The output of this process is a set of integrated senses. Each sense groups the ontology terms that correspond to the same intended meaning, as exemplified in Figure 3. We will store the obtained integrated senses in the output structures that we define in the following: Definition 9. Sense map. We define sense map as the tuple SYN, OT, where SYN K, and OT P (C) P (P ) P (I) such that, ot i, ot j OT ot i ot j k SYN ot OT ot wsearch(k) Sense maps group together ontological terms that, according to their similarity, are expected to represent the same meaning. As we have to deal with their integrated ontological information, they are part of the more complete definition of sense. The following definition accommodates (and slightly extends) the idea of sense presented in [21]: Definition 10. Sense. We define a sense as the tuple SYN, OT, int(ot ), descr, pop, syndgr, where SYN, OT is a sense map, int(ot ) is the integration of the involved ontology terms, including also the graph that describes topologically the integration, descr is a textual description, pop (popularity) is the number of integrated ontology terms: pop = OT, and syndgr is a value corresponding the resultant similarity degree of the integration. Our process never destroys semantic information because, even when the final sense includes an integration, containing only parts of the involved ontologies, it also preserves all original ontological terms in a sense map. It makes possible that, when needed, a particular application traverses the sense map to reach further information from the original ontologies. This clustering process is repeated with the rest of the synonym maps, to create eventually a pool of integrated senses which covers all ontology terms in Watson indexes. 3.5 Modifying integration degree As commented before, a threshold is given as input of our system, to decide whether to integrate or not two senses. Later, in Section 4, we explore possible ways to decide an optimal threshold. However, a fixed threshold leads to a particular integration scheme. That is, the obtained clusters represent what our application interprets as optimal integration, but may differ from other opinions or necessities. Therefore, we consider that different integration levels should be possible, to adapt the granularity in senses discrimination to the requirements of client applications, or to the different points of view of users. We propose a subsequent run-time process that further integrates or disintegrates the initial clusters, as represented in Figure 4. Without entering into details, the idea is that, given a new threshold, and given the set of senses associated to a certain keyword, the system have to explore how to modify the integration, according to the new threshold. Figure 3: Sense maps obtained for apple Note that, by definition, a sense map only contains ontology terms of the same type. Therefore classes, properties and individuals are treated separately, thus never trying to integrate ontology terms of different nature. Figure 4: Integration/disintegration example. 615

6 A rising threshold (newthreshold > threshold) means a lower integration level. Each considered sense s is disintegrated, and the sense clustering algorithm is applied again to its ontology terms s OT. More than one new sense can be created during the process, covering the meaning that the initial s represented. A falling threshold (newthreshold < threshold) means a higher integration level. In this case all senses are candidate for further integration, so the clustering algorithm is applied again on all the senses (not necessarily on their elemental ontology terms, as in the disintegration case). A number of senses equal or lower than the initial set is created. 4. OPTIMIZATION STUDY OF THE SIMI- LARITY THRESHOLD As mentioned before, a threshold is used to decide whether two terms are similar enough or not. However, one question arises: What is the best threshold value to identify a correct synonymy between ontology terms? We have different ways to decide it: 1. Experimenting with ontology matching benchmarks. That is, running our similarity measure with known alignment test cases, and comparing results with the provided reference alignments. 2. Contrasting with human opinion, in experiments where direct or indirectly some human evaluators assess the correctness of our measure. 3. Optimizing response time. As different thresholds lead to different integration degrees, we expect different execution times as well. We can try to discover a threshold that minimizes response time. In the rest of the section, we explore all these possibilities and provide a method to decide the optimal threshold, which relies on the latter point (that is, it prioritizes time response). 4.1 Finding threshold from OAEI benchmarks Regarding a comparison to benchmarks, we submitted CIDER system (our alignment service) to the OAEI 08 competition 9, in order to evaluate the capabilities of our semantic similarity measure for ontology matching tasks [10]. We participated in benchmark and directory tracks, obtaining good results in both of them. As benchmark track was open (organizers provided the reference alignment), we were able to deduce the threshold for the similarity measure (around 0,13 in an interval [0,1]) that lead to the best results. Nevertheless, we do not trust this value very much, as the nature of the comparisons in the benchmark data sets is quite restricted 10, not representing the great variety of real semantic descriptions one can find on the Semantic Web. In general, we expect higher thresholds when integrate senses because, with benchmark test cases, it is highly probable that minimal similar information, as for example same labels, assures that two terms are equivalent (as ontologies are very similar). However, equal labels are not enough, in Most comparisons are between an ontology and itself, with some modification rules applied, and always in the bibliographic domain. general, to assess equivalent meanings when more heterogeneous semantic data are involved. Therefore, we accept the obtained threshold, but only as a lower bound for an optimal one, to be used in a more general scenario. 4.2 Finding threshold from human assessment Another possibility, as it was above mentioned, is to devise an experiment where human evaluation is present to assess the quality of the matching that our measure provides. We already did such an experiment in [9], where a method to improve a background-based ontology matching technique was proposed. Without entering into further details (see [9]), our measure was used to add a confidence level to 354 mappings initially provided for the background-based technique, between two ontologies of the agricultural domain. We applied different thresholds to this confidence level, to decide the validity of the mappings, and we compared this to an assessment made by human observers. A particular optimum threshold did not arise from this experiment, depending the particular choice on the balance between precision and recall one needs. However, a range of reasonable good values was found between 0.2 and Finding threshold from performance optimization We have measured the variation of the response time of the system, during alignment and integration, when the threshold is modified. Our goal is to discover in which ranges our integration works better, in terms of performance. In fact, we expect different response time depending on the threshold, because the number (and size) of clusters differs with the integration level, therefore consuming different time when comparing and merging them during our iterative integration algorithm. For this purpose we have set up an small experiment with 505 randomly selected keywords, with a number of associated ontology terms ranging from 2 to 511, which were used as input of our alignment and integration system. We have experimented with different levels of integration (that is, corresponding to different thresholds). We evaluated the integration level as integration level = 1 # final integrated senses # initial ontology terms The computer used for this test had the following characteristics: 4 x Intel R Core TM 2 Quad CPU Q6600 at 2.40GHz, 8GB RAM (with Ubuntu 8.04 Linux). Figure 5 shows the averaged integration levels obtained. As expected, the higher the threshold, the lower the integration level, which is consistent with the discussion in Section 3.5. In Figure 6, the average time that our similarity computation and integration processes took in the experiment, for different thresholds, is shown. Analyzing the results, we can see two local maximum values in the graphic, corresponding to the highest and the lowest integration levels respectively. In fact, each iteration, in our integration algorithm, needs to compare a new ontology term to the others already examined. If there are no (or few) clusters, this number of comparisons could be high, thus consuming more time. On the contrary, if there is a high integration level, the time is spent merging ontological data, as well as comparing each new integration with the rest (1) 616

7 Integration level Threshold Figure 5: Integration level with different threshold. Time (s) Threshold Figure 6: Response time with respect to threshold. of the ontology terms again. The results show that the best performance corresponds to an integration level around 0.5. In particular, we found the best performance for a threshold value of 0.22 in this experiment. 4.4 Proposed optimization method As conclusion of our experience, described in previous paragraphs, we advocate for an optimal threshold established to maximise performance. The main reasons are: 1. It will reduce the response time of the overall system. Figure 6 shows how an optimal threshold selection can easily double or triple the speed with respect to other thresholds. 2. The optimal threshold we can find this way (Section 4.3), is compatible with the range of good ones found by comparing the results with human evaluations (Section 4.2). 3. It is not always feasible to have a large enough number of humans, or reference alignments, in order to assess the behaviour of a similarity measure. On the contrary the performance-based method is completely automatic, not needing human supervision. 4. Finally, as explained in Section 3.5, the ideal integration level is dependent on the user s view and on the requirements of particular applications. Therefore, it is reasonable to chose an initial integration that can be established quickly, letting the user, or client application, chose later how it should be modified to fit his particular needs. Therefore, we propose to automatically train the system with a large number of test cases, analyzing the response time with respect to the threshold. Then, we deduce the threshold that minimize the time value, that will be proposed as the optimal threshold of our system. Finally, the information of subsequent manual tuning of the threshold can be stored, and utilized to improve the initial threshold by using machine learning schemes. 5. SCALABILITY STUDY In this section we explore the scalability of the techniques we presented in [21], when used to align and integrate ontological information extracted from Watson indexes. This way, we can check whether the scenario described in Section 3 is feasible or not. 5.1 Empirical analysis of costs The method developed in [21] to integrate senses of user keywords, was conceived initially for a context of use where only a reduced set of keywords (corresponding to a user query) were involved. For this case, the set of tests we run showed a good behaviour of the method, in terms of performance. However, the new goal we describe in this paper is much more ambitious, as the context of application grows enormously in scale. Therefore, a scalability study is needed in order to figure out whether our integration algorithm is applicable to a large scale context (the whole set of online ontologies indexed by Watson) or not. Generally speaking, we say that a system is scalable when the cost per unit of output remains relatively constant with proportional changes in the number of units (or size) of the inputs. There are two ways of analyzing the cost in time of an algorithm: theoretically or empirically. We have chosen the second way, because we already dispose of an advanced implementation of the system. To empirically study the response time of the system, we need to run it with a statistically significant sample of input data. Then, we will infer the performance of the system from the obtained results. Experimental setup. For these experiments, we used the set of synonym maps that Watson indexes hosted on June 2008, from which we randomly selected a 10%, taking a representative keyword from each map to be used as input for our experiment, resulting in an initial number of After removing the keywords with only one ontology term associated (it makes no sense to apply our integration process on them) 11, we counted 9156 different keywords, associated to different ontology terms to be clustered in this experiment. We used a threshold = 0.27 during our sense clustering step. 11 A small number of keywords that could not been processed, due to different technical reasons, was also removed (e.g., not available in Watson indexes at the time the test was run). 617

8 The computer utilized for this tests had these characteristics: 2 x Intel R Xeon R CPU X5355 at 2.66GHz, 6GB RAM (with Red Hat Enterprise Linux 5). 5.2 Studying the size of keyword maps The target of this experiment is to study the response in time of our system, with respect to the number of ontology terms per keyword that have to be processed during the clustering step. We expect that the greater the number of ontology terms associated to a keyword, the longer it takes to be integrated. Our objective is to figure out whether this time is always limited and controllable, even for a high amount of terms, or not. It is something hard to predict in advance, due to the great variability of input data, in terms of the quantity and quality of semantic information. In Figure 7 we see how final results distribute. It shows the total response time per number of ontology terms in each keyword map. Each point corresponds to a test case (a total of 9156 cases were run, one per different input keyword). We have adjusted the given results to different curves, finding that the best fit is a linear regression with equation y = 367x, and correlation coefficient R = If we consider the distribution of time responses, the mean value of all cases is 2.5 seconds, and the median value is 0.5 seconds 5.3 Studying the size of ontologies As an additional result of our scalability study, we were interested in confirming that, due to the use of the Watson inverted indexes, the size of the accessed ontologies has not effect on the response time of our techniques. Our results confirmed it, as it is shown by the cloud shaped dispersion in Figure 8 (zoomed for convenience), where the time result according to the size of the involved ontologies appears. After having statistically analyzed the results, we have not found any dependence between ontology size and response time of our system. This independence between the size of ontologies and the response time of our system, even if expected, shows a remarkable advantage of accessing the ontological information by using Watson, instead of other sources that requiere a preliminary load step before using the ontologies. In that cases, the time taken for downloading (and querying) the ontologies depends on their size. Figure 8: Time vs. size of accessed ontologies. Figure 7: Time vs. number of initial ontology terms per keyword. From the given results we observe: 1. 99% of cases corresponds to keywords with less than 100 associated ontology terms. In all these cases the system s response takes no more than a few seconds, thus confirming its capacity of being used for real-time purposes (as it was first conceived in [21]). 2. The expected result time is always controlled and can be approximately predicted for any value of the input, as the ratio processing time/num. ontological terms remains constant. As our alignment and integration algorithms show a controllable behaviour, we can focus more on technical optimization issues to improve performance (parallelization, cache schemes, etc.) when the real application scenario requires it (e.g., it is expected that Watson indexes continue growing with time). 6. ILLUSTRATING EXAMPLES The main target of our test series has been to study the scalability and performance of our integration techniques. In this section we want to give also an idea of the quality of the results, by inspecting some selected clustering examples. A larger human-based evaluation is pending as future work. Meanwhile, we have inspected the results given with these polysemic words: turkey, plant, and our motivating example apple. Due to space limitations, we only detail the first example, summarizing the results of the other two. Case 1: Turkey. After synonymy expansion, the synonym map SYN, OT contained the keywords SYN = {Turkey, Türkei, Türkiye}, and a set of 58 ontology terms: OT = 58. Our sense clustering step was carried out with threshold = The system grouped the initial 58 ontology terms into 9 different senses (integration level = 0.84, according to Equation 1). Table 1 summarizes some aspects of the integrated senses Ontologies mentioned in Table 1 can be found, respectively, at

9 # type pop syndgr comments 1 class 1 0 It appears as a country in Country.daml ontology. 2 class 1 0 In tap.rdf ontology (however its ontological context does not clarify its meaning). 3 class Integration of turkey as a livestock from the Economy.daml and Economy.owl ontologies. 4 class Integration of turkey as a bird from the ontosem.owl and mesh.owl ontologies. 5 individual It integrates many instances of turkey as light meal. 6 individual It integrates many instances of turkey from annotated conversations in social webs, most of them referring to meanings similar to place, country, area. 7 individual Integration of turkey as an instance of a place, from different versions of epitaph.rdf ontology. 8 individual It integrates turkey as an instance of document in annotated blogs. 9 individual 1 0 It comes from a test ontology, being an instance of Foo. Table 1: Summary of obtained senses for turkey. In order to experiment with different integration levels, we decreased the threshold to 0.17, resulting in an integration level = In particular senses 6 and 7 were integrated into a single one (with pop = 37, syndgr = 0.38), as well as senses 3 and 4 (pop = 4, syndgr = 0.38). The others remained the same. On the contrary, if we increase the threshold to 0.37, the number of obtained senses rises to 12 (integration level decreases to 0.79). Particularly, sense 4 is split in two, and sense 5 is split in three new senses. Case 2: Plant. The obtained synonymy expansion was {plant, plants}. Watson retrieved 52 ontology terms for these keywords. After running our clustering step, with threshold = 0.17, 16 final senses were obtained: 14 classes and two individuals (integration level = 0.73). The most remarkable one was a class that integrated 32 ontology terms, all with the meaning of plant as a kind of living organism. Case 3: Apple. Watson retrieved initially 38 terms for this keyword. The synonym expansion did not add significant synonyms. Our sense clustering step, with threshold = 0.27, grouped the initial ontology terms into nine different senses (integration level = 0.76): five individuals and four classes. Among the possible senses, we find apple as a fruit, as an ingredient, as an annotation tag in Flickr 13, and as an instance of document in many annotated blogs (most of them refer to the meaning of the electronics company ). Discussion. Our examples show the ability of our system to reduce the number of repeated senses one can discover on the Semantic Web, to represent the meaning of a keyword. It is specially useful to cluster terms form different versions of the same ontology. Also references in social tags, where semantics is quite poor, are easily grouped. We have also found that very different meanings are not wrongly mixed easily (e.g., turkey as country with turkey as food ). On the other hand, it is hard to obtain a total integration of all terms of the same type referring to the same meaning. It is due to the very different semantic descriptions somehttp://reliant.teknowledge.com/daml/economy.owl, and times used to model the same concepts. We have also found that some expected meanings do not appear among our results (e.g., apple as a tree ). It is a consequence of the still limited coverage of the current Semantic Web. 7. RELATED WORK Our proposal represents the first large-scale effort, so far, to provide a pool of clusters of cross-ontology terms free of redundancy. Therefore, direct comparison with other alternatives is quite difficult. Clusty 14, for example, is a metasearch engine that groups search results in clusters that group pages about a similar topic. The general idea could be equivalent, but our search domain is the Semantic Web instead of the Web, and we are interested in clustering senses instead of web pages. Regarding other systems that index online semantic content [17, 8, 11], neither of them have incorporated redundancy reduction techniques yet. Technologically speaking, even though we reuse ontology matching and integration techniques, our work is neither solely about ontology matching [7, 6], nor about ontology integration or merging [18, 13, 12], in the sense of obtaining a new ontology from matched ontologies [7]. It is closer, however, to the idea of ontology construction from online ontologies given in [1]. In his work, Alani proposes a general strategy to dynamically segment, map and merge online ontologies in order to support ontology construction. Our target is not the same, but we share some commonalities, and we think that our proposal could indeed support and benefit such kind of systems. Regarding our particular techniques, notice that our definitions of extraction, ontological context and equivalence, given in Section 3, are general enough to accommodate other similarity measures, always that they come with a clear notion of neighbourhood, as for example in [19] (where they propose the idea of virtual documents, containing neighbouring information, to describe the intended meaning of the ontology terms). Also, other segmentation techniques [20, 16, 2] could substitute the extraction we use. However, the main target of these techniques is usually to reduce the size of large ontologies, to make them treatable and scalable, instead of characterizing a single sense. Furthermore, the good results given by the CIDER alignment service, when compared to others in the OAEI contest [10], confirm the

10 validity of its extraction and similarity computation, when applied to ontology matching tasks. 8. CONCLUSIONS AND FUTURE WORK We have proposed a method to perform a large scale integration of ontology terms, to cluster the most similar ones into integrated senses, when indexing huge amounts of online semantic information. The method is flexible enough to dynamically modify the integration level according to the user point of view, by tuning the similarity threshold. We have also explored possible methods to find an optimal integration level, finally proposing one that is based on performance optimization. The work presented here is quite innovative in some aspects. For example, it is the first large-scale effort, so far, to provide a pool of clusters of cross-ontology terms free of redundancy (to a certain extent). A scalability study has been performed in order to investigate the feasibility of our proposal. The results show that our techniques lead to processing times that are controllable and predictable, thus making scalability possible. As future work, we plan to complete the implementation of the system, only partially covered by our current prototype. It will be provided as a service in Watson eventually. In this way, many applications, that exploit Watson to dynamically access online semantic content, will take advantage of our method. We plan also to perform a large scale test with human opinion, to judge the quality of the obtained senses. Acknowledgments. We thank Jordi Bernad for his valuable comments on our mathematical formalization. This work is supported by the CICYT project TIN C02-02, and the Government of Aragon-CAI grant IT17/08. Mathieu d Aquin is partly funded by the NeOn project sponsored by the European Commission as part of the Information Society Technologies (IST) programme. 9. REFERENCES [1] H. Alani. Position paper: Ontology construction from online ontologies. In Proc. of 15th International World Wide Web Conference (WWW2006), Edinburgh, UK, May [2] M. Bhatt, C. Wouters, A. Flahive, W. Rahayu, and D. Taniar. Semantic completeness in sub-ontology extraction using distributed methods. In In Proc. Int. Conf. on Computational Science and its Applications (ICCSA), Assisi, Italy, May [3] C. Bobed, R. Trillo, E. Mena, and J. Bernad. Semantic discovery of the user intended query in a selectable target query language. In Proc. of 7th International Conference on Web Intelligence (WI 2008), Sydney, Australia, December [4] M. d Aquin, C. Baldassarre, L. Gridinoc, S. Angeletou, M. Sabou, and E. Motta. Characterizing knowledge on the semantic web with Watson. In Proc. of 5th International EON Workshop, at ISWC 07, Busan, Korea, [5] M. d Aquin, E. Motta, M. Sabou, S. Angeletou, L. Gridinoc, V. Lopez, and D. Guidi. Toward a new generation of semantic web applications. IEEE Intelligent Systems, 23(3):20 28, [6] M. Ehrig. Ontology Alignment: Bridging the Semantic Gap (Semantic Web and Beyond). Springer-Verlag New York, Inc., [7] J. Euzenat and P. Shvaiko. Ontology matching. Springer-Verlag, [8] T. Finin, L. Ding, R. Pan, A. Joshi, P. Kolari, A. Java, and Y. Peng. Swoogle: Searching for knowledge on the Semantic Web. In Proc. of AAAI 05 (intelligent systems demo), July [9] J. Gracia, V. Lopez, M. d Aquin, M. Sabou, E. Motta, and E. Mena. Solving semantic ambiguity to improve semantic web based ontology matching. In Proc. of 2nd Ontology Matching Workshop (OM 07), at ISWC 07, Busan, Korea, November [10] J. Gracia and E. Mena. Ontology matching with CIDER: Evaluation report for the OAEI In Proc. of 3rd Ontology Matching Workshop (OM 08), at ISWC 08, Karlsruhe, Germany, October [11] A. Harth, J. Umbrich, and S. Decker. Multicrawler: A pipelined architecture for crawling and indexing semantic web data. In Proc. of 5th International Semantic Web Conference (ISWC 06), Athens, GA, USA. Springer, [12] P. Hitzler, M. Krötzsch, M. Ehrig, and Y. Sure. What is ontology merging? - a categorytheoretic perspective using pushouts. In Proc. of First International Workshop on Contexts and Ontologies: Theory, Practice and Applications, at AAAI 05, Pittsburgh, Pennsylvania, USA, July [13] C. M. Keet. Aspects of ontology integration, Technical Report, Napier University, Edinburgh, Scotland. [14] B. Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. Springer-Verlag, [15] G. Miller. WordNet: A Lexical Database for English. Communications of the ACM, 38(11):39 41, [16] N. F. Noy and M. A. Musen. Specifying ontology views by traversal. In Proceedings of International Semantic Web Conference (ISWC 04), Hiroshima, Japan, November [17] E. Oren, R. Delbru, M. Catasta, R. Cyganiak, H. Stenzhorn, and G. Tummarello. Sindice.com: A document-oriented lookup index for open linked data. International Journal of Metadata, Semantics and Ontologies, 3(1):37 52, [18] H. S. Pinto, A. Gómez-Pérez, and J. P. Martins. Some Issues on Ontology Integration. In Proc. of the IJCAI Workshop on Ontologies and Problem-Solving Methods, Stockholm, Sweden, August [19] Y. Qu, W. Hu, and G. Cheng. Constructing virtual documents for ontology matching. In Proc. of 15th International World Wide Web Conference (WWW 06), Edinburgh, UK, May [20] J. Seidenberg and A. Rector. Web ontology segmentation: analysis, classification and use. In Proc. of the 15th international conference on World Wide Web (WWW 06), Edinburgh, UK, May [21] R. Trillo, J. Gracia, M. Espinoza, and E. Mena. Discovering the semantics of user keywords. Journal on Universal Computer Science (JUCS). Special Issue: Ontologies and their Applications, 13(12): , December

LabelTranslator - A Tool to Automatically Localize an Ontology

LabelTranslator - A Tool to Automatically Localize an Ontology LabelTranslator - A Tool to Automatically Localize an Ontology Mauricio Espinoza 1, Asunción Gómez Pérez 1, and Eduardo Mena 2 1 UPM, Laboratorio de Inteligencia Artificial, 28660 Boadilla del Monte, Spain

More information

Characterizing Knowledge on the Semantic Web with Watson

Characterizing Knowledge on the Semantic Web with Watson Characterizing Knowledge on the Semantic Web with Watson Mathieu d Aquin, Claudio Baldassarre, Laurian Gridinoc, Sofia Angeletou, Marta Sabou, and Enrico Motta Knowledge Media Institute (KMi), The Open

More information

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT

More information

Multilingual and Localization Support for Ontologies

Multilingual and Localization Support for Ontologies Multilingual and Localization Support for Ontologies Mauricio Espinoza, Asunción Gómez-Pérez and Elena Montiel-Ponsoda UPM, Laboratorio de Inteligencia Artificial, 28660 Boadilla del Monte, Spain {jespinoza,

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

Performance evaluation of Web Information Retrieval Systems and its application to e-business

Performance evaluation of Web Information Retrieval Systems and its application to e-business Performance evaluation of Web Information Retrieval Systems and its application to e-business Fidel Cacheda, Angel Viña Departament of Information and Comunications Technologies Facultad de Informática,

More information

Adaptive Context-sensitive Analysis for JavaScript

Adaptive Context-sensitive Analysis for JavaScript Adaptive Context-sensitive Analysis for JavaScript Shiyi Wei and Barbara G. Ryder Department of Computer Science Virginia Tech Blacksburg, VA, USA {wei, ryder}@cs.vt.edu Abstract Context sensitivity is

More information

Clustering Technique in Data Mining for Text Documents

Clustering Technique in Data Mining for Text Documents Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor

More information

Chapter-1 : Introduction 1 CHAPTER - 1. Introduction

Chapter-1 : Introduction 1 CHAPTER - 1. Introduction Chapter-1 : Introduction 1 CHAPTER - 1 Introduction This thesis presents design of a new Model of the Meta-Search Engine for getting optimized search results. The focus is on new dimension of internet

More information

An experience with Semantic Web technologies in the news domain

An experience with Semantic Web technologies in the news domain An experience with Semantic Web technologies in the news domain Luis Sánchez-Fernández 1,NorbertoFernández-García 1, Ansgar Bernardi 2,Lars Zapf 2,AnselmoPeñas 3, Manuel Fuentes 4 1 Carlos III University

More information

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany Information Systems University of Koblenz Landau, Germany Exploiting Spatial Context in Images Using Fuzzy Constraint Reasoning Carsten Saathoff & Agenda Semantic Web: Our Context Knowledge Annotation

More information

Draft Martin Doerr ICS-FORTH, Heraklion, Crete Oct 4, 2001

Draft Martin Doerr ICS-FORTH, Heraklion, Crete Oct 4, 2001 A comparison of the OpenGIS TM Abstract Specification with the CIDOC CRM 3.2 Draft Martin Doerr ICS-FORTH, Heraklion, Crete Oct 4, 2001 1 Introduction This Mapping has the purpose to identify, if the OpenGIS

More information

Tags4Tags: Using Tagging to Consolidate Tags

Tags4Tags: Using Tagging to Consolidate Tags Tags4Tags: Using Tagging to Consolidate Tags Leyla Jael Garcia-Castro 1, Martin Hepp 1, Alexander Garcia 2 1 E-Business and Web Science Research Group, Universität der Bundeswehr München, D-85579 Neubiberg,

More information

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

Understanding Web personalization with Web Usage Mining and its Application: Recommender System Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,

More information

Sentiment analysis on tweets in a financial domain

Sentiment analysis on tweets in a financial domain Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International

More information

Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2

Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Department of Computer Engineering, YMCA University of Science & Technology, Faridabad,

More information

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION Pilar Rey del Castillo May 2013 Introduction The exploitation of the vast amount of data originated from ICT tools and referring to a big variety

More information

LDIF - Linked Data Integration Framework

LDIF - Linked Data Integration Framework LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany a.schultz@fu-berlin.de,

More information

SEMANTIC VIDEO ANNOTATION IN E-LEARNING FRAMEWORK

SEMANTIC VIDEO ANNOTATION IN E-LEARNING FRAMEWORK SEMANTIC VIDEO ANNOTATION IN E-LEARNING FRAMEWORK Antonella Carbonaro, Rodolfo Ferrini Department of Computer Science University of Bologna Mura Anteo Zamboni 7, I-40127 Bologna, Italy Tel.: +39 0547 338830

More information

Web Document Clustering

Web Document Clustering Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,

More information

A Software Tool for Thesauri Management, Browsing and Supporting Advanced Searches

A Software Tool for Thesauri Management, Browsing and Supporting Advanced Searches J. Nogueras-Iso, J.A. Bañares, J. Lacasta, J. Zarazaga-Soria 105 A Software Tool for Thesauri Management, Browsing and Supporting Advanced Searches J. Nogueras-Iso, J.A. Bañares, J. Lacasta, J. Zarazaga-Soria

More information

SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA

SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA SEMANTIC WEB BASED INFERENCE MODEL FOR LARGE SCALE ONTOLOGIES FROM BIG DATA J.RAVI RAJESH PG Scholar Rajalakshmi engineering college Thandalam, Chennai. ravirajesh.j.2013.mecse@rajalakshmi.edu.in Mrs.

More information

Domain Classification of Technical Terms Using the Web

Domain Classification of Technical Terms Using the Web Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using

More information

Application of ontologies for the integration of network monitoring platforms

Application of ontologies for the integration of network monitoring platforms Application of ontologies for the integration of network monitoring platforms Jorge E. López de Vergara, Javier Aracil, Jesús Martínez, Alfredo Salvador, José Alberto Hernández Networking Research Group,

More information

Semantic Search in Portals using Ontologies

Semantic Search in Portals using Ontologies Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br

More information

Content Delivery Network (CDN) and P2P Model

Content Delivery Network (CDN) and P2P Model A multi-agent algorithm to improve content management in CDN networks Agostino Forestiero, forestiero@icar.cnr.it Carlo Mastroianni, mastroianni@icar.cnr.it ICAR-CNR Institute for High Performance Computing

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

Building Ontology Networks: How to Obtain a Particular Ontology Network Life Cycle?

Building Ontology Networks: How to Obtain a Particular Ontology Network Life Cycle? See discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/47901002 Building Ontology Networks: How to Obtain a Particular Ontology Network Life Cycle?

More information

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,

More information

A Statistical Text Mining Method for Patent Analysis

A Statistical Text Mining Method for Patent Analysis A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, shjun@cju.ac.kr Abstract Most text data from diverse document databases are unsuitable for analytical

More information

RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS

RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS ISBN: 978-972-8924-93-5 2009 IADIS RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS Ben Choi & Sumit Tyagi Computer Science, Louisiana Tech University, USA ABSTRACT In this paper we propose new methods for

More information

Test Coverage Criteria for Autonomous Mobile Systems based on Coloured Petri Nets

Test Coverage Criteria for Autonomous Mobile Systems based on Coloured Petri Nets 9th Symposium on Formal Methods for Automation and Safety in Railway and Automotive Systems Institut für Verkehrssicherheit und Automatisierungstechnik, TU Braunschweig, 2012 FORMS/FORMAT 2012 (http://www.forms-format.de)

More information

Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

More information

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Search and Information Retrieval

Search and Information Retrieval Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

Search Engine Based Intelligent Help Desk System: iassist

Search Engine Based Intelligent Help Desk System: iassist Search Engine Based Intelligent Help Desk System: iassist Sahil K. Shah, Prof. Sheetal A. Takale Information Technology Department VPCOE, Baramati, Maharashtra, India sahilshahwnr@gmail.com, sheetaltakale@gmail.com

More information

How To Find Influence Between Two Concepts In A Network

How To Find Influence Between Two Concepts In A Network 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation Influence Discovery in Semantic Networks: An Initial Approach Marcello Trovati and Ovidiu Bagdasar School of Computing

More information

Privacy-preserving Digital Identity Management for Cloud Computing

Privacy-preserving Digital Identity Management for Cloud Computing Privacy-preserving Digital Identity Management for Cloud Computing Elisa Bertino bertino@cs.purdue.edu Federica Paci paci@cs.purdue.edu Ning Shang nshang@cs.purdue.edu Rodolfo Ferrini rferrini@purdue.edu

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Big Data Mining Services and Knowledge Discovery Applications on Clouds Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy talia@dimes.unical.it Data Availability or Data Deluge? Some decades

More information

Interactive Dynamic Information Extraction

Interactive Dynamic Information Extraction Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken

More information

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of

More information

Distributed Database for Environmental Data Integration

Distributed Database for Environmental Data Integration Distributed Database for Environmental Data Integration A. Amato', V. Di Lecce2, and V. Piuri 3 II Engineering Faculty of Politecnico di Bari - Italy 2 DIASS, Politecnico di Bari, Italy 3Dept Information

More information

Overview of iclef 2008: search log analysis for Multilingual Image Retrieval

Overview of iclef 2008: search log analysis for Multilingual Image Retrieval Overview of iclef 2008: search log analysis for Multilingual Image Retrieval Julio Gonzalo Paul Clough Jussi Karlgren UNED U. Sheffield SICS Spain United Kingdom Sweden julio@lsi.uned.es p.d.clough@sheffield.ac.uk

More information

Linked Open Data Infrastructure for Public Sector Information: Example from Serbia

Linked Open Data Infrastructure for Public Sector Information: Example from Serbia Proceedings of the I-SEMANTICS 2012 Posters & Demonstrations Track, pp. 26-30, 2012. Copyright 2012 for the individual papers by the papers' authors. Copying permitted only for private and academic purposes.

More information

EXPLOITING FOLKSONOMIES AND ONTOLOGIES IN AN E-BUSINESS APPLICATION

EXPLOITING FOLKSONOMIES AND ONTOLOGIES IN AN E-BUSINESS APPLICATION EXPLOITING FOLKSONOMIES AND ONTOLOGIES IN AN E-BUSINESS APPLICATION Anna Goy and Diego Magro Dipartimento di Informatica, Università di Torino C. Svizzera, 185, I-10149 Italy ABSTRACT This paper proposes

More information

A Tool for Searching the Semantic Web for Supplies Matching Demands

A Tool for Searching the Semantic Web for Supplies Matching Demands A Tool for Searching the Semantic Web for Supplies Matching Demands Zuzana Halanová, Pavol Návrat, Viera Rozinajová Abstract: We propose a model of searching semantic web that allows incorporating data

More information

Effect of Using Neural Networks in GA-Based School Timetabling

Effect of Using Neural Networks in GA-Based School Timetabling Effect of Using Neural Networks in GA-Based School Timetabling JANIS ZUTERS Department of Computer Science University of Latvia Raina bulv. 19, Riga, LV-1050 LATVIA janis.zuters@lu.lv Abstract: - The school

More information

Cross-Lingual Concern Analysis from Multilingual Weblog Articles

Cross-Lingual Concern Analysis from Multilingual Weblog Articles Cross-Lingual Concern Analysis from Multilingual Weblog Articles Tomohiro Fukuhara RACE (Research into Artifacts), The University of Tokyo 5-1-5 Kashiwanoha, Kashiwa, Chiba JAPAN http://www.race.u-tokyo.ac.jp/~fukuhara/

More information

Using Semantic Data Mining for Classification Improvement and Knowledge Extraction

Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Fernando Benites and Elena Sapozhnikova University of Konstanz, 78464 Konstanz, Germany. Abstract. The objective of this

More information

A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment

A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment Weisong Chen, Cho-Li Wang, and Francis C.M. Lau Department of Computer Science, The University of Hong Kong {wschen,

More information

A Time Efficient Algorithm for Web Log Analysis

A Time Efficient Algorithm for Web Log Analysis A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,

More information

QASM: a Q&A Social Media System Based on Social Semantics

QASM: a Q&A Social Media System Based on Social Semantics QASM: a Q&A Social Media System Based on Social Semantics Zide Meng, Fabien Gandon, Catherine Faron-Zucker To cite this version: Zide Meng, Fabien Gandon, Catherine Faron-Zucker. QASM: a Q&A Social Media

More information

An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them

An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them An Open Platform for Collecting Domain Specific Web Pages and Extracting Information from Them Vangelis Karkaletsis and Constantine D. Spyropoulos NCSR Demokritos, Institute of Informatics & Telecommunications,

More information

IBM Software Information Management. Scaling strategies for mission-critical discovery and navigation applications

IBM Software Information Management. Scaling strategies for mission-critical discovery and navigation applications IBM Software Information Management Scaling strategies for mission-critical discovery and navigation applications Scaling strategies for mission-critical discovery and navigation applications Contents

More information

Invited Applications Paper

Invited Applications Paper Invited Applications Paper - - Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK THOREG@MICROSOFT.COM JOAQUINC@MICROSOFT.COM

More information

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02) Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we

More information

Semantic tagging for crowd computing

Semantic tagging for crowd computing Semantic tagging for crowd computing Roberto Mirizzi 1, Azzurra Ragone 1,2, Tommaso Di Noia 1, and Eugenio Di Sciascio 1 1 Politecnico di Bari Via Orabona, 4, 70125 Bari, Italy mirizzi@deemail.poliba.it,

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

A Comparative Study on Sentiment Classification and Ranking on Product Reviews

A Comparative Study on Sentiment Classification and Ranking on Product Reviews A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network , pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and

More information

Towards a Big Data Curated Benchmark of Inter-Project Code Clones

Towards a Big Data Curated Benchmark of Inter-Project Code Clones Towards a Big Data Curated Benchmark of Inter-Project Code Clones Jeffrey Svajlenko, Judith F. Islam, Iman Keivanloo, Chanchal K. Roy, Mohammad Mamun Mia Department of Computer Science, University of Saskatchewan,

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

A Data Mining Study of Weld Quality Models Constructed with MLP Neural Networks from Stratified Sampled Data

A Data Mining Study of Weld Quality Models Constructed with MLP Neural Networks from Stratified Sampled Data A Data Mining Study of Weld Quality Models Constructed with MLP Neural Networks from Stratified Sampled Data T. W. Liao, G. Wang, and E. Triantaphyllou Department of Industrial and Manufacturing Systems

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful

More information

Flattening Enterprise Knowledge

Flattening Enterprise Knowledge Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it

More information

Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project

Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project Ahmet Suerdem Istanbul Bilgi University; LSE Methodology Dept. Science in the media project is funded

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY QÜESTIIÓ, vol. 25, 3, p. 509-520, 2001 PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY GEORGES HÉBRAIL We present in this paper the main applications of data mining techniques at Electricité de France,

More information

Personalization of Web Search With Protected Privacy

Personalization of Web Search With Protected Privacy Personalization of Web Search With Protected Privacy S.S DIVYA, R.RUBINI,P.EZHIL Final year, Information Technology,KarpagaVinayaga College Engineering and Technology, Kanchipuram [D.t] Final year, Information

More information

CiteSeer x in the Cloud

CiteSeer x in the Cloud Published in the 2nd USENIX Workshop on Hot Topics in Cloud Computing 2010 CiteSeer x in the Cloud Pradeep B. Teregowda Pennsylvania State University C. Lee Giles Pennsylvania State University Bhuvan Urgaonkar

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

How To Write A Summary Of A Review

How To Write A Summary Of A Review PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,

More information

KNOWLEDGE DISCOVERY FOR SUPPLY CHAIN MANAGEMENT SYSTEMS: A SCHEMA COMPOSITION APPROACH

KNOWLEDGE DISCOVERY FOR SUPPLY CHAIN MANAGEMENT SYSTEMS: A SCHEMA COMPOSITION APPROACH KNOWLEDGE DISCOVERY FOR SUPPLY CHAIN MANAGEMENT SYSTEMS: A SCHEMA COMPOSITION APPROACH Shi-Ming Huang and Tsuei-Chun Hu* Department of Accounting and Information Technology *Department of Information Management

More information

Binary search tree with SIMD bandwidth optimization using SSE

Binary search tree with SIMD bandwidth optimization using SSE Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous

More information

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures ~ Spring~r Table of Contents 1. Introduction.. 1 1.1. What is the World Wide Web? 1 1.2. ABrief History of the Web

More information

LinkZoo: A linked data platform for collaborative management of heterogeneous resources

LinkZoo: A linked data platform for collaborative management of heterogeneous resources LinkZoo: A linked data platform for collaborative management of heterogeneous resources Marios Meimaris, George Alexiou, George Papastefanatos Institute for the Management of Information Systems, Research

More information

Using Ontologies for Software Development Knowledge Reuse

Using Ontologies for Software Development Knowledge Reuse Using Ontologies for Software Development Knowledge Reuse Bruno Antunes, Nuno Seco and Paulo Gomes Centro de Informatica e Sistemas da Universidade de Coimbra Departamento de Engenharia Informatica, Universidade

More information

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool. International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 9, Issue 8 (January 2014), PP. 19-24 Comparative Analysis of EM Clustering Algorithm

More information

Concept and Project Objectives

Concept and Project Objectives 3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the

More information

Chapter ML:XI (continued)

Chapter ML:XI (continued) Chapter ML:XI (continued) XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained

More information

A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1

A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 Yannis Stavrakas Vassilis Plachouras IMIS / RC ATHENA Athens, Greece {yannis, vplachouras}@imis.athena-innovation.gr Abstract.

More information

The Performance Characteristics of MapReduce Applications on Scalable Clusters

The Performance Characteristics of MapReduce Applications on Scalable Clusters The Performance Characteristics of MapReduce Applications on Scalable Clusters Kenneth Wottrich Denison University Granville, OH 43023 wottri_k1@denison.edu ABSTRACT Many cluster owners and operators have

More information

Utility-Based Fraud Detection

Utility-Based Fraud Detection Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Utility-Based Fraud Detection Luis Torgo and Elsa Lopes Fac. of Sciences / LIAAD-INESC Porto LA University of

More information

DBpedia German: Extensions and Applications

DBpedia German: Extensions and Applications DBpedia German: Extensions and Applications Alexandru-Aurelian Todor FU-Berlin, Innovationsforum Semantic Media Web, 7. Oktober 2014 Overview Why DBpedia? New Developments in DBpedia German Problems in

More information

Executing SPARQL Queries over the Web of Linked Data

Executing SPARQL Queries over the Web of Linked Data Executing SPARQL Queries over the Web of Linked Data Olaf Hartig 1, Christian Bizer 2, and Johann-Christoph Freytag 1 1 Humboldt-Universität zu Berlin lastname@informatik.hu-berlin.de 2 Freie Universität

More information

Combining Ontological Knowledge and Wrapper Induction techniques into an e-retail System 1

Combining Ontological Knowledge and Wrapper Induction techniques into an e-retail System 1 Combining Ontological Knowledge and Wrapper Induction techniques into an e-retail System 1 Maria Teresa Pazienza, Armando Stellato and Michele Vindigni Department of Computer Science, Systems and Management,

More information

How To Create A Text Classification System For Spam Filtering

How To Create A Text Classification System For Spam Filtering Term Discrimination Based Robust Text Classification with Application to Email Spam Filtering PhD Thesis Khurum Nazir Junejo 2004-03-0018 Advisor: Dr. Asim Karim Department of Computer Science Syed Babar

More information

ONTOLOGY FOR MOBILE PHONE OPERATING SYSTEMS

ONTOLOGY FOR MOBILE PHONE OPERATING SYSTEMS ONTOLOGY FOR MOBILE PHONE OPERATING SYSTEMS Hasni Neji and Ridha Bouallegue Innov COM Lab, Higher School of Communications of Tunis, Sup Com University of Carthage, Tunis, Tunisia. Email: hasni.neji63@laposte.net;

More information

Lightweight Data Integration using the WebComposition Data Grid Service

Lightweight Data Integration using the WebComposition Data Grid Service Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed

More information

Unsupervised Data Mining (Clustering)

Unsupervised Data Mining (Clustering) Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in

More information

A Supervised Forum Crawler

A Supervised Forum Crawler International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Special Issue-2, April 2016 E-ISSN: 2347-2693 A Supervised Forum Crawler Sreeja S R 1, Sangita Chaudhari

More information

Optimizing Description Logic Subsumption

Optimizing Description Logic Subsumption Topics in Knowledge Representation and Reasoning Optimizing Description Logic Subsumption Maryam Fazel-Zarandi Company Department of Computer Science University of Toronto Outline Introduction Optimization

More information

Security-Aware Beacon Based Network Monitoring

Security-Aware Beacon Based Network Monitoring Security-Aware Beacon Based Network Monitoring Masahiro Sasaki, Liang Zhao, Hiroshi Nagamochi Graduate School of Informatics, Kyoto University, Kyoto, Japan Email: {sasaki, liang, nag}@amp.i.kyoto-u.ac.jp

More information

An Evaluation of Machine Learning Method for Intrusion Detection System Using LOF on Jubatus

An Evaluation of Machine Learning Method for Intrusion Detection System Using LOF on Jubatus An Evaluation of Machine Learning Method for Intrusion Detection System Using LOF on Jubatus Tadashi Ogino* Okinawa National College of Technology, Okinawa, Japan. * Corresponding author. Email: ogino@okinawa-ct.ac.jp

More information

Customer Intentions Analysis of Twitter Based on Semantic Patterns

Customer Intentions Analysis of Twitter Based on Semantic Patterns Customer Intentions Analysis of Twitter Based on Semantic Patterns Mohamed Hamroun mohamed.hamrounn@gmail.com Mohamed Salah Gouider ms.gouider@yahoo.fr Lamjed Ben Said lamjed.bensaid@isg.rnu.tn ABSTRACT

More information