Improving the Performance of Web Service Recommenders Using Semantic Similarity


 Erick Wilkins
 1 years ago
 Views:
Transcription
1 Improving the Performance of Web Service Recommender Uing Semantic Similarity Juan Manuel AdánCoello, Carlo Miguel Tobar, Yang Yuming Faculdade de Engenharia de Computação, Pontifícia Univeridade Católica de Campina (PUCCampina) Campina, SP, Brail Abtract: Thi paper addree iue related to recommending Semantic Web Service (SWS) uing collaborative filtering (CF). The focu i on reducing the problem ariing from data parity, one of the main difficultie for CF algorithm. Two CF algorithm are preented and dicued: a memorybaed algorithm, uing the knn method, and a modelbaed algorithm, uing the kmean method. In both algorithm, imilarity between uer i computed uing the Pearon Correlation Coefficient (PCC). One of the limitation of uing the PCC in thi context i that in thoe intance where uer have not rated item in common it i not poible to compute their imilarity. In addition, when the number of common item that were rated i low, the reliability of the computed imilarity degree may alo be low. To overcome thee limitation, the preented algorithm compute the imilarity between two uer taking into account ervice that both uer acceed and alo emantically imilar ervice. Likewie, to predict the rating for a not yet acceed target ervice, the algorithm conider the rating that neighbor uer aigned to the target ervice, a i normally the cae, while alo conidering the rating aigned to ervice that are emantically imilar to the target ervice. The experiment decribed in the paper how that thi approach ha a ignificantly poitive impact on prediction accuracy, particularly when the ueritem matrix i pare. Keyword: Collaborative filtering, Recommender ytem, Semantic imilarity, Semantic Web Service, Spare data. 1. INTRODUCTION ServiceOriented Computing (SOC) i a new computing paradigm that ue ervice a building block to accelerate the development of ditributed application in heterogeneou computer environment. SOC promie a world of cooperating ervice where application component are combined with little effort into a network of looely coupled ervice for creating flexible and dynamic buine procee that can pread over many organization and computing platform [1] Among the key challenge for the effective ue of Web ervice i the dicovery of ervice that meet the functional and nonfunctional requirement of it uer and that take into account their preference [2]. In Web ervice dicovery ytem, three entitie can typically be ditinguihed: the ervice requeter (a uer or a program), the ervice provider and the ervice regitry. Entitie eeking ervice make ervice requet to the regitry. In the regitry, the decription of the ervice requeted i compared with the decription of ervice advertied by ervice provider, uing a matching algorithm, to identify whether there are ervice that meet the requet. If the matching i ucceful, the regitry provide the decription of identified ervice intance to the requeter, including the neceary detail for their invocation. Architecture for ervice dicovery, uually baed on the WSDL pecification [3], have eriou limitation ariing from the ervice decription technology and matching algorithm ued. Thee limitation are due, in part, to the ue of informal decription of ervice functionality and capability, written in natural language, uually lacking a common vocabulary for the ervice requeter and provider. Semantic Web Service (SWS) and Linked Service are recent approache that try to overcome thee limitation by combining Web ervice technology with element of the Semantic Web [4][5]. In SWS dicovery architecture, advertied ervice are decribed uing ervice annotation ontologie in addition to WSDL parameter and operation name. Thee ontologie define a emantic model for the decription of a Web ervice from everal perpective, including functionality, execution flow and invocation detail. They define a et of attribute for ervice capability decription, the mot common being the ocalled IOPE (Input, Output, Precondition and Effect). Service annotation, in accordance with a ervice annotation ontology, ue concept contained in domain ontologie intead of nontandardized word, which are more commonly ued in conventional nonemantic approache. Domain ontologie decribe the terminology and the relationhip between term of a pecific domain uing an ontology language uch a OWL or RDFS [6][7]. Each ontology language ha it own unique expreive power, but all can model, at the minimum, hierarchie of concept and role of concept, uch a propertie, attribute and relationhip. When performing a earch, the characteritic of the deired ervice, uch a input and output, are pecified by term that repreent ontology concept. Matchmaking algorithm baed on logical inference can then eek matche for the requet parameter, taking into account the parameter of the available ervice. For each match found, a value that characterize the matching degree (imilarity) i computed. Finally, the identified ervice are returned to the requeter in decending order of matching degree. Search algorithm for emantic Web ervice preent good reult when the uer i able to adequately decribe the deired ervice. However, thi i not alway the cae, and a requet for a ervice cannot correpond fully to the intention of the requeter. For example, there may be a publihed ervice that partially matche the requet and accomplihe the intention of the requeter, or the oppoite cenario could alo conceivably occur [8]. A the number of available ervice on the Web increae, thi problem woren. Currently, a pointed out in [9], one of the mot challenging iue in Web ervice proviion i not the matchmaking proce but the election of good ervice for a target uer. 80
2 In addition, a the number of available Web ervice grow, there may be a lot of intereting available ervice that uer are not aware of, and that they therefore will not take the initiative to requet. Additionally, in the context of mobile and ubiquitou computing, it i unreaonable to aume that a uer i contantly earching for intereting ervice available at the uer locale. In thi context, it i deirable to have a recommender ytem capable of identifying and of proactively recommending potentially intereting ervice to the uer in the right ituation. A web ervice recommender can alo be very valuable to proactively deal with failure and to recover to ervice workflow that have partially failed and in dynamic compoition cenario, provided the ervice and the recommender can deal with emantic markup [10] [11]. The recommendation problem can be reduced to an iue of etimating rating for item that have not been ued before by a uer; item with higher etimated rating are a a conequence recommended to the uer. Recommender are uually implemented uing filtering algorithm claified into three main categorie, depending on how the recommendation are performed: (1) ContentBaed algorithm (CB) filter and recommend item that are imilar to other the uer ha acceed in the pat; (2) Collaborative Filtering (CF) algorithm filter and recommend item baed on the preference of other uer with imilar tate and preference; knowledgebaed (KB) recommender ue knowledge about uer and item to generate a recommendation. It i alo frequent to find hybrid ytem that combine method taken from two or more of the previou categorie of recommender [12] [13]. Contentbaed recommender have their root in the information retrieval field and were uccefully implemented in domain where the item to be recommended are decribed through textual information. Thee ytem are, however, limited by the feature that are explicitly aociated with the item. They are alo limited to recommended item that are imilar to thoe already rated by the uer (over pecialization). A particularly difficult tak for thi type of algorithm i to deal with new uer, becaue new uer have to rate a ufficient number of item before the ytem can undertand their preference and tart making ueful recommendation. CF algorithm do not have ome of the abovementioned hortcoming of content baed algorithm. Since they employ the uer' rating, they can deal with any kind of content and recommend any type of item, even item that are diimilar to thoe acceed in the pat. However CF ytem have their own challenge, including coping with pare data and caling with increaing number of uer and item. Several tructural difficultie related to pare data may be encountered, including the cold tart problem, the reduced coverage problem and the neighbor tranitivity problem. The cold tart problem occur when new uer or item are inerted into the ytem. New item cannot be recommended until they are rated by ome uer, and, in turn, new uer are unlikely to receive good recommendation becaue they lack a rating hitory. The reduced coverage problem occur when the number of rating i very mall compared with the number of item in the rating databae. In thi ituation, the ytem may be unable to generate recommendation for uch uer. The neighbor tranitivity problem occur when uer with imilar tate do not have rated item in common and thu cannot be identified a imilar. Knowledgebaed recommender ytem avoid ome of the drawback of content and CF ytem ince their recommendation do not depend on a bae of uer rating. Their main drawback i the well known knowledge acquiition bottleneck. Algorithm for CF, the primary focu of thi paper, can be further claified into two main categorie: memorybaed and modelbaed. Memorybaed algorithm contruct a neighborhood of uer who have imilar rating to the target uer uing directly the available data. In thi circumtance, the rating of neighbor are ued to predict how a target uer will rate an item he ha not yet acceed. Modelbaed technique employ available rating data to learn a model to make prediction, uually uing data mining or a machine learning algorithm. Then the model i ued to make prediction for target item, intead of uing raw rating data, a i done with memorybaed algorithm. When comparing memorybaed and modelbaed CF algorithm it i uually accepted that memorybaed algorithm are eay to implement and have higher prediction accuracy, particularly for dene dataet. Modelbaed algorithm are, in turn, more calable and le vulnerable to profile injection attack [12]. In the recent pat, recommender ytem have been built for recommending different type of item in divere domain, including CD, Web page, book, new, movie and coure. However, reearch on Web ervice recommendation i in it preliminary tage and uually focue on predicting ervice QoS (Quality of Service) parameter [14], which i a very limited way of capturing uer interet [15]. In thi paper, we preent algorithm for contructing Web ervice recommender ytem aimed at reducing the problem ariing from pare data. The propoed approach combine CF algorithm with logical inference to determine the emantic imilarity between ervice, and between uer. The rationale behind thi approach i that if two uer have not rated a common et of ervice but have rated imilar ervice, thee rating can till be an indication of uer imilarity and therefore contribute to reduce the effect of data parene. The remainder of the paper i organized a follow: ection 2 preent memory and modelbaed CF algorithm for Semantic Web ervice recommendation; ection 3 dicue the experimental et up ued to evaluate the algorithm and the reult that were obtained; ection 4 preent related work; and, finally, ection 5 conclude the paper by pointing out our main reult and direction for future work. 2. CF ALGORITHMS FOR SEMANTIC WEB SERVICE RECOMMENDATION In thi ection, variation of two recommender algorithm that exploit emantic imilaritie among web ervice are preented. Their performance will be compared in Section 3. Intance of uer feedback 1 are tored in a ueritem matrix, repreented a a et T U S F, where U 1 In thi paper we ue the term feedback, core and rating a ynonym. 81
3 = {u 1, u 2,, u m } i the et off all uer, S = { 1, 2,, n } i the et of all rated ervice, F = {f 1, f 2,, f m } i the et of intance of feedback related to ervice in S and collected from the uer in U. Each f u F i an n dimenional vector over the pace of all intance of uer feedback, i.e, f u =( f 1, f 2,, f n ) where f j [0..1] i the feedback given by uer u to ervice j. If a ervice wa not rated it feedback i repreented a φ (null). Although the collaborative filtering algorithm decribed in thi ection are independent of the notation ued to decribe ervice emantic, when they allow for the meaurement of the level of emantic imilarity among two ervice, a prototype for ervice decribed uing OWLS wa implemented for the validation of the algorithm. OWLS i an upper ontology that pecifie that a ervice can be decribed by at mot one ervice model, and a grounding mut be aociated with exactly one ervice [16]. OWLS i a W3C recommendation baed on the W3C tandard OWL, an ontology language for the Semantic Web with formally defined meaning [6]. Computing Service Similarity In our prototype implementation, the degree of imilarity between OWLS ervice i computed uing a hybrid emantic ervice matching algorithm decribed in [17] that take advantage of both logicbaed reaoning and IR technique. If R repreent a requet for a ervice and S a ervice regitered in the ervice databae, the emantic matching algorithm compute the following matching degree: Exact match (S exactly matche R)  The I/O (Input/Output) ignature of S perfectly matche requet R with repect to the logicbaed equivalence of their formal emantic. Plugin match (S plug into R)  All input parameter concept of S match more pecific one in R. In addition, S i expected to return more pecific output data. Subumed match (R ubume S)  Thi matching degree i weaker than plugin matching. The output of S i more pecific than requeted by R a before, but the contraint of immediate output concept ubumption i relaxed to arbitrary output concept ubumption. Subumedby match (R i ubumed by S)  The output of S i lightly more general than requeted (direct parent output concept). Nearetneighbor match (S i the nearet neighbor of R)  It i checked if the degree of text imilarity, SynSim(S,R), between the input and output concept of S and R i greater than or equal to a defined yntactic imilarity threhold α. Thi degree i computed a the averaged yntactic imilarity of the erialized input and output concept of S and R, according to a given imilarity metric. A et of concept i erialized by mean of their expanion through the ontology implemented and by the conjunctive concatenation of the reult into one untructured text document, including only logical operator and primitive component of the baic vocabulary that i preent in the ontological terminology. In the cae of vectorpacebaed text imilarity meaurement, thee document are repreented a weighted keyword vector baed on a termweighting cheme. Fail (S doe not match with R)  None of the above matching degree wa obtained. Memorybaed Feedback Prediction with KNN Thi recommendation algorithm i baed on the contruction of neighborhood of imilar uer. The neighbor rating can then be ued to make prediction for unrated item. A neighborhood i contructed comparing the imilarity of each pair of exiting uer uing the Pearon Correlation Coefficient (PCC). Two variant of the algorithm were implemented. In the firt, named PCC, the imilarity between two uer u and v, im(v), i computed a hown in Eq. (1), where S uv = { f φ and f v, φ } i the et of ervice that both uer, u and v, have rated, f [0..1] i the feedback given by uer u to ervice and f u and f v are the average of the intance of feedback given by uer u and v, repectively. im(v)= S uv S uv ( f ( f f u)( fv, f v) (1) f u) 2 Su v ( f v, f v) In Eq. (1), if uer u and v have not rated item in common it i not poible to compute their imilarity. Alo, if the number of common item that were rated i very low, the computed imilarity may be unreliable. In the econd variant of the algorithm, named PCC SS (PCC with imilar ervice), it i not required that uer u and v rate the ame ervice to compute their imilarity a it take into conideration the rating of imilar ervice. The imilarity between ervice i computed uing the emantic matching algorithm preented in the previou ubection. PCCSS compute the imilarity between two uer, u and v, uing Eq. (2). In that equation, t i the ervice rated by v that i mot imilar to (rated by u), repecting a minimum threhold of imilarity δ. When both uer have rated the ame ervice, and t repreent the ame ervice (the imilarity between and t i 1). im(v)= u S ( f f u)( fv, t f v), (2) Su t Sv ( f f ) 2 u t Sv ( f v, t 2 f v) The imilarity between two uer, im(v), computed uing Eq. (1) or Eq. (2), range from 1 to 1. A value of 1 implie a line that decribe the relationhip between feedback f and f v, given from uer u and v, repectively, for ervice (or a imilar ervice), with all data point (intance of feedback) lying on the line where f v, increae a f increae. A value of 1 implie that all data point lie on the line where f v, decreae a f increae. A value of 0 implie that there i no linear correlation between the variou intance of feedback. In our implementation only im(v) value higher than 0 were conidered relevant. The feedback a uer u would give to a ervice that he ha not yet rated can be etimated uing the rating that neighbor uer aigned to that ervice. Having a neighborhood V, the feedback uer u would give to ervice, f, can be predicted uing two variant of the weighted average of all neighbor rating, a hown in Eq. (3) and Eq. (4). 2 82
4 im( v)( fv, f v) V f = f u + v (3) im( v) v V For the reult f in Eq. (3), hereafter named WAAR (Weighted Average of All Rating), the neighborhood V i formed by the k mot imilar uer to u that rated ervice. im( v)( fv, t f v) v V fu = f u + (4), im( v) v V For the reult f in Eq. (4), hereafter named WAAR SS (Weighted Average of All Rating with Service Similarity), the neighborhood V i formed by the k mot imilar uer to u that rated ervice or a ervice t that i emantically imilar to. If V or V i empty, repectively in Eq. (3) or (4), f i made equal to f u. Modelbaed Feedback Prediction with Kmean Memorybaed filtering algorithm tend to be more accurate than modelbaed algorithm, but the latter are more calable and le vulnerable to profile injection attack [18]. Conidering that the number of available ervice in the Web i continuouly increaing, and that in the context of Webbaed open collaborative recommender the likelihood of attack i not negligible, modelbaed recommender algorithm can be good alternative to memorybaed algorithm, provided that their accuracy i acceptable We decribe in thi ection a modelbaed CF algorithm for emantic Web ervice that ue the kmean clutering method and the concept of emantic ervice imilarity. The kmean method i ued to partition a et of point or obervation into cluter. If we conider that f u F define the profile of uer where f u i the vector of intance of feedback given by uer u for the available ervice, the kmean algorithm can be ued to cluter uer with imilar profile. Once the cluter are defined, their centroid can be interpreted a aggregated profile of the uer in the cluter a done in [19]. The clutering algorithm work a follow. Initially k point (f vector) are randomly choen a the initial cluter centroid, after which an aignment tep and an update tep are repeated until the algorithm converge. In the aignment tep, each point i aigned to the cluter with the cloet centroid. In the update tep, cluter centroid are updated to the mean of the point aigned to the cluter. The algorithm converge when the centroid no longer change. In the aignment tep, the ditance between a point and a cluter centroid i computed uing the PCC or the PCCSS (Eq. 1 and Eq. 2, repectively). Following the aignment tep, the update tep compute a new centroid f c =( f c,1, f c,2,, f c,n ) for each cluter c. The new centroid vector i the mean of the uer profile aigned to cluter c. That i, f c,i, for i = 1 to n, i computed by Eq. (5). 1 = fu i (5) c f c, i, u C When applying Eq. (5), if ome f i i equal to φ (meaning that uer u ha not rated ervice i,), the average core of the item rated by u i intead ued. When the algorithm converge, each cluter centroid i een a an aggregation of the uer profile in their repective cluter. Uer intance of feedback for unrated ervice are then etimated uing Eq. (3) or Eq. (4), taking into conideration the neighborhood formed by the k cluter (repreented by their centroid) mot imilar to the target uer profile (repreented by hi feedback vector). 3. EXPERIMENTAL EVALUATION The purpoe of thi ection i to compare the performance of the algorithm preented on ection 2. The lack of public rating dataet i a major difficulty when validating recommender ytem for Web ervice. To circumvent thi difficulty, reearcher uually adapt popular dataet contructed to recommend other type of item. For example, [20] ue the Movielen 2 dataet and conider that a movie in the dataet repreent a Web ervice. The evaluation of the algorithm we propoe, add an additional level of difficulty becaue we need a dataet of uer rating for emantic Web ervice. In thi context, an alternative i to yntheize a dataet that matche the propertie of the target domain and tak [21]. Following thi approach we created a ynthetic ueritem matrix that can be ued to provide ome inight into the behavior of the implemented algorithm and erve a a proof of concept. We ued ervice from the OWLS Service Retrieval Tet Collection  OWLSTC 3, verion 2.2, a collection of 1004 Web ervice from everal domain, pecified according to the OWLS ontology. In the experiment, two group with 50 uer each were defined. Each uer rated 56 ervice from the following four categorie: car, camera, hotel and urf. Service rating were et according to a bae feedback defined for each pair (uer_group, ervice_category). Each feedback wa added to a value that varie from 1 to 1 according to the normal ditribution. The main objective of the experiment wa to analyze the behavior of the propoed algorithm conidering dene and pare data cenario. Thee cenario were imulated by progreively hiding a number of ervice rating from the algorithm: the 56 ervice rating for each uer were progreively reduced in tep of 10 until only 6 rating were available for each uer. After each removal tep, the value of the removed core were etimated uing the algorithm previouly dicued, with and without taking into conideration imilar ervice, following which the average error of the prediction wa computed. The experiment for each removal cenario were repeated 10 time and the reult averaged. The time needed to compute the imilaritie between ervice wa not taken into conideration becaue the computation were performed before running the experiment. The prediction performance of the algorithm wa meaured uing the Mean Abolute Error (MAE) and the Normalized Mean Abolute Error (NMAE), defined by Eq. (6) and Eq. (7), repectively
5 MAE = p N f MAE (7) NMAE = f N In Eq. (6), p denote the predicted feedback that uer u will give to ervice, f denote the actual (hidden) feedback that uer u gave to ervice, and N i the number of predicted intance of feedback. Lower value for MAE and NMAE indicate better prediction quality. A MAE or NMAE equal to zero correpond to an ideal cenario, where all prediction are equal to the actual intance of feedback. Evaluating the KNN Memorybaed Feedback Prediction Algorithm In the experiment decribed in thi ection two ervice are conidered imilar if their matching degree i Exact, Plugin, Subume, Subumedby or Nearetneighbor with a threhold α of 0.8. Two imple etimation cheme, the itemmean and the uermean algorithm, were alo implemented to be ued a baeline. The itemmean (IMEAN) algorithm etimate the core for an item (a ervice) a the mean of the core the target item received from all uer that rated it. The uermean (UMEAN) algorithm etimate the core for an item a the mean of the core the target uer gave to the item he rated. When applying Eq. (3) or Eq. (4) (WAAR and WAARSS), the neighborhood ued to etimate a core i defined by uer with a degree of imilarity to the target uer that i greater than or equal to 0.8, a computed by Eq. (1) or Eq. (2) (PCC and PCCSS). When etting thi imilarity threhold, we have to conider that if it i too low uer with low imilarity can be conidered neighbor, negatively affecting the accuracy of the algorithm. On the other hand, if the threhold i very high it i poible that no neighbor will be found, making it impoible to predict feedback from the target uerervice pair. A can be oberved in Figure 1, the prediction error when uing the PCC and WAAR (without uing ervice imilarity) i ignificantly lower than when the IMEAN and UMEAN algorithm are ued. In other word, NMAE (6) IMEAN UMEAN KNN without ervice imilarity 0.00 Number of rating removed Figure 1. Prediction accuracy of IMEAN, UMEAN and knn without ervice imilarity NMAE Without ervice imilarity WAAR w ith ervice imilarity 0.00 Number of rating removed PCC with ervice imilarity PCC and WAAR w ith ervice imilarity Figure 2. Impact of ervice imilarity on the accuracy of the KNNbaed prediction algorithm conidering a neighborhood of imilar uer to predict uer feedback i better than uing raw uer or item average. Figure 2 how that conidering ervice imilarity increae the prediction performance to an even greater extent. Thi happen when ervice imilarity i ued only to compute the PCCSS (Eq. (2)) for the purpoe of finding a neighborhood, or to etimate core with WAARSS (Eq. (4)). Uing ervice imilarity both to compute the PCCSS and the WAARSS produce even more accurate prediction. Thee reult can be explained a follow. When the PCC i computed without taking into conideration ervice imilarity, everal imilar uer are not identified becaue the PCC equation correlate only uer that rated a common et of ervice. When ervice imilarity i taken into account, uer who rated imilar ervice are alo taken into conideration, increaing the neighborhood and, a a conequence, the accuracy of the algorithm. In addition, uing ervice imilarity to predict a rating (WAARSS) contribute to increae the accuracy becaue it allow more core to be conidered when calculating the prediction. Thi happen becaue intead of only conidering ervice core that the target uer and their imilar uer rated, core for imilar ervice are alo included. Figure 2 alo how that the effect of conidering ervice imilarity are not ignificant when a mall amount of core i removed, but are more dramatic when the amount of removed core increae, that i, when the ueritem matrix become parer. A hown in figure 2, when 50 out of 56 core are removed, the NMAE i equal to 0.23 if ervice imilarity i conidered in both the PCC and WAAR, while when it i not conidered in any of the method it rie to 0.41, an increae of 78%. Evaluating The Kmean Modelbaed Feedback Prediction Algorithm Uing the ame cenario from the previou ection, experiment were conducted to evaluate the performance of the prediction approach baed on kmean. One of the important parameter for thi algorithm i the number of cluter, k. If k i too mall uer profile with little imilarity are clutered together, reducing the accuracy of the algorithm; on the other hand, if k i too high the calability of the algorithm (one of it main expected advantage over the knn baed algorithm) can be negatively affected. In the experiment preented in thi ection k wa et to 8, a value choen after ome 84
6 preliminary tet demontrated that it i a good choice for the data et ued. The neighborhood ued to predict a feedback to a target uer i formed by the cluter centroid that have a degree of imilarity to that uer (computed uing the PCC and PCCSS) greater than or equal to 0.8. A can be oberved in figure 3, the kmean prediction algorithm without ervice imilarity ha a prediction error ignificantly lower than that which i obtained when applying the IMEAN and UMEAN algorithm, except when the number of available core i very low (when 50 out of 56 are removed). Under uch circumtance, the mall number of available uer profile prevent the contruction of repreentative uer group, everely affecting the prediction accuracy of the algorithm. Under uch pare data condition, the ue of ervice imilarity account for an appreciable increae in accuracy. A already verified for the knn algorithm, the bet reult are oberved when ervice imilarity information i ued for computing both the PCC and the WAAR. Thee reult can be explained in the ame manner a done for the knn algorithm: when running the algorithm without ervice imilarity information, everal imilar uer are not identified a uch and are not clutered together, becaue only uer that rated the ame et of ervice can be conidered imilar; when ervice imilarity i taken into account, it i alo poible to identify imilar uer among thoe uer that rated imilar ervice. In addition, when computing the WAAR, the ue of ervice imilarity information contribute to increae the accuracy becaue it allow for the conideration of more core to calculate a prediction. Figure 3 how that when 50 out of 56 core are removed, characterizing a ituation of carcity of evaluation, uing ervice imilarity for computing the PCC and WAAR account for a NMAE of 0.32, while when thi information i not ued the NMAE rie to 0.89, an increae of 178%. Comparing the KNN and the Kmean Prediction Algorithm The literature ay that memorybaed prediction algorithm, like thoe baed on the knn, often have greater accuracy than modelbaed algorithm, uch a thoe baed on the kmean, but modelbaed algorithm are more calable becaue they require le memory and are fater. Figure 4 confirm the firt claue of the NMAE UMEAN knn with ervice imilarity kmean w ith ervice imilarity 0.00 Number of rating removed knn w ithout ervice imilarity kmean without ervice imilarity Figure 4. Comparing the prediction performance of knn and kmean algorithm previou entence. However, it i worth noting that the k mean algorithm with ervice imilarity i more accurate than the knn one without ervice imilarity. The lower accuracy of the kmean algorithm with repect to knn can be explained by the fact that the k mean method ue cluter centroid and not the profile of imilar uer to predict the core. Profile are grouped into cluter baed on the imilarity of each profile to a cluter centroid; thu a poorly choen centroid directly influence the quality of the cluter. In the implementation decribed, the initial eight centroid were choen randomly among the available profile. The particularly bad reult for the kmean algorithm when many core are removed and imilar ervice are not conidered can be explained by the difficulty in finding imilar uer to group together when data i pare. Figure 5 how the time required by the algorithm to predict the removed core when uing a notebook with an Intel Core Duo 1.66 GHz proceor and 2 GB of RAM. Regarding the kmean algorithm, the required time for core prediction with already created cluter i hown. Under thee condition, the run time i lower for the kmean algorithm, particularly when the ueritem matrix i dene. Thi reult wa expected becaue a high number of profile are conidered in the computation of the PCC and the WAAR when uing the knn method, while only a mall number of cluter centroid are ued when applying the kmean method. NMAE IMEAN UMEAN Without ervice imilarity PCC w ith ervice imilarity WAAR w ith ervice imilarity PCC and WAAR w ith ervice imilarity Number of rating removed Time (m) KNN without ervice imilarity Kmean without ervice imilarity KNN with ervice imilarity Kmean with ervice imilarity Number of rating removed Figure 3. Impact of ervice imilarity on the prediction performance of the kmean algorithm Figure 5. Runtime of the knn and kmean algorithm for core prediction 85
EVALUATING SERVICE QUALITY OF MOBILE APPLICATION STORES: A COMPARISON OF THREE TELECOMMUNICATION COMPANIES IN TAIWAN
International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 13494198 Volume 8, Number 4, April 2012 pp. 2563 2581 EVALUATING SERVICE QUALITY OF MOBILE APPLICATION
More informationWho Will Follow You Back? Reciprocal Relationship Prediction
Who Will Follow You Back? Reciprocal Relationhip Prediction John Hopcroft Department of Computer Science Cornell Univerity Ithaca NY 4853 jeh@c.cornell.edu Tiancheng Lou Intitute for Interdiciplinary Information
More informationA family of chaotic pure analog coding schemes based on baker s map function
Liu et al. EURASIP Journal on Advance in Signal Proceing 5 5:58 DOI.86/36345439 RESEARCH Open Acce A family of chaotic pure analog coding cheme baed on baker map function Yang Liu * JingLi Xuanxuan
More informationCorporate Tax Aggressiveness and the Role of Debt
Corporate Tax Aggreivene and the Role of Debt Akankha Jalan, Jayant R. Kale, and Cotanza Meneghetti Abtract We examine the effect of leverage on corporate tax aggreivene. We derive the optimal level of
More informationPerformance Evaluation and Delay Modelling of VoIP Traffic over 802.11 Wireless Mesh Network
International Journal of Computer Application (975 8887) Volume 1 No.9, May 11 Performance Evaluation and Delay Modelling of VoIP Traffic over 8.11 Wirele Meh Network Amit Chhabra Dept. of CSE SDDIET,
More informationMethod of Moments Estimation in Linear Regression with Errors in both Variables J.W. Gillard and T.C. Iles
Method of Moment Etimation in Linear Regreion with Error in both Variable by J.W. Gillard and T.C. Ile Cardiff Univerity School of Mathematic Technical Paper October 005 Cardiff Univerity School of Mathematic,
More informationMULTIPLE SINK LOCATION PROBLEM AND ENERGY EFFICIENCY IN LARGE SCALE WIRELESS SENSOR NETWORKS
MULTIPLE SINK LOCATION PROBLEM AND ENERGY EFFICIENCY IN LARGE SCALE WIRELESS SENSOR NETWORKS by Eylem İlker Oyman B.S. in Computer Engineering, Boğaziçi Univerity, 1993 B.S. in Mathematic, Boğaziçi Univerity,
More informationIncorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors
via Dirichlet Foret Prior David ndrzeewi andrzee@c.wic.edu Xiaoin Zhu erryzhu@c.wic.edu Mar raven craven@biotat.wic.edu Department of omputer Science, Department of iotatitic and Medical Informatic Univerity
More informationX. Xxxxxx. By Nunzio Quacquarelli MA Cambridge, MBA Wharton
X. Xxxxxx 1 QS Global 200 Buine School Report 2012 By Nunzio Quacquarelli MA Cambridge, MBA Wharton 2 Content 1. Summary: fat fact...5 2. Introduction... 7 3. Methodology...9 4. Methodology: ample... 11
More informationDoSAM DomainSpecific Software Architecture Comparison Model *
DoSAM DomainSpecific Software Architecture Comparion Moel * Klau Bergner 1, Anrea Rauch 2, Marc Sihling 1, Thoma Ternité 2 1 4Soft GmbH Mitterertraße 3 D80336 Munich, Germany {bergner ihling}@4oft.e
More informationIntroduction to the article Degrees of Freedom.
Introduction to the article Degree of Freedom. The article by Walker, H. W. Degree of Freedom. Journal of Educational Pychology. 3(4) (940) 5369, wa trancribed from the original by Chri Olen, George Wahington
More informationMC39i Siemens Cellular Engine. Version: 01.02 DocID: MC39i_HD_V01.02
MC39i Siemen Cellular Engine Verion: 01.02 DocID: MC39i_HD_V01.02 Document Name: MC39i Hardware Interface Decription Verion: 01.02 Date: November 12, 2003 DocId: Statu: MC39i_HD_V01.02 General Note Product
More informationHealth Insurance and Social Welfare. Run Liang. China Center for Economic Research, Peking University, Beijing 100871, China,
Health Inurance and Social Welfare Run Liang China Center for Economic Reearch, Peking Univerity, Beijing 100871, China, Email: rliang@ccer.edu.cn and Hao Wang China Center for Economic Reearch, Peking
More informationAsset Pricing: A Tale of Two Days
Aet Pricing: A Tale of Two Day Pavel Savor y Mungo Wilon z Thi verion: June 2013 Abtract We how that aet price behave very di erently on day when important macroeconomic new i cheduled for announcement
More informationSome Recent Advances on Spectral Methods for Unbounded Domains
COMMUICATIOS I COMPUTATIOAL PHYSICS Vol. 5, o. 24, pp. 195241 Commun. Comput. Phy. February 29 REVIEW ARTICLE Some Recent Advance on Spectral Method for Unbounded Domain Jie Shen 1, and LiLian Wang
More informationTwo Trees. John H. Cochrane University of Chicago. Francis A. Longstaff The UCLA Anderson School and NBER
Two Tree John H. Cochrane Univerity of Chicago Franci A. Longtaff The UCLA Anderon School and NBER Pedro SantaClara The UCLA Anderon School and NBER We olve a model with two i.i.d. Luca tree. Although
More informationHumidity Fixed Points of Binary Saturated Aqueous Solutions
JOURNAL OF RESEARCH of the National Bureau of StandardA. Phyic and Chemitry Vol. 81 A, No. 1, JanuaryFebruary 1977 Humidity Fixed Point of Binary Saturated Aqueou Solution Lewi Greenpan Intitute for
More informationREVISTA INVESTIGACIÓN OPERACIONAL V ol. 29, No. 2,,95105 2007
REVISTA INVESTIGACIÓN OPERACIONAL V ol. 29, No. 2,,95105 2007 GENETIC OPERATORS FOR THE MULTIOBJECTIVE FLOWSHOW PROBLEM Magdalena Bandala*, María A. OorioLama** School of Computer Science, Univeridad
More informationIntroduction to Recommender Systems Handbook
Chapter 1 Introduction to Recommender Systems Handbook Francesco Ricci, Lior Rokach and Bracha Shapira Abstract Recommender Systems (RSs) are software tools and techniques providing suggestions for items
More informationThe ImportExport Paradigm for HighQuality College Courses
Public Policy Editor: Stephen Ruth ruth@gmu.edu The ImportExport Paradigm for HighQuality College Coure An Anwer to Tuition Throughthe Roof Cot Spiral? Stephen Ruth George Maon Univerity Three new
More informationAddressing Cold Start in Recommender Systems: A Semisupervised Cotraining Algorithm
Addressing Cold Start in Recommender Systems: A Semisupervised Cotraining Algorithm Mi Zhang,2 Jie Tang 3 Xuchen Zhang,2 Xiangyang Xue,2 School of Computer Science, Fudan University 2 Shanghai Key Laboratory
More informationWarp Field Mechanics 101
Warp Field Mechanic 101 Dr. Harold Sonny White NASA Johnon Space Center 2101 NASA Parkway, MC EP4 Houton, TX 77058 email: harold.white1@naa.gov Abtract: Thi paper will begin with a hort review of the
More informationIOWA WESTERN COMMUNITY COLLEGE General Catalog 20142015
IOWA WESTERN COMMUNITY COLLEGE General Catalog 20142015 Council Bluff Campu 2700 College Road Council Bluff, Iowa 51503 (712) 3253200 18004325852 Clarinda Center 923 E. Wahington Street Clarinda,
More informationScalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights
Seventh IEEE International Conference on Data Mining Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights Robert M. Bell and Yehuda Koren AT&T Labs Research 180 Park
More informationSRA SOLOMON : MUC4 TEST RESULTS AND ANALYSI S
SRA SOLOMON : MUC4 TEST RESULTS AND ANALYSI S Chinatu Aone, Doug McKee, Sandy Shinn, Hatte Bleje r Sytem Reearch and Application (SRA ) 2000 15th Street North Arlington, VA 2220 1 aonec@ra.com INTRODUCTION
More informationVARIABILITY is commonly understood as the ability of a
282 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 40, NO. 3, MARCH 2014 Variability in Software Systems A Systematic Literature Review Matthias Galster, Danny Weyns, Dan Tofan, Bartosz Michalik, and
More informationContentBoosted Collaborative Filtering for Improved Recommendations
Proceedings of the Eighteenth National Conference on Artificial Intelligence(AAAI2002), pp. 187192, Edmonton, Canada, July 2002 ContentBoosted Collaborative Filtering for Improved Recommendations Prem
More informationCloudGenius: Decision Support for Web Server Cloud Migration
CloudGenius: Decision Support for Web Server Cloud Migration ABSTRACT Michael Menzel Research Center for Information Technology Karlsruhe Institute of Technology Karlsruhe, Germany menzel@fzi.de Cloud
More informationIntrusion Detection Techniques for Mobile Wireless Networks
Mobile Networks and Applications? (2003) 1 16 1 Intrusion Detection Techniques for Mobile Wireless Networks Yongguang Zhang HRL Laboratories LLC, Malibu, California Email: ygz@hrl.com Wenke Lee College
More informationSupporting Keyword Search in Product Database: A Probabilistic Approach
Supporting Keyword Search in Product Database: A Probabilistic Approach Huizhong Duan 1, ChengXiang Zhai 2, Jinxing Cheng 3, Abhishek Gattani 4 University of Illinois at UrbanaChampaign 1,2 Walmart Labs
More information