Entropy-Based Link Analysis for Mining Web Informative Structures
|
|
|
- Norah Griffith
- 9 years ago
- Views:
Transcription
1 Etropy-Based Lk Aalyss for Mg Web Iformatve Structures Hug-Yu Kao, Sha-Hua L *, Ja-Mg Ho *, Mg-Sya Che Electrcal Egeerg Departmet Natoal Tawa Uversty Tape, Tawa, ROC E-Mal: {[email protected], [email protected]} Isttute of Iformato Scece * Academa Sca Tape, Tawa, ROC E-Mal: {shl, hoho}@s.sca.edu.tw ABSTRACT I ths paper, we study the problem of mg the formatve structure of a ews Web ste whch cossts of thousads of hyperlked documets. We defe the formatve structure of a ews Web ste as a set of dex pages (or referred to as TOC,.e., table of cotets, pages) ad a set of artcle pages lked by TOC pages through formatve lks. It s oted that the Hyperlk Iduced Topcs Search () algorthm has bee employed to provde a soluto to aalyzg authortes ad hubs of pages. However, most of the cotet stes ted to cota some extra hyperlks, such as avgato paels, advertsemets ad baers, so as to crease the add-o values of ther Web pages. Therefore, due to the structure duced by these extra hyperlks, s foud to be suffcet to provde a good precso solvg the problem. To remedy ths, we develop a algorthm to utlze etropy-based lk aalyss to me Web formatve structures. Ths algorthm s referred to as LAMIS, stadg for etropy-based Lk Aalyss o Mg web Iformatve Structures. The key dea of LAMIS s to utlze formato etropy for represetg the kowledge that correspods to the amout of formato a lk or a page the lk aalyss. Expermets o several real ews Web stes show that the precso ad recall of LAMIS s much superor to those obtaed by heurstc methods ad also that the lk aalyss techques derved are very powerful to mg the formatve structures of ews Web stes. I average, the augmeted LAMIS leads to promet performace mprovemet ad creases the precso by a factor ragg from 33% to 232% whe the desred recall falls betwee 0. ad. Keywords Iformatve structure, lk aalyss, hubs ad authortes, achor text, etropy, formato extracto.. Itroducto Recetly, there has bee explosve progress the developmet of the World Wde Web. Ths progress creates umerous ad varous formato cotets publshed as HTML pages o the Iteret. Furthermore, for the purpose of mateace ad scalablty of Web stes, most of the Web cotets are mgratg from statc Permsso to make dgtal or hard copes of all or part of ths work for persoal or classroom use s grated wthout fee provded that copes are ot made or dstrbuted for proft or commercal advatage ad that copes bear ths otce ad the full ctato o the frst page. To copy otherwse, or republsh, to post o servers or to redstrbute to lsts, requres pror specfc permsso ad/or a fee. CIKM'02, November 4-9, 2002, McLea, Vrga, USA. Copyrght 2002 ACM /02/00 $.00. pages ad geeral text fles to dyamc forms whch are geerated from predefed templates ad dverse cotet retreved from back-ed databases. I geeral, pages most commercal Web stes, e.g., search eges, e-commerce stores, ews, etc, are so dyamcally geerated. Such Web stes are called systematc Web stes ths paper. A ews Web ste that geerates pages wth daly hot ews ad archves hstorc ews s a typcal example of a systematc Web ste. Due to the evoluto of automatc geerato of Web pages, the umber of Web pages grows explosvely [6] [2]. However, there s a lot of redudat ad rrelevat formato Web stes [], especally automatcally geerated pages of systematc Web stes. Examples of redudat ad rrelevat formato clude advertsemet baers, browsg meus, catalogs of servces, aoucemets of copyrght ad prvacy polcy, ad those cotets tagged wth hyperlks for easy access to related formato. I systematc Web stes, e.g. ews Web stes, wth the use of redudat ad rrelevat lks, t s coveet ad easy for users to browse ad extract formatve parts usg fewer clcks or shortcuts ay page. However, these redudat lks crease the dffculty for search eges or text mers to extract useful formato exactly sce they are usually desged to dex or process everythg cludg redudat ad rrelevat formato. I geeral, these redudat ad rrelevat formato s usually ot closely related to the theme of correspodg pages, thereby makg t dffcult to retreve ad classfy the topcs of a cotet page correctly. Cosder the example Fgure. We dvde the root page of WashgtoPost ( a popular Eglsh ews Web ste) to several parts wth dfferet styles ad cotets,.e., () a baer wth lks ''ews'', ''OPoltcs'', ''Etertamet'', ''Lve Ole'', etc. o the top, (2) a meu wth 22 lks of ews categores o the left, (3) a baer wth advertsemet lks, (4) geeral aoucemets about washgtopost, () a block wth promoted hot ews ad advertsemets, (6) a TOC block, ad (7) a lst wth headle ews. I ths case, parts () ad (2) are dstrbuted amog most pages WashgtoPost ad are redudat formato for users. However, they are stll dexed by search eges. Such dexg duces the creasg of the dex sze, whle beg useless for users ad harmful for the qualty of search results. Parts (3), (4) ad () are rrelevat to the cotext of the page ad are called rrelevat formato. These parts wll make the topc of the page drft whe terms these parts are dexed. The last two parts, (6) ad (7), fact draw more atteto from users ad are prmary cotets. Users ca get ews artcles wth oe clck from achors these
2 two parts. The followg examples descrbe ther mpacts wth more detal: Example : After searchg ''game hardware tech jobs'' Google ( oe of the most popular search eges, we foud 20 pages of CNET ( a ews ad product revew Web ste for computg ad techology) the top-00 results. However, oe of these pages came from the Web pages categorzed CNET Job Seeker 2, whch cotas the desred formato. There are matched query terms redudat parts of pages amog all these 20 pages ad three of them do ot cota ay matched terms the formatve parts of pages. The matched terms redudat parts of a page wll crease the rak of the page, eve though they are usually gored by users. Note that sx of the 20 pages are raked as top-6 ad the page wth the hghest rakg does ot cota ay desrable formato the formatve parts. Fgure : A sample page of ews Web stes I ews Web stes, ews artcle pages are daly updated ad ews formato agets oly eed to crawl TOC pages frst ad fd ews pages through lks TOC pages wth the help of the formatve structure. They, therefore, do ot eed to daly crawl ad dex redudat ad rrelevat formato. I addto to creasg precso ad decreasg the cost of search eges, crawlers ad formato agets, formatve structures of Web stes are useful for wrapper geerato formato extracto, e.g., WIEN [7], IEPAD [] etc. Ths s because repeated patters would be more evdet clustered artcle pages. From our observatos, systematc Web stes, the documet structures of pages wth ear equal authorty ad hub values are aalogous to oe aother. The precso of wrapper geerato wll be mproved after the preprocessg of mg formatve structures o pages. Cosequetly, wth the cotuous growth o the umber of pages wth redudat ad teuous cotexts, fdg desred formato o the Web has become a mportat ad dffcult task to solve. Explctly, extractg formatve structures of Web stes to recogze useful pages ad lks s a crucal ssue to crease precso ad decrease the cost of search eges ad formato agets. I ths paper, we focus o the mg of ews Web stes to demostrate the problem ad the soluto proposed detal. From observg TOC pages ad artcle pages a ews Web ste, oe may ote that TOC pages ted to hold more outlks tha The result s quered from o February, CNET Job Seeker: artcle pages, ad also that legths of o-achor cotext of artcle pages are loger tha those of TOC pages. However, these characterstcs are dverse systematc Web stes ad are depedet o presetato styles, polces, ad types of cotets of Web stes. Moreover, the characterstcs wll be blurred a large cotext ad become less dscrmatg whe redudat/rrelevat formato ad lks are duced. The works [4][] provde good learg mechasms to recogze advertsemets ad redudat ad rrelevat lks of Web pages. However, these methods eed to buld the trag data frst ad related doma kowledge must be cluded to extract features for geerato of classfcato rules. Therefore, t s dffcult to extract the formatve structures of preset systematc Web stes by a automatc ad heurstc techque. Specfcally, whe formato characterstcs are cosdered lk graphs, TOC pages the formatve structure of a Web ste preset characterstcs of good hubs, ad artcle pages are good authortes lked by TOC pages. Therefore, lk aalyss algorthms, e.g., (Hyperlk Iduced Topc Search [6]), provde a reasoable soluto to mg the formatve structures of ews Web stes. Explctly, based o lk aalyss, [6] ad the PageRak algorthm [4] appled Google have provded a ew rakg techology for Web search eges. The algorthm based o mutual reforcemet relatoshp provdes a ovatve methodology for Web searchg ad topcs dstllato. Accordg to the defto [6], a Web page s a authorty o a topc f t provdes good formato, ad s a hub f t provdes lks to good authortes. I recet research work o lk aalyss of hyperlked documets, s appled to the research area of topc dstllato ad several kds of lk weghts are volved to dcate the sgfcace of lks hyperlked documets. I the Clever system [0], weghts tued emprcally are added to dstgush same-ste lks ad others. I [3], the metrcs of smlarty of whole cotets lked documets are appled o lk weghts ad the use of text surroudg the lks as keyword-based evdece to determe a weght for each lk s proposed [9]. Cosderg the dstrbuto of terms documets, Chakrabart et al [7] combe the TFIDF-weghted model ad mcro-hub to represet the sgfcace of achors regos wth formato eeded. Note that the topc dstllato s dfferet from the formatve structure mg several aspects: () The former dstlls hubs ad authortes for a gve query ad the latter mes the formatve structure cosstg of TOC ad artcle pages o a gve Web ste; (2) The base set of topc dstllato cossts of the root set, whch s geerated from the subset of query results, ad the eghborg odes of the root set. The formatve structure mg targets o all pages of a Web ste; (3) Algorthms o topc dstllato usually omt tra-lks ad epotstc lks to perform the mutual reforcemet betwee stes. Such lks are mportat ad should be cosdered as lk caddates the formatve structure of a Web ste; (4) Most adaptve topc dstllato algorthms based o take the relatoshp betwee queres ad documets to cosderato. However, these algorthms do ot work well o mg formatve structures the abset of a target query. Furthermore, as descrbed [8][8], the lk aalyss algorthms, e.g., are vulerable to the effect of epotstc clque attack ad Tghtly-Kt Commuty (TKC). The effects wll be more sgfcat for mg formatve structures of Web stes
3 due to the huge amout of epotstc lks ad clques a Web ste. Cosequetly, we propose ths paper a approach called Etropy-based Lk Aalyss o Mg Iformatve Structure (LAMIS). LAMIS s a automatc formatve structure extractg system based o the weghted lk aalyss of Web pages. Whe a root URL of a ste s gve, the system raks ts Web pages accordg to dstct degrees of hub ad authorty. However, because a page, especally a systematc page commercal Web stes, usually carres multple characterstcs of hub, authorty ad ose, t s geeral dffcult to dscrmate pages accordg to ther hub or authorty values whe a geeral lk aalyss algorthm s appled. Oe may assg weghts o each lk of a page to reduce the effects of ose lks. I recet researches of topc dstllato ad lk aalyss, there are some weghtg mechasms proposed [3][7][0]. However, due to the lack of query terms, these weghtg mechasms do ot work well whe we wat to me the formato structure of a Web ste. Therefore, a ew weghtg mechasm whch cosders the sgfcace of lks ad the cotets of formato they carry a Web ste, s eeded. The key dea of LAMIS s to utlze the formato etropy ad wegh lks wth the kowledge that correspods to the amout of formato a lk or a page the lk aalyss. Results of expermets o several real ews Web stes have show that LAMIS sgfcatly outperforms the compao heurstc algorthms ad avods the drawback of geeral lk aalyss algorthms for mg formatve structures. I average, the augmeted LAMIS leads to promet performace mprovemet ad creases the precso by a factor ragg from 33% to 232% whe the desred recall falls betwee 0. ad. Beeftg from the mprovemet of TOC recogto, LAMIS s show to be able to me the formatve structure more effcetly ad precsely. The remader of ths paper s orgazed as follows. I Secto 2, we descrbe the basc dea of LAMIS. The etropy of achor text ad ehaced lk aalyss s preseted ths secto. The system desg ad mplemetato s descrbed Secto 3. I Secto 4, we evaluate the performace of LAMIS by several expermets o Chese ad Eglsh ews Web stes. We dscuss oe of the effects of oses ad the augmeted feature of LAMIS ths secto. The paper cocludes wth Secto. 2. Model of Lk Aalyss I the secto, we wll descrbe the detal of our approach wth a llustratve example. The troducto of etropy ad ts applcato to achor text are descrbed Secto 2.. The proposed etropy-based lk aalyss s preseted Secto Etropy of Achor Text Whle users are browsg the Web, the achor text s a mportat clue for users to track ad search ther desred formato. Hece, we extract terms from achor texts to deote the sgfcace of achors. I ths paper, a term correspods to a meagful keyword. The motvato of our approach s that terms dstrbuted more pages a Web ste usually carry less formato to users. I cotrast, those appearg fewer pages carry more formato of terest. Hece, we extract terms from achor texts ad use the etropy, whch s determed from the probablty dstrbuto of terms aroud the whole documet sets, to represet the formato stregth (rch or poor) of terms. Shao's formato etropy [22] s appled o the term-documet matrx whch s geerated from the term extracto module to calculate the etropy. By defto, the etropy E ca be expressed as p log p, where p s the probablty of evet ad s umber of evets. By ormalzg the weght of a term to be [0, ], the etropy of term T s: T ) w log 2 w, () j whch w j s the value of ormalzed term-frequecy. w j s a etty the term-documet matrx to represet the weght of a term a page,.e., tf, where tf j j s the term frequecy of wj k tf k term page j. To ormalze the etropy value of a term to the rage [0, ], the base of the logarthm s chose to be the umber of pages. Equato () thus becomes: T ) wj log wj, where D, D s the set of j. (2) pages We the defe the etropy of achor AN as the average etropy of all terms AN below: k j T j ) ( AN ), where T, T2 E. (3), K, Tk, are terms achor AN k It s oted that f a achor cotas o terms, the etropy of the achor s assged to oe. AN 3 P 0 P 0 AN 30 AN 4 AN 32 AN 00 AN 0 AN 42 P AN 2 P 2 AN 0 AN AN 40 AN 33 P 3 P 4 AN 43 AN 3 AN 00 : "hot ews" AN 0 : "sales" P AN 0 : "sports" AN : "busess" AN 2 : "sales" AN 3 : "hom e" P 2 AN 20 : "hom e" P 3 AN 30 : "hot ews" AN 3 : "sales" AN 32 : "hom e" AN 33 : "busess ews" AN 20 P 0 P P 2 P 3 P 4 P P 0 P P 3 0 P 4 0 Adjacet M atrx P 4 AN 40 : "hot ews" AN 4 : "sales" AN 42 : "hom e" AN 43 : "sports ews" P 0 P P 2 P 3 P 4 T 0 (hot) 0 0 T (ews) T 2 (sales) 0 T 3 (sports) T 4 (busess) T (home) 0 TD M atrx Fgure 2 A smple Web ste, D Cosder for example a smple ews Web ste Fgure 2. Page P 0 s the homepage of the Web ste. Page P s the TOC page wth two ews achors lkg to P 3 ad P 4 whch are ews artcle pages the Web ste. Page P 2 s a advertsemet page lked by other four pages. Most pages cota achors,.e., home, hot ews, ad sales, lkg to Page P 0, P, ad P 2 respectvely. Page P 3 ad P 4 have cross-referece lks as the related ews oes to each other. The etropy of terms achor texts of the Web ste ca be calculated as below:
4 T ) 0 T ) T ) T ) 2 T ) T ) log 0.682, log log 0.6, log 0.86, ad 4 4 log From the example Fgure 2, we ote that terms uformly dstrbuted aroud documets have ther etropy values close to oe, meag that these terms provde very few formato for users. I geeral, these terms come from lks of avgato paels ad advertsemet baers, lke T, T 2, ad T. Hece, achors cota hgh etropy terms are deemed as less formatve oes. For example, the etropy of achor AN 00 s E ( T ) 0 + T 0.669, ad 2 achor AN 00 s thus cosdered as a less formatve lk. I cotrast, whe terms ca oly be foud oe achor, they ow the lowest etropy 0, meag that users ca oly fd formato relevat to these terms through achors they are located. Users are usually more terested such formato of smaller etropy. Ths term meas that a achor wth a smaller etropy term should be assged a larger weght tha the oe wth a larger etropy term. The etropy values of achors the sample ste are lsted Table. The most formatve lks are AN 0 ad AN whch lk to artcle pages P 3 ad P 4 from TOC page P. Table : Etropy values of achors Fgure 2 P 0 P P 2 AN 00 AN 0 AN 0 AN AN 2 AN 3 AN P 3 P 4 AN 30 AN 3 AN 32 AN 33 AN 40 AN 4 AN 42 AN Ehaced Lk Aalyss wth Achor Texts I the lk graph of a Web ste G(V, E), algorthm computes two scores for each ode v V,.e., the hub score H( ad the authorty score. Itally, the two scores of each ode are assged to a equal postve umber. The, t teratvely updates the scores as follows utl H ad A coverge: ( u, H ( ad H ( ( v,. (4) Note that the scores are ormalzed each terato. It s proved [6] that H ad A wll evetually coverge,.e., termato of s guarateed. I our approach, we corporate etropy values of achors as lk weghts to preset the sgfcace of lks a page. Therefore, Equato (4) s modfed as follows: ( u,, () H ( * α ad H ( * α ( v, where α s the weght of AN. Accordg to the defto of etropy, α s defed as follows: α AN ). (6) Equato (6) meas that the more formato a lk carres, the larger the weght of the lk. Whle usg mutual reforcemet approach, such as, to perform the lk aalyss o pages of a Web ste, the TKC effect [8] s more obvous tha o the base set of a specfc query. Ths s because that the umber of cycle lks ad epotstc lks domates the set of lks a Web ste. As defed [8], a tghtly-kt commuty s a small but hghly tercoected set of stes. Stes the TKC wll score hgh lk aalyss algorthms, eve though they are ot authortatve o the gve topc. O mg formatve structure of a Web ste, umerous tra-doma lks wll form several ad complex TKCs. They wll uavodably affect the result of algorthm sgfcatly. The SALSA proposed [8] s desged to resst ths effect ad wll be emprcally evaluated by our expermetal studes later. The etropy-based SALSA ca be descrbed as below: H ( ( u, ( v, H ( * * α E out deg ree( * * α. deg ree( The oto out-degree( meas the umber of out-lks of ode u, ad oto -degree( s the umber of lks potg to ode u. We apply Equato (6) o etropy values of achors Fgure 2 to obta correspodg weghts. The we use Equatos (4) ad () to calculate values of hub ad authorty of all pages to show the effect of etropy-based weghtg. I Table 2, hub ad authorty values are obtaed after 0 teratos. Page P s raked as top- hub page by etropy-based ad pages P 3 ad P 4 are raked wth the hghest authorty as well. The result agrees wth our expectato. It ca also be see that raks the advertsemet page as the best authortatve page ad ews artcle pages as good hub oes. Table 2: Results of lk aalyss of ad Etropy-based Method Etropy-based Authorty Hub Authorty Hub P P P P P The Desg of LAMIS I ths secto, the LAMIS system s desged to explore hyperlk formato to extract ad detfy the hub ad authorty pages. The oly parameter for the extracto system s the startg URL of a Web ste ad there s o maual vetos ad pror kowledge about the Web ste requred by LAMIS. We wll troduce the system archtecture the Secto 3. ad the process of our adaptve lk aalyss wll be descrbed Secto 3.2. (7)
5 3.. The System Archtecture Fgure 3 shows the three ma compoets LAMIS, cludg () a Web crawler to crawl pages ad buld the lk graph of a Web ste, (2) a feature extractor module whch extracts features of pages, cludg the -degree ad out-degree of pages, the legth of cotext ad terms lks, ad (3) a etropy-based lk aalyss module whch recogzes TOC pages ad bulds up the formatve structure of a Web ste. I the crawler module, a startg URL,.e., the root ode the lk graph of a Web ste, s gve ad pages are crawled accordg to the lk structure of the Web ste. I LAMIS, we ca assg dfferet crawl depths to get a dfferet vew of the Web ste. The deeper the crawl depth, the more precse the aalyss wll be, at the cost of hgher computg complexty. From Equato (), t ca be verfed that the complexty of lk aalyss algorthms s O( E ). I our observato o several systematc Web stes, the formatve structures of Web stes are mostly located from depth- to depth-2 of lk graphs (the root ode s depth-0). Therefore, wthout loss of geeralty the crawl depth s assged to be three our expermets. Oce a page has bee crawled, related features the page are extracted by the feature extractor module. The, the achor texts all lks are parsed to extract meagful terms ad assocated term frequeces are also couted. As we kow, extractg Eglsh terms s relatvely smple. Applyg stemmg algorthms ad removg stop words based o a stop-lst, Eglsh keywords (terms) ca be extracted [2]. Extractg terms used oretal laguages seems to be more dffcult because of the lack of separators these laguages. I LAMIS, we use a algorthm to extract keywords from Chese seteces based o a Chese term base whch s geerated va collectg hot queres, excludg stop words, from our search ege 3. HTML Fles Page Archve parse page Structure Learer form ato structure of a web ste Startg URL of a web ste Crawler Module archve web pages Iformato Structure Dstllato M odule Cotet Block & Feature Extracto Module LAMIS record lk relato ad achor text Feature Database read achor text / record features Fgure 3: Flowchart of LAMIS system For each term, we mata a term-documet matrx (abbrevated as T-D matrx) to represet the correspodg term frequecy. Whle assocated features are beg extracted ad the T-D matrx s costructed, the etropy-based lk aalyss module raks pages 3 The searchg servce s a project sposored by Yam, ( It served the Web users from November, 998 to December, accordg to the values of hub ad authorty. TOC pages are selected from the set of hgh-rakg pages, ad the we fd correspodg artcle pages through formatve lks TOC pages. The extracted sets of odes ad lks form a formato structure of the Web ste. Ths tur meas that f we ca rak ad recogze TOC pages more precsely, we ca get a more accurate formatve structure drectly The process of lk aalyss I the etropy-based lk aalyss module, we calculate the values of hub ad authorty of each page accordg to Equatos () ad (6) after the lk weghts are assged. I, hub ad authorty wll coverge to the prcpal egevector of the lk matrx [6]. The weghted algorthm s also proved to be coverged f weghtg factors are postve [3]. I LAMIS, the weghtg factor α, s bouded [0,] so that the covergece of LAMIS follows. I our expermets, 9.7% of the hub values are zero after 0 teratos o average. Moreover, our specto, TOC pages ted to hold hgh hub values, ad we hece use the raked lst to retreve TOC pages amog the whole page sets. 4. Expermets ad Evaluato I the secto, we descrbe several expermets coducted o some real ews Web stes to evaluate the performace mprovemet attaed by LAMIS. The detal of datasets s descrbed the Secto 4.. The mprovemet of the etropy-based lk aalyss s preseted Secto 4.2. We dscuss oe of ose factors ad devse a techque to remedy the effect Secto 4.3. Secto 4.4 shows a overall comparso of all expermets. 4.. Datasets I our expermets, the datasets 4 cota eght Chese ad fve Eglsh ews Web stes as descrbed at Table 3. All of these ews stes provde real-tme ews ad hstorcal ews browsg servces. News these Web stes cover several domas, cludg poltcs, face, sports, lfe, teratoal ssues, etertamet, health, cultures, etc., ad are updated from tme to tme. I our expermets, the crawl depth s set to 3, ad after pages have bee crawled, the doma experts spect the cotet of each page Chese ews Web stes ad mark the classes of pages to buld the aswer set of datasets accordg to prevous experece ad doma kowledge the mateace of the ews search ege (NSE) of Yam. As show Table 3 the umbers of TOC pages vary amog stes, eve though the umbers of total pages of some stes are smlar. As wll be see later, the dversty of formato structures datasets fact dcates the several geeral applcablty of LAMIS. After extractg features from crawled pages, we compute the etropy values of extracted terms ad achors by Equatos (2) ad (3). As llustrated Fgure 4, average, 7.6% of lks are ot formatve,.e. ther etropy values are larger tha 0.8. As we expect, they are maly lks avgato paels, advertsemets ad copyrght baers ad are deed pretty uformly dstrbuted amog all pages. 4 Pages of Web stes datasets are crawled at 200/2/27 ad 2002/4/. The datasets ca be retreved our research ste The doma experts are the ste maagers of Yam News Search Ege (NSE,
6 Table 3: Datasets ad related formato Ste Abbr. URLs of News Web Stes Total TOC Lks cotet pages pages blocks CDN CTIMES ews.chatmes.com CNA CNET tawa.cet.com CTS TVBS TTV UDN udews.com CNN N/A * WP 30 N/A LATIMES 9 N/A CSMONITOR N/A DISPATCH N/A *: We oly cosder top-20 precso expermets of Eglsh Web ste. Hece, we do ot fd out all TOC pages Eglsh Web stes. Accumulatg Dstrbuto (%) CDN CNA CTS TTV CTIMES CNET TVBS UDN Lk Etropy Fgure 4: Accumulatg dstrbuto of lk etropy 4.2. The Improvemet of Etropy-based Lk Aalyss I order to show the mprovemet acheved by LAMIS, we costruct several expermets uder dfferet crtera as lsted Table 4. We evaluate the performace of expermets by measurg the precso ad recall of the hub-rakg lst. For each Web ste, we exame the precsos at stadard recall levels,.e., 0%, 0%, 20%,, 00%. The, same as [2], we average the precso at each recall level as follows: N P ( r) P( r), N where P (r) s the average precso at the recall level r, N s the umber of datasets used, ad P (r) s the precso at recall level r for the -th dataset. For performace comparso amog dvdual datasets, we use R-Precso ad Precso Hstograms for sgle value summares [2]. R-Precso RP A () for algorthm A over dataset s defed as the precso rate at the R-th posto the rakg, where R s the total umber of aswers ad the precso hstogram s defed as: RP ( ) RP ( ) RP ( ). A / B A B We scale up the precso hstogram by the factor calculated o the umber of aswers each dataset for hghlghts o the mproved umber of retreval aswers. At frst, we compare the performaces of, SALSA, ad LAMIS. I Table 4, LAMIS ca be descrbed as combato of PA-AEN- meag that etropy-based the page mode. The otato PA meas that the algorthm s operated the page mode,.e. each page s treated as a ode. Note that we use the otato LN (.e., lk ormalzato) to dcate the dfferece betwee ad SALSA,.e., SALSA s equal to -LN. I Fgure, we ca see that does ot work well sce the average R-precso s oly 0.27 (show Table ). Moreover, R-precsos of fve datasets are smaller tha 0.0, showg that caot be drectly appled o mg the formatve structures. However, though the mprovemets of the precso rates for both etropy-based ad SALSA over the orgal oes rage from 0. to 0.2, all of these four algorthms perform poor whe the desred recall s larger tha 0.7. It s observed that TOC pages have more lks tha artcle pages geeral. Ths observato suggests us to rak pages by the umber of outlks o a page,.e. the out-degree of odes the lk graph. It s terestg to see from Fgure that the smple heurstcs OL performs much better tha geeral lk aalyss algorthms whe the desred recall s uder 0.7, where the precso decreases suddely whe the recall s larger tha 0.7. It s oted that there are some ose effects fluecg the etropy-based lk aalyss ad about 0% of TOC pages are hard to dscrmate whe these four algorthms are appled. I the followg secto, we wll descrbe these ose effects ad our solutos to these effects. Table 4: The lst of compoets of expermets Expermet Abbr. OL PA CB SALSA LN AEN Precso Descrpto rak by umber of outlks a page the page mode the cotet block mode Kleberg s the stochastc approach for lk-structure aalyss lk ormalzato weghted by achor text etropy Recall OL -LN (SALSA) LAMIS LAMIS-LN Fgure : The effect of weghts o lk aalyss 4.3. Augmeted Features of LAMIS O specto of expermetal results Fgure, we fd several ose flueces o lk aalyss algorthms durg mg formatve structures. The most fluetal effect wll be examed Secto 4.3., ad we shall devse oe techque to remedy ts effect Hybrdzato ad fecto of hubs ad authortes Whe the cotexts of pages are more complex, pages may cota more hybrd characterstcs of hub ad authorty. Such a page s called a hybrdzed page. For example, the leadg story page ews Web ste cotas formatve lks to other hot-ews pages
7 ad s cosdered as a TOC page to these hot-ews pages. However, ths page s fact a hghly authortatve page as well because the hottest ews s also poted to by other pages. Due to the fluece of the mutual reforcemet relatoshp, the hybrdzato of hubs ad authortes affects ot oly the characterstcs of hybrdzed pages but also that of ther eghborg pages. The fecto wll blur the values of hub ad authorty. To address ths effect, we propose a adaptve approach, amely the cotet block mode (CB). Cotet Block Mode I Fgure 6, we ca see the dfferece of mutual reforcemet propagato betwee the page mode ad the cotet block mode. I the page mode, authorty of P2,.e., A p2, ca affect the value A p3 through hub of P H p. If Page p2 s authortatve, A p3 wll also be promoted, eve though t s ot a authortatve page. I the cotet block mode, we dvde Page p to two parts, oe cotas a lk to Page p2, ad the other cotas a lk to Page p3. They are treated as separate odes. Hece, the propagato of hgh authorty of Page p2 wll ow be termated at CB ad Page p2 wll ot be correctly promoted. The block level hub values also help us to extract the formatve part of a page. The work [8] also proposes a fe-graed model based o Documet Object Model (DOM [23]) to perform mcro-hub computato. I our approach, blocks the cotet block mode are delmted by pre-defed HTML tags, e.g., <table>, rather tha by the DOM tree used [8]. It s show by our expermetal results that the delmtato mechasm performs well o dscoverg formatve cotet blocks of Web documets. The effect of the cotet block mode s show Fgure 7. We ote that LAMIS-LN-CB outperforms other schemes whe the desred recall s larger tha 0.6. We ca fd that the average precso of LAMIS-LN-CB s smaller tha the oe of LAMIS-LN whe the recall rate s smaller tha 0.. Ths s because LAMIS-LN-CB raks two artcle pages of TVBS ad TTV ews ste the top-0 hub rakg. Because the szes of TOC aswer sets of these two stes are small,.e., 3 ad 22 respectvely, the correspodg precso rates decrease suddely, whch tur reduces the average precso rate whe the desred recall s smaller tha 0.. The performace of the other two expermets o the cotet block mode s smlar to that the page mode, because the ose effects of TKC ad redudat/rrelevat lks coceal the mprovemet of the cotet block mode Overall Performace Comparso amog All Schemes We summarze the R-precso values of all expermets Table. The augmeted LAMIS (LAMIS-LN-CB), whch tegrates the cotet block mode wth etropy-based lk aalyss, attas the best performace metrcs. The average R-precso s 0.7. The augmeted LAMIS s fact raked frst for R-precso four of eght Chese datasets ad raked secod the other three, showg the mprovemet o R-precso over geeral by a factor of.78. We ca also see the mprovemet of augmeted LAMIS over all Chese ews Web stes Fgure 8. It s show that LAMIS based schemes geeral outperform others ad the techque devsed Secto 4.3. s powerful dealg wth such effect. P CB CB2 P2 P3 P2 P3 Page Mode Cotet Block Mode Fgure 6: Propagatos of mutual reforcemet o dfferet modes Precso Recall -CB LAMIS LAMIS-CB LAMIS-LN LAMIS-LN-CB Fgure 7: Effects of the cotet block mode ad the page mode Table : R-Precso of all expermets R-Precso CDN CTIMES CNA CNET CTS TVBS TTV UDN AVG. outlks LN (SALSA) LAMIS LAMIS-LN CB CB--LN LAMIS-CB LAMIS-LN-CB OL: rak by umber of outlks a page, PA: Page mode, CB: Cotet block mode, : Kleberg's, LN: Lk ormalzato, -LNSALSA: the stochastc approach for lkstructure aalyss,, AEN: weghted by achor text etropy R-Precso CDN CTIMES CNA CNET CTS TV BS TTV UDN A V G. Datasets outlks -LN (SALSA) LAMIS-LN-CB Fgure 8: R-Precso mprovemet of augmeted LAMIS Precso (%) LAMIS-LN-CB outlks SALSA Top-20 Fgure 9: Top-20 precso of Eglsh ews Web stes
8 We coduct our several expermets o Eglsh ews Web stes ad compare the top-20 precso rates Fgure 9. We ca ote that some results are ot as good as those Chese ews Web stes, though augmeted LAMIS stll emerges as the wer. The reaso s that effects of oses descrbed Secto 4.3 are promet Eglsh ews Web stes ad thus affect the top-20 rakg further. For example, four of top-0 hub-rakg pages DISPATCH ad CSMONITOR are artcle pages wth local related ews meus, ad they appear to receve hgh hub values due to hgh weghted lks the rrelevat block.. Coclusos I the paper, we addressed the problem o mg formatve structure of Web stes ad ts correspodg ssues. We devsed a etropy-based lk aalyss mechasm to me the formatve structures wth hgh precso. Our approach,.e., LAMIS, corporates the etropy value of a lk, whch s essece the average of etropy values of terms the achor text, to coeffcets of the authorty-hub equatos so as to ehace the etropy-based lk aalyss for Web structure mg. LAMIS s further augmeted to mprove the precso ad recall the Web structure mg. LAMIS gves a precse TOC recogto methodology ad bulds a formatve structure by these TOC pages ad artcles pages that are lked by formatve lks TOC pages. Expermetal results o several ews Web stes have show that the augmeted LAMIS has the capablty of mg the formatve structures of ews Web stes wth a very good precso. Moreover, from the expermetal studes, we ehace LAMIS ad make t practcally useful for mg real ews Web stes. 6. Refereces [] B. Ameto, L. Tervee, ad W. Hll. Does Authorty Mea Qualty? Predctg Expert Qualty Ratgs of Web Documets. Proc. of 23th ACM SIGIR Cof. o Research ad Developmet Iformato Retreval, [2] R. Baeza-Yates, B. Rbero-Neto. Moder Iformato Retreval. Addso Wesley, 999. [3] K. Bharat, M. R. Hezger. Improved Algorthms for Topc Dstllato a Hyperlked Evromet. Proc. of 2th ACM SIGIR Cof. o Research ad Developmet Iformato Retreval, 998. [4] S. Br, L. Page. The aatomy of a large-scale hypertextual Web search ege. Proc. of 7th World Wde Web Coferece, 998. [] A. Broder, S. Glassma, M. Maasse, G. Zweg. Sytactc Clusterg of the Web. Proc.of 6 th World Wde Web Coferece, 997. [6] A. Broder, R. Kumar, F. Maghoul, P. Raghava, S. Rajagopala, R. Stata, A. Tomks, J. Weer. Graph structure the Web. Proc.of 9 th World Wde Web Coferece, [7] S. Chakrabart, M. Josh, V. Tawde. Ehaced Topc Dstllato usg Text, Markup Tags, ad Hyperlks. Proc. of 24th ACM SIGIR Cof. o Research ad Developmet Iformato Retreval, 200. [8] S. Chakrabart. Itegratg the Documet Object Model wth Hyperlks for Ehaced Topc Dstllato ad Iformato Extracto. Proc. of 0 th World Wde Web Coferece, 200. [9] S. Chakrabart, B. Dom, P. Raghava, S. Rajagopala, D. Gbso, J. M. Kleberg. Automatc Resource Complato by Aalyzg Hyperlk Structure ad Assocated Text. Proc. of 7 th World Wde Web Coferece, 998. [0] S. Chakrabart, B. Dom, S. Kumar, P. Raghava, S. Rajagopala, A. Tomks, D. Gbso, ad J. M. Kleberg. Mg the Web's lk structure. IEEE Computer, 32(8), pages 60-67, August 999. [] C. H. Chag, S. C. Lu. IEPAD: Iformato Extracto Based o Patter Dscovery. Proc.of 0 th World Wde Web Coferece, 200. [2] M.-S. Che, J.-S. Park, ad P. S. Yu. Effcet Data Mg for Path Traversal Patters. IEEE Trasactos o Kowledge ad Data Egeerg, 0(2): , Aprl 998. [3] V. Crescez, G. Mecca, ad P. Meraldo. RoadRuer: towards automatc data extracto from large Web stes. Proc. of 27 th Iteratoal Coferece o Very Large Data Bases, 200. [4] B. D. Davso. Recogzg Nepotstc Lks o the Web. Proc. of AAAI [] N. Jushmerck. Learg to remove Iteret advertsemets. Proc. of 3 rd Iteratoal Cof. O Autoomous Agets, 999. [6] J. M. Kleberg. Authortatve sources a hyperlked evromet. ACM-SIAM Symposum o Dscrete Algorthms, 998. [7] N. Kushmerck, D. Weld, ad R. Doorebos. Wrapper Iducto for Iformato Extracto. I Proc. of the th Iteratoal Jot Coferece o Artfcal Itellgece (IJCAI), 997. [8] R. Lempel, S. Mora. The Stochastc Approach for Lk-Structure Aalyss (SALSA) ad the TKC effect. I 9 th Iteratoal World Wde Web Coferece, Amsterdam, Netherlads, May [9] W. S. L, N. F. Aya, O. Kolak, Q. Vu. Costructg Mult-Graular ad Topc-Focused Web Ste Maps. Proc. of 0 th World Wde Web Coferece, 200. [20] P. Proll, J. Ptkow, R. Rao. Slk from a sow s ear: Extractg usable structures from the Web. Proc. of ACM SIGCHI Coferece o Huma Factors Computg, 996. [2] G. Salto. Automatc Text Processg: The Trasformato, Aalyss, ad Retreval of Iformato by Computer. Addso Wesley [22] C. E. Shao. A mathematcal theory of commucato. Bell System Techcal Joural, 27: , 948. [23] W3C DOM. Documet Object Model (DOM).
IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki
IDENIFICAION OF HE DYNAMICS OF HE GOOGLE S RANKING ALGORIHM A. Khak Sedgh, Mehd Roudak Cotrol Dvso, Departmet of Electrcal Egeerg, K.N.oos Uversty of echology P. O. Box: 16315-1355, ehra, Ira [email protected],
6.7 Network analysis. 6.7.1 Introduction. References - Network analysis. Topological analysis
6.7 Network aalyss Le data that explctly store topologcal formato are called etwork data. Besdes spatal operatos, several methods of spatal aalyss are applcable to etwork data. Fgure: Network data Refereces
Green Master based on MapReduce Cluster
Gree Master based o MapReduce Cluster Mg-Zh Wu, Yu-Chag L, We-Tsog Lee, Yu-Su L, Fog-Hao Lu Dept of Electrcal Egeerg Tamkag Uversty, Tawa, ROC Dept of Electrcal Egeerg Tamkag Uversty, Tawa, ROC Dept of
Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering
Moder Appled Scece October, 2009 Applcatos of Support Vector Mache Based o Boolea Kerel to Spam Flterg Shugag Lu & Keb Cu School of Computer scece ad techology, North Cha Electrc Power Uversty Hebe 071003,
Maintenance Scheduling of Distribution System with Optimal Economy and Reliability
Egeerg, 203, 5, 4-8 http://dx.do.org/0.4236/eg.203.59b003 Publshed Ole September 203 (http://www.scrp.org/joural/eg) Mateace Schedulg of Dstrbuto System wth Optmal Ecoomy ad Relablty Syua Hog, Hafeg L,
A New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree
, pp.277-288 http://dx.do.org/10.14257/juesst.2015.8.1.25 A New Bayesa Network Method for Computg Bottom Evet's Structural Importace Degree usg Jotree Wag Yao ad Su Q School of Aeroautcs, Northwester Polytechcal
An IG-RS-SVM classifier for analyzing reviews of E-commerce product
Iteratoal Coferece o Iformato Techology ad Maagemet Iovato (ICITMI 205) A IG-RS-SVM classfer for aalyzg revews of E-commerce product Jaju Ye a, Hua Re b ad Hagxa Zhou c * College of Iformato Egeerg, Cha
A Study of Unrelated Parallel-Machine Scheduling with Deteriorating Maintenance Activities to Minimize the Total Completion Time
Joural of Na Ka, Vol. 0, No., pp.5-9 (20) 5 A Study of Urelated Parallel-Mache Schedulg wth Deteroratg Mateace Actvtes to Mze the Total Copleto Te Suh-Jeq Yag, Ja-Yuar Guo, Hs-Tao Lee Departet of Idustral
Average Price Ratios
Average Prce Ratos Morgstar Methodology Paper August 3, 2005 2005 Morgstar, Ic. All rghts reserved. The formato ths documet s the property of Morgstar, Ic. Reproducto or trascrpto by ay meas, whole or
The Digital Signature Scheme MQQ-SIG
The Dgtal Sgature Scheme MQQ-SIG Itellectual Property Statemet ad Techcal Descrpto Frst publshed: 10 October 2010, Last update: 20 December 2010 Dalo Glgorosk 1 ad Rue Stesmo Ødegård 2 ad Rue Erled Jese
A DISTRIBUTED REPUTATION BROKER FRAMEWORK FOR WEB SERVICE APPLICATIONS
L et al.: A Dstrbuted Reputato Broker Framework for Web Servce Applcatos A DISTRIBUTED REPUTATION BROKER FRAMEWORK FOR WEB SERVICE APPLICATIONS Kwe-Jay L Departmet of Electrcal Egeerg ad Computer Scece
An Approach to Evaluating the Computer Network Security with Hesitant Fuzzy Information
A Approach to Evaluatg the Computer Network Securty wth Hestat Fuzzy Iformato Jafeg Dog A Approach to Evaluatg the Computer Network Securty wth Hestat Fuzzy Iformato Jafeg Dog, Frst ad Correspodg Author
The impact of service-oriented architecture on the scheduling algorithm in cloud computing
Iteratoal Research Joural of Appled ad Basc Sceces 2015 Avalable ole at www.rjabs.com ISSN 2251-838X / Vol, 9 (3): 387-392 Scece Explorer Publcatos The mpact of servce-oreted archtecture o the schedulg
Numerical Methods with MS Excel
TMME, vol4, o.1, p.84 Numercal Methods wth MS Excel M. El-Gebely & B. Yushau 1 Departmet of Mathematcal Sceces Kg Fahd Uversty of Petroleum & Merals. Dhahra, Saud Araba. Abstract: I ths ote we show how
Security Analysis of RAPP: An RFID Authentication Protocol based on Permutation
Securty Aalyss of RAPP: A RFID Authetcato Protocol based o Permutato Wag Shao-hu,,, Ha Zhje,, Lu Sujua,, Che Da-we, {College of Computer, Najg Uversty of Posts ad Telecommucatos, Najg 004, Cha Jagsu Hgh
Dynamic Two-phase Truncated Rayleigh Model for Release Date Prediction of Software
J. Software Egeerg & Applcatos 3 63-69 do:.436/jsea..367 Publshed Ole Jue (http://www.scrp.org/joural/jsea) Dyamc Two-phase Trucated Raylegh Model for Release Date Predcto of Software Lafe Qa Qgchua Yao
ADAPTATION OF SHAPIRO-WILK TEST TO THE CASE OF KNOWN MEAN
Colloquum Bometrcum 4 ADAPTATION OF SHAPIRO-WILK TEST TO THE CASE OF KNOWN MEAN Zofa Hausz, Joaa Tarasńska Departmet of Appled Mathematcs ad Computer Scece Uversty of Lfe Sceces Lubl Akademcka 3, -95 Lubl
Projection model for Computer Network Security Evaluation with interval-valued intuitionistic fuzzy information. Qingxiang Li
Iteratoal Joural of Scece Vol No7 05 ISSN: 83-4890 Proecto model for Computer Network Securty Evaluato wth terval-valued tutostc fuzzy formato Qgxag L School of Software Egeerg Chogqg Uversty of rts ad
Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity
Computer Aded Geometrc Desg 19 (2002 365 377 wwwelsevercom/locate/comad Optmal mult-degree reducto of Bézer curves wth costrats of edpots cotuty Guo-Dog Che, Guo-J Wag State Key Laboratory of CAD&CG, Isttute
Application of Grey Relational Analysis in Computer Communication
Applcato of Grey Relatoal Aalyss Computer Commucato Network Securty Evaluato Jgcha J Applcato of Grey Relatoal Aalyss Computer Commucato Network Securty Evaluato *1 Jgcha J *1, Frst ad Correspodg Author
APPENDIX III THE ENVELOPE PROPERTY
Apped III APPENDIX III THE ENVELOPE PROPERTY Optmzato mposes a very strog structure o the problem cosdered Ths s the reaso why eoclasscal ecoomcs whch assumes optmzg behavour has bee the most successful
SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN
SHAPIRO-WILK TEST FOR NORMALITY WITH KNOWN MEAN Wojcech Zelńsk Departmet of Ecoometrcs ad Statstcs Warsaw Uversty of Lfe Sceces Nowoursyowska 66, -787 Warszawa e-mal: wojtekzelsk@statystykafo Zofa Hausz,
Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R =
Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS Objectves of the Topc: Beg able to formalse ad solve practcal ad mathematcal problems, whch the subjects of loa amortsato ad maagemet of cumulatve fuds are
Banking (Early Repayment of Housing Loans) Order, 5762 2002 1
akg (Early Repaymet of Housg Loas) Order, 5762 2002 y vrtue of the power vested me uder Secto 3 of the akg Ordace 94 (hereafter, the Ordace ), followg cosultato wth the Commttee, ad wth the approval of
Optimization Model in Human Resource Management for Job Allocation in ICT Project
Optmzato Model Huma Resource Maagemet for Job Allocato ICT Project Optmzato Model Huma Resource Maagemet for Job Allocato ICT Project Saghamtra Mohaty Malaya Kumar Nayak 2 2 Professor ad Head Research
Speeding up k-means Clustering by Bootstrap Averaging
Speedg up -meas Clusterg by Bootstrap Averagg Ia Davdso ad Ashw Satyaarayaa Computer Scece Dept, SUNY Albay, NY, USA,. {davdso, ashw}@cs.albay.edu Abstract K-meas clusterg s oe of the most popular clusterg
Efficient Traceback of DoS Attacks using Small Worlds in MANET
Effcet Traceback of DoS Attacks usg Small Worlds MANET Yog Km, Vshal Sakhla, Ahmed Helmy Departmet. of Electrcal Egeerg, Uversty of Souther Calfora, U.S.A {yogkm, sakhla, helmy}@ceg.usc.edu Abstract Moble
Fast, Secure Encryption for Indexing in a Column-Oriented DBMS
Fast, Secure Ecrypto for Idexg a Colum-Oreted DBMS Tgja Ge, Sta Zdok Brow Uversty {tge, sbz}@cs.brow.edu Abstract Networked formato systems requre strog securty guaratees because of the ew threats that
Statistical Pattern Recognition (CE-725) Department of Computer Engineering Sharif University of Technology
I The Name of God, The Compassoate, The ercful Name: Problems' eys Studet ID#:. Statstcal Patter Recogto (CE-725) Departmet of Computer Egeerg Sharf Uversty of Techology Fal Exam Soluto - Sprg 202 (50
T = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are :
Bullets bods Let s descrbe frst a fxed rate bod wthout amortzg a more geeral way : Let s ote : C the aual fxed rate t s a percetage N the otoal freq ( 2 4 ) the umber of coupo per year R the redempto of
Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK
Fractal-Structured Karatsuba`s Algorthm for Bary Feld Multplcato: FK *The authors are worg at the Isttute of Mathematcs The Academy of Sceces of DPR Korea. **Address : U Jog dstrct Kwahadog Number Pyogyag
ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data
ANOVA Notes Page Aalss of Varace for a Oe-Wa Classfcato of Data Cosder a sgle factor or treatmet doe at levels (e, there are,, 3, dfferet varatos o the prescrbed treatmet) Wth a gve treatmet level there
Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract
Preset Value of Autes Uder Radom Rates of Iterest By Abraham Zas Techo I.I.T. Hafa ISRAEL ad Uversty of Hafa, Hafa ISRAEL Abstract Some attempts were made to evaluate the future value (FV) of the expected
Suspicious Transaction Detection for Anti-Money Laundering
Vol.8, No. (014), pp.157-166 http://dx.do.org/10.1457/jsa.014.8..16 Suspcous Trasacto Detecto for At-Moey Lauderg Xgrog Luo Vocatoal ad techcal college Esh Esh, Hube, Cha [email protected] Abstract Moey lauderg
A Parallel Transmission Remote Backup System
2012 2d Iteratoal Coferece o Idustral Techology ad Maagemet (ICITM 2012) IPCSIT vol 49 (2012) (2012) IACSIT Press, Sgapore DOI: 107763/IPCSIT2012V495 2 A Parallel Trasmsso Remote Backup System Che Yu College
DECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT
ESTYLF08, Cuecas Meras (Meres - Lagreo), 7-9 de Septembre de 2008 DECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT José M. Mergó Aa M. Gl-Lafuete Departmet of Busess Admstrato, Uversty of Barceloa
Credibility Premium Calculation in Motor Third-Party Liability Insurance
Advaces Mathematcal ad Computatoal Methods Credblty remum Calculato Motor Thrd-arty Lablty Isurace BOHA LIA, JAA KUBAOVÁ epartmet of Mathematcs ad Quattatve Methods Uversty of ardubce Studetská 95, 53
Study on prediction of network security situation based on fuzzy neutral network
Avalable ole www.ocpr.com Joural of Chemcal ad Pharmaceutcal Research, 04, 6(6):00-06 Research Artcle ISS : 0975-7384 CODE(USA) : JCPRC5 Study o predcto of etwork securty stuato based o fuzzy eutral etwork
IP Network Topology Link Prediction Based on Improved Local Information Similarity Algorithm
Iteratoal Joural of Grd Dstrbuto Computg, pp.141-150 http://dx.do.org/10.14257/jgdc.2015.8.6.14 IP Network Topology Lk Predcto Based o Improved Local Iformato mlarty Algorthm Che Yu* 1, 2 ad Dua Zhem 1
CHAPTER 2. Time Value of Money 6-1
CHAPTER 2 Tme Value of Moey 6- Tme Value of Moey (TVM) Tme Les Future value & Preset value Rates of retur Autes & Perpetutes Ueve cash Flow Streams Amortzato 6-2 Tme les 0 2 3 % CF 0 CF CF 2 CF 3 Show
Report 52 Fixed Maturity EUR Industrial Bond Funds
Rep52, Computed & Prted: 17/06/2015 11:53 Report 52 Fxed Maturty EUR Idustral Bod Fuds From Dec 2008 to Dec 2014 31/12/2008 31 December 1999 31/12/2014 Bechmark Noe Defto of the frm ad geeral formato:
A particle Swarm Optimization-based Framework for Agile Software Effort Estimation
The Iteratoal Joural Of Egeerg Ad Scece (IJES) olume 3 Issue 6 Pages 30-36 204 ISSN (e): 239 83 ISSN (p): 239 805 A partcle Swarm Optmzato-based Framework for Agle Software Effort Estmato Maga I, & 2 Blamah
Proactive Detection of DDoS Attacks Utilizing k-nn Classifier in an Anti-DDos Framework
World Academy of Scece, Egeerg ad Techology Iteratoal Joural of Computer, Electrcal, Automato, Cotrol ad Iformato Egeerg Vol:4, No:3, 2010 Proactve Detecto of DDoS Attacks Utlzg k-nn Classfer a At-DDos
Preprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)-size data structure that enables O(log n) query time.
Computatoal Geometry Chapter 6 Pot Locato 1 Problem Defto Preprocess a plaar map S. Gve a query pot p, report the face of S cotag p. S Goal: O()-sze data structure that eables O(log ) query tme. C p E
How To Balance Load On A Weght-Based Metadata Server Cluster
WLBS: A Weght-based Metadata Server Cluster Load Balacg Strategy J-L Zhag, We Qa, Xag-Hua Xu *, Ja Wa, Yu-Yu Y, Yog-Ja Re School of Computer Scece ad Techology Hagzhou Daz Uversty, Cha * Correspodg author:[email protected]
The analysis of annuities relies on the formula for geometric sums: r k = rn+1 1 r 1. (2.1) k=0
Chapter 2 Autes ad loas A auty s a sequece of paymets wth fxed frequecy. The term auty orgally referred to aual paymets (hece the ame), but t s ow also used for paymets wth ay frequecy. Autes appear may
Software Reliability Index Reasonable Allocation Based on UML
Sotware Relablty Idex Reasoable Allocato Based o UML esheg Hu, M.Zhao, Jaeg Yag, Guorog Ja Sotware Relablty Idex Reasoable Allocato Based o UML 1 esheg Hu, 2 M.Zhao, 3 Jaeg Yag, 4 Guorog Ja 1, Frst Author
ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil
ECONOMIC CHOICE OF OPTIMUM FEEDER CABE CONSIDERING RISK ANAYSIS I Camargo, F Fgueredo, M De Olvera Uversty of Brasla (UB) ad The Brazla Regulatory Agecy (ANEE), Brazl The choce of the approprate cable
of the relationship between time and the value of money.
TIME AND THE VALUE OF MONEY Most agrbusess maagers are famlar wth the terms compoudg, dscoutg, auty, ad captalzato. That s, most agrbusess maagers have a tutve uderstadg that each term mples some relatoshp
Fault Tree Analysis of Software Reliability Allocation
Fault Tree Aalyss of Software Relablty Allocato Jawe XIANG, Kokch FUTATSUGI School of Iformato Scece, Japa Advaced Isttute of Scece ad Techology - Asahda, Tatsuokuch, Ishkawa, 92-292 Japa ad Yaxag HE Computer
Impact of Interference on the GPRS Multislot Link Level Performance
Impact of Iterferece o the GPRS Multslot Lk Level Performace Javer Gozalvez ad Joh Dulop Uversty of Strathclyde - Departmet of Electroc ad Electrcal Egeerg - George St - Glasgow G-XW- Scotlad Ph.: + 8
1. The Time Value of Money
Corporate Face [00-0345]. The Tme Value of Moey. Compoudg ad Dscoutg Captalzato (compoudg, fdg future values) s a process of movg a value forward tme. It yelds the future value gve the relevat compoudg
Chapter Eight. f : R R
Chapter Eght f : R R 8. Itroducto We shall ow tur our atteto to the very mportat specal case of fuctos that are real, or scalar, valued. These are sometmes called scalar felds. I the very, but mportat,
A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS
A COMPARATIVE STUDY BETWEEN POLYCLASS AND MULTICLASS LANGUAGE MODELS I Ztou, K Smaïl, S Delge, F Bmbot To cte ths verso: I Ztou, K Smaïl, S Delge, F Bmbot. A COMPARATIVE STUDY BETWEEN POLY- CLASS AND MULTICLASS
Mobile Agents in Telecommunications Networks A Simulative Approach to Load Balancing
Moble Agets Telecommucatos Networks A Smulatve Approach to Load Balacg Steffe Lpperts Departmet of Computer Scece (4), Uversty of Techology Aache Aache, 52056, Germay Ad Brgt Kreller Corporate Techology
Discrete-Event Simulation of Network Systems Using Distributed Object Computing
Dscrete-Evet Smulato of Network Systems Usg Dstrbuted Object Computg Welog Hu Arzoa Ceter for Itegratve M&S Computer Scece & Egeerg Dept. Fulto School of Egeerg Arzoa State Uversty, Tempe, Arzoa, 85281-8809
AnySee: Peer-to-Peer Live Streaming
ysee: Peer-to-Peer Lve Streamg School of Computer Scece ad Techology Huazhog Uversty of Scece ad Techology Wuha, 40074, Cha {xflao, hj, dfdeg }@hust.edu.c Xaofe Lao, Ha J, *Yuhao Lu, *Loel M. N, ad afu
Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Telecommunications (JSAT), January Edition, 2011
Cyber Jourals: Multdscplary Jourals cece ad Techology, Joural of elected Areas Telecommucatos (JAT), Jauary dto, 2011 A ovel rtual etwork Mappg Algorthm for Cost Mmzg ZHAG hu-l, QIU Xue-sog tate Key Laboratory
Optimizing Software Effort Estimation Models Using Firefly Algorithm
Joural of Software Egeerg ad Applcatos, 205, 8, 33-42 Publshed Ole March 205 ScRes. http://www.scrp.org/joural/jsea http://dx.do.org/0.4236/jsea.205.8304 Optmzg Software Effort Estmato Models Usg Frefly
The Application of Intuitionistic Fuzzy Set TOPSIS Method in Employee Performance Appraisal
Vol.8, No.3 (05), pp.39-344 http://dx.do.org/0.457/uesst.05.8.3.3 The pplcato of Itutostc Fuzzy Set TOPSIS Method Employee Performace pprasal Wag Yghu ad L Welu * School of Ecoomcs ad Maagemet, Shazhuag
RUSSIAN ROULETTE AND PARTICLE SPLITTING
RUSSAN ROULETTE AND PARTCLE SPLTTNG M. Ragheb 3/7/203 NTRODUCTON To stuatos are ecoutered partcle trasport smulatos:. a multplyg medum, a partcle such as a eutro a cosmc ray partcle or a photo may geerate
Performance Attribution. Methodology Overview
erformace Attrbuto Methodology Overvew Faba SUAREZ March 2004 erformace Attrbuto Methodology 1.1 Itroducto erformace Attrbuto s a set of techques that performace aalysts use to expla why a portfolo's performace
Web Service Composition Optimization Based on Improved Artificial Bee Colony Algorithm
JOURNAL OF NETWORKS, VOL. 8, NO. 9, SEPTEMBER 2013 2143 Web Servce Composto Optmzato Based o Improved Artfcal Bee Coloy Algorthm Ju He The key laboratory, The Academy of Equpmet, Beg, Cha Emal: [email protected]
Load Balancing Algorithm based Virtual Machine Dynamic Migration Scheme for Datacenter Application with Optical Networks
0 7th Iteratoal ICST Coferece o Commucatos ad Networkg Cha (CHINACOM) Load Balacg Algorthm based Vrtual Mache Dyamc Mgrato Scheme for Dataceter Applcato wth Optcal Networks Xyu Zhag, Yogl Zhao, X Su, Ruyg
Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach 1
Learg to Flter Spam E-Mal: A Comparso of a Nave Bayesa ad a Memory-Based Approach 1 Io Adroutsopoulos, Georgos Palouras, Vagels Karkaletss, Georgos Sakks, Costate D. Spyropoulos ad Paagots Stamatopoulos
The Popularity Parameter in Unstructured P2P File Sharing Networks
The Popularty Parameter Ustructured P2P Fle Sharg Networks JAIME LLORET, JUAN R. DIAZ, JOSE M. JIMÉNEZ, MANUEL ESTEVE Departmet of Commucatos Polytechc Uversty of Valeca Camo de Vera s/, 4622 Valeca SPAIN
Compressive Sensing over Strongly Connected Digraph and Its Application in Traffic Monitoring
Compressve Sesg over Strogly Coected Dgraph ad Its Applcato Traffc Motorg Xao Q, Yogca Wag, Yuexua Wag, Lwe Xu Isttute for Iterdscplary Iformato Sceces, Tsghua Uversty, Bejg, Cha {qxao3, kyo.c}@gmal.com,
AN ALGORITHM ABOUT PARTNER SELECTION PROBLEM ON CLOUD SERVICE PROVIDER BASED ON GENETIC
Joural of Theoretcal ad Appled Iformato Techology 0 th Aprl 204. Vol. 62 No. 2005-204 JATIT & LLS. All rghts reserved. ISSN: 992-8645 www.jatt.org E-ISSN: 87-395 AN ALGORITHM ABOUT PARTNER SELECTION PROBLEM
Classic Problems at a Glance using the TVM Solver
C H A P T E R 2 Classc Problems at a Glace usg the TVM Solver The table below llustrates the most commo types of classc face problems. The formulas are gve for each calculato. A bref troducto to usg the
On Error Detection with Block Codes
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 9, No 3 Sofa 2009 O Error Detecto wth Block Codes Rostza Doduekova Chalmers Uversty of Techology ad the Uversty of Gotheburg,
n. We know that the sum of squares of p independent standard normal variables has a chi square distribution with p degrees of freedom.
UMEÅ UNIVERSITET Matematsk-statstska sttutoe Multvarat dataaalys för tekologer MSTB0 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multvarat dataaalys för tekologer B, 5 poäg.
Integrating Production Scheduling and Maintenance: Practical Implications
Proceedgs of the 2012 Iteratoal Coferece o Idustral Egeerg ad Operatos Maagemet Istabul, Turkey, uly 3 6, 2012 Itegratg Producto Schedulg ad Mateace: Practcal Implcatos Lath A. Hadd ad Umar M. Al-Turk
Using Phase Swapping to Solve Load Phase Balancing by ADSCHNN in LV Distribution Network
Iteratoal Joural of Cotrol ad Automato Vol.7, No.7 (204), pp.-4 http://dx.do.org/0.4257/jca.204.7.7.0 Usg Phase Swappg to Solve Load Phase Balacg by ADSCHNN LV Dstrbuto Network Chu-guo Fe ad Ru Wag College
Robust Realtime Face Recognition And Tracking System
JCS& Vol. 9 No. October 9 Robust Realtme Face Recogto Ad rackg System Ka Che,Le Ju Zhao East Cha Uversty of Scece ad echology Emal:[email protected] Abstract here s some very mportat meag the study of realtme
The Gompertz-Makeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev
The Gompertz-Makeham dstrbuto by Fredrk Norström Master s thess Mathematcal Statstcs, Umeå Uversty, 997 Supervsor: Yur Belyaev Abstract Ths work s about the Gompertz-Makeham dstrbuto. The dstrbuto has
Chapter 3 0.06 = 3000 ( 1.015 ( 1 ) Present Value of an Annuity. Section 4 Present Value of an Annuity; Amortization
Chapter 3 Mathematcs of Face Secto 4 Preset Value of a Auty; Amortzato Preset Value of a Auty I ths secto, we wll address the problem of determg the amout that should be deposted to a accout ow at a gve
VIDEO REPLICA PLACEMENT STRATEGY FOR STORAGE CLOUD-BASED CDN
Joural of Theoretcal ad Appled Iformato Techology 31 st Jauary 214. Vol. 59 No.3 25-214 JATIT & S. All rghts reserved. ISSN: 1992-8645 www.att.org E-ISSN: 1817-3195 VIDEO REPICA PACEMENT STRATEGY FOR STORAGE
ANALYTICAL MODEL FOR TCP FILE TRANSFERS OVER UMTS. Janne Peisa Ericsson Research 02420 Jorvas, Finland. Michael Meyer Ericsson Research, Germany
ANALYTICAL MODEL FOR TCP FILE TRANSFERS OVER UMTS Jae Pesa Erco Research 4 Jorvas, Flad Mchael Meyer Erco Research, Germay Abstract Ths paper proposes a farly complex model to aalyze the performace of
Approximation Algorithms for Scheduling with Rejection on Two Unrelated Parallel Machines
(ICS) Iteratoal oural of dvaced Comuter Scece ad lcatos Vol 6 No 05 romato lgorthms for Schedulg wth eecto o wo Urelated Parallel aches Feg Xahao Zhag Zega Ca College of Scece y Uversty y Shadog Cha 76005
On formula to compute primes and the n th prime
Joural's Ttle, Vol., 00, o., - O formula to compute prmes ad the th prme Issam Kaddoura Lebaese Iteratoal Uversty Faculty of Arts ad ceces, Lebao Emal: [email protected] amh Abdul-Nab Lebaese Iteratoal
Proceedings of the 2010 Winter Simulation Conference B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, and E. Yücesan, eds.
Proceedgs of the 21 Wter Smulato Coferece B. Johasso, S. Ja, J. Motoya-Torres, J. Huga, ad E. Yücesa, eds. EMPIRICAL METHODS OR TWO-ECHELON INVENTORY MANAGEMENT WITH SERVICE LEVEL CONSTRAINTS BASED ON
TESTING AND SECURITY IN DISTRIBUTED ECONOMETRIC APPLICATIONS REENGINEERING VIA SOFTWARE EVOLUTION
TESTING AND SECURITY IN DISTRIBUTED ECONOMETRIC APPLICATIONS REENGINEERING VIA SOFTWARE EVOLUTION Cosm TOMOZEI 1 Assstat-Lecturer, PhD C. Vasle Alecsadr Uversty of Bacău, Romaa Departmet of Mathematcs
CHAPTER 13. Simple Linear Regression LEARNING OBJECTIVES. USING STATISTICS @ Sunflowers Apparel
CHAPTER 3 Smple Lear Regresso USING STATISTICS @ Suflowers Apparel 3 TYPES OF REGRESSION MODELS 3 DETERMINING THE SIMPLE LINEAR REGRESSION EQUATION The Least-Squares Method Vsual Exploratos: Explorg Smple
A particle swarm optimization to vehicle routing problem with fuzzy demands
A partcle swarm optmzato to vehcle routg problem wth fuzzy demads Yag Peg, Ye-me Qa A partcle swarm optmzato to vehcle routg problem wth fuzzy demads Yag Peg 1,Ye-me Qa 1 School of computer ad formato
