A PREDICTIVE MODEL FOR CUSTOMER PURCHASE BEHAVIOR IN E-COMMERCE CONTEXT

Size: px
Start display at page:

Download "A PREDICTIVE MODEL FOR CUSTOMER PURCHASE BEHAVIOR IN E-COMMERCE CONTEXT"

Transcription

1 A PREDICTIVE MODEL FOR CUSTOMER PURCHASE BEHAVIOR IN E-COMMERCE CONTEXT Jangtao Qu, School of Economc Informaton Engneerng, Southwestern Unversty of Fnance and Economcs, Chengdu, Schuan, Chna, Abstract Predctng customer purchase behavour s an nterestng and challengng task. In e-commerce context, to tackle the challenge wll confront a lot of new problems dfferent from those n tradtonal busness. Ths study nvestgates three factors that affect purchasng decson-makng of customers n onlne shoppng: the needs of customers, the popularty of products and the preference of the customers. Furthermore, explotng purchase data and ratngs of products n the e-commerce webste, we propose methods to quantfy the strength of these factors: (1) usng assocatons between products to predct the needs of customers; (2) combnng collaboratve flterng and a herarchcal Bayesan dscrete choce model to learn preference of customers; (3) buldng a support vector regresson based model, called Heat model, to calculate the popularty of products; (4) developng a crowdsourcng approach based expermental platform to generate tran set for learnng Heat model. Combnng these factors, a model, called COREL, s proposed to make purchase behavour predcton for customers. Submtted a purchased product of a customer, the model can return top n the most possble purchased products of the customer n future. Experments show that these factors play key roles n predctve model and COREL can greatly outperform the baselne methods. Keywords: predctve model, purchase behavor, E-commerce

2 1 INTRODUCTION If a frm s able to predct customers purchase behavor, the frm wll really beneft much from ths ablty, such as mprovng success rate of acqurng customer, ncreasng sales and establshng compettveness. There have been researchers from marketng and customer relatonshp management (CRM) makng ther contrbutons for the purchase behavor predcton. Market basket analyss [Raymond et.al. 2005, Shu et.al. 2011] examnes purchase lsts of supermarket or shop to dentfy purchase pattern. Wth the pattern, the needs of one customer can be predcted. Scholars n CRM employ technques of data mnng to evaluate customers value [E. H. Suh et.al. 1999, J. R. Bult et. al. 1995, J. A. McCarty et. al.2007], helpng frms to acqure customers or mprove customer retenton. Dfferent from tradtonal busness, enterprses n e-commerce have no way to acqure nformaton about customer demography, geography and famly background. Instead, t s convenent to obtan revews, ratngs of product and vstng tracks. Therefore, those methods and algorthms for customer purchase behavor predcton n tradtonal busness are not sutable for e-commerce context. Is t probable to predct customer purchase behavor n e-commerce context? Anand [Anand 2008] suggests that purchases are vewed as two dstnct types n onlne shoppng: frmntated purchase, whch s a consequence of the frm makng recommendatons, and others apart from the frm-ntated purchase. In our opnon, the second type may be further dvded nto self-ntated purchase and assocaton-ntated purchase. The self-ntated purchase ndcates that current purchase s not related wth past purchase behavor whle the assocaton-ntated purchase consders t to be assocated wth prevous experence. Let s envsage a scenaro: a begnner photographer bought a camera; wth mprovement of hs photography sklls, a trpod can possbly emerge on hs schedule, and then a remote shutter wll also probably follow t. As a tool for frm-ntated purchase, recommender system s a technology adopted wdely n e- commerce webstes. Recommender system [Lnyuan et.al. 2012, D. Jannach et.al. 2011] predcts whch tems a customer wll most probably lke or be nterested n va ether explotng ratngs made by customers wth smlar taste (collaboratve flterng) or usng ratngs of the customer for other products n past tme (content-based recommendaton). However, recommender system generally predcts a ratng for a canddate product that only represents what mpresson the customer wll have for the product. The ratng s far from predctng purchase behavor of customers. Accordng to our experments, usng collaboratve flterng as a tool for predctng customer purchase behavor leads to a very poor performance. To predct the self-ntated purchase s an extremely dffcult task snce too many factors can actvate a customer s purchase desre whle many of them cannot be acqured n the onlne context. For example, a man may go to buy a TV set due to old one was out of order. The desre of customer s almost mpossble to be predcted n e-commerce context. Referrng to the assocaton-ntated purchase, however, past purchased products of customers may reveal ther needs. Hence, t s vable for predctng customer needs va explorng assocatons among products. From our pont of vew, customers purchase behavors generally are motvated by ther needs. Addtonally, the popularty of products has an mpact on the purchasng decson-makng of a customer. For example, a product n whch no one has shown any nterest s dffcult to actvate a customer s desre. Also, customers preference for products plays key roles for ther purchasng decson-makng. Ths study focuses on the assocaton-ntated purchase predcton n e-commerce context. We propose a predctve model ncorporatng customer needs, the popularty of products and customer preference, whch are usually gnored n prevous works. When a customer submts one purchased product nto ths model, t can return top n the most probably purchased products by the customer n future.

3 We explore assocatons between products, explotng them to predct customer needs. The popularty of a product s subect to many factors such as on-shelf date, recent revew date, ratngs and the number of revews. We develop a Support Vector Regresson (SVR) based model, called Heat model, to calculate popularty of products. However, we fal to fnd out tran dataset for Heat model from e- commerce webstes. Accordng to our experence, customers are generally able to make a correct udgment on the popularty of a product when they are dong an onlne shoppng. Relyng on crowdsourcng approach, we develop an expermental platform n whch partcpants gve a comparson on the popularty of a par of products. The system converts the choces of partcpants to nstances of tran data for learnng the parameters of Heat model. The dscrete choce analyss s usually used to buld the behavoral model for customers decson-makng. We also combne collaboratve flterng and a herarchcal Bayesan dscrete choce model to learn customers preference for products. 2 RELATED WORK Researchers n marketng and retalng felds have pad ther efforts on predctng customers purchase behavor, whch may help frms to mplement cross sellng and up sellng campagns. Market basket analyss or Assocaton Rule s a man technque for the predcton. They fnd customer purchasng patterns by extractng assocatons or co-occurrences from stores transactonal databases. [Raymond et.al. 2005, Shu et.al. 2011]. CRM system of a frm generally needs to mantan a large amount of customer data, such as age, gender, ncome and purchasng lsts. Data mnng technques are often employed n analytcal CRM to transform ths large amount of data nto valuable knowledge that can be used to support marketng decson makng. Based on such data mnng technques, customers may be segmented nto clusters wth nternally homogenous and mutually heterogeneous characterstcs [Chhl et.al. 2008]. Besdes segmentaton, customers can also be ranked on ther probablty to behave n a certan way (e.g. buyng a specfc product or respondng to a certan marketng campagn). Wth helps of these segmentaton schemes and rankngs, a frm s able to approach carefully selected customers, resultng n a hgher success rate of ther marketng campagns [E. H. Suh et.al. 1999]. Consequently, researchers often try to mprove CRM by enhancng the data mnng technques themselves. Researches n CRM have evolved from RFM models (.e., recency, frequency and monetary value of customer purchases) to classfcaton technques such as ch-square automatc nteracton detecton (CHAID) and regresson models [J. R. Bult et.al. 1995, J. A. McCarty et. al. 2007]. Recently, researchers try to outperform these prmtve technques by ntroducng more advanced machne learnng algorthms, lke support vector machnes, neural networks and random forests [H. Shn et.al. 2006, J. Zahav et. al. 1997]. In general, t s dffcult to acqure customer demography and prvacy nformaton such as ncome, age and gender n e-commerce context. Instead, t s easer to acqure web accessng data such as product revews and ratngs, whch provde rcher nformaton than tradtonal CRM. Therefore, those predcton technques n CRM based on customer data cannot be well transferred to e-commerce doman. There have been few researches focusng on predctng purchasng behavor n e-commerce context or onlne shoppng. [Poel et.al. 2005] employ logt modelng to predct whether or not a purchase wll be made durng the next vst to the webste. The model uses clck steam behavor, customer demography and hstorc purchase behavor as varables. Ther study shows that clck stream behavor s mportant when determnng the tendency to buy. E-commerce frms generally use recommender system for predctng customer purchase behavor. However, as dscussed n Secton 1, recommender system that generally calculate a ratng for a canddate product only predct what mpresson the customer wll have for the product, whch s far dfferent from predctng purchase behavors of the customer. There have been many excellent surveys [Lnyuan et.al. 2012, D. Jannach et.al. 2011] about recommendaton system. We wll not gve a detaled dscusson about t n ths paper.

4 3 CUSTOMER PURCHASE BEHAVIOR PREDICTIVE MODEL Ths secton proposes a customer purchase behavor predcton model COREL (CustOmer purchase predcton model). Let c k be a customer; d and d be products. When c k purchased d n tme t, COREL can return top n the most lkely purchased products by c k after t tme. As c k bought d n t tme, the probablty that c k wll also purchase d after t tme may be p( d ck, d ) p( d, ck ) p( d ck, d) p( d, c ) Suppose c k and d are ndependent of each other,.e. c k may purchase any product n tme t, the probablty may be p( d d) p( d ck ) p( d ck, d) pd ( ) where p(d c k ) s the probablty that c k wll purchase d. p(d d ) s the probablty that a customer bought d wll also purchase d. p(d ) s pror probablty of d. Let ω={d 1,,d -1,d +1,,d m } be a collecton of canddate products. We calculate p(d c k, d ) for each product d ω, and then rank them. When the pror probablty of a product p(d ) s assumed to be unform across all products, p(d ) can be gnored. Therefore COREL may be p( d c, d ) p( d c ) p( d d ) k k COREL can be understood as a two-stage approach: usng p(d d ) to buld a product collecton ω where products are assocated wth d ; usng p(d c k ) to pck up the most lkely beng purchased canddates from ω. Therefore, the most mportant task for buldng a model COREL s estmatng both parameters p(d d ) and p(d c k ). 3.1 Estmatng p(d d ) k Parameter p(d d ) represents the probablty that d wll also be purchased by the same customer after d s purchased. The parameter can be estmated by explorng assocaton between d and d, whch may be calculated va market basket analyss. When both products occur n same one market basket, t s generally thought that there exsts an assocaton between both products. Usng maxmum lkelhood estmaton, d d p( d d) d d denotes the number that product d s purchased; d d s the frequency of both products d and d co-occurrng n the one market basket. However, experment shows that the collecton of canddates bult usng formula (1) s so small that COREL fal to acheve a good performance. Therefore, we suggest buldng assocaton of category, and then pckng up canddates from assocated category of a product. Generally, e-commerce webstes assgn ther products the multlevels categores. For nstance, ngdong ( has three level categores for ther products. For one tem EPSON LQ-630k Prnter, ts categores from frst level to thrd level s Computer or Offce Equpment->Prntng related Offce Equpment->Prnter. We generate categores assocaton n thrd level of categores. Thr(d ) denotes the thrd category of product d. So, (1) Thr( d) Thr( d ) p( d d) Thr( d ) (2)

5 Experments reported about n secton 4 demonstrate that assocatons of categores can extend the canddate collecton. COREL acheve a better performance by pckng up top n assocated categores than that of usng formula (1). 3.2 Estmatng p(d c k ) Parameter p(d c k ) ndcates the probablty that customer c k wll purchase product d. However, t s almost mpossble to accurately estmate the parameter. Accordng to our experences of onlneshoppng, we may fnd that the parameter s subect to two factors: the popularty of d and preference of c k. Suppose both factors are ndependent of each other, formally, p(d c k ) s approxmately computed wth formula (3) p( d c ) Hot d * Preference c, d k k It s beleved that a product that s purchased much more frequently and has hgher ratngs than others wll be more popular n customers purchase decson-makng. Hence we develop a model, called Heat model Hot(d ), to calculate the popularty of products. Customer s preference also plays an mportant factor durng hs/her purchase decson- makng. We propose an approach to learn customers preference of products Preference(c k,d ), whch s presented n Secton Heat Model (3) Besdes the needs of a customer, the number of revews and ratngs of a product also play mport roles on purchase decson-makng of the customer. If a product s gven a lot of low ratngs or sparse revews, t s beleved that a customer wll hestate to purchase the product even f he has a strong need for t. We explot revews and sale nformaton of a product to calculate ts popularty. They nclude (1) Qr, the number of revews; (2) Qs, the average of ratngs; (3) Qa, the number of days snce on-shelf; (4) Qu, the number of days snce recent revew. We use a vector to represent the product d n whch there are four elements (Qr, Qs, Qa, Qu). Prevous researches have shown that SVR (Support Vector Regresson) s an excellent tool for predctve tasks. We develop a SVR based model to calculate popularty of products, called Heat model Hot(d ). Gven a product wth ts four attrbutes Qr, Qs, Qa, Qu, Heat model can calculate a score for ts popularty. The tran set s a necessary component for learnng a Heat model. As far as we know, none of the e-commerce webstes provdes labeled data about popularty of products. However, we can observer that a vstor can possbly decde whch one of two products s more popular n onlne shoppng. Based on the observaton, we use followng four steps to generate a Heat Model. Step 1: Reled on crowdsourcng approach, we develop a platform n whch partcpants need to pck up a more popular product from a par of products dsplayed n web page. The nterface of the system shows n Fgure 1. The system works wth followng steps. Pcks up any two products A and B from products database; dsplays factors Qr, Qs, Qa and Qu of both products n web page; a partcpant chooses a more popular one from them; If Hot(A)>Hot(B), generates two nstances of tran set. Err_Qr Err_Qs Err_Qa Err_Qu label Qr(A)- Qr(B) Qs(A) -Qs(B) Qa(A)- Qa(B) Qu(A)- Qu(B) 1 Qr(B)- Qr(A) Qs(B) -Qs(A) Qa(B)- Qa(A) Qu(B)- Qu(A) -1 In the obtaned tran set, one nstance ncludes fve felds: Err_Qr, Err_Qs, Err_Qa, Err_Qu and label. Qr(A) denotes the element Qr of vector A.

6 Fgure 1. A crowdsourceng approach based expermental platform Step 2: We buld a logstc regresson model f(φ) that may be able to compare the popularty of two products. φ n the model s a vector where the elements, denoted as Err_Qr, Err_Qs, Err_Qa, Err_Qu, represent dfference between elements of both compared product vectors. Where exp( ( )) f ( ) 1 exp( ( )) ( ) Err _ Qr Err _ Qs Err _ Qa Err _ Qu We employ tran set obtaned n step 1 to tran the logstc regresson model. Step 3: An algorthm usng logstc regresson model f(φ) to calculate popularty of products s descrbed as follows. Algorthm 1. Calculate Popularty of Products Input: a collecton of products ω, logstc regresson model f(φ) Output: the popularty of products n ω steps: 1. P [] 2. For each par <a,b>, a,b ω, a b 3. φ =V(a)-V(b) 4. score=f(φ) 6. P[a]= P[a]+score-0.5; P[b]= P[b]+0.5-score 9. End 10. normalze P to range [0,1] 11. Return P In the algorthm 1, the array P stores the calculated popularty of all products n ω n range of [0, 1]. Step 4: Usng algorthm 1, we calculate popularty for each products n set ω, and further generate a tran set for SVR model. Two nstances n the tran set are shown n followng table where score refers to popularty of a product and Ln(Qr) s natural log of Qr attrbute. Ln(Qr) Qs Ln(Qa) Ln(Qu) score In ths study, we explore -SVR and μ-svr combnng wth the polynomal kernel and the radal bass functon that are used as the kernel functon of SVR respectvely. Snce there s few general gudance to determne the parameters of SVR, ths study vares the parameters to select optmal values for the

7 best predcton performance. Expermental results show that -SVR wth radal bass functon can reach best performance n our study. Ths study uses LIBSVM software system [Chh et.al. 2011] to perform experments. Gven a set of data ponts, {(X 1,z 1 ),,(X m,z m )}, such that X R n s an nput and z R 1 s a target, the standard form of -SVR s Subect to mn wb,,, * l 1 T w w C C T W ( X ) b z z W ( X ) b l T * *, 0, 1,..., When Heat model reaches the best performance, the parameters of -SVR are C=1 and = Learnng Customer Preference Economc models of choce typcally assume that an ndvdual s latent utlty s a functon of brand and attrbute preference [Sha et.al. 2003]. Collaboratve flterng (CF) may be used to estmate a customer s ratng for one product n e-commerce va explotng the product ratngs made by customers wth smlar taste. However, CF doesn t consder a customer s preference for prce and brand of products that play an mportant role n customer s purchase decson makng. We predct the c k s ratng for d usng collaboratve flterng, CF(c k,d ), and then propose a herarchcal Bayesan dscrete choce model to learnng preference of customers to prce and brand, DC(c k,d ). CF(c k,d )*DC(c k,d ) refer to preference of customer c k to product d. CF(c k,d ) s calculated n formula(4). Sm( ck, s) ratng( s, d ) ss CF( ck, d ) S (4) where S denotes a set of customers that conssts of top 10 the most smlar customers wth c k ; ratng(s,d ) refers to a ratng that customer s make for product d. The possble ratng values are defned on a numercal scale from 0 (strongly dslke) to 5 (strongly lke). Sm(c k, s) ndcates the smlarty between customers c k and s, whch can be calculated by usng cosne measure. A customer feature vector s defned as a set of ratngs of products. For example, the feature vector of c k, V(c k )=(0, 4, 1, 0, 5) represents that c k dd not purchase product d 1 (or he/she gve a 0 ratng value) and gave d 2 a ratng value 4, etc. l * V ( ck) V ( cl) Sm( ck, cl) V ( ck) V ( cl) (5) Experments reported about n secton 4.3 analyze how CF mpact on performance of COREL. Expermental results show that a model combnng p(d d) wth CF can outperform the basc models usng only ether p(d d ) or CF on predctng customer purchasng behavor. We propose a herarchcal Bayesan dscrete choce model to learn customer c k s preference for prce and brand. Employng the model, we can calculate to what extent customer c k prefer a product d, DC(c k, d ). We dvde prce and brand of every product nto three levels respectvely: hgh, medum and low prce; large, moderate and small brand. In ths way, the feature vector x of a product d has sx bnary value features x = (p_h, p_me, p_lo, b_la, b_mo, b_sm) correspondng to three prce levels and three brand

8 levels, respectvely. Only one of three prce levels n the feature vector has a value 1 whle others s 0. For example, (p_h=1, p_me=0, p_lo=0) ndcates the prce of a product s n hgh level. Brand features are also subected to the rule. For example, (b_la=0, b_mo =0, b_sm=1) means that a product belongs to the small brand. 1 DC( ck, d ) P( y 1) 1 exp( V ( d, ck)) (6) Utlty functon,, U d c V d c e k k k k V d, c p _ h p _ me p _ lo b _ la b _ mo b _ sm P(y =1) denotes the probablty of selectng product d. Sx coeffcents 1 ~ 6 n the utlty functon are decded by features of customer. It means that each customer can face the utlty functon wth dfferent coeffcents. We use followng features to construct a customer feature vector. R (Recency): the number of months that have passed snce the customer last purchased F (Frequency): number of purchases n the last 12 months. M (Monetary): the amount of value from the customer n the last 12 months. Sd, whch s standard devaton of prces of total purchase products of the customer. The value reveals the habt of onlne shoppng of the customer. The smaller Sd means that the customer lke purchase fxed varety of product whle larger Sd ndcates that the customer don t mnd prce of products n onlne shoppng. Age, whch denotes tme nterval n year from current date to date when the customer frst purchase product n the webste. Every customer may have hs preference for products prce and brand. For example, someone prefers large brand products whle other ones do not care a products brand n condton that t s cheap. In the herarchcal Bayesan model, the coeffcents n utlty functon are decded by customers features. Use B denotes 1 ~ 6. B Z U; u ~ N(0, V ) The matrx Z contans features of customers. The coeffcent matrx has a normal dstrbuton wth means vec( ) and covarance matrces gven by Kronecker product of A -1 and V. n ~ N( ' Zn, V ) vec N vec A V 1 ( ) ~ ( ( ), ) V ~ IW ( v, V ) The vec operator creates a column vector from a matrx by stackng the column vectors of [Jan et.al. 2007]. Hyperparameter V has an Inverted Wshart pror. We set nonnformatve pror v, V, and A to v=m+3, V v I, =0 and A=0.01 where m s the number of coeffcents n utlty functon. Parameters n the herarchcal Bayesan model can be descrbed usng a DAG (Drected Acyclc Graph) n Fgure 2.

9 v, V V A, y Fgure 2 DAG of parameters n the herarchcal Bayesan model We employ MCMC-metropols hastng algorthm to estmate parameters n the herarchcal Bayesan model, usng a normal dstrbuton as the proposal dstrbuton for the MCMC algorthm. The loglkelhood functon s L( X, Y, B) log( p( x )* y (1 p( x ))*(1 y )) exp( x * B) px ( ) 1 exp( x * B) The steps for estmatng parameters of model are as follows. Algorthm 2. Use MCMC-metropols hastng algorthm to estmate parameters steps: 1. ntatng old 2. draw from V v,v~ IW(v+n,V+S) 1 vec( ), A, V ~ N( vec( ), A V) 3. draw from 4. draw new ~ N( old, V ) 5. Compute ( old, new) ~ mn(1, p( new) q( new, old ) / p( old ) q( old, new)) Where p( new) / p( old ) exp( L( X, Y, new) L( X, Y, old )), q(, ) / q(, ) exp{( ' Z)* V *( ( Z)') ( ' Z)* V *( ( Z)'))} new old old new new new old old 6. If <1 then 7. old = new wth probablty 8. else 9. old = new 10. End 11. Goto step (2) untl loop end Usng saved draws, we can plot posteror dstrbuton of coeffcents. Fgure 3 llustrates posteror dstrbutons of three coeffcent p_h, p_me and p_lo for one customer. It can be observed that means of three dstrbuton s about -2.7, 0.8 and 0.5, respectvely. From pont estmate of three coeffcents, we can conclude that the customer generally reects hgh prce product and tends to prefer medum prce products to low prce one. The traned model reveals more nformaton about customers. Fgure3. Posteror dstrbuton of coeffcents of a customer To learn a herarchcal Bayesan dscrete choce model, t s necessary to know the choces of customers n a fnte alternatve set. In e-commerce context, however, we can only know what customers purchased, not to know what customer gave up n ther choce. When tranng a herarchcal

10 Bayesan dscrete choce model, both postve and negatve samples are necessary components. Regardng the purchased products as postve data, we develop a technque to generate one negatve nstance from the postve one. One nstance n tran dataset s a feature vector of a purchased product combned wth a label. Sx features p_h, p_me, p_lo, b_la, b_mo, b_sm n an nstance represent prce level and brand level of a product, respectvely. p_h p_me p_lo b_la b_mo b_sm label When each feature n the postve nstance s nverted, we can derve a negatve nstance. p_h p_me p_lo b_la b_mo b_sm label EXPERIMENTS 4.1 Dataset Jngdong s a well-known B2C e-commerce webste n P.R.Chna. We collected customer nformaton and revews of products from the webste. The collected data contan 727,878 product tems, 342,451 customers and 14,634,059 revews from 2004 to 31 January, The products n Jngdong are assgned three level categores. There are 19 frst-level categores, 124 second-level categores and 1078 thrd-level categores. It s dffcult to drectly collect purchase data of customers from e-commerce webste as they are generally regarded as prvacy. Our study s based on an assumpton: f a customer frequently wrtes revews n an e-commerce webste, hs revews can almost totally exhbt purchased products of the customer (n Jngdong webste, only customers who purchased a product s authorzed to wrte revew for the product). Therefore, we pck up customers wth hgh revewng frequency to generate purchase data from ther revews. In addton, there are 55 partcpants usng our crowdsourcng platform to generate tran data that have 1351*2 nstances. The collected data are processed as follows. (1)Dvdng dataset: Dvde the dataset nto three sectons by date. A secton: before 30th June, 2012; B secton : from 30th June, 2012 to 31th July,2012; C secton: after 31th July, (2)Pckng up customers: We pck up customers who wrote revews n all three sectons and the number of revews n A secton s more than 30. There are total of 2770 customers meetng the requrement. (3)Test set: those products that are revewed n C secton by the pcked customers. (4)Tran set: purchased data of the pcked customer n A secton. (5)Target set: C secton (6)Settng prce level for every product: Let d be a product, thr be ts thrd-level category. If the prce of d s up to 75th percentle of prce of all products n thr, we assgn features of d to p_h=1, p_me=0 and p_lo=0. If the prce of d s less than 25th percentle of prce, the features s p_h=0, p_me=0 and p_lo=1. Otherwse, p_h=0, p_me=1 and p_lo=0 (7)Settng brand level for every product: we examne the dstrbuton of the number of product tems for all brands. If the number of product tems of a brand s larger than 75th percentle of the dstrbuton, we set features all products belongng to the brand as b_la=1, b_mo=0 and b_sm=0. If the brand s less than 25th percentle of the dstrbuton, the feature s b_h=0, b_mo=0 and b_sm=1. Otherwse, b_h=0, b_mo=1 and b_sm=0 We call the processed data as JD dataset.

11 4.2 Expermental Results Ths secton nvestgates performance of COREL on predctng customer purchase behavor. We compare COREL to several baselne methods. These models are shown n Table 1. name model descrpton M1 p(d d ) Usng formula (1) to pck up assocated products M2 p(d d )*CF Usng formula(2) to buld canddate set and employng collaboratve flterng to pck up products from canddates M3 p(d d )*CF*Hot A model combng M2 wth Heat model M4 COREL Incorporatng customers prce and brand preference Table 1 Predcton model Four models n Table 1 make predctons on JD data set. These models use the latest purchased product d n B secton by customer c k to predct purchased ones by c k n C secton. The canddates are generated by pckng up ten most assocated categores of d. These models calculate a predcton score for all canddates, and pck up top n canddates to buld a product subset ω. If any product n ω occurs n test set Φ, we call customer c k s successfully predcted. Usng precson as a measure, we present the expermental result n Table 2. Table 2 Evaluate predcton performance for four models Table 2 shows that market basket analyss (M1) has a poorest performance on predctng purchasng behavor of customers. Combng collaboratve flterng and market basket analyss (M2), predctve model can dramatcally mprove ts performance. When combnng M2 and Heat model, the predctve performance can be further ncreased. The model M4 ncorporates prce and brand preference of customers. We can see brand and prce preference do not make a sgnfcant mprovement for predcton performance when n=10. But M4 outperform other three models when n=3 and n=1. It means that the proposed model COREL n ths study s feasble and effectve on predctng customers purchase behavor. 5 CONCLUSION n M1 M2 M3 M % 9.9% 11.4% 15.8% 3 6.8% 13.1% 15.7% 17.5% 5 8.9% 19.9% 23.5% 23.5% % 27.7% 32.9% 32.2% Researchers from marketng and CRM felds make a lot of sgnfcant contrbutons on customer purchase behavor predcton for tradtonal busness. However, n e-commerce context, new methods and technques need to be developed to deal wth the predctve problem. In ths study, we nvestgate several key factors that have an mpact on purchasng decson-makng of customers n e-commerce context, ncludng the needs of customers, the popularty of products and the preference of customers. Furthermore, explotng purchase data and ratngs of products, we propose methods to quantfy the strength of these factors. We beleve that there exst assocaton-ntated purchase n on-lne shoppng and t can be exploted to predct the needs of customers. Experments n ths study favor our pont of vew. It s reasonable that we dvde onlne shoppng to three types: frm-ntated purchase, self-ntated purchase and assocaton-ntated purchase. Accordngly, experments also show that assocatons between categores of products can sgnfcantly mprove the predctve performance.

12 We develop a SVR based model, called Heat model, to calculate the popularty of products. However, none of the e-commerce webste provdes the labeled data for tranng the model. Relyng on crowdsourcng approach, we develop a system that can generate tran set va collectng partcpant s udgment on popularty of products. Experments prove that our approach s feasble and popularty of products s also a key factor mpactng on-lne shoppng. We also combne CF and a herarchcal Bayesan dscrete choce model to learn preference of customers. Experments demonstrate that customers preference play an mportant role on the purchase decson-makng of customer. The model COREL, whch combnes products assocatons, the popularty, and customer preference, may be appled to predct the most possble purchased products of a customer. Experments show that COREL obvously outperform the baselne methods. Acknowledgements Ths work was supported by Maor Program of Natonal Natural Scence Foundaton of Chna (No ) and the Fundamental Research Funds for the Central Unversty of Chna (No. JBK120505). References Anand V. Bodapat (2008). Recommendaton Systems wth Purchases Data, ournal of marketng research, 45 (1), Chhl Hung and Chh-Fong Tsa (2008). Market segmentaton based on herarchcal self-organzng map for markets of multmeda on demand, Expert System wth Applcaton. 34(1), Chh-Chung Chang and Chh-Jen Ln (2011). LIBSVM : a lbrary for support vector machnes. ACM Transactons on Intellgent Systems and Technology, 27(2), D. Jannach, M. Zanker, A. Felferng, G. Fredrch, (2011). Recommender Systems: An Introducton. Cambrdge Unversty Press, New York. E. H. Suh, K. C. Noh, and C. K. Suh (1999). Customer lst segmentaton usng the combned response model, Expert System wth Applcaton, 17(1), H. Shn, and S. Cho, (2006). Response modelng wth support vector machnes, Expert System wth Applcaton, 30(4), Jan R. Magnus, Henz N. (2007). Matrx Dfferental Calculus wth Applcatons n Statstcs and Econometrcs, JOHN WILEY & SONS, New York. J. A. McCarty, and M. Hastak (2007). Segmentaton approaches n data-mnng: A comparson of RFM, CHAID, and logstc regresson, Journal of Busness Research, 60(3), J. R. Bult, and T. Wansbeek (1995). Optmal selecton for drect mal. Mark. Sc., 14, J. Zahav, and N. Levn, (1997). Applyng neural computng to target marketng. Journal of Drect Marketng, 11(2), Lnyuan Lü, Matúš Medo, Ch Ho Yeung, Y-Cheng Zhang, Z-Ke Zhang, Tao Zhou (2012). Recommender systems, Physcs Reports, 519(1), Poel, D. V. D., & Bucknx, W. (2005). Predctng onlne-purchasng behavor. European Journal of Operatonal Research, 166, Raymond Ch-Wng Wong, Ada Wa-Chee Fu, Ke Wang, (2005). Data Mnng for Inventory Item Selecton wth Cross-Sellng Consderatons. Data Mnng and Knowledge Dscovery, 11 (1), Sha Yang, Greg M. Allenby (2003). Modelng Interdependent Consumer Preferences. Journal of Marketng Research, 40(3), Shu-hsen Lao, Yn-u Chen, Hsn-hua Hseh (2011). Mnng customer knowledge for drect sellng and marketng, Expert Systems wth Applcatons, 38 (5),

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000 Problem Set 5 Solutons 1 MIT s consderng buldng a new car park near Kendall Square. o unversty funds are avalable (overhead rates are under pressure and the new faclty would have to pay for tself from

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy Fnancal Tme Seres Analyss Patrck McSharry patrck@mcsharry.net www.mcsharry.net Trnty Term 2014 Mathematcal Insttute Unversty of Oxford Course outlne 1. Data analyss, probablty, correlatons, vsualsaton

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

An Empirical Study of Search Engine Advertising Effectiveness

An Empirical Study of Search Engine Advertising Effectiveness An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan Rmm-Kaufman, Rmm-Kaufman

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh

More information

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Study on Model of Risks Assessment of Standard Operation in Rural Power Network Study on Model of Rsks Assessment of Standard Operaton n Rural Power Network Qngj L 1, Tao Yang 2 1 Qngj L, College of Informaton and Electrcal Engneerng, Shenyang Agrculture Unversty, Shenyang 110866,

More information

PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS

PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS Yunhong Xu, Faculty of Management and Economcs, Kunmng Unversty of Scence and Technology,

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION NEURO-FUZZY INFERENE SYSTEM FOR E-OMMERE WEBSITE EVALUATION Huan Lu, School of Software, Harbn Unversty of Scence and Technology, Harbn, hna Faculty of Appled Mathematcs and omputer Scence, Belarusan State

More information

Analyzing Search Engine Advertising: Firm Behavior and Cross-Selling in Electronic Markets

Analyzing Search Engine Advertising: Firm Behavior and Cross-Selling in Electronic Markets WWW 008 / Refereed Track: Internet Monetzaton - Sponsored Search Aprl -5, 008 Beng, Chna Analyzng Search Engne Advertsng: Frm Behavor and Cross-Sellng n Electronc Markets Anndya Ghose Stern School of Busness

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

How To Calculate The Accountng Perod Of Nequalty

How To Calculate The Accountng Perod Of Nequalty Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsn-yng Wu b a Professor (Management Scence), Natonal Chao

More information

Sorting Online Reviews by Usefulness Based on the VIKOR Method

Sorting Online Reviews by Usefulness Based on the VIKOR Method Assocaton or Inormaton Systems AIS Electronc Lbrary (AISeL) Eleventh Wuhan Internatonal Conerence on e- Busness Wuhan Internatonal Conerence on e-busness 5-26-2012 Sortng Onlne Revews by Useulness Based

More information

Efficient Project Portfolio as a tool for Enterprise Risk Management

Efficient Project Portfolio as a tool for Enterprise Risk Management Effcent Proect Portfolo as a tool for Enterprse Rsk Management Valentn O. Nkonov Ural State Techncal Unversty Growth Traectory Consultng Company January 5, 27 Effcent Proect Portfolo as a tool for Enterprse

More information

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

Credit Limit Optimization (CLO) for Credit Cards

Credit Limit Optimization (CLO) for Credit Cards Credt Lmt Optmzaton (CLO) for Credt Cards Vay S. Desa CSCC IX, Ednburgh September 8, 2005 Copyrght 2003, SAS Insttute Inc. All rghts reserved. SAS Propretary Agenda Background Tradtonal approaches to credt

More information

The Application of Fractional Brownian Motion in Option Pricing

The Application of Fractional Brownian Motion in Option Pricing Vol. 0, No. (05), pp. 73-8 http://dx.do.org/0.457/jmue.05.0..6 The Applcaton of Fractonal Brownan Moton n Opton Prcng Qng-xn Zhou School of Basc Scence,arbn Unversty of Commerce,arbn zhouqngxn98@6.com

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

An Empirical Analysis of Search Engine Advertising: Sponsored Search in Electronic Markets 1

An Empirical Analysis of Search Engine Advertising: Sponsored Search in Electronic Markets 1 An Emprcal Analyss of Search Engne Advertsng: Sponsored Search n Electronc Markets Anndya Ghose Stern School of Busness New York Unversty aghose@stern.nyu.edu Sha Yang Stern School of Busness New York

More information

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIIOUS AFFILIATION AND PARTICIPATION Danny Cohen-Zada Department of Economcs, Ben-uron Unversty, Beer-Sheva 84105, Israel Wllam Sander Department of Economcs, DePaul

More information

Comparing Performance Metrics in Organic Search with Sponsored Search Advertising

Comparing Performance Metrics in Organic Search with Sponsored Search Advertising Comparng erformance Metrcs n Organc Search wth Sponsored Search Advertsng Anndya Ghose Stern School of Busness ew York Unversty ew York, Y-1001 aghose@stern.nyu.edu Sha Yang Stern School of Busness ew

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

Performance Management and Evaluation Research to University Students

Performance Management and Evaluation Research to University Students 631 A publcaton of CHEMICAL ENGINEERING TRANSACTIONS VOL. 46, 2015 Guest Edtors: Peyu Ren, Yancang L, Hupng Song Copyrght 2015, AIDIC Servz S.r.l., ISBN 978-88-95608-37-2; ISSN 2283-9216 The Italan Assocaton

More information

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE Yu-L Huang Industral Engneerng Department New Mexco State Unversty Las Cruces, New Mexco 88003, U.S.A. Abstract Patent

More information

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

Logistic Regression. Steve Kroon

Logistic Regression. Steve Kroon Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

Understanding the Impact of Marketing Actions in Traditional Channels on the Internet: Evidence from a Large Scale Field Experiment

Understanding the Impact of Marketing Actions in Traditional Channels on the Internet: Evidence from a Large Scale Field Experiment A research and educaton ntatve at the MT Sloan School of Management Understandng the mpact of Marketng Actons n Tradtonal Channels on the nternet: Evdence from a Large Scale Feld Experment Paper 216 Erc

More information

Traffic-light a stress test for life insurance provisions

Traffic-light a stress test for life insurance provisions MEMORANDUM Date 006-09-7 Authors Bengt von Bahr, Göran Ronge Traffc-lght a stress test for lfe nsurance provsons Fnansnspetonen P.O. Box 6750 SE-113 85 Stocholm [Sveavägen 167] Tel +46 8 787 80 00 Fax

More information

Transition Matrix Models of Consumer Credit Ratings

Transition Matrix Models of Consumer Credit Ratings Transton Matrx Models of Consumer Credt Ratngs Abstract Although the corporate credt rsk lterature has many studes modellng the change n the credt rsk of corporate bonds over tme, there s far less analyss

More information

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. INDEX 1. Load data usng the Edtor wndow and m-fle 2. Learnng to save results from the Edtor wndow. 3. Computng the Sharpe Rato 4. Obtanng the Treynor Rato

More information

How To Predct On The Web For Hfmd

How To Predct On The Web For Hfmd Proceedngs of the Twenty-Second Internatonal Jont Conference on Artfcal Intellgence Predctng Epdemc Tendency through Search Behavor Analyss Danqng Xu, Yqun Lu, Mn Zhang, Shaopng Ma, Anq Cu, Lyun Ru State

More information

Forecasting and Stress Testing Credit Card Default using Dynamic Models

Forecasting and Stress Testing Credit Card Default using Dynamic Models Forecastng and Stress Testng Credt Card Default usng Dynamc Models Tony Bellott and Jonathan Crook Credt Research Centre Unversty of Ednburgh Busness School Verson 4.5 Abstract Typcally models of credt

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

Semantic Link Analysis for Finding Answer Experts *

Semantic Link Analysis for Finding Answer Experts * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 51-65 (2012) Semantc Lnk Analyss for Fndng Answer Experts * YAO LU 1,2,3, XIAOJUN QUAN 2, JINGSHENG LEI 4, XINGLIANG NI 1,2,3, WENYIN LIU 2,3 AND YINLONG

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

SIMPLE LINEAR CORRELATION

SIMPLE LINEAR CORRELATION SIMPLE LINEAR CORRELATION Smple lnear correlaton s a measure of the degree to whch two varables vary together, or a measure of the ntensty of the assocaton between two varables. Correlaton often s abused.

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Mining Multiple Large Data Sources

Mining Multiple Large Data Sources The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of

More information

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,

More information

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION

More information

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall SP 2005-02 August 2005 Staff Paper Department of Appled Economcs and Management Cornell Unversty, Ithaca, New York 14853-7801 USA Farm Savngs Accounts: Examnng Income Varablty, Elgblty, and Benefts Brent

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

Lecture 3: Force of Interest, Real Interest Rate, Annuity

Lecture 3: Force of Interest, Real Interest Rate, Annuity Lecture 3: Force of Interest, Real Interest Rate, Annuty Goals: Study contnuous compoundng and force of nterest Dscuss real nterest rate Learn annuty-mmedate, and ts present value Study annuty-due, and

More information

Optimal Customized Pricing in Competitive Settings

Optimal Customized Pricing in Competitive Settings Optmal Customzed Prcng n Compettve Settngs Vshal Agrawal Industral & Systems Engneerng, Georga Insttute of Technology, Atlanta, Georga 30332 vshalagrawal@gatech.edu Mark Ferguson College of Management,

More information

Marginal Benefit Incidence Analysis Using a Single Cross-section of Data. Mohamed Ihsan Ajwad and Quentin Wodon 1. World Bank.

Marginal Benefit Incidence Analysis Using a Single Cross-section of Data. Mohamed Ihsan Ajwad and Quentin Wodon 1. World Bank. Margnal Beneft Incdence Analyss Usng a Sngle Cross-secton of Data Mohamed Ihsan Ajwad and uentn Wodon World Bank August 200 Abstract In a recent paper, Lanjouw and Ravallon proposed an attractve and smple

More information

Hot and easy in Florida: The case of economics professors

Hot and easy in Florida: The case of economics professors Research n Hgher Educaton Journal Abstract Hot and easy n Florda: The case of economcs professors Olver Schnusenberg The Unversty of North Florda Cheryl Froehlch The Unversty of North Florda We nvestgate

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

Using Content-Based Filtering for Recommendation 1

Using Content-Based Filtering for Recommendation 1 Usng Content-Based Flterng for Recommendaton 1 Robn van Meteren 1 and Maarten van Someren 2 1 NetlnQ Group, Gerard Brandtstraat 26-28, 1054 JK, Amsterdam, The Netherlands, robn@netlnq.nl 2 Unversty of

More information

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Secure Password-Authenticated Key Agreement Using Smart Cards A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

Probabilistic Latent Semantic User Segmentation for Behavioral Targeted Advertising*

Probabilistic Latent Semantic User Segmentation for Behavioral Targeted Advertising* Probablstc Latent Semantc User Segmentaton for Behavoral Targeted Advertsng* Xaohu Wu 1,2, Jun Yan 2, Nng Lu 2, Shucheng Yan 3, Yng Chen 1, Zheng Chen 2 1 Department of Computer Scence Bejng Insttute of

More information

Searching and Switching: Empirical estimates of consumer behaviour in regulated markets

Searching and Switching: Empirical estimates of consumer behaviour in regulated markets Searchng and Swtchng: Emprcal estmates of consumer behavour n regulated markets Catherne Waddams Prce Centre for Competton Polcy, Unversty of East Angla Catherne Webster Centre for Competton Polcy, Unversty

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

Financial Instability and Life Insurance Demand + Mahito Okura *

Financial Instability and Life Insurance Demand + Mahito Okura * Fnancal Instablty and Lfe Insurance Demand + Mahto Okura * Norhro Kasuga ** Abstract Ths paper estmates prvate lfe nsurance and Kampo demand functons usng household-level data provded by the Postal Servces

More information

Design and Development of a Security Evaluation Platform Based on International Standards

Design and Development of a Security Evaluation Platform Based on International Standards Internatonal Journal of Informatcs Socety, VOL.5, NO.2 (203) 7-80 7 Desgn and Development of a Securty Evaluaton Platform Based on Internatonal Standards Yuj Takahash and Yoshm Teshgawara Graduate School

More information

Small pots lump sum payment instruction

Small pots lump sum payment instruction For customers Small pots lump sum payment nstructon Please read these notes before completng ths nstructon About ths nstructon Use ths nstructon f you re an ndvdual wth Aegon Retrement Choces Self Invested

More information

Improved SVM in Cloud Computing Information Mining

Improved SVM in Cloud Computing Information Mining Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu

More information

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy 4.02 Quz Solutons Fall 2004 Multple-Choce Questons (30/00 ponts) Please, crcle the correct answer for each of the followng 0 multple-choce questons. For each queston, only one of the answers s correct.

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 petr.nazarov@crp-sante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for

More information

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts Power-of-wo Polces for Sngle- Warehouse Mult-Retaler Inventory Systems wth Order Frequency Dscounts José A. Ventura Pennsylvana State Unversty (USA) Yale. Herer echnon Israel Insttute of echnology (Israel)

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

Customer Lifetime Value Modeling and Its Use for Customer Retention Planning

Customer Lifetime Value Modeling and Its Use for Customer Retention Planning Customer Lfetme Value Modelng and Its Use for Customer Retenton Plannng Saharon Rosset Enat Neumann Ur Eck Nurt Vatnk Yzhak Idan Amdocs Ltd. 8 Hapnna St. Ra anana 43, Israel {saharonr, enatn, ureck, nurtv,

More information

The Use of Analytics for Claim Fraud Detection Roosevelt C. Mosley, Jr., FCAS, MAAA Nick Kucera Pinnacle Actuarial Resources Inc.

The Use of Analytics for Claim Fraud Detection Roosevelt C. Mosley, Jr., FCAS, MAAA Nick Kucera Pinnacle Actuarial Resources Inc. Paper 1837-2014 The Use of Analytcs for Clam Fraud Detecton Roosevelt C. Mosley, Jr., FCAS, MAAA Nck Kucera Pnnacle Actuaral Resources Inc., Bloomngton, IL ABSTRACT As t has been wdely reported n the nsurance

More information

The Probability of Informed Trading and the Performance of Stock in an Order-Driven Market

The Probability of Informed Trading and the Performance of Stock in an Order-Driven Market Asa-Pacfc Journal of Fnancal Studes (2007) v36 n6 pp871-896 The Probablty of Informed Tradng and the Performance of Stock n an Order-Drven Market Ta Ma * Natonal Sun Yat-Sen Unversty, Tawan Mng-hua Hseh

More information

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1. HIGHER DOCTORATE DEGREES SUMMARY OF PRINCIPAL CHANGES General changes None Secton 3.2 Refer to text (Amendments to verson 03.0, UPR AS02 are shown n talcs.) 1 INTRODUCTION 1.1 The Unversty may award Hgher

More information

Stochastic Protocol Modeling for Anomaly Based Network Intrusion Detection

Stochastic Protocol Modeling for Anomaly Based Network Intrusion Detection Stochastc Protocol Modelng for Anomaly Based Network Intruson Detecton Juan M. Estevez-Tapador, Pedro Garca-Teodoro, and Jesus E. Daz-Verdejo Department of Electroncs and Computer Technology Unversty of

More information

Beating the Odds: Arbitrage and Wining Strategies in the Football Betting Market

Beating the Odds: Arbitrage and Wining Strategies in the Football Betting Market Beatng the Odds: Arbtrage and Wnng Strateges n the Football Bettng Market NIKOLAOS VLASTAKIS, GEORGE DOTSIS and RAPHAEL N. MARKELLOS* ABSTRACT We examne the potental for generatng postve returns from wagerng

More information

Data Mining Analysis and Modeling for Marketing Based on Attributes of Customer Relationship

Data Mining Analysis and Modeling for Marketing Based on Attributes of Customer Relationship School of athematcs and Systems Engneerng Reports from SI - Rapporter från SI Data nng Analyss and odelng for arketng Based on Attrbutes of Customer Relatonshp Xaoshan Du Sep 2006 SI Report 06129 Väö Unversty

More information

Detecting Credit Card Fraud using Periodic Features

Detecting Credit Card Fraud using Periodic Features Detectng Credt Card Fraud usng Perodc Features Alejandro Correa Bahnsen, Djamla Aouada, Aleksandar Stojanovc and Björn Ottersten Interdscplnary Centre for Securty, Relablty and Trust Unversty of Luxembourg,

More information

Web Spam Detection Using Machine Learning in Specific Domain Features

Web Spam Detection Using Machine Learning in Specific Domain Features Journal of Informaton Assurance and Securty 3 (2008) 220-229 Web Spam Detecton Usng Machne Learnng n Specfc Doman Features Hassan Najadat 1, Ismal Hmed 2 Department of Computer Informaton Systems Faculty

More information

ERP Software Selection Using The Rough Set And TPOSIS Methods

ERP Software Selection Using The Rough Set And TPOSIS Methods ERP Software Selecton Usng The Rough Set And TPOSIS Methods Under Fuzzy Envronment Informaton Management Department, Hunan Unversty of Fnance and Economcs, No. 139, Fengln 2nd Road, Changsha, 410205, Chna

More information

Searching for Interacting Features for Spam Filtering

Searching for Interacting Features for Spam Filtering Searchng for Interactng Features for Spam Flterng Chuanlang Chen 1, Yun-Chao Gong 2, Rongfang Be 1,, and X. Z. Gao 3 1 Department of Computer Scence, Bejng Normal Unversty, Bejng 100875, Chna 2 Software

More information