Web Servces Wd Tuel: O Performace Testg Large-scale Stateful Web Servces Marcelo De Barros, Jg Shau, Che Shag, Keto Gdewall, Hu Sh, Joe Forsma Mcrosoft Cororato {marcelod,shau,cshag,ketog,hush,osehfo}@mcrosoft.com Abstract New versos of exstg large-scale web servces such as Passort.com have to go through rgorous erformace evaluatos order to esure a hgh degree of avalablty. Performace testg (such as bechmarkg, scalablty, ad caacty tests) of largescale stateful systems maaged test evromets has may dfferet challeges, maly related to the reroducblty of roducto codtos lve data ceters. Oe of these challeges s creatg a dataset a test evromet that mmcs the actual dataset roducto. Other challeges volve the characterzato of load atters roducto based o log aalyss ad roer load smulato va reutlzato of data from the exstg dataset. The tet of ths aer s to descrbe ractcal aroaches to address some of the aforemetoed challeges through the use of varous ovel techques. For examle, ths aer dscusses data satzato, whch s the alterato of large datasets a cotrolled maer to obfuscate sestve formato, reservg data tegrty, relatoshs, ad data equvalece classes. Ths aer also rovdes techques for load atter characterzato va the alcato of Markov Chas to custom ad geerc logs, as well as geeral gudeles for the develomet of cache-based load smulato tools talored for the erformace evaluato of stateful systems. 1. Itroducto Large-scale ole web servces are subect to very dfferet loads ad codtos whe they are released o the teret. Servces such as Passort.com ca receve u to 3, user-drve trasactos er secod, ad may cota a dataset of over 5,, users. For ths reaso, ew versos of such servces have to udergo rgorous erformace testg before gog ublc. Irresectve of the tests beg executed (bechmarkg, load, scalablty, caacty, etc), the hgh-level rocess cossts of: evromet (clusters, data, tools) rearato, executo, ad aalyss. Ths aer descrbes ractcal aroaches for creatg accurate evromets for the executo of erformace tests for stateful web servces. Web servces are cosdered stateful f they cota hard-state stead of soft-state data [3]. Hard-state data s data that caot be lost due to the ufeasblty of recostructg t. A examle s a user rofle ad user trasactos for a bak accout. Soft-state data ca be recostructed from hard-state data. A examle would be aggregated facal reorts. May web servces avalable today are stateful. I erformace test evromets, re-oulato of test data s a crucal ste towards relcatg roducto codtos. Tradtoal aroaches for test data geerato cosst of sytheszg the data based o the alcato code, radom ad robablstc techques, or custom alcatos [4]. Other aroaches make use of lmted data satzato rocesses based o redetermed heurstcs [5]. However t s mractcal to sythesze the same hardstates observed roducto evromets due to the uredctablty of the may ways whch the data may have bee trasformed based o users actvtes. Performace tests are artcularly sestve to the dataset sce slght dffereces the test data may result sgfcat dscreaces the test results. The deal dataset would be costructed from the same set of roducto data for the erformace tests, but may of the tems roducto have restrcted access. Therefore there s a eed for a sustaable rocess of obfuscatg restrcted data tems so that the dataset ca be safely used test laboratores. Ths s accomlshed by the use of the Data Satzato rocess [6], whch ams to retreve a set of databases ad determstcally obfuscate restrcted data, whle reservg data tegrty, relatoshs, ad data equvalece classes. Ths rocess s descrbed secto 2. Oce the rght data s lace, there s stll a eed to smulate roducto behavor by dulcatg the rght mx of varous tyes of trasactos rocessed by the system ad smulatg the deedecy betwee related sequeces of trasactos as observed roducto. May lve evromets cota a set of logs whch ca be med to rovde u-to-date statstcs descrbg
the trasacto mx. By usg data mg techques, oe ca determe ot oly how APIs (Alcato Programmg Iterfaces) are beg voked, but also the relatosh betwee vocatos of the dfferet APIs. We dscuss oe of these techques based o a alcato of Markov Chas to geerc/custom log data. Ths rocess s fully descrbed secto 3. Secto 4 dscusses aroaches for wrtg cachebased erformace test tools whch ca leverage the revously retreved ad satzed roducto data, morted to test evromets. Fally, we dscuss ractcal results from the use of these techques, as well as future mrovemets secto 5. subectvty of the data. The database schemas are modeled as ML fles [7] cotag all of the metadata ertet to the tables all of the databases. 2.2 Satzato Method Assgmet: The method used to satze a certa PII feld ca ow be chose such a way to reserve (), () ad (). Fgure 1 below llustrates the PII detfcato ad the satzato method assgmet: 2. Data Satzato Usg roducto data for testg ew features whch wll oerate o exstg data (.e., ew search fuctoalty that allows more fe-graed search crtera) s crucal ot oly for fuctoal valdato of correctess, but also for erformace evaluato, sce dfferet amouts of data, deedeces, ad data characterstcs may have sgfcat affects the overall system s erformace. Producto data, however, cotas large amouts of formato regardg dvduals whch should be ket cofdetal, eve f oly used restrcted test laboratores. The geeral term to classfy such cofdetal data s called Persoally Idetfable Iformato (PII) [1]. The data satzato rocess cossts of a set of tools ad methodologes to take roducto data ad obfuscate all the re-determed PII, reservg three key characterstcs:. Data Itegrty: Costrats aled to relatoal database tables, such as Prmary Keys ad Uqueess are carred over after satzato.. Data Relatoshs: Relatoshs betwee tables a relatoal database ersst after the satzato rocess.. Data Equvalece Classes: Subsets of the doma ut data are reserved, such that all elemets the subsets are assumed to be the same from the secfcato of the subsets [2]. The curret rocess s talored for obfuscato of data stored relatoal databases. The rocess cossts of a sequece of stes lsted as: 2.1 PII Idetfcato: the frst ste cossts of detfyg the data that eeds to be obfuscated. Databases for large-scale systems may cota thousads of dfferet tables ad colums. A set of tools s rovded alog wth the data satzer framework to assst the user wth the detfcato of PII, although the detfcato rocess requres maual terveto ad caot be fully automated due to the Fgure 1. PII detfcato ad method assgmet A satzato method may have the followg geerc sgature: obect SatzatoMethod( obect OrgalValue, obect[] Metadata) The metadata may cosst of the detals of the artcular feld questo, such as data tye ad legth. The method should erform a oe-way trasformato order to avod reverse egeerg of the satzed data. The data satzer framework comes wth several satzato methods, cludg the followg oes (table 1). Stadard Satzato Method Descrto Erase Erases ay o-bary feld EraseBary Erases bary felds FllWthChar Relaces the etre feld wth a radom strg of same legth FllWthDgt Relaces the etre feld wth a radom umber of same legth HashStrg Ales a oe-way SHA1 saltbased (assword) hash fucto HashDgts Ales a oe-way SHA1 saltbased (assword) hash fucto, but the result s umerc NewGUID Relace a GUID wth a dfferet (ew) GUID Table 1. Stadard satzato methods New methods ca be added to the framework whe deemed ecessary. The use of oe-way SHA1 hash fuctos as a satzato method s essetal to esure
data tegrty ad relatosh reservato ostsatzato: correlated data across tables/dbs ca be satzed by usg oe-way, salt-based SHA1 hash, esurg the same outut, thus cosstecy. A crucal ste s to revew PII sets as well as satzato methods ad assgmets wth securty exerts ad legal ersoel to valdate the correctess of the rocedures. 2.3 Test ad Satzato Executo: after the roer detfcato of PII ad satzato methods, the overall satzato rocess s tested restrcted laboratores o o-roducto data. Uo verfcato, the rocess s carred out roducto evromets. Sce the satzato s a -lace rocedure, a coy of roducto data s made, all wth secure roducto evromets. The ma satzer tool s multthreaded for otmal seed, thus multle databases ad tables are rocessed cocurretly. Because of ths, database costrats must be removed ror to the satzato executo, sce durg the executo hase data relatoshs mght be temorarly volated. All these costrats are saved ror to the satzato, ad are recreated uo comleto of the satzato rocess (fgure 2). Fgure 2. Satzato executo Table 2 below shows some results obtaed rug the data satzer o HP DL385 G1, 4xAMD 2.4GHz rocessors, 4GB RAM servers, agast large data sets. The average ercetage of PII felds detfed these data sets was ~13%, wth oe-way hash fuctos accoutg for ~1% of the satzato methods: Databases Data Sze Tme (h) Subscrto Servce 33MM users 85GB 22 Parter Reortg Servce 3MM users 6GB 16 Facal Reortg Servce 32MM users 8GB 21 Customer Assstace Servce 1.4MM tckets 5GB 3.5 Authetcato Servce 4MM users 3.5TB 1 Table 2. Satzato rus exermets 3. Markov Cha Stress Model The Markov Cha Stress Model cludes two maor comoets: the kowledge retrever comoet ad the kowledge exercser comoet. Both comoets are based o the cocet of the Markov Cha dyamc stochastc rocess, whch descrbes the state of systems at successve tmes [8]. The kowledge retrever comoet ales data mg o roducto actvty logs ad dscovers the arameter load atters of each API. The kowledge exercser comoet uses Markov Cha Mote Carlo methods [9], whch are a class of algorthms for samlg from robablty dstrbutos, based o costructg a Markov cha wth the desred dstrbuto as ts statoary dstrbuto, to mafest the former statstcal kowledge to the stress test evromet, ad geerates dyamc scearos cosstet wth roducto load atters. Durg the kowledge retrever ste, we assume that: A dstrbuted alcato has a set wth coutable umber of APIs, rereseted as: =,,..., } { 1 Ay API s logcally coected to API wth a robablty weght, where = s ossble. Each API has a kow set of doma data as ut arameters: = f,,..., ) ( 1 m Where belogs to a doma set S, S, =,1,.., m where S s the set of all ossble values of ths API for A clet alcato makes API calls agast the web servce accordg to the Markov rocess. To smlfy our model, we use the frst-order Markov rocess [9] to mlemet the rogram,.e. curret API call at tme t deeds oly o the revous API call at tme t-1. P (, t,: t 1) = P(, t, t 1 ) Clet s homogeous over tme, meag that ts behavor s cosstet, the trasto matrx ca be re-bult ay tme. Wth the above assumtos, we use real roducto trace data as samles to estmate the Markov Trasto Matrx:
1 2 11 21 1 1 12 22............ 1 2 1 2... where: reresets the trasto robablty Ad we have: For, =,1,..., ; =,1,..., 1 = = 1 Durg the kowledge exercser ste our obectve s to roduce load atters that are as close as ossble to those roducto, usg the Markov Trasto Matrx we created durg the Kowledge Retrever ste. A mortat asect of ths methodology s that the trasto matrx extracted oly reflects the average behavor over a certa erod of tme. Therefore, order to reroduce the exact same atter observed roducto ( terms of dfferet varables, such as CPU utlzato, memory utlzato, dsk utlzato, etc.), the matrx has to be erodcally udated. The followg dagram (fgure 3) descrbes the etre workflow: 2 2 13 23 Matrx over the APIs, ad (2) the terval betwee each of the APIs trasactg (some authors refer to t as Thkg Tme [1]). Through data mg over each API s arameters wth aggregated statstcs, we ca get the arameter callg atters of each API. At ths ot, we are doe wth the kowledge retrever ste. Our ext ste s to exercse the kowledge collected revous stes the stress testg evromet. I ractce, we foud that t s more challegg to recover ad mmc the arameter atters of each API call tha to roduce the Markov trasto matrx. Ths occurs due to the fact that the maorty of ut arameters are user-secfc formato wth a hgh degree of radomess (.e., user s frst ame). For user-secfc formato, we created tools or methods that would geerate vald arameter values by creatg them or retrevg them from our data store, ad the deost them arameter ools. We the use a thread to rereset a clet. Ths thread wll call APIs oe by oe based o the Markov Trasacto Matrx. We also troduce a slee terval betwee two cotget API calls to make the smulato more realstc. The model wll fetch eeded arameters from the arameter ool for each API call. Our model s scalable such that geerato of stress load s ossble. The arameter geerator ad threads ca be dstrbuted across maches va cofgurable arameters. Fgure 4 s a examle of our stress load smulato durg a 3 hour ru. I ths examle, we geerated load o the rmary database servers a test evromet usg kowledge leared from SQL rofler traces o database servers wth the same role roducto evromets. The load rofle our test clusters was very smlar to that observed roducto. CPU utlzato Producto ad Test Evromet smulato % CPU Utlzato 1 9 8 7 6 5 4 3 2 1 Producto Test Ev 1 25 49 73 97 121 145 169 193 217 241 265 289 313 337 361 385 49 433 457 481 55 529 553 577 61 625 649 673 697 721 Tme Seres Fgure 4. Load rofles roducto ad test Fgure 3. Markov Cha workflow I the above workflow, we frst aggregate the roducto actvty logs by sesso, ad order them by tmestams. Through data mg over the aggregated actvty logs, we ca get: (1) the Markov Trasacto The Markov cha model ot oly smulates roducto-lke stress load o test evromets, but also rovdes mortat sghts about system behavor, esecally the correlato amog APIs ad amog the arameters of each API. For examle, hghly correlated APIs mght get a erformace boost by mrovg localty, whle uexected correlatos
mght be a dcato of a otetal area eed of further secto. 4. Cache-based Load Smulato Tools Methods descrbed revous sectos rovde a large dataset resemblg roducto as well as a ractcal way to aalyze ad smulate real-user behavors (.e. scearos). The erformace tug of a stateful system, however, ofte requres a artcular API to be executed umerous tmes solato order to determe the bottleecks of the system. Web servces are geerally modeled after fte state maches that the maorty of APIs exect the vokg etty to be a certa state ror to a oerato [11]. The covetoal test aroach s to geerate a etty ad duce t to the requred state before the actual API vocato (sometmes the stateducemet makes use of other APIs). Ths o-the-fly data geerato ad etty state ducemet ca hde actual system bottleecks, sce the rearato work may be the bottleeck tself. I addto, the covetoal aroach does ot allow for takg advatage of satzed roducto data. Our soluto s cache-based load smulato. Ths rocess requres re-determato of the dvdual API calls to the state(s) whch the etty t oerates agast must be. Each defed state ca be rereseted as a bucket. The bucket deftos are collectos of Boolea codtos. Whe a artcular etty matches the re-defed set of codtos, the etty s sad to belog to the corresodg bucket. The states are ot mutually exclusve, so a gve etty ca otetally satsfy more tha oe state codto ad therefore be reset more tha oe bucket the cache. Oce the API state mag ad the evaluato crtera for each state have bee establshed, the rocess becomes trval. Gve ay API, we execute the followg: 1. Determe the matchg bucket for the API 2. Extract a etty from the bucket 3. Ivoke the API wth the selected etty 4. Re-evaluate state ost API vocato 5. Re-sert the etty the ew bucket(s). If etty s o loger usable, t s dscarded Our mlemetato of the etty cache uses a SQL database to revet data loss the evet of a crash, as well as allowg easy data sharg across multle staces of the tool. To smlfy access to the database, a layer of abstracto was troduced to wra the database calls. Ths layer exoses methods to modfy the ettes as well as the buckets. Ths layer exoses other methods addto to the ormal add, get, ad remove calls (table 3): Method Tye Method ItalzeCache Italzato LoadBuckets Clear AddBucket Bucket Access GetBucketLst GetCoutPerBucket AddEtty GetEttyFromBucket Etty Access RemoveEtty GetPreExstgEtty Table 3. DB access methods for cached-based smulato Sce the oeratos erformed o the ettes wll most lkely chage the etty state, each etty s smlar to a crtcal secto oly oe thread may act o a etty at a tme (however, multle threads ca stll oerate o dfferet ettes). To avod corruto of etty state, we remove the etty from ts matchg buckets ust ror to the API vocato, ad after the API executo, we re-evaluate the etty state ad lace t back to the arorate bucket(s). The aforemetoed rocess deeds o a welloulated cache to work. Durg the cache-oulato stage, we utlze the database access method GetPreExstgEtty metoed above to ck radom ettes from the exstg dataset ad use the state evaluato ste to lace the ettes the arorate buckets the cache. Poulato of secfc buckets may oly be accomlshed by executg a set of re-defed stes, whch may or may ot leverage exstg data. Therefore, the resultg cached data would the be a combato of satzed data ad sythetc data. I ractce, we observed that the etty state ca usually be determed by arsg the results from state-retrevg API calls (e.g. Get calls). I cases where the state-retrevg API calls do ot rovde suffcet formato, we costruct custom data access methods to query the data store. Ths cache-based aroach ca be mlemeted ay of the exstg commercal alcatos for load geerato sce the model s focused o teractos wth the uderlyg system ad ot the load volume. 5. Results ad Future Work We have descrbed ths aer techques for buldg roer evromets ad erformace test tools whch ca be used to accurately smulate the same codtos observed roducto (lve)
evromets, targetg large-scale stateful web servces. The use of Data Satzato, Markov Cha Stress Model, ad Cache-based Load Smulato Tools have bee successfully used for bechmark, caacty lag, ad scalablty tests of three maor dstrbuted web servces: Subscrto ad Commerce Web Servces, Idetty Servces Web Servces, ad Customer Assstace Web Servces, all art of the Mcrosoft Member Platform Grou. Accuracy of erformace umbers collected test laboratores have creased to a devato of less tha 5% from erformace umbers observed roducto evromets (comared to ~9% wth sytheszed data). The umber of real erformace ad fuctoal ssues foud durg the qualty assurace rocess has creased by 15% wth the troducto of the techques descrbed ths aer as art of the testg methodology. Future work volves the ehacemet ad geeralzato of the techques descrbed ths aer, cludg: Extedg the alcato of the data satzato rocess to other data sources addto to relatoal databases Real-tme data satzato Geeralzato of the alcato of Markov Cha Stress Model to dfferet log sources Geeralzato of the Cache-based load smulato tools to automatcally detfy otetal matchg buckets based o the Fte State Mache for the system beg tested. [6] O.C. McDoald,. Wag, M. De Barros, R.K. Bolla, Q. Ke, Strateges for Satzg Data Items, US Patet Alcato (atet edg), 24 [7] S. Abteboul, P. Buema, D. Sucu, "Data o the Web: From Relatoal to Semstructured Data ad ML", SIGMOD Record, Vol. 32, No. 4, December 23 [8] Stuart J. Russell, Peter Norvg, Artfcal Itellgece: A Moder Aroach (2d Edto), Pretce Hall, Uer Saddle Rver, NJ, Dec, 23 [9] W.R. Glks, S. Rchardso, D.J. Segelhalter, Markov Cha Mote Carlo Practce, Chama & Hall/CRC, Dec, 1995. [1] J.D. Meer, Srath Vasreddy, Ashsh Babbar, ad Alex Mackma, Imrovg.NET Alcato Performace ad Scalablty, Mcrosoft Cor., Redmod, WA, Arl 24 [11] B. Beatallah, F. Casat ad F. Touma, Web servce coversato modelg: A corer-stoe for e-busess automato, IEEE Iteret Comutg, 24. 6. Refereces [1] Prvacy e-commerce: examg user scearos ad rvacy refereces, Proceedgs of the 1st ACM coferece o Electroc commerce, ACM, 1999. [2] W.E. Howde, Relablty of the ath aalyss testg strategy, IEEE Tras. Software Egeerg, vol SE-2, 1976 Se [3] Y. Sato, B.N. Bershad, H.M. Levy, "Maageablty, avalablty, ad erformace orcue: a hghly scalable, cluster-based mal servce", ACM Trasactos o Comuter Systems, 2 [4] J. Edvardsso, "A survey o automatc test data geerato", Proceedgs of the Secod Coferece o Comuter Scece ad Egeerg, ECSEL, October 1999. [5] SRM Olvera, OR Zaıae, Protectg Sestve Kowledge By Data Satzato, Thrd IEEE Iteratoal Coferece o Data Mg, ICDM 23