Assocaton or Inoraton Systes AIS Eectronc Lbrary (AISeL ICIS 5 Proceedngs Internatona Conerence on Inoraton Systes (ICIS 1-31-5 Prot Maxzaton wth Data Manageent Systes Adr Even Boston Unversty Schoo o Manageent Ganesan Shankaranarayanan Boston Unversty Schoo o Manageent Pau Berger Boston Unversty Schoo o Manageent Foow ths and addtona works at: http://ase.asnet.org/cs5 Recoended Ctaton Even, Adr; Shankaranarayanan, Ganesan; and Berger, Pau, "Prot Maxzaton wth Data Manageent Systes" (5. ICIS 5 Proceedngs. Paper 4. http://ase.asnet.org/cs5/4 Ths atera s brought to you by the Internatona Conerence on Inoraton Systes (ICIS at AIS Eectronc Lbrary (AISeL. It has been accepted or ncuson n ICIS 5 Proceedngs by an authorzed adnstrator o AIS Eectronc Lbrary (AISeL. For ore noraton, pease contact ebrary@asnet.org.
PROFIT MAXIMIZATION WITH DATA MANAGEMENT SYSTEMS Adr Even, G. Shankaranarayanan, and Pau D. Berger Boston Unversty Schoo o Manageent Boston, MA U.S.A. adr@bu.edu gshankar@bu.edu pdberger@bu.edu Abstract Data, a core coponent o noraton systes, has ong been recognzed as a crtca resource to rs. Data s the backbone o busness processes; t enabes ecent operatons, supports anagera decson-akng, and generates revenues as a coodty. Ths study dentes a sgncant gap between the technca and the busness perspectves o data anageent. Whe unctonaty and technca ecency are we addressed, the consderaton o econoc perspectves, such as vaue-contrbuton and protabty, s not evdent. Ths study suggests that ntroducng econoc perspectves can better nor the desgn and the adnstraton o data anageent systes by accountng or the nterpay between busness benets and peentaton costs. To address the dented gap, the paper proposes a uanttatve croeconoc raework or data anageent that nks vaue and cost to the parta/technoogca characterstcs o data and the reated noraton syste. Such a appng aows cost/benet assessent and deternaton o opta conguraton o syste and data characterstcs to axze vaue and prots. The raework s deonstrated through deveopent o a ode or tabuar datasets, and the opta desgn o dataset characterstcs (such as tespan, desred uaty-eve, and the set o attrbutes to be ncuded. The appcaton o the ode s ustrated usng nuerca exapes. Keywords: Data anageent, noraton vaue, noraton econoy, database, desgn, data uaty, croeconoc odeng, utty, noraton product desgn Introducton Data has been recognzed as a crtca resource, essenta at a busness eves ro day-to-day operatons to strategc decsonakng (Redan 1996. The aount o data anaged by organzatons s ncreasng; data warehouses n the agntude o peta (1 15 bytes have been recenty reported. Data anageent eorts are drected toward the desgn and anageent o data actvtes (such as acuston, processng, storage, and devery, and corporate nvestents n data anageent technooges and servces are steady growng (Wxo and Watson 1. Current data anageent actvtes and desgn ethodooges are geared toward unctonaty and technca ecency but, as the nvestents n these grow, concerns about the econoc benets o data assets aso ncrease. For noraton-ntensve rs (e.g., Reuters, AC-Nesen, and S&P, the vaue ganed ro data products s the crtca success actor, and data n such rs s perceved as a strategc asset (Gazer 1993. There s no sgncant research evdence or understandng the corporate gans ro nvestents n data repostores and the reated anageent actvtes. There s an eergng need to exane how such assets ought to be anaged ro the busness-benet vewpont. Ths study brngs together the econoc and technca perspectves n noraton systes/noraton technoogy (IS/IT anageent n genera. Speccay, t exanes data anageent, an IS/IT ed n whch the gap between the two perspectves s apparent. Ths study contrbutes to a better understandng o the busness vaue and the protabty o data, the pact o econoc actors on data anageent actvtes, and ther pcatons or desgn and adnstraton. The reander o ths paper s organzed as oows. The next secton provdes the reevant background. The subseuent secton proposes a croeconoc raework 5 Twenty-Sxth Internatona Conerence on Inoraton Systes 9
Breakthrough Ideas n Inoraton Technoogy or odeng the eect o data and syste characterstcs on both vaue and cost. Ths raework aows assessent o protabty, whch can be axzed through an opta desgn conguraton. The raework n ts genera or targets a broad-range o syste coponents and conguraton actors. In ths study, however, t s apped to the deveopent o a uanttatve ode or the tabuar dataset, the ost coon data-storage structure n data anageent systes. We then deonstrate the use o the ode or prot axzaton by congurng key tabuar dataset characterstcs: the te span, the data uaty eve, and the ed structure, essentay addressng the desgn o tabuar datasets n data repostores. Fnay, concudng rearks are oered and drectons or uture research proposed. Reevant Background Ths secton revews reevant data anageent aspects and ntroduces research concepts that can nuence a ore rgorous econoc thnkng n ths ed. Data anageent has been the subject o substanta acadec research and practtoner dscussons. It s consdered a copex task: data s ntegrated ro utpe sources, processed and stored n repostores, and devered by ront-end toos or va exchange protocos. The coecton o processes and systes ors a copex utstage archtecture that can be vewed as a data anuacturng process (DMP. The output o the DMP s an noraton product (IP, a coodty that can be used nternay, sod to other rs, or ebedded wthn product oerngs. The DMP/IP anageent s asssted by etadata, an abstracton o desgn and adnstraton choces that represents derent aspects o unctonaty and characterstcs, such as nrastructure, ode, process, contents, representaton, and adnstraton (Shankaranarayanan and Even 4. The DMP/IP vew s predonant n today s data uaty anageent (DQM ed, underyng the adaptaton o tota uaty anageent prncpes (Wang 1998, and the deveopent o uaty proveent ethodooges (Baou et a. 1998; Redan 1996; Shankaranarayanan et a. 3. The vaue ganed by DQM s oten dscussed uatatvey (Redan 1996; Wang 1998, wth the excepton o the uanttatve DMP optzaton dscussed by Baou et a. (1998. Ther ode aps uaty attrbutes to vaue and cost, an approach that underes optzaton odes or uaty trade-os (Baou and Pazer 1995, 3, and nuences the raework deveoped n ths paper. Another key data anageent prncpe that nuences ths study s the separaton o data and progra. Datasets are hed n ndependent database anageent systes (DBMS that can be accessed by utpe appcatons. Leadng data anageent technooges (e.g., RDBMS, at-es, spreadsheets, and statstca packages use a tabuar-dataset ode, based on two-densona structures that abstract busness enttes, and/or the reatonshps aong the. Ths research ocuses on key tabuar-dataset characterstcs that are urther dscussed n the next secton. Whe the technca aspects o data anageent are we-addressed, dscusson o ts econoc aspects s surprsngy rare athough noraton vaue (IV and econocs have attracted sgncant research attenton wthn the broader context o IS/IT (Banker and Kauan 4, and have been an portant oca pont or ed practtoners (Wxo and Watson 1. A ew key concepts have eerged ro ths strea. Whe rs are vewed as prot-axzng enttes, the prot contrbuton o IS/IT s oten not apparent and dcut to prove eprcay (Davern and Kauan. Organzatons gan econoc benets ro IS/IT through strategc agnent wth busness goas (Henderson and Venkatraan 1993. IS/IT vaue s shown to ateraze through contextua use (Devaraj and Koh 3 and successu ntegraton nto busness processes, together wth copeentary resources (Davern and Kauan. IV s oten vewed as the payo argn between perect noraton versus perectons that resut n neror outcoe and ower wngness-to-pay (Banker and Kauan 4. IV s argued to be asyetrc: shared noraton ay have derent vaue to derent actors (Lee et a. ; Raju and Roy. Aong the IV deternants (IS characterstcs, anagera actons, payos, and uncertantes, ony the IS characterstcs (partcuary uaty are shown to have onotonc reatonshps wth vaue (Hton 1981. Conversey, voue does not; ncreasng t ght have negatve eects when nteractons aong uttes (Arya et a. 1997 or uncertantes (Sugank and Zcha 1996 exst. Concepts o noraton vaue and econocs can nor better data anageent ro the busness vewpont. Frst, whe data anageent ocuses on technca and unctona aspects, there s a need to easure perorance wthn busness context and choose the dependent varabe (DV accordngy (e.g., protabty. Prot axzaton pes both vaue ncrease and cost reducton. Whe data anageent costs are reasonaby we understood, the vaue ganed s argey unknown and needs urther study. Second, data s attrbuted wth vaue ony wthn contextua use, hence the need to nk vauaton o data wth busness usage. Thrd, athough drven by use, vaue s nuenced by parta IS characterstcs. IV odes oten treat noraton abstracty, rather than address specc technoogca characterstcs. The utty uncton approach as to address ths chaenge. Such unctons ap IS attrbutes nto tangbe vaue wthn specc usage, and attept to axze vaue through opta desgn (Ahtuv 1981. Utty unctons are not coony used n data anageent, athough they are shown to be sgncant or DQM and DMP optzaton (Baou and Pazer 1995, 3; Baou et a. 1998. Mcroeconoc raeworks or vaue-drven data nng have ebedded utty unctons to hep drect data search that yeds patterns wth the potenta to generate concrete and 3 5 Twenty-Sxth Internatona Conerence on Inoraton Systes
Even et a./prot Maxzaton wth Data Manageent Systes Contextua Usage Busness Process Speccs Strategc versus Operatona Externa versus Interna Use Short-ter versus Long-ter Leve o Uncertanty Iparta Characterstcs Inrastructure Mode Process Contents Presentaton Manageent Vaue Cost + - Prot Ipeentaton Technoogy Purchase Sotware Prograng Manageent Overhead Fgure 1. Prot Maxzaton Mode or Data Manageent protabe acton (Kenberg et a. 1998. These concepts protabty as the DV, contextua vaue attrbuton, and the appng o parta/technoogca IS/IT characterstcs to vaue and cost gude the deveopent o the croeconoc raework descrbed next. A Prot-Maxzaton Fraework or the DMP/IP The uanttatve croeconoc raework, whch stes ro the DMP/IP vew, assues axzng prot the derence between the vaue ganed by usng IPs (the DMP output and the cost o peentng the DMP as the key goa o data anageent. It s assued that both vaue and cost are aected by parta characterstcs o the DMP and/or the IP that are captured n the etadata abstracton. The eect o these characterstcs on vaue s assued to be oderated by contextua usages and ther eect on cost by peentaton actors (Fgure 1. The ode hghghts three paths to ncrease prot: (1 ncreasng IP usage, ( reducng DMP peentaton costs, and (3 optzng DMP/IP conguraton. Ths croeconoc ode concdes wth the vew o desgn as a process o searchng or optaty aong easbe soutons (Churchan 1971. Ths study posts that odes, such as the one descrbed here, proote vaue-drven and prot-axzng desgn and, hence, contrbute to better data anageent. Vaue and prot optzaton odes are coon n arketng and operatons anageent research, partcuary n the ed o product desgn. Such odes undere the opta desgn and conguraton o products (e.g., Koh and Sukuar 199 and servces (e.g., Easton and Puan 1; Erksen and Berger 1987. They are aso used or cost-benet optzaton o producton nes (e.g., Cooper and Saguder 4; Ygt et a.. To deveop the ode, ths paper rst descrbes the genera oruaton o the key constructs by takng a deternstc approach. Ths s then extended or tabuar datasets. 1 Metadata Vector (X: The etadata vector X represents the set o nput characterstcs that are subject to opta conguraton. Metadata characterstcs can be broady cassed as desgn or antenance, a categorzaton that reects derent optzaton optons at derent stages o the syste peentaton cyce. Desgn characterstcs reect the ong-ter decsons that are typca n eary stages (e.g., nrastructura and archtectura choces, whe antenance characterstcs reect the short-ter, ongong decsons that are ade when the syste s operatona (e.g., perorance ontorng and troubeshootng. Data anageent systes ntroduce a arge set o conguraton decsons. To nze copexty, t s portant to t the nput 1 Aternatve data odes (e.g., object-orented, object-reatona, and XML are ganng popuarty n IS peentaton and uture extensons to ths study shoud ook nto extendng the raework to address those. 5 Twenty-Sxth Internatona Conerence on Inoraton Systes 31
Breakthrough Ideas n Inoraton Technoogy set to ncude ony those characterstcs that sgncanty aect vaue and/or cost and, hence, protabty. In the specc case o the tabuar dataset, the oowng key desgn characterstcs sgncanty aect the vaue and the cost. Fed Structure ({Y } =1..M : In the tabuar dataset ode, entty attrbutes are represented by eds or couns. Attrbutes are not necessary o eua portance ro the busness perspectve. An opta ed-structure desgn has to consder trade-os: a sa set ors a parsonous abstracton o the entty, spes data acuston and processng, decreases sze, and resuts n ower storage and adnstraton costs. An over-sped set, on the other hand, ght a to capture portant descrptors, and thus prevent the ntegraton o the dataset wthn busness processes and reduce the potenta or ganng vaue. Whe soe eds ay be andatory or a possbe uses, or needed or antenance purposes (e.g., te ndcators or ndex eds, others ay be optona and ther ncuson or excuson s subject to desgn decsons. We assue M optona eds, each represented by a varabe Y, a bnary nteger: Y = 1 pes a decson to ncude the ed [] n the dataset, whe Y = pes excuson. Te Span (T: Entty nstances n a tabuar dataset are represented as records, or rows, wth dentca ed structure. The nuber o records (N ntroduces protabty trade-os: a arger N oten pes hgher costs (e.g., upgrades to storage space, hardware, and appcatons. On the other hand, a arger N oers a broader and ore granuar busness perspectve and aows ore eaborate anayss. To address ths trade-o, databases are oten segented by record age: the ore recent data s ade avaabe to end-users va an actve dataset, optzed or ast perorance, whe oder data s dscarded or archved. The te span covered by the actve dataset, deterned by an age cut-o, dctates the expected N and hence becoes an portant desgn actor. The te span s represented as a nonnegatve, contnuous varabe, T =. N s assued to be near wth T: N(T = RT where R s the record densty per te perod. Quaty Leve (Q: Data uaty (DQ s coony evauated aong a set o attrbutes (e.g., accuracy and copeteness, and easured as a rato wthn the range o (bad and 1 (good (Ppno et a.. DQ can be easured partay, based on the dataset structure and/or contents (e.g., reectng undaaged data, or contextuay, wthn a specc busness usage (e.g., reectng task-reevant data. DQM terature has dscussed a pethora o DQ soutons, ro error detecton and correcton, to coprehensve process desgn ethodooges (Redan 1996; Wang 1998. In the proposed ode, the uaty eve s vewed as a desgn target (conversey, n other decson scenaros, t can be vewed as reectng the actua status. Desgnng the syste that guarantees the hgh uaty o the generated dataset(s ncreases the potenta to gan vaue, but ght py nvestents n costy DQ soutons. The ode proposed here optzes the targeted eve o one DQ attrbute, easured partay. A dataset record s assued to be ether o good uaty wth kehood Q r or o poor uaty wth kehood o (1-Q r. The overa dataset uaty Q s a nuber between and 1, dened as the proporton o good-uaty records. It can be shown that Q = Q r, assung that record uaty eves are ndependent and dentcay dstrbuted varabes. Vaue: Data (or IP vaue s attrbuted wthn contextua busness use and reects the consuer s wngness-to-pay, hence s easured onetary. Iparta DMP/IP characterstcs can aect the vaue, soe drecty (e.g., dataset rchness, proptness o devery, or accuracy, and others ndrecty (e.g., hardware and process conguratons. The characterstc-to-vaue appng s represented as a set o utty unctons, one per usage scenaro. The contextua oderaton eect s reected n the specc unctona or. The overa vaue s su-addtve: U ( X = U ( = I X 1.. (1 where X = The etadata vector o parta characterstcs U = Utty wthn contextua usages, ndexed by [] I = The tota nuber o contextua usages U = The overa vaue A tabuar dataset can serve utpe purposes. Consderng te (T and uaty (Q rst, the vaue wthn a use scenaro s assued to be capped and axzed wth the ongest possbe te range (T 4 and opta uaty (Q = 1. The vaue degrades wth ower uaty eve and s assued to have an exponentay dnshng return wth age. It s hence represented as Such assuptons reect a prary nternay (wthn an organzaton ocused usage. Wth externa use, ncreasng voues and provng uaty above certan eves can create new usage opportuntes and the resutng deand curves are ore key to oow an S-shape. These are to be expored n uture extensons. 3 5 Twenty-Sxth Internatona Conerence on Inoraton Systes
Even et a./prot Maxzaton wth Data Manageent Systes U β ( T, Q = k ( 1 e Q ( where U (T, Q= The vaue o usage scenaro [] k = The vaue cap o usage scenaro [], at Q=1 and T 4 " = A postve exponenta sope actor o usage scenaro []. The greater the vaue o ", the ess dependent the usage scenaro s on oder data. $ = A postve uaty senstvty actor o usage scenaro []. The greater the vaue o $, the ore senstve the user scenaro s to oss o uaty. The overa vaue or a use scenaros s thereore gven by U T β ( T, Q = k ( 1 e Q α = 1.. I (3 Addng structure characterstcs (X = [T, Q, {Y }], each use scenaro ay reure a derent ed subset. Soe eds are andatory or a certan scenaro, soe are not andatory but ay reduce vaue excuded, and yet others do not aect the scenaro at a. We dene as the senstvty actor o usage [] to optona ed [], < < 1. The hgher the, the ore necessary the ed. = 1 pes a andatory ed or usage []. = pes that usage scenaro [] s ndependent o the ncuson o ed []. s = 1 η 1 ( Y s the eect o ed [] on usage scenaro []. Wth senstvty actor =, the eect s aways 1. Wth senstvty actor = 1, the eect s 1 the ed s ncuded and not. Wth < < 1, the eect s 1 the ed s ncuded, and 1 - not. S = s = = 1.. M = 1.. M ( 1 η ( 1 Y s the structure eect on usage []. Excudng a andatory ed pes S =. Excudng a partay portant ed ( < < 1 reduces S but not to. Feds o whch the scenaro s ndependent do not aect the overa vaue (snce s = 1. Addng ed structure consderatons to (3, the overa vaue s gven by U [ ( ] = 1.. I = 1.. I = 1.. M β β ( T, Q, {} Y = k ( 1 e Q S = k ( 1 e Q 1 η ( 1 Y (4 k s now nterpreted as the axu potenta vaue o scenaro [], gven u te span covered (T 4, opta uaty eve (Q = 1 and a the vaue-contrbutng eds ncuded. Cost: The DMP peentaton coes at a cost, whch s drven by technca and anagera decsons such as nrastructura choces, prograng eorts, nvestent n uaty proveent soutons, and adnstratve overhead. Sar to utty, DMP characterstcs aect the cost drecty or ndrecty. Cost actors can be represented as a paraeterzed uncton that transates the eect o parta characterstc to onetary output. C ( X = C ( j = J j X 1.. (5 where X = The etadata vector o parta characterstcs C j = Cost actor, ndexed by [j] J = The tota nuber o cost actors C = The overa cost For tabuar datasets, consderng T and Q rst, cost s assued to have three coponents. 5 Twenty-Sxth Internatona Conerence on Inoraton Systes 33
Breakthrough Ideas n Inoraton Technoogy C ( T, Q = C + C ( T + C ( T Q, (6 where C = A xed coponent, whch can be reated, or exape, to hardware or networkng. C (T = A near coponent wth a per-record cost c, whch can be reated to data acuston or to nvestent n dsk-storage space. The near cost s, hence, gven by C ( T = c N ( T c RT = (7 C (T,Q = Varabe uaty cost. Oten, the oder the data s, the ore expensve t s to antan ts uaty. The uaty cost per record s thereore assued to be C R ( t, Q = c Q ( 1 + θt = c Q ( 1 + θt r where t = The age o the record. Q r (=Q = Quaty eve o a snge record. c = Cost o antanng a record o age at a axu uaty eve. * = Cost senstvty to the uaty, * > 1. The greater * s, the greater s the ncrease n cost as perect uaty s approached. = Cost senstvty to age, assung near ncrease. >, where euaty to zero reects no age eect on uaty cost per record. (8 The record densty s R. Thereore, the overa uaty cost and the overa cost are C R ( T, Q RC ( τ, Q dτ = c RQ ( 1 + θτ dτ = c RQ ( T +.5θT C τ =.. T τ =.. T = ( T, Q = C + C ( T + C ( T, Q = c + c RT + c RQ ( T +.5θT (9 (1 Addng ed structure characterstcs ay aect each cost coponent. Fxed cost (C : The xed cost s assued to have a xed coponent c. Each optona ed, ncuded, adds an ncreenta cost c (e.g., desgn and prograng eorts reated to addng the ed. The xed cost s hence gven by ({} Y = c c Y c + = * 1.. M (11 Lnear Cost (C : The per-record cost c, s assued to have a xed coponent c, (potentay attrbuted to andatory eds. Each optona ed, ncuded, adds an ncreenta cost c. The near cost, as derved ro (7, s hence gven by C ( T,{} Y = ( c c Y RT + = M * 1.. Quaty Cost (C : Sary, the per-record uaty cost c, s assued to have a xed coponent c, and each optona ed, ncuded, adds an ncreenta uaty cost per record c. The uaty cost, as derved ro (1, s, hence, gven by The overa cost sus the three cost actors C ( T, Q, {} Y ( c + c * Y RQ ( T +.5θT = = 1.. M (1 (13 34 5 Twenty-Sxth Internatona Conerence on Inoraton Systes
Even et a./prot Maxzaton wth Data Manageent Systes (, Q, {} Y = ( c + c * Y + ( c + c * Y C T + ( c + c * Y RQ ( T +.5θT = 1.. M = 1.. M = 1.. M RT (14 Prot: The prot s dened as the derence between overa vaue and overa cost. P ( X = U ( X C( X = U ( X C ( = 1.. I j= J j X 1.. (15 where X = A vector o etadata characterstcs. {U (X} = Vaue attrbuted to I contextua usages, ndexed by []. {C j (X} = Cost attrbuted to J cost actors, ndexed by [j]. P(X = Contrbuton o data to prot. P(X s the objectve uncton or the optzaton probe: congure the characterstcs o the DMP/IP (the vector X such that the overa prot s axzed. Opta conguraton s subject to constrants such as target busness goas, ega and contractua obgatons, capped peentaton budget and te, scarcty o reured resources, or nterdependency aong etadata coponents. Wth tabuar datasets, consderng T and Q rst, prot s gven by P β ( T, Q = = k ( 1 e Q ( c + c RT + c RQ ( T +.5θT 1.. I s.t. T >, Q >, Q < 1 (16 Addng ed structure characterstcs yeds β P( T, Q, {} Y = k ( 1 e Q [ ( 1 ( 1 Y ] 1.. I η 1.. M = = [( c c * Y ( c c * Y RT ( c c * Y RQ ( T.5 T ] + + + + + + θ = 1.. M = 1.. M = 1.. M (17 s.t. T >, Q >, Q < 1,Y >, Y < 1 and Y nteger, or each = 1 M Whe the protabty has apparent trade-os wth T, Q, and {Y }, the eect o other paraeters can be nerred ro the prot ode. We expect protabty to Increase wth I and {k }: ore usages and hgher per-use vaue ncrease protabty. Increase wth " and decrease wth $: hgher te senstvty pes near optaty wth ess te coverage, hgher uaty senstvty pes hgher decne as uaty degrades. Decrease wth : protabty s key to reduce wth hgher senstvty to ed excuson. Decrease wth c, c, and c : hgher cost actors decrease protabty. Decrease wth * and : hgher uaty cost senstvty to Q or to T decreases protabty. Future Extensons: Data anageent systes are copex and n reaty protabty can be aected by a arge set o other characterstcs (e.g., nrastructure, process, and devery. The usabty o such a raework depends on dentyng a ted subset o nuenta characterstcs and odeng ther eect on vaue and cost. The genera oruaton assues deterns or spcty, athough n reaty data anageent systes are ar ro beng deternstc. Aternatve oruatons can consder stochastc behavor and transor the optzaton probe (15 nto axzaton o expectaton over te (denoted E t []. P ( X = E [ U ( X ] E [ C( X ] = E U ( X t t [ ] E [ C ( X ] = 1.. I t j= Another underyng assupton to be reconsdered s the su-addtve odeng o utty and cost actors. Ths assupton, whe spyng the ode anaytcay, suggests that such actors are ndependent o each other, whch oten does not reect rea-e 1.. J t j (18 5 Twenty-Sxth Internatona Conerence on Inoraton Systes 35
Breakthrough Ideas n Inoraton Technoogy practces. Vaue enhanceent or neutrazng reatonshps ay exst aong uttes and costs, n whch case the whoe s not necessary an addtve-su o the parts. Such scenaros o nterdependency ought to be urther expored, and the raework shoud be enhanced to ode the propery. Other ode enhanceents to consder are, or exape Derent constrants on T and Q (T n < T < T ax, Q n < Q < Q ax, dctated by busness needs. Uneven dstrbuton o records aong te, whch pes a derent N(T oruaton. Quaty eves per record that are not..d., whch pes a derent Q oruaton. Derent unctona ors (e.g., step or s-shaped or appng T, Q, and {Y } to cost and vaue: noraton overoad, or exape, ght py vaue degradaton as voue and ed structure copexty ncrease. Larger T and Q, and/or rcher ed structure ay ntroduce new usage opportuntes, but reure sgncant hardware and sotware upgrade. Opta Conguraton o Tabuar-Dataset Characterstcs Ths secton deonstrates the use o the prot-axzaton raework or opta desgn o dataset characterstcs. The optzaton ode s nonnear and xes contnuous and nteger nput varabes. St, wthn certan assuptons, a cosed-or souton can be obtaned. More oten, however, optzaton reures nuerca approxaton, usng dedcated sotware. 3 Optzng the Te Span (T and the Quaty Leve (Q Wth certan reaxatons, cosed-or soutons can be obtaned or optzng T and Q. Three cases are deonstrated: (1 optzng T aone, ( optzng Q aone, and (3 optzng T and Q sutaneousy. Case 1: Optzng the te span aone, gven Q. A rst dervatve o (16 yeds αt β ( T, Q / T = α k e Q c R c RQ c RQ θt P = 1.. I (19 The opta te span can be obtaned ro MP(T,Q/MT=, or = 1.. I α k e α T Q β = c R + c RQ + c RQ θt ( Snce the et-hand sde s onotoncay decreasng wth T and the rght-hand sde s onotoncay ncreasng wth T, there s a snge T OPT souton. Takng a second dervatve P / T = α = k e 1.. I α T Q β c RQ θ (1 The second dervatve s negatve. Hence, T OPT s a pont o axa protabty. A cosed-or souton can be obtaned ro ( or a snge utty (I = 1 and assung =. αke Q β = c R + c RQ ( Ths optu represents the te pont above whch the argna cost ( c R α β ( αke T Q. The opta te can now be obtaned ro (. + c RQ exceeds the argna vaue 3 Mcrosot-Exce/Sover was used or the nuerc ustratons n ths study. 36 5 Twenty-Sxth Internatona Conerence on Inoraton Systes
Even et a./prot Maxzaton wth Data Manageent Systes T Ln α RQ OPT 1 αk = β ( c + c Q (3 The axu protabty s gven by P OPT β R αk ( T Q kq ( c + cq * Ln e * c, = β α RQ ( c + cq Case : Optzng Q aone, gven T. A rst dervatve o (16 yeds P β 1 1 ( T, Q / Q = β k ( 1 e Q c RQ ( T +.5θT = 1.. I The opta uaty eve be obtaned ro MP(T,Q/ MQ =, or = 1.. I β k β 1 1 ( 1 e Q = c RQ ( T +.5θT (4 (5 (6 Such euaton ay have ore than one souton, dependng on the actua paraeter vaues. A souton beow pes that the syste s neasbe; when a souton s above 1, the constrant Q < 1 appes, and Q = 1 s the canddate choce. The second dervatve yeds P β ( T, Q / Q = β ( β 1 k ( 1 e Q ( 1 crq ( T +.5θT = 1.. I Wth * > 1, the second dervatve s negatve <$ < 1 and the opta souton, easbe ( < Q < 1, ndcates axa protabty. I $ > 1, the second dervatve s not guaranteed to be negatve. Hence, there s a need to obtan ts vaue wth the actua paraeters. A cosed-or souton can be obtaned ro (6 or a snge utty (I = 1: βk β 1 1 ( 1 e Q = c RQ ( T +.5θT 1 Ths optu represents argna cost β 1 c RQ T +.5θT exceedng the argna vaue βk 1 e Q. The optu can be now obtaned ro (8: ( ( ( ( (7 (8 Q OPT βk = cr T ( 1 e ( +.5θT 1 β (9 I < Q OPT < 1, the axa protabty s gven by ( OPT β OPT (, Q = k( 1 e Q c + c RT + c RQ ( T +.5θT P T (3 Otherwse, Q OPT < pes uneasbty and Q OPT > 1, Q OPT = 1 s opta and protabty s P ( T, 1 = k( 1 e c ( c + c RT.5θc RT (31 5 Twenty-Sxth Internatona Conerence on Inoraton Systes 37
Breakthrough Ideas n Inoraton Technoogy Case 3: Frst dervatves by T and by Q are gven by (19 and (5 respectvey. Sovng MP/MT = and MP/MQ = sutaneousy yeds canddate soutons, whch can be checked or optaty. Wth the spyng assuptons o I = 1, c = and =, P β ( T Q / T = αke Q c RQ β 1 1, P( T, Q / Q = βk( 1 e Q c RQ T Sovng MP/MT = and MP/MQ = αke Q = C Dvdng the second euaton by the rst yeds RQ and (3 αt β ( αt β T β k 1 e Q = C RQ and (33 T OPT ( e OPT 1 T β α = α (34 Snce * >, the euaton has a postve T OPT souton (n addton to T =. Substtutng *T n the second euaton yeds a canddate Q OPT, whch shoud be checked or easbty ( < Q < 1. Q OPT αk = C R e ( 1 e αt ( 1 1 β (35 For checkng optaty, the Hessan atrx and ts deternant shoud be ooked at. H [ P( T, Q ] αβk P / T = P / Q T P / T Q = P / Q β β 1 1 α ke Q αβk( 1 e Q c ( ( ( ( RQ β 1 1 β 1 e Q crq β β 1 k 1 e Q 1 crq T (36 D = P ( P / T / T * P / Q Q (37 The oowng condtons can be evauated to deterne optaty: I D > and second dervatves are postve at (T OPT, Q OPT, the pont s a reatve nu. I D > and second dervatves are negatve at (T OPT, Q OPT, the pont s a reatve axu. I D <, the pont s a sadde pont and D =, hgher order tests ust be used. Iustratve Exape 1: A r wshes to proote a product to sted custoers. Targetng the entre st s expected to yed $1 on. The st covers 5 years wth an average o 1, custoers added per year. The ore recent a custoer, the hgher s the acceptance chance, wth a argna exponenta decne rate o.. Oten, custoer noraton s naccurate and daages prooton eorts wth a senstvty rate o. Rasng uaty eve to 1 percent accuracy s vabe, but expensve ($6 per record, wth neggbe ncrease or oder data. However, a cheaper x o autoated and anua procedures can ower cost, wth a uaty cost senstvty actor o 5. Snce the st s aready n pace, xed costs and per-record argna cost are neggbe. To optze the expected prot, two uestons ust be addressed: (1 how any years o custoer data shoud be ncuded, and ( what data uaty eve shoud be targeted? Transaton o the gven probe to the ode paraeters yeds I = 1, snce a snge prooton s anayzed and k = $1 on, the axa vaue possbe. 38 5 Twenty-Sxth Internatona Conerence on Inoraton Systes
Even et a./prot Maxzaton wth Data Manageent Systes Te Span Optzaton Quaty Leve Optzaton,,,, 1,5, 1,5, 1,, 1,, $ Aount 5, Vaue Cost Prot $ Aount 5, Vaue Cost Prot 4 6 8 1 1 14 16 18 4..1..3.4.5.6.7.8.9 1. -5, -5, -1,, Te Span (T -1,, Quaty Leve (Q Fgure. Cost, Vaue, and Prot by Te Span (Q = 1 Fgure 3. Cost, Vaue, and Prot by Quaty Leve (T = 5 " =., the vaue senstvty to te and $ =, the vaue senstvty to uaty eve. R = 1,, the average nuber o records per year. c = $6, perect uaty cost per record. * = 5, cost senstvty to uaty eve. c = c = =, neggbe xed and per-record costs and uaty cost senstvty to te. The soutons or three ode cases are deonstrated: (1 optzng the te T or a perect uaty eve (Q = 1, ( optzng the uaty eve Q or the entre database (T = 5, and (3 optzng T and Q sutaneousy. Case 1: Appyng the vaues yeds C(T = 6 T, U(T = 1 (1 e -.T, and P(T = 1 (1 e -.T 6 T (Fgure. The opta te span T OPT can be obtaned ro (3: T OPT = 6.. The axa protabty s estated n $338,88. Case : Appyng T = 5, yeds C(Q = 15Q 5, U(Q = 1 (1 e 5 * Q, and P(Q = 1 (1 e 5 * Q 15 Q 5 (Fgure 3. The opta uaty Q OPT can be obtaned ro (9: Q OPT =.64, wth axa protabty s estated n $45,793. Case 3: Here C(T, Q = 6TQ 5, U(T, Q = 1 (1 e -.T * Q, and P(T, Q = 1 (1 e -.T * Q 6TQ 5. Sovng (34 and (35 nuercay yeds an opta te span T OPT = 8.9 (euvaent to the ost recent 8,9 dataset records, Q OPT =.87 and expected prot o $364,879. As expected, ths prot s hgher than the resuts obtaned ro optzng the te span or the uaty eve aone. For senstvty anayss, rst Q s xed at Q OPT =.87 and the eect o T varaton s ustrated n Tabe 1. Wthn a range o ~1 year ro the T OPT, protabty decne stays wthn ~1 percent. Wthn a range o ~ years, the decne s ess than 5 percent, and wthn a range o ~3 years, the decne s ess than 1 percent. Second, T s xed at T OPT = 8.9 and the eect Q varaton s ustrated n Tabe. Wthn Q range o [.85,.9], the protabty decne s ess than 1 percent. Wthn a range o [.8,.95], the decne s ess than 5 percent, but wthn a range o [.75, 1] the decne exceeds 1 percent. The opta souton appears to be ary robust. Keepng T and Q, wthn a reasonabe range around the optu yeds a reatvey sa devaton ro opta protabty. 5 Twenty-Sxth Internatona Conerence on Inoraton Systes 39
Breakthrough Ideas n Inoraton Technoogy Tabe 1. Senstvty to Te-Span Varaton (Q =.87 Te Span (T Expected Prot ($ Margn ro Optu ($ Decne (% 5 39,13-35,747-9.8% 6 349,65-15,7-4.17% 7 361,5-3,874-1.6% 8 364,85-7 -.1% 8.9 (Optu 364,879 % 9 36,554 -,35 -.64% 1 355,55-9,64 -.64% 11 343,776-1,13-5.78% Tabe. Senstvty to Quaty Leve Varaton (T = 8.9 Quaty Leve (Q Expected Prot ($ Margn ro Optu ($ Decne (%.7 311,91 53,588-14.69%.75 335,83 9,76-7.97%.8 354,57 1,8 -.97%.85 363,865 1,14 -.8%.87 (Optu 364,879 %.9 36,744,135 -.59%.95 347,93 16,976-4.65% 1 316,4 48,655-13.33% Optzng o the Fed Structure Potenta cosed-or soutons or opta ed structure were not expored n ths study. However, optzaton that consders ed structure together wth T and Q, (17, can be nuercay obtaned wth the approprate sotware. Iustratve Exape : The sae r consders enhancng ts custoer st wth externa noraton that can support addtona decson tasks. The canddate enhanceents are 1. Marta status. Nuber o chdren 3. Years o educaton 4. Neghborhood rankng 5. Credt status 6. Vaue o houses owned 7. Vaue o cars owned 8. Vaue o appances owned Four decson tasks are evauated, each wth a derent potenta vaue contrbuton and a derent set o addtona eds reured. 1. Task 1 has a vaue potenta o $1,,, and reures eds (1, (, (3 and (5.. Task has a vaue potenta o $1,,, and reures eds (1, (, (3 and (6. 3. Task 3 has a vaue potenta o $,, and reures eds (1, (, (4 and (7. 4. Task 4 has a vaue potenta o $,, and reures eds (1, (, (4 and (8. Vaue has te senstvty actor o.5 and uaty senstvty actor o. The xed cost o enhanceent s $1,, pus $1, per ed added. The per-record near cost has a xed coponent o $., pus $.1 per ed added. The per-record uaty cost has a xed coponent o $1, pus $ per record added. To optze prot, the oowng uestons need addressng: (1 Whch ed, aong the canddates, shoud be added to the dataset? ( How any years o data shoud be ncuded? (3 What data uaty eve shoud be targeted? Transaton o the gven probe to the ode paraeters yeds 4 5 Twenty-Sxth Internatona Conerence on Inoraton Systes
Even et a./prot Maxzaton wth Data Manageent Systes Fro the ustratve exape 1: R = 1,, * = 5, and =. I = 4, snce our tasks are consdered. k 1 = k = $1,,, and k 3 = k 4 = $,. {" =.5} = 1..4, the vaue senstvty to te, and {$ = } = 1..4, the senstvty to uaty eve. Fxed cost coponents: c = $1,, c = $1,. Lnear cost coponents: c = $., c = $.1. Quaty cost coponents: c = $1, c = $. { }, the senstvty actors o task [] to ed [] are presented n the oowng atrx: Task / Fed = 1 3 4 5 6 7 8 = 1 1 1 1 1 1 1 1 1 3 1 1 1 1 4 1 1 1 1 The opta souton can be obtaned by consderng (17 as the objectve uncton. 1. Keepng T OPT = 8.9 and Q OPT =.87, as prevousy obtaned, the optzaton suggests ncudng eds (1,, 3, 5, 6 and excudng (4, 7, 8, whch pes that ony tasks (1 and ( w be supported snce (3 and (4 depend on excuded eds. The opta protabty s estated to be $3,41.. Reoptzng T and Q as we. The optzaton here suggests the sae x o eds. However, the opta T OPT = 5.85 and Q OPT =.8 are now derent. The estated opta protabty n ths case s $49,361, sgncanty hgher then the rst resut. The resuts ake ntutve sense: the two ore protabe tasks were supported wth the reured eds, whe the two ess protabe were otted, due to a too-hgh addtona cost. Concusons and Drectons or Future Research Data anageent s portant to busness rs. Whe today t s drven by technca and unctona ecency, a stronger ncuson o the econoc perspectve s encouraged. Ths study suggests that data anageent ought to agn wth protaxzaton goas, and contrbutes to ths suggeston wth the proposed prot-axzaton raework. The raework aps the parta/technca characterstcs to the peentaton costs and to the vaue created wthn contextua usage. Brngng those aspects together aows prot axzaton through opta conguraton o the characterstcs. The raework provdes a poweru too, ro a busness perspectve, by consodatng vaue and cost; t aows trade-o assessents toward prot axzaton. Fro the technca perspectve, t nors the desgn o data anageent systes by attrbutng vaue, cost, and protabty to parta characterstcs. The raework s deonstrated through optzaton o an IP wth a tabuar data structure. The ode ustrates cost/benet trade-os wth key tabuar dataset desgn characterstcs: the te span, the uaty eve, and the ed structure. As deonstrated, ore s not necessary better: ncreasng the nuber o records, addng ore eds, and approachng perect uaty ay have unctona and technca erts, but ay not necessary be opta or protabty. Ths study oers a range o opportuntes or uture research. Soe are specc to the croeconoc raework proposed, whe others take a broader theoretca perspectve. The croeconoc ode aows any possbe extensons, as dscussed earer. Iprovng ts contrbuton and usabty reures dentyng nuenta desgn characterstcs, odeng ther cost/benet eect, and assessng prot optzaton accordngy. Such odeng w be chaengng gven the ssues wth nonnearty, copex constrants, stochastc behavor, consderatons o current versus uture goas, and dynac behavor. For ths reason, obtanng a near-prograng oruaton or cosed-or soutons s not key. Aternatvey, ore advanced ethods such as nonnear optzaton, xed-nteger prograng, dynac prograng, conjont anayss, or the rea-opton approach can be exaned. Mcroeconoc odeng s apped wthn acadec dscpnes other than IS and the protabty raework coud benet ro better synergy wth these. Two such bodes o research are reerenced here. Frst s the product desgn, whch uses croeconoc odes to optze products, servces, and producton nes. Prot optzaton o the DMP/IP concdes wth the knd o probes addressed by product desgn research, hence, the odes and anaytca ethods that have been proposed ay be appcabe to the IS/IT settng. The second body o research s data nng n whch croeconoc odes are apped to vaue- 5 Twenty-Sxth Internatona Conerence on Inoraton Systes 41
Breakthrough Ideas n Inoraton Technoogy drected noraton search n arge datasets. Such odes can contrbute to a better expanaton o the utty obtaned ro data consupton, an portant part o the protabty raework that has not been sucenty expored by IS research. Eprca studes can hep assess the vaue and cost assocated wth ntegratng data wthn busness processes. Quantyng those actors can be chaengng; processes are copex and nvove copeentary resources (Davern and Kauan. Ths ay reure exporng technues or busness process appng and vaue attrbuton. Eprca studes can aso hep denty the ore nuenta characterstcs n copex systes, so that odeng can ocus on a ted, but portant, nput set o characterstcs. Another aspect to be eprcay conred s the assued unctona ors or vaue and cost appng and the cabraton o ther paraeters. Data anageent ought to be nked ore robusty to the econoc vew o IS/IT. Data anageent technooges are broady researched, and so are noraton vaue and noraton econocs. However, vaue and protabty are rarey addressed n the context o data anageent, and technoogca characterstcs o data anageent are not coony dscussed n noraton econocs or noraton vaue research. Ths gap can be redressed by understandng the vaue contrbuton o data and how gans are aected by technoogca characterstcs. A possbe approach s to deveop a data vauaton taxonoy, dentyng aspects o vaue contrbuton, detectng actors that appear to have strong expanatory power, and odeng the vaue contrbuton aong those actors. Factors that appear to nuence IS/IT vaue n a broader context oten agn aong the dstncton between the operatona and the strategc: nterna use versus externa anagera scope (West and Courtney 1993, current versus uture goas (Davern and Kauan, and uncertanty eve (Sugank and Zcha 1996. Fnay, desgn or vaue and protabty s portant not ony to data anageent, but to syste desgn n a broader sense. Athough n recent years the econoc aspects o IS (cost, vaue, and protabty have drawn sgncant attenton, t s hard to detect the nuence o these concepts n IS desgn. Ths ay contrbute to the oten-heard arguents o sagnent and dsconnect between busness goas and IT. The suggested raework brngs the econoc consderatons to the oreront as goas that ought to drect opta technca desgn. Such an approach, proven to be useu, can better nor syste desgn and archtectura choces. Reerences Ahtuv, N. A Systeatc Approach Towards Assessng the Vaue o Inoraton Syste, MIS Quartery (4:4, 198, pp. 61-75. Arya, A., Gover, J. C., and Svaraakrshnan, K. The Interacton Between Decson and Contro Probes and the Vaue o Inoraton, The Accountng Revew (7:4, 1997, pp. 561-574. Baou, D. P., and Pazer, H. L. Desgnng Inoraton Systes to Optze the Accuracy-teness Tradeo, Inoraton Systes Research (6:1, 1995, pp. 51-7. Baou, D. P., and Pazer, H. L. Modeng Copeteness versus Consstency Tradeos n Inoraton Decson Systes, IEEE Transactons n Knowedge Manageent and Data Engneerng (15:1, 3, pp. 4-43. Baou, D. P., Wang, R., Pazer, H., and Tay, G. K. Modeng Inoraton Manuacturng Systes to Deterne Inoraton Quaty, Manageent Scence (44:4, 1998, pp. 46-484. Banker, R. D., and Kauan, R. J. The Evouton o Research on Inoraton Systes: A Fteth-Year Survey o the Lterature n Manageent Scence, Manageent Scence (5:3, 4, pp. 81-98. Churchan, C. W. The Desgn o Inurng Systes, Basc Concepts o Systes and Organzatons, Basc Books, New York, 1971. Cooper, R., and Suguder, R. Achevng Fu-Cyce Cost Manageent, MIT Soan Manageent Revew (46:1, 4, pp. 45-5. Davern, M. J., and Kauan, R. J. Dscoverng Potenta and Reazng Vaue ro Inoraton Technoogy Investents, Journa o MIS (16:4,, pp. 11-143. Devaraj, S., and Koh, R. Perorance Ipacts o Inoraton Technoogy: Is Actua Usage the Mssng Lnk, Manageent Scence (49:3, 3, pp. 73-89. Easton, F. F., and Puan, M. E. Optzng Servce Attrbutes, The Seer s Utty Probe, Decson Scences (3:, 1, pp. 51-75. Erksen, S. E., and Berger, P. D. A Quadratc Prograng Mode or Product Conguraton Optzaton, Zetschrt ür Operatons Research (31:, 1987, pp. 143-159. Gazer, R. Measurng the Vaue o Inoraton: The Inoraton-Intensve Organzaton, IBM Systes Journa (3:1, 1993, pp. 99-11. 4 5 Twenty-Sxth Internatona Conerence on Inoraton Systes
Even et a./prot Maxzaton wth Data Manageent Systes Henderson, J. C., and Venkatraan, N. Strategc Agnent: Leveragng Inoraton Technoogy or Transorng Organzatons, IBM Systes Journa (3:1, 1993, pp. 4-16. Hton, R. W. The Deternants o Inoraton Vaue: Syntheszng Soe Genera Resuts, Manageent Scence (7:1, 1981, pp. 57-64. Kenberg, J., Papadtrou, C., and Taghavan, P. A Mcro-Econoc Vew o Data Mnng, Data Mnng and Knowedge Dscovery (:4, 1998, pp. 311-34. Koh, R., and Sukuar, R. Heurstcs or Product-Lne Desgn Usng Conjont Anayss, Manageent Scence (36:13, 199, pp. 1464-1478. Lee, H. L., So, K. C., and Tank, C. S. The Vaue o Inoraton Sharng n a Two-Leve Suppy Chan, Manageent Scence (46:5,, pp. 66-643. Ppno,L. L., Yang, W. L., and Wang, R. Y. Data Quaty Assessent, Councatons o the ACM (45:4,, pp. 11-18. Raju, J. S., and Roy, A. Market Inoraton and Fr Perorance, Manageent Scence (46:8,, pp. 175-184. Redan, T.C. Data Quaty or the Inoraton Age, Artech House, Boston, MA, 1996. Shankaranarayanan, G., and Even, A. Managng Metadata n Data Warehouses: Ptas and Possbtes, Councatons o the AIS (4:14, 4, pp. 47-74. Shankaranarayanan, G., Zad, M., and Wang, R. Y. Managng Data Quaty n Dynac Decson Makng Envronents: An Inoraton Product Approach, Journa o Database Manageent (14:4, 3, pp. 14-3. Sugank, E., and Zcha, Y. The Vaue o Inoraton n the Presence o Futures Markets, Journa o Futures Market (16:, 1996, pp. 7-4. Wang, R. Y. A Product Perspectve on Tota Quaty Manageent, Councatons o the ACM (41:, 1998, pp. 58-65. West, L. A. Jr., and Courtney, J. F. The Inoraton Probes n Organzatons: A Research Mode or the Vaue o Inoraton and Inoraton Systes, Decson Scences (4:, 1993, pp. 9-51. Wxo, B. H., and Watson, H. J. An Eprca Investgaton o the Factors Aectng Data Warehousng Success, MIS Quartery (5:1, 1, pp. 17-41. Ygt, A. S., Usoy, A. G., and Aahverd, A. Optzng Moduar Product Desgn or Recongurabe Manuacturng, Journa o Integent Manuacturng (13:4,, pp. 39-316. 5 Twenty-Sxth Internatona Conerence on Inoraton Systes 43