The modelling of business rules for dashboard reporting using mutual information

8 t World IMACS / MODSIM Congress, Cairns, Australia 3-7 July 2009 ttp://mssanz.org.au/modsim09 Te modelling of business rules for dasboard reporting using mutual information Gregory Calbert Command, Control, Communications and Intelligence Division, Defence Science and Tecnology Organisation, Edinburg, Sout Australia, 5. Email: greg.calbert@dsto.defence.gov.au Abstract: Te role of a business or military dasboard is to form a succinct picture of te key performance indicators tat govern an organisation s dynamics or effectiveness. Tese indicators are based on te judgments of different experts. Variables from one or several databases along wit te subjective opinion of te expert are fused or amalgamated to form te so called key performance indicators. A principal example of suc a dasboard is te reporting tat occurs in te defence preparedness system. Judgments, based on data or opinions witin lower level units are formed ten fused troug te defence ierarcy. Te aim of tis system is to form a concise picture of te strategically important preparedness issues pertinent to senior leadersip. Te issue tat we begin to address in tis paper is te analysis of just ow informative suc business dasboards can be. Tere are tree principal reasons as to wy te information conveyed by te dasboard to senior boards or defence leadersip may be uninformative. Te first is tat te selection of te key performance variables may not capture te true dynamics of te organisation s state over time. Te second is tat te models or algoritms used to map fundamental database variables onto key performance indicators may be wrong. Finally, even if te models are correct, te key performance indicator may not convey sufficient information as to te state of te organisation. In tis paper we will discuss te last issue-tat of te information conveyed troug te application of business rules to form te key performance indicator. A number of examples will suffice to illustrate te loss of information. Important information migt be lost wen variables are fused or amalgamated due to te requirement for brevity. Take for example overall profit. Wile te profit for an organisation or division may be good, te successes of some brances may ide a critical loss in oters. A simpler example is te requirement for ierarcical reporting. Reports are amalgamated at te branc level and ten to te division level wit inevitable loss of information. For tis paper we apply te information-teoretic concept of mutual information between te performance indicators -te state- at one level of te organisational ierarcy and te corresponding state, formed by te business rule for fusion at a iger level. Te analysis is done for a number of business rules. Te business rules analysed are te commonly applied report by exception rule or te report by majority rule. By calculation of te mutual information, one is able to quantify just ow muc information is lost troug te application of suc rules, as reports propagate upward in te ierarcy. We draw some conclusions as to wic business rule is more appropriate in organisations tat are eiter relatively static wit few key performance indicators canges, versus organisations wic are igly dynamic, were suc indicators may cange often. A generalised business rule is ten defined. Tis work forms te basis for te measurement of te effectiveness of te reporting system itself. Measures of system bias are formed tat suggest te degree of effectiveness of te overall system in te communication of te defence system s overall state to te apex of te ierarcy, tat is to senior leadersip. Keywords: Organisational modelling, information teory, ierarcy, business rule, fusion, tresold. 594

. INTRODUCTION Te defence preparedness system informs senior leadersip as to te overall state of te organisation in relation to te readiness of units and te ability to conduct specific operations. Reports regarding te state of a number of independent key performance indicators (KPI) are formed at lower levels of te ierarcy, typically, if using te Army as an example, at te company level. Tese reports may be in regard to te state of te facilities or stocks at different barracks. Toug tere may be some minimal dependence on te state of KPI at te different levels of te ierarcy, for pragmatic reasons tey are assumed to be independent. Tey are ten fused at te battalion, ten te brigade and finally te output group level to form a picture, termed in business a dasboard, to inform senior leadersip. Brevity and clarity is paramount and terefore te key performance indicators are drawn from a discrete set. An example of suc a set is te traffic ligt indicator form of green for adequate performance and yellow, red for moderate and inadequate performance respectively. For simplicity of analysis we will only consider te binary valued green and red KPI states of adequate and inadequate performance. As illustrated in Figure, independent reports at te lower level of te ierarcy are amalgamated troug te application of a business rule to form an overall judgement at te next level, of te ierarcy. Tis process is repeated to te apex of te ierarcy or to some dept at a iger level. Te aim of tis paper is to quantify te performance of te business rules used to form te judgments. We use an information teoretic concept of mutual-information (Sannon, 948) to quantify suc performance. A discussion as to te use of tis measure against oter measures will be made. It sould be empasised tat te application of mutual information in control and to understand networks is not new. Indeed te communications teory subject of rate distortion is devoted to te study of information loss under various compression algoritms. Tere is an extensive literature on information teory and data fusion (Tay, 2008). Recently, mutual information as been applied to quantifying information propagation in general complex networks (Luque, 997), neurobiological circuits and networks of gene dependency in biology (Butte, 2000). Our application ere is in te specific context of defence preparedness in order to draw conclusions regarding te different business rules applied. 2. MEASURES OF EFFECTIVENESS Before proceeding, it is wort defining te notation tat will be used to model te information propagation in te defence preparedness system. We define te state of a single node in te preparedness ierarcy (tat is a unit suc as a company lower in te ierarcy or a brigade or output group near te apex) to be X ( R, G), were R is te state red and G is te state green for inadequate and adequate performance respectively. Te levels of te reporting ierarcy are labeled from lowest to igest as n = 0,,, N. A combined state at level n of te ierarcy (a series of independent red and green reports) Xn = ( X n, X n2,, X nm ) is associated wit te coordinator or fuser at te next level n +. Here m is te number of performance indicators linked to te fusion state in te next level of te ierarcy, te brancing factor. Witout loss of generality we label te combined state along wit te fusion state as X, Y respectively as is sown in Figure. Te state X is termed te subordinate state to Y. Business rules are defined to be Boolean functions tat map te subordinate level combined state onto te fusion state, B : X Y. Te space of all business rules is termed Β. Tus te business rule is defined between te subordinate states to form te fused state of te coordinator. Y X B Figure. Scematic representation of te defence preparedness system, in wic te independent lower level states labelled as X are fused troug a business rule B to te state Y, te states being good (green) or bad (red). 595

Now we turn our attention to a discussion of te measures of performance for te preparedness system. One could argue tat tere are two broad measures. One is to propagate as many reports of inadequate performance troug to te next level of te ierarcy as possible. Framed as an optimisation problem, we look for a business rule B tat maximises te probability of reporting a red state troug to te next level of te ierarcy max Pr(red reported in level Y red in X ). B Β It is clear tat te business rule termed report by exception maximises effectiveness according to tis criterion. Te report by exception rule sees te coordinator report a red at a node if any one of subordinate nodes is red. Tis is te commonly used business rule applied in defence preparedness reporting. Anoter approac is to find a business rule tat reflects, troug canges in te fused state Y te canging states at te subordinate level X. Intuitively, consider two organisations, one were te system is stable and performance is generally good. Here, red reports are rare. Reports to te apex of a multi-level ierarcy will contain a mix of a few inadequately performing divisions, amongst te sound performers. Now consider an organisation in a state of cange or urgency. Here, tere are many units, brances or divisions wit inadequate performance. Applying te report by exception rule will mean tat over te levels, te confluence of reports at te apex of te ierarcy will only see inadequate performance across all te divisions, witout viewing te canging dynamic of te nodes below. In te limit, tere need only be one report of red for te wole system to record inadequate performance as a wole (if all nodes including te apex apply te exception rule), wic will not be reflective of te performance of all te nodes below. To empasise tis point it is wort forming a novel analogy. If all traffic ligts witin a metropolitan traffic system were red all te time, teir effectiveness in te control of traffic would cease as tey ultimately would be ignored. To prevent tis problem of reporting in dynamic and canging organisations we propose using te mutual information between levels of te ierarcy as te measure of effectiveness. Suppose tat te evolution of te subordinate state over time is formed from a probability distribution Pr(X ). Te measure of te information (in bits) of te subordinate state is te information entropy H( X) = p ( X)log p ( X ). X Te entropy is a measure of ow muc information te random variable X conveys over time (Sannon, 948). Between te subordinate level X and te fusion state Y te mutual information is defined as I( X, Y) = H( X) H( X Y). Tis function is symmetric and measures ow te dynamic canges of X over time are reflected in Y. We can observe tis noting tat if X and Y are independent ten H( X Y) = H( X) making te mutual information zero. If X as te same distribution as Y ten H( X Y) = H( X X) = 0 so te mutual information is maximised to H( X). We tus seek a business rule or series of business rules tat maximise te mutual information between levels of te ierarcy, as is seen from te commonly formed capacity equation (Sannon, 948) max I( X,Y). B Β 3. THE MUTUAL INFORMATION OF TWO BUSINESS RULES We will derive te mutual information of two business rules in tis section. Te first is te aforementioned report by exception rule. We ten consider te report by majority rule. For tis rule, te reporter in te fusion node above observes all subordinate node states. If tere are more green states tan red ten a report of green is made, oterwise red is reported. Bot rules are juxtaposed in Figure 2. Of course tere will be many oter rules tat may be applicable. For example tere may be special subordinate nodes tat ave more weigt in te reporting tan oters. We do not consider tis, as all nodes are considered equally important. More important nodes may be expanded to multiple nodes wit more detailed representation. Generally, we consider tresold business rules (Irving, 994), were, exceeding some specified fraction of red reports te lowest common ancestor reports red. 2 596

Our aim is to examine ow tese two business rules function in conveying information to te next level of te ierarcy, under different levels of organisational uncertainty. Drawing from te examples of Section 2, organisations tat face certain environments ave low entropy wilst ig uncertainty implies rapid canges over time in te value of te key performance indicators, tat is, ig entropy. To calculate te mutual information for te two rules, a model of te key performance indicator values (te value of te nodes) must be formed at level 0 of te ierarcy (te base nodes)-te interface between te organisation and te environment. Our approac is to define a set of distributions parameterised by entropy value in te following way. Assume tere are m subordinate nodes tat report to te coordinator or fuser. Suppose X 0 = ( X 0, X 02,, X 0m ) ten in order to tune te distribution over X0 to ave entropy te first m nodes are set or frozen to te green state, X 0 = X02 = = X0 ( m ) = G, wit probability. Te remaining nodes are set to ave eiter te green or red state wit probability / 2. Tus Pr ( X ) = Pr( X ) Pr( X ) Pr( X ), Pr( X = G) = / 2, i m +,, m. o m + m +2 m i It is easy to sow tat te total entropy of te subordinate nodes defined by tis distribution as entropy bits. Te base nodes tat can switc from red or green states wit probability / 2 supply te uncertainty into te ierarcy. Tus te entropy, is bot te measure of te uncertainty of te organisation and te number of subordinate nodes tat can switc between te red and green states. Wit tis explicit model distribution, we are able to calculate te mutual information precisely for te two business rules. Te examination of te effectiveness of eac rule can ten be ascertained by varying te entropy, tat is canging te number of nodes tat can switc between te red and green states. To calculate te mutual information te conditional entropy between te fusion node states and te subordinate node states H( X Y) Pr( Y) Pr( X Y)log Pr( X Y ) = Y=R, G X is evaluated. We omit te details of te derivation albeit to say tat te conditional distributions are evaluated troug te application of Baye s Teorem, Pr( X Y) = (Pr( Y X) Pr( X)) / Pr( Y). Upon viewing te conditional probabilities, te two business rules can now be expressed in general form. For bot rules te conditional probability were for te report by exception rule 2 for X S Pr( Y = R X) = R 0,oterwise, EXCEPTION { X } S R,exception = : te number of red subordinate reports and for te report by te majority rule wit m subordinates reporting to te coordinator { X m } SR,majority = : te number of red subordinate reports / 2. MAJORITY Figure 2. Illustration of te exception and minority reporting rules. On te left and side a red state is seen so under te exception rules a red is reported for te fusion. On te rigt and side tere are more green reports tan red, so under te majority rule a green is reported. Put in tis form, te reporting rule can be generalised to any tresold rule as we will discuss in te following section 4. Te mutual information between te subordinate nodes and te coordinator node for te exception rule, given information entropy level of te subordinate nodes and te distribution specified above can be sown to be 597

( ) I exception ( X, Y ) = log2 2, >0. 2 Te report by majority rule can be sown to ave mutual information K 2 log2( 2 K) + log2 I majority ( ) X, Y = 2 K for > m/2, 0oterwise, were te number of combined subordinate states wit total entropy in wic te majority report is red (tat is in S ) R,majority K = + + +. m/2 m/2 + Te following figure plots te mutual information for te exception and te majority rules, as a function of te information entropy, assuming tat tere are eleven suc subordinate nodes. Mutual Information 0.8 0.6 0.4 0.2 exception majority 0 2 3 4 5 6 7 8 9 0 Total Subordinate Entropy (Nodes tat can switc states) Figure 3: Mutual information between subordinate and fusion nodes of te exception and majority reporting rules as a function of te subordinate nodes total information entropy, wic is te number of nodes tat can switc between red and green states, given eleven subordinate nodes. It is clear from inspection of Figure 3 tat te exception reporting rule is most effective wen te information entropy is low. Te interpretation for te preparedness reporting system is tat te exception rule applies wen most subordinate reports are consistently good (green) over time, reporting few inadequate key performance indicators. However, as te uncertainty increases te majority reporting rule ten becomes more informative as to te state of te subordinate nodes below. Analogously, tis reflects an organisation were te base key performance indicators are in flux and cange over time. Referring to Figure 3, in te left limit as te information entropy approaces one, any canges in te subordinate states are precisely reflected in te report of te lowest common ancestor, as tere is one subordinate state canging. Hence te mutual information is maximised for te exception rule. Conversely, in te rigt limit, wen all subordinate states can cange randomly and equiprobably, te majority rule will reflect tese canges, toug not precisely as tere is information loss. Te mutual information approaces te maximum of one (tis will be te maximum for te coordinator node as it can only be in two states, red and green, one bit). Te mutual information reaces tis maximum value only because our assumption tat all subordinate nodes can cange in an equiprobable way. If nodes can cange from across red and green states wit different probabilities ten te mutual information for te majority rule will not reac one. We discuss tis issue in te following section. 598

4. A GENERALISED THRESHOLD BUSINESS RULE Troug te preceding section we ave illustrated tat te exception business rule may not necessarily be effective in propagating te canging states of te subordinate nodes below. We modeled te majority business rule and found tat under te circumstances of dynamic state cange of subordinate nodes, tis rule better reflects suc a cange in states. Bot euristics are a version of te general tresold business rule as determined by te value of te conditional probability distribution Pr( Y = R X) wic is one (tat is te business rule reports te red state) under our distribution Pr ( X 0) for te set { X a} SRa, = : te number of red subordinate reports. For te exception rule a = and for te majority rule a = m / 2 were m is te number of subordinate nodes. It is natural to ask if tere is an generalised tresold business rule. Couced in communications teory terms we seek a rule maximises tat capacity between te subordinate nodes and te state of te fusion node. Under our distribution Pr ( X0) defined earlier, we can calculate te mutual information for a general value of a. Te resulting equation is identical to tat of te mutual information for te majority rule except tat te value of K is replaced by a term dependent on a, tis being te number of elements in SRa, wit entropy bits K( a) = + + +. a a+ One need only cycle over te values following Figure 4. 0 a H to find te optimum value of a. Tese values are sown in te a, te number of subordinate red states before reporting red 6 5 4 3 2 0 2 3 4 5 6 7 8 9 0 Total Subordinate Entropy (Number of nodes at can switc states) Figure 4: Grap of te optimal number of red states a, given eleven subordinate nodes, before te business rule reports an overall state of red, as a function of te subordinate entropy. Referring to Figure 4, as per te observation in Figure 3, as te uncertainty in te organisation increases, te business rule tresold by wic te subordinate reports te red state also increases. As an approximation, a = /2. te tresold number of red reports [ ] We are able to calculate a generalised business rule under te assumption tat all te nodes tat can switc between red and green states do so wit equal probability / 2 over time. If tis assumption fails, ten inevitably, te difficulty of calculating exact expressions for te mutual information will increase substantially. To overcome tis problem, one need only employ Monte Carlo simulation for te given set of state switcing probabilities to determine wat te tresold will be. For tat set of state switcing probabilities Pr( = G, R), i =, m te information entropy may be exactly determined. Tis gives us an X i 599

upper bound on a as 0 a. It is only te conditional probabilities tat need to be estimated from simulation. 5. ESTIMATES OF BIAS IN THE REPORTING SYSTEM From te discussion above, we are able to establis criteria by wic reporting witin te preparedness system is biased. Tis measure of bias will determine te overall system effectiveness. Longitudinal data on te state canges of eac node in te current defence system is recorded and terefore te entropy of all subordinate nodes can be estimated. Tere are two approaces to estimating te internal preparedness system bias, given longitudinal data. One is in reference to te judgments of oter reports in te ierarcy and te oter is in reference to te deviation from te generalised business rule or te mandated business rule. To measure te internal bias in reporting, one can estimate, given a particular level of te ierarcy te probability of a fused state Pr( Y X). Differences in te fused state probabilities can be calculated and summed to form te bias estimate over te wole ierarcy. Assuming tat te subordinate states are te same, tat is X = X for two fused states Yi and Yj, a measure of te preparedness system bias is Yi Yj b = i Yi j Yj Hierarcylevels i, j Pr( Y X ) Pr( Y X ). An alternative is to look at te deviation from te result of te general business rule. For a fusion node, an indicator function may be defined * (Y) tat is one wen te fused state matces tat of te state found Y from te business rule or zero oterwise. Ten one measure of te preparedness system bias, given te set of fusion nodes wic we label A (tat is all nodes except te base nodes of te ierarcy) is 6. DISCUSSION AND CONCLUSION ( ) b= A *. Y Y i i i A We ave igligted tat te use of te report by exception rule, wilst appropriate for a preparedness system tat as few problems, may not be te appropriate rule wen te values of te key performance indicators sow dynamic canges over time. Te majority business rule or te general tresold rule as discussed in Section 4 is more appropriate. In Defence, decision makers know te problems wit te exception rule. At te moment tere is considerable subjectivity and variability in fusion decisions made, wit limited guidance as to wat rules sould be applied across different circumstances. Decision makers ave te option to fuse decisions based on exception or averaged results from te subordinates or fuse subjectively. Data regarding te longitudinal cange in states over time is recorded. As suc, te necessary infrastructure is in place at present to apply te best tresold business rules as derived in tis paper. Tis work sould improve te guidance on wat rules to apply to best relay preparedness information to senior leadersip. REFERENCES Butte, A. J. and Koane, I.S. (2000), Mutual Information Relevance Networks: Functional Genomic Clustering Using Pairwise Entropy Estimates. Pacific Symposium in Biocomputing, 45-426. Irving, W. W. and Tsitsiklis, J. N., (994), Some Properties of Optimal Tresolds in Decentralized Detection. IEEE Transactions in Automatic Control 39(4). Luque, B. and Ferrera, A., (997), Measuring Mutual Information in Random Boolean Networks. Complex Systems,. Sannon, C.E. (948), A Matematical Teory of Communication. Te Bell System Tecnical Journal 27 379-423 and 623-656. Tay, W. P., Tsitsiklis, J. N., and Win, M. Z., (2008), Data Fusion Trees for Detection: Does Arcitecture Matter? IEEE Transactions in Information Teory 54(9), 455-468. 600