8 Algorithm for Binary Searching in Trees



Similar documents
Recurrence. 1 Definitions and main statements

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

1 Example 1: Axis-aligned rectangles

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

BERNSTEIN POLYNOMIALS

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Support Vector Machines

This circuit than can be reduced to a planar circuit

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

The Greedy Method. Introduction. 0/1 Knapsack Problem

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

How To Calculate The Accountng Perod Of Nequalty

INSTITUT FÜR INFORMATIK

What is Candidate Sampling

Project Networks With Mixed-Time Constraints

Section 5.3 Annuities, Future Value, and Sinking Funds

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

Extending Probabilistic Dynamic Epistemic Logic

General Auction Mechanism for Search Advertising

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

We are now ready to answer the question: What are the possible cardinalities for finite fields?

J. Parallel Distrib. Comput.

Generalizing the degree sequence problem

n + d + q = 24 and.05n +.1d +.25q = 2 { n + d + q = 24 (3) n + 2d + 5q = 40 (2)

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

An Alternative Way to Measure Private Equity Performance

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

where the coordinates are related to those in the old frame as follows.

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Section 5.4 Annuities, Present Value, and Amortization

Energies of Network Nastsemble

Embedding lattices in the Kleene degrees

Formulating & Solving Integer Problems Chapter

DEFINING %COMPLETE IN MICROSOFT PROJECT

Simple Interest Loans (Section 5.1) :

Multiplication Algorithms for Radix-2 RN-Codings and Two s Complement Numbers

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

The OC Curve of Attribute Acceptance Plans

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

Calculation of Sampling Weights

Period and Deadline Selection for Schedulability in Real-Time Systems

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

Level Annuities with Payments Less Frequent than Each Interest Period

To Fill or not to Fill: The Gas Station Problem

Efficient Project Portfolio as a tool for Enterprise Risk Management

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

Production. 2. Y is closed A set is closed if it contains its boundary. We need this for the solution existence in the profit maximization problem.

From Selective to Full Security: Semi-Generic Transformations in the Standard Model

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Enabling P2P One-view Multi-party Video Conferencing

Using Series to Analyze Financial Situations: Present Value

A generalized hierarchical fair service curve algorithm for high network utilization and link-sharing

Equlbra Exst and Trade S effcent proportionally

L10: Linear discriminants analysis

Sngle Snk Buy at Bulk Problem and the Access Network

Availability-Based Path Selection and Network Vulnerability Assessment

An Interest-Oriented Network Evolution Mechanism for Online Communities

Logistic Regression. Steve Kroon

Statistical Methods to Develop Rating Models

In our example i = r/12 =.0825/12 At the end of the first month after your payment is received your amount in the account, the balance, is

The Geometry of Online Packing Linear Programs

Forecasting the Direction and Strength of Stock Market Movement

Design of Output Codes for Fast Covering Learning using Basic Decomposition Techniques

Realistic Image Synthesis

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

An efficient constraint handling methodology for multi-objective evolutionary algorithms

An Overview of Financial Mathematics

Stochastic Bandits with Side Observations on Networks

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Addendum to: Importing Skill-Biased Technology

1. Measuring association using correlation and regression

AN EFFECTIVE MATRIX GEOMETRIC MEAN SATISFYING THE ANDO LI MATHIAS PROPERTIES

Calculating the high frequency transmission line parameters of power cables

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

PKIS: practical keyword index search on cloud datacenter

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

FINANCIAL MATHEMATICS. A Practical Guide for Actuaries. and other Business Professionals

Activity Scheduling for Cost-Time Investment Optimization in Project Management

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

Traffic State Estimation in the Traffic Management Center of Berlin

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Do Hidden Variables. Improve Quantum Mechanics?

Business Process Improvement using Multi-objective Optimisation K. Vergidis 1, A. Tiwari 1 and B. Majeed 2

Optimal resource capacity management for stochastic networks

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt.

Real-Time Process Scheduling

Research Article Enhanced Two-Step Method via Relaxed Order of α-satisfactory Degrees for Fuzzy Multiobjective Optimization

A Probabilistic Theory of Coherence

Stability, observer design and control of networks using Lyapunov methods

Implementation of Deutsch's Algorithm Using Mathcad

Mining Multiple Large Data Sources

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Minimal Coding Network With Combinatorial Structure For Instantaneous Recovery From Edge Failures

Learning Permutations with Exponential Weights

Pricing Multi-Asset Cross Currency Options

Transcription:

8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the nput s a path-lke tree. Ths s true because t can be easly reduced to the well-solved problem of searchng a hdden marked element from a total order set U n a sorted lst L U of elements where each element U has a gven probablty of beng the marked one [PS93] (see appendx. Due to ths correspondence, an approxmate strategy for searchng n lsts gves an approxmaton (wth the same guarantee for searchng n path-lke trees. Motvated by ths observaton, the algorthm decomposes the nput tree nto specal paths, fnds decson trees for each of these paths (wth modfed weght functons and combne them nto a decson tree for the orgnal tree. In our analyss, we obtan a lower bound on the optmal soluton and an upper bound on the returned soluton n terms of the costs of the decson trees for the paths. Thus, the approxmaton guarantee of the algorthm s bascally a constant tmes the guarantee of the approxmaton used to compute the decson trees for the paths. Throughout the text, we present the executon and analyss of the algorthm over an nstance (T,w, where T s rooted at node r. Recall that for every node u T, the cumulatve weght of u s the sum of the weghts of ts descendants, namely w(t u. Moreover, recall that a heavy path Q of T s defned recursvely as follows: r belongs to Q; for every node u n Q, the non-leaf chldren of u wth greatest cumulatve weght also belongs to Q. Let Q = (q 1... q Q be a heavy path of T. We defne T q = T q T q+1, for < Q and T q Q = T q Q. In addton, we defne Tq j as the jth heavest maxmal subtree rooted at a chld of q not n Q (Fgure 8.1. Note that these defntons are slghtly dfferent from the ones presented n the prevous part of ths work. Fnally, let n denote the number of chldren of q

Chapter 8. Algorthm for Bnary Searchng n Trees 55 whch do not belong to Q and defne e j as the arc of T connectng the node q to the subtree T j q. q 1 T 1 T 1 1 q 3 q 5 T 3 T 1 3 T 3 T 5 T 1 5 Fgure 8.1: Example of structures T q and T j q. Now we explan the hgh level structure of the soluton returned by the algorthm. The man observaton s that we can break the task of fndng the marked node n three stages: frst fndng the node q of Q such that T q contans the marked node, then queryng the arcs {e j } j to dscover whch tree T j q contans the marked node or that the marked node s q, and fnally (f needed locate the marked node n T j q. The algorthm follows ths reasonng: t computes a decson tree for the heavy path Q, then a decson tree for queryng the arcs {e j } and recursvely computes a decson tree for each tree T j q. steps: Now we present the algorthm tself, whch conssts of the followng fve ( Fnd a heavy path Q of T and then for each q Q defne w (q = w(t q /w(t. ( Calculate a decson tree D Q for the nstance (Q,w usng the approxmaton algorthm presented n [PS93]. ( Calculate recursvely a decson tree D j for each nstance (T j q,w. (v Buld a decson tree D for each T q as follows. The leftmost path of D conssts of nodes correspondng to the arcs e 1,...,e n, wth a node u q appended at the end. In addton, for every j, D j s the rght chld of the node correspondng to e j n D (Fgure 8.. (v Construct the decson tree D for T by replacng the leaf u q of D by D, for each q Q (Fgure 8.3. It s not dffcult to check that the decson tree D computed by the algorthm s a vald decson tree for T.

Chapter 8. Algorthm for Bnary Searchng n Trees 56 (a (b u e 1 q u e D 1 e 1 e e 3 u e 3 D T 1 T T 3 u q D 3 Fgure 8.: Illustraton of Step (v of the algorthm. (a Tree T wth heavy path n bold. (b Decson tree D for T. (a (b u q1 u q u q3 u q4 D 1 D D 3 D 4 Fgure 8.3: Illustraton of Step (v of the algorthm. (a Decson tree D Q bult at Step (. (b Decson tree D constructed by replacng the leaves {u q } by the decson trees {D }. 8.1 Upper Bound As the trees {T j q } and Q form a partton of the nodes of T, we analyze the dstance of the root of D to the nodes n each of these structures separately n order to upper bound the cost of D. Frst, consder a tree T j q and let x be a node n T j q. Notcng that u x s a leaf of the tree D j, the path from r(d to u x n D must contan the node r(d j. Then, by the constructon made n Step (v, the path from r(d to u x n D has the form (r(d u e 1 u e... u e j r(d j u x. Notce that the path (r(d u e 1 n D s the same as the path (r(d u q n D Q. In addton, the path from r(d j to u x s the same n D and n D j. Employng the prevous observatons, we have that the length of the path (r(d u e 1 u e... u e j r(d j u x s: d(r(d,u x,d = d(r(d,u q,d Q + j + d(r(d j,u x,d j Now we consder a node q Q. Agan due to the constructon made n Step (v of the algorthm, t s not dffcult to see that the path from r(d to u q n D traverses the leftmost path of D, that s, ths path has the form

Chapter 8. Algorthm for Bnary Searchng n Trees 57 (r(d u e 1 u e... u n e u q. Because the path (r(d u e 1 n D s the same as the path (r(d u q n D Q, t follows that the length of the path from the root of D to u q s d(r(d,u q,d Q + n. of D: Weghtng the dstance to reach nodes n {T j q } and n Q, we fnd the cost cost(d,w = d(r(d,u q,d Q w(tq j + j w(tq j q Q j q Q j + d(r(d j,u x,d j w(x q Q j x Tq j + d(r(d,u q,d Q w(q + n w(q q Q q Q = d(r(d,u q,d Q w(t q + j w(tq j q Q q Q j + cost(d j,w + n w(q q Q j q Q = w(t d(r(d,u q,d Q w (q + j w(tq j q Q q Q j + cost(d j,w + n w(q q Q j q Q = w(t cost(d Q,w + j w(tq j q Q j + cost(d j,w + n w(q (1 q Q j q Q Now all we need s to upper bound the frst term of the rght-hand sde of prevous equalty. Notce that cost(d Q,w s exactly the cost of the approxmaton computed at Step ( of the algorthm. As mentoned prevously, we can use the algorthm of [PS93] n ths step to fnd a decson tree D Q wth cost at most H({w (q }+. Substtutng ths bound on equalty (1 and observng that w(t H({w (q } = H({w(T q }, we conclude the upper bound: cost(d,w H({w(T q } + w(t + j w(tq j q Q j + cost(d j,w + n w(q ( q Q j q Q

Chapter 8. Algorthm for Bnary Searchng n Trees 58 8. Entropy Lower Bound In ths secton we present a lower bound on the cost of an optmal decson tree for (T,w. Hence, let D be a mnmum cost decson tree for (T,w, and let r be the root of D. Consder a tree Tq j and let x be a node n Tq j. By defnton, the representatve of Tq j n D (the node u(tq j s an ancestor of the node u x n D. Notce that the representatve of Tq j s a node n D that corresponds to some arc (,j of Tq j (that s u(tq j = u (,j and that (,j s also an arc of T q. Therefore, the defnton of representatve agan mples that u(tq j s a descendant of u(t j q. Combnng the prevous observatons, we have that the path n D from r to u x has the form (r u(t q u(tq j u x. Now consder a node q Q; agan by the defnton of representatve, the path from r to u q can be wrtten as (r u(t q u q. Addng the weghted paths for nodes n {Tq j } and n Q, we can wrte the cost of D as: OPT(T,w = d(r,u(t q w(t q + d(u(t q,u(tq j w(tq j q q + q j x T j q d(u(t j q,u x w(x + q d(u(t q,u q w(q (3 j Now we lower bound each term of the last equaton. The dea to analyze the frst term s the followng: t can be seen as the cost of the decson tree D under a cost functon where the representatve of T q, for every, has weght w(t q and all other nodes have weght zero. Snce D s a bnary tree, we can use Shannon s Codng Theorem to guarantee that the frst term of (3 s lower bounded by H({w(T q }/c for some constant c. Then we have the followng lemma, whch s proved more formally n the appendx: Lemma 14 q Q d(r,u(t q,d w(t q H({w(T q }/ log 3 w(t Now we bound the second term of (3. Fx a node q Q; consder two dfferent trees T j q and T j q such that d(r,u(t j q = d(r,u(t j q. We clam that u(t j q and u(t j q are sblngs n D. Because ther dstances to r are the same, u(t j q cannot be a nether a descendant nor an ancestor of u(t j q. Therefore, they have a common ancestor, say u (v,z, and one of these node s n the rght subtree of u (v,z and the other n the left subtree of u (v,z. It s not dffcult to see that u (v,z can only correspond to ether (q,j or (q,j. Wthout loss of generalty suppose the latter; then the rght chld of u (v,z corresponds to an

Chapter 8. Algorthm for Bnary Searchng n Trees 59 arc/node of T j q, and therefore u(t j q must be ths chld. Due to ther dstance to r, t follows that u(t j q must be the other chld of u (v,z. As a consequence, we have that for any level l of D, there are at most two representatves of the trees T j q located at l. Ths together wth the fact that T j q s the jth heavest tree rooted at a chld of q guarantees that n j=1 d(u(t q,u(t j q w(t j q n j=1 n j=1 (j 1/ w(t j q ( j 3 w(tq j (4 and the last nequalty gves a bound for the second term of (3. Drectly employng Lemma 13, we can lower bound the thrd term of (3 by q,j OPT(T j q,w. Fnally, for the fourth term we fx q and note that the path n D connectng u(t q to u q must contan the nodes correspondng to arcs e 1,...,e n (otherwse when traversng D and reachng u q we would not have enough knowledge to nfer that q s the marked node, contradctng the valdty of D. Applyng ths reasonng for each q, we conclude that last term of (3 s lower bounded by q n w(q. Therefore, applyng the prevous dscusson to lower bound the terms of (3 we obtan that: OPT(T,w H({w(T q } c 5w(T + q,j j w(t j q + q,j OPT(T j q,w + q n w(q (5 8.3 Alternatve Lower Bounds As n the case of the k-hotlnk Assgnment Problem presented n prevous sectons, when the value of the entropy H({w(T q } s large enough, t domnates the term 5w(T/ n nequalty (5 and leads to a suffcently strong lower bound. However, when ths entropy assumes a small value we need to adopt a dfferent strategy to devse an effectve bound. Frst alternatve lower bound. In order to reduce the addtve factor that appears n (5, we use almost the same dervaton that leads from (3 to (5; however, we smply lower bounded the frst summaton of (3 by zero.

Chapter 8. Algorthm for Bnary Searchng n Trees 60 Ths gves: OPT(T,w 3w(T + q,j j w(t j q + q,j OPT(T j q,w + q Q n w(q (6 Second alternatve lower bound. Now we devse a lower bound wthout an addtve factor; however t also does not contan the mportant term q Q n w(q. Consder a tree Tq j and a node v Tq j. By the defnton of representatve, u(tq j s an ancestor of u v n D, thus d(r,u v,d = d(r,u(t j q,d + d(u(t j q,u v,d. Because the trees {T j q } and Q form a partton of nodes of T, the cost OPT(T,w can be wrtten as: OPT(T,w = d(r,u(tq j w(tq j + d(r,u q w(q + d(u(tq j,u v w(v q,j q Q q,j v Tq j Frst, as the trees {T j q Q } are sngle nodes and hence do not contan any arcs, {u(t j q Q } and {u q } cannot be the root of D. Therefore, at most one dstance d(r,u(t j q (and wth q q Q of the frst two summatons of the prevous nequalty can be equal to zero, and all others must have value of at least one. But the constructon of the heavy path guarantees that for q q Q the weght of each of the trees {T j q } s at most w(t/. As a consequence, the frst two summatons of the nequalty can be lower bounded by w(t/. Combnng ths fact wth a lower bound for the last summaton provded by Lemma 13 (wth T = T j q we have: OPT(T,w w(t + q,j OPT(T j q,w (7 8.4 Approxmaton Guarantee We proceed by nducton over the number of nodes of T, where the base case s the trval one when T has only one node. Notce that because each tree Tq j s properly contaned n T, the nductve hypothess asserts that D j s an approxmate decson tree for Tq j, namely cost(d j,w αopt(t q j,w for some constant α. The analyss needs to be carred out n two dfferent cases dependng on the value of the entropy (henceforth we use H as a shorthand for H({w(T q }.

Chapter 8. Algorthm for Bnary Searchng n Trees 61 Case 1: H/ log 3 3w(T. Applyng ths bound to nequalty (5 we have: OPT(T,w H 6 log 3 + q,j j w(tq j + OPT(Tq j,w + n w(q (8 q,j q Employng the entropy hypothess to the frst term of nequalty ( and usng the nductve hypothess we have: cost(d,w H(3 log 3 + 3 log 3 + q,j ( j w(t j q + q,j αopt(t j q,w + q n w(q Settng α (3 log 3+ t follows from the prevous nequalty and from nequalty (8 that cost(d, w αopt(t, w. Case : H/ log 3 3w(T. Applyng the entropy hypothess and the nductve hypothess to nequalty ( we have: cost(d,w (3 log 3 + w(t + q,j ( j w(t j q + q,j αopt(t j q,w + q n w(q Addng (α 1 tmes nequalty (7 to nequalty (6 we have: αopt(t,w (α 4w(T + q,j j w(tq j + q,j αopt(d j,w + q n w(q Settng α (3 log 3 + + 4 we have that cost(t,w αopt(t,w. Therefore, the nductve step holds for both cases when α (3 log 3 + + 4. 8.5 Effcent Implementaton Notce that durng the presentaton and analyss of the algorthm, we have assumed that the trees {Tq j } are ordered by ther weghts. However, n order to actually mplement Step (v of the algorthm, one needs to sort these trees. As a heavy path decomposton can be computed n lnear tme [DL06], t s easy to see that all steps of the steps of the algorthm, besdes ths sortng at Step (v, can be mplemented n lnear tme. Hence, we resort to an approxmate sortng to provde a lnear tme mplementaton for the algorthm. Fx a node q n Q. Wthout loss of generalty, we assume there are more than three trees {Tq j } (that s n > 3, otherwse one could sort them n constant tme. In order to smplfy the analyss, we use w j = w(tq j and W = max j {w j }. Then we have the sequence WQ = (w 1,w,...,w n, whch s non-decreasng by the defnton of the trees {Tq j }. Now we partton WQ

Chapter 8. Algorthm for Bnary Searchng n Trees 6 nto blocks wth smlar weghts: for 1 j < n, the j-block contans all elements of WQ wth weghts n the nterval (W/ j,w/ j 1 ] and the n - block contans all elements of WQ wth weghts n [0,W/ n 1 ]. Due to ts order, WQ s the concatenaton of these blocks. Defnng s j as the ndex n WQ of the frst element of the j-block 1, we have that the for j < n the j- block s the sequence (w sj,w sj +1,...,w sj+1 1 and the n -block s the sequence (w sn,w sn +1,...,w n. f WQ Now we say that WQ s an approxmate sortng of the weghts {w j } contans frst all elements 1-block of WQ, then all elements of the -block of WQ and so on. The sequence WQ can be thought as a permutaton of W Q where only elements on the same j-block can be permuted. Defnng WQ = (w σ(1,w σ(,...,w σ(n, t follows that the subsequence (w σ(sj,w σ(sj +1,...,w σ(sj+1 contans the same elements as the j-block of WQ. As there are only n n blocks, we can fnd an approxmate sortng n lnear tme usng a bucketng strategy. We clam that n j=1 j w σ(j 3 n j=1 j w j. For every 1 j < n, t follows from the defnton of j-block and the relatonshp between the elements of WQ and the blocks of WQ that: k=s j k w k W j k=s j k = 1 W j 1 k=s j In addton, because for every k s n k 1 k=s j k w σ(k (9 we have w σ(k W/ n, the sum n k=s n k w σ(k can be upper bounded by n (W/ n. Therefore, for n > 3 ths term can be upper bounded by W. Combnng wth the fact that n k=1 k w k W we have that n k=s n k w σ(k n k=1 k w k. Usng the prevous nequalty together wth nequalty (9, we have: n k=1 k w σ(k = n 1 j=1 n 1 j=1 k=s j k w σ(k + n k=s n k w σ(j n n k w k + k w k 3 k w k k=s j k=1 k=1 Now suppose that nstead of usng an exact sortng our algorthm uses an approxmate sortng of the trees {T j q }, by means of permutatons σ q. It s easy to see that the only mpact ths modfcaton has to the upper bound of the algorthm occurs n the second term of the rght-hand sde of nequalty (, wth j j w(t q j beng replaced by j j w(t σq (j q. Then the prevous 1 In order to avod a heaver notaton, we assume that each j-block s nonempty.

Chapter 8. Algorthm for Bnary Searchng n Trees 63 argument mples that ths approxmate sortng ntroduces a multplcatve factor of three n second term of ( and t s straght forward to check ths does not alter the guarantee of the algorthm. Theorem 6 There s a lnear tme algorthm whch provdes a constant factor approxmaton for the problem of bnary searchng n trees.