Application-Aware Data Collection in Wireless Sensor Networks

Appication-Aware Data Coection in Wireess Sensor Networks Xiaoin Fang *, Hong Gao *, Jianzhong Li *, and Yingshu Li +* * Schoo of Computer Science and Technoogy, Harbin Institute of Technoogy, Harbin, China + Department of Computer Science, Georgia State University, Atanta, GA 30303, USA {xforu, honggao, ijzh}@hit.edu.cn, yi@cs.gsu.edu Abstract Data sharing for data coection among mutipe appications is an efficient way to reduce the communication cost of Wireess Sensor Networks (WSNs). This paper is the first work to introduce the interva data sharing probem which is to investigate how to transmit as ess data as possibe over the network, and meanwhie the transmitted data satisfies the requirements of a the appications. Different from current studies where each appication requires a singe data samping during each task, we study the probem where each appication requires a continuous interva of data samping in each task instead. The proposed probem is a noninear nonconvex optimization probem. In order to ower the high compexity for soving a noninear nonconvex optimization probem in resource restricted sensor nodes, a 2- factor approximation agorithm whose time compexity is O(n 2 ) and memory compexity is O(n) is provided. A specia instance of this probem is aso anayzed. This specia instance can be soved with a dynamic programming agorithm in poynomia time, which gives an optima resut in O(n 2 ) time compexity and O(n) memory compexity. We evauate the proposed agorithms with TOSSIM, a widey used simuation too in WSNs. Theoretica anaysis and simuation resuts both demonstrate the effectiveness of the proposed agorithms. I. INTRODUCTION WSN depoyment is a difficut and time-consuming work which requires much manpower or mechanica power. Once a network is depoyed, it is expected to run for a ong time without any human interruption. Therefore it is inefficient to carry out ony one appication in a network. Sharing a network for mutipe appications can significanty improve network utiization efficiency [1], [2], [3], [4], [5], [6], [7], [8]. Currenty, it is popuar for a set of appications to share one network coecting data. Each node in the network sampes at a particuar frequency and the samped data is transmitted to the base station through muti-hops. A the appications prefer to receive a the samped data. However, if a the samped data is transmitted to the base station, the communication cost wi be high and network ifetime wi be reduced. Fortunatey, there may be some appications monitoring the same physica attributes. In this case, a certain amount of data may not need to be repeatedy transmitted back to the base station. Under the above mentioned scenario, carefuy designed data sharing agorithms are desired. Tavakoi et a. [9] propose a data samping agorithm for each node, such that the samped data can be shared by as many appications as possibe. Meanwhie, the amount of samped data at each node is reduced to a maximum eve, reducing the overa communication cost. In [9], each appication consists of a set of tasks. In each task, each node sampes data once. As shown in Fig.1, there are two appications running on this node. Task T 1 is for the first appication, and Task T 2 is for the second one. T 1 and T 2 may overap on the time axis, and both of them need to sampe data once. A method is to sampe data independenty, e.g. s 1 is samped by T 1 and s 2 is samped by T 2 as shown in Fig.1a, resuting in two samping data s 1 and s 2. In [9], the authors designed a agorithm such that ony one data samping can serve both appications as shown in Fig.1b. s1 s2 (a) independent samping s'1 (b) samping Fig. 1: Data samping for a time point In many appications, data needs to be samped for a continuous interva as shown in Fig.2, instead of samping at a particuar time point. For exampe, raiway monitoring systems which coect acoustic information [10], [11] need to sampe data for a continuous interva. Vocanic and earthquake monitoring systems [12], [13], [14] aso have such a requirement to measure vibrations. Habitat monitoring systems for microcimate, pant physioogy and anima behavior [15], [16], [17] need to record wind speed and take video of anima behaviors, which again require to sampe data for a continuous interva. s s (a) independent samping s (b) samping Fig. 2: Data samping for a continuous interva This paper studies the interva data sharing probem of how to reduce the overa ength of data samping intervas which s

coud be shared by mutipe appications. We assume there are mutipe appications running on a same node, and each appication consists of tasks. Each task requires to sampe data for a continuous interva. In Fig.2, T 1 is for the first appication, and T 2 is for the second one. Both tasks need to continuousy sampe data for an interva s. If two tasks sampe data independenty, two intervas of data with ength s need to be samped as shown in Fig.2a. However, one interva of data with ength s is enough if the starting points of data samping of these two appications can be inteigenty arranged. The data samping interva engths for different appications may be different, and for the same appication, tasks may have different data samping interva engths. The investigated probem in this paper is to minimize the overa data samping interva ength at each node whie satisfying a the appications needs. We formuate the aforementioned probem as a noninear nonconvex optimization probem. Since sensor nodes are resource constrained, the cost to sove such a probem at each node is very high. Therefore, we propose a 2-factor agorithm with time compexity O(n 2 ) and memory compexity O(n). We aso consider a specia instance where the data samping interva engths of a the tasks are the same. The specia instance coud be soved with a dynamic programming agorithm in poynomia time, whose time compexity is O(n 2 ) and memory compexity is O(n). The contributions of this paper are as foows. This is the first work to study the interva data sharing probem, where each node sampes data for a continuous interva instead of samping a discrete data point. This probem is formuated as a noninear nonconvex programming probem. A approximation agorithm is proposed to sove the probem so as to reduce the cost of soving the noninear nonconvex optimization probem at resource restricted sensor nodes. The proposed agorithm is proved to be a 2-factor approximation agorithm. The time compexity of this agorithm is O(n 2 ), and the memory compexity is O(n). We aso anayze a specia instance of the interva data sharing probem. We give a dynamic programming agorithm which gives an optima resut in poynomia time. The time compexity is O(n 2 ) and the memory compexity is O(n). Extensive simuations were conducted to vaidate the correctness and effectiveness of our agorithms. The rest of this paper is organized as foows. Section 2 reviews the reated works. Section 3 formay defines the interva data sharing probem. Section 4 gives an agorithm to sove the probem and the approximation ratio is anayzed. A specia instance is investigated in Section 5. A dynamic programming agorithm is aso presented in this section to address the specia instance. Performance evauations are shown in Section 6 and Section 7 concudes this paper. II. RELATED WORKS Muti-query optimization in database systems studies how to efficienty process queries with commmon sub-expressions [18], [19]. It aims at expoiting the common sub-expression of SQLs to reduce query cost which is different from our probem. S. Krishnamurthy et a. [20] consider the probem of data sharing in data streaming system for aggregate queries. They studies the min, max, sum and count-ike aggregation queries. The stream is scanned at east once and is chopped into sices. Ony the sices that overap among mutipe queries coud be shared. Their studied probems are different from ours. We expect to reduce the number of sensor sampings at each node resuting in ess communication cost. Our probem differs in that we want to provide each appications enough samping data whie minimize the tota number of sampings. Query optimization in WSNs [2], [21] usuay tries to find in-network schemes or distributed agorithms to reduce communication cost for aggregation queries. Whie our work focuses on reducing the amount of transmitted data for each node. The most reative work of this paper is [9]. It studies the probem of data sharing among mutipe appications. This work assumes each appications ony needs discrete data point sampings. Whie in our probem the appications may require an continuous interva of data. The proposed soution in [9] coud not be appied to our probem. However, our soution can sove their probem. III. PROBLEM DEFINITION In order to make our probem cear, we first introduce an exampe as shown in Fig.3. We have two appications, and each appication consists of many tasks. Appication A 1 requires an interva of data of ength 1 during each task duration, and A 2 requires an interva of data of ength 2 during each task duration. The task duration engths of A 1 and A 2 are different as shown in Fig.3. Appication A 1 consists of tasks T 11,T 12,,T 1i, and so on. Appication A 2 incudes tasks T 21,T 22,,T 2j, and so on. Take tasks T 11,T 12,T 13,T 21 and T 22 as exampes. The optima soution is shown in the bottom part of Fig.3. Tasks T 11, T 12 and T 13 pick the intervas I 11, I 12 and I 13 respectivey. The intervas I 11, I 12 and I 13 are a of ength 1. Tasks T 21 and T 22 pick the intervas I 21 and I 22 respectivey. The intervas I 21 and I 22 are both of ength 2. The optima soution gives a resut of ength s 1 +s 2 in this exampe, as shown in the bottom part of Fig.3, where the task are sorted according the ascending order of the ending time of the tasks. Sensor data within the overap of mutipe tasks coud be shared by these tasks. We aim at minimizing the overa ength of the data intervas. Before the description of our probem, we give some preiminary definition which wi be used ater. Definition 1. Define I I as the union of two intervas or interva sets I and I. For exampe, [1, 5] [3, 7] = [1, 7], and {[1, 3], [5, 7]} [3, 5] = [1, 7].

A1 A2 1 1 2 3 2 I11 1 2 3 I11 I21 1 I21 s1 I12 I12 Fig. 3: Interva data samping for muti-appications Definition 2. Define I I as the overap of two intervas I and I. For exampe, [1, 5] [3, 7] = [3, 5]. Definition 3. Define I as the ength of the interva I or the ength of the union of the interva in set I. For exampe, [1, 5] = 4, [1, 5] [3, 7] = 6, and [1, 3] [5, 7] = 2. Definition 4. I I means interva I is a sub-interva of I. For exampe, [2, 3] [1, 5]. Given a set of n tasks T = {T i }, i = 1, 2,, n. Each task T i is a three-tupe T i = b i, e i, i, where b i denotes the beginning time, e i represents the end time, and i means that T i needs an interva of data with ength i. It s assumed that i e i b i. The probem is to find a continuous sub-interva I i in interva [b i, e i ], i.e. I i [b i, e i ], for every task satisfying I i = i, so that the ength of the union of a the sub-intervas on the time axis is minimized, i.e. n i=1 I i is minimum. Note that the sub-interva I i is continuous. The bottom part of Figure 3 iustrates an exampe. Since sensor nodes have imited communication and computationa capabiities, we want to find a set of sub-intervas I 11, I 21, I 12, I 22 and I 13 for tasks T 11, T 21, T 12, T 22 and T 13 respectivey, such that I 11 I 21 I 12 I 22 I 13 is minimum. In the exampe shown in Fig.3, the optima soution is s 1 + s 2, a the tasks coud derive the data they need from the two data intervas s 1 and s 2. We now formay define the interva data sharing probem. Definition 5. Given a set of n tasks T, each task T i is a threetupe T i = b i, e i, i, that is, each task T i has a beginning time b i, an end time e i, and an data samping interva ength i, the probem is to find a continuous sub-interva I i for each task so as to n min I i (1) i=1 2 I13 I13 I22 I22 s2 1 2 s.t. I i [b i, e i ], i = 1, 2,, n (2) I i = i, i = 1, 2,, n (3) The objective function of this probem is non-inear. So if b i, e i and i are rea numbers, the probem is a noninear programming probem which has no efficient universa soution [22]. It is easy to find that the objective function is nonconvex [23]. Severa methods are avaiabe for soving nonconvex optimization probems. For exampe, one approach is to use specia formuations of inear programming probems. Another method invoves the use of branch and bound techniques, where the program is divided into subcasses to be soved with convex or inear approximations that form a ower bound on the overa cost within the subdivision. However, a these methods require high computationa compexity which are impractica to be impemented on sensor nodes. Since digita signas are discrete, the data intervas can be regarded as integer sequences. Therefore, b i, e i and i can be regarded as integers. The integer variabes make the probem a noninear integer programming probem [24], [25] which is hard to be soved. IV. A 2-FACTOR APPROXIMATION ALGORITHM A method is to initiate a continuous data samping interva at the beginning time of each task independenty. However, this method resuts in a arge amount of data. In this section, we present a agorithm which is a 2-factor approximation agorithm for our interva data sharing probem. Before we present the approximation agorithm, we propose a soution for the specia case where every task overaps with each other. A. Tasks Overapped with Each Other For ease of understanding, we first define satisfied as foows. Definition 6. We say that an interva I is satisfied for a task T i if I [b i, e i ] i. An interva set S is satisfied for a task T i if there exists an interva I in S such that I [b i, e i ] i. If a the tasks overap with each other, then the interva data sharing probem can be soved in poynomia time. An agorithm is presented as foows. Step 1: Sort the tasks in ascending order by their end times. Step 2: Pick the sub-interva of ength 1 at the end of the first task T 1, i.e. pick the sub-interva [e 1 1, e 1 ]. Step 3: Pick a sub-interva for each task from the second to the ast. Take T i as an exampe, if the union of the picked sub-intervas is satisfied for T i, do nothing and continue to pick a sub-interva for the next task T i+1. If it is not satisfied for T i, extend forward from the tai of picked sub-intervas. If it is sti not satisfied for T i, extend backward from the head of the picked sub-intervas. The pseudo code for tasks overapped with each other is described in Agorithm 1. Take Fig.4 as an exampe. Task T 1, T 2 and T 3 overap with each other. T 1 needs a data interva

Agorithm 1: SOLVE-OVERLAP(T ) Input: T = {T 1, T 2,, T n}, T i = b i, e i, i for i = 1, 2,, n [b i, e i ] [b j, e j ], i, j = 1, 2,, n Output: Find a minimum interva I that is satisfied for a tasks. 1: Sort tasks in ascending order by end time. Assume that the sorted tasks set is T = {T k1, T k2,, T kn } 2: s = e k1 k1 ; 3: e = e k1 ; 4: for i from 2 to n do 5: if [s, e] is satisfied for T ki then 6: continue; 7: ese 8: et e = min{s + ki, e ki }; 9: if [s, e] is satisfied for T ki then 10: continue; 11: ese 12: et s = e ki ; 13: return I = [s, e]; Agorithm 2: SOLVE-OVERLAP-B(T ) Input: T = {T 1, T 2,, T n}, T i = b i, e i, i for i = 1, 2,, n [b i, e i ] [b j, e j ], i, j = 1, 2,, n Output: Find a minimum interva I that is satisfied for a tasks. 1: s = ; 2: e min = ; 3: for i from 1 to n do 4: if s > e i i then 5: s = e i i ; 6: if e min > e i then 7: e min = e i ; 8: e = e min ; 9: for i from 1 to n do 10: if e < b i + i then 11: e = b i + i ; 12: if e < s + i then 13: e = s + i ; 14: return I = [s, e]; Fig. 4: Exampe of tasks overapped with each other of ength 1 = 4, T 2 needs an interva of ength 2 = 3, and T 3 needs an interva of ength 3 = 9. First, the tasks are sorted in ascending order by their end times. Second, pick the sub-interva with ength 4 at the end of T 1. The picked interva for T 1 is I = [7, 11]. Third, I is satisfied for task T 2, so nothing is done for T 2. Forth, I is not satisfied for task T 3, thus, I is extended forward unti the end time of T 3, at this time I = [7, 14]. But I is sti not satisfied for T 3, I is then extended backward from the head of the picked interva to get I = [5, 14] which is satisfied for a these three tasks. The time compexity is O(n og n) due to sorting step. If the tasks are pre-sorted, the time compexity is O(n). One can find that, the optima interva I = [s, e] for tasks overapped with each other can be aso obtained by another method. The optima interva I = [s, e] is get from the foowing equations. s = n min i=1 {e i i } (4) e = max{ max n {b n i + i }, max {s + i}, min {e i}} (5) i=1 i=1 i=1 The second method is described in Agorithm 2 which wi get the same resut as Agorithm 1. This agorithm consists of two phases. Let us take Fig.4 as an exampe again. In the first phase, it needs to find the beginning time s. In this exampe, s is the minimum e i i, and it is easy to find that s = 5. In the second phase, we find that e = 14 which is the maximum s + i in this exampe. Thus, the optima interva is obtained as [5, 14]. As we can see, the case where tasks overap with each other coud be soved in time compexity of O(2n) = O(n) with Agorithm 2. This agorithm does n not require a sort step. However, if the tasks are pre-sorted, Agorithm 1 is no worse than Agorithm 2. As shown in the ater section, our approximation agorithm pre-sorts the tasks, so either agorithm coud be used as a sub-process in our foowing approximation agorithm. B. 2-factor Approximation Agorithm We now present our approximation agorithm. First, sort a tasks by the end time in ascending order. Second, identify a subset of tasks that overap with T 1, and meanwhie, these tasks overap with each other. Find the minimum interva that coud be shared by the these tasks identified ast step by using Agorithm 1 described earier. Third, remove the previousy identified tasks incuding T 1. Repeat the second and the third steps for the remaining tasks unti a tasks are removed. One can refer to Agorithm 3 for the detaied process. Fig.5 iustrates the process of the approximation agorithm. The five tasks are sorted in ascending order by end time. In the first step, task T 1, T 2 and T 5 are identified as a subset of tasks that overap with each other. One can find that, if the tasks are sorted by end time, a the tasks which overap with T 1 aso overap with each other. Now, Agorithm 1 can be used to compute the interva that is satisfied for these three tasks. After that, the three tasks T 1, T 2 and T 5 are removed. In the second step, T 3 and T 4 are identified as a subset of tasks that overap with each other. Now, Agorithm 1 is empoyed again to compute the interva that is satisfied for these two tasks. The union of the two found intervas is the fina resut of this exampe returned by Agorithm 3. T4 T5 Fig. 5: Iustration of the approximation agorithm

Agorithm 3: GREEDY-APPROX(T ) Input: T = {T 1, T 2,, T n}, T i = b i, e i, i for i = 1, 2,, n Output: Find a set of intervas I that is satisfied for a tasks. 1: Sort tasks in ascending order by the end time. Assume that the sorted tasks is T = {T k1, T k2,, T kn }; 2: I = 3: whie T do 4: T = ; 5: et the first task in T be T kf ; 6: et the set of tasks which overap with T kf in T be T o; 7: add T kf and T o into T ; /*note that tasks in T overap with each other.*/ 8: I =SOLVE-OVERLAP(T ); 9: I = I I ; 10: remove T from T ; /*note that T is sti sorted after the removing step.*/ 11: return I; Theorem 1. Agorithm 3 is a 2-factor approximation agorithm. Proof: Assume that there are m tasks eft, i.e. T = {T i1, T i2,, T im }, and they are sorted in ascending order by end time. T i1 needs an interva I i1 of ength at east i1. Assume that a task T ij overaps with T i1, and it needs an interva I ij of ength ij. Agorithm 3 wi derive a resut interva which contains I i1 and I ij. In the worst situation, Agorithm 3 may derive another interva I whose ength is no shorter than ij in the ater steps, i.e. I ij. In this case I is satisfied for T ij and meanwhie it is satisfied for some ater tasks. So in the worst situation, Agorithm 3 wi give a resut of ength I i1 I ij I for tasks T i1 and T ij. However, there may exist an optima soution, where I i1 is satisfied for T i1, and I is satisfied for T ij and some ater tasks. Thus, an optima soution may derive a resut of ength I i1 I for tasks T i1 and T ij. Therefore, we have I i1 I ij I I i1 I I i 1 I ij + I I i1 + I I i 1 + I ij + I I i1 + I I i 1 + I ij + I ij I i1 + I ij (6) (7) (8) = i 1 + 2 ij i1 + ij (9) 2 (10) A tight exampe is shown in Fig.6. Agorithm 3 returns a resut of ength 2 as shown in Fig.6a, whie the optima resut is of ength ε + as shown in Fig.6b. Thus, im = 2. The ε 0 2 ε+ time compexity of Agorithm 3 is O(n 2 ) due to the step of identifying tasks which overap with the first remaining task in each iteration. Property 1. Let T m be the task with the minimum end time, ε (a) resut ε (b) optima resut Fig. 6: A tight exampe of Agorithm 3 i.e. e m = min n e i. Then sub-interva [e m m, e m ] does not i=1 resut in a worse resut. Proof: One can find that, if the sub-interva I m = [e m m, e m ] is picked, there are two possibe cases: no task overaps with T m or some other task overaps with T m. In the first case, it is apparent that [e m m, e m ] is a best soution. In the second case, as shown in Fig.7, three sub-cases exist. Let I opt be an optima soution. Let I be the union of the picked sub-intervas excuding I m in an optima soution, i.e. I opt = I m I. In the first sub-case, I covers I m, so I opt = I. In this sub-case, I m does not contribute to the optima resut, thus, any sub-interva of ength m in [b m, e m ] I is a good choice, and [e m m, e m ] is one of the choices. In the second sub-case, I overaps with I m, so I opt = I m I. [e m m, e m ] is the best choice which wi reduce the ength of the union of the picked intervas in this sub-case. In the third sub-case, I does not overap with I m, so I opt = I m + I. Any sub-interva of ength m between [b m, e m ] is a choice, and [e m m, e m ] is one of the choices. Thus the property is proved. Tm Im I' (a) covering I m Tm Im I' (b) overapping with I m Tm Fig. 7: Task with minimum end time Im I' (c) not overapping with I m V. MULTIPLE TASKS WITH SAME DATA SAMPLING INTERVAL LENGTH In this section, we study a specia instance of the interva data sharing probem where the ength of the data samping interva of a tasks is the same. Different from the genera probem, this specia instance can be soved with a dynamic agorithm. Given a set of tasks T = {T 1, T 2,, T n } and a positive integer, each task T i is denoted as T i = b i, e i,, where b i is the beginning time and e i is the end time. The probem is to pick a continuous sub-interva of ength for each task T i in [b i, e i ], so that the ength of the union of a the picked sub-intervas on the time axis is minimized. Definition 7. In the same data samping interva ength situation, a task T i covers T j if [b j, e j ] is a sub-interva of [b i, e i ], that is [b j, e j ] [b i, e i ].

One can find that in the same data samping interva ength situation, tasks which cover some other task can be removed. This is because any interva that is satisfied for the covered shorter task must be satisfied for the onger task. In Fig.8a, task T 2 covers T 1. If they have the same data samping interva ength, then any interva I that is satisfied for T 1 is satisfied for task T 2. Therefore, we do not have to consider task T 2, and T 2 can be removed in our agorithm. As shown in Fig.8b, we wi get the same resut after removing T 2. I s1 (a) before removing I s1 (b) after removing Fig. 8: Exampe of covering Property 2. Let the data samping interva ength of a tasks be the same. If T i covers T j, i.e. [b j, e j ] [b i, e i ] for any i, j = 1, 2,, n, then any interva that is satisfied for T j is satisfied for T i. After removing the tasks which cover other tasks, the probem coud be soved with a dynamic programming agorithm. Let T = {T 1, T 2,, T m} be the set of tasks any of which does not cover some other task. Assume that T 1, T 2,, T m are sorted in ascending order by end time. Let I(i, j) be the interva that is satisfied for both T i and T j, where T i overaps with T j, i.e. T i T j, i j. We have I(i, j) = { [e i, e i ] if b j e i, [ e i, b j + ] if e i < b j < e i. (11) It is easy to get I(i, j) in Equation.11 from Fig.9. There are ony two cases when T i overaps with T j. In the first case, T j covers interva [e i, e i ] as shown in Fig.9a, then I(i, j) = [e i, e i ]. In the second case, T j overaps with interva [e i, e i ] as shown in Fig.9b, then I(i, j) = [ e i, b j + ]. T'i T'j b'i... b'j (a) case 1 e'i e'j T'i T'j b'i... b'j (b) case 2 Fig. 9: Iustration of computing I(i, j) It is obvious that I(i, j) is satisfied for a tasks T i, T i+1,, T j. This is because that if T i overaps with T j, T i aso overaps with tasks from T i+1 to T j 1, because the tasks are sorted in ascending order by end time. Let f(i) be the resut with minimum ength of the union of resut from tasks T i, T i+1,, T m, where [e i, e i ] is picked. Let g(i) be the index x which resuts in the minimum ength of the union of resut from tasks T i, T i+1,, T m. Then g(i) and f(i) coud be represented as foows. e'i e'j Agorithm 4: SOLVE-COMMON-WEIGHT(T ) Input: T = {T 1, T 2,, T n}, T i = b i, e i, for i = 1, 2,, n; Output: Find a set of intervas I that is satisfied for a tasks. 1: Sort tasks in ascending order by end time. Assume that the sorted task set is T = {T k1, T k2,, T kn }; 2: b max = ; /*The foowing oop removes the tasks which cover some other tasks.*/ 3: for i = 1 to n do 4: if b ki b max then 5: remove T ki from T ; 6: ese 7: b max = b ki ; 8: Let the task set after the removing step be T = {T 1, T 2,, T m }, assume that it is sorted in ascending order by end time; 9: compute I(i, j) for 1 i j m; 10: f(m + 1) = ; 11: for i = m to 1 do 12: min = ; 13: g(i) = 1; 14: for x = i to m do 15: if b x e i then 16: break; /*do nothing for T x that does not overap with T i */ 17: if I(i, x) f(x + 1) < min then 18: g(i) = x; 19: min = I(i, x) f(x + 1) ; 20: f(i) = I(i, g(i)) f(g(i) + 1); 21: I = f(1); 22: return I; g(i) = { arg min { I(i, x) f(x + 1) } T i overapt x,i x m m I(i, g(i)) f(g(i) + 1) 1 i < m f(i) = [e m, e m] i = m i > m 1 i < m i = m (12) (13) An exampe is shown in Fig.10, and the process of this exampe is presented in Tabe I. By Equation (11), we derive I(i, j) in Tabe Ia. And f(i) is obtained in Tabe Ib from Equations (12) and (13). As represented in Equation (13), we get f(5) = first. By recaing the definition of f(i), f(4) = I(4, 4) = [e 4, e 4]. Then f(3) is the one with ess ength of the union of the intervas between I(3, 3) f(4) and I(3, 4) f(5), thus we get f(3) = I(3, 4). After that, f(2) is the one with ess ength of the union of the intervas between I(2, 2) f(3) and I(2, 3) f(4), and we obtain f(2) = I(2, 2) f(3). Finay, f(1) is the one with east ength of the union of the intervas among I(1, 1) f(2), I(1, 2) f(3) and I(1, 3) f(4), and we get f(1) = I(1, 2) f(3). The dynamic programming agorithm is described in Agorithm 4. In Agorithm 4, the tasks are sorted in ascending order by end time in ine 1. Lines 2-7 remove the tasks which cover other task. f(i) is computed in ines 10-20. Line 15 checks whether T x overaps with T i. If T x does not overap with T i, nothing is done. Break the oop because a the ater tasks wi not overap with T i. If T x overaps with T i, the agorithm needs to record the best index g(i) and the minimum resut. The fina resut is f(1).

T'1 T'2 T'3 T'4 0 5 10 15 Fig. 10: An exampe for Agorithm 4 In ines 10-20 of Agorithm 4, it seems that it needs O(n) memory to record one f(i), and O(n 2 ) memory to record a f(i). As the memory is a restrict resource at each sensor node, it needs to find a way to reduce the memory usage. Actuay, we ony want to compute f(1), and it doesn t have to record every f(i). It ony needs to record g(i) and f(i), and we can recover f(1) from g(i) at the end of the agorithm when every g(i) is found. Therefore, ines 10-22 in Agorithm 4 coud be repaced with Agorithm 5. In Agorithm 5, ines 7-11 compute the overap ength of I(i, k) and f(k+1), so as to get the ength of I(i, k) f(k+1) in ine 12. Agorithm 5 s memory compexity is O(n). Above a, it is easy to find that the specia instance where tasks have same data samping interva ength coud be addressed in time compexity O(n 2 ) and memory compexity O(n). =4 I(i, j) resut I(1, 1) [3, 7] I(1, 2) [3, 8] I(1, 3) [3, 10] I(2, 2) [5, 9] I(2, 3) [5, 10] I(3, 3) [12, 16] I(3, 4) [12, 16] I(4, 4) [14, 18] (a) Computing I(i,j) f(i) resut f(4) I(4, 4) f(3) I(3, 4) f(2) I(2, 2) f(3) f(1) I(1, 2) f(3) (b) Computing f(i) TABLE I: Computing I(i,j) and f(i) for exampe in Fig.10 time respectivey. The task durations are 13, 17, 19 and 23 unit time respectivey in the second case. The task durations of the third case are 17, 19, 23 and 29 unit time, and 19, 23, 29 and 31 for the forth case. We assume that sensor nodes can sampe once and obtain one unit data in each unit time. The sensor nodes run Agorithm 4 every maxt ime unit time, where maxt ime is a parameter according to the computation abiity of the sensor nodes. Higher computationa abiity aows arger maxt ime. Our agorithm is compared with the method which is introduced in Section 4. The method initiates a continuous data samping at the beginning of each task independenty. Agorithm 5: COMPUTE-f(1) Input: T = {T 1, T 2,, T m }, T i = b i, e i, for i = 1, 2,, m; Output: Find a set of intervas I that is satisfied for a tasks. 1: f(m + 1) = 0; 2: for i = m to 1 do 3: min = ; 4: g(i) = 1; 5: for x = i to m do 6: if b x < e i then 7: assume I(i, x) = [s ix, e ix ]; 8: 9: if e x+1 < e ix then overap = e ix (e x+1 ); 10: ese 11: overap = 0; 12: if I(i, x) + f(x + 1) overap < min then 13: g(i) = x; 14: min = I(i, x) + f(x + 1) overap; 15: f(i) = min ; 16: I = ; 17: k = 1; 18: whie k m do 19: I = I I(k, g(k)); 20: k = g(k) + 1; 21: return I; VI. PERFORMANCE EVALUATION We evauate the effectiveness of the proposed agorithms above through simuations in this section. The simuations are impemented with TOSSIM [26] which is a widey used simuation too in wireess sensor networks. Four cases are tested in this section. In each case of our experiments, four appications, each with different task durations and different data samping interva engths are tested. In the first case, the task durations of four appications are 11, 13, 17 and 19 unit The amount of samped data 160 140 120 100 80 60 40 20 0 optima case 1 case 2 case 3 case 4 Different cases Fig. 11: Data amount for short interva engths In the first set of simuations, we evauate the performance of the proposed agorithms in terms of the amount of samped data. The data samping interva engths for every case are 2, 3, 5, 7 unit time. It s shown in Fig.11 that, the method sampes much more data than the optima soution, and it cannot be bounded. In this simuation, the maxt ime is set to 150 unit time. Our agorithm sampes more data than the optima soution, but it is aways no more than two times of the optima resut. Compared with the method, our agorithm sampes amost 200% ess data when the data samping interva ength is short. one can aso find that, when the task duration increases, the amount of data samped by both the and the agorithm decreases. In the second group of simuations, we test the case where the data samping interva engths are onger. In such a case, the method may sampe data in every unit time. In this group of simuations, the data samping interva engths for

The amount of samped data 160 140 120 100 80 60 40 20 0 optima case 1 case 2 case 3 case 4 Different cases Fig. 12: Data amount for onger interva engths the node number in the network is 160, the method oses amost haf the sent samped data. The agorithm sampes much ess data, thus the traffic carried on the network is not quite heavy, and the data oss rate is much ower. Data amount 2500 2000 1500 1000 500 sent, received, sent, received, every case are 7, 11, 13, 17 unit time, and the maxt ime is sti 150. The amount of data samped by the optima and the method is much more when the data samping interva engths are onger, but it is sti ess than that of the method as shown in Fig.12. The next group of simuations is to evauate how maxt ime affects the amount of samped data. The resut is shown in Fig.13. The amount of data samped changes sighty for different maxt ime settings. As the maxt ime increases, the amount of samped data increases, however, the average amount of data does not vary a ot. This observation means that it is not necessary for the sensor nodes to take care of a ong maxt ime. A sma maxt ime is aready enough to derive a good resut. 0 10 40 90 160 Nubmer of nodes Fig. 14: The amount of sent and received data Fig.15 shows how the amount of received data is affected by network scae. As the network scae increases, the amount of sent data increases. But the amount of received data may decreases when network scae is too arge. As shown in Fig.15, the method sampes much more data than the and the optima agorithms, which wi resut in severe data oss. The agorithm transmits no more than two times of the amount of data than the optima soution, so these two agorithms are not quite different. Data amount per unit time 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 optima 0.0 100 110 120 130 140 150 160 170 180 maxtime Fig. 13: Data amount for different maxtime settings Next we evauate the impact of the node density in a network on the amount of samped data. Fig.14 iustrates the amount of data sent by source nodes and received by the base station. In our simuations, every sensor node in the network sampes data independenty, and the data is transmitted over the network through a routing tree. As the node density increases, the amount of sent data increases, but the amount of data received by the base station may decrease when the node density is very arge. This is because data oss rate increases sharpy due to unreiabe wireess ink and communication congestion in node-dense networks. In this simuation, when The amount of received data 1200 1000 800 600 400 200 optima 0 50 100 150 200 250 Area width Fig. 15: The amount of received data Fig.16 shows how the data oss rate is affected by network scae. The method and the agorithm have a simiar oss rate in sma scae networks. But when the network scae is very arge, the data oss rate of method is amost 70%. This is because the method sampes a arge amount of data which resut in numerous coisions in arge scae network. The optima soution and the agorithm which sampe ess amount of data show a better resut. VII. CONCLUSION Data sharing for mutipe appications is an efficient way to reduce the communication cost in WSNs. Many appications need an continuous interva of data samping periodicay. This

Data oss rate 0.7 0.6 0.5 0.4 0.3 0.2 0.1 optima 0.0 50 100 150 200 250 Area width Fig. 16: Packet oss paper is the first work to introduce the interva data sharing probem among mutipe appications, which is a noninear nonconvex optimization probem. Since no efficient universa soution has been found for such probem, we provide a approximation agorithm to ower the high computationa compexity of the avaiabe soutions. We prove that the provided agorithm is a 2-factor approximation agorithm. The time compexity of this approximation agorithm is O(n 2 ) and the memory compexity is O(n). In a specia instance where a tasks have the same data samping interva ength, the probem can be addressed in poynomia time, and a dynamic programming agorithm is provided for this specia instance. The time compexity of the dynamic programming agorithm is O(n 2 ) and the memory compexity is O(n). ACKNOWLEDGMENT This work was supported in part by the Major Program of Nationa Natura Science Foundation of China under grant No. 61190115, the Nationa Basic Research Program of China (973 Program) under grant No. 2012CB316200, and the Nationa Natura Science Foundation of China (NSFC) under grants No. 61033015, No. 60933001 and No. 61100030. REFERENCES [1] W.I. Grosky, A. Kansa, S. Nath, Jie Liu, and Feng Zhao. Senseweb: An infrastructure for shared sensing. Mutimedia, IEEE, 14(4):8 13, oct.-dec. 2007. [2] Niki Trigoni, Yong Yao, Aan Demers, and Johannes Gehrke. Mutiquery optimization for sensor networks. In In DCOSS, pages 307 321, 2005. [3] Ming Li, Tingxin Yan, Deepak Ganesan, Eric Lyons, Prashant Shenoy, Arun Venkataramani, and Michae Zink. Muti-user data sharing in radar sensor networks. In Proceedings of the 5th internationa conference on Embedded networked sensor systems, SenSys 07, pages 247 260, New York, NY, USA, 2007. ACM. [4] You Xu, Abusayeed Saifuah, Yixin Chen, Chenyang Lu, and Sangeeta Bhattacharya. Near optima muti-appication aocation in shared sensor networks. In Proceedings of the eeventh ACM internationa symposium on Mobie ad hoc networking and computing, MobiHoc 10, pages 181 190, New York, NY, USA, 2010. ACM. [5] S. Ji and Z. Cai. Distributed data coection and its capacity in asynchronous wireess sensor networks. In Proceedings of The 31st Annua IEEE Internationa Conference on Computer Communications, IEEE INFOCOM, 2012. [6] Z. Cai, S. Ji, and J. Li. Data caching based query processing in muti-sink wireess networks. Internationa Journa of Sensor Networks, 11(2):109 125, 2012. [7] Z. Cai, S. Ji, J. He, and A. G. Bourgeois. Optima distributed data coection for asynchronous cognitive radio networks. In Proceedings of The 32nd Internationa Conference on Distributed Computing Systems 2012, ICDCS, 2012. [8] S. Cheng, J. Li, and Z. Cai. o(ɛ)-approximation to physica word by sensor networks. In Proceedings of The 32nd IEEE Internationa Conference on Computer Communications, IEEE INFOCOM 2013. [9] Arsaan Tavakoi, Aman Kansa, and Suman Nath. On-ine sensing task optimization for shared sensors. In Proceedings of the 9th ACM/IEEE Internationa Conference on Information Processing in Sensor Networks, IPSN 10, pages 47 57, New York, NY, USA, 2010. ACM. [10] S. Ganesan and R.D. Finch. Monitoring of rai forces by using acoustic signature inspection. Journa of Sound and Vibration, 114(2):165 171, 1987. [11] M. Ceruo, G. Fazio, M. Fabbri, F. Muzi, and G. Sacerdoti. Acoustic signa processing to diagnose transiting eectric trains. Inteigent Transportation Systems, IEEE Transactions on, 6(2):238 243, june 2005. [12] Liang Cheng and S.N. Pakzad. Agiity of wireess sensor networks for earthquake monitoring of bridges. In Networked Sensing Systems (INSS), 2009 Sixth Internationa Conference on, pages 1 4, june 2009. [13] Makoto Suzuki, Shunsuke Saruwatari, Narito Kurata, and Hiroyuki Morikawa. A high-density earthquake monitoring system using wireess sensor networks. In SenSys, pages 373 374, 2007. [14] Rui Tan, Guoiang Xing, Jinzhu Chen, Wen-Zhan Song, and Renjie Huang. Quaity-driven vocanic earthquake detection using wireess sensor networks. In Rea-Time Systems Symposium (RTSS), 2010 IEEE 31st, pages 271 280, 30 2010-dec. 3 2010. [15] Aan Mainwaring, David Cuer, Joseph Poastre, Robert Szewczyk, and John Anderson. Wireess sensor networks for habitat monitoring. In Proceedings of the 1st ACM internationa workshop on Wireess sensor networks and appications, WSNA 02, pages 88 97, New York, NY, USA, 2002. ACM. [16] Robert Szewczyk, Aan Mainwaring, Joseph Poastre, John Anderson, and David Cuer. An anaysis of a arge scae habitat monitoring appication. In Proceedings of the 2nd internationa conference on Embedded networked sensor systems, SenSys 04, pages 214 226, New York, NY, USA, 2004. ACM. [17] Robert Szewczyk, Eric Osterwei, Joseph Poastre, Michae Hamiton, Aan Mainwaring, and Deborah Estrin. Habitat monitoring with sensor networks. Commun. ACM, 47:34 40, June 2004. [18] Timos K. Seis. Mutipe-query optimization. ACM Trans. Database Syst., 13:23 52, March 1988. [19] Prasan Roy, S. Seshadri, S. Sudarshan, and Siddhesh Bhobe. Efficient and extensibe agorithms for muti query optimization. In Proceedings of the 2000 ACM SIGMOD internationa conference on Management of data, SIGMOD 00, pages 249 260, New York, NY, USA, 2000. ACM. [20] Saiesh Krishnamurthy, Chung Wu, and Michae Frankin. On-the-fy sharing for streamed aggregation. In Proceedings of the 2006 ACM SIGMOD internationa conference on Management of data, SIGMOD 06, pages 623 634, New York, NY, USA, 2006. ACM. [21] Shii Xiang, Hock Beng Lim, Kian-Lee Tan, and Yonguan Zhou. Twotier mutipe query optimization for sensor networks. In Proceedings of the 27th Internationa Conference on Distributed Computing Systems, ICDCS 07, pages 39, Washington, DC, USA, 2007. IEEE Computer Society. [22] Dimitri P. Bertsekas. Noninear Programming. Athena Scientifica, 1999. [23] D. Henrion and J.-B. Lasserre. Soving nonconvex optimization probems. Contro Systems, IEEE, 24(3):72 83, jun 2004. [24] Jon Lee and Sven Leyffer. Mixed Integer Noninear Programming. The IMA Voumes in Mathematics and its Appications. Springer, 2011. [25] D. Li and X. Sun. Noninear integer programming. Internationa series in operations research & management science. Springer, 2006. [26] Phiip Levis, Neson Lee, Matt Wesh, and David Cuer. Tossim: accurate and scaabe simuation of entire tinyos appications. In Proceedings of the 1st internationa conference on Embedded networked sensor systems, SenSys 03, pages 126 137, New York, NY, USA, 2003. ACM.