Secure Data Aggregation in Wireless Sensor Networks

Transcription

1 Secure Data Aggregation in Wireless Sensor Networks by Hani Alzaid Bachelor of Computer Engineering (King Saud University) 2000 Master of Computer Science and Engineering (University of New South Wales) 2005 Thesis submitted in accordance with the regulations for the Degree of Doctor of Philosophy Information Security Institute Faculty of Science and Technology Queensland University of Technology March 1, 2011

2

3 Keywords Secure data aggregation, wireless sensor networks, performance analysis, security analysis, reputation systems, trust systems, node compromise, attacks, cryptographic-based solutions, reputation-based solutions, forward & backward secure key management, On-Off attacks, i

4 ii

5 Abstract A Wireless Sensor Network (WSN) is a set of sensors that are integrated with a physical environment. These sensors are small in size, and capable of sensing physical phenomena and processing them. They communicate in a multihop manner, due to a short radio range, to form an Ad Hoc network capable of reporting network activities to a data collection sink. Recent advances in WSNs have led to several new promising applications, including habitat monitoring, military target tracking, natural disaster relief, and health monitoring. The current version of sensor node, such as MICA2, uses a 16 bit, 8 MHz Texas Instruments MSP430 micro-controller with only 10 KB RAM, 128 KB program space, 512 KB external flash memory to store measurement data, and is powered by two AA batteries. Due to these unique specifications and a lack of tamper-resistant hardware, devising security protocols for WSNs is complex. Previous studies show that data transmission consumes much more energy than computation. Data aggregation can greatly help to reduce this consumption by eliminating redundant data. However, aggregators are under the threat of various types of attacks. Among them, node compromise is usually considered as one of the most challenging for the security of WSNs. In a node compromise attack, an adversary physically tampers with a node in order to extract the cryptographic secrets. This attack can be very harmful depending on the security architecture of the network. For example, when an aggregator node is compromised, it is easy for the adversary to change the aggregation result and inject false data into the WSN. The contributions of this thesis to the area of secure data aggregation are manifold. We firstly define the security for data aggregation in WSNs. In contrast with existing secure data aggregation definitions, the proposed definition covers the unique characteristics that WSNs have. Secondly, we analyze the relationship between security services and adversarial models considered in existing secure data aggregation in order to provide a general framework of required security services. Thirdly, we analyze existing cryptographic-based and reputationbased secure data aggregation schemes. This analysis covers security services provided by these schemes and their robustness against attacks. Fourthly, we propose a robust reputationbased secure data aggregation scheme for WSNs. This scheme minimizes the use of heavy cryptographic mechanisms. The security advantages provided by this scheme are realized by integrating aggregation functionalities with: (i) a reputation system, (ii) an estimation theory, and (iii) a change detection mechanism. We have shown that this addition helps defend against most of the security attacks discussed in this thesis, including the On-Off attack. Finally, we iii

6 propose a secure key management scheme in order to distribute essential pairwise and group keys among the sensor nodes. The design idea of the proposed scheme is the combination between Lamport s reverse hash chain as well as the usual hash chain to provide both past and future key secrecy. The proposal avoids the delivery of the whole value of a new group key for group key update; instead only the half of the value is transmitted from the network manager to the sensor nodes. This way, the compromise of a pairwise key alone does not lead to the compromise of the group key. The new pairwise key in our scheme is determined by Diffie-Hellman based key agreement. iv

7 Contents Front Matter i Keywords i Abstract iii Table of Contents v List of Figures ix List of Tables xi Declaration xiii Previously Published Material xv Acknowledgements xvii 1 Introduction Background Challenges in Wireless Sensor Networks Challenges in the End Device Challenges in the Network Data Aggregation and Security Challenges Research Objectives Outline Secure Data Aggregation in Wireless Sensor Networks Secure Data Aggregation in Wireless Sensor Networks Security Requirements for Data Aggregation Security The Expected Adversarial Model and Security Concerns Security Attacks Sybil Attack (SY) Selective Forwarding Attack (SF) Replay Attack (RE) Spoofed Data Attack (SD) Adversary Classification Current Secure Data Aggregation Schemes Single Aggregator Model Du et al. s Scheme Przydatek et al. s Scheme v

8 Mahimkar & Rappaport s Scheme Sanli et al. s Scheme Multiple Aggregator Model Hu & Evans s Scheme Jadia & Mathuria s Scheme Westhoff et al. s Scheme Yang et al. s Scheme Security Analysis Security Services Attack Vulnerability Framework for Evaluating New Schemes Performance Analysis First Scenario: No Aggregation & No Security Second Scenario: Aggregation but No Security Third Scenario: Hu & Evans s Scheme Fourth Scenario: Jadia & Mathuria s Scheme Fifth Scenario: Przydatek et al. s Scheme Sixth Scenario: Du et al. s Scheme Example Summary Reputation-based Trust Systems in Wireless Sensor Networks Analysis Framework for Reputation Systems Information Gathering and Sharing Phase Information Modeling Phase Decision Making Phase Dissemination Phase Security Attacks against Reputation-based Trust Systems Bad Mouthing Attack (BM) Ballot Stuffing Attack (BS) On-Off Attack (OO) Newcomer Attack (NE) The State of the Art of Reputation-based Trust Systems in WSNs Boukerche & Ren s Scheme Shaikh et al. s Scheme Michiardi & Molva s Scheme Srinivasan et al. s Scheme Özdemir s Scheme Comparison of Current Reputation-based Systems in WSNs Classification Model Reputation Components Attack Vulnerability Summary vi

9 4 Reputation-based Secure Data Aggregation Network Assumptions Data Model Adversarial Model Security Requirements The Proposal Reputation-based Secure Data Aggregation Scheme Experimental Evaluation Scenario 1: No Attacks Scenario 2: Abrupt Change Scenario 3: 1-per-2 Strategy On-Off Attack Security Analysis Reputation Components Security Services Attacks Resilience Summary Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation Related Work Estimation Theory Change Point Detection The Proposed Enhanced Reputation-based Secure Data Aggregation Scheme Experiment Evaluation Scenario 1: No Attacks Scenario 2: Abrupt or Incipient Change Scenario 3: 1-per-2 Strategy On-Off Attack Scenario 4: 1-per-3 Strategy On-Off Attack Summary A Forward & Backward Secure Key Management in Wireless Sensor Networks Adversary Model and Security Concerns Related Work The Proposed Forward & Backward Secure Key Management Scheme - FBSKM Group Key Update Protocol Pairwise Key Update Protocol Delivery Failure Management The Enhanced FBSKM (E-FBSKM) Security Analysis Robustness Against Adversaries Achievement of Past & Future Secrecy Resilience Against Impersonation Attacks Performance Analysis Memory Overhead vii

10 6.6.2 Communication Overhead Computation Cost Summary Conclusion and Future Work Research Summary Future Work Bibliography 163 viii

11 List of Figures 1.1 Main components of a sensor node An aggregation scenario using the SUM aggregation function Sybil Attack Selective Forwarding Attack Replay Attack Spoofed Data Attack Classification of adversaries A sketch of single and multiple aggregator models Classification of current secure data aggregation schemes A Merkle hash tree The proposed framework for secure data aggregation schemes The aggregation tree model used in the performance analysis section The reputation system phases Bad Mouthing Attack Ballot Stuffing Attack On-Off Attack Newcomer Attack A community as suggested in TOMS [12] Classification of current reputation-based trust systems in WSNs A simplified deployment area for Özdemir s scheme The radio coverage in RSDA A simplified deployment area for RSDA The first scenario of RSDA evaluation in which dataset-1 is used The second scenario of RSDA evaluation in which dataset-2 is used The third scenario of RSDA evaluation in which dataset-3 is used Reputation values of C rep k during the third scenario of RSDA evaluation The third scenario of RSDA evaluation in which dataset-4 is used A simplified estimation model for data aggregation in WSNs A simplified deployment area for E-RSDA A simplified E-RSDA model ix

12 5.4 The first scenario of E-RSDA evaluation in which dataset-1 is used The second scenario of E-RSDA evaluation in which dataset-2 is used The second scenario of E-RSDA evaluation in which dataset-3 is used The third scenario of E-RSDA evaluation in which dataset-4 is used Reputation values of C rep k during the third scenario of E-RSDA evaluation The third scenario of E-RSDA evaluation in which dataset-5 is used The fourth scenario of E-RSDA evaluation in which dataset-6 is used Reputation values of C rep k during the fourth scenario of E-RSDA evaluation The fourth scenario of E-RSDA evaluation in which dataset-7 is used Classification of adversaries Key evolution in the proposed protocol State diagram of key disclosure Relations between keying materials and the significance of node compromise x

13 List of Tables 1.1 Hardware s specifications for three types of sensor nodes Security services provided in current secure data aggregation schemes Attacks vulnerabilities in current secure data aggregation schemes Description of notations used in the performance analysis section Number of bytes transmitted across the network to accomplish a single aggregation transaction Reputation components in current reputation-based trust systems Attacks vulnerabilities in current reputation-based trust systems Description of notations used in Chapter Reputation table format as suggested in RSDA Datasets used in the experimental evaluation section Reputation components in current reputation-based trust systems Security services provided in current secure data aggregation protocols Attacks vulnerabilities in current reputation-based trust systems Description of notations used in Chapter Data sets used in the experiment evaluation Description of notations used in Chapter Memory overhead comparison Number of bits transmitted/received by a sensor Computation cost comparison xi

14 xii

15 Declaration The work contained in this thesis has not been previously submitted for a degree or diploma at any higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made. Signed: Date: xiii

16 xiv

17 Previously Published Material The following papers have been published or presented, and contain material based on the content of this thesis. ˆ Book Chapters: Hani Alzaid, Ernest Foo, Juan Manuel González Nieto, and DongGook Park. Secure Data Aggregation in Wireless Sensor Networks. In Anna Foerster and Alexander Foerster, editors, Emerging Communications for Wireless Sensor Networks, chapter 10, pages , InTech, Croatia Hani Alzaid. Reputation-based Trust Systems in Wireless Sensor Networks. In Al- Sakib Khan Pathan, editor, Security of Self-Organizing Networks: MANET, WSN, WMN, VANET, chapter 20, pages , Auerbach Publications, CRC Press, Taylor & Francis Group, USA Hani Alzaid, DongGook Park, Juan Manuel González Nieto, Colin Boyd, and Ernest Foo. A Forward & Backward Secure Key Management in Wireless Sensor Networks for PCS/SCADA. In Raúl Aquino Santos, Arthur Edwards, and Victor Rangel Licea, editors, Emerging Technologies in Wireless Ad Hoc Networks: Applications and Future Development, chapter 3, pages 41-60, IGI Global, USA ˆ Journal Articles: Hani Alzaid, Ernest Foo, Juan Manuel González Nieto, and DongGook Park. Secure Data Aggregation in Wireless Sensor Networks: A Comprehensive Review. International Journal of Communication Networks and Distributed Systems (IJCNDS), Invited Article, In press, InderScience Publishers. Hani Alzaid, Ernest Foo, Juan Manuel González Nieto, and Ejaz Ahmed. Mitigating the On-Off Attacks in Reputation-based Secure Data Aggregation for Wireless Sensor Networks. Security and Communication Networks, In press. ˆ Conference Papers: Hani Alzaid, DongGook Park, Juan Manuel González Nieto, and Ernest Foo. Mitigating Sandwich Attacks against a Secure Key Management Scheme in Wireless Sensor Networks for PCS/SCADA. In Proceedings of the 24th IEEE International Conference on Advanced Information Networking and Applications, AINA 10, Perth, Australia, April 2010, pages , IEEE Computer Society, xv

18 Hani Alzaid, DongGook Park, Juan Manuel González Nieto, Colin Boyd, and Ernest Foo. A Forward & Backward Secure Key Management in Wireless Sensor Networks for PCS/SCADA. In Proceedings of the 1st International ICST Conference on Sensor Systems and Software, S-CUBE 09, 7-9 September 2009, Grand Hotel Duomo of Pisa, Pisa. Hani Alzaid, Ernest Foo, and Juan Manuel González Nieto. RSDA: Reputationbased Secure Data Aggregation in Wireless Sensor Networks. In Proceedings of the 9th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 08, Dunedin, New Zealand, 1-4 December 2008, pages , IEEE Computer Society, Hani Alzaid, Ernest Foo, and Juan Manuel González Nieto. Secure Data Aggregation in Wireless Sensor Networks: A Survey. In Proceedings of the 6th Australasian Information Security Conference: Conferences in Research and Practice in Information Technology, AISC 08, Wollongong, NSW, Australia, January 2008, pages , Australian Computer Society Inc., xvi

19 Acknowledgements xvii

20 xviii

21 Chapter 1 Introduction A Wireless Sensor Network (WSN) is a highly distributed network of small wireless nodes deployed in large numbers to monitor the environment or other systems by the measurement of physical parameters such as temperature, pressure, or relative humidity [85, page 647]. Advancements in micro-electro-mechanical systems, digital electronics, and wireless communications have enabled the development of a new generation of sensor nodes. These sensors are small in size and communicate in a multihop manner due to a short radio range, and are powered by a limited energy source. These sensor nodes collaborate to form an Ad Hoc Network capable of reporting network activities to a data collection sink. Recently, WSNs have been used in many promising applications, including habitat monitoring [76], military target tracking [55, 116], natural disaster relief [19], and health monitoring [82]. 1.1 Background WSN applications are classified into four classes [61]: (i) event detection, (ii) periodic reporting, (iii) base station querying, and (iv) tracking. These classes are briefly explained as follows: ˆ Event Detection: The objective of sensor networks in this application class is to detect rare events, such as forest fires or intrusions, and to promptly communicate a report of such an event to the sink. ˆ Periodic Reporting: The objective of the sensor networks in this type of application is to send periodic updates to the sink. Thus, there is regularity in terms of data gathering phases, and there is a steady flow of data from the sensor nodes to the sink. In-network data aggregation is useful in such applications because measurements of neighboring nodes are likely to be correlated, and could be used to reduce the amount of data that needs to be communicated to the sink. This in turn reduces communication energy expenditure of the nodes, and prolongs the lifetime of the network. 1

22 2 Chapter 1. Introduction Figure 1.1: Main components of a sensor node ˆ Base Station Querying: In several application classes, the sink is not interested in data updates from all the nodes in the network. The sink may want updates from different regions at different times. Thus, requiring all the nodes to send their data to the sink at all times increases the energy consumption on communication as well as on computation. In such cases, the sink selectively queries a set of sensor nodes located in the region of interest. This results in a more energy-efficient use of resources. ˆ Tracking: Tracking WSN applications are interested in detecting, localizing and tracking targets, and conveying the relevant information to the sink, in a timely fashion. They combine some of the characteristics of the three application classes discussed earlier. The end device in WSNs, the sensor node, is composed of four basic units [123]: (i) sensing unit, (ii) processing unit, (iii) power unit, and (iv) transceiver unit as depicted in Figure 1.1. These four units are briefly explained as follows: ˆ Sensing Unit: It consists of an array of sensors that can measure the physical characteristics of its environment, like temperature, light, vibration, and others. Each sensor has the ability to sense environmental characteristics via the sensing unit and then use the Analog to Digital Converter (ADC) to convert the sensed analog data into digital. ˆ Processing Unit: It is, in most cases, composed of an internal memory to store data and application programs, and a microcontroller to process the data. The microcontroller can be considered as a highly constrained computer that contains the memory and interfaces required to create simple applications. This unit should be able to work with a limited resource of energy and process efficiently the digital data delivered by the sensing unit. ˆ Power Unit: It provides the energy required by all the sensor components, and such energy may come from either a battery or from renewable sources. ˆ Transceiver Unit: It is able to send and receive messages through a wireless channel. In other words, it gives the sensor the ability to talk to other sensor nodes and form an Ad Hoc Network. Note that, the sensor node may have an external memory unit that works as a secondary memory in order to keep a data log. Devising solutions for WSNs are not successfully accomplished by the simple adaptation of solutions designed for wired networks, or even for the more closely related, Ad Hoc Networks. This is due to the limitations and challenges that

23 1.2. Challenges in Wireless Sensor Networks 3 WSNs have, which will be discussed in Section 1.2. A wireless Ad Hoc Network is a collection of wireless devices that can dynamically self-organize into an arbitrary and temporary topology to form a network without necessarily using any pre-existing infrastructure. In fact, wireless sensor networks could be considered as a specific subset of Ad Hoc Networks where end devices in wireless sensor networks are able to sense physical phenomena. However, there are great differences between Ad Hoc Networks and WSNs as listed in the following paragraphs [16, 18]: ˆ Energy Source: Most WSNs are deployed in remote or hostile environments, whereas Ad Hoc Networks are not. Consequently, replacing the batteries of these WSN nodes is more of a problem than it is for Ad Hoc Networks. As a result, the energy consumption of any solution designed for WSNs should be carefully considered at the design time. ˆ Data Centric: Routing in WSNs is more likely to be querying attributes of the phenomenon (attribute-based naming) rather than querying individual nodes addresses (IPs). For example, what is the area where the temperature is over 70 o celsius? is more a common query in WSNs than the temperature read by a certain sensor node. ˆ Node Density: The number of nodes in the WSN can be higher than the number of nodes in the Ad Hoc Network. The nature of WSNs is that they are deployed in large scale environments, and each sensor has a limited transmission range. Therefore, dense deployment is necessary to achieve stable connectivity and to overcome the limited transmission coverage. ˆ End Device: In Ad Hoc Networks, the end node device is less constrained than sensor nodes. For example, the end device in Ad Hoc Networks, a laptop, has a larger memory and battery, and has a more powerful processor. ˆ Network Structure: Whereas Ad Hoc networks are usually completely distributed networks, WSNs have a central control system, which is the base station. Therefore, most traffic in WSNs is sent from the sensor nodes to the base station, and vice versa. Only in a few cases; one node will send information directly to another sensor node. However, it is normal for end devices of an Ad Hoc Network to communicate with other devices in the network as part of their normal functionality. The rest of this chapter is organized as follows: Section 1.2 discusses limitations and challenges in Wireless Sensor Networks. These limitations and challenges affect the performance of any application intended to run on WSNs, especially data aggregation applications. Section 1.3 provides the motivation for this thesis and highlights the importance of secure data aggregation. Then, the research objectives and contributions are stated in Section 1.4. Finally, the thesis structure is detailed in Section Challenges in Wireless Sensor Networks As discussed above, WSNs have unique specifications and constraints as compared with Ad Hoc Networks, which makes the simple adaptation of existing solutions designed for traditional

24 4 Chapter 1. Introduction Table 1.1: Hardware s specifications for three types of sensor nodes Specifications MICA2 [30] FLECK [32] MICAZ [31] Processor Atmega 128L Atmega 128L Atmega 128L RAM 4 KB 4 KB 4 KB Memory ROM 128 KB 512 KB 128 KB EPROM 512 KB 1 MB 512 KB Power Supply 2AA 3AA & ISB 2AA Data Rate 38.4 kbps 72 kbps 250 kbps Radio RR 152 m 500 m 75 m RF 868/916 MHz 913 MHz GHz Transmit 27 ma 5 ma N/A Current Draw Receive 10 ma N/A 19.7 ma Sleep < 1 µa 30 ua 1 µa * Transmit with Maximum Power RR Radio Range ISB Integrated Solar Board RF Radio Frequency N/A Not Available networks impractical. Thus, understanding the unique specifications of WSNs is highly recommended to adapt any new idea with these specifications and make it feasible in the WSN real world [59]. These unique specifications and constraints are named challenges in the rest of this section, and classified into: (i) challenges in the end device (the sensor node), and (ii) challenges in the wireless sensor network, as follows [69, 107, 124, 129]: Challenges in the End Device All security approaches require a certain amount of resources for the implementation, including data memory, code space, and energy to power the sensor during the run of the approach. However, currently these resources are very limited in a tiny wireless sensor node. Table 1.1 lists the hardware specifications for three types of sensor node, namely MICA2 [30], FLECK [32], and MICAZ [31] and highlights the resource constraints in the end device of WSNs. We refer interested readers to the mini hardware survey done by Tatiana Bokareva for more information about the hardware specifications of more types of sensor node [10]. The challenges in the sensor s hardware are discussed as follows: ˆ Limited Memory: A sensor node is a tiny device with only a small amount of memory and storage space for the code. In order to build an effective security mechanism, it is necessary to limit the code size of the security algorithm. For example, one common sensor type (MICA2) has 4K RAM, 128K program memory, and 512K flash storage [30]. The total code space of TinyOS, the de-facto standard operating system for wireless sensors, is approximately 4K [57], and the core scheduler occupies only 178 bytes. With

25 1.2. Challenges in Wireless Sensor Networks 5 such a limitation, the code size for the proposed solution must be small. ˆ Limited Energy Resource: The energy resource is the biggest challenge in WSNs. It is assumed that once sensor nodes are deployed in a WSN, their batteries cannot be easily replaced due to the high operating costs of being deployed in remote areas. This will be discussed in Section Some current versions of sensor nodes such as MICA2 are powered by 2AA batteries as shown in Table 1.1. Therefore, the battery charge taken with them to the field must be conserved to prolong the life of the individual sensor node and the entire sensor network. For example, when implementing a cryptographic function or protocol in a sensor node, the energy impact of the proposed solution should be considered. ˆ Limited CPU Performance: The CPU used in MICA2 sensors, for example, is the 16 bit, 8MHz Texas Instruments MSP430 microcontroller [30]. Embedded processors are generally not as powerful as those in nodes of a wired network. As such, complex cryptographic algorithms should be avoided in WSNs. ˆ Tamper-Resistant Hardware: The most obvious tamper-resistance strategies are hardware-based ones, which involve extra cost to implement special complex hardware circuits in the electronic device. To run these circuits, extra energy should be ensured. Due to the targeted low cost and the limited power resource existing in sensor nodes, the hardware-based tamper protection solutions are very limited [126] Challenges in the Network Sensor nodes are usually scattered randomly in the field to perform certain tasks. There is usually no infrastructure support for sensor networks. Sensor nodes self-organize to form a network. However, some network challenges exist. These challenges are discussed as follows: ˆ Hostile & Remote Environment: Depending on the function of a particular sensor network, the sensor nodes may be left unattended for long periods of time. Most WSNs are deployed in remote or hostile environments such as battlefields. Therefore, sensor nodes without tamper-resistant hardware cannot be protected from physical attacks since the deployment area accessible to anyone. An adversary could capture a sensor node or even introduce his own malicious nodes inside the network. ˆ Random Topology: WSN is often deployed in random distribution since it is mostly used in remote or hostile environments. Consequently, there is no chance to know its topology beforehand. Also, the topology after the deployment keeps changing because some sensors disappear due to drained resources, or for instance by being damaged, or faulty. ˆ Latency: The communication range of most sensor nodes is limited in order to conserve energy. According to Table 1.1, the MICA2, FLECK, and MICAZ sensor nodes have radio coverage area up to 152 m, 500 m, and 75 m, respectively. To move a packet from one end of the network to another, a multi-hop routing approach is needed. In a congested wireless

26 6 Chapter 1. Introduction sensor network, multi-hop routing and node processing can lead to great latency in the network, which makes synchronization among sensor nodes difficult. The synchronization issues can be critical to sensor security where the security mechanism relies on critical event reports and cryptographic key distribution. ˆ Unreliable Communication: This challenge is inherited from Ad Hoc Networks, since end devices in both WSNs and Ad Hoc Networks communicate with each other wirelessly. Packets may get damaged due to channel errors, lack of radio coverage, or by being dropped at highly congested nodes. 1.3 Data Aggregation and Security Challenges In many WSN applications, a physical phenomenon is sensed by sensor nodes and then reported to the base station. To reduce the communication energy expenditure of sensor nodes, these applications should minimize the number of packets traveling across the network by eliminating redundant data. Thus, these applications may employ in-network aggregation before the raw data reaches the base station. Typically, there are three types of nodes in WSN applications where in-network aggregation is implemented. These three types are: (i) normal sensor nodes, (ii) aggregators, and (iii) a querier (or queriers). The aggregators are intermediate nodes that collect raw data from downstream sensor nodes, process the data and apply a suitable aggregation function. Then they transmit the processed data to an upper aggregator or to the querier who generated the query. The querier processes the received sensor data and derives meaningful information reflecting the events in the target field. It can be the base station or sometimes an external user who has permission to interact with the network, depending on the network architecture. Let us consider the example depicted in Figure 1.2. The network topology contains 16 sensor nodes and performs the sum (SUM) as the aggregation function. Nodes N1, N2,..., and N8 are normal sensor nodes that sense specific physical phenomena and report them back to upper nodes. Nodes N9, N10,..., and N16 are aggregators that perform both sensing and aggregation activities. To answer a single aggregation query sent by the base station, every normal sensor node (nodes N1-N8) will report individually the sensed physical phenomena to the aggregators (nodes N9-N13). These aggregators add their sensed physical phenomena to the received raw data, and then apply the SUM aggregation function. Subsequently, they send the processed information to the upper aggregators (nodes N14-N15), which will do the same. At node N16, only one packet will be sent to the base station as an answer to its query. Thus, the total number of packets transmitted across the network is only 16 packets. If the in-network aggregation is not implemented in the example given in Figure 1.2, every node will respond to the received query and report its sensed information individually. Thus, the total number of packets, traveled across the network, would be 50 packets in order to deliver 16 packets to the base station. These 16 packets are the nodes responses to the base station s query.

27 1.4. Research Objectives 7 Nxx Nxx Aggregator Normal Sensor Base Station ri represents the reading from node i. Ni represents the node i. A@i represents the aggregation result at node i. N14 r14= 9 r16= 2 N16 r15= 3 N15 A@16= r16 + A@14 + A@15 = 58 r13= 0 N13 r12= 3 N12 r11= 2 N11 r10= 7 N10 r9= 7 N9 A@15= r15 + A@9 + A@10 = 24 N1 N2 N3 N4 N5 N6 N7 N8 A@9= r7 + r8 + r9 = 9 r1= 1 r2= 4 r3= 7 r4= 6 r5= 4 r6= 1 r7= 0 r8= 2 Figure 1.2: An aggregation scenario using the SUM aggregation function Previous studies [72,96,128] show that data transmission consumes much more energy than computation. As illustrated in the two examples given above, data aggregation can greatly help to reduce this consumption by eliminating redundant data. This in turn helps prolong the network lifetime. Most existing schemes for data aggregation are under the threat of various types of attacks [128]. Among them, the node compromise is usually considered as one of the most challenging issues in the security of WSNs [8, 54, 69, 69, 95, 107, 135]. In a node compromise attack, an adversary tries to physically tamper with a node in order to extract the cryptographic secrets. This attack can be very harmful depending on the security architecture of the network. For example, when an aggregator node is compromised, it is easy for the adversary to change the aggregation result and inject false data into the WSNs. Because of this, the need for secure data aggregation is raised and its importance needs to be highlighted. 1.4 Research Objectives According to the discussion in Section 1.3, the node compromise attack is the most challenging security threat. Simple adaptation of security solutions designed for the wired and Ad Hoc networks is impractical due to the unique characteristics of WSNs as discussed in Section 1.2. Two main directions exist to circumvent this important threat [36]. The first one involves in improving the tamper-resistance of the nodes in order to increase the effort of the attacker. However, tamper-resistant mechanisms are costly for small sensor nodes and are therefore usually not present on these devices. The second alternative adopts a reputation-based approach, which monitors the network activities and tries to detect events related to the node compromise. It assumes that a node capture will provoke some noticeable events, such as inconsistent sensing or aggregation results, a displacement or removal of a node, and malicious routing

28 8 Chapter 1. Introduction activities [71]. The objective of this thesis is to address the security issues of data aggregation in wireless sensor networks, and study the strengths and weaknesses of both the cryptographic-based and reputation-based secure data aggregation schemes found in the literature. Our goal is to design a robust secure data aggregation scheme that minimizes the use of heavy cryptographic mechanisms, defends against most security attacks, and securely computes the aggregation. Our research contributions in this thesis are summarized as follows: ˆ Define the security for data aggregation in wireless sensor networks. The thesis takes a step further and stipulate the main components of a robust secure data aggregation scheme as follows: Ability to provide fair approximations of the sensor readings even though a limited number of nodes are compromised. Dynamic response to attack activities by rejecting incorrect aggregation results as soon as possible, possibly by nodes in the neighborhood, not at the base station level. These properties should work together to provide accurate aggregation results securely without exhausting the network s limited resources. In contrast with existing secure data aggregation definitions, the proposed definition covers the unique characteristics that wireless sensor networks have. ˆ Analyze the relationship between security services and the adversarial model considered in existing secure data aggregation schemes, in order to provide a general framework of required security services. This framework helps identify the minimum security services that a secure data aggregation design should provide to defend against specific types of adversaries. ˆ Analyze both cryptographic-based and reputation-based secure data aggregation schemes. This analysis covers security services provided by these schemes and their robustness against security attacks. It is believed that this analysis can help to identify the security level in these schemes. Surprisingly, most of the examined data aggregation schemes are vulnerable to selective forwarding attacks. ˆ Propose an efficient reputation-based secure data aggregation scheme that overcomes the weaknesses in other schemes found in the literature. The security advantages provided by this proposal are realized by integrating aggregation functionalities with: (i) a reputation system, (ii) an estimation theory, and (iii) a change point detection mechanism. The significance of the proposal is two-fold: (i) it mitigates the effect of On-Off attacks on aggregation results, and (ii) it distinguishes between an abrupt change and a temporary departure in heterogeneous environments. The proposal is tested in different scenarios to validate the superior performance of the proposal. The experiment results showed that the proposal is able to detect On-Off attacks as long as the attack frequency is

29 1.5. Outline 9 smaller than the buffer window size. The results showed that the proposal follows the reputation-based estimate behavior during the On-Off attack, but it has a better reaction once the attack was over. This proposal re-initializes the estimator as soon as the end of the On-Off attack has been recognized. This ensures a quick convergence afterwards with the reputation-based aggregation results. To the best of our knowledge, this proposal is the only secure data aggregation scheme in the literature that is able to mitigate the On-Off attack. ˆ Propose a secure key management protocol in order to distribute essential pairwise and group keys among the sensor nodes. The protocols also helps to revoke misbehaved nodes and isolate them from the network. Importantly, the proposal provide backward & forward secrecy that are not provided by similar schemes such as Nilson et al. s scheme [88]. The design idea of the proposed scheme is the combination between Lamport s reverse hash chain as well as the usual hash chain to provide both past and future key secrecy. The proposal avoids the delivery of the whole value of a new group key for group key update; instead only the half of the value is transmitted from the base station to the sensor nodes. The performance analysis result shows that a sensor node in the proposal consumes approximately µj and µj in order to update the pairwise key and the group key, respectively. This energy consumption includes the communication cost and the computation cost. The proposal s energy consumption for the pairwise key update protocol is µj more than Nilsson et al. s scheme. This difference is due to the security enhancements that are required to overcome the weaknesses in Nilsson et al. s scheme, as will be discussed in Section 6.2. To update the group key, the proposal consumes µj more energy than Nilsson et al. s scheme. These additional costs result from defeating the Sandwich attack and overcoming the weaknesses of Nilsson et al. s scheme. 1.5 Outline The organization of the thesis is as follows: Chapter 2: This chapter is about cryptographic-based secure data aggregation. We first give introductory information about secure data aggregation in WSNs, which defines the data aggregation security considering the unique characteristics of WSNs. Then, we highlight the security requirements for data aggregation in WSNs, since the thesis is centered on providing security to data aggregation applications. We also discuss the security attacks against cryptographic-based secure data aggregation schemes. Then, we survey, in detail, some of the current secure data aggregation schemes and classify them into two models: (i) the single aggregator, and (ii) the multiple aggregator model. We also undertake security and performance analyses of current cryptographic-based secure data aggregation schemes. The security analysis covers the security services the current schemes provide and their robustness against the security attacks discussed in this thesis. The performance analysis covers the number of bits transmitted in order to accomplish the aggregation phase in some selected schemes.

30 10 Chapter 1. Introduction The contents of this chapter have appeared in the following publications: ˆ Hani Alzaid, Ernest Foo, and Juan Manuel González Nieto. Secure Data Aggregation in Wireless Sensor Networks: A Survey. In Proceedings of the 6th Australasian Information Security Conference: Conferences in Research and Practice in Information Technology, AISC 08, Wollongong, NSW, Australia, January 2008, pages , Australian Computer Society Inc., ˆ Hani Alzaid, Ernest Foo, Juan Manuel González Nieto, and DongGook Park. Secure Data Aggregation in Wireless Sensor Networks: A Comprehensive Review. International Journal of Communication Networks and Distributed Systems (IJCNDS), Invited Article, In press, InderScience Publishers. ˆ Hani Alzaid, Ernest Foo, Juan Manuel González Nieto, and DongGook Park. Secure Data Aggregation in Wireless Sensor Networks. In Anna Foerster and Alexander Foerster, editors, Emerging Communications for Wireless Sensor Networks, chapter 10, pages , InTech, Croatia Chapter 3: This chapter investigates the use of reputation-based systems to provide trust among sensors in WSNs. We first discuss security attacks against reputation-based trust systems. Then, we present a comprehensive survey of the state-of-the-art in reputation-based trust systems for WSNs and classify these systems to five categories: (i) generic, (ii) localization, (iii) mobility, (iv) routing, and (v) aggregation. Finally, we compare in detail these reputation-based trust systems. The comparison includes: (i) investigating the visibility of the main components of the reputation systems, and (ii) studying the appearance of attacks, which is related either to WSNs or reputation systems, in existing reputation-based systems. The contents of this chapter have appeared in the following publication: ˆ Hani Alzaid. Reputation-based Trust Systems in Wireless Sensor Networks. In Al-Sakib Khan Pathan, editor, Security of Self-Organizing Networks: MANET, WSN, WMN, VANET, chapter 20, pages , Auerbach Publications, CRC Press, Taylor & Francis Group, USA Chapter 4: In this chapter, we propose a Reputation-based Secure Data Aggregation (RSDA) for wireless sensor networks. RSDA minimizes the use of heavy cryptographic mechanisms, and integrates the aggregation functionalities with the advantages that are provided by a reputation system in order to enhance the network lifetime and the accuracy of the aggregated data. The chapter also discusses performance and security analyses of RSDA. In the performance analysis, RSDA is tested in three scenarios, depending on the adversary capability to affect the aggregation results, as follows: (i) no attack on the data, (ii) abrupt change, and (iii) 1-per-2 strategy-based On-Off attacks. The security analysis of RSDA follows the same methodology used in Chapters 2 and 3.

31 1.5. Outline 11 The contents of this chapter have appeared in the following publication: ˆ Hani Alzaid, Ernest Foo, and Juan Manuel González Nieto. RSDA: Reputation-based Secure Data Aggregation in Wireless Sensor Networks. In Proceedings of the 9th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 08, Dunedin, New Zealand, 1-4 December 2008, pages , IEEE Computer Society, Chapter 5: This chapter focuses on investigating the ability to mitigate the On-Off attack where the adversary aims to disrupt the system s overall performance without being detected or excluded from the network. The proposal in this chapter extends RSDA, the contribution of Chapter 4, by adding an estimation theory and a change point detection mechanism. Through extensive simulations, it can be shown that this addition helps defend against On-Off attacks and enhances the data accuracy in the aggregation results. We first provide a brief overview of some techniques used in the proposal, namely: the estimation theory, and the change detection mechanism. Then, we explain the damage caused by the On-Off attack on RSDA. Finally, we discuss in detail the proposed solution. The solution is tested in four scenarios, depending on the adversary s capability to affect the aggregation results, as follows: (i) no attack on the data, (ii) abrupt and incipient change, (iii) 1-per-2 strategy-based On-Off attacks, and (iv) 1-per-3 strategy-based On-Off attacks. The contents of this chapter have appeared in the following publication: ˆ Hani Alzaid, Ernest Foo, Juan Manuel González Nieto, and Ejaz Ahmed. Mitigating the On-Off Attacks in Reputation-based Secure Data Aggregation for Wireless Sensor Networks. Security and Communication Networks, In press. Chapter 6: This chapter proposes a secure key management scheme which helps distribute and renew pairwise and group (cell) keys to sensor nodes. It also helps to revoke misbehaved nodes and isolate them from the network. The design idea of the proposed scheme is the combination of Lamport s reverse hash chain and the usual hash chain to provide both past and future key secrecy. We first define the term future & past secrecy and then use it instead of the similar terminology forward & backward secrecy, which has always been quite confusing. Then, we discuss the motivation behind the proposal by analyzing the security strengths and weaknesses of current key management schemes. We then present two variants of the proposed key management scheme. Finally, a performance analysis of these two variants is discussed. This analysis covers: (i) memory overhead, (ii) communication cost, and (iii) computation cost. The contents of this chapter have appeared in the following publications: ˆ Hani Alzaid, DongGook Park, Juan Manuel González Nieto, Colin Boyd, and Ernest Foo. A Forward & Backward Secure Key Management in Wireless Sensor Networks for PCS/SCADA. In Proceedings of the 1st International ICST Conference on Sensor Systems and Software, S-CUBE 09, Grand Hotel Duomo of Pisa, Pisa, 7-9 September 2009, pages 66-82, Springer, 2010.

32 12 Chapter 1. Introduction ˆ Hani Alzaid, DongGook Park, Juan Manuel González Nieto, and Ernest Foo. Mitigating Sandwich Attacks against a Secure Key Management Scheme in Wireless Sensor Networks for PCS/SCADA. In Proceedings of the 24th IEEE International Conference on Advanced Information Networking and Applications, AINA 10, Perth, Australia, April 2010, pages , IEEE Computer Society, ˆ Hani Alzaid, DongGook Park, Juan Manuel González Nieto, Colin Boyd, and Ernest Foo. A Forward & Backward Secure Key Management in Wireless Sensor Networks for PCS/SCADA. In Raúl Aquino Santos, Arthur Edwards, and Victor Rangel Licea, editors, Emerging Technologies in Wireless Ad Hoc Networks: Applications and Future Development, chapter 3, pages 41-60, IGI Global, USA Chapter 7: Finally the thesis contributions are summarized in this chapter. Several open problems and possible research directions are also discussed.

33 Chapter 2 Secure Data Aggregation in Wireless Sensor Networks Studies by Wagner [128] and Krishnamachari et al. [72] showed that data transmission consumes much more energy than computation. Data transmission accounts for 70% of the energy cost of computation and communication for the SNEP protocol [96]. Data aggregation can significantly help to reduce this consumption by eliminating redundant data. However, aggregators are vulnerable to attacks such as node compromise attacks, especially if they are not equipped with tamper-resistant hardware. When an aggregator node is compromised, it is easy for the adversary to change the aggregation result and inject false data into WSNs. Due to the WSNs unique characteristics discussed in Chapter 1, devising security protocols for WSNs is complicated and may not be successfully accomplished by the simple adaptation of security solutions designed for wired networks. Unfortunately, the security mechanisms used in other network environments are not appropriate for WSN domains, since they are typically based on public key cryptography, which is too expensive for sensor nodes. There are two approaches to circumvent the node compromise threat. The first one, which is the focus of this chapter, involves in increasing the needed efforts of the adversary to succeed in launching the attack. This can be done by employing some cryptographic-based techniques. For example, the Merkle hash tree is used in Przydatek et al. s Scheme in order to facilitate the verification process at the querier and ensure the correctness of the aggregation results (more details are given in Section 2.3). The second alternative mitigates node compromise attacks by adopting a reputation-based scheme to monitor the network activities and detect events related to the node compromise. A detailed discussion of the second approach is presented in Chapter 3. Our contributions in this chapter are four-fold: ˆ Define the security for data aggregation in WSNs. In contrast with existing secure data aggregation definitions, the proposed definition covers the unique characteristics that 13

34 14 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks WSNs have. ˆ Present a survey of the state-of-the-art in secure data aggregation schemes. These schemes are then classified into two groups according to the number of aggregator nodes, and whether the verification phase of the aggregation result is considered or not. ˆ Explore the relation between the security services and the adversarial model considered in existing secure data aggregation schemes for possible general framework. This framework helps identify the minimum security services that a secure data aggregation design should provide to defend against a specific type of adversary. ˆ Evaluate current cryptographic-based secure data aggregation schemes. The evaluation is composed of: (i) security analysis, and (ii) performance analysis. The security analysis covers the robustness against security attacks discussed in this chapter, and the security services provided. The performance analysis focuses on calculating the number of bits transmitted within the network, in order to show which secure data aggregation scheme is more energy hungry and sends more information to accomplish the scheme objectives. The rest of the chapter is organized as follows: Section 2.1 gives introductory information about secure data aggregation in WSNs. Section 2.2 lists security concerns in data aggregation, and highlights different capabilities that an adversary may have against a secure data aggregation scheme. Section 2.3 surveys, in detail, some of the current cryptographic-based secure data aggregation schemes and classifies them into two models: (i) the single aggregator, and (ii) the multiple aggregator model. Then, a security analysis of these schemes is discussed in Section 2.4. The analysis covers the security services these schemes provide, and their robustness against the security attacks mentioned above. Section 2.5 discusses the performance analysis of some of these schemes. Finally, the chapter is concluded in Section Secure Data Aggregation in Wireless Sensor Networks The motivation behind secure data aggregation in WSNs is explained in Section 1.3. Unfortunately, the design principles for secure data aggregation schemes are poorly understood. There is no clear definition of what secure data aggregation should mean, what security requirements a scheme should have, and what adversary capability a scheme should defend against. Existing schemes might have one or more of the security requirements, depending on how secure data aggregation has been addressed, and the strength of the expected adversary. For example, secure data aggregation has been addressed in Przydatek et al. s scheme from the point of view of detecting forged data aggregation values [99]. This does not cover security issues such as how to elect aggregators, rotate aggregation functionality between nodes, or how to set up trust between aggregators and sensor nodes. Also, some schemes provide more security requirements than others, as discussed in Section 2.4, or send more bits than others, as discussed in Section 2.5. Generally speaking, there is no common ground that allows for a complete comparison between different aggregation schemes.

35 2.1. Secure Data Aggregation in Wireless Sensor Networks 15 Secure data aggregation is defined as the efficient delivery of the summary of sensor readings that are reported to an off-site user in such a way that ensures these reported readings have not been altered [21, 99]. This definition considers WSN applications where the querier is located outside the deployment area and a base station acts as an aggregator. Shi and Perrig [115] highlight error sources that affect the aggregated data, and define secure data aggregation as the process of obtaining a relative estimate of the sensor readings with the ability to detect and reject reported data that is significantly distorted by corrupted nodes or injected by malicious nodes. However, rejecting reported data injected by malicious nodes consumes the network resources, specifically the nodes batteries. The malicious packet will be processed by intermediate nodes until it reaches the verifier, which is normally the base station. The damage caused by malicious nodes or compromised nodes should be reduced by adding a self-healing property to the network. This property helps the network to learn how to handle new threats through extensive monitoring of network activities. Therefore, we take a step further and stipulate the main components of a robust secure data aggregation scheme as follows: ˆ Ability to provide fair approximations of the sensor readings even though a limited number of nodes are compromised. ˆ Dynamic response to attack activities by rejecting incorrect aggregation results as soon as possible, possibly by nodes in the neighborhood, not at the base station level. These properties should work together to provide accurate aggregation results securely without exhausting the network s limited resources Security Requirements for Data Aggregation Security Since WSNs share some properties with traditional wireless networks, data security requirements in WSNs are similar to those in traditional networks [96, 115]. This section discusses security requirements for strengthening attack-resistant data aggregation schemes for WSNs. These security requirements are as follows: ˆ Data Confidentiality: ensures that information content is never revealed to unauthorized parties. In WSN applications where in-network aggregation is required, data confidentiality can be implemented in two ways: (i) a hop-by-hop basis and (ii) an endto-end basis. In the hop-by-hop basis, any aggregator node needs to decrypt the received encrypted data, apply an aggregation function, encrypt the aggregated data, and send it to an upper aggregator point. This kind of confidentiality implementation requires extra computation, which leads to more delays in the network and increases the energy consumption. It also facilitates the adversary s mission. For example, the secrecy of sensed data is disclosed once any intermediate node is compromised. In the end-to-end basis, an aggregator does not need to perform decrypting and encrypting on received data; it instead applies aggregation functions directly on encrypted data by using some techniques such as homomorphic encryption [131]. End-to-end confidentiality greatly reduces energy consumption since there is no need for decryption and encryption of the received encrypted data at intermediate nodes.

36 16 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks ˆ Data Integrity: ensures that a message has not been altered, either maliciously or accidentally, in transit. Even if the network provides data confidentiality, there is still a possibility that data integrity can be affected. In certain applications, data confidentiality is not as important as data integrity. It is sometimes acceptable for an adversary to eavesdrop and learn about aggregation results, but not to change them. Suppose a secure data aggregation scheme provides only data confidentiality in order to defend against an adversary that is capable of compromising an aggregator node. The adversary could then alter the aggregation result and mislead the base station. Moreover, even without the existence of an adversary, data might be damaged or corrupted due to the nature of the wireless environment. ˆ Data Freshness: ensures that the data are recent and no old messages have been replayed, thereby protecting data aggregation schemes against replay attacks. In this kind of attack, it is not enough that these schemes provide only data confidentiality and data integrity, because an adversary able to intercept even encrypted messages could later replay them to disrupt the data aggregation results. This requirement is important in real time applications or key management schemes. For example, an adversary could replay an old distributed shared key and mislead a sensor concerning the current cryptographic key used to secure sensing information or aggregation results. ˆ Data Availability: ensures that the network is alive and data are accessible. In the presence of malicious nodes, it is highly recommended that the network react to these bad (compromised) nodes and eliminate them. Once an adversary gets into the network by compromising some legitimate nodes, the adversary can affect network services, especially in those parts of the network where the attack was launched. It is preferable that a secure data aggregation scheme contains the following mechanism to ensure a reasonable level of data availability in the network: Self-healing: which can diagnose and react to an adversary s activities, especially when some legitimate nodes are compromised, and then start corrective actions based on defined policies to recover the network or isolate the compromised nodes. The reason for adding cryptographic mechanisms is to protect WSNs from adversaries whose goals may include decreasing WSN lifetime. However, adding these cryptographic mechanisms comes at cost. Thus, these mechanisms should be carefully implemented to fit WSNs characteristics. ˆ Authentication: allows a receiver to verify whether a message is sent by the claimed sender or not. An adversary would not be able to participate and inject data into the network without valid authentication keys. If entity authentication is not implemented, an adversary could impersonate other nodes and get access to sensitive data. In the aggregation context, without entity authentication, an adversary could masquerade as an aggregator and claim to a querier that an aggregation result is x instead of x.

37 2.2. The Expected Adversarial Model and Security Concerns The Expected Adversarial Model and Security Concerns WSNs are vulnerable to different types of attack. The damage caused by these attacks varies from one scheme to another according to the adversarial model. One of the potential vulnerabilities in WSNs results from compromising its sensor nodes, given the lack of tamper-resistant packaging [54, 135]. An adversary could gain control of one or more sensor nodes and readily access sensitive information. It is usually assumed that node capture is easy in WSNs due to a lack of physical restrictions that help control access to the deployment area in outdoor environments [8]. This attack is referred to as the supervision attack and sometimes the physical attack. Considering the data aggregation scenario, once a node has been taken over, all the secret information stored on it can be extracted and the adversary can then participate in aggregation activities. Even worse, the adversary may also inject their own commodity nodes into the network by fooling nodes into believing that these commodity nodes are legitimate members of the network, especially if there is no proper authentication scheme in place. A simulation study showed that network operation and maintenance can be easily jeopardized and network performance will severely degrade once a single node starts misbehaving [80]. The purpose of this section is to highlight different capabilities that an adversary may have against a secure data aggregation scheme. Before we classify expected adversaries, possible security attacks related to WSNs are discussed in the following section Security Attacks This subsection studies how attacks related to WSNs (WSNs attacks) can affect any proposal to secure data aggregation in WSNs. WSNs attacks are discussed as follows: Sybil Attack (SY) The Sybil attack 1 is a type of attacks where the adversary is able to present more than one identity (node) within the network to deceive other nodes [39]. A node that wishes to conduct the SY attack can affect an aggregation scheme in different ways: it can (i) create multiple identities to generate additional votes in the aggregator election phase to make a malicious node an aggregator instead of legitimate nodes, (ii) generate multiple entries to an aggregation function with different incorrect readings, or (iii) create multiple identities to affect reputation values of legitimate nodes in reputation-based applications by falsely degrading legitimate node reputation values. Let us consider the example given in Figure 2.1 where an adversary creates fake IDs in order to affect the overall performance of the network. Figure 2.1-A shows a sketch of the normal scenario without any adversary. The real path starts from node A(D) and ends at 1 It has also been defined as a malicious device illegitimately taking on multiple identities [87].

38 18 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks B B Adversary Compromised Sensor Genuine Sensor A A B B B` C C D A. Normal Scenario B. Modified Scenario D Figure 2.1: Sybil Attack node D(A). Nodes B and C are adjacent neighbors. A simple form of the SY attack occurs when an adversary has the ability to compromise some sensor nodes. Suppose that an adversary succeeded in compromising node B and then manipulating the route discovery messages within the routing activities. Thus, the adversary can add another node to the network, which is node B in Figure 2.1-B. Now, the adversary can communicate with node A using node B and communicate with node C using node B. It can perform malicious activities in the network and trickily blame node B (or node B) for those activities and leave the reputation value of node B (or node B ) untouched. Selective Forwarding Attack (SF) It is sometimes assumed that each node will accurately forward received messages. However, a compromised node may refuse to do so. It is up to the adversary that is controlling the compromised node whether to forward received messages or not [67]. To put it in another way, the process of stopping the propagation of certain messages at the compromised node is under the control of the adversary. Once the adversary has succeeded in launching a SF attack, it can affect the propagation of the reputation information, such as direct observations across the network. Note that SF attacks are most effective when the attacking nodes are included in the path of the data flow. Figure 2.2 depicts a simplified scenario of a SF attack. The scenario follows the single aggregator model [6], where node A acts as an aggregator. In Figure 2.2-A, an adversary succeeded in compromising node B but behaved well and forwarded the request message sent by node A. Later on, node B, which is still under the adversary control, drops the response from D as in Figure 2.2-B. Since the aggregator has not received any reply for its recent request, node A updates its reputation table and reduces the reputation value of node D

39 2.2. The Expected Adversarial Model and Security Concerns 19 B B Adversary Compromised Sensor Genuine Sensor A B A X B C C D D A. Request Path B. Reply Path Figure 2.2: Selective Forwarding Attack as in Figure 2.2-B. Note that the reputation table does not usually contain any reputation information for the node that maintains the table. For example, the reputation table which is maintained by node A in Figure 2.2 does not have reputation information for the node itself (node A). Replay Attack (RE) Some WSN applications are vulnerable to replay attacks where an adversary is able to eavesdrop on the traffic and replay old messages. Replay attacks are the easiest, because the adversary does not need to physically capture a sensor node and get access to its internal memory, or analyze intercepted encrypted data. In the reputation-based applications context, an adversary can record some reputation information, which has been exchanged wirelessly between sensor nodes, without even understanding its content and then replay them (with no changes) to mislead other nodes and make their reputation tables out-dated. Figure 2.3 describes a simplified scenario of a RE attack in which the adversary has captured the reputation update message at a certain time t 1 (see Figure 2.3-A), and then re-injected it at time t 2 where t 2 > t 1 (see Figure 2.3-B). With no proper verification, nodes B, C, and D will accept this re-injection and end up being out-dated and thus potentially with incorrect reputation values. Spoofed Data Attack (SD) In this type of attack, an adversary alters intercepted data in order to inject false data into the network and affects the reputation values. This attack cannot be launched alone; the adversary needs to combine either a RE attack or node compromise attack with a SD attack. In

40 20 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks B B Adversary Compromised Sensor Genuine Sensor A A D B C A. Reputation Update at t 1 D B C B. Reputation Update at t 2 Figure 2.3: Replay Attack B B Adversary Compromised Sensor Genuine Sensor A A B B C C D D A. Normal Scenario B. Modified Scenario Figure 2.4: Spoofed Data Attack

41 2.2. The Expected Adversarial Model and Security Concerns 21 the former, the adversary first eavesdrops on the traffic, captures some reputation information in understandable format, performs some changes on the captured information, and then reinjects it into the network. In the latter, the adversary first needs to overtake a sensor node, and can then affect the reputation calculation by falsely claiming that his direct observation for node N i is R i (instead of the correct R i ). R i is then propagated to neighboring nodes which are misled by the received indirect observation R i and thus their calculations for the reputation value of N i are affected. Figure 2.4 presents a simplified scenario of a SD attack once the adversary has succeeded in compromising node B. The adversary, in Figure 2.4-B, during the reputation update phase, claims that the reputation value for node A is R A not R A and then sends it to neighboring nodes C and D. Therefore, nodes C and D will use R A as an indirect observation for node A when they calculate the reputation value for node A Adversary Classification Current cryptographic-based secure data aggregation schemes are threatened by adversaries with different capabilities. The following criteria are used to classify adversaries: ˆ The adversary can take over a sensor node. The adversary can then read and modify all the software code and configurations, including secret keys, installed in the sensor node. For example, once the adversary has succeeded in compromising a sensor node, the adversary can then alter any software installed in this node. In other words, adversaries can be: passive or active. Passive adversaries take advantage of the wireless communication nature (broadcasting) and eavesdrop on the traffic to obtain any important information about the sensed data. Active adversaries interact with WSNs by injecting packets, destroying or compromising nodes, extracting sensitive data, and stopping or delaying packets from being delivered to a querier, etc. They can launch any type of attack listed in Section ˆ The adversary has access to the whole network. As discussed in Section 1.3, there are three components in WSNs: sensor nodes, aggregators, and a base station with different functionalities and capabilities. The adversary s ability to interact with these components is determined by the network access. Passive adversaries with total network access can listen to all communications between sensor nodes in the network; and active adversaries can interact maliciously with all components in WSNs (nodes, aggregators, base stations) by launching any attack listed in Section However, this type of access is not common in most WSN applications. Moving from the total network access capability to partial network access, passive adversaries can listen to communications between a subset of nodes in the network. Active adversaries can interact only with a subset of nodes in the WSN. According to the above two criteria, adversaries are divided into four distinct types as shown in Figure 2.5. Type I is the weakest adversary: capable of eavesdropping on communications

42 22 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks Figure 2.5: Classification of adversaries in some parts of the network in which it has access to, but not capable of interacting with the network. To the best of our knowledge, this type of adversary has never been considered in any secure data aggregation scheme. Type IV is the strongest. It refers to an active adversary that has total access to the network. This type of adversary is interested in affecting the data aggregation results by launching any attack listed in Section against any network component (nodes, aggregators, base stations). We believe that this adversary classification can help to make better evaluation of new schemes and facilitate making decisions on which scheme is more suitable for specific conditions, as discussed in Section In the following section, current cryptographic-based secure data aggregation schemes are discussed. Aggregator Sensor Base Station A. Single Aggregator B. Multiple Aggregator Figure 2.6: A sketch of single and multiple aggregator models 2.3 Current Secure Data Aggregation Schemes To the best of our knowledge, there have been four surveys in which current secure data aggregation schemes are compared. Setia et al. [112] discussed the security vulnerabilities of data

43 2.3. Current Secure Data Aggregation Schemes 23 aggregation schemes and surveyed secure data aggregation schemes that are resilient to false data injection attacks. However, this survey covered only a few schemes. Sang et al. [109] classified secure data aggregation schemes into hop-by-hop encrypted data aggregation and end-to-end encrypted data aggregation. However, this classification does not detail the security analysis nor the performance analysis of these schemes. In early 2008, we classified these schemes based on how many times the data is aggregated during its travel to the base station, and whether these schemes have a verification phase or not [6]. This taxonomy also discussed performance and security analyses of these schemes. A year later, Ozdemir and Xiao [93] surveyed current work in the area of secure data aggregation and provided some details on the security services provided by each scheme. It is found that their security analysis is similar to our published taxonomy. This section follows the same methodology used in our previous taxonomy [6] and extends it by analyzing more secure data aggregation schemes. The security analysis covers the robustness against security attacks discussed in this chapter, and the security services provided. The performance analysis focuses on calculating the number of bits transmitted within the network, in order to show which secure data aggregation scheme is more energy hungry and, sends more information to accomplish the scheme objectives. It was found that current secure data aggregation schemes fall under either a single aggregator model or a multiple aggregator model. These will be discussed in the following subsections. A sketch of these two aggregation models can be found in Figure 2.6. Under each model, each secure data aggregation scheme either has a verification phase or does not, depending on security primitives used to defend against the expected adversary capability. To put it in another way, the verification phase is used to validate the aggregation results (or the aggregator behavior) by using methods such as interactive protocols between the base station (or the querier) and normal sensor nodes. Figure 2.7 classifies secure aggregation schemes depending on the aggregation model they follow and whether they have a verification phase or not Single Aggregator Model The aggregation process, in this model, takes place once between the sensor nodes and the base station or the querier. All individual collected physical phenomena (PP), therefore, travel to only one aggregator point in the network before reaching the querier. This aggregator node should be powerful enough to perform the expected high computation and communication. The main role of the data aggregation might not be fully satisfied since redundant data still travel in the network for a while until they reach the aggregator node, as shown in Figure 2.6- A. This model is useful when the network is small. However, large networks are unsuitable places for implementing this model, especially when data redundancy at lower levels is high. Examples of secure data aggregation schemes that follow the one aggregator model are: Du et al. s scheme [40], Przydatek et al. s scheme [99], Mahimkar & Rappaport s scheme [75], and Sanli et al. s scheme [110], which are discussed in the following sections.

44 24 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks Secure Data Aggregation Schemes Single Aggregator Model Multiple Aggregator Model No Verification Phase Verification Phase No Verification Phase Verification Phase Sanli et al. Du et al. Przydatek et al. Mahimkar & Rappaport Westhoff et al. Castelluccia et al. Yang et al. Chan et al. Jadia & Mathuria Hu & Evans Frikken & Dougherty Haghani et al. Figure 2.7: Classification of current secure data aggregation schemes Du et al. s Scheme Du et al. [40] proposed a witness-based scheme, which enhances the assurance of aggregation results reported to the base station. Du et al. argued that selecting some nodes around the aggregator, as witnesses to monitor the data aggregation results, helps to assure the validity of the aggregation results. The leaf nodes report their sensing information to aggregator nodes. The aggregator then needs to perform the aggregation function and forward the aggregation results to the base station. In order to prove the validity of the aggregation results, the aggregator node has to provide proofs from several witnesses. A witness is a node around the aggregator, which also performs data aggregation like the aggregator node, but without forwarding its aggregation result to the base station. Instead, each witness computes the message authentication code (MAC ) of the aggregation result and then sends it to the aggregator node. The aggregator subsequently must forward the proofs with its aggregation calculation to the base station. Verification Phase This scheme does not have a verification phase since the base station can verify the correctness of the aggregation results without the need to interact with the network. Instead, the scheme designers rely on the proofs that are computed by the witnesses and coupled with the aggregation results. Upon receiving the aggregation result with its proofs, the base station uses the n out of m + 1 voting strategy to determine the correctness of the aggregation results. In the n out of m + 1 strategy, m denotes the number of witnesses nodes for each aggregator node, and n denotes the minimum number of witnesses that should agree with the aggregation result provided by the aggregator. If less than n proofs agreed with

45 2.3. Current Secure Data Aggregation Schemes 25 the aggregation result, the base station discards the result. Otherwise, the base station accepts the aggregation result. Adversarial Model and Attack Resistance Du et al. considered an adversary that can compromise the aggregator and some witnesses as well. Du et al., however, limited the adversary capability to compromising less than n witnesses for a single aggregator node. This type of adversary falls into the type III adversary, according to the discussion in Section Once the adversary has succeeded in compromising an aggregator node, it can then decide whether to forward the aggregation result and the proofs or not. This is an example of the Selective Forwarding attack. The adversary, once it compromises an aggregator node, is also able to replay an old aggregation result with its valid proofs instead of the current result to mislead the base station. This is an example of the Replay attack. Moreover, the adversary can take over some leaf nodes and then present multiple identities to affect the aggregation results, which is one form of the Sybil attack. The scheme is vulnerable to Sybil attacks because the sensed PP are not authenticated by the aggregator. Security Services The data aggregation security is provided by coupling the aggregation result with proofs from the witnesses around the aggregator node. These proofs, as discussed above, are MAC s computed on the aggregation result to ensure its integrity and authenticate the witnesses to the base station. Other security services such as data confidentiality, data freshness, and data authentication for leaf nodes were not considered by Du et al. Discussion The security primitives used in this scheme to defend against type III adversary is the n out of m + 1 voting strategy. This strategy authenticates witnesses and aggregators to the base station but not leaf nodes. The leaf nodes, therefore, are appropriate targets for the adversary to launch the Node Compromise attack and then report invalid readings to aggregators. Moreover, resource utilization efficiency in this scheme is poor due to three reasons: ˆ The aggregator needs to receive m more proofs from the witnesses and the aggregator then needs to forward these extra proofs with its aggregation result to the base station. ˆ The number of times the aggregation takes place in the network is increased by m times, because the aggregation function is repeated m times by the witnesses for each query. ˆ Finally, the aggregation result with the proofs are traveled unchecked all the way to the base station, because the verification process is done at the base station. Przydatek et al. s Scheme Przydatek et al. [21, 99] proposed a secure information aggregation scheme which provides efficient sub-schemes for securely computing the median and the average of the measurements, estimating the network size, and finding the minimum and the maximum sensor readings. It

46 26 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks consists of three types of network components: (i) an off-site home server (or user), (ii) a base station (or aggregator), and (iii) a large number of sensors. The scheme designers claimed that their scheme is robust against stealthy attacks where the attacker s goal is to make the user accept false aggregation results without revealing its presence. It is believed that stealthy attack can be accomplished by using any type of attack discussed in Section The scheme employed an aggregate-commit-prove approach, to achieve its goal, where the aggregator performs aggregation activities and then proves to the home server that it has computed the aggregation function correctly. In this approach, the aggregator helps with computing the aggregation results and then forwards them to the home server together with a commitment to the collected data. The home server and the aggregator then use interactive proofs, where the home server will be able to verify the correctness of the results. From the proposed sub-schemes, we limit the discussion in this chapter to the minimum aggregation sub-scheme (MIN). Przydatek et al. proposed a secure MIN discovery sub-scheme that enables the home server to find the minimum of the reported value. They, however, restricted the adversary capability to not reporting smaller values than real values. The subscheme works by first constructing a spanning tree such that the root of the tree holds the minimum element as illustrated in Algorithm 1. The tree construction proceeds in iterations. Throughout the scheme, each sensor node S i maintains a tuple of state variable (p i, v i, id i ), where p i denotes the ID of the current parent of S i in the tree being constructed, v i denotes the smallest value seen so far, and id i denotes the ID of the node whose value is equal to v i. Each S i initializes its state variables with its information as in steps 1, 2, and 3 in Algorithm 1. In each iteration, S i broadcasts (v i, id i ) to its neighbors. Let (v i, id i ) denote a message sent by S with a smaller value picked by S i. Then, S i updates its state by setting p i = S, v i = v i, id i = id i. The tree construction terminates after d iteration where d is an upper bound on the diameter of the network. Upon constructing the tree, each node S i authenticates its final state (p i, v i, id i ) using the key shared with the home server and then forwards it to the aggregator. The aggregator checks the consistency of the constructed tree with the values committed. If the check is successful, the aggregator commits to the list of all nodes and their states, finds the root of the constructed tree, and reports the root node to the home server. Otherwise, the aggregator reports the inconsistency. The commitment to the collected data is done using the Merkle hash tree [79] to ensure that the aggregator used the data provided by sensors. For example, the aggregator constructs the Merkle hash tree over the sensor measurements m 0, m 1, m 2,..., m 7 as in Figure 3, and then sends the root of the tree (called a commitment) to the home server. Verification Phase The home server, upon receiving the aggregation results and the commitment of the collected data from the aggregator, needs to verify the correctness of the reported data. The home server checks whether or not the committed data is a good representative of the true values in the sensors network. In other words, the home server checks if the aggregator is trying to provide an invalid aggregation result or not by using an interactive proof with the aggregator. It randomly picks a node in the committed list, say m 5 in

47 2.3. Current Secure Data Aggregation Schemes 27 Algorithm 2.1: Finding the minimum value from nodes sensed data /* code for sensor node i */ /* Initialization phase */ 1 p i = S i ; // current parent. 2 v i = v i ; // current sensed physical phenomenon. 3 id i = S i ; // owner of the current minimum value. 4 for i = 1.. d do 5 send (v i, id i ) to all neighbors. 6 receive (v j, id j ) from neighbors. 7 if (v j < v i ) for sensor j then 8 p i = S j ; 9 v i = v j ; 10 id i = id j ; 11 end if; 12 end loop; 13 return < p i, v i, id i >; Figure 2.8, and then traverses the path from the picked node to the root using the information provided by the aggregator. During the traversal, the home server checks the consistency of the constructed tree. If the checks are successful, then the home server accepts the aggregation result; otherwise, it rejects it. In other words, the aggregator sends the values of v 1,0, v 3,4, v 2,2 to the base station, and then the base station checks whether the following equality holds: v 0,0 = h(v 1,0 h(h(v 3,4 h(m 5 )) v 2,2 )) where h is a cryptographic hash function. Adversarial Model and Attack Resistance Przydatek et al. considered an adversary which can corrupt, at most, a small fraction of all the sensor nodes and then misbehave in any arbitrary way. However, more restrictions apply in their sub-schemes, such as that they assumed that the adversary, in the secure MIN sub-scheme, cannot lie about its value or is uninterested in reporting a smaller value. This adversary is classified as type III according to our discussion in Section According to Przydatek et al., this type III adversary can launch the Node Compromise attack but it is still unable to affect the secure MIN aggregation sub-scheme, because the adversary is not allowed to report values smaller than the real values. It is argued that this restriction should be relaxed because the adversary, with the ability to launch the Node Compromise attack, can report whatever data it likes or selectively drop messages. Thus, it is found that this scheme is vulnerable to Selective Forwarding attack. Moreover, the scheme is robust against the Replay attack due to the single usage of each temporary key shared with the base station. The scheme is also robust against the Sybil attack, because the adversary cannot mislead the base station to accept new hash chains for the newly created fake identities. Thus, these fake identities cannot predict the next component of the

48 28 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks i, j 0,0 i+1, j i+1, j+1 1,0 2,0 2,1 1,1 2,0 3,0 3,1 2,0 2,1 2,2 3,0 j 3,1 3,2 3,3 3,4 5 3,5 3,6 3,7 Figure 2.8: A Merkle hash tree hash chain and thus they cannot participate in the network. Security Services Przydatek et al. employed the Merkle hash tree together with µtesla [96] and MAC to defend against a type III adversary. The usage of µtesla and MAC provides authentication and data freshness to the network, and the Merkle hash tree provides data integrity. Authentication is offered because only legitimate sensor nodes, with synchronized hash chains with the base station, are able to participate and contribute to the aggregation function. Data freshness is offered because of the single usage of the temporary key provided by µtesla. Unfortunately, data availability was not considered by Przydatek et al., due to the number of bits that traveled across the network in order to accomplish the aggregation task for a single query, as will be discussed in Section Discussion As discussed above, the scheme is able to check the validity of the aggregation result, but with no further action to remove or isolate the node which caused inconsistency in the aggregation results. Przydatek et al. also restricted the adversary capability into compromising the node but with no ability to report a value smaller than the real value when calculating the MIN aggregation function. It is believed that this assumption should be relaxed because an adversary with the ability to compromise nodes is also able to perform whatever activities it likes. Once the assumption is relaxed, then the secure MIN sub-scheme should be revisited. Mahimkar & Rappaport s Scheme Mahimkar & Rappaport s scheme is similar to Przydatek et al. s scheme except that it provides one more security service; data confidentiality. It is composed of two phases: (i) the key establishment and (ii) the secure data aggregation and verification. The key establishment phase

49 2.3. Current Secure Data Aggregation Schemes 29 generates a secret key for each cluster, and each node belonging to the cluster has a share of the secret key. The node uses this share to generate a partial signature on its reading. The second phase ensures that the base station does not accept invalid aggregation results from the cluster head (or the aggregator). Each node senses the required physical phenomena (PP), encrypts it using its share of the cluster s private key, and computes the MAC on its PP using the key shared between itself and the base station. Then, it sends these data, the encryption result and the MAC to the cluster head, which aggregates the nodes PP s and computes the average of the sensed physical phenomena. The cluster head then broadcasts the average to all cluster members in order to let them compare their PPs with the average. If the difference is less than a threshold, the node creates a partial signature on the average using its share of the cluster s private key, and then sends it to the cluster head. The cluster head combines these signatures into a full signature and sends it along with the average value to the base station. Mahimkar & Rappaport used the Merkle hash tree together with encryption and digital signature to achieve their goals. They used elliptic curve cryptography to encrypt PPs reported to the cluster head, digital signature concept to sign aggregation results, and the Merkle hash tree to verify the integrity of the reported aggregation results once the signature verification failed. Verification Phase The base station, upon receiving the average value and the full signature, verifies the validity of the signature using the cluster s public key. A valid signature is generated by a collusion of t or more nodes within the cluster. The base station accepts the aggregation result, which is the average value, once the signature validity is accepted. Otherwise, the base station rejects the aggregation result and uses the Merkle hash tree to ensure the integrity of the PPs. This is done as suggested in Przydatek et al. s scheme. Adversarial Model and Attack Resistance Mahimkar & Rappaport aimed to defeat an adversary that is able to compromise up to t 1 nodes in each cluster, where t should be less than half of the total number of sensors in the cluster. This adversary falls into type III according to the discussion in Section Type III adversary is able to launch Node Compromise attack as assumed by the designers of the scheme. Once the adversary has succeeded in compromising a sensor node, it can forward messages selectively to upper nodes or drop them. This is an example of the Selective Forwarding attack. Also, the adversary is able to replay an old message with its own valid signature, instead of the current message, which misleads the base station and affects the aggregation results. Finally, the scheme is robust against the Sybil attack since each node should have a legitimate share of the cluster s private key that cannot be generated by the adversary. Security Services The scheme, through the key establishment phase, provides authentication service because only the cluster members with legitimate shares are able to participate in the aggregation processing. Data confidentiality and integrity are offered through the

50 30 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks aggregation and verification phase. Elliptic curve encryption provides data confidentiality, and digital signatures and the Merkle hash tree enhance data integrity of the aggregation results. Data freshness, however, is not considered. Discussion If the adversary compromised any of the cluster members, except the aggregator, it is able to affect the aggregation result by reporting invalid PP s. Wagner proved that the average function, which is implemented in this scheme as the aggregation function, is insecure in the existence of only one compromised sensor node [128]. Even worse, when the adversary succeeds in compromising the cluster head (or the aggregator), the adversary can then replay old but valid signed aggregation results to mislead the base station. In this case the base station would not be able to detect it. Moreover, Mahimkar & Rappaport considered only the average function and replacing this function with another function is impossible given the same scheme run. In the current scenario, each sensor node is able to check the aggregation result by dividing its PP by the number of sensor nodes in its cluster, and then comparing the result with the average value broadcasted by the cluster head. The sum function, for example, cannot be implemented because each sensor node encrypts its PP using a different share of the cluster private key, and this is inaccessible to other cluster members. Sanli et al. s Scheme Sanli et al. [110] proposed a secure reference-based data aggregation scheme that encrypts the aggregation results and applies variable security strength at different levels of the cluster heads (or aggregators) hierarchy. The differential data, which is the difference between the reference value and the sensed data, is reported to aggregator points instead of the sensed data itself in order to reduce the number of transmitted bits. Sanli et al. argued that intercepting messages transmitted at higher levels of clustering hierarchy provides a summary of a large number of transmissions at lower levels. They, therefore, believed that the security level of the network should be gradually increased as messages are transmitted through higher levels. Based on this observation, they chose a cryptographic algorithm that allows adjustment of its parameter and the number of encryption rounds to change its security strength as required. Instead of sending the raw data to the aggregator, a sensor node compares its sensed data with the reference data and then sends the encryption of the difference data. The reference data is taken as the average value of a number of previous sensor readings, N, where N > 1. The aggregator, upon receiving these differential data, performs the following activities: ˆ Decrypts the data and then determines the distance to the base station in number of hops (hop).

51 2.3. Current Secure Data Aggregation Schemes 31 ˆ Encrypts the aggregation result using RC6 with the number of rounds calculated as: number of rounds = 1 hop 100 (2.1) They adjust the number of rounds, which RC6 performs to accomplish an encryption operation, depending on how far the aggregator point is from the base station. The closer the aggregator is, the larger the number of rounds that should be used. ˆ Forwards the encrypted aggregated data to the base station. Verification Phase This scheme does not contain a verification phase to check the validity of the aggregation results. Sanli et al., instead, rely on the security primitives, RC6, to enhance the security for the aggregation results. Once the base station has received the encrypted aggregation results, it decrypts them with the corresponding keys. Adversarial Model and Attack Resistance Sanli et al. did not discuss the adversary capability that was considered in their scheme. It is believed, however, from the discussion in their paper, that the adversary type is a type II adversary for the following reasons: ˆ They rely only on encryption to provide accurate data aggregation. ˆ A single node compromise can breach the security of the scheme. For example, once the adversary has succeeded in compromising an aggregator node, the privacy and accuracy of the aggregation results can be manipulated and then affect the overall aggregation activities of the system. Security Services The data aggregation security is achieved by encrypting traveled data using the block cipher RC6. This provides a data confidentiality service to the network. Data freshness is also provided due to the key update component adhered to the aggregation component. Other security services are not considered because of the type of adversary considered by Sanli et al. Discussion The security primitives, used to defeat the type I adversary, are impractical for use in constrained devices such as sensor nodes. Law et al. [74] constructed an evaluation framework in which suitable block cipher candidates for WSNs can be identified. They concluded, based on evaluation results, that RC6 is lacking in energy efficiency (i.e., a large RAM consumer), and performs poorly on 8/16 bits architectures. They further concluded that RC6 with 20 rounds is secure against a list of attacks such as chosen ciphertext attack. However, the number of rounds for RC6 encryption in Sanli et al. s scheme can be as low as 10 rounds once the aggregator node is 10 hops away from the base station, according to Equation Multiple Aggregator Model In this model, collected data are aggregated more than once before reaching the final destination (or the querier) see Figure 2.6-B. As discussed in Section 1.3, this model achieves greater

52 32 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks reduction in the number of bits transmitted across the network, especially in large WSNs. The importance of this model grows as the network size gets bigger. Examples of secure data aggregation schemes that fall under this model are: Hu & Evans s scheme [58], Jadia & Mathuria s scheme [62], Westhoff et al. s scheme [131], and Sanli et al. s scheme [110], which are discussed in the following sections. Hu & Evans s Scheme Hu & Evans [58] proposed a secure aggregation scheme that achieves resilience against node compromise by delaying the aggregation and authentication at the upper levels. The required physical phenomena (PP) are, therefore, forwarded unchanged and then aggregated at the second hop instead of aggregating them at the immediate next hop. Thus, the parents need to buffer the data to authenticate it once the shared key is revealed by the base station. This represents the first attempt towards studying the problem of data aggregation security once a node is compromised. Each sensor node shares a temporary symmetric key with the base station, which lasts for a single aggregation calculation. The base station periodically broadcasts these authentication keys as soon as it receives the aggregation result. Each leaf node, as a part of the aggregation phase, transmits its PP to its parent. This transmission includes the node ID, the sensed PP, and the message authentication code MAC KID (ID, PP). It uses the temporary key shared with the base station, but not yet known to the other nodes, to calculate the MAC. The parent (or any intermediate node) applies the aggregation function on messages received from its children, calculates the MAC of the aggregation result, and then transmits messages and MAC s received from its direct children along with the MAC computed on the aggregation result. The parent, which has grandchildren nodes, is permitted to remove its grandchildren s raw data (or PPs) and confirm the aggregation result done by its children nodes (or the parent of its grandchildren). It is important that each parent stores raw data received from its children (and its grandchildren if it available) and the MAC computed on the reported data from its children (and its grandchildren if available). The parent will use this information at the end of the aggregation process when the base station reveals the temporary keys, as discussed in the subsequent paragraph. Verification Phase This scheme has a verification phase where the base station interacts with sensor nodes and aggregators in order to verify the aggregation results. Hu & Evans used µtesla protocol to update the shared keys between sensor nodes and the base station. The µtesla protocol delays the disclosure of symmetric keys to achieve asymmetry [96]. The base station generates the one-way key chain of length n. It then chooses the last key K n and generates the remaining values by applying a one-way function F as follows: K j = F (K j+1 ) Because F is a one-way function, anybody can compute backward, such as compute K 0, K 1,..., K j given K j+1, but nobody can compute forward such as compute K j+1 given K 0, K 1,

53 2.3. Current Secure Data Aggregation Schemes 33..., K j. In the time interval t, the sender is given the key of the current interval K t by the base station through a secure channel, and then the sender uses the key to calculate MAC Kt on its PP in that interval. The base station then discloses K t after a delay, which helps other nodes to verify the received MAC Kt. When aggregation results arrive at the base station, the base station reveals the temporary symmetric keys shared with every node. Every parent is now able to verify whether the information (raw data and the MAC ) stored for its children is matched or not. If the parent detects an inconsistent MAC from a child or a grandchild, it sends out an alarm message to the base station along with MAC computed using the node s temporary key. Adversarial Model and Attack Resistance The most serious threat considered by Hu & Evans is that of an adversary that can compromise the network to provide false readings without being detected by the operator. Each intermediate node (parent) can thus modify, forge, discard messages, or transmit false aggregation values. Hu & Evans, however, limited the adversary capability to not launching the Node Compromise attack for two consecutive nodes in the hierarchy. This type of adversary falls into type III according to the discussion in Section Once an intermediate node is compromised, the adversary is then able to launch the Selective Forwarding attack. The scheme, however, is robust against the Replay attack due to the single usage of each temporary key shared with the base station. Also, the scheme is robust against Sybil attack, because the adversary cannot mislead the base station to accept new hash chains for the newly created fake identities. Security Services Hu & Evans regarded data confidentiality of messages to be unnecessary for their scheme. They focused only on the integrity of aggregation results by using µtesla protocol, which also provides authentication and data freshness security services. Authentication is offered because only legitimate sensor nodes, with synchronized hash chains with the base station, are able to participate and contribute to the aggregation function. Data freshness is offered because of the single usage of the temporary key. Unfortunately, data availability was not considered by Hu & Evans, because each parent has to store and verify received information from its children and grandchildren. This verification requires each parent to listen to every shared key revealed by the base station until it hears the keys of its children and grandchildren. Even worse for data availability, the data keeps traveling towards the base station even when it has been corrupted, because the keys are revealed when the aggregation results reach the base station. Another factor that affects data availability is that once a compromised node is detected, no practical action is taken to reduce the damage caused by this compromise, and the compromised node can still participate in the aggregation activities. Discussion Hu & Evans considered data integrity and used µtesla to defeat a type III adversary. The scheme is able to detect a single node compromise, but without further action to remove or isolate this compromised node. Much worse, once a grandfather node detects a node compromise, it could not decide whether the cheating node is its child or grandchild. The scheme, moreover, fails to provide data integrity once the adversary compromised two consecutive nodes successfully in the hierarchy, such as the parent and the grandparent. The

54 34 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks scheme also suffers from extra memory overhead because of the delayed authentication and the need to buffer the data received by parents to be authenticated later. Finally, parents waste some energy listening to some of the revealed keys that are not intended for them. Jadia & Mathuria s Scheme The data confidentiality in Hu & Evans s scheme was not considered. Jadia and Mathuria, however, argued that messages relayed in data aggregation hierarchy may need confidentiality. Thus, they extended Hu & Evans s scheme to enhance the security services by adding data confidentiality [62]. This scheme uses encryption for confidentiality but without requiring decryption at intermediate nodes. The designers of the scheme adopted an encryption method where the data is added to a sufficiently long random encryption key. Let K A denote the master key shared between node A and the base station. The encryption of the sensed PP reported by a sensor node A can be calculated as follows: C KA = (PP A + K A ) (2.2) After encrypting the required PP, node A computes two MAC s on these PP. One MAC is calculated by using one-hop pairwise key shared with the node s parent, and the second MAC is calculated using two-hop key shared with the node s grandparent. The aggregation phase is accomplished in the same way as the Hu & Evans s scheme, except for two differences listed below: ˆ Leaf nodes encrypt their PP s before sending them. ˆ Leaf nodes compute two MAC s on the encrypted data. Each, leaf node then forwards its ID, encrypted data, and two MAC s to its parent. The parent node (say node C) receives the message and verifies the origin of the data using the one-hop pairwise key. It performs the aggregation over the encrypted data received from its children (node A and node B) as follows: EAR = C KA + C KB + C KC (2.3) where EAR denotes the Encrypted Aggregation Result. Node C then calculates the MAC of EAR using the two-hop pairwise key shared with its grandparent node, and transmits it along with the encrypted PPs and MAC s received from its children (of course without the MAC intended for itself). Verification Phase This scheme does not have a verification phase. Jadia & Mathuria argued that the two MAC s, which are discussed in the previous paragraph, help provide the integrity of the data while minimizing the communication required between the base station and sensor nodes. In other words, the verification phase in Hu & Evans s scheme, where the base station reveals temporary shared keys with nodes, is replaced with MAC s in order to improve data availability in the network. Jadia & Mathuria, however, did not discuss how these

55 2.3. Current Secure Data Aggregation Schemes 35 pairwise keys are distributed, nor how much bandwidth and energy consumption were required. If the base station did not receive alarm messages from parents regarding inconsistency between encrypted data and MAC s computed on them, the base station decrypts the aggregation result (EAR) from Equation 2.3 as follows: Aggregation result = EAR (K A + K B + K C ) (2.4) Adversarial Model and Attack Resistance Since this scheme is an extension to Hu & Evans s scheme, the scheme designers considered the same adversary type, which is type II. Unfortunately, the scheme is vulnerable to the Selective Forwarding attack due to the capability of a type III adversary and due to the same discussion given on Hu & Evans s scheme. However, the scheme is robust against the Sybil and Replay attacks due to the design assumption which states that the authentication and encryption keys are changed with every message. However, no details on changing these keys was given. Security Services This scheme provides data confidentiality, data integrity, data freshness, and authentication services. The usage of two MAC s, which are calculated by onehop and two-hop pairwise keys, provides data integrity and authentication for the aggregation results. Data confidentiality is provided by using the adopted end-to-end encryption that is summarized by Equations 2.2, 2.3, and 2.4. Finally, data freshness service is ensured in the network due to the authors assumption that the authentication and encryption keys are changed with every message. Discussion As discussed above, Jadia & Mathuria added data confidentiality to the security services provided by Hu & Evans s scheme, but their scheme has the same weaknesses. However, the memory overhead weakness is not visible in this scheme because it uses pairwise keys and does not need to keep copies of MAC s information until the base station reveals temporary keys. Westhoff et al. s Scheme Westhoff et al. [131] solved the problem of aggregating encrypted data in WSNs, and proposed a secure data aggregation scheme that provides aggregator nodes with the possibility to perform aggregation functions directly on ciphertexts. This work is an extension to their initial work in [131]. It uses an additive and multiplicative Privacy Homomorphic (PH ) encryption scheme [38] in order to provide end-to-end encryption. The aggregator nodes do not need to decrypt encrypted messages when they aggregate them. If the usual encryption algorithms, such as RC5, were used instead of PH to provide data confidentiality, hop-to-hop encryption then should be used instead of end-to-end encryption. This is because usual encryption algorithms do not let aggregator nodes apply aggregation functions directly on ciphertexts. Hop-by-hop encryption means that every intermediate node has to decrypt received encrypted messages, and then aggregate them according to the corresponding aggregation function, encrypt the aggregation results, and finally forward the aggregation results to upper nodes.

56 36 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks During the last few years, PH encryption schemes have been studied extensively since they have proved to be useful in many cryptographic applications such as electronic elections [49], sensor networks [17, 131] and so on. Homomorphic cryptosystem is a cryptosystem that allows direct computation on encrypted data by using an efficient scheme. It is an important tool that can be used in a secure aggregation scheme to provide end-to-end privacy if needed. The RSA scheme is a good example of a deterministic, multiplicative homomorphic cryptosystem on M = NZ Z, where N is the product of two large primes [105]. Let K e, K d, E, D, m, c denote the private key, public key, encryption function, decryption function, message in plaintext, ciphertext, respectively. Thus, C = NZ Z is the ciphertext space and the key space is: K = {(k e, k d ) = ((N, e), d) N = pq, ed 1 mod ϕ(n)} The encryption of any message m M is defined as: E ke (m) = m e mod N while the decryption of any ciphertext c C is defined as: D ke,k d (c) = c d mod N = m mod N Obviously, the encryption of the product of two messages m 1, m 2 M can be computed by multiplying the corresponding ciphertexts: E ke (m 1 m 2 ) = (m 1 m 2 ) e mod N = (m e 1 mod N)(m e 2 mod N) = E ke (m 1 ) E ke (m 2 ) Westhoff et al. s scheme employs the Domingo-Ferrer s encryption function that chooses the ciphertext corresponding to given plaintexts (or messages) from a set of possible ciphertexts. The public parameters, for the encryption function, are a positive integer d 2, and a large integer g that has many small divisors. There should be, at the same time, many integers < g that can be inverted modulo g. Then, the secret key is computed as: k = (r, g ) The plaintext r Z g is chosen such that r 1 mod g exists, where log g g indicates the security level provided by the function. The set of plaintext is Z g and the set of ciphertext is (Z g ) d. The encryption process is executed at leaf nodes as follows: ˆ Randomly split the plaintext a Z g into secretes a 1, a 2,..., a d such that d j=1 (a j mod g ) = a

57 2.3. Current Secure Data Aggregation Schemes 37 ˆ Compute E k (a) = (a 1 r 1 mod g, a 2 r 2 mod g,..., a d r d mod g) Leaf nodes then forward the encrypted data to aggregator nodes where PH is used to apply aggregation function on these encrypted data with no need to decrypt them. The decryption process is performed at the base station (or the querier), which is discussed in the subsequent paragraph. Verification Phase This scheme does not have a verification phase. Westhoff et al., instead, relied on the additive and multiplicative Privacy Homomorphic (PH ) encryption scheme to defend against the considered type of adversary. The scheme is designed to encrypt the required physical phenomenon in a way that aggregators are able to apply aggregation functions directly on ciphertexts. The aggregators then forward the aggregation results to upper nodes. When these aggregation results reach the querier, the querier decrypts them as follows: ˆ Compute the j th coordinate by r j mod g to retrieve a j mod g. ˆ In order to compute a, the querier computes D k (E k (a)) = d j=1 (a j mod g ) Adversarial Model and Attack Resistance Westhoff et al. aimed to defeat passive adversaries that eavesdrop on communication between sensor nodes, aggregators, and the base station. However, Westhoff et al. extended the capability of the adversary to be able to take over aggregator nodes but not other network components. Thus, we classify this adversary as type III due to its capability to launch the Node Compromise attack. Since the adversary is able to compromise aggregator nodes, it can then launch the Replay attack by replacing old but valid encrypted messages as long as encryption keys of leaf nodes have not been updated/renewed. Once an aggregator is compromised, the adversary is easily able to launch the Selective Forwarding attack. Security Services The data aggregation security is provided by encrypting the reported data and thus only data confidentiality is provided. Other security services, discussed in Section 2.1.1, are not provided due to the focus of Westhoff s paper. Discussion The security primitive used to defeat the type III adversary is PH. This primitive is impractical for use in constraint devices, such as the sensor node, due to its high computational cost [131]. Westhoff et al. argued that their scheme considered this disadvantage, the high computational cost, by rotating the aggregation duties between aggregators to balance the energy consumption. Moreover, it has been proved that PH is insecure against chosen plain text attacks [127]. However, Westhoff et al. argued that for data aggregation scenarios in WSNs, the security level

58 38 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks is still adequate and they used this encryption transformation as a reference PH. Unfortunately, this scheme can support only average and movement detection aggregation functions. Applying PH on the context of WSNs in order to support other aggregation functions is an open area of research. Yang et al. s Scheme Yang et al. [136] proposed a secure data aggregation that can tolerate more than one node compromise. The scheme is composed of two phases: (i) divide-and-conquer and (ii) commit-andattest. In the former phase, the scheme uses a probabilistic grouping technique that partitions nodes in a tree topology into several logical groups. In the latter phase, a commitment-based hop-by-hop aggregation is performed in each group to generate a group aggregate. The base station then identifies the suspicious groups based on a set of group aggregates. Each group under suspicion participates in an attestation process to prove the validity of its group aggregation result. A leaf node encrypts its ID, physical phenomena (PP), count value (C), and the query sequence number (SQ) using a pairwise key shared with its parent. The count value represents the number of the node s children, and therefore C for any leaf node is always zero. It then forwards to its parent the encryption result, a MAC computed on inputs to the encryption function, and a one bit aggregation flag. This flag instructs the node s parent upon receiving the transmission whether there is a need for further aggregation or not. When an intermediate node receives a message from its child, it first checks the flag and then follows one of the following scenarios: ˆ 1st scenario (flag=1): the intermediate node forwards the packet untouched to the base station via its parent. ˆ 2nd scenario (flag=0): the intermediate node decrypts the received message and then checks whether or not the received data is a legitimate response to the current query. Once this checking is passed, the intermediate node adds its own PP and other aggregation results received from other children nodes (with flag=0) to the received data. The C is subsequently updated by adding up count values of all other participants. To set the aggregation flag to one, which represents that no more aggregation should be done by this intermediate node, the node performs the following check: H(SQ ID) < F g (C) (2.5) where H is a secure pseudo random function that uniformly maps the input values into the range of [0, 1] and F g is a grouping function that outputs a real number between [0, 1]. This check helps the intermediate node to decide whether it is a leader node or not. Using the pairwise key shared with its parent, non-leader node encrypts its ID, new C, aggregation result, and SQ. It then sets the flag to zero and forwards these data along with a MAC, which is

59 2.3. Current Secure Data Aggregation Schemes 39 Algorithm 2.2: Grubbs test algorithm Input: a set T of n tuple (x, c x, Agg x ), where x is group leader ID, c x is group count value, Agg x is group aggregation result, and n is the total number of groups; Output: a set L of leader IDs of groups with invalid aggregation results. Procedure: 1 loop 2 compute µ c and s c for all counts in set T ; 3 compute µ v and s v for all values in set T ; 4 find the maximum count value c x in set T ; 5 compute statistic Z c for count c x as cx µc S c ; 6 compute p-value P c based on the statistic Z c ; 7 compute statistic Z v for corresponding values Agg x as Agg x µv S c ; 8 compute p-value P v based on the statistic Z v ; 9 if (P c P v ) < α then 10 T = T (x, c x, Agg x ); 11 L = L x; 12 else 13 break; 14 end if; 15 end loop; 16 return L; computed on inputs to the encryption function, and an XOR result for all MAC s received from its children and included in this aggregation. The leader node on the other hand performs the same operation as the non-leader node, except that it encrypts the new aggregation using the key shared with the base station and sets the flag to one. Verification Phase The base station, upon receiving the aggregation result from a leader node, needs to verify whether the received aggregation result is accurate and came from a genuine leader node. It decrypts this aggregation result and then applies Equation 2.5 to check the legitimacy of the node as a leader node. Once the test is passed, the base station needs to check the validity of the received aggregation result. First, the base station uses an adaptive Grubbs test [50] to verify the abnormality in the aggregation result before accepting or rejecting the received aggregation result. The adaptive Grubbs test, as shown in Algorithm 2, first computes the sample statistic for each datum X in the set by X µ, where µ and s are the mean and the standard deviation of s the data, respectively. The result represents the datum s absolute deviation from the mean in units of the standard deviation. To decide whether H 0 should be accepted or not, the test compares the p-value computed based on the sample statistic with the predefined significance level α (α = 0 typically), where p-value is set as the product of the p-values of the data aggregation and the count (the number of participants in the aggregation). When the p-value is smaller than α, H 0 is rejected and the datum under consideration is an outlier, and then the attestation mechanism is called. The attestation process is similar to the Merkle hash tree discussed in Przydatek et al. s scheme. The base station interacts with the group under

60 40 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks suspicion to prove the correctness of its group aggregation result. Adversarial Model and Attack Resistance The scheme designers considered an adversary that can compromise a small fraction of sensor nodes to obtain the keys as well as reprogramming these sensor nodes with attacking code. This type of adversary falls within type III according to the discussion in Section Although Yang et al. mentioned that they did not consider any type of behavior-based attack such as the Selective Forwarding attack, their scheme is examined against this attack for the sake of a complete survey. It is argued that if the adversary is able to launch the Node Compromise attack in order to mislead the base station about the aggregation results, the adversary can also perform some of the Selective Forwarding attack activities for the same purpose. The scheme, however, is robust against the Replay and Sybil attacks due to the query sequence number embedded in the reported PP and due to the use of µtesla, respectively. Security Services The data aggregation security is achieved by encrypting PP destined to the base station and then by checking the validity of the aggregation results. This ensures data confidentiality, authentication, and data integrity within the network. Due to the query sequence number, which is embedded in any response, data freshness is also offered. Data availability, however, is not ensured because of the high number of transmissions required to accomplish the aggregation activities, as will be discussed in Section 2.5. Discussion As discussed above, Yang et al. used an adaptive test to check the validity of aggregation results. This adaptive test is subject to attack when some nodes are compromised. The test uses reported aggregation results to compute the µ and s (see Algorithm 2). Compromised nodes can collude and report invalid aggregation results to mislead the calculation of the mean of the data (µ) and then affect steps 3-16 in Algorithm 2. This will affect the base station s decision and may enforce it to start the attestation process with honest groups instead of malicious groups. Moreover, invalid aggregation results are attested (or verified) through centralized verification that incurs high communication cost. 2.4 Security Analysis This section provides security analysis for several secure data aggregation schemes. Not surprisingly, this analysis can be difficult for the following reasons: ˆ The data aggregation security problem was solved using different approaches. For example, some authors solved the problem by considering either a single aggregator model or a multiple aggregator model. Each model has its own challenges that need to be considered carefully. End-to-end encryption, for example, is easier to implement in the single aggregator model than the multiple aggregator model. However, the energy consumption in the single aggregator model is high, because of the large number of transmissions required to accomplish a single aggregation query, as will be covered in Section 2.5.

61 2.4. Security Analysis 41 Table 2.1: Security services provided in current secure data aggregation schemes Missing Provided Scheme CO IN FR AV AU AT Sanli et al. [110] II Castelluccia et al. [17] II Westhoff et al. [131] III Hu & Evans [58] III Przydatek et al. [99] III Chan et al. [22] III Du et al. [40] III Mahimkar & Rappaport [75] III Yang et al. [136] III Jadia & Mathuria [62] III Frikken & Dougherty [43] III Haghani et al. [53] III CO Confidentiality IN Integrity FR Freshness AV Availability AU Authentication AT Adversary Type ˆ There is no standard adversarial model where current cryptographic-based secure data aggregation schemes compete to provide a higher level of security, or resilience to attacks discussed in Section For example, schemes that defend against a type I adversary are secure in the face of Sybil, Selective Forwarding, and Replay attacks. However, the resilience against these attacks is not provided by the scheme itself, but is due to the limited capability of a type I adversary, as discussed in Section Current cryptographic-based secure data aggregation schemes are consequently compared with respect to: the security services they provide, and the attacks they are secure against Security Services Since the considered adversarial model in current cryptographic-based secure data aggregation schemes varies from one scheme to another, as discussed in Section 2.3, each scheme provides different security services to defeat the expected type of adversary. This section investigates which security services, discussed in Section 2.1.1, are provided in each of the cryptographic-based secure data aggregation schemes discussed in this chapter. It is obvious from Table 2.1 that schemes designed with a type I adversary in mind, such as Castelluccia et al. s scheme [17] and Sanli et al. s scheme [110], do not provide entity authentication service, which is a must in most schemes that aim to defeat active adversaries (type III or type IV) as in [22,40,43,53,58,75,99,131,136]. This is because active adversaries can launch, for example, Sybil attacks where the adversary is able to present more than one node and then interact with the network. Adversaries can successfully inject fake identities to affect aggregation results and

62 42 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks mislead the base station. Security Services discussed in this section are as follows: Data Confidentiality Data confidentiality is provided in cryptographic-based secure data aggregation schemes whenever the privacy of the data is required. Some of the schemes in which a type II adversary is expected, such as Castelluccia et al. s scheme [17] and Sanli et al. s scheme [110], aimed to secure raw data and aggregation results from revelation by a passive adversary. Thus, they focused on providing data confidentiality only. This level of security is acceptable, because a type II adversary has no interest in destroying the overall performance but it is only interested in knowing the content of the reported information. Other schemes, which consider type III or type IV adversaries, may or may not provide data confidentiality. This depends on whether the privacy of aggregation results is important for WSN applications. For example, Jadia & Mathuria s [62], Mahimkar & Rappaport s [75], Przydatek et al. s [99], Yang et al. s [136], and Westhoff et al. s [131] schemes provide data confidentiality with other security services. Data Integrity Data integrity is provided in some cryptographic-based secure data aggregation schemes in which active adversaries (type III or type IV) are expected in the deployment area. These two types of adversary, as discussed in Section 2.2.2, can launch node compromise attacks and then they are able to alter the content of data received from downstream nodes before it is forwarded to upper stream nodes. If data integrity service is not offered by a scheme, upper stream nodes would have no knowledge of this alteration. Table 2.1 shows that most cryptographic-based secure data aggregation schemes that have at least a type III adversary in mind [22, 40, 43, 53, 58, 62, 75, 99, 136] provide data integrity service. However, Westhoff et al. s scheme [131] does not offer data integrity although it is built with type III adversary in mind. This is because the authors of this scheme limited their discussion to offering data confidentiality only. Data Freshness Active adversaries (type III or IV) can launch different types of attack such as Replay attacks. They can affect the aggregation result by simply replaying old messages into networks that do not have data freshness provided. Not surprisingly, each scheme where active adversaries are expected, ensures data freshness. However, data freshness is not provided in schemes such as Du et al. s [40], Mahimkar & Rappaport s [75], and Westhoff et al. s [131]. Witnesses in Du et al. s scheme help the base station (or the querier) to validate the aggregation results but the freshness of the aggregation is left unconsidered. Therefore, the aggregator - if compromised - can mislead the base station by replaying old messages with valid (but old) proofs from the witnesses. Westhoff et al. s scheme also does not offer data freshness, although was built with a type III adversary in mind. This is because the authors of this scheme limited their discussion to offering data confidentiality only. Table 2.1 shows that data freshness is ensured in Chan et al. s scheme [22], Hu & Evans s scheme [58], Jadia & Mathuria s scheme [62], Przydatek et al. s scheme [99], and Yang et al. s scheme [136].

63 2.4. Security Analysis 43 Table 2.2: Attacks vulnerabilities in current secure data aggregation schemes Robust Vulnerable Scheme NC SY SF RE AT Castelluccia et al. [17] II Sani et al. [110] II Westhoff et al. [131] III Hu & Evans [58] III Przydatek et al. [99] III Chan et al. [22] III Du et al. [40] III Mahimkar & Rappaport [75] III Yang et al. [136] III Jadia & Mathuria [62] III Frikken & Dougherty [43] III Haghani et al. [53] III SF Selective Forwarding RE REplay SY SYbil NC Node Compromise AT Adversary Type Data Availability Recently, data availability has gained some attention in cryptographic-based secure data aggregation schemes. Detecting the inconsistency in aggregation results with no further action to determine the node that caused this inconsistency is not enough. An adversary could keep manipulating aggregation results in order to bring the network down by consuming the energy resources of intermediate sensor nodes. Table 2.1 shows that Haghani et al. s scheme is the only scheme that provides data availability [53]. This scheme allows the identification of nodes that caused the inconsistency in the aggregation result (or the aggregation disruption) and then allows the removal of malicious nodes. These nodes can be detected through successive polling of the layers on a commitment tree. However, the energy consumption of successive polling is questionably high. Entity Authentication As discussed in Section 2.1.1, entity authentication ensures the reliability of a message by verifying its origin. Table 2.1 shows that cryptographic-based secure data aggregation schemes that provide data integrity also provide entity authentication. This is because the message authentication code (MAC ) is used to verify both data authenticity and data integrity. Note that, entity authentication is partially provided in Du et al. s scheme, because only communications between an aggregator and a querier are authenticated. Communications between leaf nodes and the aggregator are not authenticated.

64 44 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks Attack Vulnerability This section extends the attacks vulnerability analysis that is discussed in Section 2.3. Cryptographic-based secure data aggregation schemes are investigated to determine whether or not they are vulnerable to the security attacks listed in Section Node Compromise Attack The node compromise attack explains whether or not the adversary is able to reach any deployed sensor node and extract all credentials stored in its memory. It is usually assumed that node capture is easy in WSNs due to the lack of tamper-resistant packaging [8,69,69,107]. Thus, all cryptographic-based secure data aggregation schemes that consider active adversaries (type III or type IV) are vulnerable to the node compromise attack. Other schemes that consider passive adversaries (type I or type II) such as Sanli et al. s [110] and Castelluccia et al. s [17] schemes are robust against the node compromise attack due to assumptions about adversary capability. However, these two schemes are vulnerable to the Node Compromise attack once the adversary assumption is relaxed. Sybil Attack As the capability of the adversary varies from type I to type IV, the damage caused by these attacks also varies. Passive adversaries (type I or type II), as discussed in Section 2.2.2, have insufficient capability to launch the Sybil attack. Therefore, Castelluccia et al. s scheme [17] and Sanli et al. s scheme [110] are robust against the Sybil attack because of the considered adversary capability, not because of the security primitives employed in these schemes. Du et al. s scheme [40] is vulnerable to the Sybil attack, because leaf nodes are not authenticated to the aggregator. An adversary, upon compromising a leaf node, can present more than one identity and then mislead an aggregator with respect to aggregation results, as discussed in Section 2.3. Selective Forwarding Attack Once the adversary has succeeded in launching the node compromise, the adversary has full control of the compromised node and can then selectively drop messages. This is an example of the Selective Forwarding attack. All secure data aggregation schemes that considered active adversaries (type III or type IV) are vulnerable to this type of attack, except Haghani et al. s scheme [53]. This scheme has an adversary localizer component that marks nodes that disrupted an acknowledgment collection, and can then detect any selective forwarding activity. Once again, Castelluccia et al. s scheme [17] and Sanli et al. s scheme [110] are robust against the selective forwarding attack, because of the considered adversary capability, not because of the security primitives employed in these schemes. Replay Attack Replay attacks occur when the adversary has the ability to re-inject (or replay) old messages without even understanding their content. Most cryptographic-based secure data aggregation schemes are robust against this attack except Castelluccia et al. s [17], Du et al. s [40], Mahimkar & Rappaport s [75], Sanli et al. s [110], and Westhoff et al. s [131]. Surprisingly, Du

65 2.4. Security Analysis 45 CO: Confidentiality FR: Freshness IN: Integrity AU: Authentication Start Passive Adversary Active Partial Network Access Network Access Total Total Partial Type I Type II Type III Type IV FR & CO* FR & CO* FR & IN & AU & CO* Tamper Proof Finish Finish Finish Finish Figure 2.9: The proposed framework for secure data aggregation schemes et al. s, Mahimkar & Rappaport s, and Westhoff s [131] schemes are vulnerable to the replay attack although they are designed to defeat active adversaries. For example, once an adversary has compromised an aggregator node in Du et al. s scheme, it is able to replay an old aggregation result with its valid proofs, instead of a current result, to mislead the base station. In Mahimkar & Rappaport s scheme an adversary, upon compromising an aggregator, can replay old valid signed aggregation results to mislead the base station. In Westhoff et al. s scheme, an adversary can replay old encrypted messages once the compromise of an aggregator node has succeeded, which affects the aggregation results. The security analysis discussed above raises the point that relying on cryptographic countermeasure is insufficient to protect data aggregation schemes due to node compromise attacks. Table 2.2 shows that most cryptographic-based secure data aggregation schemes are vulnerable to different types of attacks Framework for Evaluating New Schemes Based on our discussion provided in Sections 2.1, 2.2, and 2.4, a conceptual framework for secure data aggregation schemes is proposed in this section. The framework helps to identify the minimum security services that a secure data aggregation design should provide to defend against a specific type of adversary. In other words, we believe that these minimum security services provide resilience against security attacks that can be launched by the expected adversary. Figure 2.9 depicts the relation between the security services, discussed in Section 2.1.1, and the adversarial model, discussed in Section Since type IV is so much more powerful, it is unlikely that any practical cryptographic-based secure data aggregation scheme against this adversary can be devised. The framework, therefore, suggests the use of tamper-proof

66 46 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks Parent Parent & Child Child 1 2 b d 1 2 b 1 2 b 1 2 b Figure 2.10: The aggregation tree model used in the performance analysis section technology to deny physical access to this type of adversary. Since a type III adversary is able to launch the security attacks discussed in Section 2.2.1, the framework suggests that any secure data aggregation scheme should provide at least data integrity, data freshness, and authentication. Data integrity helps to detect any spoofed data attack activity, data freshness is important to detect any replayed attack activity, and authentication helps to defend against any Sybil attack activity. The framework puts data confidentiality as an optional requirement. If data privacy is valuable for any application, then data confidentiality is necessary. A type I adversary is capable of eavesdropping on communications in parts of the network that it has access to, and type II can eavesdrop on all communications in the network. However, both types can not interact with any component in the network. To defend against these adversaries, the framework suggests that any scheme should provide at least data integrity. Data integrity is important to minimize the effect of unreliable data delivery due to the transmission media or drained batteries. Again, data confidentiality is suggested as an optional requirement. If a WSN application, where in-network aggregation is implemented, has concerns about data privacy, then data confidentiality should be provided. To the best of our knowledge, this framework is the first work that enables comparisons between different secure data aggregation schemes. 2.5 Performance Analysis This section provides a performance analysis of some cryptographic-based secure data aggregation schemes discussed in this chapter. This analysis focuses on calculating the number of bits transmitted within the network, in order to determine which secure data aggregation scheme is the most energy hungry and sends more information in order to accomplish the scheme

67 2.5. Performance Analysis 47 Table 2.3: Description of notations used in the performance analysis section Notation b d x y z qn h w N n Description The number of children nodes that an intermediate node has. The depth of the aggregation tree. The length of the reported information (raw or aggregation result) excluding the header. The length of the sensor ID in bits. The length of the MAC in bits. The length of the query nonce in bits. The length of the packet s header in bits. The number of witnesses per aggregator. The total number of nodes in the aggregation tree. The length of N in bits. objectives. Notations used in this section are listed in Table 2.3. For concreteness, we consider an aggregation tree where its depth is d and each node (except leaf nodes) has b children as shown in Figure This means that the distance between the base station and leaf nodes are d + 1, where d starts with zero at the first level. The total number of nodes (N), excluding the base station, in the tree is n bits long and can be calculated as: N = bd+1 1 b 1 (2.6) This kind of tree, therefore, has b d leaf nodes. If a scenario belongs to the single aggregator model, we consider the root of the tree to be the aggregator. Otherwise, any parent node acts as an aggregator (see Figure 2.10). In both models, each sensor node in the tree has to participate in the aggregation activity by sensing the environment and then report its reading to its parent. Moreover, TinyOS packet is pre-configured with a maximum size of 35 bytes (29 bytes payload and 6 bytes header) and thus we denote the packet header by h. We discuss six scenarios where both the single and the multiple aggregator models are covered. These scenarios are: no aggregation, aggregation but no security, two representatives for the single aggregator model (Hu & Evans s scheme [58], Jadia & Mathuria s scheme [62]), and two representatives for the multiple aggregator model (Przydatek et al. s scheme [99], Du et al. s scheme [40]). Since not all of these scenarios have a verification phase, we limit our analysis to the aggregation phase only First Scenario: No Aggregation & No Security We analyze the number of transmitted bits by considering the situation where no aggregation and no security are used within our example summarized in Figure Leaf nodes sense some

68 48 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks physical phenomena and report them to upper nodes (their parents). The parents subsequently forward this information to upper nodes until the information is delivered and collected by the base station (or the querier). Each reported set of information contains the sensor node ID and the sensed physical phenomena, which required each sensor node at level d to send x+y +h bits long message to its parent. Each parent, or intermediate node, needs to forward x + y + h bits for each child it has and x + y + h bits to report its reading. Thus, the total number of bits forwarded by each parent at level d i (where i = d 1) is: (b + 1)(x + y + h) (2.7) From Equation 2.7, the total number of bits traveled across the network to perform a single aggregation function can be estimated as follows: ( bd+1 1 b 1 d ) (x + y + h) + (d i)b (d i) (x + y + h) (2.8) Second Scenario: Aggregation but No Security i=0 The aggregation functionality in this scenario is implemented but the security is not considered. This scenario is similar to the example discussed in Section 1.3, where each parent combines the reported b messages from its children with its reading. Then, it forwards only one message to represent these b + 1 messages. The number of bits forwarded by each parent at any level is estimated as x + y + h and the total number of bits, traveled across the network in order to accomplish the aggregation phase, is calculated as: ( bd+1 1 ) (x + y + h) (2.9) b Third Scenario: Hu & Evans s Scheme This scenario analyzes Hu & Evans s scheme [58]. This scheme, as discussed in Section 2.3, follows the multiple aggregator model with a verification phase. Each leaf node (at level d i where i = 0) needs to send its ID, data, and one message authentication code toward its parent. The length of this message in bits can be calculated as x + y + z + h. Then, the total number of bits sent by all leaf nodes at level d i (where i = 0) can be estimated as: b d (x + y + z + h) (2.10) Each parent (at levels d i where 0 < i d) needs to forward the received data unchanged and adds one more MAC. Thus, the length of this message in bits can be calculated as b(x + y + z) + z + h. This means that the total number of bits sent by all parents in the tree is: d i=1 b (d i) [b(x + y + z) + z + h] (2.11)

69 2.5. Performance Analysis 49 Thus, the approximate number of bits transmitted across the network to perform a single aggregation transaction, in Hu & Evans s scheme, can be calculated by adding Equation 2.10 and Equation 2.11 together as follows: d i=1 b (d i) [b(x + y + z) + z + h] + b d (x + y + z + h) = ( bd+1 1 b 1 bd ) [b(x + y + z) + z + h] + b d (x + y + z + h) (2.12) Fourth Scenario: Jadia & Mathuria s Scheme As discussed in Section 2.3, Jadia & Mathuria s scheme [62] enhanced the security services provided in Hu & Evans s scheme [58] by adding data confidentiality. This requires each node to add one more message authentication code into each message. So, each sensor node at level d i (where i = 0) sends x + y + 2z + h bits instead of sending x + y + z + h bits in Hu & Evans s scheme. Then, the total number of bits sent by all leaf nodes can be estimated as: b d (x + y + 2z + h) (2.13) By substituting Equation 2.13 with the second part of the right side of Equation 2.12, the total number of bits sent by the scheme to accomplish a single aggregation function is approximately: = ( bd+1 1 b 1 bd ) [b(x + y + z) + z + h] + b d (x + y + 2z + h) (2.14) Fifth Scenario: Przydatek et al. s Scheme In this scenario, Przydatek et al. s scheme [99] is analyzed. The scheme follows the single aggregator model and uses the aggregate-commit-prove approach discussed in Section 2.3. In the aggregate phase, each sensor sends its ID, data, query nonce, and two message authentication codes keyed with two shared keys: the first key is shared with the aggregator and the other key is shared with the base station. The length of this message in bits is x + y + qn + 2z + h and it travels all the way toward the aggregator. Therefore, the total number of bits traveled

70 50 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks across the network until the sensed data reaches the aggregator can be estimated as: d i=0 (d i)b (d i) (x + y + qn + 2z + h) (2.15) In the commit phase, the aggregator constructs a Merkle hash tree of received messages. The aggregator sends the root of this tree (as a commitment value), the number of leaves in the hash tree, and an aggregation result. Let us assume for simplicity the length of the commitment value is x + y + qn + 2z + h bits long, and the length of the aggregation result is the same as the reported data x. Thus, the total number of bits sent to the home server (or remote user) by the aggregator is: n + 2x + y + qn + 2z + h (2.16) Adding the number of bits in Equations 2.15 and 2.16 gives the total number of bits sent by the scheme to perform the aggregation phase for a single aggregation query as follows: d i=0 n + 2x + y + qn + 2z + h + (d i)b (d i) (x + y + qn + 2z + h) (2.17) Sixth Scenario: Du et al. s Scheme According to the discussion in Section 2.3, Du et al. s scheme follows the single aggregator model. It is assumed that leaf nodes are honest and the sensed data reaches the aggregator and witnesses correctly. Let us assume that each sensor needs to send at least its ID and its sensed data. The length of this message in bits is x + y + h. Therefore, the number of bits sent by leaf nodes to the aggregator in order to accomplish the aggregation phase for a single aggregation activity can be estimated as: d i=0 (d i)b (d i) (x + y + h) (2.18) According to the scheme design, the same number of bits goes to each witness (w) and consequently the total number of bits sent to the witnesses can be estimated as: d w (d i)b (d i) (x + y + h) (2.19) i=0 where w is the number of witnesses. Then, each witness computes the aggregation result and sends it to the aggregator with a message authentication code (MAC ) that contains its ID and the aggregation result. The length in bits for this transmission can be calculated as: w(x + y + z + h) (2.20)

71 2.5. Performance Analysis 51 Table 2.4: Number of bytes transmitted across the network to accomplish a single aggregation transaction Scenarios b=2 b=3 b=4 d=3 d=4 d=3 d=4 d=3 d=4 First Scenario: No Aggregation & No Security Second Scenario: Aggregation but No Security Third Scenario: Hu & Evans s scheme [58] Fourth Scenario: Jadia & Mathuria s scheme [62] Fifth Scenario: Przydatek et al. s scheme [99] Sixth Scenario: Du et al. s scheme [40] Finally, the aggregator forwards its ID, the aggregation result that is computed by itself, and all MAC s received from its witnesses as follows: x + y + wz + h (2.21) Therefore, the total number of traveled bits can be calculated by adding Equations 2.18, 2.19, 2.20, and 2.21 as follows: d i=0 d (d i)b (d i) (x + y + h) + w (d i)b (d i) (x + y + h) + i=0 w(x + y + z + h) + (x + y + wz + h) (2.22) Example For better understanding the transmission overhead caused by scenarios mentioned above, an example with numbers is given. Let us select the length of the reported information without the header (x), the length of the sensor ID (y), the MAC s length (z), the number of witnesses (w), the length of the query number (qn), and the length of the total number of sensor nodes (n) to be 7 bytes, 2 bytes, 6 bytes, 5 witnesses, 3 bytes, and 4 bytes respectively. We compare the scenarios discussed in this section by computing the number of bytes that each scenario transmits to accomplish the aggregation phase. This can be done by substituting the values given above into Equations 2.8, 2.9, 2.12, 2.14, 2.17, and Table 2.4 investigates our scenarios by varying the depth of the aggregation tree, and the number of children each parent has. In contrast with the first scenario, the second scenario shows that in-network aggregation

72 52 Chapter 2. Secure Data Aggregation in Wireless Sensor Networks greatly helps reduce the number of bits required to accomplish the aggregation phase. This reduction increases as the depth of the aggregation tree or the number of children per parent increases. Table 2.4 also shows that cryptographic-based secure data aggregation schemes that follow the single aggregator model send many more bits than schemes that follow the multiple aggregator model. In fact, they send at least double the number of bits sent by single aggregator model schemes. 2.6 Summary This chapter is about cryptographic-based secure data aggregation. It first gives introductory information about secure data aggregation in WSNs, which leads to a new definition of data aggregation security with respect to the challenges that WSNs have. Then, it highlights the security requirements for data aggregation in WSNs, since this thesis is centered on providing security to data aggregation applications. It also discusses security attacks against cryptographic-based secure data aggregation schemes. Then, it surveys in detail some of the current secure data aggregation schemes and classifies them into two models: (i) the single aggregator, and (ii) the multiple aggregator model. It also discusses the security and performance analysis of current cryptographic-based secure data aggregation schemes. The security analysis covers the security services the current schemes provide and their robustness against the security attacks discussed in this thesis. Based on the security analysis, a conceptual framework is proposed. This framework helps to identify the minimum security services that a secure data aggregation design should provide to defend against a specific type of adversary. The security analysis also shows that relying on cryptographic countermeasure is insufficient to protect data aggregation schemes due to node compromise attacks. Table 2.2 shows that most cryptographic-based secure data aggregation schemes are vulnerable to different types of attacks. The performance analysis covers the number of bits transmitted in order to accomplish the aggregation phase in some selected schemes. Schemes that follow the multiple aggregator model are more efficient than schemes that follow the single aggregator model. In the next chapter, an alternative direction to circumvent node compromise attacks is discussed. Reputation-based approach, in this direction, monitors the network activities and tries to detect events related to the node compromise.

73 Chapter 3 Reputation-based Trust Systems in Wireless Sensor Networks Chapter 2 has reviewed cryptographic-based secure data aggregation schemes. It was found that cryptographic mechanisms alone are insufficient to defend against node compromise attacks. The wireless security community has consequently developed a suite of mechanisms to complement cryptographic techniques, such as reputation-based trust systems. These systems can be defined as systems that collect, processe, and disseminate feedback about the history of the sensors behaviors. To the best of our knowledge, there is only one survey in which current reputation-based trust systems for WSNs have been studied. Roman et al. gave the state of the art in trust management systems for WSNs and they also tried to identify the main components of these systems architectures [106]. The main two components, according to Roman et al. s study, are information gathering and information modeling. This chapter extends the work in [106] by considering more components in the architecture of reputation-based trust systems, and analyzing more trust systems. It also provides insights into the reputation components and vulnerability to the security attacks discussed in Sections and 3.2 for each system. Trust has become an important topic of research in many fields including sociology, psychology, philosophy, economics, business, law and information technology. The most cited definition of trust has been presented by Dasgupta as the expectation of one person about the actions of others that affects the first person s choice, when an action must be taken before the actions of others are known [33]. This definition captures both the purpose of trust and its nature in a form that can be reasoned. Another definition for trust by Gambetta [45] is also often quoted in the literature: trust (or, symmetrically, distrust) is a particular level of the 53

74 54 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks subjective probability with which an agent assesses that another agent or group of agents will perform a particular action, both before he can monitor such action (or independently of his capacity ever to be able to monitor it) and in a context in which it affects his own action. Though many definitions are available in the literature, a complete formal unambiguous definition of trust is rare because trust is a complex term with multiple dimensions. A concept that is often mentioned together with trust is reputation. In order to avoid confusion, a definition for reputation as well as the relation between reputation and trust are highlighted in this paragraph. Mui et al. [84] define reputation as a perception that an agent creates through past actions about its intentions and norms. A similar definition given by Abdul-Rahman et al. [1] is a reputation is an expectation about an agent s behavior based on information about or observations of its past behavior. Another definition for reputation is given by Jøsang et al. [65] as: reputation is what is generally said or believed about a person s or thing s character or standing. Although the definition only introduces an abstract notion of reputation, it allows one to easily differentiate between trust and reputation. Trust describes a subjective relation between an entity and another entity (or group of entities) while reputation is what is generally said about an entity. Thus, the reputation of an entity is based on the opinions provided by all entities. Trust may be used to determine the reputation of an entity. The other way around, reputation may also be used to determine the trustworthiness of an entity [65]. The Feedback Forum on ebay is the most prominent example of online reputation systems [68] in which the basic idea is to let parties rate each other. After the completion of a transaction, each party is allowed to leave feedback about their experience of the other party. Then, the aggregated ratings about a given party are used to derive a reputation score, which can assist other parties in deciding whether or not to deal with that party in the future. In general, trust and reputation models provide means for assessing the trustworthiness of an entity within a specific context or scope. However, traditional trust management schemes used for wired and wireless Ad Hoc networks are not suitable for WSNs due to higher computational costs, and large memory and communication overheads [113, 114]. Our contributions in this chapter include the following: ˆ Proposal of an analysis framework for reputation-based trust systems. This framework helps to understand the limitation of each system. ˆ Discussion of the security concerns in reputation-based trust systems designed for WSNs. This includes discussion of how the integration between wireless sensor networks and reputation systems can open doors for an adversary to threaten reputation-based trust systems, and thus affect their entire performance. ˆ Presentation a comprehensive survey of the state-of-the-art in reputation-based trust systems for WSNs, and then classification of these systems according to the context they were designed for. ˆ Finally, a detailed comparison of these reputation-based trust systems. This comparison

75 3.1. Analysis Framework for Reputation Systems 55 Phase 1 Phase 2 Phase 3 Phase 4 Source Approach WDM Type Direct Indirect Observations Decision Metric Approach Structure Scope Structure Another entity Another entity Figure 3.1: The reputation system phases includes: (i) investigating the feasibility of main components of existing reputation systems, and (ii) analyze vulnerability of these systems to security attacks related either to WSNs or reputation systems. It is believed that this comparison will help in assessing the strengths and weaknesses of existing reputation-based trust systems. The rest of the chapter is organized as follows: Section 3.1 proposes a framework to analyze current reputation-based trust systems. The framework is composed of four phases: (i) information gathering and sharing, (ii) information modeling, (iii) decision making, and (iv) dissemination. Section 3.2 discusses possible security attacks against reputation systems. Section 3.3 surveys, in detail, some of the current reputation-based trust systems intended to work in WSNs and then classifies them into five categories: (i) generic, (ii) localization, (iii) mobility, (iv) routing, and (v) aggregation. Then, a comparison between current reputation-based trust systems is given in Section 3.4. Finally, the chapter is concluded in Section Analysis Framework for Reputation Systems Reputation systems often share similar structural patterns due to the common purposes they are used for, such as enhancing the system s overall performance by monitoring network activities. They consist of four main phases: information gathering and sharing, information modeling (or reputation calculation), decision making, and dissemination (See Figure 3.1). These four phases are discussed in the following subsections Information Gathering and Sharing Phase This phase compromises the communication and collection of reputation ratings. A reputation system design must specify the type of information to be collected about other neighboring

76 56 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks nodes, and how it should be collected. The metrics for collected ratings can for example accept only positive ratings, only negative ratings, both types, or any rating on continuous scales. It is believed that this phase is the core component of any reputation system, because it evaluates current activities and gathers the available information about the system and then hands it to the next phase; the information modeling phase. The information gathering and sharing phase has four components: information source, information type, information gathering approach, and gathering scope. These components are discussed as follows: Information Source: The information source in any reputation system can be either manual or automatic. The manual information source is obtained in the form of user ratings for other entities as a result of being involved in a single transaction, such as in the ebay rating system [68]. This type of source is not available in WSNs due to the lack of user interaction with the network. The only user interaction with WSNs usually occurs at the base station, whereas the reputation system gathers information from every device within the WSNs. The automatic information source does not involve user interaction and can be either direct or indirect observation. Direct observations, sometimes called first-hand information, are computed based on the node s observations and experience with neighboring nodes, such as the success and failure of forwarding aggregated data within an error rate. In some reputation systems, direct observations need to be propagated to other nodes in the neighborhood. Then, this propagated information is called indirect observation, or second-hand information, at the receiving nodes. In other words, an indirect observation for one node is a propagated direct observation of another node. Indirect observation helps to build up the reputation system more quickly than using only direct observation, since nodes will be able to learn about other nodes behaviors even though no direct communications (observations) have occurred. However, propagating reputation information between nodes makes the system vulnerable to different attacks such as Bad Mouthing (BM), Ballot Stuffing (BS), and On-Off (OO) attacks as discussed in Section 3.2. Information Type: The type of the reputation information shared between sensor nodes can be unary, i.e., either only negative [14], or only positive [81], or binary, i.e, meaning positive or negative [13, 117, 118], discrete, i.e., positive, neutral, negative as in ebay, a natural number on a scale from 1 (untrusted) to 10 (trusted) [48], or continuous [66], e.g., real values in the range of [0,1]. The choice of the information type is up to the system designer, but designers should be aware of the consequences of any choice. Considering only positive feedback on the one hand, the BM attack can be prevented because malicious nodes would not be able to affect the trust level of trustworthy nodes by propagating negative reputation ratings. However, malicious nodes can collude and falsely praise misbehaved nodes to launch a BS attack. Propagating positive feedback also exhausts the network s limited resources since the number of nodes that behave correctly in general is supposed to be larger than those which do not. Thus, the number of transmissions required to update reputation values is high, which depletes the limited energy source. On the other hand, considering only negative feedback helps prevent malicious nodes from colluding and praising misbehaving nodes (BS attack), because they could not propagate positive feedback. It also helps to minimize the number of

77 3.1. Analysis Framework for Reputation Systems 57 transmission required to update the reputation values. However, malicious nodes can assign negative reputation ratings/feedback for trustworthy nodes in order to affect their trust level (BM attack). Information Gathering Approach: As discussed earlier, the main task of this phase is to collect information about other sensor nodes in the neighborhood. This information is gathered by a sensor node based on its observations and experience about other nodes. Most current reputation-based trust systems in WSNs use monitoring mechanisms such as the Watchdog mechanism (WDM) [81] as an approach to collect these direct observations. When a node forwards a packet, the node s WDM verifies that the next node in the path also forwards the packet. The WDM is implemented by maintaining a buffer of recently sent packets. The WDM compares each overheard packets with the packet in the buffer in order to see if they match or not. Once there is a match, the packet is removed from the buffer. If the packet has remained in the buffer for longer than a certain timeout, the WDM increments a failure tally for the node that is responsible for forwarding activities. Reputation System Scope: In the current literature, most reputation-based trust systems destined to WSNs focus on specific functions. For example, CORE [81], and CONFIDANT [14] focus on detecting misbehaviors related to routing functionalities, while DRBTS [118] focuses on enforcing cooperation between beacon nodes by motivating them to provide correct location information. Comparison between reputation-based trust systems with different scopes is difficult. This is because a scope-specific reputation system requires the WDM to be tailored in order to monitor activities related to the chosen scope. For example, the aggregation scope requires the WDM to monitor routing, forwarding, sensing, and aggregation activities where each activity may use different reputation information type, while the localization scope requires the WDM to focus only on the provided location information. Thus, applying the reputation system destined for the aggregation scope directly to the localization scope is impractical; the system has to be modified. However, there might be cases, where a trust model that has been developed for a specific scope can be also applied to another scope with only minor changes, especially in scenarios where the input parameters for the trust model come from the same domain Information Modeling Phase The main task of this phase is to calculate the reputation values for such a node from the available information (direct and indirect observations), which is provided by the previous phase; the information gathering and sharing phase. This phase has two components: the information modeling structure, and the information modeling approach. These components are discussed as follows:

78 58 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks Information Modeling Structure: Reputation systems can be designed to calculate reputation values via a centralized entity, distributed entities, or a hybrid approach. In the centralized structure, observations about a node s performance are propagated to a central authority that collects these observations, derives reputation values for each node and subsequently updates nodes with new reputation values. This structure relies on some assumptions, namely that nodes completely trust the centralized authority which in turn must be correct and always available. However, if the centralized structure is not carefully designed, it can become a single point of failure for the whole reputation-based trust system. Also, centralized systems suffer from a lack of scalability, especially if the information is obtained from high latency sources. In the domain of WSNs, most recent applications were designed with a central robust authority, the base station, in place. However, propagating observations across the network to the central point is impractical due to the scalability issue and the huge energy consumption. Hence, minimizing energy consumption is important in environments where end nodes are operated with 2AA batteries, such as MICA2 sensor nodes [30]. One way to minimize energy consumption is by considering the distributed structure for information modeling. In the distributed structure, each node propagates its observations to neighboring nodes and then these nodes calculate the reputation values individually. In other words, each node is responsible for collecting direct and indirect observations, and calculating reputation values of other nodes in the neighborhood. Although the distributed structure of the information modeling is inherently more complex, it scales well, avoids single points of failure in the system, and balances load across multiple nodes. Finally, reputation values in the hybrid structure are calculated by more than one entity. For example, Shaikh et al. s scheme [113, 114] follows the distributed approach for calculating reputation values for nodes within a cluster, but it follows a centralized approach when the base station calculates reputation values for cluster-heads. Information Modeling Approach: The information modeling approach can be either deterministic or probabilistic. In the former, the output is uniquely determined by the input with no existence for randomness, whereas the output in the latter can be predicted only within certain errors, due to some randomness resources added to the input. The Bayesian model [64, 104, 121], for example, uses a probabilistic approach, which is Bayes formula, to model the reputation information [24, page 256]. On the other hand, the majority vote used in Srinivasan et al. s system [118] is an example of the deterministic information modeling approach. In this voting approach, a sensor node calculates the reputation value of a specific beacon node, which is equipped with a GPS unit and provides location information, by summation of the positive and negative votes reported by neighboring beacon nodes Decision Making Phase The main task of this phase is to decide, based on the available reputation information resulting from the information modeling phase, whether or not the trustworthiness of a specific node

79 3.1. Analysis Framework for Reputation Systems 59 is enough for a certain interaction or task. In this phase, the decision metric component is discussed as follows: Decision Metric: The decision metric can be either binary, discrete, or continuous. In the binary decision metric, the decisions (cooperate and do not cooperate notions) are represented by two symbols 1 and 0, respectively. This is usually based on a threshold policy, which is common in most reputation-based trust systems for WSNs. If a reputation value of a sensor node is above a predefined threshold, then cooperation with this node is preferable. If a trust model provides more information about the trustworthiness of an entity, e.g. the trustworthiness comes from a set of discrete values such as distrusted, uncertain, trusted, and very trusted, then the final decision of whether to interact with an entity or not can be made in a more sophisticated way. For example, if the trust value can be interpreted in terms of the probability of a successful interaction, and if it is possible to assign values for utilities and costs to successful and unsuccessful interactions, respectively, then one might apply utilitybased decision making for deciding whether it is rational to interact or not [9, 83] Dissemination Phase The main task of this phase is to ensure that reputation values resulting from the previous phase, the decision making phase, are available at each legitimate neighboring sensor node. This phase has two components: dissemination structure and dissemination approach. These components are discussed as follows: Dissemination Structure: Calculated reputation values are distributed within trust systems according to the dissemination structure, which can be either a distributed or centralized structure. In the former, each sensor node calculates reputation values of other nodes in the neighborhood, stores them locally, and then shares them with its neighbors. This type of structure helps sensor nodes to be updated about other nodes by quickly filling their reputation tables. However, redundancy in this reported reputation information exists, which affects the limited energy source in sensor nodes. Unfortunately, the distributed structure opens doors for an adversary to affect reputation values by launching BS, BM, or OO attacks. In the latter, the centralized structure, calculated reputation values are stored and distributed by a single entity, which can be a cluster-head or a base station. To manage the dissemination activities, this single entity has to have greater resources, such as enough memory space to store reputation information for other nodes, and enough energy and processing capability to ensure availability of this single entity. It is worth mentioning that there is an overlap between the information modeling structure component and the Dissemination structure component, as will be discussed in Section Dissemination Approach: The dissemination approach can be either proactive or reactive. In the former, reputation values are broadcasted periodically, although there are no changes to reputation values since last update. In the latter, reputation values are only broadcasted

80 60 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks when there are sufficient changes to these reputation values, such as the occurrence of a specific event, or that a request for a reputation value is received. Periodic dissemination, on the one hand, is suitable for resource constraint devices in busy networks, because reputation values are updated regularly for more than one activity. This helps reduce the number of transmissions required to update reputation values. On the other hand, the reactive dissemination approach, where reputation information is disseminated only on request, is suitable in networks with light traffic. This helps minimize the number of transmissions in cases where there are not sufficient changes in reputation values. It also covers designs where reputation values are piggy-backed on reply messages such as in CORE [81]. 3.2 Security Attacks against Reputation-based Trust Systems This thesis integrates reputation system capabilities with in-network aggregation functionalities for WSNs. This integration helps strengthen the performance and security levels of WSNs by providing continuous monitoring, evaluating the quality of different activities, and warning neighboring nodes about malicious behaviors. Although the use of trust and reputation concepts does not prevent an adversary from taking over legitimate nodes or adding malicious nodes, these concepts help detect malicious behaviors and then exclude from the network nodes that caused these malicious behaviors. As we propose to increase the robustness of WSNs by reputation systems, two types of attack may threaten the proposal s robustness. These two types are: (i) WSNs-related attacks (WSNs attacks), and (ii) reputation-related attacks (reputation attacks). WSNs attacks and examples of how they can affect reputation functions were discussed in Section The reputation system itself is threatened by several types of attacks [60, 63]. Understanding these attacks is crucial in order to ensure that the integration between reputation systems and WSNs does not open doors for more threats. Attacks that are only applicable to reputation systems are discussed in this section as follows: Bad Mouthing Attack (BM) This attack involves providing unfair negative ratings for trustworthy nodes. It is also known as False Accusation attack. Once an adversary has compromised a sensor node, it can affect the reputation system by assigning falsely negative feedback as the compromised node s observation of well-behaved neighboring nodes. When these incorrect direct observations are propagated to other neighboring nodes, they will be considered by neighboring nodes at the reputation calculation phase if no proper verification is in place, as will be discussed in Section 3.1. This results in incorrect reputation values for victim well-behaved nodes. In other words, the BM attack happens when the adversary has the ability to assign negative feedback for trustworthy nodes in order to reduce the trustworthiness in those nodes. This attack is possible in scenarios where the indirect observations are taken into consideration and parties

81 3.2. Security Attacks against Reputation-based Trust Systems 61 B B Adversary Compromised Sensor Genuine Sensor A A D B A. Normal reputation update C D B C B. Altered reputation update Figure 3.2: Bad Mouthing Attack are allowed to share their negative feedback with nodes in the neighborhood. Figure 3.2 depicts a simplified scenario where the BM attack can take place. Figure 3.2-A shows a sketch of the normal reputation update where nodes A and D have the same reputation value R C for node C. Note that the reputation table does not usually contain any reputation information for the node that maintains the table. For example, the reputation table which is maintained by node A in Figure 3.2 does not have reputation information for the node itself (node A). In figure 3.2-B, the adversary has succeeded in compromising node B. Later on, it assigned a negative reputation value R C for a well-behaved node C in order to mislead node A with its calculation of the reputation value of node C. This results in that nodes A and D have different reputation values R C and R C, respectively. Ballot Stuffing Attack (BS) The ballot attack is similar to the BM attack, but the adversary tries to perform the opposite effect by providing unfair positive ratings (false praise). The trustworthiness of well-behaved nodes, in this attack, is not affected as in the BM attack; however, the trustworthiness of the bad-behaved nodes is affected by assigning falsely positive feedback to malicious nodes. This attack is feasible in scenarios where indirect observations are taken into consideration and parties are allowed to share their positive feedback with their neighboring nodes. Figure 3.3 depicts a simplified scenario where the BS attack can take place. Nodes B and C, in Figure 3.3-A, are compromised and their reputation values (or maybe one of their reputation values) are low due to their previous malicious behaviors. These compromised nodes colluded with each other and assigned higher reputation values to each other as in Figure 3.3-B,

82 62 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks B B Adversary Compromised Sensor Genuine Sensor A A B D C A. Before launching BS Attack B D C B. After launching BS Attack Figure 3.3: Ballot Stuffing Attack which will affect the reputation calculation for nodes B and C at nodes A and D. Generally speaking, the adversary can substitute low reputation values with high reputation values for any neighboring node in order to affect the overall performance of the system. On-Off Attack (OO) In this type of attack, an adversary aims to disrupt the system s overall performance with the hope that it will not be detected or excluded from the network. The adversary alternates in showing abnormal and normal behavior in order to extend the detection time required to recognize its misbehaviors. This attack can be launched against either the reputation activities or general activities in WSNs. For example, showing abnormal and normal behaviors can be done in the context of reputation activities, such as forwarding and calculating reputation information, or can be done in the context of normal sensor network activities, such as aggregation, routing, and sensing physical phenomena. A simple scenario where an adversary is able to perform some OO attack activities is shown in Figure 3.4. Figure 3.4-A shows a subset of genuine sensor nodes where a sensor node B shares broadcasts its reputation table or its experience with neighboring nodes. Let us assume that node B has been compromised at t 2 where t 2 > t 1. Later on, node B behaves maliciously intermittently when it deals with nodes C and D by claiming that the reputation value for node A is R A instead of R A. However, it behaves normally when it deals with node A and disseminates the real reputation values for nodes C and D (see Figure 3.4-B). Another form of the OO attack happens when a sensor node misbehaves once every l well-behaved transactions, which makes nodes A, C and D uncertain about the behavior of node B. In other words, they are not sure whether the misbehavior of node B was intended or whether it was due to some other factors such as the wireless medium.

83 3.2. Security Attacks against Reputation-based Trust Systems 63 Adversary B Compromised Sensor B Genuine Sensor A A D B C A. Normal reputation update at t 1 D B C B. Altered reputation update at t 2 Figure 3.4: On-Off Attack B B Adversary Compromised Sensor Genuine Sensor A A D B A. Reputation update at t 1 C D B C C` B. Reputation update at t 2 Figure 3.5: Newcomer Attack

84 64 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks Newcomer Attack (NE) As soon as the adversary s reputation value drops below a predetermined threshold value, which moves a node from a trusted mode into a distrusted mode, the adversary will consider other ways to increase its reputation value. One way to do so is to rejoin the network with a new ID and wipe out all its bad history. This attack is referred to as the newcomer attack 1. If the adversary has the ability to launch this attack, then detecting the adversary s misbehaviors is not an issue from the adversary s perspective due to the fact that all the old history can be wiped out at any stage. A sketch of a simplified scenario for a NE attack is shown in Figure 3.5. The reputation value of node C in Figure 3.5-A fell below the predefined threshold value as a result of its previous misbehaviors. Therefore, the adversary decided to rejoin the network with another identity C and neutral reputation value as in Figure 3.5-B. 3.3 The State of the Art of Reputation-based Trust Systems in WSNs In this section, only five reputation-based trust systems are discussed. These systems are selected as representatives of five scopes that attracted the systems designers: generic, routing, access, localization, and aggregation. The five representatives are discussed as follows: Boukerche & Ren s Scheme The trust computation and management system (TOMS), which is proposed by Boukerche and Ren, involves developing a trust model, assigning credentials to nodes, updating private keys, managing trust values of each node, and making appropriate decisions about the nodes access rights [12, 103]. Boukerche and Ren claimed that TOMS helps to control the access of mobile nodes to the resources of other nodes by applying effective access management mechanisms. TOMS introduces the concept of community which is defined as a central node and its entire one-hop neighboring nodes. Let us consider the community model example given in Figure 3.6. In the community of the central node C, C has six neighbors, but neighbors B and E are malicious and thus they are excluded from C s community. The community of the central node C consists of nodes A, B, E and F, as well as the central node C. TOMS is composed of two phases: trust model and trust management. These two phases cover the information modeling, the decision making, and the dissemination phases according to the discussion in Section 3.1. The trust metric in TOMS is defined as a function that depends on the time a node has stayed in the community and on the past trust to which this node has gained, as follows [15]: N = rt, 1 It is sometimes referred to as the identity attack or white washing attack [41].

85 3.3. The State of the Art of Reputation-based Trust Systems in WSNs 65 A Community Node A Central Node A x 1 x 2 x 3 x 4 x 5 x 6 Figure 3.6: A community as suggested in TOMS [12] where rt represents the recent trust. The recent trust rt of the node n i reflects the past behavior of n i. This will yield a value very close to 1 for nodes with a moderate trust (rt = 0.5), a value below 1 for nodes that have lower trust (rt < 0.5), and a value above 1 for nodes that have a higher trust (rt > 0.5). Then, the authors defined the time factor (W ) as follows: W = K time + ra, where K is a discount factor between 0 and 1 and ra is the node s recent activities, which can include a successful forwarding or a deliberate exaggeration. Finally, the trust metric is evaluated as follows: T = γ 1 N (1+W ) 1 N where γ is a scaling factor to keep the trust T at a value between 0 and 1. TOMS has a trust assistant policy (TAP) which helps the central node in better evaluating its neighboring nodes trusts. When the central node wants to evaluate a neighboring node s trust, it queries its trust assistants about this neighboring node x. Then, these trust assistants will provide the node s trust in their individual community to the central node. Subsequently, the node s final trust can be calculated by the central node as follows: T final x = T (C,x) + [T (A1,x) T (Ai,x) T (An,x)] n + 1 (3.1) where T (C,x) is the trust value of the central node C to a certain node x, T (Ai,x) is the trust value of the trust assistant i to the same node x, and n is the number of trust assistants in the community. According to Equation 3.1, TOMS uses a centralized structure for trust calculation because the trust values of nodes in a community are calculated by the community central node only. TOMS uses the direct and indirect information sources in order to calculate the trust value of a specific node, say node x. The direct information represents the experience of a central node with node x, while the indirect information represents the experience of trust assistants with the same node, node x. The trust assistants are allowed to provide the central node with either positive or negative feedback when they answer the trust request sent by the

86 66 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks central node. To distribute the trust values, TOMS employs both reactive and proactive dissemination approaches. In the former approach, any node in the community needs to send a trust query to the central node in order to obtain trust information for another node in the community. In the latter approach, the central node piggybacks the trust values of nodes in its community with the periodic HELLO message, which means that TOMS employs a centralized dissemination structure. Later on, the central node uses a binary decision metric whenever it evaluates other nodes in its community. The decision is made based on a threshold policy in which the central node will independently set a trust threshold for its community, and the neighboring nodes that cannot meet the trust requirement will be taken out of its community. The central node will keep a black list to record all malicious nodes that have been excluded from its community due to their malicious behaviors in recent periods Shaikh et al. s Scheme Shaikh et al. proposed a Group-based Trust Management Scheme, GTMS, for clustered WSNs [114]. GTMS has a hybrid trust management architecture in which the trust value is calculated at three levels: at each sensor node, at each cluster head (or group leader), and at the base station. At the first level, a trust value is calculated by using direct and indirect observations. Direct observations represent the number of successful and unsuccessful interactions while indirect observations represent peer recommendations about a specific node. Whenever a sensor node x wants to communicate with another node y, it checks whether it has any past experience of communication with y during a specific time interval. If yes, then node x calculates the reputation value based on past interaction experience; otherwise, node x seeks recommendations from neighboring nodes. The time-based past interaction reputation value R x,y of node y at node x that lies between 0 and 100 is defined as: 100 (S x,y ) 2 R x,y = [ (S x,y + U x,y )(S x,y + 1) ], (3.2) where [.] is the nearest integer function, S x,y is the total number of successful interactions of node x with y, U x,y is the total number of unsuccessful interactions of node x with y. Whenever a node requires peer recommendation, it will send a request to all cluster members except for the distrusted ones. Suppose that k nodes are trusted or uncertain in a cluster. Then, node x calculates the trust value of node y as follows: R x,y = [ i D x C x R x,i R i,y ] ; j = D x C x n 2 (3.3) 100 k where D x and C x represents respectively the set of trusted and distrusted nodes, R x,i is the reputation value of the recommender, R i,y is the reputation value of node y sent by node i, n

87 3.3. The State of the Art of Reputation-based Trust Systems in WSNs 67 is the total number of sensor nodes. After calculating the reputation value, a node will quantize trust into three states as follows: T (R x,y ) = trusted, 100 f R x,y 100 uncertain, 50 g R x,y < 100 f distrusted, 0 R x,y < 50 g where f represents half of the average values of all trusted nodes, and g represents one third of the average values of all distrusted nodes. Both f and g are calculated as follows: [ 1 2 f j+1 = ( i Dx Rx,i R x )], 0 < R x n 1 f j, D x = 0 [ 1 3 g j+1 = ( i Mx Rx,i M x )], 0 < M x n 1 g j, M x = 0 where M x represents the set of distrusted nodes. At the second level, each cluster head (CH ) periodically asks the nodes for their trust states of other members in the cluster. In response, all member forward trust states of other member nodes to the CH. Also, CH maintains the record of past interactions of another cluster in the same manner as individual nodes keep record of other nodes (see Equation 3.2). Reputation values of a group are calculated on the basis of either past interaction or information passed on by the BS. Suppose ch i wants to calculate R chi,j of another cluster j. Then, it can be calculated by using either time-based past interaction evaluation if it has enough experience about cluster j or by getting recommendation from the BS. At the third level, BS also maintains the record of past interactions with CH s in the same manner as individual nodes do in Equation 3.2. Suppose there are G groups in the network. BS periodically multicasts request packets to the CHs. On request, the CHs forward their reputation vectors, related to the recommendations of other clusters based upon past interactions, to BS as follows: R ch = (R ch,1, R ch,2,..., R ch, G 1 ) On reception of reputation vectors from all CHs, the BS calculates the reputation value of each cluster as shown below: R BS,ch1 = [ G 1 i=1 R BS,chi R chi,ch 1 ],..., G 1

88 68 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks R BS,ch G = [ G 1 i=1 R BS,chi R chi,ch G ] G 1 where R BS,chi is the reputation value of ch i at the BS, R chi,ch 1 is the reputation value of cluster head ch 1 at cluster head ch i, and G represents the total number of groups in the network Michiardi & Molva s Scheme The Michiardi & Molva s system, CORE, enforces node cooperation in mobile Ad Hoc networks to prevent selfishness behavior in routing activities [81]. Each node uses the WDM in oder to monitor the behavior of its neighboring nodes. According to the system specifications, three reputation information sources are available: subjective, indirect, and functional reputation. The subjective reputation can be directly observed by using the WDM, which is the same as direct observation in our discussion in Section 3.1. The designers give more weight to previous observations in order to reduce the influence of any misbehavior in recent observations. The general formula to calculate a subjective reputation is: r s i (s j f) = ρ(t, t k ) σ k, (3.4) where r s i stands for the subjective reputation value calculated at time t by subject s i on subject s j with respect to the function f. ρ(t, t k ) is a time dependent function that gives higher relevance to past values of σ k. σ k represents the rating factor given to the k-th observation. The indirect reputation is the subjective reputation of one node that has been propagated and received by other nodes, which is the same as indirect observation in our discussion in Section 3.1. CORE uses a reactive dissemination approach to propagate only the positive subjective reputation information source. The functional reputation is the combination of indirect and subjective reputation with respect to a specific function. In other words, the functional reputation is the global reputation value associated with every node. It is possible to assign more weight for a specific function using the following formula: R si (s j ) = n k=1 W k {R si (s j f k ) + IR si (s j f k )}, (3.5) where R si (s j ) represents the global reputation value, W k represents the weight associated to a specific function f k, R si (s j f k ) represents the subjective reputation value calculated by s i on s j as in Equation 3.4, and IR si (s j f k ) represents the indirect reputation of s j collected by s i for the function f k. Equation 3.5 is computed at each node, which ensures a distributed reputation calculation structure. Unfortunately, giving greater weight to the past observations enables a malicious node to misbehave temporarily if it has accumulated a high reputation value. Moreover, combining the reputation values for various functions into a single global value is another problem, since this helps a malicious node to hide its misbehavior with respect to certain functions by behaving cooperatively with respect to the remaining functions.

89 3.3. The State of the Art of Reputation-based Trust Systems in WSNs Srinivasan et al. s Scheme The distributed reputation-based beacon trust system for WSNs, DRBTS, excludes malicious beacon nodes that provide false location information [118]. Beacon nodes are special sensor nodes that have the capability of knowing their location through a GPS receiver, manual configuration, etc. DRBTS helps sensor nodes to validate whether given location information is correct or not. It differs from previous reputation-based trust systems in calculating the trust values for beacon nodes, not for normal sensor nodes. The network topology consists of three types of devices: sensor nodes, beacon nodes, and a base station. The information gathering phase is done at two points: the sensor node and the beacon node. At beacon nodes level, each beacon node runs an adaptive version of the WDM in order to monitor other beacon nodes within 1-hop of its neighborhood. When a sensor node broadcasts a query asking about its location, each beacon that is able to hear this broadcast should respond with the sensor s location information. Another beacon node B i, which hears this query and replies, compares its calculation of the sensor s location information with overheard calculations. If the difference between the overheard calculations and the location information calculated by B i is within a certain range, the reputation values of those beacons which calculated the overheard location information are increased. Otherwise, they are decreased. This means that DRBTS is built based on a distributed dissemination structure where each beacon node shares its reputation calculation with other nodes. The reputation value for each beacon node is updated after obtaining direct and indirect observations. If a beacon node B i overhears location information transmitted by another beacon node B j, it first compares B j s location information with its estimation. If B j s location information is acceptable, then τ = 1; otherwise, τ = 0. The reputation value of B j is calculated by B i as follows: R n i,j = µ 1 R c i,j + (1 µ 1 ) τ where Ri,j n, Rc i,j represents respectively the new and current reputation values calculated by B i for B j and µ 1 denotes a factor that is used to weight previous experience against current information. Also, B i considers the overheard NRT j for updating its NRT i. Suppose NRT j has a reputation value for another beacon node B k which also exists in NRT i. Beacon node B i performs a deviate test on these two reputation values as follows: R c i,k R c j,k d (3.6) If the result of the deviation test is positive, then the published information by B j is considered to be compatible with B i s direct observation. Then, B i accepts this published information and updates R i,k in its NRT i as follows: R n i,k = µ 2 R c i,k + (1 µ 2 ) R c j,k However, if the result in Equation 3.6 is negative, then the published information by B j is

90 70 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks considered to deviate too much from B i s direct observation, and is disregarded as incompatible information. Moreover, the beacon node B j has to be punished by reducing its reputation value as follows: R n i,j = µ 3 R c i,j At sensor nodes level, observations about the behavior of beacon nodes is of concern, but sensor nodes do not have access to correct estimation about their location. Consequently, they rely on the location information provided by trusted beacon nodes. The system designers employed a reactive dissemination approach to propagate the reputation values to other nodes and thus they have chosen to couple the answer to the location s query with the dissemination of the beacon s neighbor reputation table (NRT ). This table contains positive and negative feedback about neighboring beacon nodes. The sensor node that produced the location request will receive NRT s and location information from neighboring beacon nodes. It counts the number of positive and negative votes, and then stores them in the trusted beacon neighbor table (TBN ). A positive vote for a beacon node B j is given when B i reports a reputation value for B j greater than a predefined trust value threshold in a sensor node. DRBTS uses majority votes to decide the final reputation value of the beacon node B j. Generally speaking, DRBTS follows a distributed reputation calculation structure in which reputation values for beacon nodes are calculated at each node Özdemir s Scheme A functional reputation-based data aggregation system for wireless sensor networks, RDAT, was proposed by Özdemir [91,92]. RDAT considers trustworthiness of sensor nodes to improve the reliability of aggregated data. It computes, for each sensor node, three functional reputation values, namely; aggregation, routing, and sensing. Functional reputation for aggregation (R agg ) is used by sensor nodes in order to evaluate the trustworthiness of aggregator nodes. Functional reputation for sensing (R sen ) and routing (R rou ) are used by aggregator nodes to enhance the reliability of aggregated data, as will be discussed later. Each sensor node monitors both negative and positive behaviors of its neighbors in order to obtain and record its direct observations in a reputation table. Later on, this reputation table is exchanged among sensor nodes to be used as indirect observations. To reduce the data transmission overhead, RDAT piggy-backs reputation tables with other control and data packets. RDAT uses a beta reputation system to calculate reputation values. When a sensor node x wants to communicate with another sensor node y, x evaluates the trustworthiness of y by using both direct and indirect observations. Let us assume that x wants to evaluate the routing behavior of y. Also, let us assume that x has received indirect observations about y from a set of neighbors N. Then, x computes the trustworthiness of y as follows: T rou xy = α rou xy α rou xy βxy rou + 2 (3.7)

91 3.4. Comparison of Current Reputation-based Systems in WSNs 71 where α rou xy represent the new amount of positive and negative observations, respectively. α rou xy and β rou xy and β rou xy can be calculated as: α rou xy = v α xy + rxy rou + Iobs rou (r ky ) (3.8) k N β rou xy = v β xy + s rou xy + Iobs rou (s ky ) (3.9) k N where v < 1 represents an aging factor that allows reputation to fade with time, α xy and β xy represent the old amount of positive and negative observations, respectively. rxy rou and s rou xy denote respectively good and bad routing activities since last reputation calculation, Iobs rou(r ky) denotes indirect observation provided to node x by node k N about node y for good routing actions, which can be evaluated as follows: Iobs rou 2 α xk r ky (r ky ) = (β xk + 2)(r ky + s ky + 2)(2 α xk ) (3.10) I rou obs (s ky) denotes indirect observation for bad routing actions, which can be evaluated as follows: Iobs rou 2 α xk s ky (s ky ) = (β xk + 2)(r ky + s ky + 2)(2 α xk ) (3.11) In RDAT, reliable data aggregation is achieved in two phases. In the first phase, each sensor node x calculates the reputation value for its aggregator node z. If the reputation value is below a predefined threshold, x encrypts its aggregation data using a pairwise key shared with the base station, and then sends the aggregated data to the base station along with a complaint about the aggregator s low reputation value. Based on the number of complaints about z, the base station removes z from the network. In the second phase, an aggregator z considers R sen of sensor nodes when it calculates aggregated data. In other words, z weights the reported data from sensor nodes by their functional reputation values R sen. 3.4 Comparison of Current Reputation-based Systems in WSNs This section provides a comparison between existing reputation-based trust systems in WSNs. It is believed that this comparison is not easy for the following reasons: ˆ The trustworthiness problem in WSNs was solved from different angles and different scopes were considered. For example, some designers solved the problem by considering only routing misbehaviors as in [14, 26, 81]. Each scope, such as routing or data aggregation, has its own challenges that need to be considered carefully, especially during the information gathering phase. ˆ Reputation components, discussed in Section 3.1, were not covered in most current

92 72 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks Aggregation Routing Mobility Localization Generic [92] [93] [14] [27] [12] [104] [119] [13] [26] [82] [118] [47] [48] [114] [115] [135] [138] [139] Figure 3.7: Classification of current reputation-based trust systems in WSNs reputation-based trust systems. ˆ Security attacks related to WSNs or reputation systems, discussed in Sections and 3.2, were not considered in most of reputation-based trust systems. Thus, existing reputation-based trust systems are compared in a number of different ways: the scope they consider, reputation components they are composed of, and security attacks they are secure against Classification Model Current reputation-based trust systems in WSNs are designed in order to enhance the trustworthiness among sensor nodes. After investigating these systems, it was found that these systems fall under one of five categories: (i) generic, (ii) localization, (iii) mobility, (vi) routing, and (v) aggregation. Figure 3.7 classifies current reputation-based trust systems, depending on what activity most attracted the system designers. Ganeriwal & Srivastava [46, 47], Chen [25], Yao et al. [137, 138], Xiao et al. [134], and Boukerche et al. [13] designed generic reputation-based trust systems, which do not consider a specific activity. They argued that their systems can be tailored to do any sort of activity. Boukerche & Ren introduced the concept of community and then they proposed a reputation-based system that considers the control of the nodes access into the community [12, 103]. This was also addressed by Srinivasan et al. [117]. Furthermore, Srinivasan et al. designed a reputation-based system that enforces cooperation between beacon nodes by motivating them to provide correct location information [118]. Moreover, Michiardi & Molva [81], Buchegger & Boudec [14], and Chen et al. [26] considered only the routing misbehaviors when a node evaluates another one. Finally, Özdemir [91,92] integrated aggregation

93 3.4. Comparison of Current Reputation-based Systems in WSNs 73 functionalities with advantages provided by a reputation component in order to enhance the accuracy of aggregated values Reputation Components According to the discussion in Section 3.1, reputation-based trust systems often share similar structural pattern. They consist of four main phases: information gathering and sharing, information modeling (or reputation calculation), decision making, and dissemination (see Figure 3.1). This section investigates the existence of these phases (and the internal components of each phase) in existing reputation-based trust systems. Table 3.1 incorporates the discussion on Section 3.1 and then analyzes reputation-based trust systems designed for WSNs. It also depicts the information related to each phase (and its components) covered by each system. We believe this helps in understanding differences between reputation-based trust systems in the current literature. Table 3.1 summarizes our discussion in Sections 3.1 and 3.3. It analyzes current reputation-based trust systems and investigates the existence of the main phases discussed in Section 3.1. Surprisingly, the decision making phase was not considered in Michiardi & Molva s [81], Srinivasan et al. s [117], Boukerche et al. s [13], Chen s [25], and Srinivasan et al. s [118] schemes. The dissemination phase is also not considered in Chen s scheme [25]. Note that, Chen s scheme does not discuss both the decision making and dissemination phases. Importantly, Table 3.1 shows that Özdemir s scheme [91,92] is the only aggregation-specific candidate in the current literature Attack Vulnerability This section investigates whether or not existing reputation-based trust systems are vulnerable to the security attacks discussed in Sections and 3.2. Damage caused by these attacks varies from no damage in one system to maximum damage in another one, depending on security assumptions used and whether these attacks were considered at the design time or not. Importantly, attacks are less feasible in Boukerche et al. s system [13], because of the assumption on the secure deployment of mobile agents. Boukerche et al. assumed that these agents are generated and launched by a trusted authority, and are not subjected to node compromise attacks, which is an unrealistic assumption. We agree with Shaikh et al. [113] that Boukerche et al. s system [13] is not well suited for realistic WSNs. It is believed that more attacks will threaten their system if the assumption about mobile agents is relaxed. Selective Forwarding Attack The Selective Forwarding (SF) attack occurs when an adversary, which is controlling a compromised node, selectively forwards received messages. Unfortunately, all systems in Table 3.2 are vulnerable to the SF attack, because launching node compromise attacks against the current version of sensor nodes such as MICA2 is trivial. The damage caused to the reputation-based trust systems by SF attacks vary from partial damage to maximum damage as shown in Table 3.2. They cause partial damage in systems [12, 14, 25, 26, 47, 81, 91, 92, 113, 117, 137, 138] although they monitor the forwarding activity. This is because most of these systems use a

94 74 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks Schemes Gathering & Sharing Calculation Decision Dissemination (Phase 1) (Phase 2) (Phase 3) (Phase 4) Source WDM Type Scope Structure Approach Metric Structure Approach Michiardi & Molva [81] D/I Y + R Di?? Di Re Buchegger & Boudec [14] D/I Y - R Di? B Di Re Ganeriwal & Srivastava [46, 47] D/I Y + G Di Pr B Di P Srinivasan et al. [117] D Y +,- M C Pr? C Re Boukerche et al. [13] D N +,- G P2P?? P2P Re Yao et al. [137, 138] D/I Y +,- G Di De Disc Di Re Shaikh et al. [113, 114] D/I? +,- G H De Disc H P, Re Özdemir [91, 92] D/I Y +,- A Di Pr B Di P Bouckerche & Ren [12, 103] D/I? +,- M C De B C P, Re Chen et al. [26] D Y +,- R Di Pr B Di P Chen [25] D Y? G Di Pr??? Xiao et al. [134] D/I? +,- G Di Pr B Di? Srinivasan et al. [118] D/I Y +,- L Di De? Di Re D Automatic Direct + Positive Feedback I Automatic Indirect - Negative Feedback WDM Watchdog Mechanism Re Reactive C Centralized P Proactive H Hybrid B Binary Di Distributed Disc Discrete M Mobility P2P Peer to Peer R Routing Misbehavior Pr Probabilistic L Localization Misbehavior De Deterministic A Aggregation Misbehavior? Not available G Generic Misbehavior Y Yes N No Table 3.1: Reputation components in current reputation-based trust systems

95 3.4. Comparison of Current Reputation-based Systems in WSNs 75 Schemes WSNs Attacks Reputation Attacks SF SY SD RE BM BS OO NE Michiardi & Molva [81] - Buchegger & Boudec [14] Ganeriwal & Srivastava [46,47] Srinivasan et al. [117] - Boukerche et al. [13] Yao et al. [137, 138] - Shaikh et al. [114] Shaikh et al. [113] Özdemir [91] Özdemir [92] Bouckerche & Ren [12, 103] Chen et al. [25, 26] Xiao et al. [134] Srinivasan et al. [118] SF Selective Forwarding BM Bad Mouthing SY Sybil BS Ballot Stuffing SD Spoofed Data OO On-Off RE REplay NE NEwcomer - Not Available Robust Partial damage Maximum damage Table 3.2: Attacks vulnerabilities in current reputation-based trust systems binary decision method when they evaluate the trust level of a specific node. This method is based on a threshold policy, and once the node s reputation is above this threshold value, then the node is considered trusted. Therefore, an adversary can launch SF attacks as long as its reputation value is above a predefined threshold value, which keeps its trust state as trusted. The damage is considered partial because adjusting the threshold value or applying mechanisms such as aging factor and weighting can help to defeat this attack. Unfortunately, Shaikh et al. s [114], Srinivasan et al. s [118], and Xiao et al. s [134] systems did not consider forwarding misbehaving and therefore, the damage caused by the SF attack is maximum. Finally, damage caused by SF attacks on Srinivasan et al. s [117], Boukerche et al. s [13], and Michiardi & Molva s [81] systems can not be predicted due to lack of information. There is no information about the decision making metric used, and whether or not forwarding activities are monitored. Sybil and Newcomer Attack Table 3.2 shows that there is a link between the adversary capability of launching Sybil (SY) and Newcomer (NE) attacks. According to the discussion in Section and 3.2, an adversary can launch the SY attack by presenting more than one identity. This means that the adversary is

96 76 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks able to launch NE attack once it has succeeded in presenting another identity beside its original identity. Interestingly, reputation-based trust systems such as [14, 25, 26, 114, 117, 134, 137, 138] are vulnerable to SY and NC attacks. This is due to the lack of authentication between sensor nodes in these systems. Replay and Spoofed Data Attack The Replay (RE) attack occurs when an adversary has the ability to replay old messages into the network. Surprisingly, this attack is possible in reputation-based trust systems such as [12,14,25,26,47,114,117,118,134]. This can harm these systems, especially if the adversary is able to replay old and invalid reputation information. Other systems [13,92,113,137,138] are considered robust against RE attacks, because of the use of mechanisms such as nonces and timestamps. It is argued that systems with vulnerability to the RE attack, are also vulnerable to the Spoofed Data (SD) attack, because the adversary can first capture some reputation information in understandable format, and then replay it into the network after changing it, in order to affect the performance of the reputation component - which is one form of the SD attack. Bad Mouthing and Ballot Stuffing Attack Bad Mouthing (BM) and Ballot Stuffing (BS) attacks are possible in systems that use indirect observations in the reputation calculation phase. Consequently, systems in [13, 25, 26, 117] are robust against BM and BS attacks, because sharing direct observations with neighboring nodes is prohibited (see Table 3.1). The BM attack is feasible in reputation-based trust systems that allow sensor nodes to exchange their negative feedback, such as in [12, 14, 91, 92, 113, 114, 118, 134,137,138]. On the other hand, the BS attack is feasible in systems that allow sensor nodes to propagate their positive feedback, such as in [12,47,81,91,92,113,114,118,134,137,138]. The damage caused by BM and BS is partial in [113], because indirect observations are considered in reputation calculation only if past communication experience does not exist or is not enough to determine the trustworthiness of a specific node. On-Off Attack The On-Off attack (OO) occurs when an adversary tries to launch a security attack (or a mixture of attacks discussed in Section 3.2) in an irregular basis, in order to keep its reputation value within an acceptable trust value. Importantly, Table 3.2 shows that all reputation-based trust systems are vulnerable to this attack. The damage caused by this attack varies, depending on how many other attacks the system is vulnerable to. 3.5 Summary This chapter discussed an alternative way to mitigate node compromise attacks, which are reputation-based security solutions. The main goal of these solutions is, to enhance the trustworthiness among sensor nodes by monitoring network activities and detecting events related to node compromise attacks. The chapter provided a detailed review of existing reputation-

97 3.5. Summary 77 based trust systems in wireless sensor networks. It then proposed a framework to analyze current reputation-based trust systems, and understand their strengths and limitations. The chapter also analyzed how the integration between wireless sensor networks and reputation systems can open doors for an adversary to threaten any reputation-based security solution, and affect its performance. Then, the chapter surveyed the state-of-the-art in reputationbased trust systems and classified them into five categories: generic, localization, mobility, routing, and aggregation. The difference between these categories is the scope (or the task) a monitor mechanism is tailored for. A scope-specific reputation system requires the watchdog mechanism to monitor activities related to the chosen scope. Finally, the chapter compared these reputation-based trust systems in three ways: (i) the scope they consider, (ii) reputation components they are composed of, and (iii) their resilience against security attacks. The chapter concluded that a lack of understanding the main phases of a reputationbased trust system, discussed in Section 3.1, make new designs subject to different attacks. For example, sharing only positive feedback between sensor nodes allows malicious nodes to collude and falsely praise misbehaved nodes to launch a BS attack. Propagating positive feedback also exhausts the network s limited resources since the number of nodes that behave correctly in general is supposed to be larger than those that do not. It was also concluded that a scope-specific reputation system requires the watchdog mechanism to be tailored in order to monitor activities related to the chosen scope. For example, the aggregation scope requires the watchdog mechanism to monitor routing, forwarding, sensing, and aggregation activities where each activity may use a different reputation information type, whereas the localization scope requires the watchdog mechanism to focus only on the provided location information. Thus, applying the reputation system designed for the localization scope directly to the aggregation scope is impractical; the system has to be modified. Finally, the chapter concludes that the only aggregation-specific reputation-based system, proposed by Özdemir [91, 92], is subject to attacks. According to our discussion in Section 3.4.3, Table 3.2 shows that Özdemir s scheme is vulnerable to SF, SD, RE, BM, BS, and OO attacks. Because of these limitations, a robust reputation-based secure data aggregation scheme is proposed in the following chapter.

98 78 Chapter 3. Reputation-based Trust Systems in Wireless Sensor Networks

99 Chapter 4 Reputation-based Secure Data Aggregation for Wireless Sensor Networks Chapter 2 showed that securing network communications in WSNs has traditionally been achieved through cryptographic mechanisms. However, cryptographic mechanisms are insufficient to protect wireless sensor networks (WSNs) as discussed in Chapter 3. For example, sensor nodes are deployed for long periods in hostile environments, which makes it possible for an adversary to physically take over a sensor node and obtain access to cryptographic keys. The wireless security community has therefore developed a suite of mechanisms to complement cryptographic techniques, such as a reputation system that can be defined as a system that collects, processes, and disseminates feedback about the history of sensors behavior. Reputation-based approaches help circumvent node compromise attacks. These approaches monitor network activities in order to detect events related to these attacks. They assume that a node capture will provoke some noticeable events, such as inconsistent sensing or aggregation results, a displacement or removal of a node, and malicious routing activities [71]. Chapter 3 concludes that a scope-specific reputation system requires the watchdog mechanism to be tailored in order to monitor activities related to the chosen scope. For example, the aggregation scope requires the watchdog mechanism to monitor routing, forwarding, sensing, and aggregation activities where each activity may use a different reputation information type, whereas the localization scope requires the watchdog mechanism to focus only on the provided location information. As discussed in Section 3.4.1, current research in reputation-based trust systems intended to work in WSNs fall under one of five scopes: generic, localization, mobility, routing, and aggregation. Note that, existing reputation schemes designed for the first four scopes are in- 79

100 80 Chapter 4. Reputation-based Secure Data Aggregation Xi A sensor node aggregator A x 1 x 2 x 3 x 4 x 5 x 6 Communication range Figure 4.1: A simplified deployment area for Özdemir s scheme appropriate to be used in the data aggregation context. For example, studies such as [37] examined how good nodes are in performing routing functionalities. They are not aware of the content of the sensed data. The disadvantage of this is that some sensors may still get good reputation values despite providing invalid readings, because no check is made on the sensed data. Importantly, only one reputation-based system is built specifically to provide data aggregation security in WSNs, which is Özdemir s scheme [91, 92]. According to our discussion in Section 3.4.3, Özdemir s scheme is vulnerable to attacks such as Spoofed Data (SD), Rplay (RE), Bad Mouthing (BM), and Ballot Stuffing (BS). Moreover, Özdemir s scheme [91] is limited to perform only the average (AVE) aggregation function. Assume that sensor nodes are grouped in a cluster as in Figure 4.1. The radio range of a sensor node (x 1 ) is limited to overhearing data transmitted by its neighbors x 2, x 3 and x 4. After performing AVE aggregation function on readings received from cluster members, an aggregator A sends an aggregation result AR to a base station. Each cluster member (i.e. node x 1 ) recomputes AVE aggregation function on data reported by neighbors (x 2, x 3 and x 4 ) within its communication range. x 1 then compares its aggregation result AR with AR. Özdemir s scheme claims that these two aggregation results should be correlated, since data sensed from the local area is often correlated. We believe that the claimed data reliability does not exist for other aggregation functions, such as summation (SUM), minimum (MIN), or maximum (MAX). For example, an aggregation result calculated by x 1 for SUM aggregation function (instead of AVE) is definitely unequal to the overheard AR. In this chapter, a robust Reputation-based Secure Data Aggregation (RSDA) for wireless sensor networks is proposed. The security advantages provided by this scheme are realized by integrating aggregation functionalities with a reputation system. RSDA does not trim abnormal (but correct) readings as suggested by Wagner [128]. It is believed that eliminating abnor-

101 4.1. Network Assumptions 81 mal readings with no further investigation is impractical, especially in applications designed for heterogeneous environments, such as the monitoring of bush-fires or monitoring temperatures within oil refineries. In these heterogeneous environments, the normal and abnormal readings are equally important for the network administrator. RSDA is similar to Özdemir s schemes [91, 92] in the sense that it minimizes the use of heavy cryptographic mechanisms, and integrates aggregation functionalities with a reputation system, in order to secure data aggregation. However, the differences between RSDA and Özdemir s scheme are four-fold: (i) RSDA considers the main phases in the analysis framework for reputation systems discussed in Section 3.1, (ii) RSDA considers both WSNs-related and reputation-related security attacks, (iii) RSDA is not limited to a single aggregation function, and (iv) RSDA provides dynamic response to attack activities by not rejecting incorrect aggregation results at the base station level. Instead, it rejects it as soon as possible, possibly by nodes in the neighborhood. We believe that these differences ensure that the main components of our definition for robust secure data aggregation discussed in Section 2.1 are satisfied. The notation to be used in this chapter is found in Table 4.1. The target terrain, where RSDA is implemented, is divided into smaller cells of equal size. Each cell has T nodes where only one of them is selected, based on its reputation value, to be the cell representative. Each node has a monitoring mechanism similar to the Watchdog mechanism, which was proposed by Martie et al. [77], in order to compare its result with results reported by its neighbors. Each node in a cell performs redundant operations to monitor the cell representative operations. RSDA follows a request-response paradigm where the base station initiates the aggregation process by flooding a query message into the network. The transformation from this paradigm to a periodic paradigm, however, is straight-forward by letting the representatives periodically report their data without the need to wait for the base station s query. The rest of the chapter is organized as follows: Section 4.1 lists the network assumptions that help achieve the desired aims. Section 4.2 provides an overview of the data model that RSDA follows. Section 4.3 discusses the expected type of adversary that RSDA resists. Section 4.4 lists the security requirements for RSDA. Section 4.5 provides details of the proposed scheme RSDA. Section 4.6 describes the data used in evaluating RSDA and discusses the experimental results. RSDA is tested in three scenarios, depending on the adversary capability to affect the aggregation results, as follows: (i) no attack on the data, (ii) abrupt change, and (iii) 1-per-2 strategy-based On-Off attacks. Section 4.7 extends the concluded results in Chapters 2 and 3 by analyzing the security level in RSDA. Finally, a conclusion is given in Section Network Assumptions It is assumed that sensor nodes lack the tamper-resistant property, have unique ID, and are preloaded with two network-wide shared keys K 1 and K 2. These two keys are used to au-

102 82 Chapter 4. Reputation-based Secure Data Aggregation A Base Station A Cell Member A Cell Representative Radio Coverage of Node x x x A B Figure 4.2: The radio coverage in RSDA thenticate intra-cell and inter-cell communications, respectively. The keys also help break the connection between intra-cell and inter-cell keys. Thus, the compromise of an intra-cell key does not lead to the compromise of inter-cell keys, as will be discussed in Section 4.5. The sensor nodes are also assumed to have a large deployment area, the dimensions of which are known in advance, and nodes are uniformly distributed over this area. A grid structure is used to divide the target terrain into smaller non-overlapping cells of equal areas. The dimension of each cell is small enough to allow the radio range of each sensor to cover its surrounding cells as shown in Figure 4.2. The physical size of the deployment area can thus be expressed as A X B cells. As the number of cells grows, it may affect the delivery time of a query (or its response), since the query needs to travel longer depending on the base station placement. However, the base station (B) placement is out of the thesis s scope. It is assumed that the existence of a short period of time exists where the network is not vulnerable to any attacks. During this time, each sensor node discovers its neighboring nodes, finds out which cell it belongs to, and computes two keys: intra-cell and inter-cell. An intra-cell key is a key shared between a cell members in order to authenticate group communication. On the other hand, an inter-cell key is a key shared between members of two adjacent cells. RSDA is composed of two types of identities: a B and normal sensor nodes. The B is entrusted with the task of initiating queries to the network, processing received answers for these queries, and deriving meaningful information that reflects the events in the target field. The normal sensors are grouped into cells. In each cell, one of the sensors, that has the highest reputation value, is selected to be the cell representative C rep according to an algorithm discussed later. These cells can be either intermediate cells or non-intermediate cells (leaf cells). The intermediate cells receive data from downstream cells and perform sensing, aggregation, and forwarding operations, whereas the non-intermediate cells do not receive data from downstream cells and do not perform any aggregation activity. The data model that RSDA follows

103 4.2. Data Model 83 is discussed in the following section. 4.2 Data Model The physical phenomenon to be reported or detected by any sensor in WSNs is an application dependent, which can be classified as a point source or a field source [4, 125]. The former represents applications in which an event is generated from a single point in the field such as in target tracking applications. The latter type represents applications where the physical phenomenon is collected from the whole deployment area such as in environment monitoring applications. In both types, the gathered data has some correlation characteristics that can be summarized as follows [4, 125]: ˆ Temporal Correlation: Applications such as environment monitoring require sensor nodes to periodically sense physical phenomena and report them back to the querier. This type of application assumes that these physical phenomena stay within a threshold over the time. ˆ Spatial Correlation: Typical WSNs applications require spatially dense deployment in order to achieve satisfactory coverage and to achieve reliable decision-making against node failure and node compromise. In RSDA, the spatial correlation is represented by having multiple sensor nodes in each cell to perform the same functions as the cell representative, which helps evaluate the cell representative behavior. The temporal correlation is represented by considering applications in which the collected physical phenomena vary within an acceptable error range. A good combination of these two types of correlation helps improve the accuracy of the aggregated data and defend against an adversary which its capability is discussed in the subsequent section. 4.3 Adversarial Model Let the number of sensor nodes in each cell be T. It is assumed that an adversary (ADV ) is capable of compromising W sensor nodes where W >> T but with no more than t 1 compromised nodes in any single cell. When the ADV compromises a sensor node x, it is able to read all of x s internal memory and then the ADV can manipulate x to alter the content of the received packet, drop it, or launch any attack listed in Sections and 3.2. However, the ADV can not take over the base station B which is secured and under the supervision of the network administrator. This type of adversary can be classified as type III, according to the proposed adversarial model discussed in Section 2.2. Before we describe how RSDA works, the security requirements are discussed in the following section.

104 84 Chapter 4. Reputation-based Secure Data Aggregation Notation Table 4.1: Description of notations used in Chapter 4 Description K 1, K 2 Two network-wide shared keys. C i The i-th cell. K Ci Intra-cell key for the i-th cell. K Cij Inter-cell key shared between the i-th and j-th cells. H(.) Hash function. MAC KCi Message authentication code computed by using K Ci. ADV An adversary around the WSN. T The number of nodes in each cell. W The total number of compromised nodes in the whole deployment area. t The minimum number of cell members that are required to revoke a misbehaving C rep or to confirm a new C read x, y Sensor nodes x and y, respectively. p x, p y The physical phenomena reported by sensor nodes x and y respectively. B The base station. Ci read The reported (sensed) physical phenomenon from C i. F An aggregation function. AR Qn C i An aggregation result for query number Q n which is obtained by applying F at C i. Q n A query number. R x S/A/F Reputation value of sensor node x for Sensing/ Aggregation/ or Forwarding functionality. α x S/A/F The number of correct behaviors of sensor node x for Sensing/ Aggregation/ or Forwarding functionality. β x S/A/F The number of incorrect behaviors of sensor node x for Sensing/ Aggregation/ or Forwarding functionality. Thr A/S/R The pre-defined threshold for the Aggregation/Sensing/Reputation. The number of inputs to the aggregation function. C # i 4.4 Security Requirements According to the proposed framework in Section and since RSDA considers a type III adversary, RSDA focuses on providing two main properties which are data accuracy and data availability. To achieve these two properties and defend against attacks discussed in Sections and 3.2, the following requirements are important. ˆ Data Integrity that ensures the content of a message has not been altered, either maliciously or accidentally, during transmission. This helps RSDA to filter out incorrect data and save the processing energy if this data traveled all the way to the base station (B). ˆ Data Freshness that ensures the data is recent and has not been replayed. Injecting old data into the network requires nodes to process this unnecessary data which leads to more energy consumption. This old data also does not represent the current (correct) cell reading, which affects the accuracy of the aggregated data. ˆ Entity Authentication that allows the receiver to verify whether the message is sent by the claimed sender or not. Therefore, an adversary would not be able to participate

105 4.5. The Proposal Reputation-based Secure Data Aggregation Scheme 85 Table 4.2: Reputation table format as suggested in RSDA Node R S R F R A ID r s r s r s x x x x i and inject data into the network and thus affect data accuracy unless it had valid keys. After describing the network assumptions, the data model that RSDA follows, the adversary it defends against, and the security requirements it provides, RSDA will be described in the following section. 4.5 The Proposal Reputation-based Secure Data Aggregation Scheme RSDA focuses on aggregating physical phenomena in heterogeneous environments and follows the multiple aggregator model in which the aggregation is performed at each non-leaf cell. Taking advantage of the temporal and the spatial correlation in WSNs, each node monitors the behavior of other sensor nodes within the same cell and then calculates their reputation values. These calculations are based on how these sensor nodes participate in some cell operations such as sensing, forwarding, and aggregation. In each cell, a sensor node is selected to be the cell representative C rep. Initially, C rep is chosen randomly since all nodes start with same reputation value such as Later on, the selection of a new C rep is based on the highest reputation score that exists among the cell members. The C rep is responsible for confirming its cell reading C read (reported by other cell members), aggregating it with other readings (if the cell is an intermediate cell), and forwarding the aggregation result to an upstream cell. Each node in the cell has a monitoring mechanism similar to the watchdog mechanism (WDM), discussed in Section 3.1.1, in order to monitor the behavior of neighboring nodes within the same cell. RSDA belongs to the class of Bayesian trust and reputation models due to its flexibility and strong foundation on statistics [64, 121], its simplicity in meeting the resource constraints in the sensor nodes, and its success in detecting the misbehaving sensor nodes [47,91,92]. The calculation of the reputation value is defined as the expectation value of beta probability density function (PDF) with the parameters (α,β) [64, 121]. The node s behavior in the Bayesian trust and reputation model can be represented in the form (α,β), where α and β represent respectively the amount of positive and negative ratings. These ratings are calculated by a cell member for its cell members and then stored in its reputation table. The beta PDF denoted

106 86 Chapter 4. Reputation-based Secure Data Aggregation Algorithm 4.1: Bootstrap Phase /* code for sensor node x in cell i. */ /* cells j, k, l are adjacent cells for cell i. */ /* x is preloaded with two network wide shared keys K 1, K 2 */ 1 x computes its intra-cell key as in Equation 4.2 ; 2 x computes its inter-cell keys as in Equation 4.3 ; 3 x deletes K 1 and K 2 ; 4 return K Ci, K Cij, K Cik, and K Cil ; by beta(p α, β) can be expressed using gamma function as follows: beta(p α, β) = Γ(α, β) Γ(α) Γ(β) pα 1 (1 p) β 1 where 0 p 1 and α, β 0 with the restriction that p 0 if α < 1, and p 1 if β < 1. The probability expectation value of the beta distribution is given by: E(p) = α α + β when nothing is known, the a priori distribution is the uniform beta PDF with α = 1 and β = 1. After observing r positive and s negative outcomes, the a posteriori distribution is the beta PDF with α = r + 1 and β = s + 1. This approach provides a sound mathematical foundation for the calculation of the reputation values. The nodes behaviors are examined for three functions: data sensing, data forwarding, and data aggregation (if x is the C rep for an intermediate cell). Each node therefore maintains a reputation table for its cell members and keeps recording r and s separately for these functions: sensing, forwarding, and aggregation as in Table 4.2. If the packet is forwarded to its intended destination, then the forwarding behavior of the overheard node is considered correct. On the other hand, the forwarding behavior is considered incorrect if the packet is dropped or forwarded along an incorrect path. The aggregation behavior is considered normal if cell members find that the difference between their calculation for the aggregation result and the C rep s calculation is bounded by a predefined threshold. Finally, if the reported sensor reading is within the accepted range of readings covered by the temporal correlation feature, then the sensing behavior of the overheard node is correct. Thus, the reputation value which factors in sensing, aggregation, and forwarding R S/A/F can be expressed as follows: R S/A/F = α S/A/F α S/A/F + β S/A/F (4.1) The essential operations, before running RSDA, are performed in a short period of time where the network is genuine. This period is called the bootstrap phase.

107 4.5. The Proposal Reputation-based Secure Data Aggregation Scheme 87 Represents an Intermediate Cell C z Represents a Leaf Cell Represents a Cell Member Represents a Cell Representative Represents a Single Cell Reading Represents Aggregated Readings C j C k C m C i C b Figure 4.3: A simplified deployment area for RSDA Bootstrap Phase This phase constitutes of a short duration of time immediately following the network deployment. It is short enough to assume that no attacks are possible during this phase. The required operations in this phase are summarized in Algorithm 4.1. The node x computes the intra-cell key (K Ci ) which is used to authenticate any communication between itself and any node in the same cell in a similar way to Ren et al. [101] as follows: K Ci = H (K 1 C i ) (4.2) where represents bit string concatenation. K Ci is used to prevent non-cell members from participating in the cell operations and affecting the accuracy of Ci read. After that, each sensor node computes inter-cell keys with adjacent cells, such as C j, as follows: K Cij = H (K 2 C i C j ) (4.3) At the end of this phase, each sensor node deletes K 1 and K 2. This helps to prevent an adversary from getting access to these keys and then participating in network activities. If only one network-wide shared key is used, then an intra-cell key compromise leads to the compromise of all inter-cell keys. In this case, where K 1 = K 2, an adversary can calculate inter-cell key between cells i and j as follows: K Cij = H (H (K 1 C i ) C j ) The advantage of using two network-wide shared keys is that a compromise of an intra-cell key does not lead to the compromise of an inter-cell key.

108 88 Chapter 4. Reputation-based Secure Data Aggregation Data Aggregation Before describing how the aggregation procedure works, the packet format used within the network is introduced below. Each packet has the following format: { C rep i, C rep j, Q n, P ayload} where C rep i represents a sending cell representative, C rep j represents a receiving cell representative, Q n is a query number, and P ayload is a packet content. An aggregation process begins when the B propagates a query to all cells as follows: { B, all cells, Q n, P ayload } The query and its response are relayed to their destination via some intermediate cells. The data flow relies further on the routing algorithm, which is not the focus of this thesis. Actions, which are done at each cell to answer this query, vary depending on whether the cell is an intermediate cell or a leaf cell. At Leaf Cells Algorithm 4.2 summarizes the important activities which are performed at leaf cells. When a leaf cell C i receives the query Q n, C rep i randomly selects a sensor node x from its cell to send back the required sensing information, p x, as follows: C rep i { C rep i, x, Q n, P ayload } As a response, x senses some physical phenomena (as requested) and then sends it back to as follows: { x, C rep i, Q n, P ayload }, where P ayload p x MAC Kci (x Q n p x ) (4.4) Since other nodes in C i are within the radio coverage and share the same intra-cell key with C rep i, they follow the process of overhearing the on-going traffic between the elected node and. These nodes then compare their local readings with p x. If the cell members agree on C rep i p x and the response sent to C rep i, they update αs x and αx F of node x and consider p x as the. They also update α S for all other nodes because of their implicit agreements on the C read i C read i, which are represented by being silent and not sending complaints about p x. A cell node y does not agree on the reading p x if p y p x > Thr S. If the reported p x is not correlated closely enough with the local sensing information of other nodes in the cell, disagreeing nodes perform the following actions: ˆ Update βs x if the reading was unacceptable, and update βx F the cell representative or no reply was sent. ˆ Provide C rep i with the correct C read i. if the destination was not Each disagreeing node, say node y, sends its reading to C rep i and as a consequence, other nodes in the cell are able to verify this disagreement and then update α y S or βy S. Crep i computes

109 4.5. The Proposal Reputation-based Secure Data Aggregation Scheme 89 Algorithm 4.2: At Leaf Cells /* code for sensor nodes x and y in cell i */ /* cell representative C rep i has received a recent query Q n from B */ 1 C rep i selects normally a cell member x to answer Q n ; 2 x sends p x back to C i as in Equation 4.4; 3 other cell member (i.e y) compares p x with p y ; 4 if p y p x > Thr s then 5 y updates βs x ; 6 y sends correct information p y to C rep i (as a complaint); 7 else 8 Ci read = p x ; 9 end if; 10 if p x is routed in an incorrect path then 11 y updates βf x ; 12 else 13 y updates αf x ; 14 end if; 15 if No-of-Complaints t then 16 C rep i calculates Ci read as in Equation end if; 18 C rep i forwards Ci read to an upper C rep as in Equation other cell member (i.e y) recomputes step 16; 20 if Cy read Crep read > Thr s then 21 y updates β Crep i A ; 22 else 23 y updates α Crep i A ; 24 end if; 25 if Ci read routed in an incorrect path then 26 y updates β Crep i F ; 27 otherwise 28 y updates α Crep i F ; 29 end if; the cell reading by using Exogenous Discounting of Unfair Ratings proposed by Whitby et al. [132], after receiving n complaints (where n t) regarding the reported reading p x. These complaints should be received from nodes located in the same cell, where the disagreement occurred, and have R > T hr R. It is based on the assumption that sensors with low reputation are likely to give unfair information and vice versa. The reputation values of these n nodes are used to determine the weight given to the readings as follows: Then, the C rep i C read i = n i=1 (p i R i S Ri F ) n i=1 (Ri S Ri F ) (4.5) forwards this reading to a next cell C j in the upstream path as follows: { C rep i, C rep, Q n, P ayload }, where j P ayload Ci read C # i MAC KCij (C rep Q n Ci read C # i ) (4.6) i

110 90 Chapter 4. Reputation-based Secure Data Aggregation where C # is the number of inputs to the aggregation function and it is set to 1 as a result of being a leaf cell. C # helps an intermediate cell representative to calculate the average aggregation function (AVE) by calculating the number of participants in the aggregation function. Other nodes in C i monitor this transmission in order to evaluate the behavior of C rep i since they also know the inter-cell keys shared between C i and its adjacent cells. If the cell reading gets altered by more than Thr S, then β Crep i A is updated. Otherwise, α Crep i A is updated. Whenever the cell reading gets routed along an incorrect path or does not get routed at all, β Crep i is updated; otherwise, α Crep i F is updated. Generally speaking, each cell member calculates the overall reputation value R for its cell members, except the cell representative, by considering the sensing and forwarding behaviors as follows: R = µ 1 R S + (1 µ 1 )R F where 0 < µ 1 < 1 (4.7) As soon as a cell member has become the cell representative, R A is set to R S and the overall reputation value of the cell representative can be calculated for onward transactions as follows: F R Crep = µ 2 R Crep A + (1 µ 2 )R Crep F where 0 < µ 2 < 1 (4.8) Algorithm 4.3: At Intermediate Cells /* code for sensor nodes f in cell j */ /* cell representative C rep j has received an answer for a recent Q n from a downstream representative C rep i */ /* cell representative C rep j checks the legitimacy of the received message */ 1 if the message has been altered then 2 C rep j updates β Crep i F ; 3 else 4 C rep j updates α Crep i F ; 5 C rep j collects reading from its cell and children cells; 6 C rep j performs aggregation as in Equation 4.10; 7 C rep j forwards AR Cj to C rep k as in Equation 4.12; 8 other cell members (i.e f) recomputes step 6; 9 if AR f AR Cj > Thr A then 10 f updates β Crep j A ; 11 else 12 f updates α Crep j A ; 13 end if; 14 if Cj read routed in an incorrect path then 15 f updates β Crep j F ; 16 else 17 f updates α Crep j F ; 18 end if;

111 4.5. The Proposal Reputation-based Secure Data Aggregation Scheme 91 At Intermediate Cells In order to ensure that the message is received from the claimed entity (data-origin authentication), C rep j recomputes the MAC for the data received from the downstream cell, and then compares it with the attached one. If they do not match, then the message received from C rep i is ignored and β Ci F is updated by increasing sci F by one. Otherwise, removes the attached MAC and considers the reported data as an input to the aggregation C rep j function and updates α Ci F keys shared between C rep i by increasing rci F by one. Since Crep j has no access to the inter-cell can not evaluate the and C i s adjacent cell representatives, C rep j aggregation and sensing behavior of C rep i. Thus, C rep j by using the available information about the forwarding activities as follows: R Ci = calculates the reputation value of C rep i α F α F + β F (4.9) The aggregation behavior of the C rep i is only monitored by nodes in the cell i. To perform some in-network processing, C rep j waits until receiving readings from its cell and other children cells. The reading of C j is done in the same way as the leaf cell does. Then, the C rep j applies, for example, the average aggregation function on the readings in order to answer Q n as follows: AR Qn C j = F (C1 read, C2 read,..., Ci read,..., Cj read ) (4.10) rep = RC 1 C1 read + R Crep 2 C read R Crep i C read i C # 1 + C# C# i C # j R Crep j Cj read (4.11) After that, C rep j sets C # j to be the summation of the received counters C # 1, C# 2,..., C# i,..., C# j and then forwards AR Qn C j to an upper cell representative C rep k (see Figure 4.3) with the following packet format: { C rep j, C rep, Q n, P ayload }, where k P ayload AR Qn C j C # j MAC K Cjk (C rep j Q n AR Qn C j C # j ) (4.12) Other nodes in cell C j are able to keep an eye on the aggregation and forwarding behavior of C rep j. They recalculate the aggregation function AR C j and match the result with AR Cj. If they are bounded by a small value such as AR Cj AR C j < Thr A, r Crep j A is increased by one. Otherwise, s Crep j A is increased by one. Moreover, the α Crep j F is increased by one if C rep j forwards the packet to the right C rep that is not in the blacklist and is one-cell closer to the base station; otherwise, β Crep j F is updated. Once R Crep j falls below Thr R, the current C rep should be blacklisted and a new C rep should be elected. This can be done through the cell representative revocation mechanism, which is discussed in the subsequent paragraph. Algorithm 4.3 summarizes the discussion above and highlights important activities which are performed at intermediate cells.

112 92 Chapter 4. Reputation-based Secure Data Aggregation Table 4.3: Datasets used in the experimental evaluation section Scenario Dataset Description Duration Frequency # Attacks Scenario 1 Dataset-1 No Attacks Scenario 2 Dataset-2 Abrupt Change Scenario 3 Dataset-3 1-per-2 OO - F. Block Dataset-4 1-per-2 OO - L. Block Cell Representative Replacement Mechanism The main aim of this mechanism is to: inform representatives of adjacent cells about the detection of a low reputation value of the current cell representative C rep, blacklist C rep, and then select a new cell representative that has the highest reputation value among the rest of the cell members. The revocation process starts when n nodes (n t) in a cell C i send revoke messages to representatives of adjacent cells in order to inform them about the low reputation value that C rep i has recently achieved. Each cell member, say x, selects one node (i.e y) that has the highest R y among the rest of the cell members and has never been on the black list as a good candidate for the new C rep i. This revoke message is sent as follows: P ayload C rep i R Crep i { x, C rep j, Q n, P ayload }, where y MAC KCij (x Q n C rep i R Crep i y) (4.13) Each adjacent cell representative, say C rep j, should receive at least n valid requests to participate in the replacement process. A valid request is a request that is received from a cell member that is located in the same cell as the revoked C rep, has an acceptable reputation value, and is not in the blacklist. The β F will be updated for those nodes in cell i, which did not participate in reporting the revocation message once n requests have been sent. After receiving these n messages, the new C rep i is selected by applying a simple majority vote on them. The replacement process requires exchanging a number of messages which can affect the network lifetime. This process, however, never starts unless the misbehave of the representative is detected. information as follows: As a results, each cell member needs to store two types of reputation-related ˆ Reputation-Table; which contains a list of the cell members and their reputation values as in Table 4.2. ˆ Blacklist; which contains a list of nodes that misbehaved during their act as a C rep. Once a node x has been blacklisted due to its low reputation value, this can be considered as evidence that x is compromised and should be isolated from the network and removed from the reputation table mentioned above. 4.6 Experimental Evaluation This section evaluates the effectiveness of RSDA by evaluating the behavior of a representative in an intermediate cell. This evaluation is based on four datasets listed in Table 4.3. The

113 4.6. Experimental Evaluation 93 first dataset (dataset-1) is a real-life dataset, which was captured in Intel Berkeley Research Laboratory (IBRL) during the period from February 28, 2004 to April 5, The other datasets are modified versions of dataset-1 as will be explained in the subsequent paragraphs. To the best of our knowledge, there is no test-bed that is available publicly in which reputation-based schemes designed for WSNs can be compared. Therefore, custom test-beds were built for current schemes in the literature. Unfortunately, some custom test-beds have not appear in print or have incomplete description. Also, the simulation environments vary from one test-bed to another, which make any comparison difficult. Consequently, we built our customized test-bed. Our simulations were written based on QUALNET network simulator [111]. In particular, we added the promiscuous mode into the MAC protocol provided by QUALNET. Then, we built RSDA on top of the application layer 2, in which datasets mentioned in Table 4.3 are preloaded into each sensor node within the simulation environments. To be able to view contents of RSDA packets as they travel up and down the protocol stack, QUALNET packet tracer tool was also customized. QUALNET tracer tool consists of adding tracing support for protocols that run in its simulation environments. To do so, the simulator code was updated to produce trace output for RSDA, and then a description of that trace output is made available to QUALNET packet tracer. We chose QUALNET network simulator, because of the following reasons: ˆ It provides a dedicated library for wireless sensor networks. This library is composed of the ZigBee physical and ZigBee MAC layers. ˆ Scalable Network Technologies, who developed QUALNET, provides support and help to academic researchers through its community forums. In return, some researchers contribute their proposals and schemes to be included in newer versions of QUALNET. This will help to facilitate a comparison between similar protocols. ˆ It provides a huge list of implemented Application Programming Interfaces (APIs) that facilitate the programming activating and designing tasks; hence the coding time can be reduced. To evaluate the performance of RSDA, the abstract network model in Figure 4.3 is implemented in QUALNET. We assigned a unique ID to each sensor. Sensor nodes in RSDA are set to run ZigBee (IEEE ) MAC protocol provided by the WSN library in QUALNET. However, we added the promiscuous mode to it, which allows each sensor node to perform passive listening activities. AODV is chosen to be the routing protocol run by sensor nodes, because it provides quick adaption to dynamic link condition, link fault, and low processing and memory usage overhead. Also, AODV is the only routing protocol that is tested with ZigBee MAC protocol in QUALNET. Finally, the radio transmission power, radio receiver power, radio idle power, data rate, packet size, frequency, and modulation are respectively set to 27 ma, 10 ma, < 1 µa, 40 kbps, 36, 916 MHz, and QPK, in order to imitate MICA2 and The source code of RSDA can be downloaded from Alzaid.htm

114 94 Chapter 4. Reputation-based Secure Data Aggregation TinyOS characteristics. The evaluation section studies the aggregation behavior at the representative node of the cell j. C rep j receives inputs AR C Qn, AR i C Qn, and AR m C Qn to the aggregation function from its k children cells C i, C m, and C k, respectively. RSDA is tested in three distinguished scenarios as follows: ˆ First Scenario: The dataset used in this scenario, dataset-1, is genuine as captured in IBRL with no attacks or alteration to the aggregated data. For simplicity, only temperature data is extracted from the IBRL dataset. This scenario helps in determining the value of Thr A, which then will be used by the cell members to evaluate the aggregation behavior of their cell representative. ˆ Second Scenario: The dataset used in this scenario is artificial, because the original IBRL dataset does not have anomalous data. Therefore, we modified k, which is the attack duration, consecutive query responses of a specific cell representative, C k, by multiplying the true value of AR C Qn by 2 in order to account for abrupt changes in k dataset-2. In other words, anomalous aggregation results have mean amplitude 200% more than normal aggregation results. In fact, RSDA is able to detect a change in an aggregation result if it differs by at least Thr A. For example, scenario one in our experiment suggests 2.52 as an optimal value for Thr A. This means that RSDA is able to detect any anomalous aggregation result with mean amplitude around 12.6% more than the average of aggregation results in the scenario, as will be discussed in Section Members of cell k need to investigate the behavior of C k. Once the reputation value of C k falls under a predefined threshold Thr R, the cell members should send revocation messages to the adjacent cell representatives, in order to replace the current misbehaved representative with another node that has the highest reputation value among the rest of the cell members. ˆ Third Scenario: The dataset in this scenario is a modified version of dataset-1 in order to mimic the On-Off (OO) attack behavior. Depending on the attack frequency l, the adversary s attacking methodology is to misbehave once every l query responses. The attack frequency in this scenario is 2 and the attack duration is 1. This means that an attack is launched once every two query responses - 1-per-2 strategy. The effectiveness of RSDA is evaluated when the 1-per-2 OO attack is launched at the first half of the data as in dataset-3, and at the second half of the data as in dataset-4 (see Table 4.3). For all these scenarios, the SUM aggregation results of RSDA, denoted as Reputation SUM (R-SUM) in the rest of the chapter, are compared with the SUM aggregation results that are calculated based on the observations without considering the reputation values, denoted as Plain SUM (P-SUM). Note that, other aggregation functions such as AVE, MIN, and MAX can be employed with very little modifications. However, the discussion in this section is limited to only the SUM aggregation function.

115 4.6. Experimental Evaluation 95 The horizontal axis, in all plots in the subsequent sections, represents the query number that is answered by cell representatives, and the vertical axis represents the temperature captured/aggregated by cell representatives. Also, node-1 represents C rep k in the abstract network model in Figure 4.3, and node-2 and node-3 respectively represent C rep i and C rep m. Figure 4.4: The first scenario of RSDA evaluation in which dataset-1 is used Scenario 1: No Attacks As discussed above, the dataset used in this scenario is as captured in IBRL and contains no malicious data. The motivation of this scenario is to find the optimal value of Thr A in

116 96 Chapter 4. Reputation-based Secure Data Aggregation which the variance on the aggregation results between a cell representative and the rest of cell members should be less or equal to it. The value of Thr A helps cell members to monitor the behavior of their cell representative. Experiments were performed with varying the value of Thr A. During these experiments different blocks of dataset-1 were used, each of which is 70 queries long. For Thr A, values between 0 and 4 were considered, with an increment of.02. Note that while increasing the value of Thr A, the number of revocation messages that were sent by members of adjacent cells was reduced. However, this also limited the detection capability of RSDA. For example, if RSDA allows a large difference (Thr A ) between two aggregation results (an aggregation result reported by a cell representative and another one recalculated by a member of the same cell), then data accuracy provided by RSDA is weakened. This gives an adversary, once it has succeeded in compromising a cell representative, the opportunity to alter an aggregation result by Thr A without being detected by its cell members. As a consequence, its reputation value would not be increased; instead of being decreased. It is observed that the optimal value of the absolute deviation is 2.52, which suggests that Thr A should be set to Figure 4.4-a depicts the behavior of the data collected by C rep j and Figure 4.4-b shows the SUM aggregation results of the collected data. As expected, the R-SUM curve converges with the curve of P-SUM, especially when the reputation values of cell members increase over the time due to their normal behaviors. We found also that no revocation messages were received from members of adjacent cells, because the selected Thr A value ensures that reputation values of adjacent representatives never fell below a predetermined value for reputation values, Thr R, which was set to Scenario 2: Abrupt Change The motivation behind this scenario is to investigate how RSDA handles an abrupt change. RSDA takes advantage of the processed reputation information, which are the revocation requests. When the reputation value of the representative of cell C k falls below a predetermined reputation value Thr R due to its malicious behavior, members of the same cell, C k, send revocation messages to adjacent cell representatives, such as C rep j, C rep i, and C rep l in the abstract network model in Figure 4.3. Once at least t revocation messages have been received at an adjacent cell representative, the revocation process is initiated and a replacement to the misbehaved representative, C rep k, is required. To simulate an abrupt change in the dataset, C rep k was considered as a compromised sensor node that had gained a high reputation value (R Crep k = 0.979) due to its normal behavior in the previous query responses up to Q i where i > 0. From query number Q i onward up to Q j where j > i, C rep k started behaving maliciously by reporting Ck read twice as large as the true data. In other words, the original dataset, dataset-1, was modified by multiplying the true Ck read by 2 for all query responses between Q i and Q j inclusive (where i = 19 and j = 46), which is then named dataset-2. Note that the attack duration and frequency in this scenario are respectively 28 and 1; see Table 4.3.

117 4.6. Experimental Evaluation 97 Figure 4.5: The second scenario of RSDA evaluation in which dataset-2 is used Figure 4.5-a depicts the data collected by C rep j in which a change in the data reported by C rep k is obvious. However, the change is ended, in Figure 4.5, at Q p where p = 23 (not 46). The reason for ending the change at Q p is because t revocation messages were received from members of C k, complaining that the reputation value of C rep k fell below Thr R, which is set to 0.8 in our experiment. The R Crep k at Q i was 0.979, and it dropped to at Q j. As discussed in Section 4.5, each time the cell members disagree with the aggregation result calculated by their representative C rep k, they update β Crep k A and then they recalculate R Crep k as in Equation 4.8. The consecutive malicious behavior between Q i and Q j increases the negative feedback

118 98 Chapter 4. Reputation-based Secure Data Aggregation amount of C rep k by Q j Q i + 1, which makes R Crep k < Thr R. Thus, the current C rep k be revoked and a new representative should be elected. needs to Figure 4.5-b shows the SUM aggregation results of the collected data. Unfortunately, the R-SUM aggregation results calculated by RSDA are affected by this abrupt change until the revocation requests are received at Q p. Importantly, this effect is temporary and RSDA has a better reaction to this change as soon as the reputation value of the misbehaved representative falls below Thr R. On the other hand, P-SUM aggregation results are seriously affected by the misbehave of the cell representative with no means of detection. Figure 4.6: The third scenario of RSDA evaluation in which dataset-3 is used

119 4.6. Experimental Evaluation 99 Figure 4.7: Reputation values of C rep k during the third scenario of RSDA evaluation Scenario 3: 1-per-2 Strategy On-Off Attack Dataset-3 and dataset-4 are used in this scenario to investigate the effectiveness of RSDA in detecting the OO attack. The difference between these datasets (dataset-3 and dataset-4) is that the attack happens at the first half of the data in the former dataset, whereas it happens at the second half of the data in the latter dataset. To simulate the OO attack, the cell representative, C rep k, is considered as a compromised sensor node that had gained a high reputation value due to its normal behavior in the previous

120 100 Chapter 4. Reputation-based Secure Data Aggregation Figure 4.8: The third scenario of RSDA evaluation in which dataset-4 is used query responses up to Q i where i > 0. Then, it tries to change the aggregation results, by reporting C read k as being twice as large as the true value (Ck read ). In this scenario, the attack wanted occurs while answering queries [Q i, Q j ] where i < j. However, the representative C rep k to ensure that its reputation value was still above Thr R, which helps extend the detection time required to recognize its malicious behavior. Thus, C rep k chose the 1-per-2 strategy in which is altered once every two responses. C read k Figure 4.6-a depicts the data collected by C rep j where it is obvious that C rep k the OO attack in some query responses between Q i and Q j. The C rep k has launched started attacking at

121 4.7. Security Analysis 101 Q i (where i = 8) and ended at Q j (where j = 20). Unfortunately, both the R-SUM and P- SUM aggregation results are affected badly by the OO attack. The reason why RSDA is not able to detect the attack is, because C rep k, after gaining a good reputation value (= 0.963) at Q i 1, behaved maliciously to the aggregation activities every two query responses until Q j. By applying Equation 4.8 on C rep k s positive and negative feedback experiences, its reputation value fluctuated but it never became smaller than Thr R, as shown in Figure 4.7-a. Due to the binary decision making approach employed in RSDA, C rep k is still considered trusted as long as its reputation value is above Thr R. Unfortunately, R-SUM and P-SUM are also affected badly once OO attacks are launched on the second half of the data (see Figure 4.8). In this case, the adversary is able to maintain an even higher reputation value than what could be obtained if the first half of the data was the target (see Figure 4.7-b). 4.7 Security Analysis This section applies the same methodology used in Chapter 2 and Chapter 3 in order to analyze the security of RSDA. It first studies the existence of the reputation components discussed in Section 3.1. After that, it investigates the security services that RSDA provides. Finally, it studies the resilience of RSDA against the attacks discussed in Sections and Reputation Components According to the analysis framework discussed in Section 3.1, RSDA addresses each of the following four phases: information gathering and sharing, information modeling, decision making, and dissemination. The reputation information is gathered by a cell member based on its direct observations and experience with other cell members. RSDA uses a monitoring mechanisms similar to the watchdog mechanism (WDM) at each cell member as an approach to collect these direct observations. Each cell member use its direct observations to calculate reputation values for other neighboring nodes. The Bayesian trust and reputation model, which is a probabilistic approach according to the discussion in Section 3.1, is used at each cell member to model these direct observations and convert them into reputation values. This means a distributed information modeling structure is implemented in RSDA. RSDA employs a binary decision metric when it evaluates the reputation value of the cell representative. Thus, the cell representative is considered trusted once its reputation value is equal or greater than a threshold value (Thr R ), otherwise it is considered distrusted. If the reputation value of the cell representative indicates that the representative is trustworthy, it continues acting as a cell representative. Once the cell representative reputation value falls below Thr R, each cell member that detected the drop in the cell representative reputation starts the process of replacing the cell representative. At the revocation mechanism initiation, the processed reputation information propagated between sensor nodes is only negative feedback, which is the low reputation value of the misbehaved cell representative. This type of propagation, according to the discussion in Section 3.1, is considered a reactive form of dissemination since, reputation values are propagated after the occurrence of an event, being

122 102 Chapter 4. Reputation-based Secure Data Aggregation Schemes Gathering & Sharing Calculation Decision Dissemination (Phase 1) (Phase 2) (Phase 3) (Phase 4) Source WDM Type Scope Structure Approach Metric Structure Approach Michiardi & Molva [81] D/I Y + R Di?? Di Re Buchegger & Boudec [14] D/I Y - R Di? B Di Re Ganeriwal & Srivastava [46, 47] D/I Y + G Di Pr B Di P Srinivasan et al. [117] D Y +,- M C Pr? C Re Boukerche et al. [13] D N +,- G P2P?? P2P Re Yao et al. [137, 138] D/I Y +,- G Di De Disc Di Re Shaikh et al. [113, 114] D/I? +,- G H De Disc H P, Re Özdemir [91, 92] D/I Y +,- A Di Pr B Di P Bouckerche & Ren [12, 103] D/I? +,- M C De B C P, Re Chen et al. [26] D Y +,- R Di Pr B Di P Chen [25] D Y? G Di Pr??? Xiao et al. [134] D/I? +,- G Di Pr B Di? Srinivasan et al. [118] D/I Y +,- L Di De? Di Re Alzaid et al. (RSDA) [5] D Y - A Di Pr B Di Re D Automatic Direct + Positive Feedback I Automatic Indirect - Negative Feedback WDM Watchdog Mechanism Re Reactive C Centralized P Proactive H Hybrid B Binary Di Distributed Disc Discrete M Mobility P2P Peer to Peer R Routing Misbehavior Pr Probabilistic L Localization Misbehavior De Deterministic A Aggregation Misbehavior? Not available G Generic Misbehavior Y Yes N No Table 4.4: Reputation components in current reputation-based trust systems

123 4.7. Security Analysis 103 Missing Provided Protocol CO IN FR AV AU AT Sanli et al. [110] II Castelluccia et al. [17] II Westhoff et al. [131] III Hu & Evans [58] III Przydatek et al. [99] III Chan et al. [22] III Du et al. [40] III Mahimkar & Rappaport [75] III Yang et al. [136] III Jadia & Mathuria [62] III Frikken & Dougherty [43] III Haghani et al. [53] III Alzaid et al. (RSDA) [5] III CO Confidentiality IN Integrity FR Freshness AV Availability AU Authentication AT Adversary Type Table 4.5: Security services provided in current secure data aggregation protocols that the reputation value fell below Thr R. As a result, Table 3.1 is updated by considering the discussion above and adding RSDA s information as in Table Security Services This section extends the discussion in Section by considering security services provided by RSDA. RSDA is one of few schemes that considers data availability for secure data aggregation. It detects the inconsistency in aggregation results, data integrity, by monitoring the aggregation behavior of a cell representative using the WDM. In contrast with most of the secure aggregation schemes discussed in Chapter 2, RSDA takes a further action once the inconsistency in aggregation results has been detected. It punishes the cell representative, which caused an inconsistency, by reducing its reputation value. Once the cell representative reputation value falls below Thr R, the revocation mechanism is initiated to prevent this representative from participating in the network by blacklisting it. Then, a new trustworthy sensor node is selected to be the next candidate to represent the cell. The blacklisting of misbehaved cell representatives helps prolong the network lifetime - data availability. It stops the forwarding of packets from a malicious cell representative, which reduces the energy consumption that would have resulted from receiving, processing, and then sending these packets at upstream cells. RSDA also provides data freshness, because each aggregation s query sent by the base station has a unique ascending query number. This query number is included in all subsequent forwarding activities until a reply to the aggregation request is received at the base station.

124 104 Chapter 4. Reputation-based Secure Data Aggregation Data-origin authentication is ensured by attaching a hashed copy of the required information with the packet payload as in Equation 4.4. Finally, data confidentiality can also be offered. A cell member shares both an intra-cell key with its cell members and inter-cell keys with its cell members and members of adjacent cells, respectively. These keys can be used to encrypt data traveled across the network. However, data confidentiality is not considered in this thesis, because we only focus on data accuracy and data availability to defeat a type III adversary as discussed in Section 4.3. Consequently, Table 4.5 represents an updated version of Table 2.1 after considering the security services provided by RSDA Attacks Resilience The adversarial model discussed in Section 4.3 is classified as a type III adversary as discussed in Section 2.2.2, because the adversary has limited computational power and can compromise up to W nodes in the deployment area with no more than t 1 compromised nodes in a cell. This section examines whether RSDA is vulnerable to the attacks discussed in Sections and 3.2. It discusses first the WSN-related attacks (WSN-attacks) and then the reputationrelated attacks (Reputation-attacks). WSN-Attacks Each cell member in RSDA is equipped with the WDM to monitor the behavior of neighboring members, which helps RSDA resist against these attacks. When a compromised cell representative C rep i, for example, selectively stops forwarding some packets to an upstream cell representative, its cell members are evaluating C rep i s behavior and subsequently update its reputation value by increasing the negative feedback parameter β Crep i F. Unfortunately, Selective Forwarding attacks still cause partial damage on RSDA although cell members are able to detect the attack through the WDM. This is because RSDA uses a binary decision approach, as mentioned in Section 4.7.1, when it evaluates the trustworthiness of a specific cell representative. If the C rep i s updated reputation value is still above Thr R, C rep i continues acting as a cell representative for C i. Thus, the adversary can launch the Selective Forwarding attack as long as its reputation value is above Thr R, which keeps its trust state as trusted. The damage is considered partial because adjusting the threshold value or applying mechanisms such as aging factor and weighting can help to defeat this attack. Also, RSDA is partially damaged by Spoofed Data attacks once a cell member is compromised. Suppose that C rep i is compromised: an adversary could then inject invalid aggregation result to the network. However, other cell members in C i are able to detect this deviation in the aggregation via the WDM. Due to the binary decision approach employed in RSDA, the Spoofed Data attack can partially affect the system, although other cell members are able to detect this deviation. The discussion mentioned above relating to why the damage caused by the Selective Forwarding attack is partial, also applies to the Spoofed Data attack. Importantly, RSDA is robust against Sybil attacks, because it is assumed that each cell member has a unique ID and has pre-deployment knowledge about neighboring members in

125 4.7. Security Analysis 105 Schemes WSNs Attacks Reputation Attacks SF SY SD RE BM BS OO NE Westhoff et al. [131] Hu & Evans [58] Przydatek et al. [99] Chan et al. [22] Du et al. [40] Mahimkar & Rappaport [75] Sani et al. [110] Yang et al. [136] Jadia & Mathuria [62] Castelluccia et al. [17] Frikken & Dougherty [43] Haghani et al. [53] Michiardi & Molva [81]? Buchegger & Boudec [14] Ganeriwal & Srivastava [46, 47] Srinivasan et al. [117]? Boukerche et al. [13] Yao et al. [137, 138]? Shaikh et al. [114] Shaikh et al. [113] Özdemir [91] Özdemir [92] Bouckerche & Ren [12, 103] Chen et al. [25, 26] Xiao et al. [134] Srinivasan et al. [118] Alzaid et al. (RSDA) [5] SF Selective Forwarding BM Bad Mouthing SY Sybil BS Ballot Stuffing SD Spoofed Data OO On-Off RE REplay NE NEwcomer - Not Applicable? Not Available Robust Partial damage Maximum damage Table 4.6: Attacks vulnerabilities in current reputation-based trust systems the same cell. Any addition to the cell members should be done through the base station. Moreover, Replay attacks are mitigated by introducing query sequence numbers into each packet. If this query number is smaller than the last processed packet, which is stored at each cell member, the packet should be dropped.

126 106 Chapter 4. Reputation-based Secure Data Aggregation Reputation-Attacks RSDA is robust against the Bad Mouthing and Ballot Stuffing attacks, because RSDA does not consider indirect observations in calculating the cell members reputation values. Section shows that only direct observations, which are gathered by the WDM, are considered. Thus, the adversary has no chance of affecting the reputation calculation once it has succeeded in compromising some members. Specifically, the adversary neither provides unfair negative feedback for trustworthy cell members nor provides unfair positive feedback to distrusted cell members. RSDA is also robust against Newcomer attacks due to the network assumption, which is that each cell member has pre-deployment knowledge about neighboring members in the cell. Consequently, the adversary can not rejoin the network with a new ID, because new IDs can be added only by the base station. Unfortunately, On-Off attacks can cause maximum damage to RSDA as discussed in Section This is due to the binary decision making approach used in calculating the reputation values. However, a solution to mitigate this attack is proposed in the next chapter. Table 4.6 combines Tables 2.2 and 3.2, and then it adds the analysis of attacks vulnerabilities in RSDA. 4.8 Summary In this chapter, a reputation-based secure data aggregation (RSDA) designed for WSNs was proposed. The significance of the proposal is four-fold: (i) it minimizes the use of heavy cryptographic mechanisms while designing a competitive secure data aggregation scheme, (ii) it outperforms other schemes by considering the WSN-related and Reputation-related attacks at the design time, (iii) RSDA is not limited for a single aggregation function, and (iv) RSDA provides dynamic response to attack activities by not rejecting incorrect aggregation results at the base station level but by rejecting it as soon as possible, possibly by nodes in the neighborhood. We believed that these differences ensure the main components of our definition for robust secure data aggregation discussed in Section 2.1. The security advantages provided by RSDA are realized by integrating aggregation functionalities with a reputation system. The chapter discussed the performance and security analysis of RSDA. In the performance analysis, RSDA was tested in three scenarios, depending on the adversary capability to affect the aggregation results, as follows: (i) no attack on the data, (ii) abrupt change, and (iii) 1- per-2 strategy-based On-Off attacks. The first scenario helped determine the value of Thr A. Thr A was then used by cell members to evaluate the aggregation behavior of their cell representative. The second scenario investigated how RSDA handles abrupt changes. The experiment results showed that RSDA had been affected by these changes until revocation requests from other cell members were received. Importantly, this affect was temporary and RSDA had a better reaction to this change as soon as the reputation value of the misbehaved representative fell below Thr R. The third scenario examined the effectiveness of RSDA in detecting On-Off attacks. Unfortunately, the experiment results showed that RSDA is badly affected by the On- Off attack. The compromised cell representative C rep k after gaining a good reputation value

127 4.8. Summary 107 (0.963) at Q i 1, behaved maliciously to the aggregation activities every two query responses until Q j. By applying Equation 4.8 to C rep k s positive and negative feedback experiences, its reputation value fluctuated but it never became smaller than Thr R, as shown in Figure 4.7-a. Due to the binary decision making approach employed in RSDA, C rep k is still considered trusted as long as its reputation value is above Thr R. This problem will be addressed in the following chapter. Finally, the chapter applied the same methodology used in Chapter 2 and Chapter 3 in order to analyze the security of RSDA. It first studied the existence of the reputation components discussed in Section 3.1. After that, it investigated the security services that RSDA provides. Finally, it studied the resilience of RSDA to attacks discussed in Sections and 3.2. In contrast with Özdemir s scheme, RSDA is robust against BM and BS attacks.

128 108 Chapter 4. Reputation-based Secure Data Aggregation

129 Chapter 5 Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation for Wireless Sensor Networks Several secure data aggregation schemes were designed to mitigate the effect of node compromise attacks and ensure data integrity. Most schemes can detect manipulation of aggregation results and then reject them at the base station. This gives a single compromised node the opportunity to disrupt the limited resources in the network. Reputation-based secure data aggregation schemes such as RSDA, as discussed in Chapter 4, take a step further in helping to identify compromised nodes as early as possible. This helps extend the network lifetime. However, adding the reputation component to the data aggregation protocol opens the door for more attacks, such as bad mouthing and On-Off attacks. Unfortunately, the experiment results in Chapter 4 showed that RSDA was badly affected by On-Off attacks. The focus of this chapter is the ability to mitigate On-Off attacks in which an adversary aims to disrupt the system s overall performance without being detected or excluded from the network. According to our discussion in Section 3.2 an adversary, in On-Off attacks, behaves normally until it gets a high reputation score. It then behaves maliciously at intervals in order to affect the aggregation results, and extend the detection time required to recognize its misbehavior by maintaining its reputation score above a predefined threshold. The proposal in this chapter extends RSDA by adding an estimation theory and a change point detection mechanism. Through extensive simulations, we have shown that this addition helps defend against On-Off attacks and enhances data accuracy in the aggregation results. It does this without trimming the abnormal but correctly reported data, as suggested by Wagner [128]. 109

130 110 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation Eliminating abnormal data with no further investigation is impractical, especially in applications developed for heterogeneous environments in which the distribution of normal physical phenomena in the deployment area may vary from one subset of sensor nodes to another. The significance of the proposal is two-fold: (i) it extends RSDA and mitigates the effect of the On-Off attack on aggregation results, and (ii) it considers non-homogeneous environments, which requires the ability to distinguish between abrupt and incipient changes. A comparative analysis of this chapter s proposal with RSDA, plain estimate, and reputation-based estimate shows its superior performance in mitigating the effect of the attack. The rest of the chapter is organized as follows: Section 5.1 provides an overview of related work. This includes providing a brief overview of some techniques used in the proposal, namely: an estimation theory, and a change point detection. Section 5.2 explains the damage caused by the On-Off attack on RSDA and then provides details of the proposed solution to mitigate this attack. Section 5.3 describes the data used in evaluating the proposal and discusses the experiment s results. The solution is tested in four scenarios, depending on the adversary s capability to affect the aggregation results, as follows : (i) no attack on the data, (ii) abrupt and incipient change, (iii) 1-per-2 strategy-based On-Off attacks, and (iv) 1-per-3 strategybased On-Off attacks. Finally, a summary is given in Section Related Work This section introduces techniques such as the estimation theory and the change point detection, which are used in this chapter to mitigate the damage caused by the On-Off attack Estimation Theory Estimation theory in statistics infers the value of a quality of interest (or parameter) from indirect, inaccurate, and uncertain observations (or measurements) [28]. An estimator attempts to approximate the unknown parameter using received observations. A concise block diagram that shows how the estimation theory can be applied in the data aggregation context is given in Figure 5.1. The measurement source box represents the sensing board in the sensor node which senses the required physical phenomena. However, these phenomena are subject to alteration due to several factors such as: faulty sensing boards, drained batteries, unreliable wireless links, existence of an adversary, etc. These factors are represented as measurement error sources in Figure 5.1. The aggregator block is where the estimator has access to the collected data, e.g. d xi, d x2,..., d xn. This collected data contains the sensors true measurements and also the error sources. The aggregator, with help from its estimator function, has to predict the aggregation result AR. ˆ In other words, the estimator takes the sensors observations (d xi ) as inputs, applies an estimation function to them, and then tries to obtain the best estimate of the aggregated result by filtering out the error sources that may be associated with these observations.

131 5.1. Related Work 111 Measurement Sources Measurement Sources Measurement Sources Figure 5.1: A simplified estimation model for data aggregation in WSNs Let us assume that 2,3,4,5, and 6 are observations of five sensor nodes x 1, x 2,..., x 5, respectively, which are collected by an aggregator. The aggregator then wants to estimate the aggregated result once the AVE aggregation function is applied to these observations. The estimator function for these five inputs (n = 5) can be written as follows [133]: AR ˆ = 1 n d xi = d x 1 + d x2 + d x3 + d x4 + d x5 n i=1 n (5.1) The estimate, on the other hand, is the particular value the estimator takes in a set of data, which is ARˆ n in this case. Following the same example given above, the estimate can be calculated by substituting d x1, d x2,..., d xn with their values as follows: ˆ AR 5 = = 4 This estimate is treated as an expected value for the next sensor observation d x6. In other words, we can say that: dx6 ˆ = AR. ˆ The estimation error ε quantifies the amount by which the estimated value ( AR) ˆ differs from the true value θ (ε = AR ˆ θ) [108]. The estimation error is then tested against a predefined threshold in order to check whether a change on the mean of the aggregation result is detected or not Change Point Detection Let (AR i ) 1<i<k be a sequence of independent aggregation results confirmed by an aggregator. Before the change occurs in the aggregation results, the mean of the aggregation results, according to Equation 5.1 is equal to AR ˆ while it is equal to AR ˆ AR ˆ after the change. This means that the statistical properties of the aggregation results have been changed. The problem can be expressed as the ability to detect the change in the aggregation result and

132 112 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation then investigate whether or not it is necessary to adapt the estimator function in order to incorporate the change. In statistics, this problem is known as sequential analysis. Change detection techniques for sequential analysis are classified into: (i) offline and (ii) online [2]. In the former type, the process of collecting aggregation results should be finished before running the technique. This means that the aggregator node should maintain a copy of the whole observation sequence before making a decision about an estimate of change in the statistical features of the aggregation results. Unfortunately, this type of change detection techniques, which is the offline type, does not meet the unique characteristics WSNs have. As discussed in Section 1.2.1, a sensor node is a tiny device with only a small amount of memory and storage space for the code. For example, one common sensor type (MICA2) has 4K RAM, 128K program memory, and 512K flash storage [30]. This makes offline change point detection techniques improper to be employed in WSNs. In the latter type, the decision about change point is made soon after performing an aggregation function, in order to identify the change in the statistical features of the aggregation results as quickly as possible. We believe that this type of change point detection techniques, which is online type, meets the unique requirements that WSNs have, because it does not require large memory spaces to store old aggregation results [133, page 594]. Since we consider heterogeneous environments in which physical phenomena in some parts of the network depart from previous reported observations, the new estimate of the aggregation result will be affected. To overcome the lack of a complete model of aggregation results, a non-parametric CUSUM approach [7] is used. The CUSUM score, S k, can be calculated as: S k = S k 1 + AR k AR, ˆ where S 0 = 0 (5.2) The CUSUM score given in Equation 5.2 is tested against a predetermined threshold Thr A, in oder to investigate whether the statistical features of the aggregation results have changed. Once a change has been detected, a further investigation using stopping rules is required to verify whether its an abrupt or incipient change, as will be discussed later. 5.2 The Proposed Enhanced Reputation-based Secure Data Aggregation Scheme RSDA, which is presented in the previous chapter, integrates aggregation functionalities with the advantages provided by a reputation system in order to extend the network lifetime and enhance the accuracy of the aggregated data. However, RSDA is prone to the On-Off attack. Let us recall the adversary behavior during launching the attack, which is discussed in Section 3.2. The adversary in RSDA, once it has succeeded in compromising any sensor x in cell C k, behaves normally until it gets a high reputation score; hence, it becomes eligible as the next cell representative C rep k. Once x has been elected as C rep k due to its good reputation value, it behaves maliciously intermittently in order to affect the aggregation results of C k. Switching

133 5.2. The Proposed Enhanced Reputation-based Secure Data Aggregation Scheme 113 Notation Table 5.1: Description of notations used in Chapter 5 Description K 1, K 2 Two network-wide shared keys. C i The i-th cell. K Ci Intra-cell key for the i-th cell. K Cij Inter-cell key shared between the i-th and j-th cells. H(.) Hash function. MAC KCi Message authentication code computed by using K Ci. ADV An adversary around the WSN. T The number of nodes in each cell. W The total number of compromised nodes in the whole deployment area. t The minimum number of cell members that are required to revoke a misbehaving C rep or to confirm a new C read x, y Sensor nodes x and y, respectively. p x, p y The physical phenomena reported by sensor nodes x and y respectively. B The base station. Ci read The reported (sensed) physical phenomenon from C i. F An aggregation function. AR Qn C i An aggregation result for query number Q n which is obtained by applying F at C i. AR Qn C i A previous estimate of the aggregation result for query number Q n, which is predicted at C i. Q n R x S/A/F α x S/A/F β x S/A/F Thr A/S/R C # i A query number. Reputation value of sensor node x for a Sensing/ Aggregation/ or Forwarding functionality. The number of correct behaviors of sensor node x for a Sensing/ Aggregation/ or Forwarding functionality. The number of incorrect behaviors of sensor node x for a Sensing/ Aggregation/ or Forwarding functionality. The pre-defined threshold for the Aggregation/Sensing/Reputation. The number of inputs to the aggregation function. a Qn The absolute deviation score at Q n. g Qn The CUSUM score at Q n., The AND and OR operators, respectively. between normal and anomalous behavior is important to ensure that the compromised node s reputation value is at least equal to the predefined reputation threshold Thr R. For example, C rep k can alter the aggregation result for consecutive aggregation queries just before its reputation value falls below Thr R, which will let other cell members in C i initiate the revocation mechanism in order to replace and black-list this misbehaved cell representative. By doing this, the adversary has affected the reported aggregation results and extended the required time to detect its malicious behavior. Sun et al. [119] discovered that using fixed forgetting factor technique can facilitate an adversary s mission in launching On-Off attacks against a reputation-based trust system. The main idea behind the fixed forgetting factor technique is to let performing k good actions at time t 1 is equivalent to performing kβ t2 t1 good actions at t 2, where 0 < β 1. Thus, Sun et

134 114 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation al. proposed a scheme that is inspired by a social phenomena. It takes long-time interaction and consistent good behaviors to built up a good reputation value; only few bad actions can ruin the reputation value. Therefore, they mimic the social phenomena by introducing an adaptive forgetting factor to defeat against OO attacks. In Sun et al. s solution, the additional successful (r) and failed (s) interactions at (t 2 ) between two nodes are updated as follows: r t2 = r t1 ˆβ + rt2 t 1 and s t2 = s t1 ˆβ + st2 t 1 where t 2 > t 1 and ˆβ r = 1 t1 +1 r t1 + s t1. However, we found that Sun et al. s solution is insufficient, because a single misbehave of a trustworthy sensor node can bring its reputation + 2 value to distrust category. This single misbehave can be an undelivered message which occurs not because the sensor node has an intension to misbehave, but it occurs due to unreliable wireless communication, which is common in WSNs. Let us assume that a reputation value for a sensor node x is due to its 34 successful and 1 failed interactions at t 1. If the behavior of sensor x for the current activity (at t 2 ) has been considered as a misbehave, then the updated reputation value of sensor x will be If a predetermined threshold value for reputation was set to be 0.7, then this single failure will move sensor node x from a trust category to a distrust category. In other words, a single failure has changed the secure state of sensor node x from trust to distrust state. In this section, a solution against such an attack is proposed by using a different approach to Sun et al. s solution. To mitigate the On-Off attacks (OO) in RSDA, the use of a combination of the estimation theory and the online change point detection mechanism is suggested. This detection is based on measuring the deviation between the reputation-based aggregation and the estimate of the aggregation result. The estimation theory helps to measure the estimated value of the aggregation result by finding the mean of the aggregation results based on good historic data. The deviation from the mean helps an intermediate cell evaluate the behavior of its children cells, as will be discussed later. The evaluation result will be incorporated into the information gathering and sharing phase of the reputation system as a direct observation of the intermediate cell - see Section 3.1. Consequently, cell representatives at intermediate cells will be able to evaluate the aggregation behavior of the downstream/children cells representatives as well as be able to evaluate the forwarding behavior, as will be discussed below. Since the proposal extends RSDA, it applies the same network assumptions, data model, and adversarial model. Consequently, the same notations used in describing RSDA are used in describing E-RSDA but with few additions, in the last three lines in Table 5.1. The forwarding and sensing behaviors are evaluated in the same way as in RSDA. However, the aggregation behavior is evaluated differently, depending on whether the evaluation is performed on the aggregation results of the same cell representative or on the aggregation results of other cell representatives, specifically downstream cells. In the former, a cell member considers the overheard aggregation result, which is calculated by its C rep, as normal if the difference between its aggregation calculation and its C rep calculation is bounded by a predefined threshold; other-

135 5.2. The Proposed Enhanced Reputation-based Secure Data Aggregation Scheme 115 Represents an Intermediate Cell C z Represents a Leaf Cell Represents a Cell Member Represents a Cell Representative Represents a Single Cell Reading Represents Aggregated Readings C j C k C m C i C b Figure 5.2: A simplified deployment area for E-RSDA wise, it is considered abnormal. In the latter, an intermediate C rep compares the aggregation result, which is calculated by the C rep itself based on the reported aggregated data from downstream cell representatives, with its prediction for the aggregation result, which is calculated based on the estimation theory. The aggregation behavior is considered normal if the difference is bounded by a predefined threshold; otherwise, it is considered anomalous. It is important to note that only the intermediate cell s duties are enhanced, in this chapter, to mitigate the OO attack, and no modification has been done at leaf cells. The simplified deployment area represented in Figure 5.2 is used in the subsequent paragraphs to illustrate the modification done to the intermediate cell C j. At Intermediate Cells The cell representative C rep j is challenged to evaluate the aggregation s behavior of its children cells as it is not able to overhear all inputs to the aggregation functions they apply, which can be due to poor radio coverage or a limited authentication capability. For example, C rep j in Figure 5.2 has no access to the shared key between cells C b and C k due to the geographic location. This limitation is addressed from the anomaly detection perspective. Most existing anomaly detection approaches follow a centralized architecture where all the observed data are collected by a central entity. This architecture prohibits performing in-network aggregation within the deployment area, which depletes quickly the limited energy resources at sensor nodes. Thus, a distributed architecture for anomaly detection is preferable for WSNs due to its flexibility in applying in-network processing, which helps reduce communication energy consumption at intermediate cells. The use of the estimation theory, online change point detection, and stop rules are respectively proposed to predict the future aggregation result, detect the deviation from the mean of the previous aggregation results, and verify the nature of the detected change at intermediate cells. The estimation function estimator helps the representative of an intermediate cell to predict the estimated aggregation result for the next query number with consideration

136 116 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation to previously accepted aggregation results. Then, the reputation-based aggregation result is compared with the estimated value to detect any major change in the aggregation behavior of the children cell representative, while the cumulative sum (CUSUM) score is evaluated to detect small deviations. Once a deviation has been detected, further investigation should be done to identify the nature of the change, which could be due to OO attacks, abrupt changes, or temporary changes in the environment. In this regard, each intermediate cell performs the following tasks: ˆ The aggregation function: Let C rep j, for example, apply the average aggregation function (AVE) on the received readings in order to answer Q n as follows: AR Qn C j = AVE (C1 read, C2 read,..., Ci read,..., C read j ) rep = RC 1 C1 read + R Crep 2 C read R Crep i C read i C # 1 + C# C# i C # j R Crep j Cj read (5.3) ˆ The recursive estimation function: The aggregation result at the intermediate cell can be estimated recursively. In other words, the estimate of the aggregated result of cell C j ( AR ˆ Qn C j ) depends on its previous estimate value and the current aggregation result. There is no need to keep all old aggregation results in order to detect changes in the aggregation results. It is believed that the recursive form of the estimation is more practical for real time applications in WSNs, because it does not require large memory spaces to store old aggregation results [133, page 594]. The new estimate of the aggregated data, which answers Q n, is calculated as follows: AR ˆ Qn C j = Q n 1 AR ˆ Qn C j + 1 AR Qn C Q j (5.4) n which can be further rewritten as: Q n AR ˆ Qn C j = AR ˆ Qn C j 1 AR ˆ Qn Q n C j + 1 AR Qn C Q j (5.5) n By combining the last two terms of the right-hand side of Equation 5.5, we get AR ˆ Qn C j = AR ˆ Qn C j + 1 (AR Qn C Q j AR ˆ Qn C j ) (5.6) n The difference between the reputation-based aggregation result and the estimate of the aggregation result is called the residual. The basic idea here is to compare the current reputationbased aggregation result with the estimated aggregation result AR ˆ Qn C j in order to measure the scatter or spread of the aggregation results in a series of aggregation queries. We use the absolute deviation, which is the absolute difference between the current reputation-based aggregation result and the estimate of the aggregation result, to measure the magnitude of varying aggregation results as follows: a Qn = AR Qn AR ˆ Qn (5.7) If the absolute deviation score (a Qn ) is greater than a threshold Thr A, then a major change in the mean of the aggregation results is detected. This change can be either an abrupt or

137 5.2. The Proposed Enhanced Reputation-based Secure Data Aggregation Scheme 117 incipient change in the aggregation results, which has to be investigated. Thus, the decision rule at this stage can be expressed as: normal a Qn T hr A 1 (aqn ) = (5.8) alarm a Qn > T hr A d Qn Unfortunately, an adversary with a reasonable reputation value can slightly affect the aggregation result with small deviations (less than Thr A ) in order to manipulate the estimate calculation in Equation 5.6. This makes the change pass the absolute deviation test and be classified as normal in Equation 5.8. Thus, the CUSUM is used to compute the cumulative sum of the differences between reputation-based aggregation results and estimate values. According to Equation 5.2, the CUSUM score (g Qn ) can be represented as: g Qn = g Qn 1 + (AR Qn AR ˆ Qn ), where g Q0 = 0 This CUSUM score is then compared with the predefined threshold Thr A to identify whether the small deviations were accumulated in a way that affects the aggregation results or not. Due to heterogeneous environments that lack a complete model of the physical phenomena, it is difficult to compute g Qn since no prior information about the underlying process distribution is available. One way to solve this problem is to use a non-parametric approach which does not make any assumptions about the underlying process probability distribution. In the case of a non-parametric CUSUM algorithm [7], the corresponding decision rule can be expressed as: normal 2 ) = (gqn alarm d Qn T hr A g Qn T hr A (5.9) otherwise If the CUSUM score falls in the range [ Thr A, Thr A ], then the aggregation behavior will be considered as normal aggregation behavior. However, if the CUSUM score is outside the range [ Thr A, Thr A ], then an alarm is raised indicating that small deviations have been accumulated which may or may not affect the estimator function and then the aggregation result. A stopping rule is used as part of the change point detection algorithm, because no statistical assumptions on the input to the aggregation function are given. Furthermore, because any change in the mean of the aggregation is considered abrupt in the change detection method, it could be either abrupt or incipient in the stopping rule method [52, page 17]. The latter method is that which can be expected in heterogeneous environments. Figure 5.3 summarizes the process that should be performed by any intermediate cell. The intermediate cell representative, C rep j receives aggregation results from its children cells. It performs reputation-based aggregation as described in Equation 5.3. Subsequently, it calculates the absolute deviation score and the CUSUM score, which are subject to a threshold test with Thr A. These two scores (a Qn, g Qn ) are considered error indicators, and based on them, the change in the mean of the aggregation results is detected. If the error indicators are less than the threshold, then no change in the mean of the aggregation results is detected, because the reputation-based aggregation result is correlated closely enough with C j s prediction for the

138 c 118 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation Measurement Sources Measurement Sources Yes No Measurement Sources Figure 5.3: A simplified E-RSDA model aggregation result. After that, C rep j other words, the cell representative, C rep j computes its prediction for the new aggregation result. In, accepts the reported aggregated data from its children cells if a Qn Thr A g Qn [ Thr A, Thr A ]. Then, C rep j updates the reputation values of its children cell representatives by increasing their α A values, and computes its prediction for the next aggregation results. In contrast with Equation 4.9, C rep j calculates the reputation value of its children cell representative (C m, C i, and C k in Figure 5.2) by using the available reputation information about the forwarding and aggregation activities as follows: α R Crep A Crep = µ 2 αa Crep + βcrep A + (1 µ 2 ) α Crep F α Crep F + β Crep F where 0 < µ 2 1 (5.10) Once C rep j has detected a change, it starts a fixed window (buffer) with size S, keeps a copy of the current estimate of the aggregation result before considering this detected change (temp estimate), and computes the new estimate value considering this new change. During the window s lifetime, temp estimate is always considered as AR ˆ Qn C j since it is the last estimate value for the aggregation result before the change is detected. Then, C rep j classifies the detected change into one of the following categories: ˆ OO Attack. Unpermitted deviation of a reputation-based aggregation result from the estimate of the aggregation result will be detected if (a Qn > Thr A g Qn < Thr A ) (a Qn < Thr A g Qn > Thr A ) occurs l times during the window length, where l is the attack frequency in which the adversary misbehaves once per l query responses. Once this unpermitted deviation is detected, it is classified as an OO attack, and then C rep j updates β A for the node that caused this fault and resets the current estimate. ˆ Perturbation. Temporary departure of the aggregation result from the current estimate will be detected if a Qn > Thr A g Qn [ Thr A, Thr A ] for S consecutive responses. The difference between this type of change and the OO attack is that the detected change continues for the whole length of the window. This temporary departure can be either an

139 5.3. Experiment Evaluation 119 Table 5.2: Data sets used in the experiment evaluation Scenario Dataset Description Duration Frequency # Attacks Scenario 1 Dataset-1 No Attacks Dataset-2 Abrupt Change Scenario 2 Dataset-3 Incipient Change Dataset-4 1-per-2 OO - F. Block Scenario 3 Dataset-5 1-per-2 OO - L. Block Dataset-6 1-per-3 OO - F. Block Scenario 4 Dataset-7 1-per-3 OO - L. Block abrupt or incipient change. Unfortunately, the absolute deviation and the CUSUM scores do not help C rep j resolve the uncertainty in this scenario. Consequently, the detected change is combined with the revocation mechanism in RSDA as proposed in Chapter 4. Thus, the detected change is considered as a non-fault change (or perturbation) if no t revocation notifications/requests have been received for the child cell representative that caused the change. Then, C rep j updates α A for the cell representative that caused the change in the mean of the aggregation result and considers the temporary departure as a change in the physical phenomena by resetting the estimator function. ˆ Failure. This is similar to the perturbation type except that the detected change is associated with revocation requests for the child cell representative that caused the change. The reception of revocation requests can happen at any time during the window s lifetime. Once the detected change has been classified as failure, then the revocation mechanism should be completed. After that, C rep j sets C # j to be the summation of the received counters C # 1, C# 2,..., C# i,..., C# j and then forwards AR Qn C j to upper cell representative Cz rep, in the abstract network model in Figure 5.2, with the following packet format: { C rep j, C rep, Q n, P ayload }, where z P ayload AR Qn C j C # j MAC K Cjz (C rep j Q n AR Qn C j C # j ) Other nodes in cell C j are still able to keep an eye on the aggregation and forwarding behavior of C rep j in the same way discussed in Chapter Experiment Evaluation This section evaluates the effectiveness of the proposed solution in distinguishing between abrupt and incipient changes in aggregation results, and defeating On-Off attacks (OO). This evaluation is based on seven datasets listed in Table??. The first dataset (dataset-1) is a reallife dataset, and the other datasets are modified versions of dataset-1. The real-life dataset, dataset-1, was captured from 54 Mica2Dot sensor nodes which were deployed at Intel Berke-

140 120 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation ley Research Laboratory (IBRL) during the period from February 28, 2004 to April 5, These sensors collect five measurements: light in lux, temperature in celsius, humidity ranging from 0%-100%, voltage in volts, and network topology information in each 30 seconds interval. For the purpose of the chapter, temperature data only has been extracted. To evaluate the performance of E-RSDA, the abstract network model in Figure 5.2 was considered. Therefore, the customized test-bed, which was built in Chapter 4 in order to evaluate RSDA, is extended in this section. Equations and the R-RSDA model, which is depicted in Figure 5.3, have been interpreted into the simulation code that is intended to be run at intermediate cell representatives. This allows representatives of intermediate cells to calculate the absolute deviation and CUSUM scores. The evaluation section studies the aggregation behavior at the representative node of the intermediate cell j. receives inputs AR C Qn, AR i C Qn, and AR m C Qn to the aggregation k function from its children cells C i, C m, and C k, respectively. E-RSDA is tested in four separate scenarios as follows: C rep j ˆ First Scenario: The dataset used in this scenario is genuine as captured in IBRL with no attacks or alteration to the aggregated data. For simplicity, only temperature data is extracted from the IBRL dataset. This scenario helps in determining the value of Thr A, which then will be used by the cell members to evaluate the aggregation behavior of their cell representative. ˆ Second Scenario: The dataset used in this scenario is artificial, because the original IBRL dataset does not have anomalous data. Therefore, we modified 28 consecutive query responses of a specific cell representative, C k, by multiplying the true value of AR C Qn by k 2 in order to cause the abrupt and incipient changes in dataset-2 and dataset-3. E-RSDA needs to investigate this continuous injection of suspicious data and distinguish between an abrupt change caused by an adversary or incipient change caused by physical phenomena changes in heterogeneous environments. In the former, the reputation value of the cell representative C rep k, which caused the abnormality in the aggregation result, will fall under a predefined threshold Thr R. Once this fall has been detected by the cell members in C k, they should send revocation messages to their adjacent cell representatives, in order to replace their representative with another node that has a better reputation value. In the latter, the revocation request should not be sent, since the abnormality in the aggregation results has occurred due to a change in the physical environment, which affects the cell members in C k, not only C rep k. ˆ Third Scenario: The dataset in this scenario is a modified version of dataset-1, in order to mimic OO attack behavior. Depending on the attack frequency l, the adversary s attacking methodology is to misbehave k queries long every l query responses. attack frequency (l) in this scenario is 2 and the attack duration (k) is 1, which means that an attack is launched once for one long query and it is repeated every two queries 1 The

141 5.3. Experiment Evaluation per-2 strategy. The effectiveness of the E-RSDA is evaluated when the 1-per-2 OO attack is launched at either the first half of the data or at the second half of the data, as in dataset-3 and dataset-4, respectively (see Table 5.2). ˆ Fourth Scenario: This scenario repeats the previous scenario after changing the attack frequency and following the 1-per-3 strategy instead of 1-per-2 strategy. Dataset-5 and dataset-6 represent modified versions of dataset-1 where the 1-per-3 strategy OO attack is launched at the first and the second half of the dataset, respectively. The OO attack is launched five times in these two datasets. For all these scenarios, the sum aggregation function (SUM) is applied and then the sum aggregation results of E-RSDA are compared with the SUM aggregation results of: (i) RSDA, (ii) the estimate of the raw SUM aggregation (Plain Estimate), and (iii) the estimate of the reputation-based SUM aggregation results (Reputation-based Estimate). The Reputationbased Estimate (R.E) refers to the expected aggregation value that is calculated based on the reputation-based observations, whereas the Plain Estimate (P.E) refers to the expected aggregation value that is calculated based on the observations without considering the reputation values. Note that other aggregation functions such as average (AVE), minimum (MIN ), and maximum (MAX ) can be employed with very few modifications. However, the discussion in this section is limited to the SUM aggregation function only. The horizontal axis in all the subsequent plots represents the query number that is answered by the cell representatives, and the vertical axis represents the temperature captured/aggregated by the cell representatives. Also, node-1 represents C rep k in the abstract network model in Figure 5.2, while node-2 and node-3 represent respectively C rep i Scenario 1: No Attacks and C rep m. As discussed above, the dataset used in this scenario is as captured in IBRL and contains no malicious data. The motivation of this scenario is to find the optimal value of Thr A in which the variance on the aggregation results should be less or equal to it. The value of Thr A helps the cell representative of an intermediate cell to detect any change in the aggregation results (see Equation 5.8 and 5.9). It is observed that the maximum value of the absolute deviation score (a Qn ) is 2.52, while the maximum value of the CUSUM score (g Qn ) is 3.58, which suggest setting Thr A to According to Equation 5.8 and 5.9, a change is detected if: a Qn > 3.58, or g Qn [ 3.58, 3.58]. Figure 5.4-a depicts the behavior of the data collected by C rep j and Figure 5.4-b shows the SUM aggregation results of the collected data. As expected, the E-RSDA behaves the same as the reputation-based aggregation in RSDA, once there is no malicious activity. However, E-RSDA can detect any malicious activity, such as OO attacks, that affects the aggregation result and that could not be detected by RSDA, as will be discussed in the following scenarios.

142 122 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation Figure 5.4: The first scenario of E-RSDA evaluation in which dataset-1 is used Scenario 2: Abrupt or Incipient Change The motivation behind this scenario is to investigate how E-RSDA distinguishes between an abrupt and an incipient change. E-RSDA takes advantage of the processed reputation information, which are the revocation requests. When the reputation value of the representative of cell C k falls below a predetermined reputation value due to its malicious behavior, members of the same cell, C k, send revocation messages to adjacent cell representatives, such as C rep j, C rep i, and C rep b in the abstract network model in Figure 5.2. Once at least t revocation messages have been received at an adjacent cell representative, the revocation process is initiated and a

143 5.3. Experiment Evaluation 123 Figure 5.5: The second scenario of E-RSDA evaluation in which dataset-2 is used replacement to the misbehaved representative, C rep k, is required. To simulate an abrupt change in the dataset, C rep k was considered as a compromised node that had gained a high reputation value (= 0.95) due to its normal behavior in previous query responses up to Q i where i > 0. From query number Q i onward up to Q j where j > i, C rep k started behaving maliciously by reporting Ck read twice as large as the true data. In other words, the original dataset, dataset-1, was modified by multiplying the true Ck read by 2 for all query responses between Q i and Q j inclusive (where i = 19 and j = 46), which was then named dataset-2.

144 124 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation Figure 5.6: The second scenario of E-RSDA evaluation in which dataset-3 is used C rep k Figure 5.5-a depicts the data collected by C rep j in which a change in the data reported by is obvious. However, the change was ended, in Figure 5.5-b at Q p where p = 23 but not 46. The reason for ending the change at Q p is because t revocation messages were received from members of C k, claiming that the reputation value of C rep k fell below Thr R, which is set to 0.8. The R Crep k A at Q i was 0.979, and it dropped to at Q p. As discussed in Section 4.5, each time the cell members disagree with the aggregation result calculated by their representative C rep k, they update β Crep k A and then they calculate R Crep k as in Equation 4.8. The consecutive malicious behavior between Q i and Q j increases the negative feedback amount of C rep k by Q j Q i + 1, which makes R Crep k < Thr R. Thus, the current C rep k needs to be revoked

145 5.3. Experiment Evaluation 125 and a new representative should be elected. Note that, the C read k smaller than C read k at Q f, where f < i. This is because the newly elected C rep k reputation value less than what the old C rep k had gained at Q f. at Q q, where q > p, is slightly started with a Figure 5.5-b shows the SUM aggregation results of the collected data. Unfortunately, the aggregation results calculated by RSDA are affected by this abrupt change until the revocation requests are received at Q p. Importantly, E-RSDA has a better reaction to this change. In contrast with RSDA, E-RSDA delays the effect of the detected change caused by AR C Qn and k relies on the reputation-based estimate values during the window s lifetime. However, E-RSDA responds to the detected change faster than the plain estimate and reputation-based estimate. Upon completing the revocation mechanism and removing the malicious C rep k, E-RSDA responds to the detected change by reinitializing the estimator, which explains the drop in the aggregation results at Q 30. Note that, Figure 5.5 explains the E-RSDA behavior once a positive change has been detected, thanks to the absolute deviation score in Equation 5.7 that ensures the detection of a negative change as well. We now move from the abrupt change example to another example in which the physical phenomena in some parts of the network depart from previously reported observations, as in dataset-3. This departure causes a change in the aggregation results and needs to be investigated to ensure that this change is not an abrupt change. Figure 5.6-a shows a temporary departure in Ck read which lasts for 28 consecutive query responses. It is clear in Figure 5.6-b that the plain and reputation-based estimate of the aggregation result do not reflect the change in the environment and their reactions to the detected change are slow. RSDA performs well in this example by offering immediate employment of the detected change to the aggregation results. However, this fast consideration to the detected change comes at the cost of being threatened by any abrupt change as discussed in Figure 5.5. Obviously, E-RSDA behaves better than the plain and reputation-based estimate. However, it delays the effect of the detected change for the window size when it is compared with RSDA. More specifically, C rep j detected a change at Q i and then performed the same actions discussed above in the abrupt example. Since the departure in the environment affects almost all the cell members, no revocation requests are expected to be received during the temporary window. Thus, C rep j reinitialized the estimator at Q p, which explains why the aggregation results calculated by E-RSDA followed the reputation-based estimate values for query responses between Q i and Q j, and afterwards it followed RSDA (see Figure 5.6). The same behavior is repeated at the end of this departure Scenario 3: 1-per-2 Strategy On-Off Attack Dataset-4 and dataset-5 are used in this scenario to investigate the effectiveness of E-RSDA in detecting OO attacks. The difference between these datasets (dataset-4 and dataset-5) is that the attack happens at the first half of the data in the former dataset, whereas it happens at the second half of the data in the latter dataset.

146 126 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation Figure 5.7: The third scenario of E-RSDA evaluation in which dataset-4 is used To simulate the OO attack, the cell representative, C rep k, is considered as a compromised node that has gained a high reputation value due to its normal behavior in the previous query responses up to Q i where i > 0. Then, it tries to change the aggregation results, by reporting AR as being twice as large as the true value (AR C Qn k C Qn ). In this scenario, the attack occurred k while answering queries [Q i, Q j ] where i < j. However, the cell representative C rep k wanted to ensure that its reputation value was still above Thr R, which helps extend the detection time required to recognize its malicious behavior. Thus, C rep k chose the 1-per-2 strategy in which AR C Qn is altered once every two query responses. k

147 5.3. Experiment Evaluation 127 Figure 5.8: Reputation values of C rep k during the third scenario of E-RSDA evaluation Figure 5.7-a depicts the data collected by C rep j where it is obvious that C rep k has launched the OO attack in some query responses between Q i and Q j. The C rep k started attacking at Q i (where i = 8) and ended at Q j where (j = 20). Figure 5.7-b shows the SUM aggregation results of the collected observations. Unfortunately, the aggregation results, which are calculated by RSDA, are affected badly by the OO attack. This is because C rep k, after gaining a good reputation reputation value (= 0.963) at Q i 1, behaved maliciously every two query responses until Q j. By applying Equation 4.8 on C rep k s positive and negative feedback experiences, its reputation value fluctuated but it never became smaller than Thr R, as shown in Figure 5.8. Due to the binary decision making approach employed in RSDA, C rep k is still considered trusted

148 128 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation Figure 5.9: The third scenario of E-RSDA evaluation in which dataset-5 is used as long as its reputation value is above Thr R. The plain and reputation-based estimates are less affected by the OO attack, because the new AR C Qn is given the least weight among the previous AR k C Qn values (see Equation 5.4). k This makes the plain and reputation-based estimate curves slow when they try to converge with the RSDA curve. However, the effect of the OO attack still exists even when the attack is over. For example, in Figure 5.7-b the plain and reputation-based estimate values for query responses that are greater than Q j are still affected by the OO attack.

149 5.3. Experiment Evaluation 129 Figure 5.10: The fourth scenario of E-RSDA evaluation in which dataset-6 is used E-RSDA follows the reputation-based estimate behavior during the OO attack, but it has a better reaction once the attack is over. E-RSDA re-initializes the estimator as soon as the end of the OO attack is recognized (at Q j ). This ensures a quick convergence afterwards with the reputation-based aggregation results. Note that the absolute deviation score in Equation 5.7 ensures that even negative changes which are caused by the OO attack are also detected. The discussion in the previous paragraphs was dedicated to dataset-4, and that in the following paragraphs moves to dataset-5. The scenario in dataset-5 is similar to the scenario in dataset-4, except that the OO attacks happened at the second half of the data instead of the

150 130 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation Figure 5.11: Reputation values of C rep k during the fourth scenario of E-RSDA evaluation first half (see Figure 5.9-a). However, the estimators in the plain estimate, reputation-based estimate, and E-RSDA solutions have a better understanding of the environment and the mean of the aggregation results. In other words, the estimators have enough experience with the aggregation results due to the reasonably large number of queries that were answered before Q i where the OO attack was detected (see Figure 5.9-b). Not surprisingly, the aggregation results, which are calculated by RSDA, are still affected badly by the OO attack. This is due to the same reasons provided in the discussion of the effect of the OO attack in dataset-4.

151 5.3. Experiment Evaluation 131 Figure 5.12: The fourth scenario of E-RSDA evaluation in which dataset-7 is used Scenario 4: 1-per-3 Strategy On-Off Attack As discussed, the difference between this scenario and scenario 3 is the attack strategy. In this section, C rep k follows the 1-per-3 strategy in which C rep k alters AR C Qn once every three k query responses. Two datasets (dataset-6 and dataset-7) are used to evaluate the effectiveness of E-RSDA in detecting OO attacks. This attack happens at the first half of dataset-6, and it happens at the second half of dataset-7. The data collected by C rep j are depicted in Figure 5.10-a, where it is obvious that C rep k launched the OO attack while answering queries [Q i, Q j ] where i < j. The C rep k has started at-

152 132 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation tacking at Q i (where i = 8) and ended at Q j (where j = 20). As in scenario 3, the aggregation results which are calculated by RSDA are affected badly by the OO attack. C rep k after gaining a good reputation value (= 0.985) at Q i 1, behaved maliciously every three query responses until Q j. Thus, the C rep k s positive and negative feedback are updated according to Equation 4.8, which cause fluctuations in its reputation value, but still leave it above Thr R, as shown in Figure In contrast with scenario 3, the C rep k s reputation value in RSDA is less affected by the OO attack, because the attack frequency is larger, which reduces the effect on the reputation calculation in Equation 4.8. The larger the attack frequency, the less the reputation value is affected, which makes detecting the OO attack harder. In Figure 5.10-b, E-RSDA follows the reputation-based estimate behavior during the OO attack, but because it reinitializes the estimator when the attack is over, E-RSDA gives a better reaction and ensures a quick convergence afterwards with reputation-based aggregation results. The same discussion concerning the E-RSDA applies to dataset-7, as show in Figure Summary This chapter focused on investigating the ability to mitigate On-Off attacks where the adversary aims to disrupt the system s overall performance without being detected or excluded from the network. The significance of the proposal is two-fold: (i) it mitigates the effect of On-Off attacks on aggregation results, and (ii) it distinguishes between an abrupt change and a temporary departure in heterogeneous environments. The security advantages provided by this scheme are realized by integrating aggregation functionalities with: (i) a reputation system, (ii) an estimation theory, and (iii) a change point detection mechanism. The superior performance of the proposal (E-RSDA) in mitigating the effect of the On-Off attack has been proven through a comparative analysis of the proposal of this chapter with RSDA, plain estimate, and reputation-based estimate. Also, the effectiveness of the proposal in distinguishing between abrupt changes and incipient changes has been shown. The experiment results showed that (E-RSDA) is able to detect On-Off attacks as long as the attack frequency is smaller than the buffer window size. The results showed that E-RSDA followed the reputation-based estimate behavior during the On-Off attack, but it had a better reaction once the attack was over. E-RSDA re-initialized the estimator as soon as the end of the On-Off attack had been recognized. This ensured a quick convergence afterwards with the reputation-based aggregation results. To the best of our knowledge, E-RSDA is the only secure data aggregation scheme in the literature that is able to mitigate the On-Off attack. On the other hand, the plain and reputation-based estimates are less affected by the OO attack than reputation-based aggregation results. This is because the new AR C Qn is given the least weight k among the previous AR C Qn values as in Equation 5.4. This makes the plain and reputationbased estimate curves slow when they try to converge with the reputation-based k aggregation results curve.

153 5.4. Summary 133 Unfortunately, E-RSDA is limited to detecting the On-Off attack launched from only one child cell. It will be interesting to extend the scheme to investigate complicated scenarios where the On-Off attack can be launched from more than one cell at the same time. Then, the feasibility of making the improved scheme as a lightweight distributed intrusion detection system for WSNs can be another direction for future work. After the OO attack is detected or the reputation value of the cell representative falls below a predefined threshold value, the cell representative needs to be replaced and prevented from interacting with the network. This can be done via updating the cell group key at all cell members except the misbehaved representative. Therefore, a secure key management scheme which helps distribute and renew both pairwise and cell keys to sensor nodes is discussed in the subsequent chapter.

154 134 Chapter 5. Mitigating On-Off Attacks in Reputation-based Secure Data Aggregation

155 Chapter 6 A Forward & Backward Secure Key Management in Wireless Sensor Networks One of the most challenging security issues in WSNs is the physical compromise of sensor nodes given the lack of tamper-resistant packaging [54]. By gaining physical access, an adversary can gain control of one or more sensor nodes and readily access sensitive information such as keys or passwords. The adversary can therefore easily get access to the plain text of encrypted messages that are routed through the controlled nodes this compromises data confidentiality. The adversary may also inject their own commodity nodes into the network by fooling legitimate nodes into believing that these commodity nodes are legitimate members of the network. Another adversary activity is launching a selective forwarding attack in which the node under the control of the adversary selectively drops legitimate packets in order to affect the overall performance of the system [67]. According to RSDA architecture shown in Figure 4.3, the base station can communicate with sensor nodes in two different ways as follows: ˆ It can broadcast information/commands to a group of sensor nodes in a cell, especially when there is no indication of a node compromise in the group. ˆ It can unicast information/commands to a specific sensor node, which helps move compromised sensor nodes from a particular group or cell. In other words, this option helps the base station eliminate the group membership from compromised nodes. Thus, a secure key management framework is needed to establish and update the cryptographic keys (group and pairwise keys) which are used to secure the two ways of communication discussed above. In the rest of this chapter, the terms group and cell key are used interchangeably to describe a key that is shared between a base station and a group of sensors in a cell. 135

156 136 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks Figure 6.1: Classification of adversaries This chapter proposes a secure key management scheme which helps distribute and renew pairwise and group (cell) keys to sensor nodes. This key management scheme replaces the intra-cell and inter-cell keys set up discussed in the bootstrap phase in Chapter 4. The design idea of the proposed scheme is the combination between Lamport s reverse hash chain as well as the usual hash chain to provide both backward and forward key secrecy. The pairwise key update protocol has only one version whereas the group key update protocol comes in two variants. The first variant, FBSKM, has better performance results than the second variant, which is the enhanced FBSKM (E-FBSKM). However, FBSKM is subject to the Sandwich attack in which the damage caused by the attack is limited to revealing old keys but not future keys. The second variant, E-FBSKM, is attack resistant with a little extra energy consumption for the communication and computation activities. The rest of this chapter is organized as follows: Section 6.1 introduces a new model of adversary with which the key management can be evaluated. Section 6.2 discusses some of the related work. Section 6.3 explains the proposed key management protocol (FBSKM). Section 6.4 explains a new kind of attack called the Sandwich attack that FBSKM is vulnerable to. The section then provides details on enhancements that should be made to FBSKM in order to defend against this attack. Section 6.5 analyzes the security of FBSKM and E-FBSKM. The security analysis covers how a compromised sensor node can recover its secure state? how past & future key secrecy features are achieved in our proposals?, and how much damage impersonation attacks can cause to our proposals?. Then, the performances of FBSKM and E-FBSKM are analyzed and compared with those of Nilsson et al. s scheme. The performance analysis covers memory overhead, communication cost, and computation cost. Finally, the chapter is concluded in Section Adversary Model and Security Concerns When designing a key management protocol for WSNs, the most challenging security threat is node capture. The limited resources in sensor nodes make defending against this type of threat very difficult. Node capture will translate into compromise of all the credentials stored in the sensor node. Furthermore, the adversary can compromise all software installed within the sensor node. However, the computation power of the adversary falls short of compromising the base station, which has reasonable physical security. The purpose of this chapter is to design a key management scheme which is resilient to

157 6.1. Adversary Model and Security Concerns 137 node capture: i.e., a scheme that enables sensor nodes to recover their secure status even after they have been captured and then released back. Consequently, we are interested in what the adversary can do both when a node is captured, and after it is released back. Key disclosure is technically simple after the node has been captured [54]. The question is, what else should be done by the adversary to keep control of the node after the node has been returned to the field? The adversary will try to ensure that the node uses values of his choice for all cryptographic keys or keying materials. For this purpose, the adversary may try to modify software components (especially the random number generation component), and monitor all or part of subsequent key update messages. In this regard, the following criteria are used to classify adversaries. ˆ The adversary can read and modify all the software code and configurations, including secret keys, installed in the sensor node. For example, once the adversary has succeeded in compromising a sensor node, the adversary can then alter any software installed in this node, especially the random number generator. ˆ The adversary can carry out seamless monitoring of all the subsequent key update protocol exchanges. After compromising a sensor node, the adversary can keep monitoring every subsequent key update message within the network. According to the above two criteria, adversaries are divided into four distinct types, as shown in Figure 6.1. Type I is the weakest adversary: capable of neither seamless monitoring nor software compromise; Type IV is the strongest: capable of both seamless monitoring and software compromise. Type IV is so much powerful that it is unlikely that any practical cryptographic countermeasure for WSNs against this adversary can be devised. The use of tamper-proof technology to deny physical access will be needed to cope with this type of adversary, but this is outside the scope of this thesis. The chapter s goal is to design a new key management scheme which uses only cryptographic countermeasures in order to defend against the other three types of attackers. Having identified different types of adversaries, we have the following concerns with regard to node capture and the consequent disclosure of all the internal data of the captured node: ˆ Past key secrecy: The past keys should not be compromised. ˆ Future key secrecy: The future keys should not be compromised. The requirement of resilience to node capture rules out the use of any long-term keys; the keys must change or evolve continuously over time, with old prior keys to be deleted securely. In other words, a key evolution scheme is required in order to achieve past/future key secrecy against the threat of node capture. Terminology. To the best of our knowledge, the terms past/future key secrecy have never been used in previous literature. Similar terminology, including (perfect) forward secrecy and backward secrecy, have always been quite confusing. The term (perfect) forward secrecy goes back to Günther [51]. The original term assumes a long-term key and session keys established by the key, and means that the current session key is not compromised by future (thus, the expression forward ) exposure of the long-term key. This terminology has

158 138 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks Name Table 6.1: Description of notations used in Chapter 6 Description B Base station. M Network manager. N Sensor node. K BN Shared pairwise key between B and N. s 0, t 0 Pre-installed global secret data in every N. KG i The i-th group key (i 0). r X Random nonce chosen by entity X. (KB 1, K B) Asymmetric key pair of base station. {m} K Encryption of message m under the key K. h( ) A cryptographic hash function. MAC K (m) A message authentication code function on m using the key K. a slightly different usage in the context of group key communication; it concerns the contamination of a group key at a particular time by the compromise of an older/newer group key. This inherent ambiguity led to the term: backward secrecy. Some authors choose the term backward secrecy to mean the forward secrecy of other authors, and vice versa. To avoid all this confusion, we will use a new, more concrete expression: past/future key secrecy. The notation to be used in the rest of the chapter can be found in Table Related Work With regard to past key secrecy, we note two proposed schemes in the WSN context: Klonowski et al. [70] and Mauw et al. [78]. Both schemes use hash functions in order to achieve key evolution. Both schemes, however, are intended to be used not for group key update but for updating pairwise keys for node-to-node [70, 102] or node-to-base station communication [78]. On the other hand, with regards to future key secrecy, Mauw et al. s protocol does not provide this property. The protocol is based on a hash chain scheme originally proposed for RFID security [89]. Protecting secret tag information from tampering in the future is a big concern in RFID environments, but this does not seem to be such a prime concern in WSNs. This is because authentication and integrity are more important than privacy in WSNs. Hence, future key secrecy is more valued than past key secrecy. On the other hand, the protocol proposed by Klonowski provides future key secrecy in a weak sense; namely, it will be computationally hard for the adversary to compute a future key from the current compromised key if he fails to record, say ten, subsequent evolution steps [102]. The work more related to our purpose is the work proposed by Nilsson et al. [88]. They proposed a key management for wireless control environments and SCADA systems. There are several papers dealing with key management designs for SCADA systems such as [34, 98]. However, these designs either use heavy cryptographic mechanisms unsuited to resource constrained devices, or do not consider the integration of WSNs within SCADA. To the best of our knowledge, Nilsson et al. s scheme is the only existing key management that considers the integration between SCADA systems and WSNs. This type of application shares the same

159 6.2. Related Work 139 communication pattern used in RSDA, which are mentioned at the beginning of this chapter. It also shares some network assumptions such that M (which is equivalent to B) is secured and under the supervision of the network administrator. Nilsson et al. designed two key update protocols: the first one updates the pairwise symmetric key between the network manager M and a sensor node N (as described in Protocol 7.2), and the other scheme updates the global or group key among M and the whole group G of sensor nodes (as described in Protocol 7.1). The authors claimed that these protocols provide both forward and backward secrecy (or in our newly defined terminology, they provide both past and future key secrecy). However, this is unfortunately not the case. Protocol 6.1: Group key update protocol from [88] M: generates a new group key K G and a random number r M 1. M N: {K G, r M } KMN 2. M N: MAC K (N, r M ) G To initiate the group key update protocol, M generates a new group key, K G, randomly. It then encrypts it with another random number, r M, and sends it over the network to the target group. No node in the group has any clue whether the received key is fresh or not. In other words, the freshness property, from the viewpoint of N does not hold since the two values (the new group key K G and the random number r M ) are random values chosen by M. It is both impractical and insecure for each sensor node to maintain a list of keys that have been used. Thus, an external adversary could record a rekeying message and then re-inject it into the network, which leads to the group key being updated with an old key. Consequently, the group enters a key mismatch phase where the key version that the group of sensors uses is different to that used by M. One good security practice is to minimize the damage caused by a compromised node. However, the authors did not consider common attacks in WSNs that an adversary is capable of launching attacks, such as selective forwarding [67] or node compromise [54]. If a single sensor node has the ability to affect the operation of a good number of sensor nodes, then the adversary will try to compromise that node. For example, if an adversary compromised a sensor node (say, node N b ) in a multi-hop path, then it would be able to enforce all other nodes downstream to enter the key mismatch phase. The adversary simply drops the rekeying message from M for the group key, and then use the new group key to calculate MAC s on their identities and the received nonce, which results in a successful impersonation attack. The problem can easily be fixed by replacing the MAC data with another one: e.g., MAC KMN (K G, r M ). Moreover, to initiate the pairwise key update protocol, N generates a random number, r N, and encrypts it with K M. It subsequently computes the MAC on the encryption result and sends this MAC and the encryption result over the network to M. The new pairwise key can be calculated, at the sender N and at the receiver M, by hashing r N with the previous pairwise key. This means that the new pairwise key is always determined by N. The adversary consequently is able to know all the future keys once he has compromised N. A closer look at the protocols, Protocol 1 and Protocol 2 reveals more serious defects of them. ˆ Defect I. The whole value of the new group key is directly carried by the protocol

160 140 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks Protocol 6.2: Pairwise key update protocol from [88] N: generates a random number r N 1. M N: {r N } KM, MAC KMN ({r N } KM ) M, N: compute the new pairwise key K MN = h(k MN, r N ) messages, encrypted under the pairwise key K MN. The consequence of this is that compromise of the pairwise key for just one node leads to compromise of the group key for the whole group. This is a more serious problem than it might appear, because the pairwise key compromise does not necessarily require node capture. ˆ Defect II. The value of the new pairwise key K MN is only determined by the sensor node. When an adversary of Type II or IV (capable of compromising the key generation code stored in the node) captures the node, all the future pairwise keys for the node can be pre-determined by the adversary. Namely, physical compromise of the node immediately leads to compromise of all the future pairwise keys if the adversary can modify the codes installed in the node. This, in turn, leads to compromise of all future group keys because, as mentioned in Defect I, the group key is delivered encrypted under the pairwise key. Hence, contrary to Nilsson et al. s claim, the scheme does not provide future key secrecy, against node compromise, for either the pairwise key or the group key. ˆ Defect III. Although not explicitly shown in the protocol descriptions above, the key input r N for the new pairwise key K MN is not really random in Nilsson et al. s scheme; it is in fact a function of a pre-installed secret key and a counter value stored in the node. This means that when the node is captured, and all the installed data including keys are exposed to the adversary, all the past pairwise keys as well as the future keys can immediately be computed, even without recording a single key update message! This failure is due not only to Defect III, but also to Defect II. Note that, due to the combination of Defect III and Defect II, the adversary does not have to modify the node s software at all in order to extract all the past and future pairwise keys. Hence Nilsson et al. s scheme offers no minimum level of past or future key secrecy against node compromise. Moreover, the adversary can extract any group key in the past or future if he has the records of the corresponding group key update message. Note also that seamless monitoring is not needed by the adversary. This means that the scheme is neither forward nor backward secure for either key type against node compromise by all types of adversary (I, II, III and IV; see Figure 6.1). 6.3 The Proposed Forward & Backward Secure Key Management Scheme - FBSKM Devising a key management scheme for WSNs is not trivial and in particular may not be successfully accomplished by simple adaptation of security solutions designed for wired networks. This is because of limited resources that a sensor node has such as energy lifetime, slow computation, small memory, and limited communication capabilities, as discussed in Chapter 1.

161 6.3. The Proposed Forward & Backward Secure Key Management Scheme - FBSKM 141 In this section, we describe a key management scheme which secures communication between sensor nodes and the base station by considering vulnerabilities that are associated with WSNs. In other words, this section focuses on updating two types of keys, which are the group key and the pairwise key, in wireless sensor networks environments Group Key Update Protocol The proposed solution for group key rekeying also exploits the idea of key evolution using a hash chain in order to achieve past key secrecy. The protocol uses a hash chain, h i (s 0 ), where s 0 is a key component pre-installed in the pre-deployment phase and i 0 denotes the index for key update phases. As for future key secrecy, the reverse hash chain technique, which was first introduced by Lamport [73], is used. The network administrator prepares in advance a hash chain of length n, starting from a random seed t n 1 and ending with the final value t 0 : t n 1, t n 2 = h(t n 1 ), t n 3 = h(t n 2 ),..., t 1 = h(t 2 ), t 0 = h(t 1 ). For reasons of convenience which will become clearer shortly, h i (t 0 ) is used instead of t i although h is not an invertible function and h 1 (x) can only mean the set of all preimages of x in a strict sense. Roughly speaking, h i (t 0 ) is the i-th preimage of t 0 in the reverse hash chain. The secret data, t 0, will be pre-installed into sensor nodes together with another key component s 0. Protocol 6.3: The proposed protocol for group key update 1. B N: i, {h i (t 0 )} KBN #unicast message 2. B N: h KBN (K i G ) B, N: increment the group key index from i 1 to i, and update the value of the group key (i.e., K i G = hi (s 0 ) h i (t 0 )). Now, with two secret key components s 0 and t 0 pre-installed within all sensor nodes, using Protocol 3, the group key KG i evolves as follows: K i G = h i (s 0 ) h i (t 0 ), i 0, where we define h 0 (s 0 ) = s 0 and h 0 (t 0 ) = t 0 (see Figure 6.2). Any sensor node can easily compute the i-th hash image h i (s 0 ) from h i 1 (s 0 ) whereas only the base station knows the value of the i-th preimage h i (t 0 ). Thus, it is only the base station that can release the preimage into the sensor field. As a consequence, the first message in the protocol provides the sensor node with a weak form of signature from the base station: the message could have been generated only by the base station, not by any sensor nodes including the node itself. The check of the preimage (i.e., h(h i (t 0 )) = h (i 1) (t 0 )) also ensures that the key update message is fresh. After the i-th key update, the sensor node stores the index i and the secret data: h i (s 0 ), h i (t 0 ) and K i G. Considering the highly lossy communication environment of sensor networks, the sensor node may sometimes fall behind the group key update schedule. The sensor node,

162 142 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks h ( t ) h i 1 0 ( t ) 0 1 i t0 s0 h ( s0 ) h s0 ( ) 0 K G 1 K G i K G Figure 6.2: Key evolution in the proposed protocol however, will soon be able to catch up at the next rekeying: it can compute the correct value of the new group key simply by checking the difference of two index values the received and the stored and applying the corresponding number of hash operations. FBSKM, however, has one limitation: it is vulnerable to a kind of collusion attack. Assume that a sensor node was captured at a key update phase i, and another node was subsequently captured again at the phase i Then, the adversary can extract all the group keys for the phases i to i Of course, this compromise is limited to the past keys, not the future keys. We call this attack sandwich attack, and this will be considered in the subsequent section. Protocol 6.4: The proposed protocol for pairwise key update 1. B N: i, {h i (t 0 ), g r B } K i 1 #broadcast message G 2. B N: {g r N } KBN, h KBN (g r B, g r N ) N: keeps the hashed value of the current pairwise key: KBN 1 = h(k BN ). B, N: increment the group key index from i 1 to i, and update the values of the pairwise key (i.e., K BN = g r Br N ) and the group key (i.e., to KG i = h i (s 0 ) h i (t 0 )) Pairwise Key Update Protocol Protocol 7.4 shows the rekeying protocol for the pairwise key shared between the base station and the sensor. This protocol is based on Diffie-Hellman protocol which has recently become not only feasible on resource constrained nodes, but attractive for WSNs [120]. The base station B first generates a secret random number r B, and computes the Diffie-Hellman component g r B. It then broadcasts Message 1, which includes the index i of the next group key, and ciphertexts of the next group key component h i (t 0 ) and a Diffie-Hellman component g r B, encrypted under the current group key, K i 1 G. The inclusion of the group key index i in the first message enables each sensor node to check if it has the current value of the group key; if not, the node can request that the base station sends the latest key component h i (t 0 ). Thus, the group key rekeying protocol exchange as described in Protocol 7.3 can be inserted between Messages 1 and 2 of the protocol in the case of a group key index mismatch.

163 6.3. The Proposed Forward & Backward Secure Key Management Scheme - FBSKM 143 Protocol 6.5: The protocol to handle delivery failure 1. B N: i, j, {h i (t 0 ), g r B } K j # unicast message 2. B N: {g r N } K j, h K j BN (g r B, g r N ) BN BN B, N: update the values of the pairwise key (i.e., K BN = g r Br N ) N: increments the indice i and j, and updates the values of the pairwise key (i.e., K BN = g r Br N ) and the group key (i.e., to KG i ), and then keeps the hashed value of old key: K j+1 BN = h(kj BN ) After retrieving the plaintext of Message 1 using the group key, the node checks the preimage if h(h i (t 0 )) = h (i 1) (t 0 ). This check provides the node with evidence that M has really started the pairwise key update session. As Message 1 is a broadcast message encrypted using the group key, it would be impossible to provide this evidence without using the preimage as used here. Of course, using digital signature/verification is different. Now the node constructs the second message of the protocol: it generates its own Diffie- Hellman component g r N, encrypts it, and generates the keyed hash of both Diffie-Hellman components under the current pairwise key K BN. After sending the message to B, the node computes the new group key, KG i = hi (s 0 ) h i (t 0 ), increments the group key index from i 1 to i, and computes the Diffie-Hellman key g r Br N to be used as the new pairwise key, while keeping the hash h(k BN ) of the old pairwise key and safely deleting the old key. On receiving Message 2, B decrypts g r N, and verifies the keyed hash from N. The inclusion of g r B and g r N in the hash provides B with confidence about the freshness and authenticity, respectively, of the message Delivery Failure Management The delivery failure in the WSNs will lead to key mismatches of group keys and/or pairwise keys. With no long term key available in the proposed key update protocols, key mismatch is a big concern and should be handled carefully. Simple retransmission of the protocol messages is not a solution as it may open the door to replay attacks. Moreover, it may require the sensor node to revert to the old key even after it has successfully updated the pairwise key. Consequently, the node must keep two keys at the same time: the old key and the new updated key. The key evolution is used once again in order to provide a solution for the delivery failure problem. With no response from the node N, the base station B initiates Protocol 7.5 over the unicast channel to N. Importantly, this protocol can be used in the two proposals introduced in this chapter (FBSKM and E-FBSKM) since they have the same pairwise key update protocol. The delivery failure in Protocol 7.3 is resolved by running Protocol 7.4 while the delivery failure in Protocol 7.4 is resolved by running Protocol 7.5. In Protocol 7.5, K j BN = hj (K BN ) is a hashed copy of the current key from B s viewpoint. For the first protocol run, the index j is set to 1; it will be incremented by one whenever the protocol is retried. On receipt of Message 1 over the unicast channel, the sensor node N compares the received group key indice i, j with the stored indice i, j, and executes the required action as follows: ˆ Case 1: i = i and j j. For simplicity, consider the case j = j = 1. The pairwise key

164 144 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks update protocol (Protocol 7.4) has just been run, but the reply message of the protocol failed to arrive at B. The node N has been keeping the hashed copy K 1 BN = h(k BN ) of the old pairwise key, which is applied to the ciphertext for Message 1 of Protocol 7.5. The retrieved value of h i (t 0 ) ensures the authenticity of the message; the entity other than N, in possession of h i (t 0 ) and KBN 1, should be B. The node decrypts the encrypted part of Message 1 using KBN 1. Then, N follows exactly the same step as in Protocol 7.4 except that it uses the hash of the old pairwise key instead of the current pairwise key. At the end of the protocol run, N will end up with a new pairwise key, and the hash of KBN 1, i.e., K2 BN ; now j = 2. The current pairwise key is simply deleted. One or more failures again will be followed by reinitialization of the protocol by B with j incremented. It could also happen that Message 1 itself fails to arrive at N, and subsequently B retries the protocol. This will lead to the case j > j. ˆ Case 2: i = i and j < j. This cannot happen; and should be a bogus message from another sensor node. N should ignore Message 1. ˆ Case 3: i > i. This happens when the node N has never been involved in the pairwise key update protocol due to delivery failure of Message 1 of Protocol 7.4. In this case, N applies the hash to the current pairwise key j times, and uses the resulting value as the decryption key for Message 1. ˆ Case 4: i < i. This is another case of a replay attack. N should ignore Message 1. Now, the old key does not need to be kept in order to handle the key mismatch, instead a hashed copy of the key is kept. Thus, Protocol 7.5 is as secure as Protocol 7.4, because it inherits all the strong features from Protocol The Enhanced FBSKM (E-FBSKM) Unfortunately, FBSKM has one limitation, it suffers from a new kind of collusion attack called the Sandwich attack. Assume that two nodes are captured at times t i and t j where t i < t j. If these two compromised nodes collude with each other, they can reveal all the group keys used between times t i and t j. Here i and j are discrete time indices, which are intended to mean the group key indices as used in Protocols 7.3 and 7.4. The attacker captures a sensor node at time t i which then leads to compromising h i (s 0 ) and h i (t 0 ). Thus, he can compute all the subsequent hash images of the forward hash chain: h i+1 (s 0 ),..., h j 1 (s 0 ), h j (s 0 ). When he captures another node at time t j, he can compute all the preimages of the reverse hash chain: h j (t 0 ), h (j 1) (t 0 ),..., h (i+1) (t 0 ). Now the attacker can compute all the group keys from t i to t j by the computation: K k G = hk (s 0 ) h k (t 0 ), where t i t k t j. This weakness comes from the design feature of the scheme: the combination of a forward hash chain and a backward hash chain. The solution to this problem is simple: Break the reverse hash chain into shorter ones while not leaving any vulnerable security crack between their connection. The following protocol is a modified version of Protocol 7.3 to accommodate this idea.

165 6.4. The Enhanced FBSKM (E-FBSKM) 145 Protocol 6.6: The modified group key update protocol 1. B N i, {h i (t 0 ), t 0 } KBN 2. B N h KBN (K i G ) B, N: increment the group key index from i 1 to i, reset the value of h i (t 0 ) to t 0, and update the value of the group key (K i G = hi (s 0 ) h i (t 0 )). The protocol messages of Protocol 7.6 are exactly the same as those of Protocol 7.3 except for the addition of a new data t 0. This addition enables the base station B to restart a new reverse hash chain by choosing a new starting value t, and then computing successive hash images of t. The final value of the hash chain is assigned to t 0. In other words, B reestablishes the reverse hash chain with t as a starting point. It should be noted that, after the execution of Protocol 7.6, h i (t 0 ) is no longer related to t 0 and thus h (i 1) (t 0 ) as well; in fact, it has been reset to the value of t 0, i.e., h i (t 0 ) = t 0. It is just for notational convenience that we keep using the name h i (t 0 ). Inclusion of t 0 together with h i (t 0 ) in the first message of Protocol 7.6 convinces the sensor node that t 0 has originated from the base station. Note that t 0 is delivered to the sensor node encrypted under the pairwise key K BN, not under the group key. Next, the new group key, which is computed by using the new reverse hash chain, is hashed and then returned to the base station. Thus, the base station can be certain that t 0 has been successfully installed into the sensor node. Interestingly, the modified protocol equipped with the countermeasure comes with a nice feature: reestablishing the reverse hash chain. With this feature, the sensor nodes do not have to be recollected to refill the reverse hash chain. Now, the base station can initiate Protocol 7.6 at any time to restart the reverse hash chain, hence arbitrarily limiting the time span within which Sandwich attacks may succeed. In fact, B can play two strategies in order to accomplish the reinitialization of the reverse hash chain. On one hand, B can replace Protocol 7.3 completely with Protocol 7.6. The only drawback with this strategy is that the self-synchronization feature, as mentioned in the description of Protocol 7.3, cannot be maintained anymore. Therefore, B must rerun Protocol 7.6 until he receives the second message of the protocol from N to ensure that the reverse hash chain has been reestablished. In return, however, we get a key management entirely free from Sandwich attacks. On the other hand, B can switch between Protocol 7.3 and Protocol 7.6 whenever it is needed. For example, the base station can use only Protocol 7.3 to renew the group key several times based on the same reverse hash chain. When there is a suspicion that the Sandwich attack may occur, B can switch and run Protocol 7.6 in order to limit the usefulness of the disclosed components of the previous hash chain. After that, B can switch back to Protocol 7.3. The choice between these two strategies depends on how much concern the network designer has with the Sandwich attack.

166 146 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks Group key not disclosed Pairwise key not disclosed key update protocol key update Protocol Captured key update protocol key update protocol key update Protocol Group key disclosed Pairwise key disclosed key update Protocol key update Protocol Protocol key update Not monitored Monitored & SW not compromised Monitored & SW compromised Group key not disclosed Pairwise key disclosed key update Protocol Group key disclosed Pairwise key not disclosed key update Protocol Figure 6.3: State diagram of key disclosure 6.5 Security Analysis In this section, the security of the two proposed secure key management schemes in this chapter (FBSKM,E-FBSKM) is analyzed. The security analysis covers their robustness against adversary types discussed in Section 6.1, the achievement of both past and future secrecy features, and their resilience against impersonation attacks in Protocol Robustness Against Adversaries The pairwise key is used for secure delivery of the group key update information in Protocols 7.3 and 7.6; the group key, in turn, encrypts the Diffie-Hellman components to establish a new pairwise key in Protocol 7.4. This combination helps the sensor network to recover its security quickly after the capture of some sensor nodes and the compromise of their keys. Carefully designed with node capture in mind, FBSKM and E-FBSKM do not surrender all the key components required to retrieve the past/future group/pairwise keys. The secure state recovery for a sensor node varies depending on the adversary capability. Figure 6.3 shows how a sensor node recovers its secure state with the help of the proposed key update protocols, after it has been captured and all the keys in it are compromised by different adversary types. According to the adversary classification discussed in Section 6.1, adversary types I and II do not have the ability to perform seamless monitoring. A compromised sensor node therefore is able to recover its secure state for the group key if the adversary has missed a single group key update message. This is because the adversary will miss the next preimage of t 0 in the reverse hash chain (the seed t 0 of the new reverse hash chain) if Protocol 7.3 (Protocol 7.6) is used. As a consequence, the new group key would not be disclosed to the adversary. However, the pairwise key will be still disclosed. Importantly, both pairwise and group keys wouldl not be disclosed to the adversary, if the adversary miss a single pairwise key update message. This is because the adversary will not have access to the next preimage of the reverse hash chain (see Protocol 7.4). Consequently, the compromised sensor node recovers its secure state for both pairwise and group keys.

167 6.5. Security Analysis 147 Figure 6.4: Relations between keying materials and the significance of node compromise Even with adversary type III, which is capable of seamless monitoring but no software compromise, the adversary in Protocol 7.4 is not capable of altering the software installed in the sensor node; hence it can not alter the Deffie-Hellman key. Both B and N contribute their Diffie-Hellman inputs to the computation of the new pairwise key, and thus the adversary can not predict the future value of the pairwise key. As a consequence, the pairwise would not be disclosed to the adversary. After that, if the group key update protocol is run, the compromised sensor node will recover its secure state for both pairwise and group keys. This is because the first message of Protocol 7.3 (or Protocol 7.6) is encrypted under the pairwise key which is not disclosed to the adversary anymore. Only an adversary equipped with both seamless monitoring and software compromise (i.e., the type IV adversary) can keep the control of a sensor node if it is captured. In other words, there is no path available back to the original secure state if the adversary is capable of both seamless monitoring and software compromise. It is argued that a non cryptographic countermeasure such as tamper-proof technology is additionally required to fight against an adversary of type IV Achievement of Past & Future Secrecy Since Protocol 7.6 in E-FBSKM differs from Protocol 7.3 in FBSKM in only the reverse hash reinitialization but not in the combination between the reverse and forward hash chain, these two protocols are considered to be the same in this section. It is up to the network administrator to choose either Protocol 7.3 or 7.6, according to the required security level. Figure 6.4 illustrates how all the keys and keying data are related to each other as they evolve over time. Note that no keys are delivered over the air; only their keying materials, such as h i (t 0 ), are exchanged or even never exchanged over the air (e.g., h i (s 0 ) ). Thus, unlike the scheme of Nilsson et al. (see Defect I in Section 6.2), pairwise key compromise alone does not lead to group key compromise, and vice versa. Using the inverse hash chain as well as the usual hash chain, both past and future group key secrecy are simultaneously achieved in both pairwise and group key update protocols. Fur-

168 148 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks thermore, the group key update message provides inherent message authenticity. Group Key Update Protocol Let s assume that the adversary has somehow extracted the current value of the group key, KG i. However, he cannot extract from this the previous key Ki 1 G because he cannot compute the value of h i 1 (s 0 ). Note that this holds even when the adversary has recorded all the previous key update messages, and compromised all the previous base station-to-node pairwise keys. In fact, the node capturing and extracting all the stored secret data does not surrender the past group key to the adversary. This is because the previous values for h i (s 0 ) were never exchanged over the air, and were deleted after group key computation. Hence it can be said that the protocol provides past key secrecy for any kind of compromise: group key compromise, pairwise key compromise, and the compromise of the node itself. The protocol also provides future key secrecy in the sense that the adversary cannot predict the next group key K i+1 G just with knowledge of the current group key Ki G. The computation of K i+1 G requires knowledge of h (i+1) (t 0 ), which has not yet been exchanged. In the next step of the key update, the adversary, without knowledge of the pairwise key K BN, will not be able to obtain the value of h (i+1) (t 0 ) from the protocol message. In fact, compromise of the pairwise key alone does not lead to the future group key compromise; it will only happen when the adversary captures a sensor node, thereby extracting the hidden component h i (s 0 ). Hence, the protocol satisfies future key secrecy in the face of group key and/or pairwise key compromise; simple delivery of the encrypted value of the new group key, as in [88], cannot provide this kind of resilience. Protocol 7.3 will fail to provide future key secrecy only when the node is physically captured. Even in the case of capture, the adversary should listen to the key update message to extract the future group key. Furthermore, when the pairwise key is updated, any adversary of type I, II, or III will not be able to have any knowledge of the new pairwise key. This, in turn, leads to the adversary s failure to have any knowledge of the new group key established using the new pairwise key. Hence, we achieve the future group key secrecy even after node capture, as far as the adversary has no ability to modify the software code stored in the node. Protocol 7.3 uses the pairwise key K BN to encrypt the i-th preimage h i (t 0 ) in the first message, and also to provide key confirmation by computing keyed hash of the new group key. This is in order to rule out any compromised or suspicious sensor nodes from group key update. Pairwise Key Update Protocol Use of Diffie-Hellman key agreement for the pairwise key update provides both past and future pairwise key secrecy; the key inputs are temporary randoms, and thus no relation to either the previous or next key inputs. Even after node compromise, if the attacker is not able to modify the software code in the node (i.e., the adversary of type I or III), or if the adversary fails to record the key update messages (i.e., the adversary of type I or II), the node will escape from the control of the adversary and recover its secure status. Thus, our scheme satisfies past pairwise key secrecy for all the adversary types, and future pairwise key secrecy for any adversary type except type IV, even against node capture and its compromise.

169 6.6. Performance Analysis 149 Table 6.2: Memory overhead comparison Nilsson et al. [88] FBSKM E-FBSKM Stored information per sensor Qty Size Qty Size Qty Size (bits) (bits) (bits) Pairwise key shared with B (K BN ) Key used for random number generation B s public key Group key (K G ) Secret data Indexes Hashed value of the old pairwise key Resilience Against Impersonation Attacks If the adversary is in full control of a compromised node, and has installed malicious attacking software, then the adversary s node can still impersonate B to some other victim node. The impersonating node may succeed in causing the victim to receive a fake Diffie-Hellman component, say g x. But this is the limit of the attack. The attacking node has only two options when receiving Message 2 from the victim node: (1) forward the message verbatim to B, or (2) cut out the message. In the first case, B will get not the expected hash h KBN (g r B, g r N ) but rather a hash of h KBN (g x, g r N ). In the second case, B will see no response from N. In both cases, B will issue Message 1 again through the unicast channel to N, which will finally lead to key agreement between B and N. 6.6 Performance Analysis In this section, the performances of the two proposals in this chapter (FBSKM, E-FBSKM) are analyzed and then compared with the similar scheme designed by Nilsson et al. [88]. The performance analysis covers memory overhead, communication cost, and computation cost for these schemes Memory Overhead In this section, the amount of memory required by the two proposals is discussed. Prior to the deployment phase, each sensor node in these two proposals stores four pieces of information: the secret data (a forward hash chain component (h i (s 0 )) and a reverse hash chain component (h i (t 0 ))), two indexes: one for the group key update phase (i) and another one (j) to handle the delivery failure problems. The sensor node then needs to keep a copy of the recent pairwise key shared with B (which is K BN ), the group key (which is K G ), and a hashed copy of the old pairwise key (which is h j (K BN ). The reason for keeping a hashed copy of the old pairwise key is to use it when B runs the delivery failure protocol as described in the delivery

170 150 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks failure management subsection in Section 6.3. In other words, a sensor node needs to store two symmetric keys (K BN, K G ), two secret values (h i (s 0 ), h i (t 0 )), two indexes (i, j), and a hashed copy of the previous pairwise key. Interestingly, E-FBSKM does not need more memory than that required by FBSKM. This is because each sensor node replaces t 0 with t 0 when the base station reestablishes the reverse hash chain as discussed in Section 6.4. There is no need to keep copies of both t 0 and t 0 at the same time, only one of them is needed. Consequently, each sensor node, in both proposals in this chapter, needs to store approximately 100 bytes in order to achieve 128 bit security. This memory overhead occupies approximately 0.078% of the total program flash memory at the most popular sensor end device MICA2 [30]. The 100 bytes include two keys (256 bit and 128 bit long each), two 128 bit secret data, two 16 bit indexes, and one 128 bit hashed value of the previous pairwise key (see Table 6.2). On the other hand, Nilsson et al. s scheme occupies approximately 128 bytes, which is equivalent to 0.1% of the total program memory, in order to achieve the same level of security. This memory overhead includes two 256 bit pairwise keys between B and N (one is the current pairwise key and the other is a copy of the previous key to handle the key delivery failure), one pre-installed 128 bit secret key that is used to generate the random number, one 256 bit public key for B, and one 128 bit group key (see Table 6.2) Communication Overhead The communication between sensor nodes is considered the biggest factor that destroys the sensor s battery since it consumes most of the available power. It consumes much more than sensing and computation activities. Hill et al. concluded that each bit transmitted in WSNs consumes about as much power as executing instructions [57]. The MICA2 data sheet indicates that the energy consumption of communication, which is the focus of this section, is unequal for sending and receiving [30]. The energy consumption of transmitting with maximum power is more than double the energy consumption of receiving activities. The energy consumption for transmitting m bits over a distance r, according to [3, 56], can be calculated as follows: E tx (m, r) = me c + mer s, (6.1) where e = e 1 s = 2 e 2 s = 4, r < r cr r > r cr Here E c represents the minimum energy required to operate the radio circuit, e denotes the unit energy required for the transmitter amplifier, and r cr is the crossover distance. The typical values for E c, e 1, and r cr are 50 nj/bits for a 1 Mbps transceiver, 10 pj/bit m 2, and 86.2 m, respectively. On the other hand, the energy consumption that results from receiving activities can be calculated as follows: E rx (m, r) = me c (6.2)

171 6.6. Performance Analysis 151 Table 6.3: Number of bits transmitted/received by a sensor Protocol Pairwise Key Group Key Step Nilsson et al. [88] FBSKM E-FBSKM # of Consumed # of Consumed # of Consumed bits energy (µj) bits energy (µj) bits energy (µj) 1. B N B N Total B N B N Total Assuming that r = 50 < r cr, Table 6.3 lists the number of bits that is required to be transmitted in order to accomplish the renewal of the pairwise and group keys. Starting with FBSKM, the base station initiates the pairwise key rekeying mechanism (Protocol 7.4), whereas the sensor node itself initiates the mechanism in Nilsson et al. s scheme (Protocol 7.1). The initiation, in Nilsson et al. s scheme (Protocol 7.1), that is done by N leads to immediate compromise of all future keys as soon as N has been physically compromised. FBSKM instead requires B and N to swap the Diffie-Hellman components g r B and g r N. This increases the length of the information received by a sensor node by 34 bytes in comparison with Nilsson et al. s scheme. Although this increase affects the energy consumption, this must be done in order to solve the security weaknesses in Nilsson et al. s scheme. Interested readers in these weaknesses are referred to Section 6.2. Importantly, this increase in the number of transmitted bits affects the energy consumption of receiving activities (E rx ), but not the energy consumption of transmitting activities (E tx ). Notably, the pairwise key rekeying mechanism (Protocol 7.4) is able to update the pairwise and group keys at the same time, especially if there is no indication of any node compromise attack or there is no need to eliminate some group members from a specific group. Thus, the communication energy consumption (E rx + E tx ) that results from updating these two keys is 32.8 µj in our proposal, whereas it is 41.6 µj in Nilsson et al. s scheme (see Table 6.3). Although, the base station can update the group and the pairwise keys at the same time by running Protocol 7.4, B sometimes may need to remove specific nodes from a particular group, especially when they behave maliciously. In this case, B can run Protocol 7.3. In comparison with the group key update protocol in Nilsson et al. s scheme (Protocol 7.2), the new group key, in FBSKM, is not exchanged between the base station and sensor nodes. Instead, only half of the group key, which is the reverse hash component (h i (t 0 )) is transmitted. The knowledge of only this component is not enough to construct the group key, since the group is composed of two components: the reverse hash chain component (h i (t 0 )) and the forward hash chain component (h i (s 0 )). In the first message of the group key update protocol, FBSKM in Protocol 7.3 requires N to receive 14 bytes less than Protocol 7.2 in Nilsson et al. s scheme. This reduction in the number of bits received by N leads to less energy consumption. However, FBSKM in the second message sends the same number of bits as Nilsson et al. s scheme. Table 6.3 shows that

172 152 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks FBSKM consumes 5.6 µj less than Nilsson et al. s scheme in order to update the group key. This is calculated per group key update at each sensor node. In moving from FBSKM to E-FBSKM, the base station may need to reset the reverse hash chain in order to defeat the Sandwich attack, as in Protocol 7.6. The repetition of this process depends on the time span, which is defined by B, within which the adversary is allowed to succeed in launching the Sandwich attack. B may reset the reverse hash chain every time when he updates the group key, or he may reset it after l group key renewals. It depends totally on the protocol specification. Importantly, the reestablishment of the reverse hash chain affects the group key update protocol but not the pairwise update protocol (see Table 6.3). The reestablishment of the reverse hash chain (Protocol 7.6) requires the sensor node N to receive 128 more bits in the first message of the protocol. The inclusion of the extra 128 bits, which is t 0, with h i (t 0 ) in the first message is necessary to convince the sensor node that the reestablishment of the reverse hash chain is originated by B, as discussed in Section 6.4. After the success of running Protocol 7.6, B can run Protocol 7.4 in order to update the group and pairwise keys at the same time. Table 6.3 shows that Protocol 7.6 in E-FBSKM requires sensor nodes to receive 2 bytes and 16 bytes more than Protocol 7.1 in Nilsson et al. s scheme and Protocol 7.3 in FBSKM, respectively, in order to run the first message of the group key update protocol. The transmission of these extra bits leads to more energy consumption. Table 6.3 shows that E-FBSKM consumes 6.4 µj more energy than the FBSKM in order to update the group key. However, B, in E-FBSKM, may run Protocol 7.3 if there is no need to reestablish the reverse hash chain. This means that the increase in energy consumption is not continuous, and it exists only when there is a need for the reestablishment. If there is no need to reset the reverse hash chain, the transmission energy consumption for our proposal is the same as the proposal in Section 6.3. Since the reestablishment of the reverse hash chain does not affect the pairwise update protocol, the transmission energy consumption that results from updating the pairwise key is the same for FBSKM and E-FBSKM (see Table 6.3) Computation Cost We assess, in this section, the energy consumption that results from applying cryptographic operations in FBSKM and E-FBSKM, and then compare this consumption with those of Nilsson et al. s schemes as in Table 6.4. For concreteness, we assume that RC5 is used for symmetric encryption/decryption activities, SHA-1 is used for hash operations, and ECDSA is used for public key encryption. The cost of the cryptographic operations is estimated based on the results from analysis studies presented in [23, 35, 122, 130]. To update the pairwise key, FBSKM consumes the same as E-FBSKM since the pairwise key update protocol in both proposals is the same. However, the two proposals consume 274 µj more energy in comparison with Nilsson et al. s scheme. This is because B and N need to exchange Diffie-Hellman components (g r B and g r N ). In the first message of Protocol 7.4, N needs to decrypt a longer encrypted message because of the addition of g r B. In the second

173 6.6. Performance Analysis 153 Table 6.4: Computation cost comparison Protocol Pairwise Key Group Key Step Consumed Energy (µj) Nilsson et al. [88] FBSKM E-FBSKM 1. B N Compute the new key B N Total B N Compute the new key B N Total message of the protocol, N needs to encrypt its Diffie-Hellman component (g r N ) and hash it with the Diffie-Hellman component of B, which is g r B. Interestingly, this protocol can update the pairwise and group keys at the same time, especially if there is no need to eliminate some group members from the group. Table 6.4 shows that the estimated computation energy consumption to run the pairwise key update protocol in FBSKM and E-FBSKM is µj, which is able to update the pairwise and the group keys at the same time. To do so in Nilsson et al. s scheme, both protocols (the pairwise key update and the group key update) should be executed in order to update pairwise and group keys, with a total computation cost of µj. In situations where eliminating some group members from a specific group is needed, the base station in FBSKM can run Protocol 7.3. Table 6.4 shows that the group key update protocol for FBSKM in Section 6.3 consumes 282 µj more than Nilsson et al. s scheme. This extra energy consumption comes as a result of performing three hash operations: one to verify the reverse hash component (h i (t 0 )), another one to calculate the forward hash chain component (h i (s 0 )), and the last one to hash the new group key (h KBN ) before sending it to B. On the other hand, Protocol 7.2 in Nilsson et al. s scheme requires N to perform a decryption operation followed by a hash operation, as discussed in Section 6.2. It is worth mentioning that this extra consumption comes as a result of mitigating some weaknesses that exist in Nilsson et al. s scheme as discussed in Sections 6.2 and 6.3. However, FBSKM is subject to the Sandwich attack as discussed in Section 6.4. E-FBSKM enhanced FBSKM by adding the capability of defending against this attack, but with extra computation cost. It consumes 26 µj more energy to update the group key in comparison with FBSKM because B, in Protocol 7.6, encrypts the new seed of the reverse hash chain with the next preimage of the current reverse hash chain. This encrypted message is longer by 128 bits than the first message of Protocol 7.3. This means that N in E-FBSKM, upon receiving this message, needs to decrypt it with a cost of 26 µj more energy than FBSKM. In other words, this extra energy consumption in Protocol 7.6 comes as a result of decrypting longer messages. It is worth mentioning that the 26 µj increase in the computation energy consumption is not continuous; it exists only when there is a need to reset the reverse hash chain. The repetition of this process depends on the time span, which is defined by B, within which the adversary

174 154 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks is allowed to succeed in launching the Sandwich attack. B may reset the reverse hash chain every time when he updates the group key, or he may reset it after l group key renewals. It depends on the protocol specification. 6.7 Summary In order to measure the resilience of key management protocols, four different types of adversaries varying in their capability with regard to seamless monitoring and software manipulation have been derived in this chapter. As shown in Section 6.1, Nilsson et al. s scheme, contrary to their claims, turned out to provide neither past key secrecy nor future key secrecy against node compromise by any type of adversary. The design idea of the proposed scheme is the combination between Lamport s reverse hash chain as well as the usual hash chain to provide both past and future key secrecy. The proposal avoids the delivery of the whole value of a new group key for group key update; instead only the half of the value is transmitted from the base station to the sensor nodes. This way, the compromise of a pairwise key alone does not lead to the compromise of the group key, which was not the case in the scheme by Nilsson et al. The new pairwise key in our scheme is determined by Diffie-Hellman based key agreement. However, Nilsson et al. s scheme uses key transport, not key agreement, where the new pairwise key is determined by the sensor node and then delivered to the trusted party using public key encryption. This is a critical flaw in their scheme. In short, the proposed scheme provides very strong resilience; both past and future key secrecy against node capture by all adversary types except Type IV. A sensor node attacked by an adversary of Type IV, in theory, cannot be quarantined by a cryptographic method alone and requires a non-cryptographic countermeasure such as tamper-proof protection. The group key update protocol in the proposal comes in two variants. The first variant, FBSKM, has better performance results than the second variant (E-FBSKM), as discussed in Section 6.6. However, FBSKM is threatened by the Sandwich attack in which the damage caused by the attack is limited to old keys but not future keys. The second variant, E-FBSKM, is able to defend against this attack with not much extra communication and computation energy consumption. The performance analysis result in Section 6.6 showed that a sensor node in E-FBSKM consumes approximately µj and µj in order to update the pairwise key and the group key, respectively. This energy consumption includes the communication cost and the computation cost as listed in Tables 6.3 and 6.4. E-FBSKM s energy consumption for the pairwise key update protocol is the same as FBSKM because the proposed scheme has only one version of the pairwise key update protocol. However, E-FBSKM s (and FBSKM s) energy consumption for the pairwise key update protocol is µj more than Nilsson et al. s scheme. This difference is due to the security enhancements that are required to overcome the weaknesses in Nilsson et al. s scheme, as discussed in Section 6.2. To update the group key, E-FBSKM consumes 32.4 µj and µj more energy than FBSKM and Nilsson et al. s schemes, respectively. These additional costs result from defeating the Sandwich attack and

175 6.7. Summary 155 overcoming the weaknesses of Nilsson et al. s scheme.

176 156 Chapter 6. A Forward & Backward Secure Key Management in Wireless Sensor Networks

177 Chapter 7 Conclusion and Future Work In this chapter, we conclude this thesis by summarizing its contributions, and then suggest several open problems and possible research directions. 7.1 Research Summary We have concentrated, in this thesis, on designing a robust secure data aggregation scheme that considers both the unique characteristics that WSNs have, and possible security attacks that could threaten and affect the aggregation results. Our contributions in this thesis are as follows: ˆ In Chapter 2, a detailed review of cryptographic-based secure data aggregation schemes in wireless sensor networks was given. The chapter first explained the motivation behind secure data aggregation and discussed the security requirements of secure data aggregation in wireless sensor networks. It then described the adversarial model that can threaten any secure aggregation scheme. The different capabilities an adversary may have against secure data aggregation schemes were discussed. After that, the stateof-the-art in cryptographic-based secure data aggregation schemes was surveyed and classified into two categories: (i) the single aggregator model and (ii) the multiple aggregator model. This classification is based on the number of aggregator nodes and the existence of the verification phase. To provide the security and performance analysis, current cryptographic-based secure data aggregation schemes were compared according to: the security services they provide, the attacks they secure against, and the number of bits required to be sent by all nodes in order to accomplish the aggregation phase. ˆ In Chapter 3, reputation-based trust systems in wireless sensor networks were reviewed in detail. The chapter first explained the motivation behind adding reputation system capabilities into wireless sensor networks. Reputation Systems helps to enhance the trustworthiness among sensor nodes. It then discussed how the integration between wireless sensor networks and reputation systems can open doors for an adversary to threaten 157

178 158 Chapter 7. Conclusion and Future Work reputation-based trust systems destined for wireless sensor networks, and affect the entire performance. After that, the state-of-the-art in reputation-based trust systems was surveyed and classified into five categories: generic, localization, mobility, routing, and aggregation. Finally, current reputation-based trust systems in wireless sensor networks were compared with respect to: the reputation components they are composed of, and the attacks they secure against. ˆ In Chapter 4, a Reputation-based Secure Data Aggregation (RSDA) for WSNs was proposed. RSDA minimizes the use of heavy cryptographic mechanisms, and integrates aggregation functionalities with the advantages that are provided by a reputation system in order to enhance the network lifetime and the accuracy of the aggregated data. The chapter also discussed the performance and security analysis of RSDA. In the performance analysis, RSDA was tested in three scenarios, depending on the adversary capability to affect the aggregation results, as follows: (i) no attack on the data, (ii) abrupt change, and (iii) 1-per-2 strategy-based On-Off attacks. In the first scenario, the optimal value of the threshold value for the aggregation (Thr A ) was calculated. The value of Thr A helped the cell members to monitor the behavior of their cell representative. A cell member considers the behavior of its cell representative normal if the variance on the aggregation results between the cell representative and the cell member is less or equal to Thr A. The second scenario investigated how RSDA handles an abrupt change. The results showed that aggregation results calculated by RSDA had been affected by this abrupt change until the revocation requests were received. Importantly, this effect is temporary and RSDA had a better reaction to this change as soon as the reputation value of the misbehaved representative fell below Thr R. The third scenario highlighted the limitation of RSDA, which is its ineffectiveness in defeating the On-Off attack. The results showed that RSDA was affected badly by the On-Off attack due to the binary decision making approach. This limitation was later covered in Chapter 5. RSDA is one of few schemes that considers data availability for secure data aggregation. It takes further action once inconsistency in the aggregated results has been detected. It punishes the cell representative by reducing its reputation value, and once the cell representative reputation value falls below Thr R, the revocation mechanism is initiated. This helps prevent this representative from participating in the network, and to select a new trustworthy sensor node to be the next candidate to represent the cell. The security analysis showed that RSDA outperforms other schemes by providing more robustness to security attacks, especially reputation-related attacks. For example, the scheme proposed by Özdemir [91, 92] is vulnerable to Bad Mouthing and Ballot Stuffing attacks, whereas RSDA is not. ˆ In Chapter 5, a solution to defend against the On-Off attack in reputation-based secure aggregation for WSNs (E-RSDA) was proposed. The significance of this solution is twofold: (i) it mitigates the effect of the On-Off attack on aggregation results, and (ii) it distinguishes between an abrupt change and a temporary departure in heterogeneous

179 7.1. Research Summary 159 environments. In this chapter, the use of a combination of the estimation theory and the change detection point mechanism was suggested as an extension to the contribution of Chapter 4 (RSDA). The superior performance of this extension was given through a comparative analysis of the contribution of this chapter (E-RSDA) with RSDA, plain estimate, and reputation-based estimate. The results showed that E-RSDA followed the reputation-based estimate behavior during the On-Off attack, but it had a better reaction once the attack was over. E-RSDA re-initialized the estimator as soon as the end of the On-Off attack had been recognized. This ensured a quick convergence afterwards with the reputation-based aggregation results. To the best of our knowledge, E-RSDA is the only secure data aggregation scheme in the literature that is able to mitigate the On-Off attack. Also, the effectiveness of the proposal in distinguishing between abrupt changes and incipient changes was shown. The results showed that E-RSDA had a better reaction to the abrupt change. In contrast with RSDA, E-RSDA delayed the effect of the detected change caused by a compromised cell representative and relied on the reputation-based estimate values during the window s lifetime. However, E-RSDA responded to the detected change faster than the plain estimate and reputation-based estimate. Upon completing the revocation mechanism and removing the compromised cell representative, E-RSDA responded to the detected change by reinitializing the estimator. In the incipient change, the results showed that the plain and reputation-based estimate of the aggregation result did not reflect the change in the environment and that their reactions to the detected change were slow. RSDA performed well by offering immediate employment of the detected change to the aggregation results. However, this fast reaction came at the cost of being threatened by any abrupt change. E-RSDA behaved better than the plain and reputation-based estimates. However, it delayed the effect of the detected change for the window size when it is compared with RSDA. ˆ In Chapter 6, a secure future & past key management scheme, which helps distribute and renew pairwise and group (cell) keys to sensor nodes, was proposed. We applied Lamport s reverse hash chain as well as usual hash chain to provide both past and future key secrecies. Our scheme avoids the delivery of the whole value of new group key for group key update; instead, only the half of the value is transmitted from the network manager to the sensor nodes. This way, the compromise of a pairwise key alone does not lead to the compromise of the group key, which was not the case in the scheme proposed by Nilsson et al. The new pairwise key in our scheme is determined by Diffie-Hellman based key agreement. As for the scheme of Nilsson s et al., it uses key transport, not key agreement, where the new pairwise key is determined by the sensor node and then delivered to the network manager by using public key encryption. This is a critical flaw in their scheme. The proposed scheme provides very strong resilience; both past and future key secrecy against node capture by all adversary types except Type IV. A sensor node attacked by an adversary of Type IV, in theory, cannot be quarantined by a cryptographic method alone and requires a non-cryptographic countermeasure such as tamper-proof protection.

180 160 Chapter 7. Conclusion and Future Work The group key update protocol in the proposal comes in two variants. The first variant, FBSKM, has better performance results than the second variant (E-FBSKM) as discussed in Section 6.6. However, FBSKM is threaten by the Sandwich attack in which the damage caused by the attack is limited to old keys but not future keys. The second variant, E-FBSKM, is able to defend this attack with not much extra communication and computation energy consumption. 7.2 Future Work The work discussed in the thesis highlights several open problems and areas of future research. These future areas of research are discussed in this section as follows: ˆ Improve the robustness of the proposed reputation-based secure data aggregation scheme. In Chapter 5, a reputation-based secure data aggregation scheme was proposed. The scheme is able to defend against one of the complex security attacks, which is the On-Off attack. However, the scheme is limited to detecting the On-Off attack launched from only one child cell. It would be interesting to extend the scheme to investigate complicated scenarios where the On-Off attack can be launched from more than one cell at the same time. Then, the feasibility of making the improved scheme as a lightweight distributed intrusion detection system for WSNs could be another direction for future work. ˆ Improve the data availability and increase the lifetime of WSNs. Battery consumption poses one of the design challenges in any scheme designed for WSNs, because it determines the network s lifetime. The network s lifetime can be defined as the time elapsed until the first node (or the last node) in the network depletes its energy [139]. One solution, that prolongs the lifetime of the network, is to reduce the data transmission between sensors and the base station by performing aggregation functions at aggregator nodes (cluster-heads). These nodes, however, execute more functions than non-aggregator nodes. They collect and apply aggregation functions on data that is received from downstream nodes, and then send aggregated results to the upper aggregator points or to the base station. These functions drain their batteries quicker than other nodes. Once the aggregator node dies or is destroyed for any reason, all nodes in its downstream (same cluster) will be disconnected from the network since they have no path toward the base station. Consequently, it would be interesting to develop an aggregator selection and rotation mechanism that enables load balancing by rotating aggregation functionality between trusted sensors. The aggregation selection and rotation mechanism is similar to the cluster-head selection/rotation, since WSNs are often divided into clusters where each cluster has a cluster-head which collects data from sensors within its cluster and performs data aggregation on it. However, most schemes that consider rotating the cluster-head duties among legitimate nodes are vulnerable to active adversary activities such as replaying old messages, since these schemes send their messages in the clear [29]. These schemes select the cluster-head with respect to one or more metrics,

181 7.2. Future Work 161 such as the residual battery energy as in the HEED protocol [139], the number of neighbors within the node s range as in ACE protocol [20], or a combination of parameters. Unfortunately, existing metrics do not reflect the previous behavior of sensor nodes. Thus, the addition of node reputation values to the current metrics, when rotating the aggregation functionalities between legitimate nodes, could be a direction for future work. ˆ Develop new methods for providing data confidentiality in data aggregation schemes. A contribution of this thesis is that data confidentiality is achieved by using a hop-by-hop encryption, which requires extra computation. However, the aggregator, in the end-to-end encryption, does not need to decrypt and encrypt data. Instead of this, it needs to apply the aggregation functions directly to the encrypted data [131]. In other words, it promises the combination of end-to-end encryption and in-network-aggregation. This can be done effectively by using homomorphic encryption, in which the sum of two encrypted values is equal to the encrypted version of the sum of these two values [42]. This leads to significant benefits compared to the hop-by-hop method. For example, it reduces network traffic, requires a smaller computational effort, and provides improved security [97]. Current schemes do provide end-to-end encryption by employing a symmetric-based privacy homomorphism [17, 38, 131]. The disadvantage of these schemes is that they require the same key to be known by each sensor and thus compromising any node leads to revealing a large amount of data. The applicability of the asymmetric-based homomorphic encryption was investigated by Mykletun et al. [86]. They showed that asymmetric encryption is a feasible solution to the problem of end-to-end encryption for aggregated data. However, they only considered certificate-based asymmetric cryptographic schemes. Identity-Based asymmetric Encryption (IBE) [11,44] has already been proposed for WSNs by Oliveira et al. [90], who argued that IBE is not only ideal for WSNs. However, Oliveira et al. s design only resists passive adversaries, and consequently does not meet the requirements of an optimal homomorphic candidate, as discussed by Mykletun et al. [86]. It would be interesting to further investigate the use of IBE to see whether it is possible for it to provide the Mykletun et al. s requirements to achieve an optimal homomorphic candidate [86]. ˆ Investigate the use of trusted computing principles in WSNs. The contributions in Chapters 4, 5 and 6 provide software-based security solutions to mitigate the effect of node compromise attack on secure aggregation functionalities. The use of security services that are offered by trusted computing could lead to better results in the domain of data aggregation in WSNs. For example, employing the trusted computing concept by adding the TPM chip selectively into a certain number of sensors as a hardware-based solution can help avoid extracting sensitive information from compromised sensors, especially those that are doing critical tasks, such as aggregators. The main two objectives of

182 162 Chapter 7. Conclusion and Future Work the trusted computing are: to improve the trustworthiness and the security of computing platform, and to provide reliable hardware-based protection for secrets and sensitive data [100]. These two objectives are ensured by features that are offered by trusted computing such as integrity measurement, protected storage, and remote attestation. As discussed in Chapter 2, some of the existing secure aggregation schemes have a verification phase such as SDAP [136] and SHDA [22] to ensure data accuracy of the aggregated data. In other words, the base station checks whether a certain aggregator behaves well or not. This process consumes significant resources since the base station needs to check the readings of each sensor and this consequently floods the network. It would be interesting to investigate whether replacing the verification phase, which might exist in some secure aggregation schemes, with TPM hardware (which allows remote attestation between the base station and the aggregator) would help to enhance data aggregation security in WSNs.

183 Bibliography [1] Alfarez Abdul-Rahman and Stephen Hailes. Supporting trust in virtual communities. In Proceedings of the 33th Annual Hawaii International Conference on System Sciences, HICSS 00, Maui, Island of Hawaii, USA, volume 6, January 04-07, [2] Ejaz Ahmad. Monitoring and analysis of internet traffic targeting unused address spaces. PhD in Computer Science, Queensland University of Technology, Brisbane, Australia, [Online]. Available: [Accessed: July 10, 2010]. [3] A.A. Ahmed, H. Shi, and Y. Shang. A survey on network protocols for wireless sensor networks. In Proceedings of the IEEE International Conference on Information Technology: Research and Education, ITRE 03, Newark, New Jersey, USA, pages , August 11-13, [4] Ian F. Akyildiz, Weilian Su, Yogesh Sankarasubramaniam, and Erdal Cayirci. Wireless sensor networks: a survey. Computer Networks, 38(4): , [5] Hani Alzaid, Ernest Foo, and Juan González Nieto. RSDA: Reputation-based secure data aggregation in wireless sensor networks. In Proceedings of the 9th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 08, Dunedin, New Zealand, pages , December 1-4, [6] Hani Alzaid, Ernest Foo, and Juan Manuel González Nieto. Secure data aggregation in wireless sensor network: A survey. In Proceedings of the 6th Australasian conference on Information security, AISC 08, Wollongong, NSW, Australia, pages , January 1, [7] Michle Basseville and Igor V. Nikiforov. Change detection algorithms. In Detection of Abrupt Changes: Theory and Application, pages Prentice-Hall, [8] Alexander Becher, Zinaida Benenson, and Maximillian Dornseif. Tampering with motes: Real-world physical attacks on wireless sensor networks. In Clark et al. [27], pages [9] Daniel Bernoulli. Exposition of a new theory on the measurement of risk. Econometrica, 22(1):23 36,

184 164 BIBLIOGRAPHY [10] Tatiana Bokareva. A mini hardware survey. WWW page, [Online]. Available: [Accessed: February 10, 2010]. [11] Dan Boneh, Eu-Jin Goh, and Kobbi Nissim. Evaluating 2-dnf formulas on ciphertexts. In Joe Kilian, editor, TCC, volume 3378 of Lecture Notes in Computer Science, pages Springer, [12] Azzedine Boukerche and Yonglin Ren. A trust-based security system for ubiquitous and pervasive computing environments. Computer Communications, 31(18): , [13] Azzedine Boukerche, Li Xu, and Khalil El-Khatib. Trust-based security for wireless ad hoc and sensor networks. Computer Communications, 30(11-12): , [14] Sonja Buchegger and Jean-Yves Le Boudec. Performance analysis of the Confidant protocol. In Proceedings of the 3rd ACM international symposium on Mobile ad hoc networking and computing, MobiHoc 02, Lausanne, Switzerland, pages , June 9-11, [15] Scott Buffett, Nathan Scott, Bruce Spencer, Michael Richter, and Michael W. Fleming. Determining internet users values for private information. In Proceedings of the 2nd Annual Conference on Privacy, Security and Trust (PST) Wu Centre, University of New Brunswick, Fredericton, New Brunswick, Canada, pages 79 88, October 13-15, 2004,. [16] David Carman, Peter Kruus, and Brian Matt. Constraints and approaches for distributed sensor network security. Technical Report , 3060 Washington Road, Glenwood, MD , September NAI Labs, The Security Research Division. [Online]. Available: Spring04/papers/nailabs_report_00-010_final.pdf [Accessed: February 10, 2010]. [17] Claude Castelluccia, Einar Mykletun, and Gene Tsudik. Efficient aggregation of encrypted data in wireless sensor networks. In Proceedings of the 2nd Annual International Conference on Mobile and Ubiquitous Systems, MobiQuitous 05, San Diego, CA, USA, pages , July 17-21, [18] Rodrigo Román Castro. Application-Driven Security in Wireless Sensor Networks. PhD in Computer Science, University of Malaga, Malaga, Spain, [Online]. Available: [Accessed: February 10, 2010]. [19] Erdal Cayirci and Tolga Coplu. SENDROM: Sensor networks for disaster relief operations management. Wireless Networks, 13(3): , [20] Haowen Chan and Adrian Perrig. ACE: An emergent algorithm for highly uniform cluster formation. In Holger Karl, Andreas Willig, and Adam Wolisz, editors, Proceedings of the 1st European Workshop on Wireless Sensor Networks, EWSN 04, Berlin, Germany, volume 2920 of Lecture Notes in Computer Science, pages Springer, January 19-21, 2004.

185 BIBLIOGRAPHY 165 [21] Haowen Chan, Adrian Perrig, Bartosz Przydatek, and Dawn Xiaodong Song. SIA: Secure information aggregation in sensor networks. Journal of Computer Security, 15(1):69 102, [22] Haowen Chan, Adrian Perrig, and Dawn Song. Secure hierarchical in-network aggregation in sensor networks. In Ari Juels, Rebecca N. Wright, and Sabrina De Capitani di Vimercati, editors, Proceedings of the 13th ACM Conference on Computer and Communications Security, CCS 06, Alexandria, Virginia, USA, pages ACM, october 30 - November 3, [23] Chih-Chun Chang, Sead Muftic, and David J. Nagel. Measurement of energy costs of security in wireless sensor nodes. In Proceedings of the 16th IEEE International Conference on Computer Communications and Networks, ICCCN 07, Turtle Bay Resort, Honolulu, Hawaii, USA, pages , August 13-16, [24] Eleizabeth Change, Tharam Dillon, and Farookh K. Hussain. Trust and Reputation for Service-Oriented Environments. John Wiley & Sons Ltd., West Sussex PO19 8SQ, England, [25] Haiguang Chen. Task-based trust management for wireless sensor networks. International Journal of Security and its Applications, 3(2):21 26, April, [26] Haiguang Chen, Huafeng Wu, Jinchu Hu, and Chuanshan Gao. Agent-based trust management model for wireless sensor networks. In Proceedings of the International Conference on Multimedia and Ubiquitous Engineering, MUE 08, Busan, Korea, pages , April 24-26, [27] John A. Clark, Richard F. Paige, Fiona Polack, and Phillip J. Brooke, editors. Proceedings of the 3rd International Conference on Security in Pervasive Computing, SPC 06, York, UK, volume 3934 of Lecture Notes in Computer Science. Springer, April 18-21, [28] Geoffrey M. Clarke and Dennis Cooke. A Basic Course in Statistics. Hodder Arnold, 338 Euston Road, London NW1 3BH, UK, [29] Garth V. Crosby and Niki Pissinou. Cluster-based reputation and trust for wireless sensor networks. In Proceedings of the 4th IEEE Consumer Communications and Networking Conference, CCNC 07, Las Vegas, Nevada, United States, pages , January 11-13, [30] Crossbow Technology Inc. Mica2 datasheet, [Online]. Available: xbow.com/products/productdetails.aspx?sid=174 [Accessed: October 10, 2009]. [31] Crossbow Technology Inc. Micaz datasheet, [Online]. Available: xbow.com/products/product_pdf_files/wireless_pdf/micaz_datasheet.pdf [Accessed: October 10, 2009]. [32] CSIRO Australia. Fleck datasheet, [Online]. Available: csiro.au/fleck1.htm [Accessed: October 10, 2009].

186 166 BIBLIOGRAPHY [33] Partha Dasgupta. Trust as a commodity. In Diego Gambetta, editor, Trust: Making and Breaking Cooperative Relations, pages Department of Sociology, University of Oxford, [Online]. Available: dasgupta49-72.pdf [Accessed: October 10, 2009]. [34] Robert Dawson, Colin Boyd, Ed Dawson, and Juan Manuel González Nieto. SKMA: a key management architecture for SCADA systems. In Rajkumar Buyya, Tianchi Ma, Reihaneh Safavi-Naini, Chris Steketee, and Willy Susilo, editors, Proceedings of the 4th Australasian Symposium on Grid Computing and e-research (AusGrid 2006) and the 4th Australasian Information Security Workshop (Network Security) (AISW 2006), ACSW Frontiers 06, Hobart, Tasmania, Australia, volume 54 of CRPIT, pages , January 16-19, [35] Giacomo de Meulenaer, François Gosset, François-Xavier Standaert, and Olivier Pereira. On the energy cost of communication and cryptography in wireless sensor networks. In Proceedings of the 4th IEEE International Conference on Wireless & Mobile Computing, Networking & Communication, WIMOB 08, Avignon, France, pages , October 12-14, [36] Giacomo de Meulenaer and Francois-Xavier Standaert. Stealthy compromise of wireless sensor nodes with power analysis attacks. In Proceedings of the 2nd International Conference on Mobile Lightweight Wireless Systems, MOBI- LIGHT 10, Barcelona, Spain, page In press, May 10-12, [Online]. Available: 81da e92d3.6d6f62696c e pdf [Accessed: June 10, 2010]. [37] Prashant Dewan and Partha Dasgupta. Trusting routers and relays in ad hoc networks. In Proceeding of the 32nd International Conference on Parallel Processing, ICPP 03, Kaohsiung, Taiwan, pages , October 6-9, [38] Josep Domingo-Ferrer. A provably secure additive and multiplicative privacy homomorphism. In Agnes Hui Chan and Virgil D. Gligor, editors, Proceedings of 5th International Conference on Information Security, ISC 02, Sao Paulo, Brazil, volume 2433 of Lecture Notes in Computer Science, pages Springer, September 30 - October 2, [39] John R. Douceur. The sybil attack. In Peter Druschel, M. Frans Kaashoek, and Antony I. T. Rowstron, editors, Proceedings of the 1st International Workshop on Peer-to-Peer Systems, IPTPS 02,, volume 2429 of Lecture Notes in Computer Science, pages Springer, Cambridge, MA, USA, March 7-8, 2002, Revised Papers. [40] Wenliang Du, Jing Deng, Yunghsiang S. Han, and Pramod Varshney. A witness-based approach for data fusion assurance in wireless sensor networks. In Proceedings of the IEEE Global Communications Conference, GLOBECOM 03, San Francisco, USA, volume 3, pages , December 1-5, [41] Michal Feldman and John Chuang. Overcoming free-riding behavior in peer-to-peer systems. SIGecom Exchanges, 5(4):41 50, 2005.

187 BIBLIOGRAPHY 167 [42] Caroline Fontaine and Fabien Galand. A survey of homomorphic encryption for nonspecialists, [Online]. Available: / pdf [Accessed: February 10, 2010]. [43] Keith B. Frikken and Joseph A. Dougherty IV. An efficient integrity-preserving scheme for hierarchical sensor aggregation. In Proceedings of the 1st ACM Conference on Wireless Network Security, WISEC 08, Alexandria, VA, USA, pages ACM, March 31 - April 02, [44] S. Galbraith, K. Paterson, and N. Smart. Pairings for cryptographers. Technical Report 165, International Association for Cryptologic Research. [Online]. Available: http: //eprint.iacr.org/2006/165 [Accessed: February 10, 2010]. [45] Diego Gambetta. Can we trust trust? In Trust: Making and Breaking Cooperative Relations, pages Basil Blackwell, [46] Saurabh Ganeriwal, Laura K. Balzano, and Mani B. Srivastava. Reputation-based framework for high integrity sensor networks. ACM Transactions on Sensor Networks, 4(3):1 37, [47] Saurabh Ganeriwal and Mani B. Srivastava. Reputation-based framework for high integrity sensor networks. In Proceedings of the 2nd ACM Workshop on Security of Ad Hoc and Sensor Networks, SASN 04, Washington, DC, USA, pages 66 77, October 25, [48] Jennifer Ann Golbeck. Computing and applying trust in web-based social networks. PhD thesis, College Park, MD, USA, Chair-Hendler, James. [49] Dima Grigoriev and Ilia V. Ponomarenko. Homomorphic public key cryptosystems over groups and rings. CoRR, cs.cr/ , [50] FE Grubbs. Procedures for detecting outlying observations in samples. Technometrics, 11(1):1 21, [51] Christoph G. Günther. An identity-based key exchange protocol. In Proceedings of the Workshop on the Theory and Application of Cryptographic Techniques, EURO- CRYPT 89, Houthalen, Belgium, Lecture Notes in Computer Science, pages Springer, April 10-13, [52] Fredrik Gustafsson. Adaptive filtering and change detection. John Wiley & Sons, Ltd, [Online]. Available: [Accessed: February 10, 2010]. [53] Parisa Haghani, Panagiotis Papadimitratos, Marcin Poturalski, Karl Aberer, and Jean- Pierre Hubaux. Efficient and robust secure aggregation for sensor networks. CoRR, abs/ , 2008.

188 168 BIBLIOGRAPHY [54] Carl Hartung, James Balasalle, and Richard Han. Node compromise in sensor networks: The need for secure systems. Technical report, University of Colorado at Boulder, January, [Online]. Available: [55] Tian He, Pascal Vicaire, Ting Yan, Liqian Luo, Lin Gu, Gang Zhou, Radu Stoleru, Qing Cao, John A. Stankovic, and Tarek F. Abdelzaher. Achieving real-time target tracking using wireless sensor networks. In Proceedings of the 12th IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS 06, San Jose, California,USA, pages 37 48, April 4-7, [56] W.B. Heinzelman, A.P. Chandrakasan, and H. Balakrishnan. An application-specific protocol architecture for wireless microsensor networks. Wireless Communications, IEEE Transactions on, 1(4): , Oct [57] Jason L. Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David E. Culler, and Kristofer S. J. Pister. System architecture directions for networked sensors. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 00, Cambridge, MA, USA, pages , November 12-15, [58] Lingxuan Hu and David Evans. Secure aggregation for wireless network. In Proceedings of the 2003 Symposium on Applications and the Internet Workshops, SAINT 03, Orlando, FL, USA, pages , January 27-31, [59] Mohammad Ilyas and Imad Mahgoub. Handbook of Sensor Networks: Compact Wireless and Wired Sensing Systems. CRC Press, Boca Raton, Florida 33431, USA, [60] Roslan Ismail. Security of Reputation Systems. PhD in Computer Science, Queensland University of Technology, Brisbane, Australia, [Online]. Available: http: //eprints.qut.edu.au/15964/ [Accessed: February 10, 2010]. [61] Aravind Iyer, Sunil S. Kulkarni, Vivek Mhatre, and Catherine P.Rosenberg. A taxonomybased approach to design of large-scale sensor networks. In Yingshu Li, My T. Thai, and Weili Wu, editors, Wireless Sensor Networks and Applications, SIGNALS AND COM- MUNICATION TECHNOLOGY, chapter 1, pages Springer Science & Business Media, LLC, New York, USA, [62] Pawan Jadia and Anish Mathuria. Efficient secure aggregation in sensor networks. In Proceedings of the 11th conference on High Performance Computing, HiPC 04, Bangalore, India, volume 3296 of Lecture Notes in Computer Science, pages Springer, December 19-22, [63] Audun Jøsang and Jennifer Golbeck. Challenges for robust trust and reputation systems. In Proceedings of the 5th International Workshop on Security and Trust Management, STM 09, Saint Malo, France, pages 1 6, September 24-25, 2009.

189 BIBLIOGRAPHY 169 [64] Audun Jøsang and Roslan Ismail. The beta reputation system. In Proceedings of the 15th Bled Conference on Electronic Commerce, ereality: Constructing the eeconomy, Bled, Slovenia, pages , June 17-19, [65] Audun Jøsang, Roslan Ismail, and Colin Boyd. A survey of trust and reputation systems for online service provision. Decision Support Systems, 43(2): , [66] Audun Jøsang, Xixi Luo, and Xiaowu Chen. Continuous ratings in discrete bayesian reputation systems. In Proceedings of the 2nd Joint itrust and PST Conference on Privacy, Trust Management and Security, IFIPTM 08, Trondheim, Norway, pages , June 18-20, [67] Chris Karlof and David Wagner. Secure routing in wireless sensor networks: attacks and countermeasures. Ad Hoc Networks, 1(2-3): , [68] Claudia Keser. Experimental games for the design of reputation management systems. IBM Systems Journal, 42(3): , [69] Kashif Kifayat, Madjid Merabti, Qi Shi, and David Llewellyn-Jones. Security in wireless sensor networks. In Mark Stamp and Peter Stavroulakis, editors, Handbook of Information and Communication Security, chapter 26, pages Springer Berlin Heidelberg, [70] Marek Klonowski, Miroslaw Kutylowski, Michal Ren, and Katarzyna Rybarczyk. Forward-secure key evolution in wireless sensor networks. In Feng Bao, San Ling, Tatsuaki Okamoto, Huaxiong Wang, and Chaoping Xing, editors, Proceedings of the 6th International Conference on Cryptology and Network Security,CANS 07, Singapore, volume 4856 of Lecture Notes in Computer Science, pages Springer, December 8-10, [71] Christoph Krauß, Markus Schneider, and Claudia Eckert. On handling insider attacks in wireless sensor networks. Information Security Technical Report, 13: , Elsevier. [72] Bhaskar Krishnamachari, Deborah Estrin, and Stephen B. Wicker. The impact of data aggregation in wireless sensor networks. In Proceedings of the 22nd International Conference on Distributed Computing Systems, ICDCSW 02, Vienna, Austria, pages , July 2-5, [73] Leslie Lamport. Password authentification with insecure communication. Communications of the ACM, 24(11): , [74] Yee Wei Law, Jeroen Doumen, and Pieter H. Hartel. Survey and benchmark of block ciphers for wireless sensor networks. ACM Transactions on Sensor Networks (TOSN), 2(1):65 93, [75] Ajay Mahimkar and Theodore S. Rappaport. SecureDAV: A secure data aggregation and verification protocol for sensor networks, globecom 04, dallas, united states. In

190 170 BIBLIOGRAPHY Proceedings of the Global Telecommunications Conference, volume 4, pages , November 29 - December 3, [76] Alan M. Mainwaring, David E. Culler, Joseph Polastre, Robert Szewczyk, and John Anderson. Wireless sensor networks for habitat monitoring. In Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications, WSNA 02, Atlanta, Georgia, USA, pages 88 97, September 28, [77] Sergio Marti, Thomas J. Giuli, Kevin Lai, and Mary Baker. Mitigating routing misbehavior in mobile ad hoc networks. In Proceedings of the 6th annual international conference on Mobile computing and networking, MOBICOM 00, Boston, Massachusetts, United States, pages , August 6-11, [78] Sjouke Mauw, Ivo van Vessem, and Bert Bos. Forward secure communication in wireless sensor networks. In Clark et al. [27], pages [79] Ralph C. Merkle. Protocols for public key cryptosystems. In IEEE Symposium on Security and Privacy, Oakland, California, United States, pages , April 14-16, [80] Pietro Michiardi and Refik Molva. Simulation-based analysis of security exposures in mobile ad hoc networks. In Proceedings of the European Wireless Conference, EW 02, Florence, Italy, February 25-28, [81] Pietro Michiardi and Refik Molva. CORE: a collaborative reputation mechanism to enforce node cooperation in mobile ad hoc networks. In Borka Jerman-Blazic and Tomaz Klobucar, editors, Proceedings of the IFIP Conference on Communications and Multimedia Security, Portoroz, Slovenia, volume 228, pages , September 26-27, [82] Aleksandar Milenkovic, Chris Otto, and Emil Jovanov. Wireless sensor networks for personal health monitoring: Issues and an implementation. Computer Communications, 29(13-14): , [83] Oskar Morgenstern and John Von Neumann. Theory of Games and Economic Behavior. Princeton University Press, New York, third edition, [84] Lik Mui, Mojdeh Mohtashemi, and Ari Halberstadt. A computational model of trust and reputation for e-businesses. In Proceedings of the 35th Annual Hawaii International Conference on System Sciences, HICSS 02, Hilton Waikoloa Village Island of Hawaii, USA, volume 7, page 188, January 7-10, [85] C. Siva Ram Murthy and B.S. Manoj. Ad Hoc Wireless Sensor Networks Architectures and Protocols. Prentice Hall PTR, Upper Saddle River, NJ, USA, [86] Einar Mykletun, Joao Girão, and Dirk Westhoff. Public key based cryptoschemes for data concealment in wireless sensor networks. In Proceedings of the IEEE International Conference on Communications, ICC 06, Istanbul, Turkey, volume 5, pages , June 11-15, 2006.

191 BIBLIOGRAPHY 171 [87] James Newsome, Elaine Shi, Dawn Xiaodong Song, and Adrian Perrig. The sybil attack in sensor networks: analysis & defenses. In Kannan Ramchandran, Janos Sztipanovits, Jennifer C. Hou, and Thrasyvoulos N. Pappas, editors, Proceedings of the 3rd International Symposium on Information Processing in Sensor Networks, IPSN 04, Berkeley, California, USA, pages , April 26-27, [88] Dennis K. Nilsson, Tanya Roosta, Ulf Lindqvist, and Alfonso Valdes. Key management and secure software updates in wireless process control environments. In Virgil D. Gligor, Jean-Pierre Hubaux, and Radha Poovendran, editors, Proceedings of the 1st ACM Conference on Wireless Network Security, WISEC 08, Alexandria, VA, USA, pages , March 31 - April 02, [89] Miyako Ohkubo, Koutarou Suzuki, and Shingo Kinoshita. Cryptographic approach to privacy-friendly tags. In RFID Privacy Workshop, Cambridge, MA, USA, November 15, [90] Leonardo B. Oliveira, Ricardo Dahab, Julio López, Felipe Daguano, and Anotonio A. F. Loureiro. Identity-based encryption for sensor networks. In Proceedings of the 5th IEEE International Conference on Pervasive Computing and Communications Workshops, PERCOMW 07, White Plains, NY, USA, pages , March 19-23, [91] Suat Özdemir. Functional reputation based data aggregation for wireless sensor networks. In Proceedings of the IEEE International Conference on Wireless and Mobile Computing, Networking and Communications, WiMob 08, Avignon, France, pages , October [92] Suat Özdemir. Functional reputation based reliable data aggregation and transmission for wireless sensor networks. Computer Communications, 31(17): , [93] Suat Özdemir and Yang Xiao. Secure data aggregation in wireless sensor networks: A comprehensive overview. Computer Networks, 53(12): , [94] Sergio Palazzo, Marco Conti, and Raghupathy Sivakumar, editors. Proceedings of the 7th ACM Interational Symposium on Mobile Ad Hoc Networking and Computing, MobiHoc 2006, Florence, Italy, May 22-25, 2006, [95] Adrian Perrig, John A. Stankovic, and David Wagner. Security in wireless sensor networks. Communications of the ACM, 47(6):53 57, [96] Adrian Perrig, Robert Szewczyk, J. D. Tygar, Victor Wen, and David E. Culler. SPINS: security protocols for sensor networks. Wireless Network, 8(5): , [97] Steffen Peter, Krzysztof Piotrowski, and Peter Langendoerfer. On concealed data aggregation for wireless sensor networks. In Proceedings of the 4th IEEE Consumer Communications and Networking Conference, CCNC 07, Las Vegas, Nevada, United States, pages , January 11-13, 2007.

192 172 BIBLIOGRAPHY [98] Ludovic Pietre-Cambacedes and Pascal Sitbon. Cryptographic key management for SCADA systems-issues and perspectives. International Journal of Security and its Applications, 2(3):31 40, July [99] Bartosz Przydatek, Dawn Xiaodong Song, and Adrian Perrig. SIA: Secure information aggregation in sensor networks. In Ian F. Akyildiz, Deborah Estrin, David E. Culler, and Mani B. Srivastava, editors, Proceedings of the 1st International Conference on Embedded Networked Sensor Systems, SenSys 03, Los Angeles, California, USA, pages , November 5-7, [100] Jason Reid, Juan Manuel González Nieto, Ed Dawson, and Eiji Okamoto. Privacy and trusted computing. In Proceedings of the 14th International Workshop on Database and Expert Systems Applications, DEXA 03, Prague, Czech Republic, pages , September 1-5, [101] Kui Ren, Wenjing Lou, and Yanchao Zhang. LEDS: Providing location-aware end-toend data security in wireless sensor networks. IEEE Transaction on Mobile Computing, 7(5): , [102] Michal Ren, Tanmoy Kanti Das, and Jianying Zhou. Diverging keys in wireless sensor networks. In Sokratis K. Katsikas, Javier Lopez, Michael Backes, Stefanos Gritzalis, and Bart Preneel, editors, Proceedings of the 9th International Conference on Information Security, ISC 06, Samos Island, Greece, volume 4176 of Lecture Notes in Computer Science, pages Springer, August 30 - September 2, [103] Yonglin Ren and Azzedine Boukerche. Modeling and managing the trust for wireless and mobile ad hoc networks. In Proceedings of IEEE International Conference on Communications, ICC 08, Beijing, China, pages , May 19-23, [104] Sebastian Ries. Extending bayesian trust models regarding context-dependence and user friendly representation. In Proceedings of the 24th ACM Symposium on Applied Computing, SAC 09, Honolulu, Hawaii, United States, pages , March 9-12, [105] Ronald L. Rivest. The MD5 message-digest algorithm. It is being published as a Request for Comments 1321 (RFC 1321) in the Internet Engineering Task Force, [online]. Available [Accessed: 18 th of February 2010]. [106] Rodrigo Román, M. Carmen Fernandez-Gago, Javier Lopez, and Hsiao-Hwa Chen. Trust and reputation systems for wireless sensor networks. In Charalabos Skianis Stefanos Gritzalis, Tom Karygiannis, editor, Security and Privacy in Mobile and Wireless Networking, pages Troubador Publishing Ltd., [107] Tanya Roosta, Shiuhpyng Shieh, and Shankar Sastry. Taxonomy of security attacks in sensor networks. In Proceedings of the 1st IEEE International Conference on System Integration and Reliability Improvements, SIRI 06, Hanoi, Vietnam, December 13-15, 2006.

193 BIBLIOGRAPHY 173 [108] Andrew P. Sage and James L. Melsa. Basic estimation theory. In A. V. Balakrishnan, George Dantzig, and Lotfi Zadeh, editors, Estimation Theory with Applications to Communication and Control, pages McGraw-Hill Book Company, [109] Yingpeng Sang, Hong Shen, Yasushi Inoguchi, Yasuo Tan, and Naixue Xiong. Secure data aggregation in wireless sensor networks: A survey. In Proceedings of the 7th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 06, Taipei, Taiwan, pages , December 4-7, [110] H. Ozagur Sanli, Suat Özdemir, and Hassan Cam. SRDA: secure reference-based data aggregation protocol for wireless sensor networks. In Proceeding of the 60th IEEE Vehicular Technology Conference, VTC 04, Los Angeles, USA, volume 7, pages , September 26-29, [111] Scalable Networks, Inc. Qualnet datasheet, The datasheet of QUALNET network simulator is retrieved 10 th of September 2008, from com/products/. [112] Sanjeev Setia, Sankardas Roy, and Sushil Jajodia. Secure data aggregation in wireless sensor networks. In Javier Lopez and Jianyin. Zhou, editors, Wireless Sensor Network Security, chapter 8, pages IOS press, [113] Riaz Ahmed Shaikh, Hassan Jameel, Brian J. d Auriol, Heejo Lee, Sungyoung Lee, and Young Jae Song. Group-based trust management scheme for clustered wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems, 20(11): , [114] Riaz Ahmed Shaikh, Hassan Jameel, Sungyoung Lee, Saeed Rajput, and Young Jae Song. Trust management problem in distributed wireless sensor networks. In Proceedings of the 12th IEEE Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA 06, Sydney, Australia, pages , August 16-18, [115] Elaine Shi and Adrian Perrig. Designing secure sensor networks. IEEE Personal Communications, 11(6):38 43, [116] Gyula Simon, Miklós Maróti, Ákos Lédeczi, György Balogh, Branislav Kusy, András Nádas, Gábor Pap, János Sallai, and Ken Frampton. Sensor network-based countersniper system. In Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems, SenSys 04, Baltimore, MD, USA, pages 1 12, November 3-5, [117] Avinash Srinivasan, Feng Li, and Jie Wu. A novel CDS-based reputation monitoring system for wireless sensor networks. In Proceedings of the 28th IEEE International Conference on Distributed Computing Systems Workshops, ICDCS 08, Beijing, China, pages , June 17-20, [118] Avinash Srinivasan, Joshua Teitelbaum, and Jie Wu. DRBTS: Distributed reputationbased beacon trust system. In Proceedings of the 2nd International Symposium on

194 174 BIBLIOGRAPHY Dependable Autonomic and Secure Computing, DASC 06, Indianapolis, Indiana, USA, pages , September 29 - October 1, [119] Yan Lindsay Sun, Zhu Han, Wei Yu, and K. J. Ray Liu. A trust evaluation framework in distributed networks: Vulnerability analysis and defense against attacks. In Proceedings of the 25th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, Barcelona, Catalunya, Spain, pages 1 13, April 23-29, [120] Piotr Szczechowiak, Leonardo B. Oliveira, Michael Scott, Martin Collier, and Ricardo Dahab. NanoECC: Testing the limits of elliptic curve cryptography in sensor networks. In Roberto Verdone, editor, Proceedings of the 5th European Conference on Wireless Sensor Networks, EWSN 08, Bologna, Italy, volume 4913 of Lecture Notes in Computer Science, pages Springer, January 30 - February 1, [121] W. T. Luke Teacy, Jigar Patel, Nicholas R. Jennings, and Michael Luck. TRAVOS: Trust and reputation in the context of inaccurate information sources. Autonomous Agents and Multi-Agent Systems, 12(2): , [122] Ramnath Venugopalan, Prasanth Ganesan, Pushkin Peddabachagari, Alexander Dean, Frank Mueller, and Mihail Sichitiu. Encryption overhead in embedded systems and sensor network nodes: Modeling and analysis. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems, CASES 03, San Jose, California, USA, pages , October 30 - November, [123] Marcos Augusto M. Vieiral, Adriano B. da Cunha, and Diógenes Cecilio da Silva Junio. Designing wireless sensor nodes. In Stamatis Vassiliadis, Stephan Wong, and Timo Hämäläinen, editors, Proceedings of the 6th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS 06, Samos, Greece, volume 4017 of Lecture Notes in Computer Science, pages Springer, July 17-20, [124] Marcos Augusto M. Vieiral, Claudionor N. Coelho. Jr, Diógenes Cecilio da Silva Junio, and José M. da Mata. Survey on wireless sensor network devices. In Proceedings of the 9th IEEE conference on Emerging Technologies and Factory Automation, ETFA 03, Lisbon, Portugal, volume 1, pages , September 16-19, [125] Mehmet C. Vuran, Özgür B. Akan, and Ian F. Akyildiz. Spatio-temporal correlation: Theory and applications for wireless sensor networks. Computer Networks, 45(3): , [126] Ashraf Wadda, Kennie Jones, Stephen Olariu, and Mohamed Elthweissy. A scalable solution for securing wireless sensor networks. In Jie Wu, editor, Handbook on Theoretical and Algorithmic Aspects of Sensor, Ad Hoc Wireless, and Peer-to-Peer Networks, chapter 33, pages Auerbach Publications, CRC Press, Taylor & Francis Group, New York, USA, 2006.

195 BIBLIOGRAPHY 175 [127] David Wagner. Cryptanalysis of an algebraic privacy homomorphism. In Colin Boyd and Wenbo Mao, editors, Proceedings of the 6th International Conference on Information Security, ISC 03, Bristol, UK, volume 2851 of Lecture Notes in Computer Science, pages Springer, October 1-3, [128] David Wagner. Resilient aggregation in sensor networks. In Sanjeev Setia and Vipin Swarup, editors, Proceedings of the 2nd ACM Workshop on Security of ad hoc and Sensor Networks, SASN 04, Washington, DC, USA, pages 78 87, October 25, [129] John Paul Walters, Zhengqiang Liang, Weisong Shi, and Vipin Chaudhary. Wireless sensor network security: A survey. In Yang Xiao, editor, Security in Distributed, Grid, and Pervasive Computing, chapter 17, pages Auerbach Publications, CRC Press, Taylor & Francis Group, [130] Arvinderpal Wander, Nils Gura, Hans Eberle, Vipul Gupta, and Sheueling Chang Shantz. Energy analysis of public key cryptography for wireless sensor networks. In Proceedings of the 3rd IEEE International Conference on Pervasive Computing and Communications, PerCom 05, Kauai, Hawaii, pages , March 8-12, [131] Dirk Westhoff, Joao Girão, and Mithun Acharya. Concealed data aggregation for reverse multicast traffic in sensor networks: Encryption, key distribution, and routing adaptation. IEEE Transactions on Mobile Computing, 5(10): , [132] Andrew Whitby, Audun Jøsang, and Jadwiga Indulska. Filtering out unfair ratings in bayesian reputation systems. In the Workshop on Trust in Agent Societies, at the 3rd International Joint Conference on Autonomous Agents & Multi Agent Systems, AA- MAS 04, New York, United States. [133] Michael M. Woolfson and Malcolm S. Woolfson. Mathematics for Physics. Oxford University Press, New York, USA, [134] Deqin Xiao, Jianzhao Feng, and Huanguo Zhang. A formal reputation system for trusting wireless sensor network. Wuhan University Journal of Natural Sciences, 13(2): , April, [135] Zheng Yan, Peng Zhang, and Teemupekka Virtanen. Trust evaluation based security solution in ad hoc networks. Technical report, December [Online]. Available: security_solution_ad_hoc_networks [Accessed: February 10, 2010]. [136] Yi Yang, Xinran Wang, Sencun Zhu, and Guohong Cao. SDAP: : a secure hop-by-hop data aggregation protocol for sensornetworks. In Palazzo et al. [94], pages [137] Zhiying Yao, Daeyoung Kim, and Yoonmee Doh. PLUS: parameterised localised trust management-based security framework for sensor networks. IJSNET, 3(4): , 2008.

196 176 BIBLIOGRAPHY [138] Zhiying Yao, Daeyoung Kim, Insun Lee, Kiyoung Kim, and Jongsoo Jang. A security framework with trust management for sensor networks. In Proceedings of the 1st International Conference on Security and Privacy for Emerging Areas in Communication Networks, SecureComm 05, Athens, Greece, pages , September 5-9, [139] Ossama Younis and Sonia Fahmy. HEED: A hybrid, energy-efficient, distributed clustering approach for ad hoc sensor networks. IEEE Transactions on Mobile Computing, 3(4): , 2004.