SAFE: A Social Based Updatable Filtering Protocol with Privacy-preserving in Mobile Social Networks

IEEE ICC 23 - Wireless Networking Symposium SAFE: A Social Based Updatable Filtering Protocol with Privacy-preserving in Mobile Social Networks Kuan Zhang, Xiaohui Liang, Rongxing Lu, and Xuemin (Sherman) Shen Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada, N2L 3G Email:{k2zhang, x27liang, rxlu, xshen}@bbcr.uwaterloo.ca Abstract Mobile Social Networks (MSN), as an emerging social networking platform, facilitates social interaction and information sharing among users in the proximity. Spam filtering protocols are extremely important to reduce communication and storage overhead when many spam packets without specific destinations are diffused in MSNs. In this paper, we propose an effective social based updatable filtering protocol (SAFE) with privacy preservation in MSNs. Specifically, we firstly construct a filter Hash tree based on the properties of Merkle tree. Then, we exploit social relationships, and select those users with more than a specific number of common attributes with the filter creator. The selected users are able to store filters in order to block spams or relay regular packets. Furthermore, we develop a cryptographic filtering scheme without disclosing the creator s private information or interests. In addition, we propose a filter update mechanism to allow users to update their distributed filters in time. The security analysis demonstrates that the SAFE can protect user s private information from filter s disclosure to other users and resist filter forgery attack. Through extensive trace-driven simulations, we show that the SAFE is effective and efficient to filter spam packets in terms of delivery ratio, average delay, and communication overhead. I. INTRODUCTION Mobile Social Networks (MSNs) [] are emerging social networking platforms that enable information sharing and social interaction among surrounding neighbors via Bluetooth or WiFi modules on their equipped smartphones. MSN applications engage in our daily life, and help people discover friends, exchange traffic, shopping or health information, and even share image or video files with each other. Simultaneously, many stores and service providers usually broadcast their service information, like advertisements or flyers, most of which might be considered as spams in the eyes of users. For example, Alice is looking for cosmetics on sale in a business street. She only needs the service information of cosmetic stores rather than the restaurants or groceries. Intuitively, Alice can send some filters including her requirements to other people for selecting the useful service information and blocking the useless advertisements. But the question is: who can help Alice? On the other hand, during the lunch time, she requires the information of the nearby restaurants instead. Since the requirements from Alice vary with time due to some personal reasons, the second question comes: how can Alice update the previous filters in a timely fashion? Therefore, an effective spam packet filtering protocol with an update mechanism is essential to block spam information and reduce communication costs. In the last decade, researchers develop many spam filtering protocols. Some sophisticated filtering protocols use blacklist [2], whitelist, or graph [3] to extract spam or admit legitimate senders. On the other hand, some research works focus on content filtering either by using keyword list [4] to match spam packets, or through machine learning, like Bayesian approaches [], to probabilistically detect spams. Recently, social characteristics are introduced to enhance the filtering effectiveness. Li et al. [6] introduce a social network based filter framework (SOAP) to detect spam emails by employing social closeness, social dis(interest) and trust management. They collect and utilize social information to distinguish regular and spam emails. Sirivianos et al. [7] also exploit social trust to achieve collaborative spam filtering. The trustworthiness of spam reporters could be used to resist Sybil attack and gather the correct spam reports. However, these protocols [6], [7] rely on an online trusted third party or Bayesian approaches [], which require a large number of history log. As a result, most of the existing filtering protocols are inefficient and impractical when being applied in MSNs. Recently, Lu et al. [4] propose, a decentralized keyword based filtering protocol in Delay Tolerant Networks (DTNs). The lets relay users store the filters and detect spam packets before they are transmitted to the destinations. Simultaneously, the encrypted keyword filters preserve user privacy. Nevertheless, the neither efficiently distributes filters, nor dynamically updates the distributed filters. At the same time, security concerns raise many challenges when designing protocols for MSNs. For example, if the filters distributed by Alice include some sensitive or private information, such as preferences or health condition, other people would easily obtain them and violate Alice s privacy. Alice might suffer from a huge amount of loss. In addition, some malicious users might try to forge and re-distribute Alice s filters in order to block regular packets and enable spam packets diffuse, which produce extra communication cost and degrade the network performance. The existing filtering protocols, including the, have not addressed filter forgery attacks before. Therefore, these challenging issues motivate us to further improve filtering effectiveness and enhance the security level. In this paper, we propose an effective social based updatable filtering protocol (SAFE) with privacy preservation in MSNs. The SAFE is characterized by social based filter distribution, private keyword filtering, and efficient filter update. Specifi- 978--4673-322-7/3/$3. 23 IEEE 64

cally, the major contributions of this paper are twofold. Firstly, we exploit Merkle tree and develop a filter authentication scheme to resist the filter forgery attack. Furthermore, based on social relationship, like common attributes, filters are purposefully distributed, which reduces the filter transmission overhead. Then, we propose Merkle tree based filter update mechanism in the SAFE to cope with the variation of user s filtering requirements. Secondly, we evaluate the performance of the proposed SAFE protocol through extensive trace-driven simulations. Our simulation results validate that the SAFE reduces the communication and storage overheads and guarantee the regular packet delivery with a reasonable delay. In addition, the security analysis demonstrates that the SAFE can preserve user s privacy against inside curious attackers and resist the filter forgery attack. The remainder of this paper is organized as follows: Network model and design goals are presented in Section II. In Section III, we present the details of the SAFE, followed by the security analysis and the performance evaluation in Sections IV and V, respectively. Finally, Section VI closes the paper with concluding remarks. II. PROBLEM DEFINITION In this section, we formulate the network model and identify our design goals to improve filtering efficiency and satisfy the security requirements. A. Network model We consider a homogenous MSN consisting of a trust authority (TA) and N mobile users. The details of these components are presented as follows. Trust Authority (TA) is a trustable, powerful, and storagerich entity, and bootstraps the entire network in the initialization phase. Afterwards, TA will not be involved in communication and filtering. When bootstrapping, TA generates secret master keys, which are going to be used for individual legitimate user to produce the session keys. In addition, TA issues legal certificates to legitimate users after their registration. During the operating time, TA will receive attack reports from mobile users if they detect some malicious users. Mobile users are denoted by U = {u,u 2,..., u N }. Each mobile user is equipped with a portable communication and storage device. Suppose that these devices bi-directionally communicate with each other within the equal communication range. Due to the practical constraints, the power and storage occupancy of such devices are limited. An individual mobile user should first register to TA for the profile and key information. For example, each legitimate user will obtain a unique identity, certificates and key materials which should be securely kept and used in each session to generate session keys. In packet transmission and filtering phases, mobile users should be able to not only produce their own identity and filter signatures but also verify other users identities. B. Threat model Malicious users exist in the whole network, and get involved in both packet delivery and spam filtering phases. We address two types of attacks: inside curious attack (ICA) and outside forgery attack (OFA). In specific, some of the filter holders are curious about other users personal interests or profiles. The goal of them is to illegally obtain other users private or sensitive information. We focus on the privacy issues during filter storage and packet delivery and filtering phases. On the other hand, some outside adversaries are not able to obtain other users profiles and filters. But they are likely to forge some filters to block the regular packets or enable spam packets diffuse in MSNs, which consumes a large amount of communication overheads. C. Design goals Our design goal is to develop an effective and updatable filtering protocol in MSNs. The user privacy should be preserved at the same time. ) Efficiency goals: Due to the intermitted end-to-end connection and constrained resources, we aim to design an effective and efficient filtering scheme to block the spam packets in MSNs. The proposed protocol should efficiently block such spam packets without using too many extra communication, storage, and computing overheads. Furthermore, the useful packets should be neither filtered nor delayed during the transmission. In addition, the distributed filters should be as fresh as possible and efficiently updated if the filter creator changes the former filters. 2) Security goals: Our security goal is to protect from ICA and OFA. On one hand, the proposed filtering protocol should preserve the filter creator s privacy from disclosing. The filters cannot appear in plaintext when distributed to other users. When filtering, the keyword should be securely kept in ciphertext. On the other hand, the filter authentication mechanism should be established and enable users to verify every filter s validity. Any forged filter can be detected by mobile users. III. PROPOSED SAFE PROTOCOL In this section, we present the details of our proposed SAFE protocol. Firstly, we propose a privacy-preserving filter authentication scheme, which protects from modifying filters. Furthermore, we explore social relationship to efficiently distribute filters. We then concretely present an effective filter update mechanism. A. Filter Authentication Scheme To authenticate user s filter, we take the advantage of Merkle tree [8], a chain of cryptographic Hash. In this section, we will elaborate procedures of constructing a filter Hash tree and propose a Hash tree based filter authentication scheme. Merkle tree is a binary tree with 2 N leaf nodes, where N is the depth of Merkle tree. Any parent node h i j = H(h i h j ), is the one-way Hash value of its two children nodes. For example, in Fig., given the leaf node h and 2 646

h -2 h h 2 Fig.. h -8 h -4 h -8 h 3-4 h 3 h 4 h -6 h h 6 Hash tree based filter authentication h 7-8 h 7 h 8 h 2, their parent node h 2 = H(h h 2 ). Then, h 2 and h 3 4 are concatenated to obtain h 4. Similarly, the root node h 8 = H(h 4 h 8 ). Finally, the path from the leaf node h to the root h 8 is stored as PH = {h 2,h 3 4,h 8 }. An individual user u i can establish its own keyword list W ui = {W ui,,...,w ui,k}, where W ui,k( k K) is the keyword selected by u i. These keywords are set as the leaf node in filter tree FR ui. During the authentication, the path information PH k is used as the certificate for each independent keyword. Other users check whether the concatenated hash value of PH k is equal to the root R ui or not. If so, the keyword is valid; otherwise, it is forged. Finally, the concatenated value of H(R ui ID ui ) is set as the creator u i s certificate. As a result, the identifier (root value) and the path from the root to every leaf node are verifiable. Suppose a Hash tree consisting of 2 N leaf nodes, N Hash operations are required for each independent verification. In addition, the signature size is N L, where L is the length of Hash value. B. Social Based Private Filtering As discussed above, not only could the spam packets consume the communication and storage overhead, but also would the filters take a large number of transmission and storage. In this section, we will solve the question: How to efficiently distribute the filters and preserve user s privacy? Initialization: When bootstrapping, TA assigns key materials to each user. Let G and G 2 be two cyclic groups with the same order q, and P be a generator of G. Suppose there exists a bilinear pairing [9], [] between G and G 2 that can be efficiently calculated: e: G G G 2, such that e(ap, bp )=e(p, P) ab for random numbers a, b Z q, P G. A bilinear parameter generator G is a probabilistic algorithm [] that gets a security parameter K as input, and outputs a tuple (q, G, G 2, e, G, P, H ), where q is a large prime. Filter generation: The creator u i firstly selects a random number x i Z q, and calculates its public key as PK i = x i P. The private key SK i is x i. Given a keyword W ui,k, its filter is F ui,k =< W ui,k,λ >, where W i = H(W u i,k) x i+h (W ui,k) P, λ = e(pk i,p). Filter distribution: When u i meets another user u j, they do the authentication and private match their profiles to determine the amount of common attributes. If they have more than TH common attributes, where TH is a preset threshold, u i sends the filter F i to u j. Algorithm Social based private filtering : Procedure: Social based private filtering 2: u s sends a packet with keyword W x to u i 3: u s and u j are encountered 4: if u j and u i have more than common attributes then : u j will check whether the packet owns the valid keyword or not 6: if u j keeps u i s filter then 7: u s sends Λ s = λ + PK i to u j 8: u j calculates e(λ s, W i ) 9: if e(λ s, W i )=λ then : u s duplicates the packet to u j : else 2: u j blocks this packet 3: end if 4: else : u s duplicates the packet to u j 6: end if 7: end if 8: return VSS = {VSS,VSS 2,..., V SS M } 9: end procedure Algorithm 2 Filter update : Procedure: Filter update 2: u i changes its own keyword W ui,k, and constructs a new filter tree FR u i with the root node R u i 3: if The encountered user u j is keeping u i s keyword W ui,k then 4: u j sends R ui to u i for the authentication : if R ui is valid then 6: if R ui R u i then 7: u i duplicates FR u i to u j 8: u j updates the kept u i s filter as FR u i 9: end if : else : u j forged u i s filter 2: end if 3: end if 4: end procedure Filtering: When a sender u s would like to send a packet with keyword W x and meets u j, u j will help u i to detect whether this packet should be blocked or not. u s firstly sends Λ s = λ +PK i to u j, where λ = H P. (W x) Upon receiving Λ s, u j checks e(λ s, W i ) =? λ. If it holds, the keyword W x passes the filter check and the packet can be forwarded by u j ; otherwise, this packet will be blocked. The details of the SAFE are illustrated in Algorithm. C. Efficient Filter Update The users might change their former filters. In that case, the filters should be efficiently updated as quickly as possible in order to successfully block or relay the coming packets. In this section, we will answer the question: How to quickly update the filters? According to the properties of Merkle tree, the root will change if any leaf node varies. As a result, we do not need to check every keyword one by one. The creator u i checks the root R ui of u i s filter tree FR ui stored by its filter holder u j. If the root is different, u i sends the updated filter tree FR u i to u j as illustrated in Algorithm 2. Therefore, the SAFE will dramatically improve the searching efficiency in the phase of filter updating. 3 647

IV. SECURITY ANALYSIS In this section, we will discuss privacy and security properties of our proposed SAFE protocol. We will focus on the two types of attacks discussed in section II. Inside Curious Attack: To protect from ICA, each filter cannot be duplicated to others as plaintext. To achieve the privacy-preserving goal, the SAFE encrypts the user s filters based on bilinear pairing. The relay user u j can effectively check whether the keyword exists in u i s filter or not without disclosing any u i s information. We have e(pk i,p) = e( x i P, P) =e(p, P) x i. e(λ s, W i ) H (W = e(λ + PK i, x i + H (W P ) = e( H (W x ) P + H (W P, x i x i + H (W P ) = e( x i + H (W x ) H (W P, H (W x )x i x i + H (W P ) e(p, P) x i, If W ui,k = W x ; = [x i +H (Wx)]H (W ui,k ) H e(p, P) (Wx)x i [x i +H (W ui,k )], otherwise. The packet with valid keywords will be forwarded by u j, while others will be blocked. Due to the properties of elliptic curve groups and bilinear pairing, it is infeasible to calculate H (W from W i = H(W u i,k) x i+h (W ui,k) P. Therefore, the keyword is securely kept so that the creator u i s sensitive and private information is preserved. Outside Forgery Attack: the SAFE resists filter forgery attack by using Merkle tree based filter authentication. The root value of each Merkle tree is the concatenated hash value of the nodes in a specific path, and it is the unique certificate H(R ui ID ui ) created by the filter creator u i. Any user can verify this root value by using the creator s public key. As a result, each leaf node, which is the independent keyword, is uniquely defined and verifiable with its hash tree path information. Once the creator u i changes the former filters, the new certificate is updated as H(R u i ID ui ). Before the update in each u i s friends, the former certificate is still valid. The resilience of OFA is based on the security level of hash function used to construct the Merkle tree. From the above security analysis, the SAFE protects user privacy from being disclosed or eavesdropped by the ICA or outside attacks. Merkle tree based authentication scheme resists the outside forgery attacks with tolerant communication and computing overheads. Note that we do not consider the privacy issues during profile matching [2]. In addition, TA operates in an off-line manner, and generates the essential cryptographic master keys in the initialization phase. Therefore, the SAFE can securely operate in a decentralized manner. V. PERFORMANCE EVALUATION To evaluate the effectiveness and efficiency of the SAFE, we simulate it through Infocom6 trace [3]. A. Simulation Setup The Infocom6 trace [3] contains 78 mobile users in a conference during four days. Each mobile user is equipped with a dedicated Bluetooth device, which can detect the Bluetooth devices appearing in the proximity. As a result, the mobility and contacts of these mobile users can be recorded in the log. We collect 28, 979 useful contacts, and divide them into two portions: the first one third of the data set as a training set producing user s attributes and the residual data as the experiment set used for the simulation. Then, we utilize maximal clique to define attributes. We select attributes each of which contains more users and the sum of all the edges is large enough. We simulate the SAFE with these attributes in the later simulation. Each of these selected attributes consists of at least 28 users, while each user participates in 38 communities on average. B. Simulation results We evaluate the performance of our proposed SAFE protocol compared with and protocols. Each mobile user generates 78 packets with different keywords according to their attributes. As Fig. 2(a) and 2(b) shown, the SAFE achieves higher delivery ratio with higher delay compared with. protocol, where a user distributes filters to any encountered ones, gains the highest delivery ratio and lowest delay, but it consumes too many communication and storage overheads to be operated in the real world. With different THs, which are equal to the number of common attributes that both users have, delivery ratio and average delay do not change at all. The reason is that the SAFE forwards packets according to the common attributes of the destination. The change of TH only impacts on the number of distributed filters rather than the filters themselves. As a result, the useful packets cannot be filtered at any time. In Fig. 2(c), we can observe that the SAFE with TH =2 blocks more spam packets, while the SAFE with TH = blocks much fewer spam packets. As shown in Fig. 2(d), the SAFE with TH =dramatically reduce the communication costs. Even though the SAFE with TH =2blocks more spam packets as shown in Fig. 2(c), it still leads to a large number of copies. Since the distributed filters decreases with TH =2, more mobile users not having the relevant filter generate more spam packets. Furthermore, from Fig. 2(c) and 2(d), the SAFE with TH = trades off the number of blocked spam packets and the number of copies compared with other protocols. In Fig. 2(e), the number of distributed filters increases with the growth of threshold TH. The smaller TH of distributing filters causes more users qualified to hold filters. The and (denoted as PF and Ep in Fig. 2(e), respectively) filtering schemes distribute more filters to mobile users. This is because the SAFE purposely distributes the filters to the users having some common attributes with the destinations. As shown in Fig. 2(f), the higher TH results in more copies during transmission. Since the higher threshold limits the 4 648

Delivery ratio (%) 9 8 7 6 4 3 2 SAFE TH= SAFE TH=2 2 2 3 3 4 Average delay () 3 2 2 SAFE TH= SAFE TH=2 2 2 3 3 4 Number of the blocked spam packets 4 4 3 3 2 2 SAFE TH= SAFE TH=2 PreFilter 2 2 3 3 4 Copy number of the transmitted packets 9 8 7 6 4 3 SAFE TH= SAFE TH=2 2 2 2 3 3 4 (a) Delivery ratio (b) Average delay (c) Number of the blocked spam packets (d) Number of copies Number of filters 4 3 3 2 2 Ep PF 2 2 3 3 4 4 (e) Number of filters vs. TH Number of copies x 4.8.6.4.2.8.6.4.2 2 2 3 3 4 4 Number of blocked spam 9 8 7 6 4 3 2 2 2 3 3 4 4 (f) Number of copies vs. TH (g) Number of blocked packets vs. TH Fig. 2. Performance comparison Number of Search operation 9 8 7 6 4 3 2 Binary Search SAFE 2 3 4 6 7 8 9 Number of filters (h) Update comparison amount of distributed filters, fewer filters cannot resist the flooding of spam packets. In Fig. 2(g), the growth of TH results in more blocked spam packets. When TH is small, for example, or, many users hold filters so that they will not produce spam packets at all. In such case, spam packets are blocked in the phase of producing. When TH increases, not that many users keep these filters. As a result, the amount of produced spam packets increases, and the distributed filters can block more spam packets. With the continuous increase of TH, fewer users hold filters. Not only do the produced spam packets increase, but also will the filtering capability degrade. This is the reason why the further growth of TH causes the decrease of the blocked spam packets when TH is greater than 4. Therefore, the SAFE with TH =achieves the trade-off between the amount of distributed filters and the number of copies, and effectively blocks spam packets. In Fig. 2(h), the SAFE performs fewer search operation when update filtering, since the only operation is to check the root of Merkle tree. Other algorithms, like binary search algorithm, performs considerable operations when the number of distributed filters increases. Therefore, the effectiveness and efficiency of the SAFE is demonstrated from the above simulation results. VI. CONCLUSION In this paper, we have proposed a social based updatable filtering protocol (SAFE) with privacy- preserving in MSNs. Firstly, analyzing the social relationships among users, we have introduced common attributes to effectively and efficiently distribute filters. According to Merkle tree, the SAFE updates the filters to adjust the various requirements from users. Furthermore, the security analysis demonstrates that user s private information embedded in the filters can be protected, while the extensive simulation results show that the SAFE can significantly reduce communication and storage overhead with high efficiency and low delay. To the best of our knowledge, this paper is the first work addressing filter update in MSNs. In our future work, we intend to explore the adaptive filter update to further improve the update efficiency. REFERENCES [] X. Liang, X. Li, R. Lu, X. Lin, and S. Shen, SEER: A secure and efficient service review system for service-oriented mobile social networks, in Proc. of IEEE ICDCS, 22, pp. 647 66. [2] F. Soldo, A. Le, and A. Markopoulou, Blacklisting recommendation system: Using spatio-temporal patterns to predict future attacks, IEEE JSAC, vol. 29, no. 7, pp. 423 437, 2. [3] A. Ramachandran and N. Feamster, Understanding the network-level behavior of spammers, in Proc. of ACM SIGCOMM, 26, pp. 29 32. [4] R. Lu, X. Lin, T. H. Luan, X. Liang, X. Li, L. Chen, and X. Shen, : An efficient privacy-preserving relay filtering scheme for delay tolerant networks, in Proc. of IEEE INFOCOM, 22, pp. 39 43. [] B. Agrawal, N. Kumar, and M. Molle, Controlling spam emails at the routers, in Proc. of IEEE ICC, 2, pp. 88 92. [6] Z. Li and H. Shen, SOAP: A social network aided personalized and effective spam filter to clean your e-mail box, in Proc. of IEEE INFOCOM, 2, pp. 83 843. [7] M. Sirivianos, K. Kim, and X. Yang, Socialfilter: Introducing social trust to collaborative spam mitigation, in Proc. of IEEE INFOCOM, 2, pp. 23 238. [8] R. Merkle, Protocols for public key cryptosystems, in Proc. of IEEE Symposium on Security and Privacy, Apr. 98, pp. 22 34. [9] Boneh and Franklin, Identity-based encryption from the weil pairing, SICOMP: SIAM Journal on Computing, vol. 32, 23. [] R. Lu, X. Lin, and X. Shen, Spring: A social-based privacy-preserving packet forwarding protocol for vehicular delay tolerant networks, in Proc. of IEEE INFOCOM, 2, pp. 632 64. [] F. Zhang, R. Safavi-Naini, W. Susilo, and W. Susilo, An efficient signature scheme from bilinear pairings and its applications. in Proc. of Public Key Cryptography, 24, pp. 277 29. [2] X. Liang, X. Li, R. Lu, X. Lin, and X. Shen, Fine-grained identification with real-time fairness in mobile social networks, in Proc. of IEEE ICC, 2, pp.. [3] J. Scott, R. Gass, J. Crowcroft, P. Hui, C. Diot, and A. Chaintreau, CRAWDAD trace cambridge/haggle/imote/infocom (v. 26--3). 649