Secure Group Oriented Data Access Model with Keyword Search Property in Cloud Computing Environment Chih Hung Wang Computer Science and Information Engineering National Chiayi University Chiayi City 60004, Taiwan (R.O.C.) wangch@mail.ncyu.edu.tw Abstract : Chia-Chun Hsu Computer Science and Information Engineering National Chiayi University Chiayi City 60004, Taiwan (R.O.C. s1000422@mail.ncyu.edu.tw Cloud computing is an improvement product of combining web technology, grid computing and virtualization. It has become one of the most important issues for the information technology in recent years. A lot of applications have been proposed and discussed on the cloud computing environment because it can bring many benefits like reducing the cost of maintaining data centers in an enterprise, low data management cost and retrieval of data whenever you want, etc. As more sensitive and personal data is shared and stored in the cloud computing server, a question of trust on cloud computing is wildly been discussed. One method to alleviate the security worries is storing data in the encrypting form. The drawback of encrypting data is the limitation of usability. In this paper, we construct a group oriented data access model with keyword search that allows multi-users as one sharing group to reduce the duplication of the sharing file. Once the user wants to retrieve the file, he must be in the authorized sharing group and provide correct keywords. The proposed scheme provides the space efficiency for key storage size and computation efficiency for data retrieving. Keywords: Cloud computing, keyword search encryption, hidden vector encryption, broadcast encryption, data access control. I. INTRODUCTION Cloud computing is an improvement product of combining web technology, grid computing and virtualization. It has become one of the most important issues for the information technology in recent years. The cloud computing server provides resources of the computing infrastructure and outsourced storage space as service over the Internet. As a lot of Service-Oriented Architectures (SOA) technique researches have been proposed, the cloud computing service is more powerful and popular. The growth of public clouds and third party databases is dramatic in recent year that represents there is a trend for outsourcing the user data and sharing the data through the Internet. Many Information Technique (IT) enterprises use the cloud computing service to reduce the cost of maintaining their own data centers and take advantage of low data management
cost. Thus, more and more data is shared and stored by the third party on Internet. The data stored on Internet usually include some sensitive and personal information; however, the third parties cannot be fully trusted in protecting them. On the other words, the dishonest third party server may maliciously collect user-specific data to get individual records. Making sensitive and personal data stored in encrypted form can let those data be confidential to avoid misuse by the server. Even if the data were stolen by the third parties or attackers, the sensitive secrets cannot be revealed. Only the authorized user can read the data in plaintext form thus the amount of information loss will be limited. The method to solve the problem mentioned before is called keyword search encryption and it becomes an especially challenging issue of database system with encrypted data. In keyword search encryption system, the data owner encrypts data and labels the encrypted file with sets of descriptive attributes; only the authorized user can query out the specified ciphertext with particular keywords and recover it to the plaintext form. The previous work of the keyword search encryption focused on an individual making a query and getting the decrypted file. However, the encrypted files in the cloud shared with a group of users that is usually adopted by a commercial application in practice. Therefore, how to make keyword search encryption more efficient in multiuser access is the major issue of this paper. We construct a method of group oriented data access model with keyword search property. We achieve sharing data to multiple users without any key size and public information extension so that it can reduce the cost of key distribution management and retrieving data in an efficient manner. For example, in an electronic healthcare record system, a patient s medical record is stored in the cloud and will be encrypted under several attributes for security and privacy. And then, the patient s record can be shared with other parties (e.g., transferring the patient for treatment in a different hospital). Making the sharing parties as a group can reduce the duplication of the ciphertext of the sharing record and provide a group oriented access control. In these cases, the patient s record can be retrieved by the authorized group member and certain keywords. The proposed scheme builds the ciphertext-policy keyword search encryption in group oriented data access that means the file will share with multi-users as a group and the file can only be retrieved by the users who belong to the authorized sharing group and provide correct keywords. II. RELATEDWORK A. Broadcast encryption Broadcast encryption allows a user to distribute message securely to a set/group of users in an insecure environment. In 2000, Naor and Pinkas [9] used the threshold secret sharing method to build the first Public Key Broadcast Encryption (PKBE) scheme. The general method of PKBE employed the user s personal key pair to generate the group key for encrypting sharing data. One of the problems in broadcast encryption is how to determine the receiver group although they are stateless. Naor et al. [8] presented an efficient method for solving this problem. Another problem in broadcast encryption is how to reduce the number of ciphertexts to make the system more scalable. This problem has been solved by Boneh et al. s scheme in [5] to achieve the constant size ciphertext. After that lots of papers discussed how to make the broadcast of encryption messages more efficient.
Boneh, et al. [4] presented a public key broadcast encryption which required both the broadcast ciphertext and user storage of constant size for stateless receivers. We applied [11] to our proposed scheme. The method of [11] is the latest approach in encryption broadcasting and has the advantages of minimizing the ciphertext. Moreover, it is more efficient than other previous schemes in terms of both transmission and data storage cost. B. Keyword search encryption In 2004, Boneh et al. [2] proposed the Public Key Encryption with Keyword Search (PESK) scheme. In their scheme, the data owner uses receiver s public key to encrypt the data and the related attributes of the data. And then, only the data receiver can use his own private key to create the correct query token (also called Trapdoor ) to query out and decrypt the specific data. PESK is a primitive and simplest keyword search encryption for retrieving the encrypted data, and it is sufficient only for a one-to-one association application between a particular user and a file sharer. After that, Hwang and Lee [7] proposed a scheme that allowed multi-user to query out the same file but in their scheme, the ciphertext size will extend with the number of sharing users. Moreover, Baeket al. [1] provided the method that can revisit the attribute that has been already encrypted with the data. In 2005, Sahai and Waters [12] proposed a novel fuzzy identity-based encryption (also called attribute-based encryption ) in which the user identity is regarded as a descriptive attribute. It is an improvement algorithm of keyword search encryption to define a group of attributes used to set the decrypted condition so that the user who matches the attributes can retrieve the file. Goyal et al. [6] proposed the keypolicy attribute-based encryption that provided the access control mechanism on the encrypted data. C. Hidden vector encryption Hidden vector encryption is a more sophisticated predicate encryption suitable for the search and fine grant access control on encrypted data. In 2007, Boneh and Waters [3] proposed the first hidden vector encryption based on anonymous identity-based encryption. The hidden vector encryption system provides equality predicate to change encrypt file attribute and query token into attribute vector and predicate vector respectively that will protect the privacy of the query. The previous papers of hidden vector encryption like [3] used the compositeorder number system that made the poor performance on computation. Our proposed scheme is based on Park s paper [10] which used prime-order number system can achieve an efficient hidden vector encryption. The complexities of the token size and the pairing cost of creating query token are equal to the number of keywords the user addressed in the query phase. III. THE PROPOSED SCHEME FOR GROUP-ORIENTED DATA ACCESSMODEL WITH KEYWORD SEARCH The proposed scheme is to integrate the concept of the hidden vector encryption of [10] and the broadcast encryption of [11]. The readers can refer to these papers for detail mathematical structures. The system is assumed to be composed of the following parities: a data owner, a group of users who can be the data receivers, the key distribution center and the cloud server. Each user in our system needs to register at the key distribution center to get his own key pair. The data owner uses public information to create temporary keys to
encrypt the sharing file and uploads the ciphertext, the header and the identities of the receiving group to the cloud server. The temporary keys are derived by applying some hash functions to a common share key that can effectively reduce the storage space for the user private keys. The data receiver uses his own key pair and the header to create the query token and then transmit it to the cloud server for decryption. The details of the construction of our scheme are listed below (also see Fig. 1). Figure 1. The concept of the proposed scheme
IV. ANALYSIS AND DISCUSSION Our scheme combines the method of [10] and [11] to achieve a subtle solution on group access to the keyword search encryption. Both [10] and [11] applied the bilinear map complexity assumption and the extension format of bilinear map complexity assumption. Naturally, our scheme can achieve the level of the security that both [10] and [11] proposed. The reader can refer to their papers for details of the security proofs. Although our proposed scheme can achieve highly secure strength in data protection, the privacy of the common share key K and the security of the hash functions are critical to the enciphering mechanism for both messages and attributes. In our scheme, the user only needs to store the keys with the size of two elements of G. The size is smaller than the one in the previous keyword search schemes. And then we take advantage of public parameters of key distribution center to generate the temporary keys of encryption by using the common share key K. The user can derive K if he gets the group header CGH and belongs to the authorized user group. As the result, in
our system, we just need to use four elements of G, a user key pairg2 and CGH G2, to carry out the process of encryption and query. Both the query token size and the pairing computation cost of creating a token are constant in our scheme. Compared with the scheme in [7], our scheme is more efficient since in [7], the size of the ciphertext grows with the number of receiving users. Although we sacrifice a little bit security level that the common share key K becomes a core value in the whole system, we achieve the space efficiency of key management and preserve the advantages proposed in [10]. Table 1 gives the performance cost in our scheme. TABLE I. PERFORMANCE COST V. CONCLUSIONS We presented an efficient group oriented data access model in keyword search encryption. The scheme is particularly suitable for the cloud computing environment since it provides fast queries for the encrypted files. The proposed scheme applied the concept of broadcast encryption to extend the keyword search encryption to be efficient for the accesses of the multiple receivers. The size of the user private key storage cannot be scaled up since the temporary keys are used by applying some hash functions. To design a proper application for the designed model is our future work, and we are planning to combine user access structures and file attributes to develop a complete solution in securely retrieving private data from the cloud server. ACKNOWLEDGMENT This work was supported in part by National Science Council under the Grants NSC 100-2219-E-415-001. REFERENCES 1. J. Baek, R. Safavi-Naini, and W. Susilo, Public key encryption with keyword search revisited, in Computational Science and Its Applications-ICCSA 2008, LNCS, vol. 5072, pp. 1249-1259, 2008. 2. D. Boneh, G. Di Crescenzo, R. Ostrovsky, and G. Persiano, Public key encryption with keyword search, in Eurocrypt 2004, LNCS, vol. 3027, pp. 506 522, 2004. 3. D. Boneh and B. Waters, Conjunctive, subset, and range queries on encrypted data, in Proceedings of TCC'07, LNCS, vol. 4392, pp. 535-554, 2007. 4. D. Boneh, C. Gentry and B. Waters, Collusion resistant broadcast encryption with short ciphertexts and private keys, in Crypto 05, vol. 3621, pp. 258-275, 2005. 5. D. Boneh, X. Boyen, and E. J. Goh, Hierarchical identity based encryption with constant size ciphertext, in Advances in Cryptology Eurocrypt 05, LNCS, vol. 3494, pp. 440 456, 2005.
6. V. Goyal, O. Pandey, A. Sahai, and B. Waters, Attribute-based encryption for fine-grained access control of encrypted data, in CCS '06 proceedings of the 13th ACM conference on Computer and communications security, pp. 89-98, 2006. 7. Y. H. Hwang and P.J. Lee, Public key encryption with conjunctive keyword search and its extension to a multi-user system, in Pairing-based Cryptography PAIRING 2007, LNCS, vol. 4575, pp. 2-22, 2007. 8. D. Naor, M. Maor, and J. Lotspiech, Revocation and tracing schemes for stateless receivers, in Advances in Cryptology Crypto 01, vol. 2139, pp. 41 62, 2001. 9. M. Naor and B. Pinkas, Efficient trace and revoke schemes, inproceedings of the 4th International Conference on Financial Cryptography (FC 00), vol. 1962, pp. 1 20, 2000. 10. J. H. Park, Efficient hidden vector encryption for conjunctive queries on encrypted data, IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 10, pp. 1483-1497, 2011. 11. J. H. Park, H. J. Kim, M. H. Sung and D. H. Lee, Public key broadcast encryption schemes with shorter transmissions, IEEE Transactions on Broadcasting, vol. 54, no. 3, pp.401-411, 2008. 12. A. Sahai and B. Waters, Fuzzy identity based encryption, in Advances in Cryptology Eurocrypt 2005, LNCS, vol. 3494, pp. 457 473, 2005.