Secret Sharing based on XOR for Efficient Data Recovery in Cloud Computing Environment Su-Hyun Kim, Im-Yeong Lee, First Author Division of Computer Software Engineering, Soonchunhyang University, kimsh@sch.ac.kr *,Corresponding Division of Computer Software Engineering, Soonchunhyang University, imylee@sch.ac.kr Abstract Cloud computing has been receiving increasing attention recently. Despite this attention, security is the main problem that still needs to be addressed for cloud computing. In general, a cloud computing environment protects data by using distributed servers for data storage. When the amount of data is too high, however, different pieces of a secret key (if used) may be divided among hundreds of distributed servers. Thus, the management of a distributed server may be very difficult simply in terms of its authentication, encryption, and decryption processes, which incur vast overheads. To address this problem, we propose a system where distributed data are stored on distributed servers that organize or communicate with a user via a XOR-based construction of threshold technique.. Introduction Keywords: Cloud Computing, XOR, Distribute Storage Interest in cloud computing is increasing around the world and a significant amount of research is being conducted in this area. Many companies are interested in cloud computing because it should allow extension into various areas and the efficient use of computing power with the growth of IT technology. However, one of the most critical issues preventing companies from introducing cloud computing environments is the problem of security. Cloud computing service providers have improved security by using various measures to protect user data; however, users cannot be completely confident about their data storage security and its management by the providers. In addition, the data stored on distributed servers may be easier targets for malicious users than data exposed on communication networks. Therefore, user data are stored in an encrypted form in most cloud systems. Data are stored on several distributed servers in typical cloud computing environments to protect the user data. However, problems may arise due to data leaks if data are stored on distributed servers without encryption processes, which may occur if the distributed file locations stored on the master server are traced. The encryption of distributed data using secret keys is necessary to prevent this problem. However, the data may be divided into several tens or hundreds of pieces when large amounts of data are used. If each different key is applied to each individual distributed server that stores the data, this presents difficulties in terms of management and it incurs huge overheads due to the countless processes involved with its certification, encryption, and decryption on distributed servers. This study presents a technique for collecting data pieces more effectively to address these problems. To address this problem, we propose a system where distributed data are stored on distributed servers that organize or communicate with a user via a XOR-based construction of threshold technique. The remainder of this paper is organized as follows. Section introduces related techniques to help readers understand the technique presented in this study. Section explains the basic requirements for the security of cloud computing environments. Section 4 presents the proposed method in greater detail. Section 5 analyzes the safety of the proposed method. Finally, Section 6 provides concluding remarks about this proposed method and directions for future study. This research was supported by the MKE(The Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center)) support program (NIPA--H--) supervised by the NIPA(National IT Industry Promotion Agency). This work was supported by the Soonchunhyang University Research Fund. Journal of Next Generation Information Technology(JNIT) Volume 4, Number 8, October 48
. Related Work This section introduces related techniques and existing methods to help readers understand the technique proposed in this study... Google File System (GFS) The Google File System (GFS) was developed to store large amounts of data, and it is optimized for core data storage and search engines. GFS stores large amounts of data that are distributed throughout many low-cost storage servers. The GFS structure consists of a master server and multiple chunk servers. In addition, the data are separated into 64 MB units and stored in several chunk servers. The master server assigns a unique 64-bit label to each server at the creation point and maintains the connection by logical mapping. However, there is a limit to the number of measures required to address the possible failures that occur frequently due to the use of a large number of chunk servers... Apache Hadoop Distributed File System (HDFS) HDFS is a file system that can be implemented on existing hardware systems, which has many similarities to existing distributed file systems. However, this system also has many differences, including its high failure recovery feature and a design that is applicable to low cost hardware. HDFS is used widely for distributed file systems and it forms the basis of cloud computing platforms for global IT companies such as Amazon, IBM, and Yahoo. A review of the highlighted items in the design and implementation of HDFS showed that most are identical to GFS, although the major difference was that JAVA was used for implementation to simplify portability between platforms. The use of JAVA with its high portability provides advantages when implementing HDFS in servers that supporting JAVA []... Writing data in HDFS cluster HDFS Client requests to store data in NameNode. NameNode searches the location of data storage. It sends the address of DataNode to save the data to the Client. The Client sends data blocks to the first DataNode. Once data is transmitted, it composes one pipeline and copies it at the same time. When the storage (saving) completes, it sends completion message to NameNode. NameNode stores the saved information. Figure. Writing data in HDFS cluster 49
Secret Sharing based on XOR for Efficient Data Recovery in Cloud Computing Environment. Security Requirements One of the core mechanisms used by cloud computing is data encryption via key management. A failure in the key storage server can make data access impossible; hence, research into key management methods is essential. The following security requirements are needed to prevent various failures. Safe Key Storage: as with other types of sensitive data, the key must be protected during storage, transmission, and back-up processes. Inadequate key storage could destroy all of the encrypted data. Key Storage Access: the access to key storage should be limited only to those entities who require private keys. The key storage management policy should also maintain separate access control. The entity used to provide the keys should be different from the entity used to store the keys. Key Back-up and Recovery: a lost key inevitably leads to a loss of data, which is an effective way of destroying data. The accidental loss of a key can also lead to major losses for a business. Therefore, safe back-up and recovery solutions must be implemented. Confidentiality: data communications between the storage server and a client terminal should only be verified using legitimate objects. Authentication: the legitimacy of a storage server where the data is distributed and stored should only be verifiable by a legitimate object recognition client. Availability: certification and secrecy should be performed at a fast rate to ensure availability for transmitting large amounts of data. Computational Efficiency: only the minimum number of computations should be performed to reduce frequent overheads on the client terminals and cloud servers. 4. Proposed Scheme 4.. Data storage and error recovery Cloud computing environment overall, was designed based on HDFS of Apach. The cloud computing environment, will be distributed saving is divided into blocks of 8MB for the data storage capacity. It has been stored in the form of text data stored in the distributed storage server is not encrypted, when the distributed storage server is taken by an attacker, the problem of some data is exposed as it occurs. Figure. Distributed data storage phase 5
HDFS propose to the user encrypted to prevent an attack, but the system itself is not provided. We use Chaning of CBC mode is one of the block encryption mode proposed method of this paper. The original data block is converted into a data block of random. Parity block is generated as shown in equation (). C i... C n ~ n ~ n i n () If an error occurs in the original data blocks, a parity block generated is able to recover the original data block. Steps to generate and recover the data block are as follows(fig. ). -Process of generating the C C C C -Process of recovery the D D C D 4. Data Piece Collection Step If a request is received from a user, a collection process is required for the distributed data. The distributed servers are physically and logically separated from the master server; hence, the certifications of all the individual distributed servers should be provided secretly to the communication networks while transmitting the data pieces. However, the use of an existing public key encryption system is very inefficient in terms of its cost and computation. Therefore, this study proposes a method for more efficient communication by applying group signcryption[6]. Step : The storage server can encrypt using signcryption. random x x w yis mod p k G( r H ( m,server_info, s x /( r xss ) mod q c Ek ( m) ( m E ( data)) K D Step : Users receive messages from the storage server that are approved by signcryption and they verify the storage server and the encryption of messages by unsigncryption. The user receives c, r, and s, distinguishes message w from m, and compares them with the received r value to verify that they are correct using unsigncryption. r s xis w ( yss g ) mod p k G( m k ( c) if ( r? H ( m,server_info, 5
5. Efficiency Analysis The technique proposed in this study ensures that a system is protected from attackers, because they cannot acquire. 5.. Calculation Efficiency The application of the signcryption technique proposed in this study delivered a minimum efficiency of 7% in terms of its overheads compared with existing methods of encryption after a signature is acquired using public and symmetric keys[5]. The group signcryption application also facilitates more efficient key management for users. The user data are divided into many pieces and stored on storage servers by the master server. If a user request is received for data, a collection process is required for the distributed data. The storage servers are physically and logically separated from the master server, so the certifications for all of the individual distributed servers should be provided secretly in the communication networks used during the transmission of the data pieces. The use of an existing public key encryption system is very inefficient in terms of its costs and computations. 5.. Authentication and Confidentiality The user data are divided into many pieces and stored on storage servers by the master server. If a user request is received for data, a collection process is required for the distributed data. The storage servers are physically and logically separated from the master server, so the certifications for all of the individual distributed servers should be provided secretly in the communication networks used during the transmission of the data pieces. The use of an existing public key encryption system is very inefficient in terms of its costs and computations. 6. Conclusion The management of a distributed server may be very difficult simply in terms of its authentication, encryption, and decryption processes, which incur vast overheads. To address this problem, we propose a system where distributed data are stored on distributed servers that organize or communicate with a user via a XOR-based construction of threshold technique. A more detailed comparative analysis should be made of existing methods and the method proposed in this paper using a simulation that considers various environmental factors. 7. References [] K. Shvachko, H. Huang, S. Radia, R. Chansler, The hadoop distributed file system, in: 6th IEEE (MSST) Symposium on Massive Storage Systems and Technologies, May,. [] A.Shamir, "How to share a Secret", Communications of the ACM. Vol., No., pp. 6-, 979. [] C. Cachin, "On-line Secret Sharing", Cryptography and Coding, LNCS Vol.5, pp.9-98, 995. [4] Raghu Rmakrishnan, "Sherpa: Cloud Computing of the Third Kind," Data- Intensive Computing Symposium, 8. [5] Y. Zheng, Digital signcryption or how to achieve cost(signature & encryption) << cost(signature)+ cost(encryption), in Advanced in Cryptology CRYPTO 97 Proceedings, LNCS Vol. 94, pages 65 79, Springer Verlag, 997. [6] D. Kwak and S. Moon. Effcient distributed signcryption scheme as group signcryption. In:Applied Cryptography and Network Security (ACNS ), LNCS 846, Springer-Verlag, pages 4-47,. 5