SURVEY ON DISTRIBUTED DEDUPLICATION SYSTEM WITH AUDITING AND IMPROVED RELIABILITY IN CLOUD Rekha R 1, ChandanRaj BR 2



Similar documents
DATA SECURITY IN CLOUD USING ADVANCED SECURE DE-DUPLICATION

Optimized And Secure Data Backup Solution For Cloud Using Data Deduplication

Review On Deduplicating Data and Secure Auditing in Cloud

A Survey on Secure Auditing and Deduplicating Data in Cloud

Data Deduplication Scheme for Cloud Storage

An Authorized Duplicate Check Scheme for Removing Duplicate Copies of Repeating Data in The Cloud Environment to Reduce Amount of Storage Space

Enhancing Data Security in Cloud Storage Auditing With Key Abstraction

TITLE: Secure Auditing and Deduplicating Data in Cloud(Survey Paper)

Secrecy Maintaining Public Inspecting For Secure Cloud Storage

Implementation of Data Sharing in Cloud Storage Using Data Deduplication

A Secure Decentralized Access Control Scheme for Data stored in Clouds

Near Sheltered and Loyal storage Space Navigating in Cloud

Secure Hybrid Cloud Architecture for cloud computing

Secure Deduplication and Data Security with Efficient and Reliable Convergent Key Management

ISSN: (Online) Volume 2, Issue 1, January 2014 International Journal of Advance Research in Computer Science and Management Studies

Authorized data deduplication check in hybrid cloud With Cluster as a Service

A NOVEL APPROACH FOR MULTI-KEYWORD SEARCH WITH ANONYMOUS ID ASSIGNMENT OVER ENCRYPTED CLOUD DATA

Keywords Cloud Storage, Error Identification, Partitioning, Cloud Storage Integrity Checking, Digital Signature Extraction, Encryption, Decryption

EFFICIENT AND SECURE DATA PRESERVING IN CLOUD USING ENHANCED SECURITY

SECURE CLOUD STORAGE PRIVACY-PRESERVING PUBLIC AUDITING FOR DATA STORAGE SECURITY IN CLOUD

Research on Storage Techniques in Cloud Computing

Improving data integrity on cloud storage services

EFFICIENT AND SECURE ATTRIBUTE REVOCATION OF DATA IN MULTI-AUTHORITY CLOUD STORAGE

AN EFFICIENT AUDIT SERVICE OUTSOURCING FOR DATA IN TEGRITY IN CLOUDS

CONSIDERATION OF DYNAMIC STORAGE ATTRIBUTES IN CLOUD

How To Secure Cloud Computing, Public Auditing, Security, And Access Control In A Cloud Storage System

An Efficient Security Based Multi Owner Data Sharing for Un-Trusted Groups Using Broadcast Encryption Techniques in Cloud

Distributed Attribute Based Encryption for Patient Health Record Security under Clouds

Cryptographic Data Security over Cloud

86 Int. J. Engineering Systems Modelling and Simulation, Vol. 6, Nos. 1/2, 2014

A Security Integrated Data Storage Model for Cloud Environment

Secure Way of Storing Data in Cloud Using Third Party Auditor

Ranked Keyword Search Using RSE over Outsourced Cloud Data

Efficient and Secure Dynamic Auditing Protocol for Integrity Verification In Cloud Storage

Secure Role-Based Access Control on Encrypted Data in Cloud Storage using Raspberry PI

Data Storage Security in Cloud Computing

RIGOROUS PUBLIC AUDITING SUPPORT ON SHARED DATA STORED IN THE CLOUD BY PRIVACY-PRESERVING MECHANISM

Authentication. Authorization. Access Control. Cloud Security Concerns. Trust. Data Integrity. Unsecure Communication

Secure Data transfer in Cloud Storage Systems using Dynamic Tokens.

preliminary experiment conducted on Amazon EC2 instance further demonstrates the fast performance of the design.

Public Auditing & Automatic Protocol Blocking with 3-D Password Authentication for Secure Cloud Storage

A SECURE FRAMEWORK WITH KEY- AGGREGATION FOR DATA SHARING IN CLOUD

Data management using Virtualization in Cloud Computing

Side channels in cloud services, the case of deduplication in cloud storage

Secure Group Oriented Data Access Model with Keyword Search Property in Cloud Computing Environment

Keywords-- Cloud computing, Encryption, Data integrity, Third Party Auditor (TPA), RC5 Algorithm, privacypreserving,

An Efficient Multi-Keyword Ranked Secure Search On Crypto Drive With Privacy Retaining

Secret Sharing based on XOR for Efficient Data Recovery in Cloud

EXAMINING OF HEALTH SERVICES BY UTILIZATION OF MOBILE SYSTEMS. Dokuri Sravanthi 1, P.Rupa 2

Index Terms : cloud computing, Distributed Storage, error detection, data recovery, SHA, dynamic block operations

Expressive, Efficient, and Revocable Data Access Control for Multi-Authority Cloud Storage

SHARED DATA & INDENTITY PRIVACY PRESERVING IN CLOUD AND PUBLIC AUDITING

Enabling Public Auditing for Secured Data Storage in Cloud Computing

Data Storage Security in Cloud Computing for Ensuring Effective and Flexible Distributed System

How To Create A Multi-Keyword Ranked Search Over Encrypted Cloud Data (Mrse)

Keywords: - Ring Signature, Homomorphic Authenticable Ring Signature (HARS), Privacy Preserving, Public Auditing, Cloud Computing.

ADVANCE SECURITY TO CLOUD DATA STORAGE

PRIVACY ASSURED IMAGE STACK MANAGEMENT SERVICE IN CLOUD

SECURE AND EFFICIENT PRIVACY-PRESERVING PUBLIC AUDITING SCHEME FOR CLOUD STORAGE

Distributed auditing mechanism in order to strengthen user s control over data in Cloud computing Environment

SECURE AND TRUSTY STORAGE SERVICES IN CLOUD COMPUTING

Index Terms: Cloud Computing, Cloud Security, Mitigation Attack, Service Composition, Data Integrity. 1. Introduction

Data Grid Privacy and Secure Storage Service in Cloud Computing

Secure and Efficient Data Retrieval Process based on Hilbert Space Filling Curve

DELEGATING LOG MANAGEMENT TO THE CLOUD USING SECURE LOGGING

How To Ensure Correctness Of Data In The Cloud

SECURITY ANALYSIS OF PASSWORD BASED MUTUAL AUTHENTICATION METHOD FOR REMOTE USER

Index Terms: Data integrity, dependable distributed storage, Cloud Computing

A Review of Cloud Environment and Recognition of Highly Secure Public Data Verification Architecture using Secure Public Verifier Auditor

Enhancing Data Integrity in Cloud Storage by Ensuring Maximum Availability

Enabling Public Auditability, Dynamic Storage Security and Integrity Verification in Cloud Storage

SECURITY ENHANCEMENT OF GROUP SHARING AND PUBLIC AUDITING FOR DATA STORAGE IN CLOUD

PRIVACY-PRESERVING PUBLIC AUDITING FOR SECURE CLOUD STORAGE

Efficient Remote Data Possession Checking In Critical Information Infrastructures Ensuring Data Storage Security In Cloud Computing

Implementation of Privacy-Preserving Public Auditing and Secure Searchable Data Cloud Storage

OVERVIEW OF SECURITY ISSUES IN CLOUD COMPUTING

Providing Access Permissions to Legitimate Users by Using Attribute Based Encryption Techniques In Cloud

Privacy & Security of Mobile Cloud Computing (MCC)

IMPLEMENTATION CONCEPT FOR ADVANCED CLIENT REPUDIATION DIVERGE AUDITOR IN PUBLIC CLOUD

SECURE RE-ENCRYPTION IN UNRELIABLE CLOUD USINGSYNCHRONOUS CLOCK

Secure cloud access system using JAR ABSTRACT:

Transcription:

SURVEY ON DISTRIBUTED DEDUPLICATION SYSTEM WITH AUDITING AND IMPROVED RELIABILITY IN CLOUD Rekha R 1, ChandanRaj BR 2 1 MTech, 4th Sem, Dept. Of Computer Science and Engineering EWIT, Bengaluru-91 2 Assistant Professor, Dept. Of Computer Science and Engineering EWIT, Bengaluru-91 ABSTRACT: Data deduplication is a strategy for removing duplicate copies of data, and has been widely utilized in cloud storage to decrease storage space and transfer bandwidth. On the other hand, there is only single copy for each file stored in cloud and it is owned by a huge number of users. Thus, deduplication system enhances storage utilization while reducing reliability. In addition, the concern of privacy for user-sensitive data also exist when they are outsourced to cloud. Planning to address the above security test, this paper builds the first effort to establish the idea of distributed reliable deduplication system. This paper recommends a new distributed deduplication system with increased dependability in which the data chunks are distributed across multiple cloud servers. The safety needs of data privacy and tag stability are also accomplished by introducing a deterministic secret sharing scheme in distributed storage systems, instead of using convergent encryption as in previous deduplication systems. Auditing mechanism is implemented in order to track user activities on cloud. Keywords: Deduplication, secret sharing, distributed storage system, reliability, data integrity, auditing [1] INTRODUCTION By the unpredictable advancement of digital information, deduplication techniques areextensively engaged to backup data and decrease network and storage overhead by identifying and getting rid of redundancy among data. As an alternative of storing different data copies with the same content, deduplication removes redundant data by storing only single copy and referring other redundant data to that copy. The mechanism of deduplication is shown in 41

Figure: 1. Deduplication mechanism Deduplication has attracted much focus from both scholarly world and industry since it can truly recover storage utilization and retain storage space, particularly for the applications with high deduplication ratio, for example archival storage systems. A number of deduplication systems have been projected based on various deduplication scheme such as client side or server-side deduplication, file-level or block-level deduplication. Specially, with the emergence of cloud storage, data deduplication strategy develops to be more fundamental for the administration of steadily expanding amount of data in cloud storage services which inspires to outsource data storage to third-party cloud providers. If we consider a portion of the samples as verifications: Today s cloud storage services, such as, Google Drive, Drop box have been relating deduplication to save the network bandwidth and the storage cost with client-side deduplication. Regardless of the fact that deduplication method can amass the storage space, it decreases the consistency (uniformity) of the system. Data consistency is truly an extremely indispensable because there is only one copy for each file aggregated in the server pooled by all the data owners. If the estimation of a file were ascertained in terms of the amount of file data that would be lost in case of missing a single chunk, then the quantity of user data lost when a file in the storage system is ruined grows with the number of the unity of the chunk. Thus, how to guarantee high data consistency in deduplication system is a crucial problem. [2] LITERATURE SURVEY The conventional encryption mechanisms, including public key encryption and symmetric key encryption, require different users to encrypt their data with their own keys. This is because when encryption is applied over the data, deduplication is impossible. Hence, commercial storage service providers are hesitant to encrypt data. As a result, identical data copies of different users will lead to distinct ciphertext. To address the problems of confidentiality and deduplication, the notion of convergent encryption has been proposed and widely adopted to enforce data confidentiality while realizing deduplication. However, these systems achieved confidentiality of outsourced data at the cost of decreased error resilience. Therefore, supporting both confidentiality and reliability while achieving deduplication in a cloud storage system is still a challenge. Section 2.1 explains 42

Convergent Encryption, section 2.2 discusses Secured Auditing and deduplication in cloud, section 2.3 is Distributed Deduplication with enhanced reliability. [2.1] CONVERGENT ENCRYPTION Convergent encryption [2] guarantees data privacy in deduplication. Bellare etal. [6] formalized this primitive as message-locked encryption, and investigated its application in space productive secure outsourced storage. Figure: 2. Architecture of Convergent Encryption based Authorized Deduplication Data security in deduplication is provided by Convergent Encryption. From each and every unique data copy, users get a convergent key and they encrypt the unique data copy with the convergent key. The user determines a tag for each unique data copy, which will be utilized to recognize duplicate copies. Architecture diagram of Convergent Encryption scheme is shown in [Figure-2]. In order to discover the duplicate copies, the user first sends the tag to the server to confirm if the duplicate copy already exists. The convergent key and tags are individually evaluated, and tags cannot understand the convergent key to break the data security. The server stores the encrypted data copy and the respective tag. The convergent encryption system can be defined by four basic functions: KeyGence (M) K -key generation algorithm which maps an information data copy M to convergent key K. Encce (K, M) C -symmetric encryption algorithm that receives the input of both data copy M and convergent key K, then gives output cipher text C. Decce(K,C) M decrypting algorithm which receives the input of the convergent key K and cipher text C, then gives the output of the original data copy M. TagGen (M) T(M) tags generating algorithm which maps original data copy M and gives output tag T(M). [2.2] SECURE AUDITING AND DEDUPLICATING DATA IN CLOUD Despite the fact that distributed storage framework has been broadly received, it fails to oblige some important emerging needs such as the capacities of auditing integrity of cloud files by cloud clients and recognizing duplicated files by cloud servers. The first problem is integrity 43

auditing. The cloud server is able to relieve clients from the heavy burden of storage management and maintenance. Data is not under control of the clients by any means, which definitely raises clients incredible worries on the integrity of their data. The second problem is secure deduplication. The quick adoption of cloud services is accompanied by expanding volumes of data stored at remote cloud servers DISTRIBUTED DEDUPLICATION SYSTEM WITH AUDITING AND IMPROVED RELIABILITY Figure: 3. SecureCloud Architecture Aiming at accomplishing data integrity and deduplication in cloud, two secure systems in particular SecCloud and SecCloud+ are proposed [7]. As depicted in Architecture diagram in [Figure-3], SecCloud presents an auditing entity with a support of a MapReduce cloud, which offers clients some assistance for generating data tags before transferring and in addition audit the integrity of data having been stored in cloud. This design fixes the issue of previous work that the computational load at user or auditor is excessively big for tag generation. For completeness of fine-grained, the functionality of auditing designed in SecCloud is supported on both block level and sector level. Furthermore, SecCloud also empowers secure deduplication. Notice that the security that is part of SecCloud is the prevention of spillage of side channel information. In order to prevent the spillage of such side channel information, we follow the convention of and design a proof of ownership protocol between clients and cloud servers, which permits clients to demonstrate to cloud servers that they precisely own the target data. Motivated by the way that customers always want to encrypt their data before transferring, for reasons extending from individual privacy to corporate policy, we introduce a key server into SecCloud as with and propose the SecCloud+ schema. Besides supporting integrity auditing and secure deduplication, SecCloud+ empowers the assurance of file confidentiality. [2.3] DISTRIBUTED DEDUPLICATION SYSTEM WITH IMPROVED RELIABILITY 44

The design of secure deduplication systems with higher reliability in cloud computing is shown in [8]. The distributed cloud storage servers are introduced into deduplication systems to provide better fault tolerance. To further protect data confidentiality, the secret sharing technique is utilized, which is also compatible with the distributed storage systems. In more details, a file is first split and encoded into fragments by using the technique of secret sharing, instead of encryption mechanisms. These shares will be distributed across multiple independent storage servers. Furthermore, to support deduplication, a short cryptographic hash value of the content will also be computed and sent to each storage server as the fingerprint of the fragment stored at each server. Only the data owner who first transfers the data is required to compute and distribute such secret shares, while all following users who own the same data copy do not need to compute and store these shares any more. To recover data copies, users must access a minimum number of storage servers through authentication and obtain the secret shares to recreate the data. In other words, the secret shares of data will only be accessible by the authorized users who own the corresponding data copy. Another recognizable highlight of this proposal is that data integrity, including tag consistency, can be accomplished. The conventional deduplication methods cannot be straightforwardly extended and applied in distributed and multi-server systems. In other words, any of the servers can acquire shares of the data stored at the other servers with the same short value as proof of ownership. Moreover, the tag consistency to prevent the duplicate/ciphertext replacement attack, is considered in this protocol. In more details, it prevents a user from transferring a maliciously-generated ciphertext such that its tag is the same with another honestly-generated ciphertext. To accomplish this, a deterministic secret sharing method has been formalized and utilized. To our knowledge, no existing work on secure deduplication can properly address the reliability and tag consistency issue in distributed storage systems System Architecture diagram is shown in [Figure-4]. Figure: 4. Architecture Diagram of Distributed Deduplication System 45

File-level and Block-level Distributed Deduplication System: To keep up efficient duplicate check, tags for each file/block will be computed and are directed to S-CSPs. File Upload- To accomplish the deduplication, the user relates with S-CSPs and uploads a file F. The user firstly computes and transfers the file tag ϕf = TagGen(F) to S-CSPs for the file duplicate check. The user will be provided a pointer for the shard stored at server. If no duplicate is found, he runs the secret sharing algorithm SS. TagGen is the tag generation algorithm that considers the original data copy C and outputs a tag T (C). This tag will be produced by the user and used to achieve the duplicate check with the server. Alternative tag generation algorithm TagGen precedes as input a file C and an index j and outputs a tag. This tag, generated by users, is used for the proof of ownership. File Download- In order to download a file F, the user first downloads the secret shares {cj} of the file from k out of n storage servers. Precisely, the user sends the pointer of F to k out o f f ( x) = a 2 0 + a1x + a2x +... + a k x k 1 1 n S-CSPs. The user reconstructs file F by using the algorithm of Recover({cj}) after meeting enough shares. This technique provides fault tolerance and lets the user to remain accessible even if any limited subsets of storage servers fail. Same applies for block level Lagranges Formula is made used- Ramp Secret Sharing Scheme- two algorithms in a secret sharing scheme are Share and Recover. The secret is separated and shared by using Share. With enough shares, the secret can be pulled out and regenerated with the algorithm of Recover. Here, the Ramp secret sharing scheme (RSSS) is assumed to secretly split a secret into shards. Auditing module- this component is used for tracking user activities in Cloud Service Provider. If there are any additions/modifications/deletions done for the data along with user details and the timing are recorded. Data owner can later view the auditing report [3] CONCLUSION The deduplication systems discussed here increase the consistency of data. Distributed deduplication system with Ramp Sharing scheme is the best to improve the reliability of data while achieving the confidentiality of the users outsourced data without an encryption mechanism. The security of tag consistency and integrity were attained. REFERENCES [1] J. R. Douceur, A. Adya, W. J. Bolosky, D. Simon, and M. Theimer, Reclaiming space from duplicate files in a serverless distributed file system. in ICDCS, 2002, pp. 617 624. [2] M. Bellare, S. Keelveedhi, and T. Ristenpart, Dupless: Server aided encryption for deduplicated storage, in USENIX Security Symposium, 2013. 46

[3] [4] G. R. Blakley and C. Meadows, Security of ramp schemes, in Advances in Cryptology: Proceedings of CRYPTO 84, ser. Lecture Notes in Computer Science, G. R. Blakley and D. Chaum, Eds. Springer-Verlag Berlin/Heidelberg, 1985, vol. 196, pp. 242 268 [5] D. Santis and B. Masucci, Multiple ramp schemes, IEEE Transactions on Information Theory, vol. 45, no. 5, pp. 1720 1728, Jul. 1999. [6] J. Li, X. Chen, M. Li, J. Li, P. Lee, and W. Lou, Secure deduplication with efficient and reliable convergent key management, in IEEE Transactions on Parallel and Distributed Systems, 2014, pp. vol. 25(6), pp. 1615 1625 [7] M. Bellare, S. Keelveedhi, and T. Ristenpart, Message-locked encryption and secure deduplication, in Advances in Cryptology EUROCRYPT 2013, ser. Lecture Notes in Computer Science, T. Johansson and P. Nguyen, Eds. Springer Berlin Heidelberg, 2013, vol. 7881, pp.296 312. [8] Jingwei Li, Jin Li, Dongqing Xie and Zhang Cai, Secure Auditing and Deduplicating Data in Cloud in IEEE Transactions on Computers [9] Jin Li, Xiaofeng Chen, Xinyi Huang, Shaohua Tang and Yang Xiang, Mohammad Mehedi Hassan and Abdulhameed Alelaiwi, Secure Distributed Deduplication Systems with Improved Reliability in IEEE Transactions on Computers Volume: PP Year: 2015 Author[s] brief Introduction Rekha R is currently pursuing MTech (4 th Semester) in Computer Science and Engineering from East West Institute of Technology, Bengaluru-91 ChandanRaj BR is working as Assistant Professor, Department of Computer Science, East West Institute of Technology, Bengaluru-91. His areas of specialization are Mobile computing, 4G Network management, sensor networks and network security 47

Corresponding Address Rekha R #33, Nisarga, 2 nd cross, Maruthinagar, Madeshwaranagr 2 nd stage Bengaluru-560091 Mobile: 9900860466 ChandanRaj BR Assistant Professor, Dept. of Computer Science and Engineering East West Institute of Technology Off Magadi Road, Vishwaneedam Post Bengaluru-560091 Mobile: 9342954123 48