Authorized data deduplication check in hybrid cloud With Cluster as a Service



Similar documents
Data Deduplication Scheme for Cloud Storage

Optimized And Secure Data Backup Solution For Cloud Using Data Deduplication

DATA SECURITY IN CLOUD USING ADVANCED SECURE DE-DUPLICATION

SECURE CLOUD STORAGE PRIVACY-PRESERVING PUBLIC AUDITING FOR DATA STORAGE SECURITY IN CLOUD

Data Storage Security in Cloud Computing for Ensuring Effective and Flexible Distributed System

TITLE: Secure Auditing and Deduplicating Data in Cloud(Survey Paper)

Verifying Correctness of Trusted data in Clouds

Ensuring Data Storage Security in Cloud Computing By IP Address Restriction & Key Authentication

SECURITY ENHANCEMENT OF GROUP SHARING AND PUBLIC AUDITING FOR DATA STORAGE IN CLOUD

February. ISSN:

Identifying Data Integrity in the Cloud Storage

RIGOROUS PUBLIC AUDITING SUPPORT ON SHARED DATA STORED IN THE CLOUD BY PRIVACY-PRESERVING MECHANISM

Improving data integrity on cloud storage services

EFFICIENT AND SECURE DATA PRESERVING IN CLOUD USING ENHANCED SECURITY

IMPLEMENTATION CONCEPT FOR ADVANCED CLIENT REPUDIATION DIVERGE AUDITOR IN PUBLIC CLOUD

A Secure & Efficient Data Integrity Model to establish trust in cloud computing using TPA

SHARED DATA & INDENTITY PRIVACY PRESERVING IN CLOUD AND PUBLIC AUDITING

RSA BASED CPDP WITH ENCHANCED CLUSTER FOR DISTRUBED CLOUD STORAGE SERVICES

Index Terms Cloud Storage Services, data integrity, dependable distributed storage, data dynamics, Cloud Computing.

A Survey on Secure Storage Services in Cloud Computing

SURVEY ON DISTRIBUTED DEDUPLICATION SYSTEM WITH AUDITING AND IMPROVED RELIABILITY IN CLOUD Rekha R 1, ChandanRaj BR 2

Target Deduplication Metrics and Risk Analysis Using Post Processing Methods

CONSIDERATION OF DYNAMIC STORAGE ATTRIBUTES IN CLOUD

Secure Hybrid Cloud Architecture for cloud computing

Cloud Data Service for Issues in Scalable Data Integration Using Multi Authority Attribute Based Encryption

Review On Deduplicating Data and Secure Auditing in Cloud

A Survey on Secure Auditing and Deduplicating Data in Cloud

Erasure correcting to enhance data security in cloud data storage

Multi-level Metadata Management Scheme for Cloud Storage System

ADVANCE SECURITY TO CLOUD DATA STORAGE

OVERVIEW OF SECURITY ISSUES IN CLOUD COMPUTING

ISSN Index Terms Cloud computing, outsourcing data, cloud storage security, public auditability

A Novel Re-Authentication Scheme on Cloud Based Storage Services T.G.V.V.Srinivas 1, P.Suresh Babu 2 1 Final M.Tech Student, 2 Associate professor

A Proxy-Based Data Security Solution in Mobile Cloud

Secrecy Maintaining Public Inspecting For Secure Cloud Storage

Role Based Encryption with Efficient Access Control in Cloud Storage

AN EXPOSURE TO RELIABLE STORAGE SERVICES IN CLOUD COMPUTING

International Journal of Infinite Innovations in Engineering and Technology. ISSN (Online): , ISSN (Print):

Analysis on Secure Data sharing using ELGamal s Cryptosystem in Cloud

Enhancing Data Security in Cloud Storage Auditing With Key Abstraction

M. Nathiya 2 B.Tech. (IT), M.E. (CSE), Assistant Professor, Shivani Engineering College, Trichy, Tamilnadu, India.

Dynamic Data Storage for Trustworthy Cloud

Sharing Of Multi Owner Data in Dynamic Groups Securely In Cloud Environment

Development of enhanced Third party Auditing Scheme for Secure Cloud Storage

Comments on "public integrity auditing for dynamic data sharing with multi-user modification"

Side channels in cloud services, the case of deduplication in cloud storage

A Survey on Privacy-Preserving Techniques for Secure Cloud Storage

Towards a compliance audit of SLAs for data replication in Cloud storage

Trusted Public Auditing Process for Secure Cloud Storage

SECURE AND TRUSTY STORAGE SERVICES IN CLOUD COMPUTING

Data Grid Privacy and Secure Storage Service in Cloud Computing

Near Sheltered and Loyal storage Space Navigating in Cloud

Energy Efficiency in Secure and Dynamic Cloud Storage

A Secure and Dependable Cloud Storage Service in Cloud Computing

Data management using Virtualization in Cloud Computing

preliminary experiment conducted on Amazon EC2 instance further demonstrates the fast performance of the design.

Data Integrity for Secure Dynamic Cloud Storage System Using TPA

Quanqing XU YuruBackup: A Highly Scalable and Space-Efficient Incremental Backup System in the Cloud

A NOVEL APPROACH FOR MULTI-KEYWORD SEARCH WITH ANONYMOUS ID ASSIGNMENT OVER ENCRYPTED CLOUD DATA

Enable Public Audit ability for Secure Cloud Storage

Performance Evaluation Panda for Data Storage and Sharing Services in Cloud Computing

Secure Storage Services and Erasure Code Implementation in Cloud Servers

DESIGN AND IMPLEMENTATION OF A SECURE MULTI-CLOUD DATA STORAGE USING ENCRYPTION

Index Terms : cloud computing, Distributed Storage, error detection, data recovery, SHA, dynamic block operations

Index Terms: Data integrity, dependable distributed storage, Cloud Computing

Data Storage Security in Cloud Computing

Secure Data Deduplication

Cloud Data Storage Services Considering Public Audit for Security

SECURE AND EFFICIENT PRIVACY-PRESERVING PUBLIC AUDITING SCHEME FOR CLOUD STORAGE

Secure Way of Storing Data in Cloud Using Third Party Auditor

A Survey and a Data Integrity Proofs In Cloud Storage

An Efficient Multi-Keyword Ranked Secure Search On Crypto Drive With Privacy Retaining

A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique

HTTP-Level Deduplication with HTML5

An Efficient Data Correctness Approach over Cloud Architectures

Secure Deduplication and Data Security with Efficient and Reliable Convergent Key Management

Keywords Cloud Storage, Error Identification, Partitioning, Cloud Storage Integrity Checking, Digital Signature Extraction, Encryption, Decryption

A Survey on Data Integrity of Cloud Storage in Cloud Computing

IMPLEMENTATION OF RESPONSIBLE DATA STORAGE IN CONSISTENT CLOUD ENVIRONMENT

Implementation of Role Based Access Control on Encrypted Data in Hybrid Cloud

Data Integrity and Dynamic Storage Way in Cloud Computing

Transcription:

Authorized data deduplication check in hybrid cloud With Cluster as a Service X.ALPHONSEINBARAJ PG Scholar, Department of Computer Science and Engineering, Anna University, Coimbatore. Abstract Data deduplication Techniques is one of the compression techniques which is mostly used to eliminate the repeating multiple copies of the data.to support differential authorization deduplication check and to avoid repudiation between the user and cloud service provide, encryption is enhanced before the data is outsourced. To better protect data security,this paper makes the first attempt to formally address the problems of integrity without any key generation. Only the tag generation and authorized user identity are making the data more secure compared to the previous traditional deduplication system. Moreover one of the additional cloud storage service that is included for security concern is Cluster as a Service(CaaS).This paper explains how CaaS is used in the secure system than the previous normal operation. Keywords: Deduplication, CaaS, Encryption, PoW,Tag. 1 INTRODUCTION Cloud computing is a new computing paradigm that offers a huge amount of computes and storage resources to the masses. Individuals and enterprises can have access to these resources by paying a small amount of money just for what is really needed. Cloud computing provides seemingly unlimited virtualized resources to users as services across the internet while hiding platform and implementation details. Depending on what services and resources are offered, cloud belongs to one of the three basic cloud categories: Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS). Iaas clouds make basic computational resources (e.g., Storage, servers), Paas clouds offers easy development and deployment of environment scalable applications. Saas clouds offer complete end user applications to be deployed, managed and delivered as service usually through the browser. The main purpose of Cluster as a Service (CaaS) is to ease the publication, discovery, selection and use of existing computational clusters. With the goal of providing users more flexible services in a transparent manner, all services are allocated in a cloud that actually is a collection of devices and resources connected through the Internet. One of the core services provided by cloud computing is data storage. This poses new challenges in creating secure and reliable data storage and access facilities over remote service providers in the cloud and increasing volume of the data. To make data management scalable in cloud computing, data deduplication [1] is the known technique which is one of the compression technique and eliminating the duplicate copies of repeating data. By applying data deduplication, we can utilize the data storage completely and efficiently by reducing the amount of bytes of data and also we can use the data storage at very low cost. Instead of keeping multiple copies of data, we can use only one identical copy to refer to the rest of the process. To make efficient use of data deduplication, convergent encryption [2] [5] method has been used. This produces identical ciphertext from identical plaintext files. This has applications in cloud computing to remove duplicate files from storage without the provider having access to the encryption keys. This is different from the traditional method which is referred as follows. Each time user access with their own private key, different plain text will be produced and it produces the different cipher text and this makes deduplication ineffective. 10

To enable deduplication and to avoid duplication, the tag has been used with corresponding privileges [3] [4]. Here tag detect duplication and that is associated with privileges. According to the privileges, tag generated is independent of convergent encryption [5] [2]. Duplication checking is done by privilege whenever each file is uploaded in the cloud. Authorized owner's identity was verified in the case of duplication file found. This verification was accessed by proof of ownership (PoW) [6] [7]. By giving this information it will match to the identity that is stored already on cloud with which the user had registered. 1.1 CONTRIBUTIONS In this paper, 1) Performing convergent encryption with differential privilege and providing tag to detect duplicate check without key generation is done.as stated, that the definition of convergent encryption, without provider having access the encryption key, when user register and upload the file, he /she will give the identity of ownership and security proof that will make encryption and decryption of file. 2) Here without interfere of provider generate keys, performing data deduplication and to detect the duplicate by tag generation, according the privilege [3] [4]. Here we consider a hybrid cloud architecture consisting of private and public cloud. 3) Introduce the first provably-secure and practical deduplication schemes that guarantee data security. 4) Implement one of the schemes and show experimental result. The rest of this paper proceeds as follows. Section 2 briefly revisit some preliminaries of this paper. Section 3 propose the system model for secure data protection. In section 4, the practical implementation progress of the secure duplication system is described. In section 5, some other related work regarding this system is described. In section 6, some future work and some other ideas to enhance security is described. Finally conclusions are drawn in section 7. 2 PRELIMINARIES In this section we are going to consider the symmetric encryption and review some secure primitives used in our secure deduplication. 2.1 Cluster as a Service (CaaS) IMPLEMENTATION: In other experiments which contained CaaS was implemented using Windows Communication Foundation (WCF) of.net that uses web services [26]. The problem is while Web services have simplified resource access and management, it is not possible to know if the resource(s) behind the web service is (are) ready for request. Clients need to exchange numerous messages with required web services to learn the current activity of the resources and thus face significant overhead losses if most of the Web services prove ineffective. The role of the CaaS is to (i) provide easy and intuitive file transfer tools so that clients can upload jobs and download results and (ii) offer an easy to use interface for clients to monitor their jobs. The CaaS Service does this by allowing clients to upload files as they would do in any Web page while carrying out the required data transfer to the cluster transparently. Because clients to the cluster cannot know how the data storage is managed,the CaaS offers a simple transfer interface to clients while addressing the transfer specified. Job submission is done in CaaS which does file upload completion and that makes a connection to the cluster storage and commences the transfer of all files.job Monitoring is specified to show the execution progress of job even though the cluster is not owned by the client,their job is. Thus,it is the right of the client to see how the job is progressing and (if client decides) terminates the job and remove it from the cluster. 11

2.2 Authorized user Identity-PoW: The notation of Proof Of ownership (PoW) [7],[24],[6] enables users to prove their ownership of data copies to the storage server which is denoted by Authorized user Identity instead. Whenever the user finds some duplication of files in hybrid cloud,he wants to give his ownership to Auditor Controller(AC).When user does uploading /downloading file(s),they need to give their proof. According to the concept of Designated Verifier Provable Data Possession (DV-PDP)[14],data owners are designated a verifier, can verify their data with independent of Cloud Service Provider(CSP).In our system, user can communicate to Storage server indirectly through Auditor Controller. Because of security concerns deduplication is done by issuing the Tag Generation. 3 SYSTEM MODEL This section consists of three different network entities: User, Auditor, Controller (AC), Cloud Storage Server (CSS). Figure 1 depicts the architecture employed in our prototype. The system consists of several modules described below. 1) User: It s an entity who has massive data will be shifted into CSS for maintenance and computation which can be either consumer or organization through the Auditor Controller 2) Cloud Storage Server (CSS): It s an entity which has significant storage space and computation resource to maintain clients and organizations' data. 3) Auditor Controller (AC): It s an entity which has the expertise and capabilities that clients do not have and is trusted to assess and expose the risk of cloud storage services on behalf of the clients request. Small companies normally store their data s in cloud server through the data centre. In data centres, companies data s is maintained. For maintenance and computation companies should pay, over the period of the time companies goes bankrupt, companies do not deal with these data. To solve the problems, I propose Authorized data deduplication check in hybrid with Cluster as a Service. User Identity of Ownership Tag Generation Upload Queue Encryption Deduplication Auditor Controller Encrypted Blocks/File(s) Upload Queue Cloud Storage Server Cloud Storage Server Cloud Storage Cloud Storage Cloud Storage Figure1:Architecture of proposed deduplication system. 12

Optionally, data on the backup server can be replicated to multiple cloud storages in the background. 1) User: A user is an entity that wants outsource data to cloud storage. Therefore upload the file(s) through private cloud and through which file(s) has been uploaded to Auditor Controller (AC). Data deduplication is performed under this level by user register their proof user details, tag will be generated. Set of privileges are associated with a tag. 2) Auditor Controller (AC): This is an entity which reduces storage space and bandwidth by eliminating repeating copy of the data. In the case of duplicate found, the duplication request made by the user is passed to CSS through Auditor Controller (AC). Accessing the files from set of privileges is done by the user. 3) Cloud Storage Server (CSS): This entity gives data storage in public cloud. To reduce the storage cost, the CSS eliminates the storage of redundant data via deduplication and keeps only unique data.we assume that CSS is always online and computation power. 3.1 Adversary Model: In this paper, suppose all files are sensitive and fully protected from private and public cloud means,it comes under some assumption by considering two kinds of adversaries. That is 1)Internal adversaries who are aiming to extract information on file from private cloud and make deduplication ineffective by sending copies of data under the same name. 2)External adversaries whose aim is to extract information from both private and public clouds. 4 SECURE DEDUPLICATION SYSTEM 1) Upload-to-Download Integrity : Since if integrity is the problem how can the user and provider know whether the data retrieved from cloud is the same data which was uploaded previously by the user? 2) Repudiation between Users and Service Providers : Some data errors will occur without any transmission error while downloading /uploading files means, how a user or/and service provider prove their innocence? Our proposed system is a providing solution for the above indicated security problems by performing uploading and downloading the files in our system. 4.1 Uploading file in our secure deduplication system: 1) User : Sends the data to Cloud Storage Server(CSS) through Auditor Controller(AC) with tag generation and Tag Generation by User (TGU). 2) Auditor Controller(AC) : Verifies the data with TGU and sends to Cloud Storage Server(CSS) if it is valid, the CSS sends back the tag generation to the user through Auditor Controller. This is known as Tag Generation by Auditor Controller (TGAC). 3) TGU is stored at the user side and TGAC is stored at the Cloud Servicer Server(CSS). Once uploading have been finished, both the sides agree on the integrity of the uploading data. 4.2 Downloading file in our secure deduplication system : 1) User : Send request to Cloud Storage Server(CSS) through Auditor Controller (AC)with Authorized Identity.Its known as Proof of Ownership(PoW). 2) Auditor Controller (AC) : Verifies the Authorized Identity and send to Cloud Storage Server (CSS) which will verify and send back the files with the help of Tag Generation Auditor Controller(TGAC) to the user. 3) The user can verify that the TGAC for integrity of files which he had sent is directed to download the file. When disputation occurs, the user or Cloud Storage Server can check TGAC and TGU to prove its innocence. However there are some special cases that exist. a) When a Cloud Storage Server is trust worthy, only TGU is needed; when the user is trustworthy only TGAC is needed ; b ) If each of them trusts then on the other side, neither TGU nor TGAC is needed. 13

5. RELATED WORK: The current deduplication system is involved in three categories : whole file,fixed sized and variable size chunks. These three categories uses Single Instance Storage (SIS).The first category refers that it would take whole file as the hash value identifier.if two or more file have same hash value, but those files are stored identically then it is known as Content Addressable Identity (CAS) [25].The second category is fixed sized chunks that refers to blocks of files.example of second category is Venti archival storage system [15].The third Category is sliding window. Here the files are converted into variable length chunks using hash values. Example is Rabin fingerprints [20]. Variable-length chunks are used in LBFS [10], Shark [16], and Deep Store [11]. Beyond these three categories, some other techniques are also available.for example secure storage through secret sharing[12]. PASIS [13] and POTSHARDS [17], Similarly, steganographic systems, such as the Steganographic File System [18] and Mnemosyne [19], provide plausible deniability over storage contents through the use of random data blocks. Both the techniques has many times of plain text. Our system coalesces data at file level, thus achieving a space for saving files.here a unique tag will be generated by user without knowing him/her by which the hash values are accessed and make convergent efficient and also deduplication perform actively. As tag defined, duplication is detected and privileges are associated with it. Finally we represent secure deduplication through hybrid approach with CaaS model. Further Proof of ownership(pow)[24]schemes are security protocols which is designed to allow a server to verify the file whether it belongs to the authorized user or not. A PoW scheme should be efficient in terms of CPU, bandwidth and I/O for server and Ateniese et al.[23] proposed a protocol based on the Provable data Procession(PDP)[8] technology,which allows users to obtain a probabilistic proof from the storage service providers.such a proof will be used as evidence that their data have been stored there. One advantage for this protocol is that proof could be generated by storage service provider will be accessing only a small portion of the whole dataset. 6. CONCLUSION Our Authorized deduplication System is provided to protect the data from the attacker by generating Tag and Proof of Ownership (PoW). My proposed system showed that it will incur minimal overhead compared to previous deduplication check in a hybrid approach. Our secure deduplication system provides the best way to secure file by sending Tag generation and mainly occurred without any key generation for encryption and decryption. 7. FUTURE WORK In our secure system, we are using three different categories such that User, Auditor Controller(AC) and Cloud Storage Server(CSS).In future work, without Auditor controller, we can enhance this work. Xu et al[21] addressed the problem and showed a secure encryption without considering issues of the key management and block level deduplication. It is known that some commercial cloud storage providers,such as Bitcasa, also deploy convergent encryption for deduplication techniques.here group secure file transfer is consider one of the future work and inwhich among group there is no need help from AC.But between group deduplication work should be done with help of AC. REFERENCES : [1]. S. Quinlan and S. Dorward. Venti: a new approach to archival storage. In Proc. USENIX FAST, Jan 2002. [2]. J. R. Douceur, A. Adya, W. J. Bolosky, D. Simon, and M. Theimer.Reclaiming space from duplicate files in a serverless distributed file system. In ICDCS, pages 617 624, 2002. [3]. D. Ferraiolo and R. Kuhn. Role-based access controls. In 15 th NIST-NCSC National Computer Security Conf., 1992 14

[4]. R. S. Sandhu, E. J. Coyne, H. L. Feinstein, and C. E. Youman. Role-based access control models.ieee Computer, 29:38 47, Feb 1996. [5]. M. Bellare, S. Keelveedhi, and T. Ristenpart. Message-locked encryption and secure deduplication. In EUROCRYPT, pages 296 312, 2013. [6]. S. Halevi, D. Harnik, B. Pinkas, and A. Shulman-Peleg. Proofs of ownership in remote storage systems. In Y. Chen, G. Danezis, and V. Shmatikov, editors, ACM Conference on Computer and Communications Security, pages 491 500. ACM, 2011. [7]. http://ale.sopit.net/pdf/asiaccs.pdf [8]. http://eprint.iacr.org/2007/202.pdf [9]. http://www.ssrc.ucsc.edu/papers/storer-storagess08.pdf [10]. A. Muthitacharoen, B. Chen, and D. Mazières. A low-bandwidth network file system. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP 01), pages 174 187, Oct. 2001 [11]. L. L. You, K. T. Pollack, and D. D. E. Long. Deep Store: An archival storage system architecture. In Proceedings of the 21st International Conference on Data Engineering (ICDE 05), Tokyo, Japan, Apr. 2005. IEEE [12]. A. Shamir. How to share a secret. Communications of the ACM, 22(11):612 613, Nov. 1979 [13]. G. R. Goodson, J. J. Wylie, G. R. Ganger, and M. K. Reiter. Efficient Byzantine-tolerant erasure-coded storage. In Proceedings of the 2004 Int l conference on Dependable Systems and Networking (DSN 2004), June 2004. [14]. http://www.sersc.org/journals/ijsia/vol7_no6_2013/2.pdf [15]. S. Quinlan and S. Dorward. Venti: A new approach to archival storage. In Proceedings of the 2002 Conference on File and Storage Technologies (FAST), pages 89 101, Monterey, California, USA, 2002. USENIX. [16]. S. Annapureddy, M. J. Freedman, and D. Mazières. Shark: Scaling file servers via cooperative caching. In Proceedings of the 2nd Symposium on Networked Systems Design and Implementation (NSDI), pages 129 142, 2005 [17]. M. W. Storer, K. M. Greenan, E. L. Miller, and K. Vorugant POTSHARDS: secure long-term storage without encryption. In Proceedings of the 2007 SENIX Annual Technical Conference, pages 143 156, June 2007 [18] R. Anderson, R. Needham, and A. Shamir. The steganographic file system. In Proceedings of the International Workshop on Information Hiding (IWIH 1998), pages 73 82, Portland, OR, Apr. 1998. [19]. S. Hand and T. Roscoe. Mnemosyne: Peer-to-peer steganographic storage. Lecture Notes in Computer Science 2429:130 140, Mar. 2002. [20]. M. O. Rabin. Fingerprinting by random polynomials.technical Report TR-15-81, Center for Research in Computing Technology, Harvard University, 1981. [21]. J. Xu, E.-C. Chang, and J. Zhou. Weak leakage-resilient client-side deduplication of encrypted data in cloud storage. In ASIACCS,pages 195 206, 2013. [22].Googlemail,http://groups.google.com/group/googleappengine/browsethread/thread/782aea7f85ec bf98/8a9a505e8aaee07a?show_docid=8a9a505e8aaee07a# [23]. G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and D. Song. Provable data possession at untrusted stores. In CCS '07, (2007)October 598-609;Alexandria, VA, USA. [24]. http://www3.ntu.edu.sg/home/ygwen/paper/nwz-sac-12.pdf [25]. http://static.usenix.org/event/wiov08/tech/full_papers/liguori/liguori.pdf [26]. http://vishalnayan.wordpress.com/2010/12/31/wcf-what-why-when/ 15