Chapter 1 Introduction



Similar documents
How To Ensure Correctness Of Data In The Cloud

How To Design A Cloud Data Storage Service For A Cloud Computer System

Index Terms Cloud Storage Services, data integrity, dependable distributed storage, data dynamics, Cloud Computing.

SECURE AND TRUSTY STORAGE SERVICES IN CLOUD COMPUTING


Cloud Computing; What is it, How long has it been here, and Where is it going?

Data Storage Security in Cloud Computing for Ensuring Effective and Flexible Distributed System

White Paper on CLOUD COMPUTING

INTRODUCTION TO CLOUD COMPUTING CEN483 PARALLEL AND DISTRIBUTED SYSTEMS

CHAPTER 8 CLOUD COMPUTING

Secure Way of Storing Data in Cloud Using Third Party Auditor

Ensuring Data Storage Security in Cloud Computing

Verifying Correctness of Trusted data in Clouds

CLOUD COMPUTING SECURITY ISSUES

OWASP Chapter Meeting June Presented by: Brayton Rider, SecureState Chief Architect

CLOUD COMPUTING An Overview

preliminary experiment conducted on Amazon EC2 instance further demonstrates the fast performance of the design.

Near Sheltered and Loyal storage Space Navigating in Cloud

Tufts University. Department of Computer Science. COMP 116 Introduction to Computer Security Fall 2014 Final Project. Guocui Gao

Security Considerations for Public Mobile Cloud Computing

Cloud Computing Submitted By : Fahim Ilyas ( ) Submitted To : Martin Johnson Submitted On: 31 st May, 2009

Architectural Implications of Cloud Computing

Keywords Cloud Storage, Error Identification, Partitioning, Cloud Storage Integrity Checking, Digital Signature Extraction, Encryption, Decryption

Cloud Computing Service Models, Types of Clouds and their Architectures, Challenges.

Keep Your Data Secure in the Cloud Using encryption to ensure your online data is protected from compromise

Secure Data transfer in Cloud Storage Systems using Dynamic Tokens.

How cloud computing can transform your business landscape

A Secure & Efficient Data Integrity Model to establish trust in cloud computing using TPA

SECURE CLOUD STORAGE PRIVACY-PRESERVING PUBLIC AUDITING FOR DATA STORAGE SECURITY IN CLOUD

PRIVACY-PRESERVING PUBLIC AUDITING FOR SECURE CLOUD STORAGE

Outline. What is cloud computing? History Cloud service models Cloud deployment forms Advantages/disadvantages

Security & Trust in the Cloud

The cloud - ULTIMATE GAME CHANGER ===========================================

Overview. The Cloud. Characteristics and usage of the cloud Realities and risks of the cloud

An Efficient Data Correctness Approach over Cloud Architectures

How To Ensure Correctness Of Data In The Cloud

N TH THIRD PARTY AUDITING FOR DATA INTEGRITY IN CLOUD. R.K.Ramesh 1, P.Vinoth Kumar 2 and R.Jegadeesan 3 ABSTRACT

Research Paper Available online at: A COMPARATIVE STUDY OF CLOUD COMPUTING SERVICE PROVIDERS

Data Integrity Check using Hash Functions in Cloud environment

Cloud Computing Guidelines

Improving data integrity on cloud storage services

Index Terms: Cloud Computing, Third Party Auditor, Threats In Cloud Computing, Dynamic Encryption.

Cloud Computing. Bringing the Cloud into Focus

Cloud Computing Flying High (or not) Ben Roper IT Director City of College Station

East African Information Conference th August, 2013, Kampala, Uganda. Security and Privacy: Can we trust the cloud?

Cloud Computing for SCADA

6 Cloud computing overview

Cloud Data Storage Services Considering Public Audit for Security

Managing Cloud Computing Risk

How cloud computing can transform your business landscape.

The NREN s core activities are in providing network and associated services to its user community that usually comprises:

Chapter 1: Introduction

Building Private & Hybrid Cloud Solutions

A Study of Infrastructure Clouds

Cloud Computing Architecture: A Survey

Tamanna Roy Rayat & Bahra Institute of Engineering & Technology, Punjab, India talk2tamanna@gmail.com

Data Storage Security in Cloud Computing

Hadoop in the Hybrid Cloud

A Secure and Dependable Cloud Storage Service in Cloud Computing

Technology & Business Overview of Cloud Computing

Addressing Data Security Challenges in the Cloud

Dr.K.C.DAS HEAD PG Dept. of Library & Inf. Science Utkal University, Vani Vihar,Bhubaneswar

What Cloud computing means in real life

AskAvanade: Answering the Burning Questions around Cloud Computing

Customer Security Issues in Cloud Computing

Kent State University s Cloud Strategy

The Cloud at Crawford. Evaluating the pros and cons of cloud computing and its use in claims management

IT Risk and Security Cloud Computing Mike Thomas Erie Insurance May 2011

Cloud Computing. Course: Designing and Implementing Service Oriented Business Processes

IBM EXAM QUESTIONS & ANSWERS

Direct User Data Authentication in Cloud

High Performance Computing Cloud Computing. Dr. Rami YARED

CLOUD COMPUTING. When It's smarter to rent than to buy

NETWORK ACCESS CONTROL AND CLOUD SECURITY. Tran Song Dat Phuc SeoulTech 2015

GETTING THE MOST FROM THE CLOUD. A White Paper presented by

Cloud Computing. Key Considerations for Adoption. Abstract. Ramkumar Dargha

Cloud Computing Technology

IS PRIVATE CLOUD A UNICORN?

IJRSET 2015 SPL Volume 2, Issue 11 Pages: 29-33

Data Protection Act Guidance on the use of cloud computing

ISSN: (Online) Volume 2, Issue 5, May 2014 International Journal of Advance Research in Computer Science and Management Studies

RSA BASED CPDP WITH ENCHANCED CLUSTER FOR DISTRUBED CLOUD STORAGE SERVICES

Chapter 2 Cloud Computing

ITL BULLETIN FOR JUNE 2012 CLOUD COMPUTING: A REVIEW OF FEATURES, BENEFITS, AND RISKS, AND RECOMMENDATIONS FOR SECURE, EFFICIENT IMPLEMENTATIONS

Emerging Technology for the Next Decade

Data Storage Security in Cloud Computing

PRIVACY PRESERVING PUBLIC AUDITING FOR SECURED DATA STORAGE IN CLOUD USING BLOCK AUTHENTICATION CODE

Security Issues in Cloud Computing

Planning the Migration of Enterprise Applications to the Cloud

CLOUD COMPUTING PHYSIOGNOMIES A 1.1 CLOUD COMPUTING BENEFITS

A Survey on Security Issues and Security Schemes for Cloud and Multi-Cloud Computing

Cloud Computing. What is Cloud Computing?

Cloud Courses Description

Ensuring Data Storage Security in Cloud Computing By IP Address Restriction & Key Authentication

Index Terms: Data integrity, dependable distributed storage, Cloud Computing

Transcription:

1 1.1. Cloud Computing Chapter 1 Introduction Cloud computing provides the next generation of internet-based, highly scalable distributed computing systems in which computational resources are offered 'as a service'. The cloud computing paradigm that builds on the foundations of distributed computing, grid computing, networking, virtualization service orientation and market oriented computing. The most predominantly used definitions for cloud computing are given bellow [110] and [28]: Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction - U.S. National Institute of Standards and Technology (NIST)[110] A Cloud is a type of parallel and distributed system consisting of a collection of interconnected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources based on service-level agreements established through negotiation between the service provider and consumers - Buyya et al.[28]. This cloud model promotes Availability and is composed of five essential characteristics, three service models, and four deployment models shown in Fig.1.1.[79, 108, 110]. 1.1.1. Cloud key characteristics The most important five key characteristics of cloud computing are: On-demand self-service: Cloud computing resources (such as CPU time, network storage, software use, and so forth) can be procured and disposed of by the consumer without human interaction with the cloud service provider. This automated (i.e. convenient, self-serve) reduces the personnel overhead of the cloud provider, cutting costs and lowering the price at which the services can be offered.

2 On-demand selfservice Resource pooling Broad access network Rapid elasticity Measured service Software as a Service(SaaS) Platform as a Service(PaaS) Infrastructure as a Service(SaaS) Public Private Community Hybrid Fig. 1.1 Visual Model of NIST working definition of Cloud Computing Resource pooling: By using a technique called virtualization, the cloud provider pools his computing resources. This resource pool enables the sharing of virtual and physical resources by multiple consumers, dynamically assigning and releasing resources according to consumer demand [110]. The consumer has no explicit knowledge of the physical location of the resources (e.g. database, CPU, etc.) being used, except when the consumer requests to limit the physical location of his data to meet legal requirements. For example, consumers are not able to know where their data is going to be stored in the Cloud. Broad network access: Cloud services(resources) are accessible over the network (e.g. Internet) via standardized interfaces, enabling access to the service not only by complex devices such as personal computers, but also by light weight devices such as smart phones, mobile phones, laptops, and PDAs situated at a consumer's site.

3 Rapid elasticity: The available cloud computing resources are rapidly matched to the actual demand, quickly increasing the cloud capabilities for a service if the demand rises, and quickly releasing the capabilities when the need for drops. This automated process decreases the procurement time for new computing capabilities when the need is there, while preventing an abundance of unused computing power when the need has subsided. Measured service: Cloud computing enables the measuring of used resources, as is the case in utility computing. The measurements can be used to provide resource efficiency information to the cloud provider, and can be used to provide the consumer a payment model based on pay-per-use. For example, the consumer may be billed for the data transfer volumes, the number of hours a service is running, or the volume of the data stored per month. 1.1.2. Cloud Computing Service Models Cloud computing can be classified by the model of service it offers into one of three different services. It is important to note, as shown in Fig. 1.2, that SaaS is built on PaaS, and the latter on IaaS. Each of these service models is described in a following subsection. Fig. 1.2: Cloud computing service models

4 Software as a Service (SaaS) The SaaS service model offers the services as applications to the consumer, using standardized interfaces. The services run on top of a cloud infrastructure, which is invisible for the consumer. The cloud provider is responsible for the management of the application, operating systems and underlying infrastructure. The consumer can only control some of the user-specific application configuration settings. A leader in the industry of SaaS is Salesforce.com, who is offering multitenant solutions in the field of Customer Relationship Management (CRM) since before the appearance of the concept of SaaS in the context of Cloud computing. A more recent example of this type of solutions is the e-mail service offered by Google, i.e., GMail, through its Google App Engine. In these situations, the Cloud user is only interested in getting the most out of the application provided by the Cloud. At this level the Cloud user is not seen as a developer anymore, he is a simple user of solutions offered by Cloud developers. Platform as a Service (PaaS) The PaaS service model offers the services as operation and development platforms to the consumer. The consumer can use the platform to develop and run his own applications, supported by a cloud-based infrastructure. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly application hosting environment configurations. A classic example of PaaS is a virtual machine image containing a set of software services (for example, a Linux distribution, a Web server, and a programming environment such as (PHP) in order to offer a web development environment for the Cloud developer. Some commercial examples from relevant companies in the IT field are already available. From the Microsoft, we have the Windows Azure Platform, while Google offers the Google App Engine. The Cloud developer can use these platforms to simplify its implementation process by relying on the set of predefined tools offered by them. Although these platforms can provide a considerable amount of flexibility, the limitation at this level is that the developer is constrained by the functionalities offered through these platforms.

5 Infrastructure as a Service (IaaS) The IaaS service model is the lowest service model in the technology stack, offering computer Infrastructure as a service, such as raw data storage, processing power and network capacity. The consumer can use IaaS based service offerings to deploy his own operating systems and applications, offering a wider variety of deployment possibilities for a consumer than the PaaS and SaaS models. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems; storage, deployed applications, and possibly limited control of select networking components (e.g., host firewalls) [110]. A well-known commercial product that offers solutions at this level is the Amazon Elastic Compute Cloud or Amazon EC2 [4]. This solution provides the customer with full access and control over the computing resources he paid for. This does not mean that the cloud user has control over the underlying cloud fabric, but that he has control over a virtual machine, or a set of resources, running on top of the Cloud fabric controlled by the Cloud provider. In this setting the cloud user is then free to configure their virtual machines with whichever solutions he sees fit. This can be seen as the layer where the level of freedom of the user is the highest. At this layer the Cloud user still has to be concerned about maintaining the software he chooses to install in the resources rented to the Cloud provider. In Table 1.1, we list out the cloud services provided by some of the cloud providers. Table 1.1: Cloud Services and Cloud Providers Cloud Service Models Cloud Providers Software as a Service(SaaS) Salesforce.com, Microsoft office 365, workday. Platform as a Service(PaaS) Google App. Engine, Force.com Infrastructure as a Service(IaaS) Amazon EC2, GoGrid, icloud and Microsoft Azure DC. 1.1.3. Cloud Deployment Models Clouds can also be classified based upon the underlying infrastructure deployment model as Public, Private, Community, or Hybrid clouds [79,108,110,162]. The different infrastructure deployment models are distinguishal with their own characteristics. The characteristics to describe the deployment models are: (i) who owns the infrastructure; (ii) who manages the infrastructure; (iii) where is the infrastructure located; (iv) who has access to the cloud services.

6 Private Cloud. The cloud infrastructure is operated solely within a single organization, and managed by the organization or a third party regardless whether it is located premise or off premise. The motivation to set up the private cloud within an organization has several aspects. First, to maximize and optimize the utilization of existing in-house resources. Second, security concerns including data privacy and trust also make Private Cloud an option for many firms. Third, data transfer cost from local IT infrastructure to a Public Cloud is still rather considerable. Fourth, organizations always require full control over mission-critical activities that reside behind their firewalls. Last, academics often build private cloud for research and teaching purposes. Community Cloud: Several organizations jointly construct and share the same cloud infrastructure as well as policies, requirements, values, and concerns. The cloud community forms into a degree of economic scalability and democratic equilibrium. The cloud infrastructure could be hosted by a third-party vendor or within one of the organizations in the community. Public Cloud: Public cloud computing is based on massive scale offerings to the general public. The Infrastructure is located on the premises of the provider, who also owns and manages the cloud Infrastructure. Public cloud users are considered to be untrusted, which means they are not tied to the organization as employees and that the user has no contractual agreements with the provider. Many popular cloud services are public clouds including Amazon EC2, S3, Google AppEngine, and Force.com. Hybrid Cloud: The cloud infrastructure is a combination of two or more clouds (private, community, or public) that remain unique entities but are bound together by standardized or proprietary technology that enables data and application portability (e.g., cloud bursting for loadbalancing between clouds). Organizations use the hybrid cloud model in order to optimize their resources to increase their core competencies by margining out peripheral business functions onto the cloud while controlling core activities on-premise through private cloud. In Table 1.2, we listed the characteristics of cloud deployment models [41]

7 Table 1.2: Characteristics of Cloud Deployment Models Infrastructure Owned Managed By By Infrastructure located Accessed and Consumed By Public 3 rd Party Provider 3 rd Party Provider Off-premise Untrusted Private/ Community 3 rd Party Provider Organization 3 rd Party Provider Organization Off-premise On-premise Trusted Hybrid Both 3 rd Party Provider and Organization Both 3 rd Party Provider and Organization Both Onpremise and Off-premise Trusted and untrusted 1.1.4. Key Drivers to Adopting the Cloud This section articulates the cloud s impact on IT users [79,108,162]. To compare Client/server computing and cloud computing, some of the benefits cloud computing offers: lower IT costs, faster time to go live and reduced complexity. However, with cloud computing it is critical to understand how to integrate the cloud solution into existing enterprise architecture. The following subsections describe a number of compelling reasons to move operations toward cloud computing. Small Initial Investment and Low Ongoing Costs Public cloud computing can avoid capital expenditures because hardware, software, or network devices need not to be purchased. Cloud usage is billed on actual use only, and is therefore treated more as an expense. In turn, usage-based billing lowers the barrier to entry because the upfront costs are minimal. Depending on the contract being signed, most companies can terminate the contract as preferred; therefore, in times of hardship or escalating costs, cloud computing costs can be managed very efficiently. Economies of Scale Most development projects have a sizing phase during which one attempts to calculate the storage, processing power and memory requirements during development, testing, and Production. With the flexibility that cloud computing solutions offer, the companies can acquire

8 computing and development services as needed and on demand, which means development projects are less at risk of missing deadlines and dealing with the unknown. Open Standards Some capabilities in cloud computing are based on open standards for building a modular architecture that can grow rapidly and can change whenever it is required. Open source software is defined as computer software that is governed by a software license in the public domain, or that meets the definition of open source, which allows the users to use, change, and improve the software. Open source software is the foundation of the cloud solution and is critical to its continued growth. Sustainability CSPs have invested the considerable expense and thought into creating a resilient architecture that can provide a highly stable environment. Traditionally, companies have periodically struggled to maintain IT services due to either single points of failure in the network or to an inability to keep pace with business changes in both volume and the nature of transactions. Cloud computing allows the companies to rely on the CSP to have limited points of failure, better resilience via clustering, and the ability to invest in state-of-the-art resilience solutions. The benefits of cloud computing impact of cloud computing on different types of users [108]: Individual consumers Individual businesses Start-ups Small and medium-size businesses (SMBs) Enterprise businesses 1.1.5. Barriers to Cloud Computing Adoption in the Enterprise Although there are many benefits to adopting cloud computing, there are also some significant barriers to its adoption. However, it is important to at least call out what some of the other barriers to adoption are [40-42, 108,163].

9 Security Privacy Connectivity and Open Access Reliability Independence from CSPs Interoperability Economic Value IT Governance Changes in the IT Organization Political Issues Due to Global Boundaries 1.1.6. Security in Cloud Computing One of the most significant barriers to adoption is security, and we discuss it extensively in this thesis. The survey conducted by the International Data Corporation (IDC) [60, 158] on cloud computing services in august 2008 has revealed that security is the biggest concern because most of the Clients of Cloud Computing worries about their businesses information and critical IT resources in the Cloud Computing system which are vulnerable to be attacked. The performance and availability issues remain as the three most important challenges According to an IDC survey in August 2008, which was conducted of 244 IT executives/cios and their lineof-business (LOB) colleagues about their companies use of and views about IT Cloud Services as given in Fig.1.3 [60]. Although Cloud Computing does not have any new technologies, its characteristics, service models and deployment models raise new security issues such as Data Storage Security, Infrastructure Security, Virtualization Security, Privacy etc. Security implementations will require additional monetary resources to implement. Service Level Agreements (SLAs) with cloud providers are less robust than the expected requirements for a company providing IT services. Governance and security standards in this regard are currently lacking. Thus, we need to have efficient and effective methods required for handling these security issues [40-42, 85, 108,194].

10 Fig 1.3: IDC survey on Cloud Services in Aug 2008 Infrastructure Security: The security challenges at various levels namely network level, host level and application level are not specifically caused by cloud computing instead are exacerbated by its use. The issues of infrastructure security and cloud computing can be addressed by clearly defining trust boundaries by understanding which party provides which part of security. Data Security and Storage: Data security is a significant task, with a lot of complexity. Methods of data protection, such as redaction, truncations, obfuscation, and others, should be viewed with great concern. Not only are there no accepted standards for these alternative methods, but also there are no programs to validate the implementations of whatever could possibly be developed. The Homomorphic encryption [43] can be used for data security but with this approach key management is a problem. Privacy: Privacy is an important issue for cloud computing, both in terms of legal compliance and user trust and this need to be considered at every phase of design. The key challenge for software engineers to design cloud services in such a way as to

11 decrease privacy risk and to ensure legal compliance. The following tips are recommended for cloud system designers, architects, developers and Testers [40, 41, 99, 108,161,168]. 1. Minimize personal information sent to and stored in the cloud. 2. Protect personal information in the cloud. 3. Maximize user control. 4. Allow user choice. 5. Specify and limit the purpose of data usage. 6. Provide feedback. Audit and Compliance: A programmatic approach to monitoring and compliance will help prepare the CSPs (Cloud Service Provider) and their users to address emerging requirements and the evolution of cloud business models. To drive efficiency, risk management, and compliance, the CSPs need to implement a strong internal control monitoring function coupled with a robust external audit process. To gain comfort over their in-cloud activities, the CSP users need to define their control requirements, understand their CSP s internal control monitoring processes, analyze relevant external audit reports, and properly execute their responsibilities as CSP users. 1.2. Cloud Storage and Security Cloud storage has the potential of providing geographically distributed storage services since cloud can integrate servers and clusters that are distributed all over the world and offered by different service providers into one virtualized environment. This can potentially resist disastrous failures and achieve low access latency and greatly reduced network traffic by bringing data close to where they are needed. Cloud data storage belongs to IaaS which allows the Clients to move their data from local computing systems to the remote Cloud. More and more the Clients start choosing to store their data in the Cloud. Thus the main reason for using cloud computing is cost effectiveness, which is particularly true for small and medium-sized businesses. Another reason is that the Clients can rely on the Cloud to provide more reliable services, so that they can access data from anywhere and at any time. Individuals or small-sized companies usually do not have the resource to keep their servers as reliable as the Cloud does.

12 Amazon s Elastic Compute Cloud (EC 2 ) [4] and icloud [6] are well known examples of cloud data storage. By moving their data into cloud, it offers several benefits to the Clients: Benefits vary from vendor to vendor and depends on the service level you negotiate, but here are some of the primary benefits of storing data in the cloud [28]: Scalability: Cloud computing allows the organizations or individuals to quickly and easily scale capacity either increasing or decreasing available storage space to meet current demands. That means unexpected spikes in capacity can be addressed without having to over invest in hardware that will spend most of the time idle. Redundancy: Cloud storage providers generally provide multiple sites that are geographically separate, but with mirrored copies of all data. Hardware failures, power outages, or natural disasters affecting the sites are transparent to customers because the data will still be accessible from the alternate sites. Hardware Upgrades: Hardware changes so rapidly that the data center investment can be bordering on obsolescence while it's barely implemented. A third-party vendor dedicated to providing hosted online storage will invest in hardware and infrastructure upgrades over time so organizations get the benefit of newer technology without having to constantly re-invest in new hardware. Load Balancing: Aside from scalability of storage capacity, cloud storage also provides scalability of bandwidth. Spikes in demand can be met by allocating additional bandwidth, and demand can also be shared between redundant sites to balance the load and ensure minimal lag in accessing data. Disaster Recovery / Business Continuity: Storing data in the cloud also means that it is being stored offsite. In the event of a catastrophe or natural disaster impacting the local Office, the data itself will still be protected and available online. Business will be able to continue almost seamlessly from alternate locations, and the data will be immediately available once normal operations resume at the primary office facility. Cost: Considering the benefits-scalable, redundant storage that also doubles as a disaster recovery and business continuity solution, the cost of cloud storage is typically quite reasonable. Consider as well that, by engaging a third-party host for to store data organizations avoid having

13 to hire personnel to manage data storage in-house along with the associated salaries and benefitsand that, with the economies of scale offered by a cloud storage provider, adding additional space is a fraction of the investment that would be required for new hardware, and the power and cooling necessary to accomplish the same thing in an internal data center. 1.3. Motivation Despite its benefits, this new data storage service also brings about many challenging design issues, which have a profound influence on the security and performance of the overall system. The same information security concerns are associated with this data stored in cloud: Confidentiality, Integrity, and Availability. [71, 82,108,154,157,199]. These security issues arise due to the following reasons: 1) The data loss incidents could happen in any infrastructure no matter what degree of reliable measures the cloud service provides would take. 2) Sometimes, the Cloud Service Providers (CSPs) may be dishonest and they may discard the data, which has not been accessed or rarely accessed to save the storage space. Moreover, the CSP data may choose to hide data loss incidents (due to management failure, hardware failures [144], Byzantine Failure [62], and corrupted by outside or inside attacks etc.). 3) Clouds use the concept of multi-tenancy where by multiple Clients data is processed on the same physical hardware. Some recent data loss incidents are the sidekick cloud disaster in 2009[32], the breakdown of Amazon s Elastic Computing Cloud (EC2) in 2010[116], and other incidents are appearing from time to time[9,87,91,174]. Therefore, although storing data in the cloud is economically attractive for the cost and complexity of long- term large scaled at storage, it s lacking of offering strong assurance of data Integrity, Confidentiality and Availability may impede its wide adoption by both enterprise and individual cloud users. By Encrypting and encoding or replicating the data before storing in the cloud can handle the Confidentiality and Availability issues respectively. However, verifying the Integrity of outsourced data is a difficult task without having a local copy of data or retrieving it from the server. Due to this reason, the straightforward cryptographic primitives such as Hashing,

14 Signatures schemes for data Integrity are not directly applicable. It is impractical for the Clients to download the all stored data in order to validate its Integrity as this would require an expensive I/O cost and communication overhead across the network. Hence, to ensure the cloud data storage Integrity assurance and enforce the quality of cloud storage service, it is essential to require an effective and efficient method for the Clients to check the Integrity of their data stored in cloud with minimum computation, communication and storage overhead. 1.4. Problem Statement 1.4.1. System Model The cloud data storage model in cloud computing consists of three entities namely Clients, Cloud Service Provider (CSP) and Third Party Auditor (TPA) as illustrated in Fig. 1.4 as the following activities[19,165,169,197]. Security Flow Message Third Party Auditor (TPA) Response Request Data Flow Clients Cloud Service Provider (CSP) Fig. 1.4 Cloud Data Storage Architecture Clients: - The Clients are those who have data to be stored, and access with help of Cloud Service Provider (CSP). They are typically desktop computers, laptops, mobile phones, tablet computers, etc.

15 Cloud Service Provider (CSP):- Cloud Service Providers (CSPs) are those who have major resources and expertise in building, managing distributed cloud storage servers and provide applications, infrastructure, hardware, enabling technology to Clients as a service via internet. Third Party Auditor (TPA):- Third Party Auditor (TPA) who has expertise and capabilities that Client may not have and verifies the Integrity of data stored in cloud on behalf of Clients. Based on the audit result, TPA could release an audit report to the Client. In cloud computing paradigm, the Clients store their data files in cloud and access them with help of Cloud Service Provider (CSP) whenever and wherever they need. The cloud consists of set of cloud servers, which are running in a simultaneous, cooperated and distributed manner. Data redundancy can be employed with technique of erasure-correcting code to further tolerate faults or server crash as user s data and encrypting the data can prevents the data leakage. In addition, the Client can frequently verify the Integrity of data without having a local copy of data file. In case, the Clients do not have the time, feasibility or resources to monitor their data, they delegate this task to Third Party Auditor (TPA). The TPA verifies the Integrity of data on behalf of Clients. In some cases, the Clients may need to perform block level operations on his data for practical applications. The most general forms of these operations are block update, delete, insert and append. There are many applications that can be envisioned to adopt this model of outsourced data storage system. For e-health applications [200], a database containing sensitive and large amount of information about patients medical history is to be stored on the cloud servers. We can consider the e-health organization to be the data owner and the physicians to be the authorized users with appropriate access right to the database. Financial applications, scientific applications, and educational applications containing sensitive information about students transcripts can also be envisioned in similar settings. 1.4.2. Problem Definition In cloud data storage system, the Clients store their data in the cloud and no longer possess the data locally. After data goes into the cloud, the Client loses control over it. If such data storage is vulnerable to attacks or Byzantine failures, in which the adversary can modify or

16 delete the data or inject polluted data into the data storage servers or may access the data. These attacks or failures would bring irretrievable losses to the Clients since their data is stored in an uncertain storage pool outside the storage enterprises. These attacks prevent the Clients from accessing the original data correctly. Hence, efficient and effective audit protocols are needed to ensure the Confidentiality, Integrity and Availability of client s data over the lifetime with minimum computation, communication and storage overhead. 1.4.3. Security Threats In this thesis, we are considering two types of threats for cloud data storage they are: Internal Threats and External Threats [13, 165]. Internal Threats: These are initiated by malicious insiders, who are intentionally corrupting the Client s data inside the cloud by modifying or deleting. They are also able obtain all the information and may leak it to outsiders. The examples for the malicious insiders are: Cloud Service Provider (CSP) or Cloud users. External Threats: These are initiated by unauthorized parties from outside the cloud. The external attacker, who is capable of comprising the cloud servers and can access the Client s data as long as they are internally consistent i.e. he may delete or modify the Client s data and may leak the user private information. The examples for external attackers are: criminals, extremists or terrorists. The above threats can be realized in different ways as given bellow: 1) Eavesdropping: leakage of information by monitoring communication the communication channels. 2) Modification: the adversary modifies or deletes the data in the cloud. 3) Replay: The Server generates the Integrity proof from the previous proof or other Information, without querying the actual Clients data. 4) Malicious programs: programs that are specially written to damage others data. 5) Masquerade: a person or entity pretends to be different.

17 1.4.4. Design Goals We have designed an efficient and secure storage protocols to ensure the following goals. These goals are classified into two categories: Efficiency and Security Goals. a) Efficiency The following efficiency requirements ought to be satisfied for the practical use of cloud storage: Low computation overhead: It includes the initialization and verification overheads of the verifier and the proof generating overheads of the server. It means that the proposed scheme should be efficient in terms of computation. Less communication overhead: It refers to the total communication cost required by the verification should be as low as possible. Low storage cost: It refers to the additional storage used for auditing should be as small as possible on both the Auditor and the cloud server. b) Security In this thesis, we are considering three security requirements, which need to be satisfied with the security of data stored in the cloud: Confidentiality: Confidentiality refers to only authorized parties or systems having the ability to access protected data and data is not disclosed or revealed to unauthorized parties. Integrity: Integrity refers to the protection of data from unauthorized deletion, modification or fabrication. Further, detecting the unauthorized modification or deletions of data and maintain the consistency of data. Availability: Availability refers to ensure the data Availability against Byzantine failures, malicious data modifications and server colluding attacks and data can be retrieved correctly upon request.

18 1.5. Scope Although cloud computing in general faces security challenges inhibiting the adoption by consumers, in this thesis we will focus on data storage security in the cloud. In particular, the broad approach of data storage security, is investigating the security objectives Authentication, Authorization, Confidentiality, Integrity, Availability and Non-repudiation. In this thesis, we focus on Confidentiality, Integrity, and Availability (CIA), because these are the important concerns of the data storage. 1.6. Research Methodology The research commenced with an exhaustive survey of the remote data storage security. Recently, several researchers have focused on the problem of remote data security and proposed different protocols [13, 16, 26, 52, 56, 59, 66,146, 165, 169, 197] to ensure the security of remote data stored in distributed storage systems without having a local copy of data based on Remote Data Checking (RDC) protocol. The RDC is a technique that on how to frequently, efficiently and securely verify that a storage server can faithfully store its Client s (potentially very large) original data without retrieving it. In RDC protocol, first, the Client computes the metadata for the file before outsourcing it into the remote server. Later, he can challenge the server for the Integrity of the file through Challenge-Response (CR) protocol. Then, the Client only keeps the metadata or sends to own agent (third party auditor), sends the file to the remote server, and deletes the local copy of the file. Later, whenever the verifier (original user or trusted third party entity) wants to check the storage data at remotely, he sends a challenge request to the server for the proof of storage data. Upon receiving a challenge from the verifier, the server computes response as proof of data storage and sends back to the verifier. Then, the verifier test the Integrity of data by comparing response with previously computed metadata. If it is true, the verifier is convinced that the data m is safe in the server. These RDC protocols can be classified into two categories: Deterministic verification protocols [52, 59, 66, 146] and Probabilistic verification protocols [13, 16, 26, 56, 165, 169, 197]. Deterministic verification schemes give deterministic guarantee of the data Integrity, because they check the Integrity of full data in a single. However, these schemes are unfeasible when file size is large. The probabilistic verification schemes give probabilistic guarantee of the

19 data Integrity and the detection probability will be high if an attacker deletes a fraction of all the blocks because the challenged blocks are randomly selected. Although Probabilistic verification schemes achieved the Integrity of remote data assurance under the different systems, but they lack in provides a strong security assurance to the Clients. This is due to their verifying process using pseudorandom sequence, which does not cover the entire file while generating the Integrity proof. Even when data corruption is not detected, these verification schemes ensure that no data will be lost. In addition, when the server only corrupts a small portion of the file (e. g. one block), if the verifier use probabilistic verification schemes based on pseudorandom sequence would have to dramatically increase the number of challenge blocks in order to achieve detection with high probability. This would render impractical the whole concept of lightweight audit through spot checking. Therefore, probabilistic verification schemes based on pseudorandom sequence may not give strong guaranty (satisfactory Integrity assurance) to the Clients about security of their data stored in the cloud. In this thesis, we propose New Probabilistic Efficient and Secure Protocols for data storage security in cloud computing such as Confidentiality, Integrity and Availability of data. First, the Homomorpic Distributed Verification Protocol (HDVP) ensures the Availability and Integrity of data in cloud with partial dynamic data support through private verifiability. The RSA-based Dynamic Public Audit Protocol (RSA-DPAP) ensures the Availability and Integrity of data stored in Cloud with support of public verifiability and efficient dynamic data operations. The designed ECC-based Public Audit Protocol(ECC-DPAP) using Elliptic Curve Cryptography (ECC) instead of RSA to address the Confidentiality, Availability and Integrity of data stored in Cloud. Publicly Verifiable Dynamic Secret Sharing Protocol(PVDSSP) ensures the all three basic security properties of data stored in Cloud efficiently refer to the Confidentiality, Integrity and Availability without responsibility of maintaining encryption key for the Clients. An Efficient Distribution Verification Protocol (EDVP), which implements the above protocols in distributed environment to improve the efficiency of single verifier protocols. The main characteristics of these protocols are given in Table 1.3.

20 Table 1.3: Main characteristics of Propose Protocols Parameters/Protocols HDVP RSA-DPAP ECC-DPAP PVDSSP Integrity yes Yes Yes Yes Availability yes Yes Yes Yes Confidentiality no no yes Yes Public Verifiability no Yes yes Yes Data Dynamics partial Yes yes Yes Storage Overhead for the Clients yes yes yes no We will discuss in detail about all these protocols in chapters 4, 5, 6, &7. 1.7. Main Contributions of Thesis The main contributions of the thesis are: 1) This thesis provides a comprehensive survey of security techniques that have been proposed in the past for remote data storage applications. The survey covers the work in distributed computing, Peer-Peer computing, Grid computing and also adds recent work on Cloud computing. These works are classified into two types in relation to their proposed techniques, which in turn help us identify their strength and weaknesses. The classification of techniques helps researchers and scientists working on remote data storage security verification techniques to their problem environment. It also aids them in choosing past work for comparing to their techniques. 2) The new probabilistic efficient and secure protocols that provide the Integrity assurance to the Clients with strong evidence that the CSP is faithfully storing data and maintain consistency that data cannot be leaked to malicious parties. The proposed protocols also support public verifiability; TPA can verify the security of the data in the cloud on behalf of Clients and also supports dynamic data operations such as modification, insertion and deletion for practical applications.

Survey of Remote Data Checking Protocols Chapter 2 Conclusion and Future Directions Chapter 9 Chapter 1 21 3) We prove the security (Integrity, Confidentiality and Availability) of proposed protocols against internal and external attacks. The Cloud server can provide valid response to the verifier challenges only if they actually have all data in an uncorrupted and update state. 4) We justify the performance of proposed protocols through concrete analysis, experimental results and comparison with existing schemes. Introduction Ph.D Thesis Cloud Storage and Security Motivation Problem Statement Scope of Research Research Methodology Summary of Contributions Structure of System Architecture Chapter 3 Homomorpic Distribution Verification Protocol Chapter 4 Chapter 5 Dynamic Public Audit Protocols Chapter 6 Public Verifiable Dynamic Secret Sharing Protocol Verification Protocol Statistical of Protocols Chapter 7 An Efficient Distribution Verification Protocol Protocol Chapter 8 Fig. 1.5 Thesis chapter organizations and contribution 1.8. Organization of the Thesis The thesis is organized as 9 chapters excluding references as shown in Fig. 1.5. The chapters in the thesis are as follows:

22 Chapter 1: Introduction It gives a brief introduction to cloud computing, cloud computing issues, cloud storage security, motivation, research methodology, contributions and organizations of the thesis are given. Chapter 2: Literature Survey This chapter provides a literature review of selected books, texts, published papers and web resources in relation to research conducted as part of the thesis. It provides an overview of the Availability of relevant subject matter and topics covered in previous research conducted by security professionals and academic researchers in the past. It also outlines a comparison with previous work conducted in the field of research. Chapter 3: System Architecture This chapter gives design architecture of the proposed protocols, which consists various phases of proposed protocols. Chapter 4: Homomorpic Distributed Verification Protocol In this chapter present the Homomorpic distribution verification protocol for ensuring the Availability and Integrity of data stored in cloud. This protocol uses erasure code for data Availability assurance and utilizes metadata generation based on Sobol Sequence to check the Integrity of data storage in cloud computing. It achieves the guaranty of Availability; Integrity of data stored in cloud i.e. whenever data modifications or deletions happen during the data storage Integrity verification across cloud servers, The HDVP scheme must give guaranty to identify the data corruptions. Chapter 5: Dynamic Public Audit Protocols. In this chapter present the dynamic public audit protocols for data storage security which can be categorize into two types:

23 5.1. RSA -based Dynamic Public Audit Protocol. The RSA-based protocol ensures the Integrity and Availability of data in cloud with public verifiability and efficient data dynamic operations. This protocol design is based on RSA, erasure codes, Homomomorpic Verifiable Tags (HVTs) and Sobol sequence. This protocol efficiently detects the data corruptions and also proves that RSA-DPAP protocol is secure against internal and external threats. 5.2. An ECC-based Dynamic Public Audit Protocol In this section present an ECC-based protocol to address the Confidentiality, Integrity, and Availability of data stored in cloud which is more efficient than RSA-DPAP scheme. Because, ECC-DPAP scheme based on small key size compared to RSA based solutions. Chapter 6: Publicly Verifiable Dynamic Secret Sharing Protocol In this chapter present the publicly verifiable dynamic secret sharing protocol for the Dependable and Secure Data Storage: Availability, Integrity and Confidentiality of data stored in cloud, which is based on Secrete Sharing, Erasure Code, Sobol sequence and Linear Code (LC). This protocol relives the Clients from maintaining the encryption key. Chapter 7: An Efficient Distribution Verification Protocol This chapter presents the an efficient distribution verification protocol to detect the data corruptions efficiently in a distributed manner with the help of multiple verifiers using Secret Sharing Protocol and Sobol Sequence. Chapter 8: Simulation of Protocols In this chapter analyze the proposed schemes in terms of security and performance; also present the simulation, experimental results and statistical results. Finally, compare the results with selective existing schemes. Chapter 9: this chapter gives the final conclusion of the research work and present possible future enhancements.