Sl.No. CODE TITLE AUTHOR



Similar documents
Energy Efficient Resource Management in Virtualized Cloud Data Centers

An Android Enabled Mobile Cloud Framework for Development of Electronic Healthcare Monitoring System using VPN Connection

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing

Grid Computing Vs. Cloud Computing

Migration of Virtual Machines for Better Performance in Cloud Computing Environment

Efficient and Enhanced Load Balancing Algorithms in Cloud Computing

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

Energy Constrained Resource Scheduling for Cloud Environment

Dynamic resource management for energy saving in the cloud computing environment

International Journal of Advance Research in Computer Science and Management Studies

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD

From Grid Computing to Cloud Computing & Security Issues in Cloud Computing

Infrastructure as a Service (IaaS)

A Study on Analysis and Implementation of a Cloud Computing Framework for Multimedia Convergence Services

Cloud computing: the state of the art and challenges. Jānis Kampars Riga Technical University

Cloud Computing Architecture: A Survey

Energy Optimized Virtual Machine Scheduling Schemes in Cloud Environment

International Journal of Computer & Organization Trends Volume21 Number1 June 2015 A Study on Load Balancing in Cloud Computing

Performance Gathering and Implementing Portability on Cloud Storage Data

Data Centers and Cloud Computing

Multilevel Communication Aware Approach for Load Balancing

INTRODUCTION TO CLOUD COMPUTING CEN483 PARALLEL AND DISTRIBUTED SYSTEMS

CLOUD COMPUTING. When It's smarter to rent than to buy

Auto-Scaling Model for Cloud Computing System

IBM EXAM QUESTIONS & ANSWERS

CDBMS Physical Layer issue: Load Balancing

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction

Tamanna Roy Rayat & Bahra Institute of Engineering & Technology, Punjab, India talk2tamanna@gmail.com

Analysis on Virtualization Technologies in Cloud

Survey on Models to Investigate Data Center Performance and QoS in Cloud Computing Infrastructure

Keywords Distributed Computing, On Demand Resources, Cloud Computing, Virtualization, Server Consolidation, Load Balancing

Sistemi Operativi e Reti. Cloud Computing

Exploring Resource Provisioning Cost Models in Cloud Computing

Future Generation Computer Systems. Energy-aware resource allocation heuristics for efficient management of data centers for Cloud computing

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

Introduction to Cloud Computing

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM

An Architecture Model of Sensor Information System Based on Cloud Computing

Emerging Technology for the Next Decade

A Secure Strategy using Weighted Active Monitoring Load Balancing Algorithm for Maintaining Privacy in Multi-Cloud Environments

Effective Resource Allocation For Dynamic Workload In Virtual Machines Using Cloud Computing

Dynamic Round Robin for Load Balancing in a Cloud Computing

System Models for Distributed and Cloud Computing

Dynamic Resource Management Using Skewness and Load Prediction Algorithm for Cloud Computing

Two-Level Cooperation in Autonomic Cloud Resource Management

What Is It? Business Architecture Research Challenges Bibliography. Cloud Computing. Research Challenges Overview. Carlos Eduardo Moreira dos Santos

Cloud Models and Platforms

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study

Li Sheng. Nowadays, with the booming development of network-based computing, more and more

Dynamic Load Balancing of Virtual Machines using QEMU-KVM

CLOUD COMPUTING. DAV University, Jalandhar, Punjab, India. DAV University, Jalandhar, Punjab, India

A Survey Paper: Cloud Computing and Virtual Machine Migration

Performance Analysis of VM Scheduling Algorithm of CloudSim in Cloud Computing

Group Based Load Balancing Algorithm in Cloud Computing Virtualization

Optimal Service Pricing for a Cloud Cache

Energy Conscious Virtual Machine Migration by Job Shop Scheduling Algorithm

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

Cloud Computing Simulation Using CloudSim

Sla Aware Load Balancing Algorithm Using Join-Idle Queue for Virtual Machines in Cloud Computing

International Journal of Engineering Research & Management Technology

CHAPTER 2 THEORETICAL FOUNDATION

CLOUD COMPUTING: A NEW VISION OF THE DISTRIBUTED SYSTEM

Cloud Computing and Amazon Web Services

International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April ISSN

Last time. Data Center as a Computer. Today. Data Center Construction (and management)

2) Xen Hypervisor 3) UEC

Cloud Computing - Architecture, Applications and Advantages

Federation of Cloud Computing Infrastructure

Keyword: Cloud computing, service model, deployment model, network layer security.

Cloud Computing with Red Hat Solutions. Sivaram Shunmugam Red Hat Asia Pacific Pte Ltd.

Cloud Computing Utility and Applications

Advanced Load Balancing Mechanism on Mixed Batch and Transactional Workloads

Cloud SQL Security. Swati Srivastava 1 and Meenu 2. Engineering College., Gorakhpur, U.P. Gorakhpur, U.P. Abstract

Environments, Services and Network Management for Green Clouds

A Survey on Cloud Security Issues and Techniques

On Cloud Computing Technology in the Construction of Digital Campus

A Survey of Energy Efficient Data Centres in a Cloud Computing Environment

CS 695 Topics in Virtualization and Cloud Computing and Storage Systems. Introduction

Cloud Infrastructure Pattern

Service allocation in Cloud Environment: A Migration Approach

A Framework for the Design of Cloud Based Collaborative Virtual Environment Architecture

Cloud Computing Research and Development Trend

High Performance Computing Cloud Computing. Dr. Rami YARED

Dynamic Resource management with VM layer and Resource prediction algorithms in Cloud Architecture

White Paper on CLOUD COMPUTING

IMPROVEMENT OF RESPONSE TIME OF LOAD BALANCING ALGORITHM IN CLOUD ENVIROMENT

Transcription:

Sl.No. CODE TITLE AUTHOR 1. SP-309 PRIVACY-PRESERVING MULTI KEYWORD RANKED SEARCH OVER ENCRYPTED CLOUD DATA. RAVIKUMAR.P S.MARAGATHAM 2. SP-397 RESEARCH AND DEVELOPMENT TREND OF CLOUD COMPUTING. 3. SP-398 ANDROID APPLICATION FRAMEWORK FOR CLOUD COMPUTING ENVIRONMENT USING VPN. 4. SP-329 CLOUD COMPUTING BASED RESOURCE MANAGEMENT IN ENERGY EFFICIENT MANNER NIKITHASHREE.N.S MANJUNATH.M SAMARA MUBEEN VINUTHA.S C.K.RAJU DR.M.SIDDAPPA SHIVAKUMAR.S.SOBANI, DR.A.SREENIVASAN 5. FP-122 CLOUD COMPUTING AND EMERGING IT TRENDS. MRS.J.SRIMATHI MRS.D.KALAIVANI 6. INCREASING DATA PRIVACY AND COMPUTATION EFFICIENCY THROUGH LINEAR PROGRAMMING OUTSOURCING IN CLOUD COMPUTING HARSHA N DR M SIDDAPPA 7. DYNAMIC RESOURCE ALLOCATION IN CLOUD FOR PARALLEL DATA PROCESSING 8. SP-440 DATA SECURE AND DEPENDABLE STORAGE SERVICES IN CLOUD COMPUTING. 9. SP-341 INCREASING DATA PRIVACY AND COMPUTATION EFFICIENCY THROUGH LINEAR PROGRAMMING OUTSOURCING IN CLOUD COMPUTING 10 SP-347 DYNAMIC RESOURCE ALLOCATION IN CLOUD FOR PARALLEL DATA PROCESSING 11 SP-348 DYNAMIC LOAD SHARING MULTICAST ALGORITHMS ON CLOUD FOR DATA INTENSIVE APPLICATIONS 12 SP-438 HORNS: A HOMOMORPHIC ENCRYPTION SCHEME FOR CLOUD COMPUTING USING RESIDUE NUMBER SYSTEM 13 SP-451 PROVIDING SECURITY IN CLOUD COMPUTING USING PROTECTION RINGS K.B.MANASA N.L.UDAYAKUMAR DR.M.SIDDAPPA AJAY KUMARA M A, MR. SHARAVANA.K HARSHA N, DR M SIDDAPPA K.B.MANASA N.L.UDAYAKUMAR DR.M.SIDDAPPA SUHASINI N.L UDAYAKUMAR DR. M. SIDDAPPA ARUN KUMBI ANASUYA PRAKASH MAHESH SHEELVANT DAYANANDA R B 14 SP-506 CLOUD COMPUTING ASHISH JAGGI, ROHIT KAKANI

Research and Development Trend of Cloud Computing Nikithashree N S¹, Manjunath M², Samara Mubeen 3 ¹IV SEM MTech, JNN College of Engineering, Shimoga, ² IV SEM MTech Dept of CSE, SJBIT, Bangalore 3 Assistant Professor, JNN College of Engineering, Shimoga, ¹nsnikitha25@gmail.com, ²manju.mys88@gmail.com, 3 samaramubeen7860@gmail.com Abstract - With the development of parallel computing, distributed computing, grid computing, and a new computing model appeared. The basic principles of cloud computing is to make the computing be assigned in great number of distributed computers, rather than the local computer. The Cloud computing is new paradigm for computing in Which all required resources are available as a service With the help of rich set of features it is getting more and More popular and well accepted by many computing communities Cloud computing provides people the way to share distributed resources and services that belong to different organizations/site. Since cloud computing share distributed resources via the network in the open environment many believe that Cloud will reshape the entire ICT industry as a revolution. The running o f the enterprise s data center is just like Internet. This makes the enterprise use the resource in the application that is needed, and access computer and storage according to the requirement. 1. Introduction Cloud computing, a new kind of computing model, is Coming With the rapid development of the Internet, user requirement is realized through the Internet, different from Changing with the need. Cloud computing is an extend of grid computing, distributed computing, and parallel com putting. Its foreground is to provide secure, quick, convenient data storage and netcomputing service cantered by internet. The factors that impel the occurring and development of cloud computing include: the development of grid computing, the appearance of high quality technology in storage and data transportation, and the appearance of Web2.0, especially the development of Virtualization. Virtualization is the main character. Most software and hardware have provided support to virtualization. We can virtualizes many factors such as IT resource, software, hardware, operating system and net storage, and manage them in the cloud com putting platform. Cloud computing is a novice approach in which every required resource is providing from one end to another. There was a huge expectation from users community to get the resources whenever they need. People were expecting the computing Facilities and resources on demand. The birth of cloud computing gave the light on it and finally it has become possible to address such expectations. Since distributed systems and network computing were used wildly, security has become an urgent problem and will be more important in the future. In order to improve the work efficiency, the different services are distributed in different servers that are distributed in different places. In contrast to the fast developing of distributed computing technologies, people have remained insufficient in the field of information security and safety. In recently, a new trend Attracts people s attention. Users from multiple environment hope use the distributed computing more efficient, just like using the electric power. Security is therefore a major element in any cloud computing infrastructure, because it is necessary to ensure that only authorized access is permitted and secure behavior is accepted. Trust is the major concern of the consumers and provider of services that participate in a cloud computing environment. 2. Cloud computing A. The background of cloud computing In recent 10 years, Internet has been developing very quickly. The cost of storage, the power consumed by computer and hardware is increasing. The storage space in data center can t meet our needs and the system and service of original internet can t solve above questions, so we need new solutions. At the same time, large enterprises have to study data source fully to support its business. The collection and analysis must be built on a new platform. Why we need cloud computing? It is to utilize the vacant resources of computer, increase the economic efficiency through improving utilization rate, and decrease the equipment energy consumption. B. Cloud computing principle It is difficult to define the cloud computing. Computing is a virtual pool of computing resources. It provides computing resources in the pool for users through internet. Integrated cloud computing is a whole dynamic computing system. It provides a mandatory application program environment. It can deploy, allocate or 1

reallocate computing resource dynamically and monitor the usage of resources at all times. Generally speaking cloud computing has a distributed foundation establishment, and monitor the distributed system, to achieve the purpose of efficient use of the system. C. Cloud computing style Though people have different views on the cloud computing, they have already reached an agreement on the basic style on it. Its style is as follows: 1. SAAS (Software as a service) This kind of cloud computing transfer programs to millions of users through browser. In the user s views, this can save some cost on servers and software. In the provider s views, they only need to maintain one program, this can also save cost. SAAS is commonly used in human resource management system and ERP (Enterprise Resource Planning). Google Apps and Zoho Office is also providing this kind of service. 2 Utility Computing Recently Amazon.com, Sun, IBM and other companies that provide storage services and virtual services are appearing. Cloud computing is creating virtual data center for IT industry to make it can provide service for the whole net through collecting memory, IO equipment, storage and computing power to a virtual resource pool. C. Network service Net service has a close relation with SAAS. The service providers can help programmers develop applications based on internet instead of providing single machine procedure through providing API (Application Programming Interface) D. PAAS (Platform as a service) Platform as a service, another SAAS, this kind of cloud computing providing development environment as a service. You can use the middleman s equipment to develop your own program and transfer it to the users through internet and servers. E. MSP (management service provider) This is one of the ancient applications of cloud computing. This application mostly serves the IT industry instead of end users. It is often used in mail virus scanning and program monitoring. F. Commercial service platform The commercial service platform is the mixture of SAAS and MSP (Mixed signal Processor), this kind of computing provides a platform for the interaction between users and service provider. For instance, the user individual expense management system can manage user s expense according user s setting and coordinate all the services that users purchased. G. Integrating internet It can integrate all the companies that provide similar services, so that users can compare and select their service provider. 3. Features of cloud computing 1. Ultra large-scale The scale of cloud is large. The cloud of Google has owned more than one million servers. Even in Amazon, IBM, Microsoft, Yahoo, they have more than hundreds of thousands servers. 2. Virtualization Cloud computing makes user get service anywhere, through any kind of terminal. The resources it required come from cloud instead of visible entity. 3. High reliability Cloud uses data multi-transcript fault tolerant, the computation node isomorphism exchangeable and so on to ensure the high reliability of the service. Using cloud computing is more reliable than local computer. 4. Versatility \Cloud computing doesn t aim at certain special Application. It can produce various applications supported by cloud, and one cloud can support different applications running it at the same time. 5. High extendibility The scale of cloud can extend dynamically to meet the increasingly requirement. 6. On demand service Cloud is a large resource pool that you can buy according to your need; cloud is just like running water, electric, and gas that can be charged by the amount that you used. 4. Need of cloud computing 1. Technology Trends: The trend of technology is moving very fast and also the environment is changing very rapidly. This is one of the big challenges for the government to maintain the pace with fast growing technology and changing environment. The cloud computing takes the responsibility of this and free the government or user from this. 2. Automation of multiple Management tasks: The management tasks like assigning the resources to the processes have become very tedious work because of the volume and complexity of the tasks the cloud computing is designed to automate the multiple tasks with security. 3. High Availability and No downtime: Availability is always desirable features of the computation system. The services should be always available for the citizens. 2

5. System architecture System Architecture is the collection of independent components that are connected or interacted through the well defined connectors. One component can interact with another with the set of defined agreements. different actors i.e. service provider and service receiver this architecture is very popular and matured in providing the best services to the authorized users. Existing system Enterprise Architecture Problem of Existing System In existing system, even intra-departments interaction is also not reliable and smooth within the ministry. Among them the incompatibility among the different data is very big issues at present scenario because they do not consider or follow the standardization while developing the system or any application. These problems have made a big frustration to the citizens. This is one of the reasons why citizen is not happy or satisfy with the performances of ministry. They Have lot of complains against the ministry. Proposed architecture The proposed following architecture as a solution. We try to address the problems with Enterprise Architecture [EA]. In EA, the core part is middleware so we start the making the architecture with the basic concept of middleware. Service Oriented Architecture [SOA] is one of the software architectures. This architecture is also based upon two Figure 2 How does it work? We have identified three types of services for above architecture. These are Full sharing services, Partial sharing services and No sharing services. In full sharing services, all services are sharable among the ministries, and in partial sharing, some services are sharable and some are not where as no sharing services are not sharable at all. As per the types of services, we propose private and public cloud in our architecture. No sharing services of a ministry are kept in the private cloud and full sharing services are kept in public cloud. The ministry has the entire control over the services in the private cloud that is why the confidential services for internal uses are kept on it. We can See manager between private and public cloud. This manager controls and protects the private cloud from external users. 6. Cloud Adoption challenges 1. Security 3

It is clear that the security issue has played the most important role in hindering Cloud computing. Without doubt, putting your data, running your software at someone else's hard disk using someone else's CPU appears daunting to many. Well-known security issues such as data loss, phishing, botnet (running remotely on a collection of machines) pose serious threats to organization's data and software. Moreover, the multi-tenancy model and the pooled computing resources in cloud computing has introduced new security challenges that require novel techniques to tackle with. For example, hackers are planning to use Cloud to organize botnet as Cloud often provides more reliable infrastructure services at a relatively cheaper price for them to start an attack. 2. Costing Model Cloud consumers must consider the tradeoffs amongst computation, communication, and integration. While migrating to the Cloud can significantly reduce the infrastructure cost, it does raise the cost of data communication, i.e. the cost of transferring an organization's data to and from the public and community Cloud [6] andthe cost per unit (e.g. a VM) of computing resource used is likely to be higher. 3. Charging Model From a cloud provider's perspective, the elastic resource pool (through either virtualization or multi-tenancy) has made the cost analysis a lot more complicated than regular data centres, which often calculates their cost based on consumptions of static computing. Moreover, an instantiated virtual machine has become the unit of cost analysis rather than the underlying physical server. A sound charging model needs to incorporate all the above as well as VM associated items such as software licenses, virtual network usage, node and hypervisor management overhead, and so on. 4. Service Level Agreement Although cloud consumers do not have control over the underlying computing resources, they do need to ensure the quality, availability, reliability, and performance of these resources when consumers have migrated their core business functions onto their entrusted cloud. In other words, it is vital for consumers to obtain guarantees from providers on service delivery. Typically, these are provided through Service Level Agreements (SLAs) negotiated between the providers and consumers. 7. Geographic Information Service A good GIS platform can not only provides various functions, but also need to be conveniently and transparently accessed at anytime, anywhere by anyone. Some aspects that GIS platform should take into account are listed as following 1. Hierarchical and distributed data storage. The geographical data should be stored on different levels with different detail and accuracy. Each level of these data can be stored at different places, nodes, or servers in a distribution way. 2. Time-series data. Sometimes geographical information changes quickly or slowly, so we must take into account the temporal nature of data and add time dimension while storing data. 3. GIS workflow. GIS applications often operate on large-scale datasets, so how to partition and how many chunks to split become a challenge problem. The GIS platform should have the capability to organize many specific GIS operations, tasks in a GIS workflow. 4. Rich GIS APIs. The GIS platform should provide rich APIs for developers and users to deal with complex and changing business needs. 1.Applications layer Figure 3 4

Applications layer is on the top of this architecture. In this layer, the users access various GIS services. As you know, everything (Data, Software, Platform) in cloud computing platform is regarded as a service and stored in the clouds. 2.Geographic Information Services Layer A lot of geographic information services are available on the internet by some GIS software and services providers. We had also developed some GIS data services such as Web Map Service (WMS), Web Coverage Service (WCS), and Web Feature Service (WFS) according to the Open Geospatial Consortium (OGC) specifications or ISO standards. Some non standard services such as image tile service, data processing service, data transformation service, and special professional services are provided in our GeoGlobe Services Platform (GSP). However, most of these services are based on Service Oriented Architecture (SOA), Web Services (WS) or Grid Computing (GC) technologies, some features of cloud computing are not considered in early development. In the new architecture of GIS platform, the geographic information and services are stored in the cloud environment which refers to virtualization, distributed file system, and parallel computing etc. So the older services should transform to suit the needs of cloud computing. computing is bound to birth a number of new jobs. cloud computing will bring a revolutionary change in the Internet. Since cloud computing is based upon the virtual machine for virtualizing the physical resources, the reliability, availability and others non functional properties are very good. References [1]http://en.wikipedia.og/wiki/Cloud_computing [2] inition/0,,sid201_gci1287881,00.html [3](U.S.) Nicholas. Carr, fresh Yan Yu, "IT is no longer important: the Internet great change of the high ground cloud computing," The Big Switch:Rewining the World,from Edison to Google,CITIC Publishing House, October 2008 [4] Tal Garfinkel, Mendel Rosenblum, and Dan Boneh, "Flexible OS Support and Applications for Trusted Computing", the 9th Workshop on Hot Topics in Operating Systems (HotOS IX), USENIX, 2003. [5]Y. Chen, V. Paxton, and R. Katz, "What s New About Cloud Computing Security?," 2010. [6]A. Leinwand, "The Hidden Cost of the Cloud: Bandwidth Charges," 09/07/17/thehidden-cost-of-the-cloud-bandwidth. 3. Cloud Computing Layer The cloud computing Operation System (OS) acts as an entrance for the user. Registered users can search the current services or data they need. At the mean time, cloud computing OS acts as a repository of services deployed by different developers. It provides flexible and scalable super computing environment for the users or applications, and the underlying software and hardware details are hidden. In this layer, all the cloud computing technologies such as parallel computing, distributed file system, computing and storing virtualization and so on are encapsulated for the developers of services. Conclusion Among the many IT giants driven by trends in cloud computing has not doubtful. It gives almost everyone has brought good news. For enterprises, cloud computing is worthy of consideration and try to build business systems as a way for businesses in this way can undoubtedly bring about lower costs, higher profits and more choice; for large scale industry. There is the advent of cloud 5

Android Application Framework for Cloud Computing Environment using VPN. Vinutha.S C.K.Raju Dr.M.Siddappa PGstudent, Department of CSE, Asst.Professor, Department of CSE, HOD Department of CSE, SSIT, Tumkur SSIT,Tumkur SSIT, Tumkur. vinuthavinu08@gmail.com ckrajussit@gmail.com Siddappa.p@gmail.com Abstract: Android smart phone users and mobile applications are growing rapidly. Cloud Computing helps to manage the data in a distributed environment which supports several platforms, systems and applications. A VPN can provide secure information transport by authenticating users. The possible Android applications are - Electronic Health Record Client based on the Android Platform [4], A Distributed Urban Sensing Platform[6], Android Smart Phone Surveillance System. Server within the Virtual Private Network of public network. II. Android Platform Architecture Android has built-in tool which make it easy for application development. Android provides an open development platform and offers developers the capability to build greatly rich and innovative applications. Fig 1. Shows the Android operating system architecture. Keywords: Android, Cloud Computing, VPN, C2DM. I. Introduction Android is a software stack for mobile devices that includes an operating system, middleware and key applications. It allows developers to write managed code in Java language, controlling the device via Googledeveloped Java libraries. Applications written in C and other languages can be compiled to ARM native code and run, but this development path is not officially supported by Google. Cloud Computing is an on-demand network access model which helps us to share resources such as networks, servers, storage, applications, and services. This cloud model promotes availability and is composed of essential characteristics, deployment models, and various service models. A VPN is a private network that uses a public network to connect remote sites or users together. By using a VPN, businesses ensure security anyone intercepting the encrypted data can't read it. An Android Application will be developed utilizing Cloud to Device Messaging for Cloud Fig 1. Android System Architecture. A. The application layer The Android software platform will come with a set of basic applications like browser, e- mail client, SMS program, maps, calendar, contacts and many more. All these applications are written using the Java programming language. It should be mentioned that applications can be run simultaneously, it is possible to hear music and read an e-mail at the same time. This layer will mostly be used by commonly cell phone users. B. The application framework An application framework is a software framework that is used to implement a standard structure of an application for a specific operating system. With the help of managers, content providers and other services programmers it can reassemble functions used by other existing applications. 6

C. The libraries The available libraries are all written in C/C++. They will be called through a Java interface. These includes the Surface Manager, 2D and 3D graphics, Media Codec s like MPEG-4 and MP3, the SQL database SQLite and the web browser engine WebKit. Libc--c standard lib, SSL--Secure Socket Layer, SGL--2D image engine, OpenGL ES--3D image engine, Media Framework--Core part of Android multi-media, SQLite--Embedded database, WebKit--Kernel of web browser, FreeType-- Bitmap and Vector & Surface Manager--Manage difference windows for different applications. D. The runtime The Android runtime consists of two components. One is a set of core libraries which provide most of the functionality available in the core libraries of the Java programming language. The second one is the virtual machine Dalvik which operates like translator between the application and the operating system. E. The kernel Linux provides the hardware abstraction layer for Android, allowing Android to be ported to a wide variety of platforms in the future. Internally, Android uses Linux for its memory management, process management, networking, and other operating system services. III. Cloud Computing Platform Cloud computing is the storage, management, processing, and accessing information and other data stored in a specific server. The cloud pertains to all these necessary information, whether these are account details of customers, sale documentations, or simply records of any kind of business; the cloud is of utmost interest in making use of the cloud computing technology. Services offered by cloud: The term services in cloud computing is the concept of being able to use reusable, finegrained components across a vendor s network. This is widely known as as-a-service. Offerings with as a service include traits like following: Low barriers to entry, making them available to small businesses. Large scalability. Multitenancy, which allows resources to be shared by many users. Device independence, which allows users to access the systems on different hardware. Software as a Service (SaaS) is where application services are delivered over the network on a subscription and on-demand basis. Cisco, Sales force, Microsoft, and Google are a few providers in this layer. Platform as a Service (PaaS) consists of runtime environments and software development frameworks and components delivered over the network on a pay-as-you-go basis. PaaS offerings are typically presented as API to consumers. Examples of this are: Google Apps Engine, Amazon Web Services, force.com, and Cisco WebEx Connect. Infrastructure as a Service (IaaS) is where compute, network, and storage are delivered over the network on a pay-as-you-go basis. Amazon pioneered this with AWS (Amazon Web Service), and now IBM and HP are entrants here also. IV. Virtual Private Network The VPN protect data while it's traveling on the public network. If intruders attempt to capture the data, they should be unable to read or use it. The VPN provide the same quality of connection for each user even when it is handling its maximum number of simultaneous connections. It is able to extend its VPN services to handle that growth without replacing the VPN technology altogether. The purpose of the tunneling protocol is to add a layer of security that protects each packet on its journey over the Internet. The packet is traveling with the same transport protocol it would have used without the tunnel; this protocol defines how each computer sends and receives data over its ISP. Each inner packet still maintains the passenger protocol, such as Internet protocol (IP) or AppleTalk, which defines how it travels on the LANs at each end of the tunnel. The tunneling protocol used for encapsulation adds a layer of security to protect the packet on its journey over the Internet. V. Cloud to Device Messaging C2DM is a Service which enables us to send data from cloud servers to their android application on devices. The main components of C2DM are device which runs android application, third party Application Server and C2DM servers. Life cycle of C2DM involve three steps; 1] Enabling C2DM which helps application register to receive messages. 7

2] Sending message in which the third party application server sends messages to device. 3] Receiving message in which an application receives a message from C2DM server. VI. Project Goal The degradation of mobile performance is due to the network traffic, number of devices, data traffic etc. customizability and of automation as shown below in Fig 3.: Fig 3. Customization & Automation Spectrum. Fig 2. Android Cloud with VPN Therefore establishment of Android Mobile Cloud Computing environment using VPN connection solve these problems. The overall project goal is represented with the help of Fig 2. VII. Methodology Today mobile devices are involved in better speeds and application than before. The upcoming application performance is degrading because of Internet traffic, number of devices, connection speeds etc. Hence load needs to be balanced which can be done with the help of Cloud. Android application [2] will be developed which utilizes C2DM for transmission of messages from server to applications on android devices. To establish a Virtual private Network, OpenVPN Software [5] is used. OpenVPN helps to create routed/ethernet by generating a RSA certificate with different keys for each client. OpenVPN helps to connect separate remote networks together into one large network that is fully routed. This virtual private network helps to send data securely through the tunnel. To establish a mobile cloud computing environment [3] there are some cloud offerings, which are Android specific while others are more general purpose. There is a tradeoff among the cloud vendors in terms of Amazon, for example, is very customizable but is not as automated, since you are still dealing with configuring virtual machines and other infrastructure oriented activities. Google s AppEngine is on the other end of the spectrum, not very customizable but much automated. Any of the best cloud offerings to establish cloud environment can be chosen. Finally Cloud environment helps in balancing the data traffic from different sources. VIII. Conclusion In this paper, we proposed a mobile cloud execution framework to execute android applications which utilizes C2DM in cloud virtualized private cloud environment. Encryption and isolation is used to protect data, against the eavesdropper from users and the cloud providers using Virtual private network. Our approach offers opportunity for end users to migrate their android applications from one mobile to another quickly and efficiently. Our framework is still a work in progress. We believe that more applications and systems can benefit from our approach. It is our hope that our framework will provide users and developers a versatile environment to carry out their applications on a range of systems, from mobile devices to cloud servers, in a convenient, efficient and secure fashion. At present the establishment of Virtual Private network is established using the above mentioned method. An Android application is under development using cloud to device messaging. Finally this android application will be deployed on the cloud. 8

References [1]. Android Developers website. http://developer.android.com/. [2]. Shih-Hao Hung, Chi-Sheng Shih, Jeng-Peng Shieh, Chen-Pang Lee, and Yi-Hsiang Huang: An Online Migration Environment for Executing Mobile Applications on the Cloud in the Proceedings of 2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing. [3].Charalampos Doukas, Thomas Pliakas, and Ilias Maglogiannis: Mobile Healthcare Information Management utilizing Cloud Computing and Android OS in the Proceedings of 32nd Annual International Conference of the IEEE EMBS Buenos Aires, Argentina, August 31 - September 4, 2010. [4]. Dimitris Tychalas & Athanasios Kakarountas: Planning and Development of an Electronic Health Record Client based on the Android Platform in the Proceedings of 2010 14th Panhellenic Conference on Informatics. [5]. Abdullah Alshalan & Garrett Drown: Cloud VPN [6]. Jong Hoon, Ahnn Uichin Lee & Hyun Jin Moon: GeoServ: A Distributed Urban Sensing Platform in the Proceedings of 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. 9

Cloud Computing Based Resource Management in Energy Efficient Manner Shivakumar.S.Sobani, DSCE, Bangalore Email: sobani2@gmail.com Dr.A.Sreenivasan Professor and Director of PG Studies DSCE, Bangalore ABSTRACT Virtualization Technology has been employed increasingly widely in modern data centers in order to improve its energy efficiency. In particular, the capability of virtual machine (VM) migration brings multiple benefits for such as resources (CPU, memory, et al.) distribution, energy aware consolidation. However, the migration of virtual machines itself brings extra power consumption. For this reason, a better understanding of its effect on system power consumption is highly desirable. In this paper, we present a power consumption evaluation on the effects of live migration of VMs. Results show that the power overhead of migration is much less in the scenario of employing the strategy of consolidation than the regular deployment without using consolidation. Our results are based on the typical physical server, the power of which is linear model of CPU utilization percentage. Keywords - Energy efficiency; Cloud computing; Virtualization; Load balancing; live or off-line migration of virtual machines. 1 INTRODUCTION Virtualization Technology has been employed increasingly widely in modern data centers in order to improve its energy efficiency. In particular, the capability of virtual machine (VM) migration brings Multiple benefits for resources(cpu, memory, et al.) distribution, energy aware consolidation. In recent years, more and more data centers start to employ server virtualization strategies for resource sharing to reduce hardware and operating costs. Virtualization technologies (such as Xen, VMware, and Microsoft Virtual Servers) can consolidate applications previously running on multiple physical servers onto a single physical server, via this way, the energy consumption of data center can be effectively reduced. Consequently, virtualized infrastructures are considered as a key solution to the power management of data center. And using VMs migration technology enables the consolidation of servers spread across many locations. If QoS performance can be maintained in the consolidation, a system can be configured with a fewer number of servers and less power consumption. Cloud computing has become a very promising paradigm for both consumers and providers in various fields of endeavor, such as science, engineering and business. A cloud typically consists of multiple resources possibly distributed and heterogeneous. Although the notion of a cloud existed in one form or another for some time now (its roots can be traced back to the mainframe era [1]), however, recent advances in virtualization technologies in particular have made it much more compelling compared to the time when it was first introduced. A number of practices can be applied to achieve energy efficiency, such as improvement of applications, algorithms, energy efficient hardware, Dynamic Voltage and Frequency Scaling (DVFS) [2], terminal servers and thin clients, and virtualization of computer resources [3]. Cloud computing naturally leads to energyefficiency by providing the following characteristics: Economy of scale due to elimination of redundancies. Improved utilization of the resources. Location independence VMs can be moved to a place where energy is cheaper. Scaling up and down resource usage can be adjusted to current requirements. Efficient resource management by the Cloud provider. 10

One of the important requirements for a Cloud computing environment is providing reliable QoS. It can be defined in terms of Service Level Agreements (SLA) that describes such characteristics as minimal throughput, maximal response time or latency delivered by the deployed system. VMs may not get the required amount of resource when requested. This leads to performance loss in terms of increased response time, time outs or failures in the worst case. Therefore, Cloud providers have to deal with energy-performance trade-off minimization of energy consumption, while meeting QoS requirements. A. Research scope The focus of this work is on energy-efficient resource management strategies that can be applied on a virtualized data center by a Cloud provider (e.g. Amazon EC2). The main instrument that we leverage is live migration of VMs. The ability to migrate VMs between physical hosts with low overhead gives flexibility to a resource provider as VMs can be dynamically reallocated according to current resource requirements and the allocation policy. Idle physical nodes can be switched off to minimize energy consumption. In this paper we present a decentralized architecture of the resource management system for Cloud data centres and propose the development of the following policies for continuous optimization of VM placement: Optimization over multiple system resources at each time frame VMs are reallocated according to current CPU, RAM and network bandwidth utilization. Network optimization optimization of virtual network topologies created by intercommunicating VMs. Network communication between VMs should be observed and considered in reallocation decisions in order to reduce data transfer overhead and network devices load. Thermal optimization current temperature of physical nodes is considered in reallocation decisions. The aim is to avoid hot spots by reducing workload of the overheated nodes and thus decrease error proneness and cooling system load. B. Research challenges The key challenges that have to be addressed are: How to optimally solve the trade-off between energy savings and delivered performance? How to determine when, which VMs, and where to migrate in order to minimize energy consumption by the system, while minimizing migration overhead and ensuring SLA? How to develop efficient decentralized and scalable algorithms for resource allocation? How to develop comprehensive solution by combining several allocation policies with different objectives? Energy consumption and resource utilization in clouds are highly coupled. Specifically, resources with a low utilization rate still consume an unacceptable amount of energy compared with their energy consumption when they are fully utilized or sufficiently loaded. According to recent studies in [4 7], average resource utilization in most data centers can be as low as 20%; and the energy consumption of idle resources can be as much as 60% or peak power. In response to this poor resource utilization, task consolidation is an effective technique to increase resource utilization and in turn reduces energy consumption. This technique is greatly enabled by virtualization technologies that facilitate the running of several tasks on a single physical resource concurrently. Recent studies identified that server energy consumption scales linearly with (processor) resource utilization [6, 8]. This encouraging fact further advocates the significant contribution of task consolidation to the reduction in energy consumption. However, task consolidation can also lead to the freeing up of resources that can sit idling yet still drawing power. Kusic et al. [10] have stated the problem of continuous consolidation as a sequential optimization and addressed it using Limited Look ahead Control (LLC). The proposed model requires simulation-based learning for the application specific adjustments. Due to complexity of the model the optimization controller s execution time reaches 30 minutes even for a small number of nodes (e.g. 15), that is not suitable for large-scale real-world systems. On the contrary, our approach is heuristic-based 11

allowing the achievement of reasonable performance even for large-scale as shown in our experimental studies. Srikantaiah et al. [11] have studied the problem of requests scheduling for multi-tiered webapplications in virtualized heterogeneous systems in order to minimize energy consumption, while meeting performance requirements. To handle the optimization over multiple resources, the authors have proposed a heuristic for multidimensional bin packing problem as an algorithm for workload consolidation. However, the proposed approach is workload type and application dependent, whereas our algorithms are independent of the workload type and thus are suitable for a generic Cloud environment. Song et al. [12] have proposed resource allocation to applications according to their priorities in multi-application virtualized cluster. The approach requires machine-learning to obtain utility functions for the applications and defined application priorities. Unlike our work, it does not apply migration of VMs to optimize allocation continuously (the allocation is static). Cardosa et al. [13] have explored the problem of power efficient allocation of VMs in virtualized heterogeneous computing environments. They have leveraged min, max and shares parameters of VMM that represent minimum, maximum and proportion of CPU allocated to VMs sharing the same resource. The approach suits only enterprise environments or private Clouds as it does not support strict SLA and requires knowledge of applications priorities to define shares parameter. We assume that resources are homogeneous in terms of their computing capability and capacity; this can be justified by using virtualization technologies. Nowadays, as many-core processors and virtualization tools (e.g., Linux KVM, VMware Workstation & VMware Fusion, Xen, Parallels Desktop for Mac, VirtualBox) are commonplace, the number of concurrent tasks on a single physical resource is loosely bounded. Although a cloud can span across multiple geographical locations (i.e., distributed), the cloud model in our study is assumed to be confined to a particular physical location. The inter-processor communications are assumed to perform with the same speed on all links without substantial contentions. It is also assumed that a message can be transmitted from one resource to another while a task is being executed on the recipient resource, which is possible in many systems. 2 MODELS In this section, we describe the cloud, application and energy models, and define the task consolidation problem targeted in this work. The details of the model presented in this section focus on resource management characteristics and issues from a cloud provider s perspective. 2.1 Cloud model The target system used in this work consists of a set R of r resources/processors that are fully interconnected in the sense that a route exists between any two individual resources (Fig. 1). Fig. (1) Cloud model 2.2 Application model Services offered by cloud providers can be classified into software as a service (SaaS), platform as a service (PaaS) and infrastructure as a service (IaaS). Note that, when instances of these services are running, they can be regarded as computational tasks or simply tasks. While IaaS requests are typically tied with predetermined time frames (e.g., pay-per-hour), requests of SaaS and PaaS are often not strongly tied with a fixed amount of time (e.g., pay-per- 12

use). However, it can be possible to have estimates for service requests for SaaS and PaaS based on historical data and/or consumer supplied service information. Service requests in our study arrive in a Poisson process and the requested processing time follows exponential distribution. We assume that the processor/cpu usage (utilization) of each service request can be identifiable. It is also assumed that disk and memory use correlates with processor utilization [6]. Hereafter, application, task and service are used interchangeably. 3. SYSTEM ARCHITECTURE In this work the underlying infrastructure is represented by a large-scale Cloud data center comprising n heterogeneous physical nodes. Each node has a CPU, which can be multicore, with performance defined in Millions Instructions Per Second (MIPS). Besides that, a node is characterized by the amount of RAM and network bandwidth. Users submit requests for provisioning of m heterogeneous VMs with resource requirements defined in MIPS, amount of RAM and network bandwidth. SLA violation occurs when a VM cannot get the requested amount of resource, which may happen due to VM consolidation. The software system architecture is tiered comprising a dispatcher, global and local managers. The local managers reside on each physical node as a part of a Virtual Machine Monitor (VMM). They are responsible for observing current utilization of the node s resources and its thermal state. The local managers choose VMs that have to be migrated to another node in the following cases: The utilization of some resource is close to 100% that creates a risk of SLA violation. The utilization of resources is low, therefore, all the VMs should be reallocated to another node and the idle node should be turned off. A VM has intensive network communication with another VM allocated to a different physical host. The temperature exceeds some limit and VMs have to be migrated in order to reduce load on the cooling system and allow the node to cool down naturally. The local managers send to the global managers the information about the utilization of resources and VMs chosen to migrate. Besides that, they issue commands for VM resizing, application of DVFS and turning on / off idle nodes. Each global manager is attached to a set of nodes and processes data obtained from their local managers. The global managers continuously apply distributed version of a heuristic for semionline multidimensional bin-packing, where bins represent physical nodes and items are VMs that have to be allocated. The decentralization removes a Single Point of Failure (SPF) and improves scalability. Each dimension of an item represents the utilization of a particular resource. After obtaining allocation decision, the global managers issue commands for live migration of VMs. As shown in Figure 2, the system operation consists of the following steps: New requests for VM provisioning. Users submit requests for provisioning of VMs. Dispatching requests for VM provisioning. The dispatcher distributes requests among global managers. Intercommunication between global managers. The global managers exchange information about utilization of resources and VMs that have to be allocated. Data about utilization of resources and VMs chosen to migrate. The local managers propagate information about resource utilization and VMs chosen to migrate to the global managers. Migration commands. The global managers issue VM migration commands in order to optimize current allocation. Commands for VM resizing and adjusting of power states. The local managers monitor their host nodes and issue commands for VM resizing and changes in power states of nodes. VM resizing, scheduling and migration actions. According to the received commands, VMM performs actual resizing and migration of VMs as well as resource scheduling. 13

Guaranteed QoS the algorithms have to provide reliable QoS by meeting SLA. Independence of the workload type the algorithms have to be able to perform efficiently in mixed application environments. Fig. (2) System architecture Fig. (3) Block diagram 4. ALLOCATION POLICIES We propose three stages of VM placement optimization: reallocation according to current utilization of multiple system resources, optimization of virtual network topologies established between VMs and VM reallocation considering thermal state of the resources. Each of these stages is planned to be investigated separately and then combined in an overall solution. The developed algorithms have to meet the following requirements: Decentralization and parallelism to eliminate SPF and provide scalability. High performance the system has to be able to quickly respond to changes in the workload. The VM reallocation problem can be divided in two: selection of VMs to migrate and determining new placement of these VMs on physical hosts. The first part has to be considered separately for each optimization stage. The second part is solved by application of a heuristic for semi online multidimensional bin-packing problem. At the first optimization stage, the utilization of resources is monitored and VMs are reallocated to minimize the number of physical nodes in use and thus minimize energy consumption by the system. However, aggressive consolidation of VMs may lead to violation of performance requirements. We have proposed several heuristics for selection of VMs to migrate and investigated the trade-off between performance and energy savings. To simplify the problem for the first step we considered only utilization of CPU. The main idea of the policies is to set upper and lower utilization thresholds and keep total utilization of CPU created by VMs sharing the same node between these thresholds. If the utilization exceeds the upper thresholds, some VMs have to be migrated from the node to reduce the risk of SLA violation. If the utilization goes below the lower thresholds, all VMs have to be migrated and the node has to be switched off to save the energy consumed be the idle node. Another problem is to determine particular values of the utilization thresholds. The results of the proposed algorithms evaluation are presented in Section 5. Due to continuous reallocation, some intensively communicating VMs can be placed inefficiently leading to excessive load on the network facilities. Therefore, it is crucial to consider network communication behavior of VMs in reallocation decisions. The aim of the second proposed optimization stage is to place communicating VMs in a way minimizing the overhead of data transfer over network. A cooling system of a data center consumes a significant amount of energy. Therefore, the third proposed optimization stage is aimed at optimization of cooling system operation. Due to consolidation, some computing nodes experience high load leading to overheating and thus require 14

extensive cooling. Monitoring of the nodes thermal state using sensors gives an opportunity to recognize overheating and reallocate workload from the overheated node to allow the natural cooling. The network and temperature optimizations are subjects for the ongoing research work. 5. EVALUATION As the proposed system is targeted on a largescale Cloud data center, it is necessary to conduct large-scale experiments to evaluate the algorithms. However, it is difficult to run largescale experiments on a real-world infrastructure, especially when the experiments have to be repeated for different policies with the same conditions [14]. Therefore, simulation has been chosen as a way to evaluate the proposed heuristics. We have chosen CloudSim toolkit [14] as a simulation framework, as it is built for simulation of Cloud computing environments. In comparison to alternative simulation toolkits (e.g.simgrid, GangSim), CloudSim supports modelling of on-demand virtualization enabled resource and application management. We have extended the framework in order to enable energy aware simulations as the core framework does not provide this capability. In addition, we have incorporated the abilities to account SLA violations and to simulate dynamic workloads that correspond to web applications and online services. 6. REFEERENCES [1] Parkhill D (1966) the challenge of the computer utility. Addison-Wesley Educational, Reading [2] G.Semeraro, G. Magklis, R. Balasubramonian, D. H. Albonesi, S. Dwarkadas, and M. L. Scott, Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling, in Proceedings of the 8 th International Symposium on High-Performance Computer Architecture, 2002, pp. 29 42. [3] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, Xen and the art of virtualization, in Proceedings of the 19th ACM symposium on Operating systems principles, 2003, p. 177. [4] Barroso L, Holzle U (2007) The case for energy-proportional computing. IEEE Comput [5] Bohrer P, Elnozahy E, Keller T, Kistler M, Lefurgy C, Rajamony R (2002) The case for power management in web servers. Power Aware Comput 261 289 [6] Fan X, Weber X-D, Barroso LA (2007) Power provisioning for a warehouse-sized computer. In: Proc 34th annual international symposium on computer architecture (ISCA 07), 2007, pp 13 23 [7] Lefurgy C, Wang X, Ware M (2007) Serverlevel power control. In: Proc IEEE international conference on autonomic computing, Jan 2007 [8] Meisner D, Gold BT, Wenisch TF (2009) PowerNap: eliminating server idle power. In: Proc 14 th international conference on architectural support for programming languages and operating systems (ASPLOS 09), 2009, pp 205 216 [9] R. Nathuji and K. Schwan, Virtualpower: Coordinated power management in virtualized enterprise systems, ACM SIGOPS Operating Systems Review, vol. 41, no. 6, pp. 265 278, 2007 [10] D. Kusic, J. O. Kephart, J. E. Hanson, N. Kandasamy, and G. Jiang, Power and performance management of virtualized computing environments via lookahead control, Cluster Computing, vol. 12, no. 1, pp. 1 15, 2009 [11] S. Srikantaiah, A. Kansal, and F. Zhao, Energy aware consolidation for cloud computing, Cluster Computing, vol. 12, pp. 1 15, 2009 [12] Y. Song, H. Wang, Y. Li, B. Feng, and Y. Sun, Multi-Tiered On-Demand resource scheduling for VM-Based data center, in Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid-Volume 00, 2009, pp. 148 155 [13] M. Cardosa, M. Korupolu, and A. Singh, Shares and utilities based power consolidation in virtualized server environments, in Proceedings of IFIP/IEEE Integrated Network Management (IM), 2009 [14] R. Buyya, R. Ranjan, and R. N. Calheiros, Modeling and simulation of scalable cloud computing environments and the CloudSim toolkit: Challenges and opportunities, in Proceedings of the 7th High Performance Computing and Simulation Conference (HPCS 09). IEEE Press, NY, USA, 2009 15

Abstract Cloud Computing and Emerging IT Trends Mrs.J.Srimathi Assistant Professor / MCA, Vivekananda Business School for Women, Thiruchengode, Tamilnadu, India. Mrs.D.Kalaivani, Assistant Professor / MCA, Vivekanandha Institute of Information and Management Studies, Thiruchengode, Tamilnadu, India India as a country with a huge population base has never been given adequate resources to embrace innovation. The case is even worse if you are an individual who wants to gain access to the latest technologies for your research and development needs. Cloud Computing addresses this challenge to a great extent and provides access to the necessary IT resources to satisfy your IT needs in an affordable way. Cloud computing is raising major disruptive force for both IT vendors and users as companies globally attempt to reduce cost of ownership for IT infrastructure. The different layers of services in cloud computing are different technological concepts like Grid computing, Virtual computing, etc. Could computing and Virtualization have opened up opportunities for organizations to offer a Virtual Desktop environment to companies and individuals. This paper provides a comprehensive analysis of cloud computing services, market segmentation, technology basics, trends, key players and challenges for cloud deployment in enterprise IT with virtualization. Keywords VMWare is a program is owned by ENC Corporation. It allows you to create and use virtual operating systems VIM is a Vendor Independent Messaging. Iaas is a Infrastructure as a service. CAGR means Compound reaching Annual Growth Rage. Virtualization and Cloud Computing Cloud technology allows to manage and increase available system resources on the cloud automatically and this sets cloud hosting apart from traditional hosting providers who are confined to the physical limitations of a server. Virtualization is the creation of a virtual (rather than actual) version of something, such as an operating system, a server, a storage device or network resources. 1.1 Virtualization The ability to run multiple operating systems on a single physical system and share the underlying hardware resources is known as Virtualization. Virtualization deals with the heterogeneity of the infrastructure and it will allow partitioning and isolating of physical resources with application execution. In virtualization, VIM provides a uniform view of the resource pool and life-cycle management. Virtualization is the creation of a virtual (rather than actual) version of something such as an operating system, a server, a storage device or network resources. a) Hardware virtualization: It refers to the creation of a virtual machine that acts like a real computer with an operating system. Software executed on these virtual machines is separated from the underlying hardware resources. b) Operating system virtualization: It is commonly used in virtual hosting environments, where it is useful for securely allocating finite hardware resources amongst a large number of mutually-distrusting users. c) Memory virtualization: It allows networked and distributed servers to share a pool of memory to overcome physical memory limitation which is a common bottleneck in software performance. With this capability integrated into the network, applications can take advantage of a very large amount of memory to improve overall performance, system utilization, increase memory usage efficiency, and enable new use cases. d) Application virtualization: It describes software technologies that improve portability, manageability and compatibility of applications by encapsulating them from the underlying operating system on which they are executed. e) Desktop virtualization: It is the concept of isolating a logical operating system (OS ) instance from the client that is used to access it. 16

1.2 Cloud Computing Cloud Computing is internet-based computing whereby shared resources, software, and information are provided to computers and devices on demand. Cloud Computing is a paradigm shift from mainframe to clientserver in the early 1980s. The provisioning of services in a timely on-demand manner to allow the scaling up and down of resources. Cloud computing overlaps some of the concepts of distributed, grid and utility computing. Grid Computing: The combination of computer resources from multiple administrative domains applied to a common task. Utility Computing: The packaging of computing resources (computation, storage etc.) as a metered service similar to a traditional public utility. Cloud Computing IaaS is the combination of those old concepts of utility and grid. Cloud Computing IaaS = Grid Computing + (Utility Computing * N) or Cloud Computing IaaS is a Grid of Compute Utilities 2. Basic Requirements of Cloud Computing 2.1 Transparency One of the premises of Cloud Computing is that services are delivered transparently regardless of the physical implementation within the "cloud. This fundamental concept is another version of virtualization where multiple resources appear to the user as a single resource. For example, when a service is provisioned to a user or an organization, it may need only a single server (real or virtual) to handle demand. But as more users access that service it may require the addition of more servers (real or virtual). 2.3 Intelligent Monitoring In order to achieve the on-demand scalability and transparency required of a mega data center in the cloud, the control node, i.e. application delivery solution will need to have intelligent monitoring capabilities. It needs to know the applications and services being served from the cloud and understand when behavior is outside accepted norms. The application delivery mechanism is not only to provide the information about when an application or service is in trouble, but also take action based on that information. If an application is responding slowly and is detected by the monitoring mechanism, then the delivery solution should adjust application requests accordingly. If the number of concurrent users accessing a service is reaching capacity, then the application delivery solution should be able to not only detect that through intelligent monitoring but participate in the provisioning of another instance of the service in order to ensure service to all clients. 2.4 Security In cloud computing the mega data center must be architected with security in mind, and it must be considered a priority for every application, service, and network infrastructure solution that is deployed. The application delivery solution, as the "control node" in the mega data center, is necessarily one of the first entry points into the cloud data center and must be secure. This Cloud computing provides network security, protocol security, transport layer security, and application security should be prime candidates for implementation at the edge of the cloud, in the control node. 3. Deployment Model of Cloud Computing 2.2 Scalability Cloud Computing service providers are in a need to scale up and build out "mega data centers". Making things even more difficult will be the need to scale on-demand in real-time in order to make the most efficient use of application infrastructure resources. Many Claims that this will require a virtualized infrastructure such that resources can be provisioned and de-provisioned quickly, easily and, one hopes, automatically. The "control node" often depicted in high-level diagrams of the "cloud computing mega data center" will need to provide on-demand dynamic application scalability. Figure-1 Deployment Model of Cloud Computing 17

3.1 Public Cloud : Public cloud refers to Cloud Computing in the traditional mainstream sense, whereby resources are dynamically provisioned on a fine-grained, self-service basis over the Internet. These resources are provisioned via web applications/web services, from an offsite third-party provider who shares resources and bills the customer on a fine-grained utility computing basis. 3.2 Community Cloud : A community cloud is established among several organizations that have similar requirements and seek to share their computing infrastructure in order to realize some of the benefits of the Public Cloud. With the costs spread over fewer users than a Public Cloud (but more than a single tenant) this option is more expensive but may offer a higher level of privacy, security and/or policy compliance. 3.3 Hybrid Cloud: A Cloud Computing environment in which an organization manages some computing resources in-house and has others provided externally on the Public Cloud. One of the primary reasons the hybrid model is popular is that organizations prefer to leverage their existing (often large) investments in computing infrastructure. Furthermore, many organizations prefer to keep sensitive data under their own control to ensure security and/or compliance. 3.4 Private Cloud: A term that is similar to, and derived from, the concept of Virtual Private Network (VPN), but applied to Cloud Computing. The Private Cloud delivers the benefits of Cloud Computing with the option to optimize on data security, corporate governance and reliability. 4. Benefits of Cloud Computing 4.1 Reduced Cost Cloud technology is paid incrementally, saving organizations money. 4.2 Increased Storage Organizations can store more data than on private computer systems. 4.3 Highly Automated No longer do IT personnel need to worry about keeping software up to date. 4.4 Flexibility Cloud computing offers much more flexibility than past computing methods. 4.5 More Mobility Employees can access information wherever they are, rather than having to remain at their desks. 4.6 Allows IT to Shift Focus No longer having to worry about constant server updates and other computing issues, government organizations will be free to concentrate on innovation. 5. Cloud Computing Services: The cloud web hosting environment as the additional resources are added automatically to deal with traffic and high loads, and thus customers sites will not be suspended for "excessive use", as the cloud computing resources can be upgraded anytime seamlessly. 5.1 Atlantic Net Atlantic.Net provides Windows and Linux cloud server with 256 MB RAM, 10 GB disk space, 10 Mbps network. All Atlantic.Net cloud hosting include a Dedicated IP address, Nightly backups, Console and full root access, and it enables to Pay hourly (you only pay for what you use). 5.2 Eleven2 Eleven2 provides Linux (CentOS, Debian, Fedora, Ubuntu) and Windows 2003/2008 cloud servers, optional cpanel control panel, from 30 GB Hard Drive, 512 MB Memory, 250 GB Bandwidth. Eleven2 enables to instantly scale cloud servers, provides flexible pricing, FREE server monitoring and 24/7 technical support. 6. Cloud Prevailing Technologies: 6.1 GoGrid: It is a cloud infrastructure service hosting Linux and Windows virtual m/cs managed by a multi-server control panel and a RESTful API. GoGrid is privately held and competes in the dedicated hosting space against Rackspace and in the cloud computing hosting space. While GoGrid offers more cloudlike services for storing information in a shared way like SimpleDB. 6.2 The Rackspace Cloud: It is a web application hosting / cloud platform provider ("Cloud Sites") that bills on a utility computing basis. It has since branched out into cloud 18

storage. It was one of the first commercial cloud computing services. 6.2 Skytap Cloud: It uses a browser-based interface for all system management, and hosts a library of pre-configured virtual machine images. Using either these images or their own imported VMs, users can create sharable configurations of one or more machines, and securely connect to active machines via a proprietary java applet. This Cloud is principally used for developing, testing, and demoing software. connections, environmental controls (e.g., air conditioning, fire suppression) and security devices. 7. Infrastructure for Cloud computing These services are broadly divided into three categories: Infrastructure-as-a-Service Platform-as-a-Service Software-as-a-Service 6.3 VMware: VMware software provides a completely virtualized set of hardware to the guest operating system. VMware software virtualizes the hardware for a video adapter, a network adapter, and hard disk adapters. The host provides pass-through drivers for guest USB, serial, and parallel devices. 6.4 Windows Azure: It makes the millions of connected servers work together as a cohesive unit and provides an environment that has automated service management, immense computing potential, practically unlimited storage and rich developer experience. It also provides you with 24/7 availability and the ability to scale up and down with very little overheads. This platform includes Live Services SQL Azure Services AppFabrics Services SharePoint Services Dynamics CRM Services. 6.5 Datacenters: Figure-2 Cloud Computing Infrastructure 7.1 Infrastructure-as-a-Service (IaaS) The consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications; and possibly limited control of select networking components (e.g., host firewalls). It provides virtual server instances with unique IP addresses and blocks of storage on demand. Customers use the provider's application program interface to start, stop, and access and configure their virtual servers and storage. 7.2. Platform-as-a-Service (PaaS) Paas is defined as a set of software and product development tools hosted on the provider's infrastructure. Developers create applications on the provider's platform over the Internet. PaaS providers may use APIs, website portals or gateway software installed on the customer's computer. Force.com and GoogleApps are examples of PaaS. Heroku is an online Ruby on Rails cloud PaaS. Datacenter is a facility used to house computer systems and associated components such as telecommunications and storage systems. Datacenters have servers grouped inside containers (1800-2500 servers) which help to withstand a growing number of connections each day with confidence and assurance in the secure transfer of data. It generally includes redundant or backup power supplies, redundant data communications 19 6.3 Software-as-a-Service (SaaS) SaaS is the software application that interacts with the user through a portal. SaaS is a very broad market. Services can be anything from Web-based email to inventory control and database processing. Because the service provider hosts both the application and the data, the end user is free to use the service from anywhere. (Eg:

SalesForce.com, BPOS, Gmail and Google Apps, Rightnow CX, Navinet, Infosys). 8. Cloud computing takes virtualization to the next step 8.1 Goal 1: Cost Control The premise behind cloud computing is that you only pay for what you use. In other words, if all you need to do is run a single app that takes few CPU cycles and MB of RAM then that is all you will pay for. Many systems have variable demands on batch processing (e.g. New York Times), Web sites with peaks (e.g. Forbes), Startups with unknown demand (e.g.). Cloud computing reduces the risk to buy a hardware until that is needed. Eg: Accessing Fobes.Com (offer on-line real time stock markt data) Figure-4 Hybrid Cloud 8.4 Goal 4: Save Time and Money The main goal of cloud computing is to save your company time and money. It leads the user with free of work & cloud provider worry about things like server swaps, datacenter maintenance, capital expenditure, backups, high availability, and disaster recovery. 8.5 Goal 5: Software Catalog With cloud computing, most cloud providers offer a software catalog where you can not only select the software you want (and pay for it, if applicable) but simply import pre-created virtual machines that contain that software into your cloud infrastructure. Figure-3 Rate of Server Access@Fobes.com 8.2 Goal 2 - Business Agility Cloud computing promises to support top-line growth in a company by increasing its agility in order to quickly respond to market movements and reduce expenses by more efficient use of assets. For example, the New York Times recently used Amazon Web Services to digitize back editions of their newspaper in only a few days, a process they claim would otherwise have taken over a year. 9. Virtualization with cloud 9.1 Resource pooling The provider s computing resources such as storage, processing, memory and network bandwidth are pooled to serve multiple consumers using a multi-tenant model with different physical and virtual resources dynamically assigned and reassigned according to consumer demand. There is a sense of location independence in that the customer generally has no control or knowledge over the exact location of the provided resources but may be able to specify location at a higher level of abstraction (e.g., country, state, or datacenter). 8.3 Goal 3 Hybrid Cloud Hybrid Cloud is used in between a private cloud (internal virtual infrastructure) and a public cloud. 20

Figure-7 Implementation of Cloud 9.2 Rapid Elastic Scaling Figure-5 Virtualization concept with cloud Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward adequate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time. 10. Conclusion and Future Cloud Computing Cloud hosting is fast becoming the web hosting solution of choice, especially with e-commerce merchants. The global cloud computing market is expected to grow at 30% to CAGR $270 billion in 2020, concludes the latest research report covering the cloud computing products, technologies and services for the global market. The report provides detailed year-by-year (2015 2020) forecasts for the following cloud computing market segments: Figure-8 Cloud Computing Market Segments (2015-2020) Figure-6 Rapid Elastic Scaling 9.3 Fast deployment We package whole applications for fast and easy deployment using virtualization in cloud which simplify Dev/Test by cloning test and production environments. In future, Cloud is open and interoperable. Today 1.5 billions are connected with cloud computing in next 5 years 10 billion client devices, plus 1 billion Users. Cloud computing is rapidly scale computing. Closed proprietary architectures stifle innovations, client-aware, automated, and federated. Resources are dynamically allocated and automatically automated. Resources with maximum utilization & energy efficiency, best of Breed Technologies, flexibility, choice from data center to client. The Tamil Nadu government is in the final stage of launching cloud computing to make its websites and online services faster and more efficient. Cloud computing technology in India will dramatically change the way we compute for Schools, Colleges & 21

References [1] Velte, A., Velte, T., Elsenpeter, R., (2010), Cloud Computing: A Practical Approach, McGraw-Hill Osborne (Primary book to be used). [2] Reese, G., (2009), Cloud Application Architectures: Building Applications and Infrastructure in the Cloud, O Reilly, USA [3] Rhoton, J., (2009), Cloud Computing Explained, Recursive Press, UK. [4] Carr, N., (2008), The Big Switch: Rewiring the World, from Edison to Google: Our New Digital Destiny, W. W. Norton & Company, USA. [5] Armbrust, M., et al., 2010, A View of Cloud Computing, ACM, 53(4), pp. 50-58. [6] Durkee, D., 2010, Why Cloud Computing Will Never Be Free, Communication of the ACM, 53(5), pp. 62-69. [7] Grossman, R., 2009, The Case for Cloud Computing, IEEE Computer, 11(2), pp. 23-27. [8] Papazoglou, M., Traverso, P., Dustdar, S., Leymann, F., 2007, Service-Oriented Computing: State of the Art and Research Challenges, IEEE Computer, 40(11), pp. 38-45. [9] Bianco, P., Kotermanski, R., Merson, P., Evaluating a Service-Oriented Architecture, SEI s tech report no. ESC-TR-2007-015. [10] Strowd, D. H., Lewis, G., T-Check in System-of- Systems Technologies: Cloud Computing, SEI s tech report no. CMU/SEI-2010-TN-009. Websites Visted: 1) http://timesofindia.indiatimes.com/topic/cloud- Computing 2) www.wikinvest.com/concept/cloud_computing 3) www.rackspace.com/cloud/what_is_cloud_ computing/ 4) www.microsoft.com/en-in/server-cloud/cloudcomputing/default.aspx 5) www.commoncraft.com/video/cloud-computing 6) http://www.vmware.com/solutions/cloudcomputing/index.html Data Secure and Dependable Storage Services in Cloud Computing Ajay Kumara M A, Mr. Sharavana.K M.Tech, Dept. of CSE (PG) Assistant Professor Dept. of CSE M V J College of Engineering, Bangalore. M V J College of Engineering Bangalore V.T.U Karnataka sharatanuj@gmail.com ajaykumar.ak99@gmail.com Abstract- Cloud storage enables users to remotely store their data and enjoy the on-demand high quality cloud applications without the burden of local hardware and software management. User physical possession of their 22 outsourced data, which inevitably poses new security risks towards the correctness of the data in cloud. In order to address this new problem and further achieve a secure and dependable cloud storage service, we propose in this paper

2 a flexible distributed storage integrity auditing mechanism, their data O n the one hand, although the cloud 3 utilizing the homomorphism token and distributed erasurecoded infrastructures are much more powerful and reliable than data. The proposed design allows users to audit the personal computing devices, broad range of both internal cloud storage with very lightweight communication and and external threats for data integrity still exist. computation cost. The auditing result not only ensures strong Examples of outages and data loss incidents of cloud cloud storage correctness guarantee, but also storage services appear from time to time [2]. On the simultaneously achieves fast data error localization, i.e., the identification of misbehaving server. The cloud data are dynamic in nature, the proposed design further supports secure and efficient dynamic operations on outsourced data, including block modification, deletion, and append. Analysis shows the proposed scheme is highly efficient and resilient against Byzantine failure, malicious data modification attack, and even server colluding attacks. Index Terms Data integrity, dependable distributed storage, error localization, data dynamics, Cloud Computing I. INTRODUCTION Several trends are opening up the era computing, which is an Internet-based development and use of computer technology. The ever cheaper and more powerful processors, together with the software sasa service (SaaS) computing architecture, are transforming data centers into pools of computing service on a huge scale. The increasing network bandwidth and reliable and flexible network connections make it even possible that users can now subscribe high quality services from data and software that reside solely on remote data centers. Moving data into the cloud offers great convince users since they don t have to care about the complexities of of direct hardware management Computing vendors, Amazon Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2) [2] are both well known examples. Internet-based online services do provide huge amounts of storage space and customizable computing resources. This computing p l a t f o r m shift, however, is eliminating the responsibility of local machines for data maintenance at the same time. As a result, u s e r s are at the mercy of their cloud ser vice providers for the availabilit y and integrity of other hand, since users may not retain a local copy of outsourced data, there exist various incentives for cloud service providers (CSP) to behave unfaithfully towards the cloud users regarding the status of their outsourced data. For example, to increase the profit margin by reducing cost, it is possible for CSP to discard rarely accessed data without being detected in a timely fashion [3]. Similarly, CSP may even attempt to hide data loss incidents so as to maintain a reputation Therefore, although outsourcing data into the cloud is economically attractive for the cost and complexity of long-term large-scale data storage, its lacking of offering strong assurance of data integrity and availability may impede its wide adoption by both enterprise and individual cloud users. In order to achieve the assurances of cloud data integrity and availability and enforce the quality of cloud storage service, efficient methods that enable on-demand data correctness verification on behalf of cloud users have to be designed. However, the fact that users no longer have physical possession of data in the cloud provides the direct adoption of traditional cryptographic primitives for the purpose of data integrity protection. Hence, the verification of cloud storage correctness must be conducted without explicit knowledge of the whole data files meanwhile; cloud storage is not just a third party data warehouse. The data stored in the cloud may not only be accessed but also be frequently updated by the users [4], include insertion, deletion, modification, appending, etc. the deployment of Cloud Computing is powered by data centers running in a simultaneous, cooperated and distributed manner. It is more advantages for individual users to store their data redundantly across multiple physical servers so as to reduce the data integrity and availability threats. Thus, distributed proto- cols for storage correctness assurance will be of most importance in achieving robust and secure cloud storage systems. II.PROPOSEDSYSTEM In this paper, we propose an effective and flexible distributed storage verification scheme with explicit dynamic data support to ensure the correctness and availability of users data in the cloud. We rely on erasurecorrecting code in the file distribution preparation to pro- vide redundancies and guarantee the data dependability against Byzantine servers [6], where a storage server may fail in arbitrary ways. This construction drastically reduces the communication and storage overhead as compared to the traditional replication-based file distribution techniques. By utilizing the homomorphic token with distributed verification of erasure-coded data, our scheme achieves the storage correctness insurance as well as data error localization: whenever data corruption has been detected during the storage correctness verification, our scheme can almost guarantee the simultaneous localization of data errors, i.e., the identification of the misbehaving server(s). In 23 order to strike a good balance between error resilience and data dynamics, we further explore the algebraic property of our token computation and erasure-coded data, and demonstrate how to efficiently support dynamic operation on data blocks, while maintaining the same level of storage correctness assurance. In order to save the time, computation re- sources, and even the related online burden of users, we also provide the extension of the proposed main scheme to support third-party auditing, where users can safely delegate the integrity checking tasks to third-party auditors and be worry-free to use the cloud storage services. Our work is among the first few ones in this field to consider distributed data storage security in Cloud Computing. III. EXISTING SYSTEM

Cloud computing has been envisioned as the next generation architecture of the IT enterprise due to its long list of unprecedented advantages in IT: on demand self-service, ubiquitous network access, location-independent independent resource pooling, rapid resource elasticity, usage-based pricing, and transference of risk]. One fundamental aspect of this new computing model is that data is being centralized or outsourced into the cloud. From the data owners perspective, including both individuals and IT enterprises, storing data remotely in a cloud in a flexible on-demand manner brings appealing benefits: relief of the burden of storage management, universal data access with independent geographical locations, and avoidance of capital expenditure on hardware, software, personnel maintenance, and so on. Our contribution can be summarized as the following three aspects: 1) Compared to many of its predecessors, which only provide binary results about the storage status across the distributed servers, the proposed scheme achieves the integration of storage correctness insurance and data error localization, i.e., the identification of misbehaving server(s). Fig. 1: Cloud data storage architecture 2) Unlike most prior works for ensuring remote data integrity, the new scheme further supports secure and efficient dynamic In cloud data storage, a user stores his data through a CSP operations on data blocks, including: update, delete and into a set of cloud servers, which are running in a append. simultaneous, cooperated and distributed manner. Data redundancy can be employed with technique of erasure- 3) The experiment results demonstrate the proposed scheme correcting code to further tolerate faults or server crash as is highly efficient. Extensive security analysis shows our user s data grows in size and importance. Thereafter, for scheme is resilient against Byzantine failure, malicious data application purposes, the user interacts with the cloud modification attack, and even server col- lading attacks. servers via CSP to access or retrieve his data. In some The rest of the paper is organized as follows. Section II cases, the user may need to perform block level introduces proposed system Section III provides Existing operations on his data. The most general forms of these system. Section IV problem statement section V gives design operations we are considering are block update, delete, insert goals Section VI ensuring cloud data storage concludes the whole paper section VII related work Finally, conclusion And append..as users no longer possess their data locally, it. is of critical importance to ensure users that their data are being correctly stored and maintained. That is, users should IV. PROBLEM STATEMENT be equipped with security means so that they can make A. System Model continuous correctness assurance (to enforce cloud storage service-level agreement) of their stored data even without the Representative network architecture for cloud storage existence of local copies. In case that users do not necessarily service architecture is illustrated in Figure 1. Three dif- have the time, feasibility or resources to monitor their data ferent network entities can be identified as follows: online, they can delegate the data auditing tasks to an User: an entity, who has data to be stored in the optional trusted TPA of their respective choices. However, to cloud and relies on the cloud for data storage and securely introduce such a TPA, any possible leakage of user s computation, can be either enterprise or individual outsourced data towards TPA through the auditing protocol customers. should be prohibited. In our model, we assume that the point-to-point com- Cloud Server (CS): an entity, which is managed by mutilation channels between each cloud server and the user is cloud service provider (CSP) to provide data storage authenticated and reliable, which can be achieved in practice service and has significant storage space and com- with little overhead. These authentication handshakes are putation resources (we will not differentiate CS and CSP omitted in the following presentation. hereafter.). B. Adversary Model Security threats faced by cloud data storage can come from Third Party Auditor (TPA): an optional TPA, two different sources. On the one hand, a CSP can be self- who has expertise and capabilities that users may not interested, UN trusted and possibly malicious. Not only have, is trusted to assess and expose risk of cloud does it desire to move data that has not been or is rarely storage services on behalf of the users upon request. accessed to a lower tier of storage than agreed for monetary User: retrieves the cloud services from CSP having it s reasons, but it may also attempt to hide a data loss incident message broadcasts channel to communicate and send due to management errors, Byzantine failures and so on. On request and response between owner and cloud server. the other hand, there may also exist an economically motivated Adversary, who has the capability to compromise a number of cloud data storage servers in different time 24 2 4

intervals and subsequently is able to modify or delete users data while remaining undetected by CSPs for a certain period. Specifically, we consider two types of adversary with different levels of capability in this paper: Weak Adversary: The adversary is interested in corrupting the user s data files stored on individual servers. Once a server is comprised, an adversary can pollute the original data files by modifying or introducing its own fraudulent data to prevent the original data from being retrieved by the user. Strong Adversary: This is the worst case scenario, in which we assume that the adversary can compromise all the storage servers so that he can intentionally modify the data files as long as they are internally consistent. In fact, this is equivalent to the case where all servers are colluding together to hide a data loss or corruption incident. V. DESIGN GOALS To ensure the security and dependability for cloud data storage we aim to design mechanisms for dynamic data verification and operation and achieve the following goals: Storage correctness: to ensure users that their data are indeed stored appropriately and kept intact all the time in the cloud. Fast localization of data error: to effectively locate the malfunctioning server when data corruption has been detected. Dynamic data support: to maintain the same level of storage correctness assurance even if users modify, delete or append their data files in the cloud. Dependability: to enhance data availability against Byzantine failures, malicious data modification and server colluding attacks, i.e. minimizing the effect brought by data errors or server failures. Lightweight: to enable users to perform storage correctness checks with minimum overhead. VI. ENSURING CLOUD DATA STORAGE In cloud data storage system, users store their data in the cloud and no longer possess the data locally. Thus, the correctness and availability of the data files being stored on the distributed cloud servers must be guaranteed. One of the key issues is to effectively detect any unauthorized data modification and corruption, possibly due to server compromise and/or random Byzantine failures. Besides, in the distributed case when such inconsistencies are successfully detected, to find which server the data error lies in is also of great significance, since it can be the first step to fast recover the storage errors. To address these problems, our main scheme for ensuring cloud data storage is presented in this section. The first part of the section is devoted to a review of basic tools from coding theory that is needed in our scheme for file distribution across cloud servers. Then, the homomorphic token is introduced. The token computation function we are considering belongs to a family of universal hash function [7], chosen to preserve the homomorphic properties, which can be perfectly integrated 25 with the verification of erasure-coded data. Subsequently, it is also shown how to derive a challenge response protocol for verifying the storage correctness as well as identifying misbehaving servers. Finally, the procedure for file retrieval and error recovery based on erasure-correcting code is outlined. A. The Proposed security by using cryptographic algorithm In the introduction we motivated the data secure and dependable storage In cloud computing. This section presents our public auditing scheme for cloud data storage security. public auditing system and discusses two straightforward schemes and their demerits. Then we present our main result for privacy-preserving public auditing to achieve the aforementioned design goals. We also show how to extent our main scheme to support batch auditing for TPA upon delegations from multi-users. Finally, we discuss how to adapt our main result to support data dynamics. B. Definitions and Framework of Public Auditing System We follow this definition using remote data integrity checking and adapt the framework for our privacy-preserving public auditing system. A public auditing scheme consists of four algorithms (KeyGen, SigGen, GenProof, and Verify Proof).KeyGen is a key generation algorithm that is run by the user to setup the scheme. SigGen is used by the user to generate verification of MAC, (message authentication code) and signatures, or other related information that will be used for auditing. GenProof is run by the cloud server to generate a proof of data storage correctness, while VerifyProof is run by the TPA to audit the proof from the cloud server. Our public auditing system can be constructed from the above auditing scheme in two phases, Setup and Audit: Setup: The user initializes the public and secret parameters of the system by executing KeyGen, and pre-processes the data file F by using SigGen to generate the verification metadata. The user then stores the data file F at the cloud server, delete its local copy, and publish the verification metadata to TPA for later audit. As part of pre-processing, the user may alter the data file F by expanding it or including additional metadata to be stored at server. Audit: The TPA issues an audit message or challenge to the cloud server to make sure that the cloud server has retained the data file F properly at the time of the audit. The cloud server will derive a response message from a function of the stored data file F by executing GenProof. Using the verification metadata, the TPA verifies the response via Verify Proof. Note that in our design, we do not assume any additional property on the data file, and thus regard error-correcting codes as orthogonal to our system. If the user wants to have more errorresiliency, he/she can first redundantly encode the data file and then provide us with the data file that has error-correcting codes integrated. In Section 3.5, we will show how to adapt our main result to support dynamic data update. 2 5

3: Choose the number t of tokens; 4: Choose the number r of indices per verification; 5: Generate master key Kprp and challenge kchal; 6: for vector G(j), j 1, n do 7: for round i 1, t do 8: Derive _i = fkchal (i) and k(i) prp from KPRP. 9: Compute v(j) i =Prq=1 _qi G(j)[_k(i)prep(q)] 10: end for 11: end for 12: Store all the vis locally. 13: end procedure C. Challenge Token Pre computation In order to achieve assurance of data storage correctness and data error localization simultaneously, our scheme entirely relies on the pre-computed verification tokens. The main idea is as follows: before file distribution the user pre-computes a certain number of short verification tokens on individual vector Fig. 2 MAC algorithm example G(j) ({1,..., n}), each token covering a random subset of data blocks. Later, when the user wants to make sure the In this example, the sender of a message runs it through a storage correctness for the data in the cloud, he challenges the MAC algorithm to produce a MAC data tag. The message and cloud servers with a set of randomly generated block indices. the MAC tag are then sent to the receiver. r. The receiver in turn Upon receiving challenge, each cloud server computes a short runs the message portion of the transmission through the same signature over the specified blocks and returns them to the MAC algorithm using the same key, producing a second MAC user. The values of these signatures should match the data tag. The receiver then compares the first MAC tag corresponding tokens pre-computed by the user. Meanwhile received in the transmission to the second generated MAC tag. as all servers operate over the same subset of the indices the If they are identical, the receiver can safely assume that the Requested response values for integrity check must also be a integrity of the message was not compromised, and the Valid codeword determined by Token Pre computation. message was not altered or tampered with during transmission. Algorithm 2 Correctness Verification and Error Localization HMAC: In cryptography, HMAC (Hash-based Message 1: procedure CHALLENGE(i) Authentication Code) is a specific construction for calculating a 2: Recompute _i = fkchal (i) and k(i) message authentication code (MAC) involving a cryptographic prp from KPRP ; hash function in combination with a secret key. As with any 3: Send {_i, k(i)prp} to all the cloud servers; MAC, it may be used to simultaneously verify both the data 4: Receive from servers: integrity and the authenticity of a message. Any cryptographic hash function, such as MD5 or SHA-1, may be used in the {R(j)i =Prq=1 _qi G(j)[_k(i)prp(q)] )] 1 j n} calculation of an HMAC; the resulting MAC algorithm is 5: for (j m + 1, n) do termed HMAC-MD5 or HMAC-SHA1 accordingly. The cryptographic strength of the HMAC depends upon the 6: R(j) R(j) Prq=1 fkj (siq,j) _qi qi, Iq = _k(i)prp(q) cryptographic strength of the underlying hash function, the size 7: end for of its hash output length in bits and on the size and quality of 8: if ((R(1)i,...,R(m)i ) P==( ==(R(m+1)i,...,R(n)i )) then the cryptographic key. 9: Accept and ready for the next challenge. Algorithm 1 Token Pre-computation 1: procedure 2: Choose parameters l, n and function f, _;; Send Sk&File 10: if (R(j)i! =v(j)i ) then Owner name 11: return server j is misbehaving. 12: end if 13: end for 14: end if 15: end Other user Request key Upload file & file name D. Correctness Verification and Error Localization provide binary results for the storage verification. Our scheme outperforms those by integrating the correctness Error localization is a key prerequisite for eliminating errors verification in and error localization in our challenge-response storage systems. However, many previous schemes do protocol: not the response values from servers for each challenge not explicitly consider the problem of data error localization, thus only 26 2 6

2 7 only determine the correctness of the distributed storage, but also contain information to locate potential data error(s). Algorithm 3 Error Recovery 1: procedure % Assume the block corruptions have been detected among % the specified r rows;% Assume s k servers have been identified misbehaving 2: Download r rows of blocks from servers; 3: Treat s servers as erasures and recover the blocks. 4: Resend the recovered blocks to corresponding servers. 5: end procedure C.File Retrieval and Error Recovery Since our layout of file matrix is systematic, the user can reconstruct the original file by downloading the data vectors from the first m servers, assuming that they return the correct response values. E. Towards Third Party Auditing As discussed in our architecture, in case the user does not have the time, feasibility or resources to perform the storage correctness verification, he can optionally delegate this task to an independent third party auditor, making the cloud storage publicly verifiable. However, as pointed out by the recent work [8], [9], to securely introduce an effective TPA, the auditing process should bring in no new vulnerabilities towards user data privacy. Namely, TPA should not learn user s data content through the delegated data auditing. Now we show that with only slight modification, our protocol can support privacy-preserving third party auditing. F. providing dynamic data operation This model may fit some application scenarios, such as libraries and scientific datasets. However, in cloud data storage, there are many potential scenarios where data stored in the cloud is dynamic, like electronic documents, photos, or log files etc. Therefore, it is crucial to consider the dynamic case, where a user may wish to perform various block-level operations of update, delete and append to modify the data file while maintaining the storage correctness assurance. Request No file No If user is Hacker Receive file No Sender Yes Check IP addr FN&S Check user FN& Yes Cloud Server If No Ip addr found Check FN & Fig 3. Activity diagram of architecture Diagram SK FN&SK VII. RELATED WORK Juels et al. [3] described a formal proof of irretrievability (POR) model for ensuring the remote data integrity. Their scheme combines spot-checking and error-correcting code to ensure both possession and irretrievability of files on archive service systems. Shacham et al. [4] built on this model and constructed a random linear function based homomorphic authenticator which enables unlimited number of queries and requires less communication overhead. Bowers et al. [5] proposed an improved framework for POR protocols that generalizes both Juels and Shacham s work. Later in their subsequent work, Bowers et al. [10] extended POR model to distributed systems. However, all these schemes are focusing on static data. The effectiveness of their schemes rests primarily on the preprocessing steps that the user conducts before outsourcing the data file F. Any change to the contents of F, even few bits, must propagate through the error-correcting code, thus introducing significant computation and communication complexity. Ateniese et al. [6] defined the provable data possession (PDP) model for ensuring possession of file on un trusted storages. Their scheme utilized public key based homomorphic tags for auditing the data file, thus providing public verifiability. However, their scheme requires sufficient computation overhead that can be expensive for an entire file. In their subsequent work, Ateniese et al. [7] described a PDP scheme that uses only symmetric key cryptography. This method has lower-overhead than their previous scheme and allows for block updates, deletions and appends to the stored file, which has also been supported in our work. However, their scheme focuses on single server scenario and does not address small data corruptions, leaving both the distributed scenario and data error recovery issue unexplored. Curtmola et al. [15] aimed to ensure data possession of multiple replicas across the distributed storage system. They extended the PDP schemeto cover multiple replicas without encoding each replica separately, providing guarantee that multiple copies of data are actually maintained. TPA 27

2 8 VIII.Conclusion In this paper, we investigate the problem of data security in cloud data storage, which is essentially a distributed storage system. To achieve the assurances of cloud data integrity and availability and enforce the quality of dependable cloud storage service for users, we propose an effective and flexible distributed scheme with explicit dynamic data support, including block update, delete, and append. We rely on erasure-correcting code in the file distribution preparation to provide redundancy par- ity vectors and guarantee the data dependability. By utilizing the homomorphic token with distributed verification of erasure-coded data, our scheme achieves the integration of storage correctness insurance and data error localization, i.e., whenever data corruption has been detected during the storage correctness verification across the distributed servers, we can almost guarantee the simultaneous identification of the misbehaving server(s). Considering the time, computation resources, and even the related online burden of users, we also provide the extension of the proposed main scheme to support third- delegate the party auditing, where users can safely integrity checking tasks to third-party rty auditors and be worry-free to use the cloud storage services. Through detailed security and extensive experiment results, we show that our scheme is highly efficient and resilient to Byzantine failure, malicious data modification attack and even server colluding attack. Proc. CCS 09, [6] M. A. Shah, M. Baker, J. C. Mogul, and R. Swaminathan, Audit-ing to keep online storage services honest, in Proc. of HotOS 07.Berkeley, CA, USA: USENIX Association, 2007, pp. 1 6. [7] B. Krebs, Payment Processor Breach May Be LargestEver, onhttp://voices.washingtonpost.com/securityfix/2009/01/ payment processor breach may b.html, Jan. 2009. [8] A. Juels and J. Burton S. Kaliski, Pors: Proofs of irretrievability for large files, in Proc. of CCS 07, Alexandria, VA, October 2007 [9] C. Wang, Q. Wang, K. Ren, and W. Lou, Privacypreservingublic auditing for storage security in cloud computing, in Proco.f IEEE INFOCOM 10, San Diego, CA, USA, March 2010. [10] C. Wang, K. Ren, W. Lou, and J. Li, Towards publicly Auditable secure cloud data storage services, IEEE Network Magazine,vol. 24, no. 4, pp. 19 24, 2010. [11] R. C. Merkle, Protocols for public key cryptosystems, in Proc. Of IEEE Symposium on Security and Privacy, Los Alamitos, CA, [12] Q. Wang, K. Ren, W. Lou, and Y. Zhang, Dependable and securesensor data storage with dynamic integrity assurance, in Proc. Of IEEE INFOCOM 09, Rio de Janeiro, Brazil, Appril 2009. [13] J. S. Plank, S. Simmerman, and C. D. Schuman, Jerasure: Alibrary in C/C++ facilitating erasure coding for storage applica-tions - Version 1..2, University of Tennessee, Tech. Rep. CS-08-627,August 2008. [14] M. Bellare, R. Canetti, and H. Krawczyk, Keying hash functionsfor message authentication, in Proc. of Crypto 96 volume 1109 oflncs. Springer-Verlag, 1996, pp. 1 15. [15] M. Bellare, O. Goldreich, and S. Goldwasser, Incremental cryptography: The case of hashing and REFERENCE [1] C. Wang, Q. Wang, K. Ren, and W. Lou, Ensuring data storage security in cloud computing, in Proc. of IWQoS 09, July 2009, pp. 1 9. [2] Amazon.com, Amazon web services (aws), Online at http://aws.amazon.com/, 2009. [3] M. Arrington, Gmail disaster: Reports of http://www.techcrunch.com/2006/ /12/28/gmaildisasterreports-of-mass-email-deletions/,December [4] G. Ateniese, R. D. Pietro, L. V. Mancini, and G. Tsudik, Scalable and efficient provable data posession, in Proc. Of SecureComm 08,2008, [5] C. Erway, A. Kupcu, C. Papamanthou, and R. Tamassia, Dynamic provable dataa possession, in Increasing Data Privacy and Computation Efficiency through Linear Programming Outsourcing in Cloud Computing Harsha N, Dr M Siddappa Department of CSE, Sri Siddhartha Institute of Technology, Sri Siddhartha University, Tumkur, Karnataka, India Email: harshanarayan1989@gmail.com 28

2 9 Abstract Cloud Computing is the emerging buzzword in Information Technology. Cloud Computing distribute computation task on the resource pool which consists of massive computers. The application systems can gain the computation strength, the storage space and software service according to its demand. It enables customers with limited computational resources to outsource their large computation workloads to the cloud, and economically enjoy the massive computational power, bandwidth, storage, and even appropriate software that can be shared in a pay-per-use manner. Despite the tremendous benefits, security is the primary obstacle that prevents the wide adoption of this promising computing model, especially for customers when their confidential data are consumed and produced during the computation. To combat against unauthorized information leakage, sensitive data have to be encrypted before outsourcing so as to provide end to- end data confidentiality assurance in the cloud and beyond. However, ordinary data encryption techniques in essence prevent cloud from performing any meaningful operation of the underlying plaintext data, making the computation over encrypted data a very hard problem. On the other hand, the operational details inside the cloud are not transparent enough to customers. As a result, there do exist various motivations for cloud server to behave unfaithfully and to return incorrect results, i.e., they may behave beyond the classical semi honest model. Fully homomorphic encryption (FHE) scheme, a general result of secure computation outsourcing has been shown viable in theory, where the computation is represented by an encrypted combinational Boolean circuit that allows to be evaluated with encrypted private inputs. But FHE is difficult to implement because it consists of huge circuits. Focusing on engineering computing and optimization tasks, this paper investigates secure outsourcing of widely applicable linear programming (LP) computations. Linear programming is an algorithmic and computational tool. In order to achieve practical efficiency, our mechanism design explicitly decomposes the LP computation outsourcing into public LP solvers running on the cloud and private LP parameters owned by the customer. It provides secure and practical mechanism design which fulfils input/output privacy. Using LP ensures that the use of cloud is economically viable. Keywords Cloud computing, Linear programming, Mechanism, Design goals, The Complete Mechanism Description I. INTRODUCTION Cloud computing is TCP/IP based high development and integrations of computer technologies such as fast micro processor, huge memory, high-speed network and reliable system architecture. Without the standard interconnect protocols and mature of assembling data center technologies, cloud computing would not become reality too. In October 2007, IBM and Google announced collaboration in cloud computing. The term cloud computing become popular from then on. Beside the web email, the Amazon Elastic Compute Cloud (EC2), Google App Engine and Salesforce s CRM largely represent a promising conceptual foundation of cloud services. Cloud Computing provides convenient ondemand network access to a shared pool of configurable computing resources that can be rapidly deployed with great efficiency and minimal management overhead. One fundamental advantage of the cloud paradigm is computation outsourcing, where the computational power of cloud customers is no longer limited by their resource-constraint devices. By outsourcing the workloads into the cloud, customers could enjoy the literally unlimited computing resources in a pay-per-use manner without committing any large capital outlays in the purchase of hardware and software and/or the operational overhead therein. Despite the tremendous benefits, outsourcing computation to the commercial public cloud is also depriving customers direct control over the systems that consume and produce their data during the computation, which inevitably brings in new security concerns and challenges towards this promising computing model. On the one hand, the outsourced computation workloads often contain sensitive information, such as the business financial records, proprietary research data, or personally identifiable health information etc. To combat against unauthorized information leakage, sensitive data have to be encrypted before outsourcing so as to provide end-to-end data confidentiality assurance in the cloud and beyond. However, ordinary data encryption 29

3 0 techniques in essence prevent cloud from performing any meaningful operation of the underlying plaintext data, making the computation over encrypted data a very hard problem. On the other hand, the operational details inside the cloud are not transparent enough to customers. As a result, there do exist various motivations for cloud server to behave unfaithfully and to return incorrect results, i.e., they may behave beyond the classical semihonest model. For example, for the computations that require a large amount of computing resources, there are huge financial incentives for the cloud to be lazy if the customers cannot tell the correctness of the output. Besides, possible software bugs, hardware failures, or even outsider attacks might also affect the quality of the computed results. Thus, we argue that the cloud is intrinsically not secure from the viewpoint of customers. Without providing a mechanism for secure computation outsourcing, i.e., to protect the sensitive input and output information of the workloads and to validate the integrity of the computation result, it would be hard to expect cloud customers to turn over control of their workloads from local machines to cloud solely based on its economic savings and resource flexibility. For practical consideration, such a design should further ensure that customers perform fewer amounts of operations following the mechanism than completing the computations by themselves directly. Otherwise, there is no point for customers to seek help from cloud. Recent researches in both the cryptography and the theoretical computer science communities have made steady advances in secure outsourcing expensive computations Based on Yao s garbled circuits and Gentry s breakthrough work on fully homomorphic encryption (FHE) scheme, a general result of secure computation outsourcing has been shown viable in theory, where the computation is represented by an encrypted combinational Boolean circuit that allows to be evaluated with encrypted private inputs. However, applying this general mechanism to our daily computations would be far from practical, due to the extremely high complexity of FHE operation as well as the pessimistic circuit sizes that cannot be handled in practice when constructing original and encrypted circuits. This overhead in general solutions motivates us to seek efficient solutions at higher abstraction levels than the circuit representations for specific computation outsourcing problems. Although some elegant designs on secure outsourcing of scientific computations, sequence comparisons, and matrix multiplication etc. have been proposed in the literature, it is still hardly possible to apply them directly in a practically efficient manner, especially for large problems. In those approaches, either heavy cloud-side cryptographic computations or multi-round interactive protocol executions, or huge communication complexities, are involved. In short, practically efficient mechanisms with immediate practices for secure computation outsourcing in cloud are still missing. Focusing on engineering computing and optimization tasks, in this paper, we study practically efficient mechanisms for secure outsourcing of linear programming (LP) computations. Linear programming is an algorithmic and computational tool which captures the first order effects of various system parameters that should be optimized, and is essential to engineering optimization. It has been widely used in various engineering disciplines that analyze and optimize real-world systems, such as packet routing, flow control, power management of data centers, etc. Because LP computations require a substantial amount of computational power and usually involve confidential data, we propose to explicitly decompose the LP computation outsourcing into public LP solvers running on the cloud and private LP parameters owned by the customer. The flexibility of such a decomposition allows us to explore higher level abstraction of LP computations than the general circuit representation for the practical efficiency. Specifically, we first formulate private data owned by the customer for LP problem as a set of matrices and vectors. This higher level representation allows us to apply a set of efficient privacy-preserving problem transformation techniques, including matrix multiplication and affine mapping, to transform the original LP problem into some arbitrary one while protecting the sensitive input/output information. One crucial benefit of this higher level problem transformation method is that existing algorithms and tools for LP solvers can be directly reused by the cloud server. Although the generic mechanism defined at circuit level, can even allow the customer to hide the fact that the outsourced computation is LP, we believe imposing this more stringent security measure than necessary would greatly affect the efficiency. To validate the computation result, we 30

3 1 utilize the fact that the result is from cloud server solving the transformed LP problem. In particular, we explore the fundamental duality theorem together with the piece-wise construction of auxiliary LP problem to derive a set of necessary and sufficient conditions that the correct result must satisfy. Such a method of result validation can be very efficient and incurs close-to-zero additional overhead on both customer and cloud server. With correctly verified result, customer can use the secret transformation to map back the desired solution for his original LP problem. II. PROBLEM STATEMENT A. System and threat model We consider a computation outsourcing architecture involving two different entities, as illustrated in Fig. 1: the cloud customer, who has large amount of computationally expensive LP problems to be outsourced to the cloud; the cloud server (CS), which has significant computation resources and provides utility computing services, such as hosting the public LP solvers in a pay-per-use manner. The customer has a large-scale linear programming problem (to be formally defined later) to be solved. However, due to the lack of computing resources, like processing power, memory, and storage etc., he cannot carry out such expensive computation locally. Thus, the customer resorts to CS for solving the LP computation and leverages its computation capacity in a pay-peruse manner. Instead of directly sending original problem Ф, the customer first uses a secret K to map Ф into some encrypted version ФK and outsources problem ФK to CS. CS then uses its public LP solver to get the answer of ФK and provides a correctness proof Τ, but it is supposed to learn nothing or little of the sensitive information contained in the original problem description Ф. After receiving the solution of encrypted problem ФK, the customer should be able to first verify the answer via the appended proof T. If it s correct, he then uses the secret K to map the output into the desired answer for the original problem Ф. The security threats faced by the computation model primarily come from the malicious behavior of CS. We assume that the CS may behave beyond honest-but-curious, i.e. the semihonest model that was assumed by many previous researches, either because it intends to do so or because it is compromised. The CS may be persistently interested in analyzing the encrypted input sent by the customer and the encrypted output produced by the computation to learn the sensitive information as in the semi-honest model. In addition, CS can also behave unfaithfully or intentionally sabotage the computation, e.g. to lie about the result to save the computing resources, while hoping not to be caught at the same time. Finally note that we assume the communication channels between each cloud server and the customer is authenticated and reliable, which can be achieved in practice with little overhead. These authentication handshakes are omitted in the following presentation. B. Design Goals To enable secure and practical outsourcing of LP under the aforementioned model, our mechanism design should achieve the following security and performance guarantees. 1) Correctness: Any cloud server that faithfully follows the mechanism must produce an output that can be decrypted and verified successfully by the customer. 2) Soundness: No cloud server can generate an incorrect output that can be decrypted and verified successfully by the customer with nonnegligible probability. 3) Input/output privacy: No sensitive information from the customer s private data can be derived by the cloud server during performing the LP computation. 4) Efficiency: The local computations done by customer should be substantially less than solving the original LP on his own. The computation burden on the cloud server should be within the comparable time complexity of existing practical algorithms solving LP problems. III. PROPOSED SYSTEM 31

3 2 This section presents our LP outsourcing scheme which provides a complete outsourcing solution for not only the privacy protection of problem input/output, but also its efficient result checking. A. Mechanism Design Framework We propose to apply problem transformation for mechanism design. The general framework is adopted from a generic approach, while our instantiation is completely different and novel. In this framework, the process on cloud server can be represented by algorithm ProofGen and the process on customer can be organized into three algorithms (KeyGen, ProbEnc, ResultDec). These four algorithms are summarized below and will be instantiated later. KeyGen(1k) {K}. This is a randomized key generation algorithm which takes a system security parameter k, and returns a secret key K that is used later by customer to encrypt the target LP problem. ProbEnc(K,Ф) {ФK}. This algorithm encrypts the input tuple Ф into ФK with the secret key K. According to problem transformation, the encrypted input ФK has the same form as Ф, and thus defines the problem to be solved in the cloud. ProofGen(ФK) {(y, Τ)}. This algorithm augments a generic solver that solves the problem ФK to produce both the output y and a proof Τ. The output y later decrypts to x, and T is used later by the customer to verify the correctness of y or x. ResultDec(K, Τ, y, Ƴ) {x, Ƴ}. This algorithm may choose to verify either y or x via the proof Τ. In any case, a correct output x is produced by decrypting y using the secret K. The algorithm outputs Ƴ when the validation fails, indicating the cloud server was not performing the computation faithfully. Note that our proposed mechanism provides us one-time pad types of flexibility. Namely, we shall never use the same secret key K to two different problems. B. Methodology Secure LP outsourcing in cloud can be represented by decomposing LP computation into public LP solvers running on the cloud and private data owned by the customer. Because different decompositions of LP usually lead to different trade-offs among efficiency and security guarantees, how to choose the right one that is most suitable for our design goal is thus of critical importance. To systematically study the difference, we organize the different decompositions into a hierarchy, as shown in Fig. 2, which ensembles the usual way that a computation is specified: a computation at a higher abstraction level is made up from the computations at lower abstraction levels. As we move up to higher abstraction levels within the hierarchy, more information about the computations becomes public so that security guarantees become weaker, but more structures become available and the mechanisms become more efficient. As we move down to lower abstraction levels, the structures become generic but less information is available to the cloud so that stronger security guarantees could be achieved at the cost of efficiency. Because our goal is to design practically efficient mechanisms of secure LP outsourcing, in this paper, we focus on the top level of the hierarchy in Fig. 2. In other words, we propose to study problem transformation techniques that enable customers to secretly transform the original LP into some random one to achieve the secure LP outsourcing design. IV. PERFORMANCE ANALYSIS Theoretic Analysis: 1) Customer Side Overhead: According to our mechanism, 32

3 3 customer side computation overhead consists of key generation, problem encryption operation, and result verification, which corresponds to the three algorithms KeyGen, ProbEnc, and ResultDec, respectively. Because KeyGen and Result- Dec only require a set of random matrix generation as well as vector-vector and matrixvector multiplication, the computation complexity of these two algorithms are upper bounded via O(n2). Thus, it is straight-forward that the most time consuming operations are the matrix-matrix multiplications in problem encryption algorithm ProbEnc. 2) Server Side Overhead: For cloud server, its only computation overhead is to solve the encrypted LP problem ФK as well as generating the result proof Τ, both of which correspond to the algorithm ProofGen. If the encrypted LP problem ФK belongs to normal case, cloud server just solves it with the dual optimal solution as the result proof Τ, which is usually readily available in the current LP solving algorithms and incurs no additional cost for cloud. V. SECURITY ANALYSIS Theorem 1: Our scheme is a correct verifiable linear programming outsourcing scheme. Proof: The proof consists of two steps. First, we show that for any problem Ф and its encrypted version ФK, solution y computed by honest cloud server will always be verified successfully. This follows directly from the correctness of duality theorem of linear programming. Namely, all conditions derived from duality theorem and auxiliary LP problem constructions for result verification are necessary and sufficient. Next, we show that correctly verified solution y always corresponds to the optimal solution x of original problem Ф. Theorem 2: Our scheme is a sound verifiable linear programming outsourcing scheme. Proof: Similar to correctness argument, the soundness of the proposed mechanism follows from the facts that the LP problem Ф and ФK are equivalent to each other through affine mapping, and all the conditions thereafter for result verification are necessary and sufficient. VI. CONCLUSION In this paper, for the first time, we formalize the problem of securely outsourcing LP computations in cloud computing, and provide such a practical mechanism design which fulfils input/output privacy, cheating resilience, and efficiency. By explicitly decomposing LP computation outsourcing into public LP solvers and private data, our mechanism design is able to explore appropriate security/efficiency tradeoffs via higher level LP computation than the general circuit representation. We develop problem transformation techniques that enable customers to secretly transform the original LP into some arbitrary one while protecting sensitive input/output information. We also investigate duality theorem and derive a set of necessary and sufficient condition for result verification. Such a cheating resilience design can be bundled in the overall mechanism with close-tozero additional overhead. Both security analysis and experiment results demonstrates the immediate practicality of the proposed mechanism. We plan to investigate some interesting future work as follows: 1) devise robust algorithms to achieve numerical stability; 2) explore the scarcity structure of problem for further efficiency improvement; 3) establish formal security framework; 4) extend our result to non-linear programming computation outsourcing in cloud. REFERENCES [1] P. Mell and T. Grance, Draft nist working definition of cloud computing, Referenced on Jan. 23rd, 2010 Online at http://csrc.nist.gov/ groups/sns/cloud-computing/index.html, 2010. [2] Cloud Security Alliance, Security guidance for critical areas of focus incloud computing, 2009, online at http://www.cloudsecurityalliance.org. [3] C. Gentry, Computing arbitrary functions of encrypted data, Commun.ACM, vol. 53, no. 3, pp. 97 105, 2010. [4] Sun Microsystems, Inc., Building customer trust in cloud computing with transparent security, 2009, online at https://www.sun.com/offers/ details/sun transparency.xml. [5] M. J. Atallah, K. N. Pantazopoulos, J. R. Rice, and E. H. Spafford, Secure outsourcing of scientific computations, Advances in Computers, vol. 54, pp. 216 272, 2001. [6] S. Hohenberger and A. Lysyanskaya, How to securely outsource ecryptographic computations, in Proc. of TCC, 2005, pp. 264 282. [7] M. J. Atallah and J. Li, Secure outsourcing of sequence comparisons, Int. J. Inf. Sec., vol. 4, no. 4, pp. 277 287, 2005. [8] D. Benjamin and M. J. Atallah, Private and cheatingfree outsourcing of algebraic computations, in Proc. of 6th Conf. on Privacy, Security,and Trust (PST), 2008, pp. 240 245. 33

3 4 [9] R. Gennaro, C. Gentry, and B. Parno, Non-interactive verifiable computing: Outsourcing computation to untrusted workers, in Proc. Of CRYPTO 10, Aug. 2010. 34

3 5 Dynamic Resource Allocation in Cloud for Parallel Data Processing K.B.Manasa N.L.UdayaKumar Dr.M.Siddappa IV sem M.Tech, Lecturer, HOD, Dept of CSE, Dept of CSE, Dept. of CSE, SSIT, Maralur, Tumkur. SSIT,Maralur,Tumkur. SSIT,Maralur,Tumkur. Abstract Cloud Computing is playing a relevant role in the evolution of information technology (IT). A considerable number of system developers are using cloud technologies to deploy and make available systems over the internet. Cloud computers are the emerging classes of computational hardware and several frameworks have been introduced to facilitate parallel data processing on cloud, such systems allow users to acquire and release the resources on demand and provide ready access to data. Processing frameworks which are currently used are from the field of cluster computing and disregard of particular nature of cloud. Consequently, the allocated compute resources may be inadequate for big parts of the submitted job and unnecessarily increase processing time and cost..companies providing cloud services have an increasing need to store and analyze massive set of data. Emerging applications require the ability to exploit geographically distributed resources and there has been a dramatic increase in the amount of available computing and storage resources. Current cloud systems push much complexity onto the user, requiring the user to manage individual virtual machines and deal with many system level concerns. Index Terms : Many-Task Computing, High- Throughput Computing, Loosely Coupled Applications, Cloud Computing, Parallel data processing, Frameworks. 1 Introduction Today a growing number of companies have to process huge amount of data in a cost efficient manner. Scalable Internet services deliver massive amounts of data on demand to large, distributed user who communicates through well-defined software interfaces. Cloud computing has emerged as a new paradigm for providing access to scalable Internet service. The general goal is to provide users with the ability to program resources within a very large scale resource cloud so that they can take advantage of the potential performance, cost, and reliability benefits. It is not possible for researchers to build, deploy, modify, instrument, or experiment with a cloud infrastructure under their own control. Some systems, such as Amazon s Elastic Compute Cloud (EC2), allow users to allocate entire virtual machines (VMs) on demand. Different types of virtual machines are allocated and user has to just pay as we use. Classic representatives for these companies are the operators of Internet search engines, like Google, Yahoo, or Microsoft. The vast amount of data they have to deal Email: mansi.harshi@gmail.com Email: msgraceuk@gmail.com 35 with every day has made traditional database solutions prohibitively expensive. Specifically, commercial cloud infrastructures take advantage of the ability to control the local resource configuration and access to large collections of potentially expensive resources. Problems like processing crawled documents, analyzing logs or regenerating a web index are split into several independent subtasks, distributed among the available nodes and computed in parallel. In order to simplify the development of distributed applications on top of such architectures, many of these companies have also built customized data processing frameworks in recent years. Examples are Google's Map Reduce engine, Yahoo!'s Map-Reduce-Merge etc. They can be classified by terms like high throughput computing (HTC) or manytask computing (MTC), depending on the amount of data and the number of tasks involved in the computation. For companies that only have to process large amounts of data running their own data center is obviously not an option. Instead, Cloud computing has emerged as a promising approach to rent a large IT infrastructure on a short-term payperusage basis. Operators of so-called Infrastructureasa- Service (IaaS) clouds, like Amazon EC2, let their customers to allocate, access, and control a set of virtual machines (VMs) which run inside their data centers and only charge them for the period of time the machines are allocated. The VMs are typically offered in different types, each type is with its own characteristics (number of CPU cores, amount of main memory, etc.) and cost. Current data processing frameworks rather expect the cloud to imitate the static nature of the cluster environments they were originally designed for. E.g., at the moment the types and number of VMs allocated at the beginning of a compute job cannot be changed in the course of processing. As a result, rented resources may be inadequate for big parts of the processing job, which may lower the overall processing performance and increase the cost. 2. Data Processing Frameworks Search engines process and manage a vast amount of data collected from the entire World Wide Web. To do this task efficiently at reasonable cost, instead of relying on generic DBMS, they are usually built as customized parallel data processing systems and deployed on large clusters. Today s processing frameworks typically assume the resources they manage consist of a static set of homogeneous compute nodes. Although designed to

3 6 deal with individual nodes failures, they consider the number of available machines to be constant, especially when scheduling the processing job s execution. New classes of high performance applications are being developed that require unique capabilities which are not available in a single computer. In order to make easier for developers to write efficient, parallel, distributed and resource-intensive applications, the simplest way is to exploit data parallelism for scalable performance. One of an IaaS cloud s key features is the provisioning of compute resources on demand. New VMs can be allocated at any time through a well-defined interface and become available in a matter of seconds. Machines which are no longer used can be terminated instantly and the cloud customer will be charged no more. Cloud operators like Amazon; let their customers rent VMs of different types, i.e. with different computational power, different sizes of main memory, and storage. Hence, the resources available in a cloud are highly dynamic and possibly heterogeneous. Facilitating such case imposes some requirements on the design of a processing framework and the way its jobs are described. First, the scheduler of such a framework must become aware of the cloud environment where a job should be executed in. It must know about the different types of available VMs as well as their cost and be able to allocate or destroy them on behalf of the cloud customer. Second, the paradigm used to describe jobs must be powerful enough to express dependencies between the different tasks that the jobs consist of. The system must be aware of which task s output is required as another task s input. Otherwise the scheduler of the processing framework cannot decide at what point in time a particular VM is no longer needed and deallocate it. The Map-Reduce pattern is a good example of an unsuitable paradigm. Although at the end of a job only few reducer tasks may still be running, it is not possible to shut down the idle VMs, since it is unclear if they contain intermediate results which are still required. Finally, the scheduler of such a processing framework must be able to determine which task of a job should be executed on which type of VM. This information could be either provided externally or internally. VMs may be migrated for administrative purposes between different locations inside the data center without any notification and without any previous knowledge of the relevant network infrastructure.cloud computing and Infrastructure as a Service (IaaS) promises a vision of boundless computation which can be tailored to exactly meet a user s need, even as that need grows or shrinks rapidly. The cloud s virtualized nature helps to enable promising new use cases for efficient parallel data processing. The major challenge we see is the cloud s opaqueness with prospect to exploiting data locality. In a cluster the compute nodes are typically interconnected through a physical high-performance network. The topology of the network, i.e. the way the compute nodes are physically wired to each other, is usually well-known. Current data processing frameworks offer to leverage this knowledge about the network hierarchy and attempt to schedule tasks on compute nodes so that data sent from one node to the other has to traverse as few network switches as possible. 3. Resource Allocation Using virtual machines as a provisioning unit reduces the complexity for the cloud manager. For example, in order for a user to construct an application that can take advantage of more than a single VM, the user application needs to recognize its needs, communicate its needs to the cloud manager, and manage the fractured communication paradigms. Based on these a new data processing framework for cloud environments is designed. This framework takes up many ideas of previous processing frameworks but refines them to better match the dynamic and opaque nature of a cloud. This new framework designed follows a classic master-worker pattern as illustrated in Fig. 1. Fig.1. Structural overview of processing framework Before submitting a job, a user must start a VM in the cloud which runs the so called Job Manager (JM). The Job Manager receives the client s jobs, schedules them, and coordinates their execution. JM is capable of communicating with the interface the cloud operator provides to control the instantiation of VMs. This interface is called as the Cloud Controller. By means of the Cloud Controller the Job Manager can allocate or de-allocate VMs according to the current job execution phase. The actual execution of tasks which a job consists of is carried out by a set of instances. The term instance type is used to differentiate between VMs with different hardware characteristics. Each instance runs a Task Manager (TM). A Task Manager receives one or more tasks from the Job Manager at a time, executes them, and after that informs the Job Manager about their completion or possible errors. Job Manager then decides, depending on the job s particular tasks, how many and what type of instances the job should be executed on, and when the respective instances must be allocated/deallocated to ensure a continuous but costefficient processing. Once all the necessary Task Managers have successfully contacted the Job Manager, it triggers the execution of the scheduled job. 3.1 Job Description in Framework Jobs of this new framework are expressed as a directed acyclic graph (DAG). Each vertex in the 36

3 7 graph represents a task of the overall processing job; the graph s edges define the communication flow between these tasks. Defining a job comprises of three steps: First, the user must write the program code for each task of his processing job or select it from an external library. Second, the task program must be assigned to a vertex. Finally, the vertices must be connected by edges to define the communication paths of the job. Tasks are expected to contain sequential code and process so called records. Programmers can define arbitrary types of records. From a programmer s perspective records enter and leave the task program through input or output gates. Those input and output gates can be considered as endpoints of the DAG s edges. After having specified the code for the particular tasks of the job, the user must define the DAG to connect these tasks. DAG is also called as the Job Graph. The Job Graph maps each task to a vertex and determines the communication paths between them. The number of a vertex s incoming and outgoing edges must thereby comply with the number of input and output gates defined inside the tasks. Figure 2 illustrates the simplest possible Job Graph. It only consists of one input, one task, and one output vertex. After having received a valid Job Graph from the user, Job Manager transforms it into Execution Graph. Fig 2. An example of a Job Graph. An Execution Graph is the primary data structure for scheduling and monitoring the execution of a job. Execution Graph contains all the concrete information required to schedule and execute the received job on the cloud. It explicitly models task parallelization and the mapping of tasks to instances. One major design goal of Job Graphs is simplicity, where users should be able to describe tasks and their relationships. In order to ensure cost efficient execution in an IaaS cloud, the processing framework allows to allocate/deallocate instances in the course of the processing job, when some subtasks have already been completed or are already running. This just-in-time allocation can also cause problems, since there is the risk that the requested instance types are temporarily not available in the cloud. To cope up with this problem, our framework separates the Execution Graph into one or more so called Execution Stages. An Execution Stage must contain at least one Group Vertex. Its processing can only start when all the subtasks included in the preceding stages have been successfully processed. The framework scheduler ensures the following three properties for the entire job execution: First, when the processing of a stage begins, all instances required within the stage are allocated. Second, all subtasks included in this stage are set up and ready to receive records. Third, before the processing of a new stage, all intermediate results of its preceding stages are stored in a persistent manner. 4. Previous Work Cloud computing is an emerging form of distributed computing that is still in its infancy. The term itself is often used today with a range of meanings and interpretations. The input data is usually large and the computations have to be distributed across hundreds or thousands of machines in order to finish in a reasonable amount of time. The issues of how to parallelize the computation, distribute the data, and handle failures conspire to obscure the original simple computation with large amounts of complex code to deal with these issues. Many frameworks such as Map-Reduced, Hadoop, Map-Reduce Merge were designed to run data analysis jobs on a large amount of data. These frameworks were used by Google, yahoo, Microsoft etc. Google s Map-Reduce programming model focus mainly to support search-engine-related data processing. It has a simple programming interface. Map-Reduce is highlighted by its simplicity. Once a user has fit his program into the required map and reduce pattern, the execution framework takes care of splitting the job into subtasks, distributing and executing them. A single Map Reduce job always consists of a distinct map and reduce program. However, several systems have been introduced to coordinate the execution of a sequence of MapReduce jobs. Map-Reduce have been clearly designed for large static clusters. The Map function takes input as a pair of keys from the user and produces a set of intermediate key/value pairs. The MapReduce library groups together all intermediate values and passes them to the Reduce function. The Reduce function, also written by the user, accepts an intermediate key and a set of values for that key. It merges together these values to form a possibly smaller set of values. The available compute resources are essentially considered to be a fixed for a set of homogeneous machines. The Map-Reduce- Merge model enables processing multiple heterogeneous datasets. In this new model, the map function transforms an input key/value pair into a list of intermediate key/value pairs. The reduce function aggregates the list of values and produces a list of values. The merge function combines the two reduced outputs from different lineages into a list of key/value outputs. With Hadoop framework it can work directly 37

3 8 with any distributed file system which can be mounted by the underlying OS. Hadoop needs to know which servers are closest to the data. Hadoop, MapReduce is a large scale, open source software framework dedicated to scalable, distributed, data-intensive computing. The framework breaks up large data into smaller parallelizable chunks and handles scheduling. Dryad also runs DAG-based jobs and offers to connect the involved tasks through file, network, or in-memory channels. However, it assumes an execution environment which consists of a fixed set of homogeneous worker nodes. The Dryad scheduler is designed to distribute tasks across the available compute nodes in a way that optimizes the throughput of the overall cluster. It does not include the notion of processing cost for particular jobs. Processing frameworks allows each task to be executed on its own instance type, so the characteristics of the requested VMs can be adapted to the demands of the current processing phase. Before processing a new Execution Stage, the scheduler collects all Execution Instances from that stage and tries to replace them with matching cloud instances. If all required instances could be allocated, the subtasks are distributed among them and set up for execution. 5 Conclusion In recent years, several frameworks have been introduced to facilitate massively-parallel data processing on shared architectures like compute clouds. While these frameworks generally offer good support in terms of task deployment and fault tolerance, they only provide poor assistance in finding reasonable degrees of parallelization for the tasks to be executed. This is the first data processing framework to exploit the dynamic resource provisioning offered by today s IaaS clouds. It is an important contribution to the growing field of Cloud computing services and points out exciting new opportunities in the field of parallel data processing. It facilitates new ways of data processing in Clouds. In general this work represents an important contribution to the growing field of Cloud computing. ACKNOWLEDGMENT The satisfaction and euphoria that accompany the successful completion of any task would be incomplete without mention of the people who made it possible and support had been a constant source of encouragement which crowned my efforts with success. My special gratitude to Dr. M.Siddappa, HOD, department of CS&E S.S.I.T, Tumkur for his guidance, constant encouragement and wholehearted support. My sincere thanks to my guide N.L.UdayaKumar, Lecturer, department of CS&E S.S.I.T, Tumkur for his guidance, constant encouragement and wholehearted support. My sincere thanks to my company guide Vidhyavathi, Knowx Pvt.Ltd, Bangalore for her guidance, encouragement and wholehearted support. References [1] Amazon Web Services LLC. Amazon Elastic Compute Cloud (Amazon EC2). http://aws.amazon.com/ec2/, 2009. [2] Amazon Web Services LLC. Amazon Elastic MapReduce. http: //aws.amazon.com/elasticmapreduce/, 2009. [3] AmazonWeb Services LLC. Amazon Simple Storage Service. http: //aws.amazon.com/s3/, 2009. [4] D. Battr e, S. Ewen, F. Hueske, O. Kao, V. Markl, and D. Warneke. Nephele/PACTs: A Programming Model and Execution Framework for Web-Scale Analytical Processing. In SoCC 10: Proceedings of the ACM Symposium on Cloud Computing 2010, pages 119 130, New York, NY, USA, 2010. ACM. [5] The Apache Software Foundation. Welcome to Hadoop! http: //hadoop.apache.org/, 2009. [6] D. Warneke and O. Kao. Nephele: Efficient Parallel Data Processing in the Cloud. In MTAGS 09: Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers, pages 1 10, New York, NY, USA, 2009. ACM. [7] H. chih Yang, A. Dasdan, R.-L. Hsiao, and D. S. Parker. Map- Reduce-Merge: Simplified Relational Data Processing on Large Clusters. In SIGMOD 07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 1029 1040, New York, NY, USA, 2007. ACM [8] R. Chaiken, B. Jenkins, P.- A. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets. PVLDB, 1(2):1265 1276, 2008. [9] Ralf L ammel, Data Programmability Team,Microsoft Corp. Google s MapReduce Programming Model. Redmond, WA, USA [10] Jeffrey Dean and Sanjay Ghemawat, MapReduce: Simplified Data Processing on Large Clusters.Google, Inc. [11] Hung-chih Yang, Ali Dasdan Yahoo! Sunnyvale, CA, USA, Ruey-Lung Hsiao, D. Stott Parker. Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters. Computer Science Department, UCLA Los Angeles, CA, USA 38

3 9 Dynamic Load Sharing Multicast Algorithms On Cloud for Data Intensive Applications Suhasini N.L Udayakumar Dr. M. Siddappa IV sem, M.Tech Lecturer, HOD, Department Of CSE Dept Of CSE Dept Of CSE S.S.I.T, Maralur, Tumkur SSIT, Tumkur SSIT, Tumkur suha.suchi16@gmail.com msgraceuk@gmail.com Abstract Data intensive parallel applications analyse and process very large scale of data sets. This huge amount of data is stored in cloud storage that needs to be distributing to many computation nodes as fast as possible. Distributing data from storage to all nodes is essentially a multicast operation. The simple solution is to let all nodes download the same data directly from the storage service. But that can easily become a performance bottleneck because the downloaded time exceeds the total execution time of an application. Another approach is to construct spanning trees based on network topology and monitoring data. But both of these approaches do not deliver an optimal performance. Here present a multicast algorithms and mainly focusing on Amazon EC2/S3 platform. These algorithms efficiently transfer the large data from storage to all nodes. The features of these algorithms are construct an overlay network to exchange the data after downloading, optimize the throughput dynamically and increase the download throughput. This algorithm divides the data to be downloaded to clients and then exchanges the data using overlay network. Index Terms - Cloud computing, Multicasting, Overlay network, Amazon EC2/S3, Data-Intensive Applications. I. Introduction Cloud computing is a technology that uses the internet and central remote servers to maintain data and applications. This is the use of a 3rd party service (Web Services) to perform computing needs. Here Cloud depicts Internet. Computing cloud platforms such as Amazon Web Services [1], Windows Azure [2], and others provide great computational power that can be easily accessed via the Internet at any time without the overhead of managing large computational infrastructures. Initially, High Power Computing [HPC] Applications have been run on distributed parallel computers such as supercomputers and large cluster computers. Especially data intensive HPC applications that explore, query, analyze, visualize, and, in general, process very large scale data sets such as BLAST [3], require very large computational and storage 39 resources. For example, the ATLAS experiment [4], this searches for new discoveries in the head-on collisions of protons of extraordinarily high energy. This will generate more than a petabyte of data per year. So a single organization faces a difficulty to manage and provide resources to store and process that huge amount of data. To overcome these difficulties clouds are used because of their large number of storage and computational resources, high accessibility, reliability and simple cost model make them very attractive for all applications. Parallel HPC applications need to store and distribute large amounts of data to all compute nodes before or during a run. In a cloud, these data are typically stored in a separate storage service. Distributing data from this storage service to all compute nodes is essentially a multicast operation. The simple solution is to let all nodes download the same data directly from the storage service. But that can easily become a performance bottleneck the downloading time exceeds the total execution time of an application. Many multicast algorithms have been developed such as collective communication algorithms for cluster and grid environments but in all these approaches have to take different assumptions and settings depending on target environment. In this paper, present two efficient algorithms to distribute large amounts of data within clouds. The proposed algorithms using ideas from multicast algorithms used in parallel distributed systems and P2P systems to achieve high performance and scalability with respect to the number of nodes and the amount of data. Each algorithm first divides the data to download from the cloud storage service over all nodes, and then exchanges the data through mesh overlay network. II. Previous multicast methods This section considers various multicast methods for large amounts of data on parallel distributed systems and P2P networks. multicast on parallel distributed systems For parallel distributed systems, such as clusters and grids, optimization of multicast communication has been researched in message passing systems like MPI and their collective operation algorithms [5][6]. In

4 0 these techniques the target application used is HPC, so mainly focus on to achieve multicast operation as fast as possible. These systems makes some assumptions as network performance is high and stable, network topology does not change and available bandwidth between nodes is symmetric. Based on these assumptions, optimized multicast algorithms generally construct one or more optimized spanning trees by using network topology information and other monitoring data [7]. The data is then forwarded along these spanning trees from the root node to all others. These multicast techniques are therefore sender-driven. For large amounts of data, some optimization algorithms try to construct multiple spanning trees that maximize the available bandwidth of nodes. Fig. 1 constructing spanning trees on parallel distributed systems Overlay multicast on P2P systems The target environment makes following assumptions. 1) Network performance is very dynamic 2) Nodes can join and leave at will 3) Available bandwidth between nodes can be asymmetric. In this technique, divide the data to multicast into small pieces that are exchanged with a few neighbour nodes. All nodes tell their neighbours which pieces they have and request pieces they lack. Multicast communication in P2P networks is therefore receiver-driven. Fig. 2 Multicast on P2P system III. Platform Characteristics Here the target cloud environment is Amazon EC2/S3. Amazon Elastic Compute Cloud (EC2) and Simple Storage Service (S3) are cloud services provided by Amazon Web Services (AWS) [1]. To use this service first select a pre-configured, template image to get up and running immediately or create an Amazon Machine Image (AMI) containing your applications, libraries, data, and associated configuration settings. Configure security and network access on your Amazon EC2 instance. Choose which instance type and operating system you want, then start, terminate, and monitor as many instances of your AMI as needed. Next a user selects where to run instances. Pay only for the resources that you actually consume, like instance-hours or data transfer. Amazon S3 is a cloud storage service that can be accessed via the Internet. Files can be uploaded and downloaded via standard GET, PUT and DELETE commands over HTTP or HTTPS that are sent through a REST or a SOAP API. S3 stores files as objects in a unique name space, called a bucket. Buckets have to be created before putting objects into S3. They have a location and an arbitrary but globally unique name. The size of objects in a bucket can currently range from 1 byte to 5 gigabytes. The S3 API allows users to access a whole object or a specific byte range of it. For example, one could access bytes 10 to 100 from an object with a size of 200 bytes. IV. Proposed Multicast Algorithms for clouds Here first consider the requirements need to be fulfilled by multicast operations and then will explain how the proposed algorithms are working. A. Requirements Maximized utilization of available aggregate download throughput from cloud storage: the available throughput should scale according to the number of nodes, so the multicast algorithm achieves 40

4 1 maximum utilization of the available aggregate download throughput. Minimization of multicast completion time of each node: Usually all the nodes are not simultaneously start to calculate. This result in a resource underutilization problem so to overcome this all nodes should start simultaneously. Non-dependence on monitoring neither network throughput nor estimation of network topology: Previous algorithms are usually working by monitoring neither network throughput nor estimation of network topology. This approach is undesirable in clouds because the underlying physical network topology and activity of other users is generally unknown. B. Proposed algorithms - load sharing and without sharing The multicast algorithm proposed by van de Geijn et al.[3] is a well known algorithm for clusters and multi clusters environments. It achieves high performance multicast operations, and is often used in efficient MPI collectives implementations [5][6]. The algorithm consists of two phases: (1) the scatter phase and (2) the allgather phase. In the scatter phase, the root node divides the data to be multicast into blocks of equal size depending on the number of nodes. These blocks are then send to each corresponding node using a binomial tree. After all the nodes have receives the divided blocks, they start the allgather phase in which the missing blocks are exchanged and collected by using the recursive doubling technique. Here the proposed algorithms are inspired by this algorithm. It also consists of a scatter phase and an allgather phase. All nodes cooperate to download and forward data from S3 to each EC2 node. Initially, none of the nodes has any parts of the data stored in S3, so S3 corresponds to a multicast root node. Phase 1 (load sharing): the file to distribute is logically divided into P fixed-sized pieces, e.g. 32KB each, numbered from 0 to P 1. When the number of nodes is N, each node i is assigned a range of pieces: Node j then returns Wi pieces to node i. Node i and j can then concurrently download Wi and Wj pieces, respectively. Hence, the amount of work they download is proportional to their download bandwidth. Fig.3 Scatter phase(load sharing) Phase 2 (load sharing): after constructing a full overlay network between the all the nodes, each node continuously exchanges information with its neighbours in the mesh about which pieces they already obtained, and fetches missing pieces from them until all pieces are downloaded. When node i has finished downloading its pieces, it asks other nodes whether they have any work remaining and reports its own download throughput Bi for the download just completed. Now assume that node j has W remaining pieces, and its download throughput is currently Bj. Node j then divides W into Wi and Wj such that: Fig. 4 Allgather phase(load sharing) Another algorithm works by without sharing load of other nodes. In this algorithm once a node finishes downloading the assigned range of pieces, it waits until all the other nodes have finished too and then exchanges information with its neighbours in the mesh. Phase 1 (without sharing): the file to distribute is logically split into P equally-sized pieces, and each node i is assigned a range of pieces as shown in Equation 1. Node i then downloads all the pieces in 41

4 2 its range from S3. Once a node finishes downloading the assigned range of pieces, it waits until all the other nodes have finished too. Fig. 5 Scatter phase(without sharing) Phase 2 (without sharing): similar to the load sharing, each node exchanges pieces within EC2 through overlay network until all the pieces have been obtained. Fig. 6 Allgather phase(without sharing) As an example, consider three nodes (A, B and C) that download the same 300MB file from S3. Node B has a fast connection to S3 (10 MB/sec), while A and C have a slow connection to S3 (2 MB/sec). The file will first be logically split into 9600 pieces of 32KB each (9600 * 32KB = 300 MB). Initially, each node requests the assigned 100MB from S3 (i.e. node A, B, and C request pieces 0-3199, 3200-6399 and 6400 9599, respectively). After approximately 10 seconds, node B will finish downloading its range. Node A and C, on the other hand, achieve slower throughput and will finish after 50 seconds. Since all nodes wait until everybody has finished, the total completion time of phase 1 is 50 seconds. V. Conclusion In this paper we have summarized different multicast algorithms for distributed parallel and P2P systems. Here mainly focused on Amazon EC2/S3 which is the most commonly used cloud platform, and discussed some multicast performance problems on there when using the simplest algorithm by which all EC2 compute nodes download files from S3 directly. Based on these findings, have presented two high performance multicast algorithms. These algorithms make it possible to transfer large amounts of data stored in S3 to multiple EC2 nodes efficiently. The proposed algorithms combine optimization ideas from multicast algorithms used in parallel distributed systems and P2P systems, and they have three salient features; (1) they can construct an overlay network on clouds without network topology information, (2) they can optimize the total throughput between compute nodes dynamically, and (3) they can increase the download throughput from cloud storage by letting nodes cooperate with each other. Acknowledgements The successful completion of any task would be incomplete without mention of the people who made it possible and support had been a constant source of encouragement which crowned my efforts with success. My special gratitude to Dr. M.SIDDAPPA, HOD, department of CS&E S.S.I.T for his guidance, constant encouragement and wholehearted support. My sincere thanks to my guide N.L Udaya kumar, Lecturer department of CSE, S.S.I.T.And my sincere thanks to my company guide Vidyavathi, Knowx India, Bangalore. References [1]Amazon Web Services, http://aws.amazon.com/. [2]WindowsAzure http://www.microsoft.com/windowsazure/. [3] M. Barnett, L. Shuler, S. Gupta, D. G. Payne, R. A. van de Geijn, and J. Watts, Building a highperformance collective communication library, in Supercomputing, 1994, pp. 107 116. [Online]. Available: http://citeseer.ist.psu.edu/140591.html [4] A Toroidal LHC ApparatuS Project (ATLAS), http://atlas.web.cern.ch/. [5] M. Matsuda, T. Kudoh, Y. Kodama, R. Takano, and Y. Ishikawa, Efficient mpi collective operations for clusters in long-and-fast networks, in IEEE International Conference on Cluster Computing (cluster 2006), 2006. [6] R. Thakur, R. Rabenseifner, and W. Gropp, Optimization of collective communication operations in MPICH, in International Journal of High Performance Computer Applications, vol. 19, no. 1, 2005, pp. 49 66. [7] M. den Burger, T. Kielmann, and H. E. Bal, Balanced multicasting: High-throughput communication for grid applications, in SC 05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing, 2005. [8] E. Walker, benchmarking amazon ec2 for high-performance scientific computing, LOGIN, pp. 18 23, October 2008. [9] S. L. Garfinkel, An evaluation of amazon s grid computing services: Ec2, 42

4 3 s3 and sqs, Center for Research on Computation and Society School for Engineering and Applied Sciences, Harvard University, Tech. Rep., 2007. [10] M. R. Palankar, A. Iamnitchi, M. Ripeanu, and S. Garfinkel, Amazon S3 for science grids: a viable solution? in DADC 08: Proceedings of the 2008 international workshop on Data-aware distributed computing. New York, NY, USA: ACM, 2008, pp. 55 64. 43

4 4 HORNS: A Homomorphic encryption Scheme for Cloud Computing using Residue Number System Arun Kumbi Anasuya Prakash M Tech 3rd sem software engineering, Assistant Professor, Dept. of CSE Dept. of CSE, EPCET, Bangalore, Karnataka. EPCET, Bangalore, Karnataka. Email:arun_kumbi@rediffmail.com Email: anuprama@rediffmail.com Abstract In this paper, we propose a homomorphic encryption scheme using Residue Number System (RNS). In this scheme, a secret is split into multiple shares on which computations can be performed independently. Security is enhanced by not allowing the independent clouds to collude.efficiency is achieved through the use of smaller shares. I. INTRODUCTION Residue Number System (RNS) is well known and a well studied number theory system [5]. RNS has been used to achieve performance improvement as the arithmetic involves smaller numbers and can be done in parallel. RNS is defined in terms of a set of relatively prime moduli.the primary application of homomorphic encryption is in the field of Cloud Computing. In this set-up, the cloud, which is untrusted, is given the task of computing on a client s confidential data. The client can protect its confidential data from the untrusted cloud if it can encrypt its data using a homomorphic encryption function and use the cloud to do the computing on the encrypted data. RNS creates multiple shares of a data and the operations on these shares are homomorphic. These two properties of RNS can be used to design a homomorphic encryption function for cloud computing. The application of RNS, so far, has been in the fields of computer arithmetic and digital signal processing. In this paper we identify the underlying research issues that are to be addressed to design a homomorphic encryption function using RNS The research so far on RNS has focused on improving performance,while we focus on security, hence the solutions we reach are different from the previous ones. II. RESEARCH ISSUES IN HORNS The issues that need to be addressed to apply RNS for homomorphic computations are overflow and sign detection. In order to apply RNS for encryption, the issues of confidentiality, integrity, and cloud collusion need to be addressed. These research issues in HORNS will be illustrated by the following example. RNS and MP be its range such that -MP/2<a; b < MP/2 In the residue class ZMP, the range [0; MP/2] Represents positive numbers and the range [MP 2 ;MP ) represents the negative numbers. The client requests the cloud to perform modular additions over pi on the individual shares cpi = api + bpi independently. The client reconstructs c from the n shares, cpi, it receives from the cloud. The issues with respect to the above example are: Overflow and Sign Detection: If a; b >= MP/4, then the result c = a + b > MP/2 Which will imply the result is a negative number while it is not. Confidentiality and Cloud Collusion: The cloud is given access to the data shares api ; bpi and the modulus pi. The client can employ several mechanisms so that the cloud does not get access to the moduli set P. For example, the client can partition the shares and provide different partitions to different clouds, as shown in Figure 1. In such a scenario, all theclouds have to collude to reconstruct the moduli set P.Another approach, as shown in Figure 2, is that the client can itself perform the computations with respect some partitions. In any case, a cloud should not be given access to all the moduli in P. Even with such a restriction, the confidentiality can not be fully guaranteed. Integrity: The cloud can provide a random result c without performing the actual addition a + b. In other words, the client should be able to detect if the result does not correspond to the actual computation requested. III. POSSIBLE SOLUTIONS The following are the solutions we propose for the research issues presented in Section II. A. Overflow and Sign Detection Redundant RNS can be used for overflow and sign detection. The idea is to do the computation on multiplern systems and compare the results. For example,let RNSp and RNSq be the two different RN systems used for computation. Let Yp and Yq be converted results of RNS computation in RNSp and RNSq respectively. Then results are valid only if Yp = Yq or Mp - Yp = Mq- Yq. This solution can detect if an overflow has occurred but not correct it. While earlier research has focused on improving the performance [3] by sharing the moduli between the redundant systems, we propose to increase the redundancy by not sharing the 44

moduli. This can increase the security as will be shown later. B. Confidentiality 1) Data Confidentiality: Data confidentiality requiresthat the shares of the data a $ (ap1 ; ap2 ; ap3 ; : : : ; apn),in the RNS defined by the set of moduli P = fp1; p2; : : : ; png and range Mp =Yni=1 pi,should reveal no information about the data a. For example, if a < p1 then the adversary can infer a from ap1 = a. In other words, with probability p1 Mp the adversary can infer a from ap1 (assuming uniform distribution of the data). In order to mask the information confusion needs to be added to the data. 2) Modulus Confidentiality: In order to do computations in RNS the modulus has to be provided to the cloud. But, this can in-turn reduce the security of the system as the cloud can infer the range Mp if it can acquire all the moduli of the RNS by some means. In order to prevent such a possibility we want to design the HORNS in such a way that the cloud should be able to operate on the data without having to know the actual modulus. This is similar to the data confidentiality requirement hence can be achieved in a similar way by adding confusion to the modulus. C.Cloud Collusion In this section, we devise a strategy for allocating a subset of the moduli to each cloud in such a way that it will minimize the impact on security due to collusion. Let P = {p1; p2; : : : ; pn} be the moduli set and pi > SnP, where SnP is the minimum size of modulus. Let MP be the range of this RNS. Let k be the number of moduli given to a cloud for execution. Thus, if two clouds collude, the number of moduli they can gather is not 2k, instead, it is less than 2k. Of course, if all clouds collude, they will have all the moduli required to break the RNS system. RNS Coding: The protected computation of program P is threaded into k threads. All the key variables to be protected are split into k moduli using the RNS schema of HORNS. Each thread would have identical control flow, but different data. In fact, data-parallel model of NVidia Fermi GPU, CUDA, works well for such a schema. In GPU terminology, each original thread becomes a k-way Warp. The threads within a Warp are forced to sync either for Montgomery reduction, or for HORNS moduli reduction, or for a branch. Threading: These k threads could be merged into a single thread for a classical computing model. Or they could be assigned to k separate cores. If the OS vulnerabilities are per processor rather than per core then these threads could be scheduled on different nodes for better security. Threading control: The preceding discussion highlights the fact in a secure cloud environment, some of the thread scheduling flexibility must be given to the client.a client should be able to specify scheduling parameters loosely. How this specification must be incorporated into Cloud protocols, and what degree of scheduling specifications cane be entrusted with the client is a 45 topic for cloud computing research. Root of trust at cloud: It would be more efficient to have a trusted node at cloud. How would such a node be rooted in trust is still an open question. Validation: There are many possible validation mechanisms.to name a few, the redundancy schema of Section III-A could deploy multiple moduli sets for each thread. The results from all the k threads can be validated at predetermined validation points by a trusted processor. Along a similar validation schema, a (k + 1)st hidden modulus could be selected. The corresponding residue can be kept as a secret with the trusted processor. When the k results from the k threads corresponding to the k moduli come back, their consistency with respect to the hidden residue can be validated with ChineseRemainder Theorem (CRT). IV. CONCLUSIONS In this paper, we have proposed a novel homomorphic encryption scheme using RNS for cloud computing.we have identified various research issues involved in HORNS and proposed solutions for some of these issues. Our future work will expand on these solutions and quantify the security of HORNS. REFERENCES [1] Jean-Claude Bajard, Laurent-Stphane Didier, and Peter Kornerup.An rns montgomery modular multiplication algorithm. IEEE TRANSACTIONS ON COMPUTERS, 47(7):766 776, 1998. [2] Craig Gentry. Computing arbitrary functions of encrypted data.commun. ACM, 53(3):97 105, 2010. [3] R T Gregory and D W Matula. Base conversion in residue number systems. Residue number system arithmetic: modern applications in digital signal processing, pages 22 30, 1986. [4] Peter L. Montgomery. Modular multiplication without trial division. Mathematics of Computation, 44(170):519 521, April1985. [5] Michael A Soderstrand, W Kenneth Jenkins, Graham A Jullien,and Fred J Taylor, editors. Residue number system arithmetic:modern applications in digital signal processing. IEEE Press,Piscataway, NJ, USA, 1986. [6] Trusted Computing Group. TPM Main Specification Level 2 Version 1.2, Revision 103

Data Secure and Dependable Storage Services in Cloud Computing Ajay Kumara M A, Mr. Sharavana.K M.Tech, Dept. of CSE (PG) Assistant Professor Dept. of CSE M V J College of Engineering, Bangalore. M V J College of Engineering Bangalore V.T.U Karnataka sharatanuj@gmail.com ajaykumar.ak99@gmail.com Abstract- Cloud storage enables users to remotely store their data and enjoy the on-demand high quality cloud applications without the burden of local hardware and software management. User physical possession of their outsourced data, which inevitably poses new security risks towards the correctness of the data in cloud. In order to address this new problem and further achieve a secure and dependable cloud storage service, we propose in this paper a flexible distributed storage integrity auditing mechanism, utilizing the homomorphism token and distributed erasure-coded data. The proposed design allows users to audit the cloud storage with very lightweight communication and computation cost. The auditing result not only ensures strong cloud storage correctness guarantee, but also simultaneously achieves fast data error localization, i.e., the identification of misbehaving server. The cloud data are dynamic in nature, the proposed design further supports secure and efficient dynamic operations on outsourced data, including block modification, deletion, and append. Analysis shows the proposed scheme is highly efficient and resilient against Byzantine failure, malicious data modification attack, and even server colluding attacks. Index Terms Data integrity, dependable distributed storage, error localization, data dynamics, Cloud Computing I. INTRODUCTION Several trends are opening up the era computing, which is an Internet-based development and use of computer technology. The ever cheaper and more powerful processors, together with the software sasa service (SaaS) computing architecture, are transforming data centers into pools of computing service on a huge scale. The increasing network bandwidth and reliable and flexible network connections make it even possible that users can now subscribe high quality services from data and software that reside solely on remote data centers. Moving data into the cloud offers great convince users since they don t have to care about the complexities of of direct hardware management Computing vendors, Amazon Simple Storage Service 46 (S3) and Amazon Elastic Compute Cloud (EC2) [2] are both well known examples. Internet-based online services do provide huge amounts of storage space and customizable computing resources. This computing p l a t f o r m shift, however, is eliminating the responsibility of local machines for data maintenance at the same time. As a result, u s e r s are at the mercy of their cloud ser vice providers for the a vail ab ilit y and integrity of their data O n the one hand, although the cloud infrastructures are much more powerful and reliable than personal computing devices, broad range of both internal and external threats for data integrity still exist. Examples of outages and data loss incidents of cloud storage services appear from time to time [2]. On the other hand, since users may not retain a local copy of outsourced data, there exist various incentives for cloud service providers (CSP) to behave unfaithfully towards the cloud users regarding the status of their outsourced data. For example, to increase the profit margin by reducing cost, it is possible for CSP to discard rarely accessed data without being detected in a timely fashion [3]. Similarly, CSP may even attempt to hide data loss incidents so as to maintain a reputation Therefore, although outsourcing data into the cloud is economically attractive for the cost and complexity of long-term large-scale data storage, its lacking of offering strong assurance of data integrity and availability may impede its wide adoption by both enterprise and individual cloud users. In order to achieve the assurances of cloud data integrity and availability and enforce the quality of cloud storage service, efficient methods that enable on-demand data correctness verification on behalf of cloud users have to be designed. However, the fact that users no longer have physical possession of data in the cloud provides the direct adoption of traditional cryptographic primitives for the purpose of data integrity protection. Hence, the verification of cloud storage correctness must be conducted without explicit knowledge of the whole data files meanwhile; cloud storage is not just a third party data warehouse. The data stored in the cloud may not only be accessed but also be frequently updated by the users [4], include insertion, deletion, modification, appending, etc. the deployment of Cloud Computing is powered by data centers running in a simultaneous, cooperated and distributed manner. It is more advantages for individual users to store their data redundantly across multiple

physical servers rs so as to reduce the data integrity and storage correctness assurance will be of most importance availability threats. Thus, distributed proto- cols for in achieving robust and secure cloud storage systems. II. PROPOSED SYSTEM In this paper, we propose an effective and flexible is highly efficient. Extensive security analysis shows our distributed storage verification scheme with explicit dy- scheme is resilient against Byzantine failure, malicious data namic data support to ensure the correctness and avail- modification attack, and even server col- lading attacks. The rest of the paper is organized as follows. Section II ability of users data in the cloud. We rely on erasureintroduces proposed system Section III provides Existing correcting code in the file distribution preparation to pro- vide system. Section IV problem statement section V gives design redundancies and guarantee the data dependability against goals Section VI ensuring cloud data storage concludes Byzantine servers [6], where a storage server may fail in the whole paper section VII related work Finally, conclusion arbitrary ways. This construction drastically reduces the. communication and storage overhead as compared to the IV. PROBLEM STATEMENT traditional replication-based file distribution techniques. A. System Model By utilizing the homomorphic token with distributed verification of erasure-coded data, our scheme achieves the Representative network architecture for cloud storage storage correctness insurance as well as data error service architecture is illustrated in Figure 1. Three different network entities can be identified as follows: localization: whenever data corruption has been detected during the storage correctness verification, our scheme can User: an entity, who has data to be stored in the cloud and relies on the cloud for data storage and almost guarantee the simultaneous localization of data computation, can be either enterprise or individual errors, i.e., the identification of the misbehaving server(s). In customers. order to strike a good balance between error resilience and data dynamics, we further explore the algebraic property of Cloud Server (CS): an entity, which is managed by our token computation and erasure-coded data, and cloud service provider r (CSP) to provide data storage demonstrate how to efficiently support dynamic operation on service and has significant storage space and comdata blocks, while maintaining the same level of storage putation resources (we will not differentiate CS and correctness assurance. In order to save the time, computation CSP hereafter.). re- sources, and even the related online burden of users, Third Party Auditor (TPA): an optional TPA, we also provide the extension of the proposed main who has expertise and capabilities that users may not scheme to support third-party auditing, where users can have, is trusted to assess and expose risk of cloud safely delegate the integrity checking tasks to third-party storage services on behalf of the users upon request. auditors and be worry-free to use the cloud storage services. Our work is among the first few ones in this field to consider User: retrieves the cloud services from CSP having it s distributed data storage security in Cloud Computing. message broadcasts channel to communicate and send request and response between owner and cloud server. III. EXISTING SYSTEM Cloud computing has been envisioned as the next generation architecture of the IT enterprise due to its long list of unprecedented advantages in IT: on demand self-service, ubiquitous network access, location-independent independent resource pooling, rapid resource elasticity, usage-based pricing, and transference of risk]. One fundamental aspect of this new computing model is that data is being centralized or outsourced into the cloud. From the data owners perspective, including both individuals and IT enterprises, storing data remotely in a cloud in a flexible on-demand manner brings appealing benefits: relief of the burden of storage management, universal data access with independent geographical locations, and avoidance of capital expenditure on hardware, software, personnel maintenance, and so on. Our contribution can be summarized as the following three aspects: 1) Compared to many of its predecessors, which only provide binary results about the storage status across the distributed servers, the proposed scheme achieves the integration of storage correctness insurance and data error localization, i.e., the identification of misbehaving server(s). Fig. 1: Cloud data storage architecture 2) Unlike most prior works for ensuring remote data integrity, the new scheme further supports secure and efficient dynamic In cloud data storage, a user stores his data through a CSP operations on data blocks, including: update, delete and into a set of cloud servers, which are running in a append. simultaneous, cooperated and distributed manner. Data 3) The experiment results demonstrate the proposed scheme redundancy can be employed with technique of erasure- 47

correcting code to further tolerate faults or server crash as user s data grows in size and importance. Thereafter, for application purposes, the user interacts with the cloud servers via CSP to access or retrieve his data. In some cases, the user may need to perform block level operations on his data. The most general forms of these operations we are considering are block update, delete, insert And append..as users no longer possess their data locally, it is of critical importance to ensure users that their data are being correctly stored and maintained. That is, users should be equipped with security means so that they can make continuous correctness assurance (to enforce cloud storage service-level agreement) of their stored data even without the existence of local copies. In case that users do not necessarily have the time, feasibility or resources to monitor their data online, they can delegate the data auditing tasks to an optional trusted TPA of their respective choices. However, to securely introduce such a TPA, any possible leakage of user s outsourced data towards TPA through the auditing protocol should be prohibited. In our model, we assume that the point-to-point commutilation channels between each cloud server and the user is authenticated and reliable, which can be achieved in practice with little overhead. These authentication handshakes are omitted in the following presentation. B. Adversary Model Security threats faced by cloud data storage can come from two different sources. On the one hand, a CSP can be selfinterested, UN trusted and possibly malicious. Not only does it desire to move data that has not been or is rarely accessed to a lower tier of storage than agreed for monetary reasons, but it may also attempt to hide a data loss incident due to management errors, Byzantine failures and so on. On the other hand, there may also exist an economically motivated Adversary, who has the capability to compromise a number of cloud data storage servers in different time intervals and subsequently is able to modify or delete users data while remaining undetected by CSPs for a certain period. Specifically, we consider two types of adversary with different levels of capability in this paper: Weak Adversary: The adversary is interested in corrupting the user s data files stored on individual servers. Once a server is comprised, an adversary can pollute the original data files by modifying or introducing its own fraudulent data to prevent the original data from being retrieved by the user. Strong Adversary: This is the worst case scenario, in which we assume that the adversary can compromise all the storage servers so that he can intentionally modify the data files as long as they are internally consistent. In fact, this is equivalent to the case where all servers are colluding together to hide a data loss or corruption incident. V. DESIGN GOALS To ensure the security and dependability for cloud data storage we aim to design mechanisms for dynamic data verification and operation and achieve the following goals: 48 Storage correctness: to ensure users that their data are indeed stored appropriately and kept intact all the time in the cloud. Fast localization of data error: to effectively locate the malfunctioning server when data corruption has been detected. Dynamic data support: to maintain the same level of storage correctness assurance even if users modify, delete or append their data files in the cloud. Dependability: to enhance data availability against Byzantine failures, malicious data modification and server colluding attacks, i.e. minimizing the effect brought by data errors or server failures. Lightweight: to enable users to perform storage correctness checks with minimum overhead. VI. ENSURING CLOUD DATA STORAGE In cloud data storage system, users store their data in the cloud and no longer possess the data locally. Thus, the correctness and availability of the data files being stored on the distributed cloud servers must be guaranteed. One of the key issues is to effectively detect any unauthorized data modification and corruption, possibly due to server compromise and/or random Byzantine failures. Besides, in the distributed case when such inconsistencies are successfully detected, to find which server the data error lies in is also of great significance, since it can be the first step to fast recover the storage errors. To address these problems, our main scheme for ensuring cloud data storage is presented in this section. The first part of the section is devoted to a review of basic tools from coding theory that is needed in our scheme for file distribution across cloud servers. Then, the homomorphic token is introduced. The token computation function we are considering belongs to a family of universal hash function [7], chosen to preserve the homomorphic properties, which can be perfectly integrated with the verification of erasure-coded data. Subsequently, it is also shown how to derive a challenge response protocol for verifying the storage correctness as well as identifying misbehaving servers. Finally, the procedure for file retrieval and error recovery based on erasure-correcting code is outlined. A. The Proposed security by using cryptographic algorithm In the introduction we motivated the data secure and dependable storage In cloud computing. This section presents our public auditing scheme for cloud data storage security. public auditing system and discusses two straightforward schemes and their demerits. Then we present our main result for privacy-preserving public auditing to achieve the aforementioned design goals. We also show how to extent our main scheme to support batch auditing for TPA upon

delegations from multi-users. users. Finally, we discuss how to adapt our main result to support data dynamics. B. Definitions and Framework of Public Auditing System We follow this definition using remote data integrity checking and adapt the framework for our privacy-preserving public auditing system. A public auditing scheme consists of four algorithms (KeyGen, SigGen, GenProof, and Verify Proof).KeyGen is a key generation algorithm that is run by the user to setup the scheme. SigGen is used by the user to generate verification of MAC, (message authentication code) and signatures, or other related information that will be used for auditing. GenProof is run by the cloud server to generate a proof of data storage correctness, while VerifyProof is run by the TPA to audit the proof from the cloud server. Our public auditing system can be constructed from the above auditing scheme in two phases, Setup and Audit: Setup: The user initializes the public and secret parameters of the system by executing KeyGen, and pre-processes the data file F by using SigGen to generate the verification metadata. The user then stores the data file F at the cloud server, delete its local copy, and publish the verification metadata to TPA for later audit. As part of pre-processing, the user may alter the data file F by expanding it or including additional metadata to be stored at server. Audit: The TPA issues an audit message or challenge to the cloud server to make sure that the cloud server has retained the data file F properly at the time of the audit. The cloud server will derive a response message from a function of the stored data file F by executing GenProof. Using the verification metadata, the TPA verifies the response via Verify Proof. Note that in our design, we do not assume any additional property on the data file, and thus regard error-correcting codes as orthogonal to our system. If the user wants to have more error- resiliency, he/she can first redundantly encode the data file and then provide us with the data file that has error-correcting codes integrated. In Section 3.5, we will show how to adapt our main result to support dynamic data update. Fig. 2 MAC algorithm example In this example, the sender of a message runs it through a MAC algorithm to produce a MAC data tag. The message and the MAC tag are then sent to the receiver. The receiver in turn runs The message portion of the transmission through the same MAC algorithm using the same key, producing a second MAC data tag. The receiver then compares the first MAC tag received in the transmission to the second generated MAC tag. If they are identical, the receiver can safely assume that the integrity of the message was not compromised, and the message was not altered or tampered with during transmission. HMAC: In cryptography, HMAC (Hash-based Message Authentication Code) is a specific construction for calculating a message authentication code (MAC) involving a cryptographic hash function in combination with a secret key. As with any MAC, it may be used to simultaneously verify both the data integrity and the authenticity of a message. Any cryptographic raphic hash function, such as MD5 or SHA-1, may be used in the calculation of an HMAC; the resulting MAC algorithm is termed HMAC-MD5 or HMAC-SHA1 accordingly. The cryptographic strength of the HMAC depends upon the cryptographic strength of the underlying hash function, the size of its hash output length in bits and on the size and quality of the cryptographic key. Algorithm 1 Token Pre-computation 1: procedure 2: Choose parameters l, n and function f, _; 3: Choose the number t of tokens; 4: Choose the number r of indices per verification; 5: Generate master key Kprp and challenge kchal; 6: for vector G(j), j 1, n do 7: for round i 1, t do 8: Derive _i = fkchal (i) and k(i) prp from KPRP. 49

9: Compute v(j) i =Prq=1 _qi G(j)[_k(i)prep(q)] 10: end for 11: end for 12: Store all the vis locally. 13: end procedure as all servers operate over the same subset of the indices the Requested response values for integrity check must also be a Valid codeword determined by Token Pre computation. Algorithm 2 Correctness Verification and Error Localization C. Challenge Token Pre computation In order to achieve assurance of data storage correctness and data error localization simultaneously, our scheme entirely relies on the pre-computed verification tokens. The main idea is as follows: before file distribution the user pre-computes a certain number of short verification tokens on individual vector G(j) ({1,..., n}), each token covering a random subset of data blocks. Later, when the user wants to make sure the storage correctness for the data in the cloud, he challenges the cloud servers with a set of randomly generated block indices. Upon receiving challenge, each cloud server computes a short signature over the specified blocks and returns them to the user. The values of these signatures should match the corresponding tokens pre-computed by the user. Meanwhile 10: if (R(j)i! =v(j)i ) then 11: return server j is misbehaving. 12: end if 13: end for 14: end if 15: end D. Correctness Verification and Error Localization Error localization is a key prerequisite for eliminating errors in storage systems. However, many previous schemes do not explicitly consider the problem of data error localization, thus only provide binary results for the storage verification. Our scheme outperforms those by integrating the correctness verification and error localization in our challengeresponse protocol: the response values from servers for each challenge not only determine the correctness of the distributed storage, but also contain information to locate potential data error(s). Algorithm 3 Error Recovery 1: procedure % Assume the block corruptions have been detected among % the specified r rows;% Assume s k servers have been identified misbehaving 2: Download r rows of blocks from servers; 3: Treat s servers as erasures and recover the blocks. 4: Resend the recovered blocks to corresponding servers. 5: end procedure C.File Retrieval and Error Recovery 1: procedure CHALLENGE(i) 2: Recompute _i = fkchal (i) and k(i) prp from KPRP ; 3: Send {_i, k(i)prp} to all the cloud servers; 4: Receive from servers: {R(j)i =Prq=1 _qi G(j)[_k(i)prp(q)] 1 j n} 5: for (j m + 1, n) do 6: R(j) R(j) Prq=1 fkj (siq,j) _qi, Iq = _k(i)prp(q) 7: end for 8: if ((R(1)i,...,R(m)i ) P==(R(m+1)i,...,R(n)i )) then 9: Accept and ready for the next challenge. Other user Send Sk&File name Since our layout of file matrix is systematic, the user can reconstruct the original file by downloading the data vectors from the first m servers, assuming that they return the correct response values. E. Towards Third Party Auditing Request key & file name As discussed in our architecture, in case the user does not have the time, feasibility or resources to perform the storage correctness verification, he can optionally delegate this task to an independent third party auditor, making the cloud storage publicly verifiable. However, as pointed out by the recent work [8], [9], to securely introduce an effective TPA, the auditing process should bring in no new vulnerabilities towards user data privacy. Namely, TPA should not learn user s data content through the delegated Check data auditing. Now we show that with only FN slight & modification, our protocol can support privacy-preserving SK third party auditing. FN&SK F. providing dynamic data operation This model may fit some application scenarios, such as libraries and scientific datasets. However, in cloud data storage, there are many potential scenarios where data stored in the cloud is dynamic, like electronic documents, photos, or log 50 Owner Upload file

Proceedings of National Conference on Advanced Computer Applications NCACA 2012 files etc. Therefore, it is crucial to consider the dynamic case, where a user may wish to perform various block-level operations of update, delete and append to modify the data file while maintaining the storage correctness assurance. Request No file No If user is Hacker Receive file No Yes Check IP addr FN&S Check user FN& Yes If No Ip addr found Sender Cloud Server TPA 51 Fig 3. Activity diagram of architecture Diagram VII. RELATED WORK Juels et al. [3] described a formal proof of irretrievability (POR) model for ensuring the remote data integrity. Their scheme combines spot-checking and error-correcting code to ensure both possession and irretrievability of files on archive service systems. Shacham et al. [4] built on this model and constructed a random linear function based homomorphic authenticator which enables unlimited number of queries and requires less communication overhead. Bowers et al. [5] proposed an improved framework for POR protocols that generalizes both Juels and Shacham s work. Later in their subsequent work, Bowers et al. [10] extended POR model to distributed systems. However, all these schemes are focusing on static data. The effectiveness of their schemes rests primarily on the preprocessing steps that the user conducts before outsourcing the data file F. Any change to the contents of F, even few bits, must propagate through the error-correcting code, thus introducing significant computation and communication complexity. Ateniese et al. [6] defined the provable data possession (PDP) model for ensuring possession of file on un trusted storages.

Proceedings of National Conference on Advanced Computer Applications NCACA 2012 Their scheme utilized public key based homomorphic tags for auditing the data file, thus providing public verifiability. However, their scheme requires sufficient computation overhead that can be expensive for an entire file. In their subsequent work, Ateniese et al. [7] described a PDP scheme that uses only symmetric key cryptography. This method has lower-overhead than their previous scheme and allows for block updates, deletions and appends to the stored file, which has also been supported in our work. However, their scheme focuses on single server scenario and does not address small data corruptions, leaving both the distributed scenario and data error recovery issue unexplored. Curtmola et al. [15] aimed to ensure data possession of multiple replicas across the distributed storage system. They extended the PDP schemeto cover multiple replicas without encoding each replica separately, providing guarantee that multiple copies of data are actually maintained. VIII.CONCLUSION In this paper, we investigate the problem of data security in cloud data storage, which is essentially a distributed storage system. To achieve the assurances of cloud data integrity and availability and enforce the quality of dependable cloud storage service for users, we propose an effective and flexible distributed scheme with explicit dynamic data support, including block update, delete, and append. We rely on erasure-correcting code in the file distribution preparation to provide redundancy par- ity vectors and guarantee the data dependability. By utilizing the homomorphic token with distributed verification of erasure-coded data, our scheme achieves the integration of storage correctness insurance and data error localization, i.e., whenever data corruption has been detected during the storage correctness verification across the distributed servers, we can almost guarantee the simultaneous identification of the misbehaving server(s). Considering the time, computation resources, and even the related online burden of users, we also provide the extension of the proposed main scheme to support third-party auditing, where users can safely delegate the integrity checking tasks to third-party auditors and be worryfree to use the cloud storage services. Through detailed security and extensive experiment results, we show that our scheme is highly efficient and resilient to Byzantine failure, malicious data modification attack and even server colluding attack. REFERENCE [1] C. Wang, Q. Wang, K. Ren, and W. Lou, Ensuring data storage security in cloud computing, in Proc. of IWQoS 09, July 2009, pp. 1 9. [2] Amazon.com, Amazon web services (aws), Online at http://aws.amazon.com/, 2009. [3] M. Arrington, Gmail disaster: Reports of http://www.techcrunch.com/2006/12/28/gmaildisasterreports-of-mass-emaildeletions/,december [4] G. Ateniese, R. D. Pietro, L. V. Mancini, and G. Tsudik, Scalable and efficient provable data possession, in Proc. Of SecureComm 08,2008, [5] C. Erway, A. Kupcu, C. Papamanthou, and R. Tamassia, Dynamic provable data possession, in Proc. CCS 09, [6] M. A. Shah, M. Baker, J. C. Mogul, and R. Swaminathan, Audit-ing to keep online storage services honest, in Proc. of HotOS 07.Berkeley, CA, USA: USENIX Association, 2007, pp. 1 6. [7] B. Krebs, Payment Processor Breach May Be Largest Ever, Onlhttp://voices.washingtonpost.com/securityfix/2009/01/ payment processor breach may b.html, Jan. 2009. [8] A. Juels and J. Burton S. Kaliski, Pors: Proofs of irretrievability for large files, in Proc. of CCS 07, Alexandria, VA, October 2007 [9] C. Wang, Q. Wang, K. Ren, and W. Lou, Privacypreserving public auditing for storage security in cloud computing, in Proc. of IEEE INFOCOM 10, San Diego, CA, USA, March 2010. [10] C. Wang, K. Ren, W. Lou, and J. Li, Towards publicly Auditable secure cloud data storage services, IEEE Network Magazine,vol. 24, no. 4, pp. 19 24, 2010. [11] R. C. Merkle, Protocols for public key cryptosystems, in Proc. Of IEEE Symposium on Security and Privacy, Los Alamitos, CA, [12] Q. Wang, K. Ren, W. Lou, and Y. Zhang, Dependable and secure sensor data storage with dynamic integrity assurance, in Proc. Of IEEE INFOCOM 09, Rio de Janeiro, Brazil, Appril 2009. [13] J. S. Plank, S. Simmerman, and C. D. Schuman, Jerasure: A library in C/C++ facilitating erasure coding for storage applica-tions - Version 1.2, University of Tennessee, Tech. Rep. CS-08-627,August 2008. [14] M. Bellare, R. Canetti, and H. Krawczyk, Keying hash functionsfor message authentication, in Proc. of Crypto 96 volume 1109 oflncs. Springer-Verlag, 1996, pp. 1 15. [15] M. Bellare, O. Goldreich, and S. Goldwasser, Incremental cryptography: The case of hashing and 52

Proceedings of National Conference on Advanced Computer Applications NCACA 2012 Increasing Data Privacy and Computation Efficiency through Linear Programming Outsourcing in Cloud Computing Harsha N, Dr M Siddappa Department of CSE, Sri Siddhartha Institute of Technology, Sri Siddhartha University, Tumkur, Karnataka, India Email: harshanarayan1989@gmail.com Abstract Cloud Computing is the emerging buzzword in Information Technology. Cloud Computing distribute computation task on the resource pool which consists of massive computers. The application systems can gain the computation strength, the storage space and software service according to its demand. It enables customers with limited computational resources to outsource their large computation workloads to the cloud, and economically enjoy the massive computational power, bandwidth, storage, and even appropriate software that can be shared in a pay-per-use manner. Despite the tremendous benefits, security is the primary obstacle that prevents the wide adoption of this promising computing model, especially for customers when their confidential data are consumed and produced during the computation. To combat against unauthorized information leakage, sensitive data have to be encrypted before outsourcing so as to provide end to- end data confidentiality assurance in the cloud and beyond. However, ordinary data encryption techniques in essence prevent cloud from performing any meaningful operation of the underlying plaintext data, making the computation over encrypted data a very hard problem. On the other hand, the operational details inside the cloud are not transparent enough to customers. As a result, there do exist various motivations for cloud server to behave unfaithfully and to return incorrect results, i.e., they may behave beyond the classical semi honest model. Fully homomorphic encryption (FHE) scheme, a general result of secure computation outsourcing has been shown viable in theory, where the computation is represented by an encrypted combinational Boolean circuit that allows to be evaluated with encrypted private inputs. But FHE is difficult to implement because it consists of huge circuits. Focusing on engineering computing and optimization tasks, this paper investigates secure outsourcing of widely applicable linear programming (LP) computations. Linear programming is an algorithmic and computational tool. In order to achieve practical efficiency, our mechanism design explicitly decomposes the LP computation outsourcing into public LP solvers running on the cloud and private LP parameters owned by the customer. It provides secure and practical mechanism design which fulfils input/output privacy. Using LP ensures that the use of cloud is economically viable. Keywords Cloud computing, Linear programming, Mechanism, Design goals, The Complete Mechanism Description 53

Proceedings of National Conference on Advanced Computer Applications NCACA 2012 Dynamic Resource Allocation in Cloud for Parallel Data Processing K.B.Manasa N.L.UdayaKumar Dr.M.Siddappa IV sem M.Tech, Lecturer, HOD, Dept of CSE, Dept of CSE, Dept. of CSE, SSIT, Maralur, Tumkur. SSIT,Maralur,Tumkur. SSIT,Maralur,Tumkur. Email: mansi.harshi@gmail.com Email: msgraceuk@gmail.com Abstract Cloud Computing is playing a relevant role in the evolution of information technology (IT). A considerableumber of system developers are using cloud technologies to deploy and make available systems over the internet. Cloud computers are the emerging classes of computational hardware and several frameworks have been introduced to facilitate parallel data processing on cloud, such systems allow users to acquire and release the resources on demand and provide ready access to data. Processing frameworks which are currently used are from the field of cluster computing and disregard of particular nature of cloud. Consequently, the allocated compute resources may be inadequate for big parts of the submitted job and unnecessarily increase processing time and cost..companies providing cloud services have an increasing need to store and analyze massive set of data. Emerging applications require the ability to exploit geographically distributed resources and there has been a dramatic increase in the amount of available computing and storage resources. Current cloud systems push much complexity onto the user, requiring the ser to manage individual virtual machines and deal with many system level concerns. Index Terms : Many- Task Computing, High-Throughput Computing, Loosely Coupled Applications, Cloud Computing, Parallel data processing, Frameworks. 54

Proceedings of National Conference on Advanced Computer Applications NCACA 2012 Abstract Dynamic Load Sharing Multicast Algorithms On Cloud for Data Intensive Applications Suhasini N.L Udayakumar Dr. M. Siddappa IV sem, M.Tech Lecturer, HOD, Department Of CSE Dept Of CSE Dept Of CSE S.S.I.T, Maralur, Tumkur SSIT, Tumkur SSIT, Tumkur Email: suha.suchi16@gmail.com Email: msgraceuk@gmail.com Data intensive parallel applications analyse and Process very large scale of data sets. This huge amount of data is stored in cloud storage that needs to be distributing to many computation nodes as fast as possible. Distributing data from storage to all nodes is essentially a multicast operation. The simple solution is to let all nodes download the same data directly from the storage service. But that can easily become a performance bottleneck because the downloaded time exceeds the total execution time of an application. Another approach is to construct spanning trees based on network topology and monitoring data. But both of these approaches do not deliver an optimal performance. Here present a multicast algorithms and mainly focusing on Amazon EC2/S3 platform. These algorithms efficiently transfer the large data from storage to all nodes. The features of these algorithms are construct an overlay network to exchange the data after downloading, optimize the throughput dynamically and increase the download throughput. This algorithm divides the data to be downloaded to clients and then exchanges the data using overlay network. Index Terms - Cloud computing, Multicasting, Overlay network, Amazon EC2/S3, Data-Intensive Applications. 55

Proceedings of National Conference on Advanced Computer Applications NCACA 2012 HORNS: A Homomorphic encryption Scheme for Cloud Computing using Residue Number System Arun Kumbi Anasuya Prakash M Tech 3rd sem software engineering, Assistant Professor, Dept. of CSE Dept. of CSE, EPCET, Bangalore, Karnataka. EPCET, Bangalore, Karnataka. Email:arun_kumbi@rediffmail.com Email: anuprama@rediffmail.com Abstract In this paper, we propose a homomorphic Encryption scheme using Residue Number System (RNS). In this scheme, a secret is split into multiple shares on which computations can be performed Independently. Security is enhanced by not allowing the independent clouds to collude.efficiency is achieved through the use of smaller shares. 56

Proceedings of National Conference on Advanced Computer Applications NCACA 2012 Providing Security in Cloud Computing Using Protection Rings Mahesh Sheelvant Dayananda R B M Tech 3rd sem software engineering, Senior Lecturer, Dept. of CSE Dept. of CSE, EPCET, Bangalore, Karnataka. EPCET, Bangalore, Karnataka. maheshss07@gmail.com ddashok17@yahoo.co.in 57

Proceedings of National Conference on Advanced Computer Applications NCACA 2012 Abstract: Cloud computing is emerging field because of its performance, high availability, least cost and many others. Besides this companies are binding there business from cloud computing because the fear of data leakage. Due lack of proper security control policy and weakness in safeguard which lead to many vulnerability in cloud computing.3 Dimensional security in cloud computing focuses on the problem of data leakage and proposes a framework works in two phases. First phase which is known as Data classification is done by client before storing the data. During this phase the data is to be categorized on the basis of CIA (Confidentiality, Integrity, and Availability). The client who wants to send the data for storage needs to give the value of C (confidentiality), I (integrity), A (Availability). The value of C is based on level of secrecy at each junction of data processing and prevents unauthorized disclosure, value of I based on how much assurance of accuracy is provided, reliability of information and unauthorized modification is required, and value of A is based on how frequently it is accessible. With the help of proposed formula, the priority rating is calculated. Accordingly data having the higher rating is considered to be critical and 3D security is recommended on that data. After completion of first phase the data which is received by cloud provider for storage, uses 3Dimentional technique for accessibility. Keywords: Availability, Cloud security, Cost Reduction, Confidentiality, Data Storage, Data protection, Integrity; 58

Proceedings of National Conference on Advanced Computer Applications NCACA 2012 Cloud Computing Ashish Jaggi, Rohit Kakani IMS CD&R, University of Pune, Ahmednagar Maharastra, ajr101@rediffmail.com Abstract: Cloud computing is a model for enabling ubiquitous, convenient, ondemand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. The concept of cloud computing fills a perpetual need of IT: way to increase capacity or add capabilities on the fly without investing in new infrastructure, training new personnel, or licensing new software. Cloud computing encompasses any subscription-based or pay-per-use service that, in real time over the Internet, extends IT existing capabilities. Keywords: resources, pay-per-use, real-time, interaction. 59