Cloud and Mobile Security Seminar Spring 2013 Lecture 2: Intro to Cloud Computing and Its Challenges
Assignment Status Paper presenter assignment: Only five people have volunteered for papers Deadline for self-volunteering is next Monday (Feb 11) Past that day, we will assign papers to you Note that papers might change, so you are selecting topics not specific papers Project selection: Only two people have told me verbally about their ideas Everyone must send me and TA a brief description of the project and team by next Monday (Feb 11) Less than 1 page, but needs to explain problem clearly! See instructions here: < http://www.cs.columbia.edu/~roxana/teaching/cloudmobiles13/projects.html> Absolutely no deadline extensions! Note that you'll have to provide a more detailed description by Feb 18 (following week, so you want to submit earlier, so you can get feedback)
Outline A history of cloud computing (Roxana) Slides were inspired by Armando Fox presentation on cloud computing history for the UC Berkeley s CS-10 course General challenges (Dmitryi) Above the clouds: A Berkeley view of cloud computing. M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R.H. Katz, A. Konwinski, G. Lee, D.A. Patterson, A. Rabkin, I. Stoica, M. Zaharia. Technical Report UCB/EECS-2009-28, 2009. Specific challenge: Side-channel leakage (Deepika) Hey, you, get off of my cloud: Exploring information leakage in third-party compute clouds. T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. In Proceedings of the ACM Conference on Computing Communications Security (CCS), 2009. Further challenges (Roxana)
Outline A history of cloud computing (Roxana) Slides were inspired by Armando Fox presentation on cloud computing history for the UC Berkeley s CS-10 course General challenges (Dmitryi) Above the clouds: A Berkeley view of cloud computing. M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R.H. Katz, A. Konwinski, G. Lee, D.A. Patterson, A. Rabkin, I. Stoica, M. Zaharia. Technical Report UCB/EECS-2009-28, 2009. Specific challenge: Side-channel leakage (Deepika) Hey, you, get off of my cloud: Exploring information leakage in third-party compute clouds. T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. In Proceedings of the ACM Conference on Computing Communications Security (CCS), 2009. Further challenges (Roxana)
What Is Cloud Computing? Computing technology in which data and/or computation are outsourced to a massive-scale, multi-user infrastructure that is managed by a third-party. Includes both: Software as a service (e.g., Gmail, Google Docs, Mozy, etc.) Infrastructure/platform as a service (e.g., Amazon AWS, Google AppEngine, Microsoft Azure, etc.) Appeared gradually due to two important challenges facing the Web: Scaling Management
Around 1995: The Scaling Challenge The Web and e-commerce were gaining traction Their challenge: how to scale? 1996 to 1997: ebay grew from 41,000 to 341,000 users!
Pre-1995 Answer: Big, Expensive Computers Example: ebay used Sun E-10000 supermini 64 processors @250MHz, 64GB RAM, 20TB disk, ~$1M The good: Easy to manage Easy to program Simple failure mode The bad: Q: Any ideas?
Pre-1995 Answer: Big, Expensive Computers Example: ebay used Sun E-10000 supermini 64 processors @250MHz, 64GB RAM, 20TB disk, ~$1M The good: Easy to manage Easy to program Simple failure mode The bad: Expensive Single point of failure No incremental scalability
1995: Berkeley Network of Workstations (NOW) Idea: Leverage many interconnected small, cheap, general-purpose machines for incremental scalability and reliability Typical PC: 200 MHz CPU, 32MB RAM, 4GB disk A Case for Networks of Workstations: NOW. IEEE Micro, 1995. Tom Anderson, Dave Culler, Dave Patterson, el.al.
NOW-0 1994: NOW had 4 HP-735 s
NOW-1 1995: NOW had 32 Sun SPARC stations
NOW-2 1997: 60 Sun SPARC-2 s Build Inktomi app
Companies Adopt NOW Everybody builds their own clusters and grows them to handle more and more load Examples: ebay, Amazon, Google, all.com bubble companies Similar to early days of electricity when everyone built their own generator Q: What do you think happened next?
Late 1990s: The Manageability Challenge Hard to manage and program large clusters How to write scalable distributed programs? How to debug large-scale programs? How to make services reliable? How to architect the network infrastructure? How to provision a cluster to handle peak load? How to administer a huge number of computers? Each company had to build their own complex software Like each of us building an OS from scratch!
Early 2000s: Scalable Cluster Primitives Very few technically strong companies create powerful scalable and reliable primitives for cluster management and programming Examples: Google s Map/Reduce The Google File System (GFS) Google s Bigtable Amazon s Dynamo Distributed debugging and tracing tools Datacenter temperature regulators Scalable distributed communication mechanisms
Mid 2000s: Three Valuable Commodities Giant-scale clusters with enormous excess capacity Everybody provisioned for peak Q: How big was a typical Google datacenter around 2005? a. 1,000 machines b. 5,000 machines c. 10,000 machines d. 50,000 machines e. 100,000 machines
Mid 2000s: Three Valuable Commodities Giant-scale clusters with enormous excess capacity Everybody provisioned for peak Expertise for managing and operating clusters at low cost Economies of scale Complex software to help program/manage clusters Even full applications (e.g., Gmail, Google Calendar, etc.) Q: What do you think happened next?
2006: Monetization of Cloud Infrastructure AWS sells resources, expertise, and access to cloud primitives in a pay-for-what-you-use model Resources: CPU, network bandwidth, persistent storage Cloud primitives: Amazon S3, EC2, SQS, Map/Reduce, etc. Google launches Google Apps for Your Domain Customizable Gmail, Google Docs, Google Calendar under a custom domain (e.g., gmail.cs.columbia.edu) Google then launches the App Engine Web hosting infrastructure (such infrastructures existed before, but never enjoyed popularity) Microsoft launches Azure in 2009
Great Advantages of Cloud Computing Low barrier of market entry for startups Cheaper, low-management email, calendars, CRM solutions New mobile applications Faster batch processing via parallelization Dmitryi will discuss
Outline A history of cloud computing (Roxana) Slides were inspired by Armando Fox presentation on cloud computing history for the UC Berkeley s CS-10 course General challenges (Dmitryi) Above the clouds: A Berkeley view of cloud computing. M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R.H. Katz, A. Konwinski, G. Lee, D.A. Patterson, A. Rabkin, I. Stoica, M. Zaharia. Technical Report UCB/EECS-2009-28, 2009. Specific challenge: Side-channel leakage (Deepika) Hey, you, get off of my cloud: Exploring information leakage in third-party compute clouds. T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. In Proceedings of the ACM Conference on Computing Communications Security (CCS), 2009. Further challenges (Roxana)
Above the Clouds: A Berkley View of Cloud Computing By Armbrust et al. Dmitriy Gromov 2/4/2013
Overview l Summary - What is Cloud Compu6ng? - Why Cloud Compu6ng? - Business Model - Obstacle and Opportuni6es l Review l Post Paper Thoughts Report Presentation - Gromov
What is Cloud Compu6ng? It s stupidity. It s worse than stupidity: it s a marke6ng hype campaign. Somebody is saying this is inevitable and whenever you hear somebody saying that, it s very likely to be a set of businesses campaigning to make it true. Richard Stallman, quoted in The Guardian, September 29, 2008 http://stock-clip.com/video-footage/cloud+computing Report Presentation - Gromov
What is Cloud Compu6ng? BUT REALLY! Report Presentation - Gromov
What is Cloud Compu6ng? l l So#ware as a Service (SaaS): Applica6on delivered as a service over the Internet Cloud: Datacenter Hardware and SoYware - Public Cloud: Cloud as a pay as you go service - Private Cloud: Internal datacenter not made available to the public l U8lity Compu8ng: The service being sold via a Public Cloud - Amazon Web Service, Google AppEngine l l Cloud Compu8ng = Public Cloud + U6lity Compu6ng Paying someone to use their service on their datacenter Report Presentation - Gromov
What is Cloud Compu6ng? l New Workflow Report Presentation - Gromov
Why Cloud Compu6ng l User Perspec6ve - The Good l l l Pay only for what you need Cheap Service Grow and shrink service very quickly - The Bad l Need to trust your cloud provider Report Presentation - Gromov
Why Cloud Compu6ng l Provider Perspec6ve - The Good l l l l Money! - Hardware in bulk is cheaper and lots of users Can easily build on what you have Can drive industry by providing solu6ons everyone needs Can become a pla`orm everyone uses - The Bad l Need to pay for and maintain datacenter Report Presentation - Gromov
Why Cloud Compu6ng l Why Then (in '08)? - SoYware improvements demanded beder hardware - Companies were not sure exactly what they needed - Hardware was cheap enough to buy in bulk and rent away. Report Presentation - Gromov
Why Cloud Compu6ng l l New Applica6ons - Mobile Apps - Huge Parallel Apps - Analy6cs - Desktop Extension U6lity Classes - Storage (S3) - Dev Pla`orms (Google AppEngine) - Computa6on (EC2) Report Presentation - Gromov
Business Model l Trade Off Equa6on: If the cost of ren6ng a cloud for some 6me is cheaper than sehng up a datacenter and using it for the same 6me, then use the cloud. l l Cost of datacenter factors in Utilization If Utilization = 1 the two sides look the same but this never happens. Report Presentation - Gromov
l Without cloud - Risky Business Model - Having too few resources can cause costumer loss (10% of affected will leave) - Having too many resources is expensive l With cloud - Less Risk - Can grow and shrink on demand - If cloud costs a lidle more this is outweighed by smaller risk. Report Presentation - Gromov
Business Model l Cost Report Presentation - Gromov
Obstacles and Opportuni6es l Availability of Service - If cloud goes down users suffer + Cloud providers are predy good about not failing + Help prevent DDoS extor6on by scaling up quickly and for cheaper than the demands l Data Lock- In - User can't get data from cloud + Can charge more for beder APIs and Services Report Presentation - Gromov
Obstacles and Opportuni6es l Data Confiden6ality and Auditability - Users concerned about their data + Government rules in place to regulate this + No fundamental obstacles to making cloud as secure as in- house environment Report Presentation - Gromov
Obstacles and Opportuni6es l Data Transfer Bodleneck - Takes a long 6me to transfer a huge data to the cloud - Also could take a long 6me inside of the clouds network + Drive networks to be faster Report Presentation - Gromov
Obstacles and Opportuni6es l Performance Unpredictability - CPU is fine but I/O can be slow + Beder scheduling + Flash, High Performance Compu6ng l Scalable Storage - By demand storage - Infinite capacity + Develop new storage systems Report Presentation - Gromov
Obstacles and Opportuni6es l l Bugs - Hard to find bug on huge distributed system + VMs allow for greater low level debugging than physical hardware Scaling Quickly - Pay as you go is difficult with Computa6on - More expensive to leave machine with high load + Figure out predic6ve model to scale CPUs up and down. Report Presentation - Gromov
Obstacles and Opportuni6es l Reputa6on Fate Sharing - 1 Bad costumer can make whole cloud look bad + Can charge for special trusted account l SoYware Licensing - Unclear how to sell license to proprietary soyware on open source cloud + Can charge a bit more for cloud usage + Long term- commitment deals Report Presentation - Gromov
l Good things Review - Good explana6on of cloud compu6ng - Well defined terms - Paper is accessible to both tech and non- tech people - Solid examples of real life cases - Good outline of obstacles and opportuni6es - Good data presenta6on Report Presentation - Gromov
Review l Bad things - Somewhat repe66ve - What are the requirements of cloud providers? - Opportuni6es are not well defined in some cases - Lidle discussion of technical aspect of clouds - OYen talks about a lot but says lidle Report Presentation - Gromov
Post Paper Thoughts l State of cloud compu6ng today - EC2 - Large Instance l 7.5 GiB memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 850 GB, 64 Bit, 500 Mbps I/O - Auto Scaling - Elas6c Load Balancing for performance - Intra Amazon transfer is mostly free. Store on S3 process on EC2 http://aws.amazon.com/ec2/ Report Presentation - Gromov
Post Paper Thoughts l How do cloud providers create the illusion of unlimited hardware? - In terms of space? - CPUs? Report Presentation - Gromov
Post Paper Thoughts l Have there been new issues and/or risks iden6fied with the increasing popularity of cloud compu6ng? Report Presentation - Gromov
Post Paper Thoughts l This paper seems a bit biased towards cloud applica6ons. What kind of large scale development would not be good for the cloud? Report Presentation - Gromov
Discussion Topics What other challenges does cloud computing raise? Do you agree with this quote: There are no fundamental obstacles to making a cloud-computing environment as secure as the vast majority of in-house IT environments? What is the cloud economics equation missing?
Discussion Topics What other challenges does cloud computing raise? Global energy consumption: is it better to run in optimized cloud and transfer data back and forth OR run locally on less optimized devices? Do you agree with this quote: There are no fundamental obstacles to making a cloud-computing environment as secure as the vast majority of in-house IT environments? Lack of trust in cloud providers changes game fundamentally What is the cloud economics equation missing? Quality of service: availability, reliability, etc. how to put a value on those?
Outline A history of cloud computing (Roxana) Slides were inspired by Armando Fox presentation on cloud computing history for the UC Berkeley s CS-10 course General challenges (Dmitryi) Above the clouds: A Berkeley view of cloud computing. M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R.H. Katz, A. Konwinski, G. Lee, D.A. Patterson, A. Rabkin, I. Stoica, M. Zaharia. Technical Report UCB/EECS-2009-28, 2009. Specific challenge: Side-channel leakage (Deepika) Hey, you, get off of my cloud: Exploring information leakage in third-party compute clouds. T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. In Proceedings of the ACM Conference on Computing Communications Security (CCS), 2009. Further challenges (Roxana)
Hey, You, Get Off of My Cloud Exploring Information Leakage in Third-Party Compute Clouds By Thomas Ristenpart, Eran Tromer, Hovav Shacham and Stefan Savage Presented by: Deepika Tunikoju
Agenda About the paper Introduction Not a goal Goal Technical overview Merits Demerits Solutions proposed Amazon s response Questions and discussion points
About the paper University of California, San Diego, USA Thomas Ristenpart, Hovav Shacham, Stefan Savage Massachusetts Institute of Technology, Cambridge, USA Eran Tromer Published in CCS 09 Proceedings of the 16th ACM conference on Computer and communications security Total citations 117 Total downloads 3838
Introduction Third-party computing clouds provide infrastructure for hosting data and software They provide several benefits such as economies of scale Dynamic provisioning Low capital expenditure Like every coin, it has a flip-side: Risk of handing over confidential data and software to cloud providers Sharing physical resources with unknown parties(other tenants of cloud)
Not a Goal Malicious behavior by the cloud provider Traditional threat where the attacker attempts to exploit the vulnerabilities of the software
Goal Exploring the practicality of mounting cross VM attacks in existing third-party compute clouds. Focus on EC2 and then generalize to all compute clouds such as Microsoft s Windows Azure, Rackspace s Mosso etc. Utilize only standard customer capabilities to compromise the privacy. Use CPU s data caches for cross-vm side channel information leakage.
Hey! You! Get off of my cloud Don't hang around 'cause two's a crowd n my cloud By The Rolling Stones, 1965.
Overview The attack contains two main steps: Placement Extraction
Technical Overview Placement: Cloud cartography Determining co-residence Matching Domain 0 IP address Small packet round trip times Numerically close IP addresses Extraction: Cross VM information leakage Measuring cache usage Load based co-residence detection Estimating traffic rates Keystroke timing attack
Determining co-residence Matching Dom 0 IP addresses Small packet round trip times Numerically close internal IP addresses Load based co-residence detection
Prime Trigger Probe victim attacker cache
Demerits Cache based covert channel working to be explained in more depth as it is the heart of the research All experiments were conducted assuming no other VMs are present on the co-resident victim VM, attacker VM and Domain0 VM. Key stroke attack performed on EC2-like environment but not EC2. Scans ports 80 and 443 only leaving out several other public services on EC2
Merits Provides in depth background before diving into their research idea Supports all the assumptions by proving directly or indirectly through the experiments Structured experiment with various phases one leading to another All experiments done on EC2 environment Cost-effective solution is proposed at the end Motivated the cloud providers to investigate potential exploits and loopholes and making efforts to bolster security Effect of various parameters analyzed such as zone, account and time of day.
Solutions Proposed Obfuscate the internal structure of services and placement policy Side channels must be anticipated and blinded Foolproof solution: Users must be allocated physical machines populated by only their VMs or known harmless VMs.
Amazon s response "The side channel techniques presented are based on testing results from a carefully controlled lab environment with configurations that do not match the actual Amazon EC2 environment." "As the researchers point out, there are a number of factors that would make such an attack significantly more difficult in practice. (such as noise from other VMs and victim and attacker being able to co-ordinate as shown in Prime- Trigger-Probe method) http://www.techworld.com.au/article/324189/amazon_downplays_report_highl ighting_vulnerabilities_its_cloud_service Officials from mazon characterized the attack described in the report as "hypothetical," and one that would be "significantly more difficult in practice."
Questions and discussions.
Discussion Topics A weakness is only a vulnerability if it s worth exploiting. Does this paper prove the existence or worthness of a vulnerability? Why do you think Amazon exhibits these vulnerabilities? Would allowing users to specify where to run their instances be better, worse, or the same for security? What are the disadvantages of a random placement policy?
Outline A history of cloud computing (Roxana) Slides were inspired by Armando Fox presentation on cloud computing history for the UC Berkeley s CS-10 course General challenges (Dmitryi) Above the clouds: A Berkeley view of cloud computing. M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R.H. Katz, A. Konwinski, G. Lee, D.A. Patterson, A. Rabkin, I. Stoica, M. Zaharia. Technical Report UCB/EECS-2009-28, 2009. Specific challenge: Side-channel leakage (Deepika) Hey, you, get off of my cloud: Exploring information leakage in third-party compute clouds. T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. In Proceedings of the ACM Conference on Computing Communications Security (CCS), 2009. Further challenges (Roxana)
Additional Data-related Challenges Both of today s papers assume trust in the cloud provider But should the provider always be trusted?
Additional Data-related Challenges Both of today s papers assume trust in the cloud provider But should the provider always be trusted? Weak and dynamic contractual agreements (ToS) result in weak assurances by the cloud Provider could respond to subpoenas without client s knowledge or control Provider may have monetary incentives to reveal your data to unintended parties (ad companies, insurance companies) Provider may want to lower replication factors for your long-unused data to save on storage costs Provider may want to keep your data even after you request its deletion to be able to mine it
Next Class: Crypto Solutions for Cloud Security Public key encryption with keyword search. D. Boneh, G. D. Crescenzo, R. Ostrovsky, and G. Persiano. In Proceedings of the IACR Annual Eurocrypt Conference, 2004. CryptDB: Protecting Confidentiality with Encrypted Query Processing. Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, and Hari Balakrishnan. In Proceedings of SOSP, 2011. Assignments for next time: 1. Submit your project description by next time (Feb 11) PDF or TXT format please, send to both TA and instructor 2. Sign up for paper presentations by next time (Feb 11) Send this to TA If you don't sign up, we will assign paper ourselves