Challenges in Modern Data- Centers Management Edi Shmueli, Spring 2015 Challenges in Modern Data Centers Management, Spring 2015 1
Information provided in these slides is for educational purposes only Challenges in Modern Data Centers Management, Spring 2015 2
Welcome Hebrew - Shalom Arabic - Ahlan'wa sahla Bosnian - Dobrodošli Chinese (Cantonese) - (fòonying) Chinese (Mandarin) - 欢 迎 [simplified], 歡 迎 [traditional] (huānyíng) Czech - Vítáme tĕ Danish - Velkommen Dutch - Welkom French - Bienvenue German - Willkommen Challenges in Modern Data Centers Management, Spring 2015 3
Outline Introduction Administrative and academic stuff Data centers History and facts Our course Lecture by lecture what will be covered in each? Challenges in Modern Data Centers Management, Spring 2015 4
Introduction Challenges in Modern Data Centers Management, Spring 2015 5
Me (Edi) With Intel since 2011 Formerly with IBM (almost 17 years) PhD in Computer Science from the Hebrew university Prof. Dror G. Feitelson Interested in anything related to Systems OS, Virtualization, Storage, etc. Distributed systems resource management & job scheduling Performance evaluation and modeling Etc. Challenges in Modern Data Centers Management, Spring 2015 6
Why the course? Data centers are big businesses 50 years of technological evolution Special skills required to operate them (experience, legacy) IT team in Intel (IDC-Haifa) Responsible for the data center facility, continuous operation, solutions development and deployment, users engagement, etc. Huge experience (legacy.) Goal is to expose some of this experience in a structured way 1. Challenges we face 2. Solutions e.g., technologies, algorithms used to address them 3. Considerations when choosing (or designing) solutions Challenges in Modern Data Centers Management, Spring 2015 7
Administration Edi (me) Main instructor and responsible for the course edi.shmueli@intel.com Jalil (him) Our super-talented teaching assistant jalilm@cs.technion.ac.il Danny (not here) Advisor and high-level supervisor Danny@cs.technion.ac.il Important dates Lectures: Wednesday s 14:30-16:30 Taub 6 Exams Moed A 3/7/2015 Moed B 20/9/2015 Challenges in Modern Data Centers Management, Spring 2015 8
Academic Pre-requisites Basic knowledge on networking, computer and distributed systems e.g., clusters should be enough Requirements 1. Must attend 80% of the lectures 2. Must deliver homework assignments (30% of the grade) 4-5 assignments 3. Final exam (70% of the grade) 2-3 open questions + few closed ones (multiple-choice) In the spirit of the homework assignments Our site http://webcourse.cs.technion.ac.il/236634/spring2015/en/general_info.html Challenges in Modern Data Centers Management, Spring 2015 9
Schedule (tentative) Challenges in Modern Data Centers Management, Spring 2015 10
Questions? (Did I forget something?) Challenges in Modern Data Centers Management, Spring 2015 11
Data centers Challenges in Modern Data Centers Management, Spring 2015 12
Data center Facility used to house computer systems and associated components Telecommunications, storage systems, etc. Production floor of most modern companies, e.g., Google Information processing, etc. Amazon Sales, Hosting (AWS), etc. Facebook Social networking Courtesy of wikipedia.org, CC BY-SA 2.0 Courtesy of wikipedia.org, CC BY-SA 3.0 Challenges in Modern Data Centers Management, Spring 2015 13
History Started as a facility to house old complex computing systems Courtesy of wikipedia.org, CC BY-SA 3.0 Challenges in Modern Data Centers Management, Spring 2015 14
History cont. Big boost during the Dot-Com era (1997 2000) Companies emerged whose business solely surrounds the Web Requiring fast Internet connectivity and 24/7 non-stop operation Special facilities built to house such businesses Internet data centers (IDC) Leading to new technologies and practices Eventually migrated to the private data centers Grid-computing phenomenon Great vision, big complexities Courtesy of wikipedia.org Challenges in Modern Data Centers Management, Spring 2015 15
History cont. Another big transformation as part of the Cloud era (2007+) New design and deployment philosophy Redundant (multiple copies), scalable (elasticity), high-availability (stateless) Technology makes hosting economically attractive Even for large-scale enterprises Environmental impact receives special attention Standard bodies specify requirements Huge effort to make data-centers appear Green Courtesy of wikipedia.org, CC BY-SA 3.0 Challenges in Modern Data Centers Management, Spring 2015 16
Some facts Large data center can consume as much electricity as a small town In 2010 data centers accounted for 1.1%-1.5% of the global electricity use Electricity spends account for 25-30% of a data center TCO Up to 50% of which might go on cooling http://dilbert.com/strips/comic/2009-08-09. Challenges in Modern Used for Data educational Centers purpose Management, only. Spring 2015 17
Some facts cont. Average life of a data center is 9 years Older than 7 years considered out-of-date (Green-computing) Minute of data-center downtime may cost tens-of-thousands of $$ High-availability is critical component in the design Challenges in Modern Data Centers Management, Spring 2015 18
Facebook Prineville data center, opened April, 2011 Originally posted to Flickr, CC BY-SA 2.0 Challenges in Modern Data Centers Management, Spring 2015 19
IBM BlueGene/P Argonne National Laboratory (ANL), Lemont, Illinois, USA Originally posted to Flickr, CC BY-SA 2.0 Challenges in Modern Data Centers Management, Spring 2015 20
Our course Challenges in Modern Data Centers Management, Spring 2015 21
Our course Focuses on common management challenges Generic enough so they fit most usage models Impossible to cover all challenges Filtered the ones team has experience with Chose the ones we believe are most important The team Responsible for data centers facility, continuous operation, solution development & deployment, etc. Domain experts Facility, networking, resource management, storage, business intelligence and analytics, security, etc. Introduction Facility basics Networks RM Part I RM Part II RM Part III Data access Business Intelligence Predictive Analytics DC visit Security Part I Security Part II Summary Challenges in Modern Data Centers Management, Spring 2015 22
Facility basics Facility basics Building a data center is expensive Single rack location construction can cost up to $80K Total spend can reach hundreds-of-millions of $$ Four main elements of the facility Power, Cooling, Space, Networks Total Cost of Ownership (TCO) Initial capital (CapEx) Long-term operational expenditures (OpEx) Power Usage Effectiveness (PUE) Power efficiency performance indicator Courtesy of http://datacenter10.blogspot.co.il/ for educational purposes only Challenges in Modern Data Centers Management, Spring 2015 23
Facility challenges Facility basics Optimizing cooling Hot Isle, Hot/cold air containment, Free cooling Optimizing power feeding Redundancy dilemma AC vs. DC Optimizing refresh rate 4-year optimal lifespan Courtesy of psmtyech.com.sg for educational purposes only Challenges in Modern Data Centers Management, Spring 2015 24
Networks Networks Veins and arteries of the data center Play key role in its performance and high-availability Ensuring adequate availability Redundancy at layer 2 (data-link) Spanning Tree Protocol (SPT) & RSPT Per VLAN spanning tree (PVST) Multi-Switch Link aggregation (M-LAG) Redundancy at layer 3 (IP) Virtual Router Redundancy Protocol (VRRP) Courtesy of wikipedia.org, CC BY 3.0 Challenges in Modern Data Centers Management, Spring 2015 25
Resource management I III RM Part I-III RM Parts I-III 5% resource waste in 10K-server data center can cost $3K/day ($1M /year) It is critical to utilize resources efficiently as possible Resource management system (RMS) / Scheduler 1. Accepts requests from the users (millions per-day) VMs (Amazon), Map-reduce (Hadoop), Chip simulations (Intel), etc. 2. Queues and prioritizes them (decides which job to execute next) Subject to constraints, e.g., ensuring shares, deadlines, etc. 3. Allocates resources and launches the jobs on selected resources Various heuristics Challenges in Modern Data Centers Management, Spring 2015 26
Resource management I RM Part I Proportional-share scheduling Very common scheduling heuristics used in data centers Every entity (VO, project, user) should get its promised share of the resources Challenges 1. How to measure resource consumption? 1-core X 4GB vs. 3-cores X 1GB 2. How to ensure fast ramp-up Limits, logical and physical buffers 3. Considering history Is this really important? Challenges in Modern Data Centers Management, Spring 2015 27
Resource management II RM Part II Matching the jobs with available resources Best-fit, worst-fit, first-fit, random, mix-fit, dynamic-programming, etc. Challenges 1. Optimizing resource matching Single vs. multiple dimensions One job at a time vs. multiple jobs (look-ahead) 2. Dealing with jobs that cannot be scheduled Reservation (backfilling) Challenges in Modern Data Centers Management, Spring 2015 28
Resource management III RM Part III Going global (meta-scheduling) Ensuring QoS, Load balancing Practical considerations Scalability, Robustness, Usability Challenges in Modern Data Centers Management, Spring 2015 29
Data Access Data access Jobs (VMs, map-reduce, simulations) use data Huge burden on the storage (DoS attacks) Challenges 1. Avoiding DoS within a data center Storage-side: Scale-out storage, Parallel NFS, etc. Client-side: cachefs, CaMA (RO) 2. Enabling remote data access (going global) Synchronous and asynchronous replications Site-level caching, etc. Continuous Integration (CI) use case Know your workload Challenges in Modern Data Centers Management, Spring 2015 30
Our course cont. Introduction Facility basics Networks RM Part I RM Part II RM Part III Data access Business Intelligence Predictive Analytics DC visit Security Part I Security Part II Summary Challenges in Modern Data Centers Management, Spring 2015 31
Business Intelligence (BI) Business Intelligence Goal is to provide insights on the data center to help optimize its operation E.g., statistics on resource usage to help deiced which equipment to buy Involves collecting, preparing, storing, analyzing, and accessing the data Challenges in each layer e.g., impact on source system, responsiveness, etc. Focusing on data analysis Optimizing data queries (SQL) Join-sort-aggregate implementations How to assemble them optimally using time and space considerations Challenges in Modern Data Centers Management, Spring 2015 32
Predictive analytics Predictive Analytics One of the important usages for BI in the data center Help systems e.g., job scheduler, take data-driven actions in real time Deep dive into one such use-case Predicting jobs resource usage for optimizing resource allocation Data-Stream Mining (DSM) Continuous (endless) rapid incoming data Machine-learning must be applied online Challenges in Modern Data Centers Management, Spring 2015 33
Predictive analytics challenges Predictive Analytics Performance (real-time) Impossible to store all data (train) each sample must be processed once Adaptability Non-stationary data model must be adaptable (sliding windows) Quality, availability Perform at least as good as no-stream models Prediction must be provided continuously Cover well known algorithms Regression trees, Decision trees, etc. Multiple Sliding Windows (MSW) Mimran & Even, 2013 Challenges in Modern Data Centers Management, Spring 2015 34
Security Security Securing the data center is complex Security breach can cause real money, reputation, IP loss and legal actions Security control model helps organize things Divide the data center into layers For each layer describe its attack vectors, vulnerabilities and controls Challenges in two layers 1. Applications Web applications Code injections, e.g., SQL injection Web manipulations e.g., ClickJacking Web services 2. Identities Access Management (IAM) Managing multiple identities (SAML) Authentication Knowledge factors: Passwords, Kerberos Possession factors: Certificates, OTP Inherence factors: Biometric (e.g., fingerprints) Challenges in Modern Data Centers Management, Spring 2015 35
Visit to IDC data center (3/6) DC visit Challenges in Modern Data Centers Management, Spring 2015 36
Summary Course is unique Covers actual challenges encountered in real environments Delivered by domain experts with huge experience in designing and deploying solutions Interaction is important Don t hesitate to ask (tough) questions Enjoy Challenges in Modern Data Centers Management, Spring 2015 37