Challenges for Large Distributed IaaS Cloud -- WIDE Cloud -- Yuji Sekiya The University of Tokyo, Information Technology Centre <sekiya@nc.u-tokyo.ac.jp> WIDE Project <sekiya@wide.ad.jp> 1
Cloud? What is Cloud? Scalability Elasticity Availability IaaS Cloud is operated as Cloud? What is the difference with Hosting or VPS (Virtual Private Server) service? Resources are connected via Network Resources are widely distributed 2
Test Bed for Distributed IaaS WIDE Cloud is built and operated by WIDE Project Distributed Inter-IaaS cloud Inter-University /Inter-Datacenter IaaS Share the computing and networking resources Real Users and Real Operation over 100 users and over 300 s are running (as of today) started since March 2010 http://www.wide.ad.jp/, DNS servers, and Mail Servers are in WIDE Cloud Testbed for New Technologies on Real Environments operational servers are working Real Testbed for Cloud Technologies More Information : https://wcc.wide.ad.jp/ 3
Overview of WIDE Cloud NAIST JAIST Inter-University Cloud Sharing the resources on each private cloud Univ. of Tokyo Characteristics of WIDE Cloud Tokyo DC Osaka DC SFO DC WIDE Cloud CNU (Korea) Keio Univ. (KMD) Keio Univ. (SFC) 1. Widely Distributed IaaS 2. Connected using commodity Internet Reachability, not dedicated circuits. Full IPv6 capable 3. Resource sharing based on the policies of each organization 4. Redundant Architecture 4
Why Location Level Redundancy? In Japan, we had a big earthquake on Mar. 11th. We encounter a serious problem of power supply in Tokyo and Tohoku areas. In March, there were scheduled blackouts in Tokyo Area Unfortunately, Keio University had two hours blackout twice a day. There are several important servers in Keio University. Need to shutdown servers or use UPS / Generator 5
EU-JP Symposium at FIA Technologies of WIDE Cloud Application Layer SQL RESTful API Image Control Layer libvirt Virtual Resource Layer NEMO kvm NFS Facility Layer Cloud Controller NoSQL Middleware Layer Sheepdog iscsi map646 VLAN Network Server Storage Application 6
Challenges in WIDE Cloud Location Level Redundancy Network migration DHT based distributed storage replication (Hot Standby) Site A Replication Distributed Storage Site C Site B Resource Allocation Connecting private clouds in each organization Sharing Resources based on policies Site A Migratable Network Site B VXLAN + LISP / NEMO Site A Migratable Network Site B 7
Network Portability by NEMO Datacenter1 IPv6 Internet IPv4 Internet Datacenter 2 Stateless IPv6/IPv4 translator for Servers Hypervisors Hypervisors s Datacenter 3 Datacenter 4 NEMO based Network Migration Hypervisors Hypervisors 8
Network Portability by VXLAN + LISP Back Bone Network ARP Packet SRC : Node A DST : L2 Broadcast VTEP A VXLAN EnCAP SRC : VTEP A DST : L3 Multicast ARP Packet SRC : Node A DST : L2 Broadcast LISP can accommodate AS in Cloud No need of HOME Network!! using the nearest LISP edge for IN/OUT traffic VXLAN can transport ANY network to remote site connected via Layer-3 Network combination with LISP 9
Advantages NEMO Easy to operation Any network can be transported to Any remote sites connected by Layer-3 Network VXLAN + LISP Combination of VXLAN + LISP can transport any network to LISP capable remote site. AS (Autonomous System) can be operated in IaaS CLOUD No need of HOME NETWORK Traffic to/from the Internet are optimized (NO TUNNEL) 10
Resource Manager Resource Management for Federated Cloud Each organization can define the sharing policy for CPU + Memory Network Storage Define threshold of resources Why need resource management? Each organization wants to define occupancy resources and sharing resources WIDE Project created an original Cloud Manager for evaluating the effectiveness of Inter-Clouds. 11
Resource Allocation Initial Allocation Depend on user Where to migrate? Migration migration Storage migration Network migration Need consideration of I/O for migrations Migration is effective for reducing server / network loads? Underloaded server server1 server4 server0 server3 Initiate migration when Weight of load part Overloaded server server2 1 2 0 Server5 2 1 0 Load Location Weight of location part 12
Working Items and ToDo High Availability / Redundancy Network Migration LISP in Cloud Management by OpenFlow Distributed Filesystems Ceph Sheepdog Resource Management Resource Auto Migration of and storage Security (not yet) Data confidentiality 13