ovirt and Gluster Hyperconvergence January 2015 Federico Simoncelli Principal Software Engineer Red Hat ovirt and GlusterFS Hyperconvergence, Jan 2015 1
Agenda ovirt Architecture and Software-defined Data Center GlusterFS Architecture and Important Concepts Hyperconverged ovirt GlusterFS Managing GlusterFS Gluster Storage Domain and Improving UX Underlying Block Devices, Management and Monitoring Data Replication and Scaling-out QEMU libgfapi Conclusions ovirt and GlusterFS Hyperconvergence, Jan 2015 2
ovirt and its Architecture ovirt is a virtualization platform to manage virtual machines, storage and networks Engine (ovirt-engine) Manages the ovirt hosts, and allows system administrators to create and deploy new VMs Host Agent (VDSM) ovirt engine communicates with VSDM to manage the VMs, storages and networks Storage Clusters Engine Storage Server Cluster Hosts Storage Server ovirt and GlusterFS Hyperconvergence, Jan 2015 3
ovirt Storage Architecture Storage Domain Centralized storage system (images, templates, etc.) A standalone storage entity (implemented with NFS, GlusterFS, FCP, iscsi) Stores the images and associated metadata Only real persistent storage for VDSM Storage Pool Aggregates several Storage Domains Supposed to simplify cross domain operations Storage Pool Clusters Engine Storage Domain VDSM Hosts Storage Domain ovirt and GlusterFS Hyperconvergence, Jan 2015 4
Software-defined Data Center Software-defined data center is a vision for IT infrastructure that extends virtualization concepts such as abstraction, pooling, and automation to all of the data center s resources and services to achieve IT as a service (ITaaS) [1] Compute virtualization (KVM) Software-defined networking (SDN) - networking functionalities in a software-based solution Software-defined storage (SDS), storage capacity, performance and durability in a software-based solution Engine Clusters VDSM Hosts Software-defined Storage [1] http://en.wikipedia.org/wiki/software-defined_data_center ovirt and GlusterFS Hyperconvergence, Jan 2015 5
Hyperconverged Architecture Hyperconvergence is a type of infrastructure system with a softwarecentric architecture that tightly integrates compute, storage, networking and virtualization resources and other technologies from scratch in a commodity hardware box supported by a single vendor [1] Group direct-attached storage into a single Namespace Reliably consume virtual machine images from the single Namespace VMs and Storage Engine Direct-attached storage (SAS/SATA) [1] http://searchvirtualstorage.techtarget.com/definition/hyper-convergence ovirt and GlusterFS Hyperconvergence, Jan 2015 6
GlusterFS and its Architecture GlusterFS is a general purpose scale-out distributed file-system supporting thousands of clients Aggregates storage exports over network interconnect to provide a single unified namespace File-system completely in userspace, runs on commodity hardware Layered on disk file systems that support extended attributes ovirt and GlusterFS Hyperconvergence, Jan 2015 7
GlusterFS Bricks A brick is an export directory located on a specific node (e.g. host-01:/srv/fs1/brick1) Each brick inherits limits of the underlying file-system No limit on the number bricks per node (as bestpractice each brick in a cluster should be of the same size) /srv/fs1/brick1 /srv/fs1/brick2 /srv/fs2/brick3 File-System: /srv/fs1 Block Device: /dev/dm-1 /srv/fs2 /dev/dm-2 ovirt and GlusterFS Hyperconvergence, Jan 2015 8
GlusterFS Volumes A volume (the mountable entity) is a logical collection of bricks Bricks from the same node can be part of different volumes Different types of Volumes Distribute, Stripe, Replicate (and their combinations) Type of a volume is specified at the time of volume creation and determines how and where data is placed Volume1 Distribute/Stripe/Replicate host01:/srv/fs1/brick1 host02:/srv/fs1/brick1 host03:/srv/fs1/brick1 ovirt and GlusterFS Hyperconvergence, Jan 2015 9
History of GlusterFS in ovirt ovirt 3.1 - August 2012 Initial support for provisioning and managing GlusterFS based storage clusters ovirt 3.2 - February 2013 Support for importing and syncing existing GlusterFS clusters in ovirt Engine ovirt 3.3 - September 2013 Gluster Storage Domain as new Storage Domain Type which uses GlusterFS as the storage backend ovirt 3.4 - March 2014 Support for re-balancing volumes and removing bricks ovirt and GlusterFS Hyperconvergence, Jan 2015 10
Hyperconverged ovirt GlusterFS The Data Center nodes are used both for virtualization and serving replicated images from the GlusterFS Bricks The boxes can be standardized (hardware and deployment) for easy addition and replacement Support for both scaling up, adding more disks, and scaling out, adding more hosts VMs and Storage Engine GlusterFS Volume Bricks Bricks Bricks ovirt and GlusterFS Hyperconvergence, Jan 2015 11
Managing GlusterFS from ovirt Gluster Service support is located in the Cluster properties Deploy Hosts with GlusterFS Server support Enable Bricks and Volume Management from ovirt WebAdmin and REST-API Engine is not taking in consideration GlusterFS on Virtualization Power-Saving policies and Fencing ovirt and GlusterFS Hyperconvergence, Jan 2015 12
Managing GlusterFS from ovirt It is possible to create and manage Gluster Volumes from WebAdmin and using the REST-API Volume Profiling Volume Capacity Monitoring ovirt and GlusterFS Hyperconvergence, Jan 2015 13
Gluster Storage Domain It is possible to create ovirt Storage Domain from GlusterFS Volumes ovirt and GlusterFS Hyperconvergence, Jan 2015 14
In Progress: User Experience Seamless UX for GlusterFS Volumes and Storage Domains (relationship not limited to UX) Provide a list of GlusterFS volumes when user selects GlusterFS as Storage Domain Automatically handle the Virtualization tunables Warn about non-replica 3 GlusterFS Volumes (easy redirect to Volumes management) Add a menu option to create Storage Domains in the Gluster Volumes tab http://www.ovirt.org/features/glusterfs_storage_domain ovirt and GlusterFS Hyperconvergence, Jan 2015 15
In Progress: GlusterFS Networks Currently, Gluster nodes use the same network for both VDSM management and Gluster traffic (possibly causing traffic chokes) The proposed solution is to separate out the GlusterFS traffic at the time of adding bricks to a Gluster volume VMs and Storage Engine Direct-attached storage (SAS/SATA) http://www.ovirt.org/features/select_network_for_gluster ovirt and GlusterFS Hyperconvergence, Jan 2015 16
Bricks Underlying Block Devices Intelligently combine fast IOPS devices (SSDs) and cost-effective large capacity drives (HDDs) DM-Cache uses fast storage devices, to act as a cache for one or more slower storage devices Battery backed RAID cards provide reliability and performances (writeback) File-System / Bricks LVM Thin-P + Data LV Battery Backed RAID / DM-Cache /dev/sdb1 /dev/sdc1 /dev/sdd1 ovirt and GlusterFS Hyperconvergence, Jan 2015 17
In Progress: Host Device Management Identify available disks and storage devices Create new bricks by creating new Thin Logical Volumes (or expand existing brick) Format the logical volume with the selected file-system and manage fstab entry http://www.ovirt.org/features/glusterhostdiskmanagement ovirt and GlusterFS Hyperconvergence, Jan 2015 18
GlusterFS Monitoring with Nagios Monitoring of critical entities such as hosts, networking, volumes, clusters and services Critical infrastructure components fail and recover alerts (email, SNMP) Record of service outages, events and notifications for later review http://www.ovirt.org/features/nagios_integration ovirt and GlusterFS Hyperconvergence, Jan 2015 19
GlusterFS Replicated Volumes Synchronous replication of all directory and file updates Provides high availability of data when node failures occur Transaction driven for ensuring consistency Changelogs maintained for re-conciliation Used for ovirt Storage Data Domains Read from one copy Volume1 Writes are replicated host01:/srv/fs1/brick1 host02:/srv/fs1/brick1 host03:/srv/fs1/brick1 ovirt and GlusterFS Hyperconvergence, Jan 2015 20
Distributed Replicated Volume Distribute files across replicated bricks Number of bricks must be a multiple of the replica count Ordering of bricks in volume definition matters Scaling and high availability Reads get load balanced Model of deployment for ovirt Storage Domains Volume1 brick1 brick2 brick3 brick4 brick5 brick6 ovirt and GlusterFS Hyperconvergence, Jan 2015 21
Bricks Scaling Out Algorithms Trivial flat scaling out assumes multiples of 3 hosts Replica Set 1 Replica Set 2 brick1 brick2 brick3 brick4 brick5 brick6 Scaling by one host at time requires a more elaborated algorithm, e.g. (most trivial): Replica Set 1 Replica Set 2 brick1 brick2 brick3 brick4 brick5 brick6 ovirt and GlusterFS Hyperconvergence, Jan 2015 22
Data Locality and Scheduling Scheduling a VM on an host holding its image replica is improving IO Writes need to be replicated Local reads Volume1 host01:/srv/fs1/brick1 host02:/srv/fs1/brick1 host03:/srv/fs1/brick1 Intelligent creation of VMs disks to group images on the same replica set Possible implementation with GlusterFS policy and deterministic placement based on image names Possible issue: unbalanced bricks space usage Possible future pluggable scheduler for disk images and VMs ovirt and GlusterFS Hyperconvergence, Jan 2015 23
In Progress: QEMU libgfapi Support GlusterFS exposes APIs for accessing Gluster volumes Reduces context switches FUSE Access Client Server QEMU GlusterFS Gluster Brick Userspace Userspace Kernel Net Net Kernel Kernel VFS /dev/fuse File-System ovirt and GlusterFS Hyperconvergence, Jan 2015 24
In Progress: QEMU libgfapi Support GlusterFS exposes APIs for accessing Gluster volumes Reduces context switches libgfapi Access Client Server QEMU Gluster Brick Userspace Userspace Kernel Net Net Kernel File-System ovirt and GlusterFS Hyperconvergence, Jan 2015 25
Hosted-Engine Support Hosted-Engine: ability to run ovirt Engine as a HA VM on the hosts that are managed by the same Engine Supports initial deployment from a single host Consume Gluster Storage Domain for the Engine VM Prepare GlusterFS Volumes on the Host for Hyperconverged deployment Proposal: support initial single-host Deployment (no GlusterFS Volume replica) and then add additional Hosts from Engine ovirt and GlusterFS Hyperconvergence, Jan 2015 26
Current Efforts In Progress Usability GlusterFS Storage Domain Creation Deployment and Monitoring Host Device Management and Monitoring Hosted-Engine support Optimizations Data locality and scheduling QEMU libgfapi support Best-Practice Documentation (Disk/Host replacement, etc.) Other Miscellaneous Issues Limit heavy data operations on rebalance/reconstruct Power-saving policies and fencing awareness ovirt and GlusterFS Hyperconvergence, Jan 2015 27
THANK YOU! devel@ovirt.org #ovirt irc.oftc.net http://www.ovirt.org/features/glusterfs-hyperconvergence ovirt and GlusterFS Hyperconvergence, Jan 2015 28