Implementing Enterprise Disk Arrays Using Open Source Software Marc Smith Mott Community College - Flint, MI Merit Member Conference 2012
Mott Community College (MCC) Mott Community College is a mid-sized community college located in Flint, Michigan. ~800 faculty and staff ~12,000 students ~3,500 computers (academic + administrative)
Who Am I? Marc Smith is the Manager of Computing Services at MCC. Received B.S. in computer science from The University of Michigan-Flint. Prior to the current management role, served as a systems engineer. Extensive experience in Linux system administration, virtualization (server and desktop), and Storage Area Network (SAN) technologies.
History November 2010: Learned about new VMware View feature that allowed replicas to live on a different datastore, preferably SSD that would boost VDI performance. December 2010: Expensive SSD-backed disk arrays drove us to look at other options. Discovered an open source project called SCST. January 2011: Implemented a Dell R710 with (6) SATA SSDs, PERC H700 controller, (2) Fibre Channel HBAs, Gentoo Linux + SCST as pilot disk array for VDI. May 2011: SCST disk array pilot so successful, decided to implement two additional (24) slot SCST-based disk arrays to put all VDI VMs on SSD-based storage. December 2011: VDI implementation growing rapidly, plans to add additional SSD disk arrays are in the works...
New Solution SCST + Gentoo Linux disk arrays were working great, albeit a few issues: o Management - All CLI / configuration file driven, no UI for provisioning storage. Lack of other personnel @ MCC with the required skill-set for maintaining the arrays. o Updates - No good/easy method for controlled updates (`emerge --sync`) and no simple solution for rolling back. o OS Disk - Wasting (2) precious slots in the disk array chassis for the boot / OS volume (RAID1). Other options? o Openfiler - "Unified Storage" open source software solution; limited block-level (via SCST) storage support, much more focused on NAS (eg, CIFS, NFS, etc.).
ESOS Is Born! We didn't have the tool for the job, so we decided to develop one ourselves: ESOS - Enterprise Storage OS o A quasi Linux distribution that includes everything (software) that you need to setup a "storage server" (eg, disk array). o Includes Linux kernel (3.x), SCST, GLIBC, BusyBox, QLogic FC HBA firmware, RAID controller configuration utilities (eg, MegaCLI), and many more utilities/tools that are needed/useful for a block-level storage solution.
ESOS Platform Features ESOS is memory resident -- it boots off a USB flash drive, and everything is loaded into RAM. If the USB flash drive fails, ESOS will send an alert email, and you can simply build a new ESOS USB flash drive, then replace the failed drive and sync the configuration. Kernel crash dump capture support. If the ESOS Linux kernel happens to panic, the system will reboot into a crash dump kernel, capture the /proc/vmcore file to the esos_logs filesystem, and finally reboot back into the production ESOS kernel -- all automatically. ESOS sends an email alert on system start-up and checks for any crash dumps.
ESOS Platform Features (con't) Two operating modes: Production (default) & Debug. With "Production" mode, the performance version of SCST (make 2perf) is used. If you find you're having a problem and not getting sufficient diagnostic logs, simply reboot into "Debug" mode (full SCST debug build, make 2debug) and get additional log data. Enterprise RAID controller CLI configuration tools. Popular RAID controller CLI tools are included (the default) with ESOS (eg, LSI MegaRAID, Adaptec AACRAID, etc.) which allows configuration (add/delete/modify) of volumes / logical drives from a running ESOS system.
Storage Provisioning Features Per initiator device visibility management (LUN masking). Thin provisioning. Implicit asymmetric logical unit access (ALUA). SCSI persistent reservations. Data deduplication (via LessFS). Several different I/O modes for virtual SCSI devices, including the ability to take advantage of the Linux page cache (vdisk_fileio), or share an ISO image file as a virtual CDROM device (vcdrom) on your SAN.
Supported Hardware ESOS should be compatible with any popular enterprise RAID controller and Tier-1 server hardware. It currently supports the following front-end target types: o Fibre Channel -- QLogic FC HBAs only; those are the only FC adapters that SCST has target drivers for. o iscsi o SCSI RDMA Protocol (SRP -> InfiniBand) o Coming soon: Fibre Channel over Ethernet (FCoE) -- Already implemented in SCST, but still relatively new.
Cons Still being developed; the core of ESOS (SCST) is mature and stable, the whole package of ESOS (Linux kernel, SCST, user-land software, firmware, CLI tools, etc.) is new. o That being said, we (MCC) have been using Gentoo + SCST for our production VDI datastores for 1.5 years, and using ESOS on development VDI datastores for the last several months. We intend to start using ESOS in production this summer. No high availability / replication. o We are currently only using SCST/ESOS with our VDI environment; we use floating pools spread across multiple of these "storage servers", so losing an entire storage server only decreases the number of available machines.
Planned Features / Additions A text-based user interface (TUI) that will provide an easy to use interface with convenient storage provisioning functions. o Nearly all core SCST functionality will be implemented in the TUI, local LSI MegaRAID volume configuration, and typical system setup tasks (network configuration, mail, accounts, etc.). o See following slides for preview screen shots. High availability / replication: o Likely done via DRBD (block device mirroring) + ALUA (control I/O path between primary/secondary nodes) + Linux-HA project. o Idea still needs to be tested and validated.
Planned Features / Additions (con't) VMware vstorage APIs for Array Integration (VAAI): o These SCSI primitives are being implemented in SCST; WRITE SAME is already done, and the rest are coming.
TUI Preview Screen Shots (1)
TUI Preview Screen Shots (2)
TUI Preview Screen Shots (3)
Our Current Configuration Each storage server (disk array) base configuration: o SuperMicro 2U 24x2.5in Bay Chassis w/900w Redundant Power Supplies o (2) Intel Xeon Processor E5645 (12M Cache, 2.40 GHz, 5.86 GT/s Intel QPI) ( 5520 Chipset Motherboard) o 12 GB 1333MHz DDR3 ECC Memory (6 GB usable with mirroring mode) o (1) LSI MegaRAID 9280-24i4e SATA/SAS 6Gb/s PCIe 2.0 w/ 512MB (with FastPath) o (2) QLogic 8GB Single Port Fibre Channel HBA We currently own (3) of these chassis (one development, two production). We have requisitioned an additional two that we are expecting to arrive soon. ~$6,000 per 24 slot server/chassis as described above.
Our Current Configuration (con't) (1) global hot spare in each chassis; (4) RAID 5 volumes with five disks each; (1) RAID 5 volume with three disks. Performance (direct IO, `fio` tool, sustained IO per volume): o 4K Random Read IOPS: ~84K o 4K Random Write IOPS: ~15K o 4K Mixed Random RW (75/25): ~27K / ~9K
Our Current Configuration (con't) Disks used in the (2) production arrays: o Crucial RealSSD C300 256GB CTFDDAC256MAG o Purchase price ~$450 each. o $450 * 24 = $10,800 per array/server for ~6 TB raw SSD storage. Disks chosen for the additional (2) new arrays: o Samsung 830 Series MZ-7PC256B/WW 2.5" 256GB SATA III MLC o Purchase price ~$275 each. o $275 * 24 = $6,600 per array/server for ~6 TB raw SSD storage. Large cost difference (reduction) in consumer SSD disk prices between last year and now.
Questions / Comments http://code.google.com/p/enterprise-storage-os/