MiServer and MiDatabase Service Description Service Definition As part of the NextGen Michigan initiative, Information and Technology Services has engaged in building a private cloud offering to support the University of Michigan. There are two distinct services: MiServer... Units no longer need to buy and manage their own servers. Now they can purchase virtual servers from ITS and greatly reduce the effort and cost of doing their business. Unit technical staff members no longer need to worry about applying software patches and upgrades, because this service provides these capabilities for them. Units can choose from three distinct virtual server options: Windows Server This is a server that includes a fully managed Windows operating system. Linux Server This is a server that includes a fully managed Red Hat Linux operating system. Core Server This is a server with no operating system at all. In special cases where the Windows and Linux servers won t be workable for a unit, Unit staff members can install and manage their own operating system that is supported by VMWare. This will take more effort from the Unit staff members, but provides flexibility to support non standard configurations. MiDatabase... Units no longer need to install and manage their own databases. Now they can purchase databases directly from ITS and greatly reduce the effort and cost of doing their business. Each of these databases runs on top of ITS MiServer. Unit technical staff members no longer need to worry about applying database or operating system patches and tuning because this service provides these capabilities for them. Units can choose from two distinct options for MiDatabase: (Oracle may be available at a future date) MS SQLServer This runs on a Windows virtual server. MySQL This runs on a Linux virtual server. Each service also allows Units to specifically configure the amount of hardware (processing, memory and storage) that are dedicated to their servers. This allows units to select the correct configuration and avoid overspending on hardware for their systems. Page 1 of 13
Units can also choose the level of backup and disaster recovery support they need for each server. Again, this allows the Unit to target its investments to protect critical data without overpaying to protect non critical data. These Managed MiServer and MiDatabase services all include: Operating system and database patch management Operating system and database upgrades Antivirus (for Windows Only monitor any stored files for viruses) System monitoring and notification (e.g. storage, performance and availability) Backup and recovery Capacity Management (ensuring that appropriate hardware capacity is available) Technology Life cycle (upgrading or replacing server hardware, software, network, or other components as appropriate) The initial Cloud Computing services will be on premise solutions housed and operated in University of Michigan data centers by University of Michigan personnel. Over time, the service will rely more and more on external cloud providers. Eventually the environment will be a hybrid of both on premise and external infrastructure allowing consumers to address cost, compliance, and performance requirements. Intended Consumers The service will be available for all faculty and staff members as well as for academic and research projects with appropriate funding (i.e. a short code) and spending authority. All user groups can benefit from this service by reducing overhead costs and administrative burden of running servers. However, users are expected install applications and other custom software. Out of Scope: Application support (Units are responsible for the installation, maintenance, monitoring and operation of applications on the server) Operating System Support that is outside of our Managed OS offerings Mac OS X or SPARC hardware requirement Any Operating system that is not supported by VMware Value Statement The service will add value for customers by providing a flexible service that scales from a virtual infrastructure to a fully managed Operating System or Database. By providing different levels of services, customers can choose features and support levels that support their business needs and can potentially reduce their overall operational costs. All services are built on virtual servers which allows for the best allocation and utilization of system resources. Customers who are running systems on this service will find value in not having to buy and maintain physical hardware. This allows users to greatly reduce the time it takes to acquire, setup and begin using computing resources. MiServer will provide value by providing on demand compute capacity to university users. The Core Server will Page 2 of 13
allow consumers to focus on their core business goals by providing infrastructure management and support. MiServer 1. Increase agility by empowering users to rapidly deploy servers with a self service portal. 2. Reduced customer overhead by offloading physical (through P2V migration process) or virtual servers to a shared service provider. 3. Offload firewall and network (VLAN) management to a shared service provider. 4. Fulfillment of virtual server requests will be rapid and highly automated. 5. 24 Hour support services when server is down or experiencing system issues 6. Managed OS includes Monitoring and event detection to ensure availability and quality of service for each virtual server 7. Data protection will be provided through routine backups and tools that allow users to request restoration of individual files; backups will be regularly monitored. The MiDatabase will provide value by assuming the overhead of managing relational database management system (RDBMS) and underlying infrastructure thus enabling customers to focus, more on their data and advancing their research, teaching or administration initiatives, and less on technology management. MiDatabase 1. Database management and support provided reduces complexity and costs for customers. 2. 24 Hour support services when server is down or experiencing system issues. 3. Database deployment will be simple, rapid and highly automated to users. 4. High quality of service and high scalability to support demanding needs. 5. Database performance and event monitoring is built in and centrally managed by the service provider. 6. Database backups will be provided, monitored and tested regularly. 7. The ability to choose between a shared database or a dedicated instance. Database owners/administrators can expect to be able to create and manage all database objects within their database(s) including users management, security administration and delegation for their database(s). Management and Governance IT Service Role Service Owner Service Manager(s) Individual Chris Wood Keila Walton Governance The Service Manager/Owner Is responsible for the MiServer service as a whole, including: Incident/Problem Management Performance Management (KPI s) Change Control Management Demand Management Customer Management working in partnership with the Customer Relationship Management capability Page 3 of 13
of ITS. Service lifecycle Management Updates and adds to the service definition MiDatabase Service Manager/Owner: Acts as the interface between the units and governs all aspects for maintaining the MiDatabase Service delivery. The Service Manager/Owner Is responsible for the following within the service: Incident/Problem Management Performance Management (KPI s) Change Control Management Demand Management Customer Management working in partnership with the Customer Relationship Management capability of ITS. Database lifecycle Management Updates and adds to the service definition Strategic governance for the technology used in the service is maintained by the U M Technical Architecture Committee (TAC). TAC and the Managed Database Service owner will be responsible for governance on the tools and database management systems offered in the service. Further governance and updates will be provided by the U M NextGen program office. Service Details Feature or Capability MiServer Service Managed OS Service Managed DB Service Description 1 8vCPU, 1 32GB RAM, up to 1TB disk. Pre configured Servers running Microsoft 2008 R2 or RHEL 6 VM Template with OS Management Pre configured Microsoft SQL 2012 or MySQL on Managed OS virtual machines MiServer: MiServer service includes the following: virtual machine provisioning, network and firewall management, monitoring, access management, and backups/restores. Virtual Machines currently provided with the service: VM container to support any Intel based x86 or x64 Operating Systems Supported Virtual Machines with migrations: Support any Intel based x86 or x64 Operating Systems;OS must meet migration criteria. Tools used for support of MiServer service: Virtual Machine Provisioning VMware vcenter Page 4 of 13
o Using standardized Operating System template, when VM requested with O.S. The VM with O.S option is only available for the currently supported Windows version. o vcenter Orchestration will help with automating provisioning tasks when possible. Monitoring o Infrastructure will be monitored and supported 24/7 ESXi hosts are protected by high availability, VMs will be restarted automatically in case of host failures, support staff immediately notified and root cause analysis initiated. Customers will experience a brief outage. All underlying physical hardware are monitored and vendor supported 24/7. vcenter and its components are highly available and monitored. o Monitoring virtual machine system resources for event detection CPU Ready times are monitored to alert possible resource contention. More resources or a rebalancing or resources would be triggered on a sustained trend. VMs total disk latency will be monitored for high trends. o Service level performance and monitoring available to end users. Access to vcenter provides virtual server owner/admins access to all available performance counter and alerts. Access Management Microsoft Active Directory o Managed access control for administrators and end users Active Directory group policy for configuration backup and restores o System and file level backups o Restore system and file level o Restore for site and geographical disaster recovery VM Retirement o Through a service request, customers will have the ability to request VM retirement. MiDatabase: MiDatabase Service provides users with optimized database environment built on MiServer. Users can select from Managed OS or Non Managed OS service. Tools used to support MiDatabase Microsoft SQL Service: Microsoft SQL Server management tools Microsoft SQL Management Studio Microsoft SQL Profiler Patch Management vcenter Protect (formerly Shavlik 8.0) Patching for critical and non critical bugs and security fixes for Microsoft SQL Server Agentless and scheduled patch management for Microsoft SQL Server Centralized patch management for Microsoft SQL Server SQL Monitoring Idera and SCOM Monitoring SQL server for event detection SQL service performance and monitoring available to users Backup and restores Microsoft SQL server database backup Database and transactional level backup and restore utilizing Microsoft SQL backup Database site disaster recovery is strategically aligned with Core Service and Managed OS Page 5 of 13
disaster recovery strategies Tools used to support MiDatabase MySQL Service: MySQL Database Management Tools Provided to Customers: MySQL Client Command Line Tool MySQL GUI Tool PhpMyAdmin (cosign protected) Patch Management Quarterly Patching for critical and non critical bugs and security fixes for MySQL Centralized patch management for MySQL Database Servers SQL Monitoring Nagios and Splunk Monitoring MySQL server for event detection through splunk MySQL Availability Monitoring through Nagios Backup and restores Database and transactional level backup and restore utilizing Native MySQL backup Database site disaster recovery is strategically aligned with Core Service and Managed OS Database retirement Individual managed database retirement is available through user service request Service Availability Service Hours Service Expectations The MiServer and MiDatabase will be available 24x7 with the exception of any unexpected outage or system maintenance. Planned Maintenance Planned maintenance will be announced and communicated 30 days prior to any system outage and is anticipated to be used 2 to 3 times per year during the maintenance window described below. ITS will announce all disruptive system maintenance changes in advance through the ITS Service Status page at: http://status.its.umich.edu/. Unified maintenance window for NextGen cloud service occurs every Saturday 11:00 p.m. Sunday 7:00 a.m. All changes that require downtime will be required to be scheduled during the unified maintenance window. All other changes will be scheduled and followed by the change management process. * Critical security exploit patches and failed patches can be completed at any time with appropriate emergency change management processes and appropriate communication prior to completion. Emergency Maintenance Emergency maintenance is any maintenance that needs to be implemented immediately to prevent a services outage. ITS will use the standard communication plan prior to any emergency maintenance to alert impacted customers. The communication plan will include system notifications and updates to system status Web pages. Customer who are part of miserver.notify@umich.edu and midatabase.notify@umich.edu will receive notification of emergency maintenance. Page 6 of 13
Service Support Requesting Support User requests for support regarding ITS services are processed through the ITS Service Center. To contact the Service Center: Submit a Service Request Online (login required) Call 734 764 HELP (764 4357) E mail 4HELP@umich.edu Support Hours ITS Service Center Hours are: Monday Friday: 7:00 a.m. 6:00 p.m. Sunday: 1:00 p.m. 5:00 p.m. (e mail only) Types of Support Self Service provisioning of MiServer and MiDatabase via service portal Monitoring and growing MiServer system disk to allow for the server to continue operating Customer will be alerted to this and charged appropriately Monitoring and growing MiDatabase system disk and database to allow for the database and server to continue operating Customer will be alerted to this and charged appropriately Notifying customers prior to any disruptive maintenance. Notification will be posted to the status page (http://status.its.umich.edu/) Monitoring system environment and reacting to alerts Purchasing and replacing hardware/software that is provided by ITS Upgrading hardware/software that is provided by ITS Testing and troubleshooting hardware/software that is provided by ITS Self-Service Support Customers will have access to online help that can generate a service incident to ITS. MiServer and MiDatabase also provides step by step getting started instructions as well as FAQs. Incidents and Outages Priority Description Target to Restore Services Example Critical Incidents are classified as critical priority when there is a major, immediate risk to the university's ability to conduct its mission, because of disruption to 4 hours Service is unavailable that supports an entire school/college/units core business system(s). Page 7 of 13
High Medium Low users' ability to perform a function related to that mission. Incidents are classified as high priority when there is an elevated risk to the university's ability to conduct its mission, because of disruption to users' ability to perform a function related to that mission. Incidents are classified as medium priority when users' ability to perform a function is impaired, and a risk to the university's ability to conduct its mission is present, but the university can manage around that risk over a short period of time. Incidents are classified as low priority when users' ability to perform a function is impaired, but there is minimal risk to the university's ability to perform its mission. 1 day Service is intermittently available and it supports an entire school/college/units core business system(s). 5 days Service is functioning however there are issues/errors that are slowing core business system(s). 10 days Service enhancement or issues with workaround Disaster Recovery and Backup & Restoration What is a disaster? A disaster is declared when a catastrophic event occurs that prevents use of a data center and/or a significant portion of its computing equipment. A data center disaster is not declared for issues with individual virtual servers. Disaster Recovery Two key metrics are used to measure the response to a disaster: Recovery Time Objective (RTO) The amount of time that elapses from the point of the disaster until the service is restored at an agreed upon level. For example with a RTO of 48 120 we would bring all systems back online within 4 days depending on the extent of the disaster Page 8 of 13
Recovery Point Objective (RPO) How much work (measured in time) will be lost in the case of a disaster. For example with a RPO of 1 hour the maximum amount of data loss would be no more than 1 hour. The two options are Async and No Sync. When you select Async, all data on your server is replicated to a secondary data center every hour. With No Sync, your server is not replicated to a secondary data center. Replication is critical when there is a datacenter loss or in the case of DRBC Replication Type Recovery Time Objective Recovery Point Objective Async 48 120 hours 1 Hour No Sync NA NA MiDatabase Recovery Options Units that purchase a Shared MiDatabase will have Async as their recovery option as they are in a shared instance with other customers. For Units that purchase a Dedicated Instance they can choose between Async or No Sync. The configuration option selected will determine what replication can be used, and therefore what RPO and RTO are available for the system. The two configuration options are: Server Type Replication Type Shared Dedicated Async Async or No Sync Backup and Restoration Data Retention MiServer/MiDatabase data retention is as follows: Daily snapshots at noon and 4:00 p.m., retained for 3 days Nightly snapshots at midnight every day, retained for 7 days Weekly snapshots every Sunday morning at midnight, retained for 8 weeks In addition to snapshots, a disaster recovery backup is created nightly and stored offsite Customer Responsibilities Roles and Responsibilities Customers should not remove any agents included in the service on a server subscribed to the managed feature of MiServer or MiDatabase. Changes to the agents included will impact the ability for ITS to meet our Page 9 of 13
SDE. Doing so may result in a degradation or loss of service for the customer and prevent ITS from properly managing their server/database. Agents included on a server subscribed to the managed feature of MiServer include however not limited to: Anti Virus (Windows Only) Monitoring Patching Backup As a customer of the MiServer and MiDatabase, there are certain roles and responsibilities you will need to perform. Below are the lists of customer responsibilities for each of the services: Customers of MiServer and MiDatabase are responsible for the following: Determining the level of urgency if a problem is discovered that requires assistance from ITS. Paying all charges associated with services rendered and adhering to all funding restrictions for the source of funds used to pay for this service. Adhering to data management, security, and compliance policies are followed appropriately to comply with university policies, state and federal laws and regulations. Requesting a server to be retired when services are no longer needed/used. If needed, shutdown/restart any application software components during scheduled maintenance. Communicating to and planning for any network, firewall, and front end application server changes that could affect access to the server (or database). Requesting any changes to the server (or database) to handle system growth. Subscribing to the miserver/database e mail group to receive system notifications (miserver.notify@umich.edu or midatabase.notify@umich.edu ) Customers of MiServer are responsible for the following: Installing, configuring patching, and maintaining all application components. Customers will have system access to perform all actions required from them. Testing, troubleshooting, and resolving application problems that may result from updates, patches, and configurations to the operating system. Restoring and verifying all application components in the event a server has to be restored from backup or rebuilt to restore service for the customer. If the customer is not using a managed operating system provided by ITS, the customer is also responsible for performing all traditional system administration activities such as operating system administration, application administration, and monitoring/troubleshooting. Managing the authorization groups for server administration. Selecting a patch schedule from a predefined list for each of their servers that are subscribed to the Managed service for MiServer. Customers who require a reboot to address application issues have an option to select a reboot schedule from a predefined list through the service. This option is only available for the Managed Windows service for MiServer. Customers need to put their servers in maintenance mode when doing application maintenance**** Page 10 of 13
Customers of MiDatabase are responsible for the following: Testing, troubleshooting and resolving problems resulting from updates, patches and configurations to the database server. Managing the authorization and data access of databases. (MiDatabase provides a framework for role based authorization, however its implementation within each database is the responsibility of the customer.) Service Performance Service Metrics & Reporting Metric Description Expectation How Measured How Reported Service Availability Report on availability, which includes availability, recoverability and any metrics that pertains to this agreement around availability as developed by the Service Manager. Overall Availability: 99% Number of incidents (service) Number of incidents caused by Server failures Number of incidents caused by other areas outside of Cloud Services Number of service requests fulfilled per user/group Response times for each service requests Number of successful/fail ed changes Monthly report provided to service customers via e mail Not yet available Service Responsiveness [Describe the metric(s) you will use to measure Service Responsiveness and Performance. For example, how long does it take for Not yet available Page 11 of 13
an application to process, a workstation to boot up, etc.] Request Fulfillment Incident Resolution Customer Satisfaction Provisioning time for the service takes 1 2 Business Days [Describe the metric(s) you will use to measure Incident Resolution. For example, number of incidents by priority, average time to resolve, etc.] [Describe the metric(s) you will use to measure Customer Satisfaction. For example, data from the bi annual B&F Customer Satisfaction survey or data from other service specific surveys.] Not yet available Not yet available Responses to Missed Service Expectations Not yet available Changes and Enhancements Weekly change report with infrastructure dependencies when maintenance is performed. Two reports should be published to service users on all upcoming changes to review for any issues or dependencies that might impact their service or potential maintenance on their service. After all changes are completed a report should be published to identify all changes that were successfully completed. The reports can be published to a central website or distributed by e mail. Document Review & Approval Milestone Reviewed by Date Page 12 of 13
Initial Draft (Created by Service Owner) QA Review Lynne Ertel 6/28/13 SPO Review Bill Wrobleski 6/28/13 Page 13 of 13