The Remote Infrastructure Management Platform



Similar documents
Dimension Data s services capabilities

NNMi120 Network Node Manager i Software 9.x Essentials

CA Virtual Assurance/ Systems Performance for IM r12 DACHSUG 2011

Kaseya Traverse. Kaseya Product Brief. Predictive SLA Management and Monitoring. Kaseya Traverse. Service Containers and Views

NMS300 Network Management System

SapphireIMS Business Service Monitoring Feature Specification

SapphireIMS 4.0 BSM Feature Specification

mbits Network Operations Centrec

VCE Vision Intelligent Operations Version 2.5 Technical Overview

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

Vistara Lifecycle Management

Syslog Analyzer ABOUT US. Member of the TeleManagement Forum

SolarWinds Certified Professional. Exam Preparation Guide

Service Automation to implement and operate your Cloud initiatives

Network Management Deployment Guide

Smart Business Architecture for Midsize Networks Network Management Deployment Guide

EMC IONIX FOR SMART GRID VISIBILITY AND INTELLIGENCE

EMC Data Protection Advisor 6.0

Remote Voting Conference

Data Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment

SolarWinds Network Performance Monitor powerful network fault & availabilty management

Cisco Network Optimization Service

About Network Data Collector

SolarWinds Network Performance Monitor

Application Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1

RUGGEDCOM NMS. Monitor Availability Quick detection of network failures at the port and

Top-Down Network Design

EMC PERSPECTIVE Comparing Network Change and Configuration Management Solutions

SOLARWINDS NETWORK PERFORMANCE MONITOR

Cisco Advanced Services Network Management Systems Architectural Leading Practice

ScienceLogic vs. Open Source IT Monitoring

10964C: Cloud & Datacenter Monitoring with System Center Operations Manager

EMC STORAGE RESOURCE MANAGEMENT SUITE

Trademark Notice. General Disclaimer

SolarWinds Network Performance Monitor

Barracuda Load Balancer Online Demo Guide

Unlimited Server 24/7/365 Support

How To Use Mindarray For Business

Cisco and VMware Virtualization Planning and Design Service

Agio Remote Monitoring and Management

ManageEngine (division of ZOHO Corporation) Infrastructure Management Solution (IMS)

IP Telephony Management

VCE Vision Intelligent Operations Version 2.6 Technical Overview

CiscoWorks Resource Manager Essentials 4.1

Radware ADC-VX Solution. The Agility of Virtual; The Predictability of Physical

CiscoWorks Resource Manager Essentials 4.3

MRV EMPOWERS THE OPTICAL EDGE.

Radware ADC-VX Solution. The Agility of Virtual; The Predictability of Physical

OnCommand Unified Manager

How To Manage A System Center 2012 R2 Operation Manager

For Service Providers. Copyright 2010 EMC Corporation. All rights reserved.

MRV EMPOWERS THE OPTICAL EDGE.

Leveraging Best Practices for SolarWinds IP Address Manager

WhatsUp Gold 2016 Getting Started Guide

Cisco Application Networking Manager Version 2.0

Professional Integrated SSL-VPN Appliance for Small and Medium-sized businesses

Apache CloudStack 4.x (incubating) Network Setup: excerpt from Installation Guide. Revised February 28, :32 pm Pacific

IBM Tivoli Network Manager software

ForeScout CounterACT. Device Host and Detection Methods. Technology Brief

SANS Top 20 Critical Controls for Effective Cyber Defense

MDS PulseNET. Network Management Software. Purpose Built. Rapid Deployment. Robust Monitoring. Easy to Use. Key Benefits Return on Network Investments

ENC Enterprise Network Center. Intuitive, Real-time Monitoring and Management of Distributed Devices. Benefits. Access anytime, anywhere

Secure Networks for Process Control

SaaS Security for the Confirmit CustomerSat Software

How To Get Started With Whatsup Gold

How To Set Up Foglight Nms For A Proof Of Concept

INSITE. Dimension Data s monitoring offering

Enterprise Mobility Management Migration Migrating from Legacy EMM to an epo Managed EMM Environment. Paul Luetje Enterprise Solutions Architect

Implement a unified approach to service quality management.

TimePictra Release 10.0

The Importance of Information Delivery in IT Operations

Enterprise Solution for Remote Desktop Services System Administration Server Management Server Management (Continued)...

McAfee Network Security Platform Administration Course

EMC Smarts Integration Guide

Installation and Upgrade on Windows Server 2008/2012 When the Secondary Server is Physical VMware vcenter Server Heartbeat 6.6

"Charting the Course... Implementing Citrix NetScaler 11 for App and Desktop Solutions CNS-207 Course Summary

Private Compute-as-a-Service

EMC AVAMAR INTEGRATION WITH EMC DATA DOMAIN SYSTEMS

BMC Remedy OnDemand. Product Overview

WHITEPAPER. PHD Virtual Monitor: Unmatched Value. of your finances. Unmatched Value for Your Virtual World

Mobile Application Development Platform Comparison

Journey to the Private Cloud. Key Enabling Technologies

AN IN-DEPTH VIEW. Cleo Cleo Harmony - An In-Depth View

Mobile Admin Architecture

Service Catalog. it s Managed Plan Service Catalog

HP Intelligent Management Center Standard Software Platform

SERVICE ASSURANCE IN VIRTUALIZED DATA CENTERS The foundation for cloud services and the software-defined data center

WhatsUp Gold vs. Orion

Opengear Technical Note

Virtual Server Hosting Service Definition. SD021 v1.8 Issue Date 20 December 10

Cisco AnyConnect Secure Mobility Solution Guide

TRAVERSE: VIRTUALIZATION AND PRIVATE CLOUD MONITORING

SERVICE SCHEDULE DEDICATED SERVER SERVICES

PLUMgrid Toolbox: Tools to Install, Operate and Monitor Your Virtual Network Infrastructure

ANS Monitoring as a Service. Customer requirements

Support, Management & Hardware Maintenance

Cisco IP Solution Center MPLS VPN Management 5.0

Restricted Document. Pulsant Technical Specification

Service Definition Document

Transcription:

services capabilities The Remote Infrastructure Management Platform What is the Remote Infrastructure Management (RIM) Platform? As part of our Global Services Operating Architecture, the RIM platform is used to support our processes and people in delivering services to your business. The platform provides proactive/ predictive monitoring through a combination of leading vendor technologies for discovery and monitoring, event management, reporting and configuration management. High-level architecture: how the RIM platform helps us deliver services The RIM platform has been architected in a modular way to support any future toolsets required to manage technologies that are currently unsupported by Dimension Data. It is split into three levels, all on virtual machines for easy deployment and upgrade: Client domain manager (CDM) the toolset accessing your devices via the demilitarised zone (DMZ) and tagging all information with an identifier. This can be housed on-site or on your network if necessary. Client aggregation layer (CAL) the domain layer accessed in the Global Service Centre (GSC) for cross-client, cross-service service aggregation. Client display layer (CDL) this is integrated into the IT Service Management (ITSM) platform for event management and allows access from anywhere (by our other GSCs or directly by you, for example) to the events, reports and configurations of the devices under management. the RIM platform is used to support our processes and people in delivering services to your business.

The figure below shows the three layers of the RIM platform: The CDL is the service portal and integration layer. Shared by multiple clients. The CAL is the aggregation layer for information from managed environments. Shared by multiple clients. Client A Client B Client C Client D The CDM is the collection layer, responsible for monitoring managed environment and collect all required information. Also the conduit into your network. One per client. The core focus of the RIM platform is to automate as far as possible the following processes: Detection extensive use of root cause analysis, threshold analysis and trending help automate the diagnosis of outages. Remediation access to the device from the portal is automated for remediation, and supports pre-emptive remediation through script automation. To further automate remediation and provide a seamless remediation process we ve also deployed and integrated EMC Network Configuration Manager (NCM), formerly known as Voyence, into our identity and access management which is accessible through the ITSM platform. Management ongoing capacity management, configuration audit, end-of-life, end-of-sales audits, etc. are automated through the toolset, ensuring proactive management for business continuity and capacity management purposes. Our unlimited enterprise level agreement with key vendors of the solution allows us to monitor previously unmanaged devices, as well as have more flexibility for client-side root cause analysis. The events collected by our RIM platform are transferred to the ITSM through a middleware orchestration engine (BizTalk). This orchestration engine ensures communication services between ITSM and RIM, as well as a number of scalability functions including: Event storm management the purpose of this capability is to ensure that our ITSM platform, and therefore our operators, are protected from an influx of events from a single source, so that another client environment has no effect on the service delivered to you. When an event storm is detected, all associated events are automatically discarded, and summarised into an event announcing the beginning and the end of the storm, and the number of events that were generated as part of the storm. Deployment automation we ve focused on automating the deployment of the devices to be monitored to make the process less time consuming. We ve developed the capability to perform pre-deployment checks, as well as post deployment checks that ensure that all devices in the list of managed devices are properly configured with the correct IP addresses, community strings, passwords, device names, etc., and are correctly deployed in all of the RIM components in which they need to be deployed (Smarts, NCM and Watch4Net). This automation means devices are deployed at the click of a button. Scheduled outages this gives you the ability to define online your scheduled outages for a list of affected configuration items. From there, the information on the schedule outage will be communicated to our toolsets so that: events are suppressed during an outage, performance reports take the outage into consideration, and our operators are informed of the outage when it happens. This shortens processing time on our side, but also ensures a better service experience for you, with a reduced risk of overlooking a non-related issue during an outage.

Event management to service request (incident, change) automation our GSOA also allows the complete integration of the RIM practices into our ITSM delivery processes. Since all relevant events are transferred to the ITSM platform, and processed from the ITSM platform, the entire ITSM lifecycle is seamless. The full history, relationships and evolutions are accessible to you through the services portal. Multiple source event system integration our RIM platform allows us to seamlessly integrate the various components of the EMC Smarts suite. Through the EMC Notif interface, we re also in a position to integrate events into our RIM platform that come from client-specific or technology-specific environments. We normalise those events, enrich them with the required GSOA information (for example, client identification) and pass them on to the ITSM platform for processing alongside the events generated by the core platform capability. This allows us great flexibility and the ability to rapidly integrate with client- or technologyspecific toolsets while enabling exactly the same level of service. The figure below depicts this architecture and some of the levels of automation delivered: Biztalk / ITSM Global storms prevention logic Authorised deployment Acknowledgements view symptoms CDL SAM (Smarts Assurance Manager) Regional storms prevention logic topology Authorised deployment CAL SAM (Smarts Assurance Manager) IDM Scheduled outages and storms prevention logic topology Authorised deployment CDMs Scheduled outages and storms prevention logic APG alerting Voyence application Other 3rd party NMS The platform provides proactive/predictive monitoring through a combination of leading vendor technologies for discovery and monitoring, event management, reporting and configuration management.

The underlying components of the RIM platform: EMC Smarts A core data collection, root cause analysis and event management system. EMC Smarts includes the following core elements: The Notification Manager (present at the CDM and IDM layers) is a rulesbased filter engine that processes SNMP traps, syslog messages and other event formats for the purpose of eliminating event storms and to facilitate integration with external client-owned element management platforms. The IP Availability and Performance Manager (present in the CDM layer) polls configuration items for availability and performance metrics, then performs intelligent root cause analysis on detected problems. The result is a root cause event created in the service assurance manager for further action. The VoIP Availability Manager (present at the CDM layer) takes network topology feeds from the IP availability manager and enriches this topology with voice-over-ip configuration item topology. It further facilitates the processing of voice-over-ip events for the purpose of cross-domain root cause analysis in order to create an appropriate root cause event in the Service Assurance Manager. The EMC Server Manager (present at CDM layer) polls virtualisation hypervisors, physical servers and enhanced server platforms for hardware, operating system and application process status and performance metrics. It integrates via open APIs with hypervisor managers such as vmware vcenter server and extracts server configuration item topology. This is to aid cross-domain root cause analysis of network and server events. The Service Assurance Manager (present in the IDM, CAL and CDL layers). It is the manager-of-managers for all topology and event data and plays a different role at each RIM layer. CDL SAM topology CAL SAM topology CDMs At the CAL layer, it aggregates all topology and root cause events from all CDMs for a given region. All data that originates on the CDM is tagged with your RIM tag for filtering and segregation. CAL SAMs BizTalk / ITSM Acknowledgements view symptoms CDL SAM topology IDM At the CDL layer, all events that need to go into ITSM are forwarded from the CAL layer. These events are then filtered for root cause and forwarded to ITSM for event management and incident management. At the IDM layer, events from client element management systems are processed, normalised and tagged for forwarding to the CDL layer.

EMC Network Configuration Manager (formerly Voyence) A primary network configuration management engine for network device configuration file collection and configuration item provisioning. The device server (present in the CDM layer) uses standard network configuration protocols (SSH / Telnet / SNMP / ICMP / TFTP / SCP / etc.) to log into network devices and collect single or multiple configuration files for backup purposes from these devices. The Device Server is triggered by a change trap received from the network device in question. It can also be used to push out configuration changes to individual or groups of network devices. The database server (present in the CAL layer) is the central data store for all data. The application server (present in the CAL layer) is accessed via web services API from the CDL layer to provide clients with pass-through access to network devices using the EMC Network Configuration Manager interface, as opposed to the device s native CLI interface. This enables audit logging and tracking of user activity when changes are made to network devices. The application server also hosts the configuration policy engine which can be configured to perform configuration compliance audits on your network devices. However, this has to be defined in the services description for the services product that s offering this capability. EMC Watch4Net (formerly APG) The primary component for availability, capacity and performance monitoring and reporting. Collectors (present at CDM layer) extracts performance data from EMC Smarts IP Availability and Performance Manager for aggregation into a central performance management database. It also extracts event data from the EMC Smarts Service Assurance Manager. Technology-specific collectors can also be deployed for more detailed analysis of performance and capacity metrics, but has to be defined in the associated service description for the services product that s offering these capabilities. The database and back-end server (present at the CAL layer) normalises and stores all collected metrics. The outage manager (present at the CAL layer) allows configuration item planned outages to be configured in EMC Watch4Net for the purpose of factoring outages into configuration item availability calculations. The alerting manager (present at the CAL layer) manages performance and capacity thresholds and generates traps and notifications when threshold conditions are met. These traps are forwarded to the respective EMC Smarts Notification Manager for further processing. The portal (present at the CAL layer) provides access to all data stored in the EMC Watch4Net database and backend server. ITSM uses the web services API to access the relevant dashboards and reports on the EMC Watch4Net portal. Microsoft BizTalk Server BizTalk is the primary information integration layer between all the components of GSOA. It facilitates all communications between RIM components and between RIM and ITSM in order to form an open architecture for service operations globally. Some important facts and statistics about our RIM platform: Locations: 300+ RIM assets: 100,000+ Active users: 20,000+ Tasks per month: 400,000 (events: 300,000; incidents: 60,000; requests: 40,000)

Frequently asked questions: GSOA (IAM, ITSM, Direct) Internal interface (CDM network) External interface (customer VLAN) Client Internal routes (public IP space) Default route (public IP space) How is the platform secured when it s connected to my business environment? The CDM has an internal and an external interface to overcome overlapping IP space issues between clients, specifically in the private RFC1918 spaces. Identity and access management provides role-based user provisioning and access with single sign-on. It uses reverse proxy for transparent connectivity to other toolsets. Authentication is provided using RSA authentication manager. All end-user interaction goes through identity and access management. How is data sovereignty ensured? Dimension Data has a registered business in each country we trade in. The contractual terms and conditions of our services contract is bound and governed by the applicable laws of that country. The primary concern with data traversing country boundaries relate to the changing data protection laws across borders, which means that data may be subject to less robust law and regulation, putting data owners privacy at risk. Our GSOA architecture includes a combination of onshore and offshore components to help address your needs. The RIM component is an onshore platform for localised data collection, processing and analysis. The ITSM component is an offshore software-as-a-service platform (ServiceNow.com), hosted in a highly secure data centre in London UK. All your asset, contract management, services aggregation, portal and reporting services are delivered using this platform. All RIM instances communicate securely via the Microsoft BizTalk integration platform with ITSM and client data tagged and segregated by the respective RIM platforms prior to updating ITSM. Dimension Data s GSOA architecture doesn t collect or store any business data in ITSM only contact, contract and IT asset information needed to deliver our services. In countries where data sovereignty regulations are very strict, we re able to deploy an onshore GSOA architecture with localised RIM and ITSM instances. However, this is not our preferred option as it doesn t benefit our international clients and makes delivering a unified service experience hard to achieve. How do you handle redundancy/ disaster recovery? RIM is deployed as a fully virtualised environment running on the latest versions of vmware. For hardware fault tolerance, we use vmware vmotion, and for centralised fault tolerant storage, we use EMC SANs. Our CDMs are built from a standard virtual machine template which makes it quick and easy to commission a new client. All servers are built on RedHat Linux Enterprise Edition for homogeneous support. Communication between RIM back-ends and ITSM, identity and access management and BizTalk use Dimension Data s global wide area network which is fully redundant and highly secure. The CDMs connect directly to a client-specific VLAN which connects to an MPLS VPN and, in rare occasions, a leased line.

Why is full estate discovery/ monitoring important? EMC Smarts is a core component of the RIM platform. EMC Smarts is an intelligent root cause analytics system. It uses it s understanding of the connectivity and dependencies between IT assets to populate a codebook with symptomatic analysis capabilities. CDM Router 1 Router 2 Router 3 What does this mean? In the example above, the CDM connects via an access router or VPN concentrator to the your network. It uses this link to process traps and poll the IT estate for status, availability, and performance metrics. In the example above, if the CDM only has knowledge of Router 1, as that is the only asset to manage as per the services contract, EMC Smarts will automatically assume Router 1 is down and won t be able to confirm if this is the real root cause, because it has no knowledge of the other devices connected to Router 1. This may lead to false positives and an incorrect analysis of the problem. We may even receive traps from other devices connected to Router 1 but without knowing the relationships, root cause will be a manual exercise which can sometimes be time consuming. However, if EMC Smarts was able to discover Routers 1, 2 and 3 and populated the codebook accordingly, it s true intelligence would come into play because it knows about the neighbouring devices. Once Router 1 becomes unresponsive, it will poll Routers 2 and 3 to confirm they re reachable and that the interfaces to Router 1 are down, which is then a safe assumption that Router 1 is down. In this case, only the root cause event is sent to ITSM for diagnoses and action. The more we manage, the more effective the correlation can be. CS / DDMS-1412 / 10/13 Copyright Dimension Data 2013 For further information visit: www.dimensiondata.com