Advanced Diagnostic/Prognostic Solutions for Information Technology (IT) UPS & Power Supply Systems

Similar documents
Advanced Diagnostic/Prognostic Solutions for Complex Information Technology (IT) Networks

Liebert GXT MT+ CX UPS, 1000VA VA Intelligent, Reliable UPS Protection. AC Power for Business-Critical Continuity

Transformerless UPS systems and the 9900 By: John Steele, EIT Engineering Manager

Technical Comparison of On-line vs. Line-interactive UPS designs

Application Note: Emergency Battery Backup System (EBBS) Solutions for the XIM Platform

Cisco Change Management: Best Practices White Paper

SCADA Systems Automate Electrical Distribution

Tough Decision: On-Line or Line-Interactive UPS?

SNMP Web card. User s Manual. Management Software for Uninterruptible Power Supply Systems

Whitepaper Continuous Availability Suite: Neverfail Solution Architecture

Power Quality. Uninterruptible Power Supply

WHITE PAPER Assessing the Business Impact of Network Management on Small and Midsize Enterprises

The Business case for monitoring points... PCM architecture...

SureSense Software Suite Overview

White Paper. The Ten Features Your Web Application Monitoring Software Must Have. Executive Summary

SSPI-APPN OTES: PARALLEL C ONNECTIONS

Highlights of CAMS. Service Health Manager

LTR Series Uninterruptible Power Systems 700 VA KVA. General Specification

PowerGem Pro Uninterruptible Power Supply

Microsoft Windows Built-In Serial and USB UPS Support

SmartOnline Expandable Rack / Tower UPS System Online, double-conversion protection for mission critical applications

A COMPARISON OF TYPICAL UNINTERRUPTIBLE POWER SUPPLY (UPS) DESIGNS IN TODAY S MARKETS

Industrial-Grade UPS System Heavy-duty power protection for harsh industrial environments

Developments in Point of Load Regulation

Smart Pro Rack/Tower UPS Intelligent, line interactive network power management system

How to Eliminate the No: 1 Cause of Network Downtime. Learn about the challenges with configuration management, solutions, and best practices.

Real-time Power Analytics Software Increasing Production Availability in Offshore Platforms

Phase Modular Finally the benefits of Modular ups are for all

Opengear Technical Note

PowerPanel Business Edition USER MANUAL

Bourns Resistive Products

DOSarrest External MULTI-SENSOR ARRAY FOR ANALYSIS OF YOUR CDN'S PERFORMANCE IMMEDIATE DETECTION AND REPORTING OF OUTAGES AND / OR ISSUES

G DATA TechPaper #0275. G DATA Network Monitoring

Model #: SMART2200VS. 2200VA / 2.2kVA line interactive tower UPS. Maintains 120V nominal output over an input range of 79 to 147V

Selecting a UPS Topology for your Data Center. by Dr. Daniel Becker, Ph.D., Methode Active Energy Solutions

Maximizing return on plant assets

Tamura Closed Loop Hall Effect Current Sensors

How To Use Rittal'S Rizone

Network Management and Monitoring Software

Oracle Maps Cloud Service Enterprise Hosting and Delivery Policies Effective Date: October 1, 2015 Version 1.0

MoniUPS Software and Service. with an expert

Top 12 Questions to Consider When Choosing UPS Systems for Network/Server Applications

Mixing Sodium and Lead Battery Technologies in Telecom Applications

The Different Types of UPS Systems

WHITE PAPER OCTOBER CA Unified Infrastructure Management for Networks

Revolutionizing Data Center Infrastructure Management

DISASTER RECOVERY WITH AWS

Enterprise Remote Monitoring

Power protection for data centers

Inverter Field Service Technician II

Circuit Protection is Key in Maintaining Growth for The Internet of Things

White Paper: Pervasive Power: Integrated Energy Storage for POL Delivery

Seven Things Critical Facilities Managers Need to Know About Wirefree Monitoring

Executive Summary. Technical Description

FioranoMQ 9. High Availability Guide

ORACLE ENTERPRISE MANAGER 10 g CONFIGURATION MANAGEMENT PACK FOR ORACLE DATABASE

Machinery condition monitoring software

Daker DK 1, 2, 3 kva. Manuel d installation Installation manual. Part. LE05334AC-07/13-01 GF

MPC 4. Machinery Protection Card Type MPC 4 FEATURES. Continuous on-line Machinery Protection Card

LAP ENERGIA PARA DATACENTERS

AC/DC Power Supply Reference Design. Advanced SMPS Applications using the dspic DSC SMPS Family

Aljex Software, Inc. Business Continuity & Disaster Recovery Plan. Last Updated: June 16, 2009

IBM Tivoli Network Manager software

Cost Effective Network Management for Small and Mid-sized Organizations

UNINTERRUPTIBLE POWER SUPPLIES >9900AUPS UNINTERRUPTIBLE POWER SUPPLIES

Solution Analysis White Paper CHOOSING THE RIGHT UPS FOR SMALL AND MIDSIZE DATA CENTERS: A COST AND RELIABILITY COMPARISON

Networking and High Availability

Single Phase UPS Management, Maintenance, and Lifecycle

Sensor Network for HACCP Food Safety Management

Minor maintenance issues proving difficult to detect for many solar PV system owners

Qualitative Analysis of Power Distribution Configurations for Data Centers

A Link Load Balancing Solution for Multi-Homed Networks

Lumeta IPsonar. Active Network Discovery, Mapping and Leak Detection for Large Distributed, Highly Complex & Sensitive Enterprise Networks

7 Best Practices for Increasing Efficiency, Availability and Capacity. XXXX XXXXXXXX Liebert North America

Modular architecture for high power UPS systems. A White Paper from the experts in Business-Critical Continuity

Electrical Systems. <Presenter>

Common Power Problems & Power Protection Solutions

Using RAID Admin and Disk Utility

Utilizing Real-Time Information in Enterprise Asset Management Systems

Load Balancing for Microsoft Office Communication Server 2007 Release 2

Before You Start 1 Overview 2 Supported Devices 4 Public (User LAN) and Private (APC LAN) Networks 5. Initial Configuration Requirements 6

How To Build A Clustered Storage Area Network (Csan) From Power All Networks

Solar Panel Analysis with SkySpark R A BASSG Custom Solution for a Self-Storage Franchise Demonstrating the power of building operational analytics

SmartOnline V 6kVA 4.2kW On-Line Double-Conversion UPS, Extended Run, SNMP, Webcard, 6U Rack/Tower, DB9 Serial, Hardwire

Active Directory Infrastructure Design Document

IT White Paper IP TELEPHONY AND RELIABILITY: PROTECTING MISSION-CRITICAL PHONE SYSTEMS FROM POWER THREATS

Network Instruments white paper

RF Network Analyzer Basics

Model Manage Monitor Maximize your Data Center

Data Center Infrastructure & Managed Services Outline

Top 8. Considerations for Choosing the Right Rack Power Distribution Unit to Fit Your Needs

Xantrex Solar Monitor Widget. User Guide

Blackboard Managed Hosting SM Disaster Recovery Planning Document

Standardization and modularity as the foundation of data center UPS infrastructure

Network Router Monitoring & Management Services

Dynamic Power Variations in Data Centers and Network Rooms

Dynamic Power Variations in Data Centers and Network Rooms

WHITE PAPER September CA Nimsoft For Network Monitoring

High Availability for Citrix XenApp

Transcription:

Application Note AN107 Advanced Diagnostic/Prognostic Solutions for Information Technology (IT) UPS & Power Supply Systems Overview In today s business networks, continuous operation of network devices is imperative to the operation of the business, and in many cases the network is used to increase business revenues. This does not just affect large companies even the smallest companies use computers for the efficient management of their operations and storage of information. When the network collapses or goes down, this causes idle time in operations and additional labor, translating to an unnecessary cost. One of the most critical components of a network is an uninterruptable power supply (UPS), which is responsible for providing clean, uninterrupted power for the connected devices. During grid power loss the UPS is expected to be able to provide sufficient power until grid power returns or until the attached devices can be safely powered down. UPS devices always fail at bad times by definition and the timing of the failure is almost always sooner than one would expect by looking at the specification. When grid power is lost and a UPS fails, it can cause catastrophic results for an IT network. UPS devices have failure points similar to other power systems including the batteries, capacitors, MOSFETs, and DC/AC inverter stages. The ability to predict the remaining useful life (RUL) of a UPS will improve the operational availability (OA) and reliability of the network by reducing unplanned network downtime. This Application Note describes an innovative approach to prognostics-enabling UPS devices through the existing UPS software and through hardware additions. Limitations of Existing UPS and Power Supply Devices Starting with the UPS hardware, the built-in self-test (BIST) of the device usually does not provide an adequate test of the internal battery system that suffers degradation through daily operation. The BIST is used to gauge the performance of the UPS with varying levels of qualitative output. Network operators are able to view this output and refer to the user guide for repair/replacement schedules. While this is an indicator for the decline in performance of the UPS, it gives no direct correlation to the percentage of degradation and more importantly does not indicate how this degradation affects the performance of the UPS under grid power loss. The UPS software built into a network management card includes a web service for the device that provides an available battery time value when grid power is lost. This number should take into account the amount of power being consumed on the output of the UPS device and the discharge rate of the internal battery system of the UPS. However, this value is likely a best-case scenario or a ceiling value for time remaining. Under real scenarios when grid power is lost, this value is not to be trusted and only provides the network operator with an optimistic indication of the network s most critical devices being safely powered through the UPS device. UPS devices employ similar design approaches to switch-mode power supplies (SMPS), especially for AC to DC power conversion, and therefore share common fault modes. High efficiency SMPSs, which are often not network-enabled, make up a significant number of the total types of DC power supplies in the market today. This is due to their high efficiency and relatively low cost. SMPSs have well-known fault modes that degrade the output voltage response from the power supply. This causes the overall health of the power supply to be degraded and ultimately reduces its RUL. Copyright 2012 RIDGETOP GROUP INC. All Rights Reserved. www.ridgetopgroup.com 1

Other fault modes affect the MOSFET power switches, resulting in an increase in the effective onresistance, and affect MOSFET gate drivers, resulting in a change to the duty cycle of the switching frequency. Prognostics Health Management Tool Ridgetop developed the Sentinel Network architecture as a comprehensive prognostics health management (PHM) solution for IT network systems. The distributed architecture is predicated on an extensible software platform that distributes the sensor data collection and performs data fusion, reasoning, and presentation tasks, as shown in Figure 1. An array of HealthView sensors publishes device and health data through well-known UDP ports. Companion services, which have subscribed to sensor data publication, store the data in a central PHM database while monitoring for anomalous health conditions. A suite of sophisticated diagnostic and prognostic reasoners processes the multivariate sensor data to isolate the root cause of the fault condition and estimate the RUL of the devices being monitored. More information about Sentinel Network is available in Ridgetop s Application Note AN106. Figure 1: Sentinel Network architecture Improving Health Prediction for Network-Enabled UPS Systems UPS devices designed, built, and deployed for IT networks are network-enabled through a management card. This management card provides remote access to the device through a built-in web service when configured properly with a TCP/IP address, gateway, and DNS address. The web service provides event data, configuration data, current device status, system status, device information, notifications, etc. In addition, some monitoring and alert capabilities are included. For example, thresholds can be set at which to send notification when battery capacity falls below a certain threshold. Or the UPS can be powered on/off at various time intervals. The web service provides the most common view for UPS device data on the network. Copyright 2012 RIDGETOP GROUP INC. All Rights Reserved. www.ridgetopgroup.com 2

The most common UPS system and the design most commonly employed by commercial-grade UPS manufacturers is an online double conversion UPS system. A functional block diagram of such a system is shown in Figure 2. Figure 2: Functional block diagram of UPS system An online double conversion UPS system provides continuous double conversion of the line input when grid power is present. Any line input dropout causes the rectifier circuit to immediately dissolve, and the batteries to retain steady and unchanged DC power to the output inverter. When grid power is restored, the rectifier will resume safely recharging the battery system. This system provides an equivalent electrical firewall and battery backup system for attached equipment. A slight variation to the online double conversion UPS system (also shown in Figure 2) is to provide a DC output instead of or in addition to the AC output. The equipment attached to a UPS often requires DC power, and by providing this DC power without conversion along with a voltage regulator, this UPS system increases efficiency and run-time for the overall system. Data are collected from the UPS device through the simple network management protocol (SNMP) and stored in a database. Data collection includes but is not limited to device link status, network configuration data, system status, and power system parameters such as input AC/DC power conversion parameters, output DC/AC power conversion parameters, and battery system parameters. Sensor readings from the UPS are collected and processed, producing an intermediate value that feeds the RUL algorithm. The Adaptive Remaining Useful Life Estimator (ARULE) is a data-driven prognostics approach that relies on a graphical model of the fault-to-failure progression (FFP) signature. A RUL estimation approach for a UPS device starts with defining the model space and in particular the floor, ceiling, and data-space amplitudes and widths of the model space, as shown in Figure 3. With the graphical model defined, ARULE produces an RUL estimate, equal to the sum of the widths of the boxes, which is calculated for each new input. Only condition-based data with an amplitude greater than the floor threshold (no faults) and condition-based data with an amplitude less than the ceiling threshold (failure) will change the widths of the data spaces. For each new data point within this region, the width of the data space either increases or decreases and resultantly increases or decreases the RUL. With ideal data, there is no correction to the model, however real data causes the model to change, and with each new data point the changes to the model cause an RUL adjustment. Within each data space there is a function that defines the shape of the FFP within that region. Each new input to ARULE will potentially modify the width of the data space. Increases or decreases to the width of the data space occur only when successive differences between data input and expected position from the function defined in the data space exceed the tolerance. In Copyright 2012 RIDGETOP GROUP INC. All Rights Reserved. www.ridgetopgroup.com 3

other words, the model and computation of the RUL require momentum of the data in one direction or the other before making any serious deviations from the data-space model. A unique default model is defined for each UPS type. Figure 3: UPS remaining useful life data spaces Example A: Application of Standard SNMP-enabled UPS System The U.S. Navy is concerned with network reliability and robust performance while operating in remote environments. An afloat network includes routers, switches, servers, and client nodes of varying type and function. The mission-critical network components are powered through a UPS device. While at sea, ships are not likely to store UPS spares, therefore it is necessary to provide ample warning time when condition-based maintenance (CBM) of the UPS system is needed to avoid potentially dangerous scenarios involving loss of ship and backup power. The Lockheed Martin Technology Collaboration Center-West (TCC-W) provides a testbed for new and innovative network technology to be transitioned for use in a naval ship. This center provides network replicating equipment found aboard certain classes of ships and this equipment can be configured into various network topologies. Ridgetop constructed a representative two-ship network topology that included two routers, three switches, four servers (including Sentinel Network), four UPSs, and six laptops. Clary Corporation supplied the CMN-2400D-PD UPS(s) for each network rack. These devices are built and designed to support the heavy power loads from the attached equipment under extreme tactical conditions, however they are still susceptible to early failure. Sentinel Network was used to automatically discover and monitor, in real time, all assets on the network. In the case of UPS devices, additional state of health (SoH) and RUL calculations and presentation of results are also available. Each of the four UPSs had unique power loads applied to their output. Four different model files were used to estimate RUL from the intermediate SoH calculation. One of the four UPSs had a dynamic load changing with time applied to its output. Sentinel Network was able to demonstrate an increase and decrease in health as shown in Figure 4 along with a calculated increased and decreased RUL estimate based on this degradation occurring in the IT system, as shown in Figure 4. Copyright 2012 RIDGETOP GROUP INC. All Rights Reserved. www.ridgetopgroup.com 4

Figure 4: Sentinel Network primary monitoring view Improving Early Detection of Power System Failure by Adding the Sample Mode Response Technique (SMRT ) Frequency Probe The health prediction for a power system is based on an analysis of the fault-to-failure progression (FFP) signature for the device being monitored on the IT network. The frequency and precision of the sampled data greatly affects the accuracy of the health estimate. Thus, adaptations to the original hardware sensors for power supply systems may be needed to increase the reliability of the RUL estimate. Ridgetop can assist with that determination. Ridgetop has prognostics-enabled a wide variety of commercial off-the-shelf (COTS) power supplies. The RingDown approach was developed to assess the power supply s SoH. This method analyzes the output current or voltage signature in response to a load change. The damped ringing response of the power supply can be characterized through Ridgetop algorithm processing to show the onset of power system degradation. More information about RingDown is available in Ridgetop s Application Note AN101. Building upon the foundations of RingDown, a more advanced prognostic approach was developed deploying the Sample Mode Response Technique (SMRT) frequency probe (SMRT Probe) as an addition to existing power supply hardware circuitry. This probe is used to collect electrical characteristics of the system, and it also makes the power system network-enabled. The SMRT Probe is a minimally invasive method of inducing and capturing a damped ringing frequency response from a device or assembly. The captured response is digitized and the frequency of the ringing response is calculated and compared against the nominal value of the frequency of a non-degraded (undamaged) device or assembly. The difference between the calculated frequency and the nominal frequency is used to determine the degree of damage (SoH) and the RUL of the device or assembly. Copyright 2012 RIDGETOP GROUP INC. All Rights Reserved. www.ridgetopgroup.com 5

Example B: Application of SMRT Probe to Commercial Power Supply Ridgetop has designed the SMRT Probe to be used with non-snmp-enabled power supply systems. The SMRT Probe interface bridge provides a simple, easy-to-apply method of supporting real-time monitoring of UPS assets on critical IT networks; that is, it network-enables the non-networked power systems. This PHM-enablement method is extensible to non-it assets such as line replaceable units (LRUs) and industrial automation assets found on the factory floor. With PHM, the ultimate goal is to assess system SoH technologies for power supply systems to support CBM strategies. An example prognostics-enabled power supply testbed with SMRT Probe includes an LRU capable of providing raw sensor data along with calculated prognostic measurements. The power supply provides a regulated 3.3 V DC output that is intercepted by a sensor consisting of both a noise filter/dc decoupler and signal conditioner (digitizer). The LRU is shown in Figure 5. Figure 5: Power supply testbed with SMRT frequency probe The front-end software of the SMRT Probe performs a digital signal processing (DSP) analysis of the captured frequency response, producing a prognostic signature. The ARULE algorithm is subsequently applied to the normalized frequency data and maintains a rolling average of six data inputs, captured at 1-second time intervals, to produce the real-time, updated RUL estimate for each new data input that is processed. The graphical display shown in Figure 6 illustrates a comprehensive engineering view for the power supply as it degrades with time. This system-level view incorporates monitoring of the power supply and SMRT Probe signature extraction. Software calculates the RUL with the ARULE algorithm. Copyright 2012 RIDGETOP GROUP INC. All Rights Reserved. www.ridgetopgroup.com 6

Figure 6: Engineering view for both UPS and prognostics-enabled power supply Return on Investment (ROI) Prognostics-enabling critical UPS systems yields a high return on investment (ROI). This section addresses two typical examples. The ROI for implementing PHM on critical UPS systems is up to $155,000 per year. For a data center, high-capacity UPS devices are deployed for reliability. These units typically have a high initial cost and high repair cost. To estimate the ROI, an analysis of the costs has to be performed. The ROI for a failed UPS in a small to medium-sized data center considers the following cost estimates: Cost of new UPS procurement ~$10,000 Copyright 2012 RIDGETOP GROUP INC. All Rights Reserved. www.ridgetopgroup.com 7

Cost of installation during network outage ~$5,000 which includes: o o o Expedited product handling and shipping Network expert consultant(s) Time required to get systems back online Lost revenue from unscheduled network down time (2 hours) ~$40,000 Assumptions for this scenario are as follows: Grid power may be lost at any time A UPS failure constituting less than necessary time to safely power down attached equipment occurs about three times per year on average Only a single UPS will be replaced per incident In order to calculate the ROI of an electronic prognostics implementation for this scenario, the equation, ROI = (Cost of Outage without PHM) (Cost of Implementing PHM) is used. The cost of an outage without PHM for the scenario described is approximately $55,000 for a single failure. Two examples of prognostic implementations are described next. Example A: Network-enabled UPS This example considers the use of a UPS that is network-enabled. In this example, Sentinel Network is deployed on the network as another node discovering, monitoring, and collecting UPS device information. The cost for prognostics-enabling a UPS over the network is estimated to be on the order of ~$30,000 and includes: Sentinel Network Prognostics Analysis Platform License ARULE Remaining Useful Life Algorithm License Full Ridgetop Application Support The ROI calculation follows: ROI = (Cost of Outage without PHM) (Cost of Implementing PHM) ROI = (3 x $55,000) - $30,000 ROI = $155,000 In terms of percentage, the ROI calculations are as follows: ROI = (Cost of Outage without PHM Cost of Implementing PHM) / (Cost of Implementing) ROI = ($165,000-$30,000)/$30,000 ROI = 450% Example B: Non-Network-enabled UPS This example considers the use of a UPS that is not network-enabled. In this example, a SMRT Probe is added to a UPS device. In addition, Sentinel Network is deployed on the network as another node discovering, monitoring, and collecting UPS device information. The cost for prognostics-enabling a non-network UPS over the network is estimated to be on the order of ~$40,000 and includes: Sentinel Network Prognostics Analysis Platform License ARULE Remaining Useful Life Algorithm License Ridgetop UPS Hardware Interface (SMRT Probe) Copyright 2012 RIDGETOP GROUP INC. All Rights Reserved. www.ridgetopgroup.com 8

Full Ridgetop Application Support The ROI calculation follows: ROI = (Cost of Outage without PHM) (Cost of Implementing PHM) ROI = (3 x $55,000)-$40,000 ROI = $145,000 In terms of percentage the ROI calculations are as follows: ROI = (Cost of Outage without PHM Cost of Implementing PHM)/ (Cost of Implementing) ROI = ($165,000-$40,000)/$40,000 ROI = 312.5% Conclusion This application note shows two solutions to real-time monitoring of both UPSs and power systems, which are critical components in IT networks. As one of the leading causes of IT network outages, these systems can benefit from both the software and hardware prognostic approaches. This includes the addition of a SMRT Probe to existing power system hardware and ARULE as a datadriven prognostics algorithm estimating RUL from FFP signature data. The return on investment calculations range from 312% to 450%. Contact Ridgetop Group to apply our proven PHM technologies to your application needs. Ridgetop Group Inc. 3580 West Ina Road Tucson, Arizona 85741 USA Telephone: +1 520.742.3300 Email: info@ridgetopgroup.com Application Note AN107 Advanced Diagnostic/Prognostic Solutions for Information Technology (IT) UPS & Power Supply Systems Rev 100312 Copyright 2012 Ridgetop Group Inc. All rights reserved. Copyright 2012 RIDGETOP GROUP INC. All Rights Reserved. www.ridgetopgroup.com 9