Better Integration of Systems Management Hardware with Linux



Similar documents
Feature Comparison: idrac 7 & 8 and idrac8 License Chart

Server Management with Lenovo ThinkServer System Manager

IPMI overview. Power. I/O expansion. Peripheral UPS logging RAID. power control. recovery. inventory. Hugo CERN-FIO-DS

AST2150 IPMI Configuration Guide

Gigabyte Management Console User s Guide (For ASPEED AST 2400 Chipset)

McAfee Firewall Enterprise

Monitor the Cisco Unified Computing System

Dell Server Management Pack Suite Version 6.0 for Microsoft System Center Operations Manager User's Guide

Gigabyte Content Management System Console User s Guide. Version: 0.1

DS SERIES SOLUTIONS ALL AT ONCE

This chapter explains a preparation for the use of RemoteControlService.

System Area Manager. Remote Management

How To Configure Syslog over VPN

WhitePaper « » Whitepaper « IPMI and Open-Source Tools: Sorting Out the Confusion. If it s embedded, it s Kontron.

Installing and Configuring the Intel Server Manager 8 SNMP Subagents. Intel Server Manager 8.40

Monthly Specification Update

Proactively Managing Servers with Dell KACE and Open Manage Essentials

Dell idrac7 with Lifecycle Controller

Server Management on Intel Server Boards and Intel Server Platforms. Revision 1.1 September 2009

Managing Dell PowerEdge Servers Using IPMItool

Data Sheet FUJITSU Software ServerView Suite integrated Remote Management Controller - irmc S4

Intel Simple Network Management Protocol (SNMP) Subagent v6.0

Server Management on Intel Server Boards and Intel Server Platforms. Revision 1.0 March 2009

MEGARAC XMS Sx EXTENDIBLE MANAGEMENT SUITE SERVER MANAGER EDITION

Intel Server Boards and Server Platforms Server Management Guide

Hardware Monitoring with the new IPMI Plugin v2

Dell Server Management Pack Suite Version For Microsoft System Center Operations Manager And System Center Essentials User s Guide

Hardware Monitoring with the new Nagios IPMI Plugin

Dell OpenManage SNMP Reference Guide Version 8.0.1

Integrated Dell Remote Access Controller 8 (idrac8) Version User's Guide

Exploring the Remote Access Configuration Utility

Dell PowerEdge C System Management

Command Line Interface User Guide for Intel Server Management Software

Integrated Dell Remote Access Controller 7 (idrac7) Version User's Guide

Systems Management Tools And Documentation Version 8.1 Installation Guide

SNMP-1000 Intelligent SNMP/HTTP System Manager Features Introduction Web-enabled, No Driver Needed Powerful yet Easy to Use

Future Console Servers. devproj project #31

System Management Software Suite

Supermicro Server Management Utilities

Using Integrated Lights-Out in a VMware ESX environment

PANDORA FMS NETWORK DEVICES MONITORING

IPMI Firmware Update (AMI) In WEB-GUI/DOS/WIN/Linux

unisys Enterprise Server ES7000 Model 7600R G2 Integrated Management Module User s Guide imagine it. done. June

Professional Xen Visualization

Systems Manageability of VMware ESXi on Dell PowerEdge Servers

Understanding DRAC/MC Alerts

PANDORA FMS NETWORK DEVICE MONITORING

Intel vpro. Technology-based PCs SETUP & CONFIGURATION GUIDE FOR

TimeIPS Server. IPS256T Virtual Machine. Installation Guide

Client Management Suite User Guide

Whitepaper. Business Service monitoring approach

Cluster Lifecycle Management Carlo Nardone. Technical Systems Ambassador GSO Client Solutions

Table of Contents Introduction and System Requirements 9 Installing VMware Server 35

Phantom4 Remote Management Module User s Manual

IPMI View User s Guide

Configuring and Using AMT on TS140 and TS440

Remote Supervisor Adapter II. User s Guide

Updating the BIOS and BMC on the FreeNAS Mini

HP Systems Insight Manager 7.0 and HP Agentless Management overview

Supermicro Server Management Utilities

FOR SERVERS 2.2: FEATURE matrix

QuickSpecs. Overview. Compaq Remote Insight Lights-Out Edition

NOC PS manual. Copyright Maxnet All rights reserved. Page 1/45 NOC-PS Manuel EN version 1.3

A Comparison of VMware and {Virtual Server}

1 Data Center Infrastructure Remote Monitoring

Management of VMware ESXi. on HP ProLiant Servers

Agent-free Inventory and Monitoring for Storage and Network Devices in Dell PowerEdge 12 th Generation Servers

Red Hat enterprise virtualization 3.0 feature comparison

Using SNMP to Obtain Port Counter Statistics During Live Migration of a Virtual Machine. Ronny L. Bull Project Writeup For: CS644 Clarkson University

Dell PowerEdge T130 Owner's Manual

Intel Management Module Installation and User s Guide

Hardware + Software Solutions for The Best in Client Management & Security. Malcolm Hay Intel Technology Manager

Dell Remote Access Controller 5 Firmware Version 1.60 User s Guide

Out-of-Band Management Reference

Pandora FMS 3.0 Quick User's Guide: Network Monitoring. Pandora FMS 3.0 Quick User's Guide

What the student will need:

SCUOLA SUPERIORE SANT ANNA 2007/2008

HP Insight Control for Microsoft System Center integration overview

Enhancements to idrac7 Alert Notification

Intel Command Line Interface

Configuring Dell OpenManage IT Assistant 8.0 to Monitor SNMP Traps Generated by VMware ESX Server

HP Integrated Lights-Out 2 User Guide

Dell Lifecycle Controller Remote Services v Quick Start Guide

Dell Client. Take Control of Your Environment. Powered by Intel Core 2 processor with vpro technology

vsphere Client Hardware Health Monitoring VMware vsphere 4.1

Required Ports and Protocols. Communication Direction Protocol and Port Purpose Enterprise Controller Port 443, then Port Port 8005

User Guide - English. FUJITSU SoftwareServerView Suite. Remote Management. irmc S2/S3 - integrated Remote Management Controller

Acer Smart Client Manager

User Guide - English. FUJITSU Software ServerView Suite. Remote Management. irmc S4 - integrated Remote Management Controller

Roamer KVM User Manual

Smart Server Manager v1.2 Best Practices. Smart Server Manager v1.2 Best Practices Guide

W H I T E P A P E R. Optimized Backup and Recovery for VMware Infrastructure with EMC Avamar

Application of DCMI in an Internet Portal Data Center

Reboot the ExtraHop System and Test Hardware with the Rescue USB Flash Drive

and Remote Supervisor Adapter II

Out-of-Band Management: the Integrated Approach to Remote IT Infrastructure Management

HP Insight Management Agents architecture for Windows servers

Clustered Data ONTAP 8.3

User Guide. SUSIAccess. Remote Device Management

BMC Configuration. User's Guide. Chapter 1 Summary. Chapter 2 System Requirements. Chapter 3 Installation. Chapter 4 Functions

Transcription:

Better Integration of Systems Management Hardware with Linux LINUXCON NORTH AMERICA Aug 2014 Charles Rose Engineer Dell Inc.

Agenda Introduction Systems Management Hardware/Software Information Available to the Service Processor The Need for Better Integration Integration of the Service Processor with Linux Managing Servers In-band and Out-of-band Current State IPMI Exchange of information between OS and Service Processor System Recovery/Debug SNMP Redirection USB NIC Pass-through Server Health Future Features OS Event logging in Service Processor Aid with Diagnostic/Debugging Automatic Configuration of console redirection 2

Introduction 3

Systems Management Hardware/Software Systems Management Hardware on Server systems: Helps manage, monitor, update and deploy Servers. Provides remote management and configuration options. Independent of the presence and status of the Operating System. Referred to as Service Processor/Baseboard Management Controller (BMC) Interfaces/API IPMI CIM WSMAN SSH SNMP Telnet VNC Web UI 4

Information Available in the Service Processor Server Hardware CPU RAM Storage/RAID Controller NIC Convergent Network Adapter/Fibre Channel Server Firmware BIOS Service Processor NIC, Storage Controller Server Software NIC IP, drivers 5

The need for better Integration 6

Integration of the Service Processor with Linux Servers can be managed: Over the systems management interface (IPMI, CIM, SNMP) Out-of-band. Over the OS s network interface (SNMP, CIM, etc.) In-band. In-band or out-of-band should not result in loss of information/functionality. OS information should be available in the Service Processor. Service processor information should be available in the OS. Operating System Server Hardware Service Processor Eliminate the need for any proprietary agents on the OS. Utilize OS to Service Processor Pass-through network. LAN On Motherboard. Virtual USB NIC. Security Considerations. In-band Out-of-band 7

Managing Servers In-band and Out-of-band Operating System Server Hardware Service Processor Operating System Server Hardware Service Processor Operating System Server Hardware 8 Management Console In-band Out-of-band Service Processor Managed Servers

Current Status 9

IPMI IPMI kernel module Autoload Older systems required OpenIPMI s startup script to load ipmi kernel modules Kernel 3.10 and later will autoload ipmi modules ipmi_devintf Ipmi_si Ipmi_msghandler Simplifies IPMI s use in installation/livecd environments ipmi_watchdog does not yet load automatically TODO: autoload ipmi_watchdog 10

Exchange Information between OS and Service Processor What OS is running on a server? What is the Service processor s IP/URL? OS information is set in the Service Processor System Host Name Operating System Operating System Version Service Processor s IP/URL is exported to the OS /etc/init.d/exchange-bmc-os-info ipmitool/contrib 11

System Recovery/Debug On OS lock-up, capture information that can aid with debugging. Watchdog timer facility provided by the Service Processor Unlike the Chipset Watchdog (itco), does more than just resetting the system. Record failure in Sensor Event Log Send alerts over SNMP/SMS/Phone, etc. Capture VGA as a JPEG, Capture Video. 12

System Recovery/Debug IPMI driver has had support to detect/log kernel panic events for years. Linux Watchdog API: ipmi_watchdog.ko /dev/watchdog interface to the Service Processor. watchdog pings converted to KCS messages to BMC. Traditionally required agents in OS to send KCS messages to BMC. Watchdogd or Systemd can act as watchdog daemons in the OS. Can co-exist/supplement kdump/kexec, requires some guess work. TODO: Update ipmi_watchdog.ko to support multi-watchdog. 13

SNMP Redirection Service Processor has exhaustive Hardware information. OS contains information for resources it manages. Many Management Consoles communicate with OS s SNMP agent. Hardware health/inventory information available to OS is limited/non-exhaustive. Service Processor s OID is grafted as part of the OS s SNMP MIB. Traps from Service Processor can be configured to reach the network s Trap Sink. Hardware Health is now available to management console. Support SNMP v2 and v3. SNMP proxy TRAP forward Management Console: SNMP get/set TRAP Operating System Server Hardware Service Processor 14

SNMP Redirection Operation Get/Set Enable SNMP on the Service Processor proxy get/set SNMP requests to the Service Processor s IP for a subset of OID SNMPv2-SMI::enterprises.674.10892 Trap Enable snmptrapd to accept traps from Service Processor s IP. forward traps to sink configured on the host. Enable SNMP Alerting on Service Processor ipmitool-1.8.15 contrib/bmc-snmp-proxy 15

USB NIC Pass-Through Dedicated channel for OS Service Processor communication Operating System Service Processor at 169.254.0.1 (default). Non-routable. Automatic configuration with Avahi and nss-mdns or NetworkManager. Server Hardware USB NIC Service processor can be reached with idrac.local http://idrac.local # ipmitool I lan H idrac.local # snmpget idrac.local Service Processor 16

System Health Health of CPU, Fan, Temp, Voltages, etc. available already Aggregate the above into System Health machine readable value. Available in-band and/or out-of-band Can be used by cluster software, virtualization managers, cloud compute managers to perform workload migration decisions Available over SNMP or IPMI SNMP redirection can make health available in-band Health Operating System Server Hardware Service Processor Health 17

System Health over IPMI and SNMP IPMI raw 0x30 0x51 Byte 5: Global and Storage status Bit 0- Set = Storage status Normal Bit 1- Set = Storage status Error (non-critical) Bit 2- Set = Storage status Failed (critical) Bit 3- Set = Storage status Unknown Bit 4- Set = Global status Normal Bit 5- Set = Global status Error (non-critical) Bit 6- Set = Global status Failed (critical) Bit 7- Set = Global status Unknown SNMP SNMPv2-SMI::enterprises.674.10892.5.2.2.0 1: other -- the is not one of the below. 2: unknown -- not known or monitored. 3: ok -- the status is ok. 4: noncritical -- the status is warning, noncritical. 5: critical -- the status is critical (failure). 6: nonrecoverable -- the status is nonrecoverable (dead). 18

Opportunities 19

OS event logging in Service Processor Log OS Events to the Service Processor to have a better understanding of the host OS: OS Started OS Stopped OS Install Started OS Install Stopped OS Install Aborted OS Install Failed Standard IPMI Sensor Events Combined with OS Name, OS Version and Power Status information, this will help administrators/console software on server state. SUSE s YaST2 Hooks 20

Aid with Debugging OS configuration and logs crucial for debugging Logs might be unavailable if system has locked-up or there was a Kernel Panic. On application/kernel error: Collect relevant configuration and logs. Store in Service Processor. Accessible out-of-band even with host OS down. 21

Automatic Configuration of Console Redirection Most headless servers use IPMI Serial Over LAN to access remote server s console. BIOS contains options to setup redirection to serial console. Administrator has to duplicate BIOS setup information on kernel command line. console=ttys0,115200 Can reduce overhead if kernel can read BIOS serial port information. ACPI already has SPCR Serial Port Console Redirection. Linux support was introduced in 2.4 and removed in 2.5. Would be nice to have something similar. 22

References IPMI on Linux http://openipmi.sourceforge.net/ipmi.pdf http://ipmitool.sourceforge.net/ http://www.gnu.org/software/freeipmi/ Related Projects http://www.openlmi.org/ https://github.com/abrt/abrt/wiki/abrt-project Scripts Exchange Information http://sourceforge.net/p/ipmitool/source/ci/master/tree/contrib/exchange-bmc-os-info.init.redhat SNMP Redirection http://sourceforge.net/p/ipmitool/source/ci/master/tree/contrib/bmc-snmp-proxy Installer Status Event logging http://sourceforge.net/p/ipmitool/patches/97/ Fedora Feature Page http://fedoraproject.org/wiki/features/agentfreemanagement Dell idrac http://en.community.dell.com/techcenter/systems-management/w/wiki/3204.dell-remote-access-controller-drac-idrac.aspx 23

Thank You! charles_rose@dell.com linux-poweredge@dell.com 24

Backup 25

Server Block Diagram 26

Automated System Recovery with Systemd Watchdog Daemon Set RuntimeWatchdogSec Set ipmi_watchdog timeout to the same Blacklist chipset watchdog Load ipmi_watchdog Reload systemd systemctl daemon reexec 27