GPFS and Remote Shell



Similar documents
IBM WebSphere Application Server Version 7.0

Contents. Part 1 SSH Basics 1. Acknowledgments About the Author Introduction

Nixu SNS Security White Paper May 2007 Version 1.2

Using Symantec NetBackup with Symantec Security Information Manager 4.5

Tighter SSH Security with Two-Factor

What IT Auditors Need to Know About Secure Shell. SSH Communications Security

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

SSH The Secure Shell

Step One: Installing Rsnapshot and Configuring SSH Keys

CUIT UNIX Standard Operating Environment and Security Best Practices

Adobe Marketing Cloud Using FTP and sftp with the Adobe Marketing Cloud

enterprise^ IBM WebSphere Application Server v7.0 Security "publishing Secure your WebSphere applications with Java EE and JAAS security standards

Security Configuration Guide P/N Rev A05

IMF Tune v7.0 Backup, Restore, Replication

CycleServer Grid Engine Support Install Guide. version 1.25

Security Advice for Instances in the HP Cloud

Using Delphix Server with Microsoft SQL Server (BETA)

Linux Security Ideas and Tips

Customized Data Exchange Gateway (DEG) for Automated File Exchange across Networks

TELE 301 Network Management. Lecture 16: Remote Terminal Services

Secure Shell User Keys and Access Control in PCI-DSS Compliance Environments

PATROL Console Server and RTserver Getting Started

Overview. Edvantage Security

RSA Authentication Manager 7.1 to 8.1 Migration Guide: Upgrading RSA SecurID Appliance 3.0 On Existing Hardware

Using Network Attached Storage with Linux. by Andy Pepperdine

11.1. Performance Monitoring

Performing Administrative Tasks

DS License Server V6R2013x

Contents Release Notes System Requirements Administering Jive for Office

Network Attached Storage. Jinfeng Yang Oct/19/2015

Automated Offsite Backup with rdiff-backup

The SSL device also supports the 64-bit Internet Explorer with new ActiveX loaders for Assessment, Abolishment, and the Access Client.

Installing and Configuring a SQL Server 2014 Multi-Subnet Cluster on Windows Server 2012 R2

Lotus Domino Security

Example - Barracuda Network Access Client Configuration

SSH and FTP on Ubuntu WNYLUG Neal Chapman 09/09/2009

Lesson Plans Microsoft s Managing and Maintaining a Microsoft Windows Server 2003 Environment

Cryptography: RSA and Factoring; Digital Signatures; Ssh

Git - Working with Remote Repositories

Objectives. Chapter 2: Operating-System Structures. Operating System Services (Cont.) Operating System Services. Operating System Services (Cont.

1 Organization of Operating Systems

Connectivity Security White Paper. Electronic Service Agent for AIX and Virtual I/O Server (VIOS)

CS161: Operating Systems

Testing New Applications In The DMZ Using VMware ESX. Ivan Dell Era Software Engineer IBM

Final Year Project Interim Report

Project management integrated into Outlook

SSL Tunnels. Introduction

File Sharing. Peter Lo. CP582 Peter Lo

Exam Questions SY0-401

TELNET CLIENT 5.11 SSH SUPPORT

Building a Continuous Integration Pipeline with Docker

PolyServe Understudy QuickStart Guide

Spectrum Scale HDFS Transparency Guide

CS 356 Lecture 25 and 26 Operating System Security. Spring 2013

Red Hat Linux Administration II Installation, Configuration, Software and Troubleshooting

Configuring SSH and Telnet

Application Server Installation

Integrating F5 BIG-IP load balancer administration with HP ProLiant Essentials Rapid Deployment Pack

IDENTITIES, ACCESS TOKENS, AND THE ISILON ONEFS USER MAPPING SERVICE

SUSE Manager in the Public Cloud. SUSE Manager Server in the Public Cloud

Data Warehouse Center Administration Guide

SO114 - Solaris 10 OE Network Administration

VMware vcenter Log Insight Security Guide

Centralizing Windows Events with Event Forwarding

State of Wisconsin DET File Transfer Protocol Service Offering Definition (FTP & SFTP)

HP OpenView Storage Data Protector

Deploying Windows Streaming Media Servers NLB Cluster and metasan

Handle Tool. User Manual

Unifying Information Security. Implementing TLS on the CLEARSWIFT SECURE Gateway

Windows Host Utilities Installation and Setup Guide

INF-110. GPFS Installation

Oracle Linux 7: System Administration Ed 1 NEW

High Availability Solutions for the MariaDB and MySQL Database

Red Hat System Administration 1(RH124) is Designed for IT Professionals who are new to Linux.

Flexible Identity Federation

IBM Sterling Connect:Enterprise for UNIX

CDH installation & Application Test Report

Git Fusion Guide August 2015 Update

Highly Available NFS Storage with DRBD and Pacemaker

Hybrid for SharePoint Server Search Reference Architecture

Configuring Secure Linux Hosts

installation administration and monitoring of beowulf clusters using open source tools

Restore and Recovery Tasks. Copyright 2009, Oracle. All rights reserved.

AD RMS Step-by-Step Guide

Scheduling in SAS 9.3

HDFS Users Guide. Table of contents

Deploying a Virtual Machine (Instance) using a Template via CloudStack UI in v4.5.x (procedure valid until Oct 2015)

OMU350 Operations Manager 9.x on UNIX/Linux Advanced Administration

Remote Desktop Administration

Cloud Server powered by Mac OS X. Getting Started Guide. Cloud Server. powered by Mac OS X. AKJZNAzsqknsxxkjnsjx Getting Started Guide Page 1

SoftNAS Application Guide: In-Flight Encryption 12/7/2015 SOFTNAS LLC

FileMaker Server 7. Administrator s Guide. For Windows and Mac OS

SQL EXPRESS INSTALLATION...

Guideline for setting up a functional VPN

SAMBA AND SMB3: ARE WE THERE YET? Ira Cooper Principal Software Engineer Red Hat Samba Team

Centralize AIX LPAR and Server Management With NIM

Local File Sharing in Linux

State of Wisconsin DET File Transfer Protocol (FTP) Roles and Responsibilities

Data Replication in Privileged Credential Vaults

Transcription:

GPFS and Remote Shell Yuri Volobuev GPFS Development Ver. 1.1, January 2015.

Abstract The use of a remote shell command (e.g. ssh) by GPFS is one of the most frequently misunderstood aspects of GPFS administration, to the point where it could be a barrier to GPFS adoption. There is much confusion around this topic. Is remote shell access the basic cost of GPFS access? Does GPFS have a hard dependency on SSH? Does GPFS require root-level passwordless SSH access between all nodes? My corporate IT security policy stipulates that PermitRootLogin must be set to No this means I can t run GPFS, right? The short answer to all of those questions is No, while the long answer requires some explaining. Background Being a file system, GPFS needs elevated privileges to operate. A portion of GPFS code runs in the kernel space, and thus has the same level of access to the system as the OS kernel (i.e. the highest level of access possible). The userspace part of GPFS code also needs to be able to perform many operations that require elevated privileges: communicating with the kernel counterpart, loading and unloading kernel modules, mounting and unmounting file systems, modifying system configuration files (e.g. /etc/fstab and entries under /dev), accessing raw disk devices, etc. On a standard AIX or Linux install, this requires root-level access. This is a hard requirement that cannot be easily changed. So GPFS administration commands, generally known as mm commands, require root privileges to run, with a few exceptions. A critical point that must be appreciated is: the core GPFS design is based on the trusted kernel assumption. GPFS is a cluster file system, not a client-server setup like NFS. Each GPFS node is capable of performing a full range of file system operations independently. This means that the kernel on each GPFS node has to be trusted to do the right thing. A user with root access on any GPFS node has full access to all file system data and metadata, and a malicious root user would be able to wreak havoc on any GPFS file system. Being a cluster file system, GPFS has a need to reach out to other nodes in the cluster. Different layers of GPFS code do this, using different communication channels. The main GPFS daemon process, mmfsd, uses an RPC mechanism to communicate with mmfsd processes running on other nodes. GPFS admin commands use remote shell, such as RSH or SSH, to execute various commands on other nodes. The rationale for this architecture has roots in early GPFS history, and the details of the remote shell use have evolved over time. History When GPFS was first released as a product in 1997, it was not a standalone piece of software, but rather a component in the IBM SP software stack. There was some common infrastructure in the stack that GPFS code was using for its needs. One basic need that GPFS has is bootstrapping: managing basic configuration covering things like cluster membership, defined file systems, disks belonging to GPFS, etc. This configuration data must be available (and be up to date) on all nodes in the cluster, and meet the basic clustering requirements: high availability, transactional semantics, and scalability for larger clusters. On IBM SP, for bootstrapping purposes GPFS was using a common infrastructure component known as

System Data Repository, or SDR. For general cluster administration, and for GPFS administration in particular, the remote shell of choice was RSH (those were simpler times). Over the course of the following years, GPFS has evolved into a standalone software product. An alternative mechanism was implemented for managing bootstrapping configuration data. The data was stored in a text file, known as mmsdrfs (in homage to the IBM SP SDR roots), and the master copy of the file was managed by (typically) a pair of nodes known as configuration manager nodes (a primary and a backup). Whenever configuration changed, to provide proper transactional semantics, a carefully orchestrated multi-phase commit operation would be carried out under the covers by GPFS admin code, using rsh and rcp commands. Once committed, an updated mmsdrfs file would be pushed out to the rest of the nodes in the cluster, again using rsh and rcp. In turn, other nodes in the cluster could pull an up-to-date copy of mmsdrfs using rsh and rcp (which can be needed, for example, if a node was down at the time of the configuration change). This meant that a remote shell connection to and from a configuration manager node may be needed pretty much at any time, for any node. At this point in time, the configuration requirements for GPFS were: any node in the cluster must be able to execute remote commands as root over rsh on all nodes in the cluster. Clearly, this isn t what a security-conscious sysadmin would like to see, but again those were simpler times. More years have passed. Gradually, an understanding has set in that the RSH protocol is woefully insecure, and SSH rose to prominence as a more secure alternative. There was nothing in GPFS code that specifically required the use of rsh and rcp as such, and using ssh and scp as drop-in replacements was a simple step. The pathnames of remote shell command and remote copy command have become cluster configuration parameters. The use of SSH with GPFS has become a de facto standard (although some souped-up forms of RSH, e.g. Kerberos-enabled varieties, are still in use). However, the way remote shell and copy commands are called from GPFS hasn t changed, and it remains very general, and not specific to any particular remote shell command implementation. There s no hard requirement for ssh and scp as such. The use of SSH, combined with the growth of GPFS cluster sizes, has created new problems. When a large cluster is brought up, all nodes would initiate SSH connections to one of the configuration manager nodes, to verify that their copy of mmsdrfs is up-to-date. It turned out that handling a surge of incoming SSH connections is something that sshd has trouble with, in particular on larger clusters. While tuning could ameliorate the problem somewhat, it was clear that a more scalable solution was needed. So mmsdrserv was implemented. At that point in time, mmsdrserv was a small, lightweight daemon that handled a few simple tasks related to mmsdrfs management, using custom RPCs over TCP/IP sockets: for example, get the current version number of mmsdrfs, fetch the body of mmsdrfs to a client. At that point, the all-to-all remote shell requirement was a source of significant consternation among GPFS users, for obvious reasons: if a single node in the cluster is compromised, the entire cluster is automatically compromised. Some way to tighten up the remote shell access requirements was needed, and the use of mmsdrserv has offered an opportunity to do just that. In GPFS V3.3, significant changes were made to the way admin commands operate. A new configuration parameter was introduced:

adminmode. The alltoall setting corresponded to the old way of doing things. The central setting allowed for a sharp reduction in the scope of remote shell access, as discussed in detail below. Another significant change to the GPFS administration model was multi-clustering: the possibility to mount a file system owned by a different cluster. In this model, several clusters can be set up that can be administered independently, with no need for command execution via the remote shell channel between them. This has provided another avenue for reducing the scope of remote shell use. In GPFS V4.1, a new mechanism for bootstrap configuration data management was introduced: Cluster Configuration Repository (CCR). When CCR is in use, once a cluster is created, the management of the master copies of configuration data is done entirely through an RPC mechanism, between mmsdrserv (or mmfsd) processes running on quorum nodes. What semantics does GPFS need from remote shell? When the adminmode = central setting is in use, the exact requirement towards remote shell/copy commands semantics reads: When a GPFS management command is executed, it must be able to execute commands remotely on all other nodes in the cluster using the configured remote shell command, without being prompted for a password on the command tty. Only the tty used to execute the management command needs to be authorized. So what s the rationale behind such precise wording? The intent here is to allow GPFS commands to perform administration tasks cluster-wide, but with a limited level of authorization. Only one tty on one node needs to be authorized, and only when a GPFS management task needs to be performed. GPFS won t try to run remote shell commands under the covers in this mode. Very importantly, without being prompted for a password on the command tty isn t equivalent to passwordless. This only means that the authentication needs to occur through a mechanism other than a tty prompt. One possible implementation that fits this model well is using SSH authorized_keys public/private key framework for granting trust, with the private key protected by a passphrase, and ssh-agent used for password prompting and password caching. It is common to use a special hardened sysadmin-only node for performing all GPFS management tasks, and only authorize this node for batch-mode SSH access. It is important that the remote shell command operates in the batch (or promptless) mode: no prompting for input and no extraneous output on the command tty. GPFS code passes -n (redirect stdin from /dev/null) switch to the remote shell command, so supplying input directly on the command tty is not possible, and the use of this option is essential to proper parsing of remote command output. It is perfectly fine if the remote shell command obtains authorization by prompting for a password (or passphrase) through an external channel, e.g. an X11 window, or reuses an authentication token from a pre-authorization operation. In a multi-homed environment, i.e. a configuration where multiple network interfaces are defined on a node, a question naturally arises: which network interfaces will GPFS use, in particular for running remote shell commands? Only admin interfaces will be used for that purpose. By default, the admin

interface is the interface corresponding to the hostname passed to mmcrcluster or mmaddnode command when adding a node to the cluster. It is possible to specify a different admin interface using mmchnode. Remote shell connections only need to be authorized for the admin interface, not any other interfaces that may be defined on a node. Other parts of GPFS, in particular the mmfsd daemon, may use other interfaces, if configured to do so, but that will not involve the use of a remote shell. What else is possible? So what can one do if the remote shell semantics explained above aren t acceptable? For example, what if PermitRootLogin must be disabled per corporate security policy, with no exceptions allowed? Does this rule out using GPFS? Not necessarily. An important point to remember is that GPFS allows using any pair of commands that provide the general semantics of rsh and rcp. While the ssh and scp pair is the most obvious candidate, the playing field is not restricted to those two. One potentially productive approach is to implement a pair of wrapper commands that provide the expected semantics externally, and internally do whatever it takes to get the job done. This may involve using a customdesigned communication tunnel, an exotic authentication method, or any combination of things. For the specific problem of PermitRootLogin, one possible approach is to leverage sudo, or a sudo-like framework, for privilege manipulation. It is possible to kick off a GPFS admin command using sudo, and then have the wrappers use ssh to log in to remote node using a non-root ID, and then use sudo on the remote side to execute the necessary commands. PermitRootLogin can be set to No in this scenario. It is still necessary to allow promptless remote command access for the user ID in question, and sudo must allow for promptless execution of a few commands for this ID. A sample of sudo-based wrappers is available on request by contacting gpfs@us.ibm.com. In those situations where no form of promptless remote shell access is possible on a given node, it is still possible to mount a GPFS file system that is exported from a different cluster. The obvious disadvantage here is the disjoint system administration model: a unit of GPFS administration is a single cluster, so if multiple clusters are defined, each needs to be administered separately. However, in certain cases this may be a fair tradeoff for not requiring remote shell access. Summary GPFS uses a fairly flexible framework for performing administrative tasks. This framework has evolved substantially from its early implementation, and some of the preconceived notions about GPFS requirements towards remote shell configuration are not true anymore. It is possible to configure GPFS to run in a wide variety of system configurations.