Do it Yourself System Administration



Similar documents
Navigating the Rescue Mode for Linux

Red Hat Linux Administration II Installation, Configuration, Software and Troubleshooting

Linux System Administration on Red Hat

A candidate following a programme of learning leading to this unit will be able to:

Restoring a Suse Linux Enterprise Server 9 64 Bit on Dissimilar Hardware with CBMR for Linux 1.02

Performing Administrative Tasks

ICS 351: Today's plan

Training on Linux System Administration, LPI Certification Level 1

How to Backup XenServer VM with VirtualIQ

Reboot the ExtraHop System and Test Hardware with the Rescue USB Flash Drive

Setup software RAID1 array on running CentOS 6.3 using mdadm. (Multiple Device Administrator) 1. Gather information about current system.

USB Bare Metal Restore: Getting Started

Red Hat Certifications: Red Hat Certified System Administrator (RHCSA)

Commodity DataServer. Best Practices Guide

USB 2.0 Flash Drive User Manual

Rev C. DBDS Backup and Restore Procedures For System Release 2.2 Through 4.3

TimeIPS Server. IPS256T Virtual Machine. Installation Guide

Ubuntu bit + 64 bit Server Software RAID Recovery and Troubleshooting.

WES 9.2 DRIVE CONFIGURATION WORKSHEET

How To Set Up Software Raid In Linux (Amd64)

LOCKSS on LINUX. Installation Manual and the OpenBSD Transition 02/17/2011

Backup of ESXi Virtual Machines using Affa

Using Secure4Audit in an IRIX 6.5 Environment

Backing up the Embedded Oracle database of a Red Hat Network Satellite

LOCKSS on LINUX. CentOS6 Installation Manual 08/22/2013

Type Message Description Probable Cause Suggested Action. Fan in the system is not functioning or room temperature

SUSE Manager in the Public Cloud. SUSE Manager Server in the Public Cloud

HARFORD COMMUNITY COLLEGE 401 Thomas Run Road Bel Air, MD Course Outline CIS INTRODUCTION TO UNIX

White Paper. The Ten Features Your Web Application Monitoring Software Must Have. Executive Summary

Introduction to Operating Systems

RHCSA 7RHCE Red Haf Linux Certification Practice

USTM16 Linux System Administration

Partek Flow Installation Guide

Red Hat System Administration 1(RH124) is Designed for IT Professionals who are new to Linux.

Week Overview. Running Live Linux Sending from command line scp and sftp utilities

Advanced SUSE Linux Enterprise Server Administration (Course 3038) Chapter 5 Manage Backup and Recovery

INF-110. GPFS Installation

II. Installing Debian Linux:

RH033 Red Hat Linux Essentials or equivalent experience with Red Hat Linux..

How To Set Up A Backupassist For An Raspberry Netbook With A Data Host On A Nsync Server On A Usb 2 (Qnap) On A Netbook (Qnet) On An Usb 2 On A Cdnap (

Planning for an Amanda Disaster Recovery System

Migrating to ESXi: How To

Configuring Virtual Blades

LOCKSS on LINUX. Network Data Transition 02/17/2011

Backup & Disaster Recovery Appliance User Guide

FILE TRANSFER PROTOCOL (FTP) SITE

Easy Setup Guide 1&1 CLOUD SERVER. Creating Backups. for Linux

Error and Event Log Messages

MYASIA CLOUD SERVER. Defining next generation of global storage grid User Guide AUG 2010, version 1.1

Linux System Administration. System Administration Tasks

Installation Guide for WebSphere Application Server (WAS) and its Fix Packs on AIX V5.3L

RedHat (RHEL) System Administration Course Summary

NI Real-Time Hypervisor for Windows

SmartFiler Backup Appliance User Guide 2.0

Fermilab Central Web Service Site Owner User Manual. DocDB: CS-doc-5372

Workflow Templates Library

Remote Unix Lab Environment (RULE)

Installing a Symantec Backup Exec Agent on a SnapScale Cluster X2 Node or SnapServer DX1 or DX2. Summary

Converting Linux and Windows Physical and Virtual Machines to Oracle VM Virtual Machines. An Oracle Technical White Paper December 2008

UPSentry Smart User s Manual. Shutdown Management Software. for Mac OS X 10.2

EXPLORING LINUX KERNEL: THE EASY WAY!

GL254 - RED HAT ENTERPRISE LINUX SYSTEMS ADMINISTRATION III

UserLock advanced documentation

Best Practices. AIX Operating System Level Backup and Recovery for an IBM Smart Analytics System

Intelligent Power Protector User manual extension for Microsoft Virtual architectures: Hyper-V 6.0 Manager Hyper-V Server (R1&R2)

LBNC and IBM Corporation Document: LBNC-Install.doc Date: Path: D:\Doc\EPFL\LNBC\LBNC-Install.doc Version: V1.0

Overview Customer Login Main Page VM Management Creation... 4 Editing a Virtual Machine... 6

Dell UPS Local Node Manager USER'S GUIDE EXTENSION FOR MICROSOFT VIRTUAL ARCHITECTURES Dellups.com

Private Server and Physical Server Backup and Restoration:

Microsoft Exchange 2003 Disaster Recovery Operations Guide

System Management. Leif Nixon. a security perspective 1/37

ENTERPRISE LINUX SYSTEM ADMINISTRATION

Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice.

NetCrunch 6. AdRem. Network Monitoring Server. Document. Monitor. Manage

Linux Disaster Recovery best practices with rear

Recover Data Like a Forensics Expert Using an Ubuntu Live CD

McAfee Web Gateway 7.4.1

Reborn Card NET. User s Manual

Cisco Prime Collaboration Deployment Troubleshooting

Introduction to AIX 6L System Administration Course Summary

Oracle VM Server Recovery Guide. Version 8.2

Configure NFS Staging for ACS 5.x Backup on Windows and Linux

stub (Private Switch) Solaris 11 Operating Environment In the Solaris 11 Operating Environment, four zones are created namely:

How to Use? SKALICLOUD DEMO

CPSC 2800 Linux Hands-on Lab #7 on Linux Utilities. Project 7-1

Notes on Windows Embedded Standard

EZblue BusinessServer The All - In - One Server For Your Home And Business

PARALLELS SERVER BARE METAL 5.0 README

Troubleshooting the Firewall Services Module

VMware vsphere 5 Quick Start Guide

GL-250: Red Hat Linux Systems Administration. Course Outline. Course Length: 5 days

HOWTO: Set up a Vyatta device with ThreatSTOP in router mode

ExtremeWireless Maintenance Guide

Proposal for Virtual Private Server Provisioning

Paul McFedries. Home Server 2011 LEASHE. Third Edition. 800 East 96th Street, Indianapolis, Indiana USA

Installing an IBM Workplace/Portal Server on Linux

LISTSERV in a High-Availability Environment DRAFT Revised

Installing Operating Systems

The Importance of Software License Server Monitoring White Paper

Hyper-V backup implementation guide

Transcription:

Do it Yourself System Administration Due to a heavy call volume, we are unable to answer your call at this time. Please remain on the line as calls will be answered in the order they were received. We appreciate your business and apologize for the delay. Your expected wait time is.

Overview Overview What I hope to cover: System shutdown/restart procedures Recovery from an outage FDISK how to perform the recovery operation Identification of properly operating systems System monitoring and troubleshooting General Linux administration Any outstanding questions, general or specific, as they apply to the systems at your ranges... Wrapping things up and your questions

MAC Shutdown and Restart steps DAS Shutdown and Restart steps External RAID devices must be up first MAC raids are attached to the INTERACTIVE2 node This node must be completely up before any other system that uses the raid (the admin node does not use the raid.) DAS raids are attached to the SERVER node (on 5- node DAS configurations) This node must be completely up before any other DAS systems (all of them use the raid) 2-node DAS configurations do not use and external raid device, but <range>-das1 should be brought up first There may be cross-mounting between the nodes as part of normal operations or for testing purposes.

Recovery from an Outage In general, recovery from an outage should be handled exactly like recovering from a controlled shutdown. These steps are documented in the attached handouts and at the online Robohelp website: http://www.4dwx.org/documentation/kbase/!ssl!/webhelp/4dwx_help.htm At some times, the systems will fail to boot cleanly due to errors on the disks or in the filesystems (operating system related) or due to no network being available. See next slide for fix to most common recovery issue Please check that power, AC, and network connectivity are available before rebooting the systems Check that the switches and external raids are up before proceeding with the recovery That should be it, barring hardware failures...

Recovery from an Outage Boot process hangs due to fsck failure and waits at: Enter root password for maintenance or control-d to continue This is likely due to the system attempting to write to the hard disks when the system went offline. To fix it: Check the screen to find out what file system had the issue Look for something like /dev/sda3 or /dev/sdb1 Hit the Enter key to clear buffer and prep for root password entry Enter the root password (you will see nothing on the console) and hit the Enter key You should now be at a command prompt that ends with a # The last step in this process is to run the command to fix the filesystem issue. Enter the following and hit Enter : /sbin/fsck -fy /dev/sd{disk}{partition} [second filesystem to check] where disk and partition were found in the first step Ex: /sbin/fsck -fy /dev/sda3 /dev/sdb1 Once this has completed, type exit at the prompt to force the system to reboot properly,

Are my systems operating properly? There are 2 methods of watching systems to determine if they are working properly: System monitoring (scheduled and automated) The tools for this type of monitoring are primarily OS based and will be used by NCAR to watch your systems remotely We intend for the ranges to have access to a web page that will display current and historical trends for all of the 4dwx systems For the time being, alerts and warnings will be sent to NCAR admins only with possible future configurations notifying local personnel General monitoring tools (immediate and manual or scripted) These are OS level tools that any user can execute to look at the status of the currently running system They will not provide extensive historical details, but can provide current status and an immediate history of the system These commands can be put into cron jobs so that the output can be emailed to local/remote personnel in an automated fashion. Let's take a look at some general tools any user can execute

Are my systems operating properly? General system monitoring commands: df displays the disk partitions mounted on the system, the amount of disk free space, and a percent usage free displays the total and used memory and swap available on the system uptime displays the time the system has been up since the last reboot and the 1, 5, and 15 minute load averages for the system load averages should be viewed as the number of jobs attempting to run on the system. A high number indicates a large demand on the CPUs, fewer resources, and a bogged down slow system. ps display a listing of the processes running on the system top displays a fairly concise summary of the previous 4 tools that updates every 5 seconds ping attempts to send a packet to another system on the network Please use the first 4 commands only if you intend on setting up any automated scripts to return information via email

General Linux Administration General Linux Admin General commands ls, cp, mv, cd, vi file, date, tar, who ssh, scp, rsync, ftp, ping kill, snuff, killall &, fg, tail, head uname, clear, grep Troubleshooting commands df, du, top, uptime, free Advanced commands vmstat, route, mpstat

Wrapping it up and sealing it shut Comments Questions Complaints Compliments Assertions Queries Announcements Suggestions Ponderings Posts Details Affirmations Denials Delusions Deceptions Criticisms Congratulations Quotes Quips Wonderments The End. That's all folks!