S2A6620 Technical Service Bulletin Cache Protect Battery Replacement
Table of Contents 1 OVERVIEW...3 2 EXPLANATION...3 2.1 Background... 3 2.2 Increased Functionality... 3 2.3 Applicability... 3 3 UPGRADE PROCEDURE...4 3.1 Online Replacement... 4 3.2 Controller Replacement Steps... 4 4 SUPPORT...6 5 ACCESSING A CONTROLLER WITHOUT A SERIAL CABLE...6 6 SFA OS APP SHOW SUB SUM COMMAND EQUIVALENTS FOR EARLIER VERSIONS...7 DataDirect Networks S2A6620 Technical Service Bulletin, Cache Protect Battery Replacement 2
1 Overview DDN recommends all S2A6620 battery packs be replaced with a new and improved Li-Ion battery. Because S2A6620 battery packs are not CRUs (Customer Replaceable Units) the entire controller canister must be replaced. 2 Explanation 2.1 Background The S2A6620 battery pack is used to protect volatile cache on power loss while cache contents are flushed to internal non-volatile memory. If a battery is incapable of this task, upon reboot, the controller will enter a critical condition. This condition is usually recoverable as the SFA OS mirrors the contents of cache across both controllers in a S2A6620 system. If the batteries on both controllers in a system are incapable of performing a successful cache flush, the system runs the risk of losing cached writes for storage pools that have write-back cache enabled during a powerloss event. DDN has encountered some batteries that are not operating at an acceptable level. After further investigation, DDN has decided to replace all existing S2A6620 canisters using nickel-metal hydride batteries with a new design that incorporates newer Li-Ion batteries as well as improved battery monitoring and conditioning features. 2.2 Increased Functionality The new controller canister with improved battery provides an interface for SFA OS firmware to communicate with an onboard fuel gauge on the battery pack. SFA OS FW has been enhanced to provide monitoring information provided by the fuel gauge via the CLUI and GUI displays and also now captures new events related to the new packs. 2.3 Applicability An upgrade to the 1.4.1 (or later) release of SFA OS, coupled with controller hardware replacement, is a mandatory upgrade for the entire S2A6620 customer base. DataDirect Networks S2A6620 Technical Service Bulletin, Cache Protect Battery Replacement 3
3 Upgrade Procedure DDN recommends that the controller replacement procedure, described in this document, only be performed as an online replacement procedure. In this scenario, the S2A6620 system is up and running (with or without host I/O), while the operator replaces each controller canister individually. 3.1 Online Replacement Two things must be accounted for when performing online controller replacements. The first has to do with the installed host side failover mechanisms (also commonly referred to as multi-pathing I/O or MPIO). If host I/O will be serviced by the SFA6620 during this operation, it is imperative that host side multi-pathing software be configured and operating correctly. Because this online upgrade procedure will involve a controller failover/failback operation, functioning multi-pathing mechanisms must be in place to ensure data availability is maintained throughout the online replacement procedure. NOTE: Multi-pathing software is implemented on the host system and not the S2A6620 and is only critical if host I/Os or mounted file systems are expected during this controller replacement procedure. Any questions or concerns in this area should be directed to DDN Support before proceeding. The second consideration is related to the power supply units currently installed in the controller enclosure of the S2A6620 system. There are currently two types of power supplies deployed with S2A6620 systems. Note that older generation power supplies are not compatible with this online controller replacement. To verify the installed power supply type, issue the command show power_supply all and examine the Serial Number field. If the main chassis power supply serial numbers begin with THDEL then you are okay to proceed with the online replacement. If your main chassis power supply serial numbers start with CATEC then you will not be able to perform the online replacement and you will need to contact DDN support for further instructions. 3.2 Controller Replacement Steps NOTE: An S2A6620 serial cable is required for the following procedure. If you do not have one available, contact DDN support (see contact details at the end of this document) before proceeding. If you need to perform the following steps without a serial cable, see Section 5 for a procedure to set up the network to access a controller when a serial cable isn t available and then return to this procedure. 1. Disable Controller Write-Back Cache (WBC): SSH into one of the S2A6620 controller canister CLUI sessions (with USERNAME: user and PASSWORD: user). Disable writeback cache on all storage pools. To do this, type the following command for each configured storage pool on the system: set pool ID write_ back_caching=false 2. Capture Controller Logs: From an SSH CLUI session on each S2A6620 controller canister, issue the command show sub sum, app show sub sum, and ui show network and save all the output as plain text. NOTE: app show sub sum may not be supported on earlier SFA OS versions. A list of equivalent commands is included at the end of this document. DataDirect Networks S2A6620 Technical Service Bulletin, Cache Protect Battery Replacement 4
It s important to capture the current network settings on the existing controllers. The network settings are required as they will be needed to configure network interfaces on the new controller canisters. The default IP addresses of the new controllers are 10.0.0.1. 3. Shutdown First Controller: SSH into the first controller to be replaced. Issue the command shutdown controller local. This command will initiate a shutdown of this controller only. This is a safe operation to do on a production system as all Pools and VDs as well as HOST I/O will be automatically transferred to the remaining active controller. 4. Remove and Install First New Controller: Verify the controller has shut down by confirming there is no LED activity on the shut down controller. Properly label all cables connected to the shutdown down controller and remove them. Disengage the controller and replace with the new controller. Make sure the new controller is fully seated and latched into place. Plug the FC cables in at this time but do not plug in the Ethernet cable. Note: If you are not using a serial cable interface but instead are going to configure the controller using an Ethernet/ssh connection and have verified there are no IP address conflicts as described in section 5, proceed with plugging in the Ethernet cable). 5. Verify Installed Controller Powers Up: The new controller should power up automatically. If this doesn t occur automatically, power up the new controller by pressing the power button located on the back of the controller: 6. Connect To Newly Installed Controller Using An RS-232/Serial Connection: Allow the controller to power up, this will take approximately two minutes. Once fully powered up, log into the newly installed controller CLUI using a serial cable connection. The newly installed controller should be in a Manual Intervention Required (MIR) Firmware Mismatch state (the firmware on the newly installed controller will likely be at a different level than the existing controller). If any other MIR state is seen, contact DDN Support for guidance. The controller will remain in the MIR Mismatched Firmware state until the other controller is replaced (later in this procedure). 7. Change the network settings to match that of the removed controller: Using the output of the ui show network data previously captured, run the following command on this controller canister: ui set network local 0 ip_address=x ip_mask=x ip_gateway=x DataDirect Networks S2A6620 Technical Service Bulletin, Cache Protect Battery Replacement 5
Where x, y, and z are the ip address, netmask, and gateway address of the original controller. If it s not already connected, plug the Ethernet cable into the new controller. Ping the newly installed controller to ensure network communication is set up correctly on the new controller. 8. Repeat For Second Controller Replacement. To replace the other controller repeat steps 2-7 referencing the second controller. Once the other controller has been shutdown (in step 3), the Firmware Mismatch MIR state will be cleared from the first controller and host I/O failover (if applicable) will occur. Note that when you execute step 6 for the second controller, that it should not come up in any MIR state, including MIR Mismatched Firmware. 9. Validate Functional Controllers: Both controllers should be fully functional at this point and serving I/O. You can run the command app show channel to look at the S2A 6620 host port status. You can also run show vd counter rate several times to see host I/O going through the S2A6620 controller FC ports (assuming host I/O is running). 10. Enable Controller Write-Back Cache (WBC): From an SSH session on either one of the S2A6620 controller canister CLUI sessions, enable write-back cache on all storage pools. To do this, type the following command for each configured storage pool on the system: set pool ID write_ back_caching=true 4 Support Please contact DataDirect Networks Support at any time for assistance. Support can be reached by the following methods: Web: http://www.ddn.com/request-support Email: support@ddn.com North America: +1.888.634.2374 International: +1.818.718.8507 5 Accessing a Controller without a Serial Cable Note: This procedure will only work if there are no other devices on the local subnet with an IP address of 10.0.0.1 and 10.0.1.1. If a 6620 serial cable is not available, this procedure can be used to access a controller using the network. 1. Ping the local network to see if any devices on the network have the IP address 10.0.0.1 and 10.0.1.1. If possible, remove these devices from the network until the controller replacement is complete. If it s not possible to remove the devices with these IP addresses, the remaining steps below cannot be used because doing so will create conflicting IP addresses. 2. DDN uses two IP addresses with replacement controllers. Your replacement controller s address will be either 10.0.0.1 or 10.0.1.1 with a netmask of 255.255.255.0. 3. Once the controller is installed and powered on, plug in the Ethernet cable. 4. SSH into one of the addresses given in Step 1. (One of the two addresses will work.) DataDirect Networks S2A6620 Technical Service Bulletin, Cache Protect Battery Replacement 6
5. Now that communication is established, change the IP address, netmask, and gateway to participate in your network. Use the following command to make the changes: ui set network local 0 ip_address=x ip_mask=y ip_gateway= z 6. Once the network settings have been applied, exit out of the CLI and SSH back into the controller using the new IP address. The steps described in this section can be used in lieu of the RS-232/Serial Connection steps described throughout Section 3.2. 6 SFA OS app show sub sum Command Equivalents for Earlier Versions If the customer is running 1.3.x.x or newer, they can use the app show sub sum command. If the system is running a version of SFA OS prior to 1.3.x.x, the following commands will need to be run to capture the required data. show controller * all show slot * app show proc * show subsystem all show pool * app show channel * show enclosure * show virtual disk * app show stack * show expander * show physical_disk * ui show email show fan * show job * all ui show network loc * show temperature * app show discovered * ui show snmp show power_supply * app show host * show controller local log num=max show power_supply * all app show initiator * show ups * app show pres * DataDirect Networks S2A6620 Technical Service Bulletin, Cache Protect Battery Replacement 7