Cyber Security Management for Utility Operations by Dennis K. Holstein (Opus Publishing) and Jose Diaz (Thales esecurity) Abstract Strong identity management enforced with digital authentication mechanisms has become the leading requirement to improve cyber security for utility operations. Utility operators don t really care how it works as long as they are confident that it does work. They want a solution that is standards-based, is interoperable with the commonly installed applications, and is extendable for legacy systems to lower the cost of ownership. General recommendations for a cryptographically-based cyber security solution are well defined in the American Gas Association s Report No. 12, Part 1, and commercial products are now available to implement these requirements. This paper presents, from a utility operator s point of view, the requirements to securely manage the keying material to protect SCADA communications and to access the maintenance ports of field devices. This paper also outlines areas of future investigation needed for a comprehensive solution. An introduction to the retrofit solution A retrofit solution to enhance access control and to protect information exchanged over Supervisory Control and Data Acquisition (SCADA) asynchronous serial communication channels and dial-up to the maintenance ports of field devices is now a reality. Recommendations for a cryptographically-based cyber security solution are well defined in American Gas Association (AGA) Report No. 12, Part 1. Recommended architecture A cryptographic module, called a CM, may be configured to protect SCADA communications (SCM) or configured to protect communications to the maintenance ports of field devices (MCM). If the term CM is used, it applies to either configuration. Figure 1 shows the recommended architecture to implement this solution. Retrofit requires the use of cryptography embedded in a SCADA Cryptographic Module (SCM) installed inline on the communication channel. SCMs should require minimal modification to existing hardware or software of the SCADA Master, Front End Processor (FEP), field device, or field technician s laptop computer. The field device may be a Remote Terminal Unit (RTU) or another Intelligent Electronic Device (IED) such as a communication processor or substation host. Some legacy devices may not have the capability to accept any modification. For this reason, the retrofit solution should, for the most part, be designed for no modification to the SCADA Master FEP, RTU and IED. Even if minimum modification is required, there would be significant cost to recertify these components if software or hardware is changed. Cryptographic module configurations It is common to build a cryptographic module that will operate in one of two modes; one box that can be operated in either mode. If a cryptographic module is configured to protect SCADA communications it is called a SCADA Cryptographic Module (SCM). If a cryptographic module is configured to protect access to the field device maintenance port and to protect the data communicated to and from the access port, it is called a Maintenance Cryptographic Module (MCM). All CMs have a local management port for configuration management. This port is used to squirt initial keying material into the CM and to set default parameters prior to field installation. Authorized personnel may access a local CM management port on site, 0-7695-2507-5/06/$20.00 (C) 2006 IEEE 1
or, if that port is connected to a communication interface it may be accessed remotely (commonly referred to as out-of-band communications). SCM configurations If a modem rack is used to support multiple SCADA communication channels, it is common to install SCMs in a rack configuration rather than stacking individual SCMs. This configuration is shown in Figure 1. SCMs at the field location may be installed on a point-to-point communication network or on a multidrop communication channel. If a multidrop communication channel is used, the SCM must have the capability to operate in a mixed mode because some field units on the multidrop may be protected and others may not. This capability also provides a more graceful cut-over to operations because the field SCMs can be turned on when ready, rather than all at once. Mixed mode operation is the one reason that the SCM must be able to interrogate the native communication protocol; in this case to get the address for each field device for which the message is intended. The other reason the SCM must interrogate the native communication protocol is to detect end of message. MCM configurations Protection of access to the maintenance ports and protecting data communicated over these channels may use an MCM at both ends of the communication channel, or one MCM at the field end of the communication channel and cryptographic software loaded on the field technician s laptop computer. Figure 1 shows the configuration with software and one MCM on each communication channel, because it is less costly and simpler to manage. This configuration is preferred. The field technician s laptop computer must include an available USB port that will accept an Authentication Key to satisfy the requirement for two factor authentication. Although a SmartCard device may be used to provide two factor authentication, it is not preferred because of cost, extra equipment (SmartCard reader), and it is not easy for field technicians to use. Secure cryptographic management system A Secure Cryptographic Management System (SCMS) is a critical component of the solution set needed for a system to cryptographically protect SCADA communications. Key management schemes must provide the capability to control the distribution, use, and to update cryptographic keys. Figure 1 shows the three SCMS subsystems of the recommended architecture. An administrative workstation equipped with a USB port for inserting the authentication key, which provides the authorization for the SCMS operator to manage all SCMS functions. A secure key management appliance used to store all keying materials. This appliance may be implemented as part of the administrative workstation, or as a separate unit that includes the function of a proxy server for key management. A key distribution system used to create and distribute keying material, and to store all the information about the configuration and status of CMs and authentication keys. Although SCMS is shown located in the SCADA control center, the SCMS may be located in any secure facility with the appropriate communication capability. AGA 12, Part 1, Addendum 1 (a work in progress) will specify the recommended practice for key establishment and use, classification and control of keys based on their intended use, requirements for the distribution of public keys, architectures supporting automated key updates in distributed systems, and the roles of trusted third parties. Systems capable of providing cryptographic services require techniques for initialization and key distribution. In addition, a protocol is needed for on-line (or in-band) update of keying material when that is the only means of remote communications, key backup and recovery, key revocation (probably the most difficult problem), and for managing certificates in certificate-based systems. Although AGA 12-1 Addendum 1 addresses key management to protect SCADA 2
communications, the same recommendations apply to management of keying material for other enterprise requirements. The AGA 12 project team has a clearly defined objective, which is to develop the framework for one key management system, and thereby avoid creating a unique key management system just for SCADA communications. Scope of this paper The scope of this paper is limited to the implementation requirements for the SCMS as needed to create and manage the keying material for all configurations of the CMs and authentication keys. The problem space and end-user options The purpose of this section is to describe the problem space and end-user options from two points of view. An end-user s operational view of cyber security management is from the time the cryptographic modules, laptop computer software, authentication keys, and SCMS is delivered, through deployment and commissioning, normal operation, repair and maintenance, and decommissioning. A supplier s view of cyber security management functions and capabilities is to determine what is needed throughout its life cycle. Note: bold italic text is used to highlight operational considerations that need to be considered by the supplier. The magnitude of SCADA operations and remote access to field devices Although SCADA operations and remote access to field devices procedures and communication capabilities vary widely, this paper uses one example to illustrate the requirements imposed on a comprehensive SCMS. Operational entities and organizational fiefdoms A hypothetical large energy company providing both gas distribution and electric transmission and distribution services is used as an example to illustrate the need for a comprehensive solution. In this example, gas distribution and electric transmission and distribution are part of the utility enterprise but operate separately. One SCMS, which is an extension of existing Information Technology (IT) policies and procedures throughout the enterprise, is desired for both gas and electric operations. Extending IT policies and procedures for both gas and electric operations through out the enterprise leads to the following SCMS derived requirements. 1. The SCMS must provide the capability for centralized control but decentralized execution to ensure a homogeneous application of IT security policy extensions with efficient operational implementation and management of keying materials. 2. For a large utility described in this example, central control should be implemented through policy rather than physical or logical management of keying material. For a small utility, central control may be implemented through both policy and physical or logical management of keying material. The SCMS must provide the capability to adapt to either environment. 3. The SCMS must provide the capability to establish a sub-enterprise level of control one for gas operation and one for electric operation. It is for this reason that AGA 12-1 recommends ANSI X9.69, which describes the implementation of cyber security for an enterprise, domains within the enterprise, and organizational units within the domain. For both gas and electric operations, it is usual practice that one or more control centers are active to provide regional control, and that one or more backup control centers are on standby to take over in case of an emergency. Furthermore, it is reasonable to assume that each control center operates independently with its own staff and communication channels to field equipment. 4. The recommended architecture shown in Figure 1 should be replicated in each control center. Although Figure 1 describes a retrofit solution, the same requirements apply to an IP-based network solution and an embedded solution. 3
One approach is to establish domains within each sub-enterprise. Each control center could represent a domain but there will be at least one other domain needed to include those organization units that support the domain of each control center (e.g., organizations such as Engineering, Field Maintenance). If the utility establishes one enterprise (no sub-enterprises), then each operation, one for gas operation and one for electric operation, could be domains within the enterprise. Communication issues this can get ugly It is common for one operation entity, such as gas distribution, to use a communication protocol, such as Modbus, that is different from the other operational entity. It is also common in a single operation entity to find a mix of a legacy protocol and a more modern protocol, such as DNP 3. And keep in mind, these communication protocols may be at different stages of deployment and commissioning. As described in AGA 12, Part 1, leased line, dial up telephone, and radio communication are the primary targets for the retrofit solution. Although not addressed in AGA 12, Part 1, VSAT-based communication is becoming more popular; for example, some utilities have as many as 200 substations operating SCADA over satellite communications. Multi-drop communication channels operating at 1200 bps to 19.2 Kbps are common. The most common speed is 9600 bps with a polling frequency of 5 seconds. Some channels may have 10 drops, but more commonly they are configured with 5 or 6 drops. In very rare instances, we found a radio channel operating at 1200 bps was configured with 100 drops. Not all field devices need to be protected. In general the installation of SCMs on SCADA communication channels will be phased in, thus creating the need for the Master Station SCMs (or head-end SCMs in a daisy chain configuration as described in AGA 12 Part 1) to operate in a mixed mode. SCMs compliant with AGA 12, Part 1 need to be designed to operate over the most common protocols, operate in a multi-drop and mixed mode configurations. In accordance with cyber security policy, the SCMS must provide the capability to manage keying material needed to support all CM operational modes. 5. Because of the need to support mixed mode and phased deployment, the SCMS must provide the capability to remotely distribute keying material to CMs. The assumption is that CMs are installed but operating in a bypass mode until they are activated. Some consideration needs to be given to providing the SCMS with the capability to securely change a CM from normal operation to operating in a bypass mode. The consequence may be an unacceptable security risk. 6. Distribution of keying material to field CMs may accomplished using one or more of the following approaches: In-band communication channels, Modems (including wireless) if they are provided to support communication to local CM management port, Site visit and load keying material via local CM management port, Load keying material prior to installation and commissioning of CM. SCMS management within the control center Figure 1 shows connectivity between the SCMS Key Management Appliance and the multi-channel SCM rack to the local management port of each SCM in the rack. It is not unreasonable to assume that this communication be implemented over an IPbased LAN. 7. For IP based communication between the SCMS Key Management Appliance and local management ports of each SCM in the multichannel SCM rack, the SCM local management port must have the capability to interface to an IP-based network. 8. For serial based communication between the SCMS Key Management Appliance and each local management 4
port on the SCM in the multichannel SCM rack, each SCM local management port must support one of two options: Dedicated communication channel between each SCM management port and the Key Management Appliance, Use of a port share or port switch connected to one serial communication channel to the Key Management Appliance. SCMS management within the field site Figure 1 shows multiple SCMs and MCMs in the remote field site. Although the local management port is only shown on the SCM, a local management port is also required on each MCM. Again, it is not unreasonable to assume that this communication be implemented over an IP-based LAN. 9. For IP-based communication between the SCMS Key Management Appliance and local management port of each CM (SCM and MCM) within the remote field site, the CM local management port must have the capability to interface to an IP-based network. 10. For serial based communication between the SCMS Key Management Appliance and the local management port on each CM within the remote field site, each CM local management port must support one or more options: Provide the capability to use a dedicated communication channel between each CM management port and the Key Management Appliance, Provide the capability to use a port share or port switch connected to one serial communication channel to the Key Management Appliance, Provide the capability to use a local connection to each CM management port from an authorized computer and user. SCMS management of authentication keys As reported by Gellings, Samototyi & Howe in the IEEE Power & Energy magazine September/October 2004, p.43, The Future s Smart Delivery System, disgruntled employees are one of the most perceived intrusion threats. The most perceived threats to power controls are information leakage, intercepting and altering control settings, authorization violation, integrity violation, and bypassing controls. In response to this insider threat, Identity Management (IM) and Role Based Access Control (RBAC) managed by organization units unique to each utility operation are needed. Clearly, the use of RBAC managed by organization units unique to each utility operation is needed. Although the organizational structure for operations may differ widely from utility to utility, the example shown in Figure 2 is useful to identify roles and responsibilities that must be managed by the SCMS. The basic idea portrayed in this example is the separation of roles and responsibilities between three organization units within the Power Delivery domain. Service Center is responsible for substation operations and maintenance, Operations is responsible for 24/7 power system operations, and Engineering and Planning is responsible for engineering and equipment performance. It is important to note that the Dispatcher in Operations is responsible for and has the authority to exercise system control of the power system. Local control within a substation is a separate organizational function wherein the on-site substation operator has responsibility and authority for equipment control as related to the maintenance of the equipment. Engineering and Planning is a mixed breed. Engineering includes Protection Engineering which has the responsibility and authority to change setting related to power system protection but no authority to exercise equipment or power system control. A parallel organization within Engineering and Planning is responsible for equipment performance. The field technicians have the authority to perform diagnosis but no authority to change settings or exercise local control that is, they have a read only privilege. If diagnosis indicates that repair or maintenance is required, the field technician prepares a report and sends it to the Service Center for appropriate action. 5
Each of these organizational units may be supported by field engineers representing the vendors that supplied the equipment. These vendors may have similar but restricted authority to support the respective organizations. The SCMS authority given to vendors may vary. One approach is to only allow vendors to simply identify who will support a specific task, and the utility organizational unit authority then issues the needed certificates and credentials to that individual. This minimizes the trust needed in the vendor s internal control processes but adds to the work load of the utility organization supported. Another approach is to empower the vendor as an organizational unit authority, which then allows the vendor to manage identity, authorization and use privileges in accordance with prespecified conditions. This reduces the work load on the utility organization supported but requires more trust in the vendor s internal control processes. The above example for vendors is applicable to the business partnerships that have prespecified contract relationships. In either case, a risk assessment is needed to determine the degree of trust to be placed in a third party and the necessary oversight required to ensure that this trust is warranted. The SCMS must support all approaches to be compliant with security policies and procedures of each utility. Before discussing effective management of certificates of authorization and privileges, it is useful to understand the special case of dial-up access to the maintenance ports of field devices. Figure 3 describes an example of identity management, access rights and authorization privileges afforded by blending the recommendation of AGA 12, Part 1, with existing use of passwords. The current practice is to load the IED vendor toolkit software on the field technician s laptop computer. The technician dials the auto-answering modem and when a connection is established, a password is entered and the IED (RTU, for example) verifies the session password. When verified, the technician now has level 1 (read), level 2 (write: change settings), or level 3 (factory settings) privileges. It is also common practice to permit the technician to change the password of equal or lower level. The relationship between the organizational units shown in Figure 2 and the access and authorization shown in Figure 3 is represented by Group designations in Figure 3. For example, Groups 1 and 6 have access and authorization for the RTU only. They do not have access to other IEDs. For the AGA 12 retrofit solution, changes to existing field device (IED or RTU) software is to be avoided. For the AGA 12 embedded solution, this may not be a problem. As shown in Figure 3, ANSI X9.69 compliant software is loaded on the laptop computer and some of these software components are loaded on the Authentication Key. As a minimum, the identity certificate should be on the authentication key to enforce two factor authentications required by AGA 12, Part 1. More sophisticated authentication keys will accept permission credentials and combiner. A CM operating in the MCM mode is placed between the auto answering modem and a port switch that connects to the maintenance ports of each RTU or IED in Figure 1. In this configuration, one MCM protects access to all field device maintenance ports. As a minimum, the MCM will issue a challenge to the X9.69 identity certificate to ensure that the user has access rights to the field device maintenance ports. This keying material must be managed by the SCMS. Credentials which contain the predefined permissions can also be used to enhance the control of user authorization to perform selected action. However, in addition to MCM software, this may require changes to the field device software depending on the level of control required. Effective authorization certificate and privilege management Because the SCMS manages the certificates of authorization and privileges across this organizational structure, it must provide the capabilities needed to support the functions described below. 11. Because of the organization structure and operational philosophy described by Figure 2 it seems reasonable to 6
establish a domain of responsibility for Power System Operation. A parallel or subordinate domain could, if needed altogether, be established for each support vendor. Each domain is empowered to execute and manage its keying material as specified in ANSI X9.69 this is viewed as distributed execution within the centralized control of the Enterprise Authority. As stated before, centralized control ensures that proper extension of the enterprise security policies and procedure are enforced through all domains and organizations, including support vendors and business partnerships. 12. Within each domain, and in compliance with ANSI X9.69, organizational units are responsible for assigning rights and privileges to all entities (people and devices) for which they are responsible. This requires that each domain and organizational unit replicate the SCMS, or selected components of the SCMS that is described in Figure 2, and include those functions needed for their assigned responsibilities. The challenge of issuing, changing, or revoking certificates and privileges Any organization will experience normal turnover of personnel, termination of personnel, and changing roles and responsibility of personnel as they move or add assignments that cross organizational units and many times across domains. For example, it is very common that personnel experienced in power delivery operations will leave the company and go to work for a supplier of equipment used by that company. Therefore, the SCMS must provide the capability to issue, change, or revoke certificates of authorization and privileges in a timely response to organizational and domain changes. 13. Identity management, authorization and privileges need to be assignable for a selected time period and managed by the organizational unit authority(s) that have direct supervision over the tasks performed by the individual. This will probably require agreements of cooperation between domain authorities and between organizational unit authorities as described in ANSI X9.69. 14. Individual certificates need to be stored on an Authentication Key that is assigned in accordance with company policies and procedures to any individual. 15. Secure distribution of keying material or revocation of keying material over modems, Wide Area Network (WAN), and the Internet as shown in Figure 1 needs to be supported by the SCMS. Although these requirements seem logical, a reliable, timely, and cost effective implementation is not a simple matter. Revocation lists are commonly used in IT systems, but implementation of these lists creates significant communication, CPU, and memory requirements in cryptographic modules. Several alternative mechanisms are under investigation at this time. Revocation lists can be managed easily by the Key Management Appliance and Administrative Workstation, but this would require that each CM (SCM and MCM) exchange messages with the SCMS to verify that the sending entity has valid access authorization and privileges. Keying material can be issued with short user-settable timeouts so that certificates of authorization and privileges are automatically revoked when their time expires. Field test and evaluation are needed to determine the best techniques. SCMS alarm processing AGA 12, Part 1, requires that all CMs record all events and output an alarm related to a CM anomalous event. How this is implemented is not specified in AGA 12, Part 1. The SCMS needs to provide the capability to receive and process in a timely manner all alarms related to a CM anomalous event. 16. CM alarms detected in a field CM need to be recorded by the field device and reported as alarms through normal SCADA communication channels to the SCADA master. 7
17. CM alarms detected within the control center need to be reported as alarms to the SCADA Operator. 18. CM alarms received by the SCADA master need to be communicated to the Key Management Appliance and Administrative Workstation for review and processing by the SCMS operator. 19. The SCMS operator at the SCMS Administrative Workstation needs to have the capability to process alarms related to an anomalous CM event, and to issue the necessary correction action in accordance with predefined policies and procedures in a timely manner. Alarm processing is another area that needs further research in the following areas: Timely and effective methods to deliver alarms to the appropriate administrative workstation for review and processing. Intrusion Detection System (IDS) functions need to be integrated into the SCMS in either an embedded or associate subsystem. Furthermore, adaptive-learning algorithms need to be developed to support the IDS requirements. Trusted 3 rd party SCMS provider All SCMS requirements may be assigned to a trusted 3 rd party SCMS provider under the appropriate terms and conditions negotiated between the utility and the trusted 3 rd party SCMS provider. This will be the subject of a future paper. 3. More research is needed to develop effective alarm processing and timely corrective action mechanisms and procedures. 4. Because some legacy devices do not have the capability to accept any modification, the retrofit solution must be designed to operate without changes to the SCADA Master, Front End Processor, Remote Terminal Unit, or Intelligent Electronic Device in the field. 5. The SCADA cryptographic module must have the capability to operate in a mixed mode on a multi-drop communication channel because some field devices may be protected and some may not. 6. Significant improvements to provide secure access to the maintenance ports of field devices can now be achieved at low cost, and easily be justified with a simple business case, which compares the cost of shutting down the dial-up and sending crews to the field site to perform the same functions locally with the cost of securing remote communications. References 1. AGA Report Number 12, Part 1, Cryptographic Protection of SCADA Communications General Requirements. The latest version of AGA 12, Part 1 is available from holsteindk@adelphia.net. 2. ANSI X9.69-1994, Framework for Key Management Extensions 3. IEEE Power and Energy magazine, September October 2004, see page 43. Conclusions and suggested research The top 6 findings from this study are: 1. A comprehensive solution is needed to avoid building stovepipe solutions, each unique to a specific organizational entity or fiefdom within the end-user s enterprise. 2. Extensive field testing is needed to evaluate the best approach to manage keying materials. A cost effect method to manage keying materials needed for SCADA communication security and secure access to the maintenance ports is the greatest challenge at this time. 8
Figure 1 Recommended retrofit architecture Figure 2 Example organizational for utility operations 9
Figure 3 Example of access rights and authorization privileges for maintenance 10