Kent Milligan and Beth Hagemeister ISV Business Strategy and Enablement March 2014 Copyright IBM Corporation, 2007. All Rights Reserved. All trademarks or registered trademarks mentioned herein are the property of their respective holders
Table of contents Abstract...1 Introduction...1 Cryptography overview...1 Hashing... 2 Symmetric encryption... 3 Asymmetric encryption... 3 Encryption keys... 3 IBM i encryption services...4 IBM i encryption interfaces... 4 CIPHER MI... 4 CCA APIs... 5 Cryptographic Services APIs... 6 SQL encrypt and decrypt functions... 7 Java-based interfaces... 8 ASP-level encryption... 9 Key management... 10 Cryptographic services...12 Functional overview... 12 API parameters... 14 A simple Encrypt Data API example... 17 Algorithm and key contexts... 19 Algorithm contexts... 19 i5/os V5R4 enhancements... 21 Cryptographic services and key management...22 Key generation... 22 i5/os V5R3 considerations... 23 Key storage...24 i5/os V5R4 advancements in key management... 25 Cryptographic-services key store... 25 Master keys...26 Key-store files... 27 Master-key encrypted keys... 28 Changing master-key values... 29 Master key variants... 29 Migrating keys to another system... 29 Backing up keys... 30 Key distribution... 30 SQL encryption and decryption...31 Preparing for encryption... 31 The encryption recipe... 35 Encryption password management and distribution... 36 Protecting i5/os data with encryption
Decryption: Unlocking the data... 36 Native interfaces and DB2 encryption... 37 Integrated encryption and Instead Of triggers... 38 Query optimization and performance considerations... 39 Encryption implementation considerations...42 Column encryption considerations... 42 Cryptographic services tips... 43 Remote clients and encryption keys... 44 Summary...44 Appendix A: Resources...45 Encryption programming examples with RPG... 46 Further reading...46 Appendix B: About the authors...47 Appendix C: Acknowledgements...47 Trademarks and special notices...48 Protecting i5/os data with encryption
Abstract This white paper focuses on the encryption of data at rest (primarily, the data stored in IBM DB2 for i tables and physical files). There is also a discussion of the recommended practices for encryption-key management. Introduction On a regular basis, there are reports of personal data being stolen or compromised, so it is no surprise that consumers are demanding more in terms of data protection and privacy. Thus, there is also strong demand for encryption of data on the IBM System i platform and on all other computers in the world. Before pursuing new methods of data protection with encryption, it is important not to overlook the first step in securing your IBM DB2 data: object-level security. Utilization of the robust object-level security services that are provided by the IBM i (formerly known as IBM i5/os and IBM OS/400 ) operating systems is critical in ensuring that your DB2 data is protected, regardless of which interface is used to access the data. With users requiring more data-access interfaces, System i shops cannot rely on menubased security to control access to sensitive data. Object-level security prevents unauthorized users from accessing or changing sensitive company financial data (for example, earnings data). Data encryption is a security method that provides another layer of protection around DB2 columns that contain sensitive data. This extra level of security is necessary because object-level security cannot prevent authorized users (such as help-desk personnel) from viewing sensitive data. Nor can it prevent a hacker from reading sensitive data by using credentials (ID and password) stolen from an authorized user. If sensitive data such as a credit-card number is stored in an encrypted format, then by default a binary string of encrypted data is always returned to all users. To view the actual credit-card number, the user must have authorization to access the DB2 object as well as have access to the encryption key and the decryption algorithm. Cryptography overview Encryption and decryption algorithms are collectively called cryptographic systems. Cryptography is the science of keeping data secure. Cryptography itself can be defined as the process of scrambling plain text (or clear text) into cipher text (in other words, encryption), then back into plain text (that is, decryption). Figure 1 contains a graphical representation of a simple cryptographic system. Figure 1. A simple cryptographic system Before reviewing the cryptography support available with IBM i, it is best to cover some of the basic operations that are associated with cryptography. 1
Hashing Hashing is an important cryptographic operation that uses a mathematical algorithm to produce a hash (or message digest), as depicted in Figure 2. Figure 2. Hashing uses a mathematical algorithm A cryptographic hash is called a one-way hash. For all practical purposes, it is not possible to derive the original clear text from the hash value because of the mathematical algorithm that is employed. A second important property of a cryptographic hash is that it is collision-resistant. That is, it is next to impossible to find another set of data that produces the same hash value. These two properties make cryptographic hashes useful for authentication purposes. Some of the available hashing algorithms include Message Digest 5 (MD5), Secure Hash Algorithm (SHA)-1 SHA-256, SHA-384 and SHA-512. Protocols, such as Secure Sockets Layer (SSL) and IP security (IPSec), widely use a variant of SHA. Weaknesses have been found in the MD5 and SHA-1 algorithms; therefore, you should not use them except for compatibility purposes. From a data-security perspective, hashing is primarily useful in protecting the integrity of a data value. For instance, you can keep a copy of a digest to compare it with a newly generated digest at a later date. If the digests are identical, the data has not been altered. Also, hash algorithms are useful for generating of digital signatures. Cryptographic hash algorithms are often used in another operation called a keyed-hash message authentication code (HMAC). An HMAC operation uses a secret key and a hash algorithm multiple times to produce a value (the HMAC) that later can be used to ensure that the data has not been modified. Typically, an HMAC is appended to the end of a transmitted message. The receiver of the message uses the same HMAC key and algorithm as the sender to reproduce the HMAC. If the receiver's HMAC matches the HMAC that was sent with the message, the data has not been altered. A hashing algorithm is viewed as one-way encryption because the original value cannot be derived from the hash value. In contrast, encryption algorithms are used to make data unreadable and to protect the confidentiality of data. Only those users with access to the encryption algorithm and the associated key can access the original data. 2
Symmetric encryption Symmetric cryptography uses the same key for both the encryption and decryption of the plain text (see Figure 3). This is the most popular method for encrypting data. Figure 3. Most popular encryption data method The most commonly used symmetric algorithms include: Data Encryption Standard (DES), Triple DES (3DES), Advanced Encryption Standard (AES), Rivest's Cipher (RC)2 and RC4. However, DES is no longer considered secure; therefore, you should only use it for compatibility purposes. Asymmetric encryption With asymmetric cryptography, a pair of keys is used to encrypt and decrypt data. Data that is encrypted with the first key can only be decrypted with the second key, as Figure 4 demonstrates. Figure 4. Asymmetric encryption The two keys are mathematically related. Asymmetric encryption is more resource-intensive than symmetric encryption; that is why this method is not used as often for data encryption. Asymmetric encryption is frequently used in authentication, such as for the handshake that is performed by virtual private networks (VPNs) and SSL, and is also used for key distribution. Encryption keys As you saw in the earlier examples, encryption requires the use of encryption keys. Management and protection of these encryption keys is obviously a very important factor in keeping encrypted data safe. You probably have experienced this first-hand with the scrutiny given to creating nontrivial passwords and regularly changing them to secure access to e-mail and other applications on your office systems. Many of the same policies need to be applied to encryption keys. In addition, a decision is required on how to store and retrieve the encryption keys securely for those applications that need to encrypt and decrypt data. When the encryption keys are stored on the system, you must also implement a backup plan to allow for recovery from a server failure or for the migration of the applications to another system. Later, this white paper discusses recommended practices for key management. 3
IBM i encryption services Both hardware- and software-based encryption capabilities are available on IBM i. As of i5/os V5R4, no solutions automatically encrypt data at rest and automatically decrypt the data as it is retrieved from DB2 tables. This is primarily because this encryption approach only provides minimal data protection. Yes, the data is protected if someone steals the physical disk drive from the server. However, automatic decryption of encrypted data allows any authorized user to see the sensitive data. Here is a summary of the software-based encryption services that are available on i5/os from IBM: Cryptographic services APIs Common Cryptographic Architecture (CCA) APIs Cryptographic Support for AS/400 (Cryptographic APIs) (IBM will withdraw support after i5/os V5R4) DB2 built-in SQL encryption and decryption functions DB2 Field Procedure interface CIPHER Machine Instruction (MI) IBM Java Cryptography Extension Providers o IBMJCE o IBMJCEFIPS o IBMJCECCAI5OS Auxilary Storage Pool (ASP) Encryption A couple of cryptographic hardware options are also available. These hardware features can offload the main processors by performing cryptographic functions and can also increase security by providing a secure key store. The following cryptographic hardware is supported by the IBM i platform: PCI-X Cryptographic Coprocessor 4764 Cryptographic Accelerator 2058 IBM Cryptographic Coprocessor 4758 Note: The 2058 and 4758 devices can no longer be purchased from IBM, but both devices are still supported by IBM. IBM i encryption interfaces The following sections contain more information on each of the IBM i encryption interfaces. CIPHER MI The IBM i cryptographic capabilities date back to the IBM System/38 platform in the late 1970s. The Cryptographic Support licensed program (LP) (57xx-CR1) included easy-to-use APIs and command language (CL) that provided DES encrypt, decrypt and message-authentication code (MAC) functions. The CIPHER machine interface (MI) was the encryption interface for providing support for DES when IBM started shipping the IBM AS/400 platform in 1998. The repertoire of encryption interfaces has been expanded with the following cryptographic support: Encryption algorithms: DES, 3DES, RC4-compatible, AES 4
Hashing algorithm : SHA-1, MD5 UNIX crypt(3) function Pseudo-random number generation (PRNG) algorithms Users of this MI are responsible for their own key management. Because of its longevity, CIPHER MI is the only encryption interface that is available in the base OS/400 and i5/os operating-system releases prior to i5/os V5R4. Using CIPHER MI is not recommended for new applications because this support might not be enhanced with new algorithms or performance improvements in future IBM i releases. You can find details on the CIPHER instruction in Machine Interface Instructions under APIs by category in the IBM i Information Center (see http://publib.boulder.ibm.com/eserver/ibmi.html). CCA APIs Over the years, industry standards have changed dramatically. To meet higher security demands, the IBM i platform added support for hardware encryption, first with the IBM 2620 Cryptographic Coprocessor (which is no longer supported), then with the IBM 4758 Cryptographic Coprocessor, and most recently with the IBM 4764 Cryptographic Coprocessor. All of these offerings support a much richer and more secure API set than 57xx-CR1. This API set was known as the Common Cryptographic Architecture (CCA) APIs. The CCA API is designed so that a call can be issued from essentially any high-level programming language. The call, or request, is forwarded to the cryptographic-services access layer and receives a synchronous response; that is, your application program loses control until the access layer returns a response after processing your request. CCA APIs are used together with the IBM 4758 or 4764 Cryptographic Coprocessor. The CCA APIs require the installation of i5/os 5722-SS1 option 35 (CCA Cryptographic Service Provider [CCA CSP]). CCA CSP provides API support that is equivalent to the CCA Support Program, which is available for the IBM System z, IBM System p and IBM System x platforms. The CCA APIs provide a variety of cryptographic processes and data-security techniques. Applications using this API set can perform the following functions: Encrypt and decrypt information, typically using the 3DES algorithm in the cipher-blockchaining (CBC) mode to enable data confidentiality Hash data to obtain a digest, or process the data to obtain a MAC Create and validate digital signatures to demonstrate both data integrity and form the basis for nonrepudiation Generate, encrypt, translate and verify finance-industry PINs and transaction-validation codes with a comprehensive set of finance-industry-specific services Manage the various DES and Rivest-Shamir-Adleman algorithm (RSA) keys that are necessary to perform the operations just listed Control the initialization and operation of CCA The CCA Basic Services Reference and Guide in the IBM i Information Center provides details on this API set. 5
Cryptographic Services APIs In OS/400 V5R2, a new infrastructure was implemented within the IBM Licensed Internal Code (LIC) that provided a common interface for LIC components to the various cryptographic-service providers (that is, the various pieces of hardware and software that implement cryptographic algorithms). In i5/os V5R3, the Cryptographic Services APIs were implemented on top of that infrastructure. These APIs provide applications with access to Cryptographic Services within the LIC or on the 2058 Cryptographic Accelerator. Only some of the Cryptographic Services support the 2058 device. To use the Cryptographic Services APIs on releases prior to V5R4, you must first install the Cryptographic Access Provider product (57xx-AC3). With the introduction of i5/os V5R4, the full cryptographic support is enabled by default. Older versions of the Cryptographic Access Provider product (57xx-AC1 and 57xx-AC2) became obsolete and were dropped after government-export restrictions on cryptographic function were reduced. The IBM i Cryptographic Services APIs help you to extend your application in the following ways: Privacy of data Integrity of data Authentication of communicating parties Nonrepudiation of messages The Cryptographic Services API set was designed to be easier to use than the CCA APIs in highlevel language programs. The APIs themselves can be broken into six categories, as follows: Encryption and Decryption APIs Authentication APIs Key Generation APIs PRNG APIs Cryptographic Context APIs Key Management APIs A detailed discussion and examples of using these APIs are available later in this white paper. In addition, the IBM i Information Center contains both C and RPG programming examples. The Cryptographic Services APIs provide the following cryptographic capabilities: Encryption algorithms: DES, 3DES, RC4-compatible, RC2, RSA and AES Hashing algorithms: MD5, SHA-1, SHA-256, SHA-384 and SHA-512 Key Exchange Algorithms: Diffie-Hellman Pseudo random-number and key-generation algorithms (PRNG) 6
SQL encrypt and decrypt functions DB2 for i includes built-in SQL functions to allow for the encryption and decryption of data on SQL requests. The SQL encryption and decryption functions actually use the Cryptographic Services APIs to perform the operations. These SQL functions provide a much simpler interface for encrypting and decrypting data, but they lack the overall capabilities of the Cryptographic Services APIs. Figure 5 is a simplified example of the SQL-based cryptographic support in action. The encryption key is specified with the SET ENCRYPTION PASSWORD SQL statement: SET ENCRYPTION PASSWORD = <character variable> INSERT INTO emp VALUES(ENCRYPT_AES('112233'), 'BOB SANDERS' ) SELECT DECRYPT_CHAR(id), name FROM emp Figure 5. An example of SQL cryptographic support The ENCRPYT_AES and DECRYPT functions then use this key to perform the respective encryption and decryption operations. Here's a list of the built-in SQL encryption functions provided by DB2. ENCRYPT_RC2: Supports the RC2 block-cipher-encryption algorithm is the only algorithm with the encryption key derived by hashing the specified password with the MD5 hashing algorithm ENCRYPT_TDES: Supports the TDES encryption algorithm (available since the i5/os V5R4 release) ENCRYPT_AES: Supports the AES encryption algorithm (available since the IBM i 6.1 release) Although these functions can only be run on SQL-based interfaces, some workarounds are available for non-sql (that is, native I/O) interfaces. These circumventions, along with details on the SQL encryption interfaces, are covered in a later section of this white paper. 7
DB2 Field Procedure interface for encryption The DB2 for i 7.1 release enables the implementation of transparent column-level encryption solutions with a field-level exit routine, known as a Field Procedure (FieldProc). The DB2 FieldProc feature allows developers to register an ILE program (see Figure NEW1) at the field-level that DB2 automatically calls each time that a row (record) is written or read. This user-supplied program can be written to include logic to encrypt the data for a write operation and automatically decrypt for read operations. This is exciting news to those IBM i customers with legacy RPG and COBOL applications that are looking to encrypt sensitive data at rest without any application changes. All that's required is registering the FieldProc program object (mylib/ccpgm in this example) for those columns containing sensitive data. Any type of encoding can be performed by a FieldProc program, but it's expected that encryption algorithms such as AES will be the most common form of encoding. Third-party software providers, Linoma Software, Townsend Security Solutions, and Enforcive have updated their encryption software to support the DB2 FieldProc interface. This is a good option for those clients that don't want to invest in writing their own Fieldproc programs. CREATE TABLE ccstore( custid CHAR(5), cardnum CHAR(16) FIELDPROC mylib/ccpgm, cardexp DATE ) Figure NEW1. An example of DB2 Field Procedure registration The requirements and specifications for Field Procedure programmers are documented in the DB2 for i SQL Programmer's Guide (http://ibm.com/systems/i/db2/books.html). Additional details on the DB2 FieldProc support are covered in a later section of this white paper. Java-based interfaces The IBM Java-based cryptography interfaces, IBM JCE and IBM JCEFIPS, are supported by IBM i platforms along with the other IBM systems. AES, DES and TDES are some of the encryption methods supported by the IBM Java Cryptography Extension Provider. The IBM JCE also includes support for hashing algorithms and a random-number generator. The IBM JCE is a standard extension to the Java 2 Software Development Kit (J2SDK), Standard Edition. The JCE implementation on IBM i is compatible with the Sun Microsystems implementation. The IBM i Information Center contains documentation regarding the unique aspects of the IBM i implementation. The IBMJCE and IBMJCEFIPS do not use any of the IBM i cryptography interfaces. All of the cryptography support is completely implemented in Java. Starting with the IBM i 6.1 release, the Java-based cryptorography interfaces were extended with the IBMJCECCAI5OS provider. While the IBMJCECCAI5OS interface is proprietary, this interface enables Java applications to leverage hardware cryptography by using the IBM Common Cryptographic Architecture (CCA) interfaces. The IBMJCECCAI5OS interface enables JCE processing such as RSA algorithms to benefit from the performance and security advantages offered by the 4764 Cryptographic Coprocessor. Consult the IBM i Information Center article (http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/rzaha/rzahahwcrypt.htm) for details on the IBMJCECCAI5OS interface. 8
ASP-level encryption To addresss requirements of organizations that need their data encrypted at rest (when stored on the disk) without programming, IBM i 6.1 adds support for ASP encryption. This also is commonly called disk encryption, even though it is a software-based encryption solution. This support provides the ability to automatically encrypt data and store it in basic disk pools and independent disk pools (also commonly referred to as independent auxiliary storege pools or IASPs). Data is automatically decrypted when the data is read from disk by the operating system or an application. Disk encryption requires the purchase and installation of an IBM i licensed program product (Option 45 - Encrypted ASP Enablement). When installed, this product provides the option to enable encryption when a new disk pool or independent disk pool is created. Data that is stored in these pools is automatically encrypted using the AES encryption algorithm before it is actually written to the disk. This particular implementation has several advantages, including the following: Protects data transmission to and from the disk drive (important in a SAN environment) Protects data in the case of theft of the disk drive Protects data in the case of return or resale of a disk drive (reduces the need to sanitize the disk drive) Protects data transmission in the cross site mirroring environment (only when the data being mirrored is on an encrypted independent disk pool) When deciding if disk encryption is the appropriate solution in your enviroment, consider the following points: With the support in the IBM i 6.1 release, disk encryption can only be enabled when creating new disk pools. Existing disk pools and independent disk pools cannot take advantage of this feature because there is no mechanism in place to start encryption after the disk pool is configured. Conversely, after an encrypted disk pool is created, there is no way of disabling this behavior. The IBM i 7.1 release enables encryption to be turned on and off for existing disk pools. In addition, the IBM i 7.1 release allows the encryption key for a disk pool to be changed. If the independent disk pool is switchable and is configured with encryption, the master encryption key for each node must be manually set. Data that overflows from a basic (traditional) disk pool into the System ASP is not encrypted. Note: Independent disk pools do not overflow, only traditional ones do. Consult the IBM i Information Center for more details on how to implement and configure ASP encryption. 9
Key management The safety of encrypted data depends on the security of your keys. This is no different than the physical security for your home or business. Even though the door locks might be the most sophisticated and expensive that money can buy, if your keys are easily accessible, the locks are useless. Here is a list of the questions to address when protecting and managing encryption keys: Where is the encryption key stored? How is the encryption key generated? How is the encryption key protected? How is the encryption key retrieved and used by applications? How often does the encryption key change? Who manages encryption keys? What is the backup strategy for encryption keys? Ideally, the keys must be managed and used by different personnel within your company. There is no reason for the application programmer to have direct access to the keys; instead, an interface can be provided to allow only programmatic retrieval of the key value. When identifying the people who should be responsible for key management, it is also a good time to review the number of user profiles that have *ALLOBJ (all object) authority and to restrict these users by putting additional controls in place, such as requiring the use of programs to decrypt sensitive data or keys. These programs can increase security by explicitly excluding *ALLOBJ authority. How often you change cryptographic keys is another important consideration in your key-management strategy. However, in some cases, decrypting and encrypting the sensitive data with a new key might put the sensitive data more at risk than allowing the data to remain encrypted with an old encryption key. Thus, the frequency of changing the cryptographic keys will probably be different in each environment. As mentioned, a secure design should only allow programmers to access the encryption key through a programmatic interface. Given that fact, from where does this programming interface retrieve the encryption key? Some regulations or companies require that the keys be stored on a hardware device. The IBM Cryptographic hardware options (4764 and 4758, which were discussed earlier) support this capability. In the next couple of sections, you will see that writing code to use IBM i interfaces for encrypting and decrypting data is a simple matter of programming. The real challenge lies in designing a process and application architecture that selectively distribute the encryption keys to data consumers (users and programs) that are authorized to view the unencrypted version of the data. Application developers and security administrators must work together to design the best process for your application and business. One alternative discussed earlier is having a single application that controls the decryption of the data. You can write this program to centralize the distribution of encryption-key passwords. A single program makes it easier to control which users are given the privilege to view the unencrypted data. Each time that data consumers need access to decrypted data, they would all use the same program. As that program shares the encryption password, it can also keep a log of the user who requested access and the timestamp of that activity for audit purposes. To limit the users who have authority to the centralized 10
decryption program, it is wise to use program-adopted authority for the invoking programs and to exclude *ALLOBJ authority. If hardware storage of the encryption keys is not required, then what other options exist on IBM i? There are solutions from other software companies, if you do not have the time or staff to implement your own solution.. If writing your own solution, then hard-coding the encryption password in the source code of a program or storing it in a data area is not a safe approach. A user space is harder to access, but the encryption password is still being stored in the clear. One IBM i object that provides more protection is a validation-list object. The list entry for a validation-list object consists of three parts: an ID, a data value that is encrypted, and attributes for that entry. The fact that IBM Iencrypts the data in each list entry provides the additional protection. The person responsible for encryption keys uses the Add Validation List Entry or Change Validation List Entry operating-system APIs to populate the validation-list object with encryption keys. Most likely, the entry ID is assigned a label that identifies that encryption key for a particular application or type of data. The application program passes in the appropriate label (entry ID) to retrieve the encryption key for encryption or decryption operations with the Find Validation List Entry API. The IBM i Information Center provides details on the programming APIs that are available for validation list objects. If you use a validation list object, ensure that *PUBLIC authority to the list object is revoked. As shown with validation lists, data security usually improves with encryption of the data key itself. The Cryptographic Services technology enables this design approach with support for key-encrypting key (KEK), where one KEK is often used to protect a group of data-encryption keys. For example, one KEK encrypts several data-encryption keys that are established for protecting credit-card data; a second KEK is used to safeguard data-encryption keys that are associated with sensitive customer attributes (such as a U.S. Social Security number [SSN]). The IBM i Cryptographic Services provide yet another layer of security by encrypting the KEKs with a master key. These layers of protection are portrayed in Figure 6. Figure 6. Master key layers of protection 11
In i5/os V5R4, IBM introduced a set of new key-management APIs. These APIs allow you to create up to eight master keys and store them securely in the LIC. When a system is restarted (IPL), the master keys are available and all applications are operable without manual intervention. The next section of this white paper covers this key management support in detail. Wherever you decide to store your encryption keys, you need to make sure that there is a backup strategy for the keys. If you have a system failure and your recovery plans only cover your business data and not the encryption keys, then any decrypted data is inaccessible without the keys that are used to encrypt it. Another critical success factor in your recovery plan is ensuring that the encryption keys are stored on different media than the encrypted data. If they are contained on the same media and that media is stolen or lost, then it is going to be much easier for your encrypted data to be compromised. Cryptographic services The Cryptographic Services APIs provide IBM i application programmers with a powerful cryptographic tool set that they can use to help protect data privacy, ensure data integrity and authenticate communicating parties. The Cryptographic Services APIs are part of the base operating system and perform cryptographic functions within the IBM i LIC or, optionally, on a 2058 Cryptographic Accelerator. To help you get started in the use of Cryptographic Services APIs, this section of the white paper provides you with the following sets of information: Provides a functional overview Explains the Encrypt Data API parameters Presents a simple encryption-example program Discusses the use of algorithm contexts for extending operations over multiple calls Reviews the key contexts for improving performance and protecting keys Looks at key management APIs to help in the handling of cryptographic keys Functional overview As mentioned, the Cryptographic Services APIs include six categories of APIs: Encryption and decryption APIs include encrypt, decrypt and translate operations, and are used for data confidentiality. Authentication APIs help you to ensure data integrity and authenticate the sender. They include hash, HMAC, MAC and signature operations. Key generation APIs generate both symmetric and asymmetric keys. They also include APIs for performing Diffie-Hellman key exchange PRNG APIs generate cryptographically secure, pseudo-random values. Cryptographic-context APIs temporarily store key and algorithm information during cryptographic operations Key management APIs help you store and handle cryptographic keys (introduced in i5/os V5R4). 12
Figure 7 shows a list of the Cryptographic Services APIs by name and associated category. Figure 7. Cryptographic services APIs 13
Figure 8 displays the algorithms and key lengths included with the Cryptographic Services APIs. The IBM i operating system contains all of these algorithms, and the 2058 Cryptographic Accelerator includes a subset. (For a list of algorithms available on the 2058 Cryptographic Accelerator, see the article i5/os and 2058 Cryptographic Function Comparison, found at http://publib.boulder.ibm.com/infocenter/iseries/v5r4/index.jsp?topic=/apis/qc3compare.htm.) Figure 8. Algorithms and key lengths API parameters At first glance, the Cryptographic Services APIs might look complex. They are designed to accommodate both simple operations, such as encrypting a single block of data with a clear key, as well as moreinvolved operations, such as encrypting data in multiple blocks during multiple calls with an encrypted key on a specific piece of hardware. However, the APIs are not as complicated as they appear. To see the Encrypt Data API, look at the C function prototype in Figure 9. (Note: The RPG prototype definitions for API formats are available in the QRPGLESRC [QC3CCI] file member in library QSYSINC.) void Qc3EncryptData ( char * volatile, /* Clear data */ int * volatile, /* Length of clear data */ char * volatile, /* Clear data format name */ char * volatile, /* Algorithm description */ char * volatile, /* Algorithm desc format name */ char * volatile, /* Key description */ char * volatile, /* Key desc format name */ char * volatile, /* Crypto Service Provider */ char * volatile, /* Crypto Device Name */ char * volatile, /* Encrypted data */ int * volatile, /* Len of encrypted data area */ int * volatile, /* Len of enc data returned */ void * volatile ); /* Error Code */ Figure 9. Encrypt Data C function prototype It is possible to divide the parameters in this prototype into six groups: 14
Parameters that describe the input data: There are three parameters of this type, as follows: 1. Clear data Input Char(*) 2. Length of clear data Input Binary(4) 3. Clear data format name Input Char(8) You specify the data to be encrypted in parameter 1 (clear data) in one of two ways, as follows: A single block of data is the simplest method of specifying the input data. An array of data pointers and lengths is used to encrypt data that is not in contiguous storage. Parameter 3 (clear data format name) specifies which format is used in the clear data parameter. DATA0100 indicates it is a single block of data, and DATA0200 says it is an array. Note: See the Cryptographic Services API documentation for the structure of the DATA0200 array, as well as for other structures that are discussed in this white paper. This documentation is found at http://publib.boulder.ibm.com/infocenter/iseries/v5r4/topic/apis/catcrypt.htm. Parameter 2 (length of clear data) specifies the length of the data in the clear data parameter. For format DATA0100, this is the length of the clear data. For format DATA0200, it is the number of entries in the array. Parameters that describe the cryptographic algorithm: The next two parameters appear on many of the Cryptographic Services APIs, as follows: 4. Algorithm description Input Char(*) 5. Algorithm desc format name Input Char(8) As the names suggest, they describe the cryptographic algorithm. The supported cryptographic algorithms fall into one of four classes, as follows: Block-cipher algorithms (DES, TDES, AES and RC2) Stream-cipher algorithms (RC4-compatible) Public key algorithms (PKAs) (RSA) Hash algorithms (MD5, SHA-1, SHA-256, SHA-384 and SHA-512) Note: Although the APIs support additional algorithms, the algorithm description parameter is not used to describe them. Each of the algorithm classes just discussed requires its own set of arguments, and each has at least one argument, referred to as the algorithm type (for example, AES). Some algorithm types, such as block ciphers, have several additional arguments (such as block length, mode and pad option). Each of these algorithm classes has a defined structure for specifying the algorithm arguments. Parameter 4 (algorithm description) points to one of these structures; parameter 5 (algorithm-description format name) identifies which structure is used. The format names are as follows: ALGD0200 for the block-cipher arguments ALGD0300 for the stream-cipher arguments ALGD0400 for the PKA arguments ALGD0500 for the hash arguments Of course, on the Encrypt Data API, the ALGD0500 structure is not valid because you cannot encrypt with a hash algorithm. In addition, ALGD0100 is a special algorithm-description structure that references 15
an algorithm context. You use an algorithm context to extend cryptographic operations over multiple calls. This white paper discusses algorithm contexts in detail in a later section. Parameters that describe the cryptographic key: Similar to the algorithm, the key is described with a structure of arguments using two parameters, as follows: 6. Key description Input Char(*) 7. Key description format name Input Char(8) Parameter 6 (key description) points to the structure, and parameter 7 (key description format name) identifies which structure is used, as follows: The KEYD0200 structure specifies the key type (for example, TDES and RSA), key format (binary or basic encoding rules [BER] encoded), key length and the key value. The KEYD0100 structure references a key context, which is used when the key is encrypted or is used on multiple operations (more about key contexts later). Additional key description formats were added in i5/os V5R4. These formats are described later. Parameters that identify the CSP: The following two parameters identify the CSP that is used to perform the encryption and decryption, as follows. 8. Cryptographic service provider Input Char(*) 9. Cryptographic device name Input Char(8) The IBM i operating system supports several CSPs (the hardware or software that implements a set of cryptographic algorithms), such as the 2058 Cryptographic Accelerator, the 4758 Cryptographic Coprocessor, the i5/os LIC and Java Cryptography Extensions. The Cryptographic Services APIs support two CSPs: the LIC and the 2058 Cryptographic Accelerator. Parameter 8 (cryptographic service provider) specifies the CSP to perform the encryption. A value of 1 runs the encryption algorithm in the LIC, and a value of 2 runs the encryption algorithm on the first available 2058 Cryptographic Accelerator. If you require a specific 2058 device, you can specify the device name in parameter 9 (cryptographic device name). If the 2058 Cryptographic Accelerator does not support the specified algorithm, an error is returned. The easiest means of selecting a CSP is to specify a value of 0, which instructs the system to choose an appropriate CSP. The 2058 Cryptographic Accelerator is used if one is available and it supports the specified algorithm. Otherwise, the encryption is performed in the LIC. Note: Some functions are always performed in the LIC. For a list of these functions, see IBM i and 2058 Cryptographic Function Comparison, which is found at http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/apis/qc3compare.htm Parameters for the output data: Here are the three parameters in this category: 10. Encrypted data Output Char(*) 11. Length of area provided for encrypted datachar(*) Input Binary(4) 12. Length of encrypted data returned Output Binary(4) Parameter 10 (encrypted data) and parameter 12 (length of encrypted data returned), both of which are output parameters, are where such data is returned. Parameter 11 (length of area provided for encrypted data) is an input parameter where you specify such information. Depending on the algorithm, the 16
encrypted data can be larger than the clear data. For example, the length of RSA-encrypted data always equals the key size. Alternatively, if you specify padding on a block cipher, the last block of the clear data is padded to a full-block size before encryption. If the length of area provided for the encrypted data is too small, an error is returned without any encrypted data. Parameter for the error code: Parameter 13, the final parameter, returns error information, as follows: 13. Error Code I/O Char(*) The error code parameter is a variable-length structure that is common to all system APIs. (You can find documentation on the error code parameter in the IBM i Information Center at http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/apiref/error.htm.) Many of the Cryptographic Services APIs use identical or similar parameters and structures to the ones just described. A simple Encrypt Data API example Figure 10 shows a simple example C program that uses the Encrypt Data API (Qc3EncryptData) to implement an AES Encrypt service program. The C header file for Qc3EncryptData is qc3dtaen.h. It contains the function prototype shown in Figure 9. As you can see, all the parameters are passed as pointers. For example, when the API documents a parameter as Binary(4), what is actually specified on the call is a pointer to an int. Header file qc3dtaen.h pulls in another include, qc3cci.h, which contains the structures and constants required for using the Cryptographic Services APIs. /*********************************************************************/ /* SRVPGM NAME: AESEncrypt */ /* */ /* DESCRIPTION: Perform AES encryption using */ /* - CBC mode */ /* - 32-byte key */ /* - 16-byte block */ /* - Pad with 0x00 */ /* */ /* LANGUAGE: ILEC */ /* PARAMETERS: */ /* ----------------------------------------------------------- */ /* PARM USE TYPE DESCRIPTION */ /* ----------------------------------------------------------- */ /* 1 Input char * Data to encrypt */ /* ----------------------------------------------------------- */ /* 2 Input int Length of data to encrypt */ /* ----------------------------------------------------------- */ /* 3 Input char * Key */ /* ----------------------------------------------------------- */ /* 4 Input char * Initialization vector */ /* ----------------------------------------------------------- */ /* 5 Output char * Place to put encrypted data */ /* ----------------------------------------------------------- */ /* 6 Input int * Size allocated for encrypted data */ /* Output Actual length of encrypted data */ /* ----------------------------------------------------------- */ /* */ /* Use the following commands to compile this program: */ /* CRTCMOD MODULE(my_lib/AESENCRYPT) SRCFILE(my_lib/my_src) */ 17
/* CRTSRVPGM SRVPGM(my_lib/AESENCRYPT) MODULE(my_lib/AESENCRYPT) + */ /* BNDSRVPGM(QC3DTAEN) EXPORT(*ALL) */ /* */ /*********************************************************************/ /*-------------------------------------------------------------------*/ #include <string.h> /* C string utilities */ #include /* Error code structure */ #include <qc3dtaen.h> /* qc3dtaen API header */ int AESEncrypt(char * inputdata, int inputlen, char * keyvalue, char * iv, char * cipherdata, int * cipherlen) { /* Local variables */ Qc3_Format_ALGD0200_T blockalgd; /* Algorithm description */ struct { Qc3_Format_KEYD0200_T keyd; /* Key description */ char keys[256]; /* Key string */ } key; char csp; /* Crypto service provider*/ int rtnlen; /* Return length */ struct Qus_EC errcode; /* Error code parameter */ /* Start of executable code */ /* ------------SECTION A Start -------------------------------------*/ /* Set up the algorithm description */ memset(&blockalgd, 0, sizeof(blockalgd)); /* Init to null */ blockalgd.block_cipher_alg = Qc3_AES; /* Algorithm is AES */ blockalgd.block_length = 16; /* Block len is 16 bytes */ blockalgd.mode = Qc3_CBC; /* Mode is CBC */ blockalgd.pad_option = Qc3_Pad_Char; /* Pad with 0x00 */ memcpy(blockalgd.init_vector, iv, 16); /* Copy in init vector */ /* ------------SECTION A End -------------------------------------*/ /* ------------SECTION B Start -------------------------------------*/ /* Set up the key description */ key.keyd.key_type = Qc3_AES; /* Key type is AES */ key.keyd.key_format = Qc3_Bin_String; /* Key format is binary */ key.keyd.key_string_len = 32; /* Key len is 32 bytes */ memcpy(key.keys, keyvalue, 32); /* Copy in key value */ /* ------------SECTION B End -------------------------------------*/ /* ------------ SECTION C Start ------------------------------------*/ /* Set the cryptographic service provider */ csp = Qc3_Any_CSP; /* Use any CSP */ /* ------------ SECTION C End ------------------------------------*/ /* ------------ SECTION D Start ------------------------------------*/ /* Set up the error code parameter */ memset(&errcode, 0, sizeof(errcode)); errcode.bytes_provided = 0; /* Request exceptions */ /* ------------ SECTION D End ------------------------------------*/ /* ------------ SECTION E Start ------------------------------------*/ /* Encrypt the data */ Qc3EncryptData(inputData, &inputlen, Qc3_Data, (char*)&blockalgd, Qc3_Alg_Block_Cipher, (char*)&key, Qc3_Key_Parms, &csp, NULL, cipherdata, cipherlen, &rtnlen, &errcode); 18
*cipherlen = rtnlen; /* Return encrypted len */ /* ------------ SECTION E End --------------------------------------*/ return 0; } /* end AESEncrypt */ Figure 10. Simple Encrypt Data API example All the parameters are passed as pointers. For example, when the API documents a parameter as Binary(4), the call actually specifies a pointer to an int. Header file qc3dtaen.h pulls in another include statement, qc3cci.h, that contains the structures and constants required for using the Cryptographic Services APIs. The following bullets briefly explain the parameters on the call to Qc3EncryptData (see SECTION E, Figure 10). The clear data parameter and length of clear data parameter are passed into the service program. The clear data format name is set to constant Qc3_Data (at SECTION E). This constant is defined in include qc3cci.h as DATA0100. The algorithm description parameter contains the ALGD0200 structure. At SECTION A, the program sets up this structure for the AES algorithm with CBC mode, a block length of 16 bytes, no pad, and the initialization vector that was passed into the program. The algorithm description format name parameter is set to constant Qc3_Alg_Block_Cipher (at SECTION E). This constant is defined in include qc3cci.h as ALGD0200. The key description parameter contains the KEYD0200 structure. As you can see at SECTION B, it is set up for a 32-byte AES key with the key value that was passed into the program. The key description format name parameter is set to constant Qc3_Key_Parms (at SECTION E). This constant is defined in include qc3cci.h as KEYD0200. The cryptographic service provider parameter is set to constant Qc3_Any_CSP (at SECTION C). This constant is defined in include qc3cci.h with a value of 0. A NULL pointer is passed for the cryptographic device name parameter (at SECTION E). The encrypted data parameter and length of area provided for it are passed into the service program. The length of encrypted data returned parameter is a local variable. The Bytes_Provided field of the error code parameter is set to 0 (at SECTION D), which forces errors to surface as exceptions. Algorithm and key contexts Algorithm and key contexts are temporary repositories for algorithm and key information. You create them with the Create Algorithm Context and Create Key Context APIs. These APIs each return an eight-byte token, which you can then specify on other cryptographic APIs. You use the Destroy Algorithm Context and Destroy Key Context APIs to destroy the algorithm and key contexts. Otherwise, they are destroyed at the end of the job. Algorithm contexts: Algorithm contexts store the algorithm type (for example, TDES and AES) and associated arguments (such as block size and pad character). They also store the state of a cryptographic operation. An algorithm context enables you to extend a cryptographic operation over multiple calls. For example, if you want to create an HMAC for an entire file, but you do not want to read the entire file before calling the Calculate HMAC API, you must first create an algorithm context. Figure 11 shows the Create Algorithm Context API C function prototype: 19
void Qc3CreateAlgorithmContext ( char * volatile, /* Algorithm description */ char * volatile, /* Algorithm desc format name */ char * volatile, /* Algorithm context token */ void * volatile ); /* Error code */ Figure 11. Create Algorithm Context API C function prototype The API accepts four parameters, as listed here: 1. Algorithm description Input Char(*) 2. Algorithm desc format name Input Char(8) 3. Algorithm context token Output Char(8) 4. Error code I/O Char(*) As with the Encrypt Data API, you specify a structure that contains the algorithm arguments in the algorithm description parameter. For an HMAC operation, the structure must contain the ALGD0500 arguments for a hash algorithm. Actually, there is only one argument, the type of hash algorithm (for example, SHA-1). The parameters for the Calculate HMAC API look similar to those for the Encrypt Data API, as you can see from the C function prototype shown in Figure 12: void Qc3CalculateHMAC ( char * volatile, /* Input data */ int * volatile, /* Length of input data */ char * volatile, /* Input data format name */ char * volatile, /* Algorithm description */ char * volatile, /* Algorithm desc format name */ char * volatile, /* Key description */ char * volatile, /* Key desc format name */ char * volatile, /* Crypto Service Provider */ char * volatile, /* Crypto Device Name */ char * volatile, /* HMAC */ void * volatile ); /* Error Code */ Figure 12. C function prototype In this example, the algorithm description format name parameter must specify ALGD0100, meaning the algorithm description parameter is a structure that references an algorithm context. The following layout shows the ALGD0100 structure: Algorithm context flag Char(8) Final operation flag Char(1) You need to set the algorithm context token to the value returned on the Create Algorithm Context API. On each invocation of the Calculate HMAC API before the last invocation, set the final operation flag to 0 to keep the operation in a continued state. The algorithm context keeps a running HMAC and buffers any data that is not a full-block size. On the last Calculate HMAC, set the final operation flag to 1; this value instructs the API to complete the operation and return the HMAC value. You can use an algorithm context on different operations but only after completing each operation. In this example, after the final flag has been set to on and the HMAC value has been returned, you can reuse the algorithm context on a new operation, such as the Calculate Hash API or another Calculate HMAC. 20
Key contexts: A key context holds key information, such as the key type (for example, MD5-HMAC, AES and RSA), key format (binary or BER-encoded) and key length. Key contexts serve several purposes: They improve the performance for some algorithms, such as AES and RC2. Some algorithms require initial key processing before performing the cryptographic operation. When you use a key context, initial key processing needs to be performed only one time. They allow the application to erase the clear key value from program storage to protect the key. They enable the application to use an encrypted key value with the APIs. You should reduce exposure of clear key values within the application program as much as possible. Encrypted key values can be used on an API if a key context is created. When you create a key context for an encrypted key value, you must supply the key and algorithm contexts for the KEK. Now, the issue becomes protection of the KEK. (For more on key management, see Scenario: Key Management and File Encryption Using the Cryptographic Services APIs. in the IBM i Information Center at http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/apis/qc3scenario.htm. The i5/os V5R3 version of this scenario is found at http://publib.boulder.ibm.com/infocenter/iseries/v5r3/ic2924/info/apis/qc3scenario.htm. i5/os V5R4 enhancements i5/os V5R4 Cryptographic Services supports the specification of several new key formats. New structures for these formats have been defined for the Key Description parameter of the Encrypt Data, Decrypt Data, Calculate MAC, Calculate HMAC, Calculate Signature and Verify Signature APIs. Key store label: With key-description format KEYD0400, you can specify the file name, library and label for a key in a Cryptographic Services key store file on any of these APIs. PKCS #5: With key-description format KEYD0500, you can use RSA Security s Public Key Cryptography Standard #5 to derive a symmetric key from a password, a salt and an iteration count. The password must be kept secret. The salt value, which does not need to be secret, is used to produce a large set of key possibilities for each password. Therefore, the salt should be a good random value. The iteration count is used to increase the length of computation. Using a large iteration value makes an exhaustive search for the key prohibitive. You can specify Password-Based Cryptography Standard (PKCS) #5 format on the Encrypt Data, Decrypt Data, Calculate MAC and Calculate HMAC APIs. (For information about PKCS #5, refer to http://rsasecurity.com/rsalabs/node.asp?id=2127.) PEM formatted certificate: Privacy-enhanced mail (PEM) is a standard for secure electronic mail over the Internet. A PEM-formatted certificate is a public-key certificate that is base64-encoded and has the text -----BEGIN CERTIFICATE----- and -----END CERTIFICATE----- appended to the beginning and the end. (A public-key certificate is basically a digital ID that can be verified. It contains a serial number, the owner s name, the issuer s name, validity dates, the public-key component of the owner s RSA key pair and a digital signature that the issuer creates. Base64 is an encoding scheme in which any arbitrary sequence of bytes is converted into printable ASCII characters.) With key-description format KEYD0600, you can specify a PEM-formatted certificate on the Encrypt Data, Decrypt Data and Verify Signature APIs. (For more information about PEM formatted certificates, refer to RFC 2127 at the following Web site: http://rsasecurity.com/rsalabs/node.asp?id=2127.) 21
Certificate label: With key-description format KEYD0700, you can specify a label that identifies a public key in a public-key certificate located in i5/os Certificate Store. You can specify a certificate label on the Encrypt Data, Decrypt Data and Verify Signature APIs. Distinguished name: With key-description format KEYD0800, you can specify a distinguished name (the certificate owner) that identifies a public key in a public-key certificate that is located in IBM i Certificate Store. You can specify a distinguished name on the Encrypt Data, Decrypt Data and Verify Signature APIs. Application identifier: With key-description format KEYD0900, you can specify an application identifier that identifies the private key associated with a public-key certificate in i5/os Certificate Store. You must use Digital Certificate Manager (DCM) to create an application definition and assign a certificate to it before performing an operation with a private key. You can specify an application identifier on the Encrypt Data, Decrypt Data and Calculate Signature APIs. (For information about DCM and IBM i Certificate Store, refer to the following IBM i Information Center article: http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/rzahu/rzahurazhudigitalcertmngmnt.) Cryptographic services and key management As mentioned, the level of protection that is provided by data encryption is dependent on how well the encryption keys are protected and managed. This section provides information on how to use Cryptographic Services for secure encryption keys. Key generation Having good random-key values is extremely important. You can obtain good randomness by having the computer do this for you. A computer function that generates a random value is called a pseudo-random number generator (PRNG). (The term pseudo is used because this function generally does not directly compute its output from a real random source.) Many PRNGs exist; however, not all are cryptographically strong. Cryptographically strong means that the PRNG uses cryptographic functions to ensure that the output is unpredictably random. This is a much more-difficult feat than you might suppose. Many security systems have been broken because the PRNG was imperfect. The IBM i operating system contains a cryptographically strong PRNG. Many system components use the system PRNG. In addition, two APIs are available for your use: Generate Pseudo-random Numbers and Add Seed for Pseudo-random Number Generator. Cryptographic Services has three APIs for generating key values: Generate Key Record, Generate Symmetric Key and Generate PKA Key Pair. All three APIs use the system PRNG to generate cryptographically strong, pseudo-random key values. The Generate Key Record API was introduced with the i5/os V5R4 release. It generates a random key or key pair and stores it in a key-store file. The key store file is discussed in more detail in the Recent advancements in key management section of this white paper. 22
i5/os V5R3 considerations On i5/os V5R3, you can use the Generate Symmetric Key and Generate PKA Key Pair APIs to create random key values for both symmetric and asymmetric algorithms. The Generate Symmetric Key API creates a random-key value for use with DES, 3DES, AES, RC2 or RC4-compatible cryptographic algorithms. The Generate PKA Key Pair API creates a random RSA-key pair. You indicate the key type, key size and whether the key must be returned in clear data or encrypted format. If you specify an encrypted key, you must supply the key and algorithm contexts for the KEK. Figure 13 pictorially represents the process of using the Generate Symmetric Key API in combination with other cryptographic APIs to create a set of three encryption key values: master key, key-encryption key and data-encryption key. Figure 13. Generate Symmetric Key API First, a master key is created with the Generate Symmetric Key API. The result is a clear-text key. Because this key is in the clear, it must be protected by the application. This key becomes an input parameter for the Create Key Context API. This API creates a temporary area for holding a cryptographic key. An algorithmic context also needs to be created to describe a temporary area for holding the algorithm parameters and the state of the cryptographic operation. This API is called Create Algorithm Context. These APIs each return an eight-byte token for referencing the created context. 23
Next, the KEK is generated. This time, the Generate Symmetric Key API is also used, but the master key context token and the algorithm context token are used as input parameters. The generated key, which the API provides in an output parameter, is encrypted by the master key. In the next step, you need to create another key context for the KEK. For the input parameters, you need to provide the encrypted KEK and the master key and algorithm-context tokens (to decrypt the KEK). Another algorithm context must be created to hold the algorithm information for encrypting the dataencryption key. You also need to provide the KEK key-context token. These two objects become input parameters for the last Generate Symmetric Key operation. This final operation generates the encrypted data-encryption key. Key storage If you implement an encryption solution on i5/os V5R3, it is important to remember that the Cryptographic Services APIs provide no key management. Determining how to store and manage cryptographic keys is the application programmer s responsibility. You must give careful consideration to the authorities that are placed on objects containing keys. Even if the key is encrypted, users who have access to the KEK can hack the data. It is important to encrypt the keys that are stored on removable media and, equally important, you must not store the clear KEK with the encrypted keys. Typically, KEKs are stored encrypted, as well. You use a master key, which is the only unencrypted key, to encrypt KEKs. Ideally, you should not store the master key on the system. (To study an example design for setting up and using a master key and KEKs, see Scenario: Key Management and File Encryption Using the Cryptographic Services APIs. The i5/os V5R3 version of this scenario is found in the IBM System i Information Center at http://publib.boulder.ibm.com/infocenter/iseries/v5r3/ic2924/info/apis/qc3scenario.htm. The i5/os V5R4 version of this scenario is found in the IBM System i Information Center at http://publib.boulder.ibm.com/infocenter/iseries/v5r4/topic/apis/qc3scenario.htm.) Note that the 2058 Cryptographic Accelerator handles keys in the clear and does not provide keystorage facilities. If your application has a requirement to store and handle keys securely on tamperresistant hardware, you might want to use the CCA API set, which performs cryptographic operations on the 4764 and 4758 Cryptographic Coprocessors. (To learn more, see Cryptographic Coprocessors, which is found in the IBM i Information Center at http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/rzahu/rzahurazhudigitalcertmngmnt) 24
Recent advancements in key management Starting with i5/os V5R4, the key management capabilities of Cryptographic Services were significantly enhanced, including support for a hierarchical-key store. Initially, IBM i APIs provided the only interface for key management. The key management interface was extended further in the IBM i 6.1 release with new CL commands and a graphical key management interface with IBM Systems Director Navigator for i5/os interface. To learn more about the key management interfaces, consult the IBM i Information Center article at http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/rzajc/rzajckeymgmt.htm. This section focuses on the programmatic API interfaces available for key management. Cryptographic-services key store Figure 14 shows a diagram of a hierarchical-key system, which is a commonly used key-management technique. Figure 14. The master key is at the top of the hierarchy and is a clear key value At the top of the hierarchy is a master key (or keys). The master key is the only clear key value and must be stored securely. You use KEKs to encrypt other keys. Typically, you use a KEK to encrypt a key that is either sent to another system or stored outside the key store. KEKs are usually stored encrypted under a master key. Data keys are used directly on user data (for example, to encrypt or sign). You can encrypt a data key under a KEK or under a master key. You can use IBM i Cryptographic Services to implement such a key hierarchy through a two-tier key store. 25
Master keys The IBM i environment supports eight master keys that are used to encrypt other keys (that is, KEKs and data keys), but not to encrypt data. In addition to the eight general-purpose master keys, cryptographic services supports two special-purpose master keys. The ASP master key is used for protecting data in the Independent Auxiliary Storage Pool (in the Disk Management GUI is known as an Independent Disk Pool). The save/restore master key is used to encrypt the other master keys when they are saved to media using a Save System (SAVSYS) operation. For more information about Cryptographic Services master keys, refer to the IBM i Information Center article entitled Cryptographic Services Master Keys, at http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/topic/apis/qc3masterkeys.htm. To reference master keys, you simply use a number, 1 through 8. Each master key is composed of three 32-byte values, called versions. The versions are new, current and old (see Figure 15). The new master-key version contains the value of the master key while it is being loaded. The current master-key version contains the active master-key value. This value is used when a master key is specified on a cryptographic operation (unless expressly stated otherwise). The old master-key version contains the previous current master-key version. It is used to prevent the loss of data and keys when the master key is changed. Figure 15. How master keys are stored in the LIC To load and set (or activate) a master key, you use APIs. As input, the Load Master Key API uses a passphrase, which is a sequence of text used to control access to a master key. You can load as many passphrases as desired before setting a master key. In fact, to ensure that no single individual has the ability to reproduce a master key, you need to assign passphrases to several people. The Set Master Key API activates the new master-key value, which consists of the passphrases that were previously loaded. The API returns a key verification value (KVV). You need to retain this value; then, at a later date, you can determine whether the master key has been changed. When you set a master key, any keys that have been encrypted under that master key must be translated (that is, decrypted and, then, encrypted again). 26
A key hierarchy is a popular key-management technique because it reduces the number of secrets (for example, plaintext keys and passwords) to protect. Be discriminating in regard to the number of master keys you actually need to use and must, therefore, manage. Key-store files You can securely store KEKs and data keys in a Cryptographic Services key-store file. A key-store file is a database file that you create with the Create Key Store API. Even though the key-store file is a database file, none of the normal data-access methods, such as SQL or record-level access, are able to access the key-store file. You specify the file name and library, the public authority and the master key that encrypts all the key values stored in the key-store file. Figure 16 shows a graphic representation of a key-store file. Figure 16. The key store You can store both KEKs and data-encryption keys in the key-store file. To store a key in a key-store file, use the Write Key Record API. Its parameters allow you to specify the following pieces of information: The type and size of the key Whether the input-key value is a binary string, a BER-encoded string or a PEM certificate Whether the input key value is encrypted (and if so, the KEK information) A label for referencing the key Alternatively, you can use the Generate Key Record API to create and store a random key value in a key-store file, again specifying key type and size, as well as a label for the key record. 27
To use a key from key store, you use a new structure that is defined for the Key Description parameter. This parameter is used by many Cryptographic Services APIs (for example, the Encrypt Data API) to specify the key data. In this new structure, you specify the key-store file name, the library and the key-record label. Cryptographic Services retrieve the key value from the key-store file and decrypt it from under the master key before running the operation (for example, encrypting the data). You must give careful consideration to authorities when creating key-store files. Although key values in a key-store file are encrypted, anyone with access to the key-store file and the appropriate API (for example, the Decrypt Data API) can hack the data. (Note: A key-store file that is moved to another system is useless if the master key under which it was encrypted has not been set identically.) Master-key encrypted keys As stated earlier, key values in a key-store file are encrypted under a master key. You can also encrypt under a master key outside of a key store. On the Generate Symmetric Key API and the Generate PKA Key Pair API, you can request that the generated key value be returned encrypted under a master key. Alternatively, you can use the Import Key API to encrypt a clear key value or translate (for example, decrypt, then encrypt) an encrypted key value under a master key. Again, you must carefully control access to master-key-encrypted keys. Figure 17 demonstrates the process of using the master key and key-store file to generate a KEK and data-encryption key. Figure 17. Generating a KEK and data-encryption key 28
Changing master-key values Periodically, it is recommended that you change master-key values. All keys that are encrypted under the master key must then be translated under the newly activated master-key value. If you neglect to do this, key values, as well as all data that is encrypted under them, might be lost. You use the Translate Key Store API to translate key-store files from an old master-key value to an active master-key value. To translate a master-key-encrypted key that is located outside of key store, first use the Export Key API, and then use the Import Key API. Master-key-encrypted keys that are not translated after the master key is changed might still be usable if you know the old master key s KVV. When a key is added to a key-store file, the KVV of the master key (under which the new key value is encrypted) is stored in the key record. When the key is used on a Cryptographic Services API, the API retrieves the KVV from the key record and compares it with the active and old master-key KVVs. However, for master-key-encrypted keys outside of key store, you must manage the master-key KVV and supply it to any APIs yourself. By supplying a KVV, an API can detect when a key is encrypted under an old master-key value. The API runs as usual but returns a warning message, which specifies that you need to translate the key. If the key is not translated and the master-key value is changed again, the key value is lost, and you cannot recover it unless you restore the original value of the master key. Master key variants You can limit the use of a master-key-encrypted key by using master-key variants. For example, if your application program uses a key from key store to encrypt sensitive data, you can use variants to ensure that the key cannot be used to decrypt the data. A master-key variant is a value that is inserted through an Exclusive Or [XOR] operation into the master-key value before encrypting a key. You specify a master-key variant on an API (for example, the Generate Key Record API) by using the Disallowed Function parameter or field. Functions that you can disallow are Encrypt, Decrypt, MAC and HMAC, Sign or any combination of these. When you use a key that is encrypted under a master-key variant on an operation (for example, to decrypt data), you must supply the variant to decrypt the key properly. The master-key variant for a key-store file key is stored in the key record and is picked up automatically when that key is specified on an API. For keys outside of key store, you must manage and supply the master-key variant yourself. If the supplied variant indicates that the operation is disallowed, an error is returned. If the variant is altered to force the operation, the results are bad. For example, if you use an altered variant on the Decrypt Data API, the decryption proceeds as usual, but the result is not clear-text data. Migrating keys to another system To move a master-key-encrypted key (inside or outside of key store) to another system, use the Export Key API. The Export Key API translates the key from encryption under the master key to encryption under a KEK. On the target system, you can then use the Write Key Record API to move the migrated key into the key store, or you can use the Import Key API to translate the key value to encryption under a master key. All these APIs require that you specify the KEK information through the use of an algorithm and key contexts. 29
The Export Key API is shipped with public authority *EXCLUDE. Be very careful about the access that you give to the Export Key API. Anyone with access to master-key-encrypted keys and the Export Key API can obtain the clear-key values. In general, you should not share master keys with another system. Each system should have unique master keys. However, to move an entire key-store file from one system to another without exposing clear-key values, you need to set up identical master-key values on both systems. To avoid exposing your master-key values, set up a temporary master key on both systems by loading and setting an unused master key with identical passphrases. On the source system, create a duplicate of the keystore file (for instance, by using the IBM i CRTDUPOBJ CL command). Then, translate the duplicated key-store file to the temporary master key. After moving the key-store file to the target system, translate the key-store file to another master key. Backing up keys Keeping a current backup of all your keys is essential. If a master key is lost, all keys encrypted under that key, and consequently all data encrypted under those keys, are lost. There are two methods of backing up your master keys: Save the individual passphrases Save the master key phrases by performing a SAVSYS operation (available starting with the IBM i 6.1 release) Details on these two master key backup methods can be found in the IBM i Information Center (http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/rzaiu/rzajcsavemasterkey.ht m) Any time you add a new key to a key-store file, it is recommended that you also make a backup of the file. In addition, any time the key-store file is translated, you need to make a new backup of the file. Do not forget about keys that are stored outside of key store. These should be backed up, as well. Generally, these should not be encrypted under a master key. If you store a key outside of key store, it is best to encrypt it under a KEK. If you encrypt it under a master key and that master key is changed, you must remember to translate the key as described previously. Key distribution You can use the Cryptographic Services APIs to implement a key negotiation protocol. Typically, you use symmetric-key algorithms to perform data encryption. The symmetric keys are distributed by using asymmetric-key algorithms. The Cryptographic Services APIs support two asymmetric-key algorithms for use in key distribution: RSA: An RSA public key encrypts a symmetric key that is then distributed. The corresponding private key is used to decrypt it. Diffie-Hellman (D-H): The communicating parties generate and exchange D-H parameters, which are then used to generate key pairs. The public keys are exchanged, and each party is then able to compute the symmetric key independently. 30
A great deal of literature is available regarding key-exchange protocols, including setting key negations and understanding the many potential pitfalls. (See the references in Appendix A of this white paper to learn more.) SQL encryption and decryption An earlier example showed the simplified DB2 for i interface for encryption and decryption with SQL. Although the SQL functions offer an easy interface for encryption and decryption, there are no SQL functions to assist with encryption-key management. This section has more details on the syntax and other requirements for using the SQL built-in encryption and decryption functions. Look at Figure 18 to see how you can use the DB2 encryption and decryption functions to provide an extra layer of security: SET ENCRYPTION PASSWORD = SEC01RET INSERT INTO customer VALUES( JOSHUA, ENCRYPT_TDES( 1111222233334444 )) SET ENCRYPTION PASSWORD = SEC01RET SELECT name, DECRYPT_CHAR(card_nbr) FROM customer Figure 18. Example of using DB2 encryption and decryption functions In the preceding code, the Set Encryption Password statement provides DB2 with the key that is used to encrypt (and decrypt) the data. The next statement shows how the DB2 Encrypt_TDES function is used to encode the sensitive credit-card number before it is written into the DB2 table. The final statement shows the steps necessary to view the original credit-card number value of 1111222233334444. First, the encryption key must be set to the same value that was used to encrypt the number. Then, one of the DB2 decryption functions must be used to convert the binary-encrypted value into the original character value. Notice in this example that no SQL or data descryption specifications (DDS) keywords tell DB2 to encrypt and decrypt the data automatically. Application changes are required because automatic encryption and decryption do not provide an extra layer of security. If DB2 automatically decrypts the credit-card number for all users that read the customer table, then the credit-card number is viewable to just as many users as it is without encryption. You can realize the security benefits of encryption only by changing your applications and interfaces to decrypt the data selectively for a subset of your authorized users. The benefit provided by the DB2 support is that these functions make it easy to encrypt and decrypt data. Your applications can invoke a simple SQL function instead of coding calls to complex cryptography APIs and services. Preparing for encryption Now that you understand how to use the SQL-encryption support, it is important to focus on the setup steps required by encryption. For several releases, IBM i provided a set of Cryptographic Services (that is, algorithms) for use by applications and system functions such as SSL and IPSec. In i5/os V5R3, access to these Cryptographic Services has been integrated into DB2 for i with the arrival of a set of SQL encryption and decryption scalar functions. The DB2 encryption and decryption column functions use the encryption algorithms that are embedded in the no-charge IBM Cryptographic Access Provider 128-bit product (5722-AC3). As mentioned, this product no longer exists as of i5/os V5R4 because cryptographic support is now embedded into the base level of the operating system. The IBM Cryptographic Coprocessor and Accelerator are not required and provide no benefit to the encryption algorithms that are used by the DB2 SQL functions. Instead of having 31
to interface directly with an IBM i Cryptographic Services API, you can invoke a simple DB2 function as demonstrated with the following insert command: INSERT INTO employees VALUES( ENCRYPT_RC2( 777-88-9999 ), BOB SANDERS ) After ensuring that your IBM i system has the needed prerequisites installed, the final preparation step is to identify the columns to encrypt. Data encryption requires changing the data type and length of your existing column definitions. The data type must be changed because an encrypted value is a binary string that can only be stored in a column that is defined with one of the following SQL data types: BINARY VARBINARY CHAR FOR BIT DATA VARCHAR FOR BIT DATA BLOB If the target DB2 table is defined with DDS, then the target column must be defined as a fixed or varyinglength character field with a coded character set identifier (CCSID) value of 65535. The length of an existing column must increase because some overhead bytes are needed for the encrypted value. The DB2 engine uses these extra bytes to store information on the attributes (for example, CCSID) of the data that is encrypted in the column and the algorithm that encrypted the data. The data that is stored in these extra bytes makes it possible for DB2 for i to share and exchange encrypted data with other DB2 server products. This length must increase if a password hint is stored with the encrypted value. (Password hints are covered later in this white paper.) The length of an encrypted string value is the actual length of the string plus N bytes. For the Encrypt_RC2 function N is defined as follows: If a hint is not specified, N is eight bytes (or 16 bytes if the input string is defined as large object (LOB), BINARY or VARBINARY or the input string and password have different CCSID values) plus the number of bytes to the next eight-byte boundary. If a hint is specified, N is eight bytes (16 bytes if the input string is defined as LOB, BINARY or VARBINARY or the input string and password have different CCSID values) plus the number of bytes to the next eight-byte boundary plus 32 bytes for the hint length. For the Encrypt_TDES function N is defined as follows: If a hint is not specified, N is 16 bytes (or 24 bytes if the input string is defined as large object (LOB), BINARY or VARBINARY or the input string and password have different CCSID values) plus the number of bytes to the next eight-byte boundary. If a hint is specified, N is 16 bytes (24 bytes if the input string is defined as LOB, BINARY or VARBINARY or the input string and password have different CCSID values) plus the number of bytes to the next eight-byte boundary plus 32 bytes for the hint length. For the Encrypt_AES function N is defined as follows: If a hint is not specified, N is 24 bytes (or 32 bytes if the input string is defined as large object [LOB], BINARY or VARBINARY or the input string and password have different CCSID values) plus the number of bytes to the next 16-byte boundary. 32
If a hint is specified, N is 24 bytes (32 bytes if the input string is defined as LOB, BINARY or VARBINARY or the input string and password have different CCSID values) plus the number of bytes to the next 16-byte boundary plus 32 bytes for the hint length. Different CCSID values can result because the password string is assigned the CCSID that is used by the job and the encrypted value is assigned the CCSID of the result data type. The DB2 for i SQL Reference provides more details on length requirements The following examples are provided to help you understand the length requirements for the various encryption functions: Encrypt_RC2 function A 16-digit credit-card number, which is stored in a VARBINARY column without a hint, requires a column length of 40 bytes. That is 16 bytes for nonencrypted data, plus 16 extra bytes, plus eight extra bytes (to reach the next eight-byte boundary). A nine-digit SSN, which is stored in a VARCHAR FOR BIT DATA column with a hint, requires a column length of 56 bytes. That is nine bytes for nonencrypted data, plus eight extra bytes, plus seven extra bytes (to reach the next eight-byte boundary), plus 32 bytes for the hint. Encrypt_TDES function A 16-digit credit-card number, which is stored in a VARBINARY column without a hint, requires a column length of 40 bytes. That is 16 bytes for nonencrypted data, plus 24 extra bytes. No extra bytes are needed because 16 plus 24 equals 40 (which is an eightbyte boundary). A nine-digit SSN, which is stored in a VARCHAR FOR BIT DATA column with a hint, requires a column length of 64 bytes. That is nine bytes for nonencrypted data, plus 16 extra bytes, plus seven extra bytes (to reach the next eight-byte boundary), plus 32 bytes for the hint. Encrypt_AES function A 16-digit credit-card number, which is stored in a VARBINARY column without a hint, requires a column length of 48 bytes. That is 16 bytes for nonencrypted data, plus 32 extra bytes. No extra bytes are needed because 16 plus 32 equals 48 (which is a 16-byte boundary). A nine-digit SSN, which is stored in a VARCHAR FOR BIT DATA column with a hint, requires a column length of 80 bytes. That is nine bytes for nonencrypted data, plus 24 extra bytes, plus 15 extra bytes (to reach the next 16-byte boundary), plus 32 bytes for the hint. Besides minimizing the number of column-definition changes, performance overhead is another reason to encrypt only a small subset of columns. For instance, if you increase the security of your office building by adding a badge reader to the front door, it takes a little longer to get to your desk each morning. Slower database access is a result of adding encryption and decryption to your application. Table 1 shows the results of a simple comparison test done in the lab with a single-column table to compare the speed of inserting and retrieving a single row. SELECT statement INSERT statement Without encryption 0.1 millisecond 0.05 millisecond With encryption 3.0 milliseconds 1.1 milliseconds 33
Table 1. Comparing the speed of inserting and retrieving a single row 34
The encryption recipe Now that the preparations have all been made, it is time to learn how to use the DB2 Encrypt function. The first step for data encryption is providing DB2 with an encryption key. The encryption password can be set at a job or connection level with the Set Encryption Password statement. The statement is a good method to use when the same password must be shared across rows in a table or across different tables. The encryption password is a character string that must have a length that is between six and 127 bytes, and the value is case-sensitive. To help administrators and programmers remember the password value, the Set Encryption Password statement lets you store an optional hint (up to 32 bytes) with the encrypted data value. Figure 19 shows how to supply DB2 with the encryption key and a hint: CREATE TABLE emp ( SocialSecurity VARCHAR(56) FOR BIT DATA, name VARCHAR(30)) SET ENCRYPTION PASSWORD= 'at95lantic' WITH HINT='ocean' INSERT INTO emp VALUES(ENCRYPT_RC2('123456789'), 'JENNA') SELECT GETHINT(socialSecurity) FROM emp Figure 19. An example of supplying DB2 with an encryption key and a hint The Set Encryption Password statement specifies an encryption key value of at95lantic with a hint value of ocean. The numeric digits, 95, are in the middle of the password string because using a word such as atlantic as a password is not a good idea. A hacker could programmatically try all the words in the dictionary in a short period of time. When the Insert statement encrypts the employee s SSN, DB2 also stores the hint value along with the encrypted value in that column. If an administrator ever forgets the encryption key, he can use the DB2 GetHint function to retrieve the hint value of ocean as a reminder that the name of an ocean was used for the encryption key. The encryption key that is specified on the Set Encryption Password statement is active until the job or activation group ends. You can also vary the encryption password across different rows in a table by running the Set Password statement multiple times or by specifying the password-string value on the DB2 Encrypt functions. The Encrypt function supports three input parameters, but the only required parameter is the input data string. The optional parameters are the password and hint. Figure 20 is an example of how to perform the same encryption as is used in Figure 19: INSERT INTO emp VALUES(ENCRYPT_RC2('123456789','at95lantic', 'ocean'),'jenna') Figure 20. An example of performing the encryption shown in Figure 19 The capability to specify different passwords for each row might be useful for a Web application where all customers are given the ability to encrypt their own credit-card number. When customers input their credit-card number, they also specify a password that is used on the ENCRYPT_RC2 function. That password is used to encrypt their card number as it is written to the back-end database. These SQL encryption examples look exactly the same if the ENCRYPT_RC2 function is replaced with either the ENCRYPT_TDES or ENCRYPT_AES functions. 35
Encryption password management and distribution Obviously, it is not a good idea to hard-code the encryption password and hint in your application source code. The recommended approach is to supply these values through host variables or parameter markers so that the encryption password is less visible to anyone who scans the application source code, as Figure 21 shows: SET ENCRYPTION PASSWORD=:hostvar1 WITH HINT=:hostvar2 INSERT INTO emp VALUES(ENCRYPT_AES('123456789',?,?), 'JENNA') Figure 21. Example of supplying encryption-password and hint values through host variables or parameter markers If the application manages the encryption and decryption of the data, you can make the password value more secure by storing it in a validation-list object or by using the IBM i APIs for key management. Decryption: Unlocking the data The final step in this data-protection process is getting the original data value back with the DB2 Decrypt function. i5/os V5R3 includes the following set of decryption functions: DECRYPT_CHAR DECRYPT_DB DECRYPT_BIT DECRYPT_BINARY Because they return a character string to the invoker, the DECRYPT_CHAR and DECRYPT_DB functions are probably the most commonly used. As with the Encrypt function, the Decrypt functions accept up to three parameters, but only the first one is required. The first parameter obviously identifies the encrypted value that needs to be decrypted. The second parameter again can be used to specify the password, and the final parameter is an integer value that you can use to specify the CCSID value for any character-string values that are returned. For the decryption functions to return the original value, the encryption password must be set to the same password value that was used to encrypt the data with the Encrypt function. If the decryption password is different, a runtime error is returned. An example that shows how to decrypt the data that was encrypted can be seen in Figure 22: SET ENCRYPTION PASSWORD='at66lantic' SELECT DECRYPT_CHAR(socialSecurity) FROM emp WHERE name='jenna' Figure 22. Example of decrypting the data that was encrypted in Figure 21 Any application that reads the SSN from the employee table without the decryption function receives the encrypted binary-string value by default. As you can see, invoking the DB2 Decrypt functions is simple. 36
Native interfaces and DB2 encryption So far in this section, all the examples have been written in SQL; so, you might wonder if applications that use the native (non-sql) interface can benefit from this DB2 support. The answer is yes. You do need to use some SQL in the process, but you do not need to rewrite your applications to use SQL. Native users must understand that they do not have to use SQL to create the underlying table that contains the encrypted data. The column that is encrypted just needs to satisfy the previously discussed requirements. With the IBM i 7.1 release, the DB2 Field Procedure interface provides another possible solution for native interfaces. Database triggers provide an excellent way of making encryption services available without changing all of your applications to use SQL. Triggers are useful whether or not you use the DB2 encrypt and decrypt functions to call the Cryptographic Services APIs directly. You can define the Before Insert and Update triggers to intercept the Write requests to your database and then encrypt any sensitive column data before it is passed to the DB2 engine. Using this trigger approach means that the only program that must use SQL to invoke one of the SQL built-in encryption functions is the trigger program. You can use both SQL and external triggers. Figure 23 is an example of using SQL triggers to encrypt SSNs that are inserted into the employee table: CREATE TRIGGER protect_ssnumber BEFORE INSERT ON emp REFERENCING NEW ROW AS n FOR EACH ROW BEGIN DECLARE encrypt_key VARCHAR(127); SET encrypt_key = get_passwordudf( EMP ); SET n.socialsecurity = ENCRYPT_AES(n.socialSecurity, encrypt_key); END Figure 23. An example of using SQL triggers to encrypt data Any native Write (or SQL Insert) into the employee table invokes this BEFORE INSERT trigger. When the trigger is invoked, the protect_id SQL trigger first calls the user-defined get_passwordudf() function to get the encryption password for the employee table. That password is then used to encrypt the SSN in the Before record image that DB2 uses to generate a new row in the employee table. SQL views are probably the most common approach to give native programs the ability to read the decrypted version of sensitive data. Figure 24 provides an example view that you can use in this manner: CREATE VIEW my_logins(system, login, passwd) AS SELECT system,login,char(decrypt_char(passwd),6 ) FROM regusers WHERE userid=user Figure 24. An example SQL view that allows native programs to read decrypted data The native program can then open the SQL my_logins view as a nonkeyed logical file, and the decrypt column function (which is contained in the SQL view) allows the unencrypted column data to be returned to the native program. The decryption occurs only if the native program sets the proper encryption key directly (through the Set Encryption Password SQL statement) or indirectly (by calling another program to activate the password). Encrypted columns require a longer column length; so, native programs must decrypt the data or be modified to handle the longer column length when the encrypted data is returned. 37
Using an SQL View to decrypt data automatically simplifies the decryption process, but this same SQL View also provides a mechanism for allowing system users to retrieve sensitive data in a bulk fashion. You must carefully review data-security concerns before using this view-based approach for decryption. Integrated encryption and Instead Of triggers DB2 includes support for Instead Of triggers which can make it easier to incorporate encryption and decryption into your applications. The biggest difference with Instead Of triggers (compared to the existing DB2 trigger support) is that Instead Of triggers can only be defined over SQL views. IBM created Instead Of triggers to enhance the behavior of Insert, Update and Delete operations against views. Certain views contain transformations that make the view read-only. For example, Insert and Update operations cannot be performed against the my_logins view because the decrypt_char scalar function causes this view to be treated as read-only. The my_logins view is used by applications so that the password column can be automatically decrypted on Read operations. It makes sense that, if an application performs Write operations against this view, the my_logins view automatically encrypts the data. You can define an Instead Of trigger on the my_logins view to perform this automatic data encryption. Figure 25 is an example of how to use Instead Of triggers to perform the encryption when Insert and Update operations reference the my_logins view: CREATE TRIGGER insert_my_logins INSTEAD OF INSERT ON my_logins REFERENCING NEW AS n FOR EACH ROW MODE DB2SQL INSERT INTO regusers VALUES(USER,n.system,n.login,Encrypt_AES(n.passwd)) CREATE TRIGGER update_my_logins INSTEAD OF UPDATE ON my_logins REFERENCING OLD AS o NEW AS n FOR EACH ROW MODE DB2SQL UPDATE reguser SET system=n.system, login=n.login, passwd=encrypt_aes(n.passwd) WHERE system=o.system AND login=o.login AND userid=user Figure 25. Encrypting with Instead Of triggers When an Insert is performed against the my_logins view, DB2 calls the insert_my_logins trigger instead of signaling a read-only view error. The insert_my_logins trigger grabs the password string (n.passwd) from the Insert against the view and then runs an Insert statement against the underlying table (regusers) so that the password string is encrypted. The update_my_logins trigger operates in a similar fashion. You cannot use the following clauses on the Create Trigger statement when creating an Instead Of trigger: Before and After clauses Of column-name clause for an Update trigger For Each Statement clause When clause For any given view, you can only define one Instead Of trigger for each operation (Insert, Update and Delete) meaning that a view can have a maximum of three Instead Of triggers. You cannot define Instead Of triggers over a system or catalog view (for example, SYSTABLES in QSYS2). 38
Query optimization and performance considerations In some cases, creating indexes on encrypted data is a good idea. Exact matches and joins of encrypted data use the indexes you create. Because encrypted data is essentially binary data, range checking of encrypted data requires table scans and is to be avoided as much as possible. In cases where searches must be performed against an encrypted column, you need to use the Encrypt functions in a way that minimizes the number of values that are decrypted in the search. For example, the optimal way to search for a user by SSN is to use the query shown in Figure 26, because it avoids decrypting every SSN in the table: SELECT name FROM emp WHERE socialsecurity = ENCRYPT_TDES('111222333'); Figure 26. Searching for a user by SSN by using a query Because extra processing takes place for encryption and decryption, these data-security operations slow most SQL statements. In general, you should only use encryption for a small set of sensitive columns, such as patient name and SSN. Note: If the WHERE clause (in Figure 26) is changed to DECRYPT_CHAR(socialSecurity)= 111222333, then DB2 must decrypt every row in the employee table before performing the comparison. For queries that join on two encrypted columns, use the encrypted values to perform the join; there is no need to decrypt the column values first. DB2 Field Procedure interface for encryption The transparency of an encryption solution utilizing the DB2 Fieldproc interface is demonstrated in Figure NEW1. Once the field procedure program has been registered, DB2 automatically calls the user-supplied FieldProc program on every write operation to encrypt the data and every read operation to decode the data. This enables sensitive data such as credit card numbers to be encrypted at rest without any changes to the application. The FieldProc program is called regardless of the application or interface. 39
Figure NEW2 - Graphical representation of transparent column-level encryption solution using Field Procedures. Let's examine the implementation details behind a FieldProc-based encryption solution. Creating the FieldProc program The first step in using the DB2 Field Procedure interface for encryption is the creation of the program. The DB2 for i SQL Programmer's Guide contains the detailed requirements for a FieldProc program. The key concepts are covered in this section. The FieldProc program can be implemented in any ILE language. However, the ILE program cannot run any SQL statements. The program must contain logic to handle the following three types of program calls: FieldProc Registration Field Encode Field Decode The FieldProc program will determine type of call by interrogating the parameter list that DB2 passes on every call to the FieldProc program. The complete layout of this parameter list can be found in one of these include files: SQLFP in QSYSINC/QRPGLESRC SQLFP in QSYSINC/H The Registration call to the FieldProc program takes place a single time - when the SQL CREATE TABLE or ALTER TABLE statement is used to assign a FieldProc program with a column (field). When this event occurs, DB2 invokes the FieldProc program to determine the attributes (size, type, and length) of the encoded values that the FieldProc program will be returning for the specified column. This provides developers a lot of flexibility becauses it means that the encoded value can be a different length than the defined column length without any changes to the application or table definition. For example, the encryption algorithm embeded within a FieldProc program can take an input string from a 16 byte credit card number column and return an encrypted value 20 bytes in length. Internally DB2 allocates 20 bytes to store that encrypted value within the record image. DB2 always returns a 16 byte credit card number value back to the application by invoking the FieldProc program to decode the encrypted value into the original data value. The attributes on the column or field definition define the application's view of the data value while the FieldProc Registration call define the attributes that DB2 will use for the internal storage of the encoded value. The encoding and decoding call types are the simplest to understand. The Field Encode call occurs on every Insert or Update operation to encode the inputted data value. The encode logic can encrypt the data with whatever encryption algorithm or API that it wants to use. The encoded data value returned by the FieldProc program is what DB2 stores in the internal record image stored on disk. Correspondingly, the Field Decode call to a FieldProc program occurs on every Read operation. This enables the FieldProc program to decrypt the stored value before it's returned to the application. The value returned by the FieldProc Decode call must be the exact inverse of the Encode call in order. Data loss can occur if this requirement is not followed. When the encoding is encryption of some form, the decode operation must return the original clear text value of the data. If the decode operation returns a 40
masked value of the data (i.e., ***********4444) instead of the clear text value(i.e, 1111222233334444), then data loss may occur. The most common data loss scenario occurs with applications using the native interface. Assume that you have a table of credit card holders. The table contains the name and address of the card holder along with the credit card number. A FieldProc routine has been defined for the credit card number field that encrypts the card number on disk and returns a masked credit card number value for every user other than the system security officer. Next, assume that a native application reads a row for update - at this point, the application buffer contains the masked value of the credit card number. The native application then changes the address field in this same buffer. When the application performs the update operation using this buffer, the registered FieldProc program is called to encode the credit card number. Unfortunately, the FieldProc program is passed the masked credit card number to encrypt instead of the clear text version of the card number. Thus, data loss occurs because DB2 is now storing the encrypted version of the masked credit card number and no longer has access to the encrypted version of the clear text value. Registering the FieldProc program Now that you understand the coding requirements for a FieldProc program, let's examine how to register the program as a Fieldproc for a column. The registration has to be performed using SQL with the Create Table or Alter Table statements. This SQL interface requirement does not mean that physical files are excluded from FieldProc support. This is not true because the Alter Table statement can operate on a physical file. As you can see in Figure NEW3, the Field Procedure syntax for the Create Table and Alter Table statements is almost identical. The only difference is the SET keyword on the Alter statement. If the table(file) being altered contains data, the Alter Table FieldProc registration will behave differently than the Create Table registration. When the table contains data, the Alter Table processing must also call the registered FieldProc program to encode the column value in each row of the table. Depending on the number of rows on the table and the activity on the system, this can be a long-running operation. Adding a FieldProc to an existing table should be performed during a maintenance window because DB2 also holds an exclusive lock while Alter Table statement is running. CREATE TABLE cardholder( chname CHAR(20), chaddress CHAR(50), chcardnum CHAR(16) FIELDPROC mylib/fppgm1) ALTER TABLE orders ALTER COLUMN cardnum SET FIELDPROC pgmlib/fppgm Figure NEW3. An example of DB2 Field Procedure registration As mentioned, the Alter Table statement can be used to register a FieldProc program with a column from a DDS-created physical file. When Alter Table is used to register a FieldProc program, special precautions must be taken to maintain this FieldProc program registration for that physical file. If the physical file is altered to add or remove a column with the Change Physical File (CHGPF) command, the 41
FieldProc registration will be lost because the CHGPF command is not capable of defining attributes that require SQL. Furthermore, the CHGPF command returns no feedback that the FieldProc registration was removed. Thus, this might be a good reason to first convert your physical file over to an SQL table definition. With the SQL table definition in place, the Alter Table statement can be used to change, add, or remove columns while maintaining the FieldProc registration at the same time. You may not be aware that there's a methodology for converting a physical file to an SQL table without requiring a recompile of your application programs. That methodology is defined in the IBM Redbook titled Modernizing IBM eserver iseries Application Data Access which can be found at: http://www.redbooks.ibm.com/abstracts/sg246393.html. The other alternative is to always remember to execute the Alter Table statement each time that the CHGPF command is run to change the record definition of the physical file. FieldProc considerations Be aware that calls to the FieldProc program for encoding and decoding are not limited to application data access interfaces. Any system command such as Copy File (CPYF), Display Physical File Member (DSPPFM) or Reorganize Physical File (RGZPFM) that writes or read data will result in DB2 invoking the FieldProc program. The FieldProc program will also be called during the creation of an index or keyed logical file that reference key fields associated with a FieldProc because the index key value uses the encoded column value. As a result, this will affect the keyed order of data and keyed position operations such as RPG Set Lower Limit (SETLL) and Set Greater Than (SETGT) because the ordering of values is determined by the encoded data value instead of the clear text data value. Encryption implementation considerations This reference section suggests considerations and offers recommendations that regard your implementation of encryption for DB2 for i data. Column encryption considerations Regardless of the interface that you choose, data encryption should be reserved only for those columns that contain personal or sensitive data. A credit-card number is an example of sensitive data that you should encrypt to protect that value. In contrast, a column that contains a customer s name is probably not an encryption candidate because that name is widely available from other sources, even though a last name is considered personal data. Encrypting the data in all columns is not recommended. Performance and sorting restrictions are the biggest reasons for limiting encryption to a small number of columns. First, additional system processing is consumed and required when encrypting and decrypting data. Performance problems can also surface when the application has to search on a specific value. Depending on the encryption implementation, applications may have to first decrypt the value stored in the table or encrypt the search value so that the DB2 search request can compare like values. This upfront conversion requirement for searches can slow performance depending on the number of values that must be converted. Second, the data that is stored in an encrypted column cannot be sorted directly in a meaningful way. If the SSN data in a column is stored in an encrypted fashion, then any program that 42
displays customer data sorted by SSN needs to be changed because the encrypted versions of the SSN sort differently. You must change the program to decrypt all of the values into some temporary structure that can then make the decrypted values available for the sorting request. This obviously slows the performance of this request. Therefore, you must give careful consideration before encrypting a column that is frequently referenced in sorting or lookup operations by an application. After you have identified the columns for encryption, one final option to consider is moving the encrypted columns to a separate table. Although this option can require extensive application changes, it does help limit access to the sensitive data that is stored in the encrypted column. For example, if the column that contains credit-card data is encrypted and moved to a separate table, any existing applications that access the customer table no longer have direct access to the credit-card column. Therefore, any retrieval of the credit-card data requires that you make explicit changes to the application to call a program or access the new table that contains the credit-card data. In addition, applications that only access nonsensitive customer data do not have to be enhanced with the ability to decrypt data. Cryptographic services tips What is the best cryptographic algorithm? What is a good key size? Cryptographic application programmers often ask these kinds of questions. The answers depend on the application s requirements. But generally, the following advice serves well: Use AES, which is the new official standard and runs faster than DES and 3DES. Do not use DES, if possible. Use a 256-bit key. Because of collision attacks, a key of n bits actually yields n/2-bit security. In other words, it takes a workload of about 2 128 steps to break a 256-bit key. Use CBC mode to mask patterns in the data. A 256-bit (32-byte) block size is best because a large block size reduces the number of ciphertext collisions, which leak information. However, the AES standard supports only a 128-bit (16-byte) block size. Therefore, when using a 256-bit key with a 128-bit block size, limiting how much total data is encrypted with a single key is important. (Otherwise, you might as well use a smaller key size.) With CBC mode, you should limit it to approximately 2 32 blocks. To avoid identical ciphertext in the first block, you must use a substantially different initialization vector (IV) for each message. One option is to use the Generate Pseudo-random Number API to generate a unique IV value. Alternatively, you can use a unique number (known as a nonce), but you must encrypt it before using it as an IV value, and you must ensure that the counter never wraps for a given key. The nonce method has advantages for communications in that you can include the information that is needed to reconstruct the nonce with your ciphertext message in typically four to eight bytes (compared to 32 bytes, if using a randomly-generated IV value). The recipient can then reject any message numbers that are not greater than the last message received. (This applies only to encryption. For MAC operations, a null IV value is fine.) 43
You need to use encryption with a separate authentication function. Even if modified messages decrypt to nonsense, they can still cause damage. (Why? If an encrypted message is modified, the decryption operation does not always fail. The decryption operation completes as usual, but the results are garbage. An application program cannot always detect this. For example, an application might use bytes 7 and 8 of the decrypted message as a numeric value. How does the application program know that the value of 43 382 should be 200?) Do the authentication first, then the encryption. Do not use the same key for encryption and authentication. Simple attacks can be performed on the MAC of a plain-text message. To prevent these attacks, concatenate the length of the input data to the start of the message, then perform a MAC operation on the entire string. Use a strong Secure Hash Algorithm, such as SHA-256 or SHA-512, if possible. (Do not bother using SHA-384. The underlying function performs all the work of SHA-512 and then throws away part of the results.) The i5/os V5R4 Calculate HMAC API now supports these algorithms. Because of weaknesses in the SHA-hash functions, you must always perform a double hash (that is, hash the data, then hash the result). This is unnecessary for HMAC. Use SHA-256 HMAC for the authentication function, and use all 256 bits (32 bytes) as the MAC value, if possible. Remote clients and encryption keys If a remote client (that is, a Java applet running in a Web browser) sends an encryption key to encrypt or decrypt data on a System i model, then extra precautions are necessary when passing this key to the system. To protect the encryption password in these cases, consider using a communications-encryption mechanism, such as IPSec (or SSL if connecting between System i models). This protection is important when the remote client specifies the encryption password on the invocation of the SQL encryption and decryption functions. In these cases, the encryption password key is sent in the clear to DB2 for i. The password itself is not encrypted. Summary This white paper has demonstrated how the new encryption and decryption functions in IBM DB2 for i provide a simple way to add an extra lock of security around sensitive data. Before implementing column encryption, you must carefully consider how your application and business processes will be changed to address selective decryption of the encrypted columns in your database. Although this task is not trivial, the requirement for encrypting personal and financial data is one that cannot be ignored. 44
Appendix A: Resources These Web sites provide useful references to supplement the information contained in this document: IBM i Information Center http://publib.boulder.ibm.com/eserver/ibmi.html Cryptographic Coprocessors for System i http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/rzahu/rzahucryptocardconcept.h tm 2058 Cryptographic Accelerator http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/rzajc/rzajcaccel2058.htm i5/os and 2058 Cryptographic Function Comparison http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/apis/qc3compare.htm Cryptographic Services APIs http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/topic/apis/catcrypt.htm Cryptographic Services Master Keys http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/topic/apis/qc3masterkeys.htm Cryptographic Services Key Store http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/topic/apis/qc3keystore.htm Digital Certificate Manager http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/topic/rzahu/rzahurazhudigitalcertmngmnt.ht m Error-code parameter http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=/apiref/error.htm IBM Redbooks (ibm.com/redbooks) IBM System i Security: Protecting i5/os Data with Encryption (SG24-7329) DB2 for i Online Documentation ibm.com/systems/i/db2/books.html IBM i on IBM PartnerWorld ibm.com/partnerworld/i IBM Publications Center http://www.elink.ibmlink.ibm.com/public/applications/publications/cgibin/pbi.cgi?cty=us 45
Encryption programming examples with RPG Scenario: Key Management and File Encryption Using the Cryptographic Services APIs http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/topic/apis/qc3scenario.htm Scenario: Key Management and File Encryption Using the Cryptographic Services APIs (i5/os V5R3 version) http://publib.boulder.ibm.com/infocenter/iseries/v5r3/ic2924/info/apis/qc3scenario.htm SystemiNetwork.com APIs by Example: Cryptographic Services APIs (multipart series) Triple-DES Encryption from RPG Further reading Practical Cryptography, Bruce Schneier (0-471-11709-9) Applied Cryptography, Bruce Schneier (0-471-11709-9) Cryptography Decrypted, H. X. Mel and Doris Baker (0-201-61647-5) PKCS #5: Password-Based Cryptography Standard http://rsasecurity.com/rsalabs/node.asp?id=2127 Privacy Enhancement for Internet Electronic Mail: Part II: Certificate-Based Key Management http://ietf.org/rfc/rfc1422.txt?number=1422 46
Appendix B: About the authors Beth Hagemeister is an IBM System i software engineer in Rochester with 20 years of experience developing cryptographic software for the System i platform. Kent Milligan is a DB2 technology specialist in IBM ISV Business Strategy and Solutions Enablement for the System i platform. After graduating from the University of Iowa, Kent spent the first eight years of his IBM career as a member of the DB2 development group in Rochester, Minnesota. He speaks and writes regularly on various DB2 for i5/os relational database topics. Appendix C: Acknowledgements Much of this content comes from articles originally published in 2005 and 2006 by SystemiNetwork.com. Thanks to Thomas Barlen of IBM for his contributions. 47
Trademarks and special notices Copyright IBM Corporation 2010. All Rights Reserved. References in this document to IBM products or services do not imply that IBM intends to make them available in every country. AIX, AS/400, DB2, i5/os, IBM, the IBM logo, OS/400, PartnerWorld, Redbooks, System/38, System i, System p, System x and System z are trademarks of International Business Machines Corporation in the United States, other countries, or both. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Other company, product, or service names may be trademarks or service marks of others. Information is provided "AS IS" without warranty of any kind. All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer. All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Contact your local IBM office or IBM authorized reseller for the full text of the specific Statement of Direction. Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The information is presented here to communicate IBM's current investment and development activities as a good faith effort to help with our customers' future planning. Any references in this information to non-ibm Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. 48