A Cyphertite White Paper February, 2013 Cloud-Based Backup Storage Threat Models
PG. 1 Definition of Terms Secrets Passphrase: The secrets passphrase is the passphrase used to decrypt the 2 encrypted 256-bit AES keys in the secrets file. The secrets passphrase, 1024-bit salt and round count are used to decrypt both of the encrypted 256-bit AES keys using PBKDF2. Losing the secrets passphrase for any data backed up to Cyphertite.com means the data is irrevocably lost. Secrets File: The secrets file is composed of several pieces of cryptographic data used to encrypt chunks on a Cyphertite client machine: 2 encrypted 256-bit AES keys, 1024-bit salt and round count for PBKDF2, and a checksum for the rest of the data. Account Credentials: A user s account credentials are a username and password combination chosen by the user during account creation on Cyphertite. com. This information is usually stored in the user s Cyphertite.conf file on their client machine, but it can also be entered on-demand when running Cyphertite from the command line when Cyphertite.conf does not contain this information. Metadata File: A metadata file refers to a file which is required to restore a given backup. The metadata file is a list of filenames and directories each of which corresponds to a list of chunks, indexed by their SHA1 hash, that can be decrypted to reassemble individual files. It allows a single chunk to be stored once and referenced multiple times, saving computation time and drive space. Certificate Bundle: The certificate bundle is a collection of small files that containcryptographic data for network layer encryption: a CA certificate, a client certificate and a 521-bit ECDSA certificate authority keypair. Both certificates are signed using 521-bit ECDSA CA keys and all ECDSA keys use curve secp521r1, the NIST/SECG curve over a 521 bit prime field. Introduction Good disaster recovery (DR) practice requires keeping usable business-critical backups offsite. Organizations have traditionally implemented this by writing backups to tape and shipping the tapes to be stored offsite. This is costly and operationally complex, requiring hardware, personnel, and sound procedures to ensure that the offsite backups are up-to-date, secure, and able to be recalled and used in the face of disaster. Cloud-based backups are an attractive alternative to traditional methods because offsite storage is inexpensive, deduplication minimizes the use of bandwidth and drive space, and there is nothing to set up beyond the client on the local machine. However, when sensitive data is being backed up and the enterprise s responsibility encompasses the safekeeping of
PG. 2 that data, e.g. social security or credit card numbers, transmitting that data across the internet to an offsite location requires heightened attention to security and privacy issues. Current cloud-based backup systems often encrypt data only while in transit to the offsite server. Data is handled without encryption by both the local machine and the remote storage system so that it can be deduplicated, usually in a global deduplication pool. Cyphertite offers a solution to the security problems associated with cloud-based deduplicated backups, making high levels of data security attainable while still realizing the cost savings of realmwide deduplicated backup through the cloud. By encrypting data prior to transmission with cryptographic keys that are unique to each Cyphertite account and only ever reside with the account owners, Cyphertite literally puts the keys to securing data into the hands of the IT professionals who manage it. Cyphertite Transparency When serious security is a concern, those responsible want to be in control of how that data is secured. As an open source project, Cyphertite is fully inspectable by those who are using it to secure their DR data. They can see precisely how the Cyphertite code handles their data, and they can choose precisely how secure they want that data to be. Community testing and inspection help keep Cyphertite s security robust and current with the highest industry standards. The Threat Models There are at four threat models concerning disaster recovery data: 1. Client Machine Compromise 2. Client Machine Physical Theft 3. Eavesdropping and Interception 4. Offsite Storage Facility Data Disclosure 1. Client Machine Compromise If an intruder gains access to a client machine s root or administrator user, the intruder has access to the DR data for the account the client software is configured to use. There is nothing any backup software client can do to distinguish an intruder from the machine owner if the intruder authenticates as the owner. Even if steps are taken to secure the backup data, such as not storing the passphrase or account credentials on the client machine, there are several techniques an intruder could use to capture the passphrase and credentials, such as using a keylogger or other data capture techniques. Suffice it to say, in the case of a client machine compromise, all of the data on the client machine and the DR data have been compromised.
PG. 3 2. Client Machine Physical Theft For maximum protection in the case of the physical theft of a client machine or drives, encrypting hard drives offers a first layer of protection for all the data on the drives. The hard drive encryption would have to be cracked before the intruder could even access the Cyphertite backup software in order to then attempt to access the DR data. In all likelihood, physical theft of an encrypted client hard drive would not result in the disclosure of sensitive data. If the drive is not encrypted, then the intruder has access to all the data on the drive, and the backup software client becomes the only barrier preventing the intruder from accessing the offsite DR data. If the credentials are visible in the backup software client and no barriers to accessing DR data have been introduced, the intruder has access to the DR data. Cyphertite can be configured to impose one or more barriers and/or layers of encryption between an intruder and DR data in the case of the physical theft of a client machine. To use the Cyphertite client software on the stolen drive to gain access to the offsite DR data, the intruder needs the following Cyphertite configuration elements: Secrets Passphrase Secrets File Account Credentials Metadata File Certificate Bundle In brief, the secrets passphrase unlocks the secrets file which unlocks the DR data chunks which can be retrieved from the Cyphertite remote storage facility server using the account credentials, the certificate bundle and the metadata files. (see figure below). Secrets Passphrase Secrets File PBKDF2 2 Keys Salt Round Count Decrypted Keys USER ACCOUNT Account Credentials + Certificate Bundles Decrypted Files CLIENT Internet SERVER Account Credentials Encrypted Data Chunks Decrypted Metadata Encrypted Metadata FIGURE 1: Cyphertite Process
PG. 4 Where and how these five elements are stored determines how safe the DR data is in the case of the physical theft of an unencrypted client drive. It is possible to store any combination of one or more of these elements on or off the client machine and to store the secrets file and/or the metadata file on or off the Cyphertite storage facility server (the server always retains a copy of the certificate bundle and account credentials in order to authenticate clients). For a grid of all possible permutations of securing the five elements necessary to retrieve DR data from the Cyphertite server via the Cyphertite client in the case of a stolen, unencrypted hard drive, see Appendix 1. For maximum security, all five of the elements can be stored off of the client hard drive without storing a copy of the secrets file and the metadata files on the Cyphertite server. In this case, an intruder would have to first obtain the account credentials, the certificate bundle and the metadata file to retrieve the DR data from the CT server. If they were somehow able to attain those three elements and retrieve the DR data from the CT server, the intruder would then be faced with decrypting 256-bit AES-XTS and then decrypting 256-bit HMAC SHA256. There are two routes to accomplish that decryption, either by brute force or gaining access to the contents of the secrets file. The intruder would need to first acquire the secrets file, and then would need either to acquire the secrets passphrase or to break the PBKDF2 encryption of the secrets file by brute force. In short, storing all five elements off of the client and the secrets and metadata files off the server introduces the following barriers to access DR data on the CT server: 1. Acquiring Account Credentials 2. Acquiring the Certificate Bundle 3. Acquiring the Metadata File Additionally, it introduces the following barriers to decrypting that data if they manage to either: 1. Brute force decrypt 256-bit AES-XTS 2. Decrypt the secrets file, either via the secrets passphrase or by brute forcing PBKDF2 The owner of the machine, however, also needs all five elements in order to restore data from the offsite cloud-based storage. This means that maximum protection also means maximum inconvenience for the user. In order to perform a restore, the user would have to provide all five Cyphertite configuration elements. The user would also assume responsibility for securely storing these elements elsewhere than the client machine or Cyphertite server.
PG. 5 The default Cyphertite client setup is geared toward maximum convenience and stores all five elements on the client machine: the Cyphertite account username and the secrets passphrase are stored in the Cyphertite client config file. The certificate bundle, the metadata file and encrypted secrets file are all stored on the local drive. In addition, the default Cyphertite client setup stores a copy of the secrets file and an encrypted copy of the metadata file on the Cyphertite server. In the case of the default setup, since the secrets passphrase is visible in the Cyphertite configuration file, a stolen unencrypted hard drive means the intruder has access to all of the DR data from the account the client is configured to use. SECURING THE FIVE CONFIGURATION ELEMENTS: Securing the Secrets Passphrase To increase security of backup data in the case of a stolen, unencrypted hard drive, the secrets passphrase could be secured by storing it off of the client machine. The secrets passphrase is never transmitted to the CT server. Since it would not be stored in the Cyphertite configuration file and is never transmitted to the CT server, the user would have to secure that passphrase and provide it when performing operations with the Cyphertite client. In this case, if the other four elements are configured to be stored on the client, an intruder could access the encrypted DR data on the Cyphertite server, but would then be faced with decrypting that data. The weakest barrier would be the PBKDF2 encryption of the secrets file, which if cracked, would give the intruder access to the DR data. Securing the Secrets File To further increase security of DR data, the Cyphertite client can be configured to store the secrets file off of both the client and the server. Without the secrets file, the weakest barrier to DR data would be the 256-bit AES-XTS encryption used in the metadata files and the data chunks. Securing the secrets file becomes the responsibility of the user who must provide it to perform Cyphertite client operations. The secrets file can be stored off of the client and on the server, but unless the account credentials and certificate bundle are secured, an intruder can simply retrieve the secrets file from the server. Securing the Metadata File Each time a backup operation is performed, the Cyphertite client produces a metadata file which describes how meaningful files are constructed out of the chunked DR data. By securing the metadata files, the intruder is forced to manually reconstitute meaningful files from the chunked data. To secure the metadata files, they can be stored off of both
PG. 6 the client and server. It becomes the responsibility of the user to securely store the metadata file and provide it when restoring DR data. Metadata files can be stored off the client and on the server, but unless the account credentials and the certificate bundle are secured, the intruder can simply retrieve the metadata files from the Cyphertite server. They cannot, however, decrypt the metadata files without the secrets passphrase. Securing the Account Credentials In the case of a stolen, unencrypted hard drive, storing the account credentials off client introduces a barrier to an intruder attempting to access the Cyphertite server in order to acquire DR data. Securing the account credentials also affects the security of both the secrets file and the metadata file if these are configured to be stored on the server. If the secrets file or the metadata file are stored off client and on server, then the security of the account username and password becomes the primary barrier to acquiring those files. When storing the account credentials off client, their security becomes the responsibility of the user who must provide them in order to perform Cyphertite client operations. Securing the Certificate Bundle It is possible, though cumbersome, to keep the certificate bundle off client. This would have a similar effect to storing the account credentials off client, provided the account credentials are not stored on the client. If the account credentials are stored on the client it is possible to retrieve the certificate bundle via the web interface for the account which would then give access to the encrypted DR data on the server. 3. Eavesdropping and Interception Cyphertite cloud-based storage has two separate layers of encryption: client data is first encrypted prior to transmission and then encrypted again for transmission over the internet. The chunks of backup data are encrypted with 256-bit AES-XTS prior to leaving the client machine and those encrypted chunks are encrypted a second time for transmission using a 256-bit AES-CBC session key with 521-bit ECDSA keypairs used for session key exchange. The purpose of using encryption over the network is to prevent eavesdropping on client transfers to and from the Cyphertite servers. Were one to try to eavesdrop on DR data it would require extracting all 256-bit AES-CBC session keys which rotate regularly e.g. every 60 minutes. Connections can be intercepted and subjected to a Man-In-The-Middle (MITM)
PG. 7 attack but this will be detected if the client is using the correct certificate files supplied when signing up for an account. If a Cyphertite client account credentials are somehow intercepted, like via a plaintext email between two employees (not great practice, but it could happen), the DR data would still be encrypted using session keys that are unknown to the eavesdropper. Even in the case that the account credentials *and* the key and cert bundle are known to an eavesdropper, the chunks and metadata files transmitted are themselves encrypted with 256-bit AES-XTS. DATA CHUNKS CYPHERTITE SERVER CLIENT SYSTEM ENCRYPTION 256 BIT AES-XTS TRANSMISSION ENCRYPTION KEYS & CERTS SYSTEM 521 BIT ECDSA FIGURE 2: Cyphertite Transmission Encryption 4. Offsite Storage Facility Data Disclosure In the real world, there is always the chance that the unforeseen can happen, and while unlikely, data on the Cyphertite server could be disclosed under a court granted subpoena. Data could also be disclosed in the case of physical or password theft at a Cyphertite storage facility. In all cases DR data on the CT server is protected by at least PBKDF2 encryption within the secrets file which even CT could not decrypt since the secrets passphrase only ever resides with the account owner. Cyphertite client configuration also allows for varying degrees of security/convenience where the remote storage facility is concerned. It is possible to configure the Cyphertite client to transmit its cryptographic keys files and/or its metadata files to the Cyphertite server. This is the least secure but most convenient configuration. As such, in order to perform a restore, the user needs to provide the account credentials, certificate bundle and the secrets passphrase file to perform a restore. A storage facility security breach in this scenario would mean that DR data is protected by the PBKDF2 encryption of secrets file since the secrets passphrase is never transmitted to the Cyphertite server. For added security, the client can be configured not to transmit the secrets file to the Cyphertite server. In this scenario an intruder would need to decrypt the 256-bit AES-XTS encryption of the DR data either by brute force or by acquiring access to the account s secrets file from the account owner.
PG. 8 For even more security, the Cyphertite client can be configured not to transmit the metadata file created during the backup process on the Cyphertite server. In this scenario, the intruder would be forced to manually reconstitute meaningful files from the any chunked data they managed to decrypt. Summary Data backup security including accessibility and privacy is without question one of the lead issues for IT professionals and the organizations they serve. Control, transparency, and accountability are critical attributes that informed managers value as the stakes in today s networked data environment rise. With its security protocols, its transparently-inspectable code and its respected track record in the open source community, Conformal System s Cyphertite is uniquely positioned to provide a credible and responsible answer to the threat models outlined in this paper. For more information, visit www.cyphertite.com.
PG. 9 Appendix 1 UNENCRYPTED CLIENT THEFT SCENARIOS Any combination of the below pieces of data may be stored off the CT client (2^5=32 configurations). Each piece of data being stored off the CT client carries an increase in security along with an inconvieniece of storing it off the client. CT CLIENT DATA SECURITY GAIN INCONVENIENCE 1 3 Secrets Passphrase 1 Secrets File 1 Metadata File 3 Account Credentials 1 Certificate Bundle 2 Due to the inconvenience and complexities associated with storing CT client data off the machine, we recommend use of full disk encryption on clients whenever feasible. Appendix 2 SERVER DATA DISCLOSURE SCENARIOS There are a number of scenarios that may arise where data is forcibly disclosed from the server side which are listed in Section 4. The security of DR data on the server depends on which pieces of data are stored on the server. In a similar fashion to the unencrypted client scenarios, there is a level of inconvenience attached to each peice of data being stored off the server. CT SERVER DATA SECURITY GAIN INCONVENIENCE 1 3 Secrets File 1 Metadata File 3