Big Data, Big Security: Best Practices for Enterprise Data Encryption
Introduction Big Data is a big topic right now and well it should be. The ebb and flow of commerce and other interactions around the world move through Enterprise systems at close to light speed, with some companies processing hundreds of thousands of transactions every second and storing data that numbers into the billions or even trillions of records. And even if a system is relatively secure and properly firewalled off from outside intrusion, administrators still need to be wary of the temptation that data can pose to those inside the firewall as many as 70% of companies have had instances of internal security breaches where data was inappropriately accessed from within the company (ASPG, MegaCryption, p. 3). With all of that data at rest and on the move it s no wonder that Big Data also represents a big target and that makes proper handling of data critical, and following best practices for the encryption and decryption of data a matter of the highest priority for CIOs, Data Security Officers, system administrators, database administrators, and any other responsible stakeholders up and down the chain of data custody. What steps, then, should these responsible stakeholders take to help ensure that their data is cryptographically secure? How are encryption and decryption accomplished? Are there accepted standards in the industry, and if so, what are they? A complete discussion of cryptography can (and often does!) fill entire books and volumes of specifications written in conceptually difficult ways, leaving a casual reader more interested in a higherlevel overview struggling to find the information they need. This whitepaper addresses that gap by providing information aimed at just such a reader so if that s you, read on!
Administrator, Know Thyself In any guide to best practices the first task on the to-do list should be a complete and honest appraisal of where those best practices need to be applied. In the case of data security, one might start by making a very high-level list that breaks data into two functional areas: Data at rest, and data in transfer. Data at rest encompasses your static files and databases, while data in transfer refers to the moments in time when your data is being transferred from one device to another -- via FTP, for instance. Be sure to include transportable devices like laptops and backup tapes in your data-at-rest list. Incidents of stolen laptops containing sensitive data have been in the news several times, and backup tapes containing gigabytes or terabytes of data are likewise easily lost, misplaced or forgotten. After you have spent the necessary time discovering everywhere in your system that encryption is possible, the next step is to decide whether you should encrypt all of that information in all of those places. Why not just encrypt everything, and be done with it? Because encrypting and decrypting data, no matter how efficiently it s done, places some amount of extra load in both time and processing resources onto your system. The extra load may be quite small and not noticeable over one record but it might be potentially very noticeable when the system has to deal with encryption and decryption across thousands or millions of requests in a very short span of time.
BEST PRACTICE: Know what data you want to encrypt. Choices, Choices Under such real-world system constraints, then, choosing which data to encrypt is a realworld issue and not just an academic exercise. Do names need to be encrypted? What about phone numbers, or email addresses? There are two main drivers that one needs to consider when making these important decisions: Internal business rules, and external compliance rules stemming from either government or industry bodies in the United States, think HIPAA or PCI, for example. Choosing which data to encrypt is a real-world issue and not just an academic exercise. Of course, every industry is potentially governed by different standards, and it s beyond the scope of this whitepaper to list out all compliance mandates for every industry. But beyond those mandates, which will effectively make some of your choices for you, lay the business rules and here is where you will need to consider what is important for your company to protect, and what might perhaps remain unencrypted in the interests of speed. What should your company choose to encrypt? Again, no whitepaper can answer that for you in absolute terms, and you should not rely on this whitepaper to do so, either. But as a way to begin thinking about it, ask yourself questions along the lines of if this particular data is stolen, will it be potentially damaging or embarrassing to either outside parties (like your clients) or to your company? If you can answer yes to questions like that (and again, ask yourself a lot of those kinds of questions), that particular data probably needs to be encrypted.
Cryptography in a Nutshell At this point it might be useful to slow down for a minute, take a deep breath, and provide a brief explanation of what all this encryption and decryption stuff i.e., cryptography is all about so that we re all working with the same vocabulary. Although some extend the definition, cryptography is the science (and possibly the art, too) of using math to encrypt and decrypt data (Network Associates, p. 11). Fair enough. But how does it all work? Let s start with our regular, unencrypted data and call that plaintext. How do we end up with encrypted text, a.k.a. ciphertext? (For purposes of this simple explanation, we re going to refer to all data as text. ) By passing a key to the encryption algorithm. Pass an algorithm a given key and given plaintext, and it will generate some ciphertext. Same plaintext but different key? Different ciphertext. To decrypt, the process is reversed: Apply a key to the ciphertext, and (assuming the key is correct) voila plaintext. If the same key is used to both encrypt and decrypt a given piece of ciphertext then that cryptographic algorithm is known as a symmetric algorithm. Likewise, if the keys used for encryption and decryption in a given algorithm are different, then what we re dealing with is an asymmetric algorithm.
BEST PRACTICE: Know what algorithms your software should be using. Algorithms, Algorithms Everywhere All of which might lead one, understandably, to wonder which algorithm to choose when deciding on an encryption paradigm. The basic rule of thumb is that the symmetric algorithms are quite a bit more efficient for use in terms of the speed and computing power required. Asymmetric algorithms, on the other hand, are generally more secure, but take up more of your system resources. Thus if speed and performance are issues you re running a very highvolume, real-time system, for example symmetric encryption is probably the way to go. If you can tolerate the performance hit and are more concerned with a higher level of security, looking at asymmetric algorithms might be a good choice. Once you ve decided which side of the symmetric/asymmetric fence you want to land on you ll find that even more choices confront you. Even if your particular industry is not (currently) impacted by government mandates on data security, you would probably do well to at least consider the standards set by the United States government and promulgated through the National Institute of Standards and Technology (NIST).
Algorithms, cont d Currently (early 2013) NIST approves three symmetric algorithms for use wherever the United States government is concerned -- Advanced Encryption Standard (AES), Triple- DES, and Skipjack. On the asymmetric side, approved algorithms are DSA, RSA and ECDSA (NIST, 2013). What these approvals mean to you, in terms of best practice, is that as long as the federal government does not change its standards you will always be better off choosing cryptography software that is built over one or more of those algorithms. Why? Because NIST approval for cryptographic algorithms only comes after a long process in which the algorithms are carefully scrutinized by cryptography experts including the U.S. National Security Agency you can be reasonably certain that approved algorithms are secure. You ll also be well-positioned to take advantage of any government contracts that might come your company s way, because if government data is involved these algorithms aren t just approved they re required.
BEST PRACTICE: Use authentication to preserve data integrity. Getting the Story Straight So, you ve figured out what data to encrypt and chosen good encryption to get the job done. Great! Eventually, though, you re going to want to transmit or otherwise work with that data and when you do, what you should do is be sure to authenticate it. What authentication means, in this context, is the ability to guarantee that the data has not been tampered with since it was encrypted. Why is that important? Because the way that many hackers go about breaking your encryption is by tampering with your ciphertext before trying to decrypt it. They do that in the hope that the decryption process on the tampered-with text will give them some small piece of information that they can use to get their foot in the door, so to speak, on cracking your code and they can then repeat that process over and over, hoping for some small piece of information on each attempt. These kinds of attacks are best warded against by authenticating the data. In simple terms, it s a way that the encryptor can let the decryptor know what the ciphertext should look like. If the ciphertext doesn t look like it should, the decryptor knows that the data has been tampered with or otherwise corrupted, and decryption should not be attempted. The upshot to you? Whatever cryptography software you use, be sure it uses authenticating algorithms so that data integrity is always maintained. Good algorithmic choices for authentication include the SHA algorithms, HMAC-SHA1, HMAC-SHA2, CRC, ADL, etc.
BEST PRACTICE: Manage Your Keys Carefully. The Keys to the Kingdom Of course, the problem with any cryptography that relies on keys is that anyone with access to those keys doesn t have to do any hacking at all they can simply decrypt into plaintext at will. That means that proper key management is critical, and cannot be overlooked. Keys should be stored independently of the data they encrypt under the doormat, after all, is the first place that any thief will look for your house key and your key storage itself needs to be secure. You should also maintain controls over who has access to keys internal data theft is a persistent problem across the Enterprise and policies that detail what a key should look like should also be in place: password and 123456 are never great choices! A plan to periodically change your keys should also be in place, as well as a strategy to backup the database or server where keys are stored.
Conclusion As mentioned earlier, it s far beyond the scope of this white paper to discuss all aspects of encryption in great depth when the topic, in all reality, could fill books. Rather, we hope you found this work educational and useful, and use it if needed as a springboard to further research and planning. Keep in mind that no document or book, however detailed, can substitute for proper analysis and planning on your part as you analyze your company s cryptography practices. If you have any further questions or would like to do more reading, feel free check out the sources in the bibliography. You can also feel free, of course, to contact us at Advanced Software Products Group we ve been doing cryptography since 1986, and would be happy to answer any questions you have about cryptography software.
About Advanced Software Products Group ASPG is an industry-leading software development company with IBM and Microsoft certifications, and for over 25 years has been producing award-winning software for data centers and mainframes, specializing in data security, storage administration, and systems productivity, providing solutions for a majority of the GLOBAL 1000 data centers. For more information about ASPG, please contact our sales team by phone at 800-662-6090 (Toll-Free) or 239-649-1548 (US/International), 239-649-6391 (fax) or email at aspgsales@aspg.com. You can also visit the ASPG website at www.aspg.com.
Bibliography Advanced Software Products Group (ASPG). MegaCryption Enterprise Cryptography Toolkit Retrieved 2/28/2013 at http://aspg.com/pdfs/megacryptionbrochure.pdf. Henderson, Stu. 2010. How to Secure Mainframe FTP. Retrieved 2/25/2013 from http://stuhenderson.com/ftpsec10.pdf. National Institute of Standards and Technology. 2013. Page retrieved 2/25/2013 from http://csrc.nist.gov/groups/stm/cavp/index.html. Network Associates. 1999. An Introduction to Cryptography. Retrieved 2/20/2013 from ftp://ftp.pgpi.org/pub/pgp/6.5/docs/english/introtocrypto.pdf.