IMPLEMENTING CRYPTOGRAPHY IN DATABASE KWAN SHONG HANN A thesis submitted in fufflilment of the requirements for the award of the degree of Bachelor of Computer Science (Computer Systems & Networking) Faculty of Computer Systems & Software Engineering ljniversiti Malaysia Pahang APRIL 2010
V ABSTRACT With the increasing dependency of database for data storage, many sensitive data such as personal information data and credit card information data are being stored inside a database. These data is so valuable that it attracts unauthorized personnel to gain access for obtaining the data for further usage. Even though the network infrastructure nowadays is protected with different kind of security measures, there is none of them is able to block all the threats perfectly. Therefore, the data needs to be able to protect its confidentiality even though all measures have been failed. This is where the importance of cryptography in protecting the data confidentiality even the data is in the hands of the attacker. Cryptography can be implemented in different kind of methods to secure the database. The purpose of this study is to develop a cryptosystem that is able to implement cryptography to the data before storing them into the database. This implementation has been strengthened by introducing two approaches which are the classification of types of data using Key Family and the classification of data encryption key status according to the activation time. Key Family separates types of data such as personal information data and credit card information data. Both families use different data encryption key to encrypt and decrypt to limit the access of the attacker if one of the key is obtained. The state of data encryption key is determined by the activation date which the key which has the latest time will be activated and the old ones will be expired. This will prevent the key remains too long for encryption and decryption which poses risks for threats to break the key. This system is developed using Java programming language with the use of Java Cryptography Extension for the cryptography process. MySQL database is used as the protected database where all the data in the database is encrypted for protection. Finally, it is hope that this system can provide better security for data confidentiality and also become the last line defense of data towards the attacks.
vi ABSTRAK Pengkalan data pada hari mi semakin diperlukan untuk menjadi tempat penyimpanan data utama kepada perisian computer yang dibangunkan. mi menyebabkan lebih banyak data yang sensitif seperti data peribadi individual dan data credit card. Data tersebut adalah amat berharga sehingga ia mendorongkan pencerobohan terhadap pengkalan data. Walaupun sistem rangkaian pada hari mi telah dilengkapi dengan pelbagai jenis cara untuk mencegah pencerobohan terhadap rangkaian, tiada satu cara yang dapat mencegahnya dengan sempuma. Oleh itu, data hendaklah diubahsuai supaya ia dapat mempertahankan inlormasi yang terkandung dalamnya. mi dapat dicapai dengan menggunakan teknik Cryptography terhadap data. Tujuan penyelidikan mi adalah untuk membangunkan satu sistem akan mengapplikasikan cryptography terhadap data sebelum ia disimpan dalam pengkalan data. Applikasi mi telah diperkuatkan dengan penggunakan Key Family yang akan mengumpulkan jenis data yang sama menjadikan keluarga yang sama. mi adalah untuk mengelakkan penceroboh dapat mendapatkan semua data dengan hanya mendapatkan satu kunci tersebut. Kunci yang digunakan untuk encrypt akan diperkenalkan dengan menggunakan masa activasi untuk mengubah pengguna.an kunci dari semasa ke semasa. Kunci yang lama akan diganti dengan kunci yang mempunyai masa yang terbaru. mi adalah untuk mengurangkan masa kunci yang digunakan kerana masa adalah satu element yang penting bagi penceroboh untuk mencapai kunci tersebut. Sistem mi dibangunkan dengan menggunakan bahasa pengaturcara Java dengan applikasi Java Cnyptographty Extension yang digunakan untuk process encrypt dan decrypt. Akhir sekali, dihanap sistem mi dapat memberikan pertahanan yang lebih baik kepada penyimpanan data di dalam pengkalan data dan ia dapat menjadi satu pentahanan yang terakhir kepada data terhadap ancaman-ancaman.
VII TABLE OF CONTENTS CHAPTER TITLE PAGE INTRODUCTION 1 1.1 Introduction 1 1.2 Problem Statement 3 1.3 Objective 4 1.4 Scope 5 2 LITERATURE REVIEW 6 2.1 Introduction to Database 6 2.1.1 Types of Databases 7 2.1.2 Overview of Existing Database Security 10 Technology 2.1.3 Database Security 14 2.2 Introduction to Cryptography 15 2.2.1 Symmetric Cryptography 15 2.2.2 Public-Key Cryptography 15 2.2.3 Hashing Algorithm 16 2.3 Methods of Implementing Cryptography in Database 17 2.3.1 Implementation of Cryptography within 17 Database 2.3.2 Implementation of Cryptography outside 19 Database 2.4 Overview of Current Data Encryption Product 21 2.4.1 Encryptionizer DE 21 2.4.2 RSA BSAFE Encryption Software 22
VIII 2.5 Conclusion 23 METHODOLOGY 3.1 Introduction 24 3.2 Project Method 24 3.2.1 Project Selection and Planning 27 3.2.2 Software Requirements Analysis 28 3.2.3 System Analysis and Design 29 3.2.3.1 Hardware Requirement 34 3.2.3.2 Software Requirement 35 3.2.4 Code Generation 36 3.2.5 Testing 37 3.3 Conclusion 38 84 4 IMPLEMENTATION 39 4.1 Introduction 39 4.2 Key Vault 41 4.2.1 Local Key Class 41 4.2.2 Local Key Store Class 42 4.3 Key Manifest 47 4.4 Key Tool 53 4.5 Engine 56 4.6 Provider 58 5 RESULTS AND DISCUSSION 59 5.1 Introduction 59 5.2 Results & Discussion 61 5.2.1 Generate Key-Encrypting Key Module 61 5.2.2 Generate New Data Encryption Key Module 61 5.2.3 Search Created Data Encryption Key Module 61
ix 5.2.4 Search for Data Encryption Key which has 62 "Live" State Module 5.2.5 Replace Data Encryption Key Module 62 5.2.6 Terminate Data Encryption Key Module 62 5.3 User Manual 62 5.4 Advantages & Disadvantages 63 5.4.1 Advantages 63 5.4.2 Disadvantages 64 5.5 Constraints 65 5.6 Recommendations and Further Research 66 6 CONCLUSION 67 REFERENCES 69 APPENDICES A-C 70
LIST OF TABLES TABLE NO. TITLE PAGE 3.1 Hardware specification for System Development 34 3.2 Software Requirement for System Development 35
Xi LIST OF FIGURES FIGURE NO. TITLE PAGE 2.1 Example of database 7 2.2 Example of Flat File database 8 2.3 Example of relational database 9 2.4 Architecture of Security in Microsoft SQL Server 2005 11 2.5 Architecture of RSA Encryption Standalone Software 22 with Protected Database 3.1 System Development Life Cycle 26 3.2 Flow Chart for Encrypting Data from Application 29 3.3 Flow Chart For Decrypting Data from Database 30 3.4 Class Diagram for System Engine in Encrypting and 31 Decrypting Data 3.5 Context Diagram of System 32 3.6 Data Flow Diagram of the proposed System 33 4.1 Local Key Class 41 4.2 generatenewkey() method 42 4.3 key encrypting key table 42 4.4 generatekey() method 43 4.5 encryptkey() method 44 4.6 savekeyo method 45 4.7 replacekek() 45 4.8 getlocalkey method 46 4.9 Key Alias class with the declaration of attributes 47 4.10 key manifest table 48 4.11 Acessor and Mutators for the Key Alias class 48
XII 4.12 getnewalias() method 49 4.13 populate() method to retrieve Key Alias information 50 4.14 getlivekeyalias() method 4.15 save() method to save KeyAlias 52 4.16 Generate key-encrypting key form 53 4.17 Create new data encryption key 54 4.18 Print Created Key using Alias ID 55 4.19 encrypt() method 56 4.20 decrypto method 57 4.21 Provider class 58
XIII LIST OF APPENDICES APPENDIX TITLE PAGE A Gantt Chart 70 B User Manual 72 C Data Dictionary 79
CHAPTER 1 INTRODUCTION This chapter will provide a brief overview of the entire project including the objective of the project, scope and problem 1.1 Introduction Nowadays, security issues among server and databases becomes popular as the dependencies of human on computer devices has been increased rapidly to store data and company's files. Therefore, its security vulnerability has become a concern to the company or user. To help in securing the content of the database, the introduction of encryption technology to database will be the last line defense for the data after the firewall and network protection has been broken in by unauthorized personnel. It helps to secure the content of the database safely by transforming text or files into meaningless characters which will avoid the third party to read or altar the information in the database.
2 Besides the attack from external user, there is a possibility of internal database attack happens as database administrator has the right to read or write the content in the database. To curb this problem, encryption technology will be introduced individually according to the user in the database. The user manages to select his/her encryption method to secure data in the database. The user has his/her own password to encrypt personal details in the database without the acknowledgement of third party. With this, the data will be secured not only from external attack to the database; it will also protect the privacy of the user of the database from immoral database administrator who violates the rights of the user in database.
3 1.2 Problem Statement Firstly, the existing plain text file in database will become the target of network culprits to attack the database for either personal usage or as a source to earn profit illegally. The content maybe deleted or edited by whoever make it through the firewalls and network protection. Conventional database allows one encryption method for all the data contained inside the database. Once the only encryption method is solved by the hackers, all the data will be translated into meaningful data to be edited or removed. Thirdly, in a shared database which contained different users, database administrator have access to data which stored by the users. There is a possibility the database administrator will explore the data personally without the permission from the files owner in the database.
4 1.3 Objective i. To introduce encryption technology into database to help to transform personal data in database into meaningless character in the database. This will prevent unauthorized personnel from reading or altering the content of the database. ii. The encryption method will be a standalone application which will be a middle tier application to encrypt the data. This will reduce the risk of having the same encryption at all the database. This will help to protect internal database administrator from violating the privacy of the users in the database. iii. To limit database administrator's access to the users' private files. Database administrator can only access to the database management files and the users' personal details given by the users. This will prevent the database administrator from accessing the data without the permission from the user.
5 1.4 Scope operation. Basically, this system contains several scopes which it can carry out its i. This database encryption system will focus on one to one basis which means an application with a database that it connects to. ii. Data type for encryption will be focus on plain text alphabet and numerical data. iii. Type of application using this system is application that saves confidential data such as credit card numbers in the database.
CHAPTER 2 LITERATURE REVIEW This chapter briefly explains about the studies of the background of the project to understand the technology and process involves in the project studied. 2.1 Introduction to Database The introduction of databases in human is to help human to save the work of documentation and filing. A database offers a mass amount of space for us to save our data in a proper and neat method. Its designed is to offer an organized mechanism for storing, managing and retrieving information. Database uses table which consists of columns and rows just like how the Microsoft Excel spreadsheet. The column in the table hold attributes which will determine the value to be inserted into. For the row, it contains a single record which contributes values to the different columns in the table [1].
':4 pr.duri k I Helmets CbrgIdimo 2007-07.23-2 lit' c r4 r0vo 2007.07.03: 3 il-o I01N102 21007-07-61-4 I4tht Shoi Z-On Blok - 2007-07-94 5 J, k, t CIovr Jacki T..kn 2007-07.02 0 2vJckt hv& - 2007-07-13 7 Jackttr C!--Jc5t A7-2007-07-08 8 Lgg,gs C-ri thurt rpi' XLV 650 2007-07-1I S Lvgg, Lrgg Set Gv E4 11-1 2090723 10 Lrg,0er '"geg., 052 Mr,4-2Lt. 2007-07-05 11 Ltgoo0rt Zrr.e.&brrrnrm Side Cre 2007-07-09 12 Lcks Abt SuutitP-r-rr533I40/3j9 2007-07.23 13 Lr-:k Abu3.Bj0L 39. 2007-67.06 14 L<,rks Abut Steèl-0-F1ex990/170 20070715 Lukr flbcixkk.,nite 8-Dbc. 2007-07-22 6 Lckr 2007-07-01 Figure 2.1: Example of database 2.1.1 Types of Databases There are two (2) main types of databases which are the Flat file database and Relational database [2]. A Flat file database contains a record per line which consists of plain text or mixed text and binary file. The fields from the record will be separated by delimiters such as commas or a fixed length of space. There is no structural relationship between two different records in the database. An example will be a sheet of paper which contains a list of names, the owners' addresses and the owners' salary. This is a flat file database which usually can be done by a typewriter or word processor. There is a problem using a flat file database because it is very prone to corruption if it is applied on a semi-active database. This type of database base has no inherent locking mechanism which will detect a file is being used or modified.
8 An example of flat file database: I [name, FName Age, Salary ISmith, John, 35 $28 OJDoe Jane, 28, $325I11ro\, Scott, 41, $265lloward, Shemp, 48, $359ITaylor, Torn, 22. $250 Figure 2.2: Example of Flat File database A relational database has its collection of data item organized as a set of table which has been formally-described. Its data can be accessed from users and reassembled back in different ways without having the database tables to be reorganized. The Structured Query Language (SQL) in a relational database is a standard user and application program interface which is used for interactive queries for information from a relational database and for gathering data for reports A relational database is a set of tables which consists of predefined categories known as column. Each row contains a unique instance/value of data corresponding to the categories defined by the column. Relational database allows connection between two different tables with the condition that both of the tables has at least one common field which it able to relate any two files. Relational database management system (DBMS) is a management system for database with the ability to access the data organized in the tables of the database and relate common field from one record to another. It also has the capability to recombine the record from different tables which makes it a powerful tool for data usage.
A Relational Data Base AUTI-(R lit 17 11l] Figure 2.3: Example of relational database
10 2.1.2 Overview of Existing Database Security Technology There are several relational databases technologies nowadays to support different types of usage and demand in the market. They are built with different architecture and tools. Here are the overviews of different databases. Microsoft SQL Server 2005 MS SQL Server is a database management system (DBMS) developed and marketed by Microsoft. This system is the most important part of Microsoft.NET technology. MS SQL Server runs exclusively under Microsoft operating systems Windows 2000, Windows Server 2003, and Window XP [3]. SQL Server secures data with hierarchical encryption layers and a key management infrastructure. Each layer secures the layer beneath it, using a combination of certificates, asymmetric keys. There are two (2) layers: i. Service Master Key ii. Database Master Key
I WiioLvI rvic Mast-er K' encrypkcd with OPAPI O SQL Server Level 4- - - - Sert',ce (1ster Key 1I Database Level- - Database, Matr Kty Certicaes As mm-tri- Itfl SyMmetric Keys Yrnmiiric Keys, Data Figure 2.4: Architecture of Security in Microsoft SQL Server 2005 The top layer, Service Master Key, is encrypted using the Windows data protection API. Database Master Key depends on the encryption mechanism which provided by Microsoft.
12 PostgreSQL PostgreSQL is an open source, Client/Server, relational database. PostgreSQL offers a unique mix of features that compare well to the major commercial databases such as Sybase, Oracle and D132. The major advantage of PostgreSQL is that it is open source - the source code is readable by anyone. It is not owned by a single company because it is developed, maintained, broken and fixed by a group of volunteer developers around the world [4]. Important aspects in PostgreSQL: i. Postgre SQL offers inheritance in the database ii. Postgre SQL enables user to add new fundamental data types. iii. Postgre SQL includes support for geometric data types such as point, line segment, box, polygon, and circle. Three aspects in PostgreSQL security: i. Securing the PostgreSQL data files ii. Securing client access iii. Granting and denying access to specific tables and specific users
13 Oracle Databases The Oracle Database consists of a relational database management system (RDBMS) produced and marketed by Oracle Corporation. As of 2009, Oracle remains a major presence in database computing. In order to meet the demands of encryption in database, Oracle8i introduced a PL/SQL package to encrypt and decrypt stored data. The package, DBMS _OBFUSCATION TOOLKIT, is provided in both standard edition and Enterprise Edition Oracle9i [5]. The package currently support bulk data encryption using Data Encryption Standard (DES) algorithm, and includes procedures to encrypt (DESEncrypt) and decrypt (DESDecrypt) using DES. Besides that, Oracle has added support for the triple DES (3DES) encryption in Oracle8i. Furthermore, it also added support for cryptographic checksumming using the MD5 algorithm. Cryptography checksums can ensure data intergrity.
14 2.1.3 Database Security There are different types of attack on the database which are classified as external and internal attack. External attack as its name described, its source of attack is from unknown personnel outside the usage of database such as Internet user or hackers from all around the world. The administrator takes several measures to curb this problem by setting up firewalls, intrusion detection systems and authorization before entering the database. This made it harder for the internet culprit to break in with all these network security tools [6]. The second attack will be the database internal attack, which has the advantage of being within the perimeter firewall and having access to information inside the territory. Many organizations have more worries about external attack while neglecting the risk of having attack internally. The internal threat is always underestimated as its manipulator often is the database administrator. This is because database administrator has access to everything in the database. They have the ability to read or edit the content in the database without leavirig a trace behind [6]. Besides that, the basic of attack to the database can also be categorized into one-offs and persistent. One-off attack is an attack when the 'attacker achieves his goals inside the database and vanishes in the air without returning to the database again. However, persistent attack is an attack where the attacker will return to the compromised database frequently and launch additional attack to the database.