A Tokenization and Encryption based Multi-Layer Architecture to Detect and Prevent SQL Injection Attack Mr. Vishal Andodariya PG Student C. U. Shah College Of Engg. And Tech., Wadhwan city, India vishal90.ce@gmail.com Prof. Shaktisinh Parmar Assistant Professor C. U. Shah College Of Engg. And Tech., Wadhwan city, India shaktisinh.parmar@yahoo.co.in Abstract With the increased importance of web applications in the last few years, the negative impact of security flaws in such applications has also grown either. Vulnerabilities that may lead to the compromise of sensitive information are being reported continuously, and the costs of damages resulting from exploited flaws can be enormous. SQL injection attack is one of Attacks to web application. SQL injection is an attack in which malicious code is inserted into strings, which are later, passed to the database server for parsing and execution. This attack can be applied to any page which accepts user input to capture data or query parameters to dynamically render the content on the web page. We tried to implement multi layer architecture to prevent SQL injection attack. At each layer we will assign different roles and responsibilities. Among which first is input validation, used to validate and filter input given by the client. Query tokenization is second layer that creates tokens of SQL query and compares this tokes with predefined tokens stored in database. Third layer is input encryption layer, which will encrypt the input given by the user to make in the form that cannot be SQL injection code. The next and last layer is classification to analyze and classify the SQL query. Index Terms SQL Injection, Types of SQL Injection, Encryption, Tokenization. 67 A M E R I C A N I J
1. Introduction SQL Injection Attack is used to get illegal access to web applications. Basic intent for SQL injection is to get login without passing valid username and password, information disclosure, compromised data integrity, compromised data availability, remote command execution etc[ 10]. Due to lack of proper handling of input validation or bypass of user validation the SQL Injection is successful. 2. Types of SQL injection [8] SQL Injection can be classified into SQL manipulation, code injection, functional call and buffer overflow. SQL manipulation means attacker attempts to modify the original SQL statements used in web application. Code Injection is an attempt to add additional SQL statements or commands to the SQL statements. Functional call is the insertion of database function into user input to execute this function using exec() command. Buffer Overflow will make the loss of database connection. 2.1. SQL manipulation It can be classified as Tautology, inference, union queries and piggybacked: 2.1.1 Tautology. The additional code is in the way that it always evaluated to be true. 2.1.2. Interface. Attacker will try to know that what kind of code is injectable. Also if application will display database error message, attacker could know backend database detail. 2.1.3. Union Queries. Attacker will inject second query using the UNION keyword of SQL. 2.1.3. Piggybacked Queries. Attacker will inject second query using ; (semicolon). They insert a new query which is not related to first query. 2.2. Code injection This attack can be classified as LIKE Queries, Column number mismatch, Additional where clause and insert Subselect query. 2.2.1. LIKE Queries. User input is given in the LIKE parameter of the query, which subverts the query. 2.2.2. Column Number Mismatch. By trying different number of columns in the input query, the attacker will try to that what numbers of columns are present in database table. 2.2.3. Additional Where Clause. Additional conditions are inserted using OR or AND keyword. 2.2.4. Insert Subselect Query. This Attack can use by attacker to access records of the database. 2.3. Functional call injection Function call injection is the insertion of database function into a user input. 2.3.1. Role foundation. The query string of user request is manipulated by attacker to gain information from web application. 2.3.2. System Stored Procedures. Attacker tries to execute Database Stored Procedure. 2.4. Buffer overflow In this type of attack, the attacker will overflow buffer of database server so that the more requests for database are blocked. This can be implemented by using inbuilt functions like DELAY of database server. Which are executed in system and creates delay in execution of server. 3. Related work 3.1. SQL Injection Attacks: Technique and Prevention Mechanism [4] This system uses the tokenization concept. Advantage of this system is that this will prevent the injected request from executing in database. But also this system uses the two level authentication system [5][7]. So, there is also database request overhead. If attacker is strong enough in SQL injection and tries to manipulate the SQL query in the way that the resultant SQL statement contains the same number of tokens as in original SQL query. Then this mechanism fails. 3.2. The Multi-Tier Architecture for Developing Secure Website with Detection and Prevention of SQL Injection Attacks [1]. This paper uses different techniques to prevent SQL injection techniques. Input validation at the initial level used to prevent user from entering special characters. Second technique checks for the data passed by the user. This paper uses generalized techniques to prevent SQL injection attack. No 68 A M E R I C A N I J
special treatment for prevention of SQL injection attack is carried out. 3.3. Random4: An Application Specific Randomized Encryption Algorithm to Prevent SQL Injection [6] This paper at basic suggests to use client side validation such as Java script to validate user input. But it does not prevent attack to the application. So they suggest Encryption technique for the database. Random4 algorithm uses lower case, upper case, numeric and 10 special characters for encryption of the database. The algorithm chooses one of four different values for single character. The encryption character is chosen from the lookup table based on the next input character in input string. The encrypted string and original string are concatenated and stored in database. This encrypted key is used to give table name and column name of database. Also this system encrypts the input given by client. So the encrypted input would be in the format that is not executable code of SQL language. 4. Proposed system We can implement SQL injection detection and prevention by multiple layers of defensive mechanisms. This system involves client side input validation, query tokenization, encryption for name of database, table, column and user inputs. At last query classification is done to check whether the query is legal or suspicious. 4.1. Detection and prevention techniques of TEMA There are multiple techniques used in the system to detect and prevent SQL injection. This system works on different layer to detect and prevent SQL injection. Layer-1: Input Validation This layer contains Client Side Validation. The validation will be implemented by JavaScript. This stage will prevent the user from giving the special characters as input. As the characters are given in the input box, they are checked for valid range of characters. If they are in the specified range of characters, they are legal and allowed to input in the input box. If they are not in the range, they are not allowed. The main advantage of the client side validation is that it takes too small amount of time to check validation. But the disadvantage of the client side input validation is that, the JavaScript validation can be easily disabled. So this stage can be bypassed easily and attackers are allowed to input special character. Layer-2: Query Tokenization Tokenization is used to determine whether the newly arrived query is legal or not by comparing the length of the query. For that the original SQL query used in the web page and the query containing the user input are stored in the array. The SQL query is broken up in the tokens. This is done with the space separated tokens. These tokens are stored in two different arrays. One array contains to the space separated tokens of original SQL query and one array contains the space separated tokens of the SQL query containing the user input. Now the length of both arrays is compared. Comparison is in the way that both array containing same number of tokens. If both arrays contains same number of tokens then the query is legal, else the query is SQL injection. The drawback of the Query Tokenization is that if user gives the malicious input in the way that the SQL query containing the user input is has no. of tokens same as the original query. But the benefit of this stage is that the generation of tokens and comparison of these arrays is faster. Layer-3: Encryption of User Input This layer encrypts the input given by the user. The resultant encrypted string will be in the form that it is not a SQL executable code. We can use any algorithm that does not produce special characters in encrypted string. We have used Message Digest algorithm to encrypt the input given by the user. We used MD5 algorithm to encrypt the input. MD5 algorithm will take the input and produces 128-bit message digest of the input. It accepts a message as input and generates a fixed-length output, which is generally less than the length of the input message. The output is called a hash value. Encryption given by simple MD5 algorithm can be cracked by the lookup tables that are easily available over the internet. So we have used the Salted MD5 algorithm to 69 A M E R I C A N I J
encrypt the input given by the user. Salt acts as additional string to the original input. We have used the combination of user-id and password as salt to the MD5 algorithm. Salt will add some complexity for decryption of the input. MD5 algorithm is faster in encryption. This is because the MD5 algorithm uses the Bitwise Boolean Operation, Modular Addition and Cyclic Shift Operation [11]. Layer-4: Query Classification of tokens as in the original SQL query. But not all SQL injection strings are possible without space. So the tokenization easily prevents this type of SQL injection. If tokenization detects the injection than it decreases the unwanted database request. If tokenization can t injection, this is passed to the encryption. 5. Results and graphs 5.1. Results at each layer of TEMA This layer will classify the newly arrived query. The query containing the user input can be the malicious to the database. So for future reference the query is stored in the database. This query is classified in three different classes. Legal, illegal and suspicious are the three classes [2][3]. If the query is syntactically correct and there is no harming effect to the database, then it is stored as legal query. If the Syntax of query is not correct, the query is stored as illegal. And at last if the query is syntactically correct and has the harming effect, the query is stored as the malicious query. Classification can be manual or automatic such as using ANN [9]. Table 1 shows the result at different layers. Dataset file contains SQL injection string as well as legal string and data that are stored in the database file. Legal request means that layer considers the request as legal input. Table 1: Results at each layer of proposed architecture Prevention Detected As Legal Request Technique Injection Input Validation 363 611 Tokenization 315 714 * Total numbers of requests are 974. Figure 1: Proposed Architecture 4.2. Why to use both Tokenization and Encryption. Tokenization system detects and prevents SQL injection by breaking the SQL query in space separated tokens. These tokens are stored in array. The length of the original SQL query and the SQL query containing user input is compared. If size is same the query is legal according to tokenization. Else not legal. But if attacker generates the malicious input in the way that no. of tokens remains same with changing the meaning of SQL query. For example, the input 1 OR 1 = 1 contains no space. So the resultant SQL query has same no. Table 1 shows the results from input validation layer and tokenization layer. If user disables the client side scripting then the input validation stage will fail. This stage will not be able to even catch single injection. In short the input validation stage will be bypassed. Tokenization stage will pass the injection string if the input given by the attacker is single token. Input encryption will encrypt the input and gives the encrypted string in the alphanumeric format. So even if the attacker inputs the injection code to input box then the injection code will be converted into the legal input string. So this stage will pass the requests only in the case if input supplied to the input box is valid data stored into the database. 5.2. Graph for individual techniques of TEMA system 70 A M E R I C A N I J
Figure 2: Comparative Graph for three techniques of TEMA system 5.3. Comparison of injection detection techniques Two systems are compared with the detection parameter of both systems. And also time required to process requests is compared. Table 2 show the comparison of two level authentication and TEMA technique. Comparison shows that our technique is 3-4 times better in terms of time required to process requests. Figure 2: Analysis of Injection Detection For two Systems Figure-3 Show the comparison of two level authentication system and the TEMA system. No of requests for both systems are 974. This comparison is in terms of time required to detect SQL injections Table 2: Comparison of Injection Detection Techniques Technique Time No. of No. of Required Injection Legal to Process Detected Requests Requests Two-Level Authentication 349 8 25-35 sec TEMA 400 8 7-10 sec Figure 3: Analysis of Time for two Systems 6. Conclusion And Future Work 5.4. Graph for two injection detection techniques Figure 2 Show the comparison of Two level authentication system and the TEMA system. No of requests for both systems are 974. This comparison is in terms of detection of SQL injection. 6.1. Conclusion In web application the response time maters much. If we use the techniques that take too much time, then it will not be a good system. The SQL injection detection and prevention technique used in this paper is faster. Tokenization and Encryption Based Multi-Layer Architecture system detects and prevents the SQL injection effectively. Our detection and prevention system will pass legal requests only, because the input given by the user is encrypted in hexadecimal format. So, resultant encryption will not contain any SQL injection code. Further its effectiveness was compared with the existing two level authentication system. Tokenization and 71 A M E R I C A N I J
Encryption Based Multi-Layer Architecture system detects more SQL injection. In addition to the detection and prevention of SQL injection our system takes less time to process web request. Response time of Tokenization and Encryption Based Multi-Layer Architecture technique is 3-4 times better than the two level authentication system. This because the two level authentication system makes two database requests. 6.2. Future Work If injection string passes the tokenization then the encryption will be done and request is forwarded to the database. However this request will not return anything. But it will create database request overhead. In future we will try to minimize this overhead. Also we will try to implement classification of SQL injection in different categories. 7. References [1] Praveen Kumar, The Multi-Tier Architecture for Developing Secure Website with Detection and Prevention of SQL Injection Attacks," International Journal of Computer Applications (0975 8887) Volume 62 No.9, January 2013. [2] Niraj Kulkarni, D R Anekar, Mayur Ghadge, Rohit Garde, A System to Detect and Block SQL Injection with the help of Multi-Agent System using Artificial Neural Network, International Journal of Computer Applications (0975 8887), Volume 71 No.12, February 2013. [3] Niraj Kulkarni, D R Anekar, Mayur Ghadge, Rohit Garde, Multi-Agent System for Detection and Blocking SQL Injection, International Journal of Computer Applications (0975 8887), Volume 64 No.15, February 2013. [4] Gaurav Shrivastava, Kshitij Pathak, SQL Injection Attacks: Technique and Prevention mechanism, International Journal of Computer Applications (0975 8887), Volume 69 No. 7, May 2013. [5] Asha. N, M. Varun Kumar, Vaidhyanathan G, Preventing SQL Injection Attacks, International Journal of Computer Applications (0975 8887), Volume 52 No.13, August 2012. [6] Srinivas Avireddy, Varalakshmi Perumal, Narayan Gowraj,Ram Srivatsa Kannan,Prashanth Thinakaran, Sundaravadanam Ganapathi, Jashwant Raj Gunasekaran and Sruthi Prabhu, Random4: An Application Specific Randomized Encryption Algorithm to Prevent SQL Injection, IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications,IEEE-2012. [7] Kai-Xiang Zhang, Chia-Jun Lin, Shih-Jen Chen, Yanling Hwang, Hao-Lun Huang, Fu-Hau Hsu, TransSQL: A Translation And Validation based Solution for SQL-injection Attack, First International Conference on Robot, Vision and Signal Processing, IEEE-2011. [8] Khaleel Ahmad, Jayant Shekhar and,k.p. Yadav, Classification of SQL Injection Attacks, Vol. I (4), 235-242, VSRD-TNTJ-2010. [9] ANN,Dec-2013 http://www.learnartificialneuralnetworks. com/introduction-to-neural-networks.html [10] "Top Ten Most Critical Web Application Vulnerabilities,"OWASP Foundation, http://www.owasp.org/documentation/topten.html. 2005.10/12/2013 [11] MD5 Algorithm- April-2014, http://www.engr.uconn.ed u/~fzhang/docs/crypt. doc 72 A M E R I C A N I J