HILL CIPHERS: A LINEAR ALGEBRA PROJECT WITH MATHEMATICA

Similar documents
An Introduction to Hill Ciphers Using Linear Algebra

Hill Ciphers and Modular Linear Algebra

Solutions to Problem Set 1

Hill s Cipher: Linear Algebra in Cryptography

Introduction to Hill cipher

Number Theory. Proof. Suppose otherwise. Then there would be a finite number n of primes, which we may

1. Define: (a) Variable, (b) Constant, (c) Type, (d) Enumerated Type, (e) Identifier.

K80TTQ1EP-??,VO.L,XU0H5BY,_71ZVPKOE678_X,N2Y-8HI4VS,,6Z28DDW5N7ADY013

Cryptography and Network Security Department of Computer Science and Engineering Indian Institute of Technology Kharagpur

Cyber Security Workshop Encryption Reference Manual

Network Security. HIT Shimrit Tzur-David

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

FAREY FRACTION BASED VECTOR PROCESSING FOR SECURE DATA TRANSMISSION

Students will operate in pairs and teams of four to decipher and encipher information.

Primes in Sequences. Lee 1. By: Jae Young Lee. Project for MA 341 (Number Theory) Boston University Summer Term I 2009 Instructor: Kalin Kostadinov

1. The RSA algorithm In this chapter, we ll learn how the RSA algorithm works.

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

RSA Encryption. Tom Davis October 10, 2003

Overview/Questions. What is Cryptography? The Caesar Shift Cipher. CS101 Lecture 21: Overview of Cryptography

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Karagpur

Systems of Linear Equations

Caesar Ciphers: An Introduction to Cryptography

A PPENDIX G S IMPLIFIED DES

A short primer on cryptography

Computer Networks. Network Security 1. Professor Richard Harris School of Engineering and Advanced Technology

= = 3 4, Now assume that P (k) is true for some fixed k 2. This means that

Notes on Determinant

Row Echelon Form and Reduced Row Echelon Form

Dr. Jinyuan (Stella) Sun Dept. of Electrical Engineering and Computer Science University of Tennessee Fall 2010

The application of prime numbers to RSA encryption

CHAPTER 5. Number Theory. 1. Integers and Division. Discussion

LEARNING MATHEMATICS IN SECONDARY SCHOOL: THE CASE OF MATHEMATICAL MODELLING ENABLED BY TECHNOLOGY

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

Number Theory and Cryptography using PARI/GP

Introduction To Security and Privacy Einführung in die IT-Sicherheit I

Continued Fractions and the Euclidean Algorithm

Advanced Cryptography

RSA Attacks. By Abdulaziz Alrasheed and Fatima

Number Theory and the RSA Public Key Cryptosystem

Multiplicative Ciphers. Cryptography of Multiplicative Ciphers

Principles of Public Key Cryptography. Applications of Public Key Cryptography. Security in Public Key Algorithms

CS 758: Cryptography / Network Security

An Introduction to the RSA Encryption Method

ALGEBRAIC APPROACH TO COMPOSITE INTEGER FACTORIZATION

Discrete Mathematics, Chapter 4: Number Theory and Cryptography

Page 1. Session Overview: Cryptography

Linear Codes. Chapter Basics

Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan

MATH2210 Notebook 1 Fall Semester 2016/ MATH2210 Notebook Solving Systems of Linear Equations... 3

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

The Advanced Encryption Standard (AES)

Similarity and Diagonalization. Similar Matrices

Integer Factorization using the Quadratic Sieve

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

Design and Implementation of Asymmetric Cryptography Using AES Algorithm

Outline. Computer Science 418. Digital Signatures: Observations. Digital Signatures: Definition. Definition 1 (Digital signature) Digital Signatures

A New Generic Digital Signature Algorithm

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

Linearly Independent Sets and Linearly Dependent Sets

Network Security. Abusayeed Saifullah. CS 5600 Computer Networks. These slides are adapted from Kurose and Ross 8-1

3 Some Integer Functions

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

Network Security Technology Network Management

SECURITY IN NETWORKS

Typical Linear Equation Set and Corresponding Matrices

SECURITY IMPROVMENTS TO THE DIFFIE-HELLMAN SCHEMES

Chap 2. Basic Encryption and Decryption

1 Solving LPs: The Simplex Algorithm of George Dantzig

Security for Computer Networks

Network Security. Security Attacks. Normal flow: Interruption: 孫 宏 民 Phone: 國 立 清 華 大 學 資 訊 工 程 系 資 訊 安 全 實 驗 室

Cryptography: Motivation. Data Structures and Algorithms Cryptography. Secret Writing Methods. Many areas have sensitive information, e.g.

ANALYSIS OF RSA ALGORITHM USING GPU PROGRAMMING

The Mathematics of the RSA Public-Key Cryptosystem

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Here are some examples of combining elements and the operations used:

How To Know If A Message Is From A Person Or A Machine

ECE 842 Report Implementation of Elliptic Curve Cryptography

SECURITY EVALUATION OF ENCRYPTION USING RANDOM NOISE GENERATED BY LCG

Network Security: Cryptography CS/SS G513 S.K. Sahay

Insight Guide. Encryption: A Guide

Index Calculation Attacks on RSA Signature and Encryption

Multiplication. Year 1 multiply with concrete objects, arrays and pictorial representations

Matrix Differentiation

Public Key Cryptography: RSA and Lots of Number Theory

Chapter 2 Homework 2-5, 7, 9-11, 13-18, 24. (9x + 2)(mod 26) y 1 1 (x 2)(mod 26) 3(x 2)(mod 26) U : y 1 = 3(20 2)(mod 26) 54(mod 26) 2(mod 26) c

Data Mining: Algorithms and Applications Matrix Math Review

Chapter 23. Database Security. Security Issues. Database Security

4.5 Linear Dependence and Linear Independence

1 Data Encryption Algorithm

Cryptanalysis of a Partially Blind Signature Scheme or How to make $100 bills with $1 and $2 ones

Introduction to Encryption

Using row reduction to calculate the inverse and the determinant of a square matrix

Content. Chapter 4 Functions Basic concepts on real functions 62. Credits 11

Symmetric Key cryptosystem

THE NUMBER OF REPRESENTATIONS OF n OF THE FORM n = x 2 2 y, x > 0, y 0

Theorem3.1.1 Thedivisionalgorithm;theorem2.2.1insection2.2 If m, n Z and n is a positive

IT Networks & Security CERT Luncheon Series: Cryptography

Split Based Encryption in Secure File Transfer

Primality Testing and Factorization Methods

Transcription:

HILL CIPHERS: A LINEAR ALGEBRA PROJECT WITH MATHEMATICA Murray Eisenberg University of Massachusetts Mathematics and Statistics Department Lederle Graduate Research Tower Amherst, MA 01003-4515 murray@math.umass.edu Abstract A project on Hill ciphers is discussed here that is suitable for a beginning linear algebra course. In the project, students use the computer algebra system Mathematica to implement the Hill encipherment method and to crack a Hill cipher. This paper explains Hill ciphers as an application of linear algebra over n ; describes the project from the student s point of view; and discusses the software implementation needed for the project. Introduction Students being able to learn math on their own is a desirable instructional goal. Software such as Mathematica can facilitate achieving that goal. How this can be done is illustrated by a project on Hill ciphers an engaging application of linear algebra to cryptology in which students express the concepts in the form of Mathematica functions which they then use to crack a Hill cipher, that is, to discover its key. Aside from its intrinsic interest, the topic of Hill ciphers requires students to learn some new linear algebra, namely, the situation in which the scalar field is finite. Hill ciphers hardly have the currency of secret-key block ciphers such as Blowfish, or of public-key systems such as RSA or elliptic-curve methods. Yet they do have interest at least for historical reasons: they constitute the first general method for successfully applying algebra specifically, linear algebra to polygraphic ciphers. 1 This project could be implemented in essentially any computer algebra system or generalpurpose programming language (although to implement it completely as described, the CAS or language must be able to load packages that the student is unable to read). In fact, the author originally implemented the project in the array-based programming language APL, and more recently reimplemented it in Kenneth Iverson s programming language J. Implementing the project in a programming language such as C, Pascal, or Java that is not array-oriented would be more tedious for both instructor and students. About Hill ciphers Letter-by-letter substitution ciphers easily succumb to frequency analysis and so are notoriously unsecure. Polygraphic ciphers, by contrast, in which each list of n consecutive letters of the plaintext an n-graph is replaced by another n-graph according to some key, can be more challenging to break. The first systematic yet simple polygraphic ciphers using more than two letters per group are the Hill 1 See Kahn [6], especially pp. 404 408.

ciphers, first described by Lester Hill [4] in 1929. 2 For a polygraphic substitution, changing just one or two plaintext letters can completely change the corresponding ciphertext! That is one reason that Hill ciphers are so difficult to crack. A Hill n-cipher works as follows. Start with an m-character alphabet and code each character with a unique integer in {0, 1, 2,...,m 1}. The alphabet could consist of just the usual 26 letters in English or, as here, those letters supplemented with the 3punctuation characters. and? and (where denotes a blank space). For simplicity, the coding is done here by numbering the 29 characters in order as shown in Table 1. Next, choose as key matrix an n n matrix A with entries in {0, 1, 2,...,m 1} that is invertible modulo m. a b c d e f g h i j k l m n o p q r s t u v w x y z.? 0 1 2 3 4 5 6 7 8 9 10 11 12 1314 15 16 17 18 19 20 21 22 2324 25 26 27 28 Table 1: Numerical coding of 29-letter alphabet Given a ciphertext string, encode it to the corresponding vector v of integers in the set {0, 1, 2,...,m 1}. If the length of v is not an integral multiple k of the key size n, then padv as needed by repeating its last entry. Partition v into k column vectors v j. Form each of the products Av j ; equivalently, form the matrix product A[v 1 v 2 v k ]. Reduce the result modulo m. Reassemble the consecutive columns of this product into a new vector w. Finally, decode w into the corresponding ciphertext string. The following Hill 3-cipher illustrates the procedure for our 29-character alphabet with its encoding given in Table 1. The key matrix is: 17 5 20 A = 239 3 11 2 12 The plaintext is the following 10-character message (including the space and period): want help. First, encode the plaintext to the corresponding numbers: 22013192874111526 Second, group these into trigraphs, repeating the final number twice to fill out the fourth group: 22 0 1319 28 7 4 11 15 26 26 26 Third, form the 3 4 matrix having these groups of three numbers as its columns. Fourth, apply the key: 22 19 4 26 25 2317 19 A 0 28 11 26 2314 4 11 (mod 29) 13 7 15 26 21 1 14 12 2 Hill [5] subsequently generalized his ciphers to a more complex scheme that used block matrices.

Fifth, decode the numbers 2523212314117414191112 from the unraveled columns of the last matrix to obtain the letters of the ciphertext: zxvxobreotlm A Hill cipher is relatively immune from attack if its key size n is large enough to preclude frequency analysis of n-graphs. But it is easy to crack if you have captured enough plaintext along with the corresponding ciphertext, for then the method of the following theorem applies. (This method was not explained in Hill [4] but is formulated, for example, by Anton and Rorres [1]; a proof is included in the expository article [2].) Cracking Theorem. Suppose the alphabet length m is prime. Let p 1, p 2,...,p n be n plaintext vectors for a Hill n-cipher having (unknown) key matrix A, and let c 1, c 2,...,c n be the corresponding ciphertext vectors. Suppose these plaintext vectors are linearly independent over m. Form the matrices P = [ ] [ ] p 1 p 2... p n and C = c1 c 2... c n having the plaintext vectors and ciphertext vectors, respectively, as their columns. Then the same sequence of elementary row operations that reduces C T to the identity matrix reduces P T to the transpose (A 1 ) T of the inverse key matrix A 1. The situation described by the theorem, which in practice would not be wholly unrealistic, is the one that arises in the project. What the student does To start, the student reads a 19-page article [2] about Hill ciphers that describes the underlying mathematics regarding arithmetic modulo m and linear algebra over m ; the Hill encipherment scheme; and the Cracking Theorem. Next, the student defines in Mathematica a main function encipher that takes as arguments a key matrix and plaintext string and returns as result the corresponding ciphertext string. He prepares auxiliary functions to do the requisite conversions between text to numbers and to arrange lists into matrices and vice versa. Guided by a Mathematica notebook, the student next validates his encipher: he loads an instructor-prepared Mathematica package that automatically generates various key matrix and plaintext pairs, evaluates the student s encipher at those pairs as arguments, and compares the results with what they should be. Now the student is ready for the project problems. He loads another instructor-prepared Mathematica package that creates data key matrices, plaintexts, and ciphertexts and stores them in named variables. As with the test pairs for validation, this data is individualized for the student through randomly generated key matrices and randomly selected entries from a database of texts. The package prints a list of three problems to be solved: 1. Given ciphertext and the inverse of the key matrix, decipher the text. 2. Given ciphertext and the key matrix, decipher the text. 3. Given the key matrix size and captured ciphertext along with corresponding plaintext, if possible crack the cipher determine the inverse key.

Results Of all the computer-related linear algebra projects the author has assigned over the past twenty years, this one unquestionably has generated the most favorable student response. Nearly all students have been able to complete it, even if they were unable to get completely correct the details of padding (which can now be handled automatically with Mathematica Version 4). And with many students one can almost see the light bulb lighting up as they realize that the function for decipherment is the same as that for encipherment, but with the key matrix replaced by its inverse. Over many years the author has observed that students typically are quite persistent in carrying out computer projects to completion (and one can but dream that they would be just as persistent with problems that are not computer-related!). And of all these computer projects, with this one on Hill ciphers students been the most persistent. Perhaps the challenge of cracking a cipher is too enticing to ignore. Implementation The suite of files needed for a student to carry out the project, available at the author s Web site [3], consists of: The expository article [2] about Hill ciphers, in Adobe.pdf format. (Given the primary purpose of the project, little would seem to be gained by making this article available in the interactive form of a Mathematica notebook.) The Mathematica notebook About Hill that guides the student through validating encipher and obtaining test data for the project problems. The Mathematica packages that validate the student s encipher and generate the data for the project problems, respectively. The former includes a short database of plaintext strings used. The latter package does a hidden import of a third encoded package, which produces a database of plaintext strings literary quotations, pithy sayings, etc. These packages are encoded. These files are ready for student use once they are downloaded into a directory to which Mathematica has direct access. For security reasons, only the encoded forms of the packages are available on the Web site. Instructors wishing to alter the packages for example, to change the database of texts should apply directly to the author for copies of the unencoded source Mathematica notebooks; then the packages can be recreated in ASCII form and encoded again by Mathematica. In both packages, each key matrix is generated on the fly by iterating the process of forming a random matrix and checking it for invertibility modulo 29, with a default matrix being used if too many iterations are required. Designing the packages involved two subtleties in Mathematica programming. The first is conditional evaluation within a package: When the package is loaded, a function in it tests the value of a global variable (such as the student s name or ID number) to see if it has been defined and, if so, has the proper format; the package s context is created only if the global variable passes the test. The second involves indirectly creating symbolic names and assigning to them the values used for validation and problem data. For details, see the package source notebooks.

Extensions and complications The project, as currently implemented in Mathematica, is about as simple as Hill ciphers can be. The project could be modified in one or more of the following ways: The student writes his own functions or modifies instructor-provided ones that work over the field to do row reduction and matrix inversion modulo the alphabet length m, instead of using the option Modulus -> m to Mathematica s built-in RowReduce and Inverse functions. 3 Such new functions require finding inverses of integers modulo m. And this could be done with the brute-force method of using a lookup table or, for a mathematically more instructive approach, by programming and applying the euclidean algorithm. Allow, as did Hill himself, nonprime alphabet length m (for example, m = 26). Then include the problem of checking whether a given key matrix for such an alphabet is in fact invertible. Use affine transformations, as Hill did in his original paper [4], instead of just linear transformations. Make the cipher more realistic by using some arbitrary transposition cipher first. The project could also be enhanced by including detection of key matrix size or by using frequency analysis to break the cipher even in the absence of a captured ciphertext-plaintext pair. Of course, such enhancement would take the project beyond the realm of a typical linear algebra course. Acknowledgment The author thanks Allan Hayes and Tom Wickham-Jones for assistance with programming issues that arose in designing the Mathematica packages for validation and problem generation. References [1] Howard Anton and Chris Rorres, Elementary Linear Algebra: Applications Version, 6 ed., Wiley, New York, 1991. See sec. 11.17. [2] Murray Eisenberg, Hill ciphers and modular linear algebra, mimeographed notes, University of Massachusetts, 1998, 19 pages. [3] Murray Eisenberg, Hill ciphers, web site, University of Massachusetts, URL http://www.math.umass.edu/~murray/hill_ciphers/. [4] Lester S. Hill, Cryptography in an algebraic alphabet, Amer. Math. Monthly 36 (1929) 306 312. [5] Lester S. Hill, Concerning certain linear transformation apparatus of cryptography, Amer. Math. Monthly 38 (1931) 135 154. [6] David Kahn, The Codebreakers: The Story of Secret Writing, Weidenfeld and Nicolson, London, 1967. See especially pp. 404 410. 3 Mathematica Version 2 does not allow the option Modulus -> m to RowReduce.