Diffusion and Data compression for data security. A.J. Han Vinck University of Duisburg/Essen April 2013 Vinck@iem.uni-due.de



Similar documents
Solutions to Problem Set 1

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Cryptography and Network Security

How To Understand And Understand The History Of Cryptography

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

IT Networks & Security CERT Luncheon Series: Cryptography

Cryptography: Motivation. Data Structures and Algorithms Cryptography. Secret Writing Methods. Many areas have sensitive information, e.g.

Error oracle attacks and CBC encryption. Chris Mitchell ISG, RHUL

Cryptography and Network Security Department of Computer Science and Engineering Indian Institute of Technology Kharagpur

Cyber Security Workshop Encryption Reference Manual

Number Theory. Proof. Suppose otherwise. Then there would be a finite number n of primes, which we may

Chap 2. Basic Encryption and Decryption

1 Data Encryption Algorithm

Introduction to Hill cipher

Introduction to image coding

Lecture 13 - Basic Number Theory.

Today. Network Security. Crypto as Munitions. Crypto as Munitions. History of Cryptography

SCAN-CA Based Image Security System

Chapter 23. Database Security. Security Issues. Database Security

Modes of Operation of Block Ciphers

Symmetric Key cryptosystem

How To Encrypt With A 64 Bit Block Cipher

Page 1. Session Overview: Cryptography

The application of prime numbers to RSA encryption

Dr. Jinyuan (Stella) Sun Dept. of Electrical Engineering and Computer Science University of Tennessee Fall 2010

Computer Networks. Network Security 1. Professor Richard Harris School of Engineering and Advanced Technology

Chapter 23. Database Security. Security Issues. Database Security

Chapter 1 Introduction

FAREY FRACTION BASED VECTOR PROCESSING FOR SECURE DATA TRANSMISSION

Thinking of a (block) cipher as a permutation (depending on the key) on strings of a certain size, we would not want such a permutation to have many

Developing and Investigation of a New Technique Combining Message Authentication and Encryption

Overview/Questions. What is Cryptography? The Caesar Shift Cipher. CS101 Lecture 21: Overview of Cryptography

Cryptography Worksheet Polybius Square

Insight Guide. Encryption: A Guide

A NEW APPROACH FOR COMPLEX ENCRYPTING AND DECRYPTING DATA

1. Define: (a) Variable, (b) Constant, (c) Type, (d) Enumerated Type, (e) Identifier.

Chair for Network Architectures and Services Department of Informatics TU München Prof. Carle. Network Security. Chapter 13

Cryptography Exercises

Linear Codes. Chapter Basics

encoding compression encryption

Information, Entropy, and Coding

Network Security Technology Network Management

Application Layer (1)

Cryptography and Network Security Block Cipher

Cardinality. The set of all finite strings over the alphabet of lowercase letters is countable. The set of real numbers R is an uncountable set.

Evaluation of the RC4 Algorithm for Data Encryption

NEW HORIZON COLLEGE OF ENGINEERING, BANGALORE CLOUD COMPUTING ASSIGNMENT Explain any six benefits of Software as Service in Cloud computing?

Block encryption. CS-4920: Lecture 7 Secret key cryptography. Determining the plaintext ciphertext mapping. CS4920-Lecture 7 4/1/2015

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Image Compression through DCT and Huffman Coding Technique

Split Based Encryption in Secure File Transfer

CSCE 465 Computer & Network Security

Cryptography and Network Security. Prof. D. Mukhopadhyay. Department of Computer Science and Engineering. Indian Institute of Technology, Kharagpur

Chapter 2 Homework 2-5, 7, 9-11, 13-18, 24. (9x + 2)(mod 26) y 1 1 (x 2)(mod 26) 3(x 2)(mod 26) U : y 1 = 3(20 2)(mod 26) 54(mod 26) 2(mod 26) c

Streaming Lossless Data Compression Algorithm (SLDC)

Hill s Cipher: Linear Algebra in Cryptography

Chapter 6 CDMA/802.11i

Message Authentication Codes

Elliptic Curve Cryptography Methods Debbie Roser Math\CS 4890

CIS433/533 - Computer and Network Security Cryptography

The Misuse of RC4 in Microsoft Word and Excel

Chapter 3. Distribution Problems. 3.1 The idea of a distribution The twenty-fold way

Sample Induction Proofs

Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding

Compression techniques

Caesar Ciphers: An Introduction to Cryptography

CS 758: Cryptography / Network Security

Network Security. HIT Shimrit Tzur-David

Overview of Symmetric Encryption

Properties of Secure Network Communication

Stanford Math Circle: Sunday, May 9, 2010 Square-Triangular Numbers, Pell s Equation, and Continued Fractions

The Advanced Encryption Standard: Four Years On

CSE331: Introduction to Networks and Security. Lecture 20 Fall 2006

SAMPLE EXAM QUESTIONS MODULE EE5552 NETWORK SECURITY AND ENCRYPTION ECE, SCHOOL OF ENGINEERING AND DESIGN BRUNEL UNIVERSITY UXBRIDGE MIDDLESEX, UK

Cipher Techniques on Networks. Amit Konar Math and CS, UMSL

Mathematical Induction. Lecture 10-11

XML Encryption Syntax and Processing. Duan,Limiao 07,12,2006

L. Smithline Math 135 Final Exam Solutions 1

Enhancing Advanced Encryption Standard S-Box Generation Based on Round Key

Network Security: Cryptography CS/SS G513 S.K. Sahay

A New Digital Encryption Scheme: Binary Matrix Rotations Encryption Algorithm

Tutorial 2. May 11, 2015

Chapter 11 Security+ Guide to Network Security Fundamentals, Third Edition Basic Cryptography

First Semester Examinations 2011/12 INTERNET PRINCIPLES

RSA Encryption. Tom Davis October 10, 2003

6 Data Encryption Standard (DES)

Network Security. Security. Security Services. Crytographic algorithms. privacy authenticity Message integrity. Public key (RSA) Message digest (MD5)

K80TTQ1EP-??,VO.L,XU0H5BY,_71ZVPKOE678_X,N2Y-8HI4VS,,6Z28DDW5N7ADY013

Privacy and Security in the Internet of Things: Theory and Practice. Bob Baxley; HitB; 28 May 2015

Network Security CS 5490/6490 Fall 2015 Lecture Notes 8/26/2015

Advanced Cryptography

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

AVR1318: Using the XMEGA built-in AES accelerator. 8-bit Microcontrollers. Application Note. Features. 1 Introduction

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Karagpur

Tutorial 3. June 8, 2015

Math 55: Discrete Mathematics

Digital Modulation. David Tipper. Department of Information Science and Telecommunications University of Pittsburgh. Typical Communication System

Security+ Guide to Network Security Fundamentals, Third Edition. Chapter 6. Wireless Network Security

Transcription:

Diffusion and Data compression for data security A.J. Han Vinck University of Duisburg/Essen April 203 Vinck@iem.uni-due.de

content Why diffusion is important? Why data compression is important? Unicity distance Time to discover a secret Source coding principle How data compression works Zipf law Han Vinck 203 2

Diffusion-transposition HOW: rearrange the symbols in the data without changing the symbols i.e. the frequency of symbols remains the same GOAL: destroy the relations between symbols and make it more difficult to analyze! ANALYSIS: index of Co-incidence, finding periods Han Vinck 203 3

example of diffusion a scytale is a tool used to perform a transposition cipher http://www.youtube.com/watch?v=veh0knztljy&feature=related Han Vinck 203 4

Confusion and diffusion in AES General round structure substitution Substitute bytes Shift rows Mix columns transposition Add round key Same equipment can be used to decipher substitution http://www.youtube.com/watch?v=mlzxpkd Han Vinck 203 5 XP58

Data compression The goal of data compression is to create - a compact representation of the data to be encrpyted - create independent symbols Decompression gives the original data back! Han Vinck 203 6

Data compression Han Vinck 203 7

Source coding in Message encryption () Part Part 2 Part n (for example every part 56 bits) dependancy exists between parts of the message encypher key n cryptograms, dependancy exists between cryptograms decypher Attacker: Part Part 2 Part n key n cryptograms to analyze for particular message of n parts Han Vinck 203 8

Source coding in Message encryption (2) Part Part 2 Part n (for example every part 56 bits) n-to- source encode key encypher cryptogram decypher Source decode Attacker: - cryptogram to analyze for particular message of n parts - assume data compression factor n-to- Hence, less material for the same message! Part Part 2 Part n Han Vinck 203 9

The position of crypto in a Communication model source Analogue to digital conversion digital compression /reduction security error protection from bit to signal Han Vinck 203 0

Source coding Two principles: data reduction: data compression: remove irrelevant data (lossy, gives errors) present data in compact (short) way (lossless) original data remove irrelevance Relevant data compact description Transmitter side original data unpack receiver side Han Vinck 203

Illustration lossless/lossy original original Han Vinck 203 2

What do we want (need)? All data symbols to be enciphered must occur with equal probability and are independent from each other Han Vinck 203 3

Example: suppose we have a dictionary with 30.000 words these can be numbered (encoded) with 5 bits if the average word length is 5, we need on the average 3 bits per letter 000000 Han Vinck 203 4

This can happen Han Vinck 203 5

Letter frequency of the vigenere cipher Han Vinck 203 6

How to compres? (binary ) source x= (x, x 2,, x N ), x i Є {0,} - #0 s = f 0 N, # s = f N; F = (f 0, f ) the composition of x - Then, the number of different vectors x for a given F is x F = f N 0 N = (f 0 N! N)!(f N)! and the number of N log 2 x F - bits/ symbol needed to represent x i=0 f i log 2 f i N + log 2 N = - i=0 f i log 2 f i (entropy!) Han Vinck 203 7

en- and decoding source x N letters encoder F (composition) Lexicographical index for x F (composition) Lexicographical index for x decoder encoder x for large N,fi pi and thus filog2fi i0 is equal to the Shannon entropy! To transmit the value of F, we need N log2(n ) bits /output letter 0 for large N Lexicographical en- and decoding is a solved problem in computer science Han Vinck 203 8

exercise For sequences of length 2 with 4 ones and 6 zeros, give the lexicographical index for the sequence 0 0 0 0 0 0 0 0 What is the sequence that belongs to the index 52 Han Vinck 203 9

Binary entropy n lim log2 n pn n h(p) n pn ( ) 2 nh p interpretation: let a binary sequence contain pn ones, then we can specify each sequence with log 2 2 nh(p) = n h(p) bits Homework: Prove the approximation using ln N! ~ N lnn for N large. Use also log a x = y log b x = y log b a The Stirling approximation N N! 2 NN e N Han Vinck 203 20

The Binary Entropy: h(p) = -plog 2 p (-p) log 2 (-p) h 0.9 0.8 0.7 Note: h(p) = h(-p) 0.6 0.5 0.4 0.3 0.2 0. 0 0 0. 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 p Han Vinck 203 2

references Information theory books MPEG, JPEG, Han Vinck 203 22

Application to text: symbols are words The distribution of words follows the law of Zipf(935): Let f n denote the frequency of the n-th most frequent word, then f n = A/n. English: A = 0. M for M 2366; filog2fi 9.72; i the average wordlength 4.5 letter; The number of bits/letter 9.72/4.5 2.6 Han Vinck 203 23

Zipf s law A web site with many references and applications http://linkage.rockefeller.edu/wli/zipf/index_ru.html Han Vinck 203 24

Web Sites rank ordered by their popularity Han Vinck 203 25

Unicity distance (3) Idea: - for a stream cipher after some time L, the plaintext and keystream can be determined uniquely from the cipher stream The smallest value where this is possible is called UNICITY DISTANCE U A necessary condition: M L x K C L, where * means cardinality (or # of) (otherwise, when M L x K > C L, some plaintexts give the same cipher) Han Vinck 203 26

Unicity distance (4) from log M L log K log C L log C L we have and : log C L log K L log M L L L U log K log M,where R M C IMPORTANT to NOTE : log M R is the maximum For low redundancy, U goes redundancy to infinity of the source sequence! Han Vinck 203 27

A probabilistic approach (Hellmann) K 2 2 M L C L Equal probable messages, Equal probable keys C L z(ci ) M L x K = z(ci ) and P(ci ) =, Ci= M L L x K z(c ) M x K z(c ) and P(c ) i where L z(c )is the number i of arrows i entering, i M x K ci i where z(ci)is the number of we used : the # of outgoing we used : arrows = # of arrows entering the # of outgoing arrows # of incoming C L 2 z (c ) M x K z = i L z(ci )P(ci ) = c M x K C ni i= n L 2 L 2 a z z(ci) a z (ci ) n i i z = gives the same result as before (one unique pair M, C) L incoming c i proof : considern we used : z(c ) = a i= i n i [z(cni ) - => z i= 2 (c i ) 2 a n n a : consider 2 [z(ci ) - ] 0 i= n Han Vinck 203 28 proof

Examples: Unicity distance (5) Assume that the German language has a rate R of 2 bits per letter -Then, for a substitution cipher with 26! keys or a permutation cipher with period 26 ( 26! keys ) we have : U log K log M R log26! log26 2 32 - For a Vigenere cipher of length 80: we have : log K U log M R 80 log26 log26 2 40 - Try to find U for the DES Han Vinck 203 29

Conclusion: Unicity distance (6) It is important to make the value of R as high as possible for a large U Hence: source compression before encryption is important for secure communications Note added: Given the message to the analyst, the value of R = 0. Hence, given the ciphertext and plaintext, log K U log M Han Vinck 203 30

Professor James L. Massey A GREAT SCIENTIST and TEACHER! MOTTO: SIMPLE but SOLID 999 THE - Professor James L. Massey MARCONI Marconi FELLOWSAward citation "For theoretical and practical contributions to cryptography and related coding problems; teacher and mentor to a generation of scientists and technologists" Professor Massey made significant advances in forward-error-correcting codes, multi-user communications, and cryptographic systems. In addition, Professor Massey is known for his contributions to the field of engineering education. He is currently an Adjunct Professor at the University of Lund, Sweden. Han Vinck 203 3

Data compression (M-ary ) source x= (x, x 2,, x N ), x i Є {,2,,M} - Suppose that a source generates N independent M-ary symbols - The frequency of a symbol i is f i and thus f i N symbols i occur in x - We call F = (f, f 2,, f M ) the composition of x - Then, the number of different vectors x for a given F is N N fn N fn f2 N fm N N! x F fn f2 N fm N (fn)!(f 2N)! (f and the number of bits/ symbolneeded to represent x N log 2 x F M i flog i 2 fn log i 2 N M i flog i 2 f i (entropy!) N)! Han Vinck 203 32 M

en- and decoding source x N letters encoder F (composition) Lexicographical index for x F (composition) decoder encoder x Lexicographical index for x for large N, f i pi and thus M filog 2fi i is equal to the Shannon entropy! M - To transmit the value of F, we need log 2 (N ) bits /output letter 0 for large N N Lexicographical en- and decoding is a solved problem in computer science Han Vinck 203 33