HUFFMAN CODING AND HUFFMAN TREE



Similar documents
Class Notes CS Creating and Using a Huffman Code. Ref: Weiss, page 433

Arithmetic Coding: Introduction

1. What are Data Structures? Introduction to Data Structures. 2. What will we Study? CITS2200 Data Structures and Algorithms

Information, Entropy, and Coding

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b

Gambling and Data Compression

Analysis of Algorithms I: Optimal Binary Search Trees

Lecture 18: Applications of Dynamic Programming Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY

Reading.. IMAGE COMPRESSION- I IMAGE COMPRESSION. Image compression. Data Redundancy. Lossy vs Lossless Compression. Chapter 8.

International Journal of Advanced Research in Computer Science and Software Engineering

CSC4510 AUTOMATA 2.1 Finite Automata: Examples and D efinitions Definitions

Computer Science 281 Binary and Hexadecimal Review

Full and Complete Binary Trees

Output: struct treenode{ int data; struct treenode *left, *right; } struct treenode *tree_ptr;

Logic gates. Chapter. 9.1 Logic gates. MIL symbols. Learning Summary. In this chapter you will learn about: Logic gates

Classification/Decision Trees (II)

Algorithms and Data Structures

- Easy to insert & delete in O(1) time - Don t need to estimate total memory needed. - Hard to search in less than O(n) time

Regular Languages and Finite State Machines

Symbol Tables. Introduction

Solutions to In-Class Problems Week 4, Mon.

root node level: internal node edge leaf node Data Structures & Algorithms McQuain

CHAPTER 6. Shannon entropy

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

Binary Search Trees (BST)

1. Prove that the empty set is a subset of every set.

Introduction to Automata Theory. Reading: Chapter 1

THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM

Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

DESIGN AND ANALYSIS OF ALGORITHMS

Any two nodes which are connected by an edge in a graph are called adjacent node.

Coding and decoding with convolutional codes. The Viterbi Algor

Entropy and Mutual Information

Introduction to Learning & Decision Trees

Mathematical Induction. Lecture 10-11

Binary Heaps. CSE 373 Data Structures

The Halting Problem is Undecidable

Project Scheduling. Introduction

CS711008Z Algorithm Design and Analysis

Lift-based search for significant dependencies in dense data sets

Resources Management

Converting a Number from Decimal to Binary

Compression techniques

On the Use of Compression Algorithms for Network Traffic Classification

Ordered Lists and Binary Trees

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

Fast nondeterministic recognition of context-free languages using two queues

Binary Heap Algorithms

24 Uses of Turing Machines

(IALC, Chapters 8 and 9) Introduction to Turing s life, Turing machines, universal machines, unsolvable problems.

A Genetic Algorithm for Constructing Compact Binary Decision Trees

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Binary Trees and Huffman Encoding Binary Search Trees

Operations: search;; min;; max;; predecessor;; successor. Time O(h) with h height of the tree (more on later).

Image Compression through DCT and Huffman Coding Technique

M-way Trees and B-Trees

Data Structure with C

A Practical Scheme for Wireless Network Operation

University of Ostrava. Reasoning in Description Logic with Semantic Tableau Binary Trees

Moving Target D* Lite

Lecture 2: Regular Languages [Fa 14]

Semantics of UML class diagrams

Theory of Computation Chapter 2: Turing Machines

Greatest Common Factor and Least Common Multiple

Design of Efficient Algorithms for Image Compression with Application to Medical Images

Finite Automata. Reading: Chapter 2

Lecture 6: Binary Search Trees CSCI Algorithms I. Andrew Rosenberg

Evaluation of Expressions

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers

Mathematics and Statistics: Apply probability methods in solving problems (91267)

TREE BASIC TERMINOLOGIES

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

CS103B Handout 17 Winter 2007 February 26, 2007 Languages and Regular Expressions

Compiler Construction

2010/9/19. Binary number system. Binary numbers. Outline. Binary to decimal

A binary search tree is a binary tree with a special property called the BST-property, which is given as follows:

Algorithms Chapter 12 Binary Search Trees

Heaps & Priority Queues in the C++ STL 2-3 Trees

Statistical Machine Learning

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

From Last Time: Remove (Delete) Operation

>

The Cost of Offline Binary Search Tree Algorithms and the Complexity of the Request Sequence

Algorithms and Data Structures

Fundamentele Informatica II

Outline. Lecture 3. Basics. Logical vs. physical memory physical memory. x86 byte ordering

Enforcing Security Policies. Rahul Gera

Sample Questions Csci 1112 A. Bellaachia

Finite Automata. Reading: Chapter 2

11 Multivariate Polynomials

Lecture 1. Basic Concepts of Set Theory, Functions and Relations

CSE 135: Introduction to Theory of Computation Decidability and Recognizability

CS5236 Advanced Automata Theory

Transcription:

oding: HUFFMN OING N HUFFMN TR Reducing strings over arbitrary alphabet Σ o to strings over a fixed alphabet Σ c to standardize machine operations ( Σ c < Σ o ). inary representation of both operands and operators in machine instructions in computers. It must be possible to uniquely decode a code-string (string over Σ c )toasource-string (string over Σ o ). Not all code-string need to correspond to a source-string. oth the coding and decoding should be efficient. Word: finite non-empty string over analphabet (Σ o or Σ c ). Simple oding Mechanism: code(a i )=anon-empty string over Σ c,for a i Σ o. code(a 1 a 2 a n )=code(a 1 ).code(a 2 ) code(a n ). xample. Σ 0 ={,,,,}and Σ c ={0, 1}. Prefixproperty 000 001 010 011 100 code( ) =000. 000. 001; yes easy to decode 0 01001 0001 00001 code() =code( ) =001; no not always possible to uniquely decode 1 01001 0001 00001 prefix-free code yes 1 10100 1000 10000 not prefix-free code no

4.2 PRFIX-FR O efinition: Nocode(a i )isaprefix of another code(a j ). In particular, code(a i ) code(a j )for a i a j. inary-tree representation of prefix-free binary code: 0 =label(left branch) and 1 = label(right branch). 0 1 0 1 0 1 1 001 000 0 1 0 1 dvantage: 011 10 0 110 0 1 0 1 01 10 11 0 1 000 001 and have shorter code-word; each non-terminal node has 2 children. One can decode the symbols from left to right, i.e., as they are received. sufficient condition for left-to-right unique decoding is the prefix property. Question:? How can we keep prefix-free property and assign shorter codes to some of the symbols {,,, }?? What should we do if the symbols in Σ o occur with probabilities p() = 0.1 = p(), p() = 0.3, p() = p() =?

4.3 HUFFMN-TR inary tree with each non-terminal node having 2 children. Gives optimal (min average code-length) prefix-free binary code to each a i Σ o for a given probabilities p(a i )>0. Huffman s lgorithm: 1. reate aterminal node for each a i Σ o,with probability p(a i ) and let S =the set of terminal nodes. 2. Select nodes x and y in S with the two smallest probabilities. 3. Replace x and y in S by a node with probability p(x) + p(y). lso, create a node in the tree which is the parent of x and y. 4. Repeat (2)-(3) untill S =1. xample. Σ 0 ={,,, } and p() =0.1 = p(), p() =0.3, p() = p()=. The nodes in S are shown shaded. 0.1 0.1 0.3 0.2 0.3 0.45 0.45 0.3 5.5 1.0 fter redrawing the tree 000 001 01 10 11

4.4 NUMR OF INRY TRS 0-2 inary Tree: ach non-terminal node has 2 children. #(inary trees with N nodes) = 1 2N + 1 #(0-2 inary trees with N terminal nodes) =#(inary trees with N 1nodes) = 2N + 1 N. 1 2N 1 2N 1 N 1 2N2. xample. inary trees with N 1 = 3 nodes correspond to 0-2 binary trees with N =4terminal nodes. Merits of Huffman s lgorithm: Itfinds the optimal coding in O(N. log N) time. It does so without having to search through possible 0-2 binary trees. 1 2N 1 2N 1 N 1

4.5 Main Operations: T-STRUTUR FOR IMPLMNTING HUFFMN S LGORITHM hoosing the two nodes with minimum associated probabilities (and creating a parent node, etc). Use heap data-structure for this part. This is done N 1 times; total work for this part is O(N. log N). ddition of each parent node and connecting with the children takes a constant time per node. total of N 1parent nodes are added, and total time for this O(N). omplexity: O(N. log N).

4.6 XRIS 1. onsider the codes shows below. 000 001 011 10 110 (a) rrange the codes in a binary tree form, with 0 = label(leftbranch) and 1 = label(rightbranch). (b oes these codes have the prefix-property? How do you decode the string 10110001000? (c) Modify the above code (keeping the prefix property) so that the new code will have less average length no matter what the probabilities of the symbols are. Show the binary tree for the new code. (d) What are the two key properties of the new binary tree (hint: compare with your answer for part (a))? (e) Give a suitable probability for the symbols such that prob( ) < prob() < prob() < prob() < prob() and the new code in part (c) is optimal (minimum aver. length) for those probabilities. 2. Show the successive heaps in creating the Huffman-Tree for the probabilities p( ) = 0.1 = p(), p() = 0.3, p() = 0.14, p() =0.12, and p(f) =0.24. 3. Give some probabilities for the items in Σ o ={,,, F} that give the largest possible value for optimal average code length. 4. rgue that for an optimal Huffman-tree, any subtree is optimal (w.r.t to the relative probabilities of its terminal nodes), and also the tree obtained by removing all children and other descendants of a node x gives atree which is optimal w.r.t to p(x) and the probabilities of its other terminal nodes.