Chapter 3: Digital Audio Processing and Data Compression

Similar documents
The string of digits in the binary number system represents the quantity

Data Storage 3.1. Foundations of Computer Science Cengage Learning

Compression techniques

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

Information, Entropy, and Coding

Lecture 2. Binary and Hexadecimal Numbers

Computer Science 281 Binary and Hexadecimal Review

Class Notes CS Creating and Using a Huffman Code. Ref: Weiss, page 433

Fast Arithmetic Coding (FastAC) Implementations

Number Representation

To convert an arbitrary power of 2 into its English equivalent, remember the rules of exponential arithmetic:

EE 261 Introduction to Logic Circuits. Module #2 Number Systems

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Oct: 50 8 = 6 (r = 2) 6 8 = 0 (r = 6) Writing the remainders in reverse order we get: (50) 10 = (62) 8

Image Compression through DCT and Huffman Coding Technique

encoding compression encryption

Levent EREN A-306 Office Phone: INTRODUCTION TO DIGITAL LOGIC

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Binary Division. Decimal Division. Hardware for Binary Division. Simple 16-bit Divider Circuit

This Unit: Floating Point Arithmetic. CIS 371 Computer Organization and Design. Readings. Floating Point (FP) Numbers

For Articulation Purpose Only

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1

plc numbers Encoded values; BCD and ASCII Error detection; parity, gray code and checksums

Numbering Systems. InThisAppendix...

Useful Number Systems

Digital Audio and Video Data

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Specification of the Broadcast Wave Format (BWF)

Binary Representation. Number Systems. Base 10, Base 2, Base 16. Positional Notation. Conversion of Any Base to Decimal.

Lecture 11: Number Systems

LSN 2 Number Systems. ECT 224 Digital Computer Fundamentals. Department of Engineering Technology

Arithmetic Coding: Introduction

Binary Numbering Systems

Streaming Lossless Data Compression Algorithm (SLDC)

Today s topics. Digital Computers. More on binary. Binary Digits (Bits)

International Journal of Advanced Research in Computer Science and Software Engineering

2010/9/19. Binary number system. Binary numbers. Outline. Binary to decimal

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

DIGITAL-TO-ANALOGUE AND ANALOGUE-TO-DIGITAL CONVERSION

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

NUMBER SYSTEMS. William Stallings

Introduction to image coding

Digital System Design Prof. D Roychoudhry Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Coding and decoding with convolutional codes. The Viterbi Algor

Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics:

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

The number of channels represented in the waveform data, such as 1 for mono or 2 for stereo.

Binary Number System. 16. Binary Numbers. Base 10 digits: Base 2 digits: 0 1

NUMBER SYSTEMS. 1.1 Introduction

Probability Interval Partitioning Entropy Codes

CPEN Digital Logic Design Binary Systems

Base Conversion written by Cathy Saxton

2011, The McGraw-Hill Companies, Inc. Chapter 3

Preservation Handbook

CDA 3200 Digital Systems. Instructor: Dr. Janusz Zalewski Developed by: Dr. Dahai Guo Spring 2012

ECE 0142 Computer Organization. Lecture 3 Floating Point Representations

Binary Trees and Huffman Encoding Binary Search Trees

Number Systems and Radix Conversion

Systems I: Computer Organization and Architecture

Today. Binary addition Representing negative numbers. Andrew H. Fagg: Embedded Real- Time Systems: Binary Arithmetic

Data Structures. Topic #12

MATH-0910 Review Concepts (Haugen)

1. Give the 16 bit signed (twos complement) representation of the following decimal numbers, and convert to hexadecimal:

RN-Codings: New Insights and Some Applications

CHAPTER 5 Round-off errors

Paramedic Program Pre-Admission Mathematics Test Study Guide

Chapter 1: Digital Systems and Binary Numbers

The Answer to the 14 Most Frequently Asked Modbus Questions

Measures of Error: for exact x and approximation x Absolute error e = x x. Relative error r = (x x )/x.

Analysis of Compression Algorithms for Program Data

Development and Evaluation of Point Cloud Compression for the Point Cloud Library

Continued Fractions and the Euclidean Algorithm

Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML. By Prashanth Tilleti

CSI 333 Lecture 1 Number Systems

Chapter 4: Computer Codes

Integer Operations. Overview. Grade 7 Mathematics, Quarter 1, Unit 1.1. Number of Instructional Days: 15 (1 day = 45 minutes) Essential Questions

Operation Count; Numerical Linear Algebra

C Implementation & comparison of companding & silence audio compression techniques

DNA Data and Program Representation. Alexandre David

Numeral Systems. The number twenty-five can be represented in many ways: Decimal system (base 10): 25 Roman numerals:

Digital Design. Assoc. Prof. Dr. Berna Örs Yalçın

WAVE PCM soundfile format

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music

Solution for Homework 2

Chapter One Introduction to Programming

ENGINEERING COMMITTEE Digital Video Subcommittee SCTE METHODS FOR CARRIAGE OF CEA-608 CLOSED CAPTIONS AND NON-REAL TIME SAMPLED VIDEO

ANALYSIS OF THE EFFECTIVENESS IN IMAGE COMPRESSION FOR CLOUD STORAGE FOR VARIOUS IMAGE FORMATS

Basics of Digital Recording

Positional Numbering System

Lecture 8: Binary Multiplication & Division

PCM Encoding and Decoding:

Sheet 7 (Chapter 10)

RN-coding of Numbers: New Insights and Some Applications

An Optimised Software Solution for an ARM Powered TM MP3 Decoder. By Barney Wragg and Paul Carpenter

0.8 Rational Expressions and Equations

CM0340 SOLNS. Do not turn this page over until instructed to do so by the Senior Invigilator.

PRIMER ON PC AUDIO. Introduction to PC-Based Audio

Symbol Tables. Introduction

Reading.. IMAGE COMPRESSION- I IMAGE COMPRESSION. Image compression. Data Redundancy. Lossy vs Lossless Compression. Chapter 8.

Transcription:

Chapter 3: Digital Audio Processing and Review of number system 2 s complement sign and magnitude binary The MSB of a data word is reserved as a sign bit, 0 is positive, 1 is negative. The rest of the bits of the data words represent the magnitude. Range of fixed-point representation of integers 8-bit 2 s complement from to 16-bit 2 s complement from to Fixed-point representation of fractional numbers 2 s complement representation with integer and fraction parts 1

Digital Audio Processing Fixed-point representation of fractional numbers For 16-bit arithmetic Q15 1 bit sign, 0 bit integer, 15 bits fraction representing a number between -1.0 to 1.0 Q14 1 bit sign, 1 bit integer, 14 bits fraction representing a number between -2.0 to 2.0 Q13 1 bit sign, 2 bits integer, 13 bits fraction representing a number between -4.0 to 4.0 For 32-bit arithmetic Q31 1 bit sign, 0 bit integer, 31 bits fraction representing a number between -1.0 to 1.0 Q28 1 bit sign, 3 bit integer, 28 bits fraction representing a number between -8.0 to 8.0 Q25 1 bit sign, 6 bits integer, 25 bits fraction representing a number between -64.0 to 64.0 Examples: 0A00 16 in Q15 of 16-bit arithmetic = 0.078125 0A00 16 in Q14 of 16-bit arithmetic = 0.15625 (note that Q15 = 2 x Q14) F7 16 in Q7 of 8-bit arithmetic = -0.0703125 2

Fixed-point representation of fractional numbers Addition/Subtraction The fraction points for both numbers must be aligned prior to the calculation. e.g., Q15+Q15 is allowed but Q15+Q14 is not allowed, you have to convert a Q14 number to Q15 prior to the addition operation, (remember, by shifting 1 bit left of a Q14 number, it will become a Q15 number, but mind you that there might be overflowed). There is one bit increase in precision, e.g., 0.9+0.9=1.8, a 16-bit Q15 format can not be used to store the result -> to keep 16 bit result, sacrifice 1 bit precision by shifting the result one bit to the right. Multiplication Multiply 2 numbers; A and B, total number of bits for the result = no need to use 2 sign bits in the result, so shift 1 bit left. Digital Audio Processing In 16 bit arithmetic, Q15 x Q15 = 32 bit result, then shift 1 bit left -> Q31 in 32 bit format. 3

Floating Point Representation and Arithmetic Floating point arithmetic offers the advantage of eliminating the scaling factor problem and also expanding the range of values over that of fixedpoint arithmetic. Format A floating-point number consists of two parts: a fraction f and an exponent e. Both f and e are signed. r is the radix (base). Digital Audio Processing Normalized fraction The magnitude of the normalized fraction has an absolute value within the range for binary number (r=2), this range becomes For example, after normalization Addition/Subtraction When adding or subtracting two floating-point numbers, the exponents must be compared and made equal > by shifting operation on one of the fractions. 4

Digital Audio Processing Common Audio File Formats in Computer Systems WAVE File Format (.wav) is a file format for storing digital audio (waveform) data. It supports a variety of bit resolutions, sample rates, and channels of audio. This format is very popular upon PC platforms, and is widely used in professional programs that process digital audio waveforms. This format uses Microsoft's version of the Electronic Arts Interchange File Format method for storing data in "chunks". WAVE File Structure A WAVE file is a collection of a number of different types of chunks. There is a required Format ("fmt ") chunk which contains important parameters describing the waveform, such as its sample rate. The Data chunk, which contains the actual waveform data, is also required. All other chunks are optional. The Format chunk must precede the Data chunk. All applications that use WAVE must be able to read the 2 required chunks and can choose to selectively ignore the optional chunks. 5

WAVE File Structure Sample Points and Sample Frames Digital Audio Processing A sample point is a value representing a sample of a sound at a given moment in time For waveforms with greater than 8-bit resolution, each sample point is stored as a linear, 2's-complement value which may be from 9 to 32 bits wide For example, each sample point of a 16-bit waveform would be a 16-bit word where 32767 (0x7FFF) is the highest value and -32768 (0x8000) is the lowest value. For 8-bit (or less) waveforms, each sample point is a linear, unsigned byte where 255 is the highest value and 0 is the lowest value. A sample point should be rounded up to a size which is a multiple of 8 when stored in a WAVE. This makes the WAVE easier to read into computer memory from 1 to 8 bits wide -> stored as an 8-bit byte (ie, unsigned char) from 9 to 16 bits wide -> stored as a 16-bit word (ie, signed short) from 17 to 24 bits wide -> stored as three bytes signed integer from 25 to 32 bits wide -> stored as a 32-bit double word (ie, signed long) the data bits should be left-justified, with any remaining (ie, pad) bits zeroed For example, 12 bits 6

WAVE File Structure Sample Points and Sample Frames Digital Audio Processing For multichannel sounds (for example, a stereo waveform), single sample points from each channel are interleaved The sample points that are meant to be played simultaneously are collectively called a sample frame. In the example of stereo waveform, every two sample points makes up another sample frame. For a monophonic waveform, a sample frame is merely a single sample point For example, stereo (2 channels) Packing of sample points into frame 7

WAVE File Structure Digital Audio Processing Format chunk The Format (fmt) chunk describes fundamental parameters of the waveform data such as sample rate, bit resolution, and how many channels of digital audio are stored in the WAVE The chunk ID is always "fmt ". The chunksize field is the number of bytes in the chunk The wformattag indicates whether compression is used when storing the data. If compression is used, WFormatTag is some value other than 1, if no compression is used, wformattag = 1 8

WAVE File Structure Format chunk The wchannels field contains the number of audio channels for the sound. A value of 1 means monophonic sound, 2 means stereo, and 6 for 5+1 surround The dwsamplespersec field is the sample rate at which the sound is to be played back in sample frames per second, e.g., 44100 The dwavgbytespersec field indicates how many bytes will be played every second. Its value should be equal to the following formula rounded up to the next whole number: dwsamplespersec * wblockalign The wblockalign field should be equal to the following formula, rounded to the next whole number: wchannels * (wbitspersample / 8) Digital Audio Processing Essentially, wblockalign is the size of a sample frame, in terms of bytes, e.g., a sample frame for a 16-bit mono wave is 2 bytes, a sample frame for a 16-bit stereo wave is 4 bytes The wbitspersample field indicates the bit resolution of a sample point, i.e., a 16-bit waveform would have wbitspersample = 16 9

WAVE File Structure Data chunk Digital Audio Processing The Data (data) chunk contains the actual sample frames, i.e., all channels of waveform data The chunkid is always data. chunksize is the number of bytes in the chunk, not counting the 8 bytes used by ID and Size fields The waveformdata array contains the actual waveform data. The data is arranged into sample frames The number of sample frames in waveformdata is determined by dividing this chunksize by the Format chunk's wblockalign. The Data Chunk is required. One and only one Data Chunk may appear in a WAVE 10

Digital Audio Processing Review of Digital Filters Applications of digital filters in audio: Equalization, mixing Over-sampling A/D conversion Linear prediction for audio coding Quadratic Mirror Filtering in MPEG audio codecs Music and sound synthesis Digital sound effect generation Finite Impulse Response (FIR) filter Non-recursive filter with feed-forward paths only Input signal Output signal Filter coefficients In z-transform where is called a unit delay element. FIR filter can be designed to have linear phase. 11

Digital Audio Processing Infinite Impulse Response (IIR) filter Recursive filter with feedback paths Digital filtering requires memory storage, multiplication and addition operations. Assume M = N 12

Basic of compression and digital representation Shannon s model of communication Shannon s model is general and covers many aspects of communication including error correction, data compression and cryptography Data rate: a measure of information rate in terms of the number of bits per second, e.g., a telephone PCM channel rate is 64 kbps, a stereo CD audio is 2x44100x16=1.4112 mbps 13

Basic of compression and digital representation Entropy: According to Shannon, the entropy of an information source is defined as: where is the probability that symbol in will occur. The term indicates the amount of information contained in, i.e., the number of bits needed to code. For example, in an audio signal with uniform distribution of intensity levels, i.e., then the number of bits needed to code each level is 8 bits. The entropy of this signal is 8 Shannon showed that for a given source and channel, coding techniques existed that would code the source with an average code length of as close to the entropy of the source desired. However, finding such a code was a separate problem. 14

Basic of compression and digital representation Compression: algorithm to remove redundancy from a source such that the compressed data can be transmitted or stored more efficiently. Different sources may require different compression algorithms, for examples, Lempel-Ziv algorithm for text (lossless), linear predictive coding (LPC) algorithm for speech (lossy), psychoacoustics coding for audio Lossless compression exact reproduction of the input data source after decompression Lossy compression the decompressed data is not the same as input data, but generally the difference (lost) may not be noticeable Compression ratio is defined as the ratio of source data rate over channel data rate and is an important measure of the effectiveness of the compression process, for examples, an MP3 audio coder has an average compression ratio of about 12. Bandwidth measures the rate at which data is transmitted through the network. It is often used as a gauge of speed in the network. Channel Capacity: the capacity of information a noisy channel can carry. It is defined as bits per second where B is the available bandwidth and SNR is the signal-to-noise ratio. Note if noise is absent in the channel, the channel capacity will be infinite 15

Fundamental of Lossless Compression Algorithms Compression can be considered as the mapping of source strings to channel stings. Blocking To form a mapping between input stings and output strings of various lengths with the aim of matching their probabilities as closely as possible. This mapping technique is called blocking. There are four kinds of blocking Variable-to-variable coding provides the most flexibility in matching the characteristics of the source with those of the channel 16

Shannon-Fano Algorithm The messages are sorted by probability and then subdivided recursively at as close to power of two boundaries as possible. The resultant binary tree when labeled with 0s and 1s describes the set of code strings Example Symbol A B C D E Count 15 7 6 6 5 Encoding (a top-down approach): 1.Sort symbols according to their frequencies/probabilities, e.g., ABCDE 2.Recursively divide symbols into two parts, each with approximately same number of counts Symbol Count log 2 (1/p i ) Code Subtotal (# of bits) A 15 1.38 00 30 B 7 2.48 01 14 C 6 2.70 10 12 D 6 2.70 110 18 E 5 2.96 111 15 Total code length (# of bit) = 89, average code length (# of bit per symbol) = 2.282 This technique yields an average code length of [H, H+1] where H is the entropy of the set of source messages 17

Huffman Coding Algorithm The messages are sorted by probability. To form a Huffman code, the two least probable messages are combined into a single pseudo message whose probability is the sum of the probabilities of its component messages. The pseudo message replaces the two messages in the list of messages and the grouping process is repeated iteratively until there is only one pseudo message left. The resultant binary tree describes the set of code strings Encoding (a bottom-up approach): 1. Initialization: Put all nodes in an OPEN list, keep it sorted at all times (e.g., ABCDE). 2. Repeat until the OPEN list has only one node left: a) From OPEN pick two nodes having the lowest frequencies/probabilities, create a parent node of them b) Assign the sum of the children's frequencies/probabilities to the parent node and insert it into OPEN c) Assign code 0, 1 to the two branches of the tree, and delete the children from OPEN. Symbol Count log 2 (1/p i ) Code Subtotal (# of bits) A 15 1.38 0 15 B 7 2.48 100 21 C 6 2.70 101 18 D 6 2.70 110 18 E 5 2.96 111 15 Total code length (# of bits) = 87, average code length = 2.23 Entropy = (15 x 1.38 + 7 x 2.48 + 6 x 2.7 + 6 x 2.7 + 5 x 2.96) / 39 = 85.26 / 39 = 2.19 18

Shannon-Fano and Huffman Coding Algorithms Discussion Decoding for these two algorithms is trivial as long as the coding table (the statistics) is sent before the data. (There is a bit overhead for sending this, negligible if the data file is big.) Unique Prefix Property: no code is a prefix to any other code (all symbols are at the leaf nodes) --> great for decoder, unambiguous. The previous algorithms described use fixed statistics. They make two passes over the message, the first pass to gather statistics and the second pass to code the message (using the statistics). If prior statistics are available and accurate, then Huffman coding is very good. For compression on, say, live audio and video, the previous algorithms require the statistical knowledge which is often not available. Even when it is available, it could be a heavy overhead especially when many tables had to be sent. Actually, the statistics of most data sources are varying Taking into account the impact of the previous symbol to the probability of the current symbol (e.g., "qu" often come together in English,...), an adaptive algorithm can be used to more actually reflect the statistics of source and hence achieve better coding performance. The solution is to use adaptive algorithms 19

The modern Paradigm of The modern paradigm uses predictions to divide compression into separate modeling and coding units. Each step transmits an instance. At the start of each step, the model constructs a prediction p (of the next instance) and passes it to the coder. The coder uses the prediction to transmit the next instance a using as close to nats as it can. Meanwhile, the receiver s model has generated an identical prediction which the decoder uses to identify the instance that was transmitted. The transmitter and receiver both use the new instance to update their models. The cycle repeats until the entire message is transmitted This prediction+coding approach is actually adaptive data compression 20

Adaptive Huffman Coding Algorithm The key is to have both encoder and decoder to use exactly the same initialization and update_model routines. update_model does two things: (a) increment the count, (b) update the Huffman tree. During the updates, the Huffman tree will be maintained its sibling property, i.e., the nodes (internal and leaf) are arranged in the order of increasing weights (see figures). When swapping is necessary, the farthest node with weight W is swapped with the node whose weight has just been increased to W+1. Note: If the node with weight W has a subtree beneath it, then the subtree will go with it. The Huffman tree could look very different after node swapping, e.g., in the third tree, node A is again swapped and becomes the #5 node. It is now encoded using only 2 bits. 21

Adaptive Huffman Coding Algorithm Note: Code for a particular symbol will be changed during the adaptive coding process 22

Golomb-Rice Coding Algorithm Golomb coding is a lossless data compression method invented by Solomon W. Golomb in the 1960s. If the source alphabets follow a geometric distribution, a Golomb code will be an optimal prefix code which is particularly suitable for situations in which the occurrence of small values in the input stream is significantly more likely than large values, for example, coding audio signal residual after linear prediction. Golomb codes can be considered as a special case of Huffman codes for sources with geometrically distributed symbols: where Rice coding (invented by Robert F. Rice) uses a subset of the family of Golomb codes to produce a simpler suboptimal prefix code. A Golomb code has a tunable parameter that can be any positive value, Rice codes are those in which the tunable parameter is a power of two. This makes Rice codes convenient for use on a computer, since multiplication and division by 2 can be implemented more efficiently in binary arithmetic Rice coding is used as the entropy encoding stage in a number of lossless image compression and audio data compression methods, for example; MPEG-4 ALS audio coder. 23

Golomb-Rice Coding Algorithm Golomb coding uses a tunable parameter M to divide an input value into two parts: q, the quotient as a result of division by M, and r, the remainder. The quotient is sent in unary coding, followed by the remainder in truncated binary encoding. When M = 1 Golomb coding is equivalent to unary coding. Note that the two parts are given by the following expression, where x is the number being encoded; and where denotes truncation to integer value. The final code looks like: where is q bits unary code of 1s and 1 bit of 0 for dilemma, and is the binary code for encoding the remainder r. Note that r can be encoded with a varying number of bits, and is specifically only b bits for Rice code, i.e.,, and switches between b-1 and b bits for Golomb code (i.e. M is not a power of 2 for Golomb code). 24

Golomb-Rice Coding Algorithm The algorithm to perform Rice coding is shown below: 1. Fix the tunable parameter M to power of 2 integer value 2. For x, the number to be encoded Quotient q = int[x/m] Remainder r = x modulo M 3. Generate codeword The coding format: <Quotient Code><Remainder Code> where Quotient Code (in unary coding) Write a q-length string of 1 bits Write a 0 bit Remainder Code (in binary coding) Write b=log 2 M bits of binary code for remainder Example Given,, The final codeword is 11010011 25

Arithmetic Coding Algorithm Arithmetic coding is an entropy coding. The compression achieved by arithmetic coding is generally better than Huffman coding. The idea behind arithmetic coding is to group source symbols and code them into a single number which has a fraction range from 0 to 1 as a probability line [0, 1) Each symbol is assigned a range in probability line based on its probability, the higher the probability, the higher range that is assigned to it. After the ranges and the probability line are defined, encoding process can be started. The algorithm to accomplish this for a message of any length is shown below: Set low to 0.0 Set high to 1.0 While there are still input symbols do get an input symbol range = high - low. high = low + range*high_range(symbol) low = low + range*low_range(symbol) End of While output low 26

Arithmetic Coding Example: if we are going to encode message ALIALIBABA", we first work out a probability distribution and assign their probabilities into a range along a probability line, which is nominally 0 to 1, like this: Symbol Probability Range A 0.4 0.0-0.4 B 0.2 0.4-0.6 L 0.2 0.6-0.8 I 0.2 0.8-1.0 Each symbol is assigned the portion of the 0-1 range that corresponds to its probability of appearance The most significant portion of an arithmetic coded message belongs to the first symbol to be encoded. In order for the first symbol, i.e., an A, to be decoded properly, the final coded message has to be a number greater than or equal to 0.00 and less than 0.40 After the first symbol is encoded, the range for the output number is now bounded by the low number (0.00) and the high number (0.40) Each new symbol to be encoded will further restrict the possible range of the output number The next symbol to be encoded, L', owns the range 0.60 through 0.80. So the new encoded number will have to fall somewhere in the 60th to 80th percentile of the currently established range. Applying this logic will further restrict the number to the 27 range 0.24 to 0.32.

Arithmetic Coding The encoding process through to its natural conclusion with the chosen message ALIALIBABA looks like this: Symbol Low Value High Value Range A 0.0 0.4 0.4 L 0.24 0.32 0.08 I 0.304 0.32 0.016 A 0.304 0.3104 0.0064 L 0.30784 0.30912 0.00128 I 0.308864 0.30912 0.000256 B 0.3089664 0.3090176 0.0000512 A 0.3089664 0.30898688 0.00002048 B 0.308974592 0.308978688 0.000004096 A 0.308974592 0.3089762304 0.0000016384 The encoded codeword can be any value between the final low and high values, i.e., 0.308974592 and 0.3089762304, respectively The total number of bits B required to encode this codeword depends on the final range r, which is related to the codeword precision, by the equation. In this example bits. With 10 symbols in the source sequence, the average bit rate is 1.912928095 bit per symbol Check the entropy of the source. Is Arithmetic Coding efficient? 28

Arithmetic Coding The decoding process to create the exact stream of input symbols operates as: The first symbol in the message owns the code space that the encoded message falls in. Since the number 0.308974592 falls between 0.0 and 0.4, we know that the first character must be A Since the low and high ranges of A is known, their effects can be removed by reversing the process that put them in. First, the low value of A is subtracted from the number, giving 0.308974592. Then it is divided by the range of A, which is 0.4. This gives a value of 0.77243648, which in turn determines where it lands in the range of the next letter, L The algorithm for decoding the incoming number looks like this: get encoded number Do find symbol whose range straddles the encoded number output the symbol range = symbol high value - symbol low value subtract symbol low value from encoded number divide encoded number by range until no more symbols Note that a special EOF symbol, or a length code of the stream can be used to identify the ending of the decoding process. 29

Arithmetic Coding The decoding algorithm for the ALIALIBABA" message will proceed something like this: Encoded Number Output Symbol Low Value High Value Range 0.308974592 A 0.0 0.4 0.4 0.77243648 L 0.6 0.8 0.2 0.8621824 I 0.8 1.0 0.2 0.310912 A 0.0 0.4 0.4 0.77728 L 0.6 0.8 0.2 0.8864 I 0.8 1.0 0.2 0.432 B 0.4 0.6 0.2 0.16 A 0.0 0.4 0.4 0.4 B 0.4 0.6 0.2 0.0 A 0.0 0.4 0.4 0.0 In summary, the encoding process is simply one of narrowing the range of possible numbers with every new symbol. The new range is proportional to the predefined probability attached to that symbol. Decoding is the inverse procedure, where the range is expanded in proportion to the probability of each symbol as it is extracted. 30

Arithmetic Coding Practical Matters Do we need a floating point processor? The bit length of the output number will increase as the number of symbols is increased. Do we need to start over again as it reaches the limit? Can be implemented using 16 bit or 32 bit integer math. Use an incremental transmission scheme where fixed size integer state variables receive new bits in at the low end and shift them out the high end Practical implementation Imagine that a 6 decimal digit (fixed-length) register is used, the decimal equivalent of the setup would look like this: HIGH: 999999 this is 1.0 LOW: 000000 this is 0.0 The range between the low value and the high value, i.e., the difference between the two registers will be 100000. In encoding the first symbol, the new high value is computed by using the formula from the previous section. In this case the high range was 0.4, which gives a new value for high of 399999. The calculation of low value follows the same path, with a resulting new value of 000000. So now high and low look like this: HIGH: 399999 LOW: 000000 31

Arithmetic Coding Practical Matters shift out the most significant digits of the low and high values if they are matched LOW HIGH RANGE CUMULATIVE OUTPUT Initial state 000000 999999 1000000 Encode A (0.0-0.4) 000000 399999 400000 Encode L (0.6-0.8) 240000 319999 80000 Encode I (0.8-1.0) 304000 319999 16000 Shift out 3 040000 199999 160000 0.3 Encode A (0.0-0.4) 040000 103999 64000 0.3 Encode L (0.6-0.8) 078400 091199 12800 0.3 Shift out 0 784000 911999 128000 0.30 Encode I (0.8-1.0) 886400 911999 25600 0.30 Encode B (0.4-0.6) 896640 901759 5120 0.30 Encode A (0.0-0.4) 896640 898687 2048 0.30 Shift out 8 and 9 664000 868799 204800 0.3089 Encode B (0.4-0.6) 745920 786879 40960 0.3089 Shift out 7 459200 868799 409600 0.30897 Encode A (0.0-0.4) 459200 623039 163840 0.30897 Shift out 4 0. 308974 Shift out 5 0.3089745 Shift out 9 0.30897459 Shift out 2 0.308974592 32

Arithmetic Coding Practical Matters: underflow problem This scheme works well for incrementally encoding a message. However, there is potential for a loss of precision under certain circumstances. In the event that the encoded word has a string of 0s or 9s in it, the high and low values will slowly converge on a value, but may not see their most significant digits match immediately. For example, the high and low values just after encoding the first B in previous table are: High: 901759 Low: 896640 At this point, the most significant digits of low and high are not the same, which means they can not be shifted out, but the calculated range is going to be small with only 4 digit long, which means the output word may not have enough precision to be accurately encoded in subsequent process. In the worse case, after a few more iterations, high and low could look like this: High: 900000 Low: 899999 Then the values are permanently stuck. The range between high and low has become so small that any calculation will always return the same values. But, since the most significant digits of both words are not equal, the algorithm can't output the digit and shift! 33

Arithmetic Coding Practical Matters: underflow problem The way to defeat this underflow problem is to prevent things from ever getting this bad. If the two most significant digits don't match, but are on adjacent numbers, a second test needs to be applied to see if the 2nd most significant digit in high is a 0, and the 2nd digit in low is a 9. If so, it means we are on the road to underflow, and need to take action. Instead of shifting the most significant digit out of the word, we just delete the 2nd digits from high and low, and shift the rest of the digits left to fill up the space. The most significant digit stays in place. We then have to set an underflow counter to remember that we threw away a digit, and we aren't quite sure whether it was going to end up as a 0 or a 9. The operation looks like this: Before After ------ ------ High 901759 917599 Low 896640 866400 Underflow 0 1 After every recalculation operation, if the most significant digits don't match up, we can check for underflow digits again. If they are present, we shift them out and increment the counter. When the most significant digits do finally converge to a single value, we first output that value. Then, we output all of the "underflow" digits that were previously discarded. The underflow digits will be all 9s or 0s, depending on whether High and Low converged to the higher or lower value. 34

Arithmetic Coding Practical Matters: underflow prevention Re-exam the example using only 5-digit precision LOW HIGH RANGE CUMULATIVE OUTPUT Initial state 00000 99999 100000 Encode A (0.0-0.4) 00000 39999 40000 Encode L (0.6-0.8) 24000 31999 8000 Encode I (0.8-1.0) 30400 31999 1600 Shift out 3 04000 19999 16000 0.3 Encode A (0.0-0.4) 04000 10399 6400 0.3 Encode L (0.6-0.8) 07840 09119 1280 0.3 Shift out 0 78400 91199 12800 0.30 Encode I (0.8-1.0) 88640 91199 2560 0.30 Encode B (0.4-0.6) 89664 90175 512 0.30 Discard second digit, count +=1 86640 91759 5120 0.30 Encode A (0.0-0.4) 86640 88687 2048 0.30 Shift out 8 66400 86879 20480 0.308 Check count and shift out 9, count -= 1 66400 86879 20480 0.3089 Encode B (0.4-0.6) 74592 78687 4096 0.3089 Shift out 7 45920 86879 40960 0.30897 Encode A (0.0-0.4) 45920 62303 16384 0.30897 Shift out 4 0. 308974 Shift out 5 0.3089745 Shift out 9 0.30897459 Shift out 2 0.308974592 35