# Binary Codes for Nonuniform

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Binary Codes for Nonuniform Sources (dcc-2005) Alistair Moffat and Vo Ngoc Anh Presented by:- Palak Mehta

2 Basic Concept: In many applications of compression, decoding speed is at least as important as compression effectiveness. So here they present two coding methods that make use of fixed binary representations. They have all of the consequent benefits in terms of decoding performance.

3 Brief About: Unary Codes It is an entropy encoding method that represents a natural number n with n ones followed by zero. Eg. 5 is represented as Elias _ Codes They are universal codes encoding positive integers. To code: Separate the integer into the highest power of 2 it contains (2 N ) and the remaining N binary digits of the integer Encode N in Unary; that is as N zeros followed by a one Append the remaining N binary digits to this representation of N. Eg. 2 7 = = 00111

4 Brief About (contd ): Golomb Codes It is a form of entropy encoding that is optimum alphabets following geometric distributions, ie; when small values are vastly more common than large values. The code depends on choice of a parameter b. The first step is to compute the 3 quantities, q = int [n/b] ; r = [n%b] ; c = [log 2 b], where n is the no. being encoded. Eg. Let b = 10 and so c = 4 r q code code Thus, the Golomb Code for 22 will be

5 The Vision: They introduced two coding methods that have following properties: Based on simple binary codes and thus fast to decode Sensitive to localized variations in probability distribution and thus capable of outperforming Huffman Codes Parameterless, and thus still useful even when the message to be represented is small compared to the number of permitted alphabet symbols

6 Recursive Multi-span Selector: In a Binary Code, all codewords are of the same length Golomb and Elias Codes can also be thought of as binary codes, but with the length of the binary part set on a codeword by codeword basis. So the key idea here is that there are other possibilities between the two extremes of binary codes and elias codes Eg. Consider the following list of integers: 13, 8,3,1,0,0,0,0,5,3,7,1,6

7 Recursive Multi-span Selector (Contd ): Transforming each value x to the corresponding bit length [log 2 (x+1)] gives a sequence of selector values: 4,4,2,1,0,0,0,0,3,2,3,1,3 As, contrast to Elias _ code, suppose that the list of selectors is grouped in blocks containing s 1 code, where s 1 is fixed in advance. So our example by assuming s 1 = 2 would be: 4,2,0,0,3,3,3 by taking maximum over pairs

8 Recursive Multi-span Selector (Contd ): The matching codeword string would then be: We can handle the sequence of selectors recursively using the same exact method. Using s 2 =2 as an example parameter at the next level, we get: 3,0,2,2 and with the bitstring as: Now further recursive passes result in the need for a single number to be transmitted as the first item in the message

9 Recursive Multi-span Selector (Contd ): The number of items in the message is also required before decoding can be commenced, in order that the recursive structure can be recovered If s 1 = s 2 = = s for some fixed value s, then the number of recursive levels required to code a message of m integers grows as log s m However, using s i = s is not necessarily ideal because the repeated taking of log on the values means that they quickly fall into compact and consistent range. Hence, instead of using s i =s, we take s i =s i

10 Recursive Multi-span Selector (Contd ): With this arrangement, the no. of recursions is ( log s m)/2 and for s = 4 and m = 1,000,000 there are just 4 levels This new method is called RBUC(s) (Recursive Bottom Up, Complete) Given the small number of plausible values for s, it is reasonable to extend the encoder and have it search over the possibilities and indicate to the decoder which values of s it will employ for this particular message This method is called RBUC-B (Recursive Bottom Up Complete, Best)

11 Top Down Evaluation: Suppose that it is known that the largest of the m message symbols can be coded in b bits Then clearly, the entire message can be represented in m.b bits So if the other m-1 values are also large, coding the message might be quite economical This observation suggests an alternative two-stage process

12 Top Down Evaluation (Contd ): In the first stage, a bottom-up hierarchy of maximum symbol values and coding costs is constructed in a tree structure with s-way branching at each node During this stage, the s children of each node report to their parent the maximum number of bits required by any symbol plus the number of bits required to resolve that parameter for each of the s subtrees

13 Top Down Evaluation (Contd ): In the second top-down phase, information is percolated back down the tree from the root, and the decision is made at each node whether it is sensible to continue the recursive decomposition into s smaller subsequences or whether it is better to terminate An s-bit control variable is added at each decision point Each bit in the control variable corresponds to one of the s subtrees, and stipulates the relationship between codeword length

14 Top Down Evaluation (Contd...): Then the kth bit in the control variable is set to 0, if the kth subtree contains one or more symbols that require b bits, and is set to 1 if every symbol in that subtree can be stored in strictly less than b bits As a single exception, if all s bits in the control variable are on then this node is regarded as a stopping node and all of the integers controlled by the node are coded as b bit binary values So, at each node, decision is made, whether it is more economical to continue the recursive decomposition or to simply use a flat binary tree

15 Top Down Evaluation (Contd ): Here, it is explored, that an exact value of b is maintained at every node in the tree rather than an upper bound To do this, every one-bit in the control variable is augmented by a follow-up binary code of [log 2 b] bits, storing a number between 0 and b -1 inclusive, indicating the exact parameter for the corresponding subtree This method is called RTDP(s) Recursive Top Down, Partial), where s>1 is an integer indicating the uniform branching of the tree

16 Experiments:

17 Experiments(contd ):

18 Experiments(contd ):

19 Experiments(contd ): Figure 2 :Compression obtained by different coding schemes on synthetic geometric data parameterized by a probability p. Each point represents a message of 1,000,000 integers

20 Experiments(contd ): Table 3 : Decoding speed (measured as total elapsed time in seconds to decode the compressed data file and write an output), carried out using a dual 2.8 Ghz Intel Xeon 2 GB of RAM, twelve 146 GB SCSI disks in RAID-5 configuration and running Debian GNU/Linux

21 Performance in a retrieval system : The primary interest in the new methods is a representation for the inverted lists required by text retrieval systems. The fast performance of Google, for e.g., is in part a consequence of the fact that the inverted lists are stored compressed, and yet can still be processed quickly. To measure the usefulness of the new methods in realistic setting, they built a retrieval system for a collection of 18GB of text containing 1.2 million documents taken from the.gov domain as part of TREC project run by NIST

22 Performance in a retrieval system (contd ): Several different compression methods were used to represent the d-gaps. The actual index structure used in this retrieval system differs slightly. This variation makes the average d-gap larger and favors the byte-aligned code. The retrieval system implements a ranking heuristic that scores each document in the collection and returns the identifiers of the top 1000 documents according to that score. To test querying speed, a set of random queries were used, each drawn from a document in the collection

23 Performance in a retrieval system (contd ): Fiqure 3 : Tradeoffs in effectiveness, measured as index size for an 18 GB document collection, and querying throughput, measured as queries per second on a dual 2.8 Ghz Intel Xeon 2 GB of RAM, twelve 146 GB SCSI disks in RAID- 5 configuration and running Debian GNU/Linux. The 10,000 queries were generated randomly from the collection, and averaged three terms each

24 Conclusion: Two new binary based unparameterized codes are introduced that are applicable for decreasing probability distribution This codes give excellent compression and because they are sensitive to localized probability changes, are capable of outperforming entropy coders The also provide fast compression Software implementing RBUC-B method is available at:

25 references: Binary Codes for Non-Uniform Sources

### Chap 3 Huffman Coding

Chap 3 Huffman Coding 3.1 Overview 3.2 The Huffman Coding Algorithm 3.4 Adaptive Huffman Coding 3.5 Golomb Codes 3.6 Rice Codes 3.7 Tunstall Codes 3.8 Applications of Huffman Coding 1 3.2 The Huffman Coding

### Compression for IR. Lecture 5. Lecture 5 Information Retrieval 1

Compression for IR Lecture 5 Lecture 5 Information Retrieval 1 IR System Layout Lexicon (w, *occ) Occurrences (d, f t,d ) Locations (d, *pos) Documents 2 Why Use Compression? More storage inverted file

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 14 Non-Binary Huffman Code and Other Codes In the last class we

### Inverted Indexes Compressed Inverted Indexes. Indexing and Searching, Modern Information Retrieval, Addison Wesley, 2010 p. 40

Inverted Indexes Compressed Inverted Indexes Indexing and Searching, Modern Information Retrieval, Addison Wesley, 2010 p. 40 Compressed Inverted Indexes It is possible to combine index compression and

### Index Compression using Fixed Binary Codewords

Index Compression using Fixed Binary Codewords Vo Ngoc Anh Alistair Moffat Department of Computer Science and Software Engineering The University of Melbourne Victoria 3010, Australia http://www.cs.mu.oz.au/

### Huffman Coding. National Chiao Tung University Chun-Jen Tsai 10/2/2014

Huffman Coding National Chiao Tung University Chun-Jen Tsai 10/2/2014 Huffman Codes Optimum prefix code developed by D. Huffman in a class assignment Construction of Huffman codes is based on two ideas:

### Multimedia Communications. Huffman Coding

Multimedia Communications Huffman Coding Optimal codes Suppose that a i -> w i C + is an encoding scheme for a source alphabet A={a 1,, a N }. Suppose that the source letter a 1,, a N occur with relative

### Almost every lossy compression system contains a lossless compression system

Lossless compression in lossy compression systems Almost every lossy compression system contains a lossless compression system Lossy compression system Transform Quantizer Lossless Encoder Lossless Decoder

### Data Compression Techniques. Lecture 4: Integer Codes II

University of Helsinki Department of Computer Science 582487 Data Compression Techniques Lecture 4: Integer Codes II Simon J. Puglisi (puglisi@cs.helsinki.fi) Books (again) Remember how Google works?

### and the bitplane representing the second most significant bits is and the bitplane representing the least significant bits is

1 7. BITPLANE GENERATION The integer wavelet transform (IWT) is transformed into bitplanes before coding. The sign bitplane is generated based on the polarity of elements of the transform. The absolute

### Inverted Files for Text Search Engines

Table of Contents Inverted Files for Text Search Engines...2 JUSTIN ZOBEL And ALISTAIR MOFFAT...2 by M. Levent KOC...2 Sample Collection for examples:...2 Collection with only case-folding...2 Collection

### 5.4. HUFFMAN CODES 71

5.4. HUFFMAN CODES 7 Corollary 28 Consider a coding from a length n vector of source symbols, x = (x... x n ), to a binary codeword of length l(x). Then the average codeword length per source symbol for

### Information Retrieval

Information Retrieval ETH Zürich, Fall 2012 Thomas Hofmann LECTURE 4 INDEX COMPRESSION 10.10.2012 Information Retrieval, ETHZ 2012 1 Overview 1. Dictionary Compression 2. Zipf s Law 3. Posting List Compression

Adaptive Huffman coding If we want to code a sequence from an unknown source using Huffman coding, we need to know the probabilities of the different symbols. Most straightforward is to make two passes

### Techniques Covered in this Class. Inverted List Compression Techniques. Recap: Search Engine Query Processing. Recap: Search Engine Query Processing

Recap: Search Engine Query Processing Basically, to process a query we need to traverse the inverted lists of the query terms Lists are very long and are stored on disks Challenge: traverse lists as quickly

### CSC 310: Information Theory

CSC 310: Information Theory University of Toronto, Fall 2011 Instructor: Radford M. Neal Week 2 What s Needed for a Theory of (Lossless) Data Compression? A context for the problem. What are we trying

### Postings Lists - Reminder

Introduction to Search Engine Technology Index Compression Ronny Lempel Yahoo! Labs, Haifa (Some of the following slides are courtesy of Aya Soffer and David Carmel, IBM Haifa Research Lab) Postings Lists

### character E T A S R I O D frequency

Data Compression Data compression is any process by which a digital (e.g. electronic) file may be transformed to another ( compressed ) file, such that the original file may be fully recovered from the

### Arithmetic Coding: Introduction

Data Compression Arithmetic coding Arithmetic Coding: Introduction Allows using fractional parts of bits!! Used in PPM, JPEG/MPEG (as option), Bzip More time costly than Huffman, but integer implementation

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

### Information, Entropy, and Coding

Chapter 8 Information, Entropy, and Coding 8. The Need for Data Compression To motivate the material in this chapter, we first consider various data sources and some estimates for the amount of data associated

### Signal Compression Survey of the lectures Hints for exam

Signal Compression Survey of the lectures Hints for exam Chapter 1 Use one statement to define the three basic signal compression problems. Answer: (1) designing a good code for an independent source;

### Information Retrieval

Information Retrieval Indexes All slides unless specifically mentioned are Addison Wesley, 2008; Anton Leuski & Donald Metzler 2012 1 Search Process Information need query text objects representation representation

### Shannon and Huffman-Type Coders

Shannon and Huffman-Type Coders A useful class of coders that satisfy the Kraft's inequality in an efficient manner are called Huffman-type coders. To understand the philosophy of obtaining these codes,

### Image compression. Stefano Ferrari. Università degli Studi di Milano Elaborazione delle immagini (Image processing I)

Image compression Stefano Ferrari Università degli Studi di Milano stefano.ferrari@unimi.it Elaborazione delle immagini (Image processing I) academic year 2011 2012 Data and information The representation

### Name: Shu Xiong ID:

Homework #1 Report Multimedia Data Compression EE669 2013Spring Name: Shu Xiong ID: 3432757160 Email: shuxiong@usc.edu Content Problem1: Writing Questions... 2 Huffman Coding... 2 Lempel-Ziv Coding...

### Unaligned Binary Codes for Index Compression in Schema-Independent Text Retrieval Systems

Unaligned Binary Codes for Index Compression in Schema-Independent Text Retrieval Systems Stefan Büttcher Charles L. A. Clarke School of Computer Science University of Waterloo, Canada {sbuettch,claclark@plg.uwaterloo.ca

### International Journal of Advanced Research in Computer Science and Software Engineering

Volume 3, Issue 7, July 23 ISSN: 2277 28X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Greedy Algorithm:

### 2. Compressing data to reduce the amount of transmitted data (e.g., to save money).

Presentation Layer The presentation layer is concerned with preserving the meaning of information sent across a network. The presentation layer may represent (encode) the data in various ways (e.g., data

### Why is the number of 123- and 132-avoiding permutations equal to the number of binary trees?

Why is the number of - and -avoiding permutations equal to the number of binary trees? What are restricted permutations? We shall deal with permutations avoiding some specific patterns. For us, a permutation

### Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Outline CS/EE 559 / ENG 4 Special Topics (Class Ids: 784, 785, 783) Lecture ReCap Hoffman Coding Golomb Coding and JPEG Lossless Coding Lec 3 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li

### Image Compression through DCT and Huffman Coding Technique

International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul

### Bits, Bytes, and Codes

Bits, Bytes, and Codes Telecommunications 1 Peter Mathys Black and White Image Suppose we want to draw a B&W image on a computer screen. We first subdivide the screen into small rectangles or squares called

### Symbol Tables. IE 496 Lecture 13

Symbol Tables IE 496 Lecture 13 Reading for This Lecture Horowitz and Sahni, Chapter 2 Symbol Tables and Dictionaries A symbol table is a data structure for storing a list of items, each with a key and

### Gambling and Data Compression

Gambling and Data Compression Gambling. Horse Race Definition The wealth relative S(X) = b(x)o(x) is the factor by which the gambler s wealth grows if horse X wins the race, where b(x) is the fraction

### Compressing the Digital Library

Compressing the Digital Library Timothy C. Bell 1, Alistair Moffat 2, and Ian H. Witten 3 1 Department of Computer Science, University of Canterbury, New Zealand, tim@cosc.canterbury.ac.nz 2 Department

### Catch Me If You Can: A Practical Framework to Evade Censorship in Information-Centric Networks

Catch Me If You Can: A Practical Framework to Evade Censorship in Information-Centric Networks Reza Tourani, Satyajayant (Jay) Misra, Joerg Kliewer, Scott Ortegel, Travis Mick Computer Science Department

### PUBMED: an efficient biomedical based hierarchical search engine ABSTRACT:

PUBMED: an efficient biomedical based hierarchical search engine ABSTRACT: Search queries on biomedical databases, such as PubMed, often return a large number of results, only a small subset of which is

### Image Compression. Topics

Image Compression October 2010 Topics Redundancy Image information Fidelity Huffman coding Arithmetic coding Golomb code LZW coding Run Length Encoding Bit plane coding 1 Why do we need compression? Data

### Fast Sequential Summation Algorithms Using Augmented Data Structures

Fast Sequential Summation Algorithms Using Augmented Data Structures Vadim Stadnik vadim.stadnik@gmail.com Abstract This paper provides an introduction to the design of augmented data structures that offer

### Efficient Retrieval of Partial Documents

Efficient Retrieval of Partial Documents Justin Zobel Alistair Moffat Ross Wilkinson Ron Sacks-Davis June 1994 Abstract Management and retrieval of large volumes of text can be expensive in both space

### 6.1 General-Purpose Data Compression

6 Index Compression An inverted index for a given text collection can be quite large, especially if it contains full positional information about all term occurrences in the collection. In a typical collection

### Compression techniques

Compression techniques David Bařina February 22, 2013 David Bařina Compression techniques February 22, 2013 1 / 37 Contents 1 Terminology 2 Simple techniques 3 Entropy coding 4 Dictionary methods 5 Conclusion

### Self-Indexing Inverted Files for Fast Text Retrieval

Self-Indexing Inverted Files for Fast Text Retrieval Alistair Moffat Justin Zobel February 1994 Abstract Query processing costs on large text databases are dominated by the need to retrieve and scan the

### Problem Set I Lossless Coding

Problem Set I Lossless Coding Christopher Tsai Problem #2 Markov- Source Model The two-dimensional entropy rate function, is, Problem 2[ii] - Entropy Rate Function of a Markov- Source.9 Entropy H(α, β).9.7.3..7.3

### Compressing Integers for Fast File Access

Compressing Integers for Fast File Access Hugh E. Williams and Justin Zobel Department of Computer Science, RMIT University GPO Box 2476V, Melbourne 3001, Australia Email: {hugh,jz}@cs.rmit.edu.au Fast

### Chapter 3: Digital Audio Processing and Data Compression

Chapter 3: Digital Audio Processing and Review of number system 2 s complement sign and magnitude binary The MSB of a data word is reserved as a sign bit, 0 is positive, 1 is negative. The rest of the

### Index and document compression. IN4325 Information Retrieval

Index and document compression IN4325 Information Retrieval 1 Last time High-level view on (Basic, positional) inverted index Biword index Hashes versus search trees for vocabulary lookup Index structures

### Binary search tree with SIMD bandwidth optimization using SSE

Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous

### ITEC310 Computer Networks II

ITEC310 Computer Networks II Chapter 25 Domain Name System Department of Information Technology Eastern Mediterranean University Objectives 2/56 After completing this chapter you should be able to do the

### The relationship of a trees to a graph is very important in solving many problems in Maths and Computer Science

Trees Mathematically speaking trees are a special class of a graph. The relationship of a trees to a graph is very important in solving many problems in Maths and Computer Science However, in computer

### Error Detection and Correction: Parity Check Code; Bounds Based on Hamming Distance

Error Detection and Correction: Parity Check Code; Bounds Based on Hamming Distance Greg Plaxton Theory in Programming Practice, Fall 2005 Department of Computer Science University of Texas at Austin Error

### Saranya G., M.E. Loganya R., M.E. Student ==================================================================

================================================================== ================================================================== Amended Golomb Coding (AGC) Implementation for Reconfigurable Devices

### Exercises with solutions (1)

Exercises with solutions (). Investigate the relationship between independence and correlation. (a) Two random variables X and Y are said to be correlated if and only if their covariance C XY is not equal

### Physical Data Organization

Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

### Probability Interval Partitioning Entropy Codes

SUBMITTED TO IEEE TRANSACTIONS ON INFORMATION THEORY 1 Probability Interval Partitioning Entropy Codes Detlev Marpe, Senior Member, IEEE, Heiko Schwarz, and Thomas Wiegand, Senior Member, IEEE Abstract

### 4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we

### Lecture 17: Huffman Coding

Lecture 17: Huffman Coding CLRS- 16.3 Outline of this Lecture Codes and Compression. Huffman coding. Correctness of the Huffman coding algorithm. 1 Suppose that we have a 100, 000 character data file that

### Class Notes CS 3137. 1 Creating and Using a Huffman Code. Ref: Weiss, page 433

Class Notes CS 3137 1 Creating and Using a Huffman Code. Ref: Weiss, page 433 1. FIXED LENGTH CODES: Codes are used to transmit characters over data links. You are probably aware of the ASCII code, a fixed-length

### Optimizing CPU Scheduling Problem using Genetic Algorithms

Optimizing CPU Scheduling Problem using Genetic Algorithms Anu Taneja Amit Kumar Computer Science Department Hindu College of Engineering, Sonepat (MDU) anutaneja16@gmail.com amitkumar.cs08@pec.edu.in

### , each of which contains a unique key value, say k i , R 2. such that k i equals K (or to determine that no such record exists in the collection).

The Search Problem 1 Suppose we have a collection of records, say R 1, R 2,, R N, each of which contains a unique key value, say k i. Given a particular key value, K, the search problem is to locate the

### Development and Evaluation of Point Cloud Compression for the Point Cloud Library

Development and Evaluation of Point Cloud Compression for the Institute for Media Technology, TUM, Germany May 12, 2011 Motivation Point Cloud Stream Compression Network Point Cloud Stream Decompression

### A Catalogue of the Steiner Triple Systems of Order 19

A Catalogue of the Steiner Triple Systems of Order 19 Petteri Kaski 1, Patric R. J. Östergård 2, Olli Pottonen 2, and Lasse Kiviluoto 3 1 Helsinki Institute for Information Technology HIIT University of

### CS383, Algorithms Notes on Lossless Data Compression and Huffman Coding

Prof. Sergio A. Alvarez http://www.cs.bc.edu/ alvarez/ Fulton Hall 410 B alvarez@cs.bc.edu Computer Science Department voice: (617) 552-4333 Boston College fax: (617) 552-6790 Chestnut Hill, MA 02467 USA

LZ77 The original LZ77 algorithm works as follows: A phrase T j starting at a position i is encoded as a triple of the form distance, length, symbol. A triple d, l, s means that: T j = T [i...i + l] =

### Reading.. IMAGE COMPRESSION- I IMAGE COMPRESSION. Image compression. Data Redundancy. Lossy vs Lossless Compression. Chapter 8.

Reading.. IMAGE COMPRESSION- I Week VIII Feb 25 Chapter 8 Sections 8.1, 8.2 8.3 (selected topics) 8.4 (Huffman, run-length, loss-less predictive) 8.5 (lossy predictive, transform coding basics) 8.6 Image

### Sophomore College Mathematics of the Information Age Entropy

Sophomore College Mathematics of the Information Age Entropy Let s recall the fundamental notions of information and entropy. To repeat from the first class, Shannon s emphasis is on selecting a given

### Full and Complete Binary Trees

Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full

### Ignify ecommerce. Item Requirements Notes

wwwignifycom Tel (888) IGNIFY5 sales@ignifycom Fax (408) 516-9006 Ignify ecommerce Server Configuration 1 Hardware Requirement (Minimum configuration) Item Requirements Notes Operating System Processor

### On the Determination of Optimal Parameterized Prefix Codes for Adaptive Entropy Coding

On the Determination of Optimal Parameterized Prefix Codes for Adaptive Entropy Coding Amir Said Media Technologies Laboratory HP Laboratories Palo Alto HPL-2006-74 April 28, 2006* entropy coding, Golomb-Rice

### Principles of Image Compression

Principles of Image Compression Catania 03/04/2008 Arcangelo Bruna Overview Image Compression is the Image Data Elaboration branch dedicated to the image data representation It analyzes the techniques

### Vector storage and access; algorithms in GIS. This is lecture 6

Vector storage and access; algorithms in GIS This is lecture 6 Vector data storage and access Vectors are built from points, line and areas. (x,y) Surface: (x,y,z) Vector data access Access to vector

### Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object

### KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS

ABSTRACT KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS In many real applications, RDF (Resource Description Framework) has been widely used as a W3C standard to describe data in the Semantic Web. In practice,

### Succinct Data Structures for Data Mining

Succinct Data Structures for Data Mining Rajeev Raman University of Leicester ALSIP 2014, Tainan Overview Introduction Compressed Data Structuring Data Structures Applications Libraries End Big Data vs.

### Trees and structural induction

Trees and structural induction Margaret M. Fleck 25 October 2010 These notes cover trees, tree induction, and structural induction. (Sections 10.1, 4.3 of Rosen.) 1 Why trees? Computer scientists are obsessed

### Multi-Way Search Trees (B Trees)

Multi-Way Search Trees (B Trees) Multiway Search Trees An m-way search tree is a tree in which, for some integer m called the order of the tree, each node has at most m children. If n

### Transparent D Flip-Flop

Transparent Flip-Flop The RS flip-flop forms the basis of a number of 1-bit storage devices in digital electronics. ne such device is shown in the figure, where extra combinational logic converts the input

### Recurrence Relations. Niloufar Shafiei

Recurrence Relations Niloufar Shafiei Review A recursive definition of a sequence specifies one or more initial terms a rule for determining subsequent terms from those that precede them. Example: a 0

### Binary Hamming Codes

Coding Theory Massoud Malek Binary Hamming Codes Hamming codes were discovered by R.W. Hamming and M. J. E. Golay. They form an important class of codes they have interesting properties and are easy to

### Storage Management for Files of Dynamic Records

Storage Management for Files of Dynamic Records Justin Zobel Department of Computer Science, RMIT, GPO Box 2476V, Melbourne 3001, Australia. jz@cs.rmit.edu.au Alistair Moffat Department of Computer Science

### Relative Data Redundancy

Image Compression Relative Data Redundancy Let b and b denote the number of bits in two representations of the same information, the relative data redundancy R is R = 1-1/C C is called the compression

### DATABASE SERVER CONFIGURATION

DATABASE SERVER CONFIGURATION RECOMMENDED SERVER HARDWARE CONFIGURATION Processor Speed/RAM 100 users 250 users 500 users > 500 users Intel Pentium 4 Processor 630 at 3.0GHz / Intel Xeon Processor at 2.8GHz

### Creating Synthetic Temporal Document Collections for Web Archive Benchmarking

Creating Synthetic Temporal Document Collections for Web Archive Benchmarking Kjetil Nørvåg and Albert Overskeid Nybø Norwegian University of Science and Technology 7491 Trondheim, Norway Abstract. In

### Today. Data Representation. Break: Binary Numbers. Representing Text. Representing Images. 18:30(ish) 15 Minute Break

Introduction Today Data Representation Binary Numbers Representing Text Representing Images Break: 18:30(ish) 15 Minute Break Why Data Representation? GCSE syllabus has an entire Module on data representation.

### Timing channels: More than just payload in IoT

Timing channels: More than just payload in IoT Guido Carlo Ferrante ] Tony Q. S. Quek Moe Z. Win ] September 14, 2016 ] G. C. Ferrante (SUTD & MIT) Tyrrhenian Workshop, 2016 September 14, 2016 1 / 25 Introduction

### Efficient visual search of local features. Cordelia Schmid

Efficient visual search of local features Cordelia Schmid Visual search change in viewing angle Matches 22 correct matches Image search system for large datasets Large image dataset (one million images

### BINARY SEARCH TREE PERFORMANCE

BINARY SEARCH TREE PERFORMANCE Operation Best Time Average Time Worst Time (on a tree of n nodes) Find Insert Delete O(lg n)?? O(lg n)?? O(n) Fastest Running Time The find, insert and delete algorithms

### Foundation of Video Coding Part I: Overview and Binary Encoding

Foundation of Video Coding Part I: Overview and Binary Encoding Yao Wang Polytechnic University, Brooklyn, NY11201 http://eeweb.poly.edu Based on: Y. Wang, J. Ostermann, and Y.-Q. Zhang, Video Processing

### Lab Session 4. Review. Outline. May 18, Image and video encoding: A big picture

Outline Lab Session 4 May 18, 2009 Review Manual Exercises Comparing coding performance of different codes: Shannon code, Shannon-Fano code, Huffman code (and Tunstall code *) MATLAB Exercises Working

### Ex. 2.1 (Davide Basilio Bartolini)

ECE 54: Elements of Information Theory, Fall 00 Homework Solutions Ex.. (Davide Basilio Bartolini) Text Coin Flips. A fair coin is flipped until the first head occurs. Let X denote the number of flips

### Modified Golomb-Rice Codes for Lossless Compression of Medical Images

Modified Golomb-Rice Codes for Lossless Compression of Medical Images Roman Starosolski (1), Władysław Skarbek (2) (1) Silesian University of Technology (2) Warsaw University of Technology Abstract Lossless

### DOUBLE COMPRESSION OF TEST DATA USING HUFFMAN CODE

DOUBLE COMPRESSION OF TEST DATA USING HUFFMAN CODE 1 R.S.AARTHI, 2 D. MURALIDHARAN, 3 P. SWAMINATHAN 1 School of Computing, SASTRA University, Thanjavur, Tamil Nadu, India 2 Assistant professor, School

### EP A1 (19) (11) EP A1. (12) EUROPEAN PATENT APPLICATION published in accordance with Art. 158(3) EPC

(19) (12) EUROPEAN PATENT APPLICATION published in accordance with Art. 8(3) EPC (11) EP 1 962 14 A1 (43) Date of publication: 27.08.08 Bulletin 08/3 (21) Application number: 06828244.1 (22) Date of filing:

### ECEN 5682 Theory and Practice of Error Control Codes

ECEN 5682 Theory and Practice of Error Control Codes Convolutional Codes University of Colorado Spring 2007 Linear (n, k) block codes take k data symbols at a time and encode them into n code symbols.

### SIDN Server Measurements

SIDN Server Measurements Yuri Schaeffer 1, NLnet Labs NLnet Labs document 2010-003 July 19, 2010 1 Introduction For future capacity planning SIDN would like to have an insight on the required resources

### Terminal Server Software and Hardware Requirements. Terminal Server. Software and Hardware Requirements. Datacolor Match Pigment Datacolor Tools

Terminal Server Software and Hardware Requirements Datacolor Match Pigment Datacolor Tools January 21, 2011 Page 1 of 8 Introduction This document will provide preliminary information about the both the