More Bits and Bytes Huffman Coding. Natural Language and Dialogue Systems Lab

Size: px
Start display at page:

Download "More Bits and Bytes Huffman Coding. Natural Language and Dialogue Systems Lab"

Transcription

1 More Bits and Bytes Huffman Coding Natural Language and Dialogue Systems Lab

2 Announcements Privacy essay deadline extended to 5PM Saturday To be assigned a week from today, a one week creativity pair programming assignment. FIND A PARTNER for your PAIR Winter2//pages/syllabus 3 homeworks this week: 2 processing homeworks, one easy (Thurs), one harder (Tues), One principles homework, due next Thursday. Ready today. Practice kinds of problems that will be on the midterm/quiz Next Tuesday: Social Media guest lecture There will be midterm questions on that lecture

3 Announcements (Repeat) If you have a question on your homework please ask Gabby or Chao if you want a v. timely response Ask the person whose section you signed up for: Gabby T/Th ghalberg@soe.ucsc.edu Chao M/W zhu@soe.ucsc.edu Questions about grading: ask Gabby but CC me.

4 Next Processing Homework: DUE THURS! Write your own code from scratch, not just change existing code. But lots of hints and instructions in the homework PDF

5 Brief Review: How do Computers Compute? Natural Language and Dialogue Systems Lab

6 MP Neuron for LOGICAL AND X + Y + =2 IS *X + *Y 2?? X AND Y

7 Fundamental units of computers: Logic Gates

8 We can read it off the truth Table for XOR P Q P xor Q

9 Logical XOR Neuron exists? How long do we keep looking for a solution? We need to be able to calculate the weights, not just keep looking for the answer by trial and error. Each possible pair of inputs corresponds to an equation (a linear inequality) for the output in terms of the inputs, the weights and the threshhold. E.g. for AND it was IS *X + *Y 2?? These can be used to compute the weights and thresholds. Equations for XOR are incompatible => MP Neuron for XOR can t be built.

10 What would you ever want XOR for anyway? Binary Addition (will do some of this after the MIDTERM)

11 Binary combinations, True/False possibilities One bit Three bits Four bits?????? Two Bits??????????

12 ASCII C A T /25/2 2 Lawrence Snyder, CSE

13 With 8 bits how many different letters? 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 => 2 8 = 256

14 How do Computers represent data? Natural Language and Dialogue Systems Lab

15 Encoding Text: How is it done? ASCII, UTF, Huffman algorithm Natural Language and Dialogue Systems Lab

16 ASCII C A T /26/2 2 Lawrence Snyder, CSE

17 With 8 places how many different letters?

18 UTF-8: All the alphabets in the world Uniform Transformation Format: a variable- width encoding that can represent every character in the Unicode Character set,2,64 of them!!! 8 UTF- 8 is the dominant character encoding for the World- Wide Web, accounting for more than half of all Web pages. The Internet Engineering Task Force (IETF) requires all Internet protocols to identify the encoding used for character data The supported character encodings must include UTF- 8.

19 UTF is a VARIABLE LENGTH ALPHABET CODING Remember ASCII can only represent 256 characters (8 bits) UTF encodes over M Why would you want a variable length coding scheme?

20 Cyrillic vs. English

21 Huffman Coding: Natural Language and Dialogue Systems Lab

22 Coding can be used to do Compression What is CODING? The conversion of one representation into another What is COMPRESSION? Change the representation (digitization) in order to reduce size of data (number of bits needed to represent data) Benefits Reduce storage needed Consider growth of digitized data. Reduce transmission cost / latency / bandwidth When you have a 56K dialup modem, every savings in BITS counts, SPEED Also consider telephone lines, texting

23 What makes it possible to do Compression? IN OTHER WORDS: When is Coding USEFUL? When there is Redundancy Recognize repeating patterns Exploit using Dictionary Variable length encoding When Human perception is less sensitive to some information Can discard less important data

24 How easy is it to do it? Depends on data Random data hard Example:? Organized data easy Example: WHAT DOES THAT MEAN? There is NO universally best compression algorithm It depends on how tuned the coding is to the data you have

25 Can you lose information with Compression? Lossless Compression is not guaranteed Pigeonhole principle Reduce size bit can only store ½ of data Example,,,,,,,,,, CONSIDER THE ALTERNATIVE IF LOSSLESS COMPRESSION WERE GUARANTEED THEN Compress file (reduce size by bit) Recompress output Repeat (until we can store data with bits) OBVIOUS CONTRADICTION => IT IS NOT GUARANTEED.

26 Huffman Code: A Lossless Compression Use Variable Length codes based on frequency (like UTF does) Approach Exploit statistical frequency of symbols What do I MEAN by that? WE COUNT!!! HELPS when the frequency for different symbols varies widely Principle Use fewer bits to represent frequent symbols Use more bits to represent infrequent symbols A A B A A A B A

27 Huffman Code Example dog cat cat bird bird bird bird fish Symbol Dog Cat Bird Fish Frequency /8 /4 /2 /8 Original Encoding Huffman Encoding 2 bits 2 bits 2 bits 2 bits 3 bits 2 bits bit 3 bits Expected size Original /8 2 + /4 2 + /2 2 + /8 2 = 2 bits / symbol Huffman /8 3 + /4 2 + /2 + /8 3 =.75 bits / symbol DOES EVERYONE SEE THIS?? ASK ME A QUESTION.

28 Huffman Code Algorithm: Data Structures Binary (Huffman) tree Represents Huffman code Edge code ( or ) Leaf symbol Path to leaf encoding Example A =, H =, C = Good when??? A, H less frequent than C in messages Want to efficiently build a binary tree Also showed you binary tree for ASCII A H C

29 Huffman Code Algorithm Overview Order the symbols with least frequent first (will explain) Build a tree piece by piece Encoding Calculate frequency of symbols in the message, language JUST COUNT AND DIVIDE BY TOTAL NUMBER OF SYMBOLS Create binary tree representing best encoding Use binary tree to encode compressed file For each symbol, output path from root to leaf Size of encoding = length of path Save binary tree

30 Huffman Code Creating Tree Algorithm (Recipe) Place each symbol in leaf Weight of leaf = symbol frequency Select two trees L and R (initially leafs) Such that L, R have lowest frequencies in tree Which L, R have the lowest number of occurrences in the message? Create new (internal) node Left child L Right child R New frequency frequency( L ) + frequency( R ) Repeat until all nodes merged into one tree

31 Huffman Tree Construction A C E H I

32 Huffman Tree Step 2: can first re-order by frequency A H C E I

33 Huffman Tree Construction 3 A H E I C 5 5

34 Huffman Tree Construction 4 A H E I C 5 5

35 Huffman Tree Construction 5 A 3 5 H 2 C E I 7 E = I = C = A = H = 25

36 Huffman Coding Example Huffman code Input ACE Output ()()() = E = I = C = A = H =

37 Huffman Code Algorithm Overview Decoding Read compressed file & binary tree Use binary tree to decode file Follow path from root to leaf

38 Huffman Decoding A 3 H 2 C E I

39 Huffman Decoding 2 A 3 H 2 C E I

40 Huffman Decoding 3 A 3 H 2 C E I A 5 25

41 Huffman Decoding 4 A 3 H 2 C E I A 5 25

42 Huffman Decoding 5 A 3 H 2 C E I AC 5 25

43 Huffman Decoding 6 A 3 H 2 C E I AC 5 25

44 Huffman Decoding 7 A 3 H 2 C E I ACE 5 25

45 Huffman Code Properties Prefix code No code is a prefix of another code Example Huffman( dog ) Huffman( cat ) // not legal prefix code Can stop as soon as complete code found No need for end-of-code marker Nondeterministic Multiple Huffman coding possible for same input If more than two trees with same minimal weight

46 Huffman Code Properties Greedy algorithm Chooses best local solution at each step Combines 2 trees with lowest frequency Still yields overall best solution Optimal prefix code Based on statistical frequency Better compression possible (depends on data) Using other approaches (e.g., pattern dictionary)

47 Huffman Coding. Another example. Natural Language and Dialogue Systems Lab

48 Huffman Tree Example 2. Step T O B E R TO BE OR NOT TO BE T = 3 O = 4 B = 2 E = 2 R =

49 Huffman Tree: TO BE OR NOT TO BE E R B T O

50 Huffman Tree 3: TO BE OR NOT TO BE E R T O B 3 2 5

51 Huffman Tree 4: TO BE OR NOT TO BE E R T O B 2 7 5

52 Huffman Tree Construction 5 E 2 3 R B T O 4 E = R = B = T = O = 2 2 NUMBER OF LETTERS IN MESSAGE

53 Huffman Tree 5: TO BE E 2 3 R B T O 4 E = R = B = T = O = 2...N..

54 How Much For A Fixed Length Code? E 2 3 R B T O 4 E = R = B = T = O = 2...N.. 28 BITS HERE

55 Huffman Code Algorithm Overview Decoding Read compressed file & binary tree Use binary tree to decode file Follow path from root to leaf FIRST EXAMPLE. ACE

56 Huffman Decoding A 3 H 2 C E I

57 Huffman Decoding 2 A 3 H 2 C E I

58 Huffman Decoding 3 A 3 H 2 C E I A 5 25

59 Huffman Decoding 4 A 3 H 2 C E I A 5 25

60 Huffman Decoding 5 A 3 H 2 C E I AC 5 25

61 Huffman Decoding 6 A 3 H 2 C E I AC 5 25

62 Huffman Decoding 7 A 3 H 2 C E I ACE 5 25

63 DECODING: 2 nd example BEBE E 2 3 R B T O 4 E = R = B = T = O = 2 BEBE = ORIGINAL 6 BITS FOR BEBE

64 Huffman Code Properties Prefix code No code is a prefix of another code Example Huffman( dog ) Huffman( cat ) // not legal prefix code Can stop as soon as complete code found No need for end-of-code marker Nondeterministic Multiple Huffman coding possible for same input If more than two trees with same minimal weight

65 Huffman Code Properties Greedy algorithm Chooses best local solution at each step Combines 2 trees with lowest frequency Still yields overall best solution Optimal prefix code Based on statistical frequency Better compression possible (depends on data) But needs look ahead, not prefix.

66 Encoding Information: There s more! Bits and bytes encode the information, but that s not all Tags encode format and some structure in word processors Tags encode format and some structure in HTML Tags are one form of meta- data Meta- data is information about information We will return to this when we talk about HTML and the WEB in Week 7.

encoding compression encryption

encoding compression encryption encoding compression encryption ASCII utf-8 utf-16 zip mpeg jpeg AES RSA diffie-hellman Expressing characters... ASCII and Unicode, conventions of how characters are expressed in bits. ASCII (7 bits) -

More information

THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM

THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM Iuon Chang Lin Department of Management Information Systems, National Chung Hsing University, Taiwan, Department of Photonics and Communication Engineering,

More information

Class Notes CS 3137. 1 Creating and Using a Huffman Code. Ref: Weiss, page 433

Class Notes CS 3137. 1 Creating and Using a Huffman Code. Ref: Weiss, page 433 Class Notes CS 3137 1 Creating and Using a Huffman Code. Ref: Weiss, page 433 1. FIXED LENGTH CODES: Codes are used to transmit characters over data links. You are probably aware of the ASCII code, a fixed-length

More information

Image Compression through DCT and Huffman Coding Technique

Image Compression through DCT and Huffman Coding Technique International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul

More information

Today s topics. Digital Computers. More on binary. Binary Digits (Bits)

Today s topics. Digital Computers. More on binary. Binary Digits (Bits) Today s topics! Binary Numbers! Brookshear.-.! Slides from Prof. Marti Hearst of UC Berkeley SIMS! Upcoming! Networks Interactive Introduction to Graph Theory http://www.utm.edu/cgi-bin/caldwell/tutor/departments/math/graph/intro

More information

Information, Entropy, and Coding

Information, Entropy, and Coding Chapter 8 Information, Entropy, and Coding 8. The Need for Data Compression To motivate the material in this chapter, we first consider various data sources and some estimates for the amount of data associated

More information

Third Southern African Regional ACM Collegiate Programming Competition. Sponsored by IBM. Problem Set

Third Southern African Regional ACM Collegiate Programming Competition. Sponsored by IBM. Problem Set Problem Set Problem 1 Red Balloon Stockbroker Grapevine Stockbrokers are known to overreact to rumours. You have been contracted to develop a method of spreading disinformation amongst the stockbrokers

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 7, July 23 ISSN: 2277 28X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Greedy Algorithm:

More information

Binary Trees and Huffman Encoding Binary Search Trees

Binary Trees and Huffman Encoding Binary Search Trees Binary Trees and Huffman Encoding Binary Search Trees Computer Science E119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary

More information

Arithmetic Coding: Introduction

Arithmetic Coding: Introduction Data Compression Arithmetic coding Arithmetic Coding: Introduction Allows using fractional parts of bits!! Used in PPM, JPEG/MPEG (as option), Bzip More time costly than Huffman, but integer implementation

More information

Reading.. IMAGE COMPRESSION- I IMAGE COMPRESSION. Image compression. Data Redundancy. Lossy vs Lossless Compression. Chapter 8.

Reading.. IMAGE COMPRESSION- I IMAGE COMPRESSION. Image compression. Data Redundancy. Lossy vs Lossless Compression. Chapter 8. Reading.. IMAGE COMPRESSION- I Week VIII Feb 25 Chapter 8 Sections 8.1, 8.2 8.3 (selected topics) 8.4 (Huffman, run-length, loss-less predictive) 8.5 (lossy predictive, transform coding basics) 8.6 Image

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

Compression techniques

Compression techniques Compression techniques David Bařina February 22, 2013 David Bařina Compression techniques February 22, 2013 1 / 37 Contents 1 Terminology 2 Simple techniques 3 Entropy coding 4 Dictionary methods 5 Conclusion

More information

Chapter 4: Computer Codes

Chapter 4: Computer Codes Slide 1/30 Learning Objectives In this chapter you will learn about: Computer data Computer codes: representation of data in binary Most commonly used computer codes Collating sequence 36 Slide 2/30 Data

More information

HIGH DENSITY DATA STORAGE IN DNA USING AN EFFICIENT MESSAGE ENCODING SCHEME Rahul Vishwakarma 1 and Newsha Amiri 2

HIGH DENSITY DATA STORAGE IN DNA USING AN EFFICIENT MESSAGE ENCODING SCHEME Rahul Vishwakarma 1 and Newsha Amiri 2 HIGH DENSITY DATA STORAGE IN DNA USING AN EFFICIENT MESSAGE ENCODING SCHEME Rahul Vishwakarma 1 and Newsha Amiri 2 1 Tata Consultancy Services, India derahul@ieee.org 2 Bangalore University, India ABSTRACT

More information

Wan Accelerators: Optimizing Network Traffic with Compression. Bartosz Agas, Marvin Germar & Christopher Tran

Wan Accelerators: Optimizing Network Traffic with Compression. Bartosz Agas, Marvin Germar & Christopher Tran Wan Accelerators: Optimizing Network Traffic with Compression Bartosz Agas, Marvin Germar & Christopher Tran Introduction A WAN accelerator is an appliance that can maximize the services of a point-to-point(ptp)

More information

On the Use of Compression Algorithms for Network Traffic Classification

On the Use of Compression Algorithms for Network Traffic Classification On the Use of for Network Traffic Classification Christian CALLEGARI Department of Information Ingeneering University of Pisa 23 September 2008 COST-TMA Meeting Samos, Greece Outline Outline 1 Introduction

More information

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!

Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere! Interconnection Networks Interconnection Networks Interconnection networks are used everywhere! Supercomputers connecting the processors Routers connecting the ports can consider a router as a parallel

More information

Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding

Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding C. SARAVANAN cs@cc.nitdgp.ac.in Assistant Professor, Computer Centre, National Institute of Technology, Durgapur,WestBengal,

More information

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to: Chapter 3 Data Storage Objectives After studying this chapter, students should be able to: List five different data types used in a computer. Describe how integers are stored in a computer. Describe how

More information

Cyber Security Workshop Encryption Reference Manual

Cyber Security Workshop Encryption Reference Manual Cyber Security Workshop Encryption Reference Manual May 2015 Basic Concepts in Encoding and Encryption Binary Encoding Examples Encryption Cipher Examples 1 P a g e Encoding Concepts Binary Encoding Basics

More information

The use of binary codes to represent characters

The use of binary codes to represent characters The use of binary codes to represent characters Teacher s Notes Lesson Plan x Length 60 mins Specification Link 2.1.4/hi Character Learning objective (a) Explain the use of binary codes to represent characters

More information

CHAPTER 2 LITERATURE REVIEW

CHAPTER 2 LITERATURE REVIEW 11 CHAPTER 2 LITERATURE REVIEW 2.1 INTRODUCTION Image compression is mainly used to reduce storage space, transmission time and bandwidth requirements. In the subsequent sections of this chapter, general

More information

CSE 326: Data Structures B-Trees and B+ Trees

CSE 326: Data Structures B-Trees and B+ Trees Announcements (4//08) CSE 26: Data Structures B-Trees and B+ Trees Brian Curless Spring 2008 Midterm on Friday Special office hour: 4:-5: Thursday in Jaech Gallery (6 th floor of CSE building) This is

More information

!"#$"%&' What is Multimedia?

!#$%&' What is Multimedia? What is Multimedia? %' A Big Umbrella Goal of This Course Understand various aspects of a modern multimedia pipeline Content creating, editing Distribution Search & mining Protection Hands-on experience

More information

Encoding Text with a Small Alphabet

Encoding Text with a Small Alphabet Chapter 2 Encoding Text with a Small Alphabet Given the nature of the Internet, we can break the process of understanding how information is transmitted into two components. First, we have to figure out

More information

Probability Interval Partitioning Entropy Codes

Probability Interval Partitioning Entropy Codes SUBMITTED TO IEEE TRANSACTIONS ON INFORMATION THEORY 1 Probability Interval Partitioning Entropy Codes Detlev Marpe, Senior Member, IEEE, Heiko Schwarz, and Thomas Wiegand, Senior Member, IEEE Abstract

More information

How to represent characters?

How to represent characters? Copyright Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See http://software-carpentry.org/license.html for more information. How to represent characters?

More information

Data Storage 3.1. Foundations of Computer Science Cengage Learning

Data Storage 3.1. Foundations of Computer Science Cengage Learning 3 Data Storage 3.1 Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: List five different data types used in a computer. Describe how

More information

Lecture 18: Applications of Dynamic Programming Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY 11794 4400

Lecture 18: Applications of Dynamic Programming Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY 11794 4400 Lecture 18: Applications of Dynamic Programming Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.sunysb.edu/ skiena Problem of the Day

More information

Overview/Questions. What is Cryptography? The Caesar Shift Cipher. CS101 Lecture 21: Overview of Cryptography

Overview/Questions. What is Cryptography? The Caesar Shift Cipher. CS101 Lecture 21: Overview of Cryptography CS101 Lecture 21: Overview of Cryptography Codes and Ciphers Overview/Questions What is cryptography? What are the challenges of data encryption? What factors make an encryption strategy successful? What

More information

Preservation Handbook

Preservation Handbook Preservation Handbook [Binary Text / Word Processor Documents] Author Rowan Wilson and Martin Wynne Version Draft V3 Date 22 / 08 / 05 Change History Revised by MW 22.8.05; 2.12.05; 7.3.06 Page 1 of 7

More information

Analysis of Compression Algorithms for Program Data

Analysis of Compression Algorithms for Program Data Analysis of Compression Algorithms for Program Data Matthew Simpson, Clemson University with Dr. Rajeev Barua and Surupa Biswas, University of Maryland 12 August 3 Abstract Insufficient available memory

More information

Development and Evaluation of Point Cloud Compression for the Point Cloud Library

Development and Evaluation of Point Cloud Compression for the Point Cloud Library Development and Evaluation of Point Cloud Compression for the Institute for Media Technology, TUM, Germany May 12, 2011 Motivation Point Cloud Stream Compression Network Point Cloud Stream Decompression

More information

An Implementation of a High Capacity 2D Barcode

An Implementation of a High Capacity 2D Barcode An Implementation of a High Capacity 2D Barcode Puchong Subpratatsavee 1 and Pramote Kuacharoen 2 Department of Computer Science, Graduate School of Applied Statistics National Institute of Development

More information

Concept of Cache in web proxies

Concept of Cache in web proxies Concept of Cache in web proxies Chan Kit Wai and Somasundaram Meiyappan 1. Introduction Caching is an effective performance enhancing technique that has been used in computer systems for decades. However,

More information

CSE 473s Introduction to Computer Networks

CSE 473s Introduction to Computer Networks CSE 473s Introduction to Computer Networks Raj Jain Washington University in Saint Louis Saint Louis, MO 63130 Jain@wustl.edu Audio/Video recordings of this lecture are available on-line at: http://www.cse.wustl.edu/~jain/cse473-11/

More information

Section 1.4 Place Value Systems of Numeration in Other Bases

Section 1.4 Place Value Systems of Numeration in Other Bases Section.4 Place Value Systems of Numeration in Other Bases Other Bases The Hindu-Arabic system that is used in most of the world today is a positional value system with a base of ten. The simplest reason

More information

Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics:

Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics: Voice Transmission --Basic Concepts-- Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics: Amplitude Frequency Phase Voice Digitization in the POTS Traditional

More information

Solutions to Problem Set 1

Solutions to Problem Set 1 YALE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE CPSC 467b: Cryptography and Computer Security Handout #8 Zheng Ma February 21, 2005 Solutions to Problem Set 1 Problem 1: Cracking the Hill cipher Suppose

More information

Computer and Network Security

Computer and Network Security Computer and Network Security Dr. Arjan Durresi Louisiana State University Baton Rouge, LA 70810 Durresi@Csc.LSU.Edu These slides are available at: http://www.csc.lsu.edu/~durresi/csc4601_04/ Louisiana

More information

CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions CS 2112 Spring 2014 Assignment 3 Data Structures and Web Filtering Due: March 4, 2014 11:59 PM Implementing spam blacklists and web filters requires matching candidate domain names and URLs very rapidly

More information

Data Structures in Java. Session 15 Instructor: Bert Huang http://www1.cs.columbia.edu/~bert/courses/3134

Data Structures in Java. Session 15 Instructor: Bert Huang http://www1.cs.columbia.edu/~bert/courses/3134 Data Structures in Java Session 15 Instructor: Bert Huang http://www1.cs.columbia.edu/~bert/courses/3134 Announcements Homework 4 on website No class on Tuesday Midterm grades almost done Review Indexing

More information

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29. Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet

More information

Software Engineering and Service Design: courses in ITMO University

Software Engineering and Service Design: courses in ITMO University Software Engineering and Service Design: courses in ITMO University Igor Buzhinsky igor.buzhinsky@gmail.com Computer Technologies Department Department of Computer Science and Information Systems December

More information

CSE 123: Computer Networks Fall Quarter, 2014 MIDTERM EXAM

CSE 123: Computer Networks Fall Quarter, 2014 MIDTERM EXAM CSE 123: Computer Networks Fall Quarter, 2014 MIDTERM EXAM Instructor: Alex C. Snoeren Name SOLUTIONS Student ID Question Score Points 1 15 15 2 35 35 3 25 25 4 15 15 5 10 10 Total 100 100 This exam is

More information

Introduction to image coding

Introduction to image coding Introduction to image coding Image coding aims at reducing amount of data required for image representation, storage or transmission. This is achieved by removing redundant data from an image, i.e. by

More information

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process

More information

Multimedia Systems WS 2010/2011

Multimedia Systems WS 2010/2011 Multimedia Systems WS 2010/2011 31.01.2011 M. Rahamatullah Khondoker (Room # 36/410 ) University of Kaiserslautern Department of Computer Science Integrated Communication Systems ICSY http://www.icsy.de

More information

In-Network Coding for Resilient Sensor Data Storage and Efficient Data Mule Collection

In-Network Coding for Resilient Sensor Data Storage and Efficient Data Mule Collection In-Network Coding for Resilient Sensor Data Storage and Efficient Data Mule Collection Michele Albano Jie Gao Instituto de telecomunicacoes, Aveiro, Portugal Stony Brook University, Stony Brook, USA Data

More information

Physical Data Organization

Physical Data Organization Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

More information

Internationalizing the Domain Name System. Šimon Hochla, Anisa Azis, Fara Nabilla

Internationalizing the Domain Name System. Šimon Hochla, Anisa Azis, Fara Nabilla Internationalizing the Domain Name System Šimon Hochla, Anisa Azis, Fara Nabilla Internationalize Internet Master in Innovation and Research in Informatics problematic of using non-ascii characters ease

More information

Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework

Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework Sergio De Agostino Computer Science Department Sapienza University of Rome Internet as a Distributed System Modern

More information

Base Conversion written by Cathy Saxton

Base Conversion written by Cathy Saxton Base Conversion written by Cathy Saxton 1. Base 10 In base 10, the digits, from right to left, specify the 1 s, 10 s, 100 s, 1000 s, etc. These are powers of 10 (10 x ): 10 0 = 1, 10 1 = 10, 10 2 = 100,

More information

Lossless Compression of Cloud-Cover Forecasts for Low-Overhead Distribution in Solar-Harvesting Sensor Networks

Lossless Compression of Cloud-Cover Forecasts for Low-Overhead Distribution in Solar-Harvesting Sensor Networks Lossless Compression of Cloud-Cover Forecasts for Low-Overhead Distribution in Solar-Harvesting Sensor Networks Christian Renner and Phu Anh Tuan Nguyen ENSsys 14, Memphis, TN, USA November 6 th, 2014

More information

HOMEWORK # 2 SOLUTIO

HOMEWORK # 2 SOLUTIO HOMEWORK # 2 SOLUTIO Problem 1 (2 points) a. There are 313 characters in the Tamil language. If every character is to be encoded into a unique bit pattern, what is the minimum number of bits required to

More information

Combinational circuits

Combinational circuits Combinational circuits Combinational circuits are stateless The outputs are functions only of the inputs Inputs Combinational circuit Outputs 3 Thursday, September 2, 3 Enabler Circuit (High-level view)

More information

Creating A Simple Dictionary With Definitions

Creating A Simple Dictionary With Definitions Creating A Simple Dictionary With Definitions The KAS Knowledge Acquisition System allows you to create new dictionaries with definitions from scratch or append information to existing dictionaries. The

More information

Computer Networks and the Internet

Computer Networks and the Internet ? Computer the IMT2431 - Data Communication and Network Security January 7, 2008 ? Teachers are Lasse Øverlier and http://www.hig.no/~erikh Lectures and Lab in A126/A115 Course webpage http://www.hig.no/imt/in/emnesider/imt2431

More information

Subject knowledge requirements for entry into computer science teacher training. Expert group s recommendations

Subject knowledge requirements for entry into computer science teacher training. Expert group s recommendations Subject knowledge requirements for entry into computer science teacher training Expert group s recommendations Introduction To start a postgraduate primary specialist or secondary ITE course specialising

More information

A NEW LOSSLESS METHOD OF IMAGE COMPRESSION AND DECOMPRESSION USING HUFFMAN CODING TECHNIQUES

A NEW LOSSLESS METHOD OF IMAGE COMPRESSION AND DECOMPRESSION USING HUFFMAN CODING TECHNIQUES A NEW LOSSLESS METHOD OF IMAGE COMPRESSION AND DECOMPRESSION USING HUFFMAN CODING TECHNIQUES 1 JAGADISH H. PUJAR, 2 LOHIT M. KADLASKAR 1 Faculty, Department of EEE, B V B College of Engg. & Tech., Hubli,

More information

Streaming Lossless Data Compression Algorithm (SLDC)

Streaming Lossless Data Compression Algorithm (SLDC) Standard ECMA-321 June 2001 Standardizing Information and Communication Systems Streaming Lossless Data Compression Algorithm (SLDC) Phone: +41 22 849.60.00 - Fax: +41 22 849.60.01 - URL: http://www.ecma.ch

More information

File Management. Chapter 12

File Management. Chapter 12 Chapter 12 File Management File is the basic element of most of the applications, since the input to an application, as well as its output, is usually a file. They also typically outlive the execution

More information

EE 261 Introduction to Logic Circuits. Module #2 Number Systems

EE 261 Introduction to Logic Circuits. Module #2 Number Systems EE 261 Introduction to Logic Circuits Module #2 Number Systems Topics A. Number System Formation B. Base Conversions C. Binary Arithmetic D. Signed Numbers E. Signed Arithmetic F. Binary Codes Textbook

More information

Field Properties Quick Reference

Field Properties Quick Reference Field Properties Quick Reference Data types The following table provides a list of the available data types in Microsoft Office Access 2007, along with usage guidelines and storage capacities for each

More information

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC

CS 61C: Great Ideas in Computer Architecture. Dependability: Parity, RAID, ECC CS 61C: Great Ideas in Computer Architecture Dependability: Parity, RAID, ECC Instructor: Justin Hsia 8/08/2013 Summer 2013 Lecture #27 1 Review of Last Lecture MapReduce Data Level Parallelism Framework

More information

Scaling 10Gb/s Clustering at Wire-Speed

Scaling 10Gb/s Clustering at Wire-Speed Scaling 10Gb/s Clustering at Wire-Speed InfiniBand offers cost-effective wire-speed scaling with deterministic performance Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400

More information

You can probably work with decimal. binary numbers needed by the. Working with binary numbers is time- consuming & error-prone.

You can probably work with decimal. binary numbers needed by the. Working with binary numbers is time- consuming & error-prone. IP Addressing & Subnetting Made Easy Working with IP Addresses Introduction You can probably work with decimal numbers much easier than with the binary numbers needed by the computer. Working with binary

More information

A Catalogue of the Steiner Triple Systems of Order 19

A Catalogue of the Steiner Triple Systems of Order 19 A Catalogue of the Steiner Triple Systems of Order 19 Petteri Kaski 1, Patric R. J. Östergård 2, Olli Pottonen 2, and Lasse Kiviluoto 3 1 Helsinki Institute for Information Technology HIIT University of

More information

6.3 Conditional Probability and Independence

6.3 Conditional Probability and Independence 222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E)

Lecture 23: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Appendix E) Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E) 1 Topologies Internet topologies are not very regular they grew incrementally Supercomputers

More information

Firefox, Opera, Safari for Windows BMP file handling information leak. September 2008. Discovered by: Mateusz j00ru Jurczyk, Hispasec Labs

Firefox, Opera, Safari for Windows BMP file handling information leak. September 2008. Discovered by: Mateusz j00ru Jurczyk, Hispasec Labs Firefox, Opera, Safari for Windows BMP file handling information leak September 2008 Discovered by: Mateusz j00ru Jurczyk, Hispasec Labs 1. Introduction The bitmap format implementations in Mozilla Firefox

More information

Design and Implementation of a Storage Repository Using Commonality Factoring. IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen

Design and Implementation of a Storage Repository Using Commonality Factoring. IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen Design and Implementation of a Storage Repository Using Commonality Factoring IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen Axion Overview Potentially infinite historic versioning for rollback and

More information

IE1204 Digital Design F12: Asynchronous Sequential Circuits (Part 1)

IE1204 Digital Design F12: Asynchronous Sequential Circuits (Part 1) IE1204 Digital Design F12: Asynchronous Sequential Circuits (Part 1) Elena Dubrova KTH / ICT / ES dubrova@kth.se BV pp. 584-640 This lecture IE1204 Digital Design, HT14 2 Asynchronous Sequential Machines

More information

Scalable Prefix Matching for Internet Packet Forwarding

Scalable Prefix Matching for Internet Packet Forwarding Scalable Prefix Matching for Internet Packet Forwarding Marcel Waldvogel Computer Engineering and Networks Laboratory Institut für Technische Informatik und Kommunikationsnetze Background Internet growth

More information

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV)

Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Interconnection Networks Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Interconnection Networks 2 SIMD systems

More information

EE3414 Multimedia Communication Systems Part I

EE3414 Multimedia Communication Systems Part I EE3414 Multimedia Communication Systems Part I Spring 2003 Lecture 1 Yao Wang Electrical and Computer Engineering Polytechnic University Course Overview A University Sequence Course in Multimedia Communication

More information

Storage Optimization in Cloud Environment using Compression Algorithm

Storage Optimization in Cloud Environment using Compression Algorithm Storage Optimization in Cloud Environment using Compression Algorithm K.Govinda 1, Yuvaraj Kumar 2 1 School of Computing Science and Engineering, VIT University, Vellore, India kgovinda@vit.ac.in 2 School

More information

Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm

Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm 1. LZ77:Sliding Window Lempel-Ziv Algorithm [gzip, pkzip] Encode a string by finding the longest match anywhere within a window of past symbols

More information

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral

Introduction. What is RAID? The Array and RAID Controller Concept. Click here to print this article. Re-Printed From SLCentral Click here to print this article. Re-Printed From SLCentral RAID: An In-Depth Guide To RAID Technology Author: Tom Solinap Date Posted: January 24th, 2001 URL: http://www.slcentral.com/articles/01/1/raid

More information

Data Storage. 1s and 0s

Data Storage. 1s and 0s Data Storage As mentioned, computer science involves the study of algorithms and getting machines to perform them before we dive into the algorithm part, let s study the machines that we use today to do

More information

Lecture 10: Regression Trees

Lecture 10: Regression Trees Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

More information

ANALYSIS OF THE EFFECTIVENESS IN IMAGE COMPRESSION FOR CLOUD STORAGE FOR VARIOUS IMAGE FORMATS

ANALYSIS OF THE EFFECTIVENESS IN IMAGE COMPRESSION FOR CLOUD STORAGE FOR VARIOUS IMAGE FORMATS ANALYSIS OF THE EFFECTIVENESS IN IMAGE COMPRESSION FOR CLOUD STORAGE FOR VARIOUS IMAGE FORMATS Dasaradha Ramaiah K. 1 and T. Venugopal 2 1 IT Department, BVRIT, Hyderabad, India 2 CSE Department, JNTUH,

More information

Magic Word. Possible Answers: LOOSER WINNER LOTTOS TICKET. What is the magic word?

Magic Word. Possible Answers: LOOSER WINNER LOTTOS TICKET. What is the magic word? Magic Word Question: A magic word is needed to open a box. A secret code assigns each letter of the alphabet to a unique number. The code for the magic word is written on the outside of the box. What is

More information

Binary Adders: Half Adders and Full Adders

Binary Adders: Half Adders and Full Adders Binary Adders: Half Adders and Full Adders In this set of slides, we present the two basic types of adders: 1. Half adders, and 2. Full adders. Each type of adder functions to add two binary bits. In order

More information

Latency on a Switched Ethernet Network

Latency on a Switched Ethernet Network Application Note 8 Latency on a Switched Ethernet Network Introduction: This document serves to explain the sources of latency on a switched Ethernet network and describe how to calculate cumulative latency

More information

LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b

LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b LZ77 The original LZ77 algorithm works as follows: A phrase T j starting at a position i is encoded as a triple of the form distance, length, symbol. A triple d, l, s means that: T j = T [i...i + l] =

More information

ER E P M A S S I CONSTRUCTING A BINARY TREE EFFICIENTLYFROM ITS TRAVERSALS DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A-1998-5

ER E P M A S S I CONSTRUCTING A BINARY TREE EFFICIENTLYFROM ITS TRAVERSALS DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A-1998-5 S I N S UN I ER E P S I T VER M A TA S CONSTRUCTING A BINARY TREE EFFICIENTLYFROM ITS TRAVERSALS DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A-1998-5 UNIVERSITY OF TAMPERE DEPARTMENT OF

More information

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. File: chap04, Chapter 04 1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. 2. True or False? A gate is a device that accepts a single input signal and produces one

More information

Computer and Network Security

Computer and Network Security Computer and Network Security Dr. Arjan Durresi Louisiana State University Baton Rouge, LA 70810 Durresi@csc.LSU.Edu These slides are available at: http://www.csc.lsu.edu/~durresi/csc4601_07/ Louisiana

More information

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems

More information

ELECTRONIC DOCUMENT IMAGING

ELECTRONIC DOCUMENT IMAGING AIIM: Association for Information and Image Management. Trade association and professional society for the micrographics, optical disk and electronic image management markets. Algorithm: Prescribed set

More information

Conceptual Framework Strategies for Image Compression: A Review

Conceptual Framework Strategies for Image Compression: A Review International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Special Issue-1 E-ISSN: 2347-2693 Conceptual Framework Strategies for Image Compression: A Review Sumanta Lal

More information

Data Integration through XML/XSLT. Presenter: Xin Gu

Data Integration through XML/XSLT. Presenter: Xin Gu Data Integration through XML/XSLT Presenter: Xin Gu q7.jar op.xsl goalmodel.q7 goalmodel.xml q7.xsl help, hurt GUI +, -, ++, -- goalmodel.op.xml merge.xsl goalmodel.input.xml profile.xml Goal model configurator

More information

Catch Me If You Can: A Practical Framework to Evade Censorship in Information-Centric Networks

Catch Me If You Can: A Practical Framework to Evade Censorship in Information-Centric Networks Catch Me If You Can: A Practical Framework to Evade Censorship in Information-Centric Networks Reza Tourani, Satyajayant (Jay) Misra, Joerg Kliewer, Scott Ortegel, Travis Mick Computer Science Department

More information

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency

More information

First Semester Examinations 2011/12 INTERNET PRINCIPLES

First Semester Examinations 2011/12 INTERNET PRINCIPLES PAPER CODE NO. EXAMINER : Martin Gairing COMP211 DEPARTMENT : Computer Science Tel. No. 0151 795 4264 First Semester Examinations 2011/12 INTERNET PRINCIPLES TIME ALLOWED : Two Hours INSTRUCTIONS TO CANDIDATES

More information

Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis

Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Abdun Mahmood, Christopher Leckie, Parampalli Udaya Department of Computer Science and Software Engineering University of

More information

Raima Database Manager Version 14.0 In-memory Database Engine

Raima Database Manager Version 14.0 In-memory Database Engine + Raima Database Manager Version 14.0 In-memory Database Engine By Jeffrey R. Parsons, Senior Engineer January 2016 Abstract Raima Database Manager (RDM) v14.0 contains an all new data storage engine optimized

More information

CSE 459/598: Logic for Computer Scientists (Spring 2012)

CSE 459/598: Logic for Computer Scientists (Spring 2012) CSE 459/598: Logic for Computer Scientists (Spring 2012) Time and Place: T Th 10:30-11:45 a.m., M1-09 Instructor: Joohyung Lee (joolee@asu.edu) Instructor s Office Hours: T Th 4:30-5:30 p.m. and by appointment

More information