Lecture 1: Encoding Language. LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han

Size: px
Start display at page:

Download "Lecture 1: Encoding Language. LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han"

Transcription

1 Lecture 1: Encoding Language LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han

2 Objectives Understand the fundamentals of how language is encoded on a computer Text encoding systems ASCII ISO-8859 Unicode 1/12/2017 2

3 How is language represented on a computer? Natural ("Human") languages: Spoken form Written form *Also: sign languages The language of computers: 1/12/2017 3

4 The language of computers At the lowest level, computer language is binary: Information on a computer is stored in bits A bit is either: ON (=1, =yes) or OFF (=0, =no) This language essentially contains two alphabetic characters Next level up: byte A byte is made up of a sequence of 8 bits ex Historically, a byte was the number of bits used to encode a single character of text in a computer Byte is a basic addressable unit in most computer architecture 1/12/2017 4

5 Encoding a written language How to represent a text with 0s and 1s? Hello world! Each character is mapped to a code point (=character code), e.g., a unique integer. H 72 dec e 101 dec Each code point is represented as a binary number, using a fixed number of bits. 8 bits == 1 byte in the example above H 72 dec ( = = 72) e 101 dec ( = = 101) One byte can represent 256 (=2 8 ) different characters dec dec 1/12/2017 5

6 ASCII encoding for English How many bits are needed to encode English? 26 lowercase letters: a, b, c, d, e, 26 uppercase letters: A, B, C, D, E, 10 Arabic digits: 0, 1, 2, 3, 4, Punctuation:., : ;?! ' " Symbols: ( ) < > & % * $ + - We are already up to 80 6 bits (2 6 = 64) is not enough; we will need at least 7 (2 7 = 128) ASCII (the American Standard Code for Information Interchange) did just that Uses 7-bit code (= 128 characters) for storing English text Range 0 to 127 1/12/2017 6

7 The ASCII chart CII%20Conversion%20Chart.pdf Decimal Binary (7-bit) Character Decimal Binary (7-bit) Character (NULL) A B # C & a b c (DEL) 1/12/2017 7

8 ASCII (the American Standard Code for Information Interchange) The ASCII encoding scheme First published in 1963 Uses 7-bit code (= 128 characters) for storing English text, ranging from 0 to 127 In an 8-bit (1 byte) representation, the highest bit is always 0 Printable characters Upper and lower case roman alphabet Digits Punctuation marks, symbols, and space Includes 32 non-printing characters Control characters: BELL, ACKNWOLEDGE, BACKSPACE, DELETE, etc. originally for typewriters, many obsolete now WHITESPACE characters: TAB, LINE FEED, CARRIAGE RETURN 1/12/2017 8

9 Practice What is this English text? Note: byte (=8-bit) ASCII representation instead of 7-bit Space provided for your convenience only! Answer: Hi! 1/12/2017 9

10 Extending ASCII: ISO-8859, etc. ASCII (=7 bit, 128 characters) was sufficient for encoding English. But what about characters used in other languages? Solution: Extend ASCII into 8-bit (=256 characters) and use the additional 128 slots for non-english characters ISO-8859: has 16 different implementations! ISO ISO ISO aka Latin-1: French, German, Spanish, etc. Greek alphabet Hebrew alphabet JIS X 0208: Japanese characters Problem: overlapping character code space. 224 dec means à in Latin-1 but א in ISO ! 1/12/

11 The problem with multiple encoding systems Problem: Multiple coding systems map different characters to the same character code Solution 1: Provide meta-information on coding system Ex. MIME (Multipurpose Internet Mail Extensions) Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit But what if your message contains characters from multiple coding systems? Solution 2: Have a single universal code system for all writing systems UNICODE 1/12/

12 Unicode A character encoding standard developed by the Unicode Consortium Provides a single representation for all world's writing systems "Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. ( 1/12/

13 How big is Unicode? Version 9.0 (2016) has codes for 128,237 characters Full Unicode standard uses 32 bits (4 bytes) : it can represent 2 32 = 4,294,967,296 characters! In reality, only 21 bits are needed Unicode has three encoding versions UTF-32 (32 bits/4 bytes): direct representation UTF-16 (16 bits/2 bytes): 2 16 =65,536 possibilities UTF-8 (8 bits/1 byte): 2 8 =256 possibilities 1/12/

14 8-bit, 16-bit, 32-bit UTF-32 UTF-16 UTF-8 (32 bits/4 bytes): direct representation (16 bits/2 bytes): 2 16 =65,536 possibilities (8 bits/1 byte): 2 8 =256 possibilities Wait! But how do you represent all of 2 32 (=4 billion) code points with only one byte (UTF-8: 2 8 =256 slots)? You don't. In reality, only 2 21 bits are ever utilized for 128K characters. UTF-8 and UTF-16 use a variable-width encoding. Why UTF-16 and UTF-8? They are more compact (more so for certain languages, i.e., English) 1/12/

15 Variable-width encoding 'H' as 1 byte (8 bits): cf. 'H' as 2 bytes (16 bits): UTF-8 as a variable-width encoding ASCII characters get encoded with just 1 byte ASCII is originally 7-bits, so the highest bit is always 0 in an 8-bit encoding All other characters are encoded with multiple bytes How to tell? The highest bit is used as a flag. Highest bit 0: single character Highest bit 1: part of a multi-byte character Advantage for English: 8-bit ASCII is already a valid UTF-8! É 1/12/

16 A look at Unicode chart How to find your Unicode character: Basic Latin (ASCII) 1/12/

17 Code point for M. But "004D"? 1/12/

18 Another representation: hexadecimal Hexadecimal (hex) = base-16 Utilizes 16 characters: A B C D E F Designed for human readability & easy byte conversion 2 4 =16: 1 hexadecimal digit is equivalent to 4 bits 1 byte (=8 bits) is encoded with just 2 hex chars! Letter Base-10 (decimal) Base-2 (binary) Base-16 (hex) M D Unicode characters are usually referenced by their hexadecimal code Lower-number characters go by their 4-char hex codes (2 bytes), e.g. U+004D ("M", U+ designates Unicode) Higher-number characters go by 5 or 6 hex codes, e.g. U+1D122 ( 1/12/

Chapter 4: Computer Codes

Chapter 4: Computer Codes Slide 1/30 Learning Objectives In this chapter you will learn about: Computer data Computer codes: representation of data in binary Most commonly used computer codes Collating sequence 36 Slide 2/30 Data

More information

ASCII Code. Numerous codes were invented, including Émile Baudot's code (known as Baudot

ASCII Code. Numerous codes were invented, including Émile Baudot's code (known as Baudot ASCII Code Data coding Morse code was the first code used for long-distance communication. Samuel F.B. Morse invented it in 1844. This code is made up of dots and dashes (a sort of binary code). It was

More information

The use of binary codes to represent characters

The use of binary codes to represent characters The use of binary codes to represent characters Teacher s Notes Lesson Plan x Length 60 mins Specification Link 2.1.4/hi Character Learning objective (a) Explain the use of binary codes to represent characters

More information

Number Representation

Number Representation Number Representation CS10001: Programming & Data Structures Pallab Dasgupta Professor, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur Topics to be Discussed How are numeric data

More information

Base Conversion written by Cathy Saxton

Base Conversion written by Cathy Saxton Base Conversion written by Cathy Saxton 1. Base 10 In base 10, the digits, from right to left, specify the 1 s, 10 s, 100 s, 1000 s, etc. These are powers of 10 (10 x ): 10 0 = 1, 10 1 = 10, 10 2 = 100,

More information

Systems I: Computer Organization and Architecture

Systems I: Computer Organization and Architecture Systems I: Computer Organization and Architecture Lecture 2: Number Systems and Arithmetic Number Systems - Base The number system that we use is base : 734 = + 7 + 3 + 4 = x + 7x + 3x + 4x = x 3 + 7x

More information

Memory is implemented as an array of electronic switches

Memory is implemented as an array of electronic switches Memory Structure Memory is implemented as an array of electronic switches Each switch can be in one of two states 0 or 1, on or off, true or false, purple or gold, sitting or standing BInary digits (bits)

More information

Symbols in subject lines. An in-depth look at symbols

Symbols in subject lines. An in-depth look at symbols An in-depth look at symbols What is the advantage of using symbols in subject lines? The age of personal emails has changed significantly due to the social media boom, and instead, people are receving

More information

Cyber Security Workshop Encryption Reference Manual

Cyber Security Workshop Encryption Reference Manual Cyber Security Workshop Encryption Reference Manual May 2015 Basic Concepts in Encoding and Encryption Binary Encoding Examples Encryption Cipher Examples 1 P a g e Encoding Concepts Binary Encoding Basics

More information

Binary Representation

Binary Representation Binary Representation The basis of all digital data is binary representation. Binary - means two 1, 0 True, False Hot, Cold On, Off We must tbe able to handle more than just values for real world problems

More information

How to represent characters?

How to represent characters? Copyright Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See http://software-carpentry.org/license.html for more information. How to represent characters?

More information

EURESCOM - P923 (Babelweb) PIR.3.1

EURESCOM - P923 (Babelweb) PIR.3.1 Multilingual text processing difficulties Malek Boualem, Jérôme Vinesse CNET, 1. Introduction Users of more and more applications now require multilingual text processing tools, including word processors,

More information

Counting in base 10, 2 and 16

Counting in base 10, 2 and 16 Counting in base 10, 2 and 16 1. Binary Numbers A super-important fact: (Nearly all) Computers store all information in the form of binary numbers. Numbers, characters, images, music files --- all of these

More information

Today s topics. Digital Computers. More on binary. Binary Digits (Bits)

Today s topics. Digital Computers. More on binary. Binary Digits (Bits) Today s topics! Binary Numbers! Brookshear.-.! Slides from Prof. Marti Hearst of UC Berkeley SIMS! Upcoming! Networks Interactive Introduction to Graph Theory http://www.utm.edu/cgi-bin/caldwell/tutor/departments/math/graph/intro

More information

Binary Representation. Number Systems. Base 10, Base 2, Base 16. Positional Notation. Conversion of Any Base to Decimal.

Binary Representation. Number Systems. Base 10, Base 2, Base 16. Positional Notation. Conversion of Any Base to Decimal. Binary Representation The basis of all digital data is binary representation. Binary - means two 1, 0 True, False Hot, Cold On, Off We must be able to handle more than just values for real world problems

More information

Ecma/TC39/2013/NN. 4 th Draft ECMA-XXX. 1 st Edition / July 2013. The JSON Data Interchange Format. Reference number ECMA-123:2009

Ecma/TC39/2013/NN. 4 th Draft ECMA-XXX. 1 st Edition / July 2013. The JSON Data Interchange Format. Reference number ECMA-123:2009 Ecma/TC39/2013/NN 4 th Draft ECMA-XXX 1 st Edition / July 2013 The JSON Data Interchange Format Reference number ECMA-123:2009 Ecma International 2009 COPYRIGHT PROTECTED DOCUMENT Ecma International 2013

More information

Unicode Security. Software Vulnerability Testing Guide. July 2009 Casaba Security, LLC www.casabasecurity.com

Unicode Security. Software Vulnerability Testing Guide. July 2009 Casaba Security, LLC www.casabasecurity.com Unicode Security Software Vulnerability Testing Guide (DRAFT DOCUMENT this document is currently a preview in DRAFT form. Please contact me with corrections or feedback.) Software Globalization provides

More information

Encoding Text with a Small Alphabet

Encoding Text with a Small Alphabet Chapter 2 Encoding Text with a Small Alphabet Given the nature of the Internet, we can break the process of understanding how information is transmitted into two components. First, we have to figure out

More information

Introduction to Unicode. By: Atif Gulzar Center for Research in Urdu Language Processing

Introduction to Unicode. By: Atif Gulzar Center for Research in Urdu Language Processing Introduction to Unicode By: Atif Gulzar Center for Research in Urdu Language Processing Introduction to Unicode Unicode Why Unicode? What is Unicode? Unicode Architecture Why Unicode? Pre-Unicode Standards

More information

Pemrograman Dasar. Basic Elements Of Java

Pemrograman Dasar. Basic Elements Of Java Pemrograman Dasar Basic Elements Of Java Compiling and Running a Java Application 2 Portable Java Application 3 Java Platform Platform: hardware or software environment in which a program runs. Oracle

More information

The Unicode Standard Version 8.0 Core Specification

The Unicode Standard Version 8.0 Core Specification The Unicode Standard Version 8.0 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers

More information

Variables, Constants, and Data Types

Variables, Constants, and Data Types Variables, Constants, and Data Types Primitive Data Types Variables, Initialization, and Assignment Constants Characters Strings Reading for this class: L&L, 2.1-2.3, App C 1 Primitive Data There are eight

More information

IBM Emulation Mode Printer Commands

IBM Emulation Mode Printer Commands IBM Emulation Mode Printer Commands Section 3 This section provides a detailed description of IBM emulation mode commands you can use with your printer. Control Codes Control codes are one-character printer

More information

ASCII Characters. 146 CHAPTER 3 Information Representation. The sign bit is 1, so the number is negative. Converting to decimal gives

ASCII Characters. 146 CHAPTER 3 Information Representation. The sign bit is 1, so the number is negative. Converting to decimal gives 146 CHAPTER 3 Information Representation The sign bit is 1, so the number is negative. Converting to decimal gives 37A (hex) = 134 (dec) Notice that the hexadecimal number is not written with a negative

More information

Frequently Asked Questions on character sets and languages in MT and MX free format fields

Frequently Asked Questions on character sets and languages in MT and MX free format fields Frequently Asked Questions on character sets and languages in MT and MX free format fields Version Final 17 January 2008 Preface The Frequently Asked Questions (FAQs) on character sets and languages that

More information

Digital codes. Resources and methods for learning about these subjects (list a few here, in preparation for your research):

Digital codes. Resources and methods for learning about these subjects (list a few here, in preparation for your research): Digital codes This worksheet and all related files are licensed under the Creative Commons Attribution License, version 1.0. To view a copy of this license, visit http://creativecommons.org/licenses/by/1.0/,

More information

So far we have considered only numeric processing, i.e. processing of numeric data represented

So far we have considered only numeric processing, i.e. processing of numeric data represented Chapter 4 Processing Character Data So far we have considered only numeric processing, i.e. processing of numeric data represented as integer and oating point types. Humans also use computers to manipulate

More information

Chapter 2: Basics on computers and digital information coding. A.A. 2012-2013 Information Technology and Arts Organizations

Chapter 2: Basics on computers and digital information coding. A.A. 2012-2013 Information Technology and Arts Organizations Chapter 2: Basics on computers and digital information coding Information Technology and Arts Organizations 1 Syllabus (1/3) 1. Introduction on Information Technologies (IT) and Cultural Heritage (CH)

More information

Section 1.4 Place Value Systems of Numeration in Other Bases

Section 1.4 Place Value Systems of Numeration in Other Bases Section.4 Place Value Systems of Numeration in Other Bases Other Bases The Hindu-Arabic system that is used in most of the world today is a positional value system with a base of ten. The simplest reason

More information

Preservation Handbook

Preservation Handbook Preservation Handbook Plain text Author Version 2 Date 17.08.05 Change History Martin Wynne and Stuart Yeates Written by MW 2004. Revised by SY May 2005. Revised by MW August 2005. Page 1 of 7 File: presplaintext_d2.doc

More information

2011, The McGraw-Hill Companies, Inc. Chapter 3

2011, The McGraw-Hill Companies, Inc. Chapter 3 Chapter 3 3.1 Decimal System The radix or base of a number system determines the total number of different symbols or digits used by that system. The decimal system has a base of 10 with the digits 0 through

More information

2 Number Systems. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

2 Number Systems. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to: 2 Number Systems 2.1 Source: Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: Understand the concept of number systems. Distinguish

More information

URL encoding uses hex code prefixed by %. Quoted Printable encoding uses hex code prefixed by =.

URL encoding uses hex code prefixed by %. Quoted Printable encoding uses hex code prefixed by =. ASCII = American National Standard Code for Information Interchange ANSI X3.4 1986 (R1997) (PDF), ANSI INCITS 4 1986 (R1997) (Printed Edition) Coded Character Set 7 Bit American National Standard Code

More information

Introduction to UNIX and SFTP

Introduction to UNIX and SFTP Introduction to UNIX and SFTP Introduction to UNIX 1. What is it? 2. Philosophy and issues 3. Using UNIX 4. Files & folder structure 1. What is UNIX? UNIX is an Operating System (OS) All computers require

More information

LSN 2 Number Systems. ECT 224 Digital Computer Fundamentals. Department of Engineering Technology

LSN 2 Number Systems. ECT 224 Digital Computer Fundamentals. Department of Engineering Technology LSN 2 Number Systems Department of Engineering Technology LSN 2 Decimal Number System Decimal number system has 10 digits (0-9) Base 10 weighting system... 10 5 10 4 10 3 10 2 10 1 10 0. 10-1 10-2 10-3

More information

Oct: 50 8 = 6 (r = 2) 6 8 = 0 (r = 6) Writing the remainders in reverse order we get: (50) 10 = (62) 8

Oct: 50 8 = 6 (r = 2) 6 8 = 0 (r = 6) Writing the remainders in reverse order we get: (50) 10 = (62) 8 ECE Department Summer LECTURE #5: Number Systems EEL : Digital Logic and Computer Systems Based on lecture notes by Dr. Eric M. Schwartz Decimal Number System: -Our standard number system is base, also

More information

The string of digits 101101 in the binary number system represents the quantity

The string of digits 101101 in the binary number system represents the quantity Data Representation Section 3.1 Data Types Registers contain either data or control information Control information is a bit or group of bits used to specify the sequence of command signals needed for

More information

Digital System Design Prof. D Roychoudhry Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Digital System Design Prof. D Roychoudhry Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Digital System Design Prof. D Roychoudhry Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 04 Digital Logic II May, I before starting the today s lecture

More information

CSC4510 AUTOMATA 2.1 Finite Automata: Examples and D efinitions Definitions

CSC4510 AUTOMATA 2.1 Finite Automata: Examples and D efinitions Definitions CSC45 AUTOMATA 2. Finite Automata: Examples and Definitions Finite Automata: Examples and Definitions A finite automaton is a simple type of computer. Itsoutputislimitedto yes to or no. It has very primitive

More information

Levent EREN levent.eren@ieu.edu.tr A-306 Office Phone:488-9882 INTRODUCTION TO DIGITAL LOGIC

Levent EREN levent.eren@ieu.edu.tr A-306 Office Phone:488-9882 INTRODUCTION TO DIGITAL LOGIC Levent EREN levent.eren@ieu.edu.tr A-306 Office Phone:488-9882 1 Number Systems Representation Positive radix, positional number systems A number with radix r is represented by a string of digits: A n

More information

EE 261 Introduction to Logic Circuits. Module #2 Number Systems

EE 261 Introduction to Logic Circuits. Module #2 Number Systems EE 261 Introduction to Logic Circuits Module #2 Number Systems Topics A. Number System Formation B. Base Conversions C. Binary Arithmetic D. Signed Numbers E. Signed Arithmetic F. Binary Codes Textbook

More information

Positional Numbering System

Positional Numbering System APPENDIX B Positional Numbering System A positional numbering system uses a set of symbols. The value that each symbol represents, however, depends on its face value and its place value, the value associated

More information

2- Electronic Mail (SMTP), File Transfer (FTP), & Remote Logging (TELNET)

2- Electronic Mail (SMTP), File Transfer (FTP), & Remote Logging (TELNET) 2- Electronic Mail (SMTP), File Transfer (FTP), & Remote Logging (TELNET) There are three popular applications for exchanging information. Electronic mail exchanges information between people and file

More information

The programming language C. sws1 1

The programming language C. sws1 1 The programming language C sws1 1 The programming language C invented by Dennis Ritchie in early 1970s who used it to write the first Hello World program C was used to write UNIX Standardised as K&C (Kernighan

More information

CAPE-OPEN Expanding Process Modelling Capability through Software Interoperability Standards

CAPE-OPEN Expanding Process Modelling Capability through Software Interoperability Standards CAPE-OPEN Expanding Process Modelling Capability through Software Interoperability Standards Errata and Clarifications Identification Common Interface v1.0 www.colan.org CO-LaN 2001-2014 1 ARCHIVAL INFORMATION

More information

Computer Science 281 Binary and Hexadecimal Review

Computer Science 281 Binary and Hexadecimal Review Computer Science 281 Binary and Hexadecimal Review 1 The Binary Number System Computers store everything, both instructions and data, by using many, many transistors, each of which can be in one of two

More information

Solution for Homework 2

Solution for Homework 2 Solution for Homework 2 Problem 1 a. What is the minimum number of bits that are required to uniquely represent the characters of English alphabet? (Consider upper case characters alone) The number of

More information

Multi-lingual Label Printing with Unicode

Multi-lingual Label Printing with Unicode Multi-lingual Label Printing with Unicode White Paper Version 20100716 2009 SATO CORPORATION. All rights reserved. http://www.satoworldwide.com softwaresupport@satogbs.com 2009 SATO Corporation. All rights

More information

Chapter 1: Digital Systems and Binary Numbers

Chapter 1: Digital Systems and Binary Numbers Chapter 1: Digital Systems and Binary Numbers Digital age and information age Digital computers general purposes many scientific, industrial and commercial applications Digital systems telephone switching

More information

CDA 3200 Digital Systems. Instructor: Dr. Janusz Zalewski Developed by: Dr. Dahai Guo Spring 2012

CDA 3200 Digital Systems. Instructor: Dr. Janusz Zalewski Developed by: Dr. Dahai Guo Spring 2012 CDA 3200 Digital Systems Instructor: Dr. Janusz Zalewski Developed by: Dr. Dahai Guo Spring 2012 Outline Data Representation Binary Codes Why 6-3-1-1 and Excess-3? Data Representation (1/2) Each numbering

More information

The future of International SEO. The future of Search Engine Optimization (SEO) for International Business

The future of International SEO. The future of Search Engine Optimization (SEO) for International Business The future of International SEO The future of Search Engine Optimization (SEO) for International Business Whitepaper The World Wide Web is now allowing special characters in URLs which means crawlers now

More information

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer Computers CMPT 125: Lecture 1: Understanding the Computer Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University January 3, 2009 A computer performs 2 basic functions: 1.

More information

The Hexadecimal Number System and Memory Addressing

The Hexadecimal Number System and Memory Addressing APPENDIX C The Hexadecimal Number System and Memory Addressing U nderstanding the number system and the coding system that computers use to store data and communicate with each other is fundamental to

More information

CS101 Lecture 11: Number Systems and Binary Numbers. Aaron Stevens 14 February 2011

CS101 Lecture 11: Number Systems and Binary Numbers. Aaron Stevens 14 February 2011 CS101 Lecture 11: Number Systems and Binary Numbers Aaron Stevens 14 February 2011 1 2 1 3!!! MATH WARNING!!! TODAY S LECTURE CONTAINS TRACE AMOUNTS OF ARITHMETIC AND ALGEBRA PLEASE BE ADVISED THAT CALCULTORS

More information

Japanese Character Printers EPL2 Programming Manual Addendum

Japanese Character Printers EPL2 Programming Manual Addendum Japanese Character Printers EPL2 Programming Manual Addendum This addendum contains information unique to Zebra Technologies Japanese character bar code printers. The Japanese configuration printers support

More information

Decimal to Binary Conversion

Decimal to Binary Conversion Decimal to Binary Conversion A tool that makes the conversion of decimal values to binary values simple is the following table. The first row is created by counting right to left from one to eight, for

More information

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to: Chapter 3 Data Storage Objectives After studying this chapter, students should be able to: List five different data types used in a computer. Describe how integers are stored in a computer. Describe how

More information

Binary Numbers. Binary Octal Hexadecimal

Binary Numbers. Binary Octal Hexadecimal Binary Numbers Binary Octal Hexadecimal Binary Numbers COUNTING SYSTEMS UNLIMITED... Since you have been using the 10 different digits 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 all your life, you may wonder how

More information

Internationalization of Domain Names

Internationalization of Domain Names Internationalization of Domain Names Marc Blanchet (Marc.Blanchet@viagenie.qc.ca) Co-chair of the IETF idn working group Viagénie (http://www.viagenie.qc.ca) Do You Like Quoted Printable? If yes, then

More information

Data Integrator. Encoding Reference. Pervasive Software, Inc. 12365-B Riata Trace Parkway Austin, Texas 78727 USA

Data Integrator. Encoding Reference. Pervasive Software, Inc. 12365-B Riata Trace Parkway Austin, Texas 78727 USA Data Integrator Encoding Reference Pervasive Software, Inc. 12365-B Riata Trace Parkway Austin, Texas 78727 USA Telephone: 888.296.5969 or 512.231.6000 Fax: 512.231.6010 Email: info@pervasiveintegration.com

More information

Lecture 2. Binary and Hexadecimal Numbers

Lecture 2. Binary and Hexadecimal Numbers Lecture 2 Binary and Hexadecimal Numbers Purpose: Review binary and hexadecimal number representations Convert directly from one base to another base Review addition and subtraction in binary representations

More information

UNDERSTANDING SMS: Practitioner s Basics

UNDERSTANDING SMS: Practitioner s Basics UNDERSTANDING SMS: Practitioner s Basics Michael Harrington, CFCE, EnCE It is estimated that in 2006, 72% of all mobile phone users world wide were active users of SMS or text messaging. In European countries

More information

HP Business Notebook Password Localization Guidelines V1.0

HP Business Notebook Password Localization Guidelines V1.0 HP Business Notebook Password Localization Guidelines V1.0 November 2009 Table of Contents: 1. Introduction..2 2. Supported Platforms...2 3. Overview of Design...3 4. Supported Keyboard Layouts in Preboot

More information

SMPP protocol analysis using Wireshark (SMS)

SMPP protocol analysis using Wireshark (SMS) SMPP protocol analysis using Wireshark (SMS) Document Purpose Help analyzing SMPP traffic using Wireshark. Give hints about common caveats and oddities of the SMPP protocol and its implementations. Most

More information

Chapter Binary, Octal, Decimal, and Hexadecimal Calculations

Chapter Binary, Octal, Decimal, and Hexadecimal Calculations Chapter 5 Binary, Octal, Decimal, and Hexadecimal Calculations This calculator is capable of performing the following operations involving different number systems. Number system conversion Arithmetic

More information

As previously noted, a byte can contain a numeric value in the range 0-255. Computers don't understand Latin, Cyrillic, Hindi, Arabic character sets!

As previously noted, a byte can contain a numeric value in the range 0-255. Computers don't understand Latin, Cyrillic, Hindi, Arabic character sets! Encoding of alphanumeric and special characters As previously noted, a byte can contain a numeric value in the range 0-255. Computers don't understand Latin, Cyrillic, Hindi, Arabic character sets! Alphanumeric

More information

ASCII Encoding. The char Type. Manipulating Characters. Manipulating Characters

ASCII Encoding. The char Type. Manipulating Characters. Manipulating Characters The char Type ASCII Encoding The C char type stores small integers. It is usually 8 bits. char variables guaranteed to be able to hold integers 0.. +127. char variables mostly used to store characters

More information

Useful Number Systems

Useful Number Systems Useful Number Systems Decimal Base = 10 Digit Set = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} Binary Base = 2 Digit Set = {0, 1} Octal Base = 8 = 2 3 Digit Set = {0, 1, 2, 3, 4, 5, 6, 7} Hexadecimal Base = 16 = 2

More information

Character Code Structure and Extension Techniques

Character Code Structure and Extension Techniques Standard ECMA-35 6th Edition - December 1994 Standardizing Information and Communication Systems Character Code Structure and Extension Techniques Phone: +41 22 849.60.00 - Fax: +41 22 849.60.01 - X.400:

More information

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner. Handout 1 CS603 Object-Oriented Programming Fall 15 Page 1 of 11 Handout 1 Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner. Java

More information

HL7 Conformance Statement

HL7 Conformance Statement HL7 Conformance Statement Release VA20B (2014-03-28) ITH icoserve technology for healthcare GmbH Innrain 98, 6020 Innsbruck, Austria +43 512 89059-0 www.ith-icoserve.com Any printout or copy of this document

More information

Chapter 2. Binary Values and Number Systems

Chapter 2. Binary Values and Number Systems Chapter 2 Binary Values and Number Systems Numbers Natural numbers, a.k.a. positive integers Zero and any number obtained by repeatedly adding one to it. Examples: 100, 0, 45645, 32 Negative numbers A

More information

Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T)

Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T) Unit- I Introduction to c Language: C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating

More information

DNA Data and Program Representation. Alexandre David 1.2.05 adavid@cs.aau.dk

DNA Data and Program Representation. Alexandre David 1.2.05 adavid@cs.aau.dk DNA Data and Program Representation Alexandre David 1.2.05 adavid@cs.aau.dk Introduction Very important to understand how data is represented. operations limits precision Digital logic built on 2-valued

More information

HP Service Virtualization

HP Service Virtualization HP Service Virtualization Fixed Length Protocol Virtualization SV Training September 2014 Fixed Length Protocol Virtualization Technology Description Use Cases Supported Message Structures SV Service Description

More information

Right-to-Left Language Support in EMu

Right-to-Left Language Support in EMu EMu Documentation Right-to-Left Language Support in EMu Document Version 1.1 EMu Version 4.0 www.kesoftware.com 2010 KE Software. All rights reserved. Contents SECTION 1 Overview 1 SECTION 2 Switching

More information

Core Components Data Type Catalogue Version 3.1 17 October 2011

Core Components Data Type Catalogue Version 3.1 17 October 2011 Core Components Data Type Catalogue Version 3.1 17 October 2011 Core Components Data Type Catalogue Version 3.1 Page 1 of 121 Abstract CCTS 3.0 defines the rules for developing Core Data Types and Business

More information

Email Electronic Mail

Email Electronic Mail Email Electronic Mail Electronic mail paradigm Most heavily used application on any network Electronic version of paper-based office memo Quick, low-overhead written communication Dates back to time-sharing

More information

Unsigned Conversions from Decimal or to Decimal and other Number Systems

Unsigned Conversions from Decimal or to Decimal and other Number Systems Page 1 of 5 Unsigned Conversions from Decimal or to Decimal and other Number Systems In all digital design, analysis, troubleshooting, and repair you will be working with binary numbers (or base 2). It

More information

Japannext PTS ITCH Market Data Specification. Version 1.4 Updated 3 October 2014

Japannext PTS ITCH Market Data Specification. Version 1.4 Updated 3 October 2014 Japannext PTS ITCH Market Data Specification Version 1.4 Updated 3 October 2014 Table of Contents 1. Introduction... 3 2. Overview... 3 3. Data Types... 3 4. Outbound Sequenced Messages... 3 4.1 Seconds...

More information

Internationalizing the Domain Name System. Šimon Hochla, Anisa Azis, Fara Nabilla

Internationalizing the Domain Name System. Šimon Hochla, Anisa Azis, Fara Nabilla Internationalizing the Domain Name System Šimon Hochla, Anisa Azis, Fara Nabilla Internationalize Internet Master in Innovation and Research in Informatics problematic of using non-ascii characters ease

More information

Representação de Caracteres

Representação de Caracteres Representação de Caracteres IFBA Instituto Federal de Educ. Ciencia e Tec Bahia Curso de Analise e Desenvolvimento de Sistemas Introdução à Ciência da Computação Prof. Msc. Antonio Carlos Souza Coletânea

More information

Networking Applications

Networking Applications Networking Dr. Ayman A. Abdel-Hamid College of Computing and Information Technology Arab Academy for Science & Technology and Maritime Transport Electronic Mail 1 Outline Introduction SMTP MIME Mail Access

More information

Number Systems. Introduction / Number Systems

Number Systems. Introduction / Number Systems Number Systems Introduction / Number Systems Data Representation Data representation can be Digital or Analog In Analog representation values are represented over a continuous range In Digital representation

More information

CSI 333 Lecture 1 Number Systems

CSI 333 Lecture 1 Number Systems CSI 333 Lecture 1 Number Systems 1 1 / 23 Basics of Number Systems Ref: Appendix C of Deitel & Deitel. Weighted Positional Notation: 192 = 2 10 0 + 9 10 1 + 1 10 2 General: Digit sequence : d n 1 d n 2...

More information

Count the Dots Binary Numbers

Count the Dots Binary Numbers Activity 1 Count the Dots Binary Numbers Summary Data in computers is stored and transmitted as a series of zeros and ones. How can we represent words and numbers using just these two symbols? Curriculum

More information

SYMETRIX SOLUTIONS: TECH TIP August 2015

SYMETRIX SOLUTIONS: TECH TIP August 2015 String Output Modules The purpose of this document is to provide an understanding of operation and configuration of the two different String Output modules available within SymNet Composer. The two different

More information

Numeral Systems. The number twenty-five can be represented in many ways: Decimal system (base 10): 25 Roman numerals:

Numeral Systems. The number twenty-five can be represented in many ways: Decimal system (base 10): 25 Roman numerals: Numeral Systems Which number is larger? 25 8 We need to distinguish between numbers and the symbols that represent them, called numerals. The number 25 is larger than 8, but the numeral 8 above is larger

More information

One Report, Many Languages: Using SAS Visual Analytics to Localize Your Reports

One Report, Many Languages: Using SAS Visual Analytics to Localize Your Reports Technical Paper One Report, Many Languages: Using SAS Visual Analytics to Localize Your Reports Will Ballard and Elizabeth Bales One Report, Many Languages: Using SAS Visual Analytics to Localize Your

More information

2- Electronic Mail (SMTP), File Transfer (FTP), & Remote Logging (TELNET)

2- Electronic Mail (SMTP), File Transfer (FTP), & Remote Logging (TELNET) 2- Electronic Mail (SMTP), File Transfer (FTP), & Remote Logging (TELNET) There are three popular applications for exchanging information. Electronic mail exchanges information between people and file

More information

Regular Expression Syntax

Regular Expression Syntax 1 of 5 12/22/2014 9:55 AM EmEditor Home - EmEditor Help - How to - Search Regular Expression Syntax EmEditor regular expression syntax is based on Perl regular expression syntax. Literals All characters

More information

Goals. Unary Numbers. Decimal Numbers. 3,148 is. 1000 s 100 s 10 s 1 s. Number Bases 1/12/2009. COMP370 Intro to Computer Architecture 1

Goals. Unary Numbers. Decimal Numbers. 3,148 is. 1000 s 100 s 10 s 1 s. Number Bases 1/12/2009. COMP370 Intro to Computer Architecture 1 Number Bases //9 Goals Numbers Understand binary and hexadecimal numbers Be able to convert between number bases Understand binary fractions COMP37 Introduction to Computer Architecture Unary Numbers Decimal

More information

Third Southern African Regional ACM Collegiate Programming Competition. Sponsored by IBM. Problem Set

Third Southern African Regional ACM Collegiate Programming Competition. Sponsored by IBM. Problem Set Problem Set Problem 1 Red Balloon Stockbroker Grapevine Stockbrokers are known to overreact to rumours. You have been contracted to develop a method of spreading disinformation amongst the stockbrokers

More information

The ASCII Character Set

The ASCII Character Set The ASCII Character Set The American Standard Code for Information Interchange or ASCII assigns values between 0 and 255 for upper and lower case letters, numeric digits, punctuation marks and other symbols.

More information

Chapter 5. Binary, octal and hexadecimal numbers

Chapter 5. Binary, octal and hexadecimal numbers Chapter 5. Binary, octal and hexadecimal numbers A place to look for some of this material is the Wikipedia page http://en.wikipedia.org/wiki/binary_numeral_system#counting_in_binary Another place that

More information

2 ASCII TABLE (DOS) 3 ASCII TABLE (Window)

2 ASCII TABLE (DOS) 3 ASCII TABLE (Window) 1 ASCII TABLE 2 ASCII TABLE (DOS) 3 ASCII TABLE (Window) 4 Keyboard Codes The Diagram below shows the codes that are returned when a key is pressed. For example, pressing a would return 0x61. If it is

More information

USB Card Reader Configuration Utility. User Manual. Draft!

USB Card Reader Configuration Utility. User Manual. Draft! USB Card Reader Configuration Utility User Manual Draft! SB Research 2009 The Configuration Utility for USB card reader family: Concept: To allow for field programming of the USB card readers a configuration

More information

2010/9/19. Binary number system. Binary numbers. Outline. Binary to decimal

2010/9/19. Binary number system. Binary numbers. Outline. Binary to decimal 2/9/9 Binary number system Computer (electronic) systems prefer binary numbers Binary number: represent a number in base-2 Binary numbers 2 3 + 7 + 5 Some terminology Bit: a binary digit ( or ) Hexadecimal

More information

This is great when speed is important and relatively few words are necessary, but Max would be a terrible language for writing a text editor.

This is great when speed is important and relatively few words are necessary, but Max would be a terrible language for writing a text editor. Dealing With ASCII ASCII, of course, is the numeric representation of letters used in most computers. In ASCII, there is a number for each character in a message. Max does not use ACSII very much. In the

More information

Binary, Hexadecimal, Octal, and BCD Numbers

Binary, Hexadecimal, Octal, and BCD Numbers 23CH_PHCalter_TMSETE_949118 23/2/2007 1:37 PM Page 1 Binary, Hexadecimal, Octal, and BCD Numbers OBJECTIVES When you have completed this chapter, you should be able to: Convert between binary and decimal

More information