DNA Data and Program Representation. Alexandre David 1.2.05 adavid@cs.aau.dk

Similar documents

ECE 0142 Computer Organization. Lecture 3 Floating Point Representations

The programming language C. sws1 1

Chapter 7D The Java Virtual Machine

Levent EREN A-306 Office Phone: INTRODUCTION TO DIGITAL LOGIC

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

Data Storage 3.1. Foundations of Computer Science Cengage Learning

Oct: 50 8 = 6 (r = 2) 6 8 = 0 (r = 6) Writing the remainders in reverse order we get: (50) 10 = (62) 8

This Unit: Floating Point Arithmetic. CIS 371 Computer Organization and Design. Readings. Floating Point (FP) Numbers

Number Representation

Binary Division. Decimal Division. Hardware for Binary Division. Simple 16-bit Divider Circuit

Systems I: Computer Organization and Architecture

The mathematics of RAID-6

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1

To convert an arbitrary power of 2 into its English equivalent, remember the rules of exponential arithmetic:

Computer Science 281 Binary and Hexadecimal Review

Measures of Error: for exact x and approximation x Absolute error e = x x. Relative error r = (x x )/x.

How To Write Portable Programs In C

Sources: On the Web: Slides will be available on:

Binary Representation. Number Systems. Base 10, Base 2, Base 16. Positional Notation. Conversion of Any Base to Decimal.

Binary Representation

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Numbering Systems. InThisAppendix...

1. Give the 16 bit signed (twos complement) representation of the following decimal numbers, and convert to hexadecimal:

How To Write A Hexadecimal Program

This 3-digit ASCII string could also be calculated as n = (Data[2]-0x30) +10*((Data[1]-0x30)+10*(Data[0]-0x30));

Numeral Systems. The number twenty-five can be represented in many ways: Decimal system (base 10): 25 Roman numerals:

Data Storage: Each time you create a variable in memory, a certain amount of memory is allocated for that variable based on its data type (or class).

Intel 64 and IA-32 Architectures Software Developer s Manual

2010/9/19. Binary number system. Binary numbers. Outline. Binary to decimal

Solution for Homework 2

Memory Systems. Static Random Access Memory (SRAM) Cell

Pemrograman Dasar. Basic Elements Of Java

Adaptive Stable Additive Methods for Linear Algebraic Calculations

Chapter 5 Instructor's Manual

Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T)

Chapter 1: Digital Systems and Binary Numbers

The New IoT Standard: Any App for Any Device Using Any Data Format. Mike Weiner Product Manager, Omega DevCloud KORE Telematics

Phys4051: C Lecture 2 & 3. Comment Statements. C Data Types. Functions (Review) Comment Statements Variables & Operators Branching Instructions

CSI 333 Lecture 1 Number Systems

CHAPTER 5 Round-off errors

The string of digits in the binary number system represents the quantity

Faculty of Engineering Student Number:

Chapter 5. Binary, octal and hexadecimal numbers

Informatica e Sistemi in Tempo Reale

EE 261 Introduction to Logic Circuits. Module #2 Number Systems

Chapter 4: Computer Codes

IA-32 Intel Architecture Software Developer s Manual

Useful Number Systems

Instruction Set Architecture (ISA)

Binary Number System. 16. Binary Numbers. Base 10 digits: Base 2 digits: 0 1

Motorola 8- and 16-bit Embedded Application Binary Interface (M8/16EABI)

1 The Java Virtual Machine

Attention: This material is copyright Chris Hecker. All rights reserved.

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

2011, The McGraw-Hill Companies, Inc. Chapter 3

This section describes how LabVIEW stores data in memory for controls, indicators, wires, and other objects.

LSN 2 Number Systems. ECT 224 Digital Computer Fundamentals. Department of Engineering Technology

Outline. hardware components programming environments. installing Python executing Python code. decimal and binary notations running Sage

Precision & Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs

EE361: Digital Computer Organization Course Syllabus

Lecture N -1- PHYS Microcontrollers

Cryptography and Network Security. Prof. D. Mukhopadhyay. Department of Computer Science and Engineering. Indian Institute of Technology, Kharagpur

CPU Organization and Assembly Language

Arithmetic in MIPS. Objectives. Instruction. Integer arithmetic. After completing this lab you will:

Q-Format number representation. Lecture 5 Fixed Point vs Floating Point. How to store Q30 number to 16-bit memory? Q-format notation.

Object Oriented Software Design

Binary search tree with SIMD bandwidth optimization using SSE

HOMEWORK # 2 SOLUTIO

plc numbers Encoded values; BCD and ASCII Error detection; parity, gray code and checksums

Lecture 8: Binary Multiplication & Division

Numerical Matrix Analysis

Object Oriented Software Design

Caml Virtual Machine File & data formats Document version: 1.4

CS101 Lecture 11: Number Systems and Binary Numbers. Aaron Stevens 14 February 2011

Using the RDTSC Instruction for Performance Monitoring

Intel Architecture Software Developer s Manual

Binary Numbering Systems

Intel Hexadecimal Object File Format Specification Revision A, 1/6/88

C++ Programming Language

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

NUMBER SYSTEMS. William Stallings

SOLVING LINEAR SYSTEMS

Binary Adders: Half Adders and Full Adders

Basic Use of the TI-84 Plus

Lecture 2. Binary and Hexadecimal Numbers

CPEN Digital Logic Design Binary Systems

Fast Arithmetic Coding (FastAC) Implementations

Chapter 5, The Instruction Set Architecture Level

Lecture 11: Number Systems

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

SECTION C [short essay] [Not to exceed 120 words, Answer any SIX questions. Each question carries FOUR marks] 6 x 4=24 marks

C++ Language Tutorial

Chapter 2 Elementary Programming

Monday January 19th 2015 Title: "Transmathematics - a survey of recent results on division by zero" Facilitator: TheNumberNullity / James Anderson, UK

Computer Architecture Lecture 3: ISA Tradeoffs. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013

İSTANBUL AYDIN UNIVERSITY

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

Transcription:

DNA Data and Program Representation Alexandre David 1.2.05 adavid@cs.aau.dk

Introduction Very important to understand how data is represented. operations limits precision Digital logic built on 2-valued logic system high/low 5V/0V true/false we abstract from that from now on bits 0/1 03+10-02-2010 DNA'10 - Aalborg University 2

Basics Natural/Real Numbers Base 10 Infinite Exact Computer Numbers Base 2 Finite Rounding - overflow In this lecture How to represent numbers & characters range, encoding. A little arithmetic. How to use these numbers. 03+10-02-2010 DNA'10 - Aalborg University 3

Questions How to code negative numbers? How to code real numbers? Which kind of precision do we get? Small numbers vs. big numbers. What about characters? 03+10-02-2010 DNA'10 - Aalborg University 4

Example Overflow: main() { printf( %d\n,200*300*400*500); } outputs -88490188. Fix sort of: main() { printf( %lld\n,200ll*300ll*400ll*500ll); } outputs 12000000000. What if I forget LL? 03+10-02-2010 DNA'10 - Aalborg University 5

Example Loss of precision: (3.14+1e20)-1e20==0.0 3.14+(1e20-1e20)==3.14 Test x == 0.0 not very useful when solving equations. In this lecture you will know why. 03+10-02-2010 DNA'10 - Aalborg University 6

Data Storage Basic unit is the byte (= 8 bits). C-declaration char short int int long int char * float double Typical 32-bit 1 2 4 4 4 4 8 Typical 64-bit 1 2 4 8 8 4 8 03+10-02-2010 DNA'10 - Aalborg University 7

Features Limits on addressable memory. Size linked to architecture 32/64. Aligned memory allocation (32/64 bits). Careful on addressing: main() { char a[]= Hello world! ; int *p=&a[1]; printf( %d\n,*p); } Bus error on some CPUs 03+10-02-2010 DNA'10 - Aalborg University 8

Integer Encoding Unsigned integers: UB = w 1 i= 0 x i 2 i Signed integers: Called 2 complement. SB = x w 1 2 w 1 + w 2 i= 0 x i 2 i Highest bit codes the sign. w: size of a word (in bits) x: bits (0 or 1) 03+10-02-2010 DNA'10 - Aalborg University 9

Range & Examples Examples: 1010 = 10, 0110 = 6, 0101 = 5 Range: 2 3 =8 2 2 =4 2 1 =2 2 0 =1 unsigned 2 k numbers from 0 to 2 k -1 signed 2 k numbers from -2 k-1 to 2 k-1-1 one more negative number than positive ones How to convert between types? int char long int sign extension 03+10-02-2010 DNA'10 - Aalborg University 10

Basic Arithmetic Logical operations (bitwise): &,,^,~,<<,>>. Example: a ^= b; b ^= a; a ^= b; Arithmetic operations: + - * /. Careful with shifts on signed integers! arithmetic & logical shifts Do not mess up with boolean operators (&&, ). 03+10-02-2010 DNA'10 - Aalborg University 11

Properties Most operators are the same on signed/unsigned integers from a binary point of view beauty of the encoding. One hardware implementation for + - / * valid for signed and unsigned integers. Example on 4 bits: 1 + 1001 = 1010 unsigned: 1+9 = 10 signed: 1+(-7) = -6 03+10-02-2010 DNA'10 - Aalborg University 12

Properties Operations based on the algebra <Z n,+ n,* n,- n,0,1> (commutativity, associativity, distributivity, ) Operations modulo n and -a=0 or -a=n-a. Similar to boolean algebra <{0,1},,&,~,0,1> with the addition of DeMorgan laws ~(a&b)=~a ~b, ~(a b)=~a&~b 03+10-02-2010 DNA'10 - Aalborg University 13

Practice: Shifts & Masks Read bit n: Use mask (1 << n). Set bit n on int bits[]: ipos = n / 32; imask = 1 << (n % 32); bits[ipos] = imask; Division/multiplications by powers of 2 seen as shifts. 2-complement: -a = ~(a-1) = ~a+1 03+10-02-2010 DNA'10 - Aalborg University 14

Arithmetic Machine code of + - * / same for int/uint. Integer convertion == type casting. Padding for the sign (int). Conversion is modulo the size of the new int. Beware of implicit conversions in C! Optimizations for some operations: 2*a == a+a == a<<1 a/2 == a>>1 a*2î == x << i a/2î == a>>i a%2î == a & ((1<<i)-1) 2î == 1<<i 03+10-02-2010 DNA'10 - Aalborg University 15

Notes Beware of precedence of operators: if (x & mask == value) WRONG if ((x & mask) == value) RIGHT Test odd numbers: if (x & 1) Careful: unsigned int i; for(i = 0; i < n-1; ++i)? 03+10-02-2010 DNA'10 - Aalborg University 16

Overflow carry 1011 +1101 111 11000 Subtraction? 1011 multiplicand *1101 multiplier 1011 0000 partial 1011 products 1011 10001111 product 03+10-02-2010 DNA'10 - Aalborg University 17

Hexadecimal Notation Learn the first powers of 2. Hexadecimal more useful: One digit codes 4 bits. 0 F=0 15=16 numbers. Notation: 0x Why? Examples: 0xa57e=1010 0101 0111 1110 10=8+2,5=4+1,7=4+2+1,14=8+4+2 0xf = 1111, 0x7=0111, 0x3=0011 03+10-02-2010 DNA'10 - Aalborg University 18

Endianness: Beware! 0xa57e is a notation for humans. Corresponds to 1010 0101 0111 1110 in base 2. Little endian: stored as 0111111010100101. Big endian: stored as 1010010101111110. Does not matter in C/Java/C#, except for bitmap manipulation device drivers network transfers 03+10-02-2010 DNA'10 - Aalborg University 19

Testing for Endianness Write a value on 32 bits. Read 8/16 bits and check what was written. Exercise for Sun/Intel. main() { int a = 0xf0000000; char *c = &a; printf( %x\n, *c); } Intel? 0 Sun? 0xfffffff0 What if a = 0x70000000? 03+10-02-2010 DNA'10 - Aalborg University 20

Representation of Reals How to code a real number with bits? Finite precision approximation. Represent very small and very large numbers density of encoding varies. Scientific notation used, e.g. (base 10), 3.141e2 but in base 2. Starter: fractional numbers bad for large or small numbers. Decimal (d): m Binary(b): = d 10 i d = i i= n i= n 03+10-02-2010 DNA'10 - Aalborg University 21 m b 2 i b i

IEEE Floats IEEE 754 floating point standard V=(-1) S M*2 E Number of bits (float/double): s[1], m[23/52], e[8/11]. S E M Normalized and de-normalized values. Bit fields: s, m, e to code respectively S, M, E. Normalized values (e 0, e 111 ) E=e-bias (-126 127/-1022 1023). M=1+m (1 M<2) Trick for more precision: implied leading 1. 03+10-02-2010 DNA'10 - Aalborg University 22

IEEE Floats De-normalized values (e= 0 or 11 ) e=0: E=1-bias, bias=2 k-1-1 Coding compensates for M not having an implied leading 1. M=m For numbers very close to 0. e=11 : m=0, (signed infinite) m!=0, NaN. No more of this. We abstract from this complexity. 03+10-02-2010 DNA'10 - Aalborg University 23

Ranges of Floats Single precision (float) 2-126 2 127 ~ 10-38 10 38. Double precision (double) 2-1022 2 1023 ~ 10-308 10 308. 03+10-02-2010 DNA'10 - Aalborg University 24

Features of IEEE floats +0.0 == 0 (binary representation = 00 ). If interpreted as unsigned int, floats can be sorted (+x ascending, -x descending). All int values representable by doubles. Not all int values representable by floats. round to even (avoid stat. bias) round towards 0 round up round down can t choose in C 03+10-02-2010 DNA'10 - Aalborg University 25

Properties (floats) Operations NOT associative. Not always inverse (infinity). Loss of precision. Ex: x=a+b+c; y=b+c+d; Optimize or not? Monotonicity a b a+x b+x Casts: int2float rounded, double2float rounded/overflow int2double, float2double OK Important for compilers and programmers. float2int, double2int truncated/rounded/overflow. 03+10-02-2010 DNA'10 - Aalborg University 26

IA32: The Good And The Bad Good: Uses internally 80 bits extended registers for more precision. Bad: Stack based. Side effects like changing values when loading or saving numbers in memory whereas register transfers are exact. Extensions: MMX, SSE, (Altivec). SIMD instructions = operations working in parallel on multiple data. 03+10-02-2010 DNA'10 - Aalborg University 27

Numerical Precision Evaluation of precision absolute x ± α relative x*(1 ± α) Be careful with division by very small values: Can amplify numerical errors. Numerical justification for Gauss method to solve linear equations. 03+10-02-2010 DNA'10 - Aalborg University 28

Character Sets By convention (standard), we assign codes to characters. Careful with programming languages C char = 1 byte, Java char = 2 bytes ASCII (American Standard Code for Information Interchange) common (7 bits), extended ASCII (8 bits). Unicode Other IEEE8859-x codings. Other Japanese codings 03+10-02-2010 DNA'10 - Aalborg University 29

More Complex Structures How to represent, e.g., struct { int a, b; char c}? Use continuous bits for the successive fields. Compilers will align data! Try: typedef struct { int a; char c; } foo_t; int main() { printf( %d\n, sizeof(foo_t)); return 0; } 03+10-02-2010 DNA'10 - Aalborg University 30

What About Programs? Instructions coded into bytes. opcode Processors interpret them as their particular instruction sets (standard). 03+10-02-2010 DNA'10 - Aalborg University 31