2 Information in a system point of view

3 Information in a system point of view All information processed by computer systems is only a sequence of zeros and ones. Nothing more. In a system point of view ALL information is just a stream of zeros and ones.

4 Information in a system point of view The same sequence of zeros and ones can be a photo of our friend, another time our favourite MP3 and another time a letter to an uncle. We are, the user, who decide how to interpret given sequence of zeros and ones.

5 Information in a system point of view What we have (read) depend on the way of interpretation. It is not a graphic file which inform us, that it is a graphic file but that we are, who interpret file as it is a graphic file. Of course most of modern files enclose information about carried data but this is only to help us.

6 Information in a system point of view Linguistic analogy What does it mean: para? if we know, that this is Polish word: two objects; if we know, that this is Spanish word: for.

7 Information in a system point of view Linguistic analogy What does it mean: MIX? if we know, that this is English word: put two objects together; if we know, that this is a stream of Roman Numbers: 1009.

8 Information in a system point of view Numeral analogy 8 in decimal numeral system; VIII in roman numeral system; 1000 in binary numeral system.

9 What we code?

10 What we code? Type of coding: alphanumeric natural numbers integer numbers real numbers graphic file bmp TCP/IP packet

11 Alphanumeric Alphanumeric is any letter, digit and some symbols like (, :, + etc., ) everything what we can write (input) using keyboard.

12 Alphanumeric This set includes: digits 0-9, letters a Z, special characters (eg., ;. ) math symbols (eg. + = - %) all specific characters (eg. # \$)

13 Alphanumeric Coding is a process of changing character entered by keyboard or any input devices into digital representation, that is storing this character as a sequence of zeros and ones.

14 Alphanumeric There are some popular standards of coding.

15 Alphanumeric / ASCII American Standard Code for Information Interchange.

16 Alphanumeric / ASCII ASCI table (

17 Alphanumeric / ASCII In this type of coding there are codes defined for small (97-122) and capital (65-90) latin alphabet; digits (48-57); some symbols like (, :, + etc. (32-47, 58-64, 91-96, ); nonprintable characters controling data flow, e.g. ACK - acknowledge or BEL - bell (sound signal) (0-31).

18 Alphanumeric / ASCII It s easy to verify that capital latin alphabet A, B, C,, which are defined by numbers correspond to 010bbbb

19 Alphanumeric / ASCII In short, in ASCII we have: 95 printable characters 33 nonprintable characters ( directives )

20 Alphanumeric / ASCII The scope of ASCII codes stretch from 0 to 127. That means that we need at least 7 bits to write all this numbers.

21 Alphanumeric / ASCII = 128 these are reserved Because most computers in those days were using 8- bit bytes (that is, computers divides and process all information in 8-bit portions) so there were 128 spare place coded from 128 to 255.

22 Alphanumeric / ASCII ASCII characters did not cover the needs of nationality using latin characters with dots (Germany, Poland) or completely nonstandard characters (Greece, Russia). Because of resultant needs, to represent dotted charactes this 128 spare numbers were used.

23 Alphanumeric /code pages The trouble was, lots of people had this idea at the same time, and they had their own ideas of what should go where in the space from 128 to 255.

24 Alphanumeric /code pages This is how code pages come into being. Code pages a sets of 255 characters with shared first half of characters and sometimes completely different second half.

25 Alphanumeric /code pages This is why when manipulate any text, if we want correctly read nonstandard characters we HAVE TO know code page used during coding.

26 Alphanumeric /code pages ISO The International Organization for Standardization, is an international standard-setting body composed of representatives from various national standards organizations.

27 Alphanumeric /code pages ISO Founded on 23 February 1947, the organization promotes worldwide proprietary, industrial and commercial standards. It is headquartered in Geneva. ISO

28 Alphanumeric /code pages ISO Norms (code pages) ordered by ISO are well known as ISO-8859 x where x is a number 1 up to 10. Eastern Europe, thus Poland also, takes the number 2

29 Alphanumeric /Polish code pages

30 Alphanumeric /Polish code pages ISO , also known as latin2, typical for UNIX like systems. CP 1250, also known as win-1250, typical for Windows family systems. Mazowia coding prepared for Polish computer Mazovia. Unicode.

31 Alphanumeric Problems Obvious many various code pages even for the same language. Dificulties with processing multilingua text. The scope of codes (255) was to small for some languages, e.g. Chinese.

32 Alphanumeric / Unicode Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing system. is a standard for representing characters as integers.

33 Alphanumeric / Unicode Unlike ASCII, which uses 7 bits for each character, Unicode uses 16 bits, which means that it can represent more than 65,000 unique characters. This is a bit of overkill for English and Western-European languages, but it is necessary for some other languages, such as Greek, Chinese and Japanese. Many analysts believe that as the software industry becomes increasingly global, Unicode will eventually supplant ASCII as the standard character coding format.

34 Alphanumeric / Unicode Unicode most importants facts Explicitness. One code for fon character and vice versa. Universality. All well (and not well) known languages and symbols. Efectiveness. Character identication does not depend on control sequence and preceding or following characters.

35 Alphanumeric / Unicode Unicode most importants facts Identication not representation. What character but not how it look (the style, the size, the laguage is not important). Sens/property. Characters properties (e.g. alphabetic order) are not depend on position in codes table but are define in properties table.

36 Alphanumeric / Unicode Unicode most importants facts Plain text. Logic order. Unification. Identical characters for difererent meaning were replaced by one.

37 Alphanumeric / Kod FOO We takes the following way of coding alphanumeric see The character of space has the code 42.

38 Alphanumeric / Kod FOO The character of space has the code 42.

39 Alphanumeric / Kod FOO In addition we define the following escape codes, control sequence (sekwencje sterujące): ESC1 (kod 43) to replacement a small letter written behind this escape code by the capital ESC2 (kod 44) to write the diacritic (the main use of diacritical marks in the Latin-derived alphabet is to change the sound-value of the letter to which they are added).

40 Alphanumeric / Kod FOO Escape code ESC2, a letter add, to a lettter ESC2. a letter add. to a lettter ESC2 - a letter add strikethrough (przekreślenie) to a lettter ESC2 ' a letter overline a lettter

41 Alphanumeric / Kod FOO NL (kod 45) force going to the next line

42 Alphanumeric / Kod FOO As we can see in code FOO, there are less than 64 signs but more than 32. It means that we must use at least 6 bits to write its codes (2) (2)

43 Alphanumeric / Kod FOO Example. Let s code the following sentence: Miała (kiedyś) Zośka 371 kotów a teraz ma 1 szczura - Mańka.

44 Alphanumeric / Kod FOO

45 Natural numbers coding To code natural numbers we use a well known from previous lectures method of coding decimal numbers as binary numbers.

46 Integer numbers coding To code integer numbers we use sign-and-magnitude (sign and absolute value) two's complement (U2)

47 Real numbers Several different representations of real numbers have been proposed, but by far the most widely used is the floating-point representation.

48 Real numbers To code real numbers we use fixed point real numbers floating point real numbers

49 Real numbers Floating point example Assumptions to represent a number we use 8 bits; the first bit from the left (7) is a sign of the number; bits (6-4) means mantissa; bits (3-0) means fractional (characteristic); constant K C is equal to 7.

