1. Basic encoding principles



Similar documents
ISO/IEC JTC1 SC2/WG2 N4399

The Unicode Standard Version 8.0 Core Specification

What is a piper plot?

USB Card Reader Configuration Utility. User Manual. Draft!

Electronic Multi-Measuring Instrument

L2/ Abstract Introduction

Table 1 below is a complete list of MPTH commands with descriptions. Table 1 : MPTH Commands. Command Name Code Setting Value Description

Internationalizing the Domain Name System. Šimon Hochla, Anisa Azis, Fara Nabilla

Caml Virtual Machine File & data formats Document version: 1.4

EURESCOM - P923 (Babelweb) PIR.3.1

Hieroglyphic Questions

DEBT COLLECTION SYSTEM ACCOUNT SUBMISSION FILE

IP Matrix MVC-FIPM. Installation and Operating Manual

CHAPTER 8 BAR CODE CONTROL

Binary Representation

San José, February 16, 2001

Understanding Hieroglyphs: Be the Scribe

Tamper protection with Bankgirot HMAC Technical Specification

Memory Systems. Static Random Access Memory (SRAM) Cell

Volume Serial Numbers and Format Date/Time Verification

(Least Squares Investigation)

Magic Word. Possible Answers: LOOSER WINNER LOTTOS TICKET. What is the magic word?

118 One hundred Eighteen


WEBMAIL User s Manual

Data Acquisition Module with I2C interface «I2C-FLEXEL» User s Guide

Whisky Emoji Submission

Communications Protocol for Akai APC40 Controller

7-Bit coded Character Set

3. USB FLASH DRIVE PREPARATION. Almost all current PC firmware permits booting from a USB drive, allowing the launch

ANCIENT EGYPTIAN HIEROGLYPHS

Right-to-Left Language Support in EMu

ISO/IEC JTC1/SC2/WG2 N 2342

CTNET Field Protocol Specification November 19, 1997 DRAFT

Application Note. Introduction AN2471/D 3/2003. PC Master Software Communication Protocol Specification

Chapter 7D The Java Virtual Machine

Statistics 100A Homework 1 Solutions

US Equities/Options Multicast PITCH Specification. Version

USER MANUAL FOR. autocue.com

Creating Serial Numbers using Design and Print Online. Creating a Barcode in Design and Print Online. Creating a QR Code in Design and Print Online

Ancient Egypt Tic-Tac-Toe Menu

Quectel Cellular Engine

Math 408, Actuarial Statistics I, Spring Solutions to combinatorial problems

Cyber Security Workshop Encryption Reference Manual

WIZnet S2E (Serial-to-Ethernet) Device s Configuration Tool Programming Guide

Romanian Online Dialect Atlas: Crişana

XMP Specification. ADOBE SYSTEMS INCORPORATED Corporate Headquarters 345 Park Avenue San Jose, CA (408)

CNC Transfer. Operating Manual

CX-Supervisor CX-MODBUS TCP

ValuPrompt Computer Prompter Instruction Manual

Font and color choices are all made from the Message or Format Text tab on the ribbon.

Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment

Excel Database Management

Advanced Security Mechanisms for Machine Readable Travel Documents and eidas Token

Interface Protocol v1.2

Intro to Ancient Writing Systems

Character Code Structure and Extension Techniques

HKSCS-2004 Support for Windows Platform

Painting an Egyptian tomb

An overview of FAT12

Factoring - Grouping

ZFS On-Disk Data Walk (Or: Where's My Data) OpenSolaris Developer Conference, June 25-27, 2008 Prague

Smart Card Application Standard Draft

ISO/IEC JTC1/SC2/WG2 N

PRODUCT MANUAL SKX OPEN SKX ADVANCE ZN1RX-SKXOPEN. Edition 2 Version 1.1

INTERNATIONAL CIVIL AVIATION ORGANIZATION ASIA AND PACIFIC OFFICE ASIA/PACIFIC ATN NETWORK SERVICE ACCESS POINT (NSAP) ADDRESSING PLAN

This is a preview - click here to buy the full publication INTERNATIONAL STANDARD

Table of Contents. Search Results.21. Equipment Change Request...10 Equipment Removal Request...11 Disposed..12 Not found 12

Command Param1 Param2 Return1 Return2 Description. 0xE9 0..0x7F (id) speed pos_high pos_low Set servo #id speed & read position

David Cowen Matthew Seyer G-C Partners, LLC

DATA ITEM DESCRIPTION

Chapter 4: Computer Codes

LCAO-MO Correlation Diagrams

Safer data transmission using Steganography

Configurable Events for APC Network Management Card

PDF Forensic Report - Job Reference

Bluetooth HID Profile

Solving printing problems

LTO label are. your labels to. environment. The human. label enables. Prefix. Tuesdays. The

Command a host of forces into battle, in almost any era.

Document review management. Raising Sales Invoices user guide

Brunata Optuna W (171)

2 days without the kids

SocketScan Software Advanced Programming Guide

Memory is implemented as an array of electronic switches

Ancient Egypt: Symbols of the pharaoh

Has difficulty with counting reliably in tens from a multiple of ten

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

Introduction to Unicode. By: Atif Gulzar Center for Research in Urdu Language Processing

Unicode in Mobile Phones

Erasure Codes Made So Simple, You ll Really Like Them

Hermes.Net IVR Designer Page 2 36

M A S S A C H U S E T T S I N S T I T U T E O F T E C H N O L O G Y DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Rendering/Layout Engine for Complex script. Pema Geyleg

Preservation Handbook

2 MODEL AND APPLICATION INFORMATION

APPENDIX E COMMAND BARCODES

Transcription:

1 of 5 5/2/2006 11:41 AM ISO/IEC JTC1/SC2/WG2 N1636 DATE: 1997-08-25 DOC TYPE: Expert contribution TITLE: Encoding Egyptian Hieroglyphs in ISO/IEC 10646-2 SOURCE: Michael Everson PROJECT: JTC1.02.18.02 STATUS: Discussion paper ACTION ID: FYI DUE DATE: -- DISTRIBUTION: Worldwide MEDIUM: Paper and web NO. OF PAGES: 5 (printed at 85%) 1. Basic encoding principles To encode Egyptian hieroglyphs, we should adopt the existing encoding principles which have been developed since computers were introduced into Egyptology in the late 1970s and the early 1980s, and which are widely if not universally employed by Egyptologists today. The Manuel de codage (Paris: 1988) and the Macintosh and Windows implementations (MacScribe and WinGlyph) which support it are sufficiently well-developed that they can be adopted with a minimum of effort into ISO/IEC 10646 -- optimizing the transfer from existing encoded texts to 10646-coded texts. 2. Character set repertoire 2.1 Repertoire To begin with, two distinct repertoires should be encoded. The first, smaller set, called BASIC EGYPTIAN HIEROGLYPHS, consists of the essential Gardiner Middle Egyptian sign list, and comprises 761 characters. The second, much larger set, called EXTENDED EGYPTIAN HIEROGLYPHS, as presented in Hieroglyphica (Utrecht and Paris: 1993), comprises 4872 characters. Additionally, 37 Alternate Format characters will be needed in order to map currently-encoded texts to 10646. It is expected that in the next decade the final number of hieroglyphs known and analyzed will total some 8000, which means that we can expect, eventually, further extensions (such as EXTENDED EGYPTIAN HIEROGLYPHS-A) to be added to the standard. 2.2. Source of the repertoire To achieve the maximum ease of mapping the characters, the 23 eight-bit fonts (4 Basic and 19 Extended) of the Centre for Computer-Aided Egyptological Research should be taken as the base for encoding, and the characters should be arranged serially in 10646. This means that the 166 characters in Glyph Basic A will occupy positions x000 to x0b5, the 192 characters in Glyph Basic B will occupy positions x0b6 to x175, the 185 characters in Glyph Basic C will occupy positions x176 to x22e, and the 202 characters in Glyph Basic D will occupy positions x22f to

2 of 5 5/2/2006 11:41 AM x2f8. And so on for the 19 fonts of the Glyph ExtLib A - S set. 3. Character names There is a number of ways to designate Egyptian hieroglyphic characters: by conventional sound (w, m, wn) which works for only a very few characters by complex description (baby chick, owl, hare, standing monkey holding severed head, recumbant crocodile with cobra headdress and flagellum, two crouching men holding hands, skirted man holding jackal aloft by the throat, seated Thoth, seated Thoth with flagellum, seated Thoth with ankh, Dhwty or Thoth) by standardized sign number (G43, G17, E34, E58A, I98, A470, A248A, C3, C3A, C3B, G26B) Obviously the only sensible way of naming the characters is to identify them with their standard Egyptological catalogue numbers, since otherwise the character names would be picturesque but overlong. The name EGYPTIAN HIEROGLYPHIC SIGN E34 refers uniquely and unambiguously to the hare character used for the sound wn. 4. Alternate format characters The complexity of the repertoire is largely a problem for the Egyptologists in the first place and for the preparer of the encoding proposal in the second; but the set is finite and well-defined. More subtly difficult, however, is mapping the formatting characters of the Manuel de codage to 10646. The chief question is, how closely must the 10646 encoding reflect them? My view is that it should reflect as much of it as possible. 4.1 Plain text and Egyptian hieroglyphs Positioning is an essential part of Egyptian hieroglyphic text processing, and ISO/IEC 10646 must represent positioning and directionality accurately and completely. It should be assumed that Egyptian has a basic left-to-right directionality; both left-to-right and right-to-left are used. The existing U+200E LEFT-TO-RIGHT MARK and U+200F RIGHT-TO-LEFT MARK can be used to effect whatever directional requirements are needed. However, these marks should be used only for reversing the directionality of the entire text, not for individual signs, which may be mirrored without affecting the directionality of the text. For those, a MIRROR SIGN is proposed here, as are a number of rotation signs -- though it should be noted that other scripts, such as Runic, could benefit from such signs. The END OF LINE MARKER and END OF PAGE MARKER are essential for correct processing of columnar Egyptian and their Manual de codage equivalents are heavily relied upon in current encoded texts. 4.2. Beginnings and endings Signs like BEGIN CARTOUCHE and END CARTOUCHE should be understood to function exactly as do LEFT PARENTHESIS and RIGHT PARENTHESIS (that, is, they take bi-directional context into account). Rendering should draw a continuous line between the two characters. 4.3 Fancy text

3 of 5 5/2/2006 11:41 AM Colour is not normally considered to be a feature of plain text, though red (grey in the hardcopy of this document) is used meaningfully in the Coffin Texts and the New Kingdom funerary papyri: Papyrus of Ani plate IV:26-27 Papyrus of Ani plate VI:6-7 Red colour and black colour are encoded with special locking-shift sequences in the Manuel de codage. It should be noted that the Naxi or Tompa script (in China) also uses colour to achieve semantic distinctions, and it may not be practical to proscribe colour as a quality of plain text. In the present proposal, an encoding is given for red hieroglyphs, because the following editorial practice supports it: I give below examples from Budge 1895 in which the rubric is represented not by colour but in another way: Papyrus of Ani plate IV:26-27 Papyrus of Ani plate VI:6-7 It is proposed that BEGIN RUBRIC SIGN and END RUBRIC SIGN be encoded and that rendering as red or as a black line above be left to implementation. The RUBRIC SIGNs function just as the CARTOUCHE SIGNs do. Issue: Have all the relevant Manuel de codage characters been accounted for in the list below? 4.4 Proposed Alternate Formatting and other characters 0001 x300 EGYPTIAN HIEROGLYPHIC SIGN SEPARATOR 0001 x301 EGYPTIAN HIEROGLYPHIC SIGN JUXTAPOSITIONER 0001 x302 EGYPTIAN HIEROGLYPHIC SIGN SUBORDINATOR 0001 x303 EGYPTIAN HIEROGLYPHIC BEGIN CLUSTER MARK 0001 x304 EGYPTIAN HIEROGLYPHIC END CLUSTER MARK 0001 x305 EGYPTIAN HIEROGLYPHIC END OF LINE MARKER 0001 x306 EGYPTIAN HIEROGLYPHIC END OF PAGE MARKER 0001 x307 EGYPTIAN HIEROGLYPHIC BEGIN CARTOUCHE 0001 x308 EGYPTIAN HIEROGLYPHIC END CARTOUCHE 0001 x309 EGYPTIAN HIEROGLYPHIC BEGIN REVERSED CARTOUCHE 0001 x30a EGYPTIAN HIEROGLYPHIC END REVERSED CARTOUCHE 0001 x30b EGYPTIAN HIEROGLYPHIC BEGIN CAPLESS CARTOUCHE 0001 x30c EGYPTIAN HIEROGLYPHIC END CAPLESS CARTOUCHE 0001 x30d EGYPTIAN HIEROGLYPHIC BEGIN HWT SIGN 0001 x30e EGYPTIAN HIEROGLYPHIC END HWT SIGN 0001 x30f EGYPTIAN HIEROGLYPHIC BEGIN LOW HWT SIGN 0001 x310 EGYPTIAN HIEROGLYPHIC END LOW HWT SIGN 0001 x311 EGYPTIAN HIEROGLYPHIC BEGIN HIGH HWT SIGN 0001 x312 EGYPTIAN HIEROGLYPHIC END HIGH HWT SIGN 0001 x313 EGYPTIAN HIEROGLYPHIC BEGIN RUBRIC SIGN 0001 x314 EGYPTIAN HIEROGLYPHIC END RUBRIC SIGN 0001 x315 EGYPTIAN HIEROGLYPHIC ROTATE TWO HUNDRED SEVENTY DEGREES SIGN 0001 x316 EGYPTIAN HIEROGLYPHIC ROTATE ONE HUNDRED EIGHTY DEGREES SIGN 0001 x317 EGYPTIAN HIEROGLYPHIC ROTATE NINETY DEGREES SIGN

4 of 5 5/2/2006 11:41 AM 0001 x318 EGYPTIAN HIEROGLYPHIC MIRROR SIGN 0001 x319 EGYPTIAN HIEROGLYPHIC MIRROR AND ROTATE NINETY DEGREES SIGN 0001 x31a EGYPTIAN HIEROGLYPHIC MIRROR AND ROTATE ONE HUNDRED EIGHTY DEGREES SIGN 0001 x31b EGYPTIAN HIEROGLYPHIC MIRROR AND ROTATE TWO HUNDRED SEVENTY DEGREES SIGN 0001 x31c EGYPTIAN HIEROGLYPHIC COMBINING QUADRANT SHADING 0001 x31d EGYPTIAN HIEROGLYPHIC COMBINING TOP HALF QUADRANT SHADING 0001 x31e EGYPTIAN HIEROGLYPHIC COMBINING BOTTOM HALF QUADRANT SHADING 0001 x31f EGYPTIAN HIEROGLYPHIC COMBINING LEFT HALF QUADRANT SHADING 0001 x320 EGYPTIAN HIEROGLYPHIC COMBINING RIGHT HALF QUADRANT SHADING 0001 x321 EGYPTIAN HIEROGLYPHIC COMBINING LEFT TOP QUADRANT SHADING 0001 x322 EGYPTIAN HIEROGLYPHIC COMBINING RIGHT TOP QUADRANT SHADING 0001 x323 EGYPTIAN HIEROGLYPHIC COMBINING LEFT BOTTOM QUADRANT SHADING 0001 x324 EGYPTIAN HIEROGLYPHIC COMBINING RIGHT BOTTOM QUADRANT SHADING 0001 x325 (This position shall not be used) 0001 x326 (This position shall not be used) 0001 x327 (This position shall not be used) 0001 x328 (This position shall not be used) 0001 x329 (This position shall not be used) 0001 x32a (This position shall not be used) 0001 x32b (This position shall not be used) 0001 x32c (This position shall not be used) 0001 x32d (This position shall not be used) 0001 x32e (This position shall not be used) 0001 x32f (This position shall not be used) 5. Allocation in ISO/IEC 10646 Egyptian hieroglyphs should be encoded in Plane 1 of ISO/IEC 10646. Three rows are required for Basic Egyptian Hieroglyphs and nineteen rows are required for Extended Egyptian Hieroglyphs. The Alternate Format Characters are a bit difficult to place; since three rows are required for the Basic characters, there are only 7 empty spaces at the end of the row. Assuming, a as shown in the table below, that the Basic characters occupy positions 0001 x000, it would be convenient to begin the Extended characters at position 0001 x400. This leaves space in row 0001 x300 available for future standardization (such as additional Alternate Format characters, Meiroitic alphabetic characters (which were taken directly from the hieroglyphic alphabet), etc.). 0 1 2 3 4 5 6 7 8 9 A B C D E F x0 x1 x2 x3 Alt. Form.???? Meroitic? x4 x5 x6 x7 x8 x9 xa xb xc xd xe xf

5 of 5 5/2/2006 11:41 AM y0 y1 y2 y3 y4 y5 y6 y7 y8 y9 ya yb yc yd ye yf z0 z1 z2 53 Michael Everson, everson@indigo.ie, http://www.indigo.ie/egt, Dublin, 1997-08-25