INTERNATIONALIZED DOMAIN NAMES

Similar documents
Tamil Indic Input 3 User Guide

Tamil Indic Input 2 User Guide

Hindi Indic Input 2 - User Guide

Hindi Indic Input 3 - User Guide

Typesetting Tamil Using Ω/ℵ

Gujarati Indic Input 3 - User Guide

Gujarati Indic Input 2 - User Guide

Kannada Indic Input 2 - User Guide

Typesetting Malayalam Using Ω/ℵ

Rendering/Layout Engine for Complex script. Pema Geyleg

Typing Devanagari on Mac OS X compiled by José C. Rodriguez, Emory College Language Center, Emory University 2009

The Sacred Letters of Tibet

Proposal to Encode the Khojki Script in ISO/IEC 10646

Annual Report H I G H E R E D U C AT I O N C O M M I S S I O N - PA K I S TA N


Rhode Island College

NAME. Internationalized Domain Names (IDNs) -.IN Domain Registry. Policy Framework. Implementation

.ASIA CJK (Chinese Japanese Korean) IDN Policies

IDN FREQUENTLY ASKED QUESTIONS

Internationalization of the Domain Name System: The Next Big Step in a Multilingual Internet

.ASIA Reserved Names Policies


DHL EXPRESS CANADA E-BILL STANDARD SPECIFICATIONS

LEXSYNERGY LIMITED, AS A SPECIALIST AFRICAN

Online EFFECTIVE AS OF JANUARY 2013

Encoding script-specific writing rules based on the Unicode character set

TEXT TO SPEECH SYSTEM FOR KONKANI ( GOAN ) LANGUAGE

Visuals and corresponding storage representations of the edge cases in Malayalam

California Treasures Phonics Scope and Sequence K-6

Arabic Domain Names. Dr. Abdulaziz H. Al-Zoman Director of SaudiNIC Chairman of Steering Committee - ADN Pilot Project zoman@isu.net.sa.

Estimating Probability Distributions


1.- L a m e j o r o p c ió n e s c l o na r e l d i s co ( s e e x p li c a r á d es p u é s ).

Internationalized Domain Names -

Implementation of Internet Domain Names in Sinhala

Intervention Strategies for Struggling Readers

Published by ICANN 7 June For Information Only

CH3 Boolean Algebra (cont d)

Angle bisectors of a triangle in I 2

A usage coverage based approach for assessing product family design

Disaster Recovery System Administration Guide for Cisco Unified Contact Center Express Release 8.5(1)

QUALIFICATIONS PACK - OCCUPATIONAL STANDARDS FOR ELECTRONICS INDUSTRY. SECTOR: ELECTRONICS SUB-SECTOR: Semiconductor Design and Active Components

Attachment "A" - List of HP Inkjet Printers

Disaster Recovery System Administration Guide for Cisco Unified Contact Center Express Release 8.0(2)

Pattern Co. Monkey Trouble Wall Quilt. Size: 48" x 58"

Life Insurer Financial Profile

Life Insurer Financial Profile

Electronegativity and Polarity

B I N G O B I N G O. Hf Cd Na Nb Lr. I Fl Fr Mo Si. Ho Bi Ce Eu Ac. Md Co P Pa Tc. Uut Rh K N. Sb At Md H. Bh Cm H Bi Es. Mo Uus Lu P F.


The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION GEOMETRY. Thursday, August 13, :30 to 11:30 a.m., only.

Acceptance Page 2. Revision History 3. Introduction 14. Control Categories 15. Scope 15. General Requirements 15

US Code (Unofficial compilation from the Legal Information Institute)

Introduction to Internationalized Domain Names (IDN)

Taipei Enterprise Sunrise Period Policy

The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION GEOMETRY. Wednesday, January 29, :15 a.m. to 12:15 p.m.

Internationalizing the Domain Name System. Šimon Hochla, Anisa Azis, Fara Nabilla

Right into Reading. Program Overview Intervention Appropriate K 3+ A Phonics-Based Reading and Comprehension Program

Text Processing for Text-to-Speech Systems in Indian Languages

Application Note RMF Magic 5.1.0: EMC Array Group and EMC SRDF/A Reporting. July 2009

OHIP Billing Information for Telemedicine Services 1 September 2011

Visa Smart Debit/Credit Certificate Authority Public Keys

Future Trends in Airline Pricing, Yield. March 13, 2013

.امارات (dotemarat) Arabic Domain Name Policy

ISO based solutions for Internationalised Domain Names

Domain Name Registration Policy

Chapter 1. The Medial Triangle

The Proposal for Internationalizing cctld Names

The Handshake Problem

Chem 115 POGIL Worksheet - Week 4 Moles & Stoichiometry Answers

San Jose Math Circle April 25 - May 2, 2009 ANGLE BISECTORS

Math 312 Homework 1 Solutions

The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION GEOMETRY. Student Name:

(1) the business credit carryforwards carried to such taxable year, (2) the amount of the current year business credit, plus

Pennsylvania College of Technology Program Accreditations/Certifications/Recognitions/Endorsements As of August 2009

Pemrograman Dasar. Basic Elements Of Java

Capability List, Certificate CHE ANAC 1/8/2014 1

QUALIFICATIONS PACK - OCCUPATIONAL STANDARDS FOR ELECTRONICS INDUSTRY SUB-SECTOR: STRATEGIC ELECTRONICS OCCUPATION: PRODUCTION PLANNING AND CONTROL

Community College of Philadelphia Calling Code 218 Employer Scan Client Approved: November 17, 2005 Region (CIRCLE) City MSA

Overview of Spellings on

CUNY Graduate School Information Technology. IT Banner Data Entry Standards Last Updated: March 16, 2015

Copyright 2002 Ford Motor Company PAGE 1

Put the human back in Human Resources.

An Introduction to UC-Monitor

Automata and Formal Languages

Surface Mount (SMD) Transistors/Diode FAQ

DATING YOUR GUILD

INCIDENCE-BETWEENNESS GEOMETRY

The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION GEOMETRY. Thursday, January 24, :15 a.m. to 12:15 p.m.

BLADE 12th Generation. Rafał Olszewski. Łukasz Matras

Information Technology Topic Maps Part 2: Data Model

URL encoding uses hex code prefixed by %. Quoted Printable encoding uses hex code prefixed by =.

Internationalization & Localization

Introduction to Unicode. By: Atif Gulzar Center for Research in Urdu Language Processing

Internal Revenue Service

Some triangle centers associated with the circles tangent to the excircles

Text-To-Speech for Languages without an Orthography अ रप त नसल भ ष स ठ व ण स षण

Phonics. Phonics is recommended as the first strategy that children should be taught in helping them to read.

SERVER CERTIFICATES OF THE VETUMA SERVICE

Transcription:

Draft Policy Document for INTERNATIONALIZED DOMAIN NAMES Language: TAMIL 1

VERSION NUMBER DATE RECORD OF CHANGES PAGES AFFECTED 1.0 19/11/09 Whole Document 1.1 22/11/20 10 1.2 05/08/20 13 M Page No 8, 17 A Whole Document A* M D *A - ADDED M - MODIFIED D - DELETED TITLE OR BRIEF DESCRIPTION Language Specific 1.5 Policy Document for TAMIL Restriction rule added, cctld added A,M Restriction rules added and modified. COMPLIANCE VERSION OF MAIN POLICY DOCUMENT 1.6 2

Table of Contents 1. AUGMENTED BACKUS-NAUR FORMALISM (ABNF)... 4 1.1 Declaration of Variables:... 4 1.2 ABNF Operators:... 4 1.3 The Vowel Sequence... 5 1.4 The Consonant Sequence... 5 1.5 Sequence... 6 1.6 ABNF Applied to Tamil IDN... 6 2. RESTRICTION RULES... 9 3. EXAMPLES... 10 4. LANGUAGE TABLE: TAMIL... 11 5. NOMENCLATURAL DESCRIPTION TABLE OF TAMIL LANGUAGE TABLE... 12 6. VARIANT TABLE... 15 7. EXPERTS/BODIES CONSULTED... 16 8. PROPOSED cctld FOR TAMIL... 17 3

1. AUGMENTED BACKUS-NAUR FORMALISM (ABNF) 1.1 Declaration of Variables: Dash Hyphen - Digit Indo-Arabic digits [0-9] C V M X H Consonant Vowel Matra Visarga/Aytham Halant/Virama 1.2 ABNF Operators: Sr. No. Operator Function 1 Alternative 2 [ ] Optional 3 * Variable Repetition 4 ( ) Sequence Group In what follows, the Vowel Sequence and the Consonant Sequence pertinent to Tamil are given. To facilitate understanding, equivalents in Devanagari are provided. 4

1.3 The Vowel Sequence A vowel sequence is made up of a single vowel. It may be followed but not necessarily (optionally) by a Visarga (X). The number of X which can follow a V in Tamil are restricted to one. The vowel sequence in Tamil is therefore, V [X] Examples: Vowel V अ Vowel+Aytham VX अ 1.4 The Consonant Sequence A consonant sequence admits the following combinations: 1. A single consonant (C) Example: C क 2. A consonant optionally followed by Dependent Vowel sign/matra [M] or Visarga [X] or Halant/Virama [H] C[M X H] Example: CM कक CX क CH क (Pure Consonant) 3. A sequence of consonants (up to 3) joined by Halant/Virama *2(CH)C Example: 5

CHC क ष क ष CHCHC क ष य क ष य 1.5 Sequence A sequence can be made up by Consonant-sequence or Vowel-sequence. Thus a sequence is, consonant-sequence vowel-sequence 1.6 ABNF Applied to Tamil IDN Consonant Sequence *2(CH)C[H X M] Vowel Sequence V[X] Sequence Consonant Sequence Vowel Sequence IDN-Label (Sequence digit)*([dash] (Sequence digit)) 6

Additional Examples putting more light on Tamil ABNF: 1. H or M or X cannot occur in the beginning of a Tamil IDN. Example: क क क As can be seen, such combinations will result automatically in a golu marking it as an invalid formation. This is an intrinsic property of the Indian language syllable and is quasi automatically applied wherever supported by the OS. 2. H is not permitted after V, X, M, Digit or Dash. Example: अ क कक 1-3. Visarga/Aytham[X] is permitted after Consonant or a Vowel is restricted to one. Thus following combinations are invalidated. Example: क अ 4. Visarga/Aytham[X] is not permitted after a Matra. कक 5. Number of M permitted after consonant is restricted to one Example: 7

क 6. M is not permitted after V Example: ई 8

2. RESTRICTION RULES The Augmented Backus Naur Formalism (ABNF) is generic in nature and when applied to a specific language/script, certain restriction rules apply. In other words, in a given language some of the Formalism structures do not necessarily apply. To take care of such cases, restriction rules are set in place. These restrictions will help fine-tune the ABNF. In case of Tamil the following rules apply: 1. A consonant syllable that is intended to end with Halant/Virama [H] can only be followed by Hyphen or a Digit. க - क - க 1 क 1 2. The number of identical consonants joined by a Halant within a label shall not exceed two. Thus (ka+halant+ka+halant+ka). (ka+halant+ka) is permitted but not 3. Consecutive hyphens will not be permitted in a domain name. 4. A label containing not more than three "akshara", which have got variants shall be permitted. As an example let us consider a, b, c and d as four aksharas in a given label having a', b', c' and d' as variants in which case such a label will be disallowed. (Example of disallowed label - abcd, acdb, cdaba and so on). Additional Note: Wherever a variant is present in a given label, the variants shall be strictly symmetric and non-transitive. This ensures that over generativity does not take place. However the case of over generativity of variants does not exist in case of Tamil. 9

3. EXAMPLES Combination Example Word with combination C CH CM CX CHC CHCHC V VX 10

4. LANGUAGE TABLE 1 : TAMIL 2 1 This language table is based on Unicode Chart for Tamil script provided by the Unicode Consortium. 2 Characters marked in yellow are not applicable to the language. 11

5. NOMENCLATURAL DESCRIPTION TABLE OF TAMIL LANGUAGE TABLE VISARGA/AYTHAM (X) 0B83 VOWEL LETTERS (V) 0B85 0B86 0B87 0B88 0B89 0B8A 0B8E 0B8F 0B90 0B92 0B93 0B94 CONSONANTS (C) 0B95 TAMIL SIGN VISARGA TAMIL LETTER A TAMIL LETTER AA TAMIL LETTER I TAMIL LETTER II TAMIL LETTER U TAMIL LETTER UU TAMIL LETTER E TAMIL LETTER EE TAMIL LETTER AI TAMIL LETTER O TAMIL LETTER OO TAMIL LETTER AU TAMIL LETTER KA 12

0B99 0B9A 0B9C 0B9E 0B9F 0BA3 0BA4 0BA8 0BA9 0BAA 0BAE 0BAF 0BB0 0BB1 0BB2 0BB3 0BB4 0BB5 0BB6 TAMIL LETTER NGA TAMIL LETTER CA TAMIL LETTER JA TAMIL LETTER NYA TAMIL LETTER TTA TAMIL LETTER NNA TAMIL LETTER TA TAMIL LETTER NA TAMIL LETTER NNNA TAMIL LETTER PA TAMIL LETTER MA TAMIL LETTER YA TAMIL LETTER RA TAMIL LETTER RRA TAMIL LETTER LA TAMIL LETTER LLA TAMIL LETTER LLLA TAMIL LETTER VA TAMIL LETTER SHA 13

0BB7 0BB8 0BB9 VOWEL SIGNS (MATRAS) (M) 0BBE 0BBF 0BC0 0BC1 0BC2 0BC6 0BC7 0BC8 0BCA 0BCB 0BCC VIRAMA (H) 0BCD TAMIL LETTER SSA TAMIL LETTER SA TAMIL LETTER HA TAMIL VOWEL SIGN AA TAMIL VOWEL SIGN I TAMIL VOWEL SIGN II TAMIL VOWEL SIGN U TAMIL VOWEL SIGN UU TAMIL VOWEL SIGN E TAMIL VOWEL SIGN EE TAMIL VOWEL SIGN AI TAMIL VOWEL SIGN O TAMIL VOWEL SIGN OO TAMIL VOWEL SIGN AU TAMIL SIGN VIRAMA 14

6. VARIANT TABLE VARIANT 0B92+0BB3 0B94 15

7. EXPERTS/BODIES CONSULTED Expertise provided by C-DAC Thiruvananthapuram. 16

8. PROPOSED cctld FOR TAMIL India (Bhārat) localized in Tamil - Note: You can send your feedbacks to idn-feedback@cdac.in 17