Title: Chinese Characters and Top Ontology in EuroWordNet



Similar documents
Performance Assessment Task Bikes and Trikes Grade 4. Common Core State Standards Math - Content Standards

Defining Equity and Debt using REA Claim Semantics

FUNDAMENTALS OF ARTIFICIAL INTELLIGENCE KNOWLEDGE REPRESENTATION AND NETWORKED SCHEMES

Domain Knowledge Extracting in a Chinese Natural Language Interface to Databases: NChiql

Universal. Event. Product. Computer. 1 warehouse.

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November ISSN

Language Meaning and Use

Kindergarten Math Curriculum Course Description and Philosophy Text Reference:

CRCT Content Descriptions based on the Georgia Performance Standards. Reading Grades 1-8

Section 8 Foreign Languages. Article 1 OVERALL OBJECTIVE

Using DEB Services for Knowledge Representation within the KYOTO Project

An Efficient Database Design for IndoWordNet Development Using Hybrid Approach

Semantic analysis of text and speech

Integer Operations. Overview. Grade 7 Mathematics, Quarter 1, Unit 1.1. Number of Instructional Days: 15 (1 day = 45 minutes) Essential Questions

A Case Study of Question Answering in Automatic Tourism Service Packaging

COMPARISON OF PUBLIC SCHOOL ELEMENTARY CURRICULUM AND MONTESSORI ELEMENTARY CURRICULUM

South Carolina College- and Career-Ready (SCCCR) Algebra 1

Reading IV Grade Level 4

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX

Chapter 8 The Enhanced Entity- Relationship (EER) Model

Natural Language Processing. Part 4: lexical semantics

HOW TO USE A KANJI DICTIONARY

University of Ostrava. Reasoning in Description Logic with Semantic Tableau Binary Trees

SEMANTIC-BASED AUTHORING OF TECHNICAL DOCUMENTATION

Virginia English Standards of Learning Grade 8

Interpreting areading Scaled Scores for Instruction

Appendix B Data Quality Dimensions

Relational Database Basics Review

Problem of the Month: Perfect Pair

Pocantico Hills School District Grade 1 Math Curriculum Draft

1 ST GRADE COMMON CORE STANDARDS FOR SAXON MATH

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

Minnesota Academic Standards

G C.3 Construct the inscribed and circumscribed circles of a triangle, and prove properties of angles for a quadrilateral inscribed in a circle.

TEACHING VOCABULARY. Across the Content Areas TEACHER TOPICS VOCABULARY TOOLS

Handout #1: Mathematical Reasoning

For example, estimate the population of the United States as 3 times 10⁸ and the

Reading VIII Grade Level 8

Going Faster: Testing The Web Application. By Adithya N. Analysis and Testing of Web Applications Filippo Ricca and Paolo Tonella

An Automated Model Based Approach to Test Web Application Using Ontology

Multilingual and Localization Support for Ontologies

Test Blueprint. Grade 3 Reading English Standards of Learning

Exhibit memory of previously-learned materials by recalling facts, terms, basic concepts, and answers. Key Words

Creating Characters. Activity 2.1. Performance Objectives

Lecture 2: Regular Languages [Fa 14]

Fostering Vocabulary Development in Elementary Classrooms. Joanne F. Carlisle University of Michigan/CIERA

Minnesota K-12 Academic Standards in Language Arts Curriculum and Assessment Alignment Form Rewards Intermediate Grades 4-6

How To Be A Mathematically Proficient Person

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

Problem of the Month: Double Down

Content-Area Vocabulary Study Strategies Appendix A 2 SENIOR

ONTOLOGIES A short tutorial with references to YAGO Cosmina CROITORU

PowerScore Test Preparation (800)

Problem of the Month: William s Polygons

PTE Academic Preparation Course Outline

Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms

A generic approach for data integration using RDF, OWL and XML

Introduction to Automata Theory. Reading: Chapter 1

CHAPTER 7 GENERAL PROOF SYSTEMS

Radicals of Chinese Characters

Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities

Conceptual IT Service Provider Model Ontology

Distributed Database for Environmental Data Integration

Bloom s Taxonomy. List the main characteristics of one of the main characters in a WANTED poster.

Chapter 111. Texas Essential Knowledge and Skills for Mathematics. Subchapter A. Elementary

KNOWLEDGE ORGANIZATION

QUALITY CONTROL PROCESS FOR TAXONOMY DEVELOPMENT

SEARCHING AND KNOWLEDGE REPRESENTATION. Angel Garrido

Semantic Analysis of. Tag Similarity Measures in. Collaborative Tagging Systems

Irrational Numbers. A. Rational Numbers 1. Before we discuss irrational numbers, it would probably be a good idea to define rational numbers.

In Defense of Ambiguity. Pat Hayes Florida IHMC

Chapter 5 Discrete Probability Distribution. Learning objectives

ELL Considerations for Common Core-Aligned Tasks in English Language Arts

Semantic Analysis of Business Process Executions

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati

Writing Learning Objectives

Information Technology Career Field Pathways and Course Structure

Visualizing WordNet Structure

Keyboards for inputting Japanese language -A study based on US patents

SPELLING DOES MATTER

IAI : Knowledge Representation

Strand: Reading Literature Topics Standard I can statements Vocabulary Key Ideas and Details

SYMBOLISM IN ARTHUR GOLDEN S MEMOIRS OF A GEISHA

Introduction to Computing. Lectured by: Dr. Pham Tran Vu

A Pattern-based Framework of Change Operators for Ontology Evolution

Analyzing survey text: a brief overview

Ontology Patterns for Complex Activity Modelling

MADE to ORDER: Instant Content Area Vocabulary

æ A collection of interrelated and persistent data èusually referred to as the database èdbèè.

Structure of the talk. The semantics of event nominalisation. Event nominalisations and verbal arguments 2

03 - Lexical Analysis

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

Queensland recordkeeping metadata standard and guideline

DanNet From Dictionary to Wordnet

Intro to Linguistics Semantics

Advancing Smart-City Infrastructures: Open Data, Big Data. Perspectives from the Field Mark S. Fox, University of Toronto

READY NCEXTEND2 End-of-Grade English Language Arts (ELA)/Reading Grades 3-8 Assessments

Transcription:

Title: Chinese Characters and Top Ontology in EuroWordNet Paper by: Shun Sylvia Wong & Karel Pala Presentation By: Patrick Baker

Introduction WordNet, Cyc, HowNet, and EuroWordNet each use a hierarchical structure of language independent concepts to reflect the important semantic differences between concepts EuroWordNet uses a hierarchy called Top Ontology (TO) This paper compares EuroWordNet s TO with the natural organization found in the pictographic based Chinese language

Top Ontology? Ontologies are artificial constructs built with the primary purpose to serve as the lexical databases for knowledge representation systems Top Ontology distinguishes between three types of entities This paper focuses on the third type

The Three Entity Types of TO: There are three types of entities distinguished at the first level of TO: 1. 1 st Order any concrete entity publicly perceivable by the senses and located at any point in time, in a threedimensional space (persons, animals, discrete objects) 2. 2 nd Order any Static Situation (property, relation) or Dynamic Situation, which cannot be grasped, heard, seen, felt as an independent thing (events, processes, states-of-affair) 3. 3 rd Order unobservable propositions which exist independently of time and space. They can be true or false rather than real (ideas, thoughts, theories, plans, reasons)

The Chinese Language Chinese script originated from picturewriting Only a couple hundred characters in the language are actual pictograms According to the etymological dictionary written by Xu Shen around 100 A.D., Chinese characters can be divided into six groups

Six Groups of Chinese Characters 1. Pictographs ( 4%): represent real-life objects by drawings 2. Ideographs ( 1%): represent positional and numeral concepts by indication 3. Logical Aggregates ( 13%): form a new meaning by combining the meanings of two or more characters 4. Phonetic Complexes ( 82%): form a character by combining the meaning of one character and another character which links through a shared sound 5. Associative Transformations (a small portion): extend the meaning of a character by adding more parts to the existing one 6. Borrowings (a small portion): to borrow the written form of a character with the same sound

The Chinese Language The average educated Chinese person knows only about 6000 of the 50,000 characters in the Chinese language Since many of the characters are combinations of simpler characters, knowing the meaning of one or more of the constituent characters allows deduction of the overall meaning

The Chinese Language Because Chinese characters can not be ordered alphabetically in a dictionary, they are ordered by Section Heads or Chinese Radicals There are 213 Chinese Radicals In most cases, a character is grouped under a certain Chinese Radical if its concept relates to the concept represented by the radical in some way

The Chinese Language and 3 rd Order Entities The concepts in the 3 rd Order Entity list are abstract and difficult to grasp; most are represented by use in the form of a sentence (e.g. John thought the movie was good ) Wong & Pala (2001) have shown that no direct correspondence can be found between Chinese Radicals and the concepts in the 3 rd Order list In most cases, the Chinese counterparts of these concepts are represented by more complicated lists of characters

The Chinese Language and 3 rd Order Entities For each of the basic concepts in the 3 rd Order list, the authors located their Chinese counterparts Each concept created a list of Chinese characters representing synonyms, hyperonyms, and/or meanings that collectively defined the scope of the concept The meanings of the component radicals of each character in the list were then examined

The Chinese Language and 3 rd Order Entities The authors found that certain radicals (with specific meanings) were found associated with one or two 3 rd Order concepts This association is called Sense Transfer e.g. the characters (logic/reason/theory), (opinion/theory/discussion), and (theory/to explain/to say) appear more often under theory e.g. the characters (to think/to consider) and (to think/to contemplate) appear more often under idea/thought

Sense Transfer and Other Languages Sense transfer exists in most languages, though not necessarily to the extent as pictograph based languages English examples: care-free, side-light, un-thinkable Czech example: u_-i-t-el (a root denoting the concept teach + a verb-making affix + an infinitive affix + an agentives suffix = teacher) The inadequacy of existing ontologies to show this sense transfer property means there exists no way to derive the meaning for a new word even if its components already exist in the ontology

The Chinese Way to Represent Concepts Wong & Pala (2001) have observed that Chinese seems to organize concepts in a contextual manner, with each Chinese radical serving as the characterizing basic concept in the respective concept Through observation, the authors determined that many of the characters subsumed in the radicals can be classified along five main lines

The Chinese Way to Represent Concepts The five conceptual lines are: 1. As an object 2. As a property 3. As a typical event (situation, process) 4. It s component 5. As a consequence e.g. the character (fire) as an object is part of (stove) and (charcoal), and as a typical event is part of (to burn) and (to cremate)

Lexical/Conceptual Organization The Chinese way of organizing concepts (even abstract ones) from simpler, more concrete concepts/entities provides an alternative to the organization provided by existing ontologies Such an organization would form a semantic network as opposed to the tree structure found in such ontologies Such a semantic network is richer, more complete, and more transparent, as each concept is derived not from verbalized concepts, but a semantic context of discrete entities

Conclusion By comparing EuroWordNet s TO to the intrinsic structure provided by the natural language Chinese, it can be seen that: Humans more naturally think of concepts as being composed of more concrete entities, as opposed to derived from abstract concepts The more natural way to represent such concepts is a semantic graph, not the tree structure found in most existing ontologies