Marking the Word Frequency: A Comparative Study of English and Chinese Learner s Dictionaries

Similar documents
EFL Learners Synonymous Errors: A Case Study of Glad and Happy

The Oxford Learner s Dictionary of Academic English

Beyond single words: the most frequent collocations in spoken English

Language Meaning and Use

Discourse Markers in English Writing

Teaching Vocabulary to Young Learners (Linse, 2005, pp )

stress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings,

Absolute versus Relative Synonymy

ICAME Journal No. 24. Reviews

Mother Tongue Influence on Spoken English

INVESTIGATING DISCOURSE MARKERS IN PEDAGOGICAL SETTINGS:

ELPS TELPAS. Proficiency Level Descriptors

Keywords academic writing phraseology dissertations online support international students

Running head: MODALS IN ENGLISH LEARNERS WRITING 1. Epistemic and Root Modals in Chinese Students English Argumentative Writings.

Reference Books. (1) English-English Dictionaries. Fiona Ross FindYourFeet.de

Reading Competencies

The Impact of Using Technology in Teaching English as a Second Language

Editorial. Metacognition and Reading Comprehension

Grammar in Dictionaries of Languages for Special Purposes

Latin Syllabus S2 - S7

TESOL Standards for P-12 ESOL Teacher Education = Unacceptable 2 = Acceptable 3 = Target

Teaching terms: a corpus-based approach to terminology in ESP classes

1.6 The Order of Operations

Section 8 Foreign Languages. Article 1 OVERALL OBJECTIVE

However, with its great importance, vocabulary always presents one of the greatest challenges for FL learners. Although being eager to learn as many

Chinese Proficiency Test (HSK)

Predictability of Vocabulary Size on Learners EFL Proficiency: Taking VST, CET4 and CET6 as Instruments

Register Differences between Prefabs in Native and EFL English

Difficulties that Arab Students Face in Learning English and the Importance of the Writing Skill Acquisition Key Words:

STANDARDS FOR ENGLISH-AS-A-SECOND LANGUAGE TEACHERS

The effects of beliefs about language learning and learning strategy use of junior high school EFL learners in remote districts

Reading in a Foreign Language April 2009, Volume 21, No. 1 ISSN pp

English academic writing difficulties of engineering students at the tertiary level in China

Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005

The place of translation in Language Teaching. Radmila Popovic

Simple maths for keywords

Syllabus: a list of items to be covered in a course / a set of headings. Language syllabus: language elements and linguistic or behavioral skills

DIFFICULTIES AND SOME PROBLEMS IN TRANSLATING LEGAL DOCUMENTS

Some Reflections on the Making of the Progressive English Collocations Dictionary

DEVELOPING EFL LEARNERS' NARRATIVE WRITING THROUGH USING SHORT STORIES- THE CASE OF AL-BAHA UNIVERSIY STUDENTS. Ahmed Abdalla Saeed Adam

Italian Language & Culture Courses for Foreigners. ITALY Language Training

Current Situation and Development Trend of Applied Linguistics Fang Li

240Tutoring Reading Comprehension Study Material

Syntactic and Semantic Differences between Nominal Relative Clauses and Dependent wh-interrogative Clauses

English. Aim of the subject

How To Teach English To Other People

Developing Speaking Skills through Reading

Result Analysis of the Local FCE Examination Sessions ( ) at Tomsk Polytechnic University

Developing Academic Language Skills to Support Reading and Writing. Kenna Rodgers February, 2015 IVC Series

A Guide to Cambridge English: Preliminary

Programme Specification (Postgraduate) Date amended: March 2012

Differences in linguistic and discourse features of narrative writing performance. Dr. Bilal Genç 1 Dr. Kağan Büyükkarcı 2 Ali Göksu 3

FINNISH AS A FOREIGN LANGUAGE

Poverty among ethnic groups

Pronunciation in English

PRACTICAL ASPECTS OF TRANSLATION: BRIDGING THE GAP BETWEEN THEORY AND PRACTICE. Andrea LIBEG, Petru Maior University, Tg Mureş, Romania.

Preliminary Discussion on Program of Computer Graphic Design of Advertising Major

Book Review: Designing Language and Teaching Curriculum: Based on Nation and Macalister s (2010)

The. Languages Ladder. Steps to Success. The

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Technical vocabulary in specialised texts

How do the principles of adult learning apply to English language learners?

open up your world Let the Macmillan English Dictionary A brilliant piece of research and a superb learning and teaching tool.

The syntactic positions of adverbs and the Second Language Acquisition

Technical Report. Overview. Revisions in this Edition. Four-Level Assessment Process

Learning Strategies for Vocabulary Development

Developing Classroom Speaking Activities; From Theory to Practice

Teaching English as a Foreign Language (TEFL) Certificate Programs

Speaking of Writing and Writing of Speaking

CALIFORNIA STATE UNIVERSITY, HAYWARD DEPARTMENT OF ENGLISH Assessment of Master s Programs in English

An Overview of Applied Linguistics

Cambridge English: First (FCE) Frequently Asked Questions (FAQs)

COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE. Fall 2014

Teaching Math to English Language Learners

Writing learning objectives

COMPUTER TECHNOLOGY IN TEACHING READING

Pasadena City College / ESL Program / Oral Skills Classes / Rubrics (1/10)

Study Plan for Master of Arts in Applied Linguistics

FINNISH AS A FOREIGN LANGUAGE

COLLOCATION TOOLS FOR L2 WRITERS 1

psychology and its role in comprehension of the text has been explored and employed

An Analysis of the Eleventh Grade Students Monitor Use in Speaking Performance based on Krashen s (1982) Monitor Hypothesis at SMAN 4 Jember

Pre-service Performance Assessment Professional Standards for Teachers: See 603 CMR 7.08

MODELS AND THE KNOWLEDGE BASE OF SECOND LANGUAGE TEACHER EDUCATION

What is the Common European Framework of Reference for language?

Research on the Income Volatility of Listed Banks in China: Based on the Fair Value Measurement

A statistical interpretation of term specificity and its application in retrieval

AN INTERACTIVE ON-LINE MACHINE TRANSLATION SYSTEM (CHINESE INTO ENGLISH)

A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students

ENGLISH FILE Elementary

Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities

3 Some Integer Functions

THE IMPACT OF TEXTUAL COHESIVE CONJUNCTIONS ON THE READING COMPREHENSION OF FOREIGN LANGUAGE STUDENTS

Grammar learning and teaching: Time, tense and verb

CHECKLIST FOR THE DEGREE PROJECT REPORT

COLLOCATIONS IN ENGLISH

Course Content. The following course units will be offered:

9 The Difficulties Of Secondary Students In Written English

Alignment of the National Standards for Learning Languages with the Common Core State Standards

Please see current textbook prices at

Transcription:

US-China Foreign Language, ISSN 1539-8080 February 2012, Vol. 10, No. 2, 909-914 D DAVID PUBLISHING Marking the Word Frequency: A Comparative Study of English and Chinese Learner s Dictionaries QIAN Gui-qin Ludong University, Yantai, China Frequency first is one of the basic principles in the second language teaching and learning. And the learner s dictionaries, being a pedagogically indispensable tool in the SLA (Second Language Acquisition), apply the frequency mark policy in the compiling. The present paper explores the methods, the granularity, and the differences of the word frequency rating and marking in English learner s dictionaries. It is argued that frequency computing supported by a large and balanced corpus should be incorporated into the frequency marking in contemporary learner s dictionary-making. Keywords: SLA (Second Language Acquisition), learner s dictionary, word frequency Introduction According to the structrualists, language is a hierarchical system, so it is with the lexicon. The hierarchy of the lexicon can be demonstrated by the different usage frequencies of the words in one and the same language. Nation (1990) found that the 10 words of the highest frequencies in English cover 25% of the texts in Longman Corpus Network, and the first 1,000 English words, being a small fraction of the English vocabulary, covers 71.4% of the texts in a corpus of 35 million word tokens (Coxhead, 1998). The frequency difference is pedagogically significant in the practice of the second language teaching and acquisition. According to the economy of language, the most frequent words should be first acquired by the second language learners because the frequently-used words occur repeatedly in different genres of texts and they form the core part of the lexicon. The user-oriented learners dictionaries, serving as an indispensable tool in the SLA (Second Language Acquisition), should adopt a uniform and systematic way to mark the frequency of words to help foster the learners awareness of word frequency. The present paper, based on the theory of vocabulary gradation, tries to explore how English and Chinese learner s dictionaries respectively mark the word frequency of the headwords and what strategies should be implemented. Richards (1976) claimed that Knowing a word means knowing the degree of probability of encountering that word in speech or print, echoed by Nation s statement that a word s frequency is a part of word knowledge, poses a relatively high demand on the lexicon acquisition of nonnative speakers. Obviously it is unrealistic to expect that a second language learner to be familiar with the frequencies of all the words in the target language. QIAN Gui-qin, lecturer at School of Foreign Languages, Ludong University.

910 ENGLISH AND CHINESE LEARNER S DICTIONARIES And it has been statistically proved that English words form a structure of various frequency strata, achieving different coverage rate of the texts in a corpus. Table 1 shows us the correlation of the most 2,000 frequently-used English words and their respective text coverage (Nation, 2001). Table 1 The Top 2,000 English Words and Their Text Coverage in Different Text Genres Text Lexicon type Dialogue (%) Fiction (%) Newspaper (%) Academic writing (%) stratum The first 1,000 words 84.3 82.3 75.6 73.5 The second 2,000 words 6 5.1 4.7 4.6 Technical terms 1.9 1.7 3.9 8.5 Others 7.8 10.9 15.8 13.4 It is self-evident that in the process of second language learning, the learners should first grasp the top 2,000-3,000 English words, which is a necessity to ensure that the learners hard work of lexicon learning can be in direct proportion to their language decoding and encoding activities (West, 1953). Therefore, an exact marking of the word frequency in dictionaries can enhance the learners development of the vocabulary acquisition of the target language. The English learner s dictionaries, aiming to fulfill a need of first things first in the vocabulary gradation, have a threshold of approximately 3,000 basic English words and mark their frequencies. Cambridge Advanced Learner s Dictionary (hereafter abbreviated to CALD) claims in its preface that the introduction of word frequency of its headwords is one of its distinguishing characteristics. Actually marking word frequency is commonplace in nearly all the mainstream contemporary English learners dictionaries. However, there still exists some variations among English learners dictionaries in the frequency marking of the headwords. The Methods of Word Frequency Marking in English Learner s Dictionaries Two methods are adopted in English learner s dictionaries to mark the word frequency of their headwords. The first one is to mark word frequency through typographic changes. And the second one is to use visually different symbols, labels, or graphs to show the frequency differences of the word. Both of the two methods can be employed simultaneously in one and the same dictionary, Longman Dictionary of Contemporary English (the 5th edition) (LDOCE5) is of this kind. And some dictionaries, such as Collins Cobuild Advanced Learner s Dictionary (the 5th edition) (CCALD5), just use one way in it to show the word frequency. First let us see how the typographic variations can demonstrate the word frequency in English learner s dictionaries. Take the example of Oxford Advanced Learner s Dictionary (the 7th edition) (OALD7) and CCALD5. The two dictionaries are similar in their typography to show word frequency with all the headwords printed in blue, in sharp contrast to the right-branch explanations in black. And in OALD7, the fonts of the active words are a bit larger than those of the passive ones. Colors are also used as a visually striking way to make a difference between the productive words and the receptive ones. In LDOCE5, all the productive words are red, and the receptive ones black. In Macmillan English Dictionary for Advanced Learners (the 2nd edition) (MEDAL2), the active words are printed in red, and the passive ones in black. And in CALD2 (the 2nd edition), the productive words are blue, in

ENGLISH AND CHINESE LEARNER S DICTIONARIES 911 contrast to the black receptive words. The visual contrast between the frequent and infrequent words turns out to be an effective way to make the productive words the salient points in the dictionary word list. Besides the typography, some symbols and abbreviations are used in English learner s dictionaries to mark the word frequency. The symbol of a key is used in OALD7 to highlight the top 3,000 English words, and in CCALD5 three diamonds ( ), two diamonds ( ) and one diamond ( ) are employed to respectively denote the top 1,000, 2,000, and 3,000 English words. MEDAL2 uses stars to rate its words, with three starts representing the basic words, two stars the frequent words, and one star the less frequent ones. And what deserves the metalexicographer s attention is the dual-track approach that MEDAL2 has taken to deal with the passive and active words in a radically different way. The active words are provided with detailed explanations, collocation, grammatical patterns, registers, and even their connotations. As to the passive words, a much more simplified method is employed. That is, only brief definitions are given, no grammatical or pragmatic information, even no examples. The dual track approach of treating the headwords is theoretically and practically plausible in that much more headwords can be incorporated in the wordlist of the dictionary with its core part, i.e., the frequently-used words highlighted. Abbreviations are also used to grade the words according to their frequencies. In LDOCE5, abbreviations S and W respectively stand for spoken and written words. Abbreviations plus numbers, for example, S(W)1, S(W)2, and S(W)3, are employed to denote the top 1,000, 2,000, and 3,000, English words in spoken (written) English. Other methods used in LDOCE5 to distinguish word frequency are the frequency graphs, which statistically show the frequency variation of the same headword in written and spoken form, revealing the frequency discrimination between British and American English, or providing the percentage of a given syntactic pattern. It is noticed that frequency graphs in LDOCE5 are not employed systematically. In LDOCE5, there are 53 headwords provided with frequency graphs. Compared with the totality of the 100,000 reference units in the dictionary, the number of frequency graphs is too small to be representative enough. What is worse, there is some inconsistency of the allotment of frequency graphs. For example, both of the two lexemes absolutely and actually have frequency graphs and absolutely is illustrated by its frequency graph that it is used mainly in spoken environment, which, however, is indicated definitely by its labels S1 and W3. Besides, a usage note concerning the difference of its typical occurrences is given just below the headword. A redundancy thus occurs, which is a waste of the dictionary s storage space. Absolutely is the representative of another type, labeled with S1 and W1 and without marked difference in its occurrence in spoken or written form, therefore a detailed frequency graph showing the subtle discrepancy is badly needed. There are 244 words marked in LDOCE5 according to their frequencies, of which 103 headwords are not used equally in spoken and written English and 141words are of the same occurrence frequencies in both spoken and written environment. It seems that LDOCE5 has not yet established a criterion to determine which headword should be provided with a frequency mark. A systematic and consistent criterion and treatment is desired in LDOCE5 s allotment of frequency graphs. To Which Linguistic Unit the Frequency Rating Is Attached Lexeme or Sememe? Researchers have not reached a conclusion on the two terms lexeme and sememe. According to Cruse

912 ENGLISH AND CHINESE LEARNER S DICTIONARIES (1986), lexeme is a set comprised of form-related lexical units and the definition of each lexical unit is a sememe. Therefore, lexeme is a two-fold sign with the same form and various definitions all rolled into one, the sememe, however, is connected only with meaning. In dictionaries, a sememe is approximately equal to one definition of the headword. It is clear that the meanings of a polysemous word do not have the same status according to their occurrences in human communication. CALD2 is the only English learner s dictionary which attaches the frequency rating to the sememes instead of the lexemes. That is to say, CALD2 adopts a much more specific way to discriminate the frequency differences of the related sememes in one and the same lexeme. The frequency marking procedure in CALD2 goes like this: First, the dictionary compliers choose the most frequent words from Cambridge International Corpus and Cambridge Learner s Corpus, then the examples of the frequent words are numbered, on basis of which the occurrence frequency of all the sememes are computed and lastly the sememes concerned are graded and marked. Take an example of the lexeme good. In CALD2, the lexeme good has nine definitions (equal to sememes ) and seven of them are marked E, meaning elementary words, and one marked I, standing for improved words, and one definition are not marked at all for its rare occurrence in communication. In LDOCE5, MEDAL2, CCALD5, and OALD7, lexeme good respectively has 17, 18, 21, and 24 definitions, all of which are labeled with only the same symbol or abbreviation. In LDOCE5, good is marked with W1, S1 and in MEDAL2, it is marked with three stars without any discrimination between the definitions. There is no denying that such a vague frequency marking undoubtedly blurs the distinction between the frequently-used definitions with the less frequent ones, which turns to be a real handicap for a learners to locate the most frequent sememe(s) among the lexeme set. The Granularity of the Word Frequency Marking of the Active Words In the Big Five of English learner s dictionaries, the granularity of the word frequency marking is treated in two ways. The first one is just to mark the productive words, with no discrimination between their internal frequency discrepancies of meanings. OALD7 serves as a typical example of this type. The second one is to subdivide the productive words according to their occurrence frequencies. The other four mainstream English learner s dictionaries except OALD7 all fall into this category. As to the second type, a subdivision of the frequently-used words is made, and CCALD5, LDOCE5, MEDAL2, and CALD2 evenly subdivide the active words into three frequency groups, which is termed as the even frequency marking. Among them, LDOCE5 and CCALD5 have 3,000 frequently-used words marked, with 1,000 thousand words as a group; while MEDAL2 have 7,500 words marked with stars, with 2,500 words as a subcategory. The only exception is CALD2, which adopts a non-even frequency making policy. In CALD2, sememes marked E are words which everyone need to know to communicate effectively, usually over 400 occurrences per 10 minllion corpus words, and total 4,900 in the dictionary. Sememes marked with I are also common in native speaker English, typically between 200-400 occurrences per 10 million words, adding up to 3,300. Sememes marked with A typically occur around 100-200 times per 10 million corpus words, which are needed by advanced learners to make their English more fluent and natural and we have 3,700 words marked with A. The even frequency marking is thought to be a relatively subjective labeling, lack in a theoretical

ENGLISH AND CHINESE LEARNER S DICTIONARIES 913 underpinning while the non-average frequency marking, supported by the frequency computations based on a large, balanced even dynamic corpus, mirrors the real distribution and usage of the linguistic units. Other contemporary English learner s dictionaries should follow the lead of CALD2. The Difference of the Frequency Marking in English Learner s Dictionaries Quantitatively speaking, the productive words marked in CALD amount to 11,900, while OALD7, LDOCE5, and CCALD5 all mark 3,000 frequent words. Obviously there s a large gap between CALD2 and the three other English learner s dictionaries. The main reasons lies in the fact that CALD2, as we have discussed in the preceding part, attach the frequency markers to the sememes instead of lexemes. What deserves our attention is that there is disagreement as to the frequency marking of the same linguistic unit. For example, the headword convince is marked among the top 3,000 word in both LDOCE5 and OALD7. The word convinced, however, is not marked at all, indicating its rare occurrences in the ordinary communication. But in CALD2, both convince and convinced are labeled with I, an abbreviation denoting the words of the top 2,000 English words. The afore-mentioned examples pose a question of the stability of word frequency marking. Does this kind of disagreement of word frequency marking often occur in the dictionary marking? We take the lexemes beginning with letter A as a data sample with the aim of finding out to what extent two dictionaries conflict with each other concerning the word frequency rating. Table 2 The Frequency Marking of the Active Words Beginning With Letter A in LDOCE5 and CCALD5 Frequency Dictionary LDOCE5 CCALD5 The top 1,000 words S1 65 W1 86 The top 2,000 words S2 84 W2 63 The top 3,000 words S3 56 W3 71 50 72 110 It can be seen clearly from Table 2 that the lexemes under letter A shows great difference in the word frequency rating. CCALD5 has a lot more headwords belonging to the top 3,000 words. Besides, LDOCE5 maintains a balance among the three groups of words showing different occurrence frequencies, while CCALD5 shows a tendency of growth from the most frequent words to the least one. It is argued that the disagreement of word frequency marking lies in the fact that different English learner s dictionaries are compiled on the basis of different corpus. As far as the dictionary users are concerned, different and sometimes even conflicting frequency markings make the potential dictionary users puzzled or lost. Conclusions The frequency marking of the headwords or their definitions is a user-friendly treatment for the dictionary users, who, when being second language learners, have to fulfill both the language decoding and encoding tasks.

914 ENGLISH AND CHINESE LEARNER S DICTIONARIES Therefore, a systematic demarcation should be made between the active and passive words in the dictionaries. Frequency rating in dictionaries can help to highlight the frequent words and deal with these words in great details, hence the processing width and depth of the frequent words is a necessity. In the present technology-dominated word, the frequency computing supported by a large and balanced corpus should be incorporated into the frequency marking in learner s dictionaries. References Coxhead, A. (1998). An academic word list (English Language Institute Occasional Publication, Wellington: School of Linguistics and Applied Language Studies, Victoria University of Wellington). Cruse, D. A. (1986). Lexical semantics. Cambridge: Cambridge University Press. Lyons, J. (1977). Semantics. Cambridge: Cambridge University Press. Nation, I. S. P. (1990). Teaching and learning vocabulary. New York: Heinle and Heinle. Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press. Richards, J. C. (1976). The role of vocabulary teaching. TESOL Quarterly, 10(1). West, M. (1953). A general service list of English words. London: Longman.