Gender assignment to English loanwords. (Variation in) Gender assignment: Status quo. Variation in gender assignment: What?



Similar documents
Acquiring grammatical gender in northern and southern Dutch. Jan Klom, Gunther De Vogelaer

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1]

Multipurpsoe Business Partner Certificates Guideline for the Business Partner

Syntactic Theory on Swedish

Complex Predications in Argument Structure Alternations

What Makes a Good Online Dictionary? Empirical Insights from an Interdisciplinary Research Project

Electronic offprint from. baltic linguistics. Vol. 3, 2012

Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1]

Hybrid Strategies. for better products and shorter time-to-market

Extracting translation relations for humanreadable dictionaries from bilingual text

Targeted Advertising and Consumer Privacy Concerns Experimental Studies in an Internet Context

Differences in linguistic and discourse features of narrative writing performance. Dr. Bilal Genç 1 Dr. Kağan Büyükkarcı 2 Ali Göksu 3

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

Descriptive and Normative Aspects of Lexicographic Decision-Making: The Borderline Cases

HIERARCHICAL HYBRID TRANSLATION BETWEEN ENGLISH AND GERMAN

Comprendium Translator System Overview

bound Pronouns

Programmierbeispiele zur Datenaufbereitung der Stichprobe der Integrierten Arbeitsmarktbiografien (SIAB) in Stata

The Use of Text Corpora in Lexical Research

Collecting Polish German Parallel Corpora in the Internet

3. Introduction to Culture, 2st

Optimizing Gender. Curt Rice * University of Tromsø

Chapter 5. Phrase-based models. Statistical Machine Translation

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde

GERMAN WORD ORDER. Mihaela PARPALEA 1

1 Basic concepts. 1.1 What is morphology?

Computer Assisted Language Learning (CALL): Room for CompLing? Scott, Stella, Stacia

Linear Coding of non-linear Hierarchies. Revitalization of an Ancient Classification Method

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX

Structure of the talk. The semantics of event nominalisation. Event nominalisations and verbal arguments 2

Keywords academic writing phraseology dissertations online support international students

Some Implications of Controlling Contextual Constraint: Exploring Word Meaning Inference by Using a Cloze Task

Simple maths for keywords

Doctoral School of Historical Sciences Dr. Székely Gábor professor Program of Assyiriology Dr. Dezső Tamás habilitate docent

German Language Resource Packet

Annotation in Language Documentation

Text-Driven Ontology Generation and Extension in the Finance Domain. Mihaela Vela Language Technology Lab DFKI Saarbrücken

A Mapping of CIDOC CRM Events to German Wordnet for Event Detection in Texts

1 Business Modeling. 1.1 Event-driven Process Chain (EPC) Seite 2

German Language Support Package

A Joint Sequence Translation Model with Integrated Reordering

Master of Arts in Linguistics Syllabus

Customizing an English-Korean Machine Translation System for Patent Translation *

CURRICULUM VITAE SILKE BRANDT

Introduction. Philipp Koehn. 28 January 2016

Checklist Use this checklist to find out how much English you already know. Grundstufe 1 (Common European Framework: A1 Level)

The English Genitive Alternation

MASTER OF PHILOSOPHY IN ENGLISH AND APPLIED LINGUISTICS

Lean E T HS MF Einführung des Lean Company Programms in der Siemens Business Unit E T HS

Prof Dr Dr Friedemann Pulvermüller Freie Universität Berlin WS 2013/14 Progress in Brain Language Research Wed, 4-6 pm ct, K 23/11

Turkish Radiology Dictation System

Local Culture in Global English:

UNKNOWN WORDS ANALYSIS IN POS TAGGING OF SINHALA LANGUAGE

COURSE OBJECTIVES SPAN 100/101 ELEMENTARY SPANISH LISTENING. SPEAKING/FUNCTIONAl KNOWLEDGE

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1]

EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language

Local Culture in Global English:

Varieties of specification and underspecification: A view from semantics

Efficient diphone database creation for MBROLA, a multilingual speech synthesiser

Contemporary Linguistics

Accessibility and simple language: experiences with automatic compliance tools

COMM 104 Introduction to Communications Fall credits Core E&C GE-AH for BAB and CS COMM 130 Introduction to Journalism Fall credits

The Vocabulary Size Test Paul Nation 23 October 2012

The shape of things to come: Young researchers in Germany

ARABIC PERSON NAMES RECOGNITION BY USING A RULE BASED APPROACH

German Language Support Package

Projektgruppe. Categorization of text documents via classification

Green Building Water Technology: Use of Renewable Water Resources in Multi-Storey Buildings

Dial-Up VPN auf eine Juniper

Literacy and Numeracy for Learning and Life

Psychology G4470. Psychology and Neuropsychology of Language. Spring 2013.

Pragmatic analysis of hotel websites in terms of interpersonal relationships. Theses of the PhD dissertation by. Kovács Péterné Dudás Andrea

Acquisition of German pluralization rules in monolingual and multilingual children

An Incrementally Trainable Statistical Approach to Information Extraction Based on Token Classification and Rich Context Models

Studienverlaufspläne (Stand Oktober 2013)

Support verb constructions

CURRICULUM VITAE. M. Sc. Anne-Katharina Schiefele

The Rise of Documentary Linguistics and a New Kind of Corpus

Comparative Analysis on the Armenian and Korean Languages

Coffee Break German. Lesson 09. Study Notes. Coffee Break German: Lesson 09 - Notes page 1 of 17

WESTERNACHER OUTLOOK -MANAGER OPERATING MANUAL

Introduction. BM1 Advanced Natural Language Processing. Alexander Koller. 17 October 2014

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang

ICAME Journal No. 24. Reviews

Transcription:

A cross-linguistic comparison of variation in gender assignment to English loanwords in German and Polish Marcus Callies Philipps Universität Marburg Eva Ogiermann Carl von Ossietzky Universität Oldenburg Konrad Szczesniak Uniwersytet Śląski Sosnowiec Gender assignment to English loanwords Onysko (2007): Interaction of gender rules and a default hierarchy "Principle and rule approach": All anglicisms which are not default gender (i.e. masculine, considered the unmarked) receive their gender by specific rules (based on German): Primary rules: semantic and morphological rules Secondary criteria: phonological rules and lexical-conceptual equivalence Rules operate on top of an underlying default gender hierarchy: If s-/m-rules apply, secondary criteria are unimportant, conflicts are settled in favour of default If no s-/m-rules apply, secondary criteria can influence gender assignment before default DGfS 2008 - Bamberg (Variation in) Gender assignment: Status quo Many studies do not explicitly address variation in gender assignment (aka gender vacillation, gender wavering, German "Genusschwankung") Mostly (diachronic) dictionary studies with few others using comp. small corpora, then often limited to only one newspaper or news magazine (e.g. Onysko 2007) Only rarely have dictionaries and/or corpora been supplemented by other types of data (Carstensen 1980, Schulte-Beckhausen 2002, Fischer 2005 worked with native speaker informants) Variation in gender assignment: What? 'True' instances of variation: Only those where the different genders do not indicate a differenc in meaning, i.e. gender does not separate lexical items (Onysko 2007): das Crossover ('mix of styles, genres') vs. der Crossover ('type of car') "Genusschwankung par excellence": No morphophonological or semantic rule available, hence inter-speaker variation between masculine and neuter (Talanga 1987) Carstensen 1980, Talanga 1987, Kilarski 2001, Schulte-Beckhausen 2002, Chan 2005, Fischer 2005, Onysko 2007

Variation in gender assignment: How much? Most studies claim that there is comparatively little variation in gender assignment to loanwords: Kilarski (2001): Variation between 3.5% for English loans in Swedish and 5.8% in Danish among all assigned nouns Schulte-Beckhausen (2002): Comparing dictionary, corpus and informant data, significant differences between the data types Nettmann-Multanowska (2003): Wavering more characteristic of German than Polish; 57 (10%) vs. 9 (3%) instances in her corpus) Chan (2005): Highest degree of variation between masculine and neuter (141 out of 3105 entries = 5%), M/F 0.9%, F/N 0.6%, M/N/F 0.1% Fischer (2005): Highest degree of variation between masculine and neuter; variation is highest with simple (monosyllabic), unaffixed words without any formal marking, esp. if the meaning of the word is unknown Onysko (2007): Only a minimal amount of gender variation in his SPIEGEL corpus (which is not quantified) Variation in gender assignment: How much? Why there is more variation than has been assumed to date: Bias towards dictionary data: Dictionaries have shown to be inconsistent and often document normative, expert language use Using only dictionaries or one type of newspaper / news magazine increases the likelihood of bias towards a specific in-house writing style or policy on anglicisms (see e.g. Yang 1990 and Onysko 2007 who used Der Spiegel) Rules for gender assignment usually explained on the basis of linguists' expert knowledge, but most native speakers are linguistically untrained: linguists' intuitions don't match those of other native speakers Variation in gender assignment: Factors Factors that are assumed to influence variation: Gender rules based on formal properties (morphophonological, semantic) Regional differences Recency of borrowing / diachronic factor Frequency Context of presentation Number of available lexical-conceptual equivalents in the recipient language (the more are available, the more variation: Login > die Anmeldung, das Passwort, der Benutzername) Bilingual competence / knowledge word meaning (variation higher when meaning is unknown) Carstensen 1980, Talanga 1987, Schulte-Beckhausen 2002, Fischer 2005, Onysko 2007 Variation in gender assignment: Why? Variation understood as conflicts between assignment rules, i.e. competition among rivaling factors (e.g. morphophonological and semantic criteria), and competition among lexical equivalents Variation also explained in terms of inter-speaker variation and indeterminacy of the closest lexical equivalent; thus relegated to arbitrariness and idiosyncracies, or non-linguistic factors ("Sprachgefühl") Carstensen 1980, Kilarski 2001, Schulte-Beckhausen 2002, Fischer 2005, Onysko 2007

Research questions 1. Taking into account corpus and informant data, how much variation is there? 2. What are the factors that cause variation in gender assignment to loanwords and what are those that make variation less likely? 3. Do these factors differ in the two languages, and if so, how do they play out? Corpus study (1) 10 out of a number of initial test items later used in the experimental study subject to pilot corpus studies German: Berliner Zeitung newspaper corpus (in DWDS), 1994-2005 (252m words); low frequency items also checked in COSMAS II (W-öff, Archiv der geschriebenen Korpora, alle öffentlichen Korpora, 2.2b words) Polish: web-as-corpus study (www.google.com) using collocational patterns to retrieve instances marked for gender Corpus study (2) Corpus study (3) Clearly, variation can be found in corpora. But... For some more recent borrowings very low frequency counts; many instances inconclusive because not marked for gender or ambiguous

Gender assignment in German (1) Phonological rules Monosyllabic words 24 phonological rules (Köpcke 1982) simplified version: masculine monosyllabic words are unmarked Morphological rules Masc: -er / -ling / -rich Fem: -e/ -keit / -ung/ -schaft Neut: -sel/ -tum/ nis Gender assignment in German (2) Semantic rules a) semantic field analogy Masc: days of the week / alcoholic drinks /spices Fem: names of trees / numbers Neut: colours / town names / languages b) hypernymy der Wagen der Honda, der Twingo die Zigarette die Marlboro, die Camel das Hotel das Hilton, das Meryan Biological gender (can be outranked by morphological rules) die Frau but das Fräulein Gender assignment in Polish (1) Phonological rules (Auslaut) Masc: all consonants dom 'House' /i/ dyżurny 'employee on call' /a/ dentysta 'dentist' Fem: /a/ krowa 'cow' some consonants noc 'night' Neut: /o/ jajko 'egg' /e/ słońce 'sun' /ę/ niemowlę 'infant' /um/ muzeum 'museum' Morphological rules (suffixes echo phonological rules) Masc: -ik / -iciel / -izm Fem: -ka / -acja Neut: -anie /-cie / -stwo Gender assignment in Polish (2) Biological gender (outranks phonological rules) Masc: mężczyzna 'man' Fem: babsztyl 'woman' (pejorative) Suggested hierarchy: biological gender > phonological rules > semantic rules

Hypotheses 1. Variation is low(er) with words that do have a marker that is a strong trigger for a specific gender (morphophonological and semantic rules are so strong that variation is marginal) { er} = masculine, {-ing} = neuter bitch = feminine, coach = masculine ending in consonant = masculine (Polish) 2. Variation is high(er) with words that have no marker/feature that determines a specific gender 3. Variation increases if there is no single clear lexical equivalent (i.e. a broad range of possible lexical equivalents or none at all) 4. Variation increases if the meaning of a word is unknown Experimental study (1) 26 loanwords (nouns), selected acc. to formal and semantic criteria to be applicable as test items in both languages biological gender/semantic field/cognate: bitch, coach; alcopop, shake, techno, domain words with morphological marking: browser, voucher, casting, posting deverbal nouns with particle: download, update, take-off, login words ending in a special sound: preview, crew; movie, cookie; badge, stage: label, jingle simplex, monosyllabic words: gate, sale, slot, gig Experimental study (2) Format: Gender assignment by providing the definite article (in German) or inflectional suffix(es) (in Polish) to words in contextualised sentences (translational equivalents) Further questions as to informants' knowledge of the meaning of the word (known, unknown, not sure) and potential lexical equivalents in the native language Questionnaire administered to 146 German and 100 Polish native speaker informants, all university students of English in their mid-twenties

Results (1) Variation measured in terms of a diversity index (Simpson's D) taking into account the range of gender categories present among the answers (how many) and the relative abundances, i.e. the evenness or equitability with which the answers are distributed among the different gender categories (how often a gender is represented) D is a figure between 0 and 1: If it is 1, the answers are spread equally across the given categories (e.g. 50 masc., 50 fem., 50 neut.) If it is 0, all answers fall into one category, there is no variation at all In short: the higher the D value, the more variation Results (2) Words for which there is a high degree of variation the most frequently mentioned gender category does not exceed 90% have a broad range of genders mentioned (range between 3 and 7) D value is higher than 0.4 show intra-speaker variation (mostly masculine/neuter) Results (3) Results (4) Variation only in German: voucher, take-off, login, techno Variation only in Polish: browser, download, update, domain, crew, stage, label, gate, sale) indicating the different weight that gender rules have in the two languages Variation in both languages: alcopop, preview, movie, cookie, badge, jingle High variation usually correlates with uncertainty/lack of knowledge of word meaning (few exceptions) /u:/, /I, i:/, /dz/ and /schwa+l/ are sounds that trigger variation in Polish

Results (5) Conclusion Hierarchy of factors that determine variation in gender assignment? gender marker > lexical equivalent > knowledge of word meaning Variation is highest if a) rules that work on gender markers are out and cannot be applied b) there is no single clear lexical equivalent in the respective language (broad range of lexical equivalents mentioned and a high percentage of answers in the category "no lexical equivalent given") c) there is lack/uncertainty about the meaning of the word References Thank you! Danke! Dziękujemy bardzo! Baran, Dominika (2003), "English loanwords in Polish and the question of gender assignment", Penn Working Papers in Linguistics 8:1, 15-28. Carstensen, Broder (1980), "Das Genus englischer Fremd- und Lehnwörter im Deutschen, in Viereck, Wolfgang (ed.), Studien zum Einfluß der englischen Sprache auf das Deutsche. Tübingen: Narr, 37-76. Chan, Sze-Mun (2005). Genusintegration: eine systematische Untersuchung zur Genuszuweisung englischer Entlehnungen in der deutschen Sprache. München: Iudicium. Fischer, Rudolf-Josef (2005), Genuszuordnung. Theorie und Praxis am Beispiel des Deutschen. Frankfurt/Main: Peter Lang. Gregor, Bernd (1983), Genuszuordnung: Das Genus englischer Lehnwörter im Deutschen. Tübingen: Niemeyer. Kilarski, Marcin (2001), Gender assignment of English loanwords in Danish, Swedish and Norwegian. Ph.D. dissertation, Adam Mickiewicz University. Köpcke, Klaus-Michael (1982), Untersuchungen zum Genussystem der deutschen Gegenwartssprache. Tübingen: Niemeyer. Nettmann-Multanowska, Kinga (2003), English Loanwords in Polish and German after 1945: Orthography and Morphology. Frankfurt/Main: Peter Lang. Onysko, Alexander (2007), Anglicisms in German. Borrowing, Lexical Productivity, and Written Codeswitching. Berlin: Walter de Gruyter. Schulte-Beckhausen, Marion (2002), Genusschwankung bei englischen, französischen, italienischen und spanischen Lehnwörtern im Deutschen: Eine Untersuchung auf der Grundlage deutscher Wörterbücher seit 1945. Frankfurt/Main: Peter Lang. Talanga, Tomislav (1987), Das Phänomen der Genusschwankung in der deutschen Gegenwartssprache untersucht nach Angaben neuerer Wörterbücher der deutschen Standardsprache. PhD dissertation, University of Bonn.