Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions. Natalia Levshina



Similar documents
Comparing constructicons: A cluster analysis of the causative constructions with doen in Netherlandic and Belgian Dutch.

Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions

Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions

An integrative exemplar-based model of semantic structure: The Dutch causative construction with laten 1

University of Marburg, RC Deutscher Sprachatlas University of Leuven, RU Quantitative Lexicology and Variational Linguistics

Tracking change in word meaning

Causality marking across levels of language structure. A cognitive semantic analysis of causal verbs and causal connectives in Dutch

Varieties of lexical variation

CLARIN project DiscAn :

COMPUTATIONAL DATA ANALYSIS FOR SYNTAX

Lexical convergence in the Dutch lexicon

Comparing because to want; How connectives affect the processing of causal relations.

FOREIGN AFFAIRS PROGRAM EVALUATION GLOSSARY CORE TERMS

GLOSSARY OF EVALUATION TERMS

SPRING SCHOOL. Empirical methods in Usage-Based Linguistics

Clustering Connectionist and Statistical Language Processing

Project notes of CLARIN project DiscAn: Towards a Discourse Annotation system for Dutch language corpora

The Chat Box Revelation On the chat language of Flemish adolescents and young adults

Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project

Learning is a very general term denoting the way in which agents:

Annotation Guidelines for Dutch-English Word Alignment

Chapter 8. Final Results on Dutch Senseval-2 Test Data

Data Deduplication in Slovak Corpora

CINTIL-PropBank. CINTIL-PropBank Sub-corpus id Sentences Tokens Domain Sentences for regression atsts 779 5,654 Test

Appendices master s degree programme Human Machine Communication

Tracking Lexical and Syntactic Alignment in Conversation

The international conference Networks in the Global World. Bridging Theory and Method: American, European, and Russian Studies took place at St.

Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program

A Proposal for the use of Artificial Intelligence in Spend-Analytics

Fairfield Public Schools

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang

Gallito 2.0: a Natural Language Processing tool to support Research on Discourse

Presented to The Federal Big Data Working Group Meetup On 07 June 2014 By Chuck Rehberg, CTO Semantic Insights a Division of Trigent Software

A Framework of Context-Sensitive Visualization for User-Centered Interactive Systems

Selected Topics in Applied Machine Learning: An integrating view on data analysis and learning algorithms

Corpus data in usage-based linguistics

CIMA VISITING PROFESSOR LECTURE MANAGEMENT ACCOUNTING CHANGE: ORGANIZATIONAL CAUSES AND INDIVIDUAL EFFECTS

Methods in writing process research

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 12: June 22, Abstract. Review session.

Algebra Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

CULTURAL STUDIES AND CROSS-CULTURAL CAPABILITY

Crossing Corpora. Modelling Semantic Similarity across Languages and Lects.

MATRIX OF STANDARDS AND COMPETENCIES FOR ENGLISH IN GRADES 7 10

Texas Success Initiative (TSI) Assessment. Interpreting Your Score

Text Analytics. A business guide

UNIVERSIDAD NACIONAL ABIERTA Y A DISTANCIA ESCUELA CIENCIAS DE LA EDUCACIÓN

Masters of Science (MS) in Educational Psychology

Correlation: ELLIS. English language Learning and Instruction System. and the TOEFL. Test Of English as a Foreign Language

DIAGRAMMING SENTENCES

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

Summarizing and Displaying Categorical Data

Telecommunication (120 ЕCTS)

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Theoretical Perspective

COURSES IN ENGLISH AND OTHER LANGUAGES AT THE UNIVERSITY OF HUELVA (update: 24 th July 2014)

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

Teaching terms: a corpus-based approach to terminology in ESP classes

LANGUAGE! 4 th Edition, Levels A C, correlated to the South Carolina College and Career Readiness Standards, Grades 3 5

Qualitative data acquisition methods (e.g. Interviews and observations) -.

How To Analyse The Diffusion Patterns Of A Lexical Innovation In Twitter

Some Implications of Controlling Contextual Constraint: Exploring Word Meaning Inference by Using a Cloze Task

Master of Arts in Linguistics Syllabus

Livingston Public Schools Scope and Sequence K 6 Grammar and Mechanics

THE REASONING ART: or, The Need for an Analytical Theory of Architecture

Emo Dialogue: Differences in male and female ways of communicating with affective autonomous conversational systems

CLUSTER ANALYSIS FOR SEGMENTATION

Simple maths for keywords

Using Four-Quadrant Charts for Two Technology Forecasting Indicators: Technology Readiness Levels and R&D Momentum

Big Data: Rethinking Text Visualization

A FUZZY BASED APPROACH TO TEXT MINING AND DOCUMENT CLUSTERING

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde

1. The semantic complexity of grammatical constructions

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Comparing Ontology-based and Corpusbased Domain Annotations in WordNet.

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Index. 344 Grammar and Language Workbook, Grade 8

FACULTY OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY AUTUMN 2016 BACHELOR COURSES

Exploratory Data Analysis for Ecological Modelling and Decision Support

Teaching English as a Foreign Language (TEFL) Certificate Programs

Czech Verbs of Communication and the Extraction of their Frames

A Capability Model for Business Analytics: Part 2 Assessing Analytic Capabilities

Course Descriptions. Seminar in Organizational Behavior II

Bachelor's Degree in Business Administration and Master's Degree course description

A Statistical Text Mining Method for Patent Analysis

Knowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success

Appendix B Checklist for the Empirical Cycle

UNIT ONE A WORLD OF WONDERS

School of Advanced Studies Doctor Of Management In Organizational Leadership/information Systems And Technology. DM/IST 004 Requirements

This Master thesis would not have its current shape without the help of several people. I

Forensic Psychology Major Learning Objectives (adapted from APA)

Appendices master s degree programme Artificial Intelligence

TS3: an Improved Version of the Bilingual Concordancer TransSearch

Research into competency models in arts education

Chapter 6: Episode discovery process

Early Morphological Development

Transcription:

Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions Natalia Levshina RU Quantitative Lexicology and Variational Linguistics Faculteit Letteren Subfaculteit Taalkunde K.U.Leuven

Outline The object of the study Theoretical and methodological goals Data and methods Results Conclusions

Dutch causative constructions Haar stem deed het glas barsten. her voice made the glass break

Dutch causative constructions Haar stem deed het glas barsten. her voice made the glass break Harry liet het glas barsten. Harry made the glass break

Dutch causative constructions Haar stem deed het glas barsten. her voice made the glass break Harry liet het glas barsten. Harry made the glass break Causer

Dutch causative constructions Haar stem deed het glas barsten. her voice made the glass break Harry liet het glas barsten. Harry made the glass break Auxiliary

Dutch causative constructions Haar stem deed het glas barsten. her voice made the glass break Harry liet het glas barsten. Harry made the glass break Causee

Dutch causative constructions Haar stem deed het glas barsten. her voice made the glass break Harry liet het glas barsten. Harry made the glass break Effected Predicate

Previous research (1) Conceptual differences between the Cxs (Kemmer & Verhagen 1994; Verhagen & Kemmer 1997; Degand 2001; Stukker 2005) doen laten direct causation there is no intervening energy source 'downstream' from the initiator: if the energy is put in, the effect is the inevitable result (V&K 1997) indirect causation some other force besides the initiator is the most immediate source of energy in the effected event (V&K 1997)

Conceptual difference Haar stem deed het glas barsten. her voice made the glass break directly: sound resonance Harry liet het glas barsten. Harry made the glass break indirectly: the power of his magic wand

Previous research (2) Lexical fixation (Speelman & Geeraerts 2009) doen is more collocationally bound than laten Lectal variation (Speelman & Geeraerts 2009) doen is more frequently used in Belgian Dutch and in formal communication: an obsolescent form with a tendency towards lexical and semantic specialization Historical change (Duinhoven 1994; Verhagen 1994) doen has lost some usage schemata (e.g. interpersonal authoritative causation) with time

Outline The object of the study Theoretical and methodological goals Data and methods Results Conclusions

Towards a Hybrid Semantics 'Analogue' Semantics (CogLing) 'Digital' Semantics (CorpLing, CompLing) Perspective mostly polysemy gram. 'alternations' and lexical relationships Context maximally contextualized, no distinction between concepts and use minimally contextualized, contextual variation kept separate Methods introspective, interpretative formal, computational

why not combine the best of both worlds? (digital semantics with 'warm vinyl sound')

Aims PERSPECTIVE: combine the semasiological and onomasiological perspectives on doen and laten i.e. internal structure and distinctive features (cf. Geeraerts et al. 1994; Stukker 2005; Glynn 2007) CONTEXT: explore if there is geographic and register variation in the semantics of the constructions METHOD: develop a quantitative approach that would allow for an intuitive representation and qualitative interpretation of meaning

Outline The object of the study Theoretical and methodological goals Data and method Results Conclusions

Corpus data Register \ Country The Netherlands Flanders Spontaneous face-to-face conversations CGN (a) CGN (a) Newspapers (politics, economy, football, music) Postings in online discussion groups (politics, economy, football, music) TwNC Usenet.nl LeNC Usenet.be (in Dutch) Total: 5672 instances of doen and laten

Variables Causer, Causee, Affectee: sem. class, person, number, definiteness, POS, synt. expression Effected Predicate: transitivity, prepositional complements, semantic class of the caused event, lemma Coreferentiality and possession relations btw. the participants Causee only: intentionality, semantic role Negation, adverbial modifiers Mood, tense, type of the clause and sentence Total: 35 categorical variables

Analytical Procedure Data frame with observations (rows) and variables (columns)

Analytical Procedure Data frame with observations (rows) and variables (columns) Matrix of distances between the observations based on Gower's distance metric

Analytical Procedure Data frame with observations (rows) and variables (columns) Matrix of distances between the observations based on Gower's distance metric Multivariate analyses (MDS, hclust) to explore the semantic structure

Analytical Procedure Data frame with observations (rows) and variables (columns) Matrix of distances between the observations based on Gower's distance metric Multivariate analyses (MDS, hclust) to explore the semantic structure Confirmatory analyses (mixed GLM, random forests, etc.)

'Digital' Operationalization of 'analogue' semantics MDS dimensions = dimensions of semantic variation clusters of exemplars = 'senses'/usage patterns density of specific senses/usage patterns = entrenchment discontinuities = autonomy of specific senses/usage patterns etc.

Outline The object of the study Theoretical and methodological goals Data and method Results - semasiology meets onomasiology - lectal variation: is it done with doen? Conclusions

Semasiology: Semantics of doen

doen: dimensions volitional Causees non-volitional Causees mental caused events non-mental caused events

Semasiology: summary the main dimensions of doen and laten are identical: the semantic domain of the caused event, volitionality (agentivity, autonomy) of the Causee, i.e. direct indirect causation directness/indirectness of causation is a matter of degree no outspoken central sense in either construction no discrete senses, although relatively autonomous regions correspond to highly frequent doen denken aan and laten weten/zien... etc. (cf. Bybee 2010) the data challenge traditional radial network models used in Analogue Semantics

Onomasiology: doen vs. laten

Onomasiology: Summary the main distinction between doen and laten is that of directness and indirectness of causation (vertical dimension) there is also some evidence of mental caused events preferring doen (horizontal dimension) clear exemplar effects for doen denken aan and other lexically specific collocations (also in a mixed effect model) the most distinctive exemplars are not the most central (prototypical), especially for laten the results challenge the all-covering notion of 'The Prototype' used by some linguists

Outline The object of the study Theoretical and methodological goals Data and method Results - semasiology meets onomasiology - lectal variation: is it done with doen? Conclusions

doen: Quantitative differences Frequency of causative doen, per million words Belgium Netherlands 0 50 100 150 Newspapers Usenet Spoken dialogues

doen: Belgian newspapers

doen: Belgian Usenet

doen: Belgian conversations

doen: Netherlandic newspapers

doen: Netherlandic Usenet

doen: Netherlandic conversations

Lectal variation: Summary doen is quantitatively and qualitatively (doen denken aan) more restricted in the Netherlandic and informal lects (cf. S&G 2009) laten exhibits less quantitative variation, although the frequency of laten zien and laten weten is surprisingly high in the Netherlandic variety, especially in the newspapers for both doen and laten, the Netherlandic variety has higher frequencies of specific lexical collocations with a high degree of autonomy these lexicalization processes have especially dramatic consequences for doen, which may be on its way to becoming a bound morpheme

Outline The object of the study Theoretical and methodological goals Data and methods Results Conclusions

Conclusions METHOD: the method allows for the quantitative operationalization of the main descriptive notions of semantics; one can model Gestalt-like concepts and senses with the help of individual observable features PERSPECTIVE: the results demonstrate complementarity of the semasiological and onomasiological perspectives. The centrality (prototypicality) and distinctiveness (cue validity) are not equal, especially for the semantically broader laten CONTEXT: the analysis of lectal variation shows that the most outspoken differences are related to the higher frequency of the lexicalized semi-autonomous constructions in the NL lects, which might suggest an ongoing fragmentation of the categories.

Future research experimental support of the corpus-based operationalizations of semantic phenomena main dimensions of variation: universal or construction-specific? a detailed lectally specific protocol of the historical changes in the semantic space of doen and laten

Thank you!