Using German corpora for linguistic purposes. Dr. Kathrin Steyer Institut für Deutsche Sprache, Mannheim

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Using German corpora for linguistic purposes. Dr. Kathrin Steyer Institut für Deutsche Sprache, Mannheim"

Transcription

1 Using German corpora for linguistic purposes Dr. Kathrin Steyer Institut für Deutsche Sprache, Mannheim

2 Introduction This talk will give a first impression of the complex field of German corpora and methods of corpus analysis. Before starting your work with corpora, be aware what a method can accomplish and what not.

3 Introduction Often I notice that overly complicated methods are used where simply collecting and counting instances would have been enough. Large collections of data and powerful automatic tools sometimes lead to an overvaluation of quantitive data.

4 Introduction Sometimes, the allure of numbers and frequencies leads to methodological laziness. Even today, the quality of linguistic interpretation is the most important factor regarding the informative value of the analysis. Corpus linguistics has not diminished the importance of the old cultural technique of reading and interpreting texts.

5 Introduction Today, I will highlight some ways how corpora and tools can help us linguists to get a high quality prestructuring of data This is particularly useful for examining high frequency phenomena which are important for language use identifying phenomena, which are not obvious to us, e.g. hidden structures and patterns

6 Introduction Focus is not on corpora or tools which need expert knowledge or have to be downloaded those are primarily used for automatic natural language processing e.g. Wortschatz Leipzig or IMS Open Corpus Workbench (Stuttgart) or TIGER (Berlin) Instead: Corpora which are available online and free of charge for the "common linguist"

7 German Introductions to Corpus Linguistics Lemnitzer, Lothar/Zinsmeister, Heike (2010): Korpuslinguistik. Eine Einführung. 2., durchgesehene und aktualisierte Aufl. (= Narr Studienbücher). Tübingen Perkuhn, Rainer/Keibel, Holger/Kupietz, Marc (2012): Korpuslinguistik. (=UTB 3433) Paderborn.

8 German Corpus Linguistics Website Noah Bubenhofer ( ): Einführung in die Korpuslinguistik: Praktische Grundlagen und Werkzeuge.

9 A Short History of German Corpora Institut for German Language a pioneer in the German speaking area since mid-1960s (!) Compilation of electronic text databases ( -> today: German reference corpus DeReKo) Development of COSMAS I, first platform for corpus analysis in the German speaking area (early 1990s 2003)

10 A Short History of German Corpora Core corpus of the Digital Dictionary of the 20th century (Digitales Wörterbuch des 20. Jahrhunderts) at the Berlin-Brandenburgische Akademie der Wissenschaften; sponsored by the Deutsche Forschungsgemeinschaft DFG Since 2009 merged into C4 Corpus DWDS; Schweizer Textkorpus (Switzerland), Austrian Academic Corpus; Korpus Südtirol (South Tyrol) 80 million word tokens

11 Overview 1. German specialized corpora examples 2. German general reference corpora 1. DWDS 2. DeReKO 3. Methodological approaches 1. Consulting the corpus 2. Analysing the corpus statistical collocation analysis 4. Corpora and lexical ressources

12 German Specialized Corpora Spoken language: Database (DGD2) Archive Gesprochenes Deutsch (Spoken German) (IDS) Discourse analysis, Dialectology Dortmunder Chatkorpus

13 German Specialized Corpora Annotation: e.g. morpho-syntactically annotated corpora example: TIGER-Korpus (IMS Stuttgart) Language Learning: Learner corpora, errorannotated corpora example: FALKO (HU Berlin) Literature: Project Gutenberg; about free ebooks (online)

14 Specialized corpora at the IDS Author corpora: Goethe corpus Dialects: Zwirner corpus, including corpus of venaculars of the former Eastern territories Genre: parliamentary debates, biographical fiction Historical period: Wendekorpus (1989/90) about 3,3 million word tokens articles, leaflets, flyers, parliamentary proceedings, speeches, declarations usw. Medium: Wikipedia corpus

15 German General Reference Corpora

16 German General Reference Corpora Not compiled for a specific use or for answering specific research questions As general as possible in order to be useful for various language studies DWDS and DeReKo

17 DWDS corpus: in total: 2.5 billion; 1.8 billion word tokens publicly accessible (online and free) (several corpora) Core corpus: approx. 100 million word tokens Balanced in respect to time and genre (literature, journalistic prose, scientific texts, specialized texts (adverts, manuals etc.), spoken) Spans the 20th century Integrated with the DWDS Portal (dictionaries etc.)

18 The German Reference Corpus (DeReKo) and COSMAS II Institut für Deutsche Sprache, Mannheim (IDS)

19 The German Reference Corpus DeReKo 6,1 billion word tokens (status as of ) Contains written German language texts of the present and recent past The largest "primordial sample of contemporary German" world wide online and free, registration required (copyright) List of corpora

20 The German Reference Corpus DeReKo Contains only copyrighted material Dynamic corpus (continually updated) Option to create personal subcorpora with COSMAS II which can be tailored towards specific research questions

21 Deutsches Referenzkorpus am IDS mit über 5,4 Milliarden Wörtern (Stand ) die weltweit größte linguistisch motivierte Sammlung elektronischer Korpora mit geschriebenen deutschsprachigen Texten aus der Gegenwart und der neueren Vergangenheit belletristische, wissenschaftliche und populärwissenschaftliche Texte, eine große Zahl von Zeitungstexten sowie eine breite Palette weiterer Textarten -> Analysesystem COSMAS II German corpora for linguistic purposes

22 COSMAS II Corpus Search, Management and Analysis System Not a web search engine Language independent Free online access since 1993 Ca registered users from over 100 countries

23 Search window KWIC Full text

24 Result presentation sources / corpora chronological alphabetical (successor/predecessor of the search object) randomized sorting text genres topics collocations Export of results

25 Analytical approaches

26 Paradigms of corpus analysis Looking for answers to my questions in the corpus -> validation of a priori knowledge ('consulting') Finding new research questions in the corpus and interpreting those -> best case: generating new knowledge ('analysing')

27 Consulting the corpus Do specific language elements (e.g. morphemes, lexemes, multi-word units) occur at all and if they do, how often? Which usage based aspects of meaning can be identified? In which situations are they used? What is the typical base form in the corpus? Which variations can be found?

28 Consulting the corpus Discourse Globalisierung bedeutet (Globalization means) (Teubert 2006) Text type ( birthday textes; advertises) geistige Frische Regional Differences Samstag vs. Sonnabend Germany, Austria, Switzerland

29 Das Korpus befragen (corpus-based) Areale Besonderheit Grumbeere? auf freiem Fuß anzeigen? Schreibung? Blind date oder Blind Date oder? Diskurs? Besserwessi

30

31 Consulting the corpus Samstag is used in all German speaking areas Sonnabend is used almost exclusively in Germany Chronological (e.g. new lexems, multi word units) voll krass

32 Search strategies - Example Exclusionary searches Excluding hits that are not relevant Verifiying stability and variance S: Übung macht ART WITHOUT Meister Query: (&Übung /+w1 &machen) /+w1 (den ODER die ODER das)) &s0 &Meister S: macht den Meister WITHOUT Übung Query: (&machen /+w2 &Meister) %s0 &Übung

33 Patterns: Übung macht den X M11 Übung macht den Kegelmeister M99 Übung macht den Handball-Meister M99 Übung macht auch hier den Zaubermeister. RHZ11 A97 A00 A09 F99 Übung macht die Meisterin Übung macht Radioprediger Übung macht den Schützen Übung macht den Feuerwehrmann Übung macht den Gourmet linguistic purposes German corpora for

34 Patterns: X macht den Meister B06 Technik macht den Meister Tipps für Anfänger B07 Energie macht den Meister B07 Vorsicht macht den Meister. BVZ07 Die Praxis macht den Meister zu Schulbeginn E99 Doch erst Playoff macht den Meister. M00 Ob Profi oder Schnuppersportler - Training macht den Meister. linguistic purposes German corpora for

35 Other phenomena: Word formation Productivity in word formation *mentalität

36 Other phenomena: Grammar Search in a morpho-syntactically annotated corpus Relatively small in comparison with the whole corpus archive Adjektive - Kopf (in a subcorpus) All dative nouns followed by a dative relative pronoun within a span of three tokens maximum Query: MORPH(NOU dat) /+w3 MORPH(PRN rel dat)

37 Grammar Phenomena Plea for search in non-annotated corpora, even for grammatical research questions Completely abstract constructions not searchable, lexical anchor necessary BUT: Larger corpus size can lead to surprising results Example: all when without comma

38 Drowning in a flood of mass data? BUT The bigger the data set, the more overwhelming for humans Example Kopf

39 Collocation analysis at the IDS Cyril Belica: Statistische Kollokationsanalyse und Clustering. Korpuslinguistische Analysemethode Institut für Deutsche Sprache, Mannheim. Tutorial 2004: Short introduction to collocation analysis Cp. Perkuhn/Keibel/Kupietz (2012)

40 Teil 2 Praktische Übungen

41 Collocation analysis at the IDS Focusses on lexical cooccurrences Dynamically computed on the latest version of the corpus Flexible adjustment of parameters (e.g. span and position, granularity, functions word y/n) Computes not only word collocate pairs, but also hierarchical clusters and common syntagmatic patterns

42 Collocation cluster CA for Kopf

43 Interpreting Clusters Collocation clusters are only indicators for the contexts on which they are based Syntagmatic perspective is most important KWIC cluster Full text cluster

44 Collocation analysis at the IDS Collocations Phrasemes fixed syntagmatic structures fixed context patterns (access to meaning and common usage)

45 You shall know a word by the company it keeps (Firth 1957)

46 Usage clusters: semantical 'injury by external force' Kugel / gegen die Wand stoßen/geschlagen / Platzwunde am Kopf / verletzt / geschossen / an die Bande prallen / abgeschlagenen / Brustverletzung / Beule 'body part' Hals / Nacken / Bauch / Oberkörper / Arme 'symptoms of illness' Gliederschmerzen heiß

47 Usage clusters: phrasemes 'emotional state' mit hängenden Köpfen ('dejected') / mit kühlem Kopf ('level-headed') / mit hochrotem Kopf ('angry' 'embarassed') / mit gesenktem Kopf ('abashed')

48 Colloctions collocation patterns Mutual lexical fixedness Hals über Kopf ('rushed') (*X über Kopf; *Hals über X) Semantically restricted usage mit hochroten Kopf CA hochrot -> hochrot only with body parts (prototypical: Kopf) Productive collocation patterns strategischer Kopf / führende / kreative / beste Köpfe ('leader mastermind')

49 Context patterns Pragmatic Orality / colloquial speech in the corpus voll krass Usage of word classes, formulae, particles, sentence adverbs etc. Example: ernsthaft Discourse: Globalisierung

50 German collocation resources Pro: fast access Contra: no dynamic customization possible DWDS word profiles Collocations in Wortschatz Leipzig IDS- Collocation Database CCDB (Kookkurrenzdatenbank) Pre-analysed profiles of lemmas + KWIC Semantic proximity by comparing CA profiles (e.g. anscheinend vs. scheinbar)

51 Collocation analysis clustering typical contexts of usage is an analytical approach that is central for all kinds of linguistic research questions, if you interested in "language in use" (this can also be "syntax in use")

52 IDS linguistic applications Corpus-based grammar (grammis) Lexicon-grammar-interface: valency, argument structure and construction grammar DeReKo, IMS Workbench, other Spoken language: Variation des gesprochenen Deutsch: Standardsprache Alltagssprache"

53 IDS linguistic applications of CA Corpus-based and driven lexicology and lexicography OWID (e.g. elexiko; dictionary of modern german proverbs ) Multilingual Proverb-Online-Platform Fields of lexical pattern and phrasem-constructions -> Qualitative linguistic interpretation of collocation and syntagmatic profiles

54 Outlook

55 Integrative Platforms Authentic corpus data Qualitative Descriptions Lexical resources (e.g. collocation profiles and networks) Web (DWDS; OWID)

56 KorAP KorAP: The next generation corpus analysis platform of the Institute for German Language Replaces COSMAS II (but features will be reproduced) Extends the possiblities of individual corpus design (e.g. by topic, by text type) Several levels of linguistic annotation Basic and extended search functionality; faster

57 Thank you for your attention!

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1]

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1] Content 1. Empirical linguistics 2. Text corpora and corpus linguistics 3. Concordances 4. Application I: The German progressive 5. Part-of-speech tagging 6. Fequency analysis 7. Application II: Compounds

More information

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1]

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1] Content 1. Empirical linguistics 2. Text corpora and corpus linguistics 3. Concordances 4. Application I: The German progressive 5. Part-of-speech tagging 6. Fequency analysis 7. Application II: Compounds

More information

Corpus-driven study of multi-word expressions based on collocations from a very large corpus

Corpus-driven study of multi-word expressions based on collocations from a very large corpus Corpus-driven study of multi-word expressions based on collocations from a very large corpus Annelen Brunner and Dr Kathrin Steyer Project Usuelle Wortverbindungen Institute for the German Language, Mannheim

More information

A model for corpus-driven exploration and presentation of multi-word expressions

A model for corpus-driven exploration and presentation of multi-word expressions A model for corpus-driven exploration and presentation of multi-word expressions Annelen Brunner 1 and Kathrin Steyer 1 Institute for the German Language, Mannheim Abstract. In this paper we outline our

More information

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1]

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1] Content 1. Empirical linguistics 2. Text corpora and corpus linguistics 3. Concordances 4. Application I: The German progressive 5. Part-of-speech tagging 6. Fequency analysis 7. Application II: Compounds

More information

EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language

EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language Thomas Schmidt Institut für Deutsche Sprache, Mannheim R 5, 6-13 D-68161 Mannheim thomas.schmidt@uni-hamburg.de

More information

The Use of Text Corpora in Lexical Research

The Use of Text Corpora in Lexical Research The Use of Text Corpora in Lexical Research Stefan Engelberg Workshop, Universitatea din Bucureşti, November 2008 http://www.ids-mannheim.de/ll/lehre/engelberg/ Webseite_CorpLex/CorpLex.html engelberg@ids-mannheim.de

More information

DeRiK: A German Reference Corpus of Computer-Mediated Communication

DeRiK: A German Reference Corpus of Computer-Mediated Communication DeRiK: A German Reference Corpus of Computer-Mediated Communication Michael Beißwenger 1, Maria Ermakova 2, Alexander Geyken 2, Lothar Lemnitzer 2, Angelika Storrer 1 1 Department of German Language and

More information

Adding Value to CMC Corpora: CLARINification and Part-of-Speech Annotation of the Dortmund Chat Corpus

Adding Value to CMC Corpora: CLARINification and Part-of-Speech Annotation of the Dortmund Chat Corpus Adding Value to CMC Corpora: CLARINification and Part-of-Speech Annotation of the Dortmund Chat Corpus Michael Beißwenger, Eric Ehrhardt, Andrea Horbach, Harald Lüngen, Diana Steffen, Angelika Storrer

More information

Berlin-Brandenburg Academy of sciences and humanities (BBAW) resources / services

Berlin-Brandenburg Academy of sciences and humanities (BBAW) resources / services Berlin-Brandenburg Academy of sciences and humanities (BBAW) resources / services speakers: Kai Zimmer and Jörg Didakowski Clarin Workshop WP2 February 2009 BBAW/DWDS The BBAW and its 40 longterm projects

More information

A Dictionary of Spoken Danish

A Dictionary of Spoken Danish A Dictionary of Spoken Danish Carsten Hansen & Martin H. Hansen The LANCHART Centre of Copenhagen Key words Lexicography, Speech Corpus, Pragmatics, Conversation Analysis 1. Introduction The purpose of

More information

Hybrid Strategies. for better products and shorter time-to-market

Hybrid Strategies. for better products and shorter time-to-market Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,

More information

What belongs in a dictionary? The Example of Negation in Czech

What belongs in a dictionary? The Example of Negation in Czech What belongs in a dictionary? The Example of Negation in Czech Dominika Kovarikova, Lucie Chlumska & Vaclav Cvrcek Keywords: negation, lexicography, grammatical category, frequency, lemmatization. Abstract

More information

Brauchen die Digital Humanities eine eigene Methodologie?

Brauchen die Digital Humanities eine eigene Methodologie? Deutsche DH, Passau 26.03.2014 Brauchen die Digital Humanities eine eigene Methodologie? 26. März 2014 Heyer / Niekler / Wiedemann 1 Übersicht Aspekte der Operationalisierung geistes- und sozialwissenschaftlicher

More information

Corpus and Discourse. The Web As Corpus. Theory and Practice MARISTELLA GATTO LONDON NEW DELHI NEW YORK SYDNEY

Corpus and Discourse. The Web As Corpus. Theory and Practice MARISTELLA GATTO LONDON NEW DELHI NEW YORK SYDNEY Corpus and Discourse The Web As Corpus Theory and Practice MARISTELLA GATTO B L O O M S B U R Y LONDON NEW DELHI NEW YORK SYDNEY Contents List of Figures xiii List of Tables xvii Preface xix Acknowledgements

More information

LINGUISTIC SUPPORT IN "THESIS WRITER": CORPUS-BASED ACADEMIC PHRASEOLOGY IN ENGLISH AND GERMAN

LINGUISTIC SUPPORT IN THESIS WRITER: CORPUS-BASED ACADEMIC PHRASEOLOGY IN ENGLISH AND GERMAN ELN INAUGURAL CONFERENCE, PRAGUE, 7-8 NOVEMBER 2015 EUROPEAN LITERACY NETWORK: RESEARCH AND APPLICATIONS Panel session Recent trends in Bachelor s dissertation/thesis research: foci, methods, approaches

More information

Extracting translation relations for humanreadable dictionaries from bilingual text

Extracting translation relations for humanreadable dictionaries from bilingual text Extracting translation relations for humanreadable dictionaries from bilingual text Overview 1. Company 2. Translate pro 12.1 and AutoLearn 3. Translation workflow 4. Extraction method 5. Extended

More information

Markus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013

Markus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013 Markus Dickinson Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013 1 / 34 Basic text analysis Before any sophisticated analysis, we want ways to get a sense of text data

More information

Processing Dialogue-Based Data in the UIMA Framework. Milan Gnjatović, Manuela Kunze, Dietmar Rösner University of Magdeburg

Processing Dialogue-Based Data in the UIMA Framework. Milan Gnjatović, Manuela Kunze, Dietmar Rösner University of Magdeburg Processing Dialogue-Based Data in the UIMA Framework Milan Gnjatović, Manuela Kunze, Dietmar Rösner University of Magdeburg Overview Background Processing dialogue-based Data Conclusion Gnjatović, Kunze,

More information

Enabling a data management system to support the good laboratory practice Masterthesis Status Report Miriam Ney (13.01.

Enabling a data management system to support the good laboratory practice Masterthesis Status Report Miriam Ney (13.01. Enabling a data management system to support the good laboratory practice Masterthesis Status Report Miriam Ney (13.01.2011) Folie 1 Statusreport Masterthesis > Miriam Ney > 13.01.2011 Overview Description

More information

Simple maths for keywords

Simple maths for keywords Simple maths for keywords Adam Kilgarriff Lexical Computing Ltd adam@lexmasterclass.com Abstract We present a simple method for identifying keywords of one corpus vs. another. There is no one-sizefits-all

More information

Complex Predications in Argument Structure Alternations

Complex Predications in Argument Structure Alternations Complex Predications in Argument Structure Alternations Stefan Engelberg (Institut für Deutsche Sprache & University of Mannheim) Stefan Engelberg (IDS Mannheim), Universitatea din Bucureşti, November

More information

What Makes a Good Online Dictionary? Empirical Insights from an Interdisciplinary Research Project

What Makes a Good Online Dictionary? Empirical Insights from an Interdisciplinary Research Project Proceedings of elex 2011, pp. 203-208 What Makes a Good Online Dictionary? Empirical Insights from an Interdisciplinary Research Project Carolin Müller-Spitzer, Alexander Koplenig, Antje Töpel Institute

More information

Collocations. The definition of Collocations Frequency Mean and Variance

Collocations. The definition of Collocations Frequency Mean and Variance Collocations The definition of Collocations Frequency Mean and Variance Collocations Collocations of a given word are statements of the habitual or customary places of that word J.R. Firth Examples of

More information

Real-Time Identification of MWE Candidates in Databases from the BNC and the Web

Real-Time Identification of MWE Candidates in Databases from the BNC and the Web Real-Time Identification of MWE Candidates in Databases from the BNC and the Web Identifying and Researching Multi-Word Units British Association for Applied Linguistics Corpus Linguistics SIG Oxford Text

More information

Pragmatic analysis of hotel websites in terms of interpersonal relationships. Theses of the PhD dissertation by. Kovács Péterné Dudás Andrea

Pragmatic analysis of hotel websites in terms of interpersonal relationships. Theses of the PhD dissertation by. Kovács Péterné Dudás Andrea Pragmatic analysis of hotel websites in terms of interpersonal relationships Theses of the PhD dissertation by Kovács Péterné Dudás Andrea Eötvös Loránd University Faculty of Humanities Doctoral School

More information

Local Culture in Global English:

Local Culture in Global English: Local Culture in Global English: a case study of Kultur in Sprache / Sprachwissenschaft in Kulturwissenschaften Josef Schmied Chair English Language & Linguistics Chemnitz University of Technology www.tu-chemnitz.de/phil/english/linguist

More information

Extended Abstract Advancement through technology? The analysis of journalistic online-content by using automated tools 1

Extended Abstract Advancement through technology? The analysis of journalistic online-content by using automated tools 1 Extended Abstract Advancement through technology? The analysis of journalistic online-content by using automated tools 1 Jörg Haßler, Marcus Maurer & Thomas Holbach 1. Introduction Without any doubt, the

More information

COURSE PRESENTATION FORM ACADEMIC YEAR 2013

COURSE PRESENTATION FORM ACADEMIC YEAR 2013 COURSE PRESENTATION FORM ACADEMIC YEAR 2013 COURSE NAME Presentation, Communication & Scientific Writing COURSE CODE 75024 LECTURERS Johannes Mahlknecht, Mario Klarer TEACHING ASSISTANT -- TEACHING LANGUAGE

More information

Local Culture in Global English:

Local Culture in Global English: Local Culture in Global English: a case study of Kultur in Sprache / Sprachwissenschaft in Kulturwissenschaften Josef Schmied Chair English Language & Linguistics Chemnitz University of Technology www.tu-chemnitz.de

More information

Transcription bottleneck of speech corpus exploitation

Transcription bottleneck of speech corpus exploitation Transcription bottleneck of speech corpus exploitation Caren Brinckmann Institut für Deutsche Sprache, Mannheim, Germany Lesser Used Languages and Computer Linguistics (LULCL) II Nov 13/14, 2008 Bozen

More information

Using the BNC to create and develop educational materials and a website for learners of English

Using the BNC to create and develop educational materials and a website for learners of English Using the BNC to create and develop educational materials and a website for learners of English Danny Minn a, Hiroshi Sano b, Marie Ino b and Takahiro Nakamura c a Kitakyushu University b Tokyo University

More information

EFL Learners Synonymous Errors: A Case Study of Glad and Happy

EFL Learners Synonymous Errors: A Case Study of Glad and Happy ISSN 1798-4769 Journal of Language Teaching and Research, Vol. 1, No. 1, pp. 1-7, January 2010 Manufactured in Finland. doi:10.4304/jltr.1.1.1-7 EFL Learners Synonymous Errors: A Case Study of Glad and

More information

CURRICULUM VITAE. M. Sc. Anne-Katharina Schiefele

CURRICULUM VITAE. M. Sc. Anne-Katharina Schiefele CURRICULUM VITAE Address: Department of Clinical Psychology and Psychotherapy, University of Trier, 54286 Trier, Germany TEL 0049 (0)651 201 2882 E-mail: schiefele@uni-trier.de Birthday: November 30, 1987

More information

University of Massachusetts Boston Applied Linguistics Graduate Program. APLING 601 Introduction to Linguistics. Syllabus

University of Massachusetts Boston Applied Linguistics Graduate Program. APLING 601 Introduction to Linguistics. Syllabus University of Massachusetts Boston Applied Linguistics Graduate Program APLING 601 Introduction to Linguistics Syllabus Course Description: This course examines the nature and origin of language, the history

More information

Master of Arts in Linguistics Syllabus

Master of Arts in Linguistics Syllabus Master of Arts in Linguistics Syllabus Applicants shall hold a Bachelor s degree with Honours of this University or another qualification of equivalent standard from this University or from another university

More information

Search Engines Chapter 2 Architecture. 14.4.2011 Felix Naumann

Search Engines Chapter 2 Architecture. 14.4.2011 Felix Naumann Search Engines Chapter 2 Architecture 14.4.2011 Felix Naumann Overview 2 Basic Building Blocks Indexing Text Acquisition Text Transformation Index Creation Querying User Interaction Ranking Evaluation

More information

An Introduction to TextGrid

An Introduction to TextGrid An Introduction to TextGrid Philipp Vanscheidt (Universität Trier / Technische Universität Darmstadt) pvanscheidt@uni-trier.de Karl-Franzens-Universität Graz 19. September 2014 The times they are a changin

More information

The Database for Spoken German DGD2

The Database for Spoken German DGD2 The Database for Spoken German DGD2 Thomas Schmidt Institut für Deutsche Sprache R5, 6-13, D-68161 Mannheim E-mail: thomas.schmidt@ids-mannheim.de Abstract The Database for Spoken German (Datenbank für

More information

Master-Programm Deutsch als Fremdsprache (Master of Arts Program in German as a Foreign Language) an der Ramkhamhaeng Universität/Bangkok

Master-Programm Deutsch als Fremdsprache (Master of Arts Program in German as a Foreign Language) an der Ramkhamhaeng Universität/Bangkok Master-Programm Deutsch als Fremdsprache (Master of Arts Program in German as a Foreign Language) an der Ramkhamhaeng Universität/Bangkok Curriculum 2008 Man kann zwischen zwei Schwerpunkten wählen: Interkulturelle

More information

Data at the SFB "Mehrsprachigkeit"

Data at the SFB Mehrsprachigkeit 1 Workshop on multilingual data, 08 July 2003 MULTILINGUAL DATABASE: Obstacles and Opportunities Thomas Schmidt, Project Zb Data at the SFB "Mehrsprachigkeit" K1: Japanese and German expert discourse in

More information

Course Content. The following course units will be offered:

Course Content. The following course units will be offered: The following course units will be offered: Research Methodology Textual Analysis and Practice Sociolinguistics: Critical Approaches Life writing World Englishes Digital Cultures Beyond the Post-colonial

More information

Off-line (and On-line) Text Analysis for Computational Lexicography

Off-line (and On-line) Text Analysis for Computational Lexicography Offline (and Online) Text Analysis for Computational Lexicography Von der PhilosophischHistorischen Fakultät der Universität Stuttgart zur Erlangung der Würde eines Doktors der Philosophie (Dr. phil.)

More information

Checklist Use this checklist to find out how much English you already know. Grundstufe 1 (Common European Framework: A1 Level)

Checklist Use this checklist to find out how much English you already know. Grundstufe 1 (Common European Framework: A1 Level) Der XL Test: Was können Sie schon? Schätzen Sie Ihre Sprachkenntnisse selbst ein! Sprache: Englisch Mit der folgenden e haben Sie die Möglichkeit, Ihre Fremdsprachenkenntnisse selbst einzuschätzen. Die

More information

Students workload: ECTS: 4; 1h lectures + 1h seminar (30 h) + 60 h of preparation and seminar papers

Students workload: ECTS: 4; 1h lectures + 1h seminar (30 h) + 60 h of preparation and seminar papers University of Zadar English Department COURSE SYLLABUS Course: Semantics Year: 3 rd Semester: 5 th Course prerequisites: Introduction to the Study of English Language and Linguistics Lecturer: Jadranka

More information

Quantitative Text Typology The Impact of Sentence Length

Quantitative Text Typology The Impact of Sentence Length Quantitative Text Typology The Impact of Sentence Length Emmerich Kelih 1, Peter Grzybek 1, Gordana Antić 2, and Ernst Stadlober 2 1 Department for Slavic Studies, University of Graz, A-8010 Graz, Merangasse

More information

What CLARIN has to offer to Linguists. Jan Odijk TIN-dag Utrecht,

What CLARIN has to offer to Linguists. Jan Odijk TIN-dag Utrecht, What CLARIN has to offer to Linguists Jan Odijk TIN-dag Utrecht, 2015-02-07 1 Overview What is CLARIN? What CLARIN has to offer to linguists How you can learn to use the functionality offered Current Status

More information

Cultural Trends and language change

Cultural Trends and language change Cultural Trends and language change Gosse Bouma g.bouma@rug.nl Information Science University of Groningen NHL 2015/03 Gosse Bouma 1/25 Popularity of Wolf in English books Gosse Bouma 2/25 Google Books

More information

stress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings,

stress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings, Section 9 Foreign Languages I. OVERALL OBJECTIVE To develop students basic communication abilities such as listening, speaking, reading and writing, deepening their understanding of language and culture

More information

Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions. Natalia Levshina

Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions. Natalia Levshina Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions Natalia Levshina RU Quantitative Lexicology and Variational Linguistics Faculteit Letteren Subfaculteit Taalkunde K.U.Leuven

More information

The PALAVRAS parser and its Linguateca applications - a mutually productive relationship

The PALAVRAS parser and its Linguateca applications - a mutually productive relationship The PALAVRAS parser and its Linguateca applications - a mutually productive relationship Eckhard Bick University of Southern Denmark eckhard.bick@mail.dk Outline Flow chart Linguateca Palavras History

More information

DAM-LR at the INL Archive Formation and Local INL. Remco van Veenendaal veenendaal@inl.nl http://imdi.inl.nl 01/03/2007 DAM-LR

DAM-LR at the INL Archive Formation and Local INL. Remco van Veenendaal veenendaal@inl.nl http://imdi.inl.nl 01/03/2007 DAM-LR DAM-LR at the INL Archive Formation and Local INL Remco van Veenendaal veenendaal@inl.nl http://imdi.inl.nl Introducing Remco van Veenendaal Project manager DAM-LR Acting project manager Dutch HLT Agency

More information

NoSta-D: A Corpus of German Non-standard Varieties

NoSta-D: A Corpus of German Non-standard Varieties NoSta-D: A Corpus of German Non-standard Varieties Stefanie Dipper 1, Anke Lüdeling 2, Marc Reznicek 2 Ruhr-Universität Bochum 1 Humboldt-Universität zu Berlin 2 Abstract Until recently, most research

More information

Course: German 1 Designated Six Weeks: Weeks 1 and 2. Assessment Vocabulary Instructional Strategies

Course: German 1 Designated Six Weeks: Weeks 1 and 2. Assessment Vocabulary Instructional Strategies (1) Communication. The student communicates using the skills of listening, speaking, reading, and writing. The student: (A) engages in oral and written exchanges of learned material to socialize and to

More information

German Language Resource Packet

German Language Resource Packet German has three features of word order than do not exist in English: 1. The main verb must be the second element in the independent clause. This often requires an inversion of subject and verb. For example:

More information

Exploiting Sign Language Corpora in Deaf Studies

Exploiting Sign Language Corpora in Deaf Studies Trinity College Dublin Exploiting Sign Language Corpora in Deaf Studies Lorraine Leeson Trinity College Dublin SLCN Network I Berlin I 4 December 2010 Overview Corpora: going beyond sign linguistics research

More information

SAP Enterprise Portal 6.0 KM Platform Delta Features

SAP Enterprise Portal 6.0 KM Platform Delta Features SAP Enterprise Portal 6.0 KM Platform Delta Features Please see also the KM Platform feature list in http://service.sap.com/ep Product Management Operations Status: January 20th, 2004 Note: This presentation

More information

LEJ Langenscheidt Berlin München Wien Zürich New York

LEJ Langenscheidt Berlin München Wien Zürich New York Langenscheidt Deutsch in 30 Tagen German in 30 days Von Angelika G. Beck LEJ Langenscheidt Berlin München Wien Zürich New York I Contents Introduction Spelling and pronunciation Lesson 1 Im Flugzeug On

More information

WebLicht: Web-based LRT services for German

WebLicht: Web-based LRT services for German WebLicht: Web-based LRT services for German Erhard Hinrichs, Marie Hinrichs, Thomas Zastrow Seminar für Sprachwissenschaft, University of Tübingen firstname.lastname@uni-tuebingen.de Abstract This software

More information

A Data-Driven Approach to Deep Machine Translation. Michael Jellinghaus Saarland University

A Data-Driven Approach to Deep Machine Translation. Michael Jellinghaus Saarland University A Data-Driven Approach to Deep Machine Translation Michael Jellinghaus Saarland University micha@coli.uni-sb.de Motivation Overview Characterisation of statistical and transfer-based MT Hybrid system idea

More information

Adding Value to CMC Corpora: CLARINification and Part-of-Speech Annotation of the Dortmund Chat Corpus

Adding Value to CMC Corpora: CLARINification and Part-of-Speech Annotation of the Dortmund Chat Corpus Adding Value to CMC Corpora: CLARINification and Part-of-Speech Annotation of the Dortmund Chat Corpus Michael Beißwenger 1, Eric Ehrhardt 2, Andrea Horbach 3, Harald Lüngen 4, Diana Steffen 3, Angelika

More information

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that

More information

CHAPTER V DISCUSSION. of the English proceeding of national and international conference.

CHAPTER V DISCUSSION. of the English proceeding of national and international conference. CHAPTER V DISCUSSION In this chapter, the writer presents and discusses the data result from the previous chapter. After analyze them, she would like to discuss the result which has been found in the finding

More information

Working Paper Series. RatSWD. Working Paper No. 127. Potential and availability of market research data for empirical social and economic research

Working Paper Series. RatSWD. Working Paper No. 127. Potential and availability of market research data for empirical social and economic research RatSWD Working Paper Series Working Paper No. 127 Potential and availability of market research data for empirical social and economic research Erich Wiegand August 2009 Working Paper Series of the Council

More information

Insights into Six Decades of Scientific Practice

Insights into Six Decades of Scientific Practice DTA-/CLARIN-D-Konferenz Historische Textkorpora für die Geistes- und Sozialwissenschaften Title Insights into Six Decades of Scientific Practice Speaker Coauthors Gerhard Heyer, NLP chair (heyer@informatik.uni-leipzig.de)

More information

Enabling a data management system to support the good laboratory practice Master Thesis Final Report Miriam Ney (09.06.2011)

Enabling a data management system to support the good laboratory practice Master Thesis Final Report Miriam Ney (09.06.2011) Enabling a data management system to support the good laboratory practice Master Thesis Final Report Miriam Ney (09.06.2011) Overview Description of Task Phase 1: Requirements Analysis Good Laboratory

More information

Differences in linguistic and discourse features of narrative writing performance. Dr. Bilal Genç 1 Dr. Kağan Büyükkarcı 2 Ali Göksu 3

Differences in linguistic and discourse features of narrative writing performance. Dr. Bilal Genç 1 Dr. Kağan Büyükkarcı 2 Ali Göksu 3 Yıl/Year: 2012 Cilt/Volume: 1 Sayı/Issue:2 Sayfalar/Pages: 40-47 Differences in linguistic and discourse features of narrative writing performance Abstract Dr. Bilal Genç 1 Dr. Kağan Büyükkarcı 2 Ali Göksu

More information

Morphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications

Morphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications Morphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications Berlin Berlin Buzzwords 2011, Dr. Christoph Goller, IntraFind AG Outline IntraFind AG Indexing Morphological

More information

1 von 91 RMS WiSe 2014/15/Academic Working/Seiten/Startseite

1 von 91 RMS WiSe 2014/15/Academic Working/Seiten/Startseite 1 von 91 RMS WiSe 2014/15/Academic Working/Seiten/Startseite 2 von 91 RMS WiSe 2014/15/Academic Working/Seiten/Abstract 3 von 91 RMS WiSe 2014/15/Academic Working/Seiten/LernBar 4 von 91 RMS WiSe 2014/15/Academic

More information

Electronic offprint from. baltic linguistics. Vol. 3, 2012

Electronic offprint from. baltic linguistics. Vol. 3, 2012 Electronic offprint from baltic linguistics Vol. 3, 2012 ISSN 2081-7533 Nɪᴄᴏʟᴇ Nᴀᴜ, A Short Grammar of Latgalian. (Languages of the World/Materials, 482.) München: ʟɪɴᴄᴏᴍ Europa, 2011, 119 pp. ɪѕʙɴ 978-3-86288-055-3.

More information

(A) DESNET (DEmand & Supply NETwork) Identification. Identification

(A) DESNET (DEmand & Supply NETwork) Identification. Identification V-LAB-Instruction Ver 4.0.doc (A) DESNET (DEmand & Supply NETwork) Identification Name RPD-Tech 2 Address Web site E - mail Coachulting, Johanniterstrasse 36, D-73207 Plochingen www.coachulting.de info@coachulting.de

More information

The Rise of Documentary Linguistics and a New Kind of Corpus

The Rise of Documentary Linguistics and a New Kind of Corpus The Rise of Documentary Linguistics and a New Kind of Corpus Gary F. Simons SIL International 5th National Natural Language Research Symposium De La Salle University, Manila, 25 Nov 2008 Milestones in

More information

Competing Models of Grammatical Description

Competing Models of Grammatical Description Competing Models of Grammatical Description Computerlinguistik (Seminar SS 2008) PD Dr. Tania Avgustinova Grammar: Overview Kinds of grammars, views on grammar Basic grammatical notions grammatical units:

More information

ELLs and Special Education : Language Difference or Learning Disability. Diane Staehr Fenner AMNH November 4, 2012

ELLs and Special Education : Language Difference or Learning Disability. Diane Staehr Fenner AMNH November 4, 2012 ELLs and Special Education : Language Difference or Learning Disability Diane Staehr Fenner AMNH November 4, 2012 1 2 Objectives Compare characteristics of the second language acquisition (SLA) process

More information

Content Management in Web Based Education

Content Management in Web Based Education Content Management in Web Based Education Thomas Kleinberger tecmath AG Sauerwiesen 2 67661 Kaiserslautern Germany Email: kleinberger@cms.tecmath.com Paul Müller University of Kaiserslautern Department

More information

Representing dictionaries in hypertextual form

Representing dictionaries in hypertextual form Preprint. To appear in: Rufus H. Gouws, Ulrich Heid, Wolfgang Schweickhard & Herbert Ernst Wiegand (eds.): Dictionaries. An international encyclopedia of lexicography. Supplementary volume: Recent developments

More information

ICAME Journal No. 24. Reviews

ICAME Journal No. 24. Reviews ICAME Journal No. 24 Reviews Collins COBUILD Grammar Patterns 2: Nouns and Adjectives, edited by Gill Francis, Susan Hunston, andelizabeth Manning, withjohn Sinclair as the founding editor-in-chief of

More information

in Language, Culture, and Communication

in Language, Culture, and Communication 22 April 2013 Study Plan M. A. Degree in Language, Culture, and Communication Linguistics Department 2012/2013 Faculty of Foreign Languages - Jordan University 1 STUDY PLAN M. A. DEGREE IN LANGUAGE, CULTURE

More information

Services supply chain management and organisational performance

Services supply chain management and organisational performance Services supply chain management and organisational performance Irène Kilubi Services supply chain management and organisational performance An exploratory mixed-method investigation of service and manufacturing

More information

Reference Books. (1) English-English Dictionaries. Fiona Ross FindYourFeet.de

Reference Books. (1) English-English Dictionaries. Fiona Ross FindYourFeet.de Reference Books This handout originated many years ago in response to requests from students, most of them at Konstanz University. Students from many different departments asked me for advice on dictionaries,

More information

Multilingual and mixed-lingual TTS applications

Multilingual and mixed-lingual TTS applications Multilingual and mixed-lingual TTS applications LangTech 2003 November 24, 2003 Simona Fina, Manager Linguistics Real-life texts need mixed-lingual analysis Agenda Short presentation of SVOX Challenges

More information

ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking

ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking Anne-Laure Ligozat LIMSI-CNRS/ENSIIE rue John von Neumann 91400 Orsay, France annlor@limsi.fr Cyril Grouin LIMSI-CNRS rue John von Neumann 91400

More information

An extended tag set for annotating parts of speech in CMC corpora

An extended tag set for annotating parts of speech in CMC corpora An extended tag set for annotating parts of speech in CMC corpora Thomas Bartz 1, Michael Beißwenger 1, Eric Ehrhardt 2, Angelika Storrer 2 1) 2) International Research Days: Social Media and CMC Corpora

More information

Comprendium Translator System Overview

Comprendium Translator System Overview Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4

More information

Download Check My Words from: http://mywords.ust.hk/cmw/

Download Check My Words from: http://mywords.ust.hk/cmw/ Grammar Checking Press the button on the Check My Words toolbar to see what common errors learners make with a word and to see all members of the word family. Press the Check button to check for common

More information

DEFINING EFFECTIVENESS FOR BUSINESS AND COMPUTER ENGLISH ELECTRONIC RESOURCES

DEFINING EFFECTIVENESS FOR BUSINESS AND COMPUTER ENGLISH ELECTRONIC RESOURCES Teaching English with Technology, vol. 3, no. 1, pp. 3-12, http://www.iatefl.org.pl/call/callnl.htm 3 DEFINING EFFECTIVENESS FOR BUSINESS AND COMPUTER ENGLISH ELECTRONIC RESOURCES by Alejandro Curado University

More information

Study Plan for Master of Arts in Applied Linguistics

Study Plan for Master of Arts in Applied Linguistics Study Plan for Master of Arts in Applied Linguistics Master of Arts in Applied Linguistics is awarded by the Faculty of Graduate Studies at Jordan University of Science and Technology (JUST) upon the fulfillment

More information

PROMETHEUS - THE DISTRIBUTED DIGITAL IMAGE ARCHIVE FOR RESEARCH AND EDUCATION GOES INTERNATIONAL!

PROMETHEUS - THE DISTRIBUTED DIGITAL IMAGE ARCHIVE FOR RESEARCH AND EDUCATION GOES INTERNATIONAL! PROMETHEUS - THE DISTRIBUTED DIGITAL IMAGE ARCHIVE FOR RESEARCH AND EDUCATION GOES INTERNATIONAL! p r o m e t h e u s c/o Kunsthistorisches Institut University of Cologne Albertus-Magnus-Platz 50923 Cologne

More information

CURRICULUM VITAE SILKE BRANDT

CURRICULUM VITAE SILKE BRANDT CURRICULUM VITAE SILKE BRANDT CONTACT Silke Brandt, PhD English Department Nadelberg 6 CH-4051 Basel Switzerland silke.brandt@unibas.ch POSITIONS 2011-present Postdoctoral researcher English Department

More information

Declarative Parsing and Annotation of Electronic Dictionaries

Declarative Parsing and Annotation of Electronic Dictionaries Declarative Parsing and Annotation of Electronic Dictionaries Christian Schneiker 1, Dietmar Seipel 1, Werner Wegstein 2, and Klaus Prätor 3 1 Department of Computer Science {schneiker seipel}@informatik.uni-wuerzburg.de

More information

Accessing the Deep Web: A Survey

Accessing the Deep Web: A Survey VL Text Analytics Accessing the Deep Web: A Survey Marc Bux, Tobias Mühl Accessing the Deep Web: A Survey, 2007 by Bin He, Mitesh Patel, Zhen Zhang, Kevin Chen Chuan Chang Computer Science Department University

More information

A History of the «Concise Oxford Dictionary»

A History of the «Concise Oxford Dictionary» Lodz Studies in Language 34 A History of the «Concise Oxford Dictionary» Bearbeitet von Malgorzata Kaminska 1. Auflage 2014. Buch. 342 S. Hardcover ISBN 978 3 631 65268 8 Format (B x L): 14,8 x 21 cm Gewicht:

More information

DiaCollo: On the trail of diachronic collocations

DiaCollo: On the trail of diachronic collocations DiaCollo: On the trail of diachronic collocations Bryan Jurish jurish@bbaw.de AG Elektronisches Publizieren Historische Semantik und Semantic Web Heidelberger Akademie der Wissenschaften 14 th 16 th September,

More information

CLARIN project DiscAn :

CLARIN project DiscAn : CLARIN project DiscAn : Towards a Discourse Annotation system for Dutch language corpora Ted Sanders Kirsten Vis Utrecht Institute of Linguistics Utrecht University Daan Broeder TLA Max-Planck Institute

More information

Security Vendor Benchmark 2016 A Comparison of Security Vendors and Service Providers

Security Vendor Benchmark 2016 A Comparison of Security Vendors and Service Providers A Comparison of Security Vendors and Service Providers Information Security and Data Protection An Die Overview digitale Welt of the wird German Realität. and Mit Swiss jedem Security Tag Competitive ein

More information

Master of Arts Program in Linguistics for Communication Department of Linguistics Faculty of Liberal Arts Thammasat University

Master of Arts Program in Linguistics for Communication Department of Linguistics Faculty of Liberal Arts Thammasat University Master of Arts Program in Linguistics for Communication Department of Linguistics Faculty of Liberal Arts Thammasat University 1. Academic Program Master of Arts Program in Linguistics for Communication

More information

BACKUP EAGLE. Release Notes. Version: 6.1.1.16 Date: 11/25/2011

BACKUP EAGLE. Release Notes. Version: 6.1.1.16 Date: 11/25/2011 BACKUP EAGLE Release Notes Version: 6.1.1.16 Date: 11/25/2011 Schmitz RZ Consult GmbH BACKUP EAGLE Release Notes Seite 1 von 7 Date 11/29/2011 Contents 1. New Features... 3 1.1. Configurable automatically

More information

Chapter 5. Phrase-based models. Statistical Machine Translation

Chapter 5. Phrase-based models. Statistical Machine Translation Chapter 5 Phrase-based models Statistical Machine Translation Motivation Word-Based Models translate words as atomic units Phrase-Based Models translate phrases as atomic units Advantages: many-to-many

More information

Big Data Vendor Benchmark 2015 A Comparison of Hardware Vendors, Software Vendors and Service Providers

Big Data Vendor Benchmark 2015 A Comparison of Hardware Vendors, Software Vendors and Service Providers A Comparison of Hardware Vendors, Software Vendors and Service Providers The digital world is becoming a reality. Mit jedem Tag ein bisschen mehr. ECommerce, Online- Werbung, mobile Applikationen und soziale

More information

The Sem metrix Project: Scaling up the Profile-Based Measurement of Lexical Variation

The Sem metrix Project: Scaling up the Profile-Based Measurement of Lexical Variation Overview Profile-based Msm Build-up Synonyms First results The Sem metrix Project: Scaling up the Profile-Based Measurement of Lexical Variation Kris Heylen & Yves Peirsman KULeuven Quantitative Lexicology

More information