Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1]
|
|
|
- Delilah Charlotte Perkins
- 9 years ago
- Views:
Transcription
1 Content 1. Empirical linguistics 2. Text corpora and corpus linguistics 3. Concordances 4. Application I: The German progressive 5. Part-of-speech tagging 6. Fequency analysis 7. Application II: Compounds 8. Co-occurrence analysis 9. Application III: Word senses in lexicography 10. Keyword analysis 8.1 Cluster analysis 8.2 Co-occurrence 8.3 CCDB & IDS co-occurrence analysis 8.4 Searching for collocations Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1] word group analysis 8.1 Cluster analysis Cluster A cluster is a chain of linguistic entities. In er sprach vor einem großen Publikum, spr is a consonant cluster consisting of 3 consonants und sprach vor einem a word cluster consisting of 3 words. n-gram A n-gram is a sequence of n linguistic elements of the same type (Kunze & Lemnitzer 2007: 190) A 4-gram of words is a sequence of 5 words. A n-gram is the same as a n- cluster. The term n-gram is used in particular if all n-cluster are extracted from a corpus. Kunze, Claudia und Lothar Lemnitzer. Computerlexikographie. Eine Einführung. Tübingen: Narr [E-Book], S Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 2] 1
2 1 Mongolia / Languages Search: clusters 2 Publishing out of 2 dictionaries words ending in off in part of the 3 Corpus linguistics English corpus of the LCC 4 Improving dictionaries 5 Outlook Search term position (here: on right) Search term (here: off) List of bi-grams with rank and fequency Sort (here: accord. to frequency of the cluster) Size of cluster (here: clusters out of two words) Frequency condition (here: at least three tokens) Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 3] Co-occurrence 2.4 Co-occurrence Co-occurrence In a general sense, the term co-occurrence refers to the occurrence of two expressions close to each other. In a more specific sense, the term cooccurrence is used when the two expression occur more often together than can be expected if all words were distributed by chance. co-occurrence analysis the basic idea 1) Assumption: In a certain corpus, word X occurs a 1000 times, word Y a 100 times, word Z 10 times. 2) Probability: The combination XY is ten times as likely as the combination XZ. XY should occur ten times as often as XZ. 3) Observation: Actually, XZ occurs about as often as XY. 4) Conclusion: There is a close linguistic connection between X and Z (close beyond expectation). Kunze, Claudia und Lothar Lemnitzer. Computerlexikographie. Eine Einführung. Tübingen: Narr [E-Book], S. 391f. Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 4] 2
3 1 Mongolia / Languages Search: co-occurrences for just in part 2 Publishing of the English dictionaries corpus of 3 Corpus the LCC. linguistics 4 Improving dictionaries 5 Outlook List of co-occurrence partner words with rank, frequency, and significance measure Search term (here: just) Definition of search context (here: up to 2 words after the search term) Sort (here: according to significance of co-occurrence) Frequency condition (here: at least 10 tokens) Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 5] 8.3 CCDB & IDS co-occurrence analysis Co-occurrence analysis at the IDS Access: via COSMAS II WWW interface via COSMAS II client via CCDB (co-occurrence databasa) WWW interface and client: Co-occurrences are computed online (takes some time); several options for fine-tuning the analysis are available. CCDB: results of co-occurrence analyses are stored (fast access); no finetuning of analysis; automatic comparison of collocation profies available Quelle: Belica, Cyril: Kookkurrenzdatenbank CCDB. Eine korpuslinguistische Denkund Experimentierplattform für die Erforschung und theoretische Begründung von systemisch-strukturellen Eigenschaften von Kohäsionsrelationen zwischen den Konstituenten des Sprachgebrauchs Institut für Deutsche Sprache, Mannheim. Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 6] 3
4 Anwendungsbeispiel II: Kookkurrenzen zu bestehen Question: co-occurrences for bestehen (in particular governed prepositions). 1 Mongolia Textkorpora / Languages 2 Publishing Recherchemethoden dictionaries 3 Corpus Anwendungen linguistics 4 Improving Rechercheprogramme dictionaries 5 Outlook Schlussbemerkung Co-occurrence analysis for bestehen as part of the CCDB (setting: do not ignore function words) Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 7] Anwendungsbeispiel II: Kookkurrenzen zu bestehen Question: co-occurrences for bestehen (in particular governed prepositions). 1 Mongolia Textkorpora / Languages 2 Publishing Recherchemethoden dictionaries 3 Corpus Anwendungen linguistics 4 Improving Rechercheprogramme dictionaries 5 Outlook Schlussbemerkung Typical syntagmatic patterns in which the words co-occur, e. g. besteht aus [ ] [zwei drei] Teilen, consists of [ ] [two three] parts Secondary co-occurrence partners of bestehen + aus, here: aus Mitgliedern / Teilen / Ortsteilen bestehen, consist of members / parts / suburbs Primary co-occurrence partner of bestehen (here: aus) Strength of the connection (here: 40683) Co-occurrence analysis for bestehen as part of the CCDB (setting: do not ignore function words) Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 8] 4
5 8.3 CCDB & IDS co-occurrence analysis Results (among others) aus: besteht [ ] aus ( consists of [ ] ) besteht [ ] aus [ ] Mitgliedern ( consists [ ] of [ ] members ) darin: besteht [ ] darin, dass ( is [ ] that ) die Schwierigkeit [ ] besteht [ ] darin, dass ( the difficulty [ ] is [ ] that ) darauf: besteht [ ] darauf, dass ( insists [ ] that ) er bestand [ ] darauf, dass ( he insisted [ ] that ) worin: worin [ ] besteht worin [ ] besteht der Unterschied zwischen ( what [ ] is the difference between ) governed preposition: auf, aus, in prepositions auf and in in particular as prepositional complement clauses preposition in often in interrogative sentences Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 9] 8.4 Searching for collocations Exploration of collocations and fixed expressions Article from a German-Mongolian dictionary (preliminary version). 20 Flaschen à 8 Euro, 20 bottles at 8 Euros each Task: Find relevant collocations and fixed expressions containing à. Procedure: 1) Retrieve concordances from a smaller corpus (AntConc with part of the German corpus from the Leipzig Corpus Collection). 2) Carry out co-occurrence analysis (CCDB, Deutsches Referenzkorpus ). Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 10] 5
6 8.4 Searching for collocations Concordances for à in a 1-million-RW selection of the German corpus within the LCC Fixed expression à la, after the fashion of (5 out of 10 hits) Fixed expression peu à peu, bit by bit (1 out of 10 hits) Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 11] Co-occurrence analysis on the basis of the Deutsches Referenzkorpus (based on 2 bn. RW); COSMAS II WWW interface 1 Mongolia / Languages 2 Publishing dictionaries la as the most siginificant cooccurrence partner of à 3 Corpus linguistics (log likelihood ratio: 4 Improving ) dictionaries 5 Outlook Both collocations, à la and peu à peu are missing in the dictionary. peu as the second most siginificant co-occurrence partner of à (log likelihood ratio: 15974) Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 12] 6
7 VICOMTE Kookkurrenzexplorer i) primary and secondary co-occurrence partner diagramed Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 13] ii) Co-occurrence partners can be annotated Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 14] 7
8 iii) co-occurrencepartners can be grouped Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 15] Perkuhn, Rainer: Systematic Exploration of Collocation Profiles. In: Proceedings of 4th Corpus Linguistics 2007, Birmingham. aper/132_paper.pdf. iv) Refinement of description Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 16] 8
Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1]
Content 1. Empirical linguistics 2. Text corpora and corpus linguistics 3. Concordances 4. Application I: The German progressive 5. Part-of-speech tagging 6. Fequency analysis 7. Application II: Compounds
Making a Dictionary in Ulaanbaatar:
Making a Dictionary in Ulaanbaatar: Corpus-based Lexicography with Limited Financial and Technical Resources Stefan Engelberg (Institut für Deutsche Sprache & Universität Mannheim) Stefan Engelberg (IDS
Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1]
Content 1. Empirical linguistics 2. Text corpora and corpus linguistics 3. Concordances 4. Application I: The German progressive 5. Part-of-speech tagging 6. Fequency analysis 7. Application II: Compounds
The Use of Text Corpora in Lexical Research
The Use of Text Corpora in Lexical Research Stefan Engelberg Workshop, Universitatea din Bucureşti, November 2008 http://www.ids-mannheim.de/ll/lehre/engelberg/ Webseite_CorpLex/CorpLex.html [email protected]
Using German corpora for linguistic purposes. Dr. Kathrin Steyer Institut für Deutsche Sprache, Mannheim
Using German corpora for linguistic purposes Dr. Kathrin Steyer Institut für Deutsche Sprache, Mannheim Introduction This talk will give a first impression of the complex field of German corpora and methods
Complex Predications in Argument Structure Alternations
Complex Predications in Argument Structure Alternations Stefan Engelberg (Institut für Deutsche Sprache & University of Mannheim) Stefan Engelberg (IDS Mannheim), Universitatea din Bucureşti, November
Data Deduplication in Slovak Corpora
Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, Slovakia Abstract. Our paper describes our experience in deduplication of a Slovak corpus. Two methods of deduplication a plain
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet, Mathieu Roche To cite this version: Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet,
Markus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013
Markus Dickinson Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013 1 / 34 Basic text analysis Before any sophisticated analysis, we want ways to get a sense of text data
What Makes a Good Online Dictionary? Empirical Insights from an Interdisciplinary Research Project
Proceedings of elex 2011, pp. 203-208 What Makes a Good Online Dictionary? Empirical Insights from an Interdisciplinary Research Project Carolin Müller-Spitzer, Alexander Koplenig, Antje Töpel Institute
Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or
Real-Time Identification of MWE Candidates in Databases from the BNC and the Web
Real-Time Identification of MWE Candidates in Databases from the BNC and the Web Identifying and Researching Multi-Word Units British Association for Applied Linguistics Corpus Linguistics SIG Oxford Text
Simple maths for keywords
Simple maths for keywords Adam Kilgarriff Lexical Computing Ltd [email protected] Abstract We present a simple method for identifying keywords of one corpus vs. another. There is no one-sizefits-all
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar 1,2, Stéphane Bonniol 2, Anne Laurent 1, Pascal Poncelet 1, and Mathieu Roche 1 1 LIRMM - Université Montpellier 2 - CNRS 161 rue Ada, 34392 Montpellier
Thomas Ragni (Seco, CH): SAPS for choosing effective measures in Switzerland SAPS. Statistically Assisted Program Selection
Thomas Ragni (Seco, CH): SAPS for choosing effective measures in Switzerland Slide 1 SAPS Statistically Assisted Program Selection A Targeting System of Swiss Active Labor Market Policies (ALMPs) Slide
Search Engines Chapter 2 Architecture. 14.4.2011 Felix Naumann
Search Engines Chapter 2 Architecture 14.4.2011 Felix Naumann Overview 2 Basic Building Blocks Indexing Text Acquisition Text Transformation Index Creation Querying User Interaction Ranking Evaluation
ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking
ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking Anne-Laure Ligozat LIMSI-CNRS/ENSIIE rue John von Neumann 91400 Orsay, France [email protected] Cyril Grouin LIMSI-CNRS rue John von Neumann 91400
Transcription bottleneck of speech corpus exploitation
Transcription bottleneck of speech corpus exploitation Caren Brinckmann Institut für Deutsche Sprache, Mannheim, Germany Lesser Used Languages and Computer Linguistics (LULCL) II Nov 13/14, 2008 Bozen
AntConc: Design and Development of a Freeware Corpus Analysis Toolkit for the Technical Writing Classroom
AntConc: Design and Development of a Freeware Corpus Analysis Toolkit for the Technical Writing Classroom Laurence Anthony Waseda University [email protected] Abstract In this paper, I will
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University [email protected] Kapil Dalwani Computer Science Department
Computer Aided Document Indexing System
Computer Aided Document Indexing System Mladen Kolar, Igor Vukmirović, Bojana Dalbelo Bašić, Jan Šnajder Faculty of Electrical Engineering and Computing, University of Zagreb Unska 3, 0000 Zagreb, Croatia
LINGUISTIC SUPPORT IN "THESIS WRITER": CORPUS-BASED ACADEMIC PHRASEOLOGY IN ENGLISH AND GERMAN
ELN INAUGURAL CONFERENCE, PRAGUE, 7-8 NOVEMBER 2015 EUROPEAN LITERACY NETWORK: RESEARCH AND APPLICATIONS Panel session Recent trends in Bachelor s dissertation/thesis research: foci, methods, approaches
Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features
, pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of
Local Culture in Global English:
Local Culture in Global English: a case study of Kultur in Sprache / Sprachwissenschaft in Kulturwissenschaften Josef Schmied Chair English Language & Linguistics Chemnitz University of Technology www.tu-chemnitz.de/phil/english/linguist
Local Culture in Global English:
Local Culture in Global English: a case study of Kultur in Sprache / Sprachwissenschaft in Kulturwissenschaften Josef Schmied Chair English Language & Linguistics Chemnitz University of Technology www.tu-chemnitz.de
Customizing an English-Korean Machine Translation System for Patent Translation *
Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,
WebLicht: Web-based LRT services for German
WebLicht: Web-based LRT services for German Erhard Hinrichs, Marie Hinrichs, Thomas Zastrow Seminar für Sprachwissenschaft, University of Tübingen [email protected] Abstract This software
From Terminology Extraction to Terminology Validation: An Approach Adapted to Log Files
Journal of Universal Computer Science, vol. 21, no. 4 (2015), 604-635 submitted: 22/11/12, accepted: 26/3/15, appeared: 1/4/15 J.UCS From Terminology Extraction to Terminology Validation: An Approach Adapted
A Mixed Trigrams Approach for Context Sensitive Spell Checking
A Mixed Trigrams Approach for Context Sensitive Spell Checking Davide Fossati and Barbara Di Eugenio Department of Computer Science University of Illinois at Chicago Chicago, IL, USA [email protected], [email protected]
Machine Learning for natural language processing
Machine Learning for natural language processing Introduction Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 13 Introduction Goal of machine learning: Automatically learn how to
Pumping up Moodle via Integrated Content Authoring, Sharing and Delivery Tools The Educanext LTI Case Study
Pumping up Moodle via Integrated Content Authoring, Sharing and Delivery Tools The Educanext LTI Case Study Bernd Simon, Michael Aram, Daniela Nösterer, Christoph Haberberger, Knowledge Markets Consulting
CS 533: Natural Language. Word Prediction
CS 533: Natural Language Processing Lecture 03 N-Gram Models and Algorithms CS 533: Natural Language Processing Lecture 01 1 Word Prediction Suppose you read the following sequence of words: Sue swallowed
Projektgruppe. Information Extraction An Incomplete Overview
Projektgruppe Henning Wachsmuth Information Extraction An Incomplete Overview 12. Mai 2010 1 Einführungsvorträge Verfassen von Seminarvortrag und paper Prof. Dr. Gregor Engels, Donnerstag 15.4., 16h-18h
Mining a Corpus of Job Ads
Mining a Corpus of Job Ads Workshop Strings and Structures Computational Biology & Linguistics Jürgen Jürgen Hermes Hermes Sprachliche Linguistic Data Informationsverarbeitung Processing Institut Department
Micro blogs Oriented Word Segmentation System
Micro blogs Oriented Word Segmentation System Yijia Liu, Meishan Zhang, Wanxiang Che, Ting Liu, Yihe Deng Research Center for Social Computing and Information Retrieval Harbin Institute of Technology,
EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language
EXMARaLDA and the FOLK tools two toolsets for transcribing and annotating spoken language Thomas Schmidt Institut für Deutsche Sprache, Mannheim R 5, 6-13 D-68161 Mannheim [email protected]
Berlin-Brandenburg Academy of sciences and humanities (BBAW) resources / services
Berlin-Brandenburg Academy of sciences and humanities (BBAW) resources / services speakers: Kai Zimmer and Jörg Didakowski Clarin Workshop WP2 February 2009 BBAW/DWDS The BBAW and its 40 longterm projects
SQS the world s leading specialist in software quality. sqs.com. SQS Testsuite. Overview
SQS the world s leading specialist in software quality sqs.com SQS Testsuite Overview Agenda Overview of SQS Testsuite Test Center Qallisto Test Process Automation (TPA) Test Case Specification (TCS) Dashboard
The Oxford Learner s Dictionary of Academic English
ISEJ Advertorial The Oxford Learner s Dictionary of Academic English Oxford University Press The Oxford Learner s Dictionary of Academic English (OLDAE) is a brand new learner s dictionary aimed at students
IRIS - English-Irish Translation System
IRIS - English-Irish Translation System Mihael Arcan, Unit for Natural Language Processing of the Insight Centre for Data Analytics at the National University of Ireland, Galway Introduction about me,
Computer-aided Document Indexing System
Journal of Computing and Information Technology - CIT 13, 2005, 4, 299-305 299 Computer-aided Document Indexing System Mladen Kolar, Igor Vukmirović, Bojana Dalbelo Bašić and Jan Šnajder,, An enormous
Chapter 7. Language models. Statistical Machine Translation
Chapter 7 Language models Statistical Machine Translation Language models Language models answer the question: How likely is a string of English words good English? Help with reordering p lm (the house
Sketch Engine. Sketch Engine. SRDANOVIĆ ERJAVEC Irena, Web 1 Word Sketch Thesaurus Sketch Difference Sketch Engine
Sketch Engine SRDANOVIĆ ERJAVEC Irena, Sketch Engine Sketch Engine Web 1 Word Sketch Thesaurus Sketch Difference Sketch Engine JpWaC 4 Web Sketch Engine 1. 1980 10 80 Kilgarriff & Rundell 2002 500 1,000
Cross-Lingual Concern Analysis from Multilingual Weblog Articles
Cross-Lingual Concern Analysis from Multilingual Weblog Articles Tomohiro Fukuhara RACE (Research into Artifacts), The University of Tokyo 5-1-5 Kashiwanoha, Kashiwa, Chiba JAPAN http://www.race.u-tokyo.ac.jp/~fukuhara/
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS Gürkan Şahin 1, Banu Diri 1 and Tuğba Yıldız 2 1 Faculty of Electrical-Electronic, Department of Computer Engineering
Die Vielfalt vereinen: Die CLARIN-Eingangsformate CMDI und TCF
Die Vielfalt vereinen: Die CLARIN-Eingangsformate CMDI und TCF Susanne Haaf & Bryan Jurish Deutsches Textarchiv 1. The Metadata Format CMDI Metadata? Metadata Format? and more Metadata? Metadata Format?
PoS-tagging Italian texts with CORISTagger
PoS-tagging Italian texts with CORISTagger Fabio Tamburini DSLO, University of Bologna, Italy [email protected] Abstract. This paper presents an evolution of CORISTagger [1], an high-performance
Hybrid Strategies. for better products and shorter time-to-market
Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,
A History of the «Concise Oxford Dictionary»
Lodz Studies in Language 34 A History of the «Concise Oxford Dictionary» Bearbeitet von Malgorzata Kaminska 1. Auflage 2014. Buch. 342 S. Hardcover ISBN 978 3 631 65268 8 Format (B x L): 14,8 x 21 cm Gewicht:
Probability and statistical hypothesis testing. Holger Diessel [email protected]
Probability and statistical hypothesis testing Holger Diessel [email protected] Probability Two reasons why probability is important for the analysis of linguistic data: Joint and conditional
Word Completion and Prediction in Hebrew
Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology
Reliable and Cost-Effective PoS-Tagging
Reliable and Cost-Effective PoS-Tagging Yu-Fang Tsai Keh-Jiann Chen Institute of Information Science, Academia Sinica Nanang, Taipei, Taiwan 5 eddie,[email protected] Abstract In order to achieve
Teaching terms: a corpus-based approach to terminology in ESP classes
Teaching terms: a corpus-based approach to terminology in ESP classes Maria João Cotter Lisbon School of Accountancy and Administration (ISCAL) (Portugal) Abstract This paper will build up on corpus linguistic
Chapter 5. Phrase-based models. Statistical Machine Translation
Chapter 5 Phrase-based models Statistical Machine Translation Motivation Word-Based Models translate words as atomic units Phrase-Based Models translate phrases as atomic units Advantages: many-to-many
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
3 rd Young Researcher s Day 2013
Einladung zum 3 rd Young Researcher s Day 2013 Nach zwei erfolgreichen Young Researcher s Days starten wir kurz vor dem Sommer in Runde drei. Frau Ingrid Schaumüller-Bichl und Herr Edgar Weippl laden ganz
Übungen zur Vorlesung Einführung in die Volkswirtschaftslehre VWL 1
Übungen zur Vorlesung Einführung in die Volkswirtschaftslehre VWL 1 Übungen Kapitel 31/38 Beat Spirig Aufgabe 31.4, UK capital outflow NCO = purchases of foreign assets by domestic residents purchases
bound Pronouns
Bound and referential pronouns *with thanks to Birgit Bärnreuther, Christina Bergmann, Dominique Goltz, Stefan Hinterwimmer, MaikeKleemeyer, Peter König, Florian Krause, Marlene Meyer Peter Bosch Institute
A Mapping of CIDOC CRM Events to German Wordnet for Event Detection in Texts
A Mapping of CIDOC CRM Events to German Wordnet for Event Detection in Texts Martin Scholz Friedrich-Alexander-University Erlangen-Nürnberg Digital Humanities Research Group Outline Motivation: information
Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection
Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Gareth J. F. Jones, Declan Groves, Anna Khasin, Adenike Lam-Adesina, Bart Mellebeek. Andy Way School of Computing,
Term extraction for user profiling: evaluation by the user
Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,
CSCI 5417 Information Retrieval Systems Jim Martin!
CSCI 5417 Information Retrieval Systems Jim Martin! Lecture 9 9/20/2011 Today 9/20 Where we are MapReduce/Hadoop Probabilistic IR Language models LM for ad hoc retrieval 1 Where we are... Basics of ad
PBS CBW NLS IQ Enterprise Content Store
CBW NLS IQ Enterprise Content Store Solution for NetWeaver BW and on HANA Information Lifecycle Management in BW Content Information Lifecycle Management in BW...3 Strategic Partnership...4 Information
Get the most value from your surveys with text analysis
PASW Text Analytics for Surveys 3.0 Specifications Get the most value from your surveys with text analysis The words people use to answer a question tell you a lot about what they think and feel. That
Productions Management II
Productions Management II - Lecture 6 - Supply Chain Management I Lecture Supervisor: M.Tech. Amit Garg [email protected] Pontdriesch 14/16 Tel.: 47705-439 Objectives of Lecture on SCM Overview on
Getting Off to a Good Start: Best Practices for Terminology
Getting Off to a Good Start: Best Practices for Terminology Technologies for term bases, term extraction and term checks Angelika Zerfass, [email protected] Tools in the Terminology Life Cycle Extraction
Transforming and optimization of the supply chain to create value and secure growth and performance
Transforming and optimization of the supply chain to create value and secure growth and performance Niedersachsen Aviation, Jahresnetzwerktreffen Hannover, 10th December 2015 Today s storyboard Short introduction
Insights into Six Decades of Scientific Practice
DTA-/CLARIN-D-Konferenz Historische Textkorpora für die Geistes- und Sozialwissenschaften Title Insights into Six Decades of Scientific Practice Speaker Coauthors Gerhard Heyer, NLP chair ([email protected])
DRAFT! c January 7, 1999 Christopher Manning & Hinrich Schütze. 141. 5 Collocations
DRAFT! c January 7, 1999 Christopher Manning & Hinrich Schütze. 141 5 Collocations COMPOSITIONALITY TERM TECHNICAL TERM TERMINOLOGICAL PHRASE A COLLOCATION is an expression consisting of two or more words
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that
An Incrementally Trainable Statistical Approach to Information Extraction Based on Token Classification and Rich Context Models
Dissertation (Ph.D. Thesis) An Incrementally Trainable Statistical Approach to Information Extraction Based on Token Classification and Rich Context Models Christian Siefkes Disputationen: 16th February
SQS-TEST /Professional
SQS the world s leading specialist in software quality sqs.com SQS-TEST /Professional Overview of SQS Testsuite Agenda Overview of SQS Testsuite SQS Test Center SQS Test Process Automation (TPA) SQS Test
On the use of antonyms and synonyms from a domain perspective
On the use of antonyms and synonyms from a domain perspective Debela Tesfaye IT PhD Program Addis Ababa University Addis Ababa, Ethiopia [email protected] Carita Paradis Centre for Languages and Literature
c. hypermarkets d. supermarkets
http://www.logforum.net LogForum > Electronic Scientific Journal of Logistics < ISSN 1734-459X 2009 Vol. 5 Issue 2 No 1 SHELF READY PACKAGING IN CONSUMERS' OPINION Andrzej Korzeniowski The Poznan School
Master-Programm Deutsch als Fremdsprache (Master of Arts Program in German as a Foreign Language) an der Ramkhamhaeng Universität/Bangkok
Master-Programm Deutsch als Fremdsprache (Master of Arts Program in German as a Foreign Language) an der Ramkhamhaeng Universität/Bangkok Curriculum 2008 Man kann zwischen zwei Schwerpunkten wählen: Interkulturelle
Multilingual Term Extraction as a Service from Acrolinx. Ben Gottesman Michael Klemme Acrolinx CHAT2013
Multilingual Term Extraction as a Service from Acrolinx Ben Gottesman Michael Klemme Acrolinx CHAT2013 Definitions term extraction: automatically identifying potential terms in a document (corpus) multilingual
TS3: an Improved Version of the Bilingual Concordancer TransSearch
TS3: an Improved Version of the Bilingual Concordancer TransSearch Stéphane HUET, Julien BOURDAILLET and Philippe LANGLAIS EAMT 2009 - Barcelona June 14, 2009 Computer assisted translation Preferred by
Collecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
Enhancing the relativity between Content, Title and Meta Tags Based on Term Frequency in Lexical and Semantic Aspects
Enhancing the relativity between Content, Title and Meta Tags Based on Term Frequency in Lexical and Semantic Aspects Mohammad Farahmand, Abu Bakar MD Sultan, Masrah Azrifah Azmi Murad, Fatimah Sidi [email protected]
Elena Chiocchetti & Natascia Ralli (EURAC) Tanja Wissik & Vesna Lušicky (University of Vienna)
Elena Chiocchetti & Natascia Ralli (EURAC) Tanja Wissik & Vesna Lušicky (University of Vienna) VII Conference on Legal Translation, Court Interpreting and Comparative Legilinguistics Poznań, 28-30.06.2013
