Sketch Engine. Sketch Engine. SRDANOVIĆ ERJAVEC Irena, Web 1 Word Sketch Thesaurus Sketch Difference Sketch Engine
|
|
|
- Elvin Hampton
- 9 years ago
- Views:
Transcription
1 Sketch Engine SRDANOVIĆ ERJAVEC Irena, Sketch Engine Sketch Engine Web 1 Word Sketch Thesaurus Sketch Difference Sketch Engine JpWaC 4 Web Sketch Engine Kilgarriff & Rundell ,000 20, Heid et al. 2000, Kilgarriff & Tugwell 2001 Sketch Engine Kilgarriff et al Srdanović et al Sketch Engine Web Word Sketch Thesaurus Sketch Difference 1
2 Sketch Engine 2. Sketch Engine Sketch Engine Kilgarriff et al Erjavec et al Web Web Sketch Engine Sketch Engine 2.1. Sketch Engine Web Sketch Engine ( 4 JpWaC Web 1 Sharoff (2006) Ueyama & Baroni (2005) Web 5 WAC Baroni & Bernardini, eds BootCat Baroni et al HTML boilerplate removal Web ChaSen token lemma tag Erjavec et al jp.com Erjavec et al Srdanović et al Sketch Engine 2 3 URL Web JpWaC
3 1 Sketch Engine 2 Sketch Engine 3 Sketch Engine 2.2. Word Sketches 22 Word Sketch, Thesaurus Sketch Difference Chasen Gahl 1998 corpus query syntax ( ) 4 Word Sketch 3
4 salience 1 modifies_n ( ) 4 2 dual *DUAL =modifier_ana/modifies_n 2:"N.Ana" "Aux" "Pref.*"? 1:[tag="N.*" & tag!="n.suff.*" & tag!="n.bnd.*"] modifier_ana modifies_n modifies_n 2:"N.Ana" "Aux" "Pref.*"? N.Ana Aux Pref.* 1: [tag="n.*" & tag! ="N.Suff.*" & tag! ="N.bnd.*"] N.* N.Suff.* N.bnd.*
5 * 0 N.* N.g N.Prop 0 1 Sketch Engine Concordance CQL Corpus Query Language [word= word= ] ChaSen [word= ] [word= ] [lemma= ] 3.2 [tag= N.* ]&[ word = ] Word Sketch Sketch Engine ChaSen IPADIC) IPADIC Sketch Engine Web ChaSen 5 ChaSen ChaSen Sketch Engine token kana lemma POS tag ( ) POS tag-eng ( ) - Adv.P - N.Ana Aux - N.g Aux Aux - Sym.p ChaSen ChaSen IPADIC ChaSen ChaSen 5
6 Word Sketch ChaSen Word Sketch Word Sketch Concordance 100 Word Sketch ChaSen Web 2.3. Thesaurus Sketch Difference Thesaurus Sketch Difference shared triples 3 triple Srdanović et al Thesaurus 6 Sketch Difference ,309 6, Web 6
7 Thesaurus 7 Sketch Difference only pattern 8 Sketch Difference only pattern 2.4. Web Web Web 7
8 Web Web Keller & Lapata 2003 Web Web JpWaC Web Web Sharoff 2006 Ueyama & Baroni 2005 Web Web Web Sharoff 2006 Ueyama & Baroni 2005 Web narrative style Web interactive style Web Web Web Ghani et al Web Web Web Web Web Crystal 2006 Web Web Web 8
9 Web 3. Sketch Engine Sketch Engine 3.1. Sketch Engine 80 Cobuild 90 Church & Hanks 1989 (MI) 2000 Word Sketch Sketch Engine BNC British National Corpus Rundell, ed Kilgarriff & Rundell (2002) Word Sketch Word Sketch Word Sketch Sketch Engine Word Sketch Sketch Engine 9
10 Kilgarriff & Rundell 2002 challenge 2004 Sketch Engine Word Sketch 9 Word Sketch 9 modifier_ana modifier_ai verb verb verb verb 9 initiation trial - 10
11 Word Sketch challenge to something/somebody Concordance 10 Concordance CQL [word=" "] []{0,3} [word=" "] {0,3} 0 3 token 11 ( Word Sketch jaslo Erjavec et al
12 Word Sketch 10 Word Sketch 1) 2) 3) 4) 1) 1, Sketch Engine 22 2 Sketch Engine Sketch Engine Sketch Engine 12
13 2) Word Sketch Word Sketch Sketch Engine Web Sketch Engine 3) Word Sketch Word Sketch 12 13
14 12 Word Sketch 4) Word Sketch Sketch Engine Thesaurus Sketch Difference A B A B A Sketch Difference 14
15 Web Web Word Sketch Sketch Engine 3.2. Sketch Engine Sketch Engine Word Sketch Thesaurus Sketch Difference Concordance suffix ( ) prefix suffix_base prefix_base bound_v V_bound suffix bound_v V_bound Sketch Difference / / 15
16 Word Sketch Word Sketch lemma 2) Concordance Concordance Concordance CQL Concordance CQL [word=" "][word=" "][lemma=" "] [word=" "][word=" "][lemma=" "] lemma 432 2,975 Collocation candidates 16
17 Concordance CQL [tag="v.*"][word=" "][word=" "][lemma=" "] Web 1,170 CQL [word=" "][word=" "][lemma=" "] Collocation candidates 10 Concordance [word=" "] [word=" "] [lemma=" "] 10,845 Collocation candidates 4, (lexical sets) 13 17
18 [word=" "][word=" "][word=" "][word=" "] [word=" "] [lemma=" "] Srdanović 2007 Word Sketch Word Sketch 3.3. Sketch Engine Sketch Engine Sketch Engine 1) Sketch Engine a b Sketch Engine Sketch Engine Nishina & Yoshihashi 2007 Smrž 2004 Sketch Engine 18
19 2) Sketch Engine 3) a ( ) b c d Sketch Engine Smrž 2004 Sketch Difference Thesaurus Sketch Engine Smrž 2004 Sketch Engine Sketch Engine 4) a b c Sketch Engine Sketch Engine Smith et al
20 3.4. Sketch Engine 2.3 Web Web Word Sketch Thesaurus Joice 2005 Sketch Engine ChaSen ChaSen Corpus Builder Sketch Engine WebBootCat Web Baroni et al Sketch Engine 1) ChaSen 4 Web 2) ChaSen Sketch Engine Word Sketch Thesaurus Sketch Difference Concordance 1) Web 2) 3) ChaSen ChaSen 20
21 Srdanović Erjavec, Irena , 83-89, 2007 Sketch Engine 18, , 2004 Baroni, Marko, Adam Kilgarriff, Jan Pomikalek & Pavel Rychly (2006) WebBootCaT: a web tool for instant corpora, Proceedings of the EuraLex Conference 2006, Baroni, Marko & Silvia Bernardini, eds. (2006) Wacky! Working papers on the Web as Corpus, Bologna: GEDIT. Church, Kenneth Ward & Patrick Hanks (1989) Word association norms, mutual information, and lexicography, Proceedings of the 27th annual meeting on Association for Computational Linguistics, Crystal, David (2006) Language and the Internet, Cambridge: Cambridge University Press. Erjavec, Tomaž, Kristina Hmeljak Sangawa & Irena Srdanović Erjavec (2006) jaslo, A Japanese-Slovene Learners' Dictionary: Methods for Dictionary Enhancement, Proceedings of the 12th EURALEX International Congress Erjavec, Tomaž, Adam Kilgarriff & Irena Srdanović Erjavec (2007) A large public-access Japanese corpus and its query tool, CoJaS 2007, The Inaugural Workshop on Computational Japanese Studies. Gahl, Susanne (1998) Automatic Extraction of subcategorization frames for corpus-based dictionary-building, Proc EURALEX 1998, Ghani, Rayid, Rosie Jones & Dunja Mladenic (2001) Using the Web to Create Minority Language Corpora, Proceedings of the 2001 ACM CIKM: Tenth International Conference on Information and Knowledge Management, Heid, Ulrich, Stefan Evert, Vincent Docherty, Wolfgang Worsch & Wermke, Matthias (2000) Computational tools for semi-automatic corpus-based updating of dictionaries, EURALEX 2000 Proceedings, Joyce, Terry (2005) Constructing a large-scale database of Japanese word associations, In Katsuo Tamaoka (ed.) Corpus Studies on Japanese Kanji (Glottometrics 10), 82-98, Tokyo: Hituzi Syobo & Germany: RAM-Verlag:Ludenschied. Keller, Frank & Maria Lapata (2003) Using the Web to Obtain Frequencies for Unseen Bigrams, Computational Linguistics 29 (3),
22 Kilgarriff, Adam & Michael Rundell (2002) Lexical Profiling Software and its Lexicographic Applications - a Case Study, EURALEX 2002 Proceedings, Kilgarriff, Adam, Pavel Rychly, Pavel Smrž & David Tugwell (2004) The Sketch Engine, Proc. Euralex, Kilgarriff Adam & David Tugwell (2001) WORD SKETCH: Extraction and Display of Significant Collocations for Lexicography, Proc. workshop "COLLOCATION: Computational Extraction, Analysis and Exploitation. 39th ACL & 10th EACL, Nishina, Kikuko & Kenji Yoshihashi (2007) Japanese Composition Support System Displaying Occurrences and Example Sentences, Symposium on Large-scale Knowledge Resources (LKR2007), Rundell, Michael, ed. (2002) Macmillan English Dictionary for Advanced Learners, London: Macmillan. Sharoff, Serge (2006) Open-source corpora: using the net to fish for linguistic data, International Journal of Corpus Linguistics 11(4), Smith, Simon, Alice Chen & Adam Kilgarriff (2007) A corpus query tool for SLA: learning Mandarin with the help of Sketch Engine, Practical Applications in Language and Computers - PALC 2007 Smrž, Pavel (2004) Integrating Natural Language Processing into E-learning A Case of Czech, Proceedings of the Workshop on elearning for Computational Linguistics and Computational Linguistics for elearning, COLING Srdanović Erjavec, Irena, Tomaž Erjavec & Adam Kilgarriff (2008 ) A web corpus and word-sketches for Japanese,, Ueyama Motoko & Marko Baroni (2005) Automated construction and evaluation of a Japanese web-based reference corpus, Proceedings of Corpus Linguistics
23 Sketch Engine corpus query tool for Japanese and its possible applications SRDANOVIĆ ERJAVEC Irena, NISHINA Kikuko Tokyo Institute of Technology Keywords Sketch Engine, corpus linguistics, lexicography, second language learning, collocations Abstract Although corpus-based language research has been developing rapidly in recent years, there is still a lack of resources in regards to their size, textual variety, and time of creation, and of efficient and user-friendly corpus query tools. This is also the case for the Japanese corpus linguistics, which is one of the primary reasons for the recent rise in projects constructing Japanese corpora resources. In this paper, we present a method for extracting linguistic information from corpora using the Sketch Engine corpus query tool, which has recently been extended for the Japanese language. The Japanese version is based on a 400 million word Japanese Web corpus, which is linguistically annotated by the morphological analyzer ChaSen, and a Japanese grammatical relations file. The tool offers efficient and user-friendly ways of extracting concise linguistic data about words their grammatical and collocational behavior, as well as thesaurus-like information and differences in usage for similar words. We explain, through examples, how the tool could be utilized in corpus lexicography, linguistic research and computer assisted language learning of the Japanese language. The investigation part of the article concentrates mainly on the ways that the tool could be applied within the dictionary creation process, and the results illustrate how each of the tool functions can greatly contribute to that process. 23
The Oxford Learner s Dictionary of Academic English
ISEJ Advertorial The Oxford Learner s Dictionary of Academic English Oxford University Press The Oxford Learner s Dictionary of Academic English (OLDAE) is a brand new learner s dictionary aimed at students
Simple maths for keywords
Simple maths for keywords Adam Kilgarriff Lexical Computing Ltd [email protected] Abstract We present a simple method for identifying keywords of one corpus vs. another. There is no one-sizefits-all
Using the BNC to create and develop educational materials and a website for learners of English
Using the BNC to create and develop educational materials and a website for learners of English Danny Minn a, Hiroshi Sano b, Marie Ino b and Takahiro Nakamura c a Kitakyushu University b Tokyo University
Real-Time Identification of MWE Candidates in Databases from the BNC and the Web
Real-Time Identification of MWE Candidates in Databases from the BNC and the Web Identifying and Researching Multi-Word Units British Association for Applied Linguistics Corpus Linguistics SIG Oxford Text
Search Result Diversification Methods to Assist Lexicographers
Search Result Diversification Methods to Assist Lexicographers Lars Borin Markus Forsberg Karin Friberg Heppin Richard Johansson Annika Kjellandsson Språkbanken, Department of Swedish, University of Gothenburg
The Hungarian Gigaword Corpus
The Hungarian Gigaword Corpus Csaba Oravecz, Tamás Váradi, Bálint Sass Research Institute for Linguistics, Hungarian Academy of Sciences Benczúr u. 33, H-1068 Budapest {oravecz.csaba,varadi.tamas,[email protected]}
Corpus and Discourse. The Web As Corpus. Theory and Practice MARISTELLA GATTO LONDON NEW DELHI NEW YORK SYDNEY
Corpus and Discourse The Web As Corpus Theory and Practice MARISTELLA GATTO B L O O M S B U R Y LONDON NEW DELHI NEW YORK SYDNEY Contents List of Figures xiii List of Tables xvii Preface xix Acknowledgements
Data Deduplication in Slovak Corpora
Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, Slovakia Abstract. Our paper describes our experience in deduplication of a Slovak corpus. Two methods of deduplication a plain
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University [email protected] Kapil Dalwani Computer Science Department
Some Reflections on the Making of the Progressive English Collocations Dictionary
43 Some Reflections on the Making of the Progressive English Collocations Dictionary TSUKAMOTO Michihisa Faculty of International Communication, Aichi University E-mail: [email protected] 1939
GRASP: Grammar- and Syntax-based Pattern-Finder for Collocation and Phrase Learning
PACLIC 24 Proceedings 357 GRASP: Grammar- and Syntax-based Pattern-Finder for Collocation and Phrase Learning Mei-hua Chen a, Chung-chi Huang a, Shih-ting Huang b, and Jason S. Chang b a Institute of Information
AntConc: Design and Development of a Freeware Corpus Analysis Toolkit for the Technical Writing Classroom
AntConc: Design and Development of a Freeware Corpus Analysis Toolkit for the Technical Writing Classroom Laurence Anthony Waseda University [email protected] Abstract In this paper, I will
COLLOCATION TOOLS FOR L2 WRITERS 1
COLLOCATION TOOLS FOR L2 WRITERS 1 An Evaluation of Collocation Tools for Second Language Writers Ulugbek Nurmukhamedov Northern Arizona University COLLOCATION TOOLS FOR L2 WRITERS 2 Abstract Second language
A Corpus-Based Tool for Exploring Domain-Specific Collocations in English
A Corpus-Based Tool for Exploring Domain-Specific Collocations in English Ping-Yu Huang 1, Chien-Ming Chen 2, Nai-Lung Tsao 3 and David Wible 3 1 General Education Center, Ming Chi University of Technology
EFL Learners Synonymous Errors: A Case Study of Glad and Happy
ISSN 1798-4769 Journal of Language Teaching and Research, Vol. 1, No. 1, pp. 1-7, January 2010 Manufactured in Finland. doi:10.4304/jltr.1.1.1-7 EFL Learners Synonymous Errors: A Case Study of Glad and
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov. 2008 [Folie 1]
Content 1. Empirical linguistics 2. Text corpora and corpus linguistics 3. Concordances 4. Application I: The German progressive 5. Part-of-speech tagging 6. Fequency analysis 7. Application II: Compounds
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar 1,2, Stéphane Bonniol 2, Anne Laurent 1, Pascal Poncelet 1, and Mathieu Roche 1 1 LIRMM - Université Montpellier 2 - CNRS 161 rue Ada, 34392 Montpellier
Motivation. Korpus-Abfrage: Werkzeuge und Sprachen. Overview. Languages of Corpus Query. SARA Query Possibilities 1
Korpus-Abfrage: Werkzeuge und Sprachen Gastreferat zur Vorlesung Korpuslinguistik mit und für Computerlinguistik Charlotte Merz 3. Dezember 2002 Motivation Lizentiatsarbeit: A Corpus Query Tool for Automatically
Teaching terms: a corpus-based approach to terminology in ESP classes
Teaching terms: a corpus-based approach to terminology in ESP classes Maria João Cotter Lisbon School of Accountancy and Administration (ISCAL) (Portugal) Abstract This paper will build up on corpus linguistic
ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking
ANNLOR: A Naïve Notation-system for Lexical Outputs Ranking Anne-Laure Ligozat LIMSI-CNRS/ENSIIE rue John von Neumann 91400 Orsay, France [email protected] Cyril Grouin LIMSI-CNRS rue John von Neumann 91400
Computer Aided Document Indexing System
Computer Aided Document Indexing System Mladen Kolar, Igor Vukmirović, Bojana Dalbelo Bašić, Jan Šnajder Faculty of Electrical Engineering and Computing, University of Zagreb Unska 3, 0000 Zagreb, Croatia
Database Design For Corpus Storage: The ET10-63 Data Model
January 1993 Database Design For Corpus Storage: The ET10-63 Data Model Tony McEnery & Béatrice Daille I. General Presentation Within the ET10-63 project, a French-English bilingual corpus of about 2 million
Collecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery
Transformation of Free-text Electronic Health Records for Efficient Information Retrieval and Support of Knowledge Discovery Jan Paralic, Peter Smatana Technical University of Kosice, Slovakia Center for
Cross-Language Information Retrieval by Domain Restriction using Web Directory Structure
Cross-Language Information Retrieval by Domain Restriction using Web Directory Structure Fuminori Kimura Faculty of Culture and Information Science, Doshisha University 1 3 Miyakodani Tatara, Kyoutanabe-shi,
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet, Mathieu Roche To cite this version: Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet,
Collocation Differences between Adjectives in English and English. Adjective Loan Words in Japanese
Collocation Differences between Adjectives in English and English Adjective Loan Words in Japanese By Masatoshi Shoji A dissertation submitted to the College of Arts and Law of the University of Birmingham
Beyond single words: the most frequent collocations in spoken English
Beyond single words: the most frequent collocations in spoken English Dongkwang Shin and Paul Nation This study presents a list of the highest frequency collocations of spoken English based on carefully
Customizing an English-Korean Machine Translation System for Patent Translation *
Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,
Integrating Natural Language Processing into E-learning A Case of Czech
Integrating Natural Language Processing into E-learning A Case of Czech Pavel Smrž Faculty of Informatics, Masaryk University Brno Botanická 68a, 602 00 Brno, Czech Republic E-mail: [email protected] Abstract
A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students
69 A Survey of Online Tools Used in English-Thai and Thai-English Translation by Thai Students Sarathorn Munpru, Srinakharinwirot University, Thailand Pornpol Wuttikrikunlaya, Srinakharinwirot University,
Brill s rule-based PoS tagger
Beáta Megyesi Department of Linguistics University of Stockholm Extract from D-level thesis (section 3) Brill s rule-based PoS tagger Beáta Megyesi Eric Brill introduced a PoS tagger in 1992 that was based
User studies, user behaviour and user involvement evidence and experience from The Danish Dictionary
User studies, user behaviour and user involvement evidence and experience from The Danish Dictionary Henrik Lorentzen, Lars Trap-Jensen Society for Danish Language and Literature, Copenhagen, Denmark E-mail:
Register Differences between Prefabs in Native and EFL English
Register Differences between Prefabs in Native and EFL English MARIA WIKTORSSON 1 Introduction In the later stages of EFL (English as a Foreign Language) learning, and foreign language learning in general,
Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features
, pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of
Hybrid Strategies. for better products and shorter time-to-market
Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS Gürkan Şahin 1, Banu Diri 1 and Tuğba Yıldız 2 1 Faculty of Electrical-Electronic, Department of Computer Engineering
An Overview of Applied Linguistics
An Overview of Applied Linguistics Edited by: Norbert Schmitt Abeer Alharbi What is Linguistics? It is a scientific study of a language It s goal is To describe the varieties of languages and explain the
Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology
Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Makoto Nakamura, Yasuhiro Ogawa, Katsuhiko Toyama Japan Legal Information Institute, Graduate
a Chinese-to-Spanish rule-based machine translation
Chinese-to-Spanish rule-based machine translation system Jordi Centelles 1 and Marta R. Costa-jussà 2 1 Centre de Tecnologies i Aplicacions del llenguatge i la Parla (TALP), Universitat Politècnica de
The Development of Multimedia-Multilingual Document Storage, Retrieval and Delivery System for E-Organization (STREDEO PROJECT)
The Development of Multimedia-Multilingual Storage, Retrieval and Delivery for E-Organization (STREDEO PROJECT) Asanee Kawtrakul, Kajornsak Julavittayanukool, Mukda Suktarachan, Patcharee Varasrai, Nathavit
From Terminology Extraction to Terminology Validation: An Approach Adapted to Log Files
Journal of Universal Computer Science, vol. 21, no. 4 (2015), 604-635 submitted: 22/11/12, accepted: 26/3/15, appeared: 1/4/15 J.UCS From Terminology Extraction to Terminology Validation: An Approach Adapted
Differences in linguistic and discourse features of narrative writing performance. Dr. Bilal Genç 1 Dr. Kağan Büyükkarcı 2 Ali Göksu 3
Yıl/Year: 2012 Cilt/Volume: 1 Sayı/Issue:2 Sayfalar/Pages: 40-47 Differences in linguistic and discourse features of narrative writing performance Abstract Dr. Bilal Genç 1 Dr. Kağan Büyükkarcı 2 Ali Göksu
Methods for the Extraction of Hungarian Multi-Word Lexemes
Methods for the Extraction of Hungarian Multi-Word Lexemes Balázs Kis*, Begoña Villada Moirón, Tamás Bíró, Gosse Bouma, Gábor Pohl*, Gábor Ugray*, John Nerbonne Rijksuniversiteit Groningen * MorphoLogic,
ONLINE ENGLISH LANGUAGE RESOURCES
ONLINE ENGLISH LANGUAGE RESOURCES Developed and updated by C. Samuel for students taking courses at the English and French Language Centre, Faculty of Arts (Links live as at November 2, 2009) Dictionaries
Modeling coherence in ESOL learner texts
University of Cambridge Computer Lab Building Educational Applications NAACL 2012 Outline 1 2 3 4 The Task: Automated Text Scoring (ATS) ATS systems Discourse coherence & cohesion The Task: Automated Text
Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization
Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization Atika Mustafa, Ali Akbar, and Ahmer Sultan National University of Computer and Emerging
Grammar in Dictionaries of Languages for Special Purposes
Author: Jóna Ellendersen Supervisor: Henning Bergenholtz Grammar in Dictionaries of Languages for Special Purposes Cand.ling.merc (tt) thesis Aarhus School of Business November 2007 Contents 1. Introduction...5
Generation of Word Profiles for large German corpora
Generation of Word Profiles for large German corpora Alexander Geyken, Alexander Siebert and Jörg Didakowski 1. Introduction Electronic corpora have been used in lexicography and the domain of language
Construction of Thai WordNet Lexical Database from Machine Readable Dictionaries
Construction of Thai WordNet Lexical Database from Machine Readable Dictionaries Patanakul Sathapornrungkij Department of Computer Science Faculty of Science, Mahidol University Rama6 Road, Ratchathewi
GRASP: Grammar- and Syntax-based Pattern-Finder in CALL
GRASP: Grammar- and Syntax-based Pattern-Finder in CALL Chung-Chi Huang * Mei-Hua Chen * Shih-Ting Huang + Hsien-Chin Liou ** Jason S. Chang + * Institute of Information Systems and Applications, NTHU,
Learning Translation Rules from Bilingual English Filipino Corpus
Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,
An Artificial Intelligence approach to Arabic and Islamic content on the internet
An Artificial Intelligence approach to Arabic and Islamic content on the internet Eric Atwell, Claire Brierley, Kais Dukes, Majdi Sawalha, Abdul-Baquee Sharaf I-AIBS Institute for Artificial intelligence
DiCE in the web: An online Spanish collocation dictionary
GRANGER, S.; PAQUOT, M. (EDS.). 2010. ELEXICOGRAPHY IN THE 21ST CENTURY: NEW CHALLENGES, NEW APPLICATIONS. PROCEEDINGS OF ELEX2009, LOUVAIN-LA-NEUVE, 22-24 OCTOBER 2009. CAHIERS DU CENTAL 7. LOUVAIN-LA-NEUVE,
Semantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
COMPUTATIONAL DATA ANALYSIS FOR SYNTAX
COLING 82, J. Horeck~ (ed.j North-Holland Publishing Compa~y Academia, 1982 COMPUTATIONAL DATA ANALYSIS FOR SYNTAX Ludmila UhliFova - Zva Nebeska - Jan Kralik Czech Language Institute Czechoslovak Academy
Iranian EFL learners attitude towards the use of WBLL approach in writing
International Journal of Research Studies in Language Learning 2016 July, Volume 5 Number 3, 29-38 Iranian EFL learners attitude towards the use of WBLL approach in writing Mashhadizadeh, Davood Sobhe
ANALEC: a New Tool for the Dynamic Annotation of Textual Data
ANALEC: a New Tool for the Dynamic Annotation of Textual Data Frédéric Landragin, Thierry Poibeau and Bernard Victorri LATTICE-CNRS École Normale Supérieure & Université Paris 3-Sorbonne Nouvelle 1 rue
GATE Mímir and cloud services. Multi-paradigm indexing and search tool Pay-as-you-go large-scale annotation
GATE Mímir and cloud services Multi-paradigm indexing and search tool Pay-as-you-go large-scale annotation GATE Mímir GATE Mímir is an indexing system for GATE documents. Mímir can index: Text: the original
Level 4 Certificate in English for Business
Level 4 Certificate in English for Business LCCI International Qualifications Syllabus Effective from January 2006 For further information contact us: Tel. +44 (0) 8707 202909 Email. [email protected]
Computer-aided Document Indexing System
Journal of Computing and Information Technology - CIT 13, 2005, 4, 299-305 299 Computer-aided Document Indexing System Mladen Kolar, Igor Vukmirović, Bojana Dalbelo Bašić and Jan Šnajder,, An enormous
Markus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013
Markus Dickinson Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013 1 / 34 Basic text analysis Before any sophisticated analysis, we want ways to get a sense of text data
Schema documentation for types1.2.xsd
Generated with oxygen XML Editor Take care of the environment, print only if necessary! 8 february 2011 Table of Contents : ""...........................................................................................................
Supporting Collocation Learning
Department of Computer Science Hamilton, New Zealand Supporting Collocation Learning by Shaoqun Wu This thesis is submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy
... for Cambridge Exams. Cambridge Books... ... for. Cambridge Exams 2004. www.cambridge.org/elt/exams
... for Exams Books...... for Exams 2004 www.cambridge.org/elt/exams Books... University Press offers an excellent range of resources to prepare students for University of ESOL Examinations. Written to
j A Handbook of Lexicography
j A Handbook of Lexicography This book provides a systematic survey of the theory and methods of dictionary-making (including the linguistic background): what types of dictionary there are, how different
... for Cambridge Exams. Cambridge Books... ... for. Cambridge Exams 2004. www.cambridge.org/elt/exams. www.cambridge.
... for Cambridge Exams Cambridge Books...... for Cambridge Exams 2004 Cambridge Books... Cambridge University Press offers an excellent range of resources to prepare students for University of Cambridge
Download Check My Words from: http://mywords.ust.hk/cmw/
Grammar Checking Press the button on the Check My Words toolbar to see what common errors learners make with a word and to see all members of the word family. Press the Check button to check for common
Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic
Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic by Sigrún Helgadóttir Abstract This paper gives the results of an experiment concerned with training three different taggers on tagged
