The MORPH2 new version: A robust morphological analyzer for Arabic texts
|
|
|
- Marlene Perkins
- 9 years ago
- Views:
Transcription
1 The MORPH2 new version: A robust morphological analyzer for Arabic texts Nouha Chaâben Kammoun 1, Lamia Hadrich Belguith 1, Abdelmajid Ben Hamadou 2 MIRACL Laboratory 1 FSEGS B.P Sfax Tunisia 2 ISIMS B.P Sakiet-Ezzit Sfax Tunisia Abstract In this paper we describe a new version of the Arabic Morphological analyzer MORPH2 (a morphological analyzer for Arabic texts). The aim of this work is to deal with the observed problems when evaluating the old MORPH2 s version. Then, we present the morphological analysis method used to build this analyzer. We focus on the new step (vocalization and validation) added to our method. This step allows our analyzer to provide fully vocalized outputs with a validation process of noisy solutions. Besides, we present the structure of the lexicon used by our method. It is an XML lexicon that allows more morphological analysis efficiency. It is organized so that it can be used not only for morphological analysis, but also for other linguistic analysis levels. The new version of MORPH2 has been evaluated on an Arabic corpus. The obtained results in terms of recall and precision are respectively 89, 77% and 82, 51%. We noticed that the major causes of failure are the non detection of relation nouns and primitive nouns. Keywords: Arabic language, morphological analysis, vocalization 1. Introduction Morphological analysis is one of the crucial parts of Natural Language Processing (NLP). The aim of morphological analysis is to recognize word composition and to provide specific morphological information about words. Such information are useful in the most generic application of NLP such as text analysis, error correction, parsing, machine translation, automatic summarization, etc. Then, developing a robust morphological analyzer is needed. In this paper, we describe a new version of the Arabic Morphological analyzer MORPH2 (Belguith and Chaâben, 2006). The aim of this work is to deal with the observed problems when evaluating the old MORPH2 s version. Then, we present here a brief description of Arabic morphological problems. An overview of morphological analysis state of the art is then introduced. The next section describes our morphological analysis. We focus especially on the new step (vocalization and validation) added to our method. This step allows our analyzer to provide fully vocalized outputs with a validation process of noisy solutions. Then, a presentation of some lexicon structure is provided. We propose an XML lexicon that allows more morphological analysis efficiency. This lexicon is organized so that it can be used not only for morphological analysis, but also for other linguistic analysis levels. An example of analysis is then presented with a brief description of MORPH2 interface. Moreover, in this paper, we present some generic applications in which MORPH2 was embedded. Finally, we provide the evaluation of our morphological analyzer on an Arabic corpus. JADT 2010 : 10 th International Conference on Statistical Analysis of Textual Data
2 1034 THE MORPH2 NEW VERSION: A ROBUST MORPHOLOGICAL ANALYZER FOR ARABIC TEXTS 2. Arabic morphological problems Like all Semitic languages, Arabic is characterized by a complex morphology and a rich vocabulary. Arabic is a derivational, flexional and agglutinative language. In fact, an Arabic word is the result of a combination of a trilateral or quadrilateral root with a specific schema. Then, there are many verbal and nominal lemmas that can be derived from an Arabic root. Besides, from a verbal or nominal lemma many flexions are possibly indicating variations in tense (for verbs), in case (for nouns), in gender (for both), etc. Agglutination in Arabic is another specific phenomenon. Indeed, in Arabic, articles, prepositions, pronouns, etc. can be affixed to adjectives, nouns, verbs and particles to which they are related. Derivational, flexional and agglutinative aspects of Arabic yield prominent challenges in NLP. Thus, many morphological ambiguities have to be solved when dealing with Arabic language. In fact, many Arabic words are homographic: they have the same orthographic form, though the pronunciation is different (Attia, 2006). In most cases, these homographs are due to the non vocalization of words. It means that a fully vocalization of words can solve these ambiguities. We present, in what follows, some of these homographs: Homographs due to agglutination: the affixation of clitics in words can produce a form.( فصل that is homographic with another full form word (Example: the form فصل fsl ص ل Silo (link) ف + ف ص ل... ف ص ل fa fasala Fasolun (then) (he dismissed) (season) Homographs due to absence of the character chadda : the presence of chadda inside the word gives a different sense from the one without chadda. يعقد yeqd ي ع ق د yaeoqidu (he ي ع ق د... yueaq~idu (he complicates) Homographs due to absence of short vowels: in Arabic, some conjugated verbs or inflected nouns can have the same orthographic form. Adding short vowels to those words makes Differences between them (in pronunciation). قبل qbl ق ب ل qabila (he accepted) ق ب ل qubila (he was accepted) ق ب ل... qabola (before)
3 NOUHA CHAÂBEN KAMMOUN, LAMIA HADRICH BELGUITH, ABDELMAJID BEN HAMADOU 1035 Homographs due to some homographic prefixes: تقب ل tqbal ت ق ب ل taqobalu (she accepts) (you accept) 3. State of the art As presented in the previous section, the morphology of Arabic, like all other Semitic languages, is rich and highly complex. The famous root-and-pattern model used in word formation and the writing system of Arabic poses many challenges to computational NLP systems. Arabic computational morphology began in the mid-1980 s with Beesley. He used two-level morphology method. In this period, the lack of resources was one of the most challenging problems in computational morphology. Today, with the appearance of few resources and tools for Arabic (even if most of them are not free) many researches in computational morphology have been done. An overview on the state of the art of Arabic computational morphology is presented in the book edited by (Saoudi et al., 2007). They consider two main Arabic computational approaches: the knowledge-base approach and the empirical approach. Knowledge-based methods are based on a lot of linguistic information. They involve a good lexicon representation and some computing techniques to perform morphological analysis. According to different lexicon representations, a large variety of methods belongs to the knowledge-base approach. Saoudi, et al. classify those methods as follows: Syllable-based morphology: this class of methods considers syllables to be the primary concept in morphological description. Root-and-pattern morphology: as proposed by McCarthy (McCarthy, 1981), stems are formed by a derivational combination of a root morpheme and a vowel melody. The projection of a root on patterns allows stems formation. Lexeme-based morphology: lexeme-based morphology supports the claim that the stem is the only morphologically relevant form of a lexeme (i.e. inflectional or derivational morphemes suffixes, prefixes, infixes and reduplication are not themselves grammatical elements). Stem-based Arabic lexicon with Grammar and Lexis Specifications: the stem-based methods allow the reduction of Arabic word structure complexity, elimination of large numbers of lexical gaps and possibility to associate relevant and specific morphological, syntactic and semantic features with each lexicon entry. Using knowledge-based methods few morphological analyzers for Arabic have been developed such as Buckwalter Arabic morphological analyzer (Buckwalter, 2004); Xerox two-level morphology system (Beesley, 2001); Sebawai system (Darwish, 2002) for shallow parsing; Abouenour et al., Morphological analyzer (Anouenour et al., 2008) and Attia morphological analyzer (Attia, 2006). Empirical-based methods employ machine learning techniques to extract linguistic knowledge from natural language data directly. The aim of these methods is to learn how to weigh between alternative solutions and how to predict useful information for unknown entities through rigorous statistical analysis of the data.
4 1036 THE MORPH2 NEW VERSION: A ROBUST MORPHOLOGICAL ANALYZER FOR ARABIC TEXTS Supervised, unsupervised and semi-supervised methods were used for Arabic morphology (Clark, 2007). Appearance of empirical-based methods in Arabic morphology is due to appearance of Arabic corpora (e.g. GigaWord, Arabic TreeBank) which can be used by learning techniques. Few Arabic morphological analyzers based on those methods have been developed ((Clark, 2003) (El Jihed and Yousfi, 2005); (Boudlal et al., 2008)). 4. Proposed method MORPH2 is based on a knowledge-based computational method (Belguith and Chaâben, 2006). Our morphological analysis method involves five steps (see Figure 1.). In this section, we provide a brief description of the principle of this method. We focus especially on the new step (Vocalization & validation) added to our method. Arabic text عندما استيقظ الولد الصغير في صباح ذلك اليوم Clitic lexicon Root lexicon Step 1 Step 2 Step 3 Tokenization Preprocessing Affixal analysis Particles, clitics, proper nouns عندما حرف ظرف زمان استيقظ فعل ماضي معلوم... الولد اسم مفرد مذآر... Step 5 Vocalization & validation process Step 4 Morphological Analysis Tagged text Validation rules Correspondence matrix Figure 1. Proposed morphological analysis method 4.1. Principle of the method As input, our method accepts an Arabic text, a sentence or a word (each word can be non vocalized, partially vocalized or fully vocalized). A tokenization process is applied in a first step. Then, the method determines, in a second step, for each word, its possible clitics. An affixal analysis is then applied to determine all possible affixes and roots. The fourth step consists of extracting the morpho-syntactic features according to the valid affixes and the root. Finally, a vocalization step has been added to our morphological analysis method. The tokenization process consists of two sub-steps. First, using the system Star (an Arabic text
5 NOUHA CHAÂBEN KAMMOUN, LAMIA HADRICH BELGUITH, ABDELMAJID BEN HAMADOU 1037 tokenizer based on contextual exploration of punctuation marks, conjunctions of coordination, etc.), the text is divided into sentences (Belguith et al., 2005). The second sub-step detects different words within each sentence. Morphological preprocessing step aims to extract clitics agglutinated to the word. A filtering process is then applied to check if the remaining word is a particle, a number, a date, or a proper noun. Arabic language has specific structural properties. A vocabulary word consists of a trilateral/quadrilateral root to which an affixal combination (consisting of a prefix, one or two infixes and a suffix) is added. Affixal analysis aims to identify basic elements that enter into the constitution of a word to say, the root and affixes (i.e. prefix (P), infix (I) and suffix (S)). This process is done by the five following stages (Ben Hamadou, 1993): (P, S) couple identification, candidate affixal triad identification, lexical filtering, ( (R), (P, I, S)) association control and transformation recognition. From one stage to another a filtering mechanism is done allowing the removal of noisy recognized decompositions. The morphological analysis step consists of determining from the (R, P, I, S) form obtained for each word, all possible morpho-syntactic features (i.e, POS, gender, number, time, person, etc.). Morpho-syntactic features detection is made on three stages (Belguith, 1999). The first stage identifies the part-of-speech (POS) of the word (i.e. verb, noun and particle). The second stage extracts for each POS a list of morpho-syntactic features. A filtering of feature lists is made in the third stage Vocalization and Validation step The aim of the new version of MORPH2 is to deal with problems encountered in the old version. It has also been extended to handle fully vocalized texts in addition to non vocalized Arabic texts. Thus, we added a new step to our morphological analysis method called vocalization and validation. Using morpho-syntactic features and (R, P, I, S) decomposition, a vocalization process is applied. Each handled word is then fully vocalized according to morpho-syntactic features determined in the previous step. To make vocalization, we interdigitate the root extracted in the affixal analysis level with the corresponding fully vocalized schema (see examples below). Example: Let the word ( تقابلتم tqabltm) be analyzed. Figure 2 describes the vocalization process. Through the analysis of this word, the detected (R, P, I, S) association, in affixal analysis step, is,ا,ت,قبل).(تم In fact, the affixal triad,ا,ت) (تم has only the corresponding vocalization schema ت ف اع ل ت م (tafaaealotumo). So, the interdigitation of this schema with the detected root قبل gives as a result the fully vocalized word ت ق اب ل ت م (taqaabalotumo/ you have been contrasted). تقابلتم( tqabltm ) تم :S ا :I ت :P قبل :R ت ف اع ل ت م (taqaabalotumo) ت ق اب ل ت م تقابلتم word Figure 2. Vocalization process of the
6 1038 THE MORPH2 NEW VERSION: A ROBUST MORPHOLOGICAL ANALYZER FOR ARABIC TEXTS Moreover, in Arabic different orthographic alternation operations can be applied on the combination of a root and a schema, especially when the hamza and the long vowel belong to the root. These orthographic alternation operations include transformation, omission and assimilation. That s why a validation process is needed. The validation process consists of dealing with transformation, omission and assimilation operations and filtering resulting outputs. The transformation phenomenon is applied when a root letter has to change while the root is combined with a schema or while a word is conjugated. Example: Let the word ازدهرت (Azdhrt/ she prospered) be analyzed. Through the.(ت,ت,ا,زهر) analysis of this word, the (R, P, I, S) association detected is ازدهرت ت :S ت :I ا :P زهر :R اف ت ع ل ت از ت ه ر ت Azotaharato از د ه ر ت Validation rule Azodaharato ازدهرت Figure 3. Vocalization and validation process of the word An omission phenomenon consists of deleting one or two letters while the root is combined with a schema. Example: Let the word يقف (yqf/ he stands up) be analyzed. Through the analysis of -). -,,ي,وقف) this word the (R, P, I, S) association detected is يقف R: وقف P: ي I: - S: - ي ف ع ل ي و ق ف yawoqifu ي ق ف Validation rule yaqifu يقف Figure 4. Vocalization and validation process of the word
7 NOUHA CHAÂBEN KAMMOUN, LAMIA HADRICH BELGUITH, ABDELMAJID BEN HAMADOU 1039 Assimilation phenomenon consists of replacing in some cases two successive consonants, if they are the same, by the chadda. Example: Let the word تمد ان (tmd~an/ they expand) be analyzed. Through the analysis.(ان -,,ت,مدد) of this word, the (R, P, I, S) association detected is تمد ان ان S: I: - ت P: مدد R: ت ف ع ل ان ت م د د ان tamodadaani ت م د ان Validation rule tamud~aani تمد ان Figure 5. Vocalization and validation process of the word Since, the new version of our morphological analyzer must deal with, partially, fully or non vocalized words; the filtering process is used to provide more precision in output results. The process is made by comparing diacritics existing in the initial form input. If, in a position, a difference of diacritic exists between the solution provided by analysis and the input form, then it is deleted from the list of solutions. 5. MORPH2 System In this section, we present the new version of the MORPH2 Arabic morphological analyzer. MORPH2 is based on the morphological analysis method presented in section 4. The implementation of this method has been done using an oriented object framework. It is made using Java programming language and based on a reduced lexicon and a set of linguistic rules MORPH2 lexicon MORPH2 uses in each stage of analysis a set of data required for processing such as the lexicon of proclitics, enclitic, and particles in the preprocessing step; the lexicon of affixal triads and roots in the affixal analysis stage; the lexicon of derived and primitive nouns and some tables of correspondences. Since, lexicon organization is a basic step in any morphological analysis application; we tried to organize our lexicon so that it allows more morphological analysis performance. MORPH2 is based on an XML lexicon. It contains 5754 trilateral and quadrilateral roots with which corresponding verbal and nominal schemas are associated (see Figure 6.). The combination of roots and verbal schemas provides verbal stems. The combination of roots and nominal schemas provides nominal stems. A list of computing rules is also stored in the lexicon (see Figure 6.). An interdigitation process between verbal schemas and corresponding rules inside this list allowed us the detection of calculated nouns.
8 1040 THE MORPH2 NEW VERSION: A ROBUST MORPHOLOGICAL ANALYZER FOR ARABIC TEXTS Figure 6. Roots lexicon and associations rules Figure 7. Nominal schema inside a root node In the Figure 6, we present the structure of the trilateral root عمل (Eml). This root can be combined with the verbal schemas, أفعل, فاعل, فع ل, فعل تفاعل and استفعل to produce respectively the following verbal stems, أعمل, عامل, عم ل, عمل تعامل and. استعمل The combination of nominal schemas (see Figure 7) and the root is made in the same manner. Calculated nouns result from the combination of a verbal schema with a corresponding computing rule. Let s consider the verbal schema. أفعل In Arabic, for each verbal stem which schema is أفعل a list of possible nouns can be derived. So for each root which can been combined with the verbal schema أفعل a list of 8 nominal stems can be computed automatically. While 1704 roots can be combined with the schema أفعل then nominal stems were automatically calculated MORPH2 Interface As input, the system accepts an Arabic word, sentence or text. Each word in this input can be non vocalized, partially vocalized or fully vocalized. As output, an XML file is provided. This file contains all information extracted by MORPH2. In order to be easy to understand by the user, the system displays the result using an Xsl stylesheet. However, the XML file provided is also useful as format of interchange when the analyzer is embedded inside more generic NLP applications. We present here an example of analysis. As a result of analyzing the word (Ofsl~mtkhA/ did I delivred her to you then) by MORPH2, we get the screen أفسل متكها presented in Figure 8.
9 NOUHA CHAÂBEN KAMMOUN, LAMIA HADRICH BELGUITH, ABDELMAJID BEN HAMADOU 1041 Figure 8. MORPH2 s interface 6. MORPH2 embedded in NLP applications Using an oriented-object framework for implementation and XML for lexicon representation, MORPH2 has been easily incorporated inside some larger applications for Arabic NLP. In fact, the system was embedded inside the following generic applications: DECORA (i.e. A system of agreement error detection and correction for non vocalized Arabic texts (Boujelen et al., 2008)): DECORA is able to detect and correct agreement errors in gender, number, person, determination, human/non-human semantic features and tense. Its process of detection and correction is based on three phases: morphosyntactic analysis; syntagmatic analysis for error detection and multi-criteria analysis for error correction. MASPAR (i.e. A multi-agent system for parsing Arabic (Belguith et al., 2008)): MASPAR is a multi-agent system for parsing Arabic non vocalized texts. This system provides as an output the syntactic structure of sentences. In failure case, it is able to extract partial structures. This system is based on a multi-agent architecture. MORPH2 was incorporated as an agent collaborating with the other agents (i.e. agent Tokenization, agent Syntax, agent Ellipse and agent Anaphora ). STAr system (Belguith et al., 2005) is used as agent Tokenization. It is able to decompose an Arabic text into sentences. As an input, the agent Syntax receives from the agent
10 1042 THE MORPH2 NEW VERSION: A ROBUST MORPHOLOGICAL ANALYZER FOR ARABIC TEXTS Morphology (i.e. MORPH2) a list of all possible morpho-syntactic features for each word in the sentence. The objective of the agent Syntax is to find the right parsing of a sentence using HPSG grammar (Head-driven Phrase Structure Grammar) (Bahou et al., 2006). QASAL (i.e. A Question Answering System for Arabic Language (Brini et al., 2009)): QASAL uses techniques from IR and NLP to process a collection of Arabic documents. This system is able to process factoid and definition questions. It accepts as an input an Arabic question written in MSA (Moderne Standard Arabic) and generates as an output the most relevant passages which are likely to contain the candidate answers. It is composed of three main modules: Question analysis module; Passage retrieval module and Answer extraction module. The anaphora annotating system of Mezghani et al. (2008): the target of this system is to produce an annotated resource that could be used for automatic anaphora resolution process of Arabic. It uses an XML-based scheme for annotation. In order to accelerate the process of identification of anaphoric entities by the human annotator, a module which could automatically detect the pronouns is integrated. This module is based on the morphological analyzer MORPH2 and a set of syntactic patrons to identify each pronoun in the text. 7. MORPH2 Evaluation The old version of MORPH2 has been evaluated on a non vocalized Arabic corpus. The obtained results in terms of recall and precision are respectively 69,77% and 68,51 % (Belguith and Chaâben, 2006). We noticed that the transformation and the omission of either long vowels or letter hamza are the major causes of failure. The new version of MORPH2 takes into account all these transformations and omissions. It has also been extended to handle not only non vocalized Arabic texts, but also partially and fully vocalized texts. Thus, it has to improve recall and precision values. To evaluate the new version of MORPH2, we use the same corpora used in the old version but with maintaining possible existing diacritics. The corpus consists of a collection of various Arabic texts. It contains about words. The aim of our morphological analyzer is to provide all possible morpho-syntactic features for each word without taking into account the context in which it occurs. That s why we make a preprocessing task on our corpus to maintain only different words. We assume that different words are those which forms.( عل مه and عل مه, عل م (e.g. inside the corpus are different Evaluation operation consists of calculating for each word in the corpus its recall and precision measures taking into account all its possible decompositions. The average of those measures is, then, calculated. We obtain 89, 77% for the recall measure and 82, 51% for the precision measure. Failure cases are (in most of the cases) due to the non detection of relation nouns الن سبة ) ( اسم and primitive nouns (non- derived). In fact, in this stage, MORPH2 is not able to handle relation nouns (e.g. ثقافي (vqafy~/ cultural), استقلالي ة (AstqlAly~p/ independent), etc.).
11 NOUHA CHAÂBEN KAMMOUN, LAMIA HADRICH BELGUITH, ABDELMAJID BEN HAMADOU Conclusion and perspectives In this paper, we have outlined some problems of computational Arabic morphology. Then, we presented our morphological analysis method. It is a knowledge-based method which aims to solve most cases of morphological ambiguities. We, also, presented the new version of our Arabic morphological analyzer MORPH2 based on the above cited method. MORPH2 is implemented in an oriented-object framework using Java programming language. It is based on a reduced XML lexicon. The lexicon is organized so that it can be used not only for morphological analysis, but also for other linguistic analysis levels. MORPH2 has been evaluated of on an Arabic corpus of words. The obtained results are very encouraging (i.e. recall = 89, 77% ; precision = 82, 51%). Failure cases are (in most of the cases) due to the non detection of relation nouns ( الن سبة ( اسم and primitive nouns (non-derived). As perspectives, we plane to deal with problems encountered during MORPH2 s evaluation. An extension of our lexicon to cover more primitive nouns and relation nouns is then intended. Also, we intend to add a POS tagging step in order to take into account the context in which a word occurs. This step allows our system to filter morpho-syntactic lists output. References Abbès R., Dichy J. and Hassoun M. (2004). The Architecture of a Standard Arabic lexical database: some figures, ratios and categories from the DIINAR.1 source program, in Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages - COLING 2004 University of Geneva, 28th August 2004 : Abouenour L., El Hassani S., Yazidy T., Bouzouba K. and Hamdani A. (2008). Building an Arabic Morphological Analyzer as part of an Open Arabic NLP Platform. In Workshop on HLT and NLP within the Arabic world: Arabic Language and local languages processing Status Updates and Prospects At the 6th Language Resources and Evaluation Conference (LREC 08), Marrakech Morocco, May Attia M. (2006). An Ambiguity-Controlled Morphological Analyzer for Modern Standard Arabic Modelling Finite State Networks. In The Challenge of Arabic for NLP/MT Conference, the British Computer Society Conference, London, October 2006, pp Bahou Y., Belguith Hadrich L., Aloulou L. and Ben Hamadou A. (2006). Adaptation et implémentation des grammaires HPSG pour l analyse de textes arabes non voyellés. In Actes du 15e congrès francophone AFRIF-AFIA Reconnaissance des Formes et Intelligence Artificielle (RFIA 06), Tours France, janvier Beesley K. (2001). Finite-State Morphological Analysis and Generation of Arabic at Xerox Research: Status and Plans in In Proceedings of the Arabic Language Processing: Status and Prospect -39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France. Belguith Hadrich L., Aloulou C. and Ben Hamadou A. (2008). MASPAR : De la segmentation à l analyse syntaxique de textes arabes. In Revue Information Interaction Intelligence I3, vol 7, N 2: Belguith Hadrich L. and Chaâben N. (2006). Analyse et désambiguïsation morphologiques des textes arabes non voyellés. In Actes de la 13 ème édition de la conférence sur le Traitement Automatique des Langues Naturelles (TALN 2006), pp , Belgique, Belguith Hadrich L., Baccour L. and Mourad G. (2005). Segmentation des textes arabes basée sur l analyse contextuelle des signes de ponctuations et de certaines particules. In Actes de la 12éme Conférence annuelle sur le Traitement Automatique des Langues Naturelles (TALN 2005), pp
12 1044 THE MORPH2 NEW VERSION: A ROBUST MORPHOLOGICAL ANALYZER FOR ARABIC TEXTS Belguith Hadrich L. (1999). Traitement des erreurs d accord de l Arabe basé sur une analyse syntagmatique étendue pour la vérification et une analyse multicritères pour la correction. Thèse de doctorat en informatique, Faculté des Sciences de Tunis. Ben Hamadou A. (1993). Vérification et correction automatique par analyse affixale des textes écrits en langage naturel : le cas de l Arabe non voyellé. Thèse d Etat en informatique, Faculté des sciences de Tunis. Boudlal A., Belahbib R., Lakhouaja A., Mazroui A., Meziane A. and Ould Abdallahi Ould Bebah M. (2008). A Markovian Approach for Arabic Root Extraction. In The International Arab Conference on Information Technology (ACIT 2008), Hammamet, December 16-18, Boujelben M., Aloulou C. and Belguith Hadrich L. (2008). Toward a detection/correction system for the agreement errors in non-voweled Arabic texts. In The International Arab Conference on Information Technology (ACIT 2008), Hammamet, December 16-18, Brini W., Ellouze M., Mesfar S. and Belguith Hadrich L. (2009). An Arabic Question-Answering system for factoid questions. In IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE 09), Dalian, China, Sep , Buckwalter T. (2004). Issues in Arabic Orthography and Morphology Analysis. In The Workshop on Computational Approaches to Arabic Script-based Languages, COLING 2004, Geneva. Clark A. (2007). Supervised and Unsupervised Learning of Arabic Morphology. In Arabic Computational Morphology, A. Soudi, A. van den Bosch and G. Neumann (eds.), Springer, 2007, pp Clark A. (2003). Combining Distributional and Morphological Information for Part of Speech Induction. In Proceedings of the tenth Annual Meeting of the European Association for Computational Linguistics EACL 2003, pp Darwish K. (2002). Building a Shallow Arabic Morphological Analyzer in One Day. In Actes du workshop Computational approaches to Semitic languages, 47-54, Dichy J. (1997). Pour une lexicomatique de l arabe : l unité lexicale simple et l inventaire fini des spécificateurs du domaine du mot. Meta 42, printemps 1997, Québec, Presses de l Université de Montréal: El Jihad A., Yousfi A. (2005). Etiquetage morpho-syntaxique des textes arabes par modèle de Markov caché. Rencontre des Etudiants Chercheurs en Informatique pour le Traitement Automatique des Langues RECITAL 05, Dourdan, Juin 2005, pp Hammami Mezghani S., Belguith Hadrich L. and Ben Hamadou A. (2008). Anaphora in Arabic Language: Developing a corpora annotating tool for anaphoric links. In The International Arab Conference on Information Technology (ACIT 2008), Hammamet, December McCarthy J. (1981). A prosodic theory of nonconcatenative morphology. Linguistic Inquiry, 12: Ouersighni R. (2002). L analyse morpho-syntaxique de l Arabe voyellé ou non voyellé : Le système AraParse. In Actes de l assemblée internationale du traitement automatique de la langue arabe. Soudi A., Neumann G. and Van den Bosch A. (2007). Arabic Computational Morphology: Knowledge-based and Empirical Methods. In Arabic Computational Morphology, A. Soudi, A. van den Bosch and G. Neumann (eds.), Springer, pp Zaafrani R. (2004). Un dictionnaire électronique pour apprenant de l arabe (langue seconde) basé sur corpus. In Actes de la 11 éme Conférence annuelle sur le Traitement Automatique des Langues Naturelles (TALN 2004).
Overview. Map of the Arabic Languagge. Some tips. Need to over-learn to reach a stage of unconscious competence. Recap Parts of Speech
Map of the Arabic Languagge Recap Parts of Speech Overview in Understanding the Qur an Start-up Initial growth Rapid growth Continuous growth Some tips Don t take furious notes Videos will be downloadable
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
ARABIC PERSON NAMES RECOGNITION BY USING A RULE BASED APPROACH
Journal of Computer Science 9 (7): 922-927, 2013 ISSN: 1549-3636 2013 doi:10.3844/jcssp.2013.922.927 Published Online 9 (7) 2013 (http://www.thescipub.com/jcs.toc) ARABIC PERSON NAMES RECOGNITION BY USING
ﺮﺋﺎ ﻤﱠﻀﻟا The Arabic Pronouns
ر الض م اي ر The Arabic Pronouns (The Arabic Pronouns) Designed and compiled by the one in need of Allah s pardon Aboo Imraan Abdus-Saboor bin Tomas Maldonado al-mekseekee -may Allah forgive him, his family,
Natural Language Database Interface for the Community Based Monitoring System *
Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University
Overview of MT techniques. Malek Boualem (FT)
Overview of MT techniques Malek Boualem (FT) This section presents an standard overview of general aspects related to machine translation with a description of different techniques: bilingual, transfer,
Semantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
PHONETIC TOOL FOR THE TUNISIAN ARABIC
PHONETIC TOOL FOR THE TUNISIAN ARABIC Abir Masmoudi 1,2, Yannick Estève 1, Mariem Ellouze Khmekhem 2, Fethi Bougares 1, Lamia Hadrich Belguith 2 (1) LIUM, University of Maine, France (2) ANLP Research
ANALEC: a New Tool for the Dynamic Annotation of Textual Data
ANALEC: a New Tool for the Dynamic Annotation of Textual Data Frédéric Landragin, Thierry Poibeau and Bernard Victorri LATTICE-CNRS École Normale Supérieure & Université Paris 3-Sorbonne Nouvelle 1 rue
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that
Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic
Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic by Sigrún Helgadóttir Abstract This paper gives the results of an experiment concerned with training three different taggers on tagged
Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology
Extraction of Legal Definitions from a Japanese Statutory Corpus Toward Construction of a Legal Term Ontology Makoto Nakamura, Yasuhiro Ogawa, Katsuhiko Toyama Japan Legal Information Institute, Graduate
How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.
Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering
31 Case Studies: Java Natural Language Tools Available on the Web
31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
Learning Translation Rules from Bilingual English Filipino Corpus
Proceedings of PACLIC 19, the 19 th Asia-Pacific Conference on Language, Information and Computation. Learning Translation s from Bilingual English Filipino Corpus Michelle Wendy Tan, Raymond Joseph Ang,
Interactive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
Automatic Text Analysis Using Drupal
Automatic Text Analysis Using Drupal By Herman Chai Computer Engineering California Polytechnic State University, San Luis Obispo Advised by Dr. Foaad Khosmood June 14, 2013 Abstract Natural language processing
TOOL OF THE INTELLIGENCE ECONOMIC: RECOGNITION FUNCTION OF REVIEWS CRITICS. Extraction and linguistic analysis of sentiments
TOOL OF THE INTELLIGENCE ECONOMIC: RECOGNITION FUNCTION OF REVIEWS CRITICS. Extraction and linguistic analysis of sentiments Grzegorz Dziczkowski, Katarzyna Wegrzyn-Wolska Ecole Superieur d Ingenieurs
LINGSTAT: AN INTERACTIVE, MACHINE-AIDED TRANSLATION SYSTEM*
LINGSTAT: AN INTERACTIVE, MACHINE-AIDED TRANSLATION SYSTEM* Jonathan Yamron, James Baker, Paul Bamberg, Haakon Chevalier, Taiko Dietzel, John Elder, Frank Kampmann, Mark Mandel, Linda Manganaro, Todd Margolis,
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet, Mathieu Roche To cite this version: Hassan Saneifar, Stéphane Bonniol, Anne Laurent, Pascal Poncelet,
CURRICULUM VITAE Studies Positions Distinctions Research interests Research projects
1 CURRICULUM VITAE ABEILLÉ Anne Address : LLF, UFRL, Case 7003, Université Paris 7, 2 place Jussieu, 75005 Paris Tél. 33 1 57 27 57 67 Fax 33 1 57 27 57 88 [email protected] http://www.llf.cnrs.fr/fr/abeille/
UNKNOWN WORDS ANALYSIS IN POS TAGGING OF SINHALA LANGUAGE
UNKNOWN WORDS ANALYSIS IN POS TAGGING OF SINHALA LANGUAGE A.J.P.M.P. Jayaweera #1, N.G.J. Dias *2 # Virtusa Pvt. Ltd. No 752, Dr. Danister De Silva Mawatha, Colombo 09, Sri Lanka * Department of Statistics
Word Completion and Prediction in Hebrew
Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar 1,2, Stéphane Bonniol 2, Anne Laurent 1, Pascal Poncelet 1, and Mathieu Roche 1 1 LIRMM - Université Montpellier 2 - CNRS 161 rue Ada, 34392 Montpellier
Determine two or more main ideas of a text and use details from the text to support the answer
Strand: Reading Nonfiction Topic (INCCR): Main Idea 5.RN.2.2 In addition to, in-depth inferences and applications that go beyond 3.5 In addition to score performance, in-depth inferences and applications
The Development of Multimedia-Multilingual Document Storage, Retrieval and Delivery System for E-Organization (STREDEO PROJECT)
The Development of Multimedia-Multilingual Storage, Retrieval and Delivery for E-Organization (STREDEO PROJECT) Asanee Kawtrakul, Kajornsak Julavittayanukool, Mukda Suktarachan, Patcharee Varasrai, Nathavit
Arabic Morphology Generation Using a Concatenative Strategy
Arabic Morphology Generation Using a Concatenative Strategy (to appear in the Proceedings of NAACL-2000) Violetta Cavalli-Sforza Carnegie Technology Education 4615 Forbes Avenue Pittsburgh, PA, 15213 [email protected]
A Machine Translation System Between a Pair of Closely Related Languages
A Machine Translation System Between a Pair of Closely Related Languages Kemal Altintas 1,3 1 Dept. of Computer Engineering Bilkent University Ankara, Turkey email:[email protected] Abstract Machine translation
Building a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
Context Grammar and POS Tagging
Context Grammar and POS Tagging Shian-jung Dick Chen Don Loritz New Technology and Research New Technology and Research LexisNexis LexisNexis Ohio, 45342 Ohio, 45342 [email protected] [email protected]
Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or
Morphology. Morphology is the study of word formation, of the structure of words. 1. some words can be divided into parts which still have meaning
Morphology Morphology is the study of word formation, of the structure of words. Some observations about words and their structure: 1. some words can be divided into parts which still have meaning 2. many
POSBIOTM-NER: A Machine Learning Approach for. Bio-Named Entity Recognition
POSBIOTM-NER: A Machine Learning Approach for Bio-Named Entity Recognition Yu Song, Eunji Yi, Eunju Kim, Gary Geunbae Lee, Department of CSE, POSTECH, Pohang, Korea 790-784 Soo-Jun Park Bioinformatics
The Lexicon. The Lexicon. The Lexicon. The Significance of the Lexicon. Australian? The Significance of the Lexicon 澳 大 利 亚 奥 地 利
The significance of the lexicon Lexical knowledge Lexical skills 2 The significance of the lexicon Lexical knowledge The Significance of the Lexicon Lexical errors lead to misunderstanding. There s that
Automatic Identification of Arabic Language Varieties and Dialects in Social Media
Automatic Identification of Arabic Language Varieties and Dialects in Social Media Fatiha Sadat University of Quebec in Montreal, 201 President Kennedy, Montreal, QC, Canada [email protected] Farnazeh
Database Design For Corpus Storage: The ET10-63 Data Model
January 1993 Database Design For Corpus Storage: The ET10-63 Data Model Tony McEnery & Béatrice Daille I. General Presentation Within the ET10-63 project, a French-English bilingual corpus of about 2 million
Ontology and automatic code generation on modeling and simulation
Ontology and automatic code generation on modeling and simulation Youcef Gheraibia Computing Department University Md Messadia Souk Ahras, 41000, Algeria [email protected] Abdelhabib Bourouis
Interpreting areading Scaled Scores for Instruction
Interpreting areading Scaled Scores for Instruction Individual scaled scores do not have natural meaning associated to them. The descriptions below provide information for how each scaled score range should
An Overview of a Role of Natural Language Processing in An Intelligent Information Retrieval System
An Overview of a Role of Natural Language Processing in An Intelligent Information Retrieval System Asanee Kawtrakul ABSTRACT In information-age society, advanced retrieval technique and the automatic
Minnesota K-12 Academic Standards in Language Arts Curriculum and Assessment Alignment Form Rewards Intermediate Grades 4-6
Minnesota K-12 Academic Standards in Language Arts Curriculum and Assessment Alignment Form Rewards Intermediate Grades 4-6 4 I. READING AND LITERATURE A. Word Recognition, Analysis, and Fluency The student
Text-To-Speech Technologies for Mobile Telephony Services
Text-To-Speech Technologies for Mobile Telephony Services Paulseph-John Farrugia Department of Computer Science and AI, University of Malta Abstract. Text-To-Speech (TTS) systems aim to transform arbitrary
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR
NATURAL LANGUAGE QUERY PROCESSING USING PROBABILISTIC CONTEXT FREE GRAMMAR Arati K. Deshpande 1 and Prakash. R. Devale 2 1 Student and 2 Professor & Head, Department of Information Technology, Bharati
What s in a Lexicon. The Lexicon. Lexicon vs. Dictionary. What kind of Information should a Lexicon contain?
What s in a Lexicon What kind of Information should a Lexicon contain? The Lexicon Miriam Butt November 2002 Semantic: information about lexical meaning and relations (thematic roles, selectional restrictions,
Focus: Reading Unit of Study: Research & Media Literary; Informational Text; Biographies and Autobiographies
3 rd Grade Reading and Writing TEKS 3 rd Nine Weeks Focus: Reading Unit of Study: Research & Media Literary; Informational Text; Biographies and Autobiographies Figure 19: Reading/Comprehension Skills.
PoS-tagging Italian texts with CORISTagger
PoS-tagging Italian texts with CORISTagger Fabio Tamburini DSLO, University of Bologna, Italy [email protected] Abstract. This paper presents an evolution of CORISTagger [1], an high-performance
Strand: Reading Literature Topics Standard I can statements Vocabulary Key Ideas and Details
Strand: Reading Literature Topics Standard I can statements Vocabulary Key Ideas and Details Craft and Structure RL.5.1 Quote accurately from a text when explaining what the text says explicitly and when
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 [email protected] 2 [email protected] Abstract A vast amount of assorted
Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects
Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects David Graff, Mohamed Maamouri Linguistic Data Consortium University of Pennsylvania E-mail: [email protected], [email protected]
Improving Data Driven Part-of-Speech Tagging by Morphologic Knowledge Induction
Improving Data Driven Part-of-Speech Tagging by Morphologic Knowledge Induction Uwe D. Reichel Department of Phonetics and Speech Communication University of Munich [email protected] Abstract
Glossary of key terms and guide to methods of language analysis AS and A-level English Language (7701 and 7702)
Glossary of key terms and guide to methods of language analysis AS and A-level English Language (7701 and 7702) Introduction This document offers guidance on content that students might typically explore
Automated Extraction of Security Policies from Natural-Language Software Documents
Automated Extraction of Security Policies from Natural-Language Software Documents Xusheng Xiao 1 Amit Paradkar 2 Suresh Thummalapenta 3 Tao Xie 1 1 Dept. of Computer Science, North Carolina State University,
Natural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
C o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER
INTRODUCTION TO SAS TEXT MINER TODAY S AGENDA INTRODUCTION TO SAS TEXT MINER Define data mining Overview of SAS Enterprise Miner Describe text analytics and define text data mining Text Mining Process
TEACHING ARABIC AS A SECOND LANGUAGE
بسم االله الرحمن الرحيم TEACHING ARABIC AS A SECOND LANGUAGE و م ن ا ي ات ه خ ل ق ال سم او ات و ال ا ر ض و اخ ت ل اف أ ل س ن ت ك م و أ ل و ان ك م إ ن ف ي ذ ل ك ل ا ي ات لل ع ال م ين 22 الروم 30 And among
Electronic Document Management Using Inverted Files System
EPJ Web of Conferences 68, 0 00 04 (2014) DOI: 10.1051/ epjconf/ 20146800004 C Owned by the authors, published by EDP Sciences, 2014 Electronic Document Management Using Inverted Files System Derwin Suhartono,
SIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISON
SIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISON Essam S. Hanandeh, Department of Computer Information System, Zarqa University, Zarqa, Jordan [email protected] ABSTRACT The massive
An Artificial Intelligence approach to Arabic and Islamic content on the internet
An Artificial Intelligence approach to Arabic and Islamic content on the internet Eric Atwell, Claire Brierley, Kais Dukes, Majdi Sawalha, Abdul-Baquee Sharaf I-AIBS Institute for Artificial intelligence
RRSS - Rating Reviews Support System purpose built for movies recommendation
RRSS - Rating Reviews Support System purpose built for movies recommendation Grzegorz Dziczkowski 1,2 and Katarzyna Wegrzyn-Wolska 1 1 Ecole Superieur d Ingenieurs en Informatique et Genie des Telecommunicatiom
Indiana Department of Education
GRADE 1 READING Guiding Principle: Students read a wide range of fiction, nonfiction, classic, and contemporary works, to build an understanding of texts, of themselves, and of the cultures of the United
Albert Pye and Ravensmere Schools Grammar Curriculum
Albert Pye and Ravensmere Schools Grammar Curriculum Introduction The aim of our schools own grammar curriculum is to ensure that all relevant grammar content is introduced within the primary years in
Open Domain Information Extraction. Günter Neumann, DFKI, 2012
Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for
Speech and Language Processing
Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition Second Edition Daniel Jurafsky Stanford University James H. Martin University
Reading IV Grade Level 4
Reading IV Reading IV introduces students to a variety of topics to enrich their reading experience including: a review of consonant and vowel sounds using phonetic clues and diacritical marks to identify
Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization
Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization Atika Mustafa, Ali Akbar, and Ahmer Sultan National University of Computer and Emerging
STUDENT OBJECTIVES. Lección 1 Descubre 1. STUDENT OBJECTIVES Lección 1 Descubre 1. Objetivos: Fotonovela Fecha
Objetivos: Fotonovela Objetivos: Contextos 1. I can specify new and familiar words I hear in the video. 1. I can use basic greetings. 2. I can identify target culture products and practices that I see
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURKISH CORPUS Gürkan Şahin 1, Banu Diri 1 and Tuğba Yıldız 2 1 Faculty of Electrical-Electronic, Department of Computer Engineering
Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features
, pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of
Comprendium Translator System Overview
Comprendium System Overview May 2004 Table of Contents 1. INTRODUCTION...3 2. WHAT IS MACHINE TRANSLATION?...3 3. THE COMPRENDIUM MACHINE TRANSLATION TECHNOLOGY...4 3.1 THE BEST MT TECHNOLOGY IN THE MARKET...4
Special Topics in Computer Science
Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS
Parsing Technology and its role in Legacy Modernization. A Metaware White Paper
Parsing Technology and its role in Legacy Modernization A Metaware White Paper 1 INTRODUCTION In the two last decades there has been an explosion of interest in software tools that can automate key tasks
Why Won t you Think?
Why Won t you Think? www.detailedquran.com إ ن ش ر الد و اب ع ند الل ه الص م ال ب ك م ال ذ ين ل ي ع ق ل ون Truly, the worst of all creatures in the sight of Allah are the deaf, the dumb, those who do not
Morphemes, roots and affixes. 28 October 2011
Morphemes, roots and affixes 28 October 2011 Previously said We think of words as being the most basic, the most fundamental, units through which meaning is represented in language. Words are the smallest
Common Core Progress English Language Arts
[ SADLIER Common Core Progress English Language Arts Aligned to the [ Florida Next Generation GRADE 6 Sunshine State (Common Core) Standards for English Language Arts Contents 2 Strand: Reading Standards
Author Gender Identification of English Novels
Author Gender Identification of English Novels Joseph Baena and Catherine Chen December 13, 2013 1 Introduction Machine learning algorithms have long been used in studies of authorship, particularly in
Hybrid Strategies. for better products and shorter time-to-market
Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,
Syntactic Theory. Background and Transformational Grammar. Dr. Dan Flickinger & PD Dr. Valia Kordoni
Syntactic Theory Background and Transformational Grammar Dr. Dan Flickinger & PD Dr. Valia Kordoni Department of Computational Linguistics Saarland University October 28, 2011 Early work on grammar There
Chapter 2 The Information Retrieval Process
Chapter 2 The Information Retrieval Process Abstract What does an information retrieval system look like from a bird s eye perspective? How can a set of documents be processed by a system to make sense
The Specific Text Analysis Tasks at the Beginning of MDA Life Cycle
SCIENTIFIC PAPERS, UNIVERSITY OF LATVIA, 2010. Vol. 757 COMPUTER SCIENCE AND INFORMATION TECHNOLOGIES 11 22 P. The Specific Text Analysis Tasks at the Beginning of MDA Life Cycle Armands Šlihte Faculty
Strand: Reading Literature Topics Standard I can statements Vocabulary Key Ideas and Details
Strand: Reading Literature Key Ideas and Craft and Structure Integration of Knowledge and Ideas RL.K.1. With prompting and support, ask and answer questions about key details in a text RL.K.2. With prompting
Doctoral School of Historical Sciences Dr. Székely Gábor professor Program of Assyiriology Dr. Dezső Tamás habilitate docent
Doctoral School of Historical Sciences Dr. Székely Gábor professor Program of Assyiriology Dr. Dezső Tamás habilitate docent The theses of the Dissertation Nominal and Verbal Plurality in Sumerian: A Morphosemantic
COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE. Fall 2014
COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE Fall 2014 EDU 561 (85515) Instructor: Bart Weyand Classroom: Online TEL: (207) 985-7140 E-Mail: [email protected] COURSE DESCRIPTION: This is a practical
ELAGSEKRI7: With prompting and support, describe the relationship between illustrations and the text (how the illustrations support the text).
READING LITERARY (RL) Key Ideas and Details ELAGSEKRL1: With prompting and support, ask and answer questions about key details in a text. ELAGSEKRL2: With prompting and support, retell familiar stories,
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems cation systems. For example, NLP could be used in Question Answering (QA) systems to understand users natural
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Pranali Chilekar 1, Swati Ubale 2, Pragati Sonkambale 3, Reema Panarkar 4, Gopal Upadhye 5 1 2 3 4 5
Online free translation services
[Translating and the Computer 24: proceedings of the International Conference 21-22 November 2002, London (Aslib, 2002)] Online free translation services Thei Zervaki [email protected] Introduction
Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002. Reading (based on Wixson, 1999)
Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002 Language Arts Levels of Depth of Knowledge Interpreting and assigning depth-of-knowledge levels to both objectives within
KINDGERGARTEN. Listen to a story for a particular reason
KINDGERGARTEN READING FOUNDATIONAL SKILLS Print Concepts Follow words from left to right in a text Follow words from top to bottom in a text Know when to turn the page in a book Show spaces between words
Islamic Studies. Student Workbook. Level 1. Name. 2 Dudley Street Cheetham Hill Manchester, M89DA. manchestersalafischool.
Islamic Studies Level 1 Student Workbook Name 2 Dudley Street Cheetham Hill Manchester, M89DA manchestersalafischool.com 0161 317 1481 Islamic Studies Level 1 Student Workbook Overview Coming to the Masjid
Year 1 reading expectations (New Curriculum) Year 1 writing expectations (New Curriculum)
Year 1 reading expectations Year 1 writing expectations Responds speedily with the correct sound to graphemes (letters or groups of letters) for all 40+ phonemes, including, where applicable, alternative
