Arabic Language Resources and Tools for Speech and Natural Language
|
|
|
- Gary Cole
- 10 years ago
- Views:
Transcription
1 Arabic Language Resources and Tools for Speech and Natural Language Mansour Alghamdi*, Chafic Mokbel** and Mohamed Mrayati*** *King Abdulaziz City for Science and Technology, Riyadh, Saudi Arabia. **University of Balamand, Lebanon. *** UN-DESA. Abstract Although 5% of the world population speak Arabic as a native language, the research on Arabic is not up to its number of speakers. However, more research has been don on Arabic in the last decade due to many factors including the widespread of communication and information technology applications. Two of the eminent research centres in the Arab World which contbuted markedly on research related to Arabic are King Abdulaziz City for Science and Technology (KACST) and University of Balamand (UOB). This paper is to highlight the efforts of these two institutions on enching the Arabic language resources including that on natural language processing, speech processing and character recognition. Introduction Arabic speech and language processing has gained a lot of interest in the past decade. Several projects have been launched at the international level. DARPA has launched the GALE Global Autonomous Language Exploitation project (Cohen 2007). It started in 2005 and aims to transcbe and translate audio and wtten documents from Arabic or Mandaan Chinese to English. The target is to build by 2010 devices capable to do such transcptiontranslation in real-time with an accuracy of 90% to 95%. Europe has also funded several projects relative to the Arabic language. We cite the ALMA (AbiRached ), NEMLAR (Yaseen, 2006) and the MEDAR (Maegaard 2008) projects. The NEMLAR project permitted to draw a map especially in the Arab region of the existing resources, tools, technologies and actors in the Arabic speech and language processing. This led to the definition of needs and availability of resources and tools for the Arabic language; the BLARK Basic Language Resources Kit (Maegaard 2006). Nemlar has also permitted to define, collect and develop two databases; a broadcast news database and a text database (Yaseen 2006). Besides the projects, several international competitions are conducted on a yearly basis in order to assess and improve the technology for the processing of the Arabic language. The National Institute of Standards and Technology (NIST) organizes several evaluation competitions. In Arabic handwtten recognition, ICDAR contest assesses and benchmark the different technologies (Margner 2005). From resources side, two institutions, the ELRA and LDC distbute useful resources for the development of Arabic speech and language processing technologies and applications. The projects, resources and tools developed for the Arabic language show the high interest in the Arabic Speech and Language processing in the past decade. This paper focuses on the activities, resources and tools developed in two laboratoes in the region, i. e. KACST and UOB, in relation with the Arabic speech and language processing. It shows the major achievements in these areas. Natural Language Processing Natural language processing (NLP) or what is sometimes called computational linguistics is the area of knowledge where the text of a natural language such as Arabic is digitized and computed. It includes natural language understanding and generation. In order to do so, some of the human brain linguistic functions need to be simulated. These include: spell checker, morphological analyser and generator, grammar checker, lexicon and so on. Morphological Analysis and Tagging KACST has two projects at the morphological level. The first is the Arabic language Morphological analyzer. This project aims to develop algothms for morphological analyzing Arabic language vocabulary according to their morphological and grammatical properties. The algothm depends on the morphological and grammatical charactestics that are extracted from a large linguistic corpus. The system consists of texts tokenizer tools, storage and reteval of the grammatical and morphological charactestics, the charactestics of the entry system and expert learning system. The system can be used in analyzing and coding Arabic language properties, grammatical and morphological analysis systems, recognizing the grammatical and morphological properties of Arabic language and statistical linguistic systems. KACST has developed an Arabic stemmer. The system is used to analyze Arabic vocabulary and linguistically recognizes and isolate suffixes, prefixes and infixes and then extract the corresponding stem. The system depends on regular expressions and serves the users and developers of applications that support Arabic language and help in expanding research process about Arabic sources. The system consists of tools for dividing up the texts, tools for analyzing regular expressions, group of regular expressions that exceed 1400 expression. Arabic lexis morphological rules have been developed by KACST. The aim of this project is to create a database that includes all the morphological rules of Arabic lexis that would lead into the production of the necessary morphological, syntactic and phonological algosms associated with the generation of lexis through fully computezed systems. The database entes include lists 176
2 of the Arabic alphabet, word roots and their features. The entes, also, include morphological measures used in generating Arabic lexis, such as, inert nouns and verbs, devatives and infinitive forms of verbs and nouns. Morphological features of generated lexis such as the plural and diminution forms, relation and feminine cases, the tense and aspect of verbs and the nominal forms generated from infinitive and devative nouns are also included in the database. This project is a base for Arabic language processing applications. These applications cover texts generation and analysis, texts compression and compilation, machine search, data coding, machine translation, Arabic language machine teaching and language understanding. The system includes new algosms that facilitates the process of constant and regular generation of new lexis in Arabic language and to collect and compile rare and infrequent forms in small groups. Modern Arabic wting includes only the letters that represent the consonants. This means that Arabic vowels and geminates are not represented in the daily wting of Arabic. The absence of the vocalic and geminate symbols does not allow for full usage of other computational systems such as text-to-speech and automatic speech recognition systems and search engines. Therefore, KACST has started doing expements on Automatic Arabic Diactization to develop a system that can be integrated in other related computer systems. The result is KACST Arabic Diactizer (KAD). KAD contains algothms that calculate the highest probability, algothms that select the diactic of the highest probability and 68,378 quad-grams of the Arabic letters and diactics. The system accuracy is 87% at the letter level including word-final letters. It diactizes more than 500 words/second. Also, it is small in size, only 3 MB s (Alghamdi and Muzaffar, 2007; Elshafei, et al. 2006). Transliteration of Arabic names has not been consistent for the reason that Arabic orthography is different from that of the Roman alphabet. The result is that the same name is Romanized in different ways. Such inconsistency has negative effects on individuals and institutions. To standardize the method of Romanizing Arabic names and make it available to others, KACST has developed the KACST Arabic Name Romanizer (KANR). KANR contains algothms that process Arabic letters, algothms that transliterate the Arabic letters into the Roman alphabet, algothms that process Roman letters and more than 50,000 Arabic names wtten in the diactized Arabic orthography (Alsalman, et al. 2007). Although the Arabic alphabet is sufficient to be used in wting and reading in Arabic, as it is the case for other languages and their wting systems, it does not possess symbols for all the Arabic sounds whether they are part of the standard or dialect inventoes. Therefore, Arabic speakers find it impossible to transcbe speech sounds using Arabic alphabet. They often either descbe the sound in words or use the International Phonetic Alphabet. Due to this fact, KACST has designed a phonetic alphabet based on the Arabic alphabet to represent all speech sounds for all languages. The new phonetic alphabet is called Arabic International Phonetic Alphabet (AIPA). AIPA was further developed as 2 fonts that can be installed on a PC and used in any word processor (Alghamdi, 2006b). Language Modelling The statistical N-gram models are used as the state of the art language models. N-gram define the probability distbution of the vocabulary words in the context of a N- 1 words. The Arabic language is charactezed by a ch morphology. Therefore, morphologically constrained N- gram (Kirchhoff 2002) have been proposed in order to reduce the complexity of the language models (number of parameters) extending by the same fact their generalization capabilities. In (Ghaoui 2005) a novel approach for the morphologically constrained N-gram has been proposed. It consists in consideng the word as a couple (stem, rule), the stem being the root of the word and the rule is the grammatical one used to deve the word from the stem. Pr( wi / wi 1 Kwi N + 1) = Pr[( si, ) /( si 1, 1 ) K( si N + 1, N + 1)] By decomposition and simplification, several models may be deved. For example, one can make independence hypotheses leading to: Pr( wi / wi 1 Kwi N + 1) = Pr[ si / si 1Ksi N + 1]Pr[ / si 1K N+ 1] This models the probability of a word given its context as the multiplication of the probability of the stem given its stems context by the probability of the rule given the previous rules and the current stem. Whatever the simplified model applied, the method requires a good morphological analyzer in order to decompose the words into their corresponding stems and rules. A simple automaton based morphological analyzer has been built at the University of Balamand and was used in this morphologically constrained N-gram. The resulting model has shown limited degradation of the perplexity with significant reduction of the number of parameters. Speech Processing Speech processing covers a set of technologies including speech coding, text to speech, speaker recognition and, speech recognition. While speech coding techniques are universal with the exception of very low rate speech coding, the other technologies highly depend on the language. In this section, several technologies and technology tools developed in UOB and KACST laboratoes are descbed. Speech Recognition Arabic speech recognition has been a major activity of interest as mentioned earlier. This was also the case at the University of Balamand. The activities cover the following directions: Arabic broadcast news transcption Arabic telephone commands Multilingual speech recognition Systems have been developed in the three areas. In order to build those systems both resources and tools are needed. Speech resources must be annotated. First, for the telephone commands a polyphone like database has been collected. This database contains commands, dates, numbers and, small expressions uttered by 25 male speakers and 30 female speakers, all between 19 and 25 years old. The vocabulary is in two languages; Arabic and 177
3 French (Bayeh 2004). In definitive, around 1100 sentences are validated from each language. This database has been expemented in telephone commands and multilingual speech recognition. This database is called BEAF for Balamand ENST Arabic and French database. State of the art speech recognition systems make use of Hidden Markov Models (HMM) to represent the speech units. At the University of Balamand a HMM toolkit has been built. This toolkit is called HCM. It includes tools to compile HMM models, initialize and train those models, and use those models in speech recognition expements. HCM has a genec structure. Every tool has a configuration file. A BNF language with C-like syntax is used to wte the configuration parameters. HCM has been successfully used and shown state of the art results in expements going from limited vocabulary isolated words to large vocabulary continuous speech recognition. Expements have been conducted on BEAF database using the HMM toolkit HCM. The baseline speech recognition results are at the state of the art (i.e. less than 2% of error rate). For broadcast news transcption, speech resources and tools are also needed. The Nemlar resources (Yaseen 2006) have been used in the expements conducted at the University of Balamand. This Nemlar database is a broadcast news database formed of about 40 hours. State of the art performance is also achieved for broadcast news transcption (Bayeh 2008), i.e. 86% of accuracy. It is worth noting that a Classification And Regression Tree (CART) algothm has been developed and used to classify the tphones models in the system. Expements have been conducted in multilingual speech recognition. Phonetic models trained on the French language have been used to build Arabic speech recognition models. For this purpose an association between unit models from both languages has to be found. This can be done manually by a human expert or automatically in a data dven approach. The idea of data dven method is to find the optimal association that would permit to better represent few acoustic data from Arabic using French unit models. For both telephone commands and Broadcast News transcption, UOB team has been able to show that better multilingual acoustic models may be obtained if data-dven association is used between the acoustical units of the different languages than when the association is set by human experts. Although we have been able to build systems using locally developed tools, the resources used have limited size and scope. More important resources exist but they are not very accessible. Accessible resources would permit to better develop the research in this area in the region. Similar projects on speech has been done in KACST laboratoes. KACST team started in the med 1990 s to build a phonetic database on Arabic. By the turn of the century, KACST Arabic Phonetic Database (KAPD) was available for researchers and research centres. The most sophisticated equipments were used to collect the database to provide data that can easily be utilized by researchers and those who are interested in speech and phonetics. KAPD contains more than 46,000 files including: 12,000 wave files for all Arabic sounds in different word positions, aerodynamic data, electropalatogrphic data, degree of nasalance, and images of the face, the vocal folds, the epiglottis and the velopharyngeal port,. 7 native speakers of Arabic participated in KAPD expements (Alghamdi, 2003). KAPD is distbuted to researchers and research centres without charge. It has been used in different text-to-speech (TTS) and speech-to-text systems (STT). KACST has collected another speech database called Saudi Accented Arabic Voice Bank (SAAVB). It represents Arabic native speakers from all the cities of Saudi Arabia. SAAVB has speech that was collected from 1033 speakers. It has 183,518 audio files in addition to their transcptions. The database has been licensed to IBM which is now integrated in its speech recognition engine. KACST has developed speech related systems including TTS (Elshafei, et al. 2002), STT (Elshafei, et al. 2008; Alotaibi, et al. 2008; Alkanhal, et al. 2008; ) and speaker vefication (Alkanhal, et al. 2007; Alghamdi, 2006a). Speaker Recognition State of the art text-independent speaker recognition systems use Gaussian Mixture Models. For this purpose Becars, an open source software, has been developed in a joint effort by the University of Balamand and the ENST. Becars is a GMM toolkit allowing, to compile, initialise, train and test GMM models in different applications. The strength of Becars is in the different adaptation techniques that are included. Actually, MAP, MLLR, and unified adaptation (Mokbel, 2001) procedures are fully implemented. In (Blouet 2004) the Becars software is descbed. Using Becars, the University of Balamand has participated to several NIST (speaker recognition) evaluations. Becars has been used also for the segmentation of the audio signal in broadcast news transcption applications. A participation to the ESTER evaluation with such a segmentation system has been done. Speaker recognition systems may be used in standalone applications or within other applications like the segmentation of audio signals in different speakers. The speech resources needed to build speaker recognition systems and to benchmark those systems are lacking in the Arabic region. Projects need to be launched in order to build such resources. Handwtten Document Recognition Arabic handwtten recognition is also being a subject of large interest. The only freely available resource is the IFN/ENIT database 1, which is built in a joint effort between IFN (Germany) and ENIT (Tunisia). The database lexicon is formed of the Tunisian towns. It includes more than city words images wtten by around 411 wters. A biyearly contest is organized by ICDAR (Margner 2005). A baseline system has been developed based on HCM and showed state of the art results (El Hajj 2005). While the IFN/ENIT database is limited to isolated Tunisian towns names, there is a need for annotated databases resources with full text in order to develop handwtten document recognition. KACST has developed a system for cursive Arabic text recognition. The system decomposes the document image
4 into text line images and extracts a set of simple statistical features. The Hidden Markov Model Toolkit (HTK) is used to process the output. The system has applied to a data corpus which includes Arabic text of more than 600 A4-size sheets typewtten in multiple computer-generated fonts (Khorsheed, 2002). Conclusions and Perspectives In this paper we have presented several resources and tools developed in KACST and UOB laboratoes. KACST has developed an Arabic stemmer and an Arabic lexis of morphological rules. In addition, an Arabic diactizer KAD and an Arabic name romanizer KANR have been produced at KACST. At the University of Balamand a French-Arabic telephone commands database has been collected and is available upon request. Two speech databases have been collected by KACST: KAPD and SAAVB. A free open source GMM toolkit, Becars, is being distbuted. A full HMM toolkit, HCM, has been developed and used in several Arabic speech recognition expements and Arabic handwtten expements. The available resources developed in the region and the tools built and tested on those resources are at the state of the art. However, there is a need for larger resources and for the development of the research activities on Arabic speech and language processing. This may be done through the organizations that are involved in such activities. Benchmarking, vefication and evaluation would certainly help in this direction. Acknowledgements The authors would like to express their gratitude to KACST, UOB and UN-DESA for their support that made this paper possible. Bibliographical References AbiRached M. & Faby J.A. (2004), Communication and information technologies at the international office of water (IOW), In Proceedings International Conference on Information and Communication Technologies: From Theory to Applications, (pp ), Damascus, Sya. Alghamdi, Mansour (2006a) "Voice Pnt": Voice Onset Time as a Model. Arab Journal for Secuty Studies and Training : Alghamdi, Mansour (2006b) Designing Arabic Fonts to Represent International Phonetic Alphabets. Journal of King Abdulaziz University: Engineeng Sciences. 16, 2: Alghamdi, Mansour (2003) KACST Arabic Phonetics Database, The Fifteenth International Congress of Phonetics Science, Barcelona, Alkanhal, Mohamed, Mansour Alghamdi, Fahad Alotaibi and Ammar Alinazi (2008) Arabic Spoken Name Recognition. International Workshop on Signal Processing and Its Applications, March 2008, University of Sharjah, Sharjah, U.A.E. Alkanhal, Mohamed, Mansour Alghamdi and Zeeshan Muzaffar (2007) Speaker Vefication Based on Saudi Accented Arabic Database. International Symposium on Signal Processing. Sharjah, United Arab Emirates February Alotaibi, Yousef Ajami, Mansour Alghamdi and Fahad Alotaiby (2008) Speech Recognition System of Arabic Digits based on A Telephony Arabic Corpus. The 2008 International Conference on Image Processing, Computer Vision, and Pattern Recognition. Las Vegas, USA. Alsalman, Abdulmalik, Mansour Alghamdi, Khalid Alhuqayl and Salih Alsubay (2007) A Computezed System to Romanize Arabic Names. The First International Symposium on Computers and Arabic Language /3/2007. Bayeh, R., Lin, S.-S. Chollet, G. & Mokbel, C. (2004), Towards multilingual speech recognition using data dven source/target acoustical units association, In Proceedings International Conference on Acoustic, Speech and Signal Processing (ICASSP), (pp ), Montreal, Canada. Bayeh, R. Mokbel, C. & Chollet, G. (2008) Broadcast News Transcption baseline system using the Nemlar database, In Proceedings of the 6 th International Conference on Language Resources and Evaluation (LREC), Marrakech, Morocco. Blouet, R., Mokbel, C., Mokbel, H., Sanchez, E., Chollet, G. & Greige, H. (2004) Becars: A free software for speaker vefication. In Proceedings of Odyssey (pp ), Toledo, Spain. Cohen, J. (2007), The GALE project: A descption and an update, In Proceedings IEEE workshop on Automatic Speech Recognition and Understanding ASRU (pp ), Kyoto, Japan. Ghaoui, A. Yvon, F. Mokbel, C. & Chollet, G. (2005) On the Use of Morphological Constraints in N-gram Statistical Language Model, In Proceedings of the Interspeech, 9 th European Conference on Speech Communication and technology, Lisbon, Portugal. El-Hajj, R., Likforman-Sulem, L., Mokbel, C. (2005), Arabic handwting recognition using baseline dependent features and Hidden Markov Modeling, In Proceedings of the 8 th International Conference on Document Analysis and Recognition, Seoul, Korea. Alghamdi (2002) Techniques for High Quality Arabic Speech Synthesis, Information Science, 140 (3-4) Alghamdi (2008) Speaker-Independent Natural Arabic Speech Recognition System. The International Conference on Intelligent Systems. Bahrain, 1-3 December Alghamdi (2006). Machine Generation of Arabic Diactical Marks. The 2006 World Congress in Computer Science Computer Engineeng, and Applied Computing. Las Vegas, USA /6/2006. Khorsheed, M. (2002) Off-line Arabic character recognition - A review. Pattern Anal. Appl. v5 i Kirchhoff, K. et al. (2002), Novel Speech Recognition Models for Arabic, Johns-Hopkins University Summer Research Workshop, Baltimore, USA. Maegaard, B. et al. (2006), The BLARK Concept and BLARK for Arabic, In Proceedings of the 5 th 179
5 International Conference on Language Resources and Evaluation (LREC), (pp ), Genova, Italy. Maegaard, B. et al., (2008), MEDAR: Collaboration between European and Mediterranean Arabic Partners to Support the Development of Language Technology for Arabic, In Proceedings of the 6 th International Conference on Language Resources and Evaluation (LREC), Marrakech, Morocco. Margner, V. Pechwitz, M. & El Abed, H. (2005), ICDAR 2005 Arabic Handwting Recognition Competition, In Proceedings of the 8 th International Conference on Document Analysis and Recognition, Vol. 1 (pp 70-74), Seoul, Korea. Mokbel, C., Abi Akl, H. & Greige, H. (2002) Automatic speech recognition of Arabic digits over the telephone network, In Proceedings of Research Trends in Science and Technology, Beyrouth, Lebanon. Mokbel, C. (2001) Online Adaptation of HMMs to reallife conditions: a unified framework, In IEEE Trans. on Speech and Audio Processing, Vol. 9, Issue 4, (pp ). Yaseen, M. et al. (2006), Building Annotated Wtten and Spoken Arabic LRs in NEMLAR, In Proceedings of the 5 th International Conference on Language Resources and Evaluation (LREC), (pp ), Genova, Italy. 180
Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus
Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus Yousef Ajami Alotaibi 1, Mansour Alghamdi 2, and Fahad Alotaiby 3 1 Computer Engineering Department, King Saud University,
Turkish Radiology Dictation System
Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey [email protected], [email protected]
SASSC: A Standard Arabic Single Speaker Corpus
SASSC: A Standard Arabic Single Speaker Corpus Ibrahim Almosallam, Atheer AlKhalifa, Mansour Alghamdi, Mohamed Alkanhal, Ashraf Alkhairy The Computer Research Institute King Abdulaziz City for Science
PHONETIC TOOL FOR THE TUNISIAN ARABIC
PHONETIC TOOL FOR THE TUNISIAN ARABIC Abir Masmoudi 1,2, Yannick Estève 1, Mariem Ellouze Khmekhem 2, Fethi Bougares 1, Lamia Hadrich Belguith 2 (1) LIUM, University of Maine, France (2) ANLP Research
MEDAR Mediterranean Arabic Language and Speech Technology An intermediate report on the MEDAR Survey of actors, projects, products
MEDAR Mediterranean Arabic Language and Speech Technology An intermediate report on the MEDAR Survey of actors, projects, products Khalid Choukri Evaluation and Language resources Distribution Agency;
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
Automatic Identification of Arabic Language Varieties and Dialects in Social Media
Automatic Identification of Arabic Language Varieties and Dialects in Social Media Fatiha Sadat University of Quebec in Montreal, 201 President Kennedy, Montreal, QC, Canada [email protected] Farnazeh
An Arabic Text-To-Speech System Based on Artificial Neural Networks
Journal of Computer Science 5 (3): 207-213, 2009 ISSN 1549-3636 2009 Science Publications An Arabic Text-To-Speech System Based on Artificial Neural Networks Ghadeer Al-Said and Moussa Abdallah Department
Word Completion and Prediction in Hebrew
Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology
Automatic slide assignation for language model adaptation
Automatic slide assignation for language model adaptation Applications of Computational Linguistics Adrià Agustí Martínez Villaronga May 23, 2013 1 Introduction Online multimedia repositories are rapidly
Customizing an English-Korean Machine Translation System for Patent Translation *
Customizing an English-Korean Machine Translation System for Patent Translation * Sung-Kwon Choi, Young-Gil Kim Natural Language Processing Team, Electronics and Telecommunications Research Institute,
Develop Software that Speaks and Listens
Develop Software that Speaks and Listens Copyright 2011 Chant Inc. All rights reserved. Chant, SpeechKit, Getting the World Talking with Technology, talking man, and headset are trademarks or registered
Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic
Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic by Sigrún Helgadóttir Abstract This paper gives the results of an experiment concerned with training three different taggers on tagged
Hebrew. Afro-Asiatic languages. Modern Hebrew is spoken in Israel, the United States, the
Jennifer Wagner Hebrew The Hebrew language belongs to the West-South-Central Semitic branch of the Afro-Asiatic languages. Modern Hebrew is spoken in Israel, the United States, the United Kingdom, Australia,
Master of Arts in Linguistics Syllabus
Master of Arts in Linguistics Syllabus Applicants shall hold a Bachelor s degree with Honours of this University or another qualification of equivalent standard from this University or from another university
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection
Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition
, Lisbon Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition Wolfgang Macherey Lars Haferkamp Ralf Schlüter Hermann Ney Human Language Technology
Points of Interference in Learning English as a Second Language
Points of Interference in Learning English as a Second Language Tone Spanish: In both English and Spanish there are four tone levels, but Spanish speaker use only the three lower pitch tones, except when
The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content
The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content Omar F. Zaidan and Chris Callison-Burch Dept. of Computer Science, Johns Hopkins University Baltimore,
31 Case Studies: Java Natural Language Tools Available on the Web
31 Case Studies: Java Natural Language Tools Available on the Web Chapter Objectives Chapter Contents This chapter provides a number of sources for open source and free atural language understanding software
How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.
Svetlana Sokolova President and CEO of PROMT, PhD. How the Computer Translates Machine translation is a special field of computer application where almost everyone believes that he/she is a specialist.
Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN
PAGE 30 Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN Sung-Joon Park, Kyung-Ae Jang, Jae-In Kim, Myoung-Wan Koo, Chu-Shik Jhon Service Development Laboratory, KT,
The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
CHARTES D'ANGLAIS SOMMAIRE. CHARTE NIVEAU A1 Pages 2-4. CHARTE NIVEAU A2 Pages 5-7. CHARTE NIVEAU B1 Pages 8-10. CHARTE NIVEAU B2 Pages 11-14
CHARTES D'ANGLAIS SOMMAIRE CHARTE NIVEAU A1 Pages 2-4 CHARTE NIVEAU A2 Pages 5-7 CHARTE NIVEAU B1 Pages 8-10 CHARTE NIVEAU B2 Pages 11-14 CHARTE NIVEAU C1 Pages 15-17 MAJ, le 11 juin 2014 A1 Skills-based
COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE. Fall 2014
COURSE SYLLABUS ESU 561 ASPECTS OF THE ENGLISH LANGUAGE Fall 2014 EDU 561 (85515) Instructor: Bart Weyand Classroom: Online TEL: (207) 985-7140 E-Mail: [email protected] COURSE DESCRIPTION: This is a practical
Professional Profile
Dr. Abdelmalek ZIDOURI Electrical Engineering Department KFUPM, Dhahran, 31261 Saudi Arabia Tel. +966 3 860 3677 Fax: +966 3 860 3535 [email protected] [email protected] Professional Profile Eager to
Grammars and introduction to machine learning. Computers Playing Jeopardy! Course Stony Brook University
Grammars and introduction to machine learning Computers Playing Jeopardy! Course Stony Brook University Last class: grammars and parsing in Prolog Noun -> roller Verb thrills VP Verb NP S NP VP NP S VP
Evaluating grapheme-to-phoneme converters in automatic speech recognition context
Evaluating grapheme-to-phoneme converters in automatic speech recognition context Denis Jouvet, Dominique Fohr, Irina Illina To cite this version: Denis Jouvet, Dominique Fohr, Irina Illina. Evaluating
Automatic Text Analysis Using Drupal
Automatic Text Analysis Using Drupal By Herman Chai Computer Engineering California Polytechnic State University, San Luis Obispo Advised by Dr. Foaad Khosmood June 14, 2013 Abstract Natural language processing
Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet)
Proceedings of the Twenty-Fourth Innovative Appications of Artificial Intelligence Conference Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet) Tatsuya Kawahara
Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction
: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction Urmila Shrawankar Dept. of Information Technology Govt. Polytechnic, Nagpur Institute Sadar, Nagpur 440001 (INDIA)
What Is Linguistics? December 1992 Center for Applied Linguistics
What Is Linguistics? December 1992 Center for Applied Linguistics Linguistics is the study of language. Knowledge of linguistics, however, is different from knowledge of a language. Just as a person is
Selected Topics in Applied Machine Learning: An integrating view on data analysis and learning algorithms
Selected Topics in Applied Machine Learning: An integrating view on data analysis and learning algorithms ESSLLI 2015 Barcelona, Spain http://ufal.mff.cuni.cz/esslli2015 Barbora Hladká [email protected]
MODERN WRITTEN ARABIC. Volume I. Hosted for free on livelingua.com
MODERN WRITTEN ARABIC Volume I Hosted for free on livelingua.com TABLE OF CcmmTs PREFACE. Page iii INTRODUCTICN vi Lesson 1 1 2.6 3 14 4 5 6 22 30.45 7 55 8 61 9 69 10 11 12 13 96 107 118 14 134 15 16
NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE. Venu Govindaraju
NAVIGATING SCIENTIFIC LITERATURE A HOLISTIC PERSPECTIVE Venu Govindaraju BIOMETRICS DOCUMENT ANALYSIS PATTERN RECOGNITION 8/24/2015 ICDAR- 2015 2 Towards a Globally Optimal Approach for Learning Deep Unsupervised
Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis
Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis Fabio Tesser, Giacomo Sommavilla, Giulio Paci, Piero Cosi Institute of Cognitive Sciences and Technologies, National
Utilizing Automatic Speech Recognition to Improve Deaf Accessibility on the Web
Utilizing Automatic Speech Recognition to Improve Deaf Accessibility on the Web Brent Shiver DePaul University [email protected] Abstract Internet technologies have expanded rapidly over the past two
LINGSTAT: AN INTERACTIVE, MACHINE-AIDED TRANSLATION SYSTEM*
LINGSTAT: AN INTERACTIVE, MACHINE-AIDED TRANSLATION SYSTEM* Jonathan Yamron, James Baker, Paul Bamberg, Haakon Chevalier, Taiko Dietzel, John Elder, Frank Kampmann, Mark Mandel, Linda Manganaro, Todd Margolis,
Establishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
TEACHER CERTIFICATION STUDY GUIDE LANGUAGE COMPETENCY AND LANGUAGE ACQUISITION
DOMAIN I LANGUAGE COMPETENCY AND LANGUAGE ACQUISITION COMPETENCY 1 THE ESL TEACHER UNDERSTANDS FUNDAMENTAL CONCEPTS AND KNOWS THE STRUCTURE AND CONVENTIONS OF THE ENGLISH LANGUAGE Skill 1.1 Understand
Interpreting areading Scaled Scores for Instruction
Interpreting areading Scaled Scores for Instruction Individual scaled scores do not have natural meaning associated to them. The descriptions below provide information for how each scaled score range should
A secure face tracking system
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 10 (2014), pp. 959-964 International Research Publications House http://www. irphouse.com A secure face tracking
THE recognition of Arabic writing has many applications
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 7, JULY 2009 1165 Combining Slanted-Frame Classifiers for Improved HMM-Based Arabic Handwriting Recognition Ramy Al-Hajj Mohamad,
Special Topics in Computer Science
Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS
A HAND-HELD SPEECH-TO-SPEECH TRANSLATION SYSTEM. Bowen Zhou, Yuqing Gao, Jeffrey Sorensen, Daniel Déchelotte and Michael Picheny
A HAND-HELD SPEECH-TO-SPEECH TRANSLATION SYSTEM Bowen Zhou, Yuqing Gao, Jeffrey Sorensen, Daniel Déchelotte and Michael Picheny IBM T. J. Watson Research Center Yorktown Heights, New York 10598 zhou, yuqing,
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg
Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg March 1, 2007 The catalogue is organized into sections of (1) obligatory modules ( Basismodule ) that
OPTIMIZATION OF NEURAL NETWORK LANGUAGE MODELS FOR KEYWORD SEARCH. Ankur Gandhe, Florian Metze, Alex Waibel, Ian Lane
OPTIMIZATION OF NEURAL NETWORK LANGUAGE MODELS FOR KEYWORD SEARCH Ankur Gandhe, Florian Metze, Alex Waibel, Ian Lane Carnegie Mellon University Language Technology Institute {ankurgan,fmetze,ahw,lane}@cs.cmu.edu
Automated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion Transcripts Vitomir Kovanović [email protected] Dragan Gašević [email protected] School of Informatics, University of Edinburgh Edinburgh, United Kingdom [email protected]
Study Plan for Master of Arts in Applied Linguistics
Study Plan for Master of Arts in Applied Linguistics Master of Arts in Applied Linguistics is awarded by the Faculty of Graduate Studies at Jordan University of Science and Technology (JUST) upon the fulfillment
DRA2 Word Analysis. correlated to. Virginia Learning Standards Grade 1
DRA2 Word Analysis correlated to Virginia Learning Standards Grade 1 Quickly identify and generate words that rhyme with given words. Quickly identify and generate words that begin with the same sound.
Robust Methods for Automatic Transcription and Alignment of Speech Signals
Robust Methods for Automatic Transcription and Alignment of Speech Signals Leif Grönqvist ([email protected]) Course in Speech Recognition January 2. 2004 Contents Contents 1 1 Introduction 2 2 Background
Comparative Analysis on the Armenian and Korean Languages
Comparative Analysis on the Armenian and Korean Languages Syuzanna Mejlumyan Yerevan State Linguistic University Abstract It has been five years since the Korean language has been taught at Yerevan State
7-2 Speech-to-Speech Translation System Field Experiments in All Over Japan
7-2 Speech-to-Speech Translation System Field Experiments in All Over Japan We explain field experiments conducted during the 2009 fiscal year in five areas of Japan. We also show the experiments of evaluation
Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects
Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects David Graff, Mohamed Maamouri Linguistic Data Consortium University of Pennsylvania E-mail: [email protected], [email protected]
Generating Training Data for Medical Dictations
Generating Training Data for Medical Dictations Sergey Pakhomov University of Minnesota, MN [email protected] Michael Schonwetter Linguistech Consortium, NJ [email protected] Joan Bachenko
OCR-Based Electronic Documentation Management System
OCR-Based Electronic Documentation Management System Khalaf S. Alkhalaf, Abdulelah I. Almishal, Anas O. Almahmoud, and Majed S. Alotaibi Abstract Optical character recognition (OCR) is one of the latest
A CHINESE SPEECH DATA WAREHOUSE
A CHINESE SPEECH DATA WAREHOUSE LUK Wing-Pong, Robert and CHENG Chung-Keng Department of Computing, Hong Kong Polytechnic University Tel: 2766 5143, FAX: 2774 0842, E-mail: {csrluk,cskcheng}@comp.polyu.edu.hk
Natural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
Processing: current projects and research at the IXA Group
Natural Language Processing: current projects and research at the IXA Group IXA Research Group on NLP University of the Basque Country Xabier Artola Zubillaga Motivation A language that seeks to survive
Effect of Captioning Lecture Videos For Learning in Foreign Language 外 国 語 ( 英 語 ) 講 義 映 像 に 対 する 字 幕 提 示 の 理 解 度 効 果
Effect of Captioning Lecture Videos For Learning in Foreign Language VERI FERDIANSYAH 1 SEIICHI NAKAGAWA 1 Toyohashi University of Technology, Tenpaku-cho, Toyohashi 441-858 Japan E-mail: {veri, nakagawa}@slp.cs.tut.ac.jp
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Pranali Chilekar 1, Swati Ubale 2, Pragati Sonkambale 3, Reema Panarkar 4, Gopal Upadhye 5 1 2 3 4 5
Semantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
A Knowledge-Poor Approach to BioCreative V DNER and CID Tasks
A Knowledge-Poor Approach to BioCreative V DNER and CID Tasks Firoj Alam 1, Anna Corazza 2, Alberto Lavelli 3, and Roberto Zanoli 3 1 Dept. of Information Eng. and Computer Science, University of Trento,
Thirukkural - A Text-to-Speech Synthesis System
Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,
CallSurf - Automatic transcription, indexing and structuration of call center conversational speech for knowledge extraction and query by content
CallSurf - Automatic transcription, indexing and structuration of call center conversational speech for knowledge extraction and query by content Martine Garnier-Rizet 1, Gilles Adda 2, Frederik Cailliau
German Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings
German Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings Haojin Yang, Christoph Oehlke, Christoph Meinel Hasso Plattner Institut (HPI), University of Potsdam P.O. Box
Why major in linguistics (and what does a linguist do)?
Why major in linguistics (and what does a linguist do)? Written by Monica Macaulay and Kristen Syrett What is linguistics? If you are considering a linguistics major, you probably already know at least
University of Massachusetts Boston Applied Linguistics Graduate Program. APLING 601 Introduction to Linguistics. Syllabus
University of Massachusetts Boston Applied Linguistics Graduate Program APLING 601 Introduction to Linguistics Syllabus Course Description: This course examines the nature and origin of language, the history
Transcription bottleneck of speech corpus exploitation
Transcription bottleneck of speech corpus exploitation Caren Brinckmann Institut für Deutsche Sprache, Mannheim, Germany Lesser Used Languages and Computer Linguistics (LULCL) II Nov 13/14, 2008 Bozen
9RLFH$FWLYDWHG,QIRUPDWLRQ(QWU\7HFKQLFDO$VSHFWV
Université de Technologie de Compiègne UTC +(8',$6
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 Yidong Chen, Xiaodong Shi Institute of Artificial Intelligence Xiamen University P. R. China November 28, 2006 - Kyoto 13:46 1
PTE Academic Preparation Course Outline
PTE Academic Preparation Course Outline August 2011 V2 Pearson Education Ltd 2011. No part of this publication may be reproduced without the prior permission of Pearson Education Ltd. Introduction The
Final Exam Study Guide (24.900 Fall 2012)
revision 12/18/201 Final Exam Study Guide (24.900 Fall 2012) If we do our job well, the final exam should be neither particularly hard nor particularly easy. It will be a comprehensive exam on the points
The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications
Forensic Science International 146S (2004) S95 S99 www.elsevier.com/locate/forsciint The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications A.
Fast Labeling and Transcription with the Speechalyzer Toolkit
Fast Labeling and Transcription with the Speechalyzer Toolkit Felix Burkhardt Deutsche Telekom Laboratories, Berlin, Germany [email protected] Abstract We describe a software tool named Speechalyzer
THE RWTH ENGLISH LECTURE RECOGNITION SYSTEM
THE RWTH ENGLISH LECTURE RECOGNITION SYSTEM Simon Wiesler 1, Kazuki Irie 2,, Zoltán Tüske 1, Ralf Schlüter 1, Hermann Ney 1,2 1 Human Language Technology and Pattern Recognition, Computer Science Department,
The SweDat Project and Swedia Database for Phonetic and Acoustic Research
2009 Fifth IEEE International Conference on e-science The SweDat Project and Swedia Database for Phonetic and Acoustic Research Jonas Lindh and Anders Eriksson Department of Philosophy, Linguistics and
Speech Analytics. Whitepaper
Speech Analytics Whitepaper This document is property of ASC telecom AG. All rights reserved. Distribution or copying of this document is forbidden without permission of ASC. 1 Introduction Hearing the
Hardware Implementation of Probabilistic State Machine for Word Recognition
IJECT Vo l. 4, Is s u e Sp l - 5, Ju l y - Se p t 2013 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Hardware Implementation of Probabilistic State Machine for Word Recognition 1 Soorya Asokan, 2
Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features
, pp.273-280 http://dx.doi.org/10.14257/ijdta.2015.8.4.27 Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features Lirong Qiu School of Information Engineering, MinzuUniversity of
Kindergarten Common Core State Standards: English Language Arts
Kindergarten Common Core State Standards: English Language Arts Reading: Foundational Print Concepts RF.K.1. Demonstrate understanding of the organization and basic features of print. o Follow words from
