Using the Amazon Mechanical Turk for Transcription of Spoken Language
|
|
|
- Shon O’Brien’
- 9 years ago
- Views:
Transcription
1 Research CMU Computer Science Department School of Computer Science 2010 Using the Amazon Mechanical Turk for Transcription of Spoken Language Matthew R. Marge Satanjeev Banerjee Alexander I. Rudnicky Follow this and additional works at: This Conference Proceeding is brought to you for free and open access by the School of Computer Science at Research CMU. It has been accepted for inclusion in Computer Science Department by an authorized administrator of Research CMU. For more information, please contact [email protected].
2 USING THE AMAZON MECHANICAL TURK FOR TRANSCRIPTION OF SPOKEN LANGUAGE Matthew Marge, Satanjeev Banerjee, and Alexander I. Rudnicky School of Computer Science Pittsburgh, PA {mrmarge, banerjee, ABSTRACT We investigate whether Amazon s Mechanical Turk (MTurk) service can be used as a reliable method for transcription of spoken language data. Utterances with varying speaker demographics (native and non-native English, male and female) were posted on the MTurk marketplace together with standard transcription guidelines. Transcriptions were compared against transcriptions carefully prepared in-house through conventional (manual) means. We found that transcriptions from MTurk workers were generally quite accurate. Further, when transcripts for the same utterance produced by multiple workers were combined using the ROVER voting scheme, the accuracy of the combined transcript rivaled that observed for conventional transcription methods. We also found that accuracy is not particularly sensitive to payment amount, implying that high quality results can be obtained at a fraction of the cost and turnaround time of conventional methods. Index Terms crowd sourcing, speech transcription 1. INTRODUCTION Accurate manual speech transcriptions are crucial for nearly all aspects of spoken language research (speech recognition, speech synthesis, etc.). However, manual speech transcription is a demanding task. Conventional transcription methodology requires training transcribers on transcription guidelines, which last from hours to days. Once trained, a worker is expected to be available to transcribe sizable amounts of spoken data. Recruiting, training and retaining transcribers represents a sizable endeavor even for university-based efforts that can make use of undergraduate labor. Costs can run upwards of $100 per hour of transcribed speech (assuming the transcriber is being paid $10 per work hour and that transcription and checking takes 10 times the length of the audio being transcribed). Amazon s Mechanical Turk 1 (MTurk) can potentially reduce the cost of manual speech transcription. Dubbed artificial artificial intelligence, MTurk is a marketplace for human workers ( turkers ) who perform tasks for pay; these tasks are submitted and paid for by other humans 1 ( requesters ). The requester determines when work is satisfactory. The tasks targeted by the system are ones that are simple for humans to perform but are challenging for computers (e.g., determining if a person is facing to the right or left in a picture). Speech transcription fits well into this framework. In this paper, we investigate whether MTurk can be used to perform transcription according to the conventions and quality level expected by the speech research community. MTurk has been previously used by others to transcribe speech. For example, [1] and [2] report near-expert accuracy when using MTurk to correct the output of an automatic speech recognizer. MTurk has been used for other natural language annotation tasks. For example, [3] used MTurk to carry out several different annotation tasks (such as word sense disambiguation) and found strong agreement with gold standard annotations. [4] used MTurk to evaluate machine translation output. [5] evaluated MTurk as a venue for conducting user studies. While these results show that MTurk appears useful for language annotation tasks, to our knowledge a systematic study of MTurk as a means for general speech transcription has not yet been reported. This paper presents such a study. 2. THE MTURK RESOURCE Requesters post Human Intelligence Tasks (HITs) on Mechanical Turk by uploading tasks onto Amazon s web portal. MTurk maintains each turker s performance history, which requesters may use to specify who is eligible to perform the HITs. Eligibility may include the turker s location (country), HIT completion rate (fraction of tasks completed among those he signed up for in the past) and approval rate (fraction of tasks accepted by requesters among those he completed in the past). The requester must also specify the amount of payment a turker can receive if he completes the task and his work is accepted by the requester. Once a requester posts a HIT on MTurk, as in our transcription task, eligible turkers can immediately view the HIT, and sign up for it. For the transcription task, we restricted eligibility to turkers with a previous HIT approval rate of 95% or better.
3 All audio was posted on a third-party website 2 that provides streaming audio through a widget embedded in our transcription HIT. Streaming makes it difficult for turkers to download or otherwise manipulate the audio in unanticipated ways. Each transcription HIT posted on MTurk was associated with a single unique streaming audio file. In our case, each transcription HIT could be performed by multiple turkers. MTurk removes completed tasks from a turker s list of available HITs, precluding a single turker from transcribing the same audio file twice. Once a turker completes the HIT, the requester can download and review the work (in our case, the speech transcription) and provide payment via Amazon if the work is satisfactory. 3. EXPERIMENTS WITH MECHANICAL TURK This study had turkers perform a simple transcription task. We examined the influence of payment amount, and of data characteristics, including speaker gender and speaker background (native or non-native English speaker) on the accuracy of the transcription Procedure The same transcription task was offered to different groups of turkers by posting, at any given time, one batch (a preset group of transcriptions that need to be transcribed at a specified payment amount). There were four batches, each paying $0.005, $0.01, $0.03, or $0.05 per transcription, pending approval by a requester. For any batch, five different transcribers were allowed to transcribe each utterance. While MTurk automatically prevented a single turker from transcribing the same utterance twice within a single payment batch, we also ensured that no turker transcribed the same utterance from two or more different payment batches. We did so in order to avoid any learning effects that turkers may have from transcribing the utterance the first time. We implemented this restriction by checking the identities of the turkers, and reposting the utterances that were transcribed more than once by the same turkers. Occasionally, turkers submitted incomplete or incoherent work, and this work was rejected via the MTurk web interface. We rejected 16 assignments out of the total 910 assignments (260 in three payment conditions, 130 in one payment condition). Once all transcriptions were done, we computed word-level accuracy against an in-house gold standard transcription, as described in section 5. We recruited 94 unique MTurk workers for this study Transcription Instructions Turkers were told to listen to utterances by using an audio player embedded in the task web page. Turkers were asked to transcribe every audible word, and were told to follow 2 conventions for marking speaker mispronunciations and restarts; they were not required to mark fillers (e.g., uh and eh ). The instructions stressed the importance of accuracy. Turkers were allowed to replay the audio as many times as necessary to produce a satisfactory transcript. 4. AUDIO MATERIALS The audio clips were taken from data collected in a separate experiment investigating the formulation of route instructions intended for robots (e.g., The purple goal area is on the right hand side of Aki [a robot] it is in front of you and diagonally to your left. ). Speakers were instructed to inspect a drawing of a scene, consider their instructions then record these using a computer interface; they had the opportunity to re-record utterances that they considered unsatisfactory. All speech was recorded in a quiet meeting room with a headset microphone. In addition, the gold standard transcriptions had one indication of a mispronunciation and no indications of restarts, though the transcribers were instructed to transcribe any occurrences of them. Thus the speech materials used for the present investigation were of good quality. This should be kept in mind when interpreting the results described below. The material used came from five male and five female speakers and ranged in duration from 6 seconds to 10 seconds (mean 8.0 sec), and from 5 words to 28 words (mean 17.4 words). Among these speakers, three were fluent non-native English speakers; the remaining speakers were native English speakers. The speakers were divided into four categories: (1) female native English speakers, (2) a female non-native English speaker, (3) male native English speakers, and (4) male non-native English speakers (see Table 1). From a set of 896 audio clips, 52 clips were selected for this study, 13 from each of the four categories above. Each set of 13 utterances had the lowest possible length variance within that set, that is, all selected audio clips in a given category were approximately the same duration. Speaker Category Mean Duration Mean Word Count Utterance Count Female Native 6.0 sec 14.8 words 13 utts Female Non-native 9.4 sec 22.6 words 13 utts Male Native 9.6 sec 20.4 words 13 utts Male Non-native 6.9 sec 11.7 words 13 utts Table 1. Utterance information across speaker categories (female or male, native or non-native English). 5. MTURK TRANSCRIPTION ANALYSIS Transcription quality was determined by computing word error rate (WER) using a standard procedure [6], with a gold standard that was transcribed using in-house guidelines [7] as reference. For purposes of the current comparison, both the reference transcription and those obtained from MTurk
4 Payment (per trans) WER Speaker Gender WER $ % Male 3.74% $ % Female 6.01% $ % $ % Speaker Native Language WER Aggregate 4.96% English 3.55% Other 6.40% Table 2. WER across payment, speaker gender, and speaker native language. were normalized by using a spell checker to correct common spelling errors and remove annotations (mispronunciations, restarts, fillers, punctuation, etc.) that were not relevant for the current analysis Transcription Results Results for the payment conditions are summarized in Table 2. Individual transcription WER is low, ranging from 4.19% to 6.53%. Note that transcribers had no knowledge of the domain other than that suggested by the speech content. Accuracy approaches the 95% transcriber agreement criterion that is expected when experts transcribe an utterance [8]. As reported by NIST, disagreement among expert transcribers is typically at a WER of 2-4% [6], and our results approach this level as well. The aggregate WER across the 1,040 transcriptions (4 payment conditions 5 transcriptions per utterance) was 4.96%. The Sentence Error Rate (SER) the fraction of transcriptions that had at least one disagreement with the gold standard was 46.0%. Disagreements were often minor (see Section 5.2). To investigate whether financial incentive might potentially influence transcription quality, we posted the same transcription tasks at 4 different payment levels, as mentioned in Section 3.1. Each batch of tasks had a set payment amount, and at any given time, only one batch was available for turkers to perform. Based on the work batch available, a turker was paid $0.005, $0.01, $0.03, or $0.05 per satisfactory transcription. Transcriptions were rejected if they were empty or had none of the words in the audio. Batches were posted in decreasing order of cost. We found similar error levels across conditions, except for the $0.01 condition. There was no statistically significant difference in WER across the $0.005, $0.03, and $0.05 payment groups (F(2, 153) = 0.28, p = 0.76). We believe that the outcome of the $0.01 condition is random, but the occurrence of such fluctuations needs to be taken into account in assessing the usefulness of the process. We found that male speakers were transcribed more accurately than female speakers across all payment conditions (F(1, 206) = 8.40, p < 0.05). The native language of the speaker also influenced transcription quality, not surprisingly, with a statistically significant difference in WER (F(1, 206) = 22.68, p < 0.05). Payment Latency Indiv. ROVER-3 ROVER-4 ROVER-5 $ hrs 4.79% 2.91% 3.37% 2.44% $ hrs 6.53% 4.00% 3.64% 2.27% $ hrs 4.33% 2.62% 2.72% 1.55% $ hrs 4.19% 2.68% 3.37% 2.33% Aggregate 4.96% 3.05% 3.27% 2.14% Table 3. Latency (turnaround time) and WER across payment with and without ROVER. 5.2 Latency & Error Analysis The latency column in Table 3 shows that cost appears to affect the amount of time it takes to obtain transcription results (i.e., latency) although there is a great deal of variability. Each work batch required 52 utterances to be transcribed 5 times. As expected, our more lucrative HITs ($0.03/transcription and $0.05/transcription) were completed much faster than the cheaper HITs (82.7% less turnaround time). Note however, that the fact that the $0.05/transcription HIT was posted in the evening, during off-peak work hours, may have influenced turnaround time. Out of a total MTurk transcription pool of 20,116 words transcribed (across all the payment conditions), 997 were errors (4.96% WER). Overall, the most common errors were substitutions (682) followed by insertions (210) and deletions (105). An analysis of errors revealed variable use of contractions, e.g. you will versus you ll, till versus until, you are versus you re, etc. Removing these differences further reduced aggregate WER to 3.61%. 6. USING MULTIPLE TRANSCRIPTIONS TO IMPROVE ACCURACY One of the benefits of a service such as MTurk is that multiple transcriptions of an utterance can be obtained at a reasonable cost. Thus, one way to increase accuracy is to combine transcriptions from multiple turkers. NIST s ROVER algorithm [9] is a well-known procedure for combining alternative recognition hypotheses through wordlevel voting. We can apply this technique to improve the accuracy of our transcriptions, although we need to establish the minimum number of transcriptions that need to be obtained for an improvement to be observed. In general, an Word Error Rate 10% 9% 8% 7% 6% 5% 4% 3% 2% 1% 0% $0.005 $0.01 $0.03 $0.05 Individual Transcriptions ROVER-3 ROVER-4 ROVER-5 Figure 1. WER Payment across payment (USD) per amount Transcription with and without ROVER.
5 odd number of transcriptions is desirable because it will likely break ties as candidates are merged (ROVER randomly selects from the candidates if words are tied at a given position). We varied the number of transcriptions per utterance from three ( ROVER-3 ) to five ( ROVER-5 ). 6.1 ROVER Results Figure 1 presents results from combining multiple transcriptions with the ROVER method. All combinations of each utterance s five transcriptions were calculated for each of the ROVER runs (e.g., all combinations of WER were calculated). Results are summarized in columns 3-6 of Table 3 by payment amount. As expected, combining all possible transcriptions of an utterance yielded the best results. The maximum number of possible transcriptions for an utterance was also an odd number (five), which improves results due to the ability to break word-level ties. Using ROVER to combine three transcriptions reduced aggregate WER by 39% to 3.05%, and combining five transcriptions reduced WER by 57% to 2.14%. In the latter case, SER is 24.2%, a 47.4% reduction from that of individual transcriptions. When contractions were resolved, as done in Section 5.2, aggregate WER after combining five transcriptions reduced to 0.84%. Note that variations between payment conditions are attenuated when candidate utterances are combined together. In fact, there was no statistically significant difference in WER across payment groups when combining all 5 transcriptions of each utterance (F(3, 203) = 0.59, p = 0.62) The native language speaker characteristic also becomes less impactful (summarized in Figure 2). Based on optimizing both cost and accuracy, good choices here are to use either three or five candidates in the least costly payment condition. This would suggest that collecting multiple transcriptions should be part of the standard procedure in using MTurk. 7. CONCLUSIONS We have shown that Amazon s Mechanical Turk (MTurk) can be used to accurately transcribe spoken language data. Experiments using MTurk for transcription were presented, and initial disagreement rates (WER) with a gold standard were approximately 5%. ROVER was found to be an effective method for improving accuracy (given sufficient parallel transcriptions). Combining multiple candidate transcriptions improved results to a % disagreement rate. This process could arguably make the role of a human transcription checker unnecessary. Traditional transcription methods for spoken language data cost upwards of $100 per hour of speech, but our methods range in cost from $2.25 to $22.50 (one transcriber, with mean utterance length of 8 sec). Furthermore, accuracy was found to be unrelated to payment. Even at low payments, transcriber disagreement (after ROVER) is similar to traditional inter-transcriber disagreement, Word Error Rate 10% 9% 8% 7% 6% 5% 4% 3% 2% 1% 0% English Other Native Language of Transcribed Speaker Indiv. Transcriptions ROVER-3 ROVER-4 ROVER-5 Figure 2. WER by native language of speaker transcribed. although lower payments may lead to longer delays in obtaining transcription results. We are not certain why turkers are willing to do this work, particularly at the very lowest payment rates. It s possible that they perceive these tasks as a diversion rather than as a source of income. We expect that over time turker behavior will evolve but it s difficult to predict in which direction this will happen. At present we find that MTurk can be effectively and cheaply used for transcription of clean speech data. In the future we will evaluate MTurk for the transcription of more difficult speech, such as conversational speech or speech in noisy backgrounds. Although there is evidence that turkers can perform some linguistic annotation tasks, it would be useful to establish the limits of their ability. 8. REFERENCES [1] A. Gruenstein, I. McGraw, and A. Sutherland, "A self-transcribing speech corpus: collecting continuous speech with an online educational game," in SLaTE Workshop, [2] I. McGraw, A. Gruenstein, and A. Sutherland, "A self-labeling speech corpus: Collecting spoken words with an online educational game," in Interspeech, [3] R. Snow, B. O'Connor, D. Jurafsky, and A. Y. Ng, "Cheap and fast - but is it good? Evaluating non-expert annotations for natural langauge tasks," in EMNLP, [4] C. Callison-Burch, "Fast, cheap, and creative: Evaluating translation quality using Amazon's Mechanical Turk," in EMNLP, [5] A. Kittur, E. H. Chi, and B. Suh, "Crowdsourcing user studies with Mechanical Turk," in CHI '08, Florence, Italy, [6] The NIST Rich Transcription Evaluation Project. [Online]. [7] C. Bennett and A. I. Rudnicky, "The Carnegie Mellon Communicator corpus," in ICSLP, [8] K. Zechner, "What did they actually say? Agreement and Disagreement among Transcribers of Non-Native Spontaneous Speech Responses in an English Proficiency Test," in SLaTE Workshop, [9] J. G. Fiscus, "A post-processing system to yield word error rates: Recognizer Output Voting Error Reduction (ROVER)," in ASRU Workshop, 1997.
Turker-Assisted Paraphrasing for English-Arabic Machine Translation
Turker-Assisted Paraphrasing for English-Arabic Machine Translation Michael Denkowski and Hassan Al-Haj and Alon Lavie Language Technologies Institute School of Computer Science Carnegie Mellon University
Cheap, Fast and Good Enough: Automatic Speech Recognition with Non-Expert Transcription
Cheap, Fast and Good Enough: Automatic Speech Recognition with Non-Expert Transcription Scott Novotney and Chris Callison-Burch Center for Language and Speech Processing Johns Hopkins University [email protected]
2014/02/13 Sphinx Lunch
2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue
Collecting Polish German Parallel Corpora in the Internet
Proceedings of the International Multiconference on ISSN 1896 7094 Computer Science and Information Technology, pp. 285 292 2007 PIPS Collecting Polish German Parallel Corpora in the Internet Monika Rosińska
A System for Labeling Self-Repairs in Speech 1
A System for Labeling Self-Repairs in Speech 1 John Bear, John Dowding, Elizabeth Shriberg, Patti Price 1. Introduction This document outlines a system for labeling self-repairs in spontaneous speech.
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
A Comparative Analysis of Speech Recognition Platforms
Communications of the IIMA Volume 9 Issue 3 Article 2 2009 A Comparative Analysis of Speech Recognition Platforms Ore A. Iona College Follow this and additional works at: http://scholarworks.lib.csusb.edu/ciima
ABSTRACT 2. SYSTEM OVERVIEW 1. INTRODUCTION. 2.1 Speech Recognition
The CU Communicator: An Architecture for Dialogue Systems 1 Bryan Pellom, Wayne Ward, Sameer Pradhan Center for Spoken Language Research University of Colorado, Boulder Boulder, Colorado 80309-0594, USA
Turkish Radiology Dictation System
Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey [email protected], [email protected]
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical Turk Haofeng Zhou School of Information University of Texas at Austin [email protected] Denys Baskov Department of Computer Science University of Texas at
A CHINESE SPEECH DATA WAREHOUSE
A CHINESE SPEECH DATA WAREHOUSE LUK Wing-Pong, Robert and CHENG Chung-Keng Department of Computing, Hong Kong Polytechnic University Tel: 2766 5143, FAX: 2774 0842, E-mail: {csrluk,cskcheng}@comp.polyu.edu.hk
Pasadena City College / ESL Program / Oral Skills Classes / Rubrics (1/10)
Pronunciation Classes Pasadena City College / ESL Program / Oral Skills Classes / Rubrics (1/10) ESL 246 SLO #1: Students will recognize and begin to produce correct American-English patterns in short
Dragon Solutions. Using A Digital Voice Recorder
Dragon Solutions Using A Digital Voice Recorder COMPLETE REPORTS ON THE GO USING A DIGITAL VOICE RECORDER Professionals across a wide range of industries spend their days in the field traveling from location
CLARIN-NL Third Call: Closed Call
CLARIN-NL Third Call: Closed Call CLARIN-NL launches in its third call a Closed Call for project proposals. This called is only open for researchers who have been explicitly invited to submit a project
Web Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall.
Web Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall. 5401 Butler Street, Suite 200 Pittsburgh, PA 15201 +1 (412) 408 3167 www.metronomelabs.com
AFTER EFFECTS FOR FLASH FLASH FOR AFTER EFFECTS
and Adobe Press. For ordering information, CHAPTER please EXCERPT visit www.peachpit.com/aeflashcs4 AFTER EFFECTS FOR FLASH FLASH FOR AFTER EFFECTS DYNAMIC ANIMATION AND VIDEO WITH ADOBE AFTER EFFECTS
Utilizing Automatic Speech Recognition to Improve Deaf Accessibility on the Web
Utilizing Automatic Speech Recognition to Improve Deaf Accessibility on the Web Brent Shiver DePaul University [email protected] Abstract Internet technologies have expanded rapidly over the past two
Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation
Systematic Comparison of Professional and Crowdsourced Reference Translations for Machine Translation Rabih Zbib, Gretchen Markiewicz, Spyros Matsoukas, Richard Schwartz, John Makhoul Raytheon BBN Technologies
Your personalised, online solution to meeting ICAO English language proficiency requirements
Your personalised, online solution to meeting ICAO English language proficiency requirements 1 Acknowledged as leaders in their respective fields of Aviation English training and Speech Recognition technology,
Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet)
Proceedings of the Twenty-Fourth Innovative Appications of Artificial Intelligence Conference Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet) Tatsuya Kawahara
An Online Application System for Civil Service Employment
An Online Application System for Civil Service Employment Richard K. Enyard, Ph.D., SPHR Eastern Illinois University Sandra E. Bowman Eastern Illinois University William Weber Eastern Illinois University
Understanding Video Lectures in a Flipped Classroom Setting. A Major Qualifying Project Report. Submitted to the Faculty
1 Project Number: DM3 IQP AAGV Understanding Video Lectures in a Flipped Classroom Setting A Major Qualifying Project Report Submitted to the Faculty Of Worcester Polytechnic Institute In partial fulfillment
First, read the Editing Software Overview that follows so that you have a better understanding of the process.
Instructions In this course, you learn to transcribe and edit reports. When transcribing a report, you listen to dictation and create the entire document. When editing a report, the speech recognition
Input Support System for Medical Records Created Using a Voice Memo Recorded by a Mobile Device
International Journal of Signal Processing Systems Vol. 3, No. 2, December 2015 Input Support System for Medical Records Created Using a Voice Memo Recorded by a Mobile Device K. Kurumizawa and H. Nishizaki
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006
The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 Yidong Chen, Xiaodong Shi Institute of Artificial Intelligence Xiamen University P. R. China November 28, 2006 - Kyoto 13:46 1
Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System
Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania [email protected]
Building A Vocabulary Self-Learning Speech Recognition System
INTERSPEECH 2014 Building A Vocabulary Self-Learning Speech Recognition System Long Qin 1, Alexander Rudnicky 2 1 M*Modal, 1710 Murray Ave, Pittsburgh, PA, USA 2 Carnegie Mellon University, 5000 Forbes
Study of Humanoid Robot Voice Q & A System Based on Cloud
Send Orders for Reprints to [email protected] 1014 The Open Cybernetics & Systemics Journal, 2015, 9, 1014-1018 Open Access Study of Humanoid Robot Voice Q & A System Based on Cloud Tong Chunya
UNIVERSITY OF READING
UNIVERSITY OF READING FRAMEWORK FOR CLASSIFICATION AND PROGRESSION FOR FIRST DEGREES (FOR COHORTS ENTERING A PROGRAMME IN THE PERIOD AUTUMN TERM 2002- SUMMER TERM 2007) Approved by the Senate on 4 July
Co-Curricular Activities and Academic Performance -A Study of the Student Leadership Initiative Programs. Office of Institutional Research
Co-Curricular Activities and Academic Performance -A Study of the Student Leadership Initiative Programs Office of Institutional Research July 2014 Introduction The Leadership Initiative (LI) is a certificate
Realtime Remote Online Captioning: An Effective Accommodation for Rural Schools and Colleges. June 27, 2001
Realtime Remote Online Captioning 1 Realtime Remote Online Captioning: An Effective Accommodation for Rural Schools and Colleges June 27, 2001 M. Bryce Fifield, Ph.D. North Dakota Center for Persons with
Why Evaluation? Machine Translation. Evaluation. Evaluation Metrics. Ten Translations of a Chinese Sentence. How good is a given system?
Why Evaluation? How good is a given system? Machine Translation Evaluation Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better?
Reviewed by Ok s a n a Afitska, University of Bristol
Vol. 3, No. 2 (December2009), pp. 226-235 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/4441 Transana 2.30 from Wisconsin Center for Education Research Reviewed by Ok s a n a Afitska, University
Operations Management and the Integrated Manufacturing Facility
March 2010 Page 1 and the Integrated Manufacturing Facility This white paper provides a summary of the business value for investing in software systems to automate manufacturing operations within the scope
Using an external agency or individual to transcribe your qualitative data
Realities Toolkit #15 Using an external agency or individual to transcribe your qualitative data Hazel Burke, Morgan Centre, University of Manchester January 2011 Introduction Although outsourcing your
Estonian Large Vocabulary Speech Recognition System for Radiology
Estonian Large Vocabulary Speech Recognition System for Radiology Tanel Alumäe, Einar Meister Institute of Cybernetics Tallinn University of Technology, Estonia October 8, 2010 Alumäe, Meister (TUT, Estonia)
Reading Assistant: Technology for Guided Oral Reading
A Scientific Learning Whitepaper 300 Frank H. Ogawa Plaza, Ste. 600 Oakland, CA 94612 888-358-0212 www.scilearn.com Reading Assistant: Technology for Guided Oral Reading Valerie Beattie, Ph.D. Director
Reading errors made by skilled and unskilled readers: evaluating a system that generates reports for people with poor literacy
University of Aberdeen Department of Computing Science Technical Report AUCS/TR040 Reading errors made by skilled and unskilled readers: evaluating a system that generates reports for people with poor
Working people requiring a practical knowledge of English for communicative purposes
Course Description This course is designed for learners at the pre-intermediate level of English proficiency. Learners will build upon all four language skills: listening, speaking, reading, and writing
Tools & Resources for Visualising Conversational-Speech Interaction
Tools & Resources for Visualising Conversational-Speech Interaction Nick Campbell NiCT/ATR-SLC Keihanna Science City, Kyoto, Japan. [email protected] Preamble large corpus data examples new stuff conclusion
Web Traffic Capture. 5401 Butler Street, Suite 200 Pittsburgh, PA 15201 +1 (412) 408 3167 www.metronomelabs.com
Web Traffic Capture Capture your web traffic, filtered and transformed, ready for your applications without web logs or page tags and keep all your data inside your firewall. 5401 Butler Street, Suite
Review Easy Guide for Administrators. Version 1.0
Review Easy Guide for Administrators Version 1.0 Notice to Users Verve software as a service is a software application that has been developed, copyrighted, and licensed by Kroll Ontrack Inc. Use of the
Establishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
Camtasia: Importing, cutting, and captioning your Video Express movie Camtasia Studio: Windows
Camtasia: Importing, cutting, and captioning your Video Express movie Camtasia Studio: Windows Activity 1: Adding your Video Express output into Camtasia Studio Step 1: the footage you shot in the Video
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Physics Graduate Program Handbook
Carnegie Mellon University Department of Physics November 14, 2012 Version 1.1 Physics Graduate Program Handbook This document presents the rules and requirements governing the Graduate Program in the
Audio Transcription. Level 6 N33001. www.fetac.ie. Module Descriptor
The Further Education and Training Awards Council (FETAC) was set up as a statutory body on 11 June 2001 by the Minister for Education and Science. Under the Qualifications (Education & Training) Act,
3PlayMedia. Closed Captioning, Transcription, and Subtitling
Closed Captioning, Transcription, and Subtitling 1 Introduction This guide shows you the basics of how to quickly create high quality transcripts, closed captions, translations, and interactive transcripts
SPeach: Automatic Classroom Captioning System for Hearing Impaired
SPeach: Automatic Classroom Captioning System for Hearing Impaired Andres Cedeño, Riya Fukui, Zihe Huang, Aaron Roe, Chase Stewart, Peter Washington Problem Definition Over one in seven Americans have
General Information and Guidelines ESOL International All Modes Entry 3 and Levels 1, 2 and 3 (B1, B2, C1 and C2)
English Speaking Board General Information and Guidelines ESOL International All Modes Entry 3 and Levels 1, 2 and 3 (B1, B2, C1 and C2) Qualification numbers B1 500/3646/4 B2 500/3647/6 C1 500/3648/8
Dragon Solutions Using A Digital Voice Recorder
Dragon Solutions Using A Digital Voice Recorder COMPLETE REPORTS ON THE GO USING A DIGITAL VOICE RECORDER Professionals across a wide range of industries spend their days in the field traveling from location
Feature Factory: A Crowd Sourced Approach to Variable Discovery From Linked Data
Feature Factory: A Crowd Sourced Approach to Variable Discovery From Linked Data Kiarash Adl Advisor: Kalyan Veeramachaneni, Any Scale Learning for All Computer Science and Artificial Intelligence Laboratory
User-Centric Client Management with System Center 2012 Configuration Manager in Microsoft IT
Situation Microsoft IT needed to evolve their Configuration Manager 2007-based environment that used homegrown application distribution services to meet the self-service needs of Microsoft personnel. Solution
Industry Guidelines on Captioning Television Programs 1 Introduction
Industry Guidelines on Captioning Television Programs 1 Introduction These guidelines address the quality of closed captions on television programs by setting a benchmark for best practice. The guideline
Generating Training Data for Medical Dictations
Generating Training Data for Medical Dictations Sergey Pakhomov University of Minnesota, MN [email protected] Michael Schonwetter Linguistech Consortium, NJ [email protected] Joan Bachenko
COPYRIGHT 2011 COPYRIGHT 2012 AXON DIGITAL DESIGN B.V. ALL RIGHTS RESERVED
Subtitle insertion GEP100 - HEP100 Inserting 3Gb/s, HD, subtitles SD embedded and Teletext domain with the Dolby HSI20 E to module PCM decoder with audio shuffler A A application product note COPYRIGHT
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer
Analyzing Large Free-Response Qualitative Data Sets A Novel Quantitative-Qualitative Hybrid Approach
Analyzing Large Free-Response Qualitative Data Sets A Novel Quantitative-Qualitative Hybrid Approach Jennifer Light, Ken Yasuhara [email protected], [email protected] Abstract - Qualitative analysis
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection
Effect of Captioning Lecture Videos For Learning in Foreign Language 外 国 語 ( 英 語 ) 講 義 映 像 に 対 する 字 幕 提 示 の 理 解 度 効 果
Effect of Captioning Lecture Videos For Learning in Foreign Language VERI FERDIANSYAH 1 SEIICHI NAKAGAWA 1 Toyohashi University of Technology, Tenpaku-cho, Toyohashi 441-858 Japan E-mail: {veri, nakagawa}@slp.cs.tut.ac.jp
Increasing Productivity with Computer Aided Medical Transcription
WHITE PAPER Increasing Productivity with Computer Aided Medical Transcription HEALTHCARE CONTENTS Executive Summary... 3 Ingredient #1: Technology... 4 Ingredient #2: Style Consistency... 5 Ingredient
C E D A T 8 5. Innovating services and technologies for speech content management
C E D A T 8 5 Innovating services and technologies for speech content management Company profile 25 years experience in the market of transcription/reporting services; Cedat 85 Group: Cedat 85 srl Subtitle
Application Trends Survey
The premier provider of market intelligence Application Trends Survey SURVEY REPORT About This Study The Application Trends Survey is a product of the Graduate Management Admission Council (GMAC ), a global
Applications of speech-to-text in customer service. Dr. Joachim Stegmann Deutsche Telekom AG, Laboratories
Applications of speech-to-text in customer service. Dr. Joachim Stegmann Deutsche Telekom AG, Laboratories Contents. 1. Motivation 2. Scenarios 2.1 Voice box / call-back 2.2 Quality management 3. Technology
RAMP for SharePoint Online Guide
RAMP for SharePoint Online Guide A task-oriented help guide to RAMP for SharePoint Online Last updated: September 23, 2013 Overview The RAMP SharePoint video library lives natively inside of SharePoint.
SDL BeGlobal: Machine Translation for Multilingual Search and Text Analytics Applications
INSIGHT SDL BeGlobal: Machine Translation for Multilingual Search and Text Analytics Applications José Curto David Schubmehl IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200
Machine Translation. Why Evaluation? Evaluation. Ten Translations of a Chinese Sentence. Evaluation Metrics. But MT evaluation is a di cult problem!
Why Evaluation? How good is a given system? Which one is the best system for our purpose? How much did we improve our system? How can we tune our system to become better? But MT evaluation is a di cult
Designing a Transcription Game
Designing a Transcription Game A Thesis Presented by Beatrice Liem To Applied Mathematics in partial fulfillment of the honors requirements for the degree of Bachelor of Arts Harvard College Cambridge,
How To Hire A Localization Vendor
Scoping Your Localization Project Where Do You Need to Go and Who Can Help You Get There? When your company is ready to launch content in new markets or bulk up the reach in an underserved region, it is
Video Games and Academic Performance. Ronny Khadra. Cody Hackshaw. Leslie Mccollum. College of Coastal Georgia
Video Games and Academic Performance Ronny Khadra Cody Hackshaw Leslie Mccollum College of Coastal Georgia Introduction and Background The development of video games goes back to the 1940s and 1950s. Computer
Creating Captions in YouTube
Creating Captions in YouTube You can add your own captions to your videos. First, you will need a Google account where you can upload your videos to YouTube. You do not have to leave your videos there
Business, Administration and IT Qualifications
Business, Administration and IT Qualifications For further information contact us: Tel. +44 (0) 8707 202909 Email. [email protected] www.lcci.org.uk London Chamber of Commerce and Industry (LCCI) International
PAY AND REWARD. Think Business, Think Equality
PAY AND REWARD Think Business, Think Equality CONTENTS INTRODUCTION WHAT TO CONSIDER WHEN DECIDING HOW MUCH TO PAY STAFF DEVELOPING PAY SYSTEMS INTRODUCING A NON-DISCRIMINATORY PAY RATE SYSTEM INTRODUCING
