Estonian Large Vocabulary Speech Recognition System for Radiology
|
|
|
- Kerry Moore
- 10 years ago
- Views:
Transcription
1 Estonian Large Vocabulary Speech Recognition System for Radiology Tanel Alumäe, Einar Meister Institute of Cybernetics Tallinn University of Technology, Estonia October 8, 2010 Alumäe, Meister (TUT, Estonia) Estonian LVCSR System for Radiology October 8, / 10
2 Radiology Radiology Radiology is the branch or specialty of medicine that deals with the study and application of imaging technology (such as X-ray, ultrasound, computer tomography) and radiation to diagnosing and treating disease. Radiologist views and interprets a radiology image and creates a report that describes the findings. Achilles tendon has a uniform structure, tears are not detected. Left, there is a small tissue swelling of the tendon side. Alumäe, Meister (TUT, Estonia) Estonian LVCSR System for Radiology October 8, / 10
3 Motivation Radiologists eyes and hands are busy during the preparation of a radiological report In many hospitals, radiologists dictate the reports which are then converted to text by human speech transcribers Speech recognition systems have the potential to replace human transcribers and enable faster and less expensive report delivery In radiology, a typical active vocabulary is much smaller than in general communication and the sentences usually have a well-defined structure Alumäe, Meister (TUT, Estonia) Estonian LVCSR System for Radiology October 8, / 10
4 Acoustic models We used various wideband Estonian speech databases for training acoustic models: Models: BABEL speech database (phonetically balanced dictated speech from 60 different speakers, 9h) transcriptions of Estonian broadcast news (mostly read news speech from around 10 different speakers, 7.5h) transcriptions of live talk programs from three Estonian radio stations (42 different speakers, 10h) MFCC features 25 phonemes, silence, nine fillers triphone models 2000 tied states, 8 Gaussian per state CMU Sphinx Alumäe, Meister (TUT, Estonia) Estonian LVCSR System for Radiology October 8, / 10
5 Language model For training a language model: 1.6 million distinct reports, with 44 million word tokens Normalization: expanded and/or normalized different abbreviations using hand-made rules used a morphological analyzer for determining the part-of-speech (POS) properties of all words expanded numbers the resulting corpus was used for producing two corpora: one including verbalized punctuations and another without punctuations. vocabulary was composed of the most frequent words Two trigram LMs one with verbalized punctuations and another without punctuations were built. The two LMs were interpolated into one final model. Perplexity 35, OOV 2.6% Alumäe, Meister (TUT, Estonia) Estonian LVCSR System for Radiology October 8, / 10
6 Evaluation Recorded a small test corpus of radiology reports Dictated by 10 radiologists, 26 minutes per speaker on average Actual reports from our test set Recordings were manually trascribed Alumäe, Meister (TUT, Estonia) Estonian LVCSR System for Radiology October 8, / 10
7 Results Used CMU Sphinx 3.7, running in 0.5 x real-time one-pass speaker independant system Speaker WER AL 7.3 AR 8.5 AS 8.5 ER 10.3 JH 13.3 JK 9.2 SU 10.7 VE 8.7 VS 11.9 Average 9.8 Alumäe, Meister (TUT, Estonia) Estonian LVCSR System for Radiology October 8, / 10
8 Error analysis Errors: 17% of errors are word compounding errors a compound word is recognized as a non-compound, or vice versa (i.e., the only error is in the space between compound constituents) 17% due to spelling errors 11% normalization mistmatches (e.g., C kuus C six vs. C6) Thus, only around 55% of the errors were real recognition errors Alumäe, Meister (TUT, Estonia) Estonian LVCSR System for Radiology October 8, / 10
9 Future work We have now more wideband speech data (55 hours in total) Adaptation: Acoustic model: adapt to the voice of a speaker Language model: adapt to the typical report content of a speaker (e.g., one radiologists might be specialized on MRI images) Perform Wizard of Oz style experiments where radiologists produce reports spontaneously for previously unseen images Post-processing: consistent normalization of read numbers, dates, abbreviations, and proper structuring of the generated reports Integrate the system into the radiology information system (RIS) Alumäe, Meister (TUT, Estonia) Estonian LVCSR System for Radiology October 8, / 10
10 Thanks! Alumäe, Meister (TUT, Estonia) Estonian LVCSR System for Radiology October 8, / 10
Turkish Radiology Dictation System
Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey [email protected], [email protected]
Slovak Automatic Transcription and Dictation System for the Judicial Domain
Slovak Automatic Transcription and Dictation System for the Judicial Domain Milan Rusko 1, Jozef Juhár 2, Marian Trnka 1, Ján Staš 2, Sakhia Darjaa 1, Daniel Hládek 2, Miloš Cerňak 1, Marek Papco 2, Róbert
Automatic slide assignation for language model adaptation
Automatic slide assignation for language model adaptation Applications of Computational Linguistics Adrià Agustí Martínez Villaronga May 23, 2013 1 Introduction Online multimedia repositories are rapidly
Word Completion and Prediction in Hebrew
Experiments with Language Models for בס"ד Word Completion and Prediction in Hebrew 1 Yaakov HaCohen-Kerner, Asaf Applebaum, Jacob Bitterman Department of Computer Science Jerusalem College of Technology
Slovak Automatic Dictation System for Judicial Domain
Slovak Automatic Dictation System for Judicial Domain Milan Rusko 1(&), Jozef Juhár 2, Marián Trnka 1, Ján Staš 2, Sakhia Darjaa 1, Daniel Hládek 2, Róbert Sabo 1, Matúš Pleva 2, Marián Ritomský 1, and
German Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings
German Speech Recognition: A Solution for the Analysis and Processing of Lecture Recordings Haojin Yang, Christoph Oehlke, Christoph Meinel Hasso Plattner Institut (HPI), University of Potsdam P.O. Box
Building A Vocabulary Self-Learning Speech Recognition System
INTERSPEECH 2014 Building A Vocabulary Self-Learning Speech Recognition System Long Qin 1, Alexander Rudnicky 2 1 M*Modal, 1710 Murray Ave, Pittsburgh, PA, USA 2 Carnegie Mellon University, 5000 Forbes
THE RWTH ENGLISH LECTURE RECOGNITION SYSTEM
THE RWTH ENGLISH LECTURE RECOGNITION SYSTEM Simon Wiesler 1, Kazuki Irie 2,, Zoltán Tüske 1, Ralf Schlüter 1, Hermann Ney 1,2 1 Human Language Technology and Pattern Recognition, Computer Science Department,
An Arabic Text-To-Speech System Based on Artificial Neural Networks
Journal of Computer Science 5 (3): 207-213, 2009 ISSN 1549-3636 2009 Science Publications An Arabic Text-To-Speech System Based on Artificial Neural Networks Ghadeer Al-Said and Moussa Abdallah Department
TED-LIUM: an Automatic Speech Recognition dedicated corpus
TED-LIUM: an Automatic Speech Recognition dedicated corpus Anthony Rousseau, Paul Deléglise, Yannick Estève Laboratoire Informatique de l Université du Maine (LIUM) University of Le Mans, France [email protected]
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection
Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus
Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus Yousef Ajami Alotaibi 1, Mansour Alghamdi 2, and Fahad Alotaiby 3 1 Computer Engineering Department, King Saud University,
Automated Transcription of Conversational Call Center Speech with Respect to Non-verbal Acoustic Events
Automated Transcription of Conversational Call Center Speech with Respect to Non-verbal Acoustic Events Gellért Sárosi 1, Balázs Tarján 1, Tibor Fegyó 1,2, and Péter Mihajlik 1,3 1 Department of Telecommunication
CallSurf - Automatic transcription, indexing and structuration of call center conversational speech for knowledge extraction and query by content
CallSurf - Automatic transcription, indexing and structuration of call center conversational speech for knowledge extraction and query by content Martine Garnier-Rizet 1, Gilles Adda 2, Frederik Cailliau
Presentation Objective: Benefits to New Employee. Goal for Participants. Benefits to the Administration
Imaging Department Quick Step How to Create and Implement a Rapid Integration Orientation New Employee Orientation builds a bridge between the old job and the new job for better quality in health care
Reading Assistant: Technology for Guided Oral Reading
A Scientific Learning Whitepaper 300 Frank H. Ogawa Plaza, Ste. 600 Oakland, CA 94612 888-358-0212 www.scilearn.com Reading Assistant: Technology for Guided Oral Reading Valerie Beattie, Ph.D. Director
ASSIGNMENT 2: THE HEALTHCARE RECORD AND HEALTHCARE DOCUMENTATION TECHNOLOGY
ASSIGNMENT 2: THE HEALTHCARE RECORD AND HEALTHCARE DOCUMENTATION TECHNOLOGY Read Chapters 2 and 3 in your textbook, Healthcare Documentation: Fundamentals and Practice. Then read Assignment 2 in this study
C E D A T 8 5. Innovating services and technologies for speech content management
C E D A T 8 5 Innovating services and technologies for speech content management Company profile 25 years experience in the market of transcription/reporting services; Cedat 85 Group: Cedat 85 srl Subtitle
Improving Data Driven Part-of-Speech Tagging by Morphologic Knowledge Induction
Improving Data Driven Part-of-Speech Tagging by Morphologic Knowledge Induction Uwe D. Reichel Department of Phonetics and Speech Communication University of Munich [email protected] Abstract
AUDIMUS.media: A Broadcast News Speech Recognition System for the European Portuguese Language
AUDIMUS.media: A Broadcast News Speech Recognition System for the European Portuguese Language Hugo Meinedo, Diamantino Caseiro, João Neto, and Isabel Trancoso L 2 F Spoken Language Systems Lab INESC-ID
First, read the Editing Software Overview that follows so that you have a better understanding of the process.
Instructions In this course, you learn to transcribe and edit reports. When transcribing a report, you listen to dictation and create the entire document. When editing a report, the speech recognition
Strategies for Training Large Scale Neural Network Language Models
Strategies for Training Large Scale Neural Network Language Models Tomáš Mikolov #1, Anoop Deoras 2, Daniel Povey 3, Lukáš Burget #4, Jan Honza Černocký #5 # Brno University of Technology, Speech@FIT,
Creation of a Speech to Text System for Kiswahili
Creation of a Speech to Text System for Kiswahili Dr. Katherine Getao and Evans Miriti University of Nairobi Abstract The creation of a speech to text system for any language is an onerous task. This is
Golder Cat-Scan & MRI Center Automates Imaging Center Workflows
CASE STUDY Golder Cat-Scan & MRI Center Automates Imaging Center Workflows any health information system any application, for any specialty, anywhere endless possibilities TEXAS QUICK FACTS GOLDER CAT-SCAN
Evaluation of speech technologies
CLARA Training course on evaluation of Human Language Technologies Evaluations and Language resources Distribution Agency November 27, 2012 Evaluation of speaker identification Speech technologies Outline
Medical Transcription
TM Medical Is your outsourcing partner reliable? Are Indian companies geared to render the required standards for services? Does your vendor commit to ensure minimizing your business risks? Copyright Pradot
LSTM for Punctuation Restoration in Speech Transcripts
LSTM for Punctuation Restoration in Speech Transcripts Ottokar Tilk, Tanel Alumäe Institute of Cybernetics Tallinn University of Technology, Estonia [email protected], [email protected] Abstract
Generating Training Data for Medical Dictations
Generating Training Data for Medical Dictations Sergey Pakhomov University of Minnesota, MN [email protected] Michael Schonwetter Linguistech Consortium, NJ [email protected] Joan Bachenko
Dragon Solutions Using A Digital Voice Recorder
Dragon Solutions Using A Digital Voice Recorder COMPLETE REPORTS ON THE GO USING A DIGITAL VOICE RECORDER Professionals across a wide range of industries spend their days in the field traveling from location
Dragon Solutions Transcription Workflow
Solutions Transcription Workflow summary Improving Transcription and Workflow Efficiency Law firms have traditionally relied on expensive paralegals, legal secretaries, or outside services to transcribe
Using the Amazon Mechanical Turk for Transcription of Spoken Language
Research Showcase @ CMU Computer Science Department School of Computer Science 2010 Using the Amazon Mechanical Turk for Transcription of Spoken Language Matthew R. Marge Satanjeev Banerjee Alexander I.
Great Basin College is offering online training for Medical Transcriptionist
Great Basin College is offering online training for Medical Transcriptionist ENJOY THE FREEDOM OF WORKING FROM HOME Set your own hours Have job flexibility Do interesting work Spend more time with your
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast
Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990
Delivering Accelerated Results
Delivering Accelerated Results Presents An Integrated Radiology Information System OVERVIEW E*HealthLine s Radiology Information System is a comprehensive HealthCare Information Management System that
Course Outline/Description Court Reporting
ASSOCIATE IN SPECIALIZED BUSINESS (ASB) DEGREE PROGRAM Nine 15-week Terms (DAY) Twelve 15-week Terms (EVENING) Photo taken at Philadelphia City Hall OBJECTIVE Students will develop computer-compatible,
Speech Analytics. Whitepaper
Speech Analytics Whitepaper This document is property of ASC telecom AG. All rights reserved. Distribution or copying of this document is forbidden without permission of ASC. 1 Introduction Hearing the
U.S. Bureau of Labor Statistics. Radiology Tech
From the: U.S. Bureau of Labor Statistics Radiology Tech What They Do Radiologic technologists (RTs) perform diagnostic imaging examinations, such as x rays, on patients. Duties RTs typically do the following:
Industry Guidelines on Captioning Television Programs 1 Introduction
Industry Guidelines on Captioning Television Programs 1 Introduction These guidelines address the quality of closed captions on television programs by setting a benchmark for best practice. The guideline
Efficient diphone database creation for MBROLA, a multilingual speech synthesiser
Efficient diphone database creation for, a multilingual speech synthesiser Institute of Linguistics Adam Mickiewicz University Poznań OWD 2010 Wisła-Kopydło, Poland Why? useful for testing speech models
NEW YORK STATE UNIFIED COURT SYSTEM OFFICE OF COURT ADMINISTRATION. EXAMINATION ORIENTATION GUIDE: 2015 Senior Court Reporter Examination
NEW YORK STATE UNIFIED COURT SYSTEM OFFICE OF COURT ADMINISTRATION EXAMINATION ORIENTATION GUIDE: 2015 Senior Court Reporter Examination Examination Numbers 45/55-787 June 2015 This Orientation Guide for
Court Reporting AAS Degree Program Overview
Court Reporting AAS Degree Program Overview Program Objectives In keeping with the mission and institutional objectives of the University, the following objectives will guide the quality of our court reporting
Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet)
Proceedings of the Twenty-Fourth Innovative Appications of Artificial Intelligence Conference Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet) Tatsuya Kawahara
Speech Recognition: What's Coming and Impact
Speech Recognition: What's Coming and Impact Lincoln L. Berland, M.D., F.A.C.R. University of Alabama at Birmingham Disclosure Consultant to Nuance, Inc. Learning Objectives Appreciate the relatively limited
Dragon Solutions. Using A Digital Voice Recorder
Dragon Solutions Using A Digital Voice Recorder COMPLETE REPORTS ON THE GO USING A DIGITAL VOICE RECORDER Professionals across a wide range of industries spend their days in the field traveling from location
The Royal College of Radiologists
The Royal College of Radiologists Board of the Faculty of Clinical Radiology Electronic remote requesting This guidance is only available electronically from www.rcr.ac.uk This guidance forms part of a
Training in Clinical Radiology
Training in Clinical Radiology What is Clinical Radiology? Clinical radiology relates to the diagnosis or treatment of a patient through the use of medical imaging. Diagnostic imaging uses plain X-ray
Editorial Manager(tm) for Journal of the Brazilian Computer Society Manuscript Draft
Editorial Manager(tm) for Journal of the Brazilian Computer Society Manuscript Draft Manuscript Number: JBCS54R1 Title: Free Tools and Resources for Brazilian Portuguese Speech Recognition Article Type:
Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words
, pp.290-295 http://dx.doi.org/10.14257/astl.2015.111.55 Efficient Techniques for Improved Data Classification and POS Tagging by Monitoring Extraction, Pruning and Updating of Unknown Foreign Words Irfan
Transcription FAQ. Can Dragon be used to transcribe meetings or interviews?
Transcription FAQ Can Dragon be used to transcribe meetings or interviews? No. Given its amazing recognition accuracy, many assume that Dragon speech recognition would be an ideal solution for meeting
Carla Simões, [email protected]. Speech Analysis and Transcription Software
Carla Simões, [email protected] Speech Analysis and Transcription Software 1 Overview Methods for Speech Acoustic Analysis Why Speech Acoustic Analysis? Annotation Segmentation Alignment Speech Analysis
Optimizing Clinical Productivity:
White Paper Optimizing Clinical Productivity: Using Speech Recognition with Medical Features vs. a General Purpose Solution Highly-Accurate Medical Speech Recognition for Busy Clinicians Provides Better
DRAGON NATURALLYSPEAKING 12 FEATURE MATRIX COMPARISON BY PRODUCT EDITION
1 Recognition Accuracy Turns your voice into text with up to 99% accuracy NEW - Up to a 20% improvement to out-of-the-box accuracy compared to Dragon version 11 Recognition Speed Words appear on the screen
Robust Methods for Automatic Transcription and Alignment of Speech Signals
Robust Methods for Automatic Transcription and Alignment of Speech Signals Leif Grönqvist ([email protected]) Course in Speech Recognition January 2. 2004 Contents Contents 1 1 Introduction 2 2 Background
Master of Arts in Linguistics Syllabus
Master of Arts in Linguistics Syllabus Applicants shall hold a Bachelor s degree with Honours of this University or another qualification of equivalent standard from this University or from another university
Health Management Information Systems: Medical Imaging Systems. Slide 1 Welcome to Health Management Information Systems, Medical Imaging Systems.
Health Management Information Systems: Medical Imaging Systems Lecture 8 Audio Transcript Slide 1 Welcome to Health Management Information Systems, Medical Imaging Systems. The component, Health Management
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer
http://www.bls.gov/oco/ocos271.htm Medical Transcriptionists
http://www.bls.gov/oco/ocos271.htm Medical Transcriptionists * Nature of the Work * Training, Other Qualifications, and Advancement * Employment * Job Outlook * Projections Data * Earnings * OES Data *
C115 Office Administration - Medical MTCU Code 52308 Program Learning Outcomes
C115 Office Administration - Medical MTCU Code 52308 Program Learning Outcomes Synopsis of the Vocational Learning Outcomes* The graduate has reliably demonstrated the ability to: 1. apply scheduling,
Speech Transcription
TC-STAR Final Review Meeting Luxembourg, 29 May 2007 Speech Transcription Jean-Luc Gauvain LIMSI TC-STAR Final Review Luxembourg, 29-31 May 2007 1 What Is Speech Recognition? Def: Automatic conversion
Scalable Web and Mobile Solution for Healthcare Software Provider
Scalable Web and Mobile Solution for Healthcare Software Provider The Client Overview Our client is a leading healthcare software vendor, providing solutions catering to a niche area of radiology diagnostics,
Automated Speech to Text Transcription Evaluation
Automated Speech to Text Transcription Evaluation Ryan H Email: [email protected] Haikal Saliba Email: [email protected] Patrick C Email: [email protected] Bassem Tossoun Email: [email protected]
Effect of Captioning Lecture Videos For Learning in Foreign Language 外 国 語 ( 英 語 ) 講 義 映 像 に 対 する 字 幕 提 示 の 理 解 度 効 果
Effect of Captioning Lecture Videos For Learning in Foreign Language VERI FERDIANSYAH 1 SEIICHI NAKAGAWA 1 Toyohashi University of Technology, Tenpaku-cho, Toyohashi 441-858 Japan E-mail: {veri, nakagawa}@slp.cs.tut.ac.jp
Evaluating grapheme-to-phoneme converters in automatic speech recognition context
Evaluating grapheme-to-phoneme converters in automatic speech recognition context Denis Jouvet, Dominique Fohr, Irina Illina To cite this version: Denis Jouvet, Dominique Fohr, Irina Illina. Evaluating
OPTIMIZATION OF NEURAL NETWORK LANGUAGE MODELS FOR KEYWORD SEARCH. Ankur Gandhe, Florian Metze, Alex Waibel, Ian Lane
OPTIMIZATION OF NEURAL NETWORK LANGUAGE MODELS FOR KEYWORD SEARCH Ankur Gandhe, Florian Metze, Alex Waibel, Ian Lane Carnegie Mellon University Language Technology Institute {ankurgan,fmetze,ahw,lane}@cs.cmu.edu
