Live subtitling with speech recognition: Causes and consequences of revisions in the production process Luuk Van Waes, Mariëlle Leijten & Aline Remael Master in de Meertalige Professionele Communicatie
Introduction Master Multilingual Professional Communication 1
Program 1. Work flow of live subtitling Mono Duo Multi 2. Organisation of live subtitling at VRT 3. Experiment: three live subtitling conditions (mono) 4. Observation: live subtitling process (duo vs. multi) 2
Workflow of live subtitling Mono Live subtitling model: 1 respeaking broadcasting correcting Figure based on Pieters & Rottier, 2011 3
Workflow of live subtitling Duo Live subtitling model (without antenna delay): 1 respeaking broadcasting correcting 2 1 Figure based on Pieters & Rottier, 2011 4
Workflow of live subtitling Duo Live subtitling model (with antenna delay): 1 4 respeaking broadcasting correcting 2 3 Figure based on Pieters & Rottier, 2011 5
Live subtitling process Temporal representation Source: Luyckx et al. 2010 accuracy: 98.5 to 99.0% delay: 6 to 10 seconds 6
Research focus previously part 2 part 1 7
Experiment: effect of reduction on errors Aim: effect of reduction strategy on error correction methodological exploration: observationmethodand multilevel analyses Method 12 live subtitlers Flemish Public Television (VRT) 8 men, 4 women various experience levels (1 7 years) Infotainment talk show Phara 3 excerpts (15 minutes) Number of subtitles: 4351 8
Experiment: three subtitling conditions Verbatim subtitling (9 min) Aim at 100% subtitling. Quantity > Quality. Summarized subtitling (15 min) Aim at 50 % subtitling. Quantity = Quality. (usual) Heavily reduced subtitling (15 min) Aim at 25 % subtitling. Quantity < Quality. (no errors) 9
Observation method Inputlog (www.inputlog.net > free for research purposes) Example of output 10
Results Reduction audio concept subtitle TV audio (transcription) Ja, maar laten we het nog even bij de politici houden. Yes, but let s stick to the politicians for a while longer. char words Dictated concept char words Subtitles char words % char % words (DNS Inputlog) We houden het bij 55 11 32 7 We houden het bij het de politici. We ll stick to the the politicians. de politici. We ll stick to the politicians. 29 6 53% 55% Het is toch zo dat de Franstalige partijen begonnen zijn, he? It is true that the French speaking parties have started, isn t it? De Franstalige 52 11 37 5 De Franstalige partijen zijn begonnen. The Frenchspeaking parties have started. partijen zijn toch begonnen. The Frenchspeaking parties have started indeed. 41 6 78% 55% 11
Errors produced and corrected (%) 60,0 12% 50,0 40,0 30,0 56% 70% concept final 20,0 10,0 0,0 verbatim sum m arized extrem ely reduced 12
Multilevel model error correction 13
Reduction percentage Net zero model is sign. better than zero model p.001 Zero model Net zero model Est. SE Est. SE Verbatim (intercept) 64,85 7,70 59,98 1,08 Summarized 5,87 1,18 Extremely reduced 8,10 1,23 65,85 68,08 Mean delay 0,33 0,11 # words in spoken comment Percentage 100% reduction Number of corrected errors 0,14 0,02 0,44 0,03 0,32 0,16 p >,01 14
Effect of error correction reduction % extremely reduced summarized verbatim Conclusion number of errors significant effect of error correction on reduction Inputlog observation allows for a finegrained and powerful statistical analysis in an experimental setting 15
Subtitling model & error correction What is the influence of the type of subtitling model on error correction? Villa politica Extra Time 16
Design and materials Villa politica (antenna delay) Extra time (live) Time subtitlingprocess (in minutes) 103 28 55 10 # subtitles 1882 1098 # subtitles per minute 9,10 9,87 # corrected subtitles 1281 (68%) 586 (53%) # corrected subtitles per minute 6,19 5,31 Three categories: General error (add/delete/substitute word, etc.) Punctuation (add/delete/substitute comma, etc.) Combination (GE&P) 17
Results Various types of error correction % 18
Processing time The interaction between respeaker and corrector. Average process time = 4,45 seconds per subtitle both respeaker and corrector correct subtitle = 17,52s correct subtitle X 4 + subtitle with error 19
Processing time (2) The interaction between respeaker and corrector. respeaker corrects subtitle = 11s corrector corrects subtitle = 10,9s subtitle with error = cor X 2 subtitle with error 20
Conclusion effect effect effect suitable methodology description analyses 21
Further Research Method validation Replication studies Interlingual live subtitling (eyetracking) Qualitative analyses (perception research quality) Development of reduction guidelines (classification) 22
Acknowledgements Tijs Delbeke & Bieke Luyckx (project assistents) MPC students Hans Pieters Eline Rotthiers Anniek Sniekers Subtitlers of VRT 23
Researchers Luuk Van Waes Mariëlle Leijten Aline Remael University of Antwerp www.ua.ac.be/luuk.vanwaes Flanders Research Foundation University of Antwerp www.ua.ac.be/marielle.leijten Artesis College University www.alineremael.be Professional business communication Writing and digital media Inputlog Reading during writing (TPSF) Multiple sources Inputlog Audio transcription Live speech subtitling 24