Transcribing and annotating audio and video: Jeff Good MPI EVA and the Rosetta Project good@eva.mpg.de
Goals of presentation Discuss basic concepts of audio and video transcription and annotation Illustrate process of transcription and annotation using Elan 2
What is annotation? In recent years, a new conception of language documentation has been emerging (see, e.g., Himmelmann (1998), Woodbury (2003)) This view takes primary sources of data (e.g., audio and video) to be the foundational materials for language documentation 3
What is annotation? Traditional linguistics is then conceptualized as annotations on primary data, including Transcription of audio or video Annotation for grammatical analysis (e.g., interlinear glossing) 4
Text annotation example Extra layer of annotation Cicko, [ch aara a goj,] i bu u. cat.erg fish & see.cvpan 3s.abs b.eat.prs The cat sees a fish and eats it. (Example from Chechen) 5
Why does this matter? It s a pretty different way of doing language documentation than before It forms the conceptual underpinnings of the functionality of annotation tools It can be a lot more work at first......with (hopefully) a worthwhile payoff 6
Good annotation Under present thinking, good annotations should have the following properties Archival format Time-aligned to primary data Transparent, documented terminology 7
Archival format How do you make sure your annotations are in an archival format? Short answer: Use a tool designed for research purposes (e.g., Elan, Shoebox) What not to do? Use FileMaker, Microsoft Word, etc., without having a conversion strategy 8
Archival format Simple tip: If the annotation file isn t designed to be easily opened in a plain text editor (e.g., Notepad, TextEdit), it s not archival The biggest mistake people make isn t deliberately choosing a program that uses a bad format it s not even thinking about formats before using some program 9
Time-aligned annotation When you re annotating audio and video, ideally, you want the annotations to be time-aligned That is, you want them to be linked to appropriate sections of the audio and video recording This allows you or another researcher to have access to the primary data on which an annotation is based 10
Time-aligned annotation in Elan
Terminology When doing annotation of linguistic data, there will always be a need for specialized terminology For example, transcription systems, like IPA, are a type of specialized terminology Interlinear glossing also uses specialized terminology (e.g., sg for Singular itself a specialized term) 12
Terminology When possible, use existing standard term sets (and document that you ve done this) For example, IPA with notes on any modifications/interpretation Leipzig glossing rules for interlinear abbreviations 13
Terminology Document the use of any special conventions you devise for your data Develop controlled vocabularies and make use of any features of your tools supporting their use Controlled vocabulary: A standardized list of terms used for annotating data 14
Controlled vocabularies in Elan
Terminology Possible controlled vocabularies Yes/No Speaker identifiers Left/Center/Right (for eye gaze) Grammatical phenomena of particular interest 16
Elan Elan is a time-aligned annotation tool available at: http://www.mpi.nl /tools Supports annotation of Audio (in WAV format) Video (in MPEG and Quicktime format) 17
Elan Noteworthy features of Elan Designed in the context of language documentation Supports Unicode Export/import of Shoebox files User-defined annotation tiers 18
Annotation tiers In Elan, tiers are where annotations are located Tiers can be thought of as a line in an analyzed text. For example: Transcription tier Morpheme-analysis tier Interlinear tier Free translation tier 19
Anatomy of an Elan window Annotation viewer Wave form Tiers Annotations
Elan in action A brief demonstration of Elan, including The tiers I ve been using Making a new annotation Searching across annotations 21
Elan s tier types Time-aligned The foundational annotation, directly aligned to audio or video. Typical example: sentence transcription 22
Elan s tier types Time subdivision Must be linked to a basic timealigned tier Allows you to make subdivisions of that tier with their own timestamps Typical example: Words in a sentence 23
Elan s tier types Symbolic subdivision Must be linked to another tier Allows that tier to be subdivided without times associated with the subdivision Typical example: Morpheme subdivision 24
Elan s tier types Symbolic association Must be linked to another tier Cannot be subdivided further Typical example: Free translation 25
Schematic example of tier types Sentence-level transcription time-aligned with wave form Time subdivision of sentence into words Symbolic subdivision of words into morphemes Symbolic association of morphemes with glosses Symbolic association of sentence transcription with free translation Puer puellam amat. puer puellam amat puer puell am am a t boy girl ACC love PRS 3s The boy loves the girl. 26
Tier types These tier types weren t invented out of the blue for Elan They correspond to the meanings of different kinds of linguistic parsing The tool designers allow for flexibility of tier types (a good thing) It s up to the linguist to understand their data well enough to use the right tier type 27
Tier templates It is likely that for a given project you ll have some tier sets you ll use often Elan provides the ability to save a set of tiers as a template that can be easily re-used for other projects 28
Aside:.eaf files How does Elan store annotations? Inside XML files using an.eaf extension What does that mean? Your archivist will be happy Hopefully, you ll never need to know more than that 29
Elan conclusion This is just an introduction Elan has more features than I have made use of or can describe here My wish list for features Integration with a phonetic analysis tool (e.g., Praat) Built-in support for clip extraction 30
Other tools There are a number of annotation tools out there Two that seem to also be popular among linguists Transcriber (http://trans.sourceforge.net/) (apparently good for conversational recordings) Praat (http://praat.org) (primarily known as phonetic analysis software, but also has facilities for time-aligned annotation) 31
Why annotate? Time-aligned annotation is a lot of work For me, it s much more time consuming than just jotting things down in a notebook So, why bother? 32
Why annotate? Intangible reasons It s currently considered good documentary practice It facilitates wider use of resources by other people Time-aligned annotations combine linguistic analysis with the immediacy of a primary recording 33
Why annotate? Tangible reasons It allows you (and others) to doublecheck your analysis more easily The ability to do searches across structured annotations facilitates analysis Makes creation of sound clips much easier 34
Conclusion If you re going to go through the trouble to make good recordings......it s worth going through the trouble of annotating them well. Unsure of how to proceed? Consult the E-MELD School of Best Practices Or talk to your archivist 35
References E-MELD School of Best Practices http://emeld.org/school / Leipzig glossing rules http://www.eva.mpg.de/lingua/files/morpheme.html Himmelmann, Nikolaus P. 1998. Documentary and descriptive linguistics. Linguistics 36:161 195. Woodbury, Tony. 2003. Defining documentary linguistics. In P. Austin (Ed.) Language documentation and description, volume 1, 33 51. London: Hans Rausing Endangered Languages Project.