Introduction till transcription using CHAT (with linking of audiofiles)



Similar documents
CLAN Manual for the CEAPP project 1

Transcriptions in the CHAT format

How to register and use our Chat System

CHAT TRANSCRIPTION TRAINING MANUAL. AphasiaBank

Windows Movie Maker 2012

Help File. Version February, MetaDigger for PC

Lab 2: MS ACCESS Tables

Site Maintenance. Table of Contents

A Short Introduction to Transcribing with ELAN. Ingrid Rosenfelder Linguistics Lab University of Pennsylvania

SPEECH TRANSCRIPTION USING MED

Hypercosm. Studio.

RHYTHMYX USER MANUAL EDITING WEB PAGES

Creating Captions in YouTube

Setting up a basic database in Access 2003

Using ELAN for transcription and annotation

Chapter 14: Links. Types of Links. 1 Chapter 14: Links

MS Access Lab 2. Topic: Tables

MICROSOFT ACCESS STEP BY STEP GUIDE

Training Modules: Entering a New Transcript Part 2

Adobe Dreamweaver CC 14 Tutorial

Creating Personal Web Sites Using SharePoint Designer 2007

Praat Tutorial. Pauline Welby and Kiwako Ito The Ohio State University. January 13, 2002

Drawing a histogram using Excel

itunes Basics Website:

Computer Programming In QBasic

Audacity is a free, totally free, audio editing program. Get it here:

Web Ambassador Training on the CMS

How to complete the PET Online Practice Test Free Sample: Listening

BeamYourScreen User Guide Mac Version

Import Filter Editor User s Guide

ADDING DOCUMENTS TO A PROJECT. Create a a new internal document for the transcript: DOCUMENTS / NEW / NEW TEXT DOCUMENT.

ITP 101 Project 3 - Dreamweaver

Week 2 Practical Objects and Turtles

How To Use An Apple Macbook With A Dock On Itunes Macbook V.Xo (Mac) And The Powerbar On A Pc Or Macbook (Apple) With A Powerbar (Apple Mac) On A Macbook

Recording Supervisor Manual Presence Software

How to Practice Pronunciation Without a Microphone

Microsoft Expression Web

Quick Guide. Passports in Microsoft PowerPoint. Getting Started with PowerPoint. Locating the PowerPoint Folder (PC) Locating PowerPoint (Mac)

Google Drive: Access and organize your files

Adobe Connect Quick Guide

How to Edit Your Website

Microsoft Access 2010 Part 1: Introduction to Access

Personal Portfolios on Blackboard

Using Microsoft Excel to Manage and Analyze Data: Some Tips

Joomla! 2.5.x Training Manual

DIY Manager User Guide.

How to Create and Send a Froogle Data Feed

Tutorial. Part One -----Class1, 02/05/2015

File Management With Windows Explorer

REAL ESTATE CLIENT MANAGEMENT QUICK START GUIDE

Comprehensive Medical Billing and Coding Student CD Quick Start Guide By Deborah Vines, Ann Braceland, Elizabeth Rollins

Microsoft Outlook. KNOW HOW: Outlook. Using. Guide for using , Contacts, Personal Distribution Lists, Signatures and Archives

Removing Primary Documents From A Project. Data Transcription. Adding And Associating Multimedia Files And Transcripts

Microsoft Access Rollup Procedure for Microsoft Office Click on Blank Database and name it something appropriate.

Transcribing Videos in Studiocode. In order to transcribe a video in Studiocode, you first need to open up a movie file. For this

MiraCosta College now offers two ways to access your student virtual desktop.

Microsoft Access Basics

Mikogo User Guide Mac Version

Snagit 10. Getting Started Guide. March TechSmith Corporation. All rights reserved.

Working with the Ektron Content Management System

Using a Digital Recorder with Dragon NaturallySpeaking

BIGPOND ONLINE STORAGE USER GUIDE Issue August 2005

Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice.

Instructions for Using Excel as a Grade Book

(These instructions are only meant to get you started. They do not include advanced features.)

Search help. More on Office.com: images templates

Ansur Test Executive. Users Manual

-SoftChalk LessonBuilder-

A quick guide to setting up your new website

Introduction to Microsoft Outlook Web Access Faculty/Staff Tutorial

Link Crew & WEB Database User Guide. Database 2006

KiCad Step by Step Tutorial

Version 4.1 USER S MANUAL Technical Support (800)

Downloading & Using Data from the STORET Warehouse: An Exercise

Unified Communications Using Microsoft Office Live Meeting 2007

Mail Merge Tutorial (for Word ) By Allison King Spring 2007 (updated Fall 2007)

ATLAS.ti 5.2: A Qualitative Data Analysis Tool

How To Manage Your Storage In Outlook On A Pc Or Macintosh Outlook On Pc Or Pc Or Ipa On A Macintosh Or Ipad On A Computer Or Ipo On A Laptop Or Ipod On A Desktop Or Ipoo On A

Microsoft Word Track Changes

14.1. bs^ir^qfkd=obcib`qflk= Ñçê=emI=rkfuI=~åÇ=léÉåsjp=eçëíë

Mikogo User Guide Windows Version

Simple Computer Backup

Book Builder Training Materials Using Book Builder September 2014

Call Recorder Oygo Manual. Version

Introduction to Mac OS X

Access Tutorial 8: Combo Box Controls

First Certificate in English Online Practice Test Free Sample. How to complete the FCE Online Practice Test Free Sample: Writing

Make Voice Calls and Share Documents using Skype*

Digital Workflow How to make & use digital signatures

INTRODUCTION TO TRANSANA 2.2 FOR COMPUTER ASSISTED QUALITATIVE DATA ANALYSIS SOFTWARE (CAQDAS)

Access to Moodle. The first session of this document will show you how to access your Lasell Moodle course, how to login, and how to logout.

Introduction to MS WINDOWS XP

Your guide to Gmail. Gmail user guide

Importing and Exporting With SPSS for Windows 17 TUT 117

Contents. Dianne Harrison Ferro Mesarch

Organizational Development Qualtrics Online Surveys for Program Evaluation

Start ichat by clicking on its icon on your dock. You will be asked to enter an AIM screen name (see above) or a.mac member name. Click Continue.

Downloading Audiobooks with Overdrive from the Marion Public Library

Table of Contents. zipform 6 User Guide

PloneSurvey User Guide (draft 3)

Transcription:

Introduction till transcription using CHAT (with linking of audiofiles) Victoria Johansson Humanities Lab, Lunds universitet it-pedagog@humlab.lu.se Innehåll 1 Inledning 2 2 CHAT 2 3 Transcription 2 3.1 The content of a CHAT-transcription................ 3 3.1.1 Headers............................ 3 3.1.2 Help with creating the header................ 5 3.2 Main tiers/transcription line/speaker tier............. 5 3.2.1 Comment tiers eller dependent tiers............ 6 4 Transcription exercise 6 4.1 Header................................. 7 4.2 Save the file.............................. 7 4.3 Open a sound file (or movie file) to use for linking (optional).. 8 4.4 Begin the transcription........................ 8 4.5 Connect short commands to every speaker code.......... 9 4.6 Using the CHECK-function IMPORTANT!!........... 9 4.7 Linking the audio (or video) file to the transcript......... 9 4.7.1 First way of linking..................... 10 4.7.2 Second way of linking.................... 10 4.8 Other ways to work with Sonic Mode................ 11 5 The Dep-file 11 5.1 What can be changed in the dep file?............... 12 Created Nov 2006, Updated 12 Feb 2008; Updated 29 Jan 2009; First English version February 2011; updated English version Feb 2012; updated April 2013 1

1 Inledning This tutorial introduced the transcription system minchat and demonstrate how to link audiofiles to CHAT transcriptions. 2 CHAT The transcription standard CHAT is an acronym for Codes for the Human Analysis of Transcripts. This manual will tell you how to transcribe a simple file according to the CHAT standard. Files following this transcription standard can be analyzed with the CLAN programs. Read more and download CLAN and more information about CHAT here: http://childes.psy,cmu.edu. Look at the header Programs and database, and click on the link to The CLAN program. This will take you to a homepage where you can choose which version you want to download. The program works for both Mac, PC and Unix. On PC the default installation will put the program directly under \C:(This computer); on Mac it will by default installation will put the program in the program folder. You can specify another place if you prefer that. When you start transcribing it is useful to have the manuals at hand. The homepage has an updated manual for CHAT here: http://childes.psy.cmu.edu/manuals/chat.pdf It is sometimes also useful to look at the CLAN manual (e.g. the manual for the analyses programs) during the transcription. This is found here: http://childes.psy.cmu.edu/manuals/clan.pdf 3 Transcription If you transcribe in CHAT, you will have to follow some fundamental rules on how to organize your transcription file. After having included the minimal required facts, you can later make several other specifications that is useful for your projects/transcriptions. Transcriptions following the CHAT convention consist of a file with the file extention.cha. The files are often called chat-files. You can use any word processor to transcribe, as long as you follow the CHAT convention. What you have to remember is to save the file as Text only and to give the file the extention.cha. There are some advantages by transcribing in e.g. Word, since you then can use the possibility of advanced search and replace, or to add common words and expressions to Word s Autocorrection function. 2

On the other hand, transcribing directly in CLAN offer some help with automatizing speakers. You can also more easily check your transcription continuously. To conclude, you can choose transcription program depending on your needs and you can also swtich between programs when needed. 3.1 The content of a CHAT-transcription 3.1.1 Headers Every transcription must begin with some lines which make up the so-called header, that is, the introduction of the transcription. The lines in the header always start with the symbol @. Observe that at the very end of the transcription, there should also be a line, called @End. Some lines in the header are compulsory. They also have to occur in a specific order. Other lines are optional, and can be added if necessary or relevant for your own transcription. You can also create header lines of your own, but in that case you will have to add those to the so-called depfile (see section 5). These header lines are compulsory: @UTF8 should be at the top of your file, although invisible in your transcript. @Begin marks the beginning of a file @Languages: the principal languages of the transcript @Participants: lists actors in a file @ID: code for a larger database transcription @End marks the end of the file Observe: All header lines begin with @. The header lines have to be in a specific order. Example of simple header @UTF8 @Begin @Languages: swe @Participants: INF no01 Informant, INV Victoria Investigator @ID: swe project INF 00;00.00 male Informant @ID: swe project INV 00;00.00 female Investigator @Comment: This is a test transcript. *INF: bla bla. 3

*INF: this is a test. *INV: and this is someone else speaking. @End Explaining the header: @UTF8 At the very top of the file/the header use @UTF8. This line will be invisible in the CHAT-file if you open it in the CLAN program. This line tell the program that you use Unicode-format, and is necessary i the program should be able to read e.g. ååö correctly. @Begin is the beginning of the transcription file. @Languages describes which language(s) that are used in the transcription. There is a set of pre-defined codes for the most common languages 1. Look at the list Obligatory headers in the CHAT.pdf-manual. @Participants indicates the speakers in the transcription. Every participant is assigned a three letter/digit-code (e.g. INF or INV ). After this is followed a description of the code (e.g. no01 or Victoria). In the end you indicate the role of the speaker (e.g. Informant or Investigator). There is a list of pre-defined roles which you can use, to avoid error messages 2. @ID gives a specific identity to every participant in the file. This is useful for instance if you want to run some statistics on the output from a certain participant. The ID-notion is built up by specific fields, most of them are compulsory, but some can be left empty. This is how you have to fill it in (informationen is found in the CHAT-manual): language corpus code age sex group SES role education Language according to the abbreviations that are in the list in the CLAN manual. Sv for Swedish, En for English. Corpus a one-word label for the corpus in lowercase Code the three-letter code for the speaker in capitals. That is, the same letters/codes that you use under participants Age the age of the speaker. Use the right standard: year;month.day Sex either male or female in lowercase Group any single word label SES (Social-Economic Status) any single word label Role the role as given in the @Participants line 1 Most common means in this case the languages that have been used for collecting child language data, transcribed in CHAT. 2 You can create the roles you want, but then you have to add the roles to the Dep-file, cf section 5 4

Education educational level of the speaker @End At the very end of the transcription you should have the line @End. It indicates the end of the transcription, and CLAN won t be able to perform analysis if it s not there. You have to add this line directly after the transcription without any blank lines. 3.1.2 Help with creating the header There is a short cut for creating headers, and especially the complicated @IDtier. Once you have opened a new document in CLAN (by choosing File > New), you can look under the menu Tiers and then choose ID headers. A window will open, where you can fill in the information needed for creating an @ID-tier for each participant. 3.2 Main tiers/transcription line/speaker tier The main tiers (lines) contain the information about what is said. They always start with a star (*), followed by a three-letter code (unique for each speaker), a colon and a tab. Every speaker has a personal code, consisting of three letters (or digits). Who is how is indicated in the Header (see above). CLAN will automatically linebreak your text, so return/enter should only be used when you want to indicate a new speaker (or utterance/t-unit or whatever categories you have decided to use as means to distinguish between new tiers). You should also use return/enter when you want to add a new comment-line to the main line (more about this when during the coding lecture). Below is an example of a few transcription lines (from fao_om_3sp.tra.cha). *INT1: vart ä du uppvucksen [//] ä du uppvucksen på en gård, berätta hur de var, din barndom å, å? OM3: nja <vi va> [/] ## vi va nie syskån. OM3: å eh ## fiem [/] fiem grabba å fyre töisar. Note the following: How the speaker tiers/main lines are constructed: *INT1:<tab>. All lines are ended by a space and then a full stop (.). All main tiers must end with a major delimiter (.?!). Normally you use full stop. Exclamation mark is used for exclamations or summoning, and question marks for questions. Normally, it is the speaker s intonation that rule the choice of major delimitor. NOTE!!! It is very important that all colons, spaces and tabs are correctly used. Otherwise you will not be able to use the analysis programs later. During the transcribing you should use the CHECK function as often as possible. Comma is used to denote phrase boundaries (it is not so very commonly used). 5

You often find that the speech of a speaker is divided into several utterances. This is the choice of the transcriber. Maybe you have decided to divide the text into t-units/macrosyntagms when you are transcribing. Or you may have made other decisions that influence this choice. Only lower-case letters are used. Capital letters in the beginning of words are used for proper nouns only, and thus not to indicate beginning of sentences. Several symbols are used to denote pauses, and repetitions are included (within brackets). CHAT has specific standards for this, and how they should be used, and which ones that are necessary to use is normally decided on in a project transcription standard. (The example above follow the transcription standard from the project SweDiaSyn. The full standard can be used in the paper Transkription och direktglossning av dialektinspelningar i SweDiaSyn ). 3.2.1 Comment tiers eller dependent tiers Every transcription line, or main tier, can be followed by one or several comment tiers, or dependent tiers. They always begi with %, followed by a three-letter code. Examples of such codes are %mor which indicates morphological coding, or %tim which indicates the time of a corresponding (audio) tape. A useful depentdent tier is %com which can be used for comments of various kind. If you need to perfor a morphological of syntactical coding you can use these tiers. You can connect several tiers to one transcription tier. But it is not necessary to use any dependent tier. Exceprt from a transcription with dependent tiers: In the exemple below you will see a morphological tier (%mor) for coding some morphological processes, and a translation tier (%eng) where the transcription is translated to English. *IN2: icia tama-nak kainunian tu lumaq. %mor: tamanak tama-nak. %eng: this is a house built by my father There is a subset of predefined comment tiers that you will find in the depfile. You can add new ones if necessary. 4 Transcription exercise This is a step-for-step guide on how to make a short transcription in CHAT. 6

4.1 Header 1. Open CLAN by clicking on the Start Menu (bottom left corner of the computer screen), and choose Alla program. Then choose the program CLAN. 2. The program will open and you will see two windows. The smaller one is called the Commands-window, and the bigger one is called the outputfönstret. You only need the Commands window when you run analysis, so you can close it now 3 3. If you don t have an empty output window, go to the File-menu and choose New. 4. Then click in the output window to start to write. 5. Fill in the first, obligatory lines for the header. Don t forget @UTF8. Be careful with spaces, tabs, colons etc. If you don t know the speakers yet, wait to fill this in before you start listening to the sound file. 6. When you have finished the header, you can take return a few times and then add the line: @End Between the header and @End is where the transcription goes. When this one is finished, you must remove the blank lines before you check the transcription. 4.2 Save the file 7. Save your file before you continue. Go to the File-menu and choose Save as... Then give the file a name. Use a name without any diacritical letters (i.e. no åäö, or words with accents), and don t have a space in the file name 4. If you are making several transcriptions in one and the same corpus, it can be good to think through how to name the files. 8. Save the files on your place on the server. It is good if you create a special folder where you can save the transcription. 9. If you work with soundfiles or videofiles it is a good tip to save the transcription in the same folder as the audio/video file. 3 If you need to open it again, you can press Ctrl+D, or go to the menu Windows and choose Commands. 4 Spaces in filenames will really complicate things once you start to run analyses! 7

4.3 Open a sound file (or movie file) to use for linking (optional) 10. When you work with transcribing and linking sound files in CHAT you use the so called Sonic Mode. 11. Open the transcription and go the Mode menu and choose the alternative Sonic mode. 12. Find your sound file. (You should have put it in the same folder as the transcription file). 13. When you have selected your sound file, it will show like a waveform in the bottom of the screen. If the waves are difficult to see, click on the +/-V and +/-H that you can see to the left and the right of the waveform. 14. At the header (the top of the transcription) you will have to add one entry: @Media:. It should be followed by the filename of the audio (or video) file. The filename should include the extention (e.g. wav ). For instance: @Media: Victoriasoundfile.wav 15. Then you can use click and drag with the mouse to mark a part of the waveform, and replay it by pressing the CTRL-key 5, and at the same time you should left click on the selection. 16. Once you have listened to the first part a few times it is time to type speakers. 4.4 Begin the transcription 17. Start every speaker line with a star (*), a three letter code, colon and tab, for instance: *INT: va kommer du ifrån? 18. Don t forget the delimiter by the end of the line. 19. Note: You can decide yourself what level you want on the transcription. You can transcribe ortographically, or more phonematic if you want to (or both). Remember to use small letters (except proper nouns), and end every line with full stop, exclamation mark or question mark. You can choose to take a new line after every clause, every sentence, or choose some other division. But be consequent! 5 If you use Mac, you use the apple-key instead 8

4.5 Connect short commands to every speaker code When you have transcribed a few lines you can use CLAN to connect short commands to every speaker code. 20. Go to the File-menu and save the file. 21. Then go to the Tiers menu and choose the alternative Update. 22. Then go back to the Tiers menu again, and see the results. Now there should have been created a short command for every speaker (that you have transcribed so far). This means that you won t have to type every speaker every time, but instead you can press Ctrl+1, Ctrl+2 etc. You will soon get used to this when you transcribe. 4.6 Using the CHECK-function IMPORTANT!! 1. When you have transcribed a bit, it is important to check that you have followed the CHAT convention. Do this by using the CHECK option. 2. When you are in CLAN, go to the Mode-menu and choose Check opened file. 3. The program will check if everything is fine. If you are lucky, you will get the message Success! No errors found. Otherwise the program will probably tell you want is wrong, so that you can fix it. It is important to have an error-free file if you want to run analyses later. 4.7 Linking the audio (or video) file to the transcript There are several ways of linking sections of the audio tape to the transcription. 1. If you haven t done so, start by putting the sound file in the same folder as your transcription. 2. Then, check that you have added the entry @Media: to the header, followed by the name of the sound file, for instance like this: @Media: Victoriasoundfile.wav 3. Then go to the Mode menu and choose the alternative Sonic mode. 4. Find your sound file. (You should have put it in the same folder as the transcription file). If you have put the sound file in the same folder as your transcription, and if you have added the soundfile name after @Media in the header, the transcription will probably automatically find, identify and import the file to the transcription. Otherwise, a window will open that lets you browse to find the file. 9

5. When you have selected your sound file, it will show like a waveform in the bottom of the screen. If the waves are difficult to see, click on the +/-V and +/-H that you can see to the left and the right of the waveform. 4.7.1 First way of linking 6. Highlight one part of the waveform in the bottom of the transcription, by click and dragging with the mouse. It should start playing immediately. You can easily adjust the selection by pressing the shift key and expanding the selection to the left or to the right with the mouse. 7. If you want to replay the selection, press the Ctrl (control) key and left click with the mouse on the selection. 8. Once you have identify the speech string that corresponds to your first utterance, put your cursor in the end of that utterance (after the delimiter), make sure that the selection is still highlighted, and then press the little S to the very left of the waveform. This will create a bullet on the end of the transcription line, indicating that the transcription is linked to the sound file. 9. If you look under the Mode menu and select Expand bullets, the bullets will expand, and you will see the duration and location of the linking. To take away this information, go to the Mode menu again, and select Hide bullets. 10. You can go through the transcription like this to link the utterances to the right parts of the sound file. 4.7.2 Second way of linking 11. If you want to, you can also choose to listen to the whole sound file, and make the linking while you are listening. Do like this: 12. Put your cursor in the end of the first transcription line (after the delimiter). 13. Go up under the Mode menu and choose Transcribe Sound or Movie. 14. The soundfile will immediately start playing. When you have heard the first utterance, you should press the Space bar. A bullet will show up in the end of the first utterance, and the cursor will immediately and automatically move to the end of the second utterance. When you have heard the second utterance, you press the Space bar again, and so on. 10

4.8 Other ways to work with Sonic Mode You can also use Sonic Mode to replay the linked sound, and to change the linking this is a good way to work. Then you can first make a rough linking, and then use this method to adjust the previous linkings 6! Replay the sound from the waveform: You can replay the sound linked to an utterance by pressing the Ctrl key and clicking on the bullet, or by clicking three times on the utterance. Change the duration of a selection You can change the duration of a certain selection by pressing the Shift key and left click on the point to which you want to prolong the selection. Scroll: In the bottom of the screen (under the waveform) there is a scroll list that will help you move back and forth in the sound file. Information on the time: Above the waveworm there is information about the time in the sound file. This is indicated on a black line. If you click on it you will get information of three numbers: (a) The beginning and end of the current window in seconds. (W = Window) (b) The position of the cursor in hours: min: seconds: milliseconds. (C =cursor) (c) The beginning and end of the current highlight in seconds. If you click the black line once more you will see the sampling frequency of the sound file (e.g. 16 bits). More information on how to link audio files with CHAT is found here: http://talkbank.org/da/linking.html 5 The Dep-file When CLAN is installed on a computer, one of the files that are installed is the so-called dep-file. From this file, CLAN gets the information on e.g. allowed headers and comment tiers. There is a number of pre-installed preferences in the dep-file and as long as one keeps to these there is no reason to change the file. But often one needs to change or add things to the file. If you need to change the file, it is found in the CHILDES-folder that was installed together with the program. On a pc it is found here (unless otherwise specified during the installation): 1. Look at \C:(This computer/den här datorn). (CLAN is installed here unless you chose another place at the installation. ) 6 Thanks to Jonas Granfeldt for this part of the guide! 11

2. Open the folder CHILDES. 3. Open the folder CLAN. 4. Open the folder lib (library). 5. This should be the location of the dep file. It is called depfile.cut. 6. Double click to open the file (in CLAN, or with any text editor) 7. It is good to choose the alternative Save as..., and save it, e.g. as depfileold.cut. In this was you will have a back up if you make changes in the file. 8. In the depfile (i.e. a version called depfile.cut) you can add or change alternative headers and/or comment tiers if you use that in your transcriptions. 9. Then save the file, and make sure that it still has the file extension.cut. It should also be located in CLAN s lib folder, so that the program can find it. 5.1 What can be changed in the dep file? If you choose to add a line in the header that is not pre-defined, or if you want to create a coding tier of your own (i.e. a dependent tier beginning with %), you need to add this in the depfile. Unless you do this, your transcription won t pass the CHECK program, and it will be difficult to use many of CLAN s analyses programs. 12