Handling multiply-annotated multimodal corpora

Similar documents
Robust Methods for Automatic Transcription and Alignment of Speech Signals

Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus

Tools & Resources for Visualising Conversational-Speech Interaction

On the use of the multimodal clues in observed human behavior for the modeling of agent cooperative behavior

Introduction. Philipp Koehn. 28 January 2016

Voice Driven Animation System

Closed captions are better for YouTube videos, so that s what we ll focus on here.

How to create a newsletter

Pamper yourself. Plan ahead. Remember it s important to eat and sleep well. Don t. Don t revise all the time

This document should help you get started. The most important button in TagNotate is the, which you can find on the bottom of each screen:!

Basics. How can I use the Internet to make free calls?

Why Your Business Needs a Website: Ten Reasons. Contact Us: Info@intensiveonlinemarketers.com

How can I use the internet to make free calls?

SPRING SCHOOL. Empirical methods in Usage-Based Linguistics

Develop Software that Speaks and Listens

homework and revision

BBC Learning English Talk about English Business Language To Go Part 1 - Interviews

Effects of Automated Transcription Delay on Non-native Speakers Comprehension in Real-time Computermediated

Speech Transcription

IC2 Class: Conference Calls / Video Conference Calls

Lecture 12: An Overview of Speech Recognition

SPRACH - WP 6 & 8: Software engineering work at ICSI

Dybkjaer_et_al_01_annotation_multimodality.pdf

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems

What's new in Word 2010

Cambridge English: First (FCE) Frequently Asked Questions (FAQs)

Investigating the effectiveness of audio capture and integration with other resources to support student revision and review of classroom activities

How to Write a Marketing Plan: Identifying Your Market

PDF Accessibility Overview

A Visual Tagging Technique for Annotating Large-Volume Multimedia Databases

Managing large sound databases using Mpeg7

Standard Languages for Developing Multimodal Applications

VoiceXML Data Logging Overview

Program curriculum for graduate studies in Speech and Music Communication

top tips to help you save on travel and expenses

Support and Compatibility

Persuasive and Compelling

Presentation Video Retrieval using Automatically Recovered Slide and Spoken Text

Elements of robot assisted test systems

A Platform for Managing Term Dictionaries for Utilizing Distributed Interview Archives

How is the Net Promoter score calculated?

WSI White Paper. Prepared by: Thomas Petty Search Engine Expert, WSI

CatDV Pro Workgroup Serve r

Import itunes Library to Surface

Imagine It! ICEBREAKER:

Or Claim Staking, Territory Taking, and Reputation Making in the Wild Wild Web.

To download the script for the listening go to:

Page 18. Using Software To Make More Money With Surveys. Visit us on the web at:

Things to remember when transcribing speech

IM, Presence, and Contacts


Basic Parsing Algorithms Chart Parsing

Do you wish you could attract plenty of clients, so you never have to sell again?

The Advanced Guide to Youtube Video SEO

Specialty Answering Service. All rights reserved.

Digital Story Telling with Windows Movie Maker Step by step

CISCO WebEx Guide for participants of WebEx meetings. Unified Communications

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

KNOWLEDGE ORGANIZATION

The Data Quality Continuum*

DATA MANAGEMENT FOR QUALITATIVE DATA USING NVIVO9

Adobe Acrobat 9 Pro Accessibility Guide: PDF Accessibility Overview

The Doctor-Patient Relationship

Vieta s Formulas and the Identity Theorem

Haberdashers Adams Federation Schools

Improving Usability by Adding Security to Video Conferencing Systems

Coaching and Feedback

Enterprise Voice Technology Solutions: A Primer

Express Technology. Maintenance Solutions Express Technology Inc.

TIA Portal vs Studio 5000

VoiceXML-Based Dialogue Systems

45 Ways to Grow Your Business with Transcribed Content

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Information Technology Career Field Pathways and Course Structure

Phase 2 The System Specification

You can probably work with decimal. binary numbers needed by the. Working with binary numbers is time- consuming & error-prone.

HOWTO annotate documents in Microsoft Word

ACCESS Importing and Exporting Data Files. Information Technology. MS Access 2007 Users Guide. IT Training & Development (818)

Quentin Williams Manager: W3C Southern Africa Office. Web Accessibility and Usability

Giuseppe Riccardi, Marco Ronchetti. University of Trento

THE EF ENGLISHLIVE GUIDE TO: Dating in English TOP TIPS. For making the right impression

The Glasgow How to guide for researcher-led activity

The Complete Educator s Guide to Using Skype effectively in the classroom

Survey Results: Requirements and Use Cases for Linguistic Linked Data

Collaborative Task: Just Another Day at the Office

Link: University of Canberra

31 Case Studies: Java Natural Language Tools Available on the Web

Information and documentation The Dublin Core metadata element set

Preparing the content for your website

SEO MADE SIMPLE. 5th Edition. Insider Secrets For Driving More Traffic To Your Website Instantly DOWNLOAD THE FULL VERSION HERE

Getting Started with WebSite Tonight

City of De Pere. Halogen How To Guide

Introduction. 1. Name of your organisation: 2. Country (of your organisation): Page 2

Search Engine optimization

Rotorcraft Health Management System (RHMS)

NonStop SQL Database Management

Making a Video Year Six

THE FUTURE OF BUSINESS MEETINGS APPLICATIONS FOR AMI TECHNOLOGIES

How to Avoid the 10 BIGGEST MISTAKES. in Voice Application Development

Transcription:

Handling multiply-annotated multimodal corpora Jean University of Edinburgh CLARIN/FLARENET multimodal workshop Nov 2009

Outline 1 Who am I? (AMI and the NITE XML Toolkit) 2 3

Outline 1 Who am I? (AMI and the NITE XML Toolkit) 2 3

Background research on spoken dialogue and how small groups interact designed and developed the NITE XML Toolkit in response to needs of distributed multi-disciplinary groups that share data consult for or support many types of data collection involving language plus something else (video annotation, eyetracking, dialogue system logs,...) secondary contributor to TEI, ISO, and W3C standards efforts

AMI Corpus (100 hours) 4 close- and 2 wide-view cameras, 4 head-set and 8 array microphones, presentation screen capture, whiteboard capture, pen devices, plus extra site-dependent devices

AMI Annotations transcription with word-level timings from forced alignment timestamping against signal: head gestures; hand gestures for addressing and interactions with objects; location in room; gaze annotation above words: dialogue acts (some w/ addressing), named entities, topic segments, linked extractive and abstractive summaries, subjectivity

The AMI Method Use hand-annotation to create automatic mark-up components Guess what features will help with new problems using hand-annotations Share automatic annotations so you ll know what will happen if you plug in someone else s technology Informed by all this, decide what new applications will work and build them

Personal goal: increase inter-disciplinary collaboration technologists often make dumb choices in their data collection and annotation computational linguistics is stuck in a rut - surely knowing something about the properties and structure of the data helps soft sciences can really benefit from some automation and they need to save money on data collection even more than systems developers do

NITE XML Toolkit Open source toolkit for handling annotations with temporal ordering and full structural relations Data storage format designed to support distributed corpus development Libraries for data handling, query, and writing graphical user interfaces End user browsing and annotation tools for common tasks Command line utilities for analysis, feature extraction

Example interface (one of many)

Who am I? (AMI and the NITE XML Toolkit) Typical data

nt da S statement disfluency reparandum movement source nt VP nt VP kontrast backgd nt kontrast contrast repair markable organisation med-gen target S nt VP markable non-concrete old nt EDITED nt NP kontrast contrast nt VP nt PP kontrast backgd nt NP word the DT * syl n nt NP word word word word word word word word sil the government doesn t have trace to deal with it DT NN VBZ-RB VB TO VB IN PRP * * * * * * * * * * * syl syl syl syl syl syl syl syl syl syl syl n p n s p n p n p p p * word the DT phon phonword the 47.48-47.61 syl n * word does VBZ word n t RB phon phon phonword doesn t 47.96-48.18 ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph ph dh ah dh ah g ah v er m ih n t d ah z en t hh ae v t ax d iy l w ih dh ih t * 47.0 48.0 49.0 t (s) phrase disfl phrase minor * * phrase major * accent nuclear accent * plain * accent nuclear

Community support stand-off annotation using multiple files under version control dependency structure for keeping track of which annotations rely on which versions of which other annotations multiple competing annotations for the same thing (different humans for a reliability assessment, different automatic processes for a competition) logical query language - because this is the only way to analyse this kind of data

Outline 1 Who am I? (AMI and the NITE XML Toolkit) 2 3

Software support is everything Build a better mousetrap, and the world will beat a path to your door. Ralph Waldo Emerson

It ain t so People will use a bad tool they already have even if it s inappropriate the less computational the field, the harder it is to get people to install software

No one can afford to support every feature open data formats are essential interoperability reduces risk and gives people the ability to try wild and wonderful new things with their data for most users, you have to suggest data paths

People are reluctant to share data First person to do X gets a decent paper even if X is stupid or trivial, others get them for beating the results Researchers always worry that someone will publish second paper before they do the first one Current situtation iniquitous - can t beat the results without the data, but only friends get it Not enough to find funding and infrastructure for data releases - need to convince people to let go

Individual researchers think short term deadline driven, always rushed for time Data quality suffers unless they intend to release from the beginning Documentation (if it exists) usually suffers from nerdview

Tag set re-use sounds good, but often is bad New research often requires new theoretical advances and new ways of thinking about tags People just blindly apply tag set without thinking what they need Usable tags developed for one data set/use don t actually fit others

Outline 1 Who am I? (AMI and the NITE XML Toolkit) 2 3

What s an annotation standard? Hopeless if not informed by past tag sets from all corners of the set of users you hope to attract Cleaned-up union of previously used tags is never usable Best option: describe the space tag a tag set for X can occupy and encourage documentation relating tags to this space

What would really make research better? an open data format tools can import and export containing all the representational richness that any data and tool might need coordination of enough tools developers to establish good data paths carrots for data sharing

Attracting users demonstration using a reformulation of some well-loved corpus give people something they don t have, as a lure Sometimes, bad speech recognition is better than no transcription, but people still doing without - hook up with webasr? unskilled forced alignment tool (only takes a bad ASR system and dictionary entries by a computer-literature language speaker) Speech spotter to find areas of speech

Encouraging data sharing Often easier to get researchers to agree in principle to data release at some specific time in the future than now, no matter how old the data is Pressure from funders, except it s too hard for them to get the conditions right Like in American psychology, pressure from professional association that controls journals