The LENA TM Language Environment Analysis System:



Similar documents
Engineer-to-Engineer Note

How To Set Up A Network For Your Business

JaERM Software-as-a-Solution Package

Small Business Networking

Engineer-to-Engineer Note

APPLICATION NOTE Revision 3.0 MTD/PS-0534 August 13, 2008 KODAK IMAGE SENDORS COLOR CORRECTION FOR IMAGE SENSORS

Source Code verification Using Logiscope and CodeReducer. Christophe Peron Principal Consultant Kalimetrix

ClearPeaks Customer Care Guide. Business as Usual (BaU) Services Peace of mind for your BI Investment

Small Business Networking

Health Information Systems: evaluation and performance of a Help Desk

Small Business Networking

Small Business Networking

An Undergraduate Curriculum Evaluation with the Analytic Hierarchy Process

Facilitating Rapid Analysis and Decision Making in the Analytical Lab.

How To Network A Smll Business

The LENA TM Language Environment Analysis System:

Treatment Spring Late Summer Fall Mean = 1.33 Mean = 4.88 Mean = 3.

DlNBVRGH + Sickness Absence Monitoring Report. Executive of the Council. Purpose of report


Test Management using Telelogic DOORS. Francisco López Telelogic DOORS Specialist

Enterprise Risk Management Software Buyer s Guide

Section 5.2, Commands for Configuring ISDN Protocols. Section 5.3, Configuring ISDN Signaling. Section 5.4, Configuring ISDN LAPD and Call Control

Performance analysis model for big data applications in cloud computing

Introducing Kashef for Application Monitoring


FortiClient (Mac OS X) Release Notes VERSION

Learner-oriented distance education supporting service system model and applied research

THE INTELLIGENT VEHICLE RECOVERY AND FLEET MANAGEMENT SOLUTION

Combined Liability Insurance. Information and Communication Technology Proposal form

Techniques for Requirements Gathering and Definition. Kristian Persson Principal Product Specialist

CallPilot 100/150 Upgrade Addendum

Innovative and applied research on big data platforms of smart heritage

Corporate Compliance vs. Enterprise-Wide Risk Management

Hillsborough Township Public Schools Mathematics Department Computer Programming 1

Blackbaud The Raiser s Edge

2015 EDITION. AVMA Report on Veterinary Compensation

Economics Letters 65 (1999) macroeconomists. a b, Ruth A. Judson, Ann L. Owen. Received 11 December 1998; accepted 12 May 1999

Intellio Video System 25

Welch Allyn CardioPerfect Workstation Installation Guide

elearning platforms and consultation service at CU Presented by Judy Lo 31 August 2007

2. Transaction Cost Economics

QoS Mechanisms C HAPTER Introduction. 3.2 Classification

Cost Functions for Assessment of Vehicle Dynamics

Recognition Scheme Forensic Science Content Within Educational Programmes

EasyMP Network Projection Operation Guide

DATA SCIENTIST WHY IT S THE SEXIEST JOB OF THE 21 ST CENTURY

File Storage Guidelines Intended Usage

AntiSpyware Enterprise Module 8.5

VoIP for the Small Business

Network Configuration Independence Mechanism

TITLE THE PRINCIPLES OF COIN-TAP METHOD OF NON-DESTRUCTIVE TESTING

Application Bundles & Data Plans

INVESTIGATION OF THE EXTINGUISHING FEATURES FOR LIQUID FUELS AND ORGANIC FLAMMABLE LIQUIDS ATOMIZED BY A WATER FLOW

Software Cost Estimation Model Based on Integration of Multi-agent and Case-Based Reasoning

ProfileMe: Hardware Support for Instruction-Level Profiling on Out-of-Order Processors

Other reasons include control algorithm device failure and unknown reasons

How To Reduce Telecommunictions Costs

WEB DELAY ANALYSIS AND REDUCTION BY USING LOAD BALANCING OF A DNS-BASED WEB SERVER CLUSTER

VoIP for the Small Business

Vendor Rating for Service Desk Selection

Move, Inc Citi EMT Conference January 5, 2010

Contextualizing NSSE Effect Sizes: Empirical Analysis and Interpretation of Benchmark Comparisons

National Diabetes Audit. Report 1: Care Processes and Treatment Targets

How To Get A Free Phone Line From A Cell Phone To A Landline For A Business

2013 Flax Weed Control Trial

Psychological health and safety in the workplace Prevention, CAN/CSA-Z /BNQ /2013

VoIP for the Small Business

Project 6 Aircraft static stability and control

GAO IRS AUDIT RATES. Rate for Individual Taxpayers Has Declined But Effect on Compliance Is Unknown

Pulsed-IV Pulsed-RF Measurements Using a Large Signal Network Analyzer

How fast can we sort? Sorting. Decision-tree model. Decision-tree for insertion sort Sort a 1, a 2, a 3. CS Spring 2009

Decision Rule Extraction from Trained Neural Networks Using Rough Sets

Implementation Evaluation Modeling of Selecting ERP Software Based on Fuzzy Theory

ffiiii::#;#ltlti.*?*:j,'i#,rffi

Why is the NSW prison population falling?

APPLICATION OF TAGUCHI EXPERIMENTAL DESIGN FOR PROCESS OPTIMIZATION OF TABLET COMPRESSION MACHINES AT HLL LIFECARE LIMITED, INDIA

Warm-up for Differential Calculus

VoIP for the Small Business

VoIP for the Small Business

g(y(a), y(b)) = o, B a y(a)+b b y(b)=c, Boundary Value Problems Lecture Notes to Accompany

E-Commerce Comparison

Data replication in mobile computing

New Internet Radio Feature

Small Businesses Decisions to Offer Health Insurance to Employees

Transcription:

FOUNDATION The LENA TM Lnguge Environment Anlysis System: Audio Specifictions of the DLP-0121 Michel Ford, Chrles T. Ber, Dongxin Xu, Umit Ypnel, Shrmi Gry LENA Foundtion, Boulder, CO LTR-03-2 September 2008 Softwre Version: V3.1.0 Copyright 2009, LENA Foundtion, All Rights Reserved

Abstrct The LENA lnguge environment nlysis system ws designed to estimte dult nd key child interctions in nturl home environments. Contrry to controlled clinicl reserch environments, the speech used by the prticipnts in this study ws rel, unrehersed, nd representtive of ech child s typicl dily lnguge environment. In this pper, we describe the Audio Processing System in terms of informtion flow, feture extrction, nd segmenttion identifiction. We lso revel the udio specifictions tht were either met or exceeded during the development nd design of the LENA digitl lnguge processor (DLP). Keywords Audio specifictions, feture extrction, segmenttion, trnscription, Digitl Lnguge Processor Copyright 2009, LENA Foundtion, All Rights Reserved 2

1.0 Introduction The LENA lnguge environment nlysis softwre V3.1.0 ws developed to process nd selectively filter udio nd interference signls resulting from nturl dt collection environment. The primry gols of the udio dt processing re to estimte Adult Word Counts (AWC), Child Vocliztions (CV), nd Converstionl Turns (CT) between the dult nd key child. Here, we describe the Audio Processing System in terms of informtion flow, feture extrction, nd segmenttion identifiction nd detil the udio specifictions tht were either met or exceeded during the development nd design of the LENA digitl lnguge processor (DLP). 2.0 LENA Processing Flow-Chrt The LENA Audio Processing System comprises four distinct components: informtion flow, informtion processing, lgorithmic processing models, nd professionl humn trnscriptions (Figure 1). Child Voice & Environment Sound LENA DLp Feture Extrction Child Speech process Reports & Disply Trnscripts Sttisticl Models Segmenttion nd Segment id its File Converstion Anlysis & Turn Estimtion Adult Speech process Figure 1. LENA Lnguge Environmentl Anlysis Audio Processing System. Copyright 2009, LENA Foundtion, All Rights Reserved 3

Initilly, n udio file contining recording dt from child s nturl home lnguge environment is stored in the DLP. The dt re first processed in the DLP to minimize disk spce nd bttery power consumption. The udio dt on the DLP re trnsferred through USB port onto computer where the dt re further processed nd coustic fetures re extrcted. Vrious coustic fetures re extrcted for different purposes. Some fetures re primrily used for distinguishing speech signl from non-speech signl; others re used for child speech processing to distinguish child vocliztion from other child sounds such s cries, vegettive sounds nd fixed signls. At the hert of the LENA system is the cpbility for the lgorithmic models to segment nd ppropritely identify sounds of vrying mplitude nd intensity. Fetures extrcted from the udio dt were segmented through itertive modelling processes into eight ctegories tht identify the source of the udio signl: the key child (wering the LENA DLP); other child; dult mle nd dult femle; overlpping sounds (t lest one humn); noise; electronic (e.g. television/rdio) sounds; nd silence. Bsed on the sttisticl fit of ech segment to the selected model, the seven ctegories other thn silence re further dichotomized into cler (i.e., high likelihood) nd uncler or quiet/distnt (i.e., low likelihood) sub-ctegories. Professionl udio trnscriptions were used to trin the udio processing models, nd the lgorithms utilized the models to identify vriety of segments from the udio signls ccurtely nd relibly. For exmple, it ws necessry for the speech processing lgorithms to differentite dult speech from child speech, nd to differentite the speech of the key child from the speech of other children or non-speech sounds (e.g. cries or vegettive sounds). Thus, lgorithmic models were built nd optimized using the professionlly trnscribed segmenttions s bsis for ccurcy. The ccurcy nd relibility of the LENA softwre V3.1.0 is described in LENA Foundtion Technicl Report LTR-05-2 nd the trnscription process in LTR-06-2. After individul segments re identified, further processing genertes key LENA dt. Key child sound segments re nlyzed through itertive processing to distinguish segments contining key child speech (including words, bbbles, nd pre-speech communictive sounds such s squels, growls, or rspberries) from non-speech (including fixed signls nd vegettive sounds) nd to estimte the number nd durtion of vocliztions produced by the child. Adult sound segments re processed to estimte the number of dult words child hers. Non-speech Copyright 2009, LENA Foundtion, All Rights Reserved 4

sound such s coughing, vegettive sounds, etc., re filtered out nd sttisticl models re used to estimte the number of words spoken in ech dult segment. Refer to LENA Foundtion Technicl Reports LTR-04-2 nd LTR-05-2 for informtion on the segmenttion process nd speech/non-speech clssifictions. Sttisticl modeling is further used to detect Converstionl Turns (CT), or bck nd forth lterntion between the key child nd n dult. For this purpose converstion ws defined s contiguous region contining live humn speech seprted from the next converstion by puse region of t lest five seconds durtion which contins only non-live-humn speech udio signls. CTs cnnot cross converstion boundries. Results from the udio processing described bove re written to the Interpreted Time Segments or ITS file, n XML-coded plin text compiltion of every fcet of dt recorded nd nlyzed by the LENA softwre.plese see Technicl Report LTR-04-2 for further informtion on the ITS file. LENA softwre engineers continue to improve the lgorithmic-bsed feture extrction nd segmenttion nlyses. We intend to relese upgrde versions of the softwre nnully. 3.0 Len System Audio Specifiction The LENA System includes Digitl Lnguge Processor (DLP) tht ws developed by hrdwre nd softwre engineers t the LENA Foundtion. Here, we describe the performnce gols ssocited with the DLP, s well s hrdwre nd opertionl performnce. 3.1 Performnce Gols The LENA DLP is used for full-dy recording sessions, for mximum of 16 consecutive hours. Thus, the unit must be stble nd mintin high levels of inter-recorder relibility, nd the performnce gols center on these two spects of the design. LENA Foundtion hrdwre engineers observed tht signl level directly ffected AWC. For exmple, if the signl vrition ws +/- 1 db, mximum of 4% vrince ws observed. However, if the signl vrition ws +/- 2 db, the mximum vrince observed ws 18%. Copyright 2009, LENA Foundtion, All Rights Reserved 5

In the exmple below, showing the signl vrition of the current model DLP-0121, six DLP units were chosen t rndom to determine how well they recorded between two different psses. As reveled in Figure 2, the signl vrition between psses ws quite mrginl for ll DLP units tested. -30.50 Recorded signl dbfs -30.75-31.00-31.25-31.50 First Pss Second Pss 1 2 3 4 5 6 Digitl Lnguge Processor Figure 2. Signl relibility using six DLP units chosen t rndom. LENA Foundtion hrdwre engineers sought to produce consistent inter-recorder sensitivity to minimize vrition of report output from different DLP units. The trget sensitivity ws set to minimize vrition (67 dbc SPL to -30 dbfs in the udio file). An dditionl performnce gol ws to chieve inter-recorder vrition of no more thn +/- 1 db. Currently, inter-recorder (between unit) vrition is less thn +/- 0.5 db nd intr-recorder (within single unit) vrition is less thn +/- 0.1 db. Additionl performnce gols included flt frequency response (+/- 1dB 100-4000 Hz), on/ off xis linerity of sensitivity nd frequency rnge, nd low signl distortion. Finlly, the unit ws designed such tht the recording ws unffected s the bttery dischrged to lower opertionl limit. The current DLP model DLP-0121 meets or exceeds US/Cnd complince stndrds. Stndrds for complince tht were either met or exceeded re shown in Tble 1. Copyright 2009, LENA Foundtion, All Rights Reserved 6

Tble 1: Complince stndrds met or exceeded by the LENA DLP. Stndrds for Complince Description DLP-0121 UL 60065 CAN/CSA-C22.2 No. 60065 UL Stndrds for udio, video, nd similr electronic pprtus Sfety requirements Cnd Stndrd for udio, video, nd similr electronic pprtus Sfety requirements. UL 696 UL Stndrd for Sfety Electric Toys EN 55022 EU Stndrd for Informtion Technology Rdio disturbnce chrcteristics UL: Underwriters Lbortories Inc; EU: Europen Union 3.2 Hrdwre Audio dt were collected using n omnidirectionl microphone with flt 20 Hz-20 khz frequency response. Extreme frequencies were suppressed, s they were unlikely to contin humn speech ctivity. Low frequency dt were suppressed through 70 Hz high-pss filter. Digitl dt were recorded using 10 khz low-pss filter to suppress high-frequency sounds. Frequencies were recorded using 16 khz 16-bit sigm-delt nlog to digitl (ADC) converter with 8x over-smpling digitl interpoltion. Initilly, udio dt were written to 512 MB flsh memory using 4:1 Adptive Differentil Pulse Code Modultion compression scheme (DVI-4 ADPCM). The flsh memory uses n internl error correcting code (ECC) for dt storge nd recovery. Complete dischrge of the bttery will not result in loss of udio dt. Dt were uploded to host computer through USB 2.0 high-speed port with sustined udio trnsfer rte to host of pproximtely 4 MB/sec (~ 2.5 minutes per 16 hours of udio). Once uploded, the dt were decompressed to the PCM udio formt with one 16-bit chnnel t 16kHz smple rte. The DLP-0121 unit pek operting power is 50 mw. A primry 450 mah bttery provides minimum of 30 hours of recording when new. The recording is sfely discontinued when bttery power is depleted. The DLP contins rel-time clock (RTC) for time-stmping recordings, s well s providing time bse for built-in ADC smple rte clibrtion. The unit comes equipped with dedicted rel-time clock bttery power for life of pproximtely 5 yers. Copyright 2009, LENA Foundtion, All Rights Reserved 7

3.3 Simple Opertion The LENA DLP ws designed for usbility. It is equipped with power switch nd test button (Figure 3). A visul feedbck mechnism llows the user to esily identify when the unit is sleeping or recording, s well s the bttery sttus. The unit esily ttches to LENA-designed clothing in protective pocket tht snps shut. The DLP is compct (3-3/8 x 2-3/16 x 1/2 ) nd of miniml weight (< 2 oz) in reltion to children, thus minimizing the distrction ssocited with the presence of the recorder. Figure 3. The LENA digitl lnguge processor (DLP-0121), ctul size. 4.0 Conclusion We hve described the four components of the udio processing system: informtion flow, processing, sttisticl modeling, nd trnscriptions used for model trining. Fetures extrcted from the udio re segmented through itertive modeling processes into ctegoricl components including mle nd femle dult, key child, other child, overlpping speech, noise, electronic noise, nd silence. Key child segments re further segmented into speech/non-speech, nd dult segments re processed to generte AWC estimtes. Adult child lterntions re processed into CT estimtes. The LENA DLP is simple to operte nd the inter-unit signl vrition is low, s ssessed by test-retest relibility. The DLP model DLP-0121 hs either met or exceeded US/Cnd complince stndrds. Copyright 2009, LENA Foundtion, All Rights Reserved 8