A user friendly toolbox for exploratory data analysis of underwater sound



Similar documents
Lab 3: Introduction to Data Acquisition Cards

On the use of Three-dimensional Self-Organizing Maps for Visualizing Clusters in Geo-referenced Data

Structural Health Monitoring Tools (SHMTools)

Using Smoothed Data Histograms for Cluster Visualization in Self-Organizing Maps

Robotic Sensing. Guiding Undergraduate Research Projects. Arye Nehorai

Establishing the Uniqueness of the Human Voice for Security Applications

Dialplate Receptionist Console Version

P300 Spelling Device with g.usbamp and Simulink V Copyright 2012 g.tec medical engineering GmbH

EDM SOFTWARE ENGINEERING DATA MANAGEMENT SOFTWARE

Shimadzu UV-VIS User s Guide

EasyC. Programming Tips

Visualization of Breast Cancer Data by SOM Component Planes

Building a Simulink model for real-time analysis V Copyright g.tec medical engineering GmbH

Aircraft cabin noise synthesis for noise subjective analysis

Option nv, Gaston Geenslaan 14, B-3001 Leuven Tel Fax Page 1 of 14

SL1100 Digital Call Logger User Guide

Period04 User Guide. Abstract. 1 Introduction. P. Lenz, M. Breger

LICENSE4J FLOATING LICENSE SERVER USER GUIDE

AB-Clock. Manual. Copyright by GRAHL software design

ELAD FDM-SW1 USER MANUAL. Ver. 1.0

Package Contents. D-Link DSN-3200/3400 Installation Guide. DSN-3200/3400 xstack Storage Area Network (SAN) Array. CD-ROM with User Guide.

Machine Learning with MATLAB David Willingham Application Engineer

Raptor-CAN User Manual

Data Mining and Neural Networks in Stata

DAS202Tools v1.0.0 for DAS202 Operating Manual

COOKBOOK. for. Aristarchos Transient Spectrometer (ATS)

1.1.1 Event-based analysis

A Discussion on Visual Interactive Data Exploration using Self-Organizing Maps

Use Data Budgets to Manage Large Acoustic Datasets

Artificial Neural Network for Speech Recognition

Manual Analysis Software AFD 1201

User Manual. Call Center - Agent Assistant Application

ROBOTRACKER A SYSTEM FOR TRACKING MULTIPLE ROBOTS IN REAL TIME. by Alex Sirota, alex@elbrus.com

CONCEPT-II. Overview of demo examples

What is LOG Storm and what is it useful for?

NETWORK-BASED INTRUSION DETECTION USING NEURAL NETWORKS

Working with SQL Server Integration Services

Ericsson T18s Voice Dialing Simulator

REMOTE DESKTOP SETUP INSTRUCTIONS

1 ImageBrowser Software Guide

Quick-Start Guide. Remote Surveillance & Playback SUPER DVR MONITORING SOFTWARE. For use on Q-See s QSDT series of PC Securitiy Surveillance Cards

How to operate the BD FACSCanto flow cytometer

Recording Supervisor Manual Presence Software

Network Video Monitor Software

KViewCenter Software User Manual 2012 / 04 / 20 Version

DocAve 4.1 Backup User Guide

Mouse Control using a Web Camera based on Colour Detection

Jump Start: Aspen HYSYS Dynamics V7.3

Intelligent Data Mining for Turbo-Generator Predictive Maintenance: An Approach in Real-World

As you look at an imac you will notice that there are no buttons on the front of the machine as shown in figure 1.

Instrument Software Update Instructions. Keysight X-Series Signal Analyzers (PXA/MXA/EXA/CXA) Upgrading an older version to A.12.

EE289 Lab Fall LAB 4. Ambient Noise Reduction. 1 Introduction. 2 Simulation in Matlab Simulink

Matlab GUI for WFB spectral analysis

Smoke Density Monitor application documentation

WEB TRADER USER MANUAL

Data topology visualization for the Self-Organizing Map

Using Avaya Flare Experience for Windows

Greenplum Database (software-only environments): Greenplum Database (4.0 and higher supported, or higher recommended)

FAST Fourier Transform (FFT) and Digital Filtering Using LabVIEW

Package Contents. D-Link DSN-3200/3400 Installation Guide. DSN-3200/3400 xstack Storage Area Network (SAN) Array. CD-ROM with User Guide.

Quick Help Guide (via SRX-Pro Remote)

ComTool Tool for managing parametric controllers

OPTOFORCE DATA VISUALIZATION 3D

Monitoring Network DMN

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

CompleteView Alarm Client User Manual. CompleteView Version 4.3

SQL 2012 Installation Guide. Manually installing an SQL Server 2012 instance

Designing a Schematic and Layout in PCB Artist

Semester Thesis Traffic Monitoring in Sensor Networks

Asset Track Getting Started Guide. An Introduction to Asset Track

Carla Simões, Speech Analysis and Transcription Software

Upgrading from Call Center Reporting to Reporting for Contact Center. BCM Contact Center

A Custom-made MATLAB Based Software to Manage Leakage Current Waveforms

Kramer Electronics, Ltd. Site-CTRL and Web Access Online User Guide (Documentation Revision 2)

Hierarchical Clustering Analysis

In: Proceedings of RECPAD th Portuguese Conference on Pattern Recognition June 27th- 28th, 2002 Aveiro, Portugal

RX-6 Six In - One Out All in One Receive Antenna Switch Local and Remote Control System RX6ACI User Manual Version 2.1

ImagineWorldClient Client Management Software. User s Manual. (Revision-2)

LabVIEW Day 1 Basics. Vern Lindberg. 1 The Look of LabVIEW

NAS 107 Introduction to Control Center

CHAPTER 2: USING THE CAMERA WITH THE APP

Analecta Vol. 8, No. 2 ISSN

OnDemand Version 1.7. Installation Manual and User s Guide. AST Technology. Sophienstrasse Herford Germany

Dictation Software Feature Comparison

Developing an Isolated Word Recognition System in MATLAB

Central Management Software CV3-M1024

Neural network software tool development: exploring programming language options

User s Manual CAREpoint EMS Workstation D-Scribe Reporting System

MM8000 safety and security with smart danger management. A scalable and flexible management station for any requirement. Answers for infrastructure.

DSP Laboratory: Analog to Digital and Digital to Analog Conversion

Technical Training Module ( 30 Days)

X Series Application Note 43:

About the NeuroFuzzy Module of the FuzzyTECH5.5 Software

Department of Electrical and Computer Engineering Ben-Gurion University of the Negev. LAB 1 - Introduction to USRP

CSE 237A Final Project Final Report

Transcription:

061215-064 1 A user friendly toolbox for exploratory data analysis of underwater sound Fernando J. Pires 1 and Victor Lobo 2, Member, IEEE Abstract The underwater acoustics research group at the Portuguese Naval academy has been working for some years on unsupervised classification of underwater sound [1-3]. Some software was developed using techniques developed by this group, but adapting the software developed during the research phase to the operational programs (that must be operated by less skilled personnel) proved to be difficult. So as to make the transition smoother, it was decided to use MATLAB both during the research phase and as development system for the operational program. By imposing some compliance rules during the exploratory phase, the same code can be seamlessly integrated with the operational program. Furthermore, the software developed constitutes a signal processing and classification toolbox that can be used by researchers or end-users with little effort. The toolbox is publicly available, and a tutorial was written so as to make the learning curve smoother. In this paper we present this toolbox, showing intended uses, available routines, and end-user programs. Index Terms Underwater acoustics, exploratory data analysis, Self-Organizing Maps, Spectograms I. INTRODUCTION HE underwater acoustics research group at the Portuguese T Naval academy has been working for some years on unsupervised classification of underwater sound [1-3] using Kohonen s Self-Organizing Maps (SOM) [4]. Thanks to the strong link to the Navy s submarine squadron, a operator-friendly program, incorporating many of our developments, was produced in 1998 [5]. This program, was written in C and compiled with Borland C++ 3.0, using a number of dedicated libraries. Although at the time it was a very powerful tool, it has proved to be hard to maintain by the research team, which tests most new ideas using MATLAB. The MATLAB environment has considerable advantages for researchers but, if care isn t taken, will lead to fragmented code that is hard to maintain, and normally requires considerable effort to port to the user version of the program, in C. Over the last year, we have overhauled the code and produced a streamlined MATLAB toolbox with all the 1 Fernando J. Pires (EE, MScEE, Eng.) is a Commander of the Portuguese Navy currently serving as department chair at the Portuguese Naval Academy. 2 Victor Lobo (PhD,MSc,Eng) is currently a civilian professor at the Portuguese Naval Academy. routines necessary for our exploratory data analysis. These routines are publicly available, with comments and a small tutorial that we hope will make them accessible to any researcher or graduate student. We also produced a userfriendly program, with a graphical user interface, that uses these routines to implement an operational data-analysis application that can be used aboard naval vessels by sonar operators. The possibility of using the same routines for research and for the operational program shortens the timeto-market enormously, reduces costs, and reduces drastically the number of bugs in the final version. The current version of the toolbox (and accompanying userfriendly program) includes: - Real-time visualization of the power spectra, using shorttime Fourier transforms - Real-time visualization of a SOM mapping of the sound (using SOMTOOLBOX for MATLAB [6]) - Training of a standard and modified SOM (using SOMTOOLBOX for MATLAB) - Feature extraction procedures. - Supervised classification of sound, using a pruned SOM [7]. Other features are still under development, such as almostreal-time visualization of spectra using time-frequency distributions [8]. In this paper, the program is presented with an example of its use on data recorder at an acoustic test tank. II. GENERAL DESCRIPTION This toolbox is aimed at researchers and end-users with an interest in capturing, analyzing, and classifying sounds with a common PC and a soundcard. If available, signal acquisition hardware may also be used. The sound source may be a sonar system, or a simple hobbyist hydrophone such as Dolphin Ear, or even pre-recorded sound files. Although intended for analyzing and classifying underwater sound, the system may be used for any type of sound, and is used in classrooms as an example of a signal processing and classification system. Usually it will be necessary to first record different sounds, and save them to disk. These sounds may be saved in several formats, namely raw acquisition data (.DAQ files), windows sound files (.WAV ) or processed feature vectors (in matlab s.mat files). During this recording, the user may

061215-064 2 monitor the signal with a spectrogram visualization tool. The second step will usually involve the creation of a Self- Organizing Map [4], that will later be used to automatically classify the source of new sounds [5]. To construct this map, one must use the recorded sound files, or the processed feature vectors. In either case, a series of pre-processing techniques are automatically applied to the data. Finally, the trained SOM will be used to continuously classify the sound that is being received (either from real-time capture via a soundcard, or from a pre-recorded file). This toolbox is composed of three main graphical user interfaces (GUI) that constitute the front end to the individual MATLAB m-files that perform more specific tasks (see Fig.1). The first project in this area was named CEHIA, that are the initials in Portuguese for Classification of hydrophonic signals using artificial intelligence, and thus most of our programs use these initials in their name. The upper part of the panel has a SOM file field, where the user can check which map is selected. The selection and configuration of this component is done via the menu Configure SOM. The next field, named current class, is continuously updated to show the estimated class of the sound that is being received. This class is obtained with the pruened classifier, and for a more comprehensive classification the SOM visualization window should be used. CEHIA Record/read data Visualize spectra Visualize classification.wav (sound files).mat.daq (classification (log files) maps) CEHIA_FTOOLS CEHIA_BUILDMAPS.MAT Convert files Build SOMs (feature vectors) Build feature vectors Fig. 1 The main end-user programs, and their interaction. The contents, organization and user interaction are described in the following section. III. END USER PROGRAMS The end user programs can be compiled into stand-alone applications, and have a graphical user interface so that they may be used by people with no programming experience. Most users will only interact with the main CEHIA application, since it is with this program that real-time classifications and spectra visualizations are performed. The other programs will only be necessary to build new classification maps. The three main GUI s are: - CEHIA: front end for operational use - CEHIA_FTOOLS: front end for the file conversion tools - CEHIA_BUILDMAPS: front end for training SOM. A. CEHIA This is the main interface for the operator as shown in Fig.2. The configuration and control of the system is performed from this central location. Fig. 2. cehia main window. The middle left section groups all the capturing parameters which are to be set before starting a recording session (or Fig. 3. - cehia Spectrogram visualization window (in top view mode). before playback of pre-recorded sessions). The bottom panel is used to select the source of the sounds to be used (either the real-time capture via the PC soundcard or a pre-recorded log) and the name of the file to be used for recording or reading.

061215-064 3 Finally, the two large buttons on the right section are used to start and stop the operation which will work on the realtime data or on a log file recorded in a previous session. When in capture or playback mode, the spectrogram of the signal received is shown in a separate window (see Fig.3). A visualization of the SOM, where a yellow blinking circle identifies the signal that is being received, is also presented to the user, as shown in Fig.4. for the SOM. Some default values are given, but the user may change them. After training, the system will show the U- Matrix obtained [9], and allow both automatic and manual labeling of the SOM units. Finally, the user may prune the SOM and/or select only certain SOM units to form a fast prototype based classifier. This prototype based classifier can then be used by CEHIA for a quick classification of the sound being received. Fig. 4. - cehia SOM visualization window. The colored areas correspond to known sounds, and the yellow circle identifies the sound currently begin captured as belonging to a certain group (in this case, dark blue). B. CEHIA_FTOOLS This GUI (see Fig.5) is used to convert between the three formats used in this toolbox: - Data acquisition log files (.DAQ files). - Windows sound files (.WAV files) - Feature Vector Matrix files (.MAT files) The main CEHIA tool uses log files both for capture logging and for file reading, while the SOM utilities use feature vector files. The.wav format is convenient for playing the sounds via the PC s sound system or for capturing feature vector from sounds recorded in this format. The upper section is used to identify the files used as original source and final destination, and to supply parameters for vector processing. In the middle panel the user selects which conversion is to be performed. The bottom buttons are used to actually perform the selected conversion and to close the tool. C. SOM_buildmaps This GUI is used to build classification maps, that are in fact Self-Organizing Maps [4]. The user will start by selecting files that contain feature vectors of different sounds. He will then be prompted to give a label (i.e. a name) to the source of that data, and to select the color with which it will be represented. The main window will show which data files have been included, and how many feature vectors where available in each. The next step consists in specifying the training parameters Fig. 5. Cehia_ftools interface. IV. MATLAB ROUTINES A number of Matlab routines are made available to the expert user/researcher. These routines are used by the main tools and can be used to perform specific tasks outside of this toolbox, or modified as needed. Each m-file corresponds to a Matlab function that can be called from an external program and includes all the internal (private) functions that should not be directly accessed from the outside of the main function. Some global variables have to be used to store information that has to be used by functions executed under the timer s control or to pass handle information that otherwise would be lost. A. Ceh_IO This function aggregates all low-level Input/Output interface with the source of the data to be used by the main CEHIA tool. The source can be either the sound card of the PC or a *.DAQ file recorded in a previous session.

061215-064 4 When capturing from the sound card, the function initializes the device and sets all the appropriate parameters for the session. All captured data is recorded in a log file, at a rate set by the AQ.sr Sampling Rate parameter. A timer is set to take regularly spaced sample vectors from this capture stream, by means of a call to the function ceh_rv. The interval between vector samples is set by the parameter AQ.ri Refresh Interval. When reading from a file, the process is essentially the same and a timer is used to set a regular capture that mimics the behavior of the capture process. B. Ceh_RV This function reads a new time vector from a specified source (sound card or a file) and builds the corresponding feature vector. There is a global variable AQ used to store all the parameters for the session that has to be properly instantiated beforehand (in normal operation this is done by ceh_io ). In regular use this function is called by a timer at regular intervals corresponding to the Refresh Interval parameter. This function makes use of the routine ceh_tv2fv which takes care of the conversion of a time vector into a feature vector. C. Ceh_tv2fv This function performs the conversion of a time vector into a feature vector. A feature vector is a frequency space representation of the signal that originated the time vector. The transformation is done by first dividing the time vector into a number of windows as set by AQ.nw. These windows are overlapped by a factor of 50% which is a commonly used value. For each window an FFT is taken resulting in its complex spectral content. Since we are only interested in the positive component of the spectrum, the upper half of the vector is not relevant. Additionally, there is the option of retaining only a part of the available frequencies (bottom few frequencies), by performing a downsample of the frequency vector as set by the AQ.ds Downsampling Factor 3. The result is a feature vector with AQ.nf Number of Features size, representing the power content of the first frequency components of the signal. D. Ceh_daq2fvm This function performs the conversion of a time vector into a feature vector. A feature vector is a frequency space representation of the signal that originated the time vector. 3 This downsampling operation is needed when we are only interested in retain frequencies up to a value that is lower than that resulting from the Nyquist frequency obtained from the sampling process. Since sound cards and capture devices in general are limited to specific sampling frequencies, one has to use this technique to obtain the desired frequencies while retaining the number of features. E. Ceh_inspectMap This function allows the user to select a unit of the SOM, using the mouse, and to inspect and change it s characteristics. F. Ceh_showFV This function maps a new feature vector onto a SOM, without using the CEHIA environment variables. G. Ceh_VIS_SOM This function shows the SOM maps and updates it by mapping a new feature vector. H. Ceh_VIS_SPEC This function shows an updated spectrogram of the sound that is being received. V. TUTORIAL A detailed tutorial is available at the project s website, but a short overview is given here. We propose two different problems. One uses pre-recorded that was actually obtained with a hydrophone. The other uses sound from a DTMF telephone, so that the user may test it in real time in any common office. For the first problem, the site has.wav files containing recordings of the sounds produced by 4 different outboard motors, recorded at an acoustical tank. To analyze this data we need to perform the following steps: 1- Use some recordings of each motor to obtain feature vectors with the CEHIA_FTOOLS program 2- Use the feature vector files to obtain a SOM with the CEHIA_BUILDMAP. 3- Using the CEHIA program play the sound files and see if each sound is correctly classified by the SOM. For the other problem, the user will record training data from a common telephone (that supports DTMF), and then use the CEHIA program (and a microphone) to identify which key is being pressed. To do this the user must perform the following steps: 1- Use the CEHIA to record the sound produced by different keys. 2- Use those recordings to obtain feature vectors with the CEHIA_FTOOLS program 3- Use the feature vector files to obtain a SOM with the CEHIA_BUILDMAP. 4- Using the CEHIA program and the SOM trained in the previous step, press different keys in the telephone, and observe how the program is able to identify which key is being pressed

061215-064 5 VI. CONCLUSIONS A short description of the MATLAB toolbox and end-user programs developed at the Portuguese Naval Academy by the underwater acoustics group was given. By using the MATLAB programming language both for research and production of the final version the development cycle was shortened considerably, and it became possible to give end-users the latest improvements of the research team. The toolbox, along with the tutorial, is available at www.isegi.unl.pt/docentes/vlobo/projectos/cehia/cehia.htm. ACKNOWLEDGMENTS The authors thank the Portuguese Science and Technology Foundation (FCT) that has sponsored the group with the research grant POCI/MAR/61190/2004, as well as the students that helped develop some of the software presented, and the Naval Academy s command. REFERENCES [1] V. Lobo and F. Moura-Pires, "Ship noise classification using Kohonen Networks," presented at EANN 95, Helsinki, Finland, 1995. [2] P. M. Oliveira, V. J. Lobo, V. Barroso, and F. Moura-Pires, "Detection and Classification of Underwater Transients with Data Driven Methods Based on Time-Frequency Distributions and Non- Parametric Classifiers," presented at MTS/IEEE Oceans'02, Biloxi, Mississipi, USA, 2002. [3] V. Lobo, N. Bandeira, and F. Moura-Pires, "Distributed Kohonen networks for Passive Sonar Based Classification," presented at FUSION 98, Las Vegas, NV, USA, 1998. [4] T. Kohonen, Self-Organizing Maps, 3rd ed. Berlin-Heidelberg: Springer, 2001. [5] V. Lobo, N. Bandeira, and F. Moura-Pires, "Ship Recognition using Distributed Self Organizing Maps,," presented at EANN 98, Gibraltar, 1998. [6] J. Vesanto, J. Himberg, E. Alhoniemi, and J. Parhankangas, "SOM Toolbox for Matlab 5," Helsinki University of Technology, Espoo A57, April 2000 2000. [7] V. Lobo, R. Swiniarski, and F. Moura-Pires, "Pruning a classifier based on a Self-Organizing Map using Boolean function formalization," presented at WCCI - World Conference on Computational Intelligence, Anchorage, Alaska, USA, 1998. [8] P. M. d. Oliveira and V. Barroso, "On the concept of instantaneous frequency," presented at International Confererence on Acoustics, Speech and Signal Processing ICASSP98, Washington, 1998. [9] A. Ultsch and H. P. Simeon, "Exploratory Data Analysis Using Kohonen Networks on Transputers," Department of Computer Science, University of Dormund, FRG Bericht Nr.329, December 1989 1989.