SOFTWARE AND HARDWARE SOUND ANALYSIS TOOLS FOR FIELD WORK. Pavan G. '' ^ Manghi M. \ Fossati C.'' ^ Centra Interisciplinare i Bioacustica e Ricerche Ambientali, Universita i Pavia, Via Taramelli 24, 27100 Pavia, Italy. Email qpavan(5).cibra.unipv.it ^ Dept. of Urban Science, luav, Venice, Italy 1. ABSTRACT The latest version of the real-time Digital Signal Processing Workstation evelope at CIBRA runs in a stanar Winows environment an can use a wie range of soun acquisition evices. It can be base on a notebook to allow on-fiel use. Depening on the acquisition evices, recoring, analysis an isplay can be performe in real-time up to 500 ksamples/sec to provie useful banwith to more than 200 khz. The software was primarily evelope for continuous real-time monitoring in bioacoustical stuies. KEYWORDS: soun analysis, igital signal processing, spectrogram, real-time 2. INTRODUCTION The evelopment of DSP (Digital Signal Processing) techniques an of afforable high-spee computer harware with large storage capacity has mae the computer analysis an recoring of bioacoustical signals an every-ay tool for ethological research an for monitoring the unerwater environment. Since igital techniques appeare an replace arialog systems in the recoring an analysis of animal vocalizations, the avantages have not only been in the change of recoring meia but mostly in the evelopment of completely new methoologies an strategies for analysis, interpretation an istribution of ata. The exponential increase in spee an storage capabilities an the rop of costs were less preictable. The sum of these two factors makes it possible to have an entire bioacoustic laboratory in an afforable notebook, reay to be move everywhere for recoring an "seeing" infrasouns, souns an ultrasouns. Since 1980 the University of Pavia evelops an applies igital techniques to the stuy of animal souns. The latest version of the CIBRA's real-time Digital Signal Processing Workstation (DSPW) meets most of the nees for soun recoring an analysis in the fiel as well as in the lab. Base on Winows, it elivers ease of use an full compatibility with stanar soun evices an soun eiting software. This makes easier the istribution of recorings among scientists an stuents. 2.1 HISTORICAL BACKGROUND The evelopment of a igital soun analysis system suite for bioacoustic research began at the University of Pavia in 1980. In those ays the analysis of 1 secon of soun require more than 40 minutes of computing, but the igital approach anyway prove to be a winning solution for its flexibility. The first real-time portable system for fiel use was evelope in 1990 by integrating a bulky portable PC (16MHz 386 CPU) an a DSP base signal acquisition boar. The software was evelope in DOS an it was mainly base on highly optimize assembler routines. It was use for Proc. I.O.A Vol 24 (This Footer space shoul remain blank - reserve for use by the I.O.A.) PiSTRlBUTIOM STATEMENT A Approve for Public Release Distribution Unlimite 20031117 151
Pavan et al. Soun Analysis Tools the first time in a cruise in 1991 an then presente at the XV IBAC meeting [1] an at the ECS meeting in Sanremo [2]. The continuous increase in CPU spee, in the following years, mae the soun analysis faster an faster. The Intel 486/66 was the first CPU able to process in real-time up to 48000 samples/sec without expensive an har to program DSP processors. Nowaays, a Pentium III 500 requires less than 100 msec to isplay a spectrogram of a 1 secon signal sample at 48000 samples/sec thus making it possible to theoretically work in real-time up to 480 K samples/sec. Nevertheless, the CPU spee is one only of the factors affecting the performance of signal processing systems. Soun acquisition evices, bus spee, CPU spee, har isk spee an other harware an software relate factors affect the sustaine throughput of a system an its suitability for real-time application. But what oes real-time mean? In a general sense, a real-time analysis system is a system able to prouce a result without appreciable elay. A more accurate efinition, applie to acoustic systems, implies that the signal must be processe continuously, without gaps (without loosing ata), an without appreciable elay. In acoustic analysis, a real time system shoul show the features of an acoustic signal continuously without loosing even short transients. Also, the isplay of features shoul happen simultaneously with the signal, or at least with a very short - but constant - elay. 3. SYSTEM DESCRIPTION 3.1 HARDWARE PLATFORM The DSPW has been evelope to provie acoustic researchers a compact, low cost, flexible an expanable personal system. The workstation is base on the Intel x86 platform an conforms to PC an Winows stanars (earlier versions were evelope in DOS [1-6]). It can be base either on esktop or notebook PCs. The software is highly optimize to exploit the processing capabilities of Intel Pentium processors an to perform smoothly in real-time. 3.2 SOFTWARE The Winows software has been evelope in 1997 to match CIBRA's research nees an it is continuously upate to meet research requirements an to take avantage of new harware. In 1999 new functions were evelope to meet the requirements of a research project grante by the Office of Naval Research. The software package inclues analysis, recoring an isplay tools. Real-time spectrogram isplay, real-time cepstrogram isplay, wrap-aroun or scrolling isplay, wie control on all analysis parameters, frequency-time cursor while in real-time moe, frequency zoom capabilities, frequency tracking, recoring an playback of files with real-time isplay, scheule recoring, onevent recoring, file analysis are all implemente. Soun files an spectrograms can be save in stanar formats to allow further processing an managing with other software. As the software has been evelope primarily for analysis an isplay, eiting capabilities are not implemente. These features can be easily foun in a number of shareware an freeware. 3.3 SOUND ACQUISITION To provie recoring quality often better than DAT recorers many high quality soun acquisition boars are on the market. Several low-cost AD/DA boars are available for musical application an most of them can be use for soun acquisition. Nevertheless, not all of this equipment has enough acoustic quality or accuracy to be use in high quality recoring an soun analysis.
Pavan et al. Soun Analysis Tools The boar shoul be accurately selecte to match the nees of the research an the quality level of the whole electroacoustic chain use. Choosing boars with both analog an igital I/O is recommene. Digital I/O is normally available in stanars: electrical SPDIF on RCA connectors, electrical AES/EBU on XLR connectors an optical on TosLink connectors. Professional boars offer all three stanars with the ability to convert one format to another, to work as a igital switching boar, an to change SCMS protection coes. As the software relies on the stanar Winows multimeia interface, the software can work on almost any soun boar on the market incluing igital I/O boars, 96 khz soun boars, the still rare 192 khz boars an the USB auio evices, unfortunately still limite to 48 khz sampling. Almost all quality boars have 24 bit AD converters proviing up to 110-115 B of ynamic range, less than the theoretical limit of 144 B because of the limitations of the analog front ens. Just a few external high-quality expensive converters can provie more than 120 B of SNR. The soun analysis software is currently limite to 16 bit signals, proviing "only" 96 B of ynamic range, well enough to work on a wie range of signals. Future version will overcome this limitation. In the meanwhile, it is possible to rea 32 bit integer raw ata files. Multiple soun evices are supporte to easily choose among the installe I/O options. Multiple program instances are allowe provie that each is using a ifferent soun evice (two programs can't access the same soun evice as pipelining is not allowe in stanar Winows rivers). Normally, high quality boars have both analog an igital I/O which can be use as inepenent evices. As the software allows inipenent selection of ifferent evices for input an output, it is, for instance, possible to recor from igital inputs an to play to analog outputs of the same boar or of a ifferent boar. A great effort was put in testing a number of ifferent evices an their combination to be sure of getting optimal performances on a wie range of platforms. 3.4 ULTRASONIC EXTENSION Sounboars sampling at 96 khz usually offer a 40-44 khz banwith, while 192 khz boars may provie up to 80-90 khz banwith. If connecte to suitable transucers, these boars may offer new chances to reveal an stuy ultrasouns. To cover the whole frequency range of both terrestrial an acquatic organisms, of those performing echolocation in particular, it is however require to exten the recoring range up to 150 khz at least. For this purpose more expensive, an often more noisy, soun acquisition systems are require. The ultrasonic extension of our software is tailore to use National Instruments DAQ cars, proviing sampling frequencies of more than 1 MHz with 12 bit accuracy. This narrows the theoretical ynamic range to 72 B, not a great range, anyway higher than the range of analog instrumentation tape recorers. Ml DAQ cars are available for PCI, PCMCIA (CarBus) an FireWire (IEEE 1394) computer bus. For esktop PCs, PCI cars are relatively cheap an perform well. For portable use on notebooks, PCMCIA CarBus boars are goo because of their compact size but they are slower than the FireWire base acquisition systems, which are external evices larger in size an more expensive. On a Pill 500MHz notebook it is possible to acquire an analyze at 320 ksamples/sec (single channel) with a PCMCIA car an at about 500 ksamples/sec with a FireWire evice. If recoring to isk without isplaying, these rates can be increase by 50%. (These rates cannot be guarantee for every Pill 500 moel). 3.5 RECORDING Soun recoring can be performe in two moes: RAM recoring, which is limite by the available amount of RAM memory, an Har Disk recoring.
Pavan et al. Soun Analysis Tools In RAM recoring moe souns are recore in a buffer whose size is set by the user accoring to the available free memory. The buffer contents can be save to isk when the acquisition is stoppe or when the buffer is completely fille. A further moe allows to set the buffer as an enless circular buffer; when the user stops the acquisition the buffer hols the last n minutes of auio an the user can save them to isk. To support long term monitoring an recoring activities, the software has been improve to allow continuous operation for ays. While isplaying souns, the software can continuously recor to isk. To overcome the 2GB file size limit of Winows operating systems, the program creates a new file every hour (file length is anyway user selectable) an can use available isks in sequence. A buffering system allows recoring continuously without loss of ata; if something goes wrong an a ata block is lost a warning an a log are generate. Each file is store with start ate an time in the filename an a log of all recorings is generate automatically; if a GPS is connecte to the DSPW, location ata is ae to filenames an log file. Recoring length is only limite by the available storage space. Storage space is normally limite by harware constraints: the largest EIDE isks currently available hol 80GB each an stanar esktop PCs allow no more than 4 har isks. If sampling two channels at 48kHz, four 80GB isks can recor continuously for 19-20 ays. By using two har isks minimum, it is possible to backup a isk to high capacity SCSI tapes while the program is writing to the other isk(s). The backup proceure requires the user intervention. If more storage space is require, it is possible to a aitional controllers to increase the number of isks. For this purpose RAID EIDE controllers are inexpensive an very flexible, but limite to esktop PCs. More expensive SCSI controllers allow more than ten isks in line. In the near future even more flexible an powerful recoring systems will be base on external hot-swappable FireWire isks to eliver huge storage capacity to notebooks as well. Other than on the size, the choice of a isk shoul be base on the require throughput; for auio range there are no special requirements, but for ultrasonic recoring, at rates of 1 Mbytes/sec per channel, fast isks are require. For very large amounts of ata, in the orer of hunres of GB, there are no cheap an compact portable solutions an we look fonwar to the new optical technologies promising 10TB (maybe 100TB) of ata on a single tick optical isk. 3.6 ANALYSIS The spectrographic isplay is create calculating the FFT of signal frames extracte at regular intervals from the incoming signal. The FFT size is selectable ranging from 512 samples for the efault isplay to 16384 samples for the zoom isplay. Analysis parameters (sampling rate, FFT size, winow length, winow shape, scanning - or shift - step) can be iniviually moifie an the software checks for parameter incompatibilities (if the user sets, for example, the winow length greater than the FFT size, the FFT size is automatically increase to match the chosen winow). It is also possible to set the winow shorter then the FFT (zero paing) to increase time selectivity at the expense of frequency selectivity while maintaining a same frequency resolution. This makes the software very flexible, but requires a goo unerstaning of soun analysis principles also in relation to the real processing capabilities of a given machine. The isplay structure has been esigne to provie great accuracy an a precise relationship among isplaye pixels an ata resulting from FFT processing. A pixel in the spectrogram always reflects the value of a frequency bin of a given time winow, unless averaging or zooming have been chosen by the user.
Pavan et al. Soun Analysis Tools FFT sizes greater than 512 samples allow to select among packing frequency bins to fit the stanar isplay or zooming into a frequency range (Figure 1) by irectly plotting bins to the isplay. In this way the zooming factor correspons to the chosen FFT size / 512 (a FFT size of 16384 yiels a 32x zoom). When zoom is selecte, it is possible to switch among zoom isplay an stanar isplay while the spectrogram is running as well as to change the baseline of the zoome range. SpecGiam Heal T juie Specogiph C) by G.Pavan 1SB8-20U0 Fil^ T«!k fiii(t«disnuji SrtCuB": PiKtrts tll*l«-.t H* n(*iiin SniMAR Sfll MiR-flMm 6000s/:eo 1 ch Fiame size 18384-2730.667ft«ec Dfepltsi 43 631 sec QsS snapiot HI 93.76. ^-^[. vri v:-tv r>] lll»fci1iw»<»<»iul»j >il<»>wlli4li iiil i<»l»f #l»l'l i»i>«lwi«'l t Hlll>)>lil)l lilil<' Quick Cetlri^s Micioso(l Soun Mappw Bloc*- Sue 16384 Sati^linB Hale )6000 FFTSije J16384 Winow See J4036 Winow Tjipe JGaussiar Scan Step )512 -.1 Uisplaji Height 256 Colours o Buffet lot, mm DD:CD.4D.9BD overlap 87.5;; r I ( I 1 t ) I I 1 I I I I r r f D 5.461 1D.923 1S.3B4 21.845 27.307 3Z768 38.229 43.B91 s Itartllil P""" I R»"'Tiir»et Fleal Time ConlroU - i Colour shift fii jj_y I Buffer 0 1 2650 Offset FaJ 0 OvI 0 ; : DiS«=>Bain fi "jjijj Tracker T f? Zoom 32x^ bin ]Q,jjij- Remove Offset T fa" Figure 1. Real-time spectrogram In 32x zoom moe revealing fin whale souns at frequencies of about 20 Hz. Other than the traitional spectrogram it is possible to isplay the cepstrogram, which is the plot of the cepstrum versus time. This isplay is particularly useful to analyze harmonic souns an to improve the measure of their funamental frequency. In marine mammal stuies, the cepstrum is particularly useful to measure the Inter Pulse Interval (IPI) in sperm whale clicks [5]. Further real-time options are in evelopment to provie filtering, ban shifting an ban compression to make easier the analysis an management of ultrasonic signals. Among these proceures, which are base on FFT manipulation an inverse transformation, ban compression is the most critical because of the processing of phase information. Figures 2-4 show that ban shifting an ownsampling can transform ban limite ultrasonic signals to fit the frequency range of available recoring instrumentation, to reuce ata size an to make them auible while maintaining the original frequency-time features. The process has been mae in separate steps for emonstration purposes, but it can be performe in a single phase in real-time. These proceures are temporarily implemente in separate routines an will be inclue in the main software soon.
Pavan et al. Soun Analysis Tools 112 36 \ ^AAAAA^A^^^^^vwvvw\MA^^^^WAA^ mnm^^tmmmmm 16-0 - Figure 2. Sequence of test signals create in ttie range 0-128 l<l-lz (256 l<samples/sec). kh! 128 -^ 112-38 - SO - U - 48 - S2-16 - 0 - \ ^AAAAAAA^AA^^^^AA^^^AA^AA^^^wvwv^ Figure 3. Spectrogram ofttie test signals after shifting the range 96-128 khz own to 0-32 khz. 1024 1152 1280 HOB 1536 1664 179? TO Figure 4. Spectrogram after own shifting (96 khz) an subsampling (4:1) the test signals.
Pavan et al. Soun Analysis Tools Platform Intel Pentium architecture with Winows 95/98/NT/2000/ME Display wrap-aroun or scrolling high resolution real-time spectrogram an cepstrogram Channels 1 or 2 channels Sampling rates up to 192000 samples/sec epening on the installe soun evice Dynamic range 16 bit resolution, 96 B ynamic range on isplay IVIax Banwith up to 24 l<hz with stanar soun evices up to 96 khz with optional PCI cars in esktop PCs Min Banwith epens on the installe soun evice. Digital zooming allows to lower the banwith 32 times Recoring size up to 2GB per file, automatic file splitting to overcome the 2GB limit Recoring format stanar 16 bit uncompresse.wav files (mono or stereo) Play/Analysis 16 bit uncompresse.wav files an custom files up to 2GB Harware requirements: CPU RAM Display Soun I/O at least a P233 for single channel operation. A Pll 300 or faster recommene 64 MB or more at least 800x600 with 16M colours, 1024x768 or higher recommene any stanar soun evice compatible with Winows, incluing analog & igital ISA an PCI boars, igital & analog USB an FireWIre auio evices, stanar notebooks. Multiple soun evices are supporte High Spee Option: CPU Acquisition evice Channels Sampling rates at least a Pll 300, a Pill 500 or faster recommene National Instruments DAQ (PCMCIA or FireWIre for notebooks, PCI or FireWIre for esktops) 1 channel (up to 8 channels In evelopment) 500-500k samples/sec with 12 bit resolution (on a Pill 500) Other Options: External anti-aliasing filters, raio receiver for sonobuoys, GPS for position logging an time synchronization, loomblt ethernet for networke operations, 11 Mbit wireless ethernet, wireless auio, external storage units Table 1. Summary of features. 4. PRACTICAL APPLICATIONS High resolution real-time capabilities, typically available in much more expensive instruments, are very useful in behavioural experiments to monitor the acoustic activities of the subjects, for example to fin correlations among observe behaviours an emitte/receive signals. This allows to immeiately evaluate the results of an experiment instea of waiting for later analyses on the recorings; also, this makes easier to analyze long recorings an to access large soun atabases. A portable version base on a notebook can be easily move across laboratories or use on-fiel. A typical application for which the software was evelope is the monitoring, recoring an classification of souns receive by towe arrays or other unerwater acoustic sensors while performing marine mammals' acoustic an visual surveys. Lightweight portable equipment is of great value as it allows performing acoustic surverys on small platforms.
Pavan et al. Soun Analysis Tools Since its earlier DOS versions, the DSPW has been extensively use in our activities in the Meiterranean Sea [6] an in particular in fiel research when continuous nnonitoring was require. Typically, we use it to isplay the signals receive by two hyrophones of a towe array while performing passive acoustic surveys for marine mammals on boar of sailing boats. In this use, the spectrogram isplay is also useful to evaluate the quality of the entire electroacoustic chain an to optimize the instrumental setup, array positioning an cruising spee in relation to hyroynamic an propeller's noise. A number of software an harware tools have been also evelope an teste to improve the management of the system an to make easier the on boar activities. Among these, a raio transmitter to broacast receive signals to heaphones on the eck, ata logging utilities, GPS logging, an a wireless network to share ata. The continuous high resolution isplay of receive souns, even short an weak signals, can reveal the presence of istant olphins or sperm whales. By zooming into the low frequency range it is also possible to etect fin whale souns. Often, acoustic etection largely anticipates visual sightings in aylight surveys, while uring night surveys it allows the etection of animals an behaviours otherwise not evient [7]. The comparison between results gaine with passive acoustics an visual sightings is still a matter of iscussion as each metho epens on the equipment an on the observation platform. It is anyway evient that the two methos shoul be improve an integrate to provie accurate results on censusing while stuying marine mammals an their environment. Recently, the system has been teste as a monitoring tool for the application of acoustic mitigation policies aime at reucing the impact on marine mammals of unerwater high power acoustic sources. Connecte to goo passive acoustic receivers it can reveal the presence of vocalizing animals within or close to the operation area an thus warn about a possible threat to marine life. The DSPW was employe uring two marine mammal surveys, name SIRENA '99 an SIRENA '00, organize by NATO Saclant Unersea Research Centre within the "Soun, Oceanography an Living IViarine Resources" (SOLMaR) project [8,9]. This is a NATO joint research project aime at increasing the knowlege about marine mammals an their environment to set an test acoustic mitigation policies. In Sirena '99, other than for continuous monitoring of towe array phones, the DSPW was use to monitor sonobuoys for etecting fin whales souns by means of real-time zooming in the 0-100 Hz range. During Sirena '00 an unattene recoring system was set up to countinously isplay an recor souns for 8 ays, proucing 18GB of ata a ay (2 channels, 48 khz sampling, 16 bit resolution). Other PCs were use for soun analysis an classification by traine observers 24h/24h to obtain, at the en of the cruise, an accurate summary of all souns etecte [8,9]. 5. CONCLUSION The acquisition of large amounts of ata, mae possible by wiely available igital technologies, generates non-trivial problems when storage, classification an retrieval are concerne. Real-time techniques make easier looking through long recorings, nevertheless it is necessary to evelop easy to use an reliable automatic recognition software to automatically browse, analyze an catalogue ata acquire uring long term monitoring operations. Recognition proceures must be flexible enough to allow the efinition of few or many, broa or tight, soun categories an to balance the risk of false etections against the risk of missing events. The DSPW represent a low cost effective approach to soun analysis an processing in ethological stuies as well as in a number of ifferent tasks relate to the acoustic monitoring of marine mammals an of the unerwater environment in general. Also, its use prove to be of great value for eucation an training when use with a large soun atabase.
Pavan et al. Soun Analysis Tools Further evelopment step will necessarily inclue the extension to multi-channel analysis an the implementation of automatic recognition capabilities. 6. ACKNOWLEDGEMENTS We thank Saciant Centre an Angela D'Amico for having mae possible the SOLMaR project an Office of Naval Research (ONR) for having grante our research project "Bioacoustic Characterization of the Meiterranean Sea" (ONR Grant N00014-99-1-0709). We also thank the Italian Navy for its support to our research an for the ospitality on boar of their ships, helicopters an airplanes. 7. REFERENCES [1 PAVAN G., 1992. A portable DSP workstation for real-time analysis of cetacean souns in the fiel. European Research on Cetaceans, Cambrige (UK), 6:165-169. PAVAN G., 1992. A portable PC-base DSP workstation for bioacoustical research. Bioacoustics, 4 (1): 65-66. Abstract. PAVAN G., 1995. Low-cost real-time spectrographic analysis of souns. Bioacoustics, 6 (1): 81. Abstract. PAVAN G., BORSANI J.F., MANGHI M., PRIANO M., 1996. Interactive Digital Soun Library on cetaceans of the Meiterranean Sea. European Research on Cetaceans, 9: 81-84. Pavan G., Priano M., Manghi M., Fossati C, 1997. Software tools for real-time IPI measurements on sperm whale souns. Proc. Unerwater Bio-Sonar an Bioacoustics Symposium. Proc. I.O.A., 19 (part 9), Loughborough, UK: 157-164. PAVAN G., BORSANI J.F., 1997. Bioacoustic research on cetaceans in the Meiterranean Sea. Mar. Fresh. Behav. Physiol., 30: 99-123. Manghi M., Fossati C, Priano M., Pavan G., Borsani J.F., Bergamasco C, 1999. Acoustic an visual methos in the Oontocetes survey: a comparison in the Central Meiterranenan Sea. European Research on Cetaceans, 12: 251-253. AA.W. SIRENA '99. SACLANTCEN CD27 AA.W. SIRENA '00. SACLANTCEN CD41