Birdsong Analysis: a Look Inside from Information Science



Similar documents
Analecta Vol. 8, No. 2 ISSN

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

How songbirds sing birdsongs?

A Segmentation Algorithm for Zebra Finch Song at the Note Level. Ping Du and Todd W. Troyer

Big Data: Rethinking Text Visualization

Data Storage 3.1. Foundations of Computer Science Cengage Learning

Categorical Data Visualization and Clustering Using Subjective Factors

Face detection is a process of localizing and extracting the face region from the

Lecture 1-10: Spectrograms

REAL TIME TRAFFIC LIGHT CONTROL USING IMAGE PROCESSING

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Visualization methods for patent data

CHAPTER 6 PRINCIPLES OF NEURAL CIRCUITS.

Memory Systems. Static Random Access Memory (SRAM) Cell

How To Filter Spam Image From A Picture By Color Or Color

Classroom Tips and Techniques: The Student Precalculus Package - Commands and Tutors. Content of the Precalculus Subpackage

The Scientific Data Mining Process

A Color Placement Support System for Visualization Designs Based on Subjective Color Balance

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

CHAPTER 3: DIGITAL IMAGING IN DIAGNOSTIC RADIOLOGY. 3.1 Basic Concepts of Digital Imaging

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

INTRUSION PREVENTION AND EXPERT SYSTEMS

62 Hearing Impaired MI-SG-FLD062-02

VISUAL ALGEBRA FOR COLLEGE STUDENTS. Laurie J. Burton Western Oregon University

For example, estimate the population of the United States as 3 times 10⁸ and the

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

A Brief Study of the Nurse Scheduling Problem (NSP)

CREATING LEARNING OUTCOMES

Extraction of Satellite Image using Particle Swarm Optimization

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

ARTIFICIAL INTELLIGENCE METHODS IN EARLY MANUFACTURING TIME ESTIMATION

Building an Advanced Invariant Real-Time Human Tracking System

Tutorial for proteome data analysis using the Perseus software platform

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

How To Use Neural Networks In Data Mining

Clustering & Visualization

Get The Picture: Visualizing Financial Data part 1

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller

Visualization Quick Guide

Introduction. Chapter 1

An Overview of Knowledge Discovery Database and Data mining Techniques

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

Tutorial 2: Using Excel in Data Analysis

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Visualization of 2D Domains

Anomaly Detection in Predictive Maintenance

Paper Airplanes & Scientific Methods

MetaMorph Software Basic Analysis Guide The use of measurements and journals

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

Environmental Remote Sensing GEOG 2021

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm

Mouse Control using a Web Camera based on Colour Detection

A Concrete Introduction. to the Abstract Concepts. of Integers and Algebra using Algebra Tiles

Digital Versus Analog Lesson 2 of 2

The Fourier Analysis Tool in Microsoft Excel

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL

ROBOTRACKER A SYSTEM FOR TRACKING MULTIPLE ROBOTS IN REAL TIME. by Alex Sirota, alex@elbrus.com

Graphic Design. Background: The part of an artwork that appears to be farthest from the viewer, or in the distance of the scene.

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

Chapter 6. The stacking ensemble approach

DYNAMIC RANGE IMPROVEMENT THROUGH MULTIPLE EXPOSURES. Mark A. Robertson, Sean Borman, and Robert L. Stevenson

A Method of Caption Detection in News Video

Drawing a histogram using Excel

ELECTRONIC DOCUMENT IMAGING

Image Compression through DCT and Huffman Coding Technique

Dong-Joo Kang* Dong-Kyun Kang** Balho H. Kim***

How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm

Voice Authentication for ATM Security

Chemotaxis and Migration Tool 2.0

Major Characteristics and Functions of New Scheduling Software Beeliner Based on the Beeline Diagramming Method (BDM)

Simplifying Logic Circuits with Karnaugh Maps

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER

1. Classification problems

THE CERN/SL XDATAVIEWER: AN INTERACTIVE GRAPHICAL TOOL FOR DATA VISUALIZATION AND EDITING

A Short Introduction to Computer Graphics

Statistical Modeling of Huffman Tables Coding

Hierarchical Clustering Analysis

Data Analysis Tools. Tools for Summarizing Data

Interference to Hearing Aids by Digital Mobile Telephones Operating in the 1800 MHz Band.

Clustering through Decision Tree Construction in Geology

Agent Simulation of Hull s Drive Theory

Safer data transmission using Steganography

Regular Languages and Finite Automata

Colour Image Segmentation Technique for Screen Printing

Appendix 2 Statistical Hypothesis Testing 1

How To Use Statgraphics Centurion Xvii (Version 17) On A Computer Or A Computer (For Free)

How Landsat Images are Made

Chapter I Model801, Model802 Functions and Features

Solving Simultaneous Equations and Matrices

Formulas, Functions and Charts

Transcription:

Birdsong Analysis: a Look Inside from Information Science KHAN MD. MAHFUZUS SALAM A thesis submitted in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY DEPARTMENT OF INFORMATION AND COMMUNICATION ENGINEERING THE UNIVERSITY OF ELECTRO-COMMUNICATIONS MARCH 2012

Birdsong Analysis: a Look Inside from Information Science APPROVED BY SUPERVISORY COMMITTEE: Prof. NISHINO TETSURO, Chairperson Prof. TAKAHASHI HARUHISA Prof. KOBAYASHI SATOSHI Prof. KASHIHARA AKIHIRO Assoc. Prof. SHONO HAYARU

Copyright c 2012 Khan Md. Mahfuzus Salam All rights reserved

To my Parents

Abstract Songbirds have been actively studied for their complex brain mechanism of sensor-motor integration during song learning. In general, birdsong which is string of sounds is represented by a sequence of letters called song notes. Our subject bird Bengalese finch (Lonchura stri-ata var. domestica) has been widely studied for its unique song features similar to human language. Male Bengalese finches learn singing by imitating external models to produce songs. For computational analysis the songs must be represented in songnote sequences. An automated approach for this purpose is highly desired since manual processing makes human annotation cumbersome, and human annotation is very heuristic and easily lacks objectivity. In our research, we propose a new approach for automatic detection and recognition of the songnote sequences via image processing. The proposed method is based on human recognition process to visually identify the patterns in a sonogram image. The songnotes of the Bengalese finch are dependent on the birds and similar pattern does not exist in two different birds. Considering this constraint, our experiments on real birdsong data of different Bengalese finch show high accuracy rates for automatic detection and recognition of the songnotes. These results indicate that the proposed approach is feasible and generalized for any Bengalese finch songs. Furthermore, in our study, we focus on information-theoretic analysis of these sequential data to explore the complexity and diversity of birdsong, and learning process throughout song development. We design and develop the analysis tool which has many features to do analysis for the sequential data. For experiment, we employ thirteen male Bengalese finches, each with different bouts of song data. By applying ethological data mining to these data, we discover that the finches follow two types of song learning mechanism: practice mode and adopt mode. In addition, over the analysis we find that it is possible to visualize the song features, e.g. traditional transmission, by contour surface diagram of the transition matrix. Furthermore, we can easily identify the families from these contour surface diagrams, which is a very challenging task in general. Our obtained results indicate that analysis based on data mining is a versatile technique to explore new aspects related to behavioral science. ii

Acknowledgments First and foremost, I would like to express my sincere gratitude to my supervisor Professor Nishino Tetsuro of U.E.C for his continuous guidance, valuable advices, kind assistance and encouragements. It has been a great honor and privilege for me to work with him. I would also like to thank the members of my supervisory committee: Professor Takahashi Haruhisa, Professor Kobayashi Satoshi, Professor Kashihara Akihiro and Associate Professor Shono Hayaru for their valuable ideas and suggestions in completing this thesis. I would also like to thank my research collaborators: Dr. Sasahara Kazutoshi, Dr. Takahasi Miki and Dr. Okanoya Kazuo of RIKEN for having fruitful discussions and valuable advices on my research. I also want to thank Professor Yoshida Toshinobu, Professor Tanaka Shigeru and Associate Professor Wakatsuki Mitsuo of U.E.C for their suggestions and valuable advices related to my research. I am really grateful to Hirose International Scholarship Foundation for providing scholarship to continue my higher study smoothly. I would like to express my deepest gratitude to my parents for their continuous support and encouragements. Untill today, their love and encouragements made it possible for me to overcome the hardships. I also want to thank my brothers and sisters for their encouragements to continue my study abroad. Finally, I thank all the members of Nishino laboratory from 2006 to 2011 for their kind assistance and support. Living in a foreign country is never easy but somehow their warm and friendly behaviour make it easy for me. I also want to thank all my friends in UEC for making my stay in Japan more memorable. iii

Contents Abstract ii Acknowledgments iii Table of Contents iv 1 Introduction 1 1.1 Background..................................... 1 1.2 Motivation...................................... 2 1.3 Organization of this thesis............................. 3 2 Background Study 5 2.1 What is Birdsong.................................. 5 2.2 Bengalese Finches Song.............................. 6 2.3 Hierarchic Song Structure.............................. 8 2.4 Development of Birdsongs............................. 11 2.5 Development of Song Syntax............................ 12 3 Automation in Songnote Detection and Recognition 21 3.1 Introduction..................................... 22 3.2 Preliminaries.................................... 22 3.2.1 Birdsong representation.......................... 22 iv

Contents 3.2.2 Bengalese finch song............................ 23 3.2.3 Detection and recognition......................... 24 3.2.3.1 Image feature extraction..................... 24 3.2.3.2 Image matching and pattern recognition............ 26 3.3 Methodology.................................... 27 3.3.1 Songnote detection............................. 27 3.3.1.1 Detection method........................ 28 3.3.1.2 Detection algorithm....................... 29 3.3.2 Songnote recognition............................ 30 3.3.2.1 Recognition method....................... 30 3.3.2.2 Chi-square goodness fit test................... 31 3.3.2.3 Recognition algorithm...................... 32 3.4 Results........................................ 32 3.4.1 Description of data............................. 33 3.4.2 Songnote detection............................. 34 3.4.3 Automatic recognition........................... 35 3.5 Summary...................................... 37 4 Information-Theoretic Analysis 39 4.1 Introduction..................................... 40 4.2 Preliminaries.................................... 41 4.2.1 Language and Birdsong.......................... 41 4.2.2 Bengalese Finch Song and its Representation............... 42 4.2.3 Information-Theoretic Measures...................... 43 4.3 Data Mining and Information Extraction...................... 44 4.3.1 Data Mining................................ 45 4.3.2 Mining Behavioral Sequential Data.................... 45 4.4 Data Mining from Birdsong............................. 47 v

Contents 4.4.1 Description of Data............................. 47 4.4.2 Common Parent-Progeny Findings..................... 48 4.4.3 Analysis on Evolution of Song....................... 49 4.4.4 Parent-Progeny Comparison of Songs................... 51 4.4.5 Visualization of Song Features....................... 52 4.5 Summary...................................... 54 5 Conclusion 57 5.1 Automation in analyzing the song......................... 57 5.2 Mining on behavioral sequences.......................... 58 5.3 Summary...................................... 59 A χ 2 distribution table 60 References 62 List of Publications 65 vi

Chapter 1 Introduction 1.1 Background Ethology is the scientific study of animal behavior for exploring mechanisms underlying diverse forms of behaviors, from unlearned stereotyped ones to learned flexible ones. Songbirds have been actively studied as a good ethological model for their complex brain mechanism of sensormotor integration in song learning. The Bengalese finch (Lonchura striata var. domestica) is a domesticated strain of a Southeast Asian finch, the white-rumped munia (Lonchura striata) and, it has been a popular subject for neurobiological and ethological studies on birdsongs for its unique song features. Birdsongs are strings of sounds represented by a sequence of letters known as a song note. The song of the Bengalese finch has a complex structure as compared with those of other songbirds such as zebra finches (Taeniopygia guttata). According to the recent studies, the courtship songs of Bengalese finches have unique features and similarity with a human language [6]. Recent studies on Bengalese finches show that, the songs of male Bengalese Finches are neither monotonous nor random; they consist of chunks, each of which is a fixed sequence of a few song notes. The song of each individual can be represented by a finite automaton, which is called song syntax (3.1 [6]. Thus, the songs of Bengalese 1

1.2. Motivation finches have double articulation, which is one of the important faculties of human language (i.e., a sentence consists of words and word consist of phonemes). Song syntax is controlled by song control nuclei in the brain. The hierarchy of song control nuclei directly corresponds to the song hierarchy [6]. Figure 1.1: Grayscale spectrogram of a Bengalese finch s song Due to the structural and functional similarities of vocal leaning between songbirds and humans, the former have been actively studied as good model of human language [11]. In particular, the song syntax of Bengalese finches sheds light on the biological foundations of syntax. In ethological studies, there are 3 steps for understanding animal behavior. First, a behavioral phenomenon is observed and recorded; second, on the basis of the recordings, we convert those recorded data to useful format; and third, do the data analysis to understand the rules or discover new knowledge. The following Fig. 1.2) shows the process flow that is used in general for birdsong analysis: 1.2 Motivation As shown in Fig. 1.2, the acoustic analysis of the birdsong is necessary to find out the song elements for analysis of birdsong to understand the birdsong syntax and also for understanding the learning process of the song. The acoustic analysis for automotive recognition and extraction of the song elements is one of the main objective of this research. Previous studies had some drawbacks which applied sound processing approach. For this reason a new approach using 2

Chapter 1. Introduction Figure 1.2: Process flow of an ethological study on birdsong image processing is applied in this study based on the recognition process used by a human to visually identify patterns in a sonogram image. Furthermore, Sasahara et al. reported 2 types of development of song syntax [21]. Their analysis result was based on number of edges that is required to represent the song syntax by an automaton. Such interesting phenomena during song development motivated us to study the learning process of Bengalese finch song. No previous research was reported to understand the process by which songbirds learn to sing by doing analysis from information-theoretic viewpoint, employing the developmental song data. In the present study, we focused on understand the learning process. The thesis, therefore, is structured as follow: 1.3 Organization of this thesis The theoretical background of Birdsong is briefly provided in chapter 2. In chapter 3, we introduces a new approach for automatic detection and reorganization of the songnote sequences via image processing. The proposed method is based on the recognition process used by a human to visually identify patterns in a sonogram image. From our experiment on real birdsong data we found the proposed scheme as feasible and generalized approach 3

1.3. Organization of this thesis considering the constraint. In chapter 4, we focus on information-theoretic analysis of these sequential data to explore the complexity and diversity of birdsong, and learning process throughout song development.. Finally, the obtained results are summarized in chapter 5 and future research is shortly discussed. 4

Chapter 2 Background Study 2.1 What is Birdsong Bird songs are the bird sounds that are melodious to the human ear. In ornithology, bird songs are often distinguished from shorter sounds, which may be termed calls. Bird vocalization includes both bird calls and bird songs. The distinction between songs and calls is based upon inflection, length, and context. Songs are longer and more complex and are associated with courtship and mating, while calls tend to serve such functions as alarms or keeping members of a flock in contact. Bird song is best developed in the order Passeriformes. Most song is emitted by male rather than female birds. Song is usually delivered from prominent perches although some species may sing when flying. Some groups are nearly voiceless, producing only percussive and rhythmic sounds, such as the storks, which clatter their bills. Other authorities such as Howell and Webb (1995) make the distinction based on function, so that short vocalizations such as those of pigeons and even non-vocal sounds such as the drumming of woodpeckers and the winnowing of snipes wings in display flight are considered as songs. In Europe and the United States, the songs of Zebra Finch are well studied while in Japan the songs of the Bengalese Finch which is a popular fowl and famous for its unique feature is being 5

2.2. Bengalese Finches Song well studied. The song of the Bengalese Finches is solely used in courtship display and functions only in sexual context but never in aggressive context. Thus, phenotype of birdsong in finches should be sensitive to the process of sexual selection. The Bengalese finch became increasingly popular as an experimental animal to study neural basis of song control [6, 11]. 2.2 Bengalese Finches Song The Bengalese finch is a domesticated strain of a Southeast Asian estrildine finch, the whitebacked munia (Lonchura striata). Their habitat ranges from wild field and agricultural area, to human residential area. They live in a flock of around 100. Several white-backed munias were imported in Japan from China about 250 years ago. About 140 years ago, white plumage mutations occurred in the White-backed munia, that was by then called as Jyu-shimatsu (ten sisters) because of their gentleness and tameness, which made this strain of the finch even more popular as a cage bird. Figure 2.1: This picture shows a Bengalese finch (right) and a white-backed munia. The Bengalese finch has white feathers with dark brown patches and the white-backed munia is in dark gray plumage except for a part of rump that is white 6

Chapter 2. Background Study The process of domestication of the Bengalese finch was begun some 250 years ago in Japan and several modifications in coloration and behavior occurred. A recent study shows songs of Bengalese finches are much more complex in their temporal organizations than songs of related species such as zebra finches. This complexity occurred during domestication [2]. This is shown by comparing syntactical and acoustical parameters of songs between the wild and domesticated strains of white-backed munia. Acoustical morphologies of the song elements were strain-specific: similarities among song elements were higher within individuals of each strain, but the degrees of morphological variations were comparable between the strains. In the time domain, white-backed munias sang a highly stereotyped song: a song element was always followed by one of certain song elements in a deterministic way. Bengalese finches, on the other hand, sang complex song with one song note followed by several possible song notes. Male songs should evolve largely under two different pressures: female preference and risk of predation. The low degree of complexity found in wild white-backed munias may be the result of compensating these two factors. In Bengalese finches, because of the domestication, predation is no longer a selection pressure. Thus, it is likely that Bengalese finch songs had undergone changes that were favored by females. The major difference between other song birds like Zebra finch and Bengalese Finch is the complexity of the songs. The songs of Zebra finch are sung by repeating a fixed short pattern, and are comparatively simple. On the other hand, the songs of Bengalese finch are more complex than those of Zebra finch. They are sung by changing order and repeating some song elements. The song syntax is much more complex in Bengalese finches than in white-backed munias. Fig. 2.2 and Fig. 2.3 shows two representative transition diagrams from two individuals of each of the species. Bengalese finch songs have more repeating notes and more loops and embedded structures (2.2) while white-backed munia songs are generally linear (2.3). Bengalese finches have somewhat complex patterns of note-to-note transition while note patterns of transition are much simpler in white-backed munias compared to Bengalese finches. Experiments shows that the song of Bengalese finch becomes complex is domestication, which might free the birds from predatory pressures. Females of White-backed munia tend to 7

2.3. Hierarchic Song Structure Figure 2.2: Transition diagrams of a male Bengalese Finch s song. [6] Figure 2.3: Transition diagrams of a white-backed munia s song. [6] like complex songs, so males which can sing complex songs can leave more offspring. 2.3 Hierarchic Song Structure A sonograph is a frequency analyzer which is used to visually represent fiches songs. It represents voice signal as graph, where horizontal axis represents time, vertical axis represents frequency. A graph generated by a sonograph is called a sonogram. Fig. 3.1 shows sonogram of a Bengalese finch s song, it clearly shows there is a silent interval between two song elements. Song elements are delimited by silent intervals and assigned by one symbol to the similar elements in order to convert them to text data. We call these elements as a song note. Adult Bengalese finches have 8 different song notes in average. Two different 8

Chapter 2. Background Study Figure 2.4: Grayscale spectrogram of a Bengalese finch s song individuals have different song notes. So, there is no universal common correspondence between the song notes and the symbols. Therefore, the song note a of one individual has no relation with the song note a of another individual. Note that, the father and son have similar song notes. So, the symbols are assigned based on this relation. Figure 2.5: Courtship song syntax represented by an automata Sound data of Bengalese finches songs can be converted to text data by extracting symbols in Fig. 3.1and represented as aaabcdeebfghjkkcdeeblaa abfghjkkcdeeb. Transition diagram of this song is shown in Fig. 4.1. As from this text data sequence, there are some fixed patterns in the song. For example, this song can be delimited as follows: aaab cdeeb fgh jkk cdeeb l aaab fgh jkk cdeeb Therefore, this song consists of patterns aaab cdeeb fgh jkk. Bengalese finches songs are sung according to a hierarchical structure like this. The following of this section briefly explains some general terms that are used in birdsong research. Song note: 9

2.3. Hierarchic Song Structure A song note is a symbol assigned for an independent pattern that appears in a sonogram as seen in Fig. 3.1. It is also referred to as a song element or behavioral element. From the definition, we can say the text data comprising symbols (such as a, b, and c) are called song note sequences. Song notes are analogous to phonemes in human language. Chunk: A fixed sequence of song notes is called a chunk. In Fig. 3.1, for example, the chunks are ab, cde and fg. Chunks analogous to words of human languages. Song unit: A song unit consists of chunks. Song units analogous to sentences of human languages. Song bout: A song bout consists of one or more song unit that is a continuous song produced by the bird in one time. Song bouts are analogous to paragraphs in human language. The hierarchic structure is closely related with the brain structure of songbirds. Fig. 2.6 shows a portion of the song system of Bengalese Finch. An experiment that examined the functions of each part had been conducted using damaged NIf, HVC, and RA by Okanoya et al. of RIKEN [6]. According to the results, When NIf was damaged, complex songs change into simply songs. Song notes and chunks are not destructed, but only transition structure become simple. When HVC was partially damaged, specific chunks were lost. All song notes are maintained, however, specific arrangements of song notes were lost. Furthermore, when RA was damaged, specific song notes were lost. Additionally, when left RA was damaged, song notes with high frequency cannot be uttered, while when right RA was damaged, song notes with low frequency cannot be uttered. Thus, right and left RAs are in charge of each frequency. It has been turned out that the layered structure of songs (bout, chunk, and song note) corresponds to the anatomical hierarchy (NIf, HVC, and RA). 10

Chapter 2. Background Study Figure 2.6: The song system of Bengalese Finch[?]. This pathway is composed of NIf HVC RA, which receives auditory input from Field L and generates motor commands that regulate the syrinx muscles for vocalization. 2.4 Development of Birdsongs The following describes the developmental process of song learning in male Bengalese finches. The indicated age in day is approximate. Age: 35 days Bengalese finches begin to sing around this age. The early stage songs sounds like noises; for example Ja, Ju, etc. The songs at this period are called subsongs. Age: 70 days Almost all song notes appear. However, the arrangement of song notes are not fixed, and changes every time while singing. Therefore, the songs around this period are called plastic songs. Around this time, their songs are not intends to courtship. Age: 120 days The arrangement in songs become fixed and called a crystallized song. At this time, they 11

2.5. Development of Song Syntax sing not only for learning but also for courtship. 2.5 Development of Song Syntax A previous research demonstrated the development of song syntaxes in juvenile Bengalese finches [21]. In their study, they compare the song syntaxes of the Bengalese finch between a father and a offspring. Figure 2.7 shows the adult song syntax of father bird, and the developmental song syntaxes of its offspring. The father has a very simple song syntax with only one transition (just represented as a pattern abcdef ). Figure 2.7: Development of song syntax The offspring follows fathers song sample, developing its song syntax. At 64 days of age it has a redundant song syntax with many transitions and the degree of reversibility k = 1, which are different from father. At 78 days of age the child bird seems to correctly finish leaning the fathers song syntax. However it develops again redundant transitions at 92 days of age. After 12

Chapter 2. Background Study that, crystallizes the song syntax and finally obtains a original song syntax, which is similar to fathers one but have a novel transition with f. This results indicates that offspring do not always learn fathers song syntaxes in accurate detail, therefore the vocal leaning of songbirds may include other factors other than grammatical inference. By applying their method to the developmental song data, we generate the development of song syntax as the topological changes. For generating the song syntax two parameter Dependency N and TT (Transition Threshold) has to be set. The N is estimated in chunk extraction method. From their experiment, they conclude that it is possible to correctly extract chunks by adopting a rather small value of N which is suitable for the stable extraction (e.g. the value of N is from 3 to 8). And For estimation of TT is depend on two important features. First, as the noise per song unit increases, TT for constructing the correct reversible automaton increases. Second, even if the song unit size decreases TT per song unit does not change as much. Based on their experiment they show that in the Bengalese finches songs, the appropriate TT is 5% to 15% of the song unit size. As per their estimation method, the value of N is being set between 3 to 8 and TT is being set at 10% to 15% of the song unit size. The following shows the developments in syntax. 13

2.5. Development of Song Syntax (b) 78 age in day (d) 100 age in day (e) 120 age in day (a) 64 age in day (c) 92 age in day Figure 2.8: Shiro (N = 8, T T = 12%) 14

Chapter 2. Background Study (b) 88 age in (c) 95 age in (f) 116 age in day day (d) 102 age day in day (a) 71 age in day Figure 2.9: LAo (N = 3, T T = 10%) 15

2.5. Development of Song Syntax (b) 76 age in day (c) 87 age in day (d) 102 age in day (e) 127 age in day (a) 61 age in day Figure 2.10: RAo (N = 7, T T = 14%) 16

Chapter 2. Background Study (d) 124 age in day (c) 99 age in day (b) 89 age in day (a) 62 age in day (a) 69 age in day Figure 2.11: Atama (N = 3, T T = 13%) 17

2.5. Development of Song Syntax (e) 125 age in day (a) 70 age in day (b) 77 age in day (c) 91 age in day (d) 99 age in day Figure 2.12: RMo(N = 5, T T = 10%) 18

Chapter 2. Background Study (c) 93 age in day (d) 106 age in day (e) 127 age in day (b) 79 age in day (a) 70 age in day Figure 2.13: RKi (N = 5, T T = 15%) 19

2.5. Development of Song Syntax (a) 74 age in day (b) 81 age in day (e) 130 age in day (d) 108 age in day (c) 94 age in day Figure 2.14: LMizuiro (N = 7, T T = 13%) 20

Chapter 3 Automation in Songnote Detection and Recognition The Bengalese finch song has been widely studied for its unique features and similarity to human language. For computational analysis of the songs, they must first be represented in songnote sequences, an automated approach for this purpose is highly needed : manual processing makes human annotation cumbersome, moreover, human annotation is very heuristic and easily lacks objectivity. In this chapter, we introduces a new approach for automatic detection and reorganization of the songnote sequences via image processing. The proposed method is based on the recognition process used by a human to visually identify patterns in a sonogram image. The songnote of birdsong are depend on the bird (i.e, similar pattern does not appear in two birds). Considering this constraint, from our experiment on real birdsong data a high accuracy rate has been achieved and our method can deal with any Bengalese finch song. Thus, we consider our method a feasible and generalized approach. 21

3.1. Introduction 3.1 Introduction Birdsong has been actively studied via analysis of songnote sequences to understand the language model of birds. The songs of the Bengalese finch (Lonchura striata var. domestica) a popular fowl in Japan, is widely used for this purpose. The song of the Bengalese finch has a complex structure as compared with those of other songbirds such as zebra finches (Taeniopygia guttata). Thus, Bengalese finch songs have been studied as a model of human language. According to the recent studies, the courtship songs of Bengalese finches have unique features and similarity with a human language [6]. In birdsong research, acoustic song analysis is necessary to find the song elements and their sequence for carrying out an analysis to understand the song syntax [10] and the learning process of the song. The current research is focused on automatic detection and recognition of the songnote and its sequence. Previous studies that employed sound processing had drawbacks. This paper introduces a new generalized approach that employs image processing to overcome the drawbacks. 3.2 Preliminaries This section briefly introduces the theoretical foundations of a birdsong, its representations, image basics, and the recognition process by humans as we focused on the recognition process that is manually carried out by humans. 3.2.1 Birdsong representation In birdsong analysis, the song data is recorded in an appropriate environment special cage equipped with automated recording system and also to avoid noise. From the recorded sound data, we obtain the sonogram image of the song. For further computational analysis, the obtained sonogram image is used as the standard representation of the song [12]. A sonogram is an image that shows how the spectral density of a signal varies with time. It is also known as a spectrogram, voiceprint, or voicegram. Sonogram are used to identify phonetic sounds to analyze the animal 22

Chapter 3. Automation in Songnote Detection and Recognition cries and also in the fields of speech processing, music, sonar/radar, seismology, etc. There are many variations in the format of the sonogram. Sometimes, the vertical and horizontal axes are switched; sometimes, the amplitude is represented as the height of a 3D surface instead of color or intensity. The frequency and amplitude axes can be either linear or logarithmic, depending on what the graph is being used for. For instance, audio would usually be represented with a logarithmic amplitude axis, and frequency would be linear in order to emphasize harmonic relationships, or logarithmic to emphasize musical, tonal relationships. The most common format is a graph with two geometric dimensions: the horizontal axis represents time, and the vertical axis is frequency; a third dimension indicating the amplitude of a particular frequency at a particular time is represented by the intensity or color of each point in the image. For the birdsong research this common format is used. Fig. 3.1 shows a sample grayscale sonogram image of a Bengalese finch courtship song. Figure 3.1: Grayscale sonogram image of a Bengalese finch song 3.2.2 Bengalese finch song Recent studies on Bengalese finches show that the songs of male Bengalese finches are neither monotonous nor random; they consist of chunks, each of which is a fixed sequence of a few 23

3.2. Preliminaries song notes. The songs of Bengalese finches have double articulation a sentence consists of words, and each word consists of phonemes, which is also one of the important faculties of human language. The song syntax is manipulated by the song control nuclei in the brain. The hierarchy of the song control nuclei directly corresponds to the song hierarchy [6]. Because of the structural and functional similarities of vocal leaning between songbirds and humans, the former have been actively studied as a good model of a human language [11]. In particular, the song syntax of Bengalese finches sheds light on the biological foundations of syntax. 3.2.3 Detection and recognition Human vision is one of the most important and perceptive mechanisms. It provides information required for the relatively simple tasks (e.g., object recognition) and for very complex tasks as well. In bird song research, the songnote recognition is carried out by humans by inspecting the patterns visually represented in a sonogram image. 3.2.3.1 Image feature extraction Digital image processing denotes the analysis carried out on the basis of the pixel property of the image irrespective of the image type. A digital image has a finite set of digital values called picture elements or pixels. The image contains a fixed number of rows and columns of pixels. Pixels are the smallest individual elements in an image, holding quantized values that represent the brightness of a given color at any specific point. Typically, the pixels are stored in computer memory as a raster image or raster map, a two-dimensional array of small integers. These values are often transmitted or stored in a compressed form. Each pixel of a raster image is typically associated with a specific position in some 2D region and has a value of one or more quantities related to that position. Digital images can be classified according to the number and nature of such samples into the following categories: Binary Grayscale 24

Chapter 3. Automation in Songnote Detection and Recognition Color False-color In our research, we use a sonogram image that is a grayscale image. Grayscale Image: A grayscale digital image is an image in which the value of each pixel is a single sample, that is, it carries only intensity information. Grayscale images are distinct from one-bit black-and-white images, which in the context of computer imaging, are images with only the two colors, black and white (also called binary images). Grayscale images have many shades of gray in between. The reason for differentiating such images from any other sort of color image is that less information needs to be provided for each pixel. In fact a gray color is one in which the red, green, and blue components all have equal intensity in the RGB space, and hence, it is only necessary to specify a single intensity value for each pixel, as opposed to the three intensities needed to specify each pixel in a full color image. Grayscale images are also called monochromatic images, denoting the absence of any chromatic variation. Pixel Values: Each of the pixels that represent an image stored inside a computer has a pixel value that describes how bright that pixel is, and/or what color it should be. In the simplest case of binary images, the pixel value is a 1-bit number indicating either foreground or background. For a grayscale image, the pixel value is a single number that represents the brightness of the pixel. The intensity of a pixel is expressed within a given range between a minimum and a maximum. In computing, although the grayscale can be computed through rational numbers, image pixels are stored in a binary, quantized form. Presently, grayscale images are commonly stored with 8 bits per sampled pixel, which allows 256 different intensities (i.e., shades of gray). The precision provided by this format is barely sufficient to avoid visible banding artifacts but is very convenient for programming because a single pixel occupies less space than a single byte. The binary representations assume that 0 is black and the maximum value 255 is white. In our research, we use the grayscale sonogram images that have pixel intensity values from 0 to 255. 25

3.2. Preliminaries 3.2.3.2 Image matching and pattern recognition Pattern recognition aims to classify data or patterns on the basis of either a priori knowledge or statistical information extracted from the patterns. The patterns to be classified are usually groups of measurements or observations, defining points in an appropriate multidimensional space. This is in contrast to pattern matching, where the pattern is rigidly specified. Pattern recognition is used to test whether things have a desired structure, to find relevant structure, to retrieve the aligning parts, and to substitute the matching part with something else. Figure 3.2: McCulloch and Pitts simplified model of a neuron and its implementation as a threshold logic unit[3] In human vision-based recognition of an image, the first thing that will catch the attention is something that is familiar. To be recognized, an object must have some feature that our consciousness can assign. Behind this process, the mental model captures the important characteristics of the object. It is unfortunate that, in many scientific experiments, the task assigned to human vision is not the recognition of familiar objects, but the detection and description of unfa- 26

Chapter 3. Automation in Songnote Detection and Recognition miliar ones, which is far more difficult. According to the McCulloch and Pitts simplified neuron model, the weighted sum of many inputs exceeds a threshold, and then the output is turned on. Learning consists of adjusting the weights, which can be either positive or negative [3]. The current research applies image processing methodology based on grayscale image features of the sonogram. The motivation of applying such image processing is to find a simple and generalized way for the automation as a human brain does in the recognition process by applying pattern matching. 3.3 Methodology The proposed automation process is divided into two steps. First, from the song sonogram image, we detect the song elements on the basis of the local property of the sonogram image. Then, on the basis of the detected elements, we apply image matching to assign a label to the extracted elements, and thus, we obtain the songnote sequence of the song. Figure 3.3: Songnote sequence detection and recognition process 3.3.1 Songnote detection From the sonogram image, we first detect the elements. On the basis of the extracted statistical features of the detected elements, we carry out the recognition process. For this reason, the detection process is very important. 27

3.3. Methodology 3.3.1.1 Detection method The detection process is carried out by analyzing the sonogram image for intensity values; we can obtain a graph for the average pixel intensity value. If the sonogram image has many noises at the beginning, which are ignored in the visual inspection by human, the present system does not ignore them as noises. For this reason, we pre-process the sonogram image. Then, if we take the average intensity value along the vertical line and draw a graph where the Y-axis represents the average intensity value or gray value and the X-axis represents the pixel index x, which is the distance from the (0, 0) pixel along the X-axis, we have a graph as follows: Figure 3.4: Average intensity value graph derived from the sonogram image The above graph (3.4) is generated from the sample sonogram image shown in Fig. 3.5. It is clearly visible that from the graph we can find some clear gaps between the elements. By defining parameters (3.5) such as minimum element width, minimum gap between elements, and the intensity threshold, we can execute our algorithm to find the song elements. If some region does not fit with the three above mentioned parameters, we consider it to be noise. Note that these parameters can vary from bird to bird. The detected song elements and the features of the elements, such as width information, are used for the recognition process. 28

Chapter 3. Automation in Songnote Detection and Recognition Figure 3.5: Sample sonogram image and the parameters 3.3.1.2 Detection algorithm The song element detection algorithm takes the array of the average intensity values as the input. On the basis of the defined parameter values, the proposed detection algorithm produces an unlabeled list of song elements. Detection Algorithm Input: array of intensity values. Output: a list of elements. Procedure: 1. Initialize the parameters. 2. If the intensity value exceeds threshold and next is not a gap set start element flag true; set start index to current index; 3. If start element flag is true and next minimum gap is detected set start element flag false; set end index to current index; add to element list; 4. Continue step 2 and 3 until end of the intensity array 5. Return element list. 29

3.3. Methodology 3.3.2 Songnote recognition For extracting the songnote sequence from the sonogram image, we extract local statistical features and then carry out the statistical pattern matching for recognition. 3.3.2.1 Recognition method As discussed in the previous section, similar patterns are assigned with the same label in the recognition process. Our recognition method is based on the local property of the sonogram image. By executing the note detection algorithm, we obtain element list information. This unlabeled element list provides the start pixel and the end pixel information for every element. R 1 R 2 R 3 = {R 1 g 0,...,R 1 g 7,R 1 g c,r 2 g 0,...,R 2 g 7,R 2 g c,r 3 g 0,...,R 3 g 7,R 3 g c } Figure 3.6: Explains the procedure while N = 3 As for the Bengalese finch song, note patterns differ from bird to bird. Therefore, we decided not to use any prior knowledge; rather, we use the statistical information extracted from the patterns. First, we divide every note into N regions, and every region is divided into nine (3 3) cells. We denote the center cell as g c and the other cells as g n in a clockwise direction, where n = 0,1,...,7. Thus, we obtain a set of values for every single element. Then, we apply a statistical test called the chi-square test to find the similarity between elements. Note that the value of N 30

Chapter 3. Automation in Songnote Detection and Recognition should not greater than 3 because if the set size exceeds thirty, the Chi-square distribution tends toward a normal distribution. 3.3.2.2 Chi-square goodness fit test The chi-square test (χ 2 ) is a statistical hypothesis test whose results are evaluated by reference to the chi-square distribution. Pearson s chi-square test is the best-known of several chi-square tests. Its properties were first investigated by Karl Pearson [9]. Pearson s chi-square is the original and most widely-used chi-square test. When an analyst attempts to fit a statistical model to observed data, he or she may wonder how well the model actually reflects the data. How close are the observed values to those that would be expected under the fitted model? One statistical test that addresses this issue is the chi-square goodness of fit test. This test is commonly used to test the association of variables in two-way tables, where the assumed model of independence is evaluated against the observed data. In general, the chi-square test statistic is of the following form: In the equation, χ 2 is of the form: Where, χ 2 = (Observed Expected)2 Expected χ 2 = n (O i E i ) 2 i=1 E i χ 2 = the test statistic that asymptotically approaches a χ 2 distribution; O i = an observed frequency; E i = an expected frequency, asserted by the null hypothesis; n = the number of possible outcomes; The chi-square statistic is calculated by finding the difference between each observed and theoretical frequency for each possible outcome, squaring these values, dividing each by the theoretical frequency, and taking the sum of the results. The chi-square statistic can be used for calculating a p-value by comparing the value of the statistic to a chi-square distribution. The 31

3.4. Results number of degrees of freedom is equal to the number of possible outcomes minus 1. If the computed test statistic is larger than the chi-square table [9]with (n 1) degrees of freedom, the observed and expected values are not close and the model is a poor fit to the data. 3.3.2.3 Recognition algorithm The songnote recognition algorithm takes the unlabeled list of song elements. It applies the goodness of fit test to find the similarity between elements and produces the songnote sequence. Recognition Algorithm Input: unlabeled list of elements. Output: labeled list of elements. Procedure: 1. For each element in element list divide into N 9 cells where 0 < N < 4 2. Calculate the average intensity value for every cells 3. For each element until there is any unlabeled element set one as expected and others as observed; if expected is not labeled set it with a new label; test the Chi-square statistics; 4. If the observed element pass the test then set the element with same label 5. Return updated element list 3.4 Results In this section, we present the results of our methodology for analyzing the Bengalese finch song. First, we explain the nature of our real song data, and then discuss the results of the automatic detection and recognition of the songnote. 32

Chapter 3. Automation in Songnote Detection and Recognition 3.4.1 Description of data For testing our proposed method we use five different song unit or phrase for each three matured Bengalese finch song; the names of the finches are Hikari 52, Hikari 49 and Kuro 0362. The song data were recorded at the Okanoya laboratory of RIKEN. The spectrogram image of the matured Bengalese finch have similar properties, where the note patterns are clearly visible and almost each songnote is separated by considerable blank space. Fig. 3.7 show the partial sonogram images for the three birds. Figure 3.7: Spectrogram for Hikari 49 (top), Hikari 52(middle) and Kuro 0362 (bottom) The sample sonogram images contains forty six to fifty four notes for Hikari 52, fifty one to fifty nine notes for Hikari 49 and fifty two to sixty one notes for Kuro 0362. From the sample sonogram image Fig. 3.7, it is clearly visible that the sonogram image of Hikari 49 is more complex than that of Hikari 52 and Kuro 0362, i.e., for Hikari 52 and Kuro 0362, the song notes are almost clearly separated from one to another, but for Hikari 49, the song notes are not clearly separated from one another. By applying our methodology, we implemented an application in JAVA, which takes the sonogram image as an input and provides extracted song elements and their sequence as the output. ImageJ API [7] is used for analyzing the image property. 33

3.4. Results 3.4.2 Songnote detection In section 3.3.1, we discussed the song note extraction methodology and explained the algorithm used for extracting the song notes from a sonogram image. We used parameters such as minimum note width, intensity threshold, and minimum gap between notes. We set the parameter values for minimum note width as 10 pixels, intensity threshold as 250, and minimum gap between notes as 5 pixels for every birds. After executing the algorithm mentioned in section 3.3.1.2, we obtain the result for the best case as follows: Table 3.1: Results of the automatic detection of song elements Bird name Number of appeared elements Average accuracy rate Hikari 52 46-54 98% Hikari 49 51-59 90% Kuro 0362 52-61 95% Now in the case of Hikari 52, when we inspect the extracted patterns, we find that there are some noises with the extracted patterns although we have a good accuracy rate. To avoid the noise, if we apply a cutoff level of 30 at the intensity value graph, we obtain 40 extracted elements. Therefore, the accuracy rate decreases, and certain elements lose some necessary information, which is not desirable. Fig. 3.8 describes the noise situation. Figure 3.8: Description of noise and effect of applying cutoff level for Hikari 52 34

Chapter 3. Automation in Songnote Detection and Recognition In the case of Hikari 49, when we inspect the extracted patterns, we find that some song notes are not extracted correctly. Initially, we have an accuracy rate of 75% with our default parameter value as the gaps between the elements are too short to separate. Fig. 3.8 describes the errors in the detection process. Figure 3.9: Description of the error in the detection for Hikari 49 Fig. 3.9, except Fig. 3.9(d), shows some incorrect extracted notes for Hikari 49. If we carefully inspect Fig. 3.9, we can observe that Fig. 3.9(a) and Fig. 3.9(b) should be extracted as two different elements because the right pattern in figure Fig. 3.9(a) and Fig. 3.9(b) is appears separately (see, 3.9(c)) in the sonogram image, and Fig. 3.9(c) should be extracted as three different elements However, 3.9(d) is considered to be extracted as a right pattern although it has the same nature as the patterns shown in Fig. 3.9(a, b, and c) because the two patterns are very close and the left and the right patterns do not appear separately in the song. We adjust the default parameter value of the minimum gap between the notes to be two pixels and use the cutoff level of nine. Thus, we obtain the best case result with an accuracy rate of 90. 3.4.3 Automatic recognition In section 3.3.2, we discussed the songnote recognition methodology and explained the algorithm. The first step is to divide every extracted element into N parts, and then calculate the average intensity value for every region. Thus, for every element, we have a set of 27 element 35

3.4. Results while N = 3. Then, we apply the Chi-square test considering the note width information. In the proposed method, we compare the elements if the note width is greater than three-fourths or smaller than five-fourths of the observed element. After executing the algorithm mentioned in section 3.3.2.3, we obtain the songnote sequences. For further discussion the songnote sequence of one song unit that is produced by our system and the sequence by human annotation for Hikari 52 has been showed as follows: System (Hikari 52): AABACDDEFGHEFGHIBJKLDEFAABACDDEFGHEFGHIBJKLDEF Correct (Hikari 52): AABLBDDEFGHEFGHICJKDDEFAABLBDDEFGHEFGHICJKDDEF We can summarize the result for the recognition as follows: Table 3.2: Results of the song note recognition Bird name Accuracy rate Hikari 52 86% Hikari 49 85% Kuro 0362 78% Notice that for Hikari 49, the result is based on extracted patterns in the previous step. If we consider the wrong extracted pattern, then the accuracy rate become around 70%. If we inspect the wrong decisions made by the system for Hikari 52, we find that note B is labeled as C and note L is labeled as D. This is because the incorrectly labeled note contains a considerable noise (white part), which affects the matching process. In the case of incorrectly labeling note L note A for Hikari 52, by carefully observing each note, we find that the intensity density is the same for both the notes (3.10). From Fig. 3.10 it is clearly visible that the distribution of intensity density is the same for both the notes. This causes the recognition error and is a limitation of the proposed image 36

Chapter 3. Automation in Songnote Detection and Recognition Figure 3.10: Note 1 (A, left), note 4 (L, middle) and distribution of intensity density value (right) for Hikari 52 matching algorithm. Notice that Note 1 and Note 4 are recognized as A, but originally by human annotation by inspecting the image and hearing to the song, Note 4 was labeled L. We notice a similar recognition error in the case of Hikari 49 and Kuro 0362. 3.5 Summary The present study proposes a brand-new approach to automatic recognition of song elements and its sequences other then sound processing, and by applying image processing, we obtain good results for the approach. There are good possibilities to improve the accuracy rate for both the extraction and the recognition methods to some extent. From the obtained results, we find that the element extraction process is very important and has a significant effect on the recognition process. The major advantage of the proposed approach is its simplicity and feasibility. The approach is focused on a generalized (does not depend on the bird) process just as humans do. Further the accuracy rate of the proposed approach is better than that of other methods such as sound processing which was previously carried out at our labrotory. However, sound processing requires considerable human effort for fixing the parameter values ( manual lebeling have to be done once if k-means approch has been applied) or for training the system (each songnote have 37

3.5. Summary to be seperated to build the database if HMM approch has been applied) for detecting and recognizing the songnotes for every bird. This is not practical for an automated system and also very time consuming. In contrast, the proposed methodology is almost automated and feasible for songbirds as our approach represents the human inspection method and does not depend on birds. The default papemeter values used for detecting the songnotes is almost good for any bird but can be changed by couple of click by the user if necessery. For the element detection process, the accuracy rate is 100% for some birds, and for other birds, the accuracy rate is also satisfactorily high. Thus, our approach saves time and is practical as an automated system. In the recognition process, although we use a simple image pattern matching method, we obtain a high accuracy rate of more than 80%. The use of some other pattern matching method may ensure a better accuracy rate for the recognition. 38

Chapter 4 Information-Theoretic Analysis Songbirds have been actively studied for their complex brain mechanism of sensor-motor integration during song learning. Male Bengalese finches learn singing by imitating external models to produce songs. In general, birdsong which is string of sounds is represented by a sequence of letters called song notes. In this chapter, we focus on information-theoretic analysis of these sequential data to explore the complexity and diversity of birdsong, and learning process throughout song development. We design and develop the analysis tool which has many features to do analysis for the sequential data. For experiment, we employ thirteen male Bengalese finches, each with different bouts of song data. By applying ethological data mining to these data, we discover that the finches follow two types of song learning mechanism: practice mode and adopt mode. In addition, over the analysis we find that it is possible to visualize the song features, e.g. traditional transmission, by contour surface diagram of the transition matrix. Furthermore, we can easily identify the families from these contour surface diagrams, which is a very challenging task in general. Our obtained results indicate that analysis based on data mining is a versatile technique to explore new aspects related to behavioral science. 39

4.1. Introduction 4.1 Introduction Ethology is the scientific study of animal behavior for exploring mechanisms underlying diverse forms of behaviors, from unlearned stereotyped ones to learned flexible ones. Songbirds have been actively studied as a good ethological model for their complex brain mechanism of sensormotor integration in song learning. The Bengalese finch (Lonchura striata var. domestica) is a domesticated strain of a Southeast Asian finch, the white-rumped munia (Lonchura striata) and, it has been a popular subject for neurobiological and ethological studies on birdsongs for its unique song features. Birdsongs are strings of sounds represented by a sequence of letters known as a song note. The males of this species acquire their own songs by learning from external tutors (father or other males) during a specific period, between the nestling and fledgling stages. The learned features of the songs change very little during the matured stage of life. Their songs are used for courtship display. Two types of song features are preferred by female birds: performance- and elaboration-related traits. The performance-related trait is associated with the extent of song production, song rate, song duration, song speed, song amplitude, etc. In contrast, the elaboration-related features are associated with song rules and complexity. Some previous study on the performance-related traits reported that, female preference is positively correlated with song duration and number of note types, which is negatively correlated with peak amplitude frequency of the song [20]. The study on elaboration-related traits reported that, female birds prefer Sexy notes that have a complex song structure [20]. Besides, Chatfield and Lemon mentioned the importance of information-theoretic measures for the analysis in the field of animal behavior study [14]. Sasahara et al. reported 2 types of development of song syntax [21]. Their analysis result was based on number of edges that is required to represent the song syntax by an automaton. Such interesting phenomena during song development motivated us to study the learning process of Bengalese finch song. No previous research was reported to understand the process by which songbirds learn to sing by doing analysis from information-theoretic viewpoint, employing the developmental song data. In the present study, we focused on elaboration-related features from 40

Chapter 4. Information-Theoretic Analysis 2 aspects: the first is analysis of song development and the other is comparison between the songs of the parent and progeny to understand the learning. Our aim is to conduct informationtheoretic analysis on sequential song data to explore the diversity of learning birdsong throughout the song development. For this purpose, we design and develop the analysis tool. Using this tool our experiment on Bengalese finches shows, in practice mode, some finches sing complex songs in the early stages of development, and gradually crystallized the songs by eliminating extra transitions. On the other hand, some other finches do not apply practice mode. Their song production counts toward selecting, constructing, and maintaining behavioral outcomes, hereby it is called adopt mode. 4.2 Preliminaries This section briefly introduces the theoretical foundations of birdsong and its representations, information-theoretic measures, and song model representation by k-reversible language. 4.2.1 Language and Birdsong Humans use language to express emotions or to communicate with other humans, and language is considered unique to the human race. However, other living creatures, too, communicate vocally, for example, songbirds and dolphins, which have complex vocal communication homologous to human language. In the 1960s, the prominent linguist C.F. Hockett divided the design features of language into 13 [17]. The proposed design features are as follows: (1) auditory-vocal channel, (2) broadcast transmission and directional reception in auditory signals, (3) rapid fading of auditory signals, (4) interchangeability in communication, (5) total feedback, (6) specialization, (7) semantics, (8) arbitrariness, (9) discreteness, (10) displacement, (11) productivity (creativity), (12) traditional transmission, and (13) duality. On comparing animal and human communication, Hockett concluded that of the 13 features, only 2 traditional transmission and duality were not observed in any animals. Here, traditional transmission indicates that linguistic 41

4.2. Preliminaries knowledge is passed on from one generation to the next through learning, and duality indicates that particular sound elements have no intrinsic meaning but combine to form structures (e.g., words and phrases) that have meaning. However, recent ethological studies have revealed that traditional transmission is found in the vocal communication of songbirds and whales, and these organisms can combine a few discrete sound elements and so even exhibit duality in communication [6]. Therefore, these properties are not unique to human communication. The song of the Bengalese finch has a more complex structure than that of other songbirds, such as zebra finches (Taeniopygia guttata) [2]. According to recent studies, the courtship songs of the Bengalese finch have unique features and are similar to human language [6]. Some research shows that the language model of this bird can be represented by a k-reversible automaton [10]. Thus, because of the structural and functional similarities in vocal leaning between songbirds and humans, the former have been actively studied as good linguistic models. In particular, the song syntax of the Bengalese finch sheds light on the biological foundations of syntax in humans [11]. 4.2.2 Bengalese Finch Song and its Representation Recent studies on Bengalese finches have shown that the songs of the male birds are neither monotonous nor random; they consist of chunks, each of which is a fixed sequence of a few song notes. The song of each individual can be represented by a finite automaton, which is called song syntax (4.1) [6]. Thus, the songs of Bengalese finches have double articulation, which is one of the important structures of human language (i.e., a sentence consists of words, and a word consist of phonemes). Song syntax is controlled by song control nuclei in the brain. The hierarchy of song control nuclei directly corresponds to the song hierarchy [5]. Bird song analysis requires song data that have been recorded in a suitable environment. From the recorded wave data, spectrograms are obtained, and these are used as the standard representation of the song. 42

Chapter 4. Information-Theoretic Analysis Figure 4.1: Courtship song syntax represented by an automaton 4.2.3 Information-Theoretic Measures In this section, we briefly discuss the information-theoretic measures that describe the features in behavioral strings and help understand their diversity. Transition matrix: In birdsong research, a transition matrix is widely used to understand syntactical complexity. A transition matrix is one that shows note-to-note transition information. A transition probability matrix can be obtained by dividing the note-to-note outcome by the total number of transitions. This is the most common and important way to represent transition information, and other properties can be analyzed using this matrix. Linearity: Scharff and Nottebohm introduced the measure of linearity for estimating the ordering complexity of notes [16]. The linearity index score is calculated from the number of note types and transition types as follows: S Linearity = N T Where, N = number of different notes per song and T = number of transition types per song. The linearity index score provides an estimate to predict the next note in a song when the previous note is known. In a completely linear song sequence, each note has only 1 transition type. Thus, a complete linear song has a linear index score of 1. If some notes have more transition types, the score becomes less than 1. Therefore, a lower linearity index score reflects a more syntactically complex song. Entropy: Shannon introduced the concept of information entropy and discussed the details of 43

4.3. Data Mining and Information Extraction entropy in printed English [15]. Information entropy is a statistical measure of the uncertainty associated with a random variable. It quantifies, in certain sense, how much of information is produced on average for each letter of a certain text. Information entropy is denoted by H(x) = n i=1 P(x i )logp(x i ) Where, n is the elements and P(x i ) is the probability of appearing symbol x i in the sequence. Chatfield and Lemon devised a method to calculate higher-order entropies based on the Markov model for sequential data [13, 14], which is defined by H n = H(n) H n 1 where,n > 1 H(n) = P(X pre,x cur )logp(x pre,x cur ) pre,cur Where, n is the order, P(x,y) is the probability of appearing symbol x after symbol y. However, this Markov model estimation with improperly higher-order entropies probably does not consider all the transitions that the true source produces or detects them with an incorrect frequency. Therefore, it produces a deceptively low and inaccurate entropy estimate. n-gram statistics: An n-gram is a sub-sequence of n items from a given sequence. The items can be phonemes, syllables, letters, or words depending on the application. The size of the ordered list of elements is denoted by n. An n-gram of size n = 1 is referred to as a unigram ; n = 2, as a bigram ; and n = 3, as a trigram. One with n = 4 or more is simply called an n-gram. (n 1)-order Markov models are language models built from n-grams. In ethology, n-gram statistics is used to understand frequency distribution and the hierarchy of behavioral patterns. 4.3 Data Mining and Information Extraction This section describes the application of the data mining technique to extract information from behavioral sequences. 44

Chapter 4. Information-Theoretic Analysis 4.3.1 Data Mining Data mining is the process of analyzing data from different perspectives and summarizing it into useful information. Various approaches can be used to analyze data for data mining. One approach is to use an analyzing tool that allows users to analyze data from many different dimensions, categorize it, and summarize the relationships identified. In general, data mining is the process of finding correlations or patterns among relational datasets. 4.3.2 Mining Behavioral Sequential Data In general, animal behavior is recorded as sequential data of behavioral events. The same symbol is assigned to an identical behavioral event type, and the behavioral data is converted into text data, which are then used for data mining. Such sequential data of animal behavior can be analyzed for different statistical and information-theoretic measures, such as the transition matrix and the first-order Markov chain, and entropy [10]. Special tools are required for dealing with complex behavioral data, and these should be used together with conventional tools. Data mining is one way to do handle complex behavioral data. The current data mining process has many well-established techniques for pattern extraction, clustering, modeling, etc [19]. It can help find units from behavioral sequences and extract the rule that governs them. For obtaining significant information from animal behavior, we have to carefully select elemental data mining techniques suitable for behavioral sequences, which might be different from both word sequences in natural languages and biological sequences like DNA, and we have to then use them with proper modification in the context of ethology. In ethological studies, there are 3 steps for understanding animal behavior. First, a behavioral phenomenon is observed and recorded; second, on the basis of the recordings, a hypothesis is formulated to explain the behavior; and third, experiments are designed, performed, and evaluated to test the hypothesis [19]. If these procedures are followed correctly, better predictions can be made concerning animal behavior, which in turn provide insights into human behavior. However, developing a hypothesis on the basis of recordings is not always easy if the data is vast 45

4.3. Data Mining and Information Extraction or complex. Therefore, we apply data mining technology in ethology. In particular, we studied the application of ethological data mining from sequential animal behavior. In this study, we employ the song of the Bengalese finch to analyze song data during song development. To collect the song data, all birds were raised in the same environment inside a cage in laboratory. Their vocalizations were recorded to analyze their songs. Each bird was individually placed in a soundproof room, and its vocal output was recorded with a directional microphone and a DAT recorder. The recorded songs were analyzed with sound analysis software to generate a spectrogram, which was used to convert sounds (WAV format) to texts. For the song of the Bengalese finch, we obtained a spectrogram where the different song notes were separated by a considerable gap. Each identical pattern was labeled with a similar symbol like a, b, etc. Thus, the song was represented in text format. This analysis was performed manually on the basis of the phonological properties of song notes. The text data of the song were organized by bouts, which is the unit of time for which songbirds sing at a stretch. The following are examples of simple bout data. (Bout 1) abcdbefggabcdbefggabcdbefggabcdbefgghijkli bkmggabcdbefggabcdbefg... (Bout 2) abcdbefggabcdbefgghijklibkmgggabcdbefggab cdbefgghijklibkmgggabcdb... If we carefully look at the song data sequence we can easily find some pattern in the sequence. When the bouts are delimited into segments at a, they are found to consist of 2 types of song units gabcdbefgg and gabcdbefgghijjklibkmggg which repeatedly appear in other bouts. Thus, by some pre-processing, we converted the song data into text for further analysis. For data mining from the behavioral sequence, we developed a tool called EUREKA, which stands for ethoinfomatical utilities for rule extraction and knowledge acquisition [23]. EUREKA is a utility suite used for the following analyses: Information-theoretic analysis Extraction of probabilistic behavioral rule (n-gram model) 46

Chapter 4. Information-Theoretic Analysis Extraction of deterministic behavioral rule (deterministic finite automaton) We design the features of the StringStat module of EUREKA tool and develop the functionality of different information-theoretic measures to meet the requirement of detail analysis. The StringStat module deals with the analysis of information-theoretic measures and enables the following translation in the context of ethology: Linearity: Analysis of behavioral diversity. Entropy: Analysis for behavioral uncertainty. Transition Matrix: Representation of the transition probability of behavioral event types. n-gram Statistics: Analysis of different n values in terms of frequency distribution and hierarchy of behavioral patterns. 4.4 Data Mining from Birdsong In this section, we present our results from 2 aspects. The first is the analysis of song development and the other is comparison of song property between the parent and progeny. We consider some of the measures in our analysis provided by StringStat. 4.4.1 Description of Data In the current study, we examined 9 juvenile birds to study song evolution and 4 parent-progeny pairs to compare the songs between parent and progeny. It has been reported that young male birds learn songs from their fathers in the first 120 130 days after hatching [12]. The principal learning period is considered the age of 60 130 days. The sounds that are made during the first 60 days after hatching are distorted and do not considered as songs. 4.1 shows the days when the data was recorded for each individual that is used for analysis in the present study. Song data was recorded with intervals but was not fixed for different birds. For the convenience of analysis, we categorized those into 5 different stages. 4.2 shows the parent and progeny relationship for the birds: 47

4.4. Data Mining from Birdsong Table 4.1: Young birds and their age (in days) when song was recorded Bird name Age in days Stage 1st 2nd 3rd 4th 5th Lao 61 88 95 102 116 RAo 61 76 87 102 127 LDai 73 80 94 108 129 Shiro 64 78 92 100 120 RMo 70 77 91 99 125 RKi 70 79 93 106 127 LShiro 73 85 93 100 121 Atama 62 69 89 99 124 LMizuiro 74 81 94 108 130 Table 4.2: Relationship between parent and progeny Parent Progeny Sankakukoshitya Lao Bakatono RAo Katsuo LDai Kuroshiro Shiro 4.4.2 Common Parent-Progeny Findings We present our results regarding features that were common to the songs of the parents and progeny. For this analysis, we considered only the transition matrix and transition types. We also present a new technique for visual representation of the song on the basis of the transition matrix. The birds have 7 14 different note types. Our findings show that every bird has 1 or 2 dominant notes, while other notes are used to generate variations in the song. Fig. 4.2 shows the 48

Chapter 4. Information-Theoretic Analysis graph of rank vs. frequency for the different pattern types from the song sung by the bird Shiro at age 120 days. X-axis corresponds to the rank for n-gram pattern types where the value of n is from 1 to 10, and Y-axis corresponds to frequency. The complete song consists of 10 bouts comprising 816 song notes. The song note types are labeled as a, b, c, d, e, f, g, and h. We found that of these song notes, d appeared 317 times, and f appeared 228 times. Further, a appeared 123 times, and the other notes appeared less than 100 times. When we increased the value of n, we found that dd appeared in the song sequence 264 times; ff, 159 times; ddd, 211 times; fff, 102 times; and dddd, 158 times. From this simple analysis, we found that the notes d and f are the dominant song notes for Shiro. Figure 4.2: Rank vs. frequency of n-gram types 4.4.3 Analysis on Evolution of Song Here, we present the results of evolution analysis. We considered only the young bird s song data for this analysis. Although StringStat module provides different information-theoretic measures, we focused on linearity, entropy, and n-gram statistics in particular transition types. In the early stage, most birds have a relatively large number of transition types in their songs. Eventually, the noise transitions are reduced, and the birds produce relatively small patterns. 49

4.4. Data Mining from Birdsong However, some birds produce songs with a relatively small pattern from an early age, and the pattern does not change later during song development. We found the song development of Shiro and LShiro to show such a trend (4.3). Figure 4.3: Change in number of transition types during song development The linearity index score indicates the complexity of the song. Our investigation of this measure showed that all the birds in this study sang complex songs in the beginning, although after the development period, the crystallized songs were syntactically simpler. Again, the experimental results showed that Shiro and LShiro started singing less complex songs from the beginning, and the linearity index score of these birds changed only slightly during song development (4.4). By definition, the linearity index score and entropy value are closely linked because both measures reflect the syntactical complexity of note-to-note transition. The following graph shows the correlation between the linearity and entropy value for all 9 progenies in their song development period (days in 4.1). In our experiment, we also find a linear correlation between these 2 variables (4.5) during song learning. Based on the above three analysis our investigation indicates that we can divide the birds into two groups. One group with majority of birds in their early stages of development has high transition rate and low linearity index score while producing the song. Those complex songs 50

Chapter 4. Information-Theoretic Analysis Figure 4.4: Change in the linearity index score during song development are gradually crystallized by the elimination of extra transitions. We call this learning process as practice mode. On the other hand, other group with minority birds, the rate of changing values for the number of transition and linearity index score is very low. During song production all activity counts toward selecting, constructing, and maintaining behavioral outcomes. This implies that the fundamental, self-generated activity plays an important role in the development of behavioral functions, both perception and cognition related. We call this learning process as adopt mode. 4.4.4 Parent-Progeny Comparison of Songs This section shows results of a comparison between the songs of the parent and progeny. For this analysis, the song data of both the parent and progeny were considered. We focused on different information-theoretic measures such as linearity, entropy, and n-gram statistics in particular transition types. Fig. 4.6 shows the comparison of information-theoretic measures between parent and progeny. Four parents song data and 4 progenies matured periods song data (i.e. stage 5 in 4.1) has been 51

4.4. Data Mining from Birdsong Figure 4.5: Relationship between linearity and entropy during song development employed here. It indicates that during practice mode learning, different measures converge to the value of the parents song. Except the pair Kuroshiro Shiro, all other pairs showed almost similar values of all measures between the parent and progeny. The learning process of Shiro was found to be adopt mode. We can see from Fig. 4.6 that the transition types and linearity index score of this bird was very different from those of his father; in fact, Shiro s song was more complex than his fathers. 4.4.5 Visualization of Song Features In this section, we present a new technique for visual representation of the song on the basis of the transition matrix. To explain the usability of this visualization technique we also show contour surface diagrams of different bird families (4.8). A contour surface diagram can be used for visual representation of the song property. In general, surface charts are useful 3D chart types; they have 3 true data dimensions and can illustrate data reasonably well. Those charts are useful to show how a variable (Z) changes according to 2 other variables (X and Y). Contour graph is a kind of surface chart containing regions colored according to the Z value. Essentially, they are 2D top views of 3D surface charts. 52

Chapter 4. Information-Theoretic Analysis Figure 4.6: Comparison of different measures between parent and progeny In this paper, to visualize the bigram property based on the transition matrix, we use the contour graph in an unconventional fashion. Where, the symbols (such as a, b, and c) are arranged on X and Y axes in the alphabetical order to generate a graph. We found that using contour graph is a good technique for visual representation of the bigrams. Fig. 4.7 shows a contour graph for the birdsong produced by bird RAo at age 127 days. Fig. 4.8 shows the contour diagrams of the transition matrix for the songs of the parents and progeny. The contour diagrams for the song at the early age and matured age are shown for the young birds. From Fig. 4.8, we can easily visualize 2 properties of the songs of the Bengalese finch: (1) Contour diagrams can visually display the unique song features of a particular bird family. We can easily distinguish different bird families from the contour diagrams. (2) Second, although there are differences in song properties at the early and matured stages, the major features are present at the early stage, and as the noise transitions reduce, the songs of the young birds eventually converge in to their fathers songs. That is a clear evidence of traditional transmission feature mentioned by Hockett. There are 3 main purpose of proposing this visual representation: (1) since the transition matrix shows only numbers, it is difficult to understand the song patterns from the matrix, but if we represent the song as a contour surface diagram, we can easily visualize its transition properties. (2) The corresponding symbols of father and progeny are directly related. Similar 53

4.5. Summary Figure 4.7: Contour surfce diagram of bird RAo patterns of father and progeny is assigned with a same symbol. For that reason, even if one pattern is changed with a different symbol, obtained contour diagram can be different but will be similar for the particular family. (3) If we want to identify the families based on the properties of their song it will be a very difficult task for making an algorithm by applying some clustering technique. But from the contour diagrams we easily identify the families which make the task simple. 4.5 Summary This paper reported on information-theoretic analysis of the sequential data of Bengalese finch to explore the learning process throughout song development. We show the effective use of the data mining for the birdsong research, in general, in the research of animal behavior. By applying ethological data mining to the birdsong data, we discover that the finches follow two types of song learning mechanism: practice mode and adopt mode, which is a new finding related to learning mechanism of birdsongs. In practice mode, some finches sing complex songs in the early stages of development, and gradually crystallized the songs by eliminating extra transitions. 54

Chapter 4. Information-Theoretic Analysis Figure 4.8: Contour diagrams of different families On the other hand, some other finches do not apply practice mode. Their song production counts toward selecting, constructing, and maintaining behavioral outcomes, hereby it is called adopt mode. Thus, such analysis provides scope for closer examination of a large amount of data, whereby useful information can be extracted from them. In addition, we showed a new technique to visualize the features of behavioral sequences. Over the analysis we find that it is possible to visualize the song features, e.g. traditional transmission, by contour surface diagram of the transition matrix. Our obtained results indicate that analysis based on data mining is a versatile technique to explore new aspects related to behavioral science. By applying the findings of the present study, we will be able to analyze animal behavior more precisely. Beside this, we could obtain a better 55

4.5. Summary understanding of the features of dominant song notes to compare their sound properties and effect on female preference. Thus, the findings in this current study shed lights on new aspects of future research related to behavioral science. 56

Chapter 5 Conclusion Birdsong analysis is a popular topic in ethological studies for understanding animal behavior. This research try to give a look inside from the information science to do automation for analyzing large amount of data and apply the knowledge of data mining to discover new knowledge. 5.1 Automation in analyzing the song This research is mainly focused on song sequence extraction process which is manually done by human. There were some previous study on this topic but those approach was based on sound processing. The present study proposes a brand-new approach to automatic recognition of song elements and its sequences by applying image processing, and we obtain good results for the approach. There are good possibilities to improve the accuracy rate for both the extraction and the recognition methods to some extent. From the obtained results, we find that the element extraction process is very important and has a significant effect on the recognition process. The major advantage of the proposed approach is its simplicity and feasibility. The approach is focused on a generalized (does not depend on the bird) process just as humans do. Further the accuracy rate of the proposed approach is better than that of other methods such as sound 57

5.2. Mining on behavioral sequences processing which was previously carried out at our laboratory. However, sound processing requires considerable human effort for fixing the parameter values for detecting and recognizing the songnotes for every bird. This is not practical for an automated system and also very time consuming. In contrast, the proposed methodology is almost automated and feasible for songbirds as our approach represents the human inspection method and does not depend on birds. The default parameter values used for detecting the songnotes is almost good for any bird but can be changed by couple of clicks by the user if necessary. Till now there are good possibilities to improve the accuracy rate for both song note extraction and recognition. For the element detection process, the accuracy rate is 100% for some birds, and for other birds, the accuracy rate is also satisfactorily high. Thus, our approach saves time and is practical as an automated system. In the recognition process, although we use a simple image pattern matching method, we obtain a high accuracy rate of more than 80%. The use of some other pattern matching method may ensure a better accuracy rate for the recognition. 5.2 Mining on behavioral sequences The research reported on information-theoretic analysis of the sequential data of Bengalese finch to explore the learning process throughout song development shows the effective use of the data mining for the birdsong research, in general, in the research of animal behavior. By applying ethological data mining to the birdsong data, we discover that the finches follow two types of song learning mechanism: practice mode and adopt mode, which is a new finding related to learning mechanism of birdsongs. In practice mode, some finches sing complex songs in the early stages of development, and gradually crystallized the songs by eliminating extra transitions. On the other hand, some other finches do not apply practice mode. Their song production counts toward selecting, constructing, and maintaining behavioral outcomes, hereby it is called adopt mode. Thus, such analysis provides scope for closer examination of a large amount of data, whereby useful information can be extracted from them. In addition, we showed a new technique to visualize the features of behavioral sequences. 58

Chapter 5. Conclusion Over the analysis we find that it is possible to visualize the song features, e.g. traditional transmission, by contour surface diagram of the transition matrix. 5.3 Summary We showed that, applying data-mining and information-theoretic analysis on behavioral sequences can identify new aspects related to behavioral science which is a good scope for further research. Findings in this research shows interesting phenomena during the learning process. To understand the reasons is a good scope for further research. Our obtained results indicate that analysis based on data mining is a versatile technique to explore new aspects related to behavioral science. By applying the findings of the present study, we will be able to analyze animal behavior more precisely. For example, our analysis shows that kuroshiro father of Shiro, is the most stupid bird in terms of song complexity. We may make a hypothesis that, if the father is stupid than son become smarter and that can be the reason of having adopt-mode learning process. Further study is necessary to establish this hypothesis. Beside this, we could obtain a better understanding of the features of dominant song notes to compare their sound properties and effect on female preference. Thus, the findings in this current study shed lights on new aspects of future research related to behavioral science. 59

Appendix A χ 2 distribution table 60

Appendix A. χ 2 distribution table Note: Significant values χ 2 (α) of Chi-square distribution for given probability α. For degree of freedom (ν) grater than 30, the quantity (2χ 2 ) (2ν 1) as a normal variate with unit variance. 61

References [1] C. K. Catchpole and P. J. B. Slater: Bird Song: Biological Themes and Variations, Cambridge University Press, 2nd edition (2003). [2] E. Honda and K. Okanoya: Acoustical and Syntactical Comparisons between Songs of the White-backed Munia (Lonchura striata) and Its Domesticated Strain, the Bengalese Finch (Lonchura striata var. domestica), Zoological Science, Vol.16, pp. 319 326 (1999). [3] John C. Russ: The Image Processing Handbook, CRC Press, 5th edition (2006). [4] J. Doupe and P. K. Kuhl: Birdsong and Human Speech: Common Themes and Mechanisms, Annual Reviews Neuroscience, Vol.22, pp. 567 631 (1999). [5] J. Nishikawa and K. Okanoya: Dynamical Neural Representation of Song Syntax in Bengalese Finch: a Model Study, Ornithological Science, pp. 95 103 (2006). [6] K. Okanoya: Song Syntax in Bengalese Finches: Proximate and Ultimate Analyses, Advances in the Study of Behavior, Vol.34, pp. 297 346 (2004). [7] National Institutes of Health, USA. ImageJ 1.41, URL: http://rsbweb.nih.gov/ij/; last accessed: July 31, 2009. [8] Ojala T, Pietikainen M and Harwood D: A comparative Study of Texture Measures with Classification Based on Feature Distributions, Pattern Recognition, Vol.29, pp. 51 59 (1996). 62

References [9] Sheldon M. Ross: Introduction to Probability and Statistics for Engineers and Scientists, Elsevier Academic Press, 3rd edition (2004). [10] Y. Kakishita, K. Sasahara, T. Nishino, M. Takahasi and K. Okanoya: Ethological Data Mining: an Automata-Based Approach to Extract Behavioral Units and Rules, Data Mining and Knowledge Discovery, Vol.18, No.3, pp. 446 471 (2009). [11] J. Doupe and P. K. Kuhl. Birdsong and Human Speech: Common Themes and Mechanisms. Annual Reviews Neuroscience 22:567 631, 1999. [12] C. K. Catchpole and P. J. B. Slater. Bird Song: Biological Themes and Variations. Cambridge University Press; 2nd edition, 2003. [13] R. Suzuki, J. R. Buck, and P. L. Tyack. Information Entropy of Humpback Whale Songs. Journal of the Acoustical Society of America 119:1849 1866, 2006. [14] C. Chatfield and R. E. Lemon. Analyzing Sequences of Behavioral Events. Journal of Theoretical Biology 29:427 445, 1970. [15] C. E. Shannon. Prediction and Entropy of Printed English. Bell System Technical Journal 3:50 64, 1950. [16] C. Scharff and F. Nottebohm. A Comparative Study of the Behavioral Deficits following Lesions of Various Parts of the Zebra Finch Song System: Implications for Vocal Learning. Journal of Neuroscience 11(9):2898 2913, September 1991. [17] C. F. Hocket. The Origin of Speech. Scientific American 203:88, 1960. [18] E. Thelen and L. B. Smith. A Dynamic Systems Approach to the Development of Cognition and Action. The MIT Press, 1994. [19] J. Han, H. Cheng, D. Xin, and X. Yan. Frequent Pattern Mining: Current Status and Future Directions. Data Mining and Knowledge Discovery 15(1):55 86, 2007. 63

References [20] M. Soma, M. Takahasi, T. Hasegawa and K. Okanoya. Trade-offs and correlations among multiple song features in Bengalese Finch. Ornithological Science 5(1):77 84, 2006. [21] K. Sasahara, Y. Kakishita, T. Nishino, M. Takahasi and K. Okanoya. Constructing Song Syntax by Automata Induction. ICGI 2006, September 2006. [22] E. Thelen and L. B. Smith. A Dynamic Systems Approach to the Development of Cognition and Action. The MIT Press, 1994. [23] TNLAB, UEC & OKANOYA LAB, RIKEN Japan. EUREKA, URL: http://sites.google.com/site/eurekawiki; last accessed: November 1st, 2010. 64

List of Publications Refereed journal papers 1. M. M. S. Khan, T. Nishino, K. Sasahara, M. Takahasi and K. Okanoya : A Feasible Approach for Automatic Detection and Recognition of the Bengalese Finch Songnotes and Their Sequences; Journal of Intelligent Learning Systems and Applications, 2010, 2, 221-228. 2. M. M. S. Khan, T. Nishino, K. Sasahara, M. Takahasi and K. Okanoya : Information Theoretic Analysis for Understanding the Behavior of Song Learning by the Bengalese Finch, The Information Processing Society of Japan (IPSJ) Transactions on Mathematical Modeling and its Applications (TOM), Vol 4, 183 192, July 2011. 3. M. M. S. Khan, S. T. Suzuki and H. Oku: Effects of Autonomy in an English Language Learning Class, The University of Electro- Communications Bulletin, 22-1, p.33-40, February 2010. Conference papers 1. M. M. S. Khan, T. Nishino, K. Sasahara, M. Takahasi and K. Okanoya : Information Theoretic Analysis for Understanding the Behavior of Song Learning by the Bengalese Finch, Workshop on Mathematical Modeling and Problem Solving, Vol.2010-MPS-81 No.8, Kyusyu University, December 2010. 65

References 2. M. M. S. Khan, T. Nishino, K. Sasahara, M. Takahasi and K. Okanoya : Information- Theoretic Analysis on Evolution in Learning Birdsong by Bengalese Finch, Neuroscience 2010, poster 606.9, San Diego, USA. November 2010. 3. M. M. S. Khan, T. Nishino, K. Sasahara, M. Takahasi and K. Okanoya : Automatic Detection and Recognition of the Behavioral Sequences of Bengalese Finch Song by Using Image Processing, Workshop on Mathematical Modeling and Problem Solving, Vol.2009- MPS-75 No.8, Hokkaido University, September 2009. 4. M. M. S. Khan, T. Nishino, K. Sasahara, M. Takahasi and K. Okanoya : Automation in Extracting the Songnote and Its Sequence of The Bengalese Finch Song by Using Image Processing, Triangle Symposium on Advanced ICT 2009 (TriSAI 2009), Tokyo, Japan, October 2009. 5. M. M. S. Khan, T. Nishino, K. Sasahara, M. Takahasi and K. Okanoya : Information Theoretic Analysis on Behavioral Sequences - A Case Study between Birdsong and Human s Song, ICT Triangle Forum 2008 (AICT 2008), Daejeon, Korea, October 2008. 6. S. Okubo, T. Honda, H. Manabe, T. Iizuka, M. M. S. Khan, H. Tokida, T. Gima, T. Suzuki, A. Tanaka, K. Matsuno, M. Wakatsuki and T. Nishino. A Technical Report on third UEC computer DAIHINMIN tournament (UECda-2008), Workshop on Game Informatics, 2009-GI-21(3), Osaka University of Commerce, Mar. 2009. (in Japanese) 7. S. Okubo, T. Honda, H. Manabe, T. Aoki, Y. Kakishita, N. Komatsubara, T. Iizuka, H. Tokida, M. M. S. Khan and T. Nishino. A Technical Report on Second UEC computer DAIHINMIN tournament (UECda-2007), Workshop on Game Informatics, 2008- GI-19(4), Tokyo University of Technology, Mar. 2008.(in Japanese) 66