Representing and Learning Opaque Maps with Strictly Local Functions



Similar documents
ACKNOWLEDGEMENTS. At the completion of this study there are many people that I need to thank. Foremost of

The puzzle-puddle-pickle problem and the Duke-of-York gambit in acquisition 1

Lecture 1: OT An Introduction

Regular Languages and Finite Automata

Stricture and Nasal Place Assimilation. Jaye Padgett

Web Data Extraction: 1 o Semestre 2007/2008

Business Process Modeling

University of Ostrava. Reasoning in Description Logic with Semantic Tableau Binary Trees

Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski

Informatique Fondamentale IMA S8

Honors Class (Foundations of) Informatics. Tom Verhoeff. Department of Mathematics & Computer Science Software Engineering & Technology

Testing LTL Formula Translation into Büchi Automata

On line construction of suffix trees 1

CS 1133, LAB 2: FUNCTIONS AND TESTING

Software Modeling and Verification

Today s Agenda. Automata and Logic. Quiz 4 Temporal Logic. Introduction Buchi Automata Linear Time Logic Summary

Lecture 2: Universality

A first step towards modeling semistructured data in hybrid multimodal logic

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

Graph Mining and Social Network Analysis

Fabio Patrizi DIS Sapienza - University of Rome

Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm

A PPM-like, tag-based branch predictor

There is no degree invariant half-jump

Scheduling Shop Scheduling. Tim Nieberg

Remarks on Markedness

Author's Name: Stuart Davis Article Contract Number: 17106A/0180 Article Serial Number: Article Title: Loanwords, Phonological Treatment of

JMulTi/JStatCom - A Data Analysis Toolkit for End-users and Developers

Bindings, mobility of bindings, and the -quantifier

Cassandra. References:

D1.1 Service Discovery system: Load balancing mechanisms

Moving Least Squares Approximation

Regular Expressions and Automata using Haskell

CHAPTER 7 GENERAL PROOF SYSTEMS

Using Patterns and Composite Propositions to Automate the Generation of Complex LTL

ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS

The Halting Problem is Undecidable

Chapter II. Controlling Cars on a Bridge

LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b

Compiler Construction

D6 INFORMATION SYSTEMS DEVELOPMENT. SOLUTIONS & MARKING SCHEME. June 2013

HTML Web Page That Shows Its Own Source Code

Learning Automata and Grammars

Calculation of Minimum Distances. Minimum Distance to Means. Σi i = 1

Analysis of Compression Algorithms for Program Data

Study of algorithms for factoring integers and computing discrete logarithms

Automata on Infinite Words and Trees

Temporal Logics. Computation Tree Logic

Logic in general. Inference rules and theorem proving

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

How To Compare A Markov Algorithm To A Turing Machine

Data quality in Accounting Information Systems

1 Approximating Set Cover

Simulation-Based Security with Inexhaustible Interactive Turing Machines

. Learn the number of classes and the structure of each class using similarity between unlabeled training patterns

2. Basic Relational Data Model

Performance Level Descriptors Grade 6 Mathematics

Relational model. Relational model - practice. Relational Database Definitions 9/27/11. Relational model. Relational Database: Terminology

Linear Programming Notes V Problem Transformations

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 4 Nancy Lynch

Seminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina

Linear Coding of non-linear Hierarchies. Revitalization of an Ancient Classification Method

Fundamentele Informatica II

The Best Binary Split Algorithm: A deterministic method for dividing vowel inventories into contrastive distinctive features

Lecture 1. Basic Concepts of Set Theory, Functions and Relations

Architectural Analysis of Microsoft Dynamics NAV

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

EFFICIENT KNOWLEDGE BASE MANAGEMENT IN DCSP

Compiler I: Syntax Analysis Human Thought

Software Verification: Infinite-State Model Checking and Static Program

CS510 Software Engineering

On the Modeling and Verification of Security-Aware and Process-Aware Information Systems

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Data Warehousing. Jens Teubner, TU Dortmund Winter 2014/15. Jens Teubner Data Warehousing Winter 2014/15 1

SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY

10CS35: Data Structures Using C

Modeling Guidelines Manual

MATHEMATICS: CONCEPTS, AND FOUNDATIONS Vol. III - Logic and Computer Science - Phokion G. Kolaitis

Automata and Computability. Solutions to Exercises

How To Use Neural Networks In Data Mining

Gerry Hobbs, Department of Statistics, West Virginia University

Bell & LaPadula Model Security Policy Bell & LaPadula Model Types of Access Permission Matrix

24 Uses of Turing Machines

Reading 13 : Finite State Automata and Regular Expressions

Nonlinear Optimization: Algorithms 3: Interior-point methods

Cell Phone based Activity Detection using Markov Logic Network

In this section, we will consider techniques for solving problems of this type.

Transcription:

Representing and Learning Opaque Maps with Strictly Local Functions Jane Chandlee Jeffrey Heinz Adam Jardine CNRS, Paris, France April 18, 2015 GLOW 38: Workshop on Computational Phonology 1

Representing and Learning Opaque Maps 1. Chandlee 2014 defines and studies input strictly local functions which are a class of string-to-string maps (see also Chandlee and Heinz, in revision). 2. Here we show that many opaque phonological patterns can be represented and modeled with these functions without any additional modifications. 3. This matters because input strictly local functions are efficiently learnable from positive evidence (Chandlee 2014, Chandlee et al. 2014, and Jardine et al. 2014). 4. Other reasons why it matters and the implications for phonological theory more generally will also be discussed. 2

Part I Studying the nature of phonological maps 3

Transformations in Generative Phonology Theories of generative phonology concern transformations, or maps from abstract underlying representations to surface phonetic representations. A truism about maps: Different grammars may generate the same map. Such grammars are extensionally equivalent. Grammars are finite, intensional descriptions and maps are their (possibly infinite) extensions Maps may have properties largely independent of their grammars... output-driven maps (Tesar 2014) regular maps (Elgot and Mezei 1956, Scott and Rabin 1959) subsequential maps (Oncina et al. 1993, Mohri 1997, Heinz and Lai 2013)... 4

Input Strict Locality: Main Idea These maps are Markovian in nature. x 0 x 1... x n where u 0 u 1... u n 1. Each x i is a single symbol (x i Σ 1 ) 2. Each u i is a string (u i Σ 2) 3. There exists a k N such that for all input symbols x i its output string u i depends only on x i and the k 1 elements immediately preceding x i. (so u i is a function of x i k+1 x i k+2...x i ) 5

Input Strict Locality: Main Idea in a Picture x a... a b a b a b a b a... b a... a b a b a b a b a... b u Figure 1: For every Input Strictly 2-Local function, the output string u of each input element x depends only on x and the input element previous to x. In other words, the contents of the lightly shaded cell only depends on the contents of the darkly shaded cells. 6

Example: Nasal Place Assimilation is ISL with k = 2 /inpäfekt / [impäfekt] input: I n p Ä f E k t output: I λ mp Ä f E k t 7

Example: Nasal Place Assimilation is ISL with k = 2 /inpäfekt / [impäfekt] input: I n p Ä f E k t output: I λ mp Ä f E k t 8

Example: Nasal Place Assimilation is ISL with k = 2 /inpäfekt / [impäfekt] input: I n p Ä f E k t output: I λ mp Ä f E k t 9

Example: Nasal Place Assimilation is ISL with k = 2 /inpäfekt / [impäfekt] input: I n p Ä f E k t output: I λ mp Ä f E k t 10

What can be modeled with ISL functions? 1. A significant range of individual phonological processes such as local substitution, deletion, epenthesis, and synchronic metathesis 2. Approximately 95% of the individual processes in P-Base (v.1.95, Mielke (2008)) 3. Opaque maps! (This talk, in a moment) (Chandlee 2014, Chandlee and Heinz, in revision) 11

What cannot be modeled with ISL functions 1. progressive and regressive spreading 2. long-distance (unbounded) consonant and vowel harmony (Chandlee 2014, Chandlee and Heinz, in revision) 12

What about spreading and long-distance phonology? ISL functions naturally generalize Strictly Local (SL) stringsets in Formal Language Theory. SL stringsets model local phonotactics and ISL functions model phonological maps with local triggers. Output SL functions are another generalization of SL stringsets which precisely models spreading (Chandlee et al. under review). Stringly-Piecewise (SP) and Tier-based Strictly Local (TSL) stringsets model long-distance phonotactics (Heinz 2010, Heinz et al. 2011). We expect functional characterizations of SP and TSL stringsets will model long-distance maps (work-in-progress). 13

Automata characterization of k-isl functions Theorem Every k-isl function can be modeled by a k-isl transducer and every k-isl transducer represents a k-isl function. The state space and transitions of these transducers are organized such that two input strings with the same k 1 suffix always lead to the same state. (Chandlee 2014, Chandlee et. al 2014) 14

Example: Fragment of k-isl transducer for NPA /inpa/ [impa] g:g g:λ V:V g:g n:λ g:ng V:V n:λ λ V:λ n:n V:V p:p p:mp n:λ p:p p:λ Not all transitions and states are shown, and vowel states and transitions are collapsed. The nodes are labeled name:output string. 15

Part II Studying Opacity in Phonology 16

Defining Opaque maps Opaque maps are defined as the extensions of particular rule-based grammars (Kiparsky 1971, McCarthy 2007). Baković (2007) provides a typology of opaque maps. Counterbleeding Counterfeeding on environment Counterfeeding on focus Self-destructive feeding Non-gratuitous feeding Cross-derivational feeding Subsequent examples are drawn from this paper. 17

Counterbleeding in Yokuts might fan /Pili:+l/ [+long] [-high] Pile:l V [-long] / C # Pilel [Pilel] 18

Fragment of a 3-ISL transducer for Yokuts l ī l l λ λ el 19

Counterfeeding-on-environment in Bedouin Arabic Bedouin Arabic This is 4-ISL. a i / σ Bedouin /badw/ G V / C # badu [badu] 20

The other examples Counterfeeding on focus in Bedouin Arabic (3-ISL) Self-destructive feeding in Turkish (5-ISL) Non-gratuitous feeding in Classical Arabic (5-ISL) Cross-derivational feeding in Lithuanian (4-ISL) 21

What is k? If a function described by a rewrite rule A B / C ISL then k will be the length of the longest string in the structural description CAD. D is For opaque maps describable as the composition of two rewrite rules, it is approximately the length of the longest string in the structural description of either rule (it may be a little longer). A k-value of 5 appears sufficient for the examples in Baković s paper. 22

Part III Learning ISL functions 23

ISLFLA: Input Strictly Local Function Learning Algorithm The input to the algorithm is k and a finite set of (u,s) pairs. ISLFLA builds a input prefix tree transducer and merges states that share the same k 1 prefix. Provided the sample data is of sufficient quality, ISLFLA provably learns any function k-isl function in quadratic time. Sufficient data samples are quadratic in the size of the target function. (Chandlee et al. 2014, TACL) 24

SOSFIA: Structured Onward Subsequential Function Inference Algorithm SOSFIA takes advantage of the fact that every k-isl function can be represented by an onward transducer with the same structure (states and transitions). Thus the input to the algorithm is k-isl transducer with empty output transitions, and a finite set of (u,s) pairs. SOSFIA calculates the outputs of each transition by examining the longest common prefixes of the outputs of prefixes of the input strings in the sample (onwardness). Provided the sample data is of sufficient quality, SOSFIA provably learns any function k-isl function in linear time. Sufficient data samples are linear in the size of the target function. 25 (Jardine et al. 2014, ICGI)

Part IV Implications for a theory of phonology 26

Some reasons why Classic OT has been influential Offers a theory of typology. Comes with learnability results. Solves the duplication/conspiracy problems. 27

What have we shown? 1. Many attested phonological maps, including many opaque ones, are k-isl for a small k. 2. k-isl functions make strong typological predictions. (a) No non-regular map is k-isl. (b) Many regular maps are not k-isl. So they are subregular. 3. k-isl functions are efficiently learnable. 28

How does this relate to traditional phonological grammatical concepts? 1. Like OT, k-isl functions do not make use of intermediate representations. 2. Like OT, k-isl functions separate marked structures from their repairs (Chandlee et al. to appear, AMP 2014). k-isl functions are sensitive to all and only those markedness constraints which could be expressed as *x 1 x 2...x k, (x i Σ). In this way, k-isl functions model the homogeneity of target, heterogeneity of process (McCarthy 2002) 29

Where is the Optimality? Paul Smolensky asked this question to Jane Chandlee at the 2013 AMP at UMass where ISL functions were first introduced. The answer is Over there. Perhaps the question ought to be How does Optimality Theory account for the typological generalization that so many phonological maps are ISL? 30

Undergeneration in Classic OT It is well-known that classic OT cannot generate opaque maps (Idsardi 1998, 2000, McCarthy 2007, Buccola 2013) (though Baković 2007, 2011 argues for a more nuanced view). Many, many adjustments to classic OT have been proposed. constraint conjunction (Smolensky), sympathy theory (McCarthy), turbidity theory (Goldrick), output-to-output representations (Benua), stratal OT (Kiparsky, Bermudez-Otero), candidate chains (McCarthy), harmonic serialism (McCarthy), targeted constraints (Wilson), contrast preservation ( Lubowicz) comparative markedness (McCarthy) serial markedness reduction (Jarosz),... See McCarthy 2007, Hidden Generalizations for review, meta-analysis, and more references to these earlier attempts. 31

Adjustments to Classic OT - constraint conjunction (Smolensky), sympathy theory (McCarthy), turbidity theory (Goldrick), output-to-output representations (Benua), stratal OT (Kiparsky, Bermudez-Otero), candidate chains (McCarthy), harmonic serialism (McCarthy), targeted constraints (Wilson), contrast preservation ( Lubowicz) comparative markedness (McCarthy) serial markedness reduction (Jarosz),... These approaches invoke different representational schemes, constraint types and/or architectural changes to classic OT. The typological and learnability ramifications of these changes is not yet well-understood in many cases. On the other hand, no special modifications are needed to establish the ISL nature of the opaque maps we have studied. 32

Overgeneration in Classic OT It is not controversial that classic OT generates non-regular maps with simple constraints (Frank and Satta 1998, Riggle 2004, Gerdemann and Hulden 2012, Heinz and Lai 2013) 33

OT s greatest strength is its greatest weakness. The signature success of a successful OT analysis is when complex phenomena are understood as the interaction of simple constraints. But the overgeneration problem is precisely this problem: complex but weird phenomena resulting from the interaction of simple constraints (e.g. Hansson 2007 on ABC). As for the undergeneration problem, opaque candidates are not optimal in classic OT. 34

In a picture Logically Possible Maps Regular Maps ( rule-based theories) Phonology OT 35

Conclusion Despite some limitations, k-isl functions characterize well the nature of phonological maps. Many phonological maps, including opaque ones, can be expressed with them. k-isl functions provide both a more expressive and restrictive theory of typology than classic OT, which we argue better matches the attested typlogy. Like classic OT, there are no intermediate representations, and k-isl functions can express the homogeneity of target, heterogeneity of process which helps address the conspiracy and duplication problems. k-isl functions are learnable. Unlike OT, subregular computational properties like ISL and not optimization form the core computational nature of phonology. 36

Conclusion in a picture Logically Possible Maps Regular Maps ( rule-based theories) Phonology OT ISL maps are in green 37

Questions 1. What is k? Can k be learned? 2. What about phonetics? 3. How do you learn underlying representations? Acknowledgments Thank You. *Special thanks to Rémi Eyraud, Bill Idsardi, and Jim Rogers for valuable discussion. We also thank Iman Albadr, Hyun Jin Hwangbo, Taylor Miller, and Curt Sebastian for useful comments and feedback. 38

Appendix Some extra slides as needed 39

Formal Language Theory ISL functions naturally extend SL stringsets in Formal Language Theory. For SL stringsets, well-formedness is determined by examining windows of size k. x a... a b a b a b a b a... b Figure 2: A stringset is Strictly 2-Local if the well-formedness of each word can be determined by checking whether each symbol x in the word is licensed by the preceding symbol. (McNaughton and Papert 1971, Rogers and Pullum 2011) 40

Subregular Hierarchies for Stringsets Successor Regular Precedence Monadic Second Order Locally Threshold Testable Non-Counting First Order Locally Testable Piecewise Testable Propositional Strictly Local Strictly Piecewise Conjunctions of Negative Finite Literals Figure 3: Subregular hierarchies of stringsets. 41

Subregular Hierarchies for Maps MSO-definable Regular Left Subsequential Right Subsequential Left Output Strictly Local Input Strictly Local Right Output Strictly Local Finite Figure 4: Subregular hierarchies of maps (so far). We are currently generalizing other subregular classes of stringsets to maps. 42

Formal Definition of ISL Definition (Residual Function) Given a function f and input string x, the residual function of f w.r.t. x is r f (x) def = {(y,v) ( u) [ f(xy) = uv u = lcp(f(xσ )) ]} where lcp(s) is the longest common prefix of all strings in S. Definition (k-isl Function) A function (map) f is Input Strictly k-local if for all input strings w,v: if Suff k 1 (w) = Suff k 1 (v) then r f (w) = r f (v). Note this definition is agrammatical! 43

Counterfeeding-on-focus Bedouin Arabic he wrote /katab/ i / a i / σ katab σ kitab [kitab] This is 3-ISL. 44

Self-destructive feeding Turkish your baby /bebek+n/ i / C C # bebekin k /V +V bebein [bebein] This is 5-ISL. 45

Non-gratuitous feeding Classical Arabic write.ms /ktub/ V i / # CCV i uktub P / # V Puktub [Puktub] This is 5-ISL. 46

Cross-derivational feeding Lithuanian to strew all over /ap-berti/ i / C C apiberti where Cs are homorganic stops [ voice] [+voice] / D apiberti [apiberti] This is 4-ISL. 47

Simple constraints in OT generate non-regular maps Ident, Dep >> *ab >> Max a n b m a n, if m < n a n b m b m, if n < m (Gerdemann and Hulden 2012) 48