Learning Subregular Classes of Languages with Factored Deterministic Automata



Similar documents
OUTLINE SYSTEM-ON-CHIP DESIGN. GETTING STARTED WITH VHDL August 31, 2015 GAJSKI S Y-CHART (1983) TOP-DOWN DESIGN (1)

Homework 3 Solutions

Regular Sets and Expressions

One Minute To Learn Programming: Finite Automata

Arc-Consistency for Non-Binary Dynamic CSPs

Words Symbols Diagram. abcde. a + b + c + d + e

PLWAP Sequential Mining: Open Source Code

Reasoning to Solve Equations and Inequalities

Module 5. Three-phase AC Circuits. Version 2 EE IIT, Kharagpur

Ratio and Proportion

EQUATIONS OF LINES AND PLANES

1. Definition, Basic concepts, Types 2. Addition and Subtraction of Matrices 3. Scalar Multiplication 4. Assignment and answer key 5.

c b N/m 2 (0.120 m m 3 ), = J. W total = W a b + W b c 2.00

Chapter. Contents: A Constructing decimal numbers

1 Fractions from an advanced point of view

- DAY 1 - Website Design and Project Planning

Bayesian Updating with Continuous Priors Class 13, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Polynomial Functions. Polynomial functions in one variable can be written in expanded form as ( )

Quick Guide to Lisp Implementation

SECTION 7-2 Law of Cosines

BUSINESS PROCESS MODEL TRANSFORMATION ISSUES The top 7 adversaries encountered at defining model transformations

Fluent Merging: A General Technique to Improve Reachability Heuristics and Factored Planning

FORMAL LANGUAGES, AUTOMATA AND THEORY OF COMPUTATION EXERCISES ON REGULAR LANGUAGES

The remaining two sides of the right triangle are called the legs of the right triangle.

Maximum area of polygon

Solving the String Statistics Problem in Time O(n log n)

SOLVING EQUATIONS BY FACTORING

Analysis of Algorithms and Data Structures for Text Indexing Moritz G. Maaß

Small Businesses Decisions to Offer Health Insurance to Employees

Orthopoles and the Pappus Theorem

Unambiguous Recognizable Two-dimensional Languages

DiaGen: A Generator for Diagram Editors Based on a Hypergraph Model

Appendix D: Completing the Square and the Quadratic Formula. In Appendix A, two special cases of expanding brackets were considered:

Automated Grading of DFA Constructions

1 GSW IPv4 Addressing

A System Context-Aware Approach for Battery Lifetime Prediction in Smart Phones

WHAT HAPPENS WHEN YOU MIX COMPLEX NUMBERS WITH PRIME NUMBERS?

Example 27.1 Draw a Venn diagram to show the relationship between counting numbers, whole numbers, integers, and rational numbers.

Cell Breathing Techniques for Load Balancing in Wireless LANs

A Language-Neutral Representation of Temporal Information

2 DIODE CLIPPING and CLAMPING CIRCUITS

SE3BB4: Software Design III Concurrent System Design. Sample Solutions to Assignment 1

Density Curve. Continuous Distributions. Continuous Distribution. Density Curve. Meaning of Area Under Curve. Meaning of Area Under Curve

How To Balance Power In A Distribution System

If two triangles are perspective from a point, then they are also perspective from a line.

Student Access to Virtual Desktops from personally owned Windows computers

FAULT TREES AND RELIABILITY BLOCK DIAGRAMS. Harry G. Kwatny. Department of Mechanical Engineering & Mechanics Drexel University

BEC TESTS Gli ascolti sono disponibili all indirizzo

Basic Research in Computer Science BRICS RS Brodal et al.: Solving the String Statistics Problem in Time O(n log n)

On Equivalence Between Network Topologies

Regular Languages and Finite Automata

Solution to Problem Set 1

A Visual and Interactive Input abb Automata. Theory Course with JFLAP 4.0

Model's quality assurance using MSA and Petri Net

Babylonian Method of Computing the Square Root: Justifications Based on Fuzzy Techniques and on Computational Complexity

Active Directory Service

5.2. LINE INTEGRALS 265. Let us quickly review the kind of integrals we have studied so far before we introduce a new one.

LINEAR TRANSFORMATIONS AND THEIR REPRESENTING MATRICES

Innovation in Software Development Process by Introducing Toyota Production System

Angles 2.1. Exercise Find the size of the lettered angles. Give reasons for your answers. a) b) c) Example

European Convention on Products Liability in regard to Personal Injury and Death

Answer, Key Homework 10 David McIntyre 1

Concept Formation Using Graph Grammars

Econ 4721 Money and Banking Problem Set 2 Answer Key

Lecture 3: orientation. Computer Animation

Factoring Polynomials

Treatment Spring Late Summer Fall Mean = 1.33 Mean = 4.88 Mean = 3.

Java CUP. Java CUP Specifications. User Code Additions You may define Java code to be included within the generated parser:

The art of Paperarchitecture (PA). MANUAL

How To Find The Re Of Tringle

The invention of line integrals is motivated by solving problems in fluid flow, forces, electricity and magnetism.

MATH PLACEMENT REVIEW GUIDE

Learning Workflow Petri Nets

Use Geometry Expressions to create a more complex locus of points. Find evidence for equivalence using Geometry Expressions.

and thus, they are similar. If k = 3 then the Jordan form of both matrices is

How To Organize A Meeting On Gotomeeting

KEY SKILLS INFORMATION TECHNOLOGY Level 3. Question Paper. 29 January 9 February 2001

Towards Zero-Overhead Static and Adaptive Indexing in Hadoop

Integration by Substitution

MATH 150 HOMEWORK 4 SOLUTIONS

High School Chemistry Content Background of Introductory College Chemistry Students and Its Association with College Chemistry Grades

Example A rectangular box without lid is to be made from a square cardboard of sides 18 cm by cutting equal squares from each corner and then folding

Transcription:

Lerning Suregulr Clsses of Lnguges with Ftored Deterministi Automt Jeffrey Heinz Dept. of Linguistis nd Cognitive Siene University of Delwre heinz@udel.edu Jmes Rogers Dept. of Computer Siene Erlhm College jrogers@s.erlhm.edu Astrt This pper shows how ftored finitestte representtions of suregulr lnguge lsses re identifile in the limit from positive dt y lerners whih re polytime itertive nd optiml. These representtions re motivted in two wys. First, the size of this representtion for given regulr lnguge n e exponentilly smller thn the size of the miniml deterministi eptor reognizing the lnguge. Seond, these representtions (inluding the exponentilly smller ones) desrie tul forml lnguges whih suessfully model nturl lnguge phenomenon, notly in the sufield of phonology. 1 Introdution In this pper we show how to define ertin suregulr lsses of lnguges whih re identifile in the limit from positive dt (ILPD) y effiient, well-ehved lerners with lttie-strutured hypothesis spe (Heinz et l., 2012). It is shown tht every finite set of DFAs defines suh n ILPD lss. In this se, eh DFA n e viewed s one ftor in the desription of every lnguge in the lss. This ftoring of lnguge lsses into multiple DFA n provide ompt, nonil representtion of the grmmrs for every lnguge in the lss. Additionlly, mny suregulr lsses of lnguges n e lerned y the ove methods inluding the Lollyk-Testle, Stritlyk-Lol, Pieewise k-testle, nd Stritly k-pieewise lnguges (MNughton nd Ppert, 1971; Rogers nd Pullum, 2011; Rogers et l., 2010). From linguisti (nd ognitive) perspetive, these suregulr lsses re interesting euse they pper to e suffiient for modeling phonotti ptterns in humn lnguge (Heinz, 2010; Heinz et l., 2011; Rogers et l., to pper). 2 Preliminries For ny funtion f nd elementin the domin of f, we write f() if f() is defined, f() = x if it is defined for nd its vlue is x, nd f() otherwise. The rnge of f, the set of vlues f tkes t elements for whih it is defined, is denoted rnge(f). Σ nd Σ k denote ll sequenes of ny finite length, nd of length k, over finite lphet Σ. The empty string is denoted λ. A lnguge L is suset of Σ. For ll x, y elonging to prtilly-ordered set (S, ), if x z nd y z then z is n upper ound ofxndy. For llx,y S, the lest upper ound (lu)x y = z iffx z,y z, nd for ll z whih upper ound x nd y, it is the se tht z z. An upper semi-lttie is prtilly ordered set (S, ) suh tht every suset of S hs lu. If S is finite, this is equivlent to the existene of x y for ll x,y S. A deterministi finite-stte utomton (DFA) is tuple(q,σ,q 0,F,δ). The sttes of the DFA re Q; the input lphet is Σ; the set of initil sttes is Q 0 ; the finl sttes re F ; nd δ : Q Σ Q is the trnsition funtion. We dmit set of initil sttes solely to ommodte the empty DFA, whih hs none. Deterministi utomt never hve more thn one initil stte. We will ssume tht, if the utomton is non-empty, then Q 0 = {q 0 }; The trnsition funtion s domin is extended to Q Σ in the usul wy. The lnguge of DFAAis L(A) def = {w Σ δ(q 0,w) F}. A DFA is trim iff it hs no useless sttes: ( q Q)[ w,v Σ δ(q 0,w) = q nd δ(q,v) F].

Every DFA n e trimmed y eliminting useless sttes from Q nd restriting the remining omponents ordingly. The empty DFA isa = (,Σ,,, ). This is the miniml trim DFA suh tht L(A ) =. The DFA produt of A 1 = (Q 1,Σ,Q 01,F 1,δ 1 ) nd A 2 = (Q 2,Σ,Q 02,F 2,δ 2 ) is (A 1,A 2 ) = (Q,Σ,Q 0,F,δ) where Q = Q 1 Q 2, Q 0 = Q 01 Q 02, F = F 1 F 2 nd ( q Q)( σ Σ) [ δ ( (q 1,q 2 ),σ ) def = (δ 1 (q 1,σ),δ 2 (q 2,σ)) The DFA produt of two DFA is lso DFA. It is not neessrily trim, ut we will generlly ssume tht in tking the produt the result hs een trimmed, s well. The produt opertion is ssoitive nd ommuttive (up to isomorphism), nd so it n e pplied to finite set S of DFA, in whih se we write S = A S A (letting {A} = A). In this pper, grmmrs re finite sequenes of DFAs A = A 1 A n nd we lso use the nottion for the produt of finite sequene of DFAs: A def = A A A nd L( A) def = L ( ) A. Sequenes re used insted of sets in order to mth ftors in two grmmrs. Let DFA denote the olletion of finite sequenes of DFAs. Theorem 1 is well-known. Theorem 1 Consider finite set S of DFA. Then L ( A S A) = A S L(A). An importnt onsequene of Theorem 1 is tht some lnguges re exponentilly more omptly represented y their ftors. The grmmr A = A 1 A n hs 1 i n rd(q i) sttes, wheres the trimmed A n hve s mny s 1 i n rd(q i) Θ(mx 1 i n (rd(q i )) n ) sttes. An exmple of suh lnguge is given in Setion 4, Figures 1 nd 2. 2.1 Identifition in the limit A positive text T for lnguge L is totl funtion T : N L {#} (# is puse ) suh tht rnge(t) = L (i.e., for every w L there is t lest one n N for whih w = T(n)). Let T[i] denote the initil finite sequene T(0),T(1)...T(i 1). Let SEQ denote the set of ll finite initil portions of ll positive texts for ] ll possile lnguges. The ontent of n element T[i] of SEQ is ontent(t[i]) def = {w Σ ( j i 1)[T(j) = w]}. In this pper, lerning lgorithms re progrms: φ : SEQ DFA. A lerner φ identifies in the limit from positive texts olletion of lnguges L if nd only if for ll L L, for ll positive texts T for L, there exists n n N suh tht ( m n)[φ(t[m]) = φ(t[n])] nd L(T[n]) = L (see Gold (1967) nd Jin et l. (1999)). A lss of lnguges is ILPD iff it is identifile in the limit y suh lerner. 3 Clsses of ftorle-dfa lnguges In this setion, lsses of ftorle-dfa lnguges re introdued. The notion of su-dfa is entrl to this onept. Pitorilly, su-dfa is otined from DFA y removing zero or more sttes, trnsitions, nd/or revoking the finl sttus of zero or more finl sttes. Definition 1 For ny DFA A = (Q,Σ,Q 0,F,δ), DFAA = (Q,Σ,Q 0,F,δ ) is su-dfa of A, written A A, if nd only if Q Q, Σ Σ, Q 0 Q 0, F F, δ δ. The su-dfa reltion is extended to grmmrs (sequenes of DFA). Let A = A1 A n nd A = A 1 A n. Then A A ( 0 i n)[a i A i]. Clerly, if A A then L(A ) L(A). Every grmmr A determines lss of lnguges: those reognized y su-grmmr of A. Our interest is not inl( A), itself. Indeed, this will generlly e Σ. Rther, our interest is in identifying lnguges reltive to the lss of lnguges reognizle y su-grmmrs of A. Definition 2 Let G( A) def = { B B A}, the lss of grmmrs tht re su-grmmrs of A. Let L( A) def = {L( B) B A}, the lss of lnguges reognized y su-grmmrs of A. A lss of lnguges is ftorle-dfa lss iff it isl( A) for some A. The set G( A) is neessrily finite, sine A is, so every lss L( A) is trivilly ILPD y lerning lgorithm tht systemtilly rules out grmmrs tht re inomptile with the text, ut this nïve lgorithm is prohiitively ineffiient. Our gol is

to estlish tht the effiient generl lerning lgorithm given y Heinz et l. (2012) n e pplied to every lss of ftorle-dfa lnguges, nd tht this lss inludes mny of the well-known su-regulr lnguge lsses s well s lsses tht re, in prtiulr sense, mixtures of these. 4 A motivting exmple This setion desries the Stritly 2-Pieewise lnguges, whih motivte the ftoriztion tht is t the hert of this nlysis. Stritly Pieewise (SP) lnguges re hrterized in Rogers et l. (2010) nd re speil sulss of the Pieewise Testle lnguges (Simon, 1975). Every SP lnguge is the intersetion of finite set of omplements of prinipl shuffle idels: where L SP def L = w S[SI(w)], S finite SI(w) def = {v Σ w = σ 1 σ k nd ( v 0,...,v k Σ )[v = v 0 σ 1 v 1 σ k v k ]} So v SI(w) iff w ours s susequene of v nd L SP iff there is finite set of strings for whih L inludes ll nd only those strings tht do not inlude those strings s susequenes. We sy thtlis generted ys. It turns out tht SP is extly the lss of lnguges tht re losed under susequene. A lnguge is SP k iff it is generted y set of strings eh of whih is of length less thn or equl to k. Clerly, every SP lnguge is SP k for some k nd SP = 1 k N [SP k]. If w Σ nd w = k, then SI(w) = L(A w ) for DFA A w with no more thn k sttes. For exmple, if k = 2 nd Σ = {,,} nd, hene, w {,,} 2, then the miniml trim DFA reognizing SI(w) will e su-dfa (in whih one of the trnsitions from theσ 1 stte hs een removed) of one of the three DFA of Figure 1. Figure 1 shows A = A,A,A, where Σ = {,,} nd eh A σ is DFA epting Σ whose sttes distinguish whether σ hs yet ourred. Figure 2 shows A. Note tht every SP 2 lnguge over {,,} is L( B) for some B A. The lss of grmmrs of G( A) reognize slight extension of SP 2 over {,, } (whih inludes 1-Reverse Definite lnguges s well). Oserve tht 6 sttes re required to desrie A ut 8 sttes re required to desrie A. Let AΣ e the sequene of DFA with one DFA for eh letter in Σ, s in Figure 1. As rd(σ) inreses the numer of sttes of A Σ is 2 rd(σ) ut the numer of sttes in AΣ is 2 rd(σ). The numer of sttes in the produt, in this se, is exponentil in the numer of its ftors. The Stritly 2-Pieewise lnguges re urrently the strongest omputtionl hrteriztion 1 of long-distne phonotti ptterns in humn lnguges (Heinz, 2010). The size of the phonemi inventories 2 in the world s lnguges rnges from 11 to 140 (Mddieson, 1984). English hs out 40, depending on the dilet. With n lphet of tht sizea Σ would hve 80 sttes, while AΣ would hve 2 40 1 10 12 sttes. The ft tht there re out 10 11 neurons in humn rins (Willims nd Herrup, 1988) helps motivte interest in the more ompt, prllel representtion given y A Σ s opposed to the singulr representtion of the DFA AΣ. 5 Lerning ftorle lsses of lnguges In this setion, lsses of ftorle-dfa lnguges re shown to e nlyzle s finite lttie spes. By Theorem 6 of Heinz et l. (2012), every suh lss of lnguges n e identified in the limit from positive texts. Definition 3 (Joins) Let A = (Q,Σ,Q 0,F,δ), A 1 = (Q 1,Σ,Q 01,F 1,δ 1 ) A nd A 2 = (Q 2,Σ,Q 02,F 2,δ 2 ) A. The join of A 1 nd A 2 is A 1 A 2 def = (Q 1 Q 2,Σ,Q 01 Q 02,F 1 F 2,δ 1 δ 2 ). Similrly, for ll A = A1 A n nd B = B 1 B n A, C 2 = C 1 C n A, the join of nd B nd C is B C def = B 1 C 1 B n C n. Note tht the join of two su-dfa of A is lso su-dfa of A. Sine G( A) is finite, inry join suffies to define join of ny set of su-dfa of given DFA (s iterted inry joins). Let [S] e the join ofs, set of su-dfas of somea(or A). 1 See Heinz et l. (2011) for ompeting hrteriztions. 2 The mentl representtions of speeh sounds re lled phonemes, nd the phonemi inventory is the set of these representtions (Hyes, 2009).

,,,,,,,,, 0 1 0 1 0 1 Figure 1: The sequene of DFA A = A,A,A, where Σ = {,,} nd eh A σ epts Σ nd whose sttes distinguish whether σ hs yet ourred., 1 0 0 1 0 1,,, 0 0 0 0 1 0 1 1 0 1 1 1, 0 0 1 0 1 1 Figure 2: The produt A,A,A.

Lemm 1 The set of su-dfa of DFA A, ordered y, ({B B A}, ), is n upper semilttie with the lest upper ound of set ofs su- DFA of A eing their join. Similrly the set of su-grmmrs of grmmr A, ordered gin y,({ B A}, ), is n upper semi-lttie with the lest upper ound of set of su-grmmrs of A eing their join. 3 This follows from the ft tht Q 1 Q 2 (similrly F 1 F 2 ndδ 1 δ 2 ) is the lu ofq 1 ndq 2 (et.) in the lttie of sets ordered y suset. 5.1 Pths nd Chisels Definition 4 LetA = (Q,Σ,{q 0 },F,δ) e nonempty DFA nd w = σ 0 σ 1 σ n Σ. If δ(q 0,w), the pth of w in A is the sequene π(a,w) def = (q 0,σ 0 ),...,(q n,σ n ),(q n+1,λ) where ( 0 i n)[q i+1 = δ(q i,σ i )]. If δ(q 0,w) then π(a,w). If π(a,w), let Q π(a,w) denote set of sttes it trverses, δ π(a,w) denote the the trnsitions it trverses, nd let F π(a,w) = {q n+1 }. Next, for ny DFA A, nd ny w L(A), we define the hisel of w given A to e the su-dfa of A tht extly enompsses the pth ethed out inayw. Definition 5 For ny non-empty DFA A = (Q,Σ,{q 0 },F,δ) nd ll w Σ, if w L(A), then the hisel of w given A is the su-dfa C A (w) = (Q π(a,w),σ,{q 0 },F π(a,w),δ π(a,w) ). If w L(A), then C A (w) = A. Consider ny A = A 1 A n nd ny word w Σ. The hisel of w given A is C A (w) = C A1 (w) C An (w). Oserve tht C A (w) A for ll words w nd ll A, nd tht C A (w) is trim. Using the join, the domin of the hisel is extended to sets of words: C A (S) = w S C A (w). Note tht {C A (w) w Σ } is finite, sine { B B A} is. Theorem 2 For ny grmmr A, let C( A) = {C A (S) S Σ }. Then (C( A), ) is n upper semi-lttie with the lu of two elements given y the join. 3 These re tully omplete finite ltties, ut we re interested primrily in the joins. Proof This follows immeditely from the finiteness of {C A (w) w Σ } nd Lemm 1. Lemm 2 For ll A = (Q,Σ,Q 0,F,δ), there is finite set S Σ suh tht w S C A(w) = A. Similrly, for ll A = A 1 A n, there is finite set S Σ suh tht C A (S) = A. Proof If A is empty, then lerly S = suffies. Heneforth onsider only nonempty A. For the first sttement, let S e the set of uσv where, for eh q Q nd for eh σ Σ, δ(q 0,u) = q nd δ(δ(q,σ),v) F suh tht uσv hs miniml length. By onstrution, S is finite. Furthermore, for every stte nd every trnsition ina, there is word ins whose pth touhes tht stte nd trnsition. By definition of it follows tht C A (S) = A. For proof of the seond sttement, for eh A i in A, onstrut S i s stted nd tke their union. Heinz et l. (2012) define lttie spes. For n upper semi-lttie V nd funtion f : Σ V suh tht f nd re (totl) omputle, (V, f) is lled Lttie Spe (LS) iff, for eh v V, there exists finited rnge(f) with D = v. Theorem 3 For ll grmmrs A = A 1 A n, (C( A),C A ) is lttie spe. Proof For ll A C( A), y Lemm 2, there is finites Σ suh tht w S C A (w) = A. For Heinz et l. (2012), elements of the lttie re grmmrs. Likewise, here, eh grmmr A = A 1 A n defines lttie whose elements re its su-grmmrs. Heinz et l. (2012) ssoite the lnguges of grmmr v in lttie spe (V,f) with{w Σ f(w) v}. This definition oinides with ours: for ny element A of C( A) (note A A), word w elongs to L( A ) if nd only if C A (w) is su-dfa of A. The lss of lnguges of LS is the olletion of lnguges otined y every element in the lttie. For every LS(C( A),C A ), we now define lerner φ ording to the onstrution in Heinz et l. (2012): T SEQ, φ(t) = w ontent(t) C A (w). Let L (C( A),C A ) denote the lss of lnguges ssoited with the LS in Theorem 3. Aording to Heinz et l. (2012, Theorem 6), the lerner φ identifies L (C( A),CvA) in the limit from positive dt. Furthermore, φ is polytime itertive,

i.e n ompute the next hypothesis in polytime from the previous hypothesis lone, nd optiml in the sense tht no other lerner onverges more quikly on lnguges in L (C( A),CG. In ddition, this lerner is glolly-onsistent (every ) hypothesis overs the dt seen so fr), lollyonservtive (the hypothesis never hnges unless the urrent dtum is not onsistent with the urrent hypothesis), strongly-monotone (the urrent hypothesis is superset of ll prior hypotheses), nd prudent (it never hypothesizes lnguge tht is not in the trget lss). Forml definitions of these terms re given in Heinz et l. (2012) nd n lso e found elsewhere, e.g. Jin et l. (1999). 6 Complexity onsidertions The spe of su-grmmrs of given sequene of DFAs is neessrily finite nd, thus, identifile in the limit from positive dt y nïve lerner tht simply enumertes the spe of grmmrs. The lttie lerning lgorithm hs etter effiieny euse it works ottom-up, extending the grmmr minimlly, t eh step, with the hisel of the urrent string of the text. The lttie lerner never explores ny prt of the spe of grmmrs tht is not su-grmmr of the orret one nd, s it never moves down in the lttie, it will skip muh of the spe of grmmrs tht re su-grmmrs of the orret one. The spe it explores will e miniml, given the text it is running on. Generliztion is result of the ft tht in extending the grmmr for string the lerner dds its entire Nerode equivlene lss to the lnguge. The time omplexity of either lerning or reognition with the ftored utomt my tully e somewht worse thn the omplexity of doing so with its produt. Computing the hisel of string w in the produt mhine of Figure 2 is Θ( w ), while in the ftored mhine of Figure 1 one must ompute the hisel in eh ftor nd its omplexity is, thus, Θ( w rd(σ) k 1 ). But Σ nd k re fixed for given ftoriztion, so this works out to e onstnt ftor. Where the ftoriztion mkes sustntil differene is in the numer of fetures tht must e lerned. In the ftored grmmr of the exmple, the totl numer of sttes plus edges is Θ(krd(Σ) k 1 ), while in its produt it is Θ(2 (rd(σ)k 1) ). This represents n exponentil improvement in the spe omplexity of the ftored grmmr. Every DFA n e ftored in mny wys, ut the ftoriztions do not neessrily provide n symptotilly signifint improvement in spe omplexity. The nonil ontrst is etween sequenes of utomt A 1,...,A n tht ount modulo some sequene of m i N. If the m i re pirwise prime, the produt will require 1 i n [m i] = Θ((mx i [m i ]) n ) sttes. If on the other hnd, they re ll multiples of eh other it will require just Θ(mx i [m i ]). 7 Exmples The ft tht the lss of SP 2 lnguges is effiiently identifile in the limit from positive dt is neither surprising or new. The ovious pproh to lerning these lnguges simply umultes the set of pirs of symols tht our s susequenes of the strings in the text nd uilds mhine tht epts ll nd only those strings in whih no other suh pirs our. This, in ft, is essentilly wht the lttie lerner is doing. Wht is signifint is tht the lttie lerner provides generl pproh to lerning ny lnguge lss tht n e ptured y ftored grmmr nd, more importntly, ny lss of lnguges tht re intersetions of lnguges tht re in lsses tht n e ptured this wy. Ftored grmmrs in whih eh ftor reognizes Σ, s in the se of Figure 1, re of prtiulr interest. Every su-str-free lss of lnguges in whih the prmeters of the lss (k, for exmple) re fixed n e ftored in this wy. 4 If the prmeters re not fixed nd the lss of lnguges is not finite, none of these lsses n e identified in the limit from positive dt t ll. 5 So this pproh is potentilly useful t lest for ll su-str-free lsses. The lerners for non-strit lsses re prtil, however, only for smll vlues of the prmeters. So tht leves the Stritly Lol SL k nd Stritly Pieewise SP k lnguges s the ovious trgets. The SL k lnguges re those tht re determined y the sustrings of length no greter thn k tht our within the string (inluding endmrk- 4 We onjeture tht there is prmeterized lss of lnguges tht is equivlent to the Str-Free lnguges, whih would mke tht lss lernle in this wy s well. 5 For most of these lsses, inluding the Definite, Reverse-Definite nd Stritly Lol lsses nd their super lsses, this is immedite from the ft tht they re superfinite. SP, on the other hnd, is not super-finite (sine it does not inlude ll finite lnguges) ut nevertheless, it is not IPLD.

ers). These n e ftored on the sis of those sustrings, just s the SP k lnguges n, lthough the onstrution is somewht more omplex. (See the Knuth-Morris-Prtt lgorithm (Knuth et l., 1977) for wy of doing this.) But SL k is se in whih there is no omplexity dvntge in ftoring the DFA. This is euse every SL k lnguge is reognized y DFA tht is Myhill grph: with stte for eh string of Σ <k (i.e., of length less thn k). Suh grph hs Θ(rd(Σ) k 1 ) sttes, symptotilly the sme s the numer of sttes in the ftored grmmr, whih is tully mrginlly worse. Therefore, ftored SL k grmmrs re not, in themselves, interesting. But they re interesting s ftors of other grmmrs. Let(SL+SP) k,l (resp. (LT + SP) k,l, (SL + PT) k,l ) e the lss of lnguges tht re intersetions of SL k nd SP l (resp. LT k nd SP l, SL k nd PT l ) lnguges. Where LT (PT) lnguges re determined y the set of sustrings (susequenes) tht our in the string (see Rogers nd Pullum (2011) nd Rogers et l. (2010)). These lsses pture o-ourrene of lol onstrints (sed on djeny) nd longdistne onstrints (sed on preedene). These re of prtiulr interest in phonottis, s they re linguistilly well-motivted pprohes to modeling phonottis nd they re suffiiently powerful to model most phonotti ptterns. The results of Heinz (2007) nd Heinz (2010) strongly suggest tht nerly ll segmentl ptterns re (SL+SP) k,l for smll k ndl. Moreover, roughly 72% of the stress ptterns tht re inluded in Heinz s dtse (Heinz, 2009; Phonology L, 2012) of ptterns tht hve een ttested in nturl lnguge n e modeled with SL k grmmrs withk 6. Of the rest, ll ut four re LT 1 + SP 4 nd ll ut two re LT 2 + SP 4. Both of these lst two re properly regulr (Wiel et l., in prep). 8 Conlusion We hve shown how suregulr lsses of lnguges n e lerned over ftored representtions, whih n e exponentilly more ompt thn representtions with single DFA. Essentilly, words in the dt presenttion re pssed through eh ftor, tivting the prts touhed. This pproh immeditely llows one to nturlly mix well-hrterized lernle suregulr lsses in suh wy tht the resulting lnguge lss is lso lernle. While this mixing is prtly motivted y the different kinds of phonotti ptterns in nturl lnguge, it lso suggests very interesting theoretil possiility. Speifilly, we ntiipte tht the right prmeteriztion of these well-studied suregulr lsses will over the lss of str-free lnguges. Future work ould lso inlude extending the urrent nlysis to ftoring stohsti lnguges, perhps in wy tht onnets with erlier reserh on ftored HMMs (Ghhrmni nd Jordn, 1997). Aknowledgments This pper hs enefited from the insightful omments of three nonymous reviewers, for whih the uthors re grteful. The uthors lso thnk Jie Fu nd Herert G. Tnner for useful disussion. This reserh ws supported y NSF grnt 1035577 to the first uthor, nd the work ws ompleted while the seond uthor ws on stil t the Deprtment of Linguistis nd Cognitive Siene t the University of Delwre. Referenes Zouin Ghhrmni nd Mihel I. Jordn. 1997. Ftoril hidden mrkov models. Mhine Lerning, 29(2):245 273. E.M. Gold. 1967. Lnguge identifition in the limit. Informtion nd Control, 10:447 474. Brue Hyes. 2009. Introdutory Phonology. Wiley- Blkwell. Jeffrey Heinz, Chetn Rwl, nd Herert G. Tnner. 2011. Tier-sed stritly lol onstrints for phonology. In Proeedings of the 49th Annul Meeting of the Assoition for Computtionl Linguistis, pges 58 64, Portlnd, Oregon, USA, June. Assoition for Computtionl Linguistis. Jeffrey Heinz, Ann Ksprzik, nd Timo Kötzing. 2012. Lerning with lttie-strutured hypothesis spes. Theoretil Computer Siene, 457:111 127, Otoer. Jeffrey Heinz. 2007. The Indutive Lerning of Phonotti Ptterns. Ph.D. thesis, University of Cliforni, Los Angeles. Jeffrey Heinz. 2009. On the role of lolity in lerning stress ptterns. Phonology, 26(2):303 351. Jeffrey Heinz. 2010. Lerning long-distne phonottis. Linguisti Inquiry, 41(4):623 661.

Snjy Jin, Dniel Osherson, Jmes S. Royer, nd Arun Shrm. 1999. Systems Tht Lern: An Introdution to Lerning Theory (Lerning, Development nd Coneptul Chnge). The MIT Press, 2nd edition. Donld Knuth, Jmes H Morris, nd Vughn Prtt. 1977. Fst pttern mthing in strings. SIAM Journl on Computing, 6(2):323 350. In Mddieson. 1984. Ptterns of Sounds. Cmridge University Press, Cmridge, UK. Roert MNughton nd Seymour Ppert. 1971. Counter-Free Automt. MIT Press. UD Phonology L. 2012. UD phonology l stress pttern dtse. http://phonology. ogsi.udel.edu/ds/stress. Aessed Deemer 2012. Jmes Rogers nd Geoffrey Pullum. 2011. Aurl pttern reognition experiments nd the suregulr hierrhy. Journl of Logi, Lnguge nd Informtion, 20:329 342. Jmes Rogers, Jeffrey Heinz, Gil Biley, Mtt Edlefsen, Molly Vissher, Dvid Wellome, nd Sen Wiel. 2010. On lnguges pieewise testle in the strit sense. In Christin Eert, Gerhrd Jäger, nd Jens Mihelis, editors, The Mthemtis of Lnguge, volume 6149 of Leture Notes in Artifil Intelligene, pges 255 265. Springer. Jmes Rogers, Jeffrey Heinz, Mrgret Fero, Jeremy Hurst, Dkoth Lmert, nd Sen Wiel. to pper. Cognitive nd su-regulr omplexity. In Proeedings of the 17th Conferene on Forml Grmmr. Imre Simon. 1975. Pieewise testle events. In Automt Theory nd Forml Lnguges: 2nd Grmmtil Inferene onferene, pges 214 222, Berlin. Springer-Verlg. Sen Wiel, Jmes Rogers, nd Jeffery Heinz. Ftoring of stress ptterns. In preprtion. R.W. Willims nd K. Herrup. 1988. The ontrol of neuron numer. Annul Review of Neurosiene, 11:423 453.