Kevin Tang and Andrew Nevins

Kevin Tang and Andrew Nevins Abstract -ar(e) -ar(e) -er(e) -ir(e) Keywords: 1 Introduction dig dig dig dig +

ouç ouç ouç ouç Verb Vocabulary Size Productivity of ar-er-ir ar-er-ir are-ere-ire -ar -er/-ir -ar -er -ir 2 Data Sources 2.1 English CLMET3.0 Old Bailey

2.1.1 CLMET3.0. 2.2 Portuguese Corpus do Português Colonia Tycho Brahe 2.2.1 Corpus do Português. fixed 2.2.2 Colonia.

2.3 Italian Google Italian Ngram DiaCoris 2.3.1 Google-Ngram:Italian. 2.3.2 DiaCoris. fixed 2.4 Spanish Google Spanish Ngram IMPACT-es 2.4.1 Google Ngram:Spanish. 2.4.2 IMPACT-es.

3 Methods: Verb Vocabulary Size 3.1 Simulations by Random Sampling 3.2 Epoching N

3.3 Lemma estimation burnt burnt ar(e) ir(e) er(e) 4 Analyses: Verb Vocabulary Size 4.1 Simulation results: English, CLMET3.0 4.2 Simulation results: Portuguese, Colonia 4.3 Simulation results: Italian, Google Ngram

-ar/-er/-ir -ar/-er/-ir 4.4 Simulation results: Spanish, Google Ngram

-ar/-er/-ir -ar/-er/-ir

4.5 Interim Summary 5 Methods: Productivity of -ar -er/-ir er-ir -ar 5.1 Simulations by Random Sampling -ar, -er -ir 5.2 Productivity Estimation 5.2.1 ar/( er+ ir). -ar -er -ir -ar -ar 5.2.2 Yang s Productivity Estimate. -ar M N/ln(N) M N M -er -ir -er/-ir -er/-ir

relative M -ar N M N -ar ( ) 1 ar/( er + ir) 6 Analyses: Productivity of 6.1 Simulation results: Portuguese, Corpus do Português -ar 6.2 Simulation results: Portuguese, Colonia -ar 6.3 Simulation results: Italian, Google Ngram are, -ere -ire 6.4 Simulation results: Italian, DiaCoris -ar 6.5 Simulation results: Spanish, Google Ngram -ar

-ar/-er/-ir -ar/-er/-ir 6.6 Simulation results: Spanish, IMPACT-es -ar

-ar/-er/-ir -ar/-er/-ir -ar/-er/-ir -ar/-er/-ir 7 Relationship between Verb vocabulary size and Productivity r p

r p r p 8 Statistical evaluation of the changepoint of verb vocabulary growth -ar changepoint

9 Artefact considerations

9.1 Corpus representativeness 9.2 Tagging accuracy and consistency

-ar -er/-ir -ar -er/-ir -ar -er/-ir -ar -er/-ir without

-ar -er -ir -ar -er -ir 10 Conclusion -ar -er -ir -ar/(-ir+-er) -ar

r p -ar -er -ir References The British industrial revolution in global perspective Literary and Linguistic Computing 7 Word frequency distributions Literary and Linguistic Computing 8 National Endowment for the humanities The European English Messenger 19 JLCL 26 Lancaster University Proceedings of the ACL 2012 system demonstrations Yearbook of morphology 2004 Special volume on non-standard data sources in corpus-based research De Economist 148 Syntactic development, its input and output Proceedings of LREC-2006, the fifth international conference on language resources and evaluation Biometrika 41 Advances in natural language processing arxiv preprint arxiv:1104.2086 Proceedings of the seventh international conference on language resources and evaluation (lrec 10)

arxiv preprint arxiv:1306.3692 Proceedings of the 5th ACL-HLT workshop on language technology for cultural heritage, social sciences, and humanities Proceedings of international conference on new methods in language processing Linguistic Variation Yearbook 5