Dynamic Finite-State Transducer Composition with Look-Ahead for Very-Large-Scale Speech Recognition
|
|
|
- Laurel Holt
- 9 years ago
- Views:
Transcription
1 Dynamic Finite-State Transducer Composition with Look-Ahead for Very-Large-Scale Speech Recognition Cyril Allauzen - [email protected] Ciprian Chelba - [email protected] Boulos Harb - [email protected] Michael Riley - [email protected] Johan Schalkwyk - [email protected] Aug 19, 2010
2 Weighted Finite-State Tranducers in Speech Recognition - I WFSTs are a general and efficient representation for many speech and NLP problems, see: Mohri, et al., Speech recognition with weighted finite-state transducers, in Handbook of Speech Processing. Springer In ASR, they have been used to: Represent models: G: n-gram language model (automaton over words) L: pronunciation lexicon (transducer from CI phones to words) C: context dependency (transducer from CD phones to CI phone) Combine and optimize models: Composition: Computes the relational composition of two transducers. Epsilon Removal: Finds equivalent WFST with no ǫ transitions. Determinization: Finds equivalent WFST that has no identically-labeled transitions leaving a state. Minimization: Finds equivalent deterministic WFST with the fewest states and arcs.
3 Weighted Finite-State Tranducers in Speech Recognition - II Advantages: Uniform data representation General, efficient, mathematically well-defined and reusable combination and optimization operations Variant systems realized in data not code. OpenFst, an open-source finite-state transducer library, was used for this work ( Released under the Apache license; used in many speech and NLP applications.
4 Weighted Acceptors Finite automata with labels and weights. Example: Word pronunciation acceptor: d/1 0 1 ey/0.5 ae/0.5 2 t/0.3 dx/0.7 3 ax/1 4
5 Weighted Transducers Finite automata with input labels, output labels, and weights. Example: Word pronunciation transducer: d:data/1 1 ey: ε /0.5 ae: ε /0.5 2 t: ε /0.3 dx: ε /0.7 3 ax: ε /1 4/0 0 d:dew/1 5 uw: ε /1 6/0 L: Closed union of V word pronunciation transducers. G: An n-gram model is a WFSA with (at most) V n 1 states.
6 Context-Dependent Triphone Transducer C y:y/ ε_x x:x/ ε_ε ε,* x:x/ ε_x x:x/ ε_y x:x/x_x x,x x:x/x_y x,y x:x/y_x x:x/y_y y:y/x_x y:y/x_y y:y/y_y y:y/y_x y,x x:x/y_ε x,ε y,y y:y/y_ε y:y/x_ ε y,ε y:y/ ε_y x:x/x_ε y:y/ ε_ε
7 Recognition Transducer Construction The models C, L, G can be combined and optimized with weighted finitestate composition and determinization as: C det(l G) (1) An alternative construction, producing an equivalent transducer, is: (C det(l)) G (2) If G is deterministic, Eq. 2 could be as efficient as Eq. 1 and avoids the determinization of L G, greatly saving time and memory and allowing fast dynamic combination (useful in applications). However, standard composition presents three problems with Eq. 2: 1. Determinization of L moves back word labels creating delay in matching and creating (possibly very many) useless composition paths 2. The delayed word labels in L produce a much larger composed machine when G is an n-gram LM. 3. The delayed word labels push back the grammar weights along paths in the composed machine to the detriment of ASR pruning.
8 0 r:red r:read r:reed r:road r:rode eh:ε eh:ε iy:ε iy:ε ao:ε ao:ε 6 d:ε 7 Composition Example 0 r:ε 1 eh:ε iy:ε ao:ε 2 3 d:read d:red d:read d:reed d:road 4 0 red/0.6 read/ ,0 5 r:red/0.6 r:read/0.4 5 d:rode L det(l) G 1,1 2,2 eh:ε eh:ε iy:ε 6,1 6,2 d:ε d:ε 7,1 7,2 0 r:ε 1,0 eh:ε iy:ε ao:ε 2,0 3,0 5,0 d:red/0.6 d:read/0.4 d:read/0.4 4,1 4,2 L G det(l) G
9 Definitions and Notation Paths Path π Origin or previous state: p[π]. Destination or next state: n[π]. Input label: i[π]. Output label: o[π]. p[π] i[π]:o[π] n[π] Sets of paths P(R 1, R 2 ): set of all paths from R 1 Q to R 2 Q. P(R 1, x, R 2 ): paths in P(R 1, R 2 ) with input label x. P(R 1, x, y, R 2 ): paths in P(R 1, x, R 2 ) with output label y.
10 Definitions and Notation Transducers Alphabets: input A, output B. States: Q, initial states I, final states F. Transitions: E Q (A {ǫ}) (B {ǫ}) K Q. Weight functions: initial weight function λ : I K final weight function ρ : F K. Transducer T = (A, B, Q, I, F, E, λ, ρ) with for all x A, y B : [[T]](x, y) = λ(p[π]) w[π] ρ(n[π]) π P(I,x,y,F)
11 Semirings A semiring (K,,, 0, 1) = a ring that may lack negation. Sum: to compute the weight of a sequence (sum of the weights of the paths labeled with that sequence). Product: to compute the weight of a path (product of the weights of constituent transitions). Semiring Set 0 1 Boolean {0, 1} 0 1 Probability R Log R {, + } log Tropical R {, + } min String B { } lcp ǫ with log defined by: x log y = log(e x + e y ).
12 (ǫ-free) Composition Algorithm States: (q 1, q 2 ) with q 1 in T 1 and q 2 in T 2. Transitions: e 1 transition in q 1 and e 2 in q 2 such that o[e 1 ] = i[e 2 ] ((q 1, q 2 ), i[e 1 ], o[e 2 ], w[e 1 ] w[e 2 ], (n[e 1 ], n[e 2 ]))
13 Composition Filter: Φ = (T 1, T 2, Q 3, i 3,, ϕ) Generalized Composition Algorithm Q 3 : set of filter states with i 3 initial and final. ϕ : (e 1, e 2, q 3 ) (e 1, e 2, q 3): transition filter Algorithm: States: (q 1, q 2, q 3 ) with q 1 in T 1, q 2 in T 2 and q 3 a filter state. Transitions: e 1 transition in q 1, e 2 in q 2 such that ϕ(e 1, e 2, q 3 ) = (e 1, e 2, q 3) with q 3 ((q 1, q 2, q 3 ), i[e 1], o[e 2], w[e 1] w[e 2], (n[e 1], n[e 2], q 3)) Trivial filter Φ trivial : Allows all matching paths Q 3 = {0, }, i 3 = 0 and ϕ(e 1, e 2, 0) = basic ǫ-free composition algorithm { (e1, e 2, 0) if o[e 1 ] = i[e 2 ] (e 1, e 2, ) otherwise
14 Pseudo-code Weighted-Composition(T 1, T 2 ) 1 Q I S I 1 I 2 {i 3 } 2 while S do 3 (q 1, q 2, q 3 ) Head(S) 4 Dequeue(S) 5 if (q 1, q 2, q 3 ) F 1 F 2 Q 3 then 6 F F {(q 1, q 2, q 3 )} 7 ρ(q 1, q 2, q 3 ) ρ 1 (q 1 ) ρ 2 (q 2 ) ρ 3 (q 3 ) 8 M {(e 1, e 2 ) E L [q 1 ] E L [q 2 ] such that ϕ(e 1, e 2, q 3 ) = (e 1, e 2, q 3 ) with q 3 } 9 for each(e 1, e 2 ) M do 10 (e 1, e 2, q 3 ) ϕ(e 1, e 2, q 3 ) 11 if (n[e 1 ], n[e 2 ], q 3 ) Q then 12 Q Q (n[e 1 ], n[e 2 ], q 3 ) 13 Enqueue(S, (n[e 1 ], n[e 2 ], q 3 )) 14 E E {((q 1, q 2, q 3 ), i[e 1 ], o[e 2 ], w[e 1 ] w[e 2 ], (n[e 1 ], n[e 2 ], q 3 ))} 15 return T
15 Epsilon-Matching Filter An ǫ-transition in T 1 (resp. T 2 ) can be matched in T 2 (resp. T 1 ) by an ǫ-transition or by staying at the same state (as if there were ǫ self-loops at each state in T 1 and T 2 ) Allowing all possible ǫ-matches: redundant ǫ-paths in T 1 T 2 wrong result when the semiring is non-idempotent Filter Φ ǫ-match : Disallows redundant ǫ-paths, favoring matching actual ǫ-transitions Q 3 = {0, 1, 2, }, i 3 = 0 and ϕ(e 1, e 2, q 3 ) = (e 1, e 2, q 3) where: q 3 = 8 >< >: 0 if (o[e 1 ], i[e 2 ]) = (x, x) with x B, 0 if (o[e 1 ], i[e 2 ]) = (ǫ, ǫ) and q 3 = 0, 1 if (o[e 1 ], i[e 2 ]) = (ǫ L, ǫ) and q 3 2, 2 if (o[e 1 ], i[e 2 ]) = (ǫ, ǫ L ) and q 3 1, otherwise. ǫ L : label of added self-loops composition algorithm of [Mohri, Peirera and Riley, 96]
16 Label-Reachability Filter Disallows following an ǫ-path in q 1 that will fail to reach a non-ǫ label that matches some transition in q 2 Label-Reachability r : Q 1 B {0, 1} r(q, x) = ( 1 if there exists a path from q to some q with output label x 0 otherwise Filter Φ reach : Same as Φ trivial except when o[e 1 ] = ǫ and i[e 2 ] = ǫ L then ϕ(e 1, e 2, 0) = (e 1, e 2, q 3 ) with q 3 = 0 r:ε 1 0 red/0.6 read/0.4 eh:ε iy:ε ao:ε d:read d:red d:read d:reed d:road d:rode 4 ( 0 if there exist e 2 in q 2 such that r(n[e 1 ], i[e 2 ]) = 1 otherwise 0,0 r:ε 1,0 eh:ε iy:ε ao:ε 2,0 3,0 d:red/0.6 d:read/0.4 d:read/0.4 4,1 4,2
17 Label-Reachability Filter with Label Pushing When matching an ǫ-transition e 1 with an ǫ L -loop e 2 : if there exists a unique e 2 in q 2 such that r(n[e 1 ], i[e 2]) = 1, then allow matching e 1 with e 2 instead early output of o[e 2] Filter Φ push-label : Q 3 = B {ǫ, } and i 3 = ǫ the filter state encodes the label that has been consumed early d:read 0 r:ε 1 0 red/0.6 read/0.4 eh:ε iy:ε ao:ε d:red d:read d:reed d:road d:rode 4 0,0,ε r:ε 1,0,ε eh:ε iy:read 2,0,ε 3,1,read d:red/0.6 d:read/0.4 d:ε/0.4 4,1,ε 4,2,ε
18 Label-Reachability Filter with Weight Pushing When matching an ǫ-transition e 1 with an ǫ L -loop e 2 : outputs early the -sum of the weight of the prospective matches Reachable weight w r : (q 1, q 2 ) e E[q 2 ],r(q 1,i[e])=1 w[e] Filter Φ push-weight : Q 3 = K, i 3 = 1 and = 0 the filter state encodes the weight that has been outputted early if o[e 1 ] = ǫ and i[e 2 ] = ǫ L, q 3 = w r (n[e 1 ], q 2 ) and w[e 2] = q 1 3 q 3 d:read 0 r:ε 1 0 red/0.6 read/0.4 eh:ε iy:ε ao:ε d:red d:read d:reed d:road d:rode 4 0,0,1 r:ε 1,0,1 d:red/0.6 4,1,1 2,0,1 eh:ε d:read/0.4 iy: ε/0.4 d:read 3,0,0.4 4,2,1
19 Representation of r Implementation Point representation: R q = {x B : r(x, q) = 1} inefficient in time and space Interval representation: I q = {[x, y) : x, y N, [x, y) R q, x 1 / R q, y / R q } efficiency depends on the number of interval for each R q one interval per state trivial for a tree - found by DFS one interval per state possible if C1P holds true if unique pronunciation L and preserved by determinization, minimization, closure and composition with C multiple pronunciation L typically fails C1P. However, a modification of the Hsu s (2002) C1P Test gives a greedy algorithm for minimizing the number of intervals per state.
20 Efficient computation of w r Implementation Requires fast computation of s q (x, y) = e E[q],i[e] [x,y) w[e] for q in T 2, x and y in B = N Achieved by precomputing c q (x) = e E[q],i[e]<x w[e] s q (x, y) = c q (y) c q (x)
21 Composition Options: Composition Design - Options typedef SortedMatcher<StdFst> SM; typedef SequenceComposeFilter<Arc> CF; ComposeFstOptions<StdArc, SM, CF> opts; opts.matcher1 = new SM(fst1, MATCH NONE, knolabel); opts.matcher2 = new SM(fst2, MATCH INPUT, knolabel); opts.filter = new CF(fst1, fst2); StdComposeFst cfst(fst1, fst2, opts);
22 Composition Filters Predefined Filters: Name SequenceComposeFilter AltSequenceComposeFilter MatchComposeFilter LookAheadComposeFilter<F> PushWeightsComposeFilter<F> PushLabelsComposeFilter<F> Description Requires FST1 epsilons to be read before FST2 epsilons Requires FST2 epsilons to be read before FST1 epsilons Requires FST1 epsilons be matched with FST2 epsilons Supports lookahead in composition Supports pushing weights in composition Supports pushing labels in composition Three lookahead composition filters, each templated on an underlying filter F, are added. All three can be used by cascading them.
23 Composition: Matcher Design Matchers can find and iterate through requested labels at FST states. Matcher Form: template <class F> class Matcher { typedef typename F::Arc Arc; public: }; void SetState(StateId s); bool Find(Label label); bool Done() const; const Arc& Value() const; void Next(); bool LookAhead(const Fst<Arc> fst, StateId s, Weight &weight); // Specifies current state // Checks state for match to label // No more matches // Current arc // Advance to next arc // (Optional) lookahead A Lookahead() method, given the language (FST + initial state) to expect, is added.
24 Matchers Predefined Matchers: Name SortedMatcher RhoMatcher<M> SigmaMatcher<M> PhiMatcher<M> LabelLookAheadMatcher<M> ArcLookMatcher<M> Description Binary search on sorted input ρ symbol handling σ symbol handling ϕ symbol handling Lookahead along epsilon paths Lookahead one transition Two lookahead matchers, each templated on an underlying matcher M, are added. Special symbol matchers: Consumes no symbol Consumes symbol Matches all ǫ σ Matches rest ϕ ρ
25 Recognition Experiments Broadcast News Spoken Query Task Trained on 96 and 97 DARPA Hub4 AM training sets. PLP cepstra, LDA analysis, STC Triphonic, 8k tied states, 16 components per state Speaker adapted (both VTLN + CMLLR) Acoustic Model Trained on > 1000hrs of voice search queries PLP cepstra, LDA analysis, STC Triphonic, 4k tied states, components per state Speaker independent 1996 Hub4 CSR LM training sets 4-gram language model pruned to 8M n- grams Language Model Trained on > 1B words of google.com and voice search queries 1 million word vocabulary Katz back-off model, pruned to various sizes
26 Recognition Experiments Precomputation before recognition Broadcast News Spoken Query Task Construction method Time RAM Result Time RAM Result Static (1) with standard composition 7 min 5.3G 0.5G 10.5 min 11.2G 1.4G (2) with generalized composition 2.5 min 2.9G 0.5G 4 min 5.3G 1.4G Dynamic (2) with generalized composition none none 0.2G none none 0.5G Broadcast News Spoken Query Task
27 Recognition Experiments A small part of the recognition transducer is visited during recognition: Spoken Query Task Static Number of states in recognition transducer 25.4M Dynamic Number of states visited per second 8K Very large language models can be used in first-pass: Word Error Rate Spoken Query Task Word error rate as function of LM size (with Ciprian Chelba and Boulos Harb) 1e+06 5e+06 1e+07 5e+07 1e+08 5e+08 1e+09 # of N Grams
28 Prior Work Caseiro and Trancoso (IEEE Trans. on ASLP 2006): they developed a specialized composition for a pronunciation lexicon L. If pronunciations are stored in a trie, then the words readable from a node form a lexicographic interval, which can be used to disallow noncoaccessible epsilon paths. Cheng, et al. (ICASSP 2007); Oonishi, et al (Interspeech 2008): they use methods apparently similar to ours, but many details are left unspecified, such as what is the representation of the reachable label sets. There are no published complexities, but the published results show a very significant overhead to the dynamic composition compared to a static recognition transducer. Our method: uses a very efficient representation of the label sets uses a very efficient computation of the weight pushing has a small overhead between static and dynamic composition
29 Conclusions This work: Introduces a generalized composition filter for weighted finite-state composition Presents composition filters that: Remove useless epsilon paths Push forward labels Push forward weights The combination of these filters permits the composition of large speech-recognition context-dependent lexicons and language models much more efficiently in time and space than before Experiments on Broadcast News and a spoken query task show a 5% to 10% overhead for dynamic, runtime composition compared to static, offline composition. To our knowledge, this is the first such system with so little overhead.
The OpenGrm open-source finite-state grammar software libraries
The OpenGrm open-source finite-state grammar software libraries Brian Roark Richard Sproat Cyril Allauzen Michael Riley Jeffrey Sorensen & Terry Tai Oregon Health & Science University, Portland, Oregon
Measuring the confusability of pronunciations in speech recognition
Measuring the confusability of pronunciations in speech recognition Panagiota Karanasou LIMSI/CNRS Université Paris-Sud [email protected] François Yvon LIMSI/CNRS Université Paris-Sud [email protected] Lori
Automated Lossless Hyper-Minimization for Morphological Analyzers
Automated Lossless Hyper-Minimization for Morphological Analyzers Senka Drobac and Miikka Silfverberg and Krister Lindén Department of Modern Languages PO Box 24 00014 University of Helsinki {senka.drobac,
Regular Expressions and Automata using Haskell
Regular Expressions and Automata using Haskell Simon Thompson Computing Laboratory University of Kent at Canterbury January 2000 Contents 1 Introduction 2 2 Regular Expressions 2 3 Matching regular expressions
Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition
, Lisbon Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition Wolfgang Macherey Lars Haferkamp Ralf Schlüter Hermann Ney Human Language Technology
Regular Languages and Finite State Machines
Regular Languages and Finite State Machines Plan for the Day: Mathematical preliminaries - some review One application formal definition of finite automata Examples 1 Sets A set is an unordered collection
Intrusion Detection via Static Analysis
Intrusion Detection via Static Analysis IEEE Symposium on Security & Privacy 01 David Wagner Drew Dean Presented by Yongjian Hu Outline Introduction Motivation Models Trivial model Callgraph model Abstract
AUDIMUS.media: A Broadcast News Speech Recognition System for the European Portuguese Language
AUDIMUS.media: A Broadcast News Speech Recognition System for the European Portuguese Language Hugo Meinedo, Diamantino Caseiro, João Neto, and Isabel Trancoso L 2 F Spoken Language Systems Lab INESC-ID
6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, 2010. Class 4 Nancy Lynch
6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, 2010 Class 4 Nancy Lynch Today Two more models of computation: Nondeterministic Finite Automata (NFAs)
Turkish Radiology Dictation System
Turkish Radiology Dictation System Ebru Arısoy, Levent M. Arslan Boaziçi University, Electrical and Electronic Engineering Department, 34342, Bebek, stanbul, Turkey [email protected], [email protected]
Finite Automata and Regular Languages
CHAPTER 3 Finite Automata and Regular Languages 3. Introduction 3.. States and Automata A finite-state machine or finite automaton (the noun comes from the Greek; the singular is automaton, the Greek-derived
Coding and decoding with convolutional codes. The Viterbi Algor
Coding and decoding with convolutional codes. The Viterbi Algorithm. 8 Block codes: main ideas Principles st point of view: infinite length block code nd point of view: convolutions Some examples Repetition
CSE 135: Introduction to Theory of Computation Decidability and Recognizability
CSE 135: Introduction to Theory of Computation Decidability and Recognizability Sungjin Im University of California, Merced 04-28, 30-2014 High-Level Descriptions of Computation Instead of giving a Turing
Formal Languages and Automata Theory - Regular Expressions and Finite Automata -
Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Samarjit Chakraborty Computer Engineering and Networks Laboratory Swiss Federal Institute of Technology (ETH) Zürich March
StateFlow Hands On Tutorial
StateFlow Hands On Tutorial HS/PDEEC 2010 03 04 José Pinto [email protected] Session Outline Simulink and Stateflow Numerical Simulation of ODEs Initial Value Problem (Hands on) ODEs with resets (Hands
Finite Automata. Reading: Chapter 2
Finite Automata Reading: Chapter 2 1 Finite Automaton (FA) Informally, a state diagram that comprehensively captures all possible states and transitions that a machine can take while responding to a stream
2110711 THEORY of COMPUTATION
2110711 THEORY of COMPUTATION ATHASIT SURARERKS ELITE Athasit Surarerks ELITE Engineering Laboratory in Theoretical Enumerable System Computer Engineering, Faculty of Engineering Chulalongkorn University
Testing LTL Formula Translation into Büchi Automata
Testing LTL Formula Translation into Büchi Automata Heikki Tauriainen and Keijo Heljanko Helsinki University of Technology, Laboratory for Theoretical Computer Science, P. O. Box 5400, FIN-02015 HUT, Finland
Theory of Computation Chapter 2: Turing Machines
Theory of Computation Chapter 2: Turing Machines Guan-Shieng Huang Feb. 24, 2003 Feb. 19, 2006 0-0 Turing Machine δ K 0111000a 01bb 1 Definition of TMs A Turing Machine is a quadruple M = (K, Σ, δ, s),
Why? A central concept in Computer Science. Algorithms are ubiquitous.
Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online
Compiler Construction
Compiler Construction Regular expressions Scanning Görel Hedin Reviderad 2013 01 23.a 2013 Compiler Construction 2013 F02-1 Compiler overview source code lexical analysis tokens intermediate code generation
Finite Automata. Reading: Chapter 2
Finite Automata Reading: Chapter 2 1 Finite Automata Informally, a state machine that comprehensively captures all possible states and transitions that a machine can take while responding to a stream (or
Reading 13 : Finite State Automata and Regular Expressions
CS/Math 24: Introduction to Discrete Mathematics Fall 25 Reading 3 : Finite State Automata and Regular Expressions Instructors: Beck Hasti, Gautam Prakriya In this reading we study a mathematical model
A Static Analyzer for Large Safety-Critical Software. Considered Programs and Semantics. Automatic Program Verification by Abstract Interpretation
PLDI 03 A Static Analyzer for Large Safety-Critical Software B. Blanchet, P. Cousot, R. Cousot, J. Feret L. Mauborgne, A. Miné, D. Monniaux,. Rival CNRS École normale supérieure École polytechnique Paris
Introduction to Learning & Decision Trees
Artificial Intelligence: Representation and Problem Solving 5-38 April 0, 2007 Introduction to Learning & Decision Trees Learning and Decision Trees to learning What is learning? - more than just memorizing
The Halting Problem is Undecidable
185 Corollary G = { M, w w L(M) } is not Turing-recognizable. Proof. = ERR, where ERR is the easy to decide language: ERR = { x { 0, 1 }* x does not have a prefix that is a valid code for a Turing machine
The Goldberg Rao Algorithm for the Maximum Flow Problem
The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }
Lexical analysis FORMAL LANGUAGES AND COMPILERS. Floriano Scioscia. Formal Languages and Compilers A.Y. 2015/2016
Master s Degree Course in Computer Engineering Formal Languages FORMAL LANGUAGES AND COMPILERS Lexical analysis Floriano Scioscia 1 Introductive terminological distinction Lexical string or lexeme = meaningful
CSC4510 AUTOMATA 2.1 Finite Automata: Examples and D efinitions Definitions
CSC45 AUTOMATA 2. Finite Automata: Examples and Definitions Finite Automata: Examples and Definitions A finite automaton is a simple type of computer. Itsoutputislimitedto yes to or no. It has very primitive
Web Data Extraction: 1 o Semestre 2007/2008
Web Data : Given Slides baseados nos slides oficiais do livro Web Data Mining c Bing Liu, Springer, December, 2006. Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008
Informatique Fondamentale IMA S8
Informatique Fondamentale IMA S8 Cours 1 - Intro + schedule + finite state machines Laure Gonnord http://laure.gonnord.org/pro/teaching/ [email protected] Université Lille 1 - Polytech Lille
IMPLEMENTING SRI S PASHTO SPEECH-TO-SPEECH TRANSLATION SYSTEM ON A SMART PHONE
IMPLEMENTING SRI S PASHTO SPEECH-TO-SPEECH TRANSLATION SYSTEM ON A SMART PHONE Jing Zheng, Arindam Mandal, Xin Lei 1, Michael Frandsen, Necip Fazil Ayan, Dimitra Vergyri, Wen Wang, Murat Akbacak, Kristin
T-79.186 Reactive Systems: Introduction and Finite State Automata
T-79.186 Reactive Systems: Introduction and Finite State Automata Timo Latvala 14.1.2004 Reactive Systems: Introduction and Finite State Automata 1-1 Reactive Systems Reactive systems are a class of software
Introduction to Scheduling Theory
Introduction to Scheduling Theory Arnaud Legrand Laboratoire Informatique et Distribution IMAG CNRS, France [email protected] November 8, 2004 1/ 26 Outline 1 Task graphs from outer space 2 Scheduling
Variable Base Interface
Chapter 6 Variable Base Interface 6.1 Introduction Finite element codes has been changed a lot during the evolution of the Finite Element Method, In its early times, finite element applications were developed
Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy
Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy Kim S. Larsen Odense University Abstract For many years, regular expressions with back referencing have been used in a variety
Pushdown automata. Informatics 2A: Lecture 9. Alex Simpson. 3 October, 2014. School of Informatics University of Edinburgh [email protected].
Pushdown automata Informatics 2A: Lecture 9 Alex Simpson School of Informatics University of Edinburgh [email protected] 3 October, 2014 1 / 17 Recap of lecture 8 Context-free languages are defined by context-free
A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms
A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms Simone Faro and Thierry Lecroq Università di Catania, Viale A.Doria n.6, 95125 Catania, Italy Université de Rouen, LITIS EA 4108,
Magic Word. Possible Answers: LOOSER WINNER LOTTOS TICKET. What is the magic word?
Magic Word Question: A magic word is needed to open a box. A secret code assigns each letter of the alphabet to a unique number. The code for the magic word is written on the outside of the box. What is
Introduction to Theory of Computation
Introduction to Theory of Computation Prof. (Dr.) K.R. Chowdhary Email: [email protected] Formerly at department of Computer Science and Engineering MBM Engineering College, Jodhpur Tuesday 28 th
Turing Machines: An Introduction
CIT 596 Theory of Computation 1 We have seen several abstract models of computing devices: Deterministic Finite Automata, Nondeterministic Finite Automata, Nondeterministic Finite Automata with ɛ-transitions,
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding
Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet)
Proceedings of the Twenty-Fourth Innovative Appications of Artificial Intelligence Conference Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet) Tatsuya Kawahara
Symbol Tables. Introduction
Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The
Genetic programming with regular expressions
Genetic programming with regular expressions Børge Svingen Chief Technology Officer, Open AdExchange [email protected] 2009-03-23 Pattern discovery Pattern discovery: Recognizing patterns that characterize
CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013
Oct 4, 2013, p 1 Name: CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013 1. (max 18) 4. (max 16) 2. (max 12) 5. (max 12) 3. (max 24) 6. (max 18) Total: (max 100)
Questions 1 through 25 are worth 2 points each. Choose one best answer for each.
Questions 1 through 25 are worth 2 points each. Choose one best answer for each. 1. For the singly linked list implementation of the queue, where are the enqueues and dequeues performed? c a. Enqueue in
Reliability Guarantees in Automata Based Scheduling for Embedded Control Software
1 Reliability Guarantees in Automata Based Scheduling for Embedded Control Software Santhosh Prabhu, Aritra Hazra, Pallab Dasgupta Department of CSE, IIT Kharagpur West Bengal, India - 721302. Email: {santhosh.prabhu,
CompuScholar, Inc. Alignment to Utah's Computer Programming II Standards
CompuScholar, Inc. Alignment to Utah's Computer Programming II Standards Course Title: TeenCoder: Java Programming Course ISBN: 978 0 9887070 2 3 Course Year: 2015 Note: Citation(s) listed may represent
Automata Theory. Şubat 2006 Tuğrul Yılmaz Ankara Üniversitesi
Automata Theory Automata theory is the study of abstract computing devices. A. M. Turing studied an abstract machine that had all the capabilities of today s computers. Turing s goal was to describe the
The Model Checker SPIN
The Model Checker SPIN Author: Gerard J. Holzmann Presented By: Maulik Patel Outline Introduction Structure Foundation Algorithms Memory management Example/Demo SPIN-Introduction Introduction SPIN (Simple(
SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY
SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY G.Evangelin Jenifer #1, Mrs.J.Jaya Sherin *2 # PG Scholar, Department of Electronics and Communication Engineering(Communication and Networking), CSI Institute
Fast nondeterministic recognition of context-free languages using two queues
Fast nondeterministic recognition of context-free languages using two queues Burton Rosenberg University of Miami Abstract We show how to accept a context-free language nondeterministically in O( n log
Resilient Dynamic Programming
Resilient Dynamic Programming Irene Finocchi, Saverio Caminiti, and Emanuele Fusco Dipartimento di Informatica, Sapienza Università di Roma via Salaria, 113-00198 Rome, Italy. {finocchi, caminiti, fusco}@di.uniroma1.it
6.080 / 6.089 Great Ideas in Theoretical Computer Science Spring 2008
MIT OpenCourseWare http://ocw.mit.edu 6.080 / 6.089 Great Ideas in Theoretical Computer Science Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Influences in low-degree polynomials
Influences in low-degree polynomials Artūrs Bačkurs December 12, 2012 1 Introduction In 3] it is conjectured that every bounded real polynomial has a highly influential variable The conjecture is known
Introduction to Finite Automata
Introduction to Finite Automata Our First Machine Model Captain Pedro Ortiz Department of Computer Science United States Naval Academy SI-340 Theory of Computing Fall 2012 Captain Pedro Ortiz (US Naval
(IALC, Chapters 8 and 9) Introduction to Turing s life, Turing machines, universal machines, unsolvable problems.
3130CIT: Theory of Computation Turing machines and undecidability (IALC, Chapters 8 and 9) Introduction to Turing s life, Turing machines, universal machines, unsolvable problems. An undecidable problem
Introduction to Automata Theory. Reading: Chapter 1
Introduction to Automata Theory Reading: Chapter 1 1 What is Automata Theory? Study of abstract computing devices, or machines Automaton = an abstract computing device Note: A device need not even be a
Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.
Name: Class: Date: Exam #1 - Prep True/False Indicate whether the statement is true or false. 1. Programming is the process of writing a computer program in a language that the computer can respond to
Compression techniques
Compression techniques David Bařina February 22, 2013 David Bařina Compression techniques February 22, 2013 1 / 37 Contents 1 Terminology 2 Simple techniques 3 Entropy coding 4 Dictionary methods 5 Conclusion
CAs and Turing Machines. The Basis for Universal Computation
CAs and Turing Machines The Basis for Universal Computation What We Mean By Universal When we claim universal computation we mean that the CA is capable of calculating anything that could possibly be calculated*.
3515ICT Theory of Computation Turing Machines
Griffith University 3515ICT Theory of Computation Turing Machines (Based loosely on slides by Harald Søndergaard of The University of Melbourne) 9-0 Overview Turing machines: a general model of computation
Business Intelligence and Process Modelling
Business Intelligence and Process Modelling F.W. Takes Universiteit Leiden Lecture 7: Network Analytics & Process Modelling Introduction BIPM Lecture 7: Network Analytics & Process Modelling Introduction
Decision Trees from large Databases: SLIQ
Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values
14.1 Rent-or-buy problem
CS787: Advanced Algorithms Lecture 14: Online algorithms We now shift focus to a different kind of algorithmic problem where we need to perform some optimization without knowing the input in advance. Algorithms
Automata and Formal Languages
Automata and Formal Languages Winter 2009-2010 Yacov Hel-Or 1 What this course is all about This course is about mathematical models of computation We ll study different machine models (finite automata,
Computability Theory
CSC 438F/2404F Notes (S. Cook and T. Pitassi) Fall, 2014 Computability Theory This section is partly inspired by the material in A Course in Mathematical Logic by Bell and Machover, Chap 6, sections 1-10.
P NP for the Reals with various Analytic Functions
P NP for the Reals with various Analytic Functions Mihai Prunescu Abstract We show that non-deterministic machines in the sense of [BSS] defined over wide classes of real analytic structures are more powerful
Automata and Computability. Solutions to Exercises
Automata and Computability Solutions to Exercises Fall 25 Alexis Maciel Department of Computer Science Clarkson University Copyright c 25 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata
C H A P T E R Regular Expressions regular expression
7 CHAPTER Regular Expressions Most programmers and other power-users of computer systems have used tools that match text patterns. You may have used a Web search engine with a pattern like travel cancun
Introduction to LabVIEW Design Patterns
Introduction to LabVIEW Design Patterns What is a Design Pattern? Definition: A well-established solution to a common problem. Why Should I Use One? Save time and improve the longevity and readability
Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System
Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania [email protected]
How To Trace
CS510 Software Engineering Dynamic Program Analysis Asst. Prof. Mathias Payer Department of Computer Science Purdue University TA: Scott A. Carr Slides inspired by Xiangyu Zhang http://nebelwelt.net/teaching/15-cs510-se
IVR Studio 3.0 Guide. May-2013. Knowlarity Product Team
IVR Studio 3.0 Guide May-2013 Knowlarity Product Team Contents IVR Studio... 4 Workstation... 4 Name & field of IVR... 4 Set CDR maintainence property... 4 Set IVR view... 4 Object properties view... 4
Markov random fields and Gibbs measures
Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).
Software Verification: Infinite-State Model Checking and Static Program
Software Verification: Infinite-State Model Checking and Static Program Analysis Dagstuhl Seminar 06081 February 19 24, 2006 Parosh Abdulla 1, Ahmed Bouajjani 2, and Markus Müller-Olm 3 1 Uppsala Universitet,
Notes on Complexity Theory Last updated: August, 2011. Lecture 1
Notes on Complexity Theory Last updated: August, 2011 Jonathan Katz Lecture 1 1 Turing Machines I assume that most students have encountered Turing machines before. (Students who have not may want to look
Honors Class (Foundations of) Informatics. Tom Verhoeff. Department of Mathematics & Computer Science Software Engineering & Technology
Honors Class (Foundations of) Informatics Tom Verhoeff Department of Mathematics & Computer Science Software Engineering & Technology www.win.tue.nl/~wstomv/edu/hci c 2011, T. Verhoeff @ TUE.NL 1/20 Information
TED-LIUM: an Automatic Speech Recognition dedicated corpus
TED-LIUM: an Automatic Speech Recognition dedicated corpus Anthony Rousseau, Paul Deléglise, Yannick Estève Laboratoire Informatique de l Université du Maine (LIUM) University of Le Mans, France [email protected]
Deterministic Finite Automata
1 Deterministic Finite Automata Definition: A deterministic finite automaton (DFA) consists of 1. a finite set of states (often denoted Q) 2. a finite set Σ of symbols (alphabet) 3. a transition function
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
Monitoring Metric First-order Temporal Properties
Monitoring Metric First-order Temporal Properties DAVID BASIN, FELIX KLAEDTKE, SAMUEL MÜLLER, and EUGEN ZĂLINESCU, ETH Zurich Runtime monitoring is a general approach to verifying system properties at
ANIMATION a system for animation scene and contents creation, retrieval and display
ANIMATION a system for animation scene and contents creation, retrieval and display Peter L. Stanchev Kettering University ABSTRACT There is an increasing interest in the computer animation. The most of
The P versus NP Solution
The P versus NP Solution Frank Vega To cite this version: Frank Vega. The P versus NP Solution. 2015. HAL Id: hal-01143424 https://hal.archives-ouvertes.fr/hal-01143424 Submitted on 17 Apr
Finding Liveness Errors with ACO
Hong Kong, June 1-6, 2008 1 / 24 Finding Liveness Errors with ACO Francisco Chicano and Enrique Alba Motivation Motivation Nowadays software is very complex An error in a software system can imply the
3. The Junction Tree Algorithms
A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin [email protected] 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )
VoiceXML-Based Dialogue Systems
VoiceXML-Based Dialogue Systems Pavel Cenek Laboratory of Speech and Dialogue Faculty of Informatics Masaryk University Brno Agenda Dialogue system (DS) VoiceXML Frame-based DS in general 2 Computer based
CS103B Handout 17 Winter 2007 February 26, 2007 Languages and Regular Expressions
CS103B Handout 17 Winter 2007 February 26, 2007 Languages and Regular Expressions Theory of Formal Languages In the English language, we distinguish between three different identities: letter, word, sentence.
SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis
SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: [email protected] October 17, 2015 Outline
Lecture 2: Universality
CS 710: Complexity Theory 1/21/2010 Lecture 2: Universality Instructor: Dieter van Melkebeek Scribe: Tyson Williams In this lecture, we introduce the notion of a universal machine, develop efficient universal
