Discrete Hidden Markov Model Training Based on Variable Length Particle Swarm Optimization Algorithm

Discrete Hidden Markov Model Training Based on Variable Length Discrete Hidden Markov Model Training Based on Variable Length 12 Xiaobin Li, 1Jiansheng Qian, 1Zhikai Zhao School of Computer Science and Technology, China University of Mining and Technology, Xuzhou Jiangsu 221008,China *2, School of Computer Science and Technology, Jiangsu Normal University, Xuzhou Jiangsu 221116,China,wb0817002@163.com 1, Abstract Expectation Maximization could be used to train a Discrete Hidden Markov Model(DHMM), but the state number must be specified in advance. Meanwhile this method may make the parameters of HMM converge to a local optimal solution. This paper puts forward a novel DHMM training method based on variable length particle swarm optimization. The method has two advantages. One is this method could optimize the state number, the other is it could make model parameters convergence to a global optimal solution. Experiments on synthetic data set verify the efficient of the method. Keywords: Discrete Hidden Markov Model, Particle swarm optimization, Expectation maximization 1. Introduction Hidden Markov Model(HMM) has demonstrated its predominant capacity in many areas[1, 2]. While training of HMM, i.e. parameters estimation is the key problem in applying HMM to practical application. The widely used parameters estimation method is Baum-Welch (BW) algorithm [3]. However the BW algorithm has disadvantages of calculating slowly and converging to a local optimal solution. Meanwhile the state number of HMM must be specified in advance. Too much state number will make the learned HMM model over fitting, and too few state number will make HMM model less fitting. Thus many scholars put forward various optimization algorithms to optimize the parameters of HMM such as genetic algorithm [4], simulated annealing algorithm [5], and particle swarm algorithm [6] and so on. At the same time BIC criterion, AIC criterion, AIC3 criterion and cross validation are used to obtain the state number of HMM.. Then the HMM training problem is divided into two questions: state number optimization and parameters optimization. These two questions are influenced each other. This paper presents a novel algorithm based on variable length particle swarm optimization for Discrete Hidden Markov Model (DHMM) training. The algorithm borrows ideas from paper [7] which introduces variable length particle swarm optimization algorithm for social programming. The proposed algorithm based on variable length particle swarm optimization algorithm has the following two distinct advantages. (1)The state number of DHMM does not have to be specified in advance. The algorithm will determine the optimization state number automatically according to relative information criterion. (2)DHMM parameters including transition probability and observation probability will approximately converge to a global optimal solution under specific state number, but not the local optimal solution. In this paper, the rest of the paper is listed as follows. Section 2 introduces DHMM optimization in previous work briefly; Section 3 gives a detail explanation on variable length particle swarm algorithm applied in DHMM training; Section 4 carries on some optimization experiments on synthesis data set; Section 5 draws the conclusion of this paper. 2. Related Work State number of HMM should be appointed before a HMM optimization procedure is carried on. Paper [8] raises a state number estimation method by BIC criterion. Paper [9]proves that when there is small sample size of data set the AIC criterion is more suitable for HMM state number assessment. International Journal of Digital Content Technology and its Applications(JDCTA) Volume6,Number20,November 2012 doi:10.4156/jdcta.vol6.issue20.20 182

Paper [10] compares several state number assessment algorithms and concludes that AIC3 criterion take more advantages. Paper [11] puts forward cross validation algorithm as HMM state number determination instead of various information criteria. Various algorithms, based on EM, are prone to converge to a local optimal solution. Then PSO algorithm is introduced for HMM training [12-14]. Experimental results in Paper [6] show that PSO algorithm in HMM training outperforms BW algorithm and simulated annealing (SA) in the area of protein multiple sequence alignment (MSA). In paper [15] the immune particle swarm optimization (IPSO) algorithm is proposed for HMM training, and for the MSA problems experimental results show that IPSO can not only improve sequence alignment capability but also reduce model training cost. Paper [16] proposes a particle repayment and penalty method during optimization procedure for HMM model training and experimental results show the method is better than BW. Paper[17, 18] put forward an optimization named quantum particle swarm optimization (QPSO) and apply this algorithm to the HMM parameters estimation in MSA area. At the same time this algorithm is compared with traditional PSO and BW, and the algorithm makes a low classification error rate than the other two algorithms on some experimental data sets. Paper [19] introduces an improved particle swarm optimization algorithm (IPSO) to optimize the HMM parameters. Paper [20] combines PSO and BW together to optimize HMM state number and probability matrix in two phases. However, all these existing HMM model training techniques have one distinct drawback that HMM state number should be appointed or calculated in advance. Once state number is determined the HMM model could then optimize its parameters under particle swarm optimization. In the classical PSO algorithm, particle length is fixed, while in several particular application areas the particle length should be variable. Paper [7] presents a variable length particle swarm algorithm to implement social programming, where unfixed length particle is used to represent feasible solution. During the optimization procedure, the non-optimal particle length will evolve to the optimal particle length in accordance with a certain probability. Paper [22][21] proposes a PSO based on feasible solution set where conventional particle swarm arithmetic calculation is replaced with logic operations on set. Then the algorithm is successfully applied into clustering problem. In the paper we borrow the thought from paper [7, 21] and modify the proposed algorithm to make it suitable for Discrete Hidden Markov Model (DHMM) training. This paper presents a variable length particle swarm optimization (VLPSO) for Discrete Hidden Markov Model training. The proposed method has the following two obvious advantages. The first is the DHMM state number don t need to be specified in advance but is determined by the algorithm according to information criterion during the dynamic optimization procedure; The second is the DHMM parameters will converge to an approximate global optimal solution under the specific state number instead of converging to a local optimal solution with EM algorithm. 3. Variable length particle swarm optimization for DHMM training 3.1. Particle swarm optimization Particle swarm optimization algorithm (PSO) is a kind of evolutionary computing technology, which is mainly inspired by the biological behaviour among birds community. In the community there is no central control mechanism and the group's behaviour is a combination of all particles in the group. Each particle s action is mainly determined by two factors of the group, one is its own historical experience and the other is its social relationship with other particles. PSO has been applied in lots of application areas such as DNA sequence alignment as well as various parameter optimization problems. In PSO algorithm a particle p is generally represented as a tuple with four elements: p ( x ( t), v ( t), p ( t), g ( t)) (1) ij ij In the tuple t means the optimization round, i means the specific particle number and j means the dimension. x ij (t) means the position of NO.i particle at t round and j dimension; v ij (t) means the speed of NO.i particle at t round and j dimension; p ij (t) means the optimal position among all the ij ij 183

Discrete Hidden Markov Model Training Based on Variable Length positions that NO.i particle have traversed at j dimension till t round; g ij (t ) represents the optimal position among all the positions that all particles have traversed at j dimension till t round. The particles evolvement among search space can be defined as the following two formulas. vij (t 1) wvij (t ) c1r1 j ( pij (t ) xij (t )) c2 r2 j ( g ij (t ) xij (t )) (2) xij (t 1) xij (t ) vij (t 1) (3) Where w is the inertia coefficient. In some PSO improved versions of the algorithm the coefficient can be reduced gradually with optimization, i.e. at the beginning of the optimization PSO have a larger inertia coefficient while later the inertia coefficient is gradually reduced. The number c1 and c2 means the local learning coefficient and global learning coefficient separately. While the number r1 j r2 j separately represents the random values which can be rebuilt in each dimension, and these two number form the constrain 0 r1 j, r2 j 1 Classical PSO algorithm run as follows: (1) Initialization. The algorithm produces several particles randomly and assigns these particles random initial position and velocity. (2) Assess. The algorithm calculates the fitness of each particle among the particle population with a fitness function. (3) Obtain the local optimal particle. The algorithm searches for the optimal solution of each particle till now. (4) Obtain the global optimal particle. The algorithm searches for the optimal solution among all particles till now. (5) Update all the particles position and velocity. (6) Return to step (2). The algorithm keeps running until the recycle round reaches a predetermined threshold or a satisfactory solution. 3.2. DHMM training based on VLPSO For a DHMM, where the model has N states and M observed symbols, a length of N+N * N+N * M particle could then be constructed. The addition symbol divides a particle into three parts, of which the first part means N states of the initial probability distribution, the second part means HMM transition probability distribution A, while the third part means HMM observation probability distribution B. When the sequence is known, the number of observed symbol M is easily identified, but the state number N is difficult to determine. Here we put forward a kind of VLPSO algorithm to optimize HMM. Firstly a five elements tuple is used to represent the particle defined in VLPSO algorithm: p ( xij (t ), vij (t ), pij (t ), p gj (t ), p aj (t )) (4) In the tuple t means the current optimization round, i means the specific particle number and j means the dimension. xij (t ) means the position of NO.i particle at t round and j dimension; vij (t ) means the speed of NO. i particle at t round and j dimension; pij (t ) means the optimal position among all the positions that NO.i particle have traversed at j dimension till t round; g ij (t ) represents the optimal position among all the positions that all particles have traversed at j dimension till t round. Among them, t means the optimization round, i means the specific particle number, j represents the dimension. xij (t ) represents the position of NO.i particles in the t round algorithm and the j dimension; vij (t ) represents the speed of NO.i particles in the t round algorithm and the j dimension; pij (t ) means the optimal position among all the positions that NO.i particle have traversed at j dimension till t round; p gj (t ) means the optimal position among all the positions that all particles have traversed at j 184

Discrete Hidden Markov Model Training Based on Variable Length dimension till t round. paj (t ) means the optimal position among all the positions that all particles have traversed till t round. The particles evolution in search space can be described by the following two formulas. vij (t 1) wvij (t ) c1r1 j ( pij (t ) xij (t )) c2 r2 j ( p gj (t ) xij (t )) (5) xij (t 1) xij (t ) vij (t 1) (6) In the formulas, the coefficient has the same meaning as the classical PSO algorithm. In addition to the changes of speed and position, the VLPSO algorithm also contains a dimension alteration step. Assume we have state number nit State( xi (t )) in the t round, meanwhile the global optimal state number is n gt State( pa (t )). Therefore at t+1 round state number can alter to the global optimal state number of state. ni (t 1) n g (t 1) (7) The alteration method can be described in detail as below. ni (t 1) ni (t 1) r (n g (t 1) ni (t 1) ) (8) Where r is a generated random number, and 0 r 1. When the global best particle state number is greater than the state number of assessment particles the assessment particles will increase their state number, otherwise it will reduce their state number. Once the new state number is determined, the particle position and velocity will be reinitialized and at the same time the local optimal particles and the global optimal particle will be altered. The main procedure of VLPSO algorithm is described as follows. (1) Initialization. The algorithm produces several particles randomly and assigns these particles random initial position and velocity. (2) Assess. The algorithm calculates the fitness of each particle among the particle population with a fitness function. (3) Obtain the local optimal particle. The algorithm searches for the optimal solution of each particle with the same dimension till now. (4) Obtain the global optimal particle. The algorithm searches for the optimal solution among all particles till now. (5) Update all the particles position and velocity. (6) Update all particles dimension (7) Return to step (2). The algorithm keeps running until the recycle round reaches a predetermined threshold or a satisfactory solution. 3.2.1. Fitness function Forward process could be applied to evaluate that a given sequence is whether satisfied with a HMM or not. Firstly a prior variable t (i ) P(O1O2...Ot, qt S i ) is defined as the probability of the partial observation sequence, O1O2...Ot,(until time t)and state is Si and time t, given the model. The process of calculating the prior variables is described as follows: (1) Initialization: (9) 1 (i ) i bi (O1 ),1<=i<=N (2) Induction: N (i ) a t 1 ( j ) t i 1 ij b j (Ot 1 ) 1<=t<=T-1 1<=j<=N (10) 185

(3) Termination: N P O ) i 1 ( ( i ) (11) T As the value of P( O ) in the calculation results is greatly small, the precision is easy lost when it is directly used as the fitness function. Therefore the log-likelihood function Fitness( xi ) log( P( O )) is often applied to assess the HMM model fitness. Hence for multiple observed sequences, the likelihood (10) is introduced. L 1 Fitness( xi ) log( P( O )). O [ o1, o2,..., ot,..., ot ] (12) L l 1 In the above evaluation function, the affection of HMM state number is not considered. In fact for VLPSO algorithm in HMM training, the state number needs to be considered simultaneously. Then the state number should be fit into the likelihood evaluation function. At present, there are several standard techniques on statistical model fitting capacity, such as AIC, BIC etc. For the likelihood evaluation in DHMM model, the observed symbol number M of a known sequence is constant, so this parameter don't need to take into consideration. While the DHMM state number N is a free variable, so the state number N is the only one needed to be considered. 3.2.2. Particle dimension adjusting probability In step (6) of VLPSO algorithm, the particle dimension will alter with an adjusting probability P ) after several algorithm round with fixed dimension. Here three strategies are proposed to ( adjust determine the adjusting probability P adjust. (1) VLPSO_FIX, the first strategy, specifies a fixed constant probability c of dimension adjusting. (2) ( t) c ( 0 c 1 ) (13) P adjust (2) VLPSO_LIN, the second strategy, increases linearly with algorithm round. Assume that the minimum dimension adjusting probability is P min, the maximum dimension adjusting probability is P max, R is the total algorithm round designed and t is the algorithm round till now from the adjusting algorithm round. Pmax Pmin P adjust ( t) Pmin t (14) R In this strategy, particles hold their dimension before t round, while after t round their dimension will approximate adjust to the global optimal particle dimension. (3) VLPSO_EXP, the third strategy, P adjust varies according to the difference of the two global optimal particles fitness in successive algorithm round. Suppose the global optimal particle fitness in successive algorithm round is respectively p g (t) and p g ( t 1). Padjust ( t 1) exp( 1*( Fitness( pg ( t) Fitness( pg ( t 1))))) (15) Obviously, in this strategy, when the difference of the two global optimal particle fitness in successive algorithm round is large the dimension adjusting probability P adjust is little. Therefore at the beginning the particles dimension has little opportunity to adjust. While later the difference of the two global optimal particles fitness in successive algorithm round is getting smaller, and the dimension 186

adjusting probability is getting bigger. i.e. There will be more opportunity to adjust particles dimension. 4. Experiments 4.1. Data set In order to accurately evaluate the DHMM training method based on variable length particle swarm optimization algorithm we proposed. Synthetic data set is generated by Murphy s HMM toolbox. There are four data sets (HMM1, HMM2, HMM3, HMM4) generated with number 100 and length 100. The state number of the HMM1 is 2, the observed symbol number is 4 and the specific parameters of the HMM model are assigned as below: =[0.5,0.5]; A=[0.2,0.8;0.3,0.7]; B=[0.2,0.4,0.3,0.1;0.1,0.5,0.2,0.2]; The state number of the HMM2 is 4, the observed symbol number is 6 and the specific parameters of the HMM model are assigned as below: =[0.25,0.25,0.25,0.25]; A=[0.2,0.6,0.1,0.1;0.3,0.2,0.2,0.3;0.3,0.2,0.2,0.3;0.3,0.2,0.2,0.3]; B=[0.2,0.4,0.1,0.1,0.1,0.1;0.1,0.2,0.2,0.2,0.1,0.2;0.3,0.2,0.2,0.1,0.1,0.1;0.3,0.2,0.2,0.1,0.1,0.1]; The state number of the HMM3 is 6, the observed symbol number is 8 and the specific parameters of the HMM model are assigned as below: =[0.25,0.25,0.2,0.1,0.1,0.1]; A=[0.1,0.1,0.1,0.1,0.1,0.5;0.1,0.1,0.1,0.1,0.5,0.1;0.1,0.1,0.1,0.5,0.1,0.1;0.1,0.1,0.5,0.1,0.1,0.1;0.1,0. 5,0.1,0.1,0.1,0.1;0.5,0.1,0.1,0.1,0.1,0.1]; B=[0.3,0.1,0.1,0.1,0.1,0.1,0.1,0.1;0.1,0.3,0.1,0.1,0.1,0.1,0.1,0.1;0.2,0.2,0.1,0.1,0.1,0.1,0.1,0.1;0.15,0.25,0.1,0.1,0.1,0.1,0.1,0.1;0.25,0.15,0.1,0.1,0.1,0.1,0.1,0.1;0.0,0.4,0.1,0.1,0.1,0.1,0.1,0.1]; The state number of the HMM4 is 8, the observed symbol number is 8 and the specific parameters of the HMM model are assigned as below: =[0.25,0.1,0.15,0.1,0,1,0.1,0.1,0.1]'; A=[0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.3;0.1,0.1,0.1,0.1,0.1,0.1,0.3,0.1;0.1,0.1,0.1,0.1,0.1,0.3,0.1,0.1;0.1,0. 1,0.1,0.1,0.3,0.1,0.1,0.1;0.1,0.1,0.1,0.3,0.1,0.1,0.1,0.1;0.1,0.1,0.3,0.1,0.1,0.1,0.1,0.1;0.1,0.3,0.1,0.1,0.1, 0.1,0.1,0.1;0.3,0.1,0.1,0.1,0.1,0.1,0.1,0.1]; B=[0.3,0.1,0.1,0.1,0.1,0.1,0.1,0.1;0.1,0.3,0.1,0.1,0.1,0.1,0.1,0.1;0.1,0.1,0.3,0.1,0.1,0.1,0.1,0.1;0.1,0. 1,0.1,0.3,0.1,0.1,0.1,0.1;0.1,0.1,0.1,0.1,0.3,0.1,0.1,0.1;0.1,0.1,0.1,0.1,0.1,0.3,0.1,0.1;0.1,0.1,0.1,0.1,0.1, 0.1,0.3,0.1;0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.3]; 4.2. Algorithm parameters Four algorithms are applied to train HMM model and also compared with each other. They are BW algorithm, VLPSO_FIX, VLPSO_LIN, VLPSO_EXP. In the BW algorithm the main parameters include state number and iteration number. For the convenience to calculate and compare, here in BW algorithm the state number is directly assigned the same state number as used in synthetic data sets. At the same time algorithm parameters of VLPSO are listed in table 1. Table 1. Algorithm parameters of VLPSO VLPSO_FIX VLPSO_LIN VLPSO_EXP N 100 100 100 R 40 40 40 Radjust 30 30 30 wstart 0.95 0.95 0.95 wend 0.2 0.2 0.2 C1 2 2 2 187

C2 2 2 2 P 0.5 - - adjust Pmax - 1.0 - Pmin - 0.5 - Among these parameters, N means the particle number, R means the total algorithm round, R adjust means the algorithm round of the particle dimension beginning to adjust, w start and w end respectively means the speed inertia weight of the initial and final calculation, C 1 and C 2 respectively means the learning factors in local solution space and global solution space, P adjust represents particle dimension adjusting probability, P max means maximum dimension adjusting probability and P min means minimum dimension adjusting probability. 4.3. State number recognition Here we apply the above four algorithms on data sets and summary the state number correct recognition. Table 2 only list the rate of correctly recognition state number. VLPSO_LIN and VLPSO_EXP have similar results. Table 2. Percentage of correctly recognize states (%) BIC AIC AICC AIC3 CAIC AICu HQC HMM1 72 32 35 52 75 56 45 HMM2 70 30 34 51 75 55 40 HMM3 69 28 32 48 73 55 38 HMM4 65 28 30 47 70 51 37 It can be seen from the experimental results that the BIC and CAIC can be well used for evaluating the likelihood. While when the state number of the sample increases, the percentage of correctly recognition states number declines. 4.4. Likelihood comparison We compare all likelihood calculated under the four algorithms. Table 3 only list the likelihood comparison between VLPSO_FIX and BW. For BW algorithm the state number used in previous data sets generated is directly applied. Table 3. Log-likelihood degrees under correctly recognize states (10^2) BW BIC AIC AICC AIC3 CAIC AICu HQC HMM1-1.28-1.23-1.23-1.24-1.23-1.23-1.23-1.24 HMM2-1.65-1.62-1.64-1.63-1.64-1.60-1.65-1.63 HMM3-1.88-1.84-1.84-1.85-1.86-1.86-1.87-1.85 HMM4-1.91-1.84-1.89-1.88-1.85-1.89-1.88-1.84 It can be seen from the experimental results the VLPSO algorithm, compared with BW algorithm, when under the same evolution index, the likelihood is higher. Other algorithms such as VLPSO_LIN and VLPSO_EXP have similar conclusions. 188

Discrete Hidden Markov Model Training Based on Variable Length 4.5. Evolution of state number HMM1 HMM2 HMM3 HMM4 Figure 1. Number of particles according to state number Here we illustrate the evolution of state number under BIC index and VLPSO_FIX in figure 1. In the figure there are four sub figures corresponding to the four data sets. Three lines are plot on each sub figure, where each line is separately corresponded to the round 30, 35 and 40. The other two algorithms also have the similar results. It can be seen from the experimental results that before dimension adjusting (round 30), each particle is optimized with a constant state number. All particles will gradually alter their dimension to the global optimal particle s dimension when the dimension adjusting process starts. 5. Conclusion Iteration in BW algorithm always increases its likelihood on data set, so the final result might produce a local rather than a global maximum solution. Meanwhile an initial state number should be appointed in advance, which is impossible when there is little prior knowledge for the data set. This paper presents VLPSO algorithm, an improvement version of traditional fixed length particle swarm algorithm, for DHMM training. From the experiments on synthetic data set we can summarize that this proposed VLPSO algorithm has the following two distinct advantages: the first is the DHMM state number do not need to be specified in advance, but is determined by the dynamic optimization, the second is DHMM parameters will approximately convergence to the global optimal solution under the specific state number rather than local optimal solution. 189

The main questions still remain in VLPSO algorithm is the selection of fitness function, which has a certain tight relationship with data set. How to choose a suitable fitness function and make better use of VLPSO to train DHMM so as to obtain appropriate structure is the focus work of next research. 6. Acknowledgement This work was supported by Chinese National High Technology Research and Development Program 863 under Grant No 2008AA062200, and by Chinese National High Technology Research and Development Program 863 under Grant No 2012AA062103, and by Jiangsu Province Production Research Foundation under Grant No BY2009114 in China, and by Xuzhou Industry Science Program under Grant No XX10A001, and by Jiangsu Normal University Foundation under Grant No 10XLA13. Also the authors wish to thank the reviewers for this paper. 7. Reference [1] Y. Yang, N. Cheng, M. Zhang, Research on activity recognition method based on human motion trajectory features, Journal of Convergence Information Technology, vol. 7, no. 1, pp. 79-85, 2012. [2] Y. Yao, K. Xia, Y. Wu, Speech word recognizer based on the HMM algorithm, International Journal of Advancements in Computing Technology, vol. 3, no. 10, pp. 371-377, 2011. [3] L. R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989. [4] Y. X. Li, S. Kwong, Q. H. He, J. He, J. C. Yang, Genetic algorithm based simultaneous optimization of feature subsets and hidden Markov model parameters for discrimination between speech and non-speech events, International Journal of Speech Technology, vol. 13, no. 2, pp. 61-73, 2010. [5] L. Jong-Seok, P. Cheol Hoon, Hybrid Simulated Annealing and Its Application to Optimization of Hidden Markov Models for Visual Speech Recognition, Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, vol. 40, no. 4, pp. 1188-1196, 2010. [6] T. K. Rasmussen, T. Krink, Improved Hidden Markov Model training for multiple sequence alignment by a particle swarm optimization--evolutionary algorithm hybrid, Biosystems, vol. 72, no. 1-2, pp. 5-17, 2003. [7] N. Nedjah, L. Mourelle, M. O'Neill, F. Leahy, A. Brabazon, "Grammatical Swarm: A Variable- Length Particle Swarm Algorithm," Swarm Intelligent Systems, Studies in Computational Intelligence, pp. 59-74: Springer Berlin / Heidelberg, 2006. [8] C. Keribin, Consistent estimation of the order of mixture models, Sankhyā: The Indian Journal of Statistics, Series A, pp. 49-66, 2000. [9] O. Lukočienė., J. K. Vermunt, "Determining the Number of Components in Mixture Models for Hierarchical Data Advances in Data Analysis, Data Handling and Business Intelligence," Studies in Classification, Data Analysis, and Knowledge Organization, pp. 241-249: Springer Berlin Heidelberg, 2010. [10] J. Dias, "Latent Class Analysis and Model Selection From Data and Information Analysis to Knowledge Engineering," Studies in Classification, Data Analysis, and Knowledge Organization, pp. 95-102: Springer Berlin Heidelberg, 2006. [11] G. Celeux, J. B. Durand, Selecting hidden Markov model state number with cross-validated likelihood, Computational Statistics, vol. 23, no. 4, pp. 541-564, 2008. [12] S. Aupetit, N. Monmarché, M. Slimane, Hidden Markov Models Training Using Populationbased Metaheuristics, Advances in Metaheuristics for Hard Optimization, pp. 415-438, 2008. [13] S. Phon-Amnuaisuk, "Estimating HMM Parameters Using Particle Swarm Optimisation," Applications of Evolutionary Computing, Lecture Notes in Computer Science, pp. 625-634: Springer Berlin / Heidelberg, 2009. [14] J. MENG, X. U. Siqiang, X. WANG, Y. I. Yajuan, L. I. U. Hongbo, Swarm-based DHMM Training and Application in Time Sequences Classification, Journal of Computational Information Systems6, vol. 1, pp. 197-203, 2010. 190

[15] H.-W. Ge, Y.-C. Liang, "A Hidden Markov Model and Immune Particle Swarm Optimization- Based Algorithm for Multiple Sequence Alignment," AI 2005: Advances in Artificial Intelligence, Lecture Notes in Computer Science, pp. 756-765: Springer Berlin / Heidelberg, 2005. [16] M. Macaš, D. Novák, L. Lhotská, "Constraints in Particle Swarm Optimization of Hidden Markov Models," Intelligent Data Engineering and Automated Learning IDEAL 2006, Lecture Notes in Computer Science, pp. 1399-1406: Springer Berlin / Heidelberg, 2006. [17] C. Li, H. Long, Y. Ding, J. Sun, W. Xu, "Multiple Sequence Alignment by Improved Hidden Markov Model Training and Quantum-Behaved Particle Swarm Optimization," Life System Modeling and Intelligent Computing, Lecture Notes in Computer Science, pp. 358-366: Springer Berlin / Heidelberg, 2010. [18] J. Sun, X. Wu, W. Fang, Y. Ding, H. Long, W. Xu, Multiple sequence alignment using the Hidden Markov Model trained by an improved quantum-behaved particle swarm optimization, Information Sciences, vol. 182, no. 1, pp. 93-114, 2010. [19] C. WANG, D. DUAN, X. WANG, A Improved PSO and HMM Algorithm for Web Information Extraction, Journal of Henan Normal University(Natural Science), vol. 38, no. 05, pp. 65-68, 2010. [20] J. ZHU, Y. GAO, Adaptive particle swarm optimization for hidden markov model training, Computer Engineering and Design, vol. 31, no. 01, pp. 157-160, 2010. [21] C. Veenhuis, "A Set-Based Particle Swarm Optimization Method," Parallel Problem Solving from Nature PPSN X, Lecture Notes in Computer Science, pp. 971-980: Springer Berlin / Heidelberg, 2008. 191