Simulator of the H.264 video codec based on the behavior study of the configuration parameters Iris Correa das Chagas Linck and Arthur Tórgo Gómez Abstract In this work it was developed a hybrid algorithm that simulates the behavior of the H.264/AVC video Encoder/Decoder, or simply H.264 CODEC used in the Brazilian Digital Television System (BDTS). The hybrid algorithm comprises two metaheuristics: Tabu Search and Genetic Algorithm. It aims to seek a good configuration of the six main parameters used for configuring the H.264 CODEC. This problem is approached as a combinatorial optimization problem known as Parties Selection Problem. The proposed algorithm searches the solution through the behavior simulation of these configuration parameters: bit rate, frame rate, quantization parameters of the B (Bi-predicted), P (Predicted) and I (Intra-Predicted) slices, and the number of B-slices in a Group Of Pictures (GOP). The proposed algorithm generated solutions that improved the encoded video generated by the H.264 CODEC in terms of image quality and video compression. The Hybrid algorithm proved to be a reliable support tool to configure a H.264 CODEC for different formats of video. Index Terms Genetic Algorithm, H.264 Video CODEC, Metaheuristic, Tabu Search. I. INTRODUCTION Algorithms for compressing and decompressing video, called CODECs have been continually improved over the last decade to attend the market demands. One of the latest CODECs is the H.264/MPEG-4 AVC standard which is part of the third generation video compression technology. However, choosing the appropriate CODEC and the optimization of real-time implementation for a specific application remains a difficult challenge today. The optimal design of a CODEC compression means more data with great efficiency using limited computational resource, which is summarized in an arduous task [1]. Many algorithms to improve the H.264 CODEC have been proposed. A Genetic Search Algorithm was proposed by Yuelei et al. [2] and it was applied in the Motion Estimation Module of the H.264. Yasakethu et al. [3] proposed a rate control technique by using the Video Quality Metric (VQM) with an evolution strategy algorithm which is capable to identify the best possible quantization parameters that it would maximize the subjective quality of the entire video. Moriyoshi et al., [4] Iris C. C. Linck is with University of Sinos Valley, UNISINOS, São Leopoldo, Brazil (telephone: 55 51 8458-5827, e-mail: linck.iris@ gmail.com). Arthur T. Gómez is with University of Sinos Valley, UNISINOS, São Leopoldo, Brazil (e-mail: atgomezbr@gmail.com). ISBN: 978-0-9803267-5-8 proposed a PC-based real-time MPEG-4 video codec with a fast adaptive motion vector search software. All the studies mentioned, [2], [3] and [4], have in common the fact they improve the performance of the H.264 codec and the solutions that are proposed by them are embedded in the source code of the codec. In this paper we will take a fresh look of this problem and it is proposed a hybrid algorithm that simulates the behavior of the H.264 configuration parameters, instead of trying to improve its source code. The configuration parameters are represented by means of an objective function. The proposed hybrid algorithm, called Simulator Metaheuristics applied to a H.264 Codec (SMAC), was developed based on two metaheuristics: Tabu Search and Genetic Algorithm. The six configuration parameters that will be optimized by the SMAC are the bit rate, frame rate, the quantization parameter of B-frame, I-frame and P-frame and the amount of B-frames in a Group of Pictures (GOP).The bit rate and frame rate parameters work primarily on the quality of the video image while the others parameters act directly on the video compression. All these parameters are part of the objective function of the proposed algorithm. The parameters have the same degree of contribution in the objective function through their unbiased weights. In summary, the objective function of the proposed algorithm behaves as an objective metric that manages two policies: quality of image and video compression. When the hybrid algorithm finds a solution, this solution is used to configure the H.264 codec. The codec will encode a video in order to verify if the encoded video reached an improved image quality, that is measured by its PSNR [5] (Peak Signal to Noise Ratio) and a better video compression. Experiments using different format videos were performed and the results showed that the encoded video had improvement. II. BEHAVIOR ANALYSES OF THE H.264 CODEC SIMULATOR The H.264 standard contains numerous configuration parameters that leave him with high flexibility and greatly affects its performance. Because of this, an H.264 encoder can have poor performance if it is configured improperly. The aim of this study was to transform the proposed objective function in an unbiased solution. For this, the configuration parameters, called decision variables, which make up the objective function, had to be calibrated in terms of weights assigned to them. The SMAC algorithm was run 100 times in this step in order to assign random weights to each decision variable. The random values of the weights followed a normal distribution within a range of values from 0 to 100. 86
After 100 SMAC rounds, it was calculated the mean of values taken by the decision variables. The decision variable with highest mean value was assigned with a normalized weight equal 1 and the weight of the others decision variables were calculated by dividing the highest mean value found by the mean value of each one. The weights were rounded. This process ensures that the objective function becomes an unbiased solution. All decision variables have the same degree of contribution within the function. The Objective Function (OF) was defined like that: 1 1 1 1 MaxOF 1BR 2FR 3 3 4 (1) 3 QI QP QB PF Where: BR represents the bit rate; FR, the Frame rate; QI, QP and QB represents respectively the quantization of I-frame, P-frame and B-frame; and PF is the number of B-frame within the GOP. The coefficients α 1, α 2, α 3, α 4, α 5 and α 6 represent the unbiased weight of the decision variables. The decision variables take random values according to the range of values suggested by I-TUT [6]. It was made experiments in order to validate the proposed algorithm. In these experiments it was observed the behavior of decision variables as the SMAC increases the weight value of the decision variables in a progressive manner. The experiment consisted of run the SMAC 300 times. The weight of the decision variable chosen was increased by 20 units every 30 executions of the algorithm. We calculated mean values of the decision variables in these 30 executions. After run the SMAC 300 times, we obtained 10 mean values of the decision variables. For this experiment, six scenarios were assembled, one for each decision variable. In the first scenario, it was increased the bit rate weight, in the second, the frame rate weight, in the third, the QI weight, in the fourth, the QP weight in the fifth the QB weight and in the sixth scenario it was increased the PF weight. The algorithm was run 1,800 times in order to cover all the scenarios. The results of the experiment using the first scenario are showed in the Table I. Table I. Mean values assumed by the decision variables when BR contributes more effectively in the Objective Function. The Fig. 1 shows the curves behavior of the objective function and its decision variables in the scenario in which BR has a higher degree of influence on Objective Function. Fig. 1. Curves behavior of the objective function and its decision variables within the first scenario. The Graph 1 (Fig.1) shows that the objective function increases when the bit rate rises. It means that when the OF reaches higher values the image quality tends to be better and this is in agreement with the I-TUT [6]. The Graph 2 (Fig.1) shows the behavior curve of frame rate. It was observed that the frame rate fluctuated through the iterations. This kind of behavior occurred with the frame rate because it is connected with the change of the type of frames. In the iteration 5 the frame rate drops sharply. It occurs in the same moment that PF reached a peak. In other words, the increase of PF means that the number of the B-frames increased within the GOP and according to the literature it contributes for the drop of the frame rate because B-frames take more time to be encoded by the H.264 codec. B-frames are the ones which suffer the greatest compressions according to the I-TUT [6]. These behaviors show that the hybrid algorithm simulates the real behavior of the H.264 codec and it is in agreement with the literature [6]. The Graph 2 (Fig.1) shows the behavior curve of the quantization parameter of the I-frames represented by QI. The Graph 2 shows that the QI behavior curve tends to have small oscillations. The I-frames are the ones which have low compression. They are encoded by the H.264 intra prediction module. This module deals with the spatial redundancy of the image or inter-pixel redundancy. It was observed that the range of values assumed by the QI parameter is greater than the QP and QI ranges (see the Table I). According to the literature, a high compression factor to treat the spatial redundancy has no great harm for the encoding of a video, but the same is not true when it comes to temporal redundancy. Therefore, this is an acceptable behavior to the proposed simulator and it confirms to the evidence that it behaves like a real codec. The Graph 4 and 5 (Figure 1) shown that QP and QB tend to be inversely related. QP is related with the compression factor of P-frames are used as the reference to B-frames. The simulator decreases the compression of the P-frames when the B-frames have a high compression. This behavior contributes to improve the image quality in the H.264 codec. The other scenarios corroborated with the results presented in the first scenario. Some important results should be cited when there was a change of scenario. In scenario 2 it was observed that the frame rate improves the objective function but the function values were slightly below the values obtained in 87
the scenario 1. In scenario 3 and 4 related to QI and QP respectively, it was observed that the objective function did not reach high values when variables related with the video compression have higher contribution in this function. In Scenario 5 the increasing of the contribution of the QP caused a slight increase in the objective function like in the scenarios 3 and 4. But, when the QP achieved its higher contribution within the function, it causes a drop in the objective function. The scenarios 3, 4 and 5 show that when a variable related to compression of frames has high contribution within the objective function, it causes a little improvement in the objective function due to the fact that increasing the frame compression is not conducive to a better image quality. III. COMPUTATIONAL ARCHITECTURE OF THE SIMULATOR OF H.264 CODEC The Fig. 2 shows the computational model of the SMAC, which illustrates the architecture of the proposed hybrid algorithm. The hybrid algorithm basically has three modules: the first module receives an initial solution as the basis for beginning the search for a better solution. The second module performs the tabu search. The third module performs the genetic algorithm. There is a fourth module that is not part of the hybrid algorithm, however, this shows that in the end of the hybrid algorithm it is provided a solution to be used for setting the parameters of the H.264 codec. Fig. 2. Computing architecture of the SMC. Currently, the research and work done by using metaheuristics are turning to the use hybrid techniques, such as Burk et al. [7],[8], because they have shown better results, since they have exploited the specific advantages of each method. In the first module of the model shown in Fig. 2, called initial solution module, it is provided a viable initial solution to the Tabu Search, which consists in a set of values for the six configuration parameters or decision variables of the objective function as in (1). This initial solution came from the JIGA of tests from the DigConv Project [9]. The JIGA of tests is a system (hardware and software) that allows complete tests to check the performance of H.264 video CODEC for the Brazilian Digital TV [10], [11], by monitoring and analyzing the data and parameters from content generation and coding/decoding video until display the video decoded. The proposed solution had to be configured to work in a properly profile/level of the H.264. The simulator works within the limits of the profile of H.264 as regards the range of values allowed for the configuration variables of the codec. The Tabu Search (TS) module will explore the search space around the initial solution. The objective function of the tabu search is the objective function (1). The neighborhood structure consisted in generating random values for the decision variables according to the values defined or suggested in [6]. The tabu list stored initially was set in 7 tabu movements. The stopping criterion applied was the maximum number of iteration without improvement of the Objective Function value (nbmax) and its size was defined in 100. The twenty best solutions found by TS were stored in an elite candidate list, and this one is delivered to the genetic algorithm when the stopping criterion is reached. The Genetic Algorithm (GA) module receives an elite candidate list from TS. This list was used to generate the initial population of the GA. The chromosome is represented by a set of the six decision variables, as in (1). The fitness function is represented by (1). The selection strategy used was the tournament selection; the reproduction strategy to generate new individuals (offspring) consisted in the crossover and mutation with probability equal of 0.8 and 0.2 respectively. It was used geometrical crossover operator and one point crossover; the number of generations was 100 and it was used as the stopping criterion. After the complete execution of the tabu search and the genetic algorithm two solutions are presented, one solution from each algorithm. These solutions found are compared and if the solution found by the Genetic Algorithm module is better than the solution found by the Tabu Search module, the hybrid algorithm returns to the Tabu Search module in order to try to optimize the GA solution. Otherwise, the hybrid algorithm is finished and the best solution found can be used in order to configure the H.264 encoder. IV. EXPERIMENTS USING THE H.264 CODEC This experiment consists in encode videos with different formats using the H.264 codec. For this, the codec was configured with solutions found by the proposed simulator. The results of the encoded video were analyzed and compared with the initial solution results in terms of the PSNR [5] and video compression. The table II lists the types of videos used in the experiment. The first column presents the identification of the video, the second column presents the format that the video should take 88
after its codification and its original name, the third column shows the resolution of the video, the fourth column presents the profile and the level used by the H.264 codec, the fifth column shows the number video frames and the sixth column shows the original size of the video when it is in its original format YUV 4:2:0. Table II. List of videos used in video encoding tests. The Table III shows the video coding results presented in table II. In short, the table III shows that the same video was encoded by the H.264 codec in two steps, at first step, the DigConv project team encoded the video using default values for the configuration parameters of the H.264 codec. In the second step, the video was encoded using the optimal parameters for the best solution found by the SMC. The degree of compression and the PSNR [5] that were obtained from these two encodings were then compared and analyzed. Note that the PSNR [5] is an objective metric provided by the H.264 codec at the end of the video coding. The higher the value of PSNR, better the image quality. The PSNR assumes values in the range from 0 to 50 and PSNR values below 30 produces an image quality not acceptable. Table III. Results of encoding videos using the H.264 codec. According to Table III, V02 was encoded using three different solutions found by proposed simulator. All solutions were better than the Initial Solution. In this experiment the simulator was configured with different values for the nbmax and the tabu list size. The first configuration used by the simulator was nbmax = 100 and tabu list size = 50, as a result, the solution found was better than the initial solution. It happened because the SMC got a video compression gain of 0.93% and a PSNR loss of only 0.28%. It was observed that the SMC found a solution which produced a slight loss in the image quality, but the loss has not led to an unacceptable image quality (PSNR < 30). The simulator was able to achieve better compression of video to compensate for the small loss in PSNR. The second configuration used nbmax equal 400 and tabu list size equal 100. This solution resulted in an increased by 2.02% in the PSNR, but the video size increased by 34.38%. It was observe that in this case the SMC considered this the solution as the best solution at all due to the fact that he prioritizes the gain in PSNR within the objective function. The third configuration used nbmax=600 and tabu list size = 100. This solution did not achieve a gain in PSNR, only managed to minimize losses of this metric and the solution achieved a gain of video compression by 34.24%. The loss of PSNR not leads to an unacceptable value PSNR. In experiments carried out with the video V02 could be observed that the behavior of the SMC was to prioritize the gain of PSNR. It was observed that the highest value of objective function was reached when it achieved a gain of PSNR, on the other hand, when the gain is not attained, the SMC tended to minimize the loss of PSNR and then it gave priority to the gain of video compression, as a way to compensate for the loss of image quality. The Table III shows that the V10 video was encoded using four different solutions found by proposed simulator. Each solution had different nbmax and tabu list size. In summary, the different solutions used to encode the V10 video V10, shown the different behaviors of the proposed simulator, which makes it clear that the best solutions were those that achieve the greatest gains of both PSNR and video compression while the worst solutions were those which had some type of loss in one of them. This shows that the SMC aims to maximize the image quality and the compression rate video. The other videos, V15, V29, V33 and V34, also achieved better solutions than the initial solution and obtained some kind of gain as shown in Table III. V. CONCLUSION This work proposed a hybrid algorithm called Simulator Metaheuristics applied to a CODEC (SMC) that uses the tabu search in combination with the genetic algorithm, in order to study the behavior of the six main configuration parameters of the video CODEC H.264/AVC. From the study of the parameters dynamic behaviors it was validated proposed hybrid algorithm. Experiments were done with the SMC in order to obtain good solutions for the H.264 codec configuration. The SMC has been validated by the patterns of behavior of the parameters that were found in the literature, and was very efficient according to the experiments. It was observed that the algorithm assumes that the best solution is one which makes the codec have a gain of image quality and video rate compression. When this solution is not reached, the SMC attempts to strike a balance between the image quality and compression video rate. The objective function of the proposed hybrid algorithm behaves like an objective metric that manages two policies: quality of image and compression video rate. The results were considered satisfactory since the proposed algorithm always reached a solution better than the initial solution and the proposed algorithm solves the problem of setting parameters of the H.264 codec used in the Brazilian Digital TV. 89
REFERENCES [1] J. Golston, A. Rao. Video codecs tutorial: Trade-offs with H.264, VC-1 and other advanced codecs, Embedded Systems Conference Silicon Valley, 2006. [2] X. Yuelei, B. Duyan, M. Baixin. A Genetic Search Algorithm for Motion Estimation, IEEE, Signal Processing Proceedings, 2000. [3] S. L. P. Yasakethu,, W. A. C. Fernando, S. Adedoyin, A. Kondoz, A Rate Control Technique for Off Line H.264/AVC Video Coding Using Subjective Quality of Video, IEEE Transactions on Consumer Electronics, Vol. 54, No. 3, 2008 [4] T. Moriyoshi, H. Shinohara, T. Miyakazi, I. Kuroda, Real-time software video codec with a fast adaptive motion vector search, Journal of VLSI Signal Processing, 2000, vol. 29, pp. 239 245. [5] S. Winkler, P. Mohandas, The evolution of video quality measurement: from PSNR to hybrid metrics, IEEE Transactions on Broadcasting, 2008, pp. 660 668. [6] ITU-T. ITU-T Recommendation H.264. Advanced video coding for generic audiovisual services, 2007. [7] E. K. Burke, P. D. Causmaecker, G. V. Berghe, A Memetic Approach to the nurse Rostering Problem, Applied Intelligence 15, 2001, pp.199-214, Kluwer Academic Publishers. [8] E. K. Burke, P. D. Causmaecker, G. V. Berghe, A hybrid Tabu Search Algorithm for the Nurse Rostering Problem,. B. McKay et al. (eds.), Simulated Evolution and Learning, Springer, Lecture Notes in Artificial Intelligence, 1998, v.1585, pp.187-194. [9] Unisinos University of the Sinos Valley, Especificação de Software CODEC Codificação e Decodificação de Sinais Fonte, 2008. [10] ABNT - Associação Brasileira de Normas Técnicas, NBR 15602-1,. Digital Terrestrial Television video coding, audio coding and multiplexing. Part 1: video coding, Rio de Janeiro, 2008. [11] ABNT - Associação Brasileira de Normas Técnicas, NBR 15601. Televisão digital terrestre Sistema de transmissão, Rio de Janeiro, 2008. 90