01 8th International Conference on Communications and Networking in China (CHINACOM) The Optimization of Parameters Configuration for AMR Codec in Mobile Networks Nan Ha,JingWang, Zesong Fei, Wenzhi Li, Tingting Yuan, Runchuan Su,XiaoYang and Xiaoqi Wang School of Information and Electronics, Beijing Institute of Technology, China Email: {hanan, wangjing, feizesong}@bit.edu.com The Research Institution of China Mobile, China Email: {yangxiao, wangxiaoqiyf}@chinamobile.com Abstract Although the pursuing of the data service diversity has become the mainstream, the speech quality is of vital importance for the network operator in the intensifying competition telecommunication market. The Adaptive Multi-Rate (AMR) codec mode, which is one of the most significant techniques, is recommended by rd Generation Partnership Project (GPP) for audio data compression. Many studies relevant to AMR codec, including the parameter configuration, have been progressed. However, most of them are studied by simulations. In this paper, the study of the configuration of Threshold (THR), Hysteria (HYST) and Initial Codec Mode (ICM) for AMR-FR and AMR- HR are implemented in the commercial network. Totally, schemes are put forward and tested to obtain the optimization configuration, which will be used in the current network to improve the speech service quality. Index Terms AMR, codec mode, THR, HYST, ICM I. INTRODUCTION AMR (Adaptive Multi-Rate) is a multi-rate codec standard which is defined and recommended by GPP for the audio data compression scheme in Global System for Mobile Communications (GSM) and Universal Mobile Telecommunications System (UMTS). The use of serious technologies, such as Algebraic Code Excited Linear Prediction (ACELP), discontinuous transmission (DTX), voice activity detection (VAD) and so on, enables AMR to meet the change of radio channel and provides relatively good speech quality. It is rather important to configure appropriate parameters in the current communication network to offer better audio services for mobile operators [1]. However, the research of AMR parameter configuration is fairly inadequate at present and the use of AMR parameters in current network are based on the result of simulation in laboratory, which is not suitable for the local channel environment []. This paper will focus on the performance of AMR-FR and AMR-HR in GSM system when different parameters are set: THR, HYST and ICM. By the way of testing the quality of speech in different channel condition with different parameter setting, the best sets of parameters will be obtained to study a new parameter control method in the current network. The result of this paper will be used so that a way of improving the audio service quality is offered for the network operators. This research work was supported by China National S&T Major Project (01ZX00010). A. The principle of AMR technology The AMR audio codec technology is an audio data compression scheme optimized for speech coding. AMR was originally designed for circuit-switched mobile radio systems. Due to its flexibility and robustness, they are also suitable for other real-time speech communication services over packet-switched networks such as the Internet []. It supports 8 kinds of speech codec modes with bit rates between.75kbps and 1.kbps, and a low rate background noise codec mode. The speech coder is capable of switching bit-rate every 0 ms. Speech codec rates which are supported in AMR are listed in Table I. The AMR technology gives a possibility to interact between different codec modes, which offer different levels of sensitivity when code error occurs []. Besides, the AMR technology could select the best codec mode dynamically to adjust the present channel condition that allows for achieving the optimum speech quality and system capacity. TABLE I: Source Codec Bit-rate for AMR Codec Codec mode Source codec bit-rate AMR 1.0 1.0 kbit/s (GSM EFR) AMR 10.0 10.0 kbit/s AMR 7.95 7.95 kbit/s AMR 7.0 7.0 kbit/s (IS-1) AMR.70.70 kbit/s (PDC-EFR) AMR 5.90 5.90 kbit/s AMR 5.15 5.15 kbit/s AMR.75.75 kbit/s AMR SID a 1.80 kbit/s a Assuming SID frames are continuously transmitted The AMR speech codec mode can be categorized into three modes: Full-Rate (FR) Codec Mode. Regular Pulse Excitation - Long-Term Prediction (RPE-LTP) technology is adopted, which including eight kinds of bit rates: 1. kbps, 10. kbps, 7.95 kbps,.7 kbps, 5.9 kbps, 5.15 kbps and.75 kbps. In this codec mode. One full-rate traffic channel occupies a whole physical channel. Half-Rate (HR) Codec Mode. It adopted the Vector Sum Excited Linear Prediction (VSELP) codec mode, includ- 19 978-1-799-10-7 01 IEEE
ing 7.kbps,.7kbps, 5.9kbps, 5.15kbps and.75kbps bit rates. Two half rate traffic channels use one physical channel. Enhanced Full-Rate (EFR) Codec Mode. This codec mode adopts the Algebraic Code Excited Linear Prediction (ACELP) technology with bit rate 1.kbps. One EFR traffic channel uses one physical channel. B. Mean Opinion Score () Mean opinion score () specified by ITU-T recommendation P.800, is used to evaluate the speech quality of network from user s view. It ranges from 5 (best) to 1 (worst) as it is shown in table II [5]. In the testing process, the sentences are read by both male and female speakers with noise level below 0dB. Listeners are required to give a single number score to express their sense of audio quality as Table II shows. The is the arithmetic mean of all the individual scores. Generally, in digital mobile network, a value between and.5 is regarded as Toll Quality since it achieves the demand of toll call. Values around.5 cause unacceptable by many users as the speech quality declining, while the conversation can be sustained, which is called Communication Quality. Values dropping below are referred to as Synthesis Quality, in which condition the recognition of voice is hard to complement []. Fig. : Family of curves in AMR-FR (from GPP TS.975) TABLE II: Scales Recommended by ITU-T Quality of the speech Description 5 excellent Imperceptible good Perceptible, but not annoying fair Slightly annoying poor Annoying 1 bad intolerable Listening test is a relatively accurate method to obtain value but they are expensive and time-consuming [7]. Hence, an objective measurement for speech quality called Perceptual Evaluation of Speech Quality (PESQ) algorithm has been developed, which is also recommended by ITU-T P.8. The basic principle is shown in Fig. 1 [8]. Fig. 1: PESQ speech quality measurement diagram II. APPROACH OF RESEARCH This paper focuses on the performance of speech quality when different THR, HYST and ICM are configured for AMR- FR and AMR-HR codec mode in GSM system. By the way of analyzing the testing result, the best parameter configuration of AMR codec mode is proposed. Fig. : Family of curves in AMR-HR (from GPP TS.975) A. Codec rate sets selection Fig. offered by GPP TS.975 shows the family of curves for Experiment in clean speech and error free condition in Full Rate channel, providing the relationship between and C/I values [9]. The envelope of AMR-FR is mainly comprised by four codec rate: 1.kbps, 7.kbps, 5.9kbps and.75kbps, which will be used in the commercial network. Similarly, the three highest codec modes: 7.kbps, 5.9kbps and.75kbps are found in Fig., which are used as the codec rate set in Half Rate channel. The codec rate sets used in this paper are shown in Table III. B. Codec rate switching threshold (THR) seletion 1) The preliminary selection of THR: To find the rough THR value, the first step is testing speech quality for all the TABLE III: Selected Codec Rate Sets Codec mode AMR FR AMR HR Codec rate 1.kbps, 7.kbps, 5.9kbps,.75kbps 7.kbps, 5.9kbps and.75kbps 195
for AMR-HR). The scheme with the highest value is the best ICM configuration. Fig. : THR testing scheme codec rates as Table IV shows. Drawing all the points of and C/I and fitting the curves by polynomial functions. The initial THR values are set to be the C/I value of intersections between two contiguous codec modes. Those are set to be THR-Sheme1. The default value of HYST is AMR-FR: {,, }, AMR-HR:{, } (Unit: db). ) Setting the THR schemes: Once the THR-sheme1 for AMR-FR and AMR-HR have been defined, the definitions for Scheme and Sheme are shown in Fig.. Scheme is the sum of THR-Scheme and default HYST; Scheme is the median of Scheme1 and Scheme; Scheme is set to the default value that offered by a provider of telecommunications equipment in China, which is obtained from the laboratory experiment. ) THR selection: Similarly, evaluating the speech quality of the eight schemes separately and drawing the curves of the relationship between and C/I. The Scheme that gives the highest value in the two codec rates is selected as the THR. C. The hysteresis value (HYST) selection Since the THR has already confirmed, it would be used here to study the HYST configuration. The HYST configurations in AMR-FR should meet the inequality: HY ST 1 <HYST <HYST (1) The default values of HYST are AMR-FR: {,, }, AMR-HR: {1, } (Unit: db), which are defined as Scheme1. Another three AMR-FR schemes and two AMR-HR schemes are defined in accordance with the Eq. 1. All of the seven HYST schemes are used to test the speech quality and the schemes with highest value in each codec mode are the best HYST configuration under the condition of ergodicity related with C/I. D. The Initial Codec Mode (ICM) selection The ICM selection is based on the result of the THR and HYST that acquired in the last steps. Due to the limitation of the device in the current network, only codec modes can be configured as the initial codec: default codec rate (5.9kbps) and the maximum codec rate (1.kbps for AMR-FR and 7.kbps III. RELATED ALGORITHM 1) Data alignment: The calling subscriber and called subscriber talk to each other 9 times in each call, which lasts for about minutes. These speech samples are recorded to obtain the value ((x)) and the connect time of the call (ConnectTime). The Measurement Report (MR) is received every 80ms in the Abis interface. The vector used to record the information of the channel signal and channel condition is denoted by MR(x): MR(x) = (Rxlev, Codec, AbisSetupTime, Time, CIR) () Note that the Rxlev is short for received signal level, AbisSetupT ime is the record time of the MR. Time is the connect time of the call and there is a fixed difference between Timeand ConnectT ime.thecir is the signal to noise ratio which has a linear relation with C/I: C/I =(CIR 10)/ () Every 10 CIR values corresponding to 1 value, since each speech sample lasts for.8s, while MR is received every 80ms. That is {CIR(10 x),cir(10 x+1),,cir(10 x +9)} corresponding to (x). ) Data compression algorithm: A data compression algorithm is used to compress the 10 CIR values to 1 CIR so as to obtain the {CIR, }, in which the Principal Component Analysis (PCA) is used to extract the most important information of the CIRs in.8s. The principal components are obtained by the liner combination of the original variables. The 10 CIRs of one call formed the vector x n : x n =[x n1,x n,,x n10 ] () All the values of CIR compose the matrix X n 10 : x 11... x 110 X n 10 =....... (5) x n1 x n10 Computing the correlation matrix R n 10 : N [(x ki x i )(x kj x j )] r ij = () N (x ki x i ) N (x kj x j ) Solving the characteristic equation λi R =0to obtain the eigenvalues: λ i. Arrange the λ i as: λ 1 >λ > >λ 10. Get the eigenvectors l i, (i =1,,, 10). Z= z 1 = l 11 x 1 + + l 110 x 10. (7) z n = l n1 x 1 + + l n10 x 10 19
MS A PESQ Channel Simulator MS B Uplink Downlink BTS Abis BSC Signaling Monitoring System BTS Abis Fig. 5: The testing environment MSC BSC TABLE IV: The Testing Schemes (Unit: db) Codec THR SCH_1 SCH_ SCH_ SCH_ AMR-FR THR_1 5.5 5.5 7 (.75-5.9) AMR-FR THR_ 8 9.5 8.5 8.5 (5.9-7.) AMR-FR THR_ 11 1 1 1 (7.-1.) AMR-HR THR_1 10 1 10.5 10 (.75-5.9) AMR-HR THR_ (5.9-7.) 15.5 1 z 1 is the first principal component, which has the largest possible variance. z 1 is the linear combination of the 10 original variables. Computing the contribution rate p i of each eigenvector: p i = λ i n (8) λ n The contribution of z 1 is always between 80% and 85% which gives the possibility of compressing the 10 CIRs to 1 value to represent the channel condition in the.8s. Since z 1 has been standardized when computing the correlation matrix, the mean and variance are used to obtain the original formed value (C/I). IV. TESTING ENVIRONMENT The testing environment is shown in Fig. 5. MS A is the calling subscriber who starts the dialogue. The speech signal goes through the current network and received by the Base Transceiver Station (BTS), from which the signal transmitted from the Abis interface in the base station, over the Mobile Switching Center (MSC) and Base Station Controller (BSC) to one BTS, from which the signal transmitted from the Abis port in the base station, over the MSC and BSC to another BTS, and received by the called subscriber (MS B). A Drive Test (DT) equipment is used to record the audio and related information of calling and called subscriber. A signaling monitoring system is used to gather the information of each call from the Abis signaling interface, where the MR is obtained. The calls are repeated for 80 times in one test, so that 70 groups of {CIR, } are gathered in one tested scheme. V. TESTING RESULTS A. Testing result of the THR 1) Selection of the THR schemes: Fig. and shows the curves of and C/I of AMR-FR and AMR-HR codec rate sets that was defined in section II. The intersections of every two AMR-FR contiguous codec rates are 5dB, 8dB and 11dB, and 10dB and db for AMR-HR codec mode. As the principle of THR schemes which were discussed in the section II, the schemes of THR are shown in Table IV..5.5.5 1 AMR FR: VS C/I.75Kbps 5.9Kbps 7.Kbps 1.Kbps 1 1 10 8.5.5 AMR HR: VS C/I.75Kbps 5.9Kbps 7.Kbps 1 1 1 10 8 Fig. : Curves of VS C/I ) Testing results of the THR: Fig. 7 shows the results of the 8 schemes that are shown in Table IV. According to Fig. 7 and, we can make conclusions for THR: Scheme1 holds the highest value in AMR-FR. Comparing the values in different intervals of C/I, the curves are approximately coincident for all the schemes using the same rate when C/I is below db. As the C/I rising above 5dB, the value of scheme 1 and are higher than that of scheme and since codec mode switching event couldn t occur in time for scheme and. In AMR-HR codec mode, the third scheme is apparently.8.. AMR FR: THR. Scheme1.8 Scheme Scheme. Scheme 1 1 1 10 8.5.5 1 Scheme1 Scheme Scheme Scheme 1 1 10 8 AMR HR: THR Fig. 7: THR schemes for AMR-FR and AMR-HR 197
higher than others. When C/I increasing over 10dB, the second and fourth schemes are lower than the others. The trends of the curves indicate that the THR value of scheme is rather too high while the scheme is a little too low. To be summarized, when the AMR-FR codec mode is adopted, the THR should be configured to {5dB, 8dB, 11dB}, while the THR for AMR-HR mode should be {10.5dB, 15.5dB}. B. The result of HYST The HYST scheme combinations are set just as they defined in the previous section in this paper, as Table V shows. Testing the speech quality and picking up the paired data of and C/I, the curves of seven schemes in Table V are fitted in Fig. 8. The following conclusions are made by Fig. 8 and : The last scheme for AMR-FR codec mode achieves the best speech quality. The maximum disparity, which is lower than 0.1 in AMR-FR mode, comes to a conclusion that the influences caused by HYST is much smaller compared to the THR does. Scheme 1 for AMR-HR achieves the best value among the whole three HYST schemes. Similarly, the same codec rate (.75 kbps) is used in the bad transmission condition. The fact that from scheme1 to scheme, the value goes with the trend of deterioration and indicates that, in case of high HYST, it is difficult to switch to a higher codec rate in time. To make a summary here, the HYST for AMR-FR codec is selected as {db, db, db} and for AMR-HR is {1dB, db}. C. Result of ICM testing In this paper, two codec rates are used to test the ICM. One is the default rate defined in GPP TS 5.009 and the other is the maximum rate [10]. The default rate is set to be 5.9kbps, and 1.kbps is used as the maximum rate for AMR-FR while 7.kbps for AMR-HR. Polynomial fitting method is used to reveal the relationship between and C/I. The fitted curves are depicted in Fig. 9. There are some commonplaces in these two codec modes from Fig. 9 and. The configuration of ICM doesn t TABLE V: The Testing Schemes (Unit: db) Codec HYST SCH_1 SCH_ SCH_ SCH_ AMR-FR HYST_1 1 1 1 (.75-5.9) AMR-FR HYST_ 1 1 (5.9-7.) AMR-FR HYST_ (7.-1.) AMR-HR HYST_1 1 1 N/A (.75-5.9) AMR-HR HYST_ (5.9-7.) N/A.8.. AMR FR: HYST. Scheme1 Scheme Scheme Scheme 1 1 1 10 8.5.5.5.5 1 AMR HR: HYST Scheme1 Scheme Scheme 1 1 10 8 Fig. 8: HYST schemes for AMR-FR and AMR-HR.5 1 AMR FR: ICM 5.9Kbps 1.Kbps 1 1 10 8.5.5 1 5.9Kbps 1.Kbps 1 1 10 8 AMR HR: ICM Fig. 9: ICM schemes for AMR-FR and AMR-HR cause much effect to the speech quality, where the disparities are below 0.05. Default ICM value should be used when the channel is in a bad condition, where the C/I is lower than 5dB for AMR-FR while 7dB for AMR-HR. With the channel condition becoming better, the ICM should be set to the maximum codec rate. D. Final result The result of the optimized parameters configuration of AMR is shown in Table VI. TABLE VI: Optimized Parameters Configuration for AMR Codec Scope THR HYST ICM.75kbps - 5.9kbps 5 5.9kbps, C/I<5dB AMR-FR 5.9 kbps - 7. kbps 8 1.kbps, C/I>5dB 7. kbps - 1. kbps 11.75kbps - 5.9kbps 10.5 1 5.9kbps, C/I<7dB AMR-HR 5.9 kbps - 7. kbps 15.5 7.kbps, C/I>7dB VI. CONCLUSION In this study, approximately 00,000 groups of data from schemes in AMR-FR and AMR-HR codec modes are tested for the purpose of parameters configuration optimization: THR, HYST and ICM. The algorithm of data processing, function fitting and analyzing is used in the study. The conclusion 198
of the parameters configuration study of AMR codec mode for speech quality in the current network will be used as guidance for the mobile operators. The new configuration will offer better speech service and earn good reputation in the commercial network. REFERENCES [1] Werner, M., Vary, P, Speech Quality Improvement in UMTS by AMR Mode Switching, th Intl. ITG-Conference on Source and Channel Coding, Berlin, Jan. 00. [] Hualin Li, Yifan Chen, Li Sun, The Research of parameter controll for AMR codec mode, National Conference on Wireless & Mobile Communication, 01, pp.9-9. [] RFC87. RTP Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs, Page 5, April, 007. [] GPP TS.071, AMR speech CODEC; General description, Release 11, Spetember, 01. [5] ITU-T Recommendation P.800, Telephone Transmission Quality Methods for objective and subjective assessment of quality, August, 199. [] Chuke Yi, Bin Tian, Qiang Fu, Speech Signal Processing, National Defence Industry Press, pp. 1-5, Beijing, 000. [7] Marc Werner, Thomas Junge, and Peter Vary, Quality Control for AMR Sppech Channels in GSM Networks, IEEE International Conference on Acoustics, Speech, and Signal Processing. vol., pp. iii - 107-9, 00. [8] ITU-T Recommendation P.8, Perceptual evaluation of speech quality (PESQ), an objective method for end-toend speech quality assessment of narrowband telephone networks and speech codecs, Geneva, 001. [9] GPP TR.975, Performance Characterization of the AMR Speech Codec, version 1.1.0, Jan, 000. [10] GPP TS 5.009, Technical Specification Group GSM/EDGE Radio Access Network, version 11.0.0, Sep, 01. 199