BACK CALCULATION PROCEDURE FOR THE STIFFNESS MODULUS OF CEMENT TREATED BASE LAYERS USING COMPUTATIONAL INTELLIGENCE BASED MODELS

Transcription

1 BACK CALCULATION PROCEDURE FOR THE STIFFNESS MODULUS OF CEMENT TREATED BASE LAYERS USING COMPUTATIONAL INTELLIGENCE BASED MODELS Maryam Miradi André.A. A. Molenaar * a.a.a.molenaar@tudelft.nl Martin F. C. van de Ven m.f.c.vandeven@tudelft.nl All members of the Faculty of Civil Engineering and Geo Sciences Delft University of Technology P.O. Box 5048, 2600 GA Delft, the Netherlands * corresponding author tel: Fax: Word count: figures + 3 tables = 2500 equivalent words Total nr of words: = 7458

2 ABSTRACT In the Netherlands, there is a need for a procedure that allows accurate estimation of the stiffness of cement bound base courses using deflection measurements and avoiding the need to take a large amount of cores. Such a procedure is needed to ensure clients that the pavement is built by the contractor as agreed upon in the contract. This paper describes the development of such a procedure. The procedure is developed using Computational Intelligence (CI) Techniques, and in particular Artificial Neural Networks (ANN) and Support Vector Machines (SVM), on a data set consisting of over 2000 deflection profiles calculated for a large number of three layer pavement structures using the BISAR PC software. The ANN and SVM models use falling weight deflectometer (FWD) deflection bowl parameters and the total pavement thickness as input. The total pavement thickness can be determined with radar measurements. The model showed to be capable of predicting the cement treated base course modulus with a high degree of accuracy and is a quick and powerful tool for scanning the stiffness of cement bound base courses.

3 INTRODUCTION Design, build, finance and maintain (DBFM) contracts and especially DB contracts are gaining increasing popularity in the Netherlands. In these contracts it is agreed that the contractor guarantees that the pavement condition stays above a certain minimum acceptance level throughout the contract period. In the Netherlands this period normally is 7 years. These contracts give on the one hand a lot of freedom to the contractors to select the types of materials and structures to be used, but on the other, they also imply that significant risks are taken by both the contractor and the client. One of the risks for the client is the performance of the pavement after the contractual period which might be far less than was anticipated. Because of the materials used, the structure as built might show undesired types of failure which only show up after a significant period of time and can result in significant maintenance needs. An example of such a problem is the use of certain types of slag which e.g. might result in unevenness because of slow chemical reactions taking place in the slag. Another example is leaching of certain chemicals resulting in environmental problems. Since in DB and DBFM contracts, the client has, in principle, no say about materials to be used, quality control in terms of gradation etc, they are often concerned about the structural quality of the pavement as constructed. A question often asked is whether or not the pavement really will have the performance as predicted by the contractor or, in other words, do the pavement layers really have the stiffness and thickness as assumed by the contractor in his design analyses. Because of less positive experiences, clients are especially concerned when a cement treated base course is proposed by the contractor. Such base courses however are gaining popularity because cement stabilization allows a wide variety of recycled materials to be used. Such materials might be mixtures of recycled concrete and masonry, harbor dredging sludge, all kinds of slag etc. Lower quality material, which in principle, are not suited to be used in base courses, can relatively easily be upgraded to higher quality material useable for base courses but the question always is what is the long term performance. To safeguard this, a protocol has been developed which requires the contractor to prove that the thickness and stiffness of the pavement layers are according to the design he proposed. If not, the contractor will be given a penalty and/or he has to upgrade the structure such that it will give the pavement life as agreed upon in the contract. An important part in that protocol is deflection testing by means of FWD testing and coring to determine the thickness of the layers. Using these values, the stiffness of the various layers is back calculated using a back analysis program based on multilayer linear elastic theory. The protocol prescribes that in the back calculation analysis, a three layer system is assumed consisting of the total asphalt thickness, the base course and the subgrade. A three layer system has to be assumed because the majority of the designs are based on three layer analyses. One of the disadvantages of the protocol is the need to take quite a number of cores for layer thickness evaluation. Experience has shown that the protocol might give undesired results. Especially when the second layer, the base course, has a higher stiffness and a greater thickness than the top layer, the back calculated stiffness values are not always realistic. This is especially the case when the top layer is relatively thin (less than 75 mm). In those cases it is quite often observed that the stiffness of the top layer is overestimated while the stiffness of the base layer is underestimated. If, because of this, the back calculated stiffness of the base course is lower than assumed in the design analyses, the contractor is penalized because he didn t build the pavement as designed. It is clear that in this case the penalty is given unrightfully leading to all kinds of unnecessary disputes and even court cases. The above mentioned situation has resulted in a need for a procedure that allows a rapid and accurate evaluation of the base course stiffness without the need of taking a large amount of cores. This procedure may be provided by artificial neural network or support vector machines, being two strong modeling tools within the field of Computational Intelligence (CI), a sub-field of Artificial Intelligence (AI). AI is the science of making computers do things that require intelligence if done by human beings (1). Since it was believed that such a procedure could be developed using artificial intelligence techniques like ANN and SVM, these techniques were applied in this study on a data base consisting of deflection profiles calculated for a large number of structures. The remainder of this paper is as follows. First, a short introduction is given on ANN and SVM. It is discussed how the data has been simulated using BISAR PC software. Then, modeling results using ANN regression, ANN classification, SVM classification, and SVM for regression will be discussed. After

4 that, the validation of ANN regression models will be tested using a new database. The paper ends with conclusions. ARTIFICIAL NEURAL NETWORKS AND SUPPORT VECTOR MACHINES Artificial neural networks (ANNs) are data processing systems. They are mathematical models of human neural systems, trying to mimic the intelligence of humans. ANNs have the same network structure as the human brain. Their structure consists of many neurons connected to each other. In fact, these neurons are non-linear calculation units. Each connection (between neurons) has a weight. For modeling, the input data records are fed into the network. Through the modeling process, connections gain in each iteration another weight, adopting themselves to the input data. The final model is capable of predicting the output accurately even for unseen new data. This ability is called generalization, meaning that through the modeling process, ANN finds the general pattern (hidden relation) between input and output using only data without necessity for any prior knowledge about the problem. This important characteristic makes ANNs suitable modeling techniques for complicated problems from which little to no pre-knowledge is available. For a detailed explanation of how ANN works, the reader is referred to Haykin (2). Traditional neural network approaches have suffered difficulties with generalization, producing models that can over-fit the data. This is a consequence of the optimization algorithms used for parameter selection and the statistical measures used to select the best model. These problems have more or less been solved by another recent CI-based modeling technique, support vector machines (SVM). The foundations of SVM have been developed by Vapnik (3) and are gaining popularity due to many attractive features, and promising empirical performance. Support vector machines are intelligent systems that use a hypothesis space of linear functions in a multi dimensional feature space, trained with a learning algorithm from optimization theory that implements a learning bias derived from statistical learning theory. This learning strategy is a very powerful technique that in the few years since its introduction has already outperformed most other systems in a wide variety of applications (4). The formulation embodies the Structural Risk Minimization (SRM) principle, which has been shown to be superior (5) to the traditional Empirical Risk Minimization (ERM) principle, employed by conventional neural networks. SRM minimizes an upper bound on the expected risk, as opposed to ERM that minimizes the error on the training data. It is this difference which equips SVM with a greater ability to generalize, which is the goal in statistical learning. SVMs were developed to solve the classification problem, but recently they have been extended to the domain of regression problems (6). A further explanation of SVM can be found in references (3, 5, 6). DATA BASE Table 1 shows the structures for which the deflection bowl was calculated. The FWD load used in the analyses, was a 50 kn load and the deflections were calculated at distances of 0, 300, 600, 900, 1200, 1500 and 1800 mm from the load centre. TABLE 1 Overview of the combinations of the thickness and stiffness of the pavement layers for which the deflection bowls were calculated. Variable Value Unit Number of layers 3 - Elastic modulus of asphalt layer (E 1 ) 4000, 6000, 8000, MPa Elastic modulus of cement treated base (E 2 ) 1500, 3000, 4500, 6000, 7500, 9000 MPa Elastic modulus of subgrade (E 3 ) 50, 100, 150, 200 MPa Poison's ratios of asphalt layer (ν 1 ) Poison's ratios of cement treated base layer (ν 2 ) Poison's ratios of subgrade layer (ν 3 ) Asphalt layer thickness (h 1 ) 100, 150, 200,250,300 mm Cement treated base thickness(h 2 ) 150, 200, 250, 300, 350, 400 mm

5 In total the deflection bowls of 2880 structures were calculated. Figure 1 gives an example of how similar two deflection profiles of two different structures can be, indicating that back calculation of stiffness moduli using standard routines might not always be an easy task. The pavement with a base stiffness of 4500 MPa had a total thickness of 700 mm (the total thickness is the thickness of the asphalt layer + the thickness of the base layer) while the pavement with a base course stiffness of 6000 MPa had a total thickness of 650 mm. FIGURE 1 Two different structures can have two similar deflection profiles. CI MODELS TO PREDICT THE STIFFNESS OF CEMENT TREATED BASE COURSES In generalized form, the dependency of the base stiffness on the deflection bowl and the total thickness of the pavement, can be written as: E b = f(e 1, E 2, E 3, h t, d 0, d 300, d 600, d 900, d 1200, d 1500, d 1800 ) (1) Where: E b d x h t = elastic modulus of cement treated base, = deflection at x mm from the loading centre, = the total pavement thickness (thickness asphalt layer + thickness base layer). However, to improve the quality of model performance, some pre-investigation was done to decrease the number of input parameters. These investigations resulted in a decrease from eight input parameters in Equation (1) to five in Equation (2). E b = f(d 0, SCI, BDI, BCI, h t ) (2) Where: SCI = D 0 -D 300, BDI = D 300 -D 600, BCI = D 600 -D 900.

6 Taking the discrete nature of E b (see Table 1), both classification and regression modeling techniques were possible for back-calculation of E b. In this section the application of ANN and SVM to predict (back-calculate) E b using both regression and classification will be discussed. Artificial neural network classification models. In an ANN model, the process of fitting a model to the data is called training. This is because the model learns from the observations (data points) how the input parameters and the output parameter relate. The first stage of modeling was to partition the dataset (2880 data points) into a training set (1999 data points), a validation set (441 data points), and a test set (440 data points). The training set is used to train (fit) the model, the validation set is used to validate the training process and avoid over-fitting. By over-fitting, we mean if the model starts to fit each single data point instead of finding a general pattern in the data. Finally, the test set is used to test the performance of the model after the training was carried out. One of the important parameters in ANN modeling is the activation function. Earlier experiments of this team with ANN modeling have shown that hyperbolic tangent activation function has excellent performance (7, 8). Trying different activation functions for this problem showed again that hyperbolic tangent shows the highest model performance. Determining the number of hidden layers and hidden neurons is a crucial step in ANN modeling. According to universal approximation theorem (9), one hidden layer is enough to model almost all problems. The detailed mathematical description of this theorem is given by Haykin (2). Concerning the optimal number of hidden neurons, an approach explained by Haykin (2) was used. Following this approach, the network was trained with different numbers of hidden neurons in one hidden layer (from 1 to 30) and was tested on the validation set each time. The number of hidden neurons which results in the lowest validation error is the optimal number of hidden neurons. This is shown in Figure 2. For this model, the optimal number of hidden neurons was 10. Error Validation Training Optimal number Number of hidden neurons FIGURE 2 Finding the optimal number of hidden neurons. One step before training the model is to determine the optimal training algorithm. To do so, many types of training algorithms were tried including Quasi-Newton backpropagation, conjugate gradient backpropagation, Levenberg-Marquardt backpropagation, one-step secant backpropagation, random order incremental update, and scaled conjugate gradient backpropagation. Quasi-Newton backpropagation and Levenberg-Marquardt backpropagation showed the lowest error. The classification with Quasi-Newton backpropagation was performed using Alyuda Neuro-Intelligence software. Quasi-Newton is the most popular algorithm in nonlinear optimization, with a reputation for fast convergence. It works by exploiting the observation that, on a quadratic (i.e. parabolic) error surface, one can step directly to the minimum using the Newton step - a calculation involving the Hessian matrix (the matrix of second partial derivatives of the error surface).

7 Artificial neural network-based prediction of the stiffness of the cement treated base course The dataset (2880 data points) was divided into a training set (1999 data points), a validation set (441 data points), and a test set (440 data points). Earlier experiments have shown that the hyperbolic tangent activation function has excellent performance (1, 2). The network was constructed with one hidden layer containing 10 hidden units and the error function of the output was cross entropy. Using Haykin s method (3), the optimal number of hidden neurons was found using 10-fold cross validation. Concerning the training algorithm, many types of training algorithms were tried including Quasi- Newton back propagation, conjugate gradient back propagation, Levenberg-Marquardt back propagation, one-step secant backpropagation, random order incremental update, and scaled conjugate gradient back propagation. Quasi-Newton back propagation and Levenberg-Marquardt back propagation gave the lowest error. The classification with Quasi-Newton backpropagation was performed using Atyuda Neuro- Intelligence software. Quasi-Newton is the most popular algorithm in nonlinear optimization, with a reputation for fast convergence. Quasi-Newton works by exploiting the observation that, on a quadratic (i.e. parabolic) error surface, one can step directly to the minimum using the Newton step - a calculation involving the Hessian matrix (the matrix of second partial derivatives of the error surface). The Quasi- Newton algorithm converged at 108 iterations; the correct classification rate (CCR) of the training set was 84% while the validation set classified 79% correctly. Figure 3 shows the relative importance of the input variables after the model was trained. The figure shows that for ANN classification, total thickness (h 1 + h 2 ) and BDI (BDI = D 300 D 600 ; D x = deflection measured at a distance of x mm from the load centre) are most influential parameters. The question might arise why the total pavement thickness was used as input parameter instead of the thickness of the individual layers. The reason for this is as follows. As mentioned before, one of the objectives was to develop a method which needed the smallest amount of cores as possible. This can be achieved when the layer thickness is estimated by means of Ground Penetrating Radar (GPR) measurements. In such a case only a limited number of cores is needed for calibration purposes. The total thickness was taken as explaining variable since it is believed that this thickness can be estimated with a higher degree of accuracy than the thickness of the individual layers. This is because of the fact that on the radar images, the contrast between the cement treated base course and the underlying subgrade, which in the Netherlands in almost all cases consists of sand, is greater than the contrast between the asphalt layer and the cement treated base course. The trained model was used to test the test set. The correct classification rate of the test set appeared to be 83%. The result of the prediction using the ANN classifier is presented by means of the confusion matrix shown in Table 2.

8 FIGURE. 3 Relative input importance of the parameters used in the model. TABLE 2 Confusion matrix for the test set. Predicted output Actual output A confusion matrix is used for checking the accuracy of a classification. Each column of the matrix represents the predicted output values, while each row represents the actual output values (this is taken from the dataset). Output values are addressed as output classes because in the classification each discrete output value is called a class. One benefit of a confusion matrix is that it is easy to see if the classifier is confusing two classes. Table 2 shows for example that of the 77 ( ) data points with actual output of 1500, 1 has been predicted as 3000 and 13 as The table also shows that ANN predicts class 6000 much better than the other classes. The class 4500 has predicted that 16 data points are belonging to the non-neighboring class 1500 and 10 data points to the neighboring class Therefore, class 4500, with the total of 26 misclassified data points, is the worst class. In summary, ANN predicts 82% of class 1500, 80% of class 3000, 64% of class 4500, 97% of class 6000, 88% of class 7500, and 92% of class 9000 correctly.

9 Artificial neural network regression models. Applying the regression power of ANN, a regression ANN model was developed. Again, the dataset (2880 data points) was partitioned to a training set (1999 data points), a validation set (441 data points), and a test set (440 data points). Experiencing with many different training algorithms leads to the Quasi-Newton back propagation algorithm with limited memory. The optimal architecture for the ANN model resulted in one hidden layer with 17 hidden neurons. The root means square error (RMSE) of the training and testing set were 251 and 277, respectively. The RMSE is determined by calculating the difference between the predicted output and the actual output and squaring them, summing up these squared values and dividing this sum by the number of data points, and finally taking the square root of that. The trained model was then tested with the test set. The RMSE of the test set amounted 258 with an R-squared of Figure 4 shows the scatter plot of the comparison between the actual output (from dataset) and the predicted output (predicted by ANN) using the test set. On request a CD containing the program can be made available. FIGURE 4 Scatter plot of the ANN regression model. Road engineering experts rated the outcome of the ANN modeling as not good enough. First of all the scatter in the predicted E b (Figure 4) was considered to be too large and secondly the confusion matrix showed too many wrong classifications. This implies that there are too many cases where the base modulus E b is predicted either too high (which is beneficial to the contractor) or too low, which is bad for the contractor because it implies that the structure is not approved although it fulfills the requirements Support vector machines. Classification models. Because the results obtained by means of ANN were not considered to be accurate enough, another AI technique called Support Vector Machines (SVM) was used on the entire dataset. Because it goes beyond the scope of this paper to explain the backgrounds of SVM, only the results obtained with this technique are described here. 70% of the data points were used for the training and validation sets and 30% for the test set. The kernel function used was a radial basis function with a power of 20. The training error of the built SVM

10 model was 1.3% while the testing error was 1.8%. In other words the CCR of SVM for the test set was 98.20%. The quality of SVM classification for the six classes of CTB is very impressive. This is proven by the confusion matrix shown in table 3. This table shows that SVM predicts class 1500 with 95%, 3000 with 98%, 4500 with 100%, 6000 with 99%, 7500 with 100%, and 9000 with 97% accuracy. TABLE 3 SVM confusion matrix. Actual output Predicted output Support vector regression models. The SV regression (SVR) technique has also been applied on the data set. Without going into the background of this technique, the results are presented here-after. The trained SVR model had a root mean square error (RMSE) of 133 and an R-squared of The RMSE of the test set amounted 181 with an R- squared of Figure 5 is the box plot of the output of the trained SVR model and the actual output (taken from dataset). The plot has produced a box and whisker plot. The box has lines at the lower quartile, median, and upper quartile values. A quartile is any of three values which divide the sorted dataset into four equal parts (lower quartile, median, upper quartile). The whiskers are lines extending from each end of the box to show the extent of the rest of the data. Outliers are data with values beyond the ends of the whiskers, which are indicated with +. Figure 5 shows that SVR predicts E 2 for values other than 1500 [MPa] with a very low error. On request a CD containing the program can be made available. Road engineering experts were very pleased with these results because the accuracy of the predictions was higher than was obtained by using ANN. VALIDATION OF THE ANN REGRESSION MODEL In spite of the fact that the ANN regression model was not considered to be the best one, it was decided to test the predictive capabilities of the model using a completely different data set. As one can observe from table 1, the original dataset consisted of pavement structures the layer thicknesses and stiffnesses of which were varied in a rather systematic way. In fact only 6 different values for the stiffness of the base were considered. The question was how well the model could predict the stiffness of the base course for pavement structures with layer thickness and stiffness combinations that differ from the combinations shown in table 1. In order to determine how well the model would predict the stiffness of the base course, the deflection profile of 100 additional structures were calculated using the BISAR PC software. Figure 6 shows the variation of the total pavement thickness (h 1 + h 2 ) of the additional pavement structures, while figure 7 shows the deflection profiles of these structures. In figure 8, the actual E 2 values as used in the BISAR calculations are compared to the E 2 values as predicted by means of the ANN regression model. One can observe that a remarkable good fit between the actual and predicted values was obtained. Based on this result, the conclusion was drawn that although the ANN regression model was initially rated as not as good, it still is capable of giving very good predictions of the stiffness of the cement treated base course.

11 FIGURE 5 Scatter plot of the test set using SVR. FIGURE 6 Total pavement thickness of the additional analyzed structures.

12 FIGURE 7 Deflection profiles of the additional analyzed pavement structures. FIGURE 8 Comparison between actual and predicted E 2 values.

13 A similar check of the SV regression model still has to be performed but is expected to give at least similar results. CONCLUSIONS From the results of the models, the following conclusions have been drawn: 1. CI techniques have proven to be powerful tools for the development of models to predict pavement layer stiffness using the measured deflection profile and total pavement thickness as input. 2. Support vector machine for regression has proven to produce better results than artificial neural network regression. 3. Extra validation of ANN regression model showed that in spite of the lesser performance of the ANN regression model, even this model is capable of accurately predicting the stiffness of cement treated base courses. REFERENCES 1. Minsky, M., The Society of Mind. 1986, New York, USA: Simon and Schuster. 2. Haykin, S., Neural Networks : A Comprehensive Foundation. 2nd ed. 1999, New Jersey: Prentice Hall. 3. Vapnik, V.N., The Nature of Statistical Learning Theory. 1995, New York: Springer - Verlag. 4. Cristianini, N., Shawe-Taylor, J., An Introduction to Support Vector Machines and Other Kernelbased Learning Methods. 2000: Cambridge University Press Gunn, S.R., Support Vector Machines for Classification and Regression, in Technical Report ISIS , Department of Electronics and Computer Science, University of Southampton: Southampton. 6. Vapnik, V.N., Golowich, S, Smola, A.J., Support vector method for function approximation, regression estimation, and signal processing. Advances in Neural Information Processing Systems, : p Miradi, M., Molenaar, A.A.A., Development of artificial neural network (ANN) models for maintenance planning of porous asphalt wearing courses 2005, Road and Railway Research Laboratory, Delft University of Technology, : Delft. 8. Miradi. M, M., A.A.A. Application of artificial neural network (ANN) to PA lifespan: forecasting models. in IEEE World Congress on Computational Intelligence Vancouver, Canada: Omnipress. 9. Hecht-Nielsen, R., Neurocomputing. 1990: Addison-Wesley. 10. Duin, R.P.W., Juszczak, P., De Ridder, D., Paclik, P., Pekalska, E., Tax, D.M.J., PRTools, a Matlab toolbox for pattern recognition. 2004, Delft University of Technology: Delft.