DIABETES TECHNOLOGY & THERAPEUTICS Volume 2, Number 3, 2000 Mary Ann Liebert, Inc. A Neural Network Approach for Insulin Regime and Dose Adjustment in Type 1 Diabetes STAVROULA G. MOUGIAKAKOU, M.Sc. and KONSTANTINA S. NIKITA, Ph.D., M.D. ABSTRACT Background: A decision support system based on a neural network approach is proposed to advise on insulin regime and dose adjustment for type 1 diabetes patients. Method: The system consists of two feed-forward neural networks, trained with the back-propagation algorithm with momentum and adaptive learning rate. The input to the system consists of patient s glucose levels, insulin intake, and observed hypoglycemia symptoms during a short time period. The output of the first neural network provides the insulin regime, which is applied as input to the second neural network to estimate the appropriate insulin doses for a short time period. Results: The system s ability in order to recommend on insulin regime is excellent, while its performance in adjusting the insulin dosages for a specific patient is highly dependent on the data set used during the training procedure. Conclusions: Despite the limitations of computer-based approaches, this study shows that artificial neural networks can assist diabetes patients in insulin adjustment. INTRODUCTION ASIGNIFICANT NUMBER OF DIABETES PATIENTS (type 1 diabetes patients) are dependent on insulin to bring their blood glucose in normal range and avoid short- and long-term complications. 1 During the last years, many research efforts have been devoted to the development of decision support systems for the simplification of type 1 diabetes patients lives, using rule-based approach, 2 mathematical models, 3,4 time-series analysis, 5 and causal probabilistic networks. 6,7 The majority of these decision support systems have been used for educational purposes, since they are not able to accurately describe such a complex, nonlinear system as the metabolic system. The application of neural networks for simulation of complex systems has been successful in many research fields. 8 In general, artificial neural networks are applied in situations that cannot be easily described with rules or mathematical algorithms. In medicine, neural networks have been used for image and signal processing for diagnostic purposes, 9,10 and as decision support systems. 11,12 In the field of diabetes mellitus, factors such as the information intensive nature of diabetes management, the large number of variables to model, and the difficulty in modeling glucose metabolism, justify the need for determining hidden patterns or algorithms through neural networks techniques. In this context, neural networks have been used for the closed-loop control of glucose using subcutaneous glucose measurements and subcutaneous monomeric insulin analogues. 13 Moreover, the use of a neural network as a decision support system has been presented for Department of Electrical and Computer Engineering, National Technical University of Athens, Greece. 381
382 MOUGIAKAKOU AND NIKITA the system to provide advice on insulin regime and dose adjustment is assessed using data from 21 patients, suffering from type 1 diabetes. BACKGROUND FIG. 1. The model of a typical neuron or node. insulin regime prescription, 14 while the combined use of compartmental models and neural networks has been proposed in order to model blood glucose metabolism. 15 Recently, a hybrid artificial intelligence technique combining the principal component method and a feed-forward neural network has been introduced by Liszka-Hackzell 16 for predicting blood glucose levels in patients with diabetes. In this paper, artificial neural networks are used in order to recommend on the appropriate kind and dose of insulin to be taken by a type 1 diabetes patient for a short time period, based on information about patient s glucose levels, insulin intake, and observed hypoglycemia symptoms during a previous short time period. The paper is organized as follows. In Section 2, the necessary background for neural networks is given, while in Section 3, the architecture of the proposed insulin advisory system is presented. In Section 4, the ability of Neural networks Artificial neural networks are networks of simple processing elements -called neurons or nodes- operating on their local data and communicating with other elements 17 (Fig. 1). Each neuron is able to receive input signals, to process them, and to send an output signal. Each neuron is connected at least with one neuron, and each connection is characterized by a real number, known as weight, that reflects the degree of importance of the given connection in the neural network. The neurons in a neural network are organized in at least two or more layers: the input layer, the hidden layer(s), and the output layer. According to the way the neurons are arranged and linked, neural networks can be grouped in two fundamental classes (Fig. 2): the feed-forward neural networks, where the signal flows from input to output layers and the recurrent neural networks, where self- and feedbackloops are allowed in the signal flow. The main advantage of neural networks is the fact that they are able to use some educated guess about unknown information hidden in data (but the user is not able to extract it). The FIG. 2. a: Feed-forward and b: recurrent neural networks.
NEURAL NETWORK APPROACH FOR INSULIN ADJUSTMENT 383 process of capturing the unknown information is called learning or training of neural network. There exist two main types of learning process: supervised and unsupervised training. In supervised training, the inputs are applied to the neural network together with the correct outputs and the weights are adjusted in order to minimize the difference between the neural network output and the desired output. The most popular algorithm of this category is the back-propagation. 17,18 In unsupervised training, the desired output is not known and the system is provided with a group of facts (patterns) and then left to itself to settle down (or not) to a stable state after a number of iterations. The algorithms of this category are used in clustering applications. 10,12 Feed-forward neural networks and back-propagation Multilayer feed-forward (MLF) neural networks, trained with a back-propagation learning algorithm are the most popular neural networks. Each neuron in the input layer or in a hidden layer is connected with all neurons in the next layer. The information of the input layer is transmitted to the neurons of the hidden layer(s) and then to the output layer. The output value (activity) of the jth neuron is determined by the equation where u j 5 ^ o j 5 f(u j ), (1) i w ji? x i 1 b j, (2) f is the transfer (activation) function, b j is the so-called bias coefficient of the jth neuron, w ji is the weight characterizing the connection from the ith neuron to the jth neuron, and the summation in eq. (2) is carried out over all neurons i transferring the signal to the jth neuron. The MLF neural network operates in two modes: training and prediction (testing) mode. For the training of the MLF neural network and for the prediction using the MLF neural network, two data sets are needed: the training and the testing sets. The training procedure begins with arbitrary values of the weights. After the outputs of the neural network have been calculated, the following error function has to be minimized E 5 } 1 2 } M ^ m51 (y m 2 o m ) 2, (3) where y m and o m are the correct and the computed outputs of neuron m at the output layer and M is the number of neurons in the output layer. Minimization of the error function is carried out using the gradient (steepest) descent method. 17 Thus, the weights and the bias coefficients are iteratively updated according to the following equations, E Dw ji 5 2h }} 5 h? d j? o i (4) w ji Db j 5 h? d j (5) where h is the learning rate. In eqs. (4) (5), d j is the so-called error signal, which is evaluated as ì ï í ï (y j 2 o j )? f9 j (u j ), if j denotes a neuron î in the output layer d j 5 f9 j (u j )? ^ d k w kj, if j denotes a neuron k in a hidden layer, (6) where the summation for a neuron in a hidden layer runs over all neurons in the next layer. The exact derivation of the above formulas can be found in the book by Haykin. 17 The above presented algorithm is called the back-propagation, because the output error propagates from the output layer through the hidden layer(s) to the input layer. The back-propagation algorithm is sensitive to the value of the learning rate h, which gives the size of change of the weights in each iteration. To improve stability and convergence of the algorithm, a momentum term is used, and eq. (4) is modified as, 17,18 Dw ji (n) 5 adw ji (n 2 1) 1 h? d j? o i (7) where a [ [0,1] is the momentum and n denotes the iteration number. The selection of momentum and learning rate values is empirical. 17,18 Moreover, the use of variable leaning rate has been suggested to speed-up algorithm s convergence. 17 When the weight matrices re-
384 MOUGIAKAKOU AND NIKITA FIG. 3. A system of two neural networks. The output layer of the first neural network provides estimation for the insulin regime, whereas the second neural network estimates the corresponding insulin dosages. sult in lower error, the learning rate is increased (h 5 inc 3 h, inc. 1). In the case of higher error, which would have a less desirable effect on training the network, the learning rate is decreased (h 5 dec 3 h, dec, 1), the weight and bias matrices are kept, the momentum becomes zero and the step is repeated. When training is completed, the optimized values for the weight and bias matrices are used in the prediction mode (testing phase), where information flows forward through the network, from inputs to outputs. The network processes one example of the testing set at a time, producing an estimate of the output value(s). The resulting error is used as an estimate of the quality of prediction of the trained network. NEURAL NETWORK SYSTEM FOR INSULIN ADJUSTMENT In this section, the use of two feed-forward neural networks (Fig. 3) is proposed to provide decision support to a type 1 diabetes patient, related to insulin regime and dose adjustment for a 24-hour period, based on information about blood glucose levels, insulin intake and observed hypoglycemic episodes during the previous 24-hour period. The six (6) most popular insulin regimes have been considered, corresponding to various combinations of shortacting (regular) insulin preparation (SA), intermediate-acting insulin preparation (IA) or their mixture (SA 1 IA), as presented in Table 1. The first neural network (NN1) is used to recommend on the appropriate insulin regime, while the second one (NN2) estimates the exact SA and/or IA insulin doses, based on the insulin regime. The input variables of NN1 are applied to 11 input neurons and include patient s glucose levels (measurements before breakfast, lunch, dinner, and bedtime) and recommended insulin doses (either SA or IA insulin, or their mixture). The input values, referred to glucose measurements in the 0 to 500 mg/dl range and recommended insulin doses in the 0 to 50 units range, are normalized between 0.0 and 1.0. The TABLE 1. THE SIX INSULIN REGIMES, AS A COMBINATION OF SHORT-ACTING (SA) AND INTERMEDIATE-ACTING INSULIN (IA) br lu di bt Regime (before breakfast) (before lunch) (before dinner) (before sleep) 0 SA 1 IA SA 1 IA 1 SA SA SA SA 2 SA 1 IA SA SA 3 SA SA SA IA 4 SA SA SA SA 1 IA 5 IA
NEURAL NETWORK APPROACH FOR INSULIN ADJUSTMENT 385 FIG. 4. a: Log-sigmoid, and b: tan-sigmoid activation functions. indications zero in the input data represent no measurement or no insulin intake. One hidden layer with 16 neurons is used and the output of NN1 is provided through a number of six (6) neurons in the output layer. Thus, the output vector [100000] corresponds to Regime 0, the output vector [010000] to Regime 1, etc. The log-sigmoid activation function (Fig. 4a) is applied to obtain the output of neurons in the hidden and the output layers. The second neural network consists of six (6) input neurons corresponding to glucose measurements (before breakfast, lunch, dinner, and bedtime), observed hypoglycemia symptoms (0: negative, 1: positive), and insulin regime, which is the transformed binary output of NN1. The input data in NN2 follow the normalization of data used in NN1. One hidden layer with 11 neurons is used, while the output of NN2 is given by seven (7) output neurons corresponding to: pre-breakfast SA and IA (br_sa, br_ia), prelunch SA (lu_sa), predinner SA and IA (di_sa, di_ia), presleep SA and IA (bt_sa, bt_ia) insulin doses. The tan-sigmoid (Fig. 4b) and log-sigmoid activation functions are applied to obtain the output of neurons in the hidden and output layers, respectively. The training and the testing procedure of the system has been based on data from 21 patients 19 suffering from type 1 diabetes treated with multiple daily injections of insulin. This data contains information about the blood glucose levels as predicted by invasive finger lancing test during the day in logical time slots (breakfast, lunch, dinner, bedtime), the insulin intake, and observed hypoglycemic symptoms. Each of the patients has recorded his blood glucose levels and insulin injections for more than 15 days. With the above information, a database has been created in Microsoft Access 2000 (Mi- TABLE 2. SUMMARY OF PATIENT DATA Total Training set Testing set Mean Range Mean Range Mean Range Glucose level (mg/dl) 162 28 450 170.2 28 450 157.6 28 450 SA insulin (U) 7 1 22 9 1 22 9 1 22 IA insulin (U) 20 1 40 21 1 40 19 6 28
386 MOUGIAKAKOU AND NIKITA FIG. 5. Distribution of insulin regimes. a: Total histogram, b: training set consisting of 20 different cases, and c: testing set with the remaining 14 cases. crosoft Corporation, Redmont, WA) and the patients have been classified according to the insulin regime taken. This classification concludes in the study of 34 different cases. The principal patient data related to blood glucose levels and insulin dosages are summarized in Table 2. Twenty out of the 34 cases have been used in the training phase and 14 cases in the testing phase of the neural network system (Fig. 5). Special care has been taken to provide the neural networks with a balanced insulin regime training set. Both neural networks have been trained using the back-propagation algorithm with momentum and adaptive learning rate, as described in Section 2. The amount of time required for NN1 and NN2 training was 6 and 9 min, respectively, on a Pentium III PC (128MB RAM), using Matlab 5.2.0 (The Mathworks, Inc., Natick, MA) and the ANN Toolbox. RESULTS AND DISCUSSION In the testing phase, the system processes data about a patient, corresponding to glucose measurements and insulin intake during the previous 24-hour period and provides estimation for the insulin regime and dosages for the next 24-hour period. The performance of NN1 was extremely
NEURAL NETWORK APPROACH FOR INSULIN ADJUSTMENT 387 TABLE 3. MEAN VALUE (STANDARD DEVIATION) AND CORRELATION COEFFICIENT (p, 0.05) BETWEEN ACTUAL AND ESTIMATED BY NN2 INSULIN DOSES DURING A 24-HOUR PERIOD. TRAINING AND TESTING OF NN2 HAS BEEN BASED ON THE SETS OF TABLE 2 Actual insulin dose Estimated insulin Correlation (U) dose (U) coefficient br_sa Training set 8 (5) 8 (3) 0.61 Testing set 6 (4) 8 (3) 0.65 br_ia Training set 17 (15) 18 (13) 0.86 Testing set 14 (8) 22 (10) 0.90 lu_sa Training set 5 (4) 5 (3) 0.82 Testing set 4 (3) 5 (3) 0.57 di_sa Training set 10 (5) 10 (4) 0.82 Testing set 6 (4) 9 (4) 0.61 di_ia Training set Testing set bt_sa Training set 2 (5) 2 (4) 0.91 Testing set 0 (1) 1 (3) 0.67 bt_ia Training set 1 (3) 1 (2) 0.84 Testing set 0 (1) 0 (1) 0.73 good. The success in estimating the actual insulin regime was 100% and 94% for the training and testing sets, respectively. The performance of NN2 has been assessed by appropriate computations of the mean values, the standard deviations, and the correlation coefficients between actual and estimated insulin doses, separately, for the training and the testing sets. 20 The statistical significance of the mean values and the correlation coefficients has been evaluated by applying the Student paired t-test, while the F-test for population variances has been used to evaluate the statistical significance of the standard deviations. In all the results, a 5% level of significance has been considered. A comparison between actual insulin doses and those estimated by NN2 is shown in Table 3. Through the statistical significance tests, it has been observed that the estimation of the appropriate insulin doses for some diabetes patients is out of the 95% confidence limits, which means that the estimation of insulin doses for a specific time slot during the day is not acceptable. Moreover, NN2 was completely unable to estimate the dose of IA insulin before dinner for both training and testing sets. This can be the result of lack of information in the training set, since only 50 out of a total of 755 insulin data used in the training set referred to the specific insulin dose, which is present only in Regime 0. A preliminary data analysis has shown that the system performance for a specific patient is better, if a subset of the same patient s data has been used during the system s training TABLE 4. CORRELATION COEFFICIENT (p, 0.05) BETWEEN ACTUAL AND ESTIMATED INSULIN DOSAGES FOR A SPECIFIC TYPE 1 DIABETES PATIENT FOLLOWING REGIMES 0 AND 5. TRAINING OF NN2 HAS BEEN BASED ON DATA OF THE SAME PATIENT br_sa br_ia lu_sa di_sa di_ia bt_sa bt_ia Training set 0.78 0.92 0.95 0.93 Testing set 0.78 0.98 0.93 0.95
388 phase. Based on this observation, the performance of NN2 trained for a specific patient has been assessed. To this end, a patient following Regime 0 or Regime 5 has been considered with information recorded for a total period of 47 days. Data for the first 35 days have been used in the training phase and the rest of the data in the testing phase. The estimations of NN1 were identical to the actual insulin regime, for all the days of the testing set. The correlation coefficients between actual and estimated by NN2 insulin doses are summarized in Table 4. A substantial improvement in the system s performance is observed as compared with the overall performance presented in Table 3. The relatively low performance of NN2, when trained with data from a variety of patients, can be attributed to the lack of important information from the database, related with patient related factors, such as sex, age, years of diabetes, carbohydrate intake, physical activity etc. Knowledge on insulin adjustment, according to such factors, has not been integrated in the system during the training phase and consequently the system is not able to estimate insulin dosages in the testing phase taking into account patient characteristics. However, in the case of training for a specific patient using a sufficient amount of data, the system s performance is dramatically improved, since the neural networks are able to use some a priori knowledge on patient s characteristics, which is hidden in the data. CONCLUSIONS MOUGIAKAKOU AND NIKITA A system of two feed-forward neural networks has been proposed to advise on insulin regime and dose adjustment for type 1 diabetes patients. The performance of the proposed neural networks system depends on the data used for its training. Thus, the system s ability to provide accurate predictions for a patient is substantially improved if the system is trained using previous data of the same patient. Furthermore, the system s performance and ability for generalization is expected to be further improved if additional information about other important parameters is available, such as patient s sex, age, years of diabetes, physical activity, diet, stress, other physiological ailments. REFERENCES 1. The Diabetes Control and Complications Trial Research Group: The effect of intensive treatment of diabetes on the development and progression of longterm complications in insulin-dependent diabetes mellitus. N Engl J Med 1993;329:977 986. 2. Skyler J, Skyler D, Seigler D, Sullivan MO: Algorithms for adjustment of insulin dosage by patients who monitor blood glucose. Diabetes Care 1981;4:311 318. 3. Berger M, Rodbard D: Computer simulation of plasma insulin and glucose dynamics after subcutaneous insulin injection. Diabetes Care 1989;12: 725 736. 4. Lehmann ED, Deutsch T, Carson ER, Sönksen PH: AIDA: An interactive diabetes advisor. Comp Meth Progr Biomed 1994;41:183 203. 5. Deutsch T, Lehmann ED, Carson ER, Roudsari AV, Hopkins KD, Sönksen PH: Time series analysis and control of blood glucose levels in diabetic patients. Comp Meth Progr Biomed 1994;41:167 182. 6. Andreassen S, Benn JJ, Hovorka R, Olesen KG, Carson ER: A probabilistic approach to glucose prediction and insulin dose adjustment: description of metabolic model and pilot evaluation study. Comp Meth Progr Biomed 1994;41:153 165. 7. Cavan DA, Hejlesen OK, Hovorka R, Evans JA, Metcalfe JA, Cavan ML, Halim M, Andreassen S, Carson ER, Sönksen PH: Preliminary experience of the DIAS computer model in providing insulin dose advice to patients with insulin dependent diabetes. Comp Meth Progr Biomed 1998;56:157 164. 8. Hagan MT, Demouth HB, Beale MH, Hagan-Demouth B: Neural network design, Boston, MA: PWS Publishing. 1996. 9. Vijaya G, Kumar V, Verma HK: ANN-based QRScomplex analysis of ECG. J Med Eng Technol 1998;22: 160 167. 10. Wright I, Gough N: Artificial neural network analysis of common femoral artery Doppler shift signals: classification of proximal disease. Ultrasound Med Biol 1999;25:735 743. 11. Yu-Chuan Li, Li Liu, Wen-Ta Chiu, Wen-Shan Jian: Neural network modeling for surgical decisions on traumatic brain injury patients. Int J Med Inf 2000; 57:1 9. 12. Chee Peng Lim, Harrison RF, Kennedy RL: Application of autonomous neural network systems to medical classification tasks. Artif Intell Med 1997;11: 215 239. 13. Trajanoski Z, Wach P: Neural predictive controller for insulin delivery using the subcutaneous route. IEEE Trans Biomed Eng 1998;45:1122 1134. 14. Ambrosiadou BV, Gogou G, Maglaveras N, Pappas
NEURAL NETWORK APPROACH FOR INSULIN ADJUSTMENT 389 C: Decision support for insuln regime prescription based on a neural-network approach. Med Informatic 1996;21:23 34. 15. Tresp V, Briegel T, Moody J: Neural-network models for the blood glucose metabolism of a diabetic. IEEE Trans Neural Networks 1999;10:1204 1213. 16. Liszka-Hackzell JJ: Prediction of blood glucose levels in diabetic patients using a hybrid AI technique. Comput Biomed Res 1999;32:132 144. 17. Haykin S: Neural networks: A comprehensive foundation. New York: Prentice-Hall. 1999. 18. Masters T: Practical neural network recipes in C11. New York: Academic Press. 1993. 19. ftp://ftp.gmd.de/learning/neural/datasets/machine-learning-databases/aim-94/ diabetes-data/, Spring Symposium on Artificial Intelligence in Medicine, 1994, Washington. 20. Brown D, Rothery P: Models in biology: Mathematics, statistics and computing. New York: John Wiley & Sons. 1994. Address reprint requests to: Konstantina S. Nikita Department of Electrical and Computer Engineering National Technical University of Athens Iroon Polytechniou 9 Zografos 15 773 Athens, Greece E-mail: knikita@cc.ece.ntua.gr