Gaussian Process Regression for Vessel Performance Monitoring
|
|
|
- Oliver Page
- 9 years ago
- Views:
Transcription
1 Gaussian Process Regression for Vessel Performance Monitoring Benjamin Pjedsted Pedersen, FORCE Technology, Copenhagen/Denmark, Jan Larsen, DTU Compute, Technical University of Denmark, Copenhagen/Denmark, Abstract It is showed how Gaussian Process Regression (GPR) can be used equally good or better than Artificial Neural Networks (ANN) for short and long term predictions of the energy consumption on a ship. Using different data sets, from five sister container ships, the method is tested with very different data quality, quantity and prediction horizon and shows that GPR is equally accurate and computational more efficient than ANN, but also offer the possible of finding the predictive variance. Furthermore, the used of so-called characteristic length-scales can be used for evaluating the importance of the input variables for the propulsion performance. A crude introduction to GPR is given together with how it has been applied for this purpose. 1. Introduction The recent years focus on fuel efficient ship operations have increased significantly. Partly because of the increasing oil prices, but also in general to reduce cost in the economically strained situations many shipowners have been in recently. Furthermore, the need for shipowners to show awareness for the environment and climate changes is increasing. And finally the political pressure to reduce emissions from shipping have been concretized through the International Maritime Organisation (IMO) by introducing of the Energy Efficiency Design Index (EEDI), which apply to all new buildings from Energy efficient operation of ships has many facets. It both comprises voyage planning, efficient loading and discharging, if possible optimal trim of the loading conditions, and long term factors as wear on the engines and other mechanical part together with fouling of the hull and propeller. Since the condition of the hull and propeller is difficult to assess it is also difficult to estimate the effect of fouling of the propulsion power. Traditionally the effect of fouling has been evaluated by comparing e.g. the actual fuel consumption with a theoretical estimate based on empirical methods and other available data as e.g. model tests. Two well-known and robust methods are Holtrop (1984) and Harvald (1983) that are based on statistical data from model tests of a large number of ships, which makes them more suitable for ship models and loading conditions resembling the ones tested. This is often not the case since model tests usually only are performed at the design and/or one ballast condition. Furthermore corrections for the environmental conditions, as the wind speed and direction (Isherwood (1972)), the wave height, period and directions (from e.g. Bhattacharyya (1978)) where both methods also are based on limited data. These factors often lead to large discrepancies from the operational measured energy consumption. In Pedersen and Larsen (2009) different regression method was tested, but the Artificial Neural Network (ANN) was found superior in all tests. Later tests performed using the Gaussian Process Regression method showed similar or better performance than the ANN, which led to a more elaborate study which is given herein. 2. Regression methods Gaussian Processes, GP, is non-parametric model that provides a flexible framework for regression. The regression function does not take a predetermined form but is constructed from information
2 derived from the data. The definition by Rasmussen and Williams (2006) is: ''A Gaussian process is a collection of random variables, any Gaussian process finite number of which have a joint Gaussian distribution.'' GP is the most flexible class of non-parametric function estimators which can be interpreted as an infinite ANN, Neil (1994). A GP can also be described as a multivariate Gaussian distribution over functions (instead of a scalar or vectors) and the general form can be written as in: f(x) GP m(x), k(x, x ) or N(μ, ) (1) where m(x) or µ is the mean function E[f(x)] and k(x,x') or is the covariance function E[(f(x)- m(x))(f(x')-m(x'))]. The mean function can be set to zero, and the covariance function used for this problem is the Squared Exponential (K SE ). cov(y) = K SE (x, x ) + σ n 2, (2) D K SE (x, x ) = σ 2 f exp 1 2 x d x d d=1 (3) l d 2 where σ f 2 is the predictive variance, σ n 2 is the noise variance, and l is the characteristic length-scale, D is the dimension of the input variables x, σ f 2, σ n 2 and l are also referred to as the hyperparameters. The length-scale can be thought of as the distance you have to move in input space for the function value to change. In the present problem, the input is multi-dimensional and thus there is one lengthscale for each of the input variables and the hyperparameters thus defined as: θ = {l 1, l 2,,l d, σ f 2, σ n 2 }. The model is trained by optimizing the covariance function with respect to the hyperparameters by maximum likelihood also referred to as Automated Relevance Determination routine, ARD. This leaves one optimum length-scale for each of the input variables l 1, l 2,,l d. Predictions by GPR are found by making a joint distribution of the training target values y=[y 1,,y N ] and the test values y t (x t ), and from this the predictive function value and variance can be derived to (4) and the predictive variance σ t 2 can be found as in the sum of the predicted covariance matrix cov(f(x t )) and the variance σ n 2 (5). where α = (K(x, x) + σ n 2 I) 1. f(x t ) = K(x t, x)αy, (4) σ t 2 = K(x t, x t ) K(x t, x)αk(x, x t ), (5) GPR is a fast method, but the complexity increases with the amount of input O(N 3 ), so it is not appropriate for analysis of large data sets. The training data is explicitly used for prediction, so these can be computationally expensive. One of the strengths of GPR is that from ARD, the length-scale is found, which determines how relevant each input variable is for the regression. Input variables with a small length-scale have a higher influence than the variables with high length-scales due to the short distance the input have to move in order to change the function. In the variable analysis all the length-scales are presented as the logarithm log(l) due to the large variation in the length-scales. Furthermore the prediction variance is calculated for every prediction. A detailed description of GPR and the specific software implementation used in this project is found in Rasmussen and Williams (2006) and the author's website (
3 2.1. Training In order to use the data set most efficiently, it was divided into several training/test sets. The most efficient use is by training with all the data except one (N-1) and test with the single remaining input/output variable that is not used for training. This can be done alternately N times so all data points are used for test at a time. This is referred to as ''Leave-One-Out'' (LOO). Prediction errors are calculated for each training/test set, and the mean value of the test error from all the test sets is referred to as the ''Cross-validation error''. The input x d i was normalized by the mean and standard deviation: x d i = x i d x d. This was done in std x d order to avoid too large variations in the input variables leading to large variations in the length-scale and hence the predictions. Since GPR with zero-mean is assumed, the output variable is centred around 0 with following normalization: y = y i y The initial guess of the hyperparameters occasionally resulted in local minima depending on the data length which was discovered by the prediction variance being very small or zero. To overcome this, the training/test set was also split up into MS parts accumulating to the full dataset (7). Each training was restarted with the final hyperparameters from the previous run in total MS-times as in (7). After a few tests, it was concluded that it was sufficient to restart the training twice depending on the data size, i.e. three trainings of the GPR, subsequently two restarts was used for all the trainings. θ x msi+1 = θ x msi (6) x msi x 0: i x MS (7) Where x is the total input data set and x msi is the multi start subset. 3. Data Based on the traditional performance evaluation using empirical methods for predicting the propulsion power the critical input variables have been identified. As expected speed through the water, draught (fore and aft), the wind sped and direction, sea state and temperatures are amongst the most important input variables for estimating the propulsion power. The specific variables are listed below in order of the expected importance. 1. Draught midship [m] 2. Trim Ta-Tf (Draught aft- draught fore) [m] 3. Ship speed [knots] 4. Relative wind speed [m/s] 5. Relative wind direction [degree] 6. Wave height [m] 7. Relative wave direction [degree] 8. Water temperature [degree C] 9. Air temperature [degree C] The output variable is either the propulsion power measured by a torsiometer or the specific fuel consumption Container ship data Noon reports from five sister container ships were systematically collected for a period of up to 10 years. The logging periods are long and contain both dry-dockings and hull cleanings, which give an interesting insight into how these operations influence the performance. The data is well organized, purged of irrelevant data and seems to be very consistent, especially the manual observations of the
4 wave height/direction and wind speed/direction. Unfortunately the dimensions and ship names had to be kept confidential, and only the performance data was available. This made it impossible compare with traditional performance methods where ship specific data are necessary. An overview of the data sets are given in Table 1 where the logging period until the first dry-docking is presented together with the total number of data points, number of dry dockings and hull cleanings within the period. Fig. 1: Speed and power distribution for the five container ships
5 Table 1: Noon reports from the container ships Ship ID Period year to first docking Total no. of No. Hull No. Docking NR cleaning Fig. 1 shows the distribution of the logged speed and shaft power, it indicates a narrow speed profile with a mean around 23 knots and with almost no occurrences of speeds out of the range of knots. The power distribution was much broader which indicates that the power is adjusted to meet the speed for different draught, environmental conditions, etc. In order to supplement the noon report data hindcast data. Hindcasts are weather information at a certain time and position in the past. The data is received by a tool developed for Seatrend (Performance Monitoring tool developed at FORCE Technology, at FORCE Technology based on weather information from the NOAA (National Oceanic Atmospheric Administration, US Dept. of Commerce) data base. This adds several new variables as to the noon reports, but it also limits the number of data, since not all areas are covered by the NOAA database. Many of the smaller sea i.e. the Mediterranean, North sea, Baltic sea, are not included. This reduces number of data of approximately the half. Table 2: Container ship input data x Data variable Source ID Unit 1 Speed through water Noon report NRxls.Ulog knots 2 Speed over ground Noon report NRxls.Uobs knots 3 Sea water temperature Noon report NRxls.Tsw deg 4 Mean draught (Ta+Tf)/2 Noon report NRxls.Tm m 5 Trim, Ta-Tf Noon report NRxls.Trim m 6 True wind speed Noon report NRxls.WindSpeed m/s 7 Relative wind direction Noon report NRxls.WindDir deg 8 Average relative winds speed during report period Hindcast HC. Vrel m/s 9 Average relative winds direction during report period Hindcast HC.gammarel deg 10 Average significant wave height during report period Hindcast HC.mean.Hs m 11 Average wave period during report period Hindcast HC.mean.Tp s 12 Variance of the significant wave height during report period Hindcast HC.var.Hs m² 13 Variance of the wave period during report period Hindcast HC.var.Tp s² 14 Variance of the wave direction during report period Hindcast HC.var.Td deg² 15 Variance of the winds speed during report period Hindcast HC.var.Ws m/s 16 Variance of the winds direction during report period Hindcast HC.var.gamma deg 17 Report date and tim (Matlab numeric value) Noon report NRxls.UTC numeric 18 Average winds speed during report period Hindcast HC.Ws m/s 19 Average winds direction during report period Hindcast HC.gamma deg 27 Sea state Noon report NRxls.SeaState m 28 Relative sea direction Noon report NRxls.TrueRelativeSeaDirection deg 36 Average shaft power Noon report NRxls.PropPower kw Since the weather observations are instantaneous values, usually taken around the reporting time, and other important variables are average values from the previous noon report time, it is not correct to use them together. It would have been ideal to have ''noon reports'' for every hour to increase the accuracy
6 of the hindcast weather information. As this was not possible, hindcasts were made between every noon report, but with one-hour intervals and equivalent position, and then the average of the values between the present and the previous noon report were found. Similarly, the variance of the hindcasts for every noon report period was determined and made available for input to give more detailed weather information. The goal was to find weather that is equivalent to the effect spent in the same period, but since this is not possible, it is believed to be better than using only one observation (noon report time) to represent the past 24 hours. Table 2 shows the list of the data used for the analysis. 4. Results The evaluation was done by comparing the relative prediction errors of energy consumption. As described in above, cross-validation has been performed by Leave-One-Out. The cross-validation error ω K is the mean error of the mean ω k from the k th subsets of all the relative tests errors ω nk. For LOO, K is equal to the length of the total data set N. Similarly the cross-validation value of the relative predictive standard deviations σ k has been determined. ω nk = EC nk EC nk EC nk (8) ω k = 1 N k ω N nk k n k =1 (9) K ω K = 1 ω K k=1 k (10) σ k = 1 N ω N nk ω k 2 k k n k =1 (11) K σ k = 1 σ K k=1 k (12) Where: EC nk is the predicted energy consumption of the n th input data for data subset k EC is the measured energy consumption of the n th input data for data subset k N k is the number of data points within each subset K is the number of subset, K=N for for LOO, N being the number data in the full data set ω nk is the relative prediction error of the n th input data for data subset k ω k is the average of the relative prediction errors for the k th subset ω K is average of the relative prediction errors 4.1. Results from Container ship #1 The data from Container ship #1 was used to make initial tests in order to determine the best combinations of the input variables. Table 3 shows the input variable combinations tested, with the cross validation errors and standard deviations listed in Table 4. From that the best prediction of the energy consumption (the lowest prediction error) is found using the input combination 13, 14 and 21, all having relative prediction error around 4% they all have in common that they are based on noon reports and hindcasts, and that the time is included as a variable. Omitting the time increase the prediction errors of up to 2.4% (input combination 14 and 15). Using input data from the noon reports only the best prediction error is found to 4.9% with input variable combination 18, where all available noon report input, except the speed over ground, is used including the time variable. Without the time variable the input combination 18 is identical to 17 which gives significant higher errors of 6.7%, for the noon reports data without time the rather simple
7 input combination 9, including only the logged speed, sea water temperature and the mean draught, yields the best prediction of an error of 5.5%. Table 3: Input variable setup combination for Noon Report data of the containership data Input combination ID NR.Ulog x x x x x x X x x x x 2 NR.Uobs x x x X x x 3 NR.Tsw x x x x x x x x x x x 4 NR.Tm x x x x x x x x x x x 5 NR.Trim x x x x x x x x x x 6 NR.True_wind_speed_m_s x x x x x x x x x x 7 NR.True_relative_wind_direction_deg x x x x x x x x x x 8 HC.Vrel x x x x 9 HC.gammarel x x x x 10 HC.mean.Hs x x x x x x 11 HC.mean.Tp x x x x x x 12 HC.var.Hs x x x x x x 13 HC.var.Tp x x x x x x 14 HC.var.Td x x x x x x 15 HC.var.Ws x x x x x x 16 HC.var.gamma x x x x x x 17 NR.UTC x x x x x 18 HC.Ws x x 19 HC.gamma x x 27 NR.Sea_state_m x x x x 28 NR.True_relative_sea_direction_deg x x x x Table 4: Relative cross validations errors predictive standard deviations. Input variable ω K σ k setup % % Fig. 2 shows the relative prediction error with error bars based on the predicted standard deviation. It is seen how the error bars generally increase with high prediction errors or areas with sparse data density.
8 Fig. 2: Relative prediction errors with error from the predicted standard deviation (input 21, years) In order to validate the prediction errors an Artificial Neural Network (ANN) has been trained and tested with the same setups as the GPR. The ANN is a feed forward network as described in Pedersen and Larsen (2009). Comparing the prediction methods in Fig. 3 show that the cross validation errors for GPR in most cases are lower or in the same order of magnitude as ANN, was satisfactory for the further study of GPR with performance data. Fig. 3: Comparison of prediction methods GPR and ANN 4.2. Results from Container ship #1-#5 Based on the findings in the previous sections the remaining 4 container ships was trained and evaluated similarly as for container ship #1, i.e. using input variable combination 9, 18, 20 and 21, in order to represent data with and without hindcast data, and with and without time as a variable. Furthermore each of the data set was split into five subset: one for the first year, another for the first two years of data, a third one for the first three year and so forth until the first dry-docking. Fig. 4 show that for the data with hindcast input (20 and 21) is only affected very little by the increasing number of data and the end up with prediction errors between 4 and 6%. For the data set using noon reports without the hindcast data there seems to be a significant benefit using more data, since the prediction errors decrease with time for most of the data set.
9 Fig. 4: Cross-validation errors of container ship #1-#5 5. Analysis of the Input-Output Variables In order to evaluate the influence of each of the input variables, different trainings have been performed with Container ship #1. The length-scales for each of the input variables have then been evaluated for the different training setups. Initially the first year of container ship #1 data was used with all the relevant input variables including the hindcast data, see Table 1. In Fig. 2 the length-scales are illustrated in a bar diagram for the input variable setup where time is included in variable setup 21 and the one without time 20 after training the data from the launching until the first dry-docking. Here all the available input variables are used, i.e. many data points are dismissed because of lacking
10 hindcast availability. For both input 20 and 21 the figure show a clear trend of the logged speed has the highest influence (input 1). The speed over the ground (input 2) is slightly higher together with hindcast wind speed and direction, and significant wave height have an equally influence. The hindcast wave period is generally less important. When the time (input 17) is introduced in variable setup 21, it has a significant influence itself, but is also influence the influence of other variables. The most dramatic drop between variable setup 20 and 21 is the seawater temperature (input 3). The time was introduced as an estimator for the hull fouling since the propulsion performance is expected to drop over time of this. The significant influence of the time variable confirms the strong relation between the expected to change in the performance over time. The draught and trim (input 4 and 5) have a smaller impact than expected but this might be due to only small variation for these variables in the current data set. Fig. 5: Logarithmic length-scales of input variable setups 20 and 21 (with time as a variable) Fig. 6: Logarithmic length-scales of input variable setups 17 and 18(with time as a variable), using only noon report data. Fig. 7: The accumulated length-scales of input variable setup 20 (without time), trained for the three periods of time 0-1, 0-2, years (4.5 year is the time of the first dry docking). Fig. 8: The accumulated length-scales of input variable setup 21 (including time input variable 17), trained for the three periods of time 0-1, 0-2, years (4.5 year is the time of the first dry docking). Fig. 6 is presenting the length-scales based on the training of data with all noon report data available and no hindcast data, input variable setup 17 and 18, without and with the time as input variable. Again it is confirmed that the ship speed is the variable with the most significant influence with very small length-scales for both the input combinations. For the variable setup 18, with time, the
11 remaining input variables are all with the same order of magnitude. For input variable setup 17, the seawater temperature (input 3) and the trim (input 5) indicates to have more influence than the remaining variables. In order asses the trend for shorter logging periods the data was trained also trained for 0-1 year, 0-2 years and 0-until the first dry-docking about 4.5 years. This has been performed for the variable setups 20 and 21 and the accumulated bars are shown in Fig. 7 and Fig. 8. The figures show that except for a few incidents the length-scales becomes shorter for longer data series. 6. Detection of performance trend The overall propulsion performance of a ship is expected to decrease over time mainly due to fouling of the hull and propeller. The change in performance can be determined by comparing the actual measured energy consumption EC propulsion power or fuel consumption with a calculated or predicted energy consumption EC of how the vessel should be able to perform. With GPR, EC predictions can be based on training of a previous period of time. The difference between the predicted and actual values, i.e. the prediction error, is thus a measure of the vessel propulsion performance. In the analysis, the relative prediction error ω = EC EC EC was used to evaluate the behaviour of the performance. The actual energy consumption EC is expected to increase over time due to the fouling, while ω is expected to decrease since the predicted values are based on the training data which are assumed not to be affected by fouling. It is thus desirable to train on the shortest possible period of time in order to limit the effect of a trend in the training data. Yet the training set should include a reasonable variation in the input variables. The training was performed on the entire training set and tested on the remaining data set resulting in a predicted energy consumption EC. The change in propulsion performance due to fouling is assumed to be linear, and the relative prediction error ω can be estimated as a linear function of the time. In order to account for the predicted variance, a ''weighted least square'' regression is performed on ω as a function of the time where the weights, w, are the inverse relative predicted variance (1 2 σ R ). This gives the prediction errors with large variance less influence on the linear regression model. As discussed previously the majority of the larger prediction errors also have a large standard deviation σ 2 R, so letting the inverse variance being the weights makes these prediction errors less relevant. Weighted least square regression is similar to normal least square, but with the residual ω i f(x, α) multiplied by the weights, w ii in the sum of square errors (13) which is minimized with respect to α. The x-axis is represented by the time t i. n E(α) = w ii ω i f(t i, α) 2 i=1 The function values of the performance trend can be define as a Vessel Performance Index, VPI: (13) VPI(t) = αt + β (14) The performance trend is only continuous in periods without external disturbances that change the propulsion performance. External disturbances can be known or unknown. The known disturbances are e.g. dry-docking and hull and propeller cleaning, and unknown disturbances could e.g. be sudden unknown damage of the propeller or rudder. All known disturbances or events are available for the container ships and the trend was thus found between the known events, including dry-docking DD, hull cleaning HCL and propeller cleaning PCL
12 for the container ship. All these events are expected to increase the relative prediction error ω, because the energy consumption is expected to drop. But for the hull and propeller cleaning, this effect can sometimes be difficult to detect due to its limited impact on the relatively large data scatter. Therefore the trend detection has been performed both between all the known events and the dry-dockings only. Fig. 9: Performance trends for container ship #1-#5 The time variable in input variable setups 18 and 21 increases linearly with time. This means that if time has relevance for the regression model, it will be dominant for prediction far into the future. Input combinations including the time were thus inapplicable for the trend detection. The best input variable setups without time was previously found to be 17 and 20, with 17 based exclusively on noon report data, and 20 including the hindcast data.
13 Given the considerations described above, EC for the five container ships were predicted based on two training periods, one and two years from launching, two input variable setups, 17 and 20. The trends were detected only between all the known events and the dry-dockings. The performance trends of container ship #1-#5 are illustrated in Fig. 9. For container ship #1 the intermediate events with short intervals gave too few data make reasonable trends from and subsequently only the dry-dockings was used, which gave a good picture of how the performance increase after a dry-docking, but afterwards drops with a higher rate than before. Container ship #2 has fewer events and thus more continuous development where only a small increase in the performance is detected. Container ship #3 have more events, but with reasonable time between and it thus become easier develop trend for after the first hull cleaning there is a drop in the performance, indicating that the hull cleaning have actually had a negative effect on the propulsion performance. The following drydocking has a small positive effect. Container ship #5 show a very clear trend of the second hull-cleaning having a temporary significant positive effect, but the slope of the performance trend drops dramatically, indicating that the anti fouling paint might have been damaged. The following dry-docking is bringing the vessel back to state which is actually better than at new. 7. Discussion With the methods described above, it is possible to detect the general trends of the change in the performance over time without any predefined definitions of the vessel. Using a combination of noon report data and hindcast data increases the prediction performance significant, even though it reduces the number of inputs. The dry-docking, hull and propeller cleanings do not always have the intended effect. The hull cleanings may have a positive immediate effect, but the long-term trend can be negative. It might also be that the event has no immediate effect, but the long-term effect can be beneficial. In order to evaluate this, detailed information about the event is needed such as what part of the ship was cleaned and what equipment was used. In general, the dry-dockings had a more consistent effect with a positive change in VPI after the docking, but the slope afterwards varied from being steeper than before, as for Container ship #1, to being flatter than the previous trend, as for Container ship #4.
14 References BHATTACHARYYA, R., (1978), Dynamics of Marine Vehicles, Ocean Engineering, Wiley- Interscience, 1 st Edition HARVALD, S.A. (1983), Resistance and Propulsion of Ships, John Wiley & Sons HOLTROP, J. (1984), A statistical re-analysis of resistance and propulsion data. International Shipbuilding Progress, p ISHERWOOD, R.M. (1972), Wind Resistance of Merchant Ships, Proceedings of Royal Institution of Naval Architects. NEIL, R. (1994), Priors for Infinite Networks, Technical Report CRG-TR-94-1, Department of Computer Science, University of Toronto PEDERSEN, B.P.; LARSEN, J. (2009), Prediction of Full-scale Propulsion Power Using Artificial Neural Networks, 8 th Int. Conf. Computer and IT Appl. Maritime Ind, Budapest, pp RASMUSSEN, C.; WILLIAMS, K. I. (2006), Gaussian Processes for Machine Learning, The MIT Presse, London, England, 1. Edition
Data-driven Vessel Performance Monitoring
Data-driven Vessel Performance Monitoring ErhvervsPhD projekt af Benjamin Pjedsted Pedersen FORCE Technology / DTU Poul Andersen Maritime Engineering, Department of Mechanical Engineering, DTU Jan Larsen
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
Gaussian Processes in Machine Learning
Gaussian Processes in Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany [email protected] WWW home page: http://www.tuebingen.mpg.de/ carl
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
Long-term Stock Market Forecasting using Gaussian Processes
Long-term Stock Market Forecasting using Gaussian Processes 1 2 3 4 Anonymous Author(s) Affiliation Address email 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
D-optimal plans in observational studies
D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
Dimitris Theodosiou Managing Director Danaos Management Consultants SA.
Goal Based technical management systems for shipping Dimitris Theodosiou Managing Director Danaos Management Consultants SA. www.danaos.gr Technical Management Aims to : Guarantee vessel availability for
of Presentation Performance Management
2016 American Bureau of Shipping. All rights reserved. Title Vessel of Presentation Performance Management Soren Hansen Assistant Director, Vessel Performance Genoa 18 April 2016 ABS SEMINAR Presentation
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
Component Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
Machine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
AP Physics 1 and 2 Lab Investigations
AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks
Local Gaussian Process Regression for Real Time Online Model Learning and Control
Local Gaussian Process Regression for Real Time Online Model Learning and Control Duy Nguyen-Tuong Jan Peters Matthias Seeger Max Planck Institute for Biological Cybernetics Spemannstraße 38, 776 Tübingen,
Introduction to Logistic Regression
OpenStax-CNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction
INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr.
INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr. Meisenbach M. Hable G. Winkler P. Meier Technology, Laboratory
Gamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
Gaussian Conjugate Prior Cheat Sheet
Gaussian Conjugate Prior Cheat Sheet Tom SF Haines 1 Purpose This document contains notes on how to handle the multivariate Gaussian 1 in a Bayesian setting. It focuses on the conjugate prior, its Bayesian
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Gerard Mc Nulty Systems Optimisation Ltd [email protected]/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I
Gerard Mc Nulty Systems Optimisation Ltd [email protected]/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy
Data Analysis on the ABI PRISM 7700 Sequence Detection System: Setting Baselines and Thresholds. Overview. Data Analysis Tutorial
Data Analysis on the ABI PRISM 7700 Sequence Detection System: Setting Baselines and Thresholds Overview In order for accuracy and precision to be optimal, the assay must be properly evaluated and a few
Implementing a Ship Energy Efficiency Management Plan (SEEMP) Guidance for shipowners and operators
Implementing a Ship Energy Efficiency Management Plan (SEEMP) Guidance for shipowners and operators Lloyd s Register, its affiliates and subsidiaries and their respective officers, employees or agents
Estimation and attribution of changes in extreme weather and climate events
IPCC workshop on extreme weather and climate events, 11-13 June 2002, Beijing. Estimation and attribution of changes in extreme weather and climate events Dr. David B. Stephenson Department of Meteorology
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
New Ferries for Gedser - Rostock. Development of a transport concept GR12. Ferry Conference in Copenhagen 22 nd November 2010
Development of a transport concept Ferry Conference in Copenhagen 22 nd November 2010 Lars Jordt Johannes Johannesson LJ / JOJ 22/11-2010 1 The routes of Scandlines LJ / JOJ 22/11-2010 2 The route Northern
Using kernel methods to visualise crime data
Submission for the 2013 IAOS Prize for Young Statisticians Using kernel methods to visualise crime data Dr. Kieran Martin and Dr. Martin Ralphs [email protected] [email protected] Office
> Capital Markets Day 2011
Wilh. Wilhelmsen ASA > Capital Markets Day 2011 Jan Eyvin Wang President and CEO 8 December 2011, Lysaker > Disclaimer This presentation may contain forward- looking expectations which are subject to risk
OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS
OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS CLARKE, Stephen R. Swinburne University of Technology Australia One way of examining forecasting methods via assignments
L13: cross-validation
Resampling methods Cross validation Bootstrap L13: cross-validation Bias and variance estimation with the Bootstrap Three-way data partitioning CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna CSE@TAMU
The influence of ship operational parameters on fuel consumption
Scientific Journals Maritime University of Szczecin Zeszyty Naukowe Akademia Morska w Szczecinie 2013, 36(108) z. 1 pp. 2013, 36(108) z. 1 s. ISSN 1733-8670 The influence of ship operational parameters
Exploratory Data Analysis
Exploratory Data Analysis Johannes Schauer [email protected] Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014
Safe carriage of oil in extreme environments
Safe carriage of oil in extreme environments Stena Aframax. 117,100 DWT. Swedish-Finnish Ice Class 1A Super Stena Panamax. 74,999 DWT. Swedish-Finnish Ice Class 1A Stena P-MAX. 49,900 DWT. Swedish-Finnish
Multivariate normal distribution and testing for means (see MKB Ch 3)
Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................
FURTHER TECHNICAL AND OPERATIONAL MEASURES FOR ENHANCING ENERGY EFFICIENCY OF INTERNATIONAL SHIPPING
MARINE ENVIRONMENT PROTECTION COMMITTEE 67th session Agenda item 5 MEPC 67/5/XX [ ] August 2014 Original: ENGLISH FURTHER TECHNICAL AND OPERATIONAL MEASURES FOR ENHANCING ENERGY EFFICIENCY OF INTERNATIONAL
1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
Multivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
Introduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
Propulsion Trends in LNG Carriers
This Is a Headline This is a subheadline Propulsion Trends in LNG Carriers Two-stroke Engines Contents Introduction...5 Market Development...6 Definition and types of liquefied gas...6 Types of LNG carriers
Machine Learning in Statistical Arbitrage
Machine Learning in Statistical Arbitrage Xing Fu, Avinash Patra December 11, 2009 Abstract We apply machine learning methods to obtain an index arbitrage strategy. In particular, we employ linear regression
Analysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
Lecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy
BMI Paper The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy Faculty of Sciences VU University Amsterdam De Boelelaan 1081 1081 HV Amsterdam Netherlands Author: R.D.R.
Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm
1 Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm Hani Mehrpouyan, Student Member, IEEE, Department of Electrical and Computer Engineering Queen s University, Kingston, Ontario,
Probabilistic user behavior models in online stores for recommender systems
Probabilistic user behavior models in online stores for recommender systems Tomoharu Iwata Abstract Recommender systems are widely used in online stores because they are expected to improve both user
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
A Study on the Comparison of Electricity Forecasting Models: Korea and China
Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 675 683 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.675 Print ISSN 2287-7843 / Online ISSN 2383-4757 A Study on the Comparison
How To Optimise A Boat'S Hull
Hull form optimisation with Computational Fluid Dynamics Aurélien Drouet, Erwan Jacquin, Pierre-Michel Guilcher Overview Context Software Bulbous bow optimisation Conclusions/perspectives Page 2 HydrOcean
Detection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
A comparison between different volatility models. Daniel Amsköld
A comparison between different volatility models Daniel Amsköld 211 6 14 I II Abstract The main purpose of this master thesis is to evaluate and compare different volatility models. The evaluation is based
Cost optimization of marine fuels consumption as important factor of control ship s sulfur and nitrogen oxides emissions
Scientific Journals Maritime University of Szczecin Zeszyty Naukowe Akademia Morska w Szczecinie 2013, 36(108) z. 1 pp. 94 99 2013, 36(108) z. 1 s. 94 99 ISSN 1733-8670 Cost optimization of marine fuels
Statistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
Java Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
Model-based Synthesis. Tony O Hagan
Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Multiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
Evaluating ship collision risks
Evaluating ship collision risks Silveira, P., Teixeira, A.P, & Guedes Soares, C. IRIS Project risk management: Improving risk matrices using multiple criteria decision analysis Centre for Marine Technology
An Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia [email protected] Tata Institute, Pune,
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Data Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
Artificial Neural Network and Non-Linear Regression: A Comparative Study
International Journal of Scientific and Research Publications, Volume 2, Issue 12, December 2012 1 Artificial Neural Network and Non-Linear Regression: A Comparative Study Shraddha Srivastava 1, *, K.C.
SOFTWARE & SERVICES. English TOTAL
SOFTWARE & SERVICES English TOTAL S H I P C A R E SOF T WAR E & SERVICES TOTA L S H I P C A R E Comprehensive Software and Services for Total Ship Care DESIGN HULL PrimeShip-IPCA MACHINERY Integrated System
Gaussian Process Training with Input Noise
Gaussian Process Training with Input Noise Andrew McHutchon Department of Engineering Cambridge University Cambridge, CB PZ [email protected] Carl Edward Rasmussen Department of Engineering Cambridge University
A Learning Algorithm For Neural Network Ensembles
A Learning Algorithm For Neural Network Ensembles H. D. Navone, P. M. Granitto, P. F. Verdes and H. A. Ceccatto Instituto de Física Rosario (CONICET-UNR) Blvd. 27 de Febrero 210 Bis, 2000 Rosario. República
Summary of specified general model for CHP system
Fakulteta za Elektrotehniko Eva Thorin, Heike Brand, Christoph Weber Summary of specified general model for CHP system OSCOGEN Deliverable D1.4 Contract No. ENK5-CT-2000-00094 Project co-funded by the
IMO. MSC/Circ.707 19 October 1995. Ref. T1/2.04 GUIDANCE TO THE MASTER FOR AVOIDING DANGEROUS SITUATIONS IN FOLLOWING AND QUARTERING SEAS
INTERNATIONAL MARITIME ORGANIZATION 4 ALBERT EMBANKMENT LONDON SE1 7SR Telephone: 020-7735 7611 Fax: 020-7587 3210 Telex: 23588 IMOLDN G IMO E MSC/Circ.707 19 October 1995 Ref. T1/2.04 GUIDANCE TO THE
Econometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
Part 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
Dimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
Bootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
Energy Efficiency of Ships: what are we talking about?
Energy Efficiency of Ships: what are we talking about? Flickr.com/OneEIghteen How do you measure ship efficiency? What is the best metric? What is the potential for regulation? CONTEXT In July 2011, the
CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION
No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
8.1 Summary and conclusions 8.2 Implications
Conclusion and Implication V{tÑàxÜ CONCLUSION AND IMPLICATION 8 Contents 8.1 Summary and conclusions 8.2 Implications Having done the selection of macroeconomic variables, forecasting the series and construction
3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
Distributed computing of failure probabilities for structures in civil engineering
Distributed computing of failure probabilities for structures in civil engineering Andrés Wellmann Jelic, University of Bochum ([email protected]) Matthias Baitsch, University of Bochum ([email protected])
Bollard Pull. Bollard Pull is, the tractive force of a tug, expressed in metric tonnes (t) or kn.
Bollard Pull (Capt. P. Zahalka, Association of Hanseatic Marine Underwriters) Bollard Pull is, the tractive force of a tug, expressed in metric tonnes (t) or kn. This figure is not accurately determinable
A Forecasting Decision Support System
A Forecasting Decision Support System Hanaa E.Sayed a, *, Hossam A.Gabbar b, Soheir A. Fouad c, Khalil M. Ahmed c, Shigeji Miyazaki a a Department of Systems Engineering, Division of Industrial Innovation
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Neural Networks: a replacement for Gaussian Processes?
Neural Networks: a replacement for Gaussian Processes? Matthew Lilley and Marcus Frean Victoria University of Wellington, P.O. Box 600, Wellington, New Zealand [email protected] http://www.mcs.vuw.ac.nz/
A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA
REVSTAT Statistical Journal Volume 4, Number 2, June 2006, 131 142 A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA Authors: Daiane Aparecida Zuanetti Departamento de Estatística, Universidade Federal de São
Standardization and Its Effects on K-Means Clustering Algorithm
Research Journal of Applied Sciences, Engineering and Technology 6(7): 399-3303, 03 ISSN: 040-7459; e-issn: 040-7467 Maxwell Scientific Organization, 03 Submitted: January 3, 03 Accepted: February 5, 03
Design and Operation of Fuel Efficient Ships. Jan de Kat Director, Energy Efficiency Operational and Environmental Performance Copenhagen
Design and Operation of Fuel Efficient Ships Jan de Kat Director, Energy Efficiency Operational and Environmental Performance Copenhagen Genova 21 November 2013 Energy efficiency: key issues High fuel
Planning And Scheduling Maintenance Resources In A Complex System
Planning And Scheduling Maintenance Resources In A Complex System Martin Newby School of Engineering and Mathematical Sciences City University of London London m.j.newbycity.ac.uk Colin Barker QuaRCQ Group
AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
Models for Product Demand Forecasting with the Use of Judgmental Adjustments to Statistical Forecasts
Page 1 of 20 ISF 2008 Models for Product Demand Forecasting with the Use of Judgmental Adjustments to Statistical Forecasts Andrey Davydenko, Professor Robert Fildes [email protected] Lancaster
RINA solutions for Ship Energy Governance
RINA solutions for Ship Energy Governance ENVIRONMENT & SUSTAINABILITY Marine Energy Business Assurance Transport & Infrastructures IFIs, Banks and Investors INNOVATION COMPANY PROFILE RINA SERVICES S.p.A.
Hydrostatically Balanced Loading
OIL COMPANIES INTERNATIONAL MARINE FORUM Hydrostatically Balanced Loading (December 1998) The OCIMF mission is to be recognised internationally as the foremost authority on the safe and environmentally
Master s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
Shipping, World Trade and the Reduction of
Shipping, World Trade and the Reduction of United Nations Framework Convention on Climate Change International Maritime Organization Marine Environment Protection Committee International Chamber of Shipping
Basics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu [email protected] Modern machine learning is rooted in statistics. You will find many familiar
What is Data Mining, and How is it Useful for Power Plant Optimization? (and How is it Different from DOE, CFD, Statistical Modeling)
data analysis data mining quality control web-based analytics What is Data Mining, and How is it Useful for Power Plant Optimization? (and How is it Different from DOE, CFD, Statistical Modeling) StatSoft
Getting Correct Results from PROC REG
Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking
HT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
MATHEMATICS FOR ENGINEERING BASIC ALGEBRA
MATHEMATICS FOR ENGINEERING BASIC ALGEBRA TUTORIAL 3 EQUATIONS This is the one of a series of basic tutorials in mathematics aimed at beginners or anyone wanting to refresh themselves on fundamentals.
