Fuel Cell Health Monitoring Using Self Organizing Maps



Similar documents
Visualization of Breast Cancer Data by SOM Component Planes

Monitoring of Complex Industrial Processes based on Self-Organizing Maps and Watershed Transformations

On the use of Three-dimensional Self-Organizing Maps for Visualizing Clusters in Geo-referenced Data

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization

An Analysis on Density Based Clustering of Multi Dimensional Spatial Data

Data topology visualization for the Self-Organizing Map

A Discussion on Visual Interactive Data Exploration using Self-Organizing Maps

Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets

Using Smoothed Data Histograms for Cluster Visualization in Self-Organizing Maps

ViSOM A Novel Method for Multivariate Data Projection and Structure Visualization

Cluster Analysis: Advanced Concepts

Supervised and unsupervised learning - 1

Social Media Mining. Data Mining Essentials

Data Clustering and Topology Preservation Using 3D Visualization of Self Organizing Maps

Data Mining and Neural Networks in Stata

Grid Density Clustering Algorithm

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS

Self Organizing Maps: Fundamentals

Self-Organizing g Maps (SOM) COMP61021 Modelling and Visualization of High Dimensional Data

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

A Computational Framework for Exploratory Data Analysis

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparing large datasets structures through unsupervised learning

Classification of Engineering Consultancy Firms Using Self-Organizing Maps: A Scientific Approach

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Visualization of large data sets using MDS combined with LVQ.

The Research of Data Mining Based on Neural Networks

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

Quality Assessment in Spatial Clustering of Data Mining

VISUALIZATION OF GEOSPATIAL DATA BY COMPONENT PLANES AND U-MATRIX

How To Use Neural Networks In Data Mining

Using Data Mining for Mobile Communication Clustering and Characterization

Self Organizing Maps for Visualization of Categories

CITY UNIVERSITY OF HONG KONG 香 港 城 市 大 學. Self-Organizing Map: Visualization and Data Handling 自 組 織 神 經 網 絡 : 可 視 化 和 數 據 處 理

INTERACTIVE DATA EXPLORATION USING MDS MAPPING

A Study of Web Log Analysis Using Clustering Techniques

Credit Card Fraud Detection Using Self Organised Map

Unsupervised Data Mining (Clustering)

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Classification algorithm in Data mining: An Overview

Advanced Web Usage Mining Algorithm using Neural Network and Principal Component Analysis

Neural networks and their rules for classification in marine geology

A Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images

Curriculum Vitæ of Alexandre RAVEY

High-dimensional labeled data analysis with Gabriel graphs

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

Predict Influencers in the Social Network

A New Approach For Estimating Software Effort Using RBFN Network

Models of Cortical Maps II

Proceedings - AutoCarto Columbus, Ohio, USA - September 16-18, 2012

Web Document Clustering

Knowledge Discovery in Stock Market Data

Chapter 12 Discovering New Knowledge Data Mining

Segmentation of stock trading customers according to potential value

Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process

An Overview of Knowledge Discovery Database and Data mining Techniques

Local Anomaly Detection for Network System Log Monitoring

Network Intrusion Detection Using an Improved Competitive Learning Neural Network

IMPROVISATION OF STUDYING COMPUTER BY CLUSTER STRATEGIES

Machine Learning and Data Mining. Fundamentals, robotics, recognition

SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING

LVQ Plug-In Algorithm for SQL Server

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

moehwald Bosch Group

Content Based Analysis of Databases Using Self-Organizing Maps

Clean and energy-efficient vehicles Advanced research and testing Battery systems

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

Workplace Accidents Analysis with a Coupled Clustering Methods: S.O.M. and K-means Algorithms

A user friendly toolbox for exploratory data analysis of underwater sound

Load balancing in a heterogeneous computer system by self-organizing Kohonen network

An Introduction to Data Mining

Supporting Information

INTRUSION DETECTION SYSTEM USING SELF ORGANIZING MAP

Online data visualization using the neural gas network

Clustering. Data Mining. Abraham Otero. Data Mining. Agenda

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations

Transcription:

A publication of CHEMICAL ENGINEERINGTRANSACTIONS VOL. 33, 2013 Guest Editors: Enrico Zio, Piero Baraldi Copyright 2013, AIDIC ServiziS.r.l., ISBN 978-88-95608-24-2; ISSN 1974-9791 The Italian Association of Chemical Engineering Online at: www.aidic.it/cet DOI: 10.3303/CET1333171 Fuel Cell Health Monitoring Using Self Organizing Maps Raïssa Onanena a,b,c, Latifa Oukhellou* a, Etienne Côme E. a, Samir Jemei c, Denis Candusso b, Daniel Hissel c, Patrice Aknin a a Université Paris-Est, IFSTTAR-GRETTIA, 14-20 Boulevard Newton, F-77447 Champs-sur-Marne, France b IFSTTAR, FCLAB Institute, Rue Thierry-Mieg, 90010 Belfort Cedex, France c FEMTO-ST UMR CNRS 6174, FR FCLAB (FR CNRS 3539), University of Franche-Comte, Rue Thierry-Mieg, 90010 Belfort Cedex, France latifa.oukhellou@ifsttar.fr The problem of durability of fuel cell technology is central for its spreading and commercialization. There is therefore a growing need to build accurate diagnosis tools which can give the operating state of the fuel cell during their use. When supervised machine learning approaches are used to build such diagnosis tools, they generally require a large amount of labeled data. Collection and annotation of data can be either difficult to perform or time consuming. In this paper, authors are interested in the monitoring of fuel cells in an unsupervised framework, meaning that no labels are required to learn the diagnosis model. The aim is to build a monitoring tool able to easily visualize the State Of Health of full cells from electrochemical impedance spectroscopy measures, showing thus its evolution from fault free case ("normal" behaviour) to defective classes such as drying or flooding. The proposed approach is based on Self Organizing Maps (SOM) which have shown their performance to solve fault detection and prediction in many industrial systems. By automatically visualizing the data into a two-dimensional space, the interpretation of the results have become easy and instinctive. The approach also allows the clustering of the data into different groups of classes, thus enabling the classification of new observations. Experimental results carried out on real data sets have shown the efficiency of the proposed approach with respect to standard supervised and unsupervised classification approaches. 1. Introduction The problem of durability of fuel cell technology is central for its spreading and commercialization. There is therefore a growing need to build accurate diagnosis tools which can give the operating state of the fuel cell during their use. Various diagnosis approaches have been proposed for that purpose. These various approaches include model-based approaches (Burford et al. 2004) and gray or black-box model approaches using fuzzy logic (Hissel et al, 2004), neural networks (Nitsche et al, 2004). Non-parametric identification by Markov parameters have also been presented (Tsujioku et al, 2004), a fuzzy-clusteringbased FC diagnosis approach (Hissel et al, 2007), as well as a diagnosis approach using bayesian networks (Wang, et al, 2012). As for pattern-recognition-based method authors have proposed in a previous study a supervised machine learning approach for FC diagnosis using Electrochemical Impedance Spectroscopy (EIS ) measurements. When using supervised machine learning approaches for diagnosis tools, it is required to have labeled data, i.e. an a priori knowledge about the intrinsic structure of the dataset must be known. Collection and annotation of these labels can be difficult to perform and time consuming, and sometimes the possibility of mislabeling the data is non-negligible. Therefore, it might be an interesting idea to consider the unsupervised learning framework to handle and study data. In this paper, the state-of-health monitoring of PEM fuel cells using an unsupervised learning method is investigated. In unsupervised learning problems, the main goal when considering our task is to discover groups or similarities in the data set and no label is required to learn the diagnosis model. The aim here is to build a monitoring tool able to easily visualize the health of full cells from electrochemical impedance spectroscopy measures, showing thus its evolution from fault free case ("regular" behaviour) to defective classes such as drying or flooding. The proposed approach is based on Self Organizing Maps (SOM) Please cite this article as: Onanena R., Oukhellou L., Come E., Jemei S., Candusso D., Hissel D., Aknin P., 2013, Fuel cell health monitoring using self organizing maps, Chemical Engineering Transactions, 33, 1021-1026 DOI: 10.3303/CET1333171 1021

which have shown their performance to solve fault detection and prediction in many industrial systems such as vehicle cooling system (Svensson et al, 2008) or aircraft engine fleet monitoring (Come et al, 2010. By automatically visualizing the data into a two-dimensional space, the interpretation of the results have become easy and instinctive. The paper is organized as follows. First of all, in section 2, the database which is used for diagnosis is presented. Then in section 3, the self-organizing maps will be briefly introduced. And finally in section 4, the results applying the Self-Organizing Maps will be presented and discussed. 2. Database The database that was used in the presented work was built after experimentations on a 20-cell PEMFC stack which has been operated under various operating conditions, comparable to real-life conditions. The stack was assembled with commercial perfluorosulfonic Membrane-Electrodes Assemblies (MEA) graphite bipolar plates and electrodes with 100cm² area. The experiments and EIS measurements have been performed using a 10kW test bench developed in-lab (Wasterlain et al, 2009). The experimentally generated operating conditions were realized using a 2-level- fractional experimental design. This experimental design deals with the impact of five factors on the operating conditions: anodic Relative Humidity (RHa) [ 35 %, 75 %] cathodic Relative Humidity (RHc) [ 35 %, 75 %] anodic Stoichiometric Factor (FSA) [1.8, 3.0] the cathodic Stoichiometric Factor [2, 3] and the FC temperature. The stack is then operated under these various conditions and faults from drying to flooding are triggered by varying one or several parameters. 11 experimental points have been considered during the experiments where EIS spectra were measured (Wasterlain et al, 2009;).. The current density was fixed to 0.5 A/cm², The whole experiments have enabled us to gather a database of EIS spectra, as presented in Table 1. Table 1: Experimental points used to generate the dataset Test Nb. 1 2 3 4 5 6 7 8 9 10 11 RHa (%) 35 35 35 35 35 75 75 75 75 75 75 RHc (%) 35 75 75 75 75 35 35 75 75 75 75 FSa 3 1.8 1.8 3 3 1.8 3 1.8 1.8 3 3 FSc 3 1.8 1.8 3 3 1.8 3 1.8 1.8 3 3 T ( C) 60 60 80 80 60 60 80 80 60 60 80 Nb of EIS 20 21 21 21 21 21 21 21 21 21 21 As a pre-processing step, a feature extraction was realized on the data. This step was achieved because of its potential benefits on the performances of the diagnosis tools. These include: - Facilitating data visualization and understanding. - Reducing computational times - Defying the curse of dimensionality. The extraction approach chosen was to take the real and imaginary parts of the spectra at the following frequencies : 5 khz ( Re (f = 5 KHz), Im (f = 5 khz) ), 500 Hz ( Re (f = 500 Hz), Im (f = 500 Hz) ), 50 Hz ( Re (f = 50 Hz), Im (f = 50 Hz) ), 5 Hz ( Re (f = 5 Hz), Im (f = 5 Hz) ), 77.5 mhz( Re (f = 77.5 mhz), Im (f = 77.5 mhz) ), 50 mhz ( Re (f = 50 mhz), Im (f = 50 mhz) ), (12 features) Therefore, for the following clustering step, the database is made up of 12 input variables. 3. SOM methodology A self organizing map (SOM) is a type of Artificial Neural Network (ANN) that is trained within an unsupervised learning framework (no class labels are required). SOM are well-known data mining tools which are frequently used to project high-dimensional data into a low dimensional space that is supposed well suited to capture the structure of the original data. Therefore the map that results to this training shows the different clusters that are present in the data. Like most Artificial Neural Networks, SOMs operate in two modes: training and mapping. A Self-Organizing Map (SOM) consists of neurons (or units) on a regular low-dimensional grid which defines a neighborhood relation between the prototypes. Each unit t of the grid is represented by a N- 1022

dimensional weight vector or prototype vector, where N is equal to the dimension of the input vectors. The neurons are connected to adjacent neurons by a neighborhood relation, which dictates the topology or structure of the map. Self-organization means that after convergence prototypes which are close in the grid will also be close in the original data space. New data can be projected after learning of the SOM by assigning them to the location on the map of the best matching unit (BMU): the closest prototype vector to the data point. The SOM training algorithm resembles vector quantization algorithms such as k-means (Webb, 2011). The important distinction is that in addition to the best matching unit vector, also its topological neighbors on the map are updated: the region around the best matching unit is stretched towards. The aim is to find a suitable set of weights for each unit so that the network estimates the distribution of the input data. The learning rule responsible for finding a suitable set of weights iterates the two following steps: Competitive step: given an input vector, the distance between unit t of the SOM and the input vector is calculated by the Euclidian metric. The unit with the smallest distance is the one which closely represents the current input, and thus is considered the winner or the best matching unit (BMU) for that input: (1) Cooperative step: the weights of the winning unit are moved towards the input. In addition, the winning unit's neighbors are also moved in this direction, but by an amount that decays with the distance of the neighbors form the winning unit. The update rule for the weight vector of unit is given by the following: (2) where t denotes time, is the learning rate and is a neighborhood kernel centered on the winner unit. A Gaussian kernel is usually chosen: (3) where and are the positions the BMU and unit i respectively on the map, the neighborhood radius. Both the radius and the neighborhood rate decrease monotonically with time (Vesanto, 2000). From the initialization phase to the end of the training, the above process is iterated for each input vector of the data set and this, results in a competition between different regions of the input space for units of the map. Dense regions of the input space will tend to attract more units than sparse ones, with the distribution of units in the weight space ultimately reflecting the distribution of the input data in the input space. Neighborhood learning also encourages topology preservation, meaning that the map end up close in the weight space, as well. 4. Results and discussion 4.1 Data clusters The first experiment aims at demonstrating the ability of SOM to perform a clustering of the dataset. The aim is to identify in the obtained SOM, groups of clusters where each cluster corresponds to an operating class. The training of the SOM was performed using the Matlab SOM toolbox. The learning was done under the default setting: the learning algorithm was the batch algorithm with linear initialization. Figures 1 presents the results obtained with the classical SOM. The topology of the map was fixed manually: it contains 6 x 12 neurons. And therefore, if each map unit is initially considered as a cluster, it results in a 72 class-clustering. A more in-depth analysis of the map could be done using the U-(matrix) (Ultsch, 2003). The U-matrix represents the average distance between the map unit and its closest neighbor; it is subsequently an interesting method for the identification of groups of clusters among the map unit. The scale on the left side of the U-matrix gives information about the distances in the U-matrix. In Figure 1, the redder colors show long distances between the nodes whereas the bluer colors show small distances between the nodes, therefore establishing clusters in the dataset. These figures exhibit 3 or 4 main clusters. In order to validate the visual assessment, the validation of clusters can be done through criteria such as compactness (members of each clusters should be as close to each other as possible) and separation (the clusters should be widely separated) (Legány et al, 2006). Six validity indexes were tested (see Figure 2), corresponding to the some of the most widely used indices. As it can be seen in Figure 2, this seems to indicate that the number of clusters in the dataset, as indicated in the SOM is likely to be 3 or 4. 1023

Figure 1: U-matrix obtained when training a SOM on the FC data Figure 2: Evaluation of different cluster validity indices to automatically determine the number of clusters 4.2 Data labeling In this second experiment, the SOM is used to identify mislabeled data. With the experts' advices, preliminary labels were associated with data points given the prior knowledge we have on the operating conditions. These preliminary classes or fault labels are presented in Figure 3. This figure shows the distribution of the data on the map according to this fault labeling. A part of the data seems to be mislabeled. For example, some data labeled as "minor drying" are clustered with data labeled as "flooding". Moreover, it seems that the gravity of the fault grows with the value of the principal components. With the assumption these observations and the assumption that data generated with similar fault operating conditions will belong to the same cluster, labels changes were therefore made accordingly. The new distribution of labels in the data set and on the map is presented in Table 2 and Figure 4. This shows that SOM can be considered as useful tools for visualizing high dimensional data. 1024

4 3 2 1 Mod. F Clusters Min. F Min. D Min.D Mod. D Sli. F Min. F Mod. F prototypes 0 Mod. F Mod. D -1-2 NLeg Min. F Min. D -3 Min. D -4-6 -4-2 0 2 4 6 Figure 3: The labelling of the data as initially done by the experts. A group of data point labelled as a drying fault are clustered with data points whose operating conditions are identified as flooding Table 2: Table of different labels of the data, with the fault gravity Fault mode Drying Flooding gravity moderate (Mod.D) minor (Min.D) moderate (Mod.F) minor (Min.F) slight (Sli.F) AssociatedTest Number 1, 6 7 3 2, 4, 8, 10 5, 9 11 Principal Component 2 4 3 2 1 0-1 -2-3 Min.D Mod.D Sli.F Min.F Mod.F Map Units Slight flooding Clusters in the data Moderate flooding Minor flooding Minor drying Moderate drying -4-6 -4-2 0 2 4 6 Principal Component 1 Figure 4: The labelling of the data as finally obtained after the changes of the labels. The modifications were done so that the "new" label corresponds to the labels of its neighbours by a majority vote. 4.3 Fault monitoring Finally, the SOM has been used for the monitoring of the fuel cell. Two points of the data set, that represent the measurement of the FC operations from flooding conditions to drying conditions are presented on the map. In Figure 5, it is possible to see how the point evolves in the map. The point evolves from top to down, which corresponds to the evolution from flooding operating conditions to drying conditions. 5. Conclusion In this paper, a non-supervised approach based on Self Organizing Maps is used to monitor fuel cell. We have presented how SOM can be used to visually identify the main clusters in the dataset, then how it could be used to correct eventual expert data mislabeling. Finally experiments are carried out to show that self-organizing map could be used to monitor the State-Of-Health of the fuel cell. This presents an 1025

alternative for supervised learning if the collecting of labels could not be done under difficult to obtain or is time consuming. Figure 5: the different labels of the map prototype as learned during the training phase. On the right-hand side, we have the labels on the units after the modification of the mislabelled data References Burford D., Davis T., Mench M., 2004, Real-time electrolyte temperature measurement in an operating polymer electrolyte fuel cell, Proceedings of the Advanced Materials for Fuel Cells and Batteries Symposium, 204th Electrochemical Society Meeting. Côme E., Cottrell M., Verleysen M., Lacaille J., 2010. Aircraft engine fleet monitoring using Self Organizing maps and edit distance, 18 th European Symposium on Artificial Neural Networks, Bruges, Belgium. Hissel D., Péra M. C., Kauffmann J.M., 2004, Diagnosis of automotive fuel cell power generators, Journal of Power Sources, 128(2), pp.239-246. Hissel D., Candusso D., Harel F., 2007, Fuzzy Clustering Durability Diagnosis of Polymer Electrolyte Fuel- Cells dedicated to transportation applications, IEEE Transactions on Vehicular Technology (56), pp.2414-2420. Kozlowski, J.D., Byington, C.S., Garga, A. K., Watson M. J., Hay T. A., 2001, Model-based predictive diagnostics for electrochemical energy sources, IEEE Aerospace Conference. Legány C., Juhász S., Babos A., Cluster validity measurement techniques. Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases, pp. 388-393. Svensson M., Byttner S., Rognvaldsson T., 2008, Self-organizing maps for automatic fault detection in a vehicle cooling system, 4th International IEEE conference on Intelligent Systems, vol 3, pp. 8-12. Tsujioku, Y., Iwase, M., Hatakeyama S., 2004, Analysis and modelling of a direct methanol fuel cell for failure diagnosis, IEEE IECON Conference, Busan, Korea. Ultsch A., 2003: U-Matrix: A Tool to visualize Clusters in high dimensional Data, Dept. of Computer Science University of Marburg, Research Report 36. Vesanto J., Himberg J., Alhoniemi E., Parhankangas J., 2000, SOM Toolbox for Matlab 5 Documentation. Helsinki University of Technology, Helsinki, Finland. Wang K., Pera M.-C.,Hissel D., Yousfi Steiner N., 2012, SOFC modelling-based on discrete Bayesian network for system diagnosis use, Proceedings of the 8th Power Plant and Power System Control Symposium (PP&PSC) 2012, Toulouse, France, September 2 5. Wasterlain, S., Harel F., Candusso D., Hissel D., 2009, Development of a High Voltage Impedance Spectrometer and Diagnosis of Large PEFC stacks, European Fuel Cell Forum, Lucerne, Switzerland. Webb A., Copsey K.D., 2011, Statistical Pattern recognition, 3rd edition John Wiley and Sons. 1026