Development of a Network Configuration Management System Using Artificial Neural Networks R. Ramiah*, E. Gemikonakli, and O. Gemikonakli** *MSc Computer Network Management, **Tutor and Head of Department Department of Computer Communications School of Engineering and Information Sciences Middlesex University, The Burroughs, Hendon, London, NW4 4BT, UK {RR412, E.Gemikonakli, O.Gemikonakli}@mdx.ac.uk Abstract In this paper we present a structure for Network Monitoring based on a connectionist inference model based on graph theory operating in conjunction with an information base to create a self-managing system in order to automatically detect and analyse network problems, and take necessary corrective actions. Identifying malfunctions in networks is challenging but crucial to ensuring secured communication. Connectionist Associative Memory Model (CAMM) is used to analyse corrupted network management information as it uses simple set operations leading to minimal computation and furthermore it is memory efficient due to its self-organising and dynamic structure. The simplicity and efficiency of CAMM are key properties to resolving network management issues and this paper elaborates on the studies being performed in this area. The proposed approach can be further extended to address various network security concerns. Keywords: network management and security; fault identification and recovery; CAMM N I. INTRODUCTION etwork Configuration management is not only a daily and essential requirement in computer network systems but it also forms part of a vast field of research and development. Computer networks change at every moment to fulfil the operational and security requirements of the services running and of the users being serviced. Therefore rigorous verifications and maintenance need to be carried out to ensure optimum network performance and security. Throughout the years, the task of Network Managers has shifted from tedious and time consuming monitoring, configurations and troubleshooting to mostly monitoring and less human intervention for configuration and troubleshooting. The need for a shift in the scope of work of Network Managers is seen to be towards a higher level interaction with the network components whereby the complexity of the networks are to be mostly handled by Artificial Intelligence (AI) entities. In an attempt to understand existing systems in place and current research and development relating to Network Configuration Management, studies relating to similar applications have been carried out and concepts have been dissected to evaluate their possible contributions into the development of a Network Configuration Management System Using Neural Networks. Such studies included those of biological nervous systems whereby the links and communications between the components have been transposed to those of artificial neural networks, networks that consist of artificial neurons or nodes are called Artificial Neural Networks [7]. In Artificial Neural Networks, a good internal representation plays significant role to efficiently storing and retrieving information. In traditional connectionist models the internal representation forms distributed representation of data, which is completely connected. This causes many unnecessary connections that give rise to ambiguity and noise [3]. Neural networks can be effectively used in the recognition of sequences, such as biological data. Biological data (e.g. protein and DNA sequences) generally contain significant levels of noise and overlapping subsequences. When we transpose such data to computer networks we have Object Identifiers (OIDs) and Management Information Base (MIB). OIDs are strings of numbers allocated in a hierarchical manner used in a variety of protocols. One of the most widely used approach to network monitoring and management is the Simple Network Management Protocol (SNMP). A MIB is a collection of management objects (OIDs) maintained on a network entity that may be remotely manipulated to achieve remote network management [4]. The MIB has a tree structure with
OIDs branching out representing their relationships throughout the tree. OIDs uniquely identify each node of the tree starting from the root as depicted in the Fig.1. Since OID sequences used in MIB browsers have similar patterns and characteristics as DNA sequences, studies relative to the latter are relevant. Although a number of artificial neural network-inspired sequence recognition models have been used in identifying protein and DNA sequences and in classifying biological data, these models are inclined to produce a single output and totally ignore the possibility of a fragment belonging to more than one sequence. A major drawback of existing models is their incapability to capture multiple associations. Furthermore conventional approaches can be both uninteresting and computationally expensive to carry out an unambiguous identification of an OID which is derived from incomplete information. Although Self-Organising Feature Map attempts to simulate some partial connectivity while remaining a fully connected model, it still has the drawback of having unrelated connections that are considered to be redundant [9]. Fig. 1. Tree structure of a MIB with its corresponding OID In the case of Network Configuration Management we believe it is possible to develop new learning strategies to make the existing models capable of dealing with the classification of OID sequences and using these classifications for identifying OIDs. The Connectionist Associative Memory Model (CAMM) relies on a two phase learning process using Hebbian learning on a feed-forward type of network. The project involves the development of strategies that will accommodate lateral connections and thus expected to produce a flexible framework for solving different applications. We have adopted this approach to make the model capable of dealing with OIDs and developing cognitive models [1]. The next step was the development of learning algorithms for an existing connectionist associative memory model and devising a dynamic database that will act as the information base that will store the OIDs in different required formats used in sequence recognition and agent based applications. Furthermore, we attempted to create and manage the network topology and relate the objects to the different components of the network in the same database. The information base is designed to enable the use of relationships between components that will determine the overall structure of the network topology. Thus should any component affecting the operations of other sub components fail to work as per specifications, the failure is generalised to this specific part of the network. II. THEORETICAL BACKGROUND AND RESEARCH Expert systems effectively use knowledge bases. One example for such a case can be given as the knowledge bases mostly built using the knowledge taken out from the human experts and the pertinent network information. This information is usually obtained from a network itself. Expert system techniques, which can be classified as knowledge-based and rule-based, are probably one of the very first AI techniques that were used to create an automatic and intelligent network management system. In the network management area, especially at the application level, it is hard to model a significant part of the problems that may occur. Lower layer problems are well-understood while problems at the application layer are complex, application dependent, and distinct from one another. The reason of this may be the difficulties encountered in modelling the reasoning relating to a collection of knowledge, or the nature of the problem to be solved [6]. A network sometimes uses a finite state machine model in order to display certain nondeterministic behaviour [1]. However, some faults do exist in the implementation machine, but they cannot be successfully located and therefore are not able to be corrected [8]. The advanced database techniques design is aimed at having the network operators interacting only with the database. If an operator requires a change in network functioning, like changing a routing scheme, the operator makes these changes in the relevant database. This approach allows the operator to focus on what has to be done, while the database automatically implements the changes and deals with other such mechanics. The advanced database design is also considered a bottom-up design and not just a modified version of other commercial database systems. Therefore, a system architecture that is made specifically for network management is the result. Finally, to efficiently determine the required Management Information Base function, we did everything to identify and use to our advantage the
special characteristics of network management data and transactions [5]. Although high level models include expert systems, finite state machines, and advanced database techniques attempt to address these points by correlating the low-level information at the node in order to take sequential relationships into account, they need fault specification to set the essential concept up [1]. Approaches such as expert systems, finite state machines, and advanced database techniques do not point out the abnormal information which is not available in most cases. Additionally the accuracy of high-level models decreases as the network evolves [6]. Pattern recognition is an attempt to solve real world problems that are not especially unambiguous. However, pattern recognition techniques tend to be much better at solving problems which include more information than either expert systems or rule based systems [2]. The monitoring hierarchy contains CAMM and neural networks. The CAMM employs only partial connectivity information to store sequences and retrieve them based on set-theoretic operation. The model has two stages. At the beginning of the first stage, each input string is represented as partially connected set of sub-strings and encapsulated in a topology to constitute a multi-graph. After representing each input string as partially connected, the i k connection sets v +, (shortly known as v-sets) which i j collect the internal relations in the sequence and represent the connections between node ( i + k) v i, j and the nodes in the th layer formed by the data elements in the input data sequences, are completely formed to describe the nodes and the neighbouring layers. The v-set representation for five sets hat, hate, data, date and rare are shown in Figure 2 [1]. Fig. 2. Architecture of Stage 1 for five strings At the end of the first stage a partially connected network representation is produced where sub-strings are superimposed and therefore share the same storage space. Fig. 3. Graph created at the end of stage 1 & Topology of Stage 2 In training part of Stage 2 (see Figure 3), the gradient descent technique is used for those nodes that have been activated by the training set, to set the connection to exact values. All possible paths produced in Stage 1 are represented and therefore the l-paths are calculated. Consequently, the sequence is accepted if it accumulates to unity, otherwise it is rejected [1]. Neural networks are used to learn the normal behaviour of the network. The state probabilities are determined at each level in the monitoring hierarchy, where a full picture of a node s health is provided, so that the network manager can then determine whether the alarm points at any problem more accurately [1]. Existing Network Management Systems offer many different services and a lot of information because of its complex system. This has triggered the development of network management applications using various approaches incorporating neural networks, artificial intelligence etc. In Ref. [10], structure of intelligent network management systems is provided by using expert systems. The resulting mechanism is called as ExNet and discussed in detail. In this work a Connectionist Inference Model shown in Fig. 4 has been incorporated into an existing architecture based on the use of expert systems. The model incorporates CAMM instead of expert systems in order to manage the system used for the analysis of the entire network and its services, and proper performance management. The architecture shown in Fig. 4 has four main Modules. These are Monitor module, the Network Interface module, the Network Manager Interface module and the Connectionist Inference module respectively and each one is explained in Ref. [1].
Fig. 4. General Architecture of Intelligent Network Monitoring [1] III. WORK DONE & RESULTS Following the general architecture and having undergone an exhaustive literature review of various sequence recognition procedures and understood the requirements, we have developed an application in Java using existing algorithms developed by authors of Ref. [1] and modifying them and incorporating components that will ensure the interaction with the database and furthermore manipulate the data in the database to create different formats of the said data to be used most efficiently by the main application. Also as mentioned the database used along side with the main application (modified CAMM), has been created using Microsoft Office Access database. The different components added to the main application are essential to generate formats of the OIDs stored in the database such that it has increased the dimension size of the sequences which have been proven to provide improved results through exhaustive testing. Therefore the main application has also been tweaked to make optimal use of the newly generated sequences as inputs. Throughout our development, three main formats of the sequences have been analysed, tested and used as input to the main application. The reason for that is simply because the characteristics of traditional OIDs (e.g..1.3.6.1.2.1.25.6.3.1.1) are such that it does not use the full performance of sequence recognition, therefore different versions of the OIDs has been generated so as to increase the dimension size of the sequences. First the OIDs have been converted from their string format to their hexadecimal format, e.g. (2e312e332e362e312e322e312e252e362e332e312e31), and finally the 2e s representing the. s in hexadecimal has been changed randomly to ensure a larger dimension of the sequences e.g. (ek31wx33mc36eu31be32mn31fh25pa36ps33fx31gm31). The database has not only been devised to store OIDs in different formats but the same database has also been used to record faults messages and to store the topology of the network and relationships between the components i.e. Routers, Switches, PCs (Personal Computers) and links have been established such that queries which have also been created can run and clearly define the connectivity status of each component and status of the overall network thus ensuring the proper operation and security of the latter. The results of the queries will act as reports to Network Managers stored as the network records. The main application reads faults messages sequences from the database (normally generated by traps set for network monitoring) and uses the modified CAMM algorithm to resolve the faults message. Till now we have been able to achieve results that optimally use the modified CAMM algorithm to resolve faults messages by achieving a reduced number of probabilities. The following example will illustrate the results we have been able to achieve. A faults message is read from the database e.g. (qw--fa33vm36ys31ub32ju--qz--rf36wo33xf31) The - represents missing information from the fault message. The fault message serves as an input to the modified CAMM which trains the fault free set of OID sequences from the database and generates connections representing the links between the components of the sequences as depicted and explained in the simple CAMM example through Figure 2 and Figure 3. The results are in the form of colour labels showing the probability of components that could fit the missing information [-]. The results of the fault message used as input above are as follows: Level affected 3 3 R 4 1 R 23 3 R 24 1 R 27 1 R 2 R 3 R 28 0 R 5 R Component & Colour Table 1. Test Results of the modified CAMM Where the level affected represent the position of the -
starting from 1. Key Letter BL W G B R Colour Black White Green Blue Red Black has the lowest probability and Red has the highest probability. From the set of OIDs in the database it is known that the corresponding OID to the faulty one is (qw31fa33vm36ys31ub32ju31qz25rf36wo33xf31). Therefore we can see that the modified CAMM has been able to resolve the levels 3, 4, 23 and 24 reaching to the appropriate component and has 3 possibilities at level 27 and 2 possibilities at level 28. We can also find that the components missing at levels 27 and 28 i.e. 2 and 5 respectively are present in the probable components and have the highest probability. Different versions of the application have been developed and perfected based on the results achieved. Throughout the development process regular tests have been carried out to ensure that the application performs the required set of tasks and that the database to read from and write to effectively and efficiently. Since the application cannot work without the database, the connection created using an Open Database Connectivity Connection is vital and has been tested alongside with the other tests. IV. FUTURE WORKS Having achieved promising results, the next step is to incorporate an additional component in the form of an algorithm that will eliminate the noisy parts to reach single outputs at each level thus allowing the complete resolution of fault messages such that the application can run completely on its own and be incorporated in networks to serve as dynamic resolution modules. Candidate concepts under study include artificial neural network whereby Hebbian based learning algorithm calculates and assigns weights between nodes and eliminate noisy parts based on the weights achieved. The work can be extended to cover not only fault management but also network security. Well defined access and usage patterns can be used for training the system and then make decisions. along with the optimization of OIDs format, have had an important impact on the resolution of sequential information. The developed system is capable of storing and retrieving multiple associations of OIDs and opens itself to parallel or distributed implementation. The strength of pattern recognition used by the modified CAMM provides a better solution compared to rule based systems which do not use all of the information available for processing. Furthermore the learning process with the modified CAMM is on-going and a best-match approach can be taken to try and solve problems for which there is no exact information available compared to rule based systems which are usually cumbersome. The database incorporated in the system provides capabilities for further improvement of systems and also can help Network Managers to have concise information on their network and enables them to picture and understand the complex network easier. Furthermore, the topologies and systems achieved enables new components to be added without major alterations of the existing system. REFERENCES [1] E.Gemikonakli, O.Gemikonakli, S.Bavan, (2008), Intelligent Network Monitoring Using a Connectionist Inference Model. [2] G.A. Halse, Novel Approaches to the Network to the Monitoring of Computer Networks, Master Thesis, Computer Science, Grahamstown, South Africa, 2003. [3] I. Mitchell, A.S.Bavan. (2000). A Connectionist Inference Model for Pattern-Directed Knowledge Representation. Expert Systems, 106-113. [4] J. D. Case, C. Partridge. Case Diagrams: A First Step to Diagrammed Management Information Bases, p13. [5] J.R.Haritsa, M.O.Ball and N.Roussopoulos, A. Datta and S.Baras, Managing Networks using Database Technology, Selected Areas incommunication, IEEE Journal, vol.11, pp.1360-1372, December 1993. [6] N. Nuansri, T.S. Dillon and S.Singh, An Application of Neural Network and Rule-Based System for Network Management: Application Level Problems, Proceeding of the 30th Hawaii International Conference on System Sciences, vol.6, pp.474-483, January 1997. [7] P. Picton, (2000). Neural Network. New York: PALGRAVE. [8] Rouvellou and G.W.Hart, Inference of a Probabilistic Finite State Machine from its Output, Systems, Man and Cybernetics, IEEE Trans., vol.25, pp.424-437, March 1995. [9] S. Bavan, M. Ford, Melina Kalaatzi. (2000). Genomic and Proteomic Sequence Recognition Using a Connectionist Inference Model. Society of Chemical Industry, 901-912. [10] Y. Kim and S.Hariri, ExNet: An Intelligent Network Management System, WebNet98-World Conference of the WWW, Internet and Intranet, Orlando, November 1998. V. CONCLUSION A solution to tackle security issues in network configuration management has been developed to achieve intelligent and secure network management. With reference to the results achieved, it is seen that the modified CAMM