Centralized logging system based on WebSockets protocol



Similar documents
Log Management with Open-Source Tools. Risto Vaarandi SEB Estonia

Graylog2 Lennart Koopmann, OSDC /

Log Management with Open-Source Tools. Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M

EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH CERN ACCELERATORS AND TECHNOLOGY SECTOR A REMOTE TRACING FACILITY FOR DISTRIBUTED SYSTEMS

Performance Evaluation of NoSQL Systems Using YCSB in a resource Austere Environment

QL Integration into Scala and Excel. Martin Dietrich

This presentation discusses the new support for the session initiation protocol in WebSphere Application Server V6.1.

Chapter 17. Transport-Level Security

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Repeater. BrowserStack Local. browserstack.com 1. BrowserStack Local makes a REST call using the user s access key to browserstack.

The syslog-ng Premium Edition 5F2

Research of Web Real-Time Communication Based on Web Socket

Log management with Logstash and Elasticsearch. Matteo Dessalvi

Comparative Analysis of Open-Source Log Management Solutions for Security Monitoring and Network Forensics

VMware vcenter Log Insight Security Guide

Interwise Connect. Working with Reverse Proxy Version 7.x

Getting Started with SandStorm NoSQL Benchmark

Technical Overview Simple, Scalable, Object Storage Software

Security Overview Introduction Application Firewall Compatibility

Building a protocol validator for Business to Business Communications. Abstract

MIT Tech Talk, May 2013 Justin Richer, The MITRE Corporation

MEGA Web Application Architecture Overview MEGA 2009 SP4

Glassfish Architecture.

z/tpf FTP Client Support

Oracle Communications WebRTC Session Controller: Basic Admin. Student Guide

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

S y s t e m A r c h i t e c t u r e

The syslog-ng Premium Edition 5LTS

Data Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment

Architecture and Mode of Operation

All You Can Eat Realtime

MESSAGING SECURITY USING GLASSFISH AND OPEN MESSAGE QUEUE

ENZO UNIFIED SOLVES THE CHALLENGES OF OUT-OF-BAND SQL SERVER PROCESSING

Web Tracking for You. Gregory Fleischer

Research on Server Push Methods in Web Browser based Instant Messaging Applications

Security Correlation Server Quick Installation Guide

A Survey Study on Monitoring Service for Grid

Performance Guideline for syslog-ng Premium Edition 5 LTS

The MoCA CIS LIS WSDL Network SOAP/WS

Limi Kalita / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 5 (3), 2014, Socket Programming

Google Cloud Platform The basics

From Centralization to Distribution: A Comparison of File Sharing Protocols

Survey of the Benchmark Systems and Testing Frameworks For Tachyon-Perf

SiteCelerate white paper

FIVE SIGNS YOU NEED HTML5 WEBSOCKETS

Information Retrieval Elasticsearch

CHAPTER 1 - JAVA EE OVERVIEW FOR ADMINISTRATORS

1Intro. Apache is an open source HTTP web server for Unix, Apache

GoToMyPC Corporate Advanced Firewall Support Features

Developing a Web Server Platform with SAPI Support for AJAX RPC using JSON

Security. Contents. S Wireless Personal, Local, Metropolitan, and Wide Area Networks 1

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

ntopng: Realtime Network Traffic View

Monitoring Linux and Windows Logs with Graylog Collector. Bernd Ahlers Graylog, Inc.

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Dissertation Title: SOCKS5-based Firewall Support For UDP-based Application. Author: Fung, King Pong

RTC-Web Security Considerations

Apigee Gateway Specifications

Assignment # 1 (Cloud Computing Security)

Modern Web Development From Angle Brackets to Web Sockets

Collaborative Open Market to Place Objects at your Service

Understanding Evolution's Architecture A Technical Overview

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

Bernd Ahlers Michael Friedrich. Log Monitoring Simplified Get the best out of Graylog2 & Icinga 2

World-wide online monitoring interface of the ATLAS experiment

How to Make the Client IP Address Available to the Back-end Server

Oracle WebLogic Server 11g Administration

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

VMware vsphere Data Protection

Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET. ISGC 2013, March 2013

A Performance Analysis of Distributed Indexing using Terrier

Resource Utilization of Middleware Components in Embedded Systems

Classic Grid Architecture

FUSE-ESB4 An open-source OSGi based platform for EAI and SOA

WebLogic & Coherence. Best backend for Mobile Apps. July 2014 INSERT PRESENTER TITLE AND DATE

International Journal of Enterprise Computing and Business Systems ISSN (Online) :

Chapter 7 Transport-Level Security

Runtime Monitoring & Issue Tracking

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

A Tool for Evaluation and Optimization of Web Application Performance

Application Note. Onsight Connect Network Requirements v6.3

Barracuda Networks Web Application Firewall

REST web services. Representational State Transfer Author: Nemanja Kojic

Towards Elastic Application Model for Augmenting Computing Capabilities of Mobile Platforms. Mobilware 2010

Smartphone Enterprise Application Integration

SOFT 437. Software Performance Analysis. Ch 5:Web Applications and Other Distributed Systems

Chapter 4: Security of the architecture, and lower layer security (network security) 1

REQUIREMENTS LIVEBOX.

Spirent Abacus. SIP over TLS Test 编 号 版 本 修 改 时 间 说 明

SkyFoundry News Update New Reporting Features and SkySpark Mobile

Enabling High performance Big Data platform with RDMA

High-Volume Performance Test Framework using Big Data

Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence

Microsoft Azure Data Technologies: An Overview

Log management with Graylog2 Lennart Koopmann, FrOSCon Mittwoch, 29. August 12

Transcription:

Centralized logging system based on WebSockets protocol Radomír Sohlich sohlich@fai.utb.cz Jakub Janoštík janostik@fai.utb.cz František Špaček spacek@fai.utb.cz Abstract: The era of distributed systems and mobile devices brings new challenges in monitoring and controlling the remote components. Watching of components is usually realized through log records. To obtain comprehensive view of distributed system the centralization of logged information is usually required. There are many centralized log solutions such as Syslog,Greylog2,Logstash or cloud service Loggly that implement the functionality of gathering log messages and data from remote components and devices. These solutions are generally based on one way data transfer, that directs from client to server. The simplest solutions use basically log file synchronization to obtain data from remote components. More sophisticated solutions use periodic reading of remote system web service or expose other protocol endpoints like syslog protocol. This research paper proposes centralized logging solution based on Websocket technology. In section 4 article describes features, architecture and communication scheme. Section 5 compares the proposed solution with existing applications. The last section 6 discusses the future work and enhancements of the proposed system. Key Words: Centralized logging, Log4j, WebSockets, Syslog, Greylog2 1 Introduction The era of emerging software with distributed architecture emphasizes difficulties with monitoring and analyzing functionality of remote components. The simplest way how to track behavior of system elements is logging its operations during the runtime. These data then bring the record of program flow and also the information describing system failure or malfunction. The trivial logging solution is that the data is written to local storage. This way is sufficient if the whole system is located on the same machine. The problem occurs if the system is located across multiple devices and the components write logs to local files. In this case the information of whole system behavior is located in separate files and these must be merged and analyzed. There are two general approaches of solving the problem of distributed system logging. Both are based on centralizing information on single machine. These approaches differ in the way how the data are collected. The first technique is that the components of system write log record to their local storage and in the system there is a subsystem which periodically synchronizing its log storage with the remote component. Alternatively the log server sends the request to specified source and receives the information from that source. The shortcoming of the solution is that the entire log file needs to be synchronized and the log records are not present in centralized component in real-time. The second solution stands on exposing receiver for communication with specific protocol. The remote component then sends the log messages directly to log server. Alternatively the remote component could contain thick logging client, which connects directly to remote storage(e.g. database). In both approaches the log server is just passive receiver for log data. This paper proposes experimental implementation based on the second approach with some enhancements in server functionality. The solution is build on lightweight WebSocket technology, NoSQL storage and Java application server. The main improvement is in the usage of WebSocket communication, not only to send log data, but even to control the client settings and functionality. Organization of paper Section 3 describes the requirements and general description of solution. Further in section 4 the architecture and technologies are described. The section 5 contains comparison with another centralized logging solution. Last section 6 summarizes the results of testing and discusses future work. ISBN: 978-1-61804-262-0 103

2 Related work The area of this problem is fairly covered so the study of existing solutions were done (generally Java platform implementations). There are some widely used systems and libraries using one of the mentioned approaches. Greylog2 [11] is log capture and analyzing tool. It has a flexible input types, including syslog, plaintext, and GELF. Additionally it is able to read from HTTP API. Greylog2 using MongoDB[12] as a storage and Elasticsearch[13] to analyze and search through the log records. Another related solution is the Syslog-ng[14]. It supports client-server mode, which is based on configuring one instance of Syslog-ng on client machine to transfer log messages to server machine through specified channel (e.g. udp,tcp connection or syslog protocol). Also syslog-ng driver can be used to write messages directly to remote storage (e.g. SQL storage,nosql storage). Syslog-ng doesn t provide log analysis tool, this feature must be realized through third party tool. Logback brings very similar concept to Graylog2, but does not provide complete functionality for log analysis. To store and analyze logs, Elasticsearch must be integrated. Loggly is a commercial cloud service, commonly called logging as service. The service is capable of gathering logs from every popular programing language or platform and the data could be sent using almost every protocol (Syslog TCP, Syslog UDP, Syslog TCP w/ TLS, or HTTP/S). The disadvantage of this solution is that the system must be connected directly or indirectly (over proxy) to the Internet. 3 Requirements The analysis of the related projects reveals the main requirements for proposed implementation. multi-platform (Linux,Windows) server solution flexible NoSQL storage for log records user friendly web interface access through REST API client transfer protocol widely supported across commonly used programming languages lightweight client implementation easily implementable message format simple configuration from server side open-source 4 Architecture The high level architecture is very simple and it is based on client-server model.[2] The server side consist of application that receives and processes logs, application server and persistent layer. 4.1 Communication Fig. 1: Architecture design As the solution required communication in both directions(client to server, server to client), suitable technology had to be selected. To ensure simplicity and versatility, a web based protocol is preferred. There were designed many two way communication protocols that use HTTP transport layer to benefit from existing infrastructure (authentication, secure transport, proxies). However these protocols are tradeoffs between efficiency and reliability as the HTTP protocol is not initially designed for bidirectional communication[6]. As the substitution for these tradeoffs, WebSocket protocol was designed. The protocol uses the HTTP transport layer as is and it is designed to work on standard port 80 or 443 for secure transport. After a micro-benchmarks between HTTP alternatives and WebSocket protocol, the WebSocket technology was selected. One of the advantages of Web- Socket protocol is that it uses one TCP connection for the communication and avoids the repetitive opening of connection, which reduces the performance. Same as basic HTTP protocol the WebSocket protocol has wide support across programming platforms, so the implementation of clients for various platforms is possible. The log messages are JSON formated and sent by WebSocket text frame to/from client. The JSON format was chosen for its flexibility and support in many programming languages. The JSON log ISBN: 978-1-61804-262-0 104

message contains all standard fields common for logging. There is also field for arbitrary object to be logged. This feature simplifies the data-mining operations from log records. The remote reconfiguration of logging client is implemented via text frames in special format different from standard log message. Also the direction is from server to client. The idea behind this feature is that the server could remotely control settings of each client log level or identification of component. The communication scheme on fig.2 shows the entire process of establishing connection and message exchange. After the WebSocket handshake, server sends initial configuration message to client, which contains the information about log level (in this case FINE) and identification of component, if it is preconfigured by log server admin. After this information exchange, the client sends the log messages with appropriate level. The reconfiguration message shows how the remote setting of log level is done (in this case INFO level). 4.2 Server Fig. 2: Communication scheme The server part is Java Enterprise application, which is running on Wildfly[3] application server. The application implements WebSocket endpoint for logging clients. Server contains remote control logic, user interface and additional REST API to access the functionality designed for log analysis and client remote control. User interface consists of configuration of clients, log analysis and search engine. The persistent layer is based on MongoDB NoSQL database. It was chosen for its flexibility and also it could be easily integrated with advanced indexing, searching and analyzing tools (e.g Elasticsearch, Kibana, Hadoop). The solution transfered the logic of log message writing and processing to server side. Server implementation uses MongoDB Java driver to write logs and to process the log messages asynchronously. The asynchronous writing brings the increase in throughput. 4.3 Client Thanks to WebSocket technology, the implementation of client is possible in various languages(c++,.net,java,javascript,python and others). The experimental client is implemented in Java programming language using the Jetty Web- Socket Client API implementation. Serialization of LogMessage is implemented by Jackson library. If the connection to log server is not present, the client caches records and after the connection is established again, it sends all cached logs to server. 5 Comparison To test proposed implementation against an existing solution, the log4j2 NoSQL appender was chosen as the nearest matching solution. This comparison measures the performance of logging clients, where log4j2 NoSQL appender uses MongoDB Wire Protocol to transfer serialized messages. The custom client uses WebSocket protocol as described above. The methodology of comparison is as follows: create a logger object insert k log records (text logs, logs with exception) measure duration of insert operation The benchmark is implemented also as Java application, as the Log4j2 is Java library. The measurements were realized on clear database collection and in separate runs. Every measurement was repeated 40 times. The insertion of 1000 log records was chosen as most representative sample size if we consider, that common application does not insert more than hundreds of log by one Logger instance. In case if there is no additional object(exception) to serialize the proposed solution shows higher average time to insert 1000 logs. Figure 3 shows the comparison of average duration of 1000 info log messages insertion. The measured value of experimental implementation is almost similar to log4j appender. Figure 4 displays the average duration of inserting 1000 log record containing exception object. In this case the experimental implementation achieved lower time value. This result is caused by more simple implementation of exception serialization and also by transferring of persistence operations to server. Also the average duration is nearly constant. ISBN: 978-1-61804-262-0 105

Java, Python and C++ Protocol buffers[15] could be solution, but the usage of this technology eliminates the versatility of message format. There are also new opportunities to explore in way of remote configuration and client functionality control. In proposed system the reconfiguration of log level and component name are implemented, but further attributes and even remote functions could be added e.g. gathering information about remote system (utilization,source usage) dependent on client platform. Fig. 3: Average duration of inserting 1000 logs without exception Fig. 4: Average duration of inserting 1000 logs with exception 6 Conslusion and future work As described in paper, there are wide array of centralized logging solutions. From simplest solution of file replication to sophisticated cloud services like Loggly. We proposed a centralized logging that benefits from WebSocket protocol as widely supported solution of bidirectional communication. The protocol also uses existing infrastructure. The experimental solution is based on Java platform, but the clients could be implemented in other programming languages. The solution was compared with existing implementation of Log4j NoSql appender. The benchmark of proposed solution proofs, that even not optimized version of that implementation is comparable to existing widely used Log4j2 appender. The tests also display, that the time to send a log record remains stable if the log record contain an object of exception. On the other hand the comparison also reveals that there is space for optimization. The serialization process of log message could be improved as it creates a performance leak of whole system. For References: [1] RFC6455. The WebSocket Protocol. 2011.: Internet Engineering Task Force (IETF), 2011. Available from: https://tools.ietf.org/html/rfc6455 [2] BERSON, Alex. Client-server architecture. McGraw-Hill, 1992. [3] Wildfly [online]. 2013 [cit. 2014-10-29]. Available from: http://wildfly.org/ [4] Mozilla Developer Network: WebSockets [online]. 2014 [cit. 2014-10-29]. Available from: https://developer.mozilla.org/en- US/docs/WebSockets [5] Qt Project: Qt WebSockets C++ Classes [online]. http://qt-project.org/doc/qt-5/qtwebsocketsmodule.html [6] RFC6202. Known Issues and Best Practices for the Use of Long Polling and Streaming in Bidirectional HTTP. University of Rome Tor Vergata : Internet Engineering Task Force (IETF), 2011. Available from: https://tools.ietf.org/html/rfc6202 [7] CROCKFORD, Douglas. The application/json media type for javascript object notation (json). 2006. [8] ABUBAKAR, Yusuf; ADEYI, ThankGod S.; AUTA, Ibrahim Gambo. Performance Evaluation of NoSQL Systems using YCSB in a resource Austere Environment. Performance Evaluation, 2014, 7.8. [9] The State of Logging in Java 2013. In: VAN CAMP, Balder. Zeroturnaround [online]. 2013 [cit. 2014-10-29]. Available from: http://zeroturnaround.com/rebellabs/the-stateof-logging-in-java-2013/ [10] APACHE SOFTWARE FOUNDA- TION. Apache Log4j 2 [online]. 2014 [cit. 2014-10-29].Available from: http://logging.apache.org/log4j/2.x ISBN: 978-1-61804-262-0 106

[11] TORCH GMBH - THE GRAYLOG2 COMPANY. GRAYLOG2 [online]. http://www.graylog2.org/ [12] MONGODB, Inc. MongoDB [online]. http://www.mongodb.com/ [13] ELASTICSEARCH BV. Elasticsearch [online]. http://www.elasticsearch.org [14] BALABIT IT SECURITY. Syslog-ng: The Foundation of Log Management [online]. 2014 [cit. 2014-10-29]. Available from: http://www.balabit.com/networksecurity/syslog-ng [15] GOOGLE, Inc. Google Developers: Protocol Buffers [online]. 2014 [cit. 2014-10-29]. Available from: https://developers.google.com/protocol-buffers ISBN: 978-1-61804-262-0 107