Users Interface and the GridRM Monitoring System

A Web 2.0 User Interface for Wide-area Resource Monitoring Garry Smith 1 garry.smith@computer.org Mark Baker 1 mark.baker@computer.org Javier Diaz Montes 2 javier.diaz@uclm.es 1 School of Systems Engineering, University of Reading, UK. 2 Universidad de Castilla-la Mancha, Spain. ABSTRACT Resource monitoring in distributed systems is required to understand the health of the overall system and to help identify particular problems, such as dysfunctional hardware or faulty system or application software. Monitoring systems such as GridRM provide the ability to connect to any number of different types of monitoring agents and provide different views of the system, based on a client s particular preferences. Web 2.0 technologies, and in particular mashups, are emerging as a promising technique for rapidly constructing rich user interfaces, that combine and present data in intuitive ways. This paper describes a Web 2.0 user interface that was created to expose resource data harvested by the GridRM resource monitoring system. Keywords Distributed Systems, Web 2.0, Mashups, User Interface, Resource Monitoring. 1. INTRODUCTION The term Web 2.0 is now an accepted way of referring to what is seen as the second generation of Web-based technologies that enhance collaboration and sharing via social-networking sites, wikis and folksonomies. The term became popular following the first O'Reilly Media Web 2.0 conference in 2004. At this stage exactly what Web 2.0 encompassed was hard to determine, but in the intervening years it has become much clearer. Web 2.0 is more than a set of new technologies and services. At its heart is a set of at least six influential ideas that are changing the way people interact online [1]: individual production and user generated content, harnessing the power of the masses, data on an epic scale, architecture of participation, network effects, and openness. Web 2.0 is really based on services put together using the building blocks of the technologies and open standards that underpin the Internet and the Web. Such services include blogs, wikis, multimedia sharing services, content syndication, podcasting and content tagging services. One aspect of Web 2.0 that has strongly influenced developers is the significant improvement in graphical user interfaces and the way users interact with them. For example, where an interface is based on a mashup, (a web application that combines data from more than one source into a single integrated service), and AJAX, (which allows asynchronous updates of parts of the interface), it is now the case that a relative novice can put together a simple user interface to popular applications. This ease of use and simplicity has influenced our work in designing a new Web-based user interface to the GridRM monitoring system. Given the potentially wide-ranging and dynamic content associated with monitoring wide-area distributed systems, the ability to rapidly create friendly and usable interfaces that focus on a particular aspect of the system is desirable for specialised clients that interact with the system. This paper describes the implementation of a Web 2.0 user interface that is intended to act as an online demonstration of the GridRM [2] resource monitoring system. Section 2 addresses related work, Section 3 describes the GridRM system infrastructure, Section 4 describes a Web 2.0 user interface, showing how it is integrates with GridRM, and highlights the technologies that were used. Section 5 evaluates the interface and compares it with previous attempts that utilised non-web 2.0 approaches. Section 6 concludes and describes future work. 2. RELATED WORK The EGEE GridMap Prototype [3] is a Web 2.0 interface that is focused on visualising the state of the Grid using Treemap visualisations. In particular sites or services of the Grid are represented by rectangles of different size and colour allowing two dimensions of data to be

visualized simultaneously [4]. The intention is to reduce the amount of space that would normally be required if the data was displayed using tables or bar charts. The main focus of the interface appears to be in the visual comparison of service availability and resource provision by different sites within a Virtual Organisation (VO). Mouse-overs and tool tips reveal summary site information, and clicking on a site causes a new Web page to be opened that displays further information. Although GridMap has been implemented using AJAX, only the site node colours and tool tip data appear to be updated dynamically. Overall GridMap has the look of a traditional Web 1.0 application, with the exception that the user is not required to refresh the main page. A backend GridMap server is used to retrieve and cache data from standard EGEE information systems (i.e. Berkeley Database Information Index (BDII), Gstat, SAM). Logically GridMap performs the same function as our global layer map: that of displaying summary information from different sites. Other projects that have Web 2.0 interfaces to reveal monitoring information include MonitorUs [5]. MonitorUs is a Web site availability monitoring service that provides a portlet-like (i.e. self contained windows that can be moved around the screen, minimised and removed, using client side JavaScript) user interface constructed using the Google Web Toolkit (GWT) [6]. Users register a Web site s URL for monitoring; a backend service periodically connects to the URL and records availability and latency measurements that are subsequently plotted on a chart. The chart image appears to be generated at the server; client-side AJAX is used to refresh an IFrame in order to download an updated chart image. Our work is currently focused on charts that are plotted dynamically at the client side in order to allow the client to connect to any number of services and combine/display data in novel ways, on the fly. Typical Web interfaces to monitoring systems are built using the likes of Perl, Servlets, Java Server Pages, Java applets, and HTML. For example Ganglia [7] uses the RRDTool [8] to generate server side charts of resource data that are displayed to the client in a static Web page. Other projects such as the GridPP Realtime monitor [9] use a Java applet to provide a geographical interface that displays job status in near real-time. For more interactive visualisations, GridPP also displays monitoring data using a custom desktop application [9], as well as Google Earth [9]. Traditional interfaces are typically restricted to displaying information in a predefined way, and can be difficult for users to extend in order to provide new views of their data, leading to problems of interface adaptation or reuse. We believe that a monitoring user interface should be based on reusable components and service oriented access to data so that the interface can be quickly adapted to meet different users needs and particular modes of working. 3. SYSTEM INFRASTRUCTURE GridRM is an extensible, wide-area, monitoring system that specialises in combining data from existing agents and monitoring systems so that a consistent view of resources can be achieved, regardless of the underlying system heterogeneity. It features a plug-in architecture whereby modules (drivers) can be inserted at runtime to allow new types of resources to be queried for information. GridRM also supports the use of concurrent naming schemas that allow clients to select, on a perquery basis, exactly how resources should be described, in order to meet each client s specific data requirements. The ability to connect to any number of different types of resource agents, and to provide different views of a given resource, provides a flexible mechanism that can be used to monitor all manner of resource attributes. GridRM is composed of a global and local layer. The global layer is focused on wide-area communication between remote sites, while the local layer is concerned with data gathering and access control from within a particular site. The global layer (shown in Figure 1) is implemented using Tycho [10], and consists of the following components: A distributed peer-to-peer registry formed dynamically and made up of Tycho mediators located at each remote site. o Mediators provide local registry functionality and also cooperate with remote mediators in order to form a distributed wide-area registry. Tycho provides an asynchronous messaging API, which communicates locally over sockets and remotely via HTTP - alternatively SSL or HTTPS can be used for wide area communications. Producer and consumer entities use the Tycho API in order to register with and communicate over the Tycho infrastructure. Producers are generally considered to be sources of information and events,

while consumers are generally sinks of that information. GridRM gateways provide controlled access to resource information for a given site. Gateways support a plug-in architecture that allows modules to be added to each gateway in order to extend the range of underlying resources agents that can be monitored. Gateways are implemented as Tycho producers. GridRM clients locate and query gateways for resource information, subscribe to receive events (notifications), typically aggregate information from multiple sites, and either inject monitoring information into external systems (e.g. SORMA [11]) or provide a user interface for humans to interact with resource data directly. The clients are implemented as Tycho consumers. attributes, e.g. a list of resources to apply the command to. Tycho mediators, multiplex messages to/from a particular site over the wide area; a single open port per site is required, regardless of the number of Tycho clients present at the site or wishing to communicate with the site. Tycho provides a number of transport protocols: GridRM uses HTTPS in order to benefit from transport layer security (message encryption) and to minimise any configuration to a site s organisational firewall. For example, a Tycho mediator could utilise the HTTPS port from an existing Web Server and potentially reuse an existing firewall rule. The approach to messaging currently used in GridRM (Tycho) has evolved from earlier prototypes that previously sent XML over HTTP (XOH) [12] directly to end points using URLs for addressing and command identification, in a similar way to the Representational State Transfer (REST) [13] methodology. XOH was originally chosen for its simplicity, ease of parsing, and minimal overhead when compared to other approaches such as SOAP [14]. In addition, at the time of their development, Grid standards and specifications where evolving and changing rapidly, and it was unclear which would win out, so XOH was deemed a sensible route forward. With hindsight, the decision was a good one as GridRM and Tycho use mature standards and can easily be made to work with emerging Grid standards. The current approach to messaging benefits from Tycho s mediator multiplexing capabilities, and while command information is now embedded within the body of the XML message, (and not directly in the endpoint s URL as was previously the case), messages encoded in XML are still transmitted over HTTPS. Figure 1: GridRM Global Layer Although only one client is shown in Figure 1, in reality there will be many clients present in the system. All communication between gateways and clients occurs using XML messages that are passed to the Tycho messaging system. Messages are composed with command(s), data, and security parts. The command part typically contains one or more Structured Query Language (SQL) statements and associated metadata identifying whether the command is for (near) real-time or historic resource data. In other cases, special keywords are used in the command section to specify particular administrative operations that should be applied to the gateway, e.g. to register new resources for monitoring. These command(s) are accompanied by corresponding Figure 2: GridRM Local Layer: Gateway

Figure 2 shows the local layer and the main gateway functionality, it also illustrates the driver plug-in mechanism that is used to gather data from a range of different agents. To place the gateway in context, Figure 2 is an expanded view of one of the GridRM Gateway boxes shown in Figure 1. A detailed discussion of the gateway s capabilities and monitoring techniques can be found at [2]. 4. A WEB 2.0 USER INTERFACE Although previous attempts at creating Web-based user interfaces for monitoring resource information, e.g. [15], were functionally correct, we were prompted to investigate Web 2.0 technologies due to a number of issues, including: The amount of code required to create the user interface was often non-trivial and diverted attention from core development interests of e.g. creating drivers for additional agents. As an example, the active resource map in [15] was coded from scratch using Java servlets, XSLT style sheets, Mercator projection map images and JavaScript in order to achieve a navigable and geographic resource view. In [15], queries are issued and results are refreshed on a per-page basis. Given a global view of core resource attributes, and the realisation that a number of fields are relatively static, a user s repeated requests for a dynamic field, e.g. system load, will result in unnecessarily downloading static information. Clearly caching of static values at the servlet can be used, but the approach is not particularly suited when delivering global views of data that contain mixed static and dynamic content. The charts generated in [15] are created dynamically by the servlet and are based on results from a resource query. The inclusion of presentation logic (the generation of chart images) in the Web server increases load on that server, which could be handed off to clients in order to achieve scalability in a heavily subscribed system. Our approach of using Java applets in 2001, as part of a prototype system to monitor Globus Monitoring and Discovery Services (MDS) [16], often incurred latencies on the order of minutes, over low bandwidth connections (e.g. 56 Kbytes/s), to download the byte code required to instantiate the user interface. We anticipate that Web 2.0 technologies have the potential to enhance a user s experience of interacting with resource data by providing content that can be integrated, explored and rearranged on the fly to suit the user s preferences, by breaking down restrictions imposed by server generated (whole) page impressions. Furthermore, the opportunities for reuse provided by Web API providers such as Google and Amazon minimise the amount of time and overhead required to implement common functionality, such as overlaying navigable maps with data points. System scalability can also potentially be improved through the targeted reloading of data required to construct parts of the user s display. Not only can the amount of data requested from the server, be reduced, but also the overhead of constructing the user interface, can be confined to the client. 4.1 An Instantiation In the first instance, the user interface is required to provide the following features: Charts and gauges that display monitoring data dynamically in near real-time, for multiple resources registered with a gateway, A map that represents gateways according to their geographical location and displays gateway metadata, e.g. status, registered drivers, and administrator information. Furthermore, it should be possible to optionally represent network links between remote sites. A registry browser, that displays entries from Tycho s distributed registry in a graph and allows users to expand and collapse nodes as they browse registry meta data this feature is useful to system developers and administrators. Dynamic plotting of chart data is intended to engage the user by providing an interface with similar features to those provided by desktop applications, and to expose the underlying monitoring activities performed by GridRM. For example, users will receive immediate feedback in response to changes in monitoring policies or resource states. In addition, to the aforementioned functional requirements, the framework should also encourage rapid development of new interface views through code reuse and data aggregation. For example, users may not have a need for dynamically updated charts showing recent system load, but prefer to view historical data from the past six months, which the client combines in order to achieve custom reports.

Figure 3 shows the architecture we have used to support the Web 2.0 interface. In Figure 3 component A corresponds to the GridRM Client box of Figure 1. Component A is a Java servlet that interacts with GridRM using the client API. URLs sent to component A are made up of a command and attributes that the servlet extracts in order to determine what query it should make to a gateway. The results are returned to the client as either XML or plain text, depending on the nature of the query. Java Server Pages (JSPs) (label B in Figure 3) containing AJAX determine how data is retrieved and displayed at the client. The AJAX within a JSP can request monitoring data from component A only (interaction 1 in Figure 3), or combine it with data from other sources on the Web; for example to overlay monitoring data on a map (e.g. interaction 2 in Figure 3). Figure 3: The structure of the Web 2.0 user interface Screenshots from the current instantiation of the Web 2.0 interface are shown in Figure 4 and Figure 5. An online demonstration of the prototype can be found at [17]. The user interface is structured around a tabbed layout that partitions the different resource views and allows further panes to be added in the future. Dynamic charts created by Fusion Charts [18] and implemented using Adobe Flash and JavaScript provide near real-time information for each resource registered with a GridRM gateway. The JSP initially retrieves the number of resources from a gateway that must be charted and then dynamically creates the JavaScript required to represent the charts at the client. When the JSP is loaded into the client s browser, the charts are instantiated and AJAX is used to independently request data from the servlet at specified intervals. In turn the servlet performs a query against the appropriate gateway. In order to represent the location and availability of gateways, a mashup using Google maps has been created. Because the Tycho registry is used to bind together all GridRM entities, it is trivial to locate gateways required for the map; any Tycho client can search for particular gateways that meet specific requirements (currently a Tycho client retrieves the data and outputs an XML file to a Web server directory). Search parameters can be matched against the name and type of entity as well as data contained in a service template, which the owning entity can update at runtime. Coloured pins denote the geographical location and status of gateways on the map; when an icon is clicked, a bubble containing summary information about registered resources, installed drivers and administrator contact details appears. In addition, network bandwidth and latency information between gateways can also be overlaid: A line is drawn between gateways, which when clicked by the user reveals a bubble that contains summary network information. Data collected from the registry is also displayed using the Freemind project s [19] mind map viewer with the intention of creating a registry browser. The viewer is implemented using Flash and displays data as a navigable graph where nodes and attributes can be expanded and collapsed, see Figure 5. The browser is useful for inspecting the state of the registry and observing changes in gateway status.

Figure 4: The dynamically updated charts showing resource information Figure 5: The Tycho registry viewer

5. BASIC UI EVALUATION Although the interface is functionally equivalent to our previous efforts, the quality of the data presentation is much improved, both in terms of the professional presentation of the chart, geographical map, and mind map components that we have utilised, but also due to the dynamically changing individual data items. From a user perspective, data is updated seamlessly across a number of animated charts and pins on the map appear and change colour automatically in response to changes across a wide area network of monitoring gateways. The mind mapped representation of the registry provides an intuitive approach to work with the data and includes a text search input field, and the ability to drill down through layers of nodes. The perceived effort required to construct the interface was minimal compared to our previous attempts. For example, the registry viewer was incorporated in a matter of hours, instead of days (or weeks), as has previously been the case with earlier interfaces. The servlet (component A in Figure 3), binds the interface to the rest of the GridRM system and takes care of passing requests to GridRM and returning results. The servlet acts as a connector and performs operations that may be non-trivial to achieve using JavaScript alone. For example subscribing to receive events, delivering notifications to the user interface (in this case caching events, for clientside AJAX to periodically retrieve), and locating/authenticating with remote gateways. For the purposes of the current demonstration, the servlet ignores any attempts by the client to provide authentication credentials. Instead the demonstration is restricted to a guest user credential, which is read from a servlet configuration file and applied to anonymous requests. In terms of the overhead incurred when attempting to achieve near real-time and continuous updates of data, the traditional Web 1.0 approach means that unnecessary load is incurred due to the requirement to reload an entire Web page at each update interval. In addition, server side generation of images (e.g. charts) can lead to greater overheads for the server. Another issue is that automatically refreshing an entire page periodically, that contains dynamic, static and user input fields (e.g. a search text box), can mean that the user loses partially entered keyboard input due to a page refresh. Web 2.0 technologies provide a way to avoid those problems due to selective page updates. However, during testing we observed a number of undesirable features whilst monitoring resources in near real-time. Displaying multiple dynamic charts (40) on a Web page (each individually updating every 8 seconds) caused some clients Web browser CPU utilisation to rise to approximately 50%, and in other cases up to 450 Mbytes of memory to be consumed by the Web browser. While the consumption of client resources is potentially excessive, overall the system behaves reasonably. Levels of consumption varied across different client platforms and browsers, indicating that browser implementations (in particular, plug-ins) require investigation. We will continue to investigate our use of the Flash charts in order optimise CPU and memory utilisation for displaying near real-time chart data, but differences in client overhead have also made us consider better ways of presenting user data. Obviously, one way would be to present the dynamic charts more logically, i.e. fewer active on a page, or perhaps creating SVG charts that use AJAX to perform DOM updates. The intention is to compare Web browser embedded SVG support with other approaches in order to assess stability, overhead, and cross-browser compatibility. 6. CONCLUSIONS AND FUTURE WORK The Web 2.0 user interface described in this paper is functionally similar to our previous attempts [20]. However, the presentation of data is far superior to our earlier efforts and the development overhead reduced due to the code reuse offered by components and services such AJAX, Google maps, Fusion charts, and the Freemind viewer. By supporting the aforementioned components with the user interface architecture, shown in Figure 3, we have been able to successfully combine technologies that allow rapid Web-based interface creation with an existing system that provides access to wide area resource monitoring data. As we have shown in this paper, attributes such as discovery, reliability, and security mechanisms that tend to be missing from Web 2.0 mashups can be provided or at least augmented by a layer of server side logic that binds the interface into a larger system. In this paper we use a thin layer of server side logic to automatically customise the AJAX source code before it reaches the client s Web browser. Future work will include an investigation of how best to natively authenticate Web 2.0 clients (i.e. without the server side support as described in this paper) in order to aggregate data from multiple services that use different security mechanisms. In addition to the interface described in this paper we have been working on a JSR- 168 portlet interface that is currently being hosted by a Gridsphere portal service. We will combine Web 2.0 and portlet technologies in order to investigate how best Web

2.0 can be included within the portlet lifecycle. In particular we will look at how to exploit interface configuration (including runtime configuration changes) and use portal single sign on mechanisms within the Web 2.0 front end. Furthermore, we will continue to look into different technologies, e.g. SVG, RSS, Adobe Flex and emerging Web 2.0 services in order to provide a more interactive user experience. 7. REFERENCES [1] Paul Anderson, What is Web 2.0? Ideas, Technologies and Implications for Education, JISC Technology and Standards Watch, February 2007. [2] M.A. Baker and G. Smith, GridRM: An Extensible Resource Monitoring System, the proceedings of IEEE International Conference on Cluster Computing (Cluster 2003), Hong Kong, IEEE Computer Society Press, pp 207-215, 2003, ISBN 0-7695-2066-9. [3] GridMap Prototype Visualizing the "State" of the Grid, http://gridmap.cern.ch/gm/. [4] M. Bohm, R Kubli, Visualizing the State of the Grid with GridMaps, EGEE 07 Conference, 1-5 October 2007. [5] MonitorUs, Website monitoring, http://mon.itor.us [6] Google Web Toolkit, http://code.google.com/webtoolkit [7] Ganglia Monitoring System, http://ganglia.sourceforge.net [8] RRDTool, http://oss.oetiker.ch/rrdtool [9] GridPP Realtime Monitor, http://gridportal.hep.ph.ic.ac.uk/rtm [10] M.A Baker and M. Grove, Tycho: A Wide-area Messaging Framework with an Integrated Virtual Registry, Special Issue on Grid Technology of the International Journal of Supercomputing, (eds) George A. Gravvanis, John P. Morrison and Geoffrey C. Fox, Springer, March 23, 2007, ISSN: 1573-0484. [11] Mario Macias, Garry Smith, Omer F. Rana, Jordi Guitart, and Jordi Torres, Enforcing Service Level Agreements using an Economically Enhanced Resource Manager, Workshop on Economic Models and Algorithms for Grid Systems (EMAGS 2007), Texas, USA, 2007. [12] An example of client interaction using the GridRM XML-over-HTTP (XOH) API, http://gridrm.org/xoh.html, November 2004. [13] Roy Thomas Fielding, Architectural Styles and the Design of Network-based Software Architectures, PhD Dissertation, University of California, Irvine, 2000. [14] SOAP Version 1.2 Part 1: Messaging Framework (Second Edition), W3C Recommendation, 27 April 2007, http://www.w3.org/tr/soap12-part1. [15] GridRM UI Demo, November 2004, http://holly.dsg.port.ac.uk:8888/gridrmportal/ [16] M.A. Baker and G. Smith, A Prototype Grid-site Monitoring System, Version 1, DSG Technical Report 2002.01, http://www.acet.rdg.ac.uk/~gms/gs/pubs/reports/gr id/dsgmonitoring.pdf. [17] GridRM Web 2.0 prototype interface, http://portals.rdg.ac.uk:8449/gridrmweb2. [18] FusionCharts, http://fusioncharts.com [19] FreeMind, http://freemind.sourceforge.net/wiki/index.php [20] Previous GridRM interfaces, http://gridrm.org/demo.html