Scalable monitoring and configuration tools for grids and clusters
|
|
|
- Lynne Gibson
- 10 years ago
- Views:
Transcription
1 Scalable monitoring and configuration tools for grids and clusters Philippe Augerat 1, Cyrill Martin 2, Benhur Stein 3 1 INPG, ID laboratory, Grenoble 2 BULL, INRIA, ID laboratory, Grenoble 3 UFSM, Santa-Maria, Brasil Abstract We present the Ka-admin project that addresses the problem of collecting, visualizing and feeding back any grid information, trace or snapshot, compliant to an XML-like model. Real use includes performance analysis of parallel applications and cluster administration. Ka-admin includes a generic filter module that processes monitored data independently of what they represent. Filters can remove, aggregate and transform data or pass them to external applications. The end user is responsible for activating the filters from within an interactive graphical interface. This allows him to focus on important information. We also present a scatter/gather module that allows efficient collection and distribution of data and commands in a large cluster. Early work on MPI/threads applications and system monitoring tools proved that the combination of both modules matches the objective of a scalable visualization of large data sets. 1. Introduction Architectures for distributed computing have become very popular over the last decades making a wide range of services available on the Internet. While resources and needs (high performance computing, data sharing, web services) exist, the problem is now to match software with the available resources. Resource monitoring has thus become a critical part of the global infrastructure. We are considering in this paper a virtual cluster built from a large number of Internet resources, and the problem of collecting, visualizing and tuning the cluster resources. A simple model for a virtual cluster could be an intranet or a grid of PC clusters because it has both a very large number of heterogeneous resources and a limited administrative scope. Thus we can expect administrators and users in such a model to be able to execute commands or access information on all nodes in the virtual computer. As far as collecting information is concerned, bwatch [1] was the first step to obtain a global view of a cluster. In the context of network administration, there are also numerous tools that monitor system information. Some of them like Performance Co-pilot [8] capture both PC system information and network device information. Finally, software has been developed to trace the behavior of dedicated applications. Vampir [5], Tau [6] and Pajé [12] are examples of such programs for high performance message passing computing. Once the information is collected, next problem is to move information collected from distributed agents to a centralized monitor server. Ganglia [2] uses a multicast protocol to achieve efficient real-time monitoring. ClusterProbe [11] is an open and flexible monitoring system based on a multi-protocol interface and a hierarchy of monitor servers to achieve scalability of communications. A complete monitoring tool should include or link to a management tool in order to have the administrator or the tool itself launch actions on the virtual cluster (reboot computers or schedule applications for instance). In addition to monitoring tools, Smile [3] provides administrative tools for clusters. Basically, it allows Unix commands to be distributed efficiently on a cluster. C3 [9] and rshp [7] projects also provide general frameworks for such parallel commands. A few works extend to multi-cluster and grids. The Grid Performance Working Group [4] propose a general framework for monitoring services and focuses on interoperability with other grid services such as authentification. The M3C project [10] joins several modules to monitor, manage a grid, schedule jobs, etc. The aim of the Ka-admin project is to design a generic tool that could leverage most existing software while being more scalable and efficient. The main concern is the definition of an extensible language and the associated filtering system that help focus on the important information. By combining the usage of Pajé, Performance Co-pilot and rshp, recent experiments have proved it possible to achieve our aim. The next section of this document describes the existing implementation and utilization of Ka-admin. The third
2 section deals with the expected properties of monitoring tools as far as genericity and scalability are concerned. 2. Description of the Ka-admin software Ka-admin is built from two independent modules. The first one, named scatter/gather module is in charge of collecting and updating data from/to the grid while the second named filter module is in charge of extracting interesting data to be displayed and selecting actions to be performed on the grid. Figure 1 shows the interaction between the two modules. for the connect phase. A comparison of the methods tested is shown in figure 2. As far as gathering information is concerned, a binomial tree performs better. Yet, a balance between the monitor server load and the network load should be also considered as well as the possible integration of intermediate filters on each node of the logical communication tree. On individual nodes, the scatter/gather module only consists in a wrapper to existing system tools. Actual implementation encapsulates either Performance Copilot software developed by SGI or homemade software. Performance Co-Pilot is a monitoring system by SGI for Figure 1: Ka-admin overview a. Data collection The scatter/gather module collects information from all node monitor agents then moves it to the server node, which may then feed back the agents with user-defined commands. The module is based on rshp, a general framework for starting programs on large-scale clusters. The rshp library was first used as an alternative solution to start the rsh command on different nodes in the cluster. Extensions to standard Unix commands also have been generated using the rshp library. Standard protocols (rsh, ssh) are used with modifications on the clients only. Using standard daemons instead of dedicated agents avoids splitting the scatter and gather functions in two different modules. To collect information from all nodes, rshp leverage the stdout/stdin properties of these protocols to create a permanent socket network that moves information from nodes to the centralized monitor node. This logical network can be built as a multicast network or as unicast binomial or n- ary trees Actual research on rshp is to define a generic framework to test and build the best permanent logical network for a grid of clusters. In general, the best case would be griddependant and rshp would adapt itself to the architecture. Early experiments on a 225 nodes Ethernet multiswitched cluster shows that the ideal network may change depending on the function (scatter or gather). The best method we found so far for scattering data is to parallel the rsh client (a three step connect/authenticate/start process) and to use a flat tree Irix and Linux computers. It monitors information which can alternatively be found in the /proc directory. It has both a C and shell interface. The system is designed to limit intrusion. If no monitoring software is available on a node or the user does not have the rights to use it, Ka-admin defaults to a command that access the /proc information directly. Figure 2: parallel rsh start on a 200 nodes cluster b. Data visualization The Ka-filters module extends Pajé [12], a scalable visualization tool designed for parallel applications. Main usage of Pajé is to visualize programs based upon MPI and threads libraries [13]. However, it has been designed to deal with various models and has been tested to visualize a data-flow programming interface and a java virtual machine behavior [14].
3 The genericity of Pajé is based on the independence of the trace format and the visualization tool. Pajé defines a grammar to describe a complex hierarchy of objects. A user can then define his own object model and generate compliant traces. Using Pajé to display traces, the user will interactively select a time window, adapt the granularity of the visualization (for instance processes vs. threads in the case of a parallel multi-threaded application) and finally filter non-important information. Pajé also allows generating automatic statistics or data aggregation. The figure 3 gives the model for an MPI/threads program. <Execution> <node> <thread> <state > </state> <MPI_event > </MPI_event> </thread> <semaphore> <state> </state> </semaphore> <lock> <state> </state> </lock> </node> </Execution> Figure 3: a simplified object model for a multithreaded MPI program A parallel program can be described as a hierarchy of objects (figure 3). The leaves in the tree representation of this hierarchy are called entities. Other items are called containers. The trace of the program (figure 4) is a list of time-stamped events: creation of a container, creation of an entity value, etc T1 PajeCreateContainer Execution T2 PajeCreateContainer Execution.node 0 T3 PajeCreateContainer node0.thread0 T4 PajeDefineEntityValue thread0.state T5 PajeDefineEntityValue thread0.mpi_event Tn PajeSetState thread0.state running Tp PajeSetState thread0.state stopped Figure 4: Part of a Pajé trace for an MPI/thread program The visualization of an MPI/threads program execution is shown on figure 5. A multi-level visualization is demonstrated (each thread running on node 1 is displayed, while all threads running on node 0 are grouped on one line and node 3 and 4 are grouped on one line. Another filtering functionality is demonstrated since only thread states and communications are shown. Other types of entities like semaphore states have been filtered out. The end user has complete control over this filters, thus can focus on the entities that matter. Various sources (system, application and middleware information) and traces (Vampir, Tau, etc) have been converted using the Pajé extended markup language. The next section gives several examples. Figure 5: Snapshot of Pajé monitoring windows
4 c. Real use: post-mortem visualization The Ka-admin system can be used to monitor postmortem information. Real use already concerns system log files and application behavior. Portable Batch System [16] is a widely used batch system that allows real-time node allocation. We use Kaadmin to trace and visualize node reservation in PBS. Tracing is done on the PBS job master node at job start. Figure 6 gives a simple model for a PBS trace. The pbs-task entity is initialized to nobody and when a user launches a PBS job, the pbs-task entity is set to the user name. The cluster administrator can use Pajé to visualize the cluster use. Figure 7 provides a view of a PBS trace. Each color matches a user. Pajé filters allow viewing a given list of users only. <grid> <cluster> <switch> <node> <pbs-task ></pbs-task> </node> </switch> </cluster> </grid> Figure 6: Simple model for monitoring PBS with Pajé. While the PBS package already provides a graphical view of a cluster, Ka-admin provides additional features. Firstly the Ka-admin filters natively provide data aggregation. For instance, the end user can easily set a filter to count the number of active nodes in the cluster (all nodes for which the pbs-task entity is not set to nobody). Practically, Pajé automatically creates a new entity in the cluster container and sets its value to the number of active nodes. Secondly any Pajé container or entity can be linked to an external command. Selecting an entity with the Pajé GUI will give access to this external command and pass the entity and its hierarchy as parameters to the command. For instance, the cluster administrator can kill a PBS job directly from the Ka-admin interface or send to active users. Another example is the monitoring of the Linpack HPL [17] application. To monitor system information during the execution of an application, the scatter/gather module starts Performance Co-pilot on individual node monitors and then collects and converts all data at the end of the run. Measurements with the Linpack parallel benchmark show that the intrusion (the overhead in the execution time) is the same as the sequential overhead caused by Performance Co-pilot on one node. Ka-admin allows monitoring any type of information collected by PCP. In the example in figure 8, statistics on the network interface, the memory and cpu loads have been captured by PCP. The figure 9 shows a Pajé snapshot of a partial HPL trace. The user selected only «network.out.bytes» entities. In addition, the user has created new objects, network.load and network.load.all using Pajé filters. The first filter automatically aggregates the network.out.bytes values on each switch to compute the network.load value. The second filter computes the aggregation on the whole cluster (network.load.all). <cluster> <switch> <node> <processor> <kernel.all.cpu.user> <kernel.all.cpu.idle> <kernel.all.cpu.sys> <network.out.bytes> <mem.freemem> Figure 8: An object model for tracing system information with Performance Co-pilot Figure 7: A PBS trace with Pajé
5 Figure 9: Pajé snapshot of Linpack execution and data aggregation With Ka-admin, it is also easy to mix heterogeneous data such as MPI traces and system load indices. One research work has started to take advantage of both granularity levels (application and system) for parallel application performance analysis. Another interesting feature of Ka-admin is to allow a view of several executions of the same application. 3. Genericity and scalability for grid monitoring tools A scalable view is one whose format, clarity, meaning and size are independent of the number of elements involved in computation [19]. From our experience with Pajé, we propose now two solutions to achieve this definition. a. Genericity Numerous administrative tools exist for clusters especially since every cluster administrator designs his own monitoring tool. Deciding which is the best would be as unfair and useless as comparing Latex and Microsoft Word. However this comparison is appropriate to underline that one major problem is the lack of a common language to describe data. Moreover, in order to handle various facility models (cluster, grid, etc) with heterogeneous resources, the language should be extensible to describe what is to be monitored. XML is an emerging standard that could help. As presented, the Pajé visualization tool already uses a similar language. Converting XML data to a Pajé trace is straightforward. The monitoring usages in the grid context include fault detection, security logging, performance analysis, scheduling, etc. Both an administrator (for log files, extreme system behaviors, software installation) and a user (for application tuning) may want to use and adapt the monitoring tool. This also encourages the defining of a high level grammar to describe the objects to be monitored independently of the monitoring tool. To allow both real-time and post-mortem visualization, the extensible language should also handle time stamped events that represent any object changes. A considerable work has been done in the context of the grid forum to define a schema for data formats for performance monitoring as well as a common interchange format for monitoring tools[20]. The monitoring tool should not restrict itself to display raw monitored data. It should also perform user-defined calculations upon the data (data aggregation in Pajé) or trigger action on the monitored resources (external shell commands in Pajé). The Pajé language could be extended to associate various executable commands (methods) with any monitored object. Finally, the adequate monitoring tool should capture changes in the grid architecture. While the virtual cluster language description may be fixed for a specific usage, the virtual computer description itself should be obtained from existing detection software or from interactions with a user. Events like creating/deleting a node are
6 possible in the Pajé language. This is also a requirement for fault tolerant monitoring. b. Scalability Monitoring a large number of computers combines two hard scalability problems. One of these is getting the information without much system and network intrusion (the producer problem). The other is focusing on the most important information within a large set of information (the consumer problem). care of what portions of trace data should be kept in memory and for controlling the trace reader whenever more data is necessary. Although it is not the case in current Pajé implementation, all modules may run on a different computer. Distributing the work of multiple storage controller will help storing a large amount of data. Distributing the other filters will help managing a very large number of entities. trace reader/ simulator trace filter storage controller filter filter visualization data flow control flow in-memory trace data Figure 10: Pajé workflow A solution to the producer problem is already handled by rshp, Ganglia or ClusterProbe. The next step should be to provide a generic tool with adaptative protocols on both the agent side and the wire side. Pajé handles the solution to the consumer problem by implementing a graph of filters. All filters share the same description language. A filter obtains a grid description and data from the previous filter in the queue and passes it to the next filter after activating one or several methods. Figure 10 summarizes Pajé workflow. The simplest example of a filter removes information from the previous one. For instance, it could remove computers that run Linux from a grid description. One filter may feed objects with information, for instance, may capture the CPU load for all nodes in the grid. Other filters can aggregate information to create new information. For instance, one filter can cumulate network traffic on all nodes on a communication switch to compute the switch load. Pajé implementation provides two types of a filter in a graph. The first acts on the flow of data, filtering it as data are read from the trace file. The second type of filter can dynamically change the view of all trace data stored in memory. While the first kind of filter acts as soon as data is available, the second kind acts on demand, answering to queries originated by user actions on the visualization interface. Between these two kinds of filters is a storage controller module, responsible for taking c. Interactivity Combining filters allows the user to work with a partial or aggregate view of the whole data set, then to go back to a complete detailed view. From this, we understand that interactivity is one of the solution to enforce visualization scalability. The user chooses which objects he wants to display, not the visualization tool. All objects may be hidden, and any combination of objects may be created. Ideally, the user may also control the information collection directly from the visualization tool. d. Perspectives The Ka-admin system is currently in use on a large cluster at INRIA Rhône-Alpes and is to be deployed on a grid of clusters in the VTHD project, the biggest French grid test-bed. Since the most powerful part of Pajé appears to be the filtering system, we will work mainly now on this module and make it independent from any graphical tool or monitoring tool. The filter module should be usable above other monitoring system such as Ganglia or ClusterProbe, helping them to achieve scalability. On the other hand, the scatter/gather module is to become self-adaptable to a heterogeneous architecture.
7 Future work includes: to parallel Ka-admin by distributing the filters on different computers, to provide a generic way to define external commands so that any administrative command could be performed from the Ka-admin interface, to leverage the Ka-admin features in the real-time context. 4. Conclusion We have presented the Ka-admin project that challenge the problem of collecting, visualizing and feeding back any grid information, trace or snapshot, compliant to an XML-like model. While current usage is to have one different monitoring tool for each cluster or application, we propose a generic filter module to process data only taking into account their description language not the description itself. Thus data can be filtered (removed, aggregated, transformed, passed to external applications) independently of what they represent. The end user is responsible for activating the filters he wants, allowing him to focus on interesting information. We also present a scatter/gather module that allows efficient collection and distribution of data and commands in a large cluster. Early work on MPI/threads applications and system load indices proved that the combination of both modules matches the objective of a scalable visualization of large data sets. References: [1] Jacek Radajewski. bwatch. [2] Matt Massie. Ganglia cluster monitoring toolkit. [3] Putchong Uthayopas et al., Interactive Management of Workstation Cluster Using WorldWide Web, ClusterComputing Conference (CCC 97), [4] Brian Tierney and all, White paper : A Grid Monitoring Service Architecture, Grid Performance Working Group [5] W.E. Nagel, A. Arnold, M. Weber, H-C. Hoppe, K. Solchenbach, VAMPIR: Visualization and Analysis of MPI Resources, Supercomputer 63, Vol.12, No. 1, pp , 1996 [6] A. Malony and S. Shende, Performance Technology for Complex Parallel and Distributed Systems, Proc. Third Austrian-Hungarian Workshop on Distributed and Parallel Systems, DAPSYS 2000, "Distributed and Parallel Systems: From Concepts to Applications," (Eds. G. Kotsis and P. Kacsuk) Kluwer, Norwell, MA, pp , [7] C. Martin, O. Richard. Parallel launcher for clusters of PC, submitted to World scientific [8] SGI Silicon Graphics Inc. Linux advanced cluster environment.http ://oss.sgi.com/projects/ace/. [9] C3, [10] M3C An Architecture for Monitoring and Managing Multiple Clusters [11] ClusterProbe: An Open, Flexible and Scalable Cluster Monitoring Tool Zhengyu Liang, Yundong Sun, and Cho-Li Wang, Department of Computer Science and Information Systems, The University of Hong Kong, Pokfulam Road, Hong Kong [12] J. Chassin de Kergommeaux, B. de Oliveira Stein, and P.E. Bernard. Pajé, an interactive visualization tool for tuning multi-threaded parallel applications. Parallel Computing, 26(10):1253_1274, aug [13] J. Chassin de Kergommeaux and B. de Oliveira Stein. Pajé : an extensible environment for visualizing multi-threaded programs executions. In A. Bode, W. Ludwig,T. Karl, and R. Wismüller, editors, Euro-Par 2000 Parallel Processing, Proc. 6th In-ternational Euro-Par Conference, volume 1900 of LNCS, pages 133_140. Springer, [14] F.-G. Ottogali, V. Olive, B. de Oliveira Stein, J. Chassin de Kergommeaux, and J.-M. Vincent. Visualization of distributed java applications for performance debugging. Soumis à publication. [15] Cyril Guilloud, Jacques Chassin de Kergommeaux, Philippe Augerat, Benhur Stein, Outil visuel d'administration système pour grappe de processeurs de grande taille, RENPAR 2001, 24, 27 avril 2001 [16] The Portable Batch System Software (PBS v2.2p13), Veridian, PBS Products Dept., Mountain View, CA, March [17] HPL [18] GNUstep [19] Couch, A.L. Categories and context in scalable execution visualization. Journal of Parallel and Distributed Computing, vol. 18, n. 2, 1993, pp ) [20] Global Grid Forum Performance Area
Application Monitoring in the Grid with GRM and PROVE *
Application Monitoring in the Grid with GRM and PROVE * Zoltán Balaton, Péter Kacsuk and Norbert Podhorszki MTA SZTAKI H-1111 Kende u. 13-17. Budapest, Hungary {balaton, kacsuk, pnorbert}@sztaki.hu Abstract.
Integration of the OCM-G Monitoring System into the MonALISA Infrastructure
Integration of the OCM-G Monitoring System into the MonALISA Infrastructure W lodzimierz Funika, Bartosz Jakubowski, and Jakub Jaroszewski Institute of Computer Science, AGH, al. Mickiewicza 30, 30-059,
Monitoring Clusters and Grids
JENNIFER M. SCHOPF AND BEN CLIFFORD Monitoring Clusters and Grids One of the first questions anyone asks when setting up a cluster or a Grid is, How is it running? is inquiry is usually followed by the
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM Albert M. K. Cheng, Shaohong Fang Department of Computer Science University of Houston Houston, TX, 77204, USA http://www.cs.uh.edu
Integrating TAU With Eclipse: A Performance Analysis System in an Integrated Development Environment
Integrating TAU With Eclipse: A Performance Analysis System in an Integrated Development Environment Wyatt Spear, Allen Malony, Alan Morris, Sameer Shende {wspear, malony, amorris, sameer}@cs.uoregon.edu
LSKA 2010 Survey Report Job Scheduler
LSKA 2010 Survey Report Job Scheduler Graduate Institute of Communication Engineering {r98942067, r98942112}@ntu.edu.tw March 31, 2010 1. Motivation Recently, the computing becomes much more complex. However,
CATS-i : LINUX CLUSTER ADMINISTRATION TOOLS ON THE INTERNET
CATS-i : LINUX CLUSTER ADMINISTRATION TOOLS ON THE INTERNET Jiyeon Kim, Yongkwan Park, Sungjoo Kwon, Jaeyoung Choi {heaven, psiver, lithmoon}@ss.ssu.ac.kr, [email protected] School of Computing, Soongsil
Performance Tools for Parallel Java Environments
Performance Tools for Parallel Java Environments Sameer Shende and Allen D. Malony Department of Computer and Information Science, University of Oregon {sameer,malony}@cs.uoregon.edu http://www.cs.uoregon.edu/research/paracomp/tau
Firewall Builder Architecture Overview
Firewall Builder Architecture Overview Vadim Zaliva Vadim Kurland Abstract This document gives brief, high level overview of existing Firewall Builder architecture.
Grid Scheduling Dictionary of Terms and Keywords
Grid Scheduling Dictionary Working Group M. Roehrig, Sandia National Laboratories W. Ziegler, Fraunhofer-Institute for Algorithms and Scientific Computing Document: Category: Informational June 2002 Status
A Survey Study on Monitoring Service for Grid
A Survey Study on Monitoring Service for Grid Erkang You [email protected] ABSTRACT Grid is a distributed system that integrates heterogeneous systems into a single transparent computer, aiming to provide
SAP Data Services 4.X. An Enterprise Information management Solution
SAP Data Services 4.X An Enterprise Information management Solution Table of Contents I. SAP Data Services 4.X... 3 Highlights Training Objectives Audience Pre Requisites Keys to Success Certification
Provisioning and Resource Management at Large Scale (Kadeploy and OAR)
Provisioning and Resource Management at Large Scale (Kadeploy and OAR) Olivier Richard Laboratoire d Informatique de Grenoble (LIG) Projet INRIA Mescal 31 octobre 2007 Olivier Richard ( Laboratoire d Informatique
End-user Tools for Application Performance Analysis Using Hardware Counters
1 End-user Tools for Application Performance Analysis Using Hardware Counters K. London, J. Dongarra, S. Moore, P. Mucci, K. Seymour, T. Spencer Abstract One purpose of the end-user tools described in
Manjrasoft Market Oriented Cloud Computing Platform
Manjrasoft Market Oriented Cloud Computing Platform Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities.
An Introduction to High Performance Computing in the Department
An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software
IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2.
IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version 6.3.1 Fix Pack 2 Reference IBM Tivoli Composite Application Manager for Microsoft Applications:
JRastro: A Trace Agent for Debugging Multithreaded and Distributed Java Programs
JRastro: A Trace Agent for Debugging Multithreaded and Distributed Java Programs Gabriela Jacques da Silva 1, Lucas Mello Schnorr 1, Benhur de Oliveira Stein 1 1 Laboratório de Sistemas de Computação Universidade
Survey of execution monitoring tools for computer clusters
Survey of execution monitoring tools for computer clusters Espen S. Johnsen Otto J. Anshus John Markus Bjørndalen Lars Ailo Bongo Department of Computer Science, University of Tromsø September 29, 2003
MapCenter: An Open Grid Status Visualization Tool
MapCenter: An Open Grid Status Visualization Tool Franck Bonnassieux Robert Harakaly Pascale Primet UREC CNRS UREC CNRS RESO INRIA ENS Lyon, France ENS Lyon, France ENS Lyon, France [email protected]
A scalable file distribution and operating system installation toolkit for clusters
A scalable file distribution and operating system installation toolkit for clusters Philippe Augerat 1, Wilfrid Billot 2, Simon Derr 2, Cyrille Martin 3 1 INPG, ID Laboratory, Grenoble, France 2 INRIA,
Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform
Mitglied der Helmholtz-Gemeinschaft System monitoring with LLview and the Parallel Tools Platform November 25, 2014 Carsten Karbach Content 1 LLview 2 Parallel Tools Platform (PTP) 3 Latest features 4
Using an In-Memory Data Grid for Near Real-Time Data Analysis
SCALEOUT SOFTWARE Using an In-Memory Data Grid for Near Real-Time Data Analysis by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 IN today s competitive world, businesses
MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper
Migrating Desktop and Roaming Access Whitepaper Poznan Supercomputing and Networking Center Noskowskiego 12/14 61-704 Poznan, POLAND 2004, April white-paper-md-ras.doc 1/11 1 Product overview In this whitepaper
Implementing Network Attached Storage. Ken Fallon Bill Bullers Impactdata
Implementing Network Attached Storage Ken Fallon Bill Bullers Impactdata Abstract The Network Peripheral Adapter (NPA) is an intelligent controller and optimized file server that enables network-attached
Cisco Application Networking Manager Version 2.0
Cisco Application Networking Manager Version 2.0 Cisco Application Networking Manager (ANM) software enables centralized configuration, operations, and monitoring of Cisco data center networking equipment
The System Monitor Handbook. Chris Schlaeger John Tapsell Chris Schlaeger Tobias Koenig
Chris Schlaeger John Tapsell Chris Schlaeger Tobias Koenig 2 Contents 1 Introduction 6 2 Using System Monitor 7 2.1 Getting started........................................ 7 2.2 Process Table.........................................
Ikasan ESB Reference Architecture Review
Ikasan ESB Reference Architecture Review EXECUTIVE SUMMARY This paper reviews the Ikasan Enterprise Integration Platform within the construct of a typical ESB Reference Architecture model showing Ikasan
PATROL From a Database Administrator s Perspective
PATROL From a Database Administrator s Perspective September 28, 2001 Author: Cindy Bean Senior Software Consultant BMC Software, Inc. 3/4/02 2 Table of Contents Introduction 5 Database Administrator Tasks
Generic Log Analyzer Using Hadoop Mapreduce Framework
Generic Log Analyzer Using Hadoop Mapreduce Framework Milind Bhandare 1, Prof. Kuntal Barua 2, Vikas Nagare 3, Dynaneshwar Ekhande 4, Rahul Pawar 5 1 M.Tech(Appeare), 2 Asst. Prof., LNCT, Indore 3 ME,
automates system administration for homogeneous and heterogeneous networks
IT SERVICES SOLUTIONS SOFTWARE IT Services CONSULTING Operational Concepts Security Solutions Linux Cluster Computing automates system administration for homogeneous and heterogeneous networks System Management
INTRODUCTION ADVANTAGES OF RUNNING ORACLE 11G ON WINDOWS. Edward Whalen, Performance Tuning Corporation
ADVANTAGES OF RUNNING ORACLE11G ON MICROSOFT WINDOWS SERVER X64 Edward Whalen, Performance Tuning Corporation INTRODUCTION Microsoft Windows has long been an ideal platform for the Oracle database server.
An Oracle White Paper October 2013. Oracle Data Integrator 12c New Features Overview
An Oracle White Paper October 2013 Oracle Data Integrator 12c Disclaimer This document is for informational purposes. It is not a commitment to deliver any material, code, or functionality, and should
Also on the Performance tab, you will find a button labeled Resource Monitor. You can invoke Resource Monitor for additional analysis of the system.
1348 CHAPTER 33 Logging and Debugging Monitoring Performance The Performance tab enables you to view the CPU and physical memory usage in graphical form. This information is especially useful when you
COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters
COMP5426 Parallel and Distributed Computing Distributed Systems: Client/Server and Clusters Client/Server Computing Client Client machines are generally single-user workstations providing a user-friendly
ERserver. iseries. Work management
ERserver iseries Work management ERserver iseries Work management Copyright International Business Machines Corporation 1998, 2002. All rights reserved. US Government Users Restricted Rights Use, duplication
Run your own Oracle Database Benchmarks with Hammerora
Run your own Oracle Database Benchmarks with Hammerora Steve Shaw Intel Corporation UK Keywords: Database, Benchmark Performance, TPC-C, TPC-H, Hammerora Introduction The pace of change in database infrastructure
Using In-Memory Computing to Simplify Big Data Analytics
SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed
Part I Courses Syllabus
Part I Courses Syllabus This document provides detailed information about the basic courses of the MHPC first part activities. The list of courses is the following 1.1 Scientific Programming Environment
- An Essential Building Block for Stable and Reliable Compute Clusters
Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative
A Framework for Online Performance Analysis and Visualization of Large-Scale Parallel Applications
A Framework for Online Performance Analysis and Visualization of Large-Scale Parallel Applications Kai Li, Allen D. Malony, Robert Bell, and Sameer Shende University of Oregon, Eugene, OR 97403 USA, {likai,malony,bertie,sameer}@cs.uoregon.edu
Scalable Cloud Computing Solutions for Next Generation Sequencing Data
Scalable Cloud Computing Solutions for Next Generation Sequencing Data Matti Niemenmaa 1, Aleksi Kallio 2, André Schumacher 1, Petri Klemelä 2, Eija Korpelainen 2, and Keijo Heljanko 1 1 Department of
Implementing Probes for J2EE Cluster Monitoring
Implementing s for J2EE Cluster Monitoring Emmanuel Cecchet, Hazem Elmeleegy, Oussama Layaida, Vivien Quéma LSR-IMAG Laboratory (CNRS, INPG, UJF) - INRIA INRIA Rhône-Alpes, 655 av. de l Europe, 38334 Saint-Ismier
Version 14.0. Overview. Business value
PRODUCT SHEET CA Datacom Server CA Datacom Server Version 14.0 CA Datacom Server provides web applications and other distributed applications with open access to CA Datacom /DB Version 14.0 data by providing
PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems
PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems Pavan Balaji 1, Darius Buntinas 1, David Goodell 1, William Gropp 2, Jayesh Krishna 1, Ewing Lusk 1, and Rajeev Thakur 1
Cray: Enabling Real-Time Discovery in Big Data
Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects
No.1 IT Online training institute from Hyderabad Email: [email protected] URL: sriramtechnologies.com
I. Basics 1. What is Application Server 2. The need for an Application Server 3. Java Application Solution Architecture 4. 3-tier architecture 5. Various commercial products in 3-tiers 6. The logic behind
a division of Technical Overview Xenos Enterprise Server 2.0
Technical Overview Enterprise Server 2.0 Enterprise Server Architecture The Enterprise Server (ES) platform addresses the HVTO business challenges facing today s enterprise. It provides robust, flexible
UNISOL SysAdmin. SysAdmin helps systems administrators manage their UNIX systems and networks more effectively.
1. UNISOL SysAdmin Overview SysAdmin helps systems administrators manage their UNIX systems and networks more effectively. SysAdmin is a comprehensive system administration package which provides a secure
Microsoft SQL Server for Oracle DBAs Course 40045; 4 Days, Instructor-led
Microsoft SQL Server for Oracle DBAs Course 40045; 4 Days, Instructor-led Course Description This four-day instructor-led course provides students with the knowledge and skills to capitalize on their skills
Manjrasoft Market Oriented Cloud Computing Platform
Manjrasoft Market Oriented Cloud Computing Platform Innovative Solutions for 3D Rendering Aneka is a market oriented Cloud development and management platform with rapid application development and workload
MD Link Integration. 2013 2015 MDI Solutions Limited
MD Link Integration 2013 2015 MDI Solutions Limited Table of Contents THE MD LINK INTEGRATION STRATEGY...3 JAVA TECHNOLOGY FOR PORTABILITY, COMPATIBILITY AND SECURITY...3 LEVERAGE XML TECHNOLOGY FOR INDUSTRY
Exporting IBM i Data to Syslog
Exporting IBM i Data to Syslog A White Paper from Safestone Technologies By Nick Blattner, System Engineer www.safestone.com Contents Overview... 2 Safestone... 2 SIEM consoles... 2 Parts and Pieces...
Monitoring Infrastructure for Superclusters: Experiences at MareNostrum
ScicomP13 2007 SP-XXL Monitoring Infrastructure for Superclusters: Experiences at MareNostrum Garching, Munich Ernest Artiaga Performance Group BSC-CNS, Operations Outline BSC-CNS and MareNostrum Overview
Apache Web Server Execution Tracing Using Third Eye
Apache Web Server Execution Tracing Using Third Eye Raimondas Lencevicius Alexander Ran Rahav Yairi Nokia Research Center, 5 Wayside Road, Burlington, MA 01803, USA [email protected] [email protected]
SAS Grid Manager Testing and Benchmarking Best Practices for SAS Intelligence Platform
SAS Grid Manager Testing and Benchmarking Best Practices for SAS Intelligence Platform INTRODUCTION Grid computing offers optimization of applications that analyze enormous amounts of data as well as load
SequeLink Server for ODBC Socket
P RODUCT O VERVIEW Server for ODBC Socket Overview DataDirect is highly scalable, server-based middleware that gives you a complete platform for data connectivity. Common Servers offer the performance
A Protocol Based Packet Sniffer
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
Parallel Analysis and Visualization on Cray Compute Node Linux
Parallel Analysis and Visualization on Cray Compute Node Linux David Pugmire, Oak Ridge National Laboratory and Hank Childs, Lawrence Livermore National Laboratory and Sean Ahern, Oak Ridge National Laboratory
Analysis and Implementation of Cluster Computing Using Linux Operating System
IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661 Volume 2, Issue 3 (July-Aug. 2012), PP 06-11 Analysis and Implementation of Cluster Computing Using Linux Operating System Zinnia Sultana
Towards Smart and Intelligent SDN Controller
Towards Smart and Intelligent SDN Controller - Through the Generic, Extensible, and Elastic Time Series Data Repository (TSDR) YuLing Chen, Dell Inc. Rajesh Narayanan, Dell Inc. Sharon Aicler, Cisco Systems
Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks
WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance
MPI / ClusterTools Update and Plans
HPC Technical Training Seminar July 7, 2008 October 26, 2007 2 nd HLRS Parallel Tools Workshop Sun HPC ClusterTools 7+: A Binary Distribution of Open MPI MPI / ClusterTools Update and Plans Len Wisniewski
Oracle Warehouse Builder 10g
Oracle Warehouse Builder 10g Architectural White paper February 2004 Table of contents INTRODUCTION... 3 OVERVIEW... 4 THE DESIGN COMPONENT... 4 THE RUNTIME COMPONENT... 5 THE DESIGN ARCHITECTURE... 6
Chapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
Petascale Software Challenges. Piyush Chaudhary [email protected] High Performance Computing
Petascale Software Challenges Piyush Chaudhary [email protected] High Performance Computing Fundamental Observations Applications are struggling to realize growth in sustained performance at scale Reasons
Module 12: Microsoft Windows 2000 Clustering. Contents Overview 1 Clustering Business Scenarios 2 Testing Tools 4 Lab Scenario 6 Review 8
Module 12: Microsoft Windows 2000 Clustering Contents Overview 1 Clustering Business Scenarios 2 Testing Tools 4 Lab Scenario 6 Review 8 Information in this document is subject to change without notice.
Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies
Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies Kurt Klemperer, Principal System Performance Engineer [email protected] Agenda Session Length:
Performance Monitoring of Parallel Scientific Applications
Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure
Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela
Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance
Communiqué 4. Standardized Global Content Management. Designed for World s Leading Enterprises. Industry Leading Products & Platform
Communiqué 4 Standardized Communiqué 4 - fully implementing the JCR (JSR 170) Content Repository Standard, managing digital business information, applications and processes through the web. Communiqué
Analyzing Network Servers. Disk Space Utilization Analysis. DiskBoss - Data Management Solution
DiskBoss - Data Management Solution DiskBoss provides a large number of advanced data management and analysis operations including disk space usage analysis, file search, file classification and policy-based
Chapter 1 - Web Server Management and Cluster Topology
Objectives At the end of this chapter, participants will be able to understand: Web server management options provided by Network Deployment Clustered Application Servers Cluster creation and management
Resource Utilization of Middleware Components in Embedded Systems
Resource Utilization of Middleware Components in Embedded Systems 3 Introduction System memory, CPU, and network resources are critical to the operation and performance of any software system. These system
ATLAS job monitoring in the Dashboard Framework
ATLAS job monitoring in the Dashboard Framework J Andreeva 1, S Campana 1, E Karavakis 1, L Kokoszkiewicz 1, P Saiz 1, L Sargsyan 2, J Schovancova 3, D Tuckett 1 on behalf of the ATLAS Collaboration 1
DIABLO VALLEY COLLEGE CATALOG 2014-2015
COMPUTER SCIENCE COMSC The computer science department offers courses in three general areas, each targeted to serve students with specific needs: 1. General education students seeking a computer literacy
Functions of NOS Overview of NOS Characteristics Differences Between PC and a NOS Multiuser, Multitasking, and Multiprocessor Systems NOS Server
Functions of NOS Overview of NOS Characteristics Differences Between PC and a NOS Multiuser, Multitasking, and Multiprocessor Systems NOS Server Hardware Windows Windows NT 4.0 Linux Server Software and
Cluster, Grid, Cloud Concepts
Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of
A Comparison on Current Distributed File Systems for Beowulf Clusters
A Comparison on Current Distributed File Systems for Beowulf Clusters Rafael Bohrer Ávila 1 Philippe Olivier Alexandre Navaux 2 Yves Denneulin 3 Abstract This paper presents a comparison on current file
Resource Management on Computational Grids
Univeristà Ca Foscari, Venezia http://www.dsi.unive.it Resource Management on Computational Grids Paolo Palmerini Dottorato di ricerca di Informatica (anno I, ciclo II) email: [email protected] 1/29
Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms
Volume 1, Issue 1 ISSN: 2320-5288 International Journal of Engineering Technology & Management Research Journal homepage: www.ijetmr.org Analysis and Research of Cloud Computing System to Comparison of
How To Understand The Concept Of A Distributed System
Distributed Operating Systems Introduction Ewa Niewiadomska-Szynkiewicz and Adam Kozakiewicz [email protected], [email protected] Institute of Control and Computation Engineering Warsaw University of
M.Sc. IT Semester III VIRTUALIZATION QUESTION BANK 2014 2015 Unit 1 1. What is virtualization? Explain the five stage virtualization process. 2.
M.Sc. IT Semester III VIRTUALIZATION QUESTION BANK 2014 2015 Unit 1 1. What is virtualization? Explain the five stage virtualization process. 2. What are the different types of virtualization? Explain
