A Scalable Control and Monitoring Framework to Aid the Development of Supercomputer Applications
|
|
- Terence Chandler
- 7 years ago
- Views:
Transcription
1 A Scalable Control and Monitoring Framework to Aid the Development of Supercomputer Applications Gregory R. Watson IBM Systems & Technology Group Carsten Karbach Forschungszentrum Jülich GmbH Wolfgang Frings Forschungszentrum Jülich GmbH Albert L. Rossi Fermi National Accelerator Laboratory Claudia Knobloch Forschungszentrum Jülich GmbH ABSTRACT The development of scientific applications for parallel computing systems is becoming increasingly challenging. Petascale systems are now becoming readily available to the scientific computing community, and planning is underway to achieve exascale within the next decade. The vast power of these systems, coupled with a corresponding increase in application code complexity, is making the limitations of existing programming and performance tools ever more apparent. If developers are going to be able to effectively utilize these systems, then a new generation of tools will be required that seamlessly integrate with each other and the target systems on which they operate. The open source Parallel Tools Platform (PTP) Project was established in 2005 to create a best-practice integrated tool workbench designed to increase the productivity of parallel application development. PTP has increased in popularity over the years, and is now used by a growing community of developers in scientific and engineering fields. PTP must also adapt to the new petascale and exascale environments, however, and in this paper we describe some of the recent changes to PTP core infrastructure that will enable it to work effectively with these and future generations of high performance computing systems. 1. INTRODUCTION Recent announcements have heralded in a new generation of petascale systems, including most recently the National Center for Supercomputing Applications (NCSA) Blue Waters system and the National Center for Atmospheric Research (NCAR) Yellowstone machine. The top 10 systems on the November 2011 TOP500 list 1 all now exceed one petaflop peak performance. 1 Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 20XX ACM X-XXXXX-XX-X/XX/XX...$ The drivers for this massive increase in computational power are the large and complex applications now being used to perform some of the most detailed and accurate numerical simulations ever contemplated. For these applications to be successful the utmost level of performance must be extracted from the hardware; it is no use having a 10 petaflop system if only a fraction of the resource can actually be used. Unfortunately, the complexity of both the applications and the computer systems they run on is also stretching the limits of existing programming and performance tools. If developers are going to be able to effectively utilize these systems, then a new generation of tools will be required that provide significantly improved capability over the current ones. In 2005, the Parallel Tools Platform (PTP) project 2 was established in order to advance the state of parallel application development and provide and integrating framework for the development of parallel tools. As PTP has gained in popularity, and is now being contemplated as the development environment for systems such as Blue Waters 3, it is becoming increasingly important for the platform to be able to support these petascale systems. In this paper, we will present our recent work to improve the scalability and usability of PTP. In Section 2 we will discuss the motivation for the changes in more detail. In Section 3 we will present the overall architecture of PTP and the components on which the current work is concentrating. In Section 4 we will present improvements to the scalability of the platform, and in Section 5 we will provide details of how the environment is now significantly more extensible. Section 6 proposes some areas of future work, and Section 7 concludes the paper. 2. MOTIVATION Many tools are available to aid developers of HPC applications, ranging from compilers to build systems to performance analysis and tuning tools. However most of these provide either stand-alone GUI s or are command-line tools, which complicates the developer s work for a number of reasons. First, the developer must spend considerable time understanding and learning the different tool interfaces. Second, the tools typically do not share information, so the de NCSAreceives.html
2 veloper must set up and configure the tools with the same information multiple times. Finally, the developer s workflow is encumbered with the need to manually switch between tools in order to access the desired functionality. Integrated development environments (IDEs) have long been used to overcome all these issues (and more), and are best practice for most of the software engineering industry. Strangely, HPC is one of the few disciplines that have not accepted the productivity benefits that IDEs have been shown to deliver, although this is now changing. A number of past efforts have attempted to create integrated environments for developing parallel programs [1] [2] [3] but few, if any, of these survive today. In addition, there are a variety of tools available for monitoring job and system status on large high performance computing systems [4] [5], and some batch systems provide facilities for remote job submission and monitoring. PTP is unique to the authors knowledge, in that it provides integration not only of a broad range of development tools, but with the systems themselves, allowing the developer to submit jobs and monitor activity on one or more target systems from within the development environment. This ability enables developers to streamline their development workflow so that they can avoid time-consuming and costly context switches between different tools, something that is essential for increasing developer productivity. The PTP project brings together a range of tools for developing C, C++, Fortran and UPC applications into a single integrated environment. In addition to advanced editing, project management, and integration with version control systems (CVS, Subversion, and Git), there is also support for MPI development, and an integrated parallel debugger [6]. A number of performance tools have also been integrated 4. When PTP was first developed, even the largest systems were relatively small compared to today s machines 5, and the ability to monitor the entire system and user s jobs was relatively straightforward. Advances in system size have lead us to make changes to the PTP core infrastructure in order to improve the scalability of system and job monitoring and to simplify the process of adding support for new systems and job schedulers. For scalable monitoring, we have based the implementation on an existing batch system monitoring tool called LLview [7] that is known to be highly scalable. This enables information about the user s job execution, along with a full overview of the system, to be viewed at regular intervals. Such a live view enables greater awareness of the target system, its complexity and the circumstances under which the user s jobs are running. The monitoring component is able to show the current usage of the full system, including the mapping between the jobs to the compute nodes and the load of the batch queues. We plan to extend this in forthcoming versions to display a prediction of the future system usage based on the current state. This would provide the user with, for example, more detailed reasons as to why a job is currently not started by the batch system. For extensibility, we have designed a completely generic framework for controlling job submission to a target system, whether interactive or batch. Support for a new type of job 4 University of Oregon s TAU, IBM s HPC Toolkit, and others 5 At the time, the world s largest system was a 1024 node cluster at Los Alamos National Laboratory. scheduler, for example, can be completely specified via an XML definition file. This specification includes commands to run on the target system to perform job related activities (e.g. submission, termination, etc.) as well as information on how to layout the user interface so that users can supply resource related information required by the target system. 3. ARCHITECTURE Even in modest supercomputing installations, the computing resources routinely used by developers are scarce and must be shared by large numbers of users, such systems are usually centrally located in specialized facilities, and must be accessed remotely. They are also typically highly customized systems, and rarely employ the same system software or development tools. PTP is a set of plug-ins for the Eclipse Platform that extends its functionality to provide various features for assisting HPC application developers in these types of environments, particularly those using MPI and other parallel programming models. 3.1 Control and Monitoring Frameworks PTP provides developers with a variety of techniques for simplifying the way in which remote computing resources are accessed and utilized, and these have been discussed in detail elsewhere [8]. The two key components of PTP that are the focus of this paper are the control and monitoring frameworks. These are the primary mechanisms for hiding the intricacies of the complicated system software used on HPC machines, and are the means by which PTP users launch and debug jobs, and monitor activity on a target system. The control framework provides an abstraction of a batch scheduler, interactive runtime system, or some other means of controlling jobs on a system. The monitoring framework provides a mechanism for monitoring the system and jobrelated activity on a target machine. The control and monitoring frameworks operate completely independently, however it is also possible to link a control and monitor implementation so they can be used together when appropriate. PTP also allows multiple control and monitoring systems to be defined, each able to interact with different types of systems simultaneously, even if these are on completely independent machines. Figure 1 shows a high-level architecture of PTP. On the left, the Eclipse platform provides the user s main development environment, and acts as the client for developing HPC applications and accessing supercomputing resources; typically this client runs on the user s workstation or laptop. The control framework issues commands to submit new jobs and perform operations on existing jobs (such as cancelling a job). The control framework is also responsible for handling standard output generated by batch or interactive jobs, standard input to interactive jobs, as well as initiating debug sessions. Launching a job via the control framework uses the normal Eclipse launch configuration mechanism. The monitoring framework manages a content model as well as views of system and job information, and is responsible for collecting data from the batch system or interactive runtime system (or both, depending on the configuration of the target system) and presenting this information to the user. Communication with the target system for both control and monitoring is via a single SSH connection (raw TCP/IP sockets can also be used).
3 Figure 1: High-level architecture of PTP. On the left is the Eclipse client that normally runs on the user s workstation or laptop. On the right is the supercomputing resource that the user is developing and running applications on. Interaction between the client and the target system is required for launching, controlling, and monitoring applications. An agent is used on the target system to manage the formatting of monitoring data. 3.2 Model Driven Architecture Both the control and monitoring frameworks use XML data formats to drive the user interface and other functions. The control component uses XML for its definition files, which provides information about the target system, such as the type of batch system, commands for job submission and control, as well as the layout of the launch configuration user interface. The monitor component uses XML for communicating monitoring and layout information between the Eclipse client and the target system. PTP uses the Java Architecture for XML Binding (JAXB) to map XML information directly to Java classes so that each XML element has a corresponding Java representation. When the framework needs to access a configuration file, or receives an XML formatted message, the XML is unmarshalled and merged into the internal content model. Various parts of the Eclipse user interface, including the launch configuration and the monitoring and status views, are driven directly from this content model. 4. SCALABILITY Adapting PTP to systems at petascale and beyond requires careful consideration of all aspects of the interaction between the local Eclipse client and the target system on which the user s jobs will be running. This is especially the case for system monitoring, which must process and present large amounts of monitoring data generated by these systems. Unless extreme care is taken, it is very easy to overwhelm the user by presenting too much information, or to run into speed or memory constraints when trying to process the information within Eclipse. The scalability of PTP s architecture was demonstrated during SC11 in November 2011 by simulating a full scale BG/Q system of approximately 1.6M cores. In the following sections, we will discuss the scalability features of the monitoring component in more detail, concentrating on three main areas for increasing scalability: the data representation, the user interface components, and remote data acquisition. 4.1 Monitor Data Representation PTP s monitoring framework uses a client-server model. The server is responsible for collecting information about the target system and the jobs on the system. This information is then passed to the client where it is presented to the user via the Eclipse user interface. We have designed the Largescale system Markup Language (LML), which is an XML schema that defines the structure of this monitoring data [9]. LML can be used to describe the status of arbitrarily large computer systems; there are no restrictions on the system s architecture or size. LML is designed so that one instance 6 provides a snapshot of the current system s state. It consists of a set of independently presentable graphical objects specified by simple elements such as table, textbox, diagram and more complex elements such as nodedisplay. The server generates these element along with elements containing the data that is to be displayed. The nodedisplay element is the most important part of LML for providing an overview of the system s state. It presents a graphical view of the system and displays the physical location of jobs currently running on the system. The nodedisplay element contains two children: scheme and data. The scheme element defines the physical hierarchy of the target system, while the data element associates physical components with dynamic aspects of the system, such as the nodes on which jobs are running. Listing 1 shows an example of a nodedisplay element used to represent a system like the Jülich Blue Gene/P. The scheme element is used to define a system comprising 72 racks, where each rack has 32 node cards and a node card has 32 chips, each of which contains 4 cores. Following this is a data element that 6 We use instance to refer to an XML-document which is valid against the LML schema.
4 uses the same hierarchy defined by the scheme element for addressing physical components. Every el element (el1, el2, etc.) within the data element has an oid attribute, which is used to reference other elements in the LML data (such as a user s job on the system.) The oid attribute is also inherited by children of an element, which eliminates any redundancy in the data. Elements with identical oid attributes can be specified more compactly using ranges. Listing 1: LML nodedisplay example 1 < nodedisplay title =" Jugene " id=" nd" > 2 <! -- Physical system structure -- > 3 < scheme > 4 <el1 tagname =" rack " min ="1" max ="72"> 5 <el2 tagname =" nodecard " min =" 1" max =" 32" > 6 <el3 tagname =" chip " min ="1" max ="32"> 7 <el4 tagname =" cpu " min ="1" max ="4"/> 8 </ el3 > 9 </ el2 > 10 </ el1 > 11 </ scheme > 12 <! -- Connect physical elements to current jobs --> 13 <data > 14 <el1 min =" 1" max =" 36" oid =" j1" status =" running "> 15 <el2 min ="9" oid ="j2"/> 16 </ el1 > 17 <el1 min =" 37" max =" 72" oid =" empty " status =" idle " description =" racks broken "/> 18 </ data > 19 </ nodedisplay > Using this approach, it is possible to represent system information to any level of detail required, or to vary the level of detail used to represent different parts of the system. By eliminating parts of the hierarchy that are not required to be displayed, it is very easy to reduce the volume of information transmitted to the client. Another scalability technique we employ is to avoid repetitious data, which can be collected together using special elements (e.g. job names and the colors used to identify the jobs in the view). It is also possible to display the physical structure defined by the data element in different levels of detail. The tree described by the data element can be eliminated from the client s view at any level. This reduces the amount of detail shown and hence the view s complexity. However, just eliminating the lower levels of the tree results in a loss of data for the user. To avoid this, we provide a mechanism that summarizes the lower levels into a flat data structure that still represents a valid representation of the child elements known as a usage bar. A usage bar is defined as a map, whose keys are job references and whose values are the amounts of the smallest units defined by the scheme element (which is cpu in the above example.) This map can be generated for each data element in the nodedisplay by calculating the total number of leaf elements and the number of leaves assigned to each job. A usage bar disregards the connection between jobs and corresponding compute resources, but ensures job information is still presented regardless of which level of detail is shown. LML also provides the table element which is used to represent tabular data such as jobs running on the system. The first part of the table element comprises column elements that specify information about each column in the table. Following the column definitions are row and cell elements that specify the contents of the table. In order to reduce the amount of data transmitted, column elements can contain pattern elements which specify the condition under which row data will be included in the table. For example, if the user only wants to see jobs belonging to them, a pattern would be added to an owner column specifying a user name to match. Only rows containing this user name would be included in the table data sent to the client. 4.2 User Interface Once the client acquires an LML instance from the server it must be rendered in the Eclipse user interface so it can be viewed by the user. In addition, the user interface must react to user input such that associated information across the different LML components is visually emphasized. For example, nodes on which a job is running are highlighted when the user selects the corresponding line in a table of job information. Users can also customize the client view by hiding, positioning, and scaling graphical components individually. As a result one LML instance can lead to a number of different client views Nodes View The primary user interface component is the view in which the nodedisplay element is shown. Each of the physical elements specified by the scheme element is rendered as a rectangle, with children painted recursively within this rectangle. The data elements are then expanded and elements on the lowest levels are filled with colors to identify the jobs running on them. Figure 2 shows how this hierarchical arrangement is used to display a full scale (96 rack) Blue Gene/Q down to the node card level. Because both the client and server sides (as discussed in more detail below) allow the level of detail to be defined, it is possible to display systems of virtually any size. The view also allows the user to zoom into physical elements to see more details about the subtree (also shown in Figure 2). This allows a high-level view to be used to avoid overwhelming the user with information, while allowing the user to exploit the detailed information available in the LML instance. A usage bar can also be generated for each data element summarizing the content of its subtree. If the view is collapsed to a lower level of detail, usage bars are painted into the rectangles instead of just filling them with a single color. Currently the view presents only a single node of the data tree. This node is usually the root, but can be altered by zooming into subtrees. We plan to extend this to allow an arbitrary number of trees to be displayed in the view. This would be useful for displaying the first five racks of a system, for example Table View A table view is used to render additional information provided by the target system, such as the list of queued or active jobs. Once again, care must be taken not to overwhelm the user with information, as the size of these tables tends to increase with the size of the system, and because there is more information about jobs and physical elements transmitted. To keep tables manageable, the user is able to sort table data, hide columns, and take advantage of mouse interaction to visually connect displayed information across all graphical components.
5 Figure 2: Screenshot of a full scale 96 rack IBM Blue Gene/Q simulator (1.6M cores) The left hand side of the display shows an overview of the entire system comprising 12 rows of racks (row 12 is scrolled off the screen). Each row comprises 8 racks containing 2 mid-planes, each mid-plane contains 16 node cards. The middle image shows the display zoomed into one row, and the left image shows the display zoomed into a single rack. 4.3 Data Acquisition The acquisition of monitoring data is also an important scaling issue. This is because obtaining the full system state of a large parallel system may require a significant number of resources. Although LML provides a scalable data format to store information about the system components using a hierarchical structure, this is generally not the case for resource management systems such as batch schedulers. Most of these systems only provide a flat data representation of the system, for example as a list of nodes and associated state information. As a consequence, a full system query could comprise a huge number of elements, leading to long query times and large amounts of data. Moreover, large numbers of users performing such queries frequently and simultaneously would place an unacceptable burden on the resource management system. For full system monitoring, as is implemented in PTP, the user experience can be improved by mapping various attribute values to hardware components. For example, the identifier of a batch job can be mapped to the node that it is running on in order to give a visual indication of the utilization of the system. In general, the full system view has to represent one or more N-to-M mappings of attributes to components. However, at least one side of these relations has to be minimized, otherwise scalability becomes an issue. To address this, LML provides the hierarchical tree model in the nodedisplay element, which can be directly exploited when generating the mapping information. In systems that have logical or physical hierarchies of components (e.g. Blue Gene/Q has partition configurations that can be described as sets of base partitions which are typically mid-planes or node cards), this information can be used directly to generate the mapping to inner tree nodes of the nodedisplay element. For resource management systems which do not provide such a hierarchical representation, the queries have to be optimized in another way. The key to minimizing scalability problems when transitioning from a flat structure to the tree model is to perform the mapping as early as possible, and at the highest level of abstraction. For a system such as Blue Gene/Q this would be the mid-plane (or node card), while for a traditional clusters this might be compute nodes consisting of several processors or cores. The acquisition of monitoring data on the remote system is performed by a set of Perl scripts that use the standard batch system query functions to obtain data about jobs, nodes, and other useful status information. To allow more flexibility, functions which are related to a particular batch system are separated into a driver layer. Typically these functions are realized by individual small scripts querying and generating LML code for one type of information (e.g. jobs). This simplifies the process of adapting to a different or newer version of batch systems. The scripts also provide mapping tables from batch-system-specific attribute names to an attribute naming scheme defined by LML. The LML data generated by these scripts is combined into an LML intermediate format, containing only a list of objects, and for each object, a list of corresponding attributes. When a client requests monitoring information, the request information and intermediate format are used to generate LML data containing the appropriate elements (e.g table, nodedisplay, diagram, etc.) for displaying the data to the user. Storing the monitoring data in an intermediate format
6 provides a number of advantages. In particular, other tools can generate data in intermediate format, and this can be merged with the monitoring data to enhance the utility of the data. We plan to provide such a tool in the future, which simulates system usage based on the current usage and job load on the system. When merged with the monitoring data, this adds new attributes that show predicted start time, ending time, and the nodes on which a job will run. 4.4 Scaling Results In order to demonstrate the scalability of the monitoring system, we used the results from a 96 rack Blue Gene/Q simulator that was demonstrated at SC11. This system is equivalent to the LLNL Sequoia system that recently became #1 on the Top 500 list. For the node card-level of detail (as shown in Figure 2), the update time, including collection of the data from the target system, was less than 10s, which we consider well within acceptable refresh times. The system has also recently been monitored to node-level detail, with similar results. We have also run a number of tests on a variety of XSEDE systems, including the National Institute for Computational Sciences (NICS) Kraken and Keeneland systems, and Texas Advanced Computing Center s (TACC) Lonestar and Ranger systems, as well as Argonne National Laboratory s Blue Gene/P and Q. Monitoring of all these systems was within acceptable times. 5. EXTENSIBILITY Eclipse provides a standard framework for launching applications, and PTP uses this mechanism to support launching applications via the control framework. The Eclipse launch framework allows the user to create a launch configuration, which encapsulates all the information necessary for the application to be successfully launched, such as the location of the executable and any required arguments, and then enter this information via a user interface. Once the launch is configured, a Run button is selected, and the appropriate actions will be taken to launch the application. Launch configurations are persisted across Eclipse sessions, so once created they can be reused again for future job launches. For launching jobs via PTP s control framework, a launch configuration specifies the type of batch system being used on the target machine. The user does this by choosing the batch or runtime system type from a list, and then providing some additional connection and authentication information. This normally only needs to be done once. Once the job is submitted, users may receive an indication of the job status (queued, running, etc.) via a view in the user interface, where they can see all the jobs they have submitted. The same interface is also used for controlling the job. If supported by the target system, the output from the job can be viewed directly from with the Eclipse user interface in a console view. In the case of interactive jobs, this output is displayed immediately as the job begins to execute. For batch jobs, the output is generally available once the job has completed. 5.1 Definition File Format A key feature of the control framework is that all the launch information required to interact with the batch system is contained within a single XML definition file. Users are able to import definition files into their workspace in order to add support for additional systems. This conveniently overcomes the existing limitation where the set of supported batch or interactive systems is fixed for each PTP release, and also allows system administrators to make sitecustomized definition files available to their users. The XML definition file specifies how the resource manager will interact with the system, and how information obtained from the batch system will be presented to the user in order to successfully launch a job. A definition file schema describes the format of this configuration file and defines the following main types of elements: attribute, command, and launch-tab. Attributes are used to represent information that is passed between the user interface and the target system. Commands specify how jobs are to be launched and controlled, and how job status information is to be obtained. Launch tabs are used to define the user interface for entering job specific information. Listing 2 shows a section of the definition file for the PBS resource manager. Listing 2: Example definition file 1 <resource - manager - builder name ="pbs - torque - generic "> 2 <control - data > 3 < attribute name =" queues " visible =" false "/ > 4 < attribute name =" destination " type =" string "> 5 < description > Designation of the queue to which to submit the job. </ description > 6 < tooltip > Format: queue ]. </ tooltip > 7 < default >debug </ default > 8 </ attribute > 9 <start -up - command name ="get - queues "> 10 <arg >qstat </ arg > 11 <arg >-Q</ arg > 12 <arg >-f</ arg > 13 <stdout - parser delim ="\n"> 14 < target ref =" queues "> 15 <match > 16 < expression > Queue: ([\ w\d ]+) </ expression > 17 <add field =" value "> 18 <entry valuegroup =" 1"/ > 19 </ add > 20 </ match > 21 </ target > 22 </ stdout - parser > 23 </ start -up - command > 24 <launch - tab > 25 <basic > 26 <title >Basic PBS Settings </ title > 27 < composite group =" true "> 28 < widget type =" combo " style =" SWT. BORDER " readonly =" true " savevalueto =" destination " > 29 <layout - data > 30 <grid - data horizontalalign =" SWT. FILL " horizontalspan =" 2" grabexcesshorizontal =" false "/ > 31 </ layout - data > 32 <items - from > queues </ items - from > 33 </ widget > 34 </ composite > 35 </ basic > 36 </ launch - tab > 37 </ control - data > 38 </ resource - manager - builder >
7 Figure 3: Resources tab of the Torque launch configuration showing the basic settings that are configurable by the user. This gives a good indication of the variety of widgets and layout options that are available for batch system implementers. Although not shown here, there are also elements available for importing, editing and using external job scripts, for managing files automatically on the target system, and for job submission and control commands (e.g. terminate, hold, release, etc.). In addition, there are commands available for specifying an interactive launch via a batch system, and for launching a debug job. 5.2 Model Driven Configuration A content model is created directly from the XML definition files and is used by the control framework to dynamically generate the launch configuration user interface. This user interface comprises a number of tabs, each of which allows the user to supply different kinds information required for the launch. For the control framework, one of these tabs, the Resources tab, is rendered directly from the widget elements specified in the launch-tab section of the XML definition file. Attributes and parameters defined in the configuration are then used to communicate user choices to the job submission command. This allows the tab to be completely customized to suit a particular batch system or runtime system. Figure 3 shows the basic settings tab for the Torque job scheduler definition. The content model is also used by the control operations of the framework in order to interact with the target system. Activities such as job submission, querying job status, and handling standard output redirection, are all driven by data obtained from the model elements. The framework also uses the presence or absence of model elements to determine if particular actions are available. For example, the absence of a submit-interactive element would indicate that the resource manager supports batch-only submission. 6. FUTURE WORK We have provided a range of techniques for improving scalability by reducing the volume of data transferred between the client and server. However, there is still additional work we plan to do in this area. In particular we have plans to implement a mechanism to send only the differences between two successive LML instances. As few changes to system state typically occur in the short intervals used for monitoring data collection, the transmission of only the differences can be quite efficient. However, for this approach to work, the server has to manage system states for every connection in order to be able to compute the difference between two successive LML instances. In addition, the LML schema has to be extended in order to provide support for handling differences, so that incomplete data can still be well-formed and valid against the LML schema.
8 Our initial implementation provides generic support for job submission and monitoring using a number of batch systems, including PBS, Torque, ALPS, LoadLeveler, SLURM, and GridEngine. We also support interactive submission using IBM s Parallel Environment, Open MPI, MPICH2, MVAPICH, as well as a simple remote launching capability. We are planning to add support for specific systems, such as those participating in the Extreme Science and Engineering Discovery Environment (XSEDE), as well as a range of other machines. Finally, we will continue to make improvements and enhancements to other parts of PTP not discussed in this paper, including new refactoring tools for Fortran, improvements to the support for remote synchronized projects, as well as enhancements to the parallel debugger. Many of these new features and improvements will be available in the 6.0 release of PTP on June 27, CONCLUSION If PTP is to continue to provide best practice tools for the development of parallel application codes, it must be scalable and extensible enough to meet the demands of the next generation of peta-scale systems and beyond. There are a number of different areas where scalability in such a development environment becomes important, such as the ability to manage an extremely large code base, the performance of integrated tools that operate on source and object code, the ability of user interfaces to present large amounts of data in a meaningful way, and the capacity to provide an abstraction of the very large systems that are being targeted by the developer, amongst others. In this paper we have only examined the latter two. In particular, we have described how modifications have been made to PTP in order to scalably monitor a target system of arbitrary size and present this information to the user in a useful manner. We have built this functionality on an existing framework that we know is highly scalable, and that has been used in production systems for some years. In addition, we have discussed how a completely generic configuration system has been designed and implemented that significantly improves the ability of PTP to be extended to support additional batch and runtime systems. This configuration system provides a completely customizable way of interacting with the plethora of target environments and systems that are currently available. Although model-driven architectures are not new, this is the first time that such an approach has been used in a development environment. We believe that these new enhancements will greatly encourage third party developers to expand the support base of the platform. By doing so, we hope to expand the community of developers and users who see PTP as one of the key technologies for dealing with the complexities of peta-scale computing environments. Acknowledgements The authors would like to acknowledge the efforts of many contributers without whom the Parallel Tools Platform would not exist. This includes the Eclipse Foundation, Los Alamos National Laboratory, Monash University, IBM Corporation, University of Oregon, Oak Ridge National Laboratory, National Center for Supercomputing Applications and others, along with the many individuals who have shared their ideas and suggestions. Thanks also to Simon Wail for his work demonstrating PTP on BG/Q, and for providing the screenshots in Figure 2. This material is partly based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under its Agreement No. HR , the United States Department of Energy under Contract No. DE-FG02-06ER25752 and Program DE-PS02-08ER08-19, and by the National Science Foundation under award number OCI REFERENCES [1] K. D. Cooper, M. W. Hall, R. T. Hood, K. Kennedy, K. S. McKinley, J. M. Mellor-Crummey, L. Torczon, and S. K. Warren, ParaScope: A Parallel Programming Environment, in Proceedings of the IEEE, 1993, pp [2] C. Clemencon, A. Endo, J. Fritscher, A. Muller, R. Ruhl, and B. J. N. Wylie, The Annai Environment For Portable Distributed Parallel Programming, in Proceedings of the 28th Hawaii International Conference on System Sciences. Washington, DC, USA: IEEE Computer Society, 1995, pp [3] P. Kacsuk, J. C. Cunha, G. Dózsa, J. a. Lourenço, T. Fadgyas, and T. Antão, A Graphical Development and Debugging Environment For Parallel Programs, Parallel Comput., vol. 22, pp , February [4] M. L. Massie, B. N. Chun, and D. E. Culler, The Ganglia Distributed Monitoring System: Design, Implementation, and Experience, Parallel Computing, vol. 30, no. 5-6, pp , [5] R. Buyya, PARMON: A Portable and Scalable Monitoring System for Clusters, Softw. Pract. Exper., vol. 30, pp , June [6] G. R. Watson and N. A. Debardeleben, A Model Based Framework for the Integration of Parallel Tools, in Proceedings of the 2006 IEEE International Conference on Cluster Computing, September [7] W. Frings, Interactive Monitoring of LoadLeveler Controlled Clusters with LLview, Available from the ScicomP11 web site: ScicomP11/Presentations/User/frings.pdf, June [8] G. R. Watson, C. E. Rasmussen, and B. R. Tibbitts, An Integrated Approach to Improving the Parallel Application Development Process, in Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing. Washington, DC, USA: IEEE Computer Society, 2009, pp [9] C. Karbach, Konzeption und Umsetzung einer Beschreibungssprache für Statusinformationen von Parallelrechnern als Basis einer Webschnittstelle für LLview, August 2010, FH-Aachen.
Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform
Mitglied der Helmholtz-Gemeinschaft System monitoring with LLview and the Parallel Tools Platform November 25, 2014 Carsten Karbach Content 1 LLview 2 Parallel Tools Platform (PTP) 3 Latest features 4
More informationScalable System Monitoring
Mitglied der Helmholtz-Gemeinschaft PTP Scalable System Monitoring with Eclipse Parallel Tools Platform Wolfgang Frings Jülich Supercomputing Centre September 2012, CHANGES Workshop W.Frings@fz-juelich.de
More informationReport on Project: Advanced System Monitoring for the Parallel Tools Platform (PTP)
Mitglied der Helmholtz-Gemeinschaft Report on Project: Advanced System Monitoring for the Parallel Tools Platform (PTP) September, 2014 Wolfgang Frings and Carsten Karbach Project progress Server caching
More informationA highly configurable and efficient simulator for job schedulers on supercomputers
Mitglied der Helmholtz-Gemeinschaft A highly configurable and efficient simulator for job schedulers on supercomputers April 12, 2013 Carsten Karbach, Jülich Supercomputing Centre (JSC) Motivation Objective
More informationDeveloping Parallel Applications with the Eclipse Parallel Tools Platform
Developing Parallel Applications with the Eclipse Parallel Tools Platform Greg Watson IBM STG grw@us.ibm.com Parallel Tools Platform Enabling Parallel Application Development Best practice tools for experienced
More informationNASA Workflow Tool. User Guide. September 29, 2010
NASA Workflow Tool User Guide September 29, 2010 NASA Workflow Tool User Guide 1. Overview 2. Getting Started Preparing the Environment 3. Using the NED Client Common Terminology Workflow Configuration
More informationExtend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia
More informationCOMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters
COMP5426 Parallel and Distributed Computing Distributed Systems: Client/Server and Clusters Client/Server Computing Client Client machines are generally single-user workstations providing a user-friendly
More informationParallel Visualization of Petascale Simulation Results from GROMACS, NAMD and CP2K on IBM Blue Gene/P using VisIt Visualization Toolkit
Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Parallel Visualization of Petascale Simulation Results from GROMACS, NAMD and CP2K on IBM Blue Gene/P using VisIt Visualization
More informationManjrasoft Market Oriented Cloud Computing Platform
Manjrasoft Market Oriented Cloud Computing Platform Innovative Solutions for 3D Rendering Aneka is a market oriented Cloud development and management platform with rapid application development and workload
More informationRational Application Developer Performance Tips Introduction
Rational Application Developer Performance Tips Introduction This article contains a series of hints and tips that you can use to improve the performance of the Rational Application Developer. This article
More informationIntegrating TAU With Eclipse: A Performance Analysis System in an Integrated Development Environment
Integrating TAU With Eclipse: A Performance Analysis System in an Integrated Development Environment Wyatt Spear, Allen Malony, Alan Morris, Sameer Shende {wspear, malony, amorris, sameer}@cs.uoregon.edu
More informationPSW Guide. Version 4.7 April 2013
PSW Guide Version 4.7 April 2013 Contents Contents...2 Documentation...3 Introduction...4 Forms...5 Form Entry...7 Form Authorisation and Review... 16 Reporting in the PSW... 17 Other Features of the Professional
More informationChapter 2: Getting Started
Chapter 2: Getting Started Once Partek Flow is installed, Chapter 2 will take the user to the next stage and describes the user interface and, of note, defines a number of terms required to understand
More informationA Distributed Render Farm System for Animation Production
A Distributed Render Farm System for Animation Production Jiali Yao, Zhigeng Pan *, Hongxin Zhang State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058, China {yaojiali, zgpan, zhx}@cad.zju.edu.cn
More informationGrid Scheduling Dictionary of Terms and Keywords
Grid Scheduling Dictionary Working Group M. Roehrig, Sandia National Laboratories W. Ziegler, Fraunhofer-Institute for Algorithms and Scientific Computing Document: Category: Informational June 2002 Status
More informationMAS 500 Intelligence Tips and Tricks Booklet Vol. 1
MAS 500 Intelligence Tips and Tricks Booklet Vol. 1 1 Contents Accessing the Sage MAS Intelligence Reports... 3 Copying, Pasting and Renaming Reports... 4 To create a new report from an existing report...
More informationzen Platform technical white paper
zen Platform technical white paper The zen Platform as Strategic Business Platform The increasing use of application servers as standard paradigm for the development of business critical applications meant
More informationComputer Information Systems (CIS)
Computer Information Systems (CIS) CIS 113 Spreadsheet Software Applications Prerequisite: CIS 146 or spreadsheet experience This course provides students with hands-on experience using spreadsheet software.
More informationOperating System for the K computer
Operating System for the K computer Jun Moroo Masahiko Yamada Takeharu Kato For the K computer to achieve the world s highest performance, Fujitsu has worked on the following three performance improvements
More informationPerformance Monitoring of Parallel Scientific Applications
Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationManaging Variability in Software Architectures 1 Felix Bachmann*
Managing Variability in Software Architectures Felix Bachmann* Carnegie Bosch Institute Carnegie Mellon University Pittsburgh, Pa 523, USA fb@sei.cmu.edu Len Bass Software Engineering Institute Carnegie
More informationHow To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 7 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx
More informationToad for Oracle 8.6 SQL Tuning
Quick User Guide for Toad for Oracle 8.6 SQL Tuning SQL Tuning Version 6.1.1 SQL Tuning definitively solves SQL bottlenecks through a unique methodology that scans code, without executing programs, to
More informationREMOTE DEVELOPMENT OPTION
Leading the Evolution DATA SHEET MICRO FOCUS SERVER EXPRESS TM REMOTE DEVELOPMENT OPTION Executive Overview HIGH PRODUCTIVITY DEVELOPMENT FOR LINUX AND UNIX DEVELOPERS Micro Focus Server Express is the
More informationEvaluation of Nagios for Real-time Cloud Virtual Machine Monitoring
University of Victoria Faculty of Engineering Fall 2009 Work Term Report Evaluation of Nagios for Real-time Cloud Virtual Machine Monitoring Department of Physics University of Victoria Victoria, BC Michael
More informationIFS-8000 V2.0 INFORMATION FUSION SYSTEM
IFS-8000 V2.0 INFORMATION FUSION SYSTEM IFS-8000 V2.0 Overview IFS-8000 v2.0 is a flexible, scalable and modular IT system to support the processes of aggregation of information from intercepts to intelligence
More informationIBM Platform Computing : infrastructure management for HPC solutions on OpenPOWER Jing Li, Software Development Manager IBM
IBM Platform Computing : infrastructure management for HPC solutions on OpenPOWER Jing Li, Software Development Manager IBM #OpenPOWERSummit Join the conversation at #OpenPOWERSummit 1 Scale-out and Cloud
More informationSemester Thesis Traffic Monitoring in Sensor Networks
Semester Thesis Traffic Monitoring in Sensor Networks Raphael Schmid Departments of Computer Science and Information Technology and Electrical Engineering, ETH Zurich Summer Term 2006 Supervisors: Nicolas
More informationAnalysis report examination with CUBE
Analysis report examination with CUBE Brian Wylie Jülich Supercomputing Centre CUBE Parallel program analysis report exploration tools Libraries for XML report reading & writing Algebra utilities for report
More informationNewsletter 4/2013 Oktober 2013. www.soug.ch
SWISS ORACLE US ER GRO UP www.soug.ch Newsletter 4/2013 Oktober 2013 Oracle 12c Consolidation Planer Data Redaction & Transparent Sensitive Data Protection Oracle Forms Migration Oracle 12c IDENTITY table
More informationComplementing Your Web Services Strategy with Verastream Host Integrator
Verastream Complementing Your Web Services Strategy with Verastream Host Integrator Complementing Your Web Services Strategy with Verastream Host Integrator Complementing Your Web Services Strategy with
More informationSimply Accounting Intelligence Tips and Tricks Booklet Vol. 1
Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1 1 Contents Accessing the SAI reports... 3 Running, Copying and Pasting reports... 4 Creating and linking a report... 5 Auto e-mailing reports...
More informationLSKA 2010 Survey Report Job Scheduler
LSKA 2010 Survey Report Job Scheduler Graduate Institute of Communication Engineering {r98942067, r98942112}@ntu.edu.tw March 31, 2010 1. Motivation Recently, the computing becomes much more complex. However,
More informationSourcery Overview & Virtual Machine Installation
Sourcery Overview & Virtual Machine Installation Damian Rouson, Ph.D., P.E. Sourcery, Inc. www.sourceryinstitute.org Sourcery, Inc. About Us Sourcery, Inc., is a software consultancy founded by and for
More information- An Essential Building Block for Stable and Reliable Compute Clusters
Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative
More informationA QUICK OVERVIEW OF THE OMNeT++ IDE
Introduction A QUICK OVERVIEW OF THE OMNeT++ IDE The OMNeT++ 4.x Integrated Development Environment is based on the Eclipse platform, and extends it with new editors, views, wizards, and additional functionality.
More informationApplying 4+1 View Architecture with UML 2. White Paper
Applying 4+1 View Architecture with UML 2 White Paper Copyright 2007 FCGSS, all rights reserved. www.fcgss.com Introduction Unified Modeling Language (UML) has been available since 1997, and UML 2 was
More informationUsing the SAS Enterprise Guide (Version 4.2)
2011-2012 Using the SAS Enterprise Guide (Version 4.2) Table of Contents Overview of the User Interface... 1 Navigating the Initial Contents of the Workspace... 3 Useful Pull-Down Menus... 3 Working with
More informationParallel Analysis and Visualization on Cray Compute Node Linux
Parallel Analysis and Visualization on Cray Compute Node Linux David Pugmire, Oak Ridge National Laboratory and Hank Childs, Lawrence Livermore National Laboratory and Sean Ahern, Oak Ridge National Laboratory
More informationGraph Visualization U. Dogrusoz and G. Sander Tom Sawyer Software, 804 Hearst Avenue, Berkeley, CA 94710, USA info@tomsawyer.com Graph drawing, or layout, is the positioning of nodes (objects) and the
More informationDebugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu
Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma carlos@tacc.utexas.edu Setup Login to Ranger: - ssh -X username@ranger.tacc.utexas.edu Make sure you can export graphics
More informationObelisk: Summoning Minions on a HPC Cluster
Obelisk: Summoning Minions on a HPC Cluster Abstract In scientific research, having the ability to perform rigorous calculations in a bearable amount of time is an invaluable asset. Fortunately, the growing
More informationBusiness Insight Report Authoring Getting Started Guide
Business Insight Report Authoring Getting Started Guide Version: 6.6 Written by: Product Documentation, R&D Date: February 2011 ImageNow and CaptureNow are registered trademarks of Perceptive Software,
More informationNetBeans Profiler is an
NetBeans Profiler Exploring the NetBeans Profiler From Installation to a Practical Profiling Example* Gregg Sporar* NetBeans Profiler is an optional feature of the NetBeans IDE. It is a powerful tool that
More informationCodeless Screen-Oriented Programming for Enterprise Mobile Applications
Codeless Screen-Oriented Programming for Enterprise Mobile Applications Aharon Abadi, Yael Dubinsky, Andrei Kirshin, Yossi Mesika, Idan Ben-Harrush IBM Research Haifa {aharona,dubinsky,kirshin,mesika,idanb}@il.ibm.com
More informationDoes function point analysis change with new approaches to software development? January 2013
Does function point analysis change with new approaches to software development? January 2013 Scope of this Report The information technology world is constantly changing with newer products, process models
More informationsupercomputing. simplified.
supercomputing. simplified. INTRODUCING WINDOWS HPC SERVER 2008 R2 SUITE Windows HPC Server 2008 R2, Microsoft s third-generation HPC solution, provides a comprehensive and costeffective solution for harnessing
More informationTips and Tricks SAGE ACCPAC INTELLIGENCE
Tips and Tricks SAGE ACCPAC INTELLIGENCE 1 Table of Contents Auto e-mailing reports... 4 Automatically Running Macros... 7 Creating new Macros from Excel... 8 Compact Metadata Functionality... 9 Copying,
More informationTest Run Analysis Interpretation (AI) Made Easy with OpenLoad
Test Run Analysis Interpretation (AI) Made Easy with OpenLoad OpenDemand Systems, Inc. Abstract / Executive Summary As Web applications and services become more complex, it becomes increasingly difficult
More informationHow to test and debug an ASP.NET application
Chapter 4 How to test and debug an ASP.NET application 113 4 How to test and debug an ASP.NET application If you ve done much programming, you know that testing and debugging are often the most difficult
More informationTitle Page. Hosted Payment Page Guide ACI Commerce Gateway
Title Page Hosted Payment Page Guide ACI Commerce Gateway Copyright Information 2008 by All rights reserved. All information contained in this documentation, as well as the software described in it, is
More informationCluster, Grid, Cloud Concepts
Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of
More informationA Performance Data Storage and Analysis Tool
A Performance Data Storage and Analysis Tool Steps for Using 1. Gather Machine Data 2. Build Application 3. Execute Application 4. Load Data 5. Analyze Data 105% Faster! 72% Slower Build Application Execute
More informationTivoli Endpoint Manager for Remote Control Version 8 Release 2. User s Guide
Tivoli Endpoint Manager for Remote Control Version 8 Release 2 User s Guide Tivoli Endpoint Manager for Remote Control Version 8 Release 2 User s Guide Note Before using this information and the product
More informationIntegration of the OCM-G Monitoring System into the MonALISA Infrastructure
Integration of the OCM-G Monitoring System into the MonALISA Infrastructure W lodzimierz Funika, Bartosz Jakubowski, and Jakub Jaroszewski Institute of Computer Science, AGH, al. Mickiewicza 30, 30-059,
More informationTerms and Definitions for CMS Administrators, Architects, and Developers
Sitecore CMS 6 Glossary Rev. 081028 Sitecore CMS 6 Glossary Terms and Definitions for CMS Administrators, Architects, and Developers Table of Contents Chapter 1 Introduction... 3 1.1 Glossary... 4 Page
More informationRed Hat Enterprise Portal Server: Architecture and Features
Red Hat Enterprise Portal Server: Architecture and Features By: Richard Li and Jim Parsons March 2003 Abstract This whitepaper provides an architectural overview of the open source Red Hat Enterprise Portal
More informationUsing Peer to Peer Dynamic Querying in Grid Information Services
Using Peer to Peer Dynamic Querying in Grid Information Services Domenico Talia and Paolo Trunfio DEIS University of Calabria HPC 2008 July 2, 2008 Cetraro, Italy Using P2P for Large scale Grid Information
More informationVisio Enabled Solution: One-Click Switched Network Vision
Visio Enabled Solution: One-Click Switched Network Vision Tim Wittwer, Senior Software Engineer Alan Delwiche, Senior Software Engineer March 2001 Applies to: All Microsoft Visio 2002 Editions All Microsoft
More informationIntroducing IBM Tivoli Configuration Manager
IBM Tivoli Configuration Manager Introducing IBM Tivoli Configuration Manager Version 4.2 GC23-4703-00 IBM Tivoli Configuration Manager Introducing IBM Tivoli Configuration Manager Version 4.2 GC23-4703-00
More informationPlanning the Installation and Installing SQL Server
Chapter 2 Planning the Installation and Installing SQL Server In This Chapter c SQL Server Editions c Planning Phase c Installing SQL Server 22 Microsoft SQL Server 2012: A Beginner s Guide This chapter
More informationA Survey Study on Monitoring Service for Grid
A Survey Study on Monitoring Service for Grid Erkang You erkyou@indiana.edu ABSTRACT Grid is a distributed system that integrates heterogeneous systems into a single transparent computer, aiming to provide
More informationEPM Performance Suite Profitability Administration & Security Guide
BusinessObjects XI R2 11.20 EPM Performance Suite Profitability Administration & Security Guide BusinessObjects XI R2 11.20 Windows Patents Trademarks Copyright Third-party Contributors Business Objects
More informationReviewing Microsoft SQL Server 2005 Management Tools
Chapter 3 Reviewing Microsoft SQL Server 2005 Management Tools After completing this chapter, you will be able to: Use SQL Server Books Online Use SQL Server Configuration Manager Use SQL Server Surface
More informationNetApp SANtricity Management Pack for Microsoft System Center Operations Manager 3.0
NetApp SANtricity Management Pack for Microsoft System Center Operations Manager 3.0 User Guide NetApp, Inc. Telephone: +1 (408) 822-6000 Part number: 215-10033_A0 495 East Java Drive Fax: +1 (408) 822-4501
More informationFileNet System Manager Dashboard Help
FileNet System Manager Dashboard Help Release 3.5.0 June 2005 FileNet is a registered trademark of FileNet Corporation. All other products and brand names are trademarks or registered trademarks of their
More informationLab Management, Device Provisioning and Test Automation Software
Lab Management, Device Provisioning and Test Automation Software The TestShell software framework helps telecom service providers, data centers, enterprise IT and equipment manufacturers to optimize lab
More informationPortfolio of Products. Integrated Engineering Environment. Overview
Portfolio of Products Integrated Engineering Environment Overview Automation Studio is an all-in-one easy-to-use software that provides an open, productive and flexible engineering environment for the
More informationThere are four technologies or components in the database system that affect database performance:
Paul Nielsen Presented at PASS Global Summit 2006 Seattle, Washington The database industry is intensely driven toward performance with numerous performance techniques and strategies. Prioritizing these
More informationExam Name: IBM InfoSphere MDM Server v9.0
Vendor: IBM Exam Code: 000-420 Exam Name: IBM InfoSphere MDM Server v9.0 Version: DEMO 1. As part of a maintenance team for an InfoSphere MDM Server implementation, you are investigating the "EndDate must
More informationWhy NetDimensions Learning
Why NetDimensions Learning Quick To Implement Lower overall costs NetDimensions Learning can be deployed faster and with fewer implementation services than almost any other learning system in the market.
More informationNETWORK PRINT MONITOR User Guide
NETWORK PRINT MONITOR User Guide Legal Notes Unauthorized reproduction of all or part of this guide is prohibited. The information in this guide is subject to change without notice. We cannot be held liable
More informationGETTING STARTED WITH ANDROID DEVELOPMENT FOR EMBEDDED SYSTEMS
Embedded Systems White Paper GETTING STARTED WITH ANDROID DEVELOPMENT FOR EMBEDDED SYSTEMS September 2009 ABSTRACT Android is an open source platform built by Google that includes an operating system,
More informationA Flexible Resource Management Architecture for the Blue Gene/P Supercomputer
A Flexible Resource Management Architecture for the Blue Gene/P Supercomputer Sam Miller, Mark Megerian, Paul Allen, Tom Budnik IBM Systems and Technology Group, Rochester, MN Email: {samjmill, megerian,
More informationIBM Unica emessage Version 8 Release 6 February 13, 2015. User's Guide
IBM Unica emessage Version 8 Release 6 February 13, 2015 User's Guide Note Before using this information and the product it supports, read the information in Notices on page 403. This edition applies to
More informationManage Software Development in LabVIEW with Professional Tools
Manage Software Development in LabVIEW with Professional Tools Introduction For many years, National Instruments LabVIEW software has been known as an easy-to-use development tool for building data acquisition
More informationLegal Notes. Regarding Trademarks. 2012 KYOCERA Document Solutions Inc.
Legal Notes Unauthorized reproduction of all or part of this guide is prohibited. The information in this guide is subject to change without notice. We cannot be held liable for any problems arising from
More informationBitrix Site Manager 4.1. User Guide
Bitrix Site Manager 4.1 User Guide 2 Contents REGISTRATION AND AUTHORISATION...3 SITE SECTIONS...5 Creating a section...6 Changing the section properties...8 SITE PAGES...9 Creating a page...10 Editing
More informationIBM Tivoli Software. Document Version 8. Maximo Asset Management Version 7.5 Releases. QBR (Ad Hoc) Reporting and Report Object Structures
IBM Tivoli Software Maximo Asset Management Version 7.5 Releases QBR (Ad Hoc) Reporting and Report Object Structures Document Version 8 Pam Denny Maximo Report Designer/Architect CONTENTS Revision History...
More informationKey Requirements for a Job Scheduling and Workload Automation Solution
Key Requirements for a Job Scheduling and Workload Automation Solution Traditional batch job scheduling isn t enough. Short Guide Overcoming Today s Job Scheduling Challenges While traditional batch job
More informationVMware Server 2.0 Essentials. Virtualization Deployment and Management
VMware Server 2.0 Essentials Virtualization Deployment and Management . This PDF is provided for personal use only. Unauthorized use, reproduction and/or distribution strictly prohibited. All rights reserved.
More informationAbstract. For notes detailing the changes in each release, see the MySQL for Excel Release Notes. For legal information, see the Legal Notices.
MySQL for Excel Abstract This is the MySQL for Excel Reference Manual. It documents MySQL for Excel 1.3 through 1.3.6. Much of the documentation also applies to the previous 1.2 series. For notes detailing
More informationA Modular Approach to Teaching Mobile APPS Development
2014 Hawaii University International Conferences Science, Technology, Engineering, Math & Education June 16, 17, & 18 2014 Ala Moana Hotel, Honolulu, Hawaii A Modular Approach to Teaching Mobile APPS Development
More informationModernizing Simulation Input Generation and Post-Simulation Data Visualization with Eclipse ICE
and Post- Data Visualization with Eclipse ICE Alex McCaskey Research Staff Oak Ridge National Laboratory mccaskeyaj@ornl.gov @amccaskey2223 Taylor Patterson Research Associate Oak Ridge National Laboratory
More informationChoosing a Development Tool
Microsoft Dynamics GP 2013 R2 Choosing a Development Tool White Paper This paper provides guidance when choosing which development tool to use to create an integration for Microsoft Dynamics GP. Date:
More informationSTUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM Albert M. K. Cheng, Shaohong Fang Department of Computer Science University of Houston Houston, TX, 77204, USA http://www.cs.uh.edu
More informationApplication Developer Guide
IBM Maximo Asset Management 7.1 IBM Tivoli Asset Management for IT 7.1 IBM Tivoli Change and Configuration Management Database 7.1.1 IBM Tivoli Service Request Manager 7.1 Application Developer Guide Note
More informationAnalysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms
Volume 1, Issue 1 ISSN: 2320-5288 International Journal of Engineering Technology & Management Research Journal homepage: www.ijetmr.org Analysis and Research of Cloud Computing System to Comparison of
More informationPTC Integrity Eclipse and IBM Rational Development Platform Guide
PTC Integrity Eclipse and IBM Rational Development Platform Guide The PTC Integrity integration with Eclipse Platform and the IBM Rational Software Development Platform series allows you to access Integrity
More informationCA VM:Operator r3. Product Overview. Business Value. Delivery Approach
PRODUCT SHEET: CA VM:OPERATOR CA VM:Operator r3 CA VM:Operator is an automated console message management system for z/vm and mainframe Linux environments. It allows you to minimize human intervention
More informationDatabase Studio is the new tool to administrate SAP MaxDB database instances as of version 7.5.
1 2 3 4 Database Studio is the new tool to administrate SAP MaxDB database instances as of version 7.5. It replaces the previous tools Database Manager GUI and SQL Studio from SAP MaxDB version 7.7 onwards
More informationStar System. 2004 Deitel & Associates, Inc. All rights reserved.
Star System Apple Macintosh 1984 First commercial OS GUI Chapter 1 Introduction to Operating Systems Outline 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 Introduction What Is an Operating System?
More informationMicrosoft Visual Studio Integration Guide
Microsoft Visual Studio Integration Guide MKS provides a number of integrations for Integrated Development Environments (IDEs). IDE integrations allow you to access MKS Integrity s workflow and configuration
More informationwww.novell.com/documentation Jobs Guide Identity Manager 4.0.1 February 10, 2012
www.novell.com/documentation Jobs Guide Identity Manager 4.0.1 February 10, 2012 Legal Notices Novell, Inc. makes no representations or warranties with respect to the contents or use of this documentation,
More informationvcenter Orchestrator Developer's Guide
vcenter Orchestrator 4.0 EN-000129-02 You can find the most up-to-date technical documentation on the VMware Web site at: http://www.vmware.com/support/ The VMware Web site also provides the latest product
More informationRunning on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF)
Running on Blue Gene/Q at Argonne Leadership Computing Facility (ALCF) ALCF Resources: Machines & Storage Mira (Production) IBM Blue Gene/Q 49,152 nodes / 786,432 cores 768 TB of memory Peak flop rate:
More informationGrid Computing Approach for Dynamic Load Balancing
International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-1 E-ISSN: 2347-2693 Grid Computing Approach for Dynamic Load Balancing Kapil B. Morey 1*, Sachin B. Jadhav
More informationHPC Wales Skills Academy Course Catalogue 2015
HPC Wales Skills Academy Course Catalogue 2015 Overview The HPC Wales Skills Academy provides a variety of courses and workshops aimed at building skills in High Performance Computing (HPC). Our courses
More information