~ On-Line Monitoring: A Thtorial



Similar documents
Advanced ColdFusion 4.0 Application Development Server Clustering Using Bright Tiger

Chapter 3: e-business Integration Patterns

Art of Java Web Development By Neal Ford 624 pages US$44.95 Manning Publications, 2004 ISBN:

SNMP Reference Guide for Avaya Communication Manager

WHITE PAPER BEsT PRAcTIcEs: PusHIng ExcEl BEyond ITs limits WITH InfoRmATIon optimization

Australian Bureau of Statistics Management of Business Providers

NCH Software FlexiServer

Normalization of Database Tables. Functional Dependency. Examples of Functional Dependencies: So Now what is Normalization? Transitive Dependencies

Pay-on-delivery investing

Avaya Remote Feature Activation (RFA) User Guide

Teamwork. Abstract. 2.1 Overview

3.3 SOFTWARE RISK MANAGEMENT (SRM)


READING A CREDIT REPORT

Setting Up Your Internet Connection

Integrating Risk into your Plant Lifecycle A next generation software architecture for risk based

Network/Communicational Vulnerability

Enhanced continuous, real-time detection, alarming and analysis of partial discharge events

With the arrival of Java 2 Micro Edition (J2ME) and its industry

Qualifications, professional development and probation

Order-to-Cash Processes

Chapter 3: JavaScript in Action Page 1 of 10. How to practice reading and writing JavaScript on a Web page

Application and Desktop Virtualization

The Web Insider... The Best Tool for Building a Web Site *

Design Considerations

Telephony Trainers with Discovery Software

Best Practices for Push & Pull Using Oracle Inventory Stock Locators. Introduction to Master Data and Master Data Management (MDM): Part 1

A Description of the California Partnership for Long-Term Care Prepared by the California Department of Health Care Services

The growth of online Internet services during the past decade has

Oracle. L. Ladoga Rybinsk Res. Volga. Finland. Volga. Dnieper. Dnestr. Danube. Lesbos. Auditing Oracle Applications Peloponnesus

Bite-Size Steps to ITIL Success

Driving Accountability Through Disciplined Planning with Hyperion Planning and Essbase

Lexmark ESF Applications Guide

Fast Robust Hashing. ) [7] will be re-mapped (and therefore discarded), due to the load-balancing property of hashing.

Learning from evaluations Processes and instruments used by GIZ as a learning organisation and their contribution to interorganisational learning

PREFACE. Comptroller General of the United States. Page i

Early access to FAS payments for members in poor health

Human Capital & Human Resources Certificate Programs

IBM Security QRadar SIEM

APIS Software Training /Consulting

IT Governance Principles & Key Metrics

EDS-Unigraphics MIS DataBroker Architecture

SELECTING THE SUITABLE ERP SYSTEM: A FUZZY AHP APPROACH. Ufuk Cebeci

Teach yourself Android application development - Part I: Creating Android products

Secure Network Coding with a Cost Criterion

Let s get usable! Usability studies for indexes. Susan C. Olason. Study plan

We focus on systems composed of entities operating with autonomous control, such

I m pretty lucky as far as teen librarians

ICAP CREDIT RISK SERVICES. Your Business Partner


TCP/IP Gateways and Firewalls

DOING BUSINESS WITH THE REGION OF PEEL A GUIDE FOR NEW AND CURRENT VENDORS

An Integrated Data Management Framework of Wireless Sensor Network

Vital Steps. A cooperative feasibility study guide. U.S. Department of Agriculture Rural Business-Cooperative Service Service Report 58

Leadership & Management Certificate Programs

NCH Software Warp Speed PC Tune-up Software

NCH Software Express Accounts Accounting Software

Information Systems Technician Training Series

We are XMA and Viglen.

Introduction the pressure for efficiency the Estates opportunity

MARKETING INFORMATION SYSTEM (MIS)

NCH Software MoneyLine

The BBC s management of its Digital Media Initiative

DECEMBER Good practice contract management framework

L I C E N S I N G G U I D E

Internal Control. Guidance for Directors on the Combined Code

Precise assessment of partial discharge in underground MV/HV power cables and terminations

LADDER SAFETY Table of Contents

Oracle Project Financial Planning. User's Guide Release

mi-rm mi-recruitment Manager the recruitment solution for Talent Managers everywhere

Migrating and Managing Dynamic, Non-Textua Content

CONTRIBUTION OF INTERNAL AUDITING IN THE VALUE OF A NURSING UNIT WITHIN THREE YEARS

Law Libraries in the Cloud **

Federal Financial Management Certificate Program

NCH Software BroadCam Video Streaming Server

Chapter 2 Traditional Software Development

Business schools are the academic setting where. The current crisis has highlighted the need to redefine the role of senior managers in organizations.

ADVANCED ACCOUNTING SOFTWARE FOR GROWING BUSINESSES

ELECTRONIC FUND TRANSFERS YOUR RIGHTS AND RESPONSIBILITIES

Lucent Technologies Bell Labs Innovations. PARTNER II Communications System PARTNER Plus Communications System Release 4.1.

WINMAG Graphics Management System

The growth of online Internet services during the past decade has increased the

Subject: Corns of En gineers and Bureau of Reclamation: Information on Potential Budgetarv Reductions for Fiscal Year 1998

Traffic classification-based spam filter

Introduction to XSL. Max Froumentin - W3C

How To Get Acedo With Microsoft.Com

(12) Patent Application Publication (10) Pub. N0.: US 2006/ A1 Marsan et al. (43) Pub. Date: May 18, 2006

DOE2000 Panel Parallel Programming Tools

Ricoh Healthcare. Process Optimized. Healthcare Simplified.

WEBSITE ACCOUNT USER GUIDE SECURITY, PASSWORD & CONTACTS

Lecture 7 Datalink Ethernet, Home. Datalink Layer Architectures

Vision Helpdesk Client Portal User Guide

Technical Support Guide for online instrumental lessons

INDUSTRIAL AND COMMERCIAL

Semantics-based design for Secure Web Services

WHITE PAPER UndERsTAndIng THE VAlUE of VIsUAl data discovery A guide To VIsUAlIzATIons

I Using Metrics to Manage

Program Management Seminar

Measuring operational risk in financial institutions

Fixed income managers: evolution or revolution

Transcription:

~ On-Line Monitoring: A Thtoria Beth A. Schroeder State University of New York, Binghamton On-ine monitoring can compement forma techniques to increase appication dependabiity. This tutoria outines the concepts and identifies the activities that comprise eventbased monitoring, describing severa representative monitoring systems. though monitoring has been around since the eary 1960s with the advent of debuggers, the fied has recenty made some exciting advances. Monitoring systems today monitor distributed appications and are often themseves distributed. In addition, they are increasingy seen as a viabe soution to areas of growing concern: ack of dependabiity and toos to support distributed appications. Monitoring has succeeded in these areas and has matured in its abiity to give users freedom in defining what is to be monitored. Monitoring gathers information about a computationa process as it executes and can be cassified by its functionaity (see Figure 1). Dependabiity incudes faut toerance and safety. Performance enhancement incudes dynamic system configuration, dynamic program tuning, and on-ine steering.* Correctness checking is the monitoring of an appication to ensure consistency with a forma specification. It can be used to detect runtime errors or as a verification technique. Security monitoring attempts to detect security vioations such as iega ogin or attempted fie access. Contro incudes cases where the monitoring system is part of the target system, a necessary component in providing computationa functionaity. Debugging and testing empoys monitoring techniques to extract data vaues from an appication being tested. Performance evauation uses monitoring to extract data from a system that is ater anayzed to assess system performance. I focus on four of the seven functiona areas: dependabiity, performance enhancement, correctness checking, and security. The systems in these functiona areas exhibit common characteristics. First, the monitor functions as an externa observer of the target software. Unike contro monitors, externa observers are not required to provide computationa functionaity. Second, the systems are designed to monitor the target software and respond whie the target software is operationa. This forces the monitoring system to react in a timey manner to events as they occur in the target system. (Debuggers are not so constrained, because they either sow the appication s execution rate or simpy gather trace data for ater anaysis or repay.) Lasty, the monitoring component is a permanent part of the overa system, athough at times it may run at reduced functionaity. (This is unike performance evauation toos that are, ike some hardware test toos, attached to a system.) We ca a monitoring system that is an externa observer, monitors a fuy functioning appication, and is generay intended to be permanent an on-ine monitoringsystem. These systems often do more than just gather information; they interpret the gathered information and respond appropriatey. On-ine monitoring systems can therefore provide increased robustness, security, faut-toerance, and adaptabiity. Computer 0018.9162/95/14.00 i 1995EEE

CONCEPTS AND TERMINOLOGY An on-ine monitoring system is a process or set of possibydistributed processes whose function is the dynamic gathering, interpreting, and acting on information concerning an appication as that appication executes. In an event-based monitoring system, which I discuss here, the gathered information arrives at the monitoring system in the form of events. An event describes an activity usuay invoving just a sma part of the appication state space. Events can be grouped into three primary categories: hardware-eve events, process-eve events, and appication-dependent events. Hardware-eve events are ow-eve activities such as page fauts, samping of a cache miss counter, and I/O channe activity. In Autonet, for exampe, events incude exceeded threshod on corrupt packets, exceeded threshod on stuck inks, or excessive vioations occurring from such things as static on a network ine. Process-eve events are events observabe externa to the process. Figure 2 iustrates event activity at this eve. Communication between a program and fie (or device) is evident by observing communication between the appication and the fie subsystem (in a Unix-based system). Communication between processes is simiary visibe by observing activity occurring between an appication and the interprocess communication subsystem, Process state information is avaiabe in the process contro subsystem. Appication-dependent events describe activity interna to an appication. The types of appication-dependent events that a monitoring system defines for use depend on the monitoring system s purpose. Definingjust the right event set can be difficut. What set of events is sufficient to capture the desired behavior in the appication? Is it enough to capture changes to seected variabes and messages passed between processes, or is a higher-eve view needed to observe, for exampe, changes to the membership of a group of processes? Sensors A sensor is an entity that observes the behavior of a sma part of the appication system state space. Upon being triggered, a sensor generates an event. A sensor is triggered either by a change to the entity it observes or by a request from the monitoring system. When triggered by a change to the entity, the sensor is said to trace the entity. Tracing is performed synchronousy with the change in the vaue of the entity. When the vaue of the object changes, the sensor reports the new vaue to the monitoring system.4 How does the sensor know there has been a change to the entity? Most frequenty, sensors are paced in the target system at ocations where changes to the entity occur. The sensor code is then executed immediatey after the change occurs. Samping, on the other hand, is the on-demand coection of information by a monitoring program and is asyn chronous with the change in the entity s vaue. When a monitoring routine decides to coect an entity s vaue, it sends a message to the appropriate sensor, and the sensor returns the current vaue.4 In Figure 3, the monitoring system is notified whenever a change occurs to the temperature variabe temp in the target system. The sensor is a sma code segment in the appication address space that is triggered by a change to temp. Upon being triggered, the sensor captures the variabe s vaue and sends an event to the monitoring system. Sensors can incude additiona functionaity. The user can define conditions that must be satisfied before the monitoring system is notified. For exampe, the sensor may generate an event onywhen the temperature exceeds 100 degrees Cesius. The condition is effectivey a fiter, fitering some events whie aowing others to pass. Figure 1. Primary uses of monitoring. User program Figure 2. Externay observabe activities. Figure 3. Sensing a vaue in a target system. I June 1995

Figure 4. Monitoring tasks. The dark ova represents preexecution tasks. The ight gray ovas represent tasks that can be done either before or during execution. The tasks shown as white ovas must be done during execution. Actions, event history, and interference An action is the monitoring system s response to an event or set of events. Actions can for exampe ater the appication state space, report some aspect of appication behavior to the user, or start up a process. Some monitoring systems maintain an event history, which may contain a events since system start-up or some subset of events (for exampe, the ast twenty events of each sensed entity, or a count of the number of ogin events since the system was booted). An event history s size is constrained utimatey by the avaiabe storage space. It shoud be noted that some kind of event history is necessary if the monitoring system is to evauate behavior as it occurs over time. Monitoring systems are often characterized by the eve of interference they impose upon the appication. If the monitoring system requires the use of appication resources (that is, CPU time, I/O devices, or shared communication channes), it is said to be intrusive. Intrusive monitoring raises the possibiity that through coecting information to anayze target system behavior, one is atering that very behavior. This is referred to as the Heisenberg effect for software. If no resources are consumed, the monitoring system is nonintrusive. A nonintrusive monitoring system has no effect on the order and timing of events in the appication. A monitoring system can be a nonintrusive gatherer, but intrusive when it executes actions. Most monitoring systems, particuary those that rey on software added to sensors, are intrusive to some degree. Competey nonintrusive monitoring systems use dedicated hardware for monitoring. intrusiveness but generay provide very ow-eve data. Software monitoring requires instrumenting the appication source code, system ibraries, or compier. Software approaches are generay more portabe and present information at an abstraction eve coser to the users way of thinking than, say, binary code or assemby anguage instructions, making them easier to use than hardware approaches. Hybrid monitoring brings together the nonintrusive nature of hardware approaches and the fexibiity of soft- ware approaches. Most monitoring systems empoy either software or hybrid monitoring. Monitoring can occur either synchronousy with appication execution or asynchronousy to the execution. Synchronous checking, or assertion checking, requires that the user add assertions to the appication code. Assertions are checks to determine if, for a particuar section of software, the reevant parts of the system state (for exampe, variabe vaues or I/O signas) are within the bounds needed for that section to operate propery.6 Assertions are paced directy in the appication and can ony be checked when encountered during execution. If more frequent checking is needed, asynchronous checking must be used. Asynchronous checking is done in an externa process that receives events from the appication. Most monitoring systems are of the atter kind. Monitoring distributed systems Monitoring distributed systems brings with it its own set of probems. The main issues in monitoring distributed systems are as foows: Monitoring approaches There are three broad approaches to monitoring: hardware, software, and hybrid approaches. Hardware monitoring requires instrumenting the hardware patform on which the appication runs. Tsai et a.5 use dedicated hardware to atch data directy off the target system s interna buses. Hardware approaches have the advantage of ow Deays in transfering information mean this information may be out of date. Variabe deays in transfering information resut in events arriving out of order. The number of objects generating monitoring information in a arge system can easiy swamp monitors. In the ikey event that the distributed system is hetero- Computer

geneous, a canonica form is needed to encode messages passed between heterogeneous machines. MONITORING SYSTEM ACTIVITIES Many activities, as derived from Snodgrass, go into making monitoring work. These activities are characterized by two traits. Is the task performed by the user or by the monitoring system, and Is the task performed before execution, during execution, or both? In Figure 4, preexecution tasks are shown as the darkest ovas (for exampe, Sensor setup). The ight gray ovas represent tasks that can be done either before or during execution. The tasks shown as white ovas must be done during execution. It is desirabe to provide as much fexibiity as possibe by deaying user tasks unti ater stages. This ets the user make adjustments as needed without recompiing the entire monitoring system and appication. In some systems, activities are omitted or combined. Sensor setup Sensor setup usuay precedes appication execution. Sensor setup performed during execution is difficut in software-based, appication-eve monitoring systems because sensors are generay impemented as embedded code in the appication data space. Sensor configuration invoves deciding what information each sensor wi record and where the sensor wi be ocated, and it can be done by the monitoring system. Sensor instaation, on the other hand, invoves pacing the coded sensors at the correct ocations and is generay done by the user. Automated sensor instaation requires the use of dependency anaysis ike that used in parae compiers.7 An enabed sensor is ready to coect information. Some sensors are permanenty enabed (that is, permanenty on), whereas others can be individuay or coectivey enabed or disabed either automaticay or at the user s direction. Sensor enabing is generay performed by the monitoring system. Sensing is the runtime activity of coecting information about the appication. When a sensed event occurs in an appication, it must be conveyed to the monitor. The event is either conveyed immediatey or can be deayed if the cost of conveying individua events is too high. How events are conveyed depends on the system architecture. In a singe-processor or shared-memory system, an event can be written to a shared-memory ocation and the monitor notified by signa or interrupt. In a message-based environment, events are sent by message. event interpretation specification can contain a specific response for each event type. Appication-dependent monitoring is more difficut. If a user is ooking for error conditions or other vioations, the system needs an event interpretation specification that contains either a compete description of correct behavior or a description of each error condition. In the former case, an incoming event is compared to the monitoring system s notion of what shoud be happening to see if the event is consistent with that notion. IDES8 and Sankar and Manda19 use this approach. Where each error condition is described individuay, an event arriving at the monitoring system is compared to the set of descriptions. A match indicates an error has occurred. Leveson6 uses this approach in her synchronous monitoring scheme. The description need not necessariy be of an error condition. ISSOS and Meta0 both provide a anguage and data mode (entity-reationship data mode) for describing arbitrary compex scenarios. Incoming events are directed to the appropriate behavior description. A match occurs when the events satisfy the description. Action specification and execution An action specification is a description suppied by the user of the action to be taken when significant behavior occurs. On recognizing a behavior, the monitoring system executes an appropriate action. Athough most monitors perform some action, often the action does not ater the appication s state. In some concurrent debuggers, for exampe, the monitoring system during execution passivey accepts events, storing them in a repository. On program competion, the monitoring system invokes a program to anayze the events in the repository and format them for graphica dispay. A SAMPLING OF MONITORING SYSTEMS I have seected one representative sampe from each of the functiona areas that comprise on-ine monitoring. Huang and Kintaa s components enhance the dependabiity of an appication; IDES monitors security vioations; Sankar and Manda s methodoogy provides correctness checking; and the Facon system provides a mechanism for on-ine steering of computationay intense parae appications. With each sampe I focus on the foowing key questions: Events. What are the events? Sensing. How is the event data gathered? Event interpretation. How does the monitoring system interpret events? Action execution. What response does the monitoring system make? Event interpretation Event interpretation is the heart of the monitoring system, where the monitoring system interprets the gathered information. What does the monitoring system need to aow it to make sense of the information? Hardware-eve and process-eve monitoring systems have reativey simpe event interpretation components. For these systems, the event set is usuay fixed and known in advance, so the The sampes ampy demonstrate that there are varied and interesting ways to address each question. At the end of each sampe I briefy discuss the potentia advantages and/or disadvantages of the approach. See Tabe 1 for a, summary of characteristics. I Huang and Kintaa Huang and Kintaa have deveoped a set of software ~ ~ June1995

, r- Tabe 1. Summary of monitoring system characteristics. Purpose Configure sensors Huang and Kintaa IDES Sankar and Manda Facon Dependabiity Security Correctness checking On-ine program steering Sensor provided User writes Sensors created by User writes sensors as ibrary routine sensors before monitor compier before execution execution Insta sensors User manuay User manuay Sensors instaed User manuay adds sensors adds sensors to automaticay adds sensors before execution kerne before before execution before execution execution Enabe sensors Aways enabed Aways enabed Aways enabed Enabed/disabed during execution Event Buit into Profies and Annotations added User describes interpretation monitoring statistica modes to source event/action specification system define behavior using view anguage Event Match occurs when Match occurs on Match occurs Match occurs interpretation heartbeat not every event on every event when events received satisfy condition in view Sensing Samping Tracing Tracing Samping and tracing Action specification Restart appication Anomay records on node or backup provided to user Diagnostic information User encodes decision provided to user routines and actions Action execution detected match occurs anomay occurs inconsistency detected match occurs components that are easiy incorporated into an existing appication to enhance its eve of faut toerance: watchd watches an appication process and recovers it in the event the process or node on which the process resides crashes; ibft is used to specify and checkpoint critica data, recover checkpointed data, og events, and ocate and reconnect a recovered process; and REPL provides faciities for on-ine repication of user-specified fies on a backup host. I focus on watchd, the component responsibe for gathering information to determine whether a process has crashed. One approach to sensor setup is to augment the appication process with a routine that periodicay sends a heartbeat message (that is, an I am aive event) to watchd. When the event arrives, the monitoring system does nothing. If the heartbeat message is not received within a specified period of time, the monitoring system assumes the appication is hung and restarts the target process at an initia or checkpointed state on the host node or backup node. DISCUSSION. Watchd has the benefit of being minimay intrusive. The heartbeat routine is provided as a ibrary routine so the user need onyinkwith the ibrary and write code to periodicay invoke the routine. On the other hand, the approach is imited to detecting ony process crash. Intrusion Detection Expert System (IDES) IDES8 is a mode for a rea-time intrusion-detection expert system. The mode proposes to detect numerous security vioations ranging from attempted break-ins by outsiders to system penetrations and abuses by insiders. Events in IDES incude ogin, command execution, pro- gram execution, fie access, fie protection vioation, and device access. For IDES to be totay transparent to the user, sensors need to reside in the kerne. A ikey pace to insta a sensor to detect attempts to og onto a Unix system, for exampe, woud be in the ogin process. When a user ID is entered, the sensor is triggered and sends an event to IDES. To interpret events, IDES matches an incoming event against a set of profies. A profie characterizes a subject s behavior with respect to an object; it serves as a description of norma activity between a subject and object. Subjects are the initiators of activities (such as user or process) whie objects are the receivers of activities (such as fies, records, or terminas). When a profie match is found, the monitor uses the event history and a statistica mode identified in the profie to determine whether the Computer

current event is consistent with the norma behavior described by the profie. If the event is norma, it updates the profie. Otherwise, it is stored as an anomay and reported to the operator. The knowedge possessed by the monitoring system is of two kinds. The first kind is a set of activity rues that are appicabe independent of the appication being monitored; activity rues specify the action to be taken when a condition is satisfied. For exampe, when a match occurs between an event and a profie, the action taken is to update the profie and check for anomaous behavior. The second kind is the set of profies, statistica metrics, and statistica modes that are unique to an appication and must be suppied by the user. DISCUSSION. Adding a new object woud, in the worst case, invove adding a sensor to the kerne, adjusting profies to incude the aowabe actions on the new object, and instaing a new statistica mode for evauating the reasonabeness of an action taken against the object. It is conceivabe to update the profies and statistica modes on-ine. Adding a new sensor to the kerne woud necessitate rebooting the system, at a minimum. Sankar and Manda Sankar and Manda19 have deveoped a methodoogy to continuousy monitor an executing Ada program for specification consistency. The user annotates an Ada program with constructs from Anna, a forma specification anguage. Annotations are predicates (that is, Booean-vaued expressions) that express constraints on Ada anguage constructs such as data objects, types, subtypes, subprograms, and exceptions. The annotation beow constrains a vaues of type EVEN to be even numbers. EVEN-CON- STRAINT is the annotation name. type EVEN is new INTEGER; - 1 < < EVEN-CONSTRAINT > > -wherex:even=>xmod2=0; Sensing is achieved indirecty by adding annotations to the target system. By annotating the code, the user is at the same time seecting the sensor ocations. The compier transforms the annotations into checking functions, each function as a separate task. In pace of the transformed annotation, the compier inserts a ca to the checking function. The ca statement then is a sensor, and the checking function task becomes a monitor. An event occurs when the ca statement is executed. Since each annotation is transformed into its own monitor, the monitor code is usuay quite simpe. In EVEN-CON- STRAINT above, the monitor checks that the parameter is even and returns the vaue if the condition is true or raises an exception otherwise. If an inconsistency occurs, diagnostic information is provided. DISCUSSION. Because of its synchronous checking approach, sensors can be instaed automaticay by the compier. However, adding another annotation requires recompiing the appication software. Aso, because a separate checking task is created for every annotation, a potentiay arge number of monitors can exist for an appication. Facon Faconi is a set of toos that support on-ine steering of parae and distributed appications. The approach offers a way of providing interactivity to high-performance appications that separates the interactive component from the computationay intensive component and provides a dynamic ink between the two. The interactive, or steering, component monitors the appication, dispays the information to the end user or submits it to a steering agent, accepts steering commands from the user, and enacts changes that affect the appication s execution. Events are user defined, invoving appication-specific data. As an exampe, an event can be generated when a thread has attempted to obtain a mutex ock and another generated when the thread has succeeded in obtaining the ock. The user defines the sensors using a sensor specification anguage and manuay inserts them in the target program. Sensors forward events to a oca monitor resident on the target program s processor, which coects the events and can appy fiters to reduce the monitoring overhead or anaysis toos to produce higher eve information. To interpret events, the gathered events (once fitered and anayzed) are matched against event/action records stored in a repository. When the condition in the event/action record is satisfied, a match occurs and the associated action is executed. The action might be to perform some actua steering action on the appication, note the occurrence of some monitoring event for future reference, or simpy forward the event for dispay or further processing. DISCUSSION. Facon is the most genera of the monitors discussed. It supports samped sensors, traced sensors, and traced sensors with fitering and computing capabiities. Sensors can be enabed and disabed. The monitoring system can be configured to meet the needs of the appication. On-ine steering is done either interactivey by directy invoving the user or by means of userdeveoped agorithms. Sensors, however, must be defined before execution begins and must be instaed manuay by the user. ON-LINE MONITORING IS INCREASINGLY SEEN AS A VIABLE means of increasing appication dependabiity. Forma methods, often regarded as a way of guaranteeing a certain eve of dependabiity, are not without shortcomings. More importanty, they are difficut to appy comprehensivey to arge deveopment projects. Design assumptions made by forma techniques to dea with unpredictabiity of the externa environment or to simpify a probem can be vioated at runtime. In some cases, it may simpy be unfeasibe to formay verify some properties. On-ine monitoring can be used to compement forma techniques to increase the overa dependabiity of an appication. In addition, monitoring distributed and parae systems during execution can provide information that can be used to reconfigure the system, tune the appication, steer its outcome, or provide visuaization of behavior. With monitoring systems today being more genera (ess target architecture and appication specific), they are promising toos for wider use in the future. I June 1995

Acknowedgments I thank Sudhir Aggarwa, State University of New York, Binghamton, for his time and suggestions given generousy throughout the deveopment of this artice. I aso thank Chandra Kintaa, AT&T Be Laboratories, and Karsten Schwan, Georgia Tech, for reviewing parts of the manuscript. Dependent Dynamic Monitoring of Distributed and Parae Systems, IEEE Trans. Parae anddistributed Systems, Vo. 4, No. 7, Juy 1993, pp. 762-778. 8. D.E. Denning, An Intrusion-Detection Mode, IEEE Trans. Software Eng., Vo. 13, No. 2, Feb. 1987, pp. 222-232. 9. S. Sankar and M. Manda, Concurrent Runtime Monitoring of Formay Specified Programs, Computer, Vo. 26, No. 3, Mar. 1993, pp. 32-41. 10. K. Marzuo et a., Toos for Distributed Appication Man- References 1. R. Snodgrass, A Reationa Approach to Monitoring Compex Systems, ACM Trans. Computer Systems, Vo. 6, No. 2, May 1988, pp. 156-196. 2. W. Gu, J. Vettrr, and K. Schwan, AnAnnotated Bibiography of Interactive Program Steering, SIGPan Notices, Vo. 29, No. 9, Sept. 1994, pp. 140-148. 3. S. Muender, ed., Distributed Systems, 2nd ed., ACM Press, New York, 1993, pp. 283-312. agement, Computer, Vo. 24, No. 8, Aug. 1991, pp. 42-51. 11. Y. Huang and C. Kintaa, Software-Impemented Faut Toerance: Technoogies and Experience, Proc. 23rdInt Symp. on Faut-Toerant Computing, IEEE CS Press, Los Aamitos, Caif., Order No. 3680-02T, 1993, pp. 2-9. 12. W. Gu et a., Facon: On-ine Monitoring and Steering of Large-Scae Parae Programs, Tech. Report GIT-CC-94-21, Coege of Computing, Georgia Institute of Technoogy, Atanta, Ga., 1994. 4. M. Kaebing and D. Oge, Minimizing Monitoring Costs: Choosing Between Tracing and Samping, Proc. 23rd Int BethA. Schroeder is a PhD candidate in computer sci- Conf. SystemSciences, IEEE CS Press, Los Aamitos, Caif. Jan. ence at State University of New York at Binghamton. Cur- 1990, pp. 314-320. rent research interests incude on-ine monitoring, 5. J.J.P. Tsai et a., ANoninterference Monitoring and RepayMechanism for Rea-Time Software Testing and Debugging, IEEE Trans. SofrwareEng., Vo. 16, No. 8, Aug. 1990, pp. 897-916. 6. N. Leveson and T. Shimea, Safety Assertions for Processdistributed rea-time systems, and safety-critica systems. She received an MS in computer sciencefrom Tempe University in 1991, an MBAfrom the University of La Verne in 1986, and a BSc in computer sciencefrom the University of Contro Systems, Proc. 13thnt. Symp. Faut-Toerant Com- Southern Mississippi in 1984. puting, IEEE CS Press, Los Aamitos, Caif., Order No. 477 (microfiche ony), 1983, pp. 236-240. Readers can contact the author via e-mai at beths@cc. ~ 7. D.M. Oge, K. Schwan, and R. Snodgrass, Appication- gatech.edu Object-Oriented Appication Frameworks edited by Ted G. Lewis One of the next major steps in object-oriented design and programming is framework design and programming. This book addresses the anticipated surge of interest in what has previousy been a itte-understood technoogy and heps the reader understand the subteties of this new technoogy. The first part is an introduction to the underying principes of objectoriented design. The second is a comparative survey of frameworks for persona computers and UNIX workstations. The ast part is an iustration of the uses of frameworks. 352 p39g..qri 1995. Softcover. ISBN O-13-213984-7. Cataog # RSOOO25 - $44.00 Makers / $50.00 List Now Avaiabe on IEEE Computer Society On-Line This month in Computer: Artice Summaries, Binary Critic, Hot Topics, Letters to the Editor, Software Chaenges, and Tabe of Contents (a new option off the main menu) *Abstracts and tabes of contents of Computer Society pubications (weeks before pubication) Conference caendar Cas for papers Career opportunities Vounteer directory Genera membership and subscription information *Author guideines and copyright forms Computer Society Press Cataog Staff contact ist Senior/staff manager ist 1994 IEEE feows The server is avaiabe with a gopher cient at info.computer.org or a WWW cient at http://www. computer.org. For more detaied information, send questions on e-mai to on-ine.access@ computer.org.